Can You Get Skin Cancer From Drawing On Yourself
· 18 min read · Updated jan 2022 · Machine Learning · Computer Vision · Healthcare
Skin cancer is an abnormal growth of skin cells, information technology is i of the most common cancers and unfortunately, it can get deadly. The good news though is when defenseless early, your dermatologist can treat it and eliminate it entirely.
Using deep learning and neural networks, we'll be able to allocate beneficial and malignant skin diseases, which may assist the dr. diagnose cancer at an earlier stage. In this tutorial, we will make a skin illness classifier that tries to distinguish betwixt benign (nevus and seborrheic keratosis) and cancerous (melanoma) skin diseases from only photographic images using TensorFlow framework in Python.
To become started, allow'southward install the required libraries:
pip3 install tensorflow tensorflow_hub matplotlib seaborn numpy pandas sklearn imblearn
Open up upward a new notebook (or Google Colab) and import the necessary modules:
import tensorflow as tf import tensorflow_hub as hub import matplotlib.pyplot equally plt import numpy as np import pandas as pd import seaborn as sns from tensorflow.keras.utils import get_file from sklearn.metrics import roc_curve, auc, confusion_matrix from imblearn.metrics import sensitivity_score, specificity_score import bone import glob import zipfile import random # to go consistent results subsequently multiple runs tf.random.set_seed(seven) np.random.seed(7) random.seed(vii) # 0 for benign, ane for cancerous class_names = ["beneficial", "malignant"]
Preparing the Dataset
For this tutorial, we'll be using but a small part of ISIC archive dataset, the below function downloads and extract the dataset into a new data
folder:
def download_and_extract_dataset(): # dataset from https://github.com/udacity/dermatologist-ai # 5.3GB train_url = "https://s3-us-west-1.amazonaws.com/udacity-dlnfd/datasets/skin-cancer/railroad train.cypher" # 824.5MB valid_url = "https://s3-united states of america-west-1.amazonaws.com/udacity-dlnfd/datasets/pare-cancer/valid.zip" # 5.1GB test_url = "https://s3-us-west-1.amazonaws.com/udacity-dlnfd/datasets/skin-cancer/test.zip" for i, download_link in enumerate([valid_url, train_url, test_url]): temp_file = f"temp{i}.zip" data_dir = get_file(origin=download_link, fname=os.path.join(os.getcwd(), temp_file)) impress("Extracting", download_link) with zipfile.ZipFile(data_dir, "r") as z: z.extractall("data") # remove the temp file bone.remove(temp_file) # annotate the below line if you already downloaded the dataset download_and_extract_dataset()
This will have several minutes depending on your connectedness, afterward that, the data
binder volition appear that contains the training, validation and testing sets. Each set is a folder that has three categories of skin disease images (nevus, seborrheic_keratosis and melanoma).
Note: You may struggle to download the dataset using the in a higher place Python function when you have a tedious Cyberspace connectedness, in that instance, yous should download it and extract it manually in the binder data
in the current directory.
At present that we have the dataset in our auto, let'due south find a way to characterization these images, remember we're going to classify only beneficial and malignant skin diseases, and then we need to label nevus and seborrheic keratosis as the value 0 and melanoma 1.
The below cell generates a metadata CSV file for each fix, each row in the CSV file corresponds to a path to an image along with its label (0 or ane):
# preparing data # generate CSV metadata file to read img paths and labels from information technology def generate_csv(folder, label2int): folder_name = os.path.basename(folder) labels = listing(label2int) # generate CSV file df = pd.DataFrame(columns=["filepath", "characterization"]) i = 0 for label in labels: print("Reading", os.path.bring together(folder, label, "*")) for filepath in glob.glob(os.path.join(binder, label, "*")): df.loc[i] = [filepath, label2int[label]] i += i output_file = f"{folder_name}.csv" impress("Saving", output_file) df.to_csv(output_file) # generate CSV files for all information portions, labeling nevus and seborrheic keratosis # as 0 (benign), and melanoma as 1 (cancerous) # yous should replace "data" path to your extracted dataset path # don't replace if you used download_and_extract_dataset() function generate_csv("data/train", {"nevus": 0, "seborrheic_keratosis": 0, "melanoma": ane}) generate_csv("data/valid", {"nevus": 0, "seborrheic_keratosis": 0, "melanoma": ane}) generate_csv("data/test", {"nevus": 0, "seborrheic_keratosis": 0, "melanoma": ane})
The generate_csv()
office accepts 2 arguments, the first is the path of the ready, for case, if you take downloaded and extract the dataset in "Due east:\datasets\peel-cancer"
, and then the training set up should exist something like "Due east:\datasets\skin-cancer\railroad train".
The 2d parameter is a dictionary that maps each peel disease category to its corresponding label value (again, 0 for benign and one for cancerous).
The reason I did a role like this is the ability to use information technology on other skin illness classifications (such as melanocytic classification), so you can add more skin diseases and apply information technology for other problems as well.
Once y'all run the cell, you notice that 3 CSV files will appear in your current directory. At present let's utilise the from_tensor_slices()
method from tf.information API to load these metadata files:
# loading data train_metadata_filename = "train.csv" valid_metadata_filename = "valid.csv" # load CSV files every bit DataFrames df_train = pd.read_csv(train_metadata_filename) df_valid = pd.read_csv(valid_metadata_filename) n_training_samples = len(df_train) n_validation_samples = len(df_valid) print("Number of training samples:", n_training_samples) print("Number of validation samples:", n_validation_samples) train_ds = tf.data.Dataset.from_tensor_slices((df_train["filepath"], df_train["label"])) valid_ds = tf.information.Dataset.from_tensor_slices((df_valid["filepath"], df_valid["label"]))
Now we take loaded the dataset (train_ds
and valid_ds
), each sample is a tuple of filepath
(path to the image file) and label
(0 for benign and 1 for cancerous), here is the output:
Number of training samples: 2000 Number of validation samples: 150
Permit'southward load the images:
# preprocess data def decode_img(img): # convert the compressed string to a 3D uint8 tensor img = tf.image.decode_jpeg(img, channels=three) # Utilise `convert_image_dtype` to convert to floats in the [0,1] range. img = tf.image.convert_image_dtype(img, tf.float32) # resize the epitome to the desired size. render tf.image.resize(img, [299, 299]) def process_path(filepath, label): # load the raw data from the file every bit a string img = tf.io.read_file(filepath) img = decode_img(img) render img, label valid_ds = valid_ds.map(process_path) train_ds = train_ds.map(process_path) # test_ds = test_ds for image, label in train_ds.take(ane): print("Image shape:", prototype.shape) print("Label:", characterization.numpy())
The above code uses map()
method to execute process_path()
function on each sample on both sets, it'll basically load the images, decode the image format, convert the image pixels to be in the range [0, ane]
and resize it to (299, 299, 3)
, we and then take one image and print its shape:
Image shape: (299, 299, 3) Characterization: 0
Everything is as expected, now let'due south prepare this dataset for training:
# training parameters batch_size = 64 optimizer = "rmsprop" def prepare_for_training(ds, cache=Truthful, batch_size=64, shuffle_buffer_size=g): if cache: if isinstance(cache, str): ds = ds.cache(cache) else: ds = ds.cache() # shuffle the dataset ds = ds.shuffle(buffer_size=shuffle_buffer_size) # Echo forever ds = ds.repeat() # split up to batches ds = ds.batch(batch_size) # `prefetch` lets the dataset fetch batches in the background while the model # is grooming. ds = ds.prefetch(buffer_size=tf.data.experimental.AUTOTUNE) return ds valid_ds = prepare_for_training(valid_ds, batch_size=batch_size, cache="valid-buried-information") train_ds = prepare_for_training(train_ds, batch_size=batch_size, enshroud="train-buried-data")
Here is what we did:
-
cache()
: Since we're making besides many calculations on each set, nosotros usedcache()
method to save our preprocessed dataset into a local cache file, this will merely preprocess information technology the very first time (in the start epoch during training). -
shuffle()
: To basically shuffle the dataset, and so the samples are in random order. -
repeat()
: Every time we iterate over the dataset, information technology'll continue generating samples for united states of america repeatedly, this volition assist us during the training. -
batch()
: We batch our dataset into 64 or 32 samples per preparation step. -
prefetch()
: This will enable us to fetch batches in the background while the model is training.
The beneath cell gets the showtime validation batch and plots the images along with their corresponding label:
batch = side by side(iter(valid_ds)) def show_batch(batch): plt.figure(figsize=(12,12)) for due north in range(25): ax = plt.subplot(5,5,n+1) plt.imshow(batch[0][due north]) plt.championship(class_names[batch[1][northward].numpy()].title()) plt.axis('off') show_batch(batch)
Output:
As yous can see, it's extremely hard to differentiate betwixt malignant and benign diseases, let's see how our model will bargain with it.
Groovy, now our dataset is ready, let's swoop into building our model.
Building the Model
Discover before, we resized all images to (299, 299, iii)
, and that's because of what InceptionV3 compages expects as input, so we'll be using transfer learning with TensorFlow Hub library to download and load the InceptionV3 architecture along with its ImageNet pre-trained weights:
# edifice the model # InceptionV3 model & pre-trained weights module_url = "https://tfhub.dev/google/tf2-preview/inception_v3/feature_vector/4" 1000 = tf.keras.Sequential([ hub.KerasLayer(module_url, output_shape=[2048], trainable=False), tf.keras.layers.Dense(1, activation="sigmoid") ]) m.build([None, 299, 299, 3]) grand.compile(loss="binary_crossentropy", optimizer=optimizer, metrics=["accuracy"]) m.summary()
We fix trainable
to False
so we won't exist able to adjust the pre-trained weights during our training, we also added a final output layer with i unit that is expected to output a value betwixt 0 and i (close to 0 means beneficial, and 1 for malignant).
After that, since this is a binary classification, nosotros built our model using binary crossentropy loss, and used accuracy every bit our metric (not that reliable metric, nosotros'll see sooner why), here is the output of our model summary:
Model: "sequential" _________________________________________________________________ Layer (blazon) Output Shape Param # ================================================================= keras_layer (KerasLayer) multiple 21802784 _________________________________________________________________ dense (Dense) multiple 2049 ================================================================= Full params: 21,804,833 Trainable params: ii,049 Non-trainable params: 21,802,784 _________________________________________________________________
Larn also: Satellite Image Nomenclature using TensorFlow in Python
Grooming the Model
Nosotros now have our dataset and the model, permit's get them together:
model_name = f"beneficial-vs-malignant_{batch_size}_{optimizer}" tensorboard = tf.keras.callbacks.TensorBoard(log_dir=os.path.join("logs", model_name)) # saves model checkpoint whenever we accomplish improve weights modelcheckpoint = tf.keras.callbacks.ModelCheckpoint(model_name + "_{val_loss:.3f}.h5", save_best_only=True, verbose=1) history = m.fit(train_ds, validation_data=valid_ds, steps_per_epoch=n_training_samples // batch_size, validation_steps=n_validation_samples // batch_size, verbose=1, epochs=100, callbacks=[tensorboard, modelcheckpoint])
Nosotros're using ModelCheckpoint callback to save the best weights so far on each epoch, that'due south why I set epochs to 100, that'due south because it can converge to improve weights at whatsoever time, to save your time, feel free to reduce that to 30 or then.
I also added tensorboard as a callback in case y'all want to experiment with different hyperparameter values.
Since fit()
method doesn't know the number of samples there are in the dataset, we need to specify steps_per_epoch
and validation_steps
parameters for the number of iterations (the number of samples divided by the batch size) of the preparation prepare and validation set respectively.
Here is a part of the output during training:
Railroad train for 31 steps, validate for 2 steps Epoch 1/100 30/31 [============================>.] - ETA: 9s - loss: 0.4609 - accuracy: 0.7760 Epoch 00001: val_loss improved from inf to 0.49703, saving model to beneficial-vs-malignant_64_rmsprop_0.497.h5 31/31 [==============================] - 282s 9s/step - loss: 0.4646 - accuracy: 0.7722 - val_loss: 0.4970 - val_accuracy: 0.8125 <..SNIPED..> Epoch 27/100 thirty/31 [============================>.] - ETA: 0s - loss: 0.2982 - accuracy: 0.8708 Epoch 00027: val_loss improved from 0.40253 to 0.38991, saving model to benign-vs-malignant_64_rmsprop_0.390.h5 31/31 [==============================] - 21s 691ms/step - loss: 0.3025 - accurateness: 0.8684 - val_loss: 0.3899 - val_accuracy: 0.8359 <..SNIPED..> Epoch 41/100 30/31 [============================>.] - ETA: 0s - loss: 0.2800 - accurateness: 0.8802 Epoch 00041: val_loss did not improve from 0.38991 31/31 [==============================] - 21s 690ms/step - loss: 0.2829 - accuracy: 0.8790 - val_loss: 0.3948 - val_accuracy: 0.8281 Epoch 42/100 xxx/31 [============================>.] - ETA: 0s - loss: 0.2680 - accuracy: 0.8859 Epoch 00042: val_loss did not improve from 0.38991 31/31 [==============================] - 21s 693ms/footstep - loss: 0.2722 - accuracy: 0.8831 - val_loss: 0.4572 - val_accuracy: 0.8047
Model Evaluation
First, allow's load our test set up, but like previously:
# evaluation # load testing set test_metadata_filename = "exam.csv" df_test = pd.read_csv(test_metadata_filename) n_testing_samples = len(df_test) print("Number of testing samples:", n_testing_samples) test_ds = tf.data.Dataset.from_tensor_slices((df_test["filepath"], df_test["label"])) def prepare_for_testing(ds, cache=True, shuffle_buffer_size=1000): if cache: if isinstance(enshroud, str): ds = ds.enshroud(cache) else: ds = ds.cache() ds = ds.shuffle(buffer_size=shuffle_buffer_size) render ds test_ds = test_ds.map(process_path) test_ds = prepare_for_testing(test_ds, cache="examination-buried-information")
The above code loads our examination data and prepares it for testing:
Number of testing samples: 600
600
images of the shape (299, 299, 3)
can fit our memory, let'south convert our exam set from tf.data into a NumPy array:
# catechumen testing set to numpy array to fit in memory (don't do that when testing # fix is as well big) y_test = np.zeros((n_testing_samples,)) X_test = np.zeros((n_testing_samples, 299, 299, 3)) for i, (img, characterization) in enumerate(test_ds.take(n_testing_samples)): # print(img.shape, characterization.shape) X_test[i] = img y_test[i] = label.numpy() print("y_test.shape:", y_test.shape)
The above cell volition construct our arrays, information technology volition take some fourth dimension the first fourth dimension it'south executed considering information technology's doing all the preprocessing defined in process_path()
and prepare_for_testing()
functions.
Now let's load our optimal weights that were saved by ModelCheckpoint during the training:
# load the weights with the least loss yard.load_weights("benign-vs-malignant_64_rmsprop_0.390.h5")
Yous may non have the exact filename of the optimal weights, you need to search for the saved weights in the current directory that has the to the lowest degree loss, the below code evaluates the model using accuracy metric:
print("Evaluating the model...") loss, accuracy = m.evaluate(X_test, y_test, verbose=0) print("Loss:", loss, " Accuracy:", accuracy)
Output:
Evaluating the model... Loss: 0.4476394319534302 Accurateness: 0.8
We've reached about 84%
accurateness on the validation set and 80%
on the test set, but that'south not all. Since our dataset is largely unbalanced, accuracy doesn't tell everything. In fact, a model that predicts every image as benign would get an accurateness of 80%
, since cancerous samples are about 20% of the total validation set.
As a result, we need a improve way to evaluate our model, in the upcoming cells, nosotros'll use seaborn and matplotlib libraries to draw the confusion matrix that tells us more about how well our model is doing.
But before we practise that, I just want to make something clear: we all know that predicting a malignant disease as beneficial is a terrible mistake, y'all can kill people doing that! So we need a way to predict even more malignant cases even that we take very few malignant samples compared to benign. A good method is introducing a threshold.
Recall the output of the neural network is a value between 0 and 1. In the normal way, when the neural network produces a value between 0 and 0.five, nosotros automatically assign information technology as benign, and from 0.5 to 1.0 as cancerous. And since nosotros want to be enlightened of the fact that we can predict a malignant disease as benign (that's only one of the many reasons), we can say for example, from 0 to 0.3 is benign, and from 0.3 to 1.0 is malignant, this ways we are using a threshold value of 0.3, this will improve our predictions.
The below function does that:
def get_predictions(threshold=None): """ Returns predictions for binary classification given `threshold` For instance, if threshold is 0.3, then it'll output 1 (malignant) for that sample if the probability of i is xxx% or more than (instead of 50%) """ y_pred = g.predict(X_test) if not threshold: threshold = 0.5 result = np.zeros((n_testing_samples,)) for i in range(n_testing_samples): # test melanoma probability if y_pred[i][0] >= threshold: result[i] = 1 # else, it'south 0 (benign) render event threshold = 0.23 # become predictions with 23% threshold # which means if the model is 23% certain or more than that is malignant, # information technology'south assigned as malignant, otherwise it'south benign y_pred = get_predictions(threshold)
Now let'southward depict our confusion matrix and interpret it:
def plot_confusion_matrix(y_test, y_pred): cmn = confusion_matrix(y_test, y_pred) # Normalise cmn = cmn.astype('float') / cmn.sum(axis=i)[:, np.newaxis] # print it print(cmn) fig, ax = plt.subplots(figsize=(10,10)) sns.heatmap(cmn, annot=Truthful, fmt='.2f', xticklabels=[f"pred_{c}" for c in class_names], yticklabels=[f"true_{c}" for c in class_names], cmap="Blues" ) plt.ylabel('Actual') plt.xlabel('Predicted') # plot the resulting confusion matrix plt.prove() plot_confusion_matrix(y_test, y_pred)
Output:
Sensitivity
So our model gets about 0.72
probability of a positive test given that the patient has the disease (bottom right of the confusion matrix), that'southward often called sensitivity.
Sensitivity is a statistical measure that is widely used in medicine that is given by the following formula (from Wikipedia):
And so in our example, out of all patients that have a malignant pare disease, nosotros successfully predicted
72%
of them as malignant, not bad merely needs improvements.
Specificity
The other metric is specificity, you tin can read it in the tiptop left of the confusion matrix, we got about 63%
. Information technology is basically the probability of a negative exam given that the patient is well:
In our instance, out of all patients that has a benign, we predicted
63%
of them as benign.
With high specificity, the test rarely gives positive results in healthy patients, whereas a high sensitivity means that the model is reliable when its outcome is negative, I invite yous to read more about information technology in this Wikipedia article.
Alternatively, you can use imblearn module to become these scores:
sensitivity = sensitivity_score(y_test, y_pred) specificity = specificity_score(y_test, y_pred) print("Melanoma Sensitivity:", sensitivity) print("Melanoma Specificity:", specificity)
Output:
Melanoma Sensitivity: 0.717948717948718 Melanoma Specificity: 0.6252587991718427
Receiver Operating Characteristic
Some other good metric is ROC, which is basically a graphical plot that shows usa the diagnostic ability of our binary classifier, it features a truthful positive charge per unit on the Y-axis and a simulated-positive charge per unit on the X-centrality . The perfect point we desire to reach is in the top left corner of the plot, hither is the code for plotting the ROC curve using matplotlib :
def plot_roc_auc(y_true, y_pred): """ This office plots the ROC curves and provides the scores. """ # set for effigy plt.figure() fpr, tpr, _ = roc_curve(y_true, y_pred) # obtain ROC AUC roc_auc = auc(fpr, tpr) # print score print(f"ROC AUC: {roc_auc:.3f}") # plot ROC curve plt.plot(fpr, tpr, color="blue", lw=ii, label='ROC bend (area = {f:.2f})'.format(d=1, f=roc_auc)) plt.xlim([0.0, 1.0]) plt.ylim([0.0, i.05]) plt.xlabel('False Positive Rate') plt.ylabel('True Positive Charge per unit') plt.title('ROC curves') plt.legend(loc="lower right") plt.show() plot_roc_auc(y_test, y_pred)
Output:
ROC AUC: 0.671
Awesome, since we desire to maximize the truthful positive rate, and minimize the false positive charge per unit, computing the area underneath the ROC curve proves to be useful, we got 0.671 every bit the Expanse Under Bend ROC (ROC AUC), an area of i means the model is platonic for all cases.
Conclusion
We're done! In that location you lot have information technology, see how you can better the model, nosotros only used 2000
training samples, go to ISIC archive and download more than and add them to the information
folder, the scores volition improve significantly depending on the number of samples yous add. Yous tin can utilise ISIC annal downloader which may help you lot download the dataset in the way yous want.
I likewise encourage you to tweak the hyperparameters such equally the threshold nosotros set before, and see if you can get better sensitivity and specificity scores.
I used InceptionV3 model architecture, you're free to use any CNN compages yous want, I invite y'all to browse TensorFlow hub and choose the newest model. For instance, in satellite image classification, nosotros've called EfficientNET V2, try it out and you may increase the performance significantly!
References
- Dermatologist-level nomenclature of peel cancer with deep neural networks
- Dermatologist AI
Learn also: How to Perform YOLO Object Detection using OpenCV and PyTorch in Python.
Happy Learning ♥
View Full Lawmaking
Read Also
Annotate panel
Source: https://www.thepythoncode.com/article/skin-cancer-detection-using-tensorflow-in-python
Posted by: whitesidewheark.blogspot.com
0 Response to "Can You Get Skin Cancer From Drawing On Yourself"
Post a Comment