Data --- the Challenge
One of the "frustrating" challenges in Deep Learning for beginners and experience alike is getting Data -both the collection of it as well as the preparation and formating of it for the Deep Learning Framework of choice (e.g. Tensorflow)
-
ISSUE 1: Downloading data can be frustrating as imany Data Sets are huge (e.g. ImageNet 2012 is 150GB+)
-
Tensorflow has some built in datasets of the more common DataSets ready for you to use -- Some https://github.com/tensorflow/datasets/blob/master/docs/announce_proxy.md
-
See the Tensorflow documentation but, something like : (note you have to install the tensorflow_dataset module with
pip install -q tensorflow-datasets
The CODE will look like:
import tensorflow_datasets as tfds
train_dataset = tfds.load(name="cifar100", split=tfds.Split.Train)
train_dataset = train_dataset.shuffle(2048).batch(64)
-
-
IF your dataset is not one of these you will have to obviously download it (and if training on cloud upload it there too).
-
-
ISSUE 2: Loading data (we will focus only on images and video)
-
Look at the examples exercises for CNN and LSTM that I have provided.....BUT, your dataset may be differently formatted (usually will be) and you are going to have to figure this out).
-
TIP: if you are doing a CNN or RNN (like LSTM) type model and creating your own training data reuse the example format above for your new data.
-
ISSUE 3: Resizing, Scaling and Sampling
-
You will generally choose to downsample your training and testing image (or images from video) from its orginal probably larger image size. See the CNN example and LSTM examples. (tf.keras.resize or cv2.resize are methods to use)
-
If you have video you will probably want to sample a fixed number of images per video sequence for training/validation/test - see the LSTM example
-
Many models have defined that the input pixels with rgb values from 0-255 are rescaled to 0 to 1.0 as floating point input. If this is the case for you, your input will have to be divided by 255.
-
see image_gen = ImageDataGenerator(rescale=1./255, horizontal_flip=True) inside the CNN Example
-
see img = tf.keras.applications.incpetion_v3.preprocess_input(img) inside the LSTM example, where a pre-built incpetion_v3 CNN processes each input image frame selected from the LSTM
-
-
-
ISSUE 3: Creating Tensorflow Datasets or Data Generators (e.g. ImageDataGenerator) for batch processing input
-
You will be creating most likely your own data sets for image based (including images coming from videos) here are a few methods in Tensorflow to load the data via ImageDataGenerator or DataSet
-
tf.keras.preprocessing.image.ImageDataGenerator (see example at google and the CNN example). The code below creaes an ImageData generator that will scale the image pixel values (rgb) from 255 to 1. It reads the files in from a directory (train_dir).
image_gen = ImageDataGenerator(rescale=1./255, horizontal_flip=True)
train_data_gen = image_gen.flow_from_directory(batch_size=BATCH_SIZE,
directory=train_dir,
shuffle=True,
target_size=(IMG_SHAPE,IMG_SHAPE)) -
tf.data.Dataset - SEE the LSTM example for creating a dataset which is a sequence of image frames sampled from a Video that are used to process through an Inception CNN. The code sample has a function called frame_generator that grabs images from the video samples and prepares them for processing by the CNN.
- dataset = tf.data.Dataset.from_generator(frame_generator,
output_types=(tf.float32, tf.string),
output_shapes=((299, 299, 3), ())) - dataset = dataset.batch(16).prefetch(tf.data.experimental.AUTOTUNE)
- dataset = tf.data.Dataset.from_generator(frame_generator,
-
WHEN doing LSTM network itself the input will not be images but, will be "data" basically feature vectors (number vectors). In this case, the image input has been processed already by a CNN. And if this is for training (not run time) you have probably done this once and saved the data in files.
-
tf.data.Dataset is one possibilty -- and in the LSTM example you can see that it uses a function make_generator that reads from the stored training feature vector data from the image sequence of each training video. See LSTM Example
-
-
-
-
ISSUE 4: Run Time: You must process your input appropriately to match what you did for your models in training
-
See the CNN example where I have processed a single image for prediction
-
See the Deploy on Google Cloud example -for issues related to taking image data and processing it for delivery to Google Cloud hosted CNN model (LSTM similar idea) using json
-