×

Let us see how to implement image data augmentation on a CNN. To do so, you can follow these steps:

Step 1: Start by importing the necessary libraries, including Keras and NumPy:
import keras
from keras.models import Sequential
from keras.layers import Conv2D, MaxPooling2D, Flatten, Dense, Dropout
from keras.preprocessing.image import ImageDataGenerator
import numpy as np

Step 2: Create an ImageDataGenerator object and specify the desired data augmentation techniques:
datagen = ImageDataGenerator(
    rotation_range=20,
    width_shift_range=0.2,
    height_shift_range=0.2,
    horizontal_flip=True,
    validation_split=0.2
)

The ImageDataGenerator object will generate batches of augmented data using the specified data augmentation techniques. In this example, we are using rotation, width and height shifts, and horizontal flipping.

Step 3: Load the original dataset and split it into training and validation sets:
train_generator = datagen.flow_from_directory(
    ‘/path/to/dataset’,
    target_size=(224, 224),
    batch_size=32,
    class_mode=’categorical’,
    subset=’training’
)
val_generator = datagen.flow_from_directory(
    ‘/path/to/dataset’,
    target_size=(224, 224),
    batch_size=32,
    class_mode=’categorical’,
    subset=’validation’
)

Here, we are using the flow_from_directory() function to load the original dataset from the specified directory. We also specify the target size of the images, the batch size, and the class mode (categorical in this case). We split the data into training and validation sets using the subset parameter.

In the provided code snippet, the flow_from_directory function is used to generate a data generator to load images from a directory. Let’s break down the parameters:

  • ‘/path/to/dataset’: This is the path to the directory containing the dataset. The function will look for subdirectories inside this directory, where each subdirectory represents a different class or category.
  • target_size=(224, 224): target_size is the size to which all images will be resized during loading. In this case, each image will be resized as a square with dimensions of 224×224 pixels. Standardizing the image size is important for consistency and compatibility with neural network models, especially when using pre-trained models that expect a specific input size.
  • batch_size=32: batch_size determines the number of images loaded and processed in each iteration during training or validation. A larger batch size can lead to faster training but may require more memory. Smaller batch sizes are often used when memory is limited or for fine-tuning models. It also affects the gradient update during training, impacting the stability and convergence of the training process.
  • class_mode=’categorical’: class_mode specifies how the target classes are represented. In this case, it is set to categorical, indicating that the labels are one-hot encoded (a binary matrix representation of class membership). Other possible values include binary for binary classification, sparse for integer-encoded class labels, and None for no labels (used for test datasets).
  • subset=’validation’: Subset is used to specify whether the generator is for the training set or the validation set. In this case, it is set to validation, indicating that the generator is for the validation set. When using subset, make sure the dataset directory contains subdirectories like train and validation to facilitate the split.

In summary, these parameters help configure the data generator to load and preprocess images from a directory. The choices made for target size, batch size, and class mode are often determined by the requirements of the machine learning model being used, the available computing resources, and the characteristics of the dataset.

Step 4: Create a CNN model:
model = Sequential()
model.add(Conv2D(32, (3, 3), activation=’relu’, input_shape=(224, 224, 3)))
model.add(MaxPooling2D((2, 2)))
model.add(Conv2D(64, (3, 3), activation=’relu’))
model.add(MaxPooling2D((2, 2)))
model.add(Conv2D(128, (3, 3), activation=’relu’))
model.add(MaxPooling2D((2, 2)))
model.add(Conv2D(256, (3, 3), activation=’relu’))
model.add(MaxPooling2D((2, 2)))
model.add(Flatten())
model.add(Dense(512, activation=’relu’))
model.add(Dropout(0.5))
model.add(Dense(2, activation=’softmax’))
model.compile(loss=’categorical_crossentropy’, \
    optimizer=’adam’, metrics=[‘accuracy’])

Leave a Reply

Your email address will not be published. Required fields are marked *

Related Posts

Example of video data labeling using k-means clustering with a color histogram – Exploring Video Data

Let us see example code for performing k-means clustering on video data using the open source scikit-learn Python package and the Kinetics...

Read out all

Frame visualization – Exploring Video Data

We create a line plot to visualize the frame intensities over the frame indices. This helps us understand the variations in intensity...

Read out all

Appearance and shape descriptors – Exploring Video Data

Extract features based on object appearance and shape characteristics. Examples include Hu Moments, Zernike Moments, and Haralick texture features. Appearance and shape...

Read out all

Optical flow features – Exploring Video Data

We will extract features based on the optical flow between consecutive frames. Optical flow captures the movement of objects in video. Libraries...

Read out all

Extracting features from video frames – Exploring Video Data

Another useful technique for the EDA of video data is to extract features from each frame and analyze them. Features are measurements...

Read out all

Loading video data using cv2 – Exploring Video Data

Exploratory Data Analysis (EDA) is an important step in any data analysis process. It helps you understand your data, identify patterns and...

Read out all