Skip to main content

Yolo App - YOLOv5 Data Preparation

Shenzhen, China

  1. Prepare your Images and get Data
  2. Train your Tensorflow Model
  3. Use your Model to do Predictions
  4. Use Tesseract to Read Number Plates
  5. Flask Web Application
  6. Yolo v5 - Data Prep

I now have a clean dataset, a working model and a web application for testing. But the detection process is relatively slow and not suitable for real-time video detection. This is where YOLOv5 comes in.

Data Preparation

There is only one problem with the data that was used to train the Tensorflow model. There I needed to define the bounding box around detect license plates byt the variables xmin, xmax, ymin, ymax. But YOLO expects an X & Y value for the center point of the region of interest and it's height & width.

Load Labels

Let's start by importing the labels files for our images:

import numpy as np
import pandas as pd
from glob import glob
import xml.etree.ElementTree as xet
import cv2
import os
import shutil import copy

importdf = pd.read_csv('../labels.csv')
df.head()
    filepath 	                xmin 	xmax 	ymin 	ymax
0 ../resources/cars_170.xml 224 439 77 167
1 ../resources/cars_171.xml 416 620 536 600
2 ../resources/cars_172.xml 184 325 114 148
3 ../resources/cars_173.xml 154 373 91 149
4 ../resources/cars_174.xml 131 279 213 256

Parse XML Data

I now need to convert these bbox values to center_x, center_y, width and height and normalize them to their image size. I can start by extracting the image width and height from the generated XML label:

<annotation>
<folder>resources</folder>
<filename>cars_1.jpg</filename>
<path>/opt/yolo-app/resources/cars_1.jpg</path>
<source>
<database>Unknown</database>
</source>
<size>
<width>1600</width>
<height>1153</height>
<depth>3</depth>
</size>
<segmented>0</segmented>
<object>
<name>number_plate</name>
<pose>Unspecified</pose>
<truncated>0</truncated>
<difficult>0</difficult>
<bndbox>
<xmin>1085</xmin>
<ymin>561</ymin>
<xmax>1354</xmax>
<ymax>683</ymax>
</bndbox>
</object>
</annotation>

The function to parse the image label is:

# Parsing XML labels
def xmlparsing(path):
parser = xet.parse(path).getroot()
image_path = '../' + parser.find('folder').text + '/' + parser.find('filename').text
image_size = parser.find('size')
width = int(image_size.find('width').text)
height = int(image_size.find('height').text)

return image_path, width, height

Now I can append the image width and height to my Pandas dataframe with:

# Take filepath from df and function to append
# image_path, width and height from XML label
df[['image_path','width','height']] = df['filepath'].apply(xmlparsing).apply(pd.Series)
df.head()
    filepath 	                  xmin 	xmax 	ymin 	ymax 	image_path 	                width height
0 ../resources/cars_170.xml 224 439 77 167 ../resources/cars_170.jpeg 500 234
1 ../resources/cars_171.xml 416 620 536 600 ../resources/cars_171.jpeg 1070 907
2 ../resources/cars_172.xml 184 325 114 148 ../resources/cars_172.jpeg 500 333
3 ../resources/cars_173.xml 154 373 91 149 ../resources/cars_173.jpeg 500 250
4 ../resources/cars_174.xml 131 279 213 256 ../resources/cars_174.jpeg 414 432

Calculate Bounding Box

And now to getting the variables that are needed by Yolo:

# Calculate center_x, center_y, width and height of bounding box
# and normalize them to image size
df['center_x'] = (df['xmax'] + df['xmin'])/(2*df['width'])
df['center_y'] = (df['ymax'] + df['ymin'])/(2*df['height'])

df['bb_width'] = (df['xmax'] - df['xmin'])/df['width']
df['bb_height'] = (df['ymax'] - df['ymin'])/df['height']

df.head()
 	filepath         image_path 	      width height 	center_x 	center_y 	bb_width 	bb_height
0 .../cars_170.xml .../cars_170.jpeg 500 234 0.663000 0.521368 0.430000 0.384615
1 .../cars_171.xml .../cars_171.jpeg 1070 907 0.484112 0.626240 0.190654 0.070562
2 .../cars_172.xml .../cars_172.jpeg 500 333 0.509000 0.393393 0.282000 0.102102
3 .../cars_173.xml .../cars_173.jpeg 500 250 0.527000 0.480000 0.438000 0.232000
4 .../cars_174.xml .../cars_174.jpeg 414 432 0.495169 0.542824 0.357488 0.099537

Split Testing and Training Data

Divide image into files used for training and for testing:

# Take first 220 images for training
df_train = df.iloc[:220]
# Take remaining images for testing
df_test = df.iloc[220:]

Create labels for training images and copy everything into the trainings folder:

# Training Data
train_folder = '../data/train'

train_values = df_train[['image_path', 'center_x', 'center_y', 'bb_width', 'bb_height']].values

# Create label and copy images to folder
for fname, x, y, w, h in train_values:
# Get filename from filepath
image_name = os.path.split(fname)[-1]
# Remove file extension
label_name = os.path.splitext(image_name)[0]

# Copy training images to train folder
dst_image_path = os.path.join(train_folder, image_name)
copy(fname,dst_image_path)

# Create image label file
label_values = f'0 {x} {y} {w} {h}'
label_path = os.path.join(train_folder, label_name + '.txt')
with open(label_path , mode='w') as f:
f.write(label_values)
f.close()

And repeat this step for the testing images:

# Testing Data
test_folder = '../data/test'

test_values = df_test[['image_path', 'center_x', 'center_y', 'bb_width', 'bb_height']].values

# Create label and copy images to folder
for fname, x, y, w, h in test_values:
# Get filename from filepath
image_name = os.path.split(fname)[-1]
# Remove file extension
label_name = os.path.splitext(image_name)[0]

# Copy training images to train folder
dst_image_path = os.path.join(test_folder, image_name)
copy(fname,dst_image_path)

# Create image label file
label_values = f'0 {x} {y} {w} {h}'
label_path = os.path.join(test_folder, label_name + '.txt')
with open(label_path , mode='w') as f:
f.write(label_values)
f.close()

Plate Detection Flask App