OpenCV Meanshift Algorithm for Object Tracking
Meanshift
The idea behind the Meanshift Algorithm is that every instance of the video is checked for the pixel distribution in that frame. We define an initial window, a region of interest (ROI) which identifies the area of maximum pixel distribution. By doing so we are defining a colour histogram. The algorithm tries to keep track of that area in the video so that the ROI moves towards the region of maximum pixel distribution - it tries to maximize the overlap of the resulting histogram with the original histogram of the area we selected. The direction of movement depends upon the difference between the center of our tracking window and the centroid of all the k-pixels inside that window.
Disadvantages of using the Meanshift Algorithm:
- The size of the ROI remains the same irrespective of the distance of the object from the camera.
- The ROI will track the object only when it is inside the initial bounding box we define.
Get your Videostream
Get your RTSP video stream input and define a region of interest for the Meanshift algorithm:
# get video stream from IP camera
print("[INFO] starting video stream")
vs = VideoStream(args["url"]).start()
# first frame from stream
frame = vs.read()
# select region of interest
bbox = cv2.selectROI(frame)
x, y, w, h = bbox
track_window = (x, y, w, h)
# define area of bounding box as area of interest
roi = frame[y:y+h, x:x+w]
Histogram Calculation in OpenCV
The Meanshift algorithm is going to use the histogram of your region of interest to track the object you selected above. But we have to convert the frame to to the HSV colour space and normalize it first:
hsv_roi = cv2.cvtColor(roi, cv2.COLOR_BGR2HSV)
# get histogram for [0] blue, [1] green, [2] red channel
roi_hist = cv2.calcHist([hsv_roi], [0], None, [180], [0, 180])
# convert hist values 0-180 to a range between 0-1
roi_hist = cv2.normalize(roi_hist, roi_hist, 0, 255, cv2.NORM_MINMAX)
So now we use cv.calcHist() function to find the histogram. Let's familiarize with the function and its parameters :
cv.calcHist(images, channels, mask, histSize, ranges[, hist[, accumulate]])
- images : it is the source image of type uint8 or float32. it should be given in square brackets, ie,
[img]
. - channels : it is also given in square brackets. It is the index of channel for which we calculate histogram. For example, if input is grayscale image, its value is
[0]
. For color image, you can pass[0]
,[1]
or[2]
to calculate - histogram of blue, green or red channel respectively.
- mask : mask image. To find histogram of full image, it is given as "None". But if you want to find histogram of particular region of image, you have to create a mask image for that and give it as mask. (I will show an example later.)
- histSize : this represents our BIN count. Need to be given in square brackets. For full scale, we pass
[256]
for RGB and[180]
for HSV. ranges : this is our RANGE. Normally, it is[0,256]
for RGB and[0, 180]
for HSV.
img = cv.imread('image_hsv.jpg',0)
hist = cv.calcHist([img],[0],None,[180],[0, 180])
Histogram Plot
with OpenCV
img = cv.imread('image_rgb.jpg',0)
# create a mask
mask = np.zeros(img.shape[:2], np.uint8)
mask[100:300, 100:400] = 255
masked_img = cv.bitwise_and(img,img,mask = mask)
# Calculate histogram with mask and without mask
# Check third argument for mask
hist_full = cv.calcHist([img],[0],None,[256],[0,256])
hist_mask = cv.calcHist([img],[0],mask,[256],[0,256])
plt.subplot(221), plt.imshow(img, 'gray')
plt.subplot(222), plt.imshow(mask,'gray')
plt.subplot(223), plt.imshow(masked_img, 'gray')
plt.subplot(224), plt.plot(hist_full), plt.plot(hist_mask)
plt.xlim([0,256])
plt.show()
with Matplotlib
import numpy as np
import cv2 as cv
from matplotlib import pyplot as plt
img = cv.imread('home.jpg')
color = ('b','g','r')
for i,col in enumerate(color):
histr = cv.calcHist([img],[i],None,[255],[0,255])
plt.plot(histr,color = col)
plt.xlim([0,255])
plt.show()
Apply the Meanshift Algorithm
Now that we have the ROI coordinates and the corresponding histogram we can add a while loop that keeps fetching new frames from the video stream. We are using the OpenCV Back Projection to compare each incoming new frame with the histogram of our ROI. The Meanshift tracking algorithm is then using the generated density function to find a best match for the new coordinates of our region of interest:
# set up the termination criteria, either 10 iteration or move by at least 1 pt
parameter = (cv2.TERM_CRITERIA_EPS | cv2.TERM_CRITERIA_COUNT, 10, 1)
# now loop through the rest of avail frames
# and use meanshift to track defined roi
while True:
# get next frame
frame = vs.read()
if True:
# convert to hsv
hsv = cv2.cvtColor(frame, cv2.COLOR_BGR2HSV)
# compare blue channel of current with roi histogram
dst = cv2.calcBackProject([hsv], [0], roi_hist, [0, 180], 1)
# call meanshift() to find match of histogram in current frame
# and get the new coordinates
ok, track_window = cv2.meanShift(dst, (x, y, w, h), parameter)
if not ok:
print('[WARNING] track lost')
# now update the roi coordinates to new values
x, y, w, h = track_window
cv2.rectangle(frame, (x, y), (x+w, y+h), (0, 255, 255), 5)
# display track
cv2.imshow("Meanshift Track", frame)
if cv2.waitKey(1) & 0xFF == ord('q'):
break
else:
break
A representation of the density function dst
used by the Meanshift algorithm to track down the region of interest:
The generated coordinates can be used to draw a rectangle around the calculated new position of our selected object: