Skip to main content

OpenCV Object Tracking

Shenzhen, China

Tracking Objects in videos with OpenCV - see also pyimagesearch. And on Github.

Setting up OpenCV

python -m venv .env
source .env/bin/activate
python -m pip install --upgrade pip

Add a file dependencies.txt with all project pip dependencies:

opencv-python
opencv-contrib-python

Install all dependencies with:

pip install -r dependencies.txt

Test installation of OpenCV by running the following Python script:

import cv2
print(cv2.__version__)
python scripts/main.py
4.5.5

Single Track

Select Tracking Algorithm

OpenCV includes 7 separate legacy object tracking implementations:

  1. BOOSTING Tracker: Based on the same algorithm used by Haar cascades (AdaBoost). Slow and doesn’t work very well.
  2. MIL Tracker: Better accuracy than BOOSTING tracker.
  3. KCF Tracker: Kernelized Correlation Filters. Faster than BOOSTING and MIL. Similar to MIL and KCF, does not handle full occlusion well.
  4. TLD Tracker: ?
  5. MedianFlow Tracker: Does a nice job reporting failures; doesn't handle big changes in motion / lighting very well.
  6. MOSSE Tracker: Not as accurate as CSRT or KCF. Good choice for speed.
  7. CSRT Tracker: Discriminative Correlation Filter (with Channel and Spatial Reliability). More accurate than KCF but slightly slower.

In the following video (downloaded free stock from pexels.com) I am selecting the woman in the brown coat in the background to the right. All algorithms do well to the point where she is blocked from view - non of them reacquire her:

BOOSTING vs KCF

OpenCV - Select Tracking Algorithm

MOSSE vs CSRT

OpenCV - Select Tracking Algorithm

  1. CSRT: high tracking accuracy but slower FPS throughput.
  2. KCF: the FPS I am getting are slightly higher but accuracy is supposed to be slightly worse than CSRT.
  3. MOSSE: for easy to track objects - only speed matters.

Interesting - the MOSSE algorithm seems to keeps the ROI size constant. Which must make it more difficult to recognize the object if it moves towards or away from the camera:

OpenCV - Select Tracking Algorithm

import cv2

tracker_types = ['BOOSTING', 'MIL', 'KCF', 'TLD', 'MEDIANFLOW', 'MOSSE', 'CSRT']
tracker_type = tracker_types[4]

if tracker_type == 'BOOSTING':
tracker = cv2.legacy.TrackerBoosting_create()
elif tracker_type == 'MIL':
tracker = cv2.legacy.TrackerMIL_create()
elif tracker_type == 'KCF':
tracker = cv2.legacy.TrackerKCF_create()
elif tracker_type == 'TLD':
tracker = cv2.legacy.TrackerTLD_create()
elif tracker_type == 'MEDIANFLOW':
tracker = cv2.legacy.TrackerMedianFlow_create()
elif tracker_type == 'MOSSE':
tracker = cv2.legacy.TrackerMOSSE_create()
elif tracker_type == 'CSRT':
tracker = cv2.legacy.TrackerCSRT_create()

# Change tracker_type index to check if objects are created:
print(tracker)

Load Video File

video = cv2.VideoCapture('resources/car_race_02.mp4')
# load video
if not video.isOpened():
print('[ERROR] video file not loaded')
sys.exit()
# capture first frame
ok, frame = video.read()
if not ok:
print('[ERROR] no frame captured')
sys.exit()
print('[INFO] video loaded and frame capture started')

Select Object to Track

Load the first frame from your video file and use the OpenCV Region of Interest selector to mark your object:

bbox = cv2.selectROI(frame)
print('[INFO] select ROI and press ENTER or SPACE')
print('[INFO] cancel selection by pressing C')
print(bbox)

OpenCV Object Tracking

The bbox variable returns the corner point of the Bounding Box selected with the ROI Selector:

(1028, 190, 84, 76)

Track the Object

Now we can start the object tracking based on our selected region of interest.

ok = tracker.init(frame, bbox)
if not ok:
print('[ERROR] tracker not initialized')
sys.exit()
print('[INFO] tracker was initialized on ROI')
# random generate a colour for bounding box
colours = (randint(0, 255), randint(0, 255), randint(0, 255))
# loop through all frames of video file
while True:
ok, frame = video.read()
if not ok:
print('[INFO] end of video file reached')
break
# update position of ROI based on tracker prediction
ok, bbox = tracker.update(frame)
# test print coordinates of predicted bounding box for all frames
print(ok, bbox)

This will loop through every frame of our video file and update the position of our bounding box based on the tracker prediction until the end of file is reached:

[INFO] select ROI and press ENTER or SPACE
[INFO] cancel selection by pressing C
[INFO] tracker was initialized on ROI
True (1013.0, 196.0, 89.0, 72.0)
True (1000.0, 201.0, 91.0, 74.0)
True (986.0, 205.0, 92.0, 75.0)

...

True (76.0, 1034.0, 67.0, 55.0)
[INFO] end of video file reached

Now we can use the predicted position of our bounding box to draw a rectangle around our tracked object:

    if ok == True:
(x, y, w, h) = [int(v) for v in bbox]
# use predicted bounding box coordinates to draw a rectangle
cv2.rectangle(frame, (x, y), (x+w, y+h), colours, 3)
cv2.putText(frame, str(tracker_type), (10, 30), cv2.QT_FONT_NORMAL, 1, (255, 255, 255))

else:
# if prediction failed and no bounding box coordinates are available
cv2.putText(frame, 'No Track', (10, 30), cv2.QT_FONT_NORMAL, 1, (0, 0, 255))

# display object track
cv2.imshow('Single Track', frame)
# press 'q' to break loop and close window
if cv2.waitKey(1) & 0xFF == ord('q'):
break

Record the Output Video

Set recording parameter:

frame_width = int(video.get(cv2.CAP_PROP_FRAME_WIDTH))
frame_height = int(video.get(cv2.CAP_PROP_FRAME_HEIGHT))
fps = int(video.get(cv2.CAP_PROP_FPS))
video_codec = cv2.VideoWriter_fourcc('m', 'p', '4', 'v')
prefix = 'recording/'+datetime.datetime.now().strftime("%y%m%d_%H%M%S")
basename = "object_track.mp4"
video_output = cv2.VideoWriter("_".join([prefix, basename]), video_codec, fps, (frame_width, frame_height))

And trigger the recording inside the While Loop:

video_output.write(frame)

User Input by Arguments

Parse arguments to select the tracking algorithm and video file:

import argparse

...

ap = argparse.ArgumentParser()
ap.add_argument("-v", "--video", type=str, default='resources/group_of_people_01.mp4', help="path to input video file")
ap.add_argument("-t", "--tracker", type=int, default=6, help="Select tracker [0-6]: boosting, mil, kcf, "
"tld, mediaflow, mosse, csrt")
args = vars(ap.parse_args())

tracker_types = ['BOOSTING', 'MIL', 'KCF', 'TLD', 'MEDIANFLOW', 'MOSSE', 'CSRT']
tracker_type = tracker_types[args["tracker"]]


...


video = cv2.VideoCapture(args["video"])

Multitrack and GOTURN

OpenCV Object Tracking