OpenCV Object Tracking
Tracking Objects in videos with OpenCV - see also pyimagesearch. And on Github.
Setting up OpenCV
python -m venv .env
source .env/bin/activate
python -m pip install --upgrade pip
Add a file dependencies.txt
with all project pip dependencies:
opencv-python
opencv-contrib-python
Install all dependencies with:
pip install -r dependencies.txt
Test installation of OpenCV by running the following Python script:
import cv2
print(cv2.__version__)
python scripts/main.py
4.5.5
Single Track
Select Tracking Algorithm
OpenCV includes 7 separate legacy object tracking implementations:
- BOOSTING Tracker: Based on the same algorithm used by Haar cascades (AdaBoost). Slow and doesn’t work very well.
- MIL Tracker: Better accuracy than BOOSTING tracker.
- KCF Tracker: Kernelized Correlation Filters. Faster than BOOSTING and MIL. Similar to MIL and KCF, does not handle full occlusion well.
- TLD Tracker: ?
- MedianFlow Tracker: Does a nice job reporting failures; doesn't handle big changes in motion / lighting very well.
- MOSSE Tracker: Not as accurate as CSRT or KCF. Good choice for speed.
- CSRT Tracker: Discriminative Correlation Filter (with Channel and Spatial Reliability). More accurate than KCF but slightly slower.
In the following video (downloaded free stock from pexels.com) I am selecting the woman in the brown coat in the background to the right. All algorithms do well to the point where she is blocked from view - non of them reacquire her:
BOOSTING vs KCF
MOSSE vs CSRT
- CSRT: high tracking accuracy but slower FPS throughput.
- KCF: the FPS I am getting are slightly higher but accuracy is supposed to be slightly worse than CSRT.
- MOSSE: for easy to track objects - only speed matters.
Interesting - the MOSSE algorithm seems to keeps the ROI size constant. Which must make it more difficult to recognize the object if it moves towards or away from the camera:
import cv2
tracker_types = ['BOOSTING', 'MIL', 'KCF', 'TLD', 'MEDIANFLOW', 'MOSSE', 'CSRT']
tracker_type = tracker_types[4]
if tracker_type == 'BOOSTING':
tracker = cv2.legacy.TrackerBoosting_create()
elif tracker_type == 'MIL':
tracker = cv2.legacy.TrackerMIL_create()
elif tracker_type == 'KCF':
tracker = cv2.legacy.TrackerKCF_create()
elif tracker_type == 'TLD':
tracker = cv2.legacy.TrackerTLD_create()
elif tracker_type == 'MEDIANFLOW':
tracker = cv2.legacy.TrackerMedianFlow_create()
elif tracker_type == 'MOSSE':
tracker = cv2.legacy.TrackerMOSSE_create()
elif tracker_type == 'CSRT':
tracker = cv2.legacy.TrackerCSRT_create()
# Change tracker_type index to check if objects are created:
print(tracker)
Load Video File
video = cv2.VideoCapture('resources/car_race_02.mp4')
# load video
if not video.isOpened():
print('[ERROR] video file not loaded')
sys.exit()
# capture first frame
ok, frame = video.read()
if not ok:
print('[ERROR] no frame captured')
sys.exit()
print('[INFO] video loaded and frame capture started')
Select Object to Track
Load the first frame from your video file and use the OpenCV Region of Interest selector to mark your object:
bbox = cv2.selectROI(frame)
print('[INFO] select ROI and press ENTER or SPACE')
print('[INFO] cancel selection by pressing C')
print(bbox)
The bbox
variable returns the corner point of the Bounding Box selected with the ROI Selector:
(1028, 190, 84, 76)
Track the Object
Now we can start the object tracking based on our selected region of interest.
ok = tracker.init(frame, bbox)
if not ok:
print('[ERROR] tracker not initialized')
sys.exit()
print('[INFO] tracker was initialized on ROI')
# random generate a colour for bounding box
colours = (randint(0, 255), randint(0, 255), randint(0, 255))
# loop through all frames of video file
while True:
ok, frame = video.read()
if not ok:
print('[INFO] end of video file reached')
break
# update position of ROI based on tracker prediction
ok, bbox = tracker.update(frame)
# test print coordinates of predicted bounding box for all frames
print(ok, bbox)
This will loop through every frame of our video file and update the position of our bounding box based on the tracker prediction until the end of file is reached:
[INFO] select ROI and press ENTER or SPACE
[INFO] cancel selection by pressing C
[INFO] tracker was initialized on ROI
True (1013.0, 196.0, 89.0, 72.0)
True (1000.0, 201.0, 91.0, 74.0)
True (986.0, 205.0, 92.0, 75.0)
...
True (76.0, 1034.0, 67.0, 55.0)
[INFO] end of video file reached
Now we can use the predicted position of our bounding box to draw a rectangle around our tracked object:
if ok == True:
(x, y, w, h) = [int(v) for v in bbox]
# use predicted bounding box coordinates to draw a rectangle
cv2.rectangle(frame, (x, y), (x+w, y+h), colours, 3)
cv2.putText(frame, str(tracker_type), (10, 30), cv2.QT_FONT_NORMAL, 1, (255, 255, 255))
else:
# if prediction failed and no bounding box coordinates are available
cv2.putText(frame, 'No Track', (10, 30), cv2.QT_FONT_NORMAL, 1, (0, 0, 255))
# display object track
cv2.imshow('Single Track', frame)
# press 'q' to break loop and close window
if cv2.waitKey(1) & 0xFF == ord('q'):
break
Record the Output Video
Set recording parameter:
frame_width = int(video.get(cv2.CAP_PROP_FRAME_WIDTH))
frame_height = int(video.get(cv2.CAP_PROP_FRAME_HEIGHT))
fps = int(video.get(cv2.CAP_PROP_FPS))
video_codec = cv2.VideoWriter_fourcc('m', 'p', '4', 'v')
prefix = 'recording/'+datetime.datetime.now().strftime("%y%m%d_%H%M%S")
basename = "object_track.mp4"
video_output = cv2.VideoWriter("_".join([prefix, basename]), video_codec, fps, (frame_width, frame_height))
And trigger the recording inside the While Loop:
video_output.write(frame)
User Input by Arguments
Parse arguments to select the tracking algorithm and video file:
import argparse
...
ap = argparse.ArgumentParser()
ap.add_argument("-v", "--video", type=str, default='resources/group_of_people_01.mp4', help="path to input video file")
ap.add_argument("-t", "--tracker", type=int, default=6, help="Select tracker [0-6]: boosting, mil, kcf, "
"tld, mediaflow, mosse, csrt")
args = vars(ap.parse_args())
tracker_types = ['BOOSTING', 'MIL', 'KCF', 'TLD', 'MEDIANFLOW', 'MOSSE', 'CSRT']
tracker_type = tracker_types[args["tracker"]]
...
video = cv2.VideoCapture(args["video"])