Skip to main content

OpenCV Crash Course Part I

Shenzhen, China

Murtaza's Workshop - Robotics and AI: We are going learn everything required to get started with OpenCV in Python. We will be using Python since it is one of the most popular programming languages. And it has opened numerous job opportunities in various sectors. We will start from the installation process right up to creating exciting projects such as detecting colors , shapes humans and even vehicle number plates. So If you are a beginner don't worry this course is for you. We will skip all the boring theory stuff and focus on the practical implementation. So you can get the computer vision skill set you have always wanted in your CV. By the end of the course you will become familiar with the core principle of opencv and apply different techniques to solve real world problems using computer vision.

Setup

python --version
Python 3.9.7
pip --version
pip 21.3.1
mkdir /opt/Python/opencv-python
cd /opt/Python/opencv-python

Open the project folder in PyCharm Community Edition and add opencv-python to your virtual environment - this will also add Numpy for us:

OpenCV Python

Read Images, Videos, Webcam

Load and Display an Image 5s

img = cv2.imread('resources/shenzhen-subway.jpg')
cv2.imshow("Output", img)
cv2.waitKey(5000)

Capture a video to variable and display until q key is pressed

cap = cv2.VideoCapture('resources/sz-office.mp4')

while True:
success, vid = cap.read()
cv2.imshow('Video', vid)
if cv2.waitKey(1) & 0xFF == ord('q'):
break

Webcams on Linux

Verify that your webcam is working in VLC - increment video source /dev/video0 if more than one camera is connected:

vlc v4l2:///dev/video0:chroma=mjpg:width=1920:height=1080

VLC media player 3.0.16 Vetinari (revision 3.0.13-8-g41878ff4f2)
[000056127761b7b0] main libvlc: Running vlc with the default interface. Use 'cvlc' to use vlc without interface.
[00007f2700006380] gl gl: Initialized libplacebo v3.120.3 (API v120)
[00007f2700006380] glconv_vaapi_x11 gl error: vaCreateSurfaces: attribute not supported
[00007f270809e7e0] main video output error: video output creation failed
[00007f27040096c0] main decoder error: failed to create video output
[00007f2700498530] gl gl: Initialized libplacebo v3.120.3 (API v120)
QObject::~QObject: Timers cannot be stopped from another thread

OpenCV Python

cap = cv2.VideoCapture(0) #Capture video source zero `/dev/video0`
cap.set(3, 1920)
cap.set(4, 1080)

while True:
success, vid = cap.read()
cv2.imshow('Video', vid)
if cv2.waitKey(1) & 0xFF == ord('q'):
break

For reference, the first argument in the cap.set() command refers to the enumeration of the camera properties, listed below:

0CV_CAP_PROP_POS_MSEC Current position of the video file in milliseconds.
1CV_CAP_PROP_POS_FRAMES 0-based index of the frame to be decoded/captured next.
2CV_CAP_PROP_POS_AVI_RATIO Relative position of the video file
3CV_CAP_PROP_FRAME_WIDTH Width of the frames in the video stream.
4CV_CAP_PROP_FRAME_HEIGHT Height of the frames in the video stream.
5CV_CAP_PROP_FPS Frame rate.
6CV_CAP_PROP_FOURCC 4-character code of codec.
7CV_CAP_PROP_FRAME_COUNT Number of frames in the video file.
8CV_CAP_PROP_FORMAT Format of the Mat objects returned by retrieve() .
9CV_CAP_PROP_MODE Backend-specific value indicating the current capture mode.
10CV_CAP_PROP_BRIGHTNESS Brightness of the image (only for cameras).
11CV_CAP_PROP_CONTRAST Contrast of the image (only for cameras).
12CV_CAP_PROP_SATURATION Saturation of the image (only for cameras).
13CV_CAP_PROP_HUE Hue of the image (only for cameras).
14CV_CAP_PROP_GAIN Gain of the image (only for cameras).
15CV_CAP_PROP_EXPOSURE Exposure (only for cameras).
16CV_CAP_PROP_CONVERT_RGB Boolean flags indicating whether images should be converted to RGB.
17CV_CAP_PROP_WHITE_BALANCE Currently unsupported
18CV_CAP_PROP_RECTIFICATION Rectification flag for stereo cameras

Image Grayscale, Blur and Edge Detection

img = cv2.imread('resources/shenzhen-subway.jpg')
kernel = np.ones((5, 5), np.uint8)

imgGray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
imgBlur = cv2.GaussianBlur(imgGray, (9, 9), 0)
imgLowEdge = cv2.Canny(img, 100, 100)
imgHighEdge = cv2.Canny(img, 200, 300)
imgDilation = cv2.dilate(imgLowEdge, kernel, iterations=5)
imgEroded = cv2.erode(imgDilation, kernel, iterations=3)

cv2.imshow("Grayscale Image", imgGray)
cv2.imshow("Blurred Image", imgBlur)
cv2.imshow("Low Edge Threshold Image", imgLowEdge)
cv2.imshow("High Edge Threshold Image", imgHighEdge)
cv2.imshow("Increased Edge Thickness", imgDilation)
cv2.imshow("Decreased Edge Thickness", imgEroded)
cv2.waitKey(5000)

OpenCV Python

Joining Images

The image output above can become a little bit messy. But Numpy offers a solution to stack images either horizontally or vertically:

img = cv2.imread('resources/shenzhen-subway.jpg')
kernel = np.ones((5, 5), np.uint8)

imgGray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
imgBlur = cv2.GaussianBlur(imgGray, (9, 9), 0)
imgLowEdge = cv2.Canny(img, 100, 100)
imgHighEdge = cv2.Canny(img, 200, 300)
imgDilation = cv2.dilate(imgLowEdge, kernel, iterations=5)
imgEroded = cv2.erode(imgDilation, kernel, iterations=3)

imgVer = np.vstack(( imgGray, imgBlur))
imgHor = np.hstack((imgLowEdge, imgHighEdge, imgDilation, imgEroded))

cv2.imshow("Posterized Image", imgHor)
cv2.imshow("Grayscale Image", imgVer)
cv2.waitKey(5000)

Horizontal Stack

OpenCV Python

Vertical Stack

OpenCV Python

Image Cropping and Resizing

Get the image dimensions:

img = cv2.imread('resources/shenzhen-subway.jpg')

print(img.shape)

The output is: (683, 1024, 3) - which stands for height: 683pixel, width: 1024pixel and 3 colour channels (BGR). We can now use the OpenCV resize function to change the dimensions to 300x200:

img = cv2.imread('resources/shenzhen-subway.jpg')
print(img.shape)
imgResize = cv2.resize(img, (300, 200))
print(imgResize.shape)

cv2.imshow("Resized Image", imgResize)
cv2.waitKey(5000)

Running the script gives us the dimensions:

python chapter_01.py

(683, 1024, 3)
(200, 300, 3)

OpenCV Python

Adding Shapes and Text

img = np.zeros((512, 512, 3), np.uint8) # 512x512 black image background with 3 colour channels BGR
imageDim = str(img.shape[0]) + ' : ' + str(img.shape[1]) # String of image dimensions

img[:] = 255, 0, 0 # Colour entire image blue
img[112:400, 112:400] = 0, 255, 0 # Add a green square in the middle

# cv2.line(img, (0, 0), (512, 512), (0, 0, 255), 3) # Draw a line from the top left to bottom right
# cv2.line(img, (512, 0), (0, 512), (0, 0, 255), 3) # Draw a line from the top right to bottom left
# Do the same thing - but for an image of unknown dimensions
# with img.shape[1]=width and img.shape[0]=height
cv2.line(img, (0, 0), (img.shape[1], img.shape[0]), (0, 0, 255), 3)
cv2.line(img, (img.shape[1], 0), (0, img.shape[0]), (0, 0, 255), 3)

cv2.rectangle(img, (3, 3), (509, 509), (0, 0, 255), 5) # Draw a rectangle around the image
cv2.rectangle(img, (212, 212), (300, 300), (0, 0, 255), cv2.FILLED) # Draw square in the middle

cv2.circle(img, (256, 256), 30, (0, 255, 255), cv2.FILLED) # Add a circle in the middle

cv2.putText(img, imageDim, (180, 70), cv2.QT_FONT_NORMAL, 1, (255, 255, 255), 1) # Print image dimensions

cv2.imshow("Image", img)
cv2.waitKey(5000)

OpenCV Python