Skip to main content

YOLOv7 Introduction

Guangzhou, China

Citation: YOLOv7: Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors Chien-Yao Wang, Alexey Bochkovskiy, Hong-Yuan Mark Liao: YOLOv7 surpasses all known object detectors in both speed and accuracy in the range from 5 FPS to 160 FPS and has the highest accuracy 56.8% AP among all known real-time object detectors with 30 FPS or higher on GPU V100. YOLOv7-E6 object detector (56 FPS V100, 55.9% AP) outperforms both transformer-based detector SWIN-L Cascade-Mask R-CNN (9.2 FPS A100, 53.9% AP) by 509% in speed and 2% in accuracy, and convolutional-based detector ConvNeXt-XL Cascade-Mask R-CNN (8.6 FPS A100, 55.2% AP) by 551% in speed and 0.7% AP in accuracy, as well as YOLOv7 outperforms: YOLOR, YOLOX, Scaled-YOLOv4, YOLOv5, DETR, Deformable DETR, DINO-5scale-R50, ViT-Adapter-B and many other object detectors in speed and accuracy. Moreover, we train YOLOv7 only on MS COCO dataset from scratch without using any other datasets or pre-trained weights. Github

Run YOLOv7 on Arch LINUX

Clone the Repository

git clone https://github.com/mpolinowski/yolov7.git
cd yolov7

Install all Dependencies

The repository comes with an Requirements.txt that you can install globally or inside a virtEnvironment pip3 install -r requirements.txt. But I am going to use PipEnv instead:

pipenv --version
pipenv, version 2022.12.19

pipenv install -r requirements.txt
Creating a Pipfile for this project...
Requirements file provided! Importing into Pipfile...
Pipfile.lock not found, creating...
Installing dependencies from Pipfile.lock (9ec603)...
To activate this project's virtualenv, run pipenv shell.
Alternatively, run a command inside the virtualenv with pipenv run.

pipfile

[[source]]
url = "https://pypi.org/simple"
verify_ssl = true
name = "pypi"

[packages]
matplotlib = ">=3.2.2"
numpy = "<1.24.0,>=1.18.5"
opencv-python = ">=4.1.1"
pillow = ">=7.1.2"
pyyaml = ">=5.3.1"
requests = ">=2.23.0"
scipy = ">=1.4.1"
torch = ">=1.7.0,!=1.12.0"
torchvision = ">=0.8.1,!=0.13.0"
tqdm = ">=4.41.0"
protobuf = "<4.21.3"
tensorboard = ">=2.4.1"
pandas = ">=1.1.4"
seaborn = ">=0.11.0"
ipython = "*"
psutil = "*"
thop = "*"

[dev-packages]

[requires]
python_version = "3.10"

UPDATE: I ran into some issues changing the torch and torchvision version here to their CUDA counterparts. Since I had all the dependencies globally installed anyway, I dropped the virtual environment and made sure that PyTorch was installed with GPU support:

YOLOv7 Introduction

You can empty the virtual environment with pipenv uninstall --all and run all scripts without the pipenv run prefix if you encounter the same issue:

python detect.py --weights yolov7.pt --conf 0.25 --img-size 640 --source hongkong.jpg

YOLOv7 Introduction

Download pre-trained Weights

The weights we can use are linked in the repo README

wget https://github.com/WongKinYiu/yolov7/releases/download/v0.1/yolov7.pt

Test the Model

The README also holds the two commands we can use to try out the pre-trained model - one for video and one for still images:

python detect.py --weights yolov7.pt --conf 0.25 --img-size 640 --source yourvideo.mp4
python detect.py --weights yolov7.pt --conf 0.25 --img-size 640 --source inference/images/horses.jpg

Image Files

Since I used pipenv I have to prepend the python in this command:

pipenv run python detect.py --weights yolov7.pt --conf 0.25 --img-size 640 --source harbin.jpeg

21 persons, 7 cars, 1 bus, 1 truck, 4 traffic lights, 1 umbrella, 3 handbags, Done. (49.1ms) Inference, (11.6ms) NMS
The image with the result is saved in: runs/detect/exp/harbin.jpeg
Done. (0.786s)

YOLOv7 Introduction

Video Files

pipenv run python detect.py --weights yolov7.pt --conf 0.25 --img-size 640 --source video.mp4

YOLOv7 Introduction

Videostreaming

According to the documentation you can select a connected webcam from your system as a video source by selected the video source, e.g. --source 0, --source 1 etc.:

pipenv run python detect.py --weights yolov7.pt --conf 0.25 --img-size 640 --view-img --nosave --source 0

But looking at the detect.py file shows me that it also accepts RTSP streams from IP cameras as source:

 webcam = source.isnumeric() or source.endswith('.txt') or source.lower().startswith(
('rtsp://', 'rtmp://', 'http://', 'https://'))

So let't try this with an INSTAR WQHD IP camera:

pipenv run python detect.py --weights yolov7.pt --conf 0.25 --img-size 640 --view-img --nosave --source rtsp://admin:instar@192.168.2.120:554/livestream/13

YOLOv7 Introduction