DnnOpenVinoDetector C++ library
v1.3.0
Table of contents
- Overview
- Versions
- Library files
- Key features and capabilities
- Supported pixel formats
- Library principles
- DnnOpenVinoDetector class description
- DnnOpenVinoDetector class declaration
- getVersion method
- initObjectDetector method
- setParam method
- getParam method
- getParams method
- executeCommand method
- detect method
- setMask method
- decodeAndExecuteCommand method
- encodeSetParamCommand method of ObjectDetector class
- encodeCommand method of ObjectDetector class
- decodeCommand method of ObjectDetector class
- Data structures
- ObjectDetectorParams class description
- Build and connect to your project
- Example
- Benchmark
- Demo application
Overview
DnnOpenVinoDetector C++ library performs automatic detection of objects on videos through the utilization of neural networks. The library is implemented in C++17 standard and uses Intel’s OpenVINO™ runtime (tested with version 2023.3.0). It optimized and tested for Ultralytics YOLOv8 neural network model and provides simple interface. The library configures OpenVINO™ runtime, performs frame preprocessing (pixel format conversion and scaling), passes the frame to OpenVINO™ for computation and interprets the results. The library inherits interface from the ObjectDetector class (provides interface for object detectors, source code included, Apache 2.0 license). The library depends on OpenCV library (version >=4.5.0, linked, Apache 2.0 license) for image processing operations (scaling, pixel format conversion etc.). Additionally the demo application depends on SimpleFileDialog library (provides function to open video file via file dialog, source code included, Apache 2.0 license).
Versions
Table 1 - Library versions.
Version | Release date | What’s new |
---|---|---|
1.0.0 | 27.09.2023 | First version. |
1.0.1 | 04.01.2024 | - Demo application updated. - Documentation updated. |
1.1.0 | 09.01.2024 | - Updated with new interface of ObjectDetector with class names in parameters structure. - Added github actions with automatic install of OpenCV and OpenVino and build on linux. |
1.1.0 | 29.03.2024 | - ObjectDetector class updated. - Image preprocessing updated. - Code structure changes for YOLOv8 support. - Demo application updated. - Documentation updated. |
1.2.0 | 15.04.2024 | - Tracker added. Provides unique object ID from frame to frame. - Documentation updated. |
1.2.1 | 20.05.2024 | - Submodules updated. - Documentation updated. |
1.3.0 | 25.07.2024 | - CMake structure updated. - Implementation folder added. |
Library files
The library is supplied as source code only. The user is provided with a set of files in the form of a CMake project (repository). The repository structure is shown below:
CMakeLists.txt ----------------------- Main CMake file of the library.
3rdparty ----------------------------- Folder with third-party libraries.
CMakeLists.txt ------------------- CMake file to include third-party libraries.
ObjectDetector ------------------- Folder with ObjectDetector library source code.
src ---------------------------------- Folder with library source code.
CMakeLists.txt ------------------- CMake file of the library.
DnnOpenVinoDetector.cpp ---------- C++ class definition file.
DnnOpenVinoDetector.h ------------ Main library header file.
DnnOpenVinoDetectorVersion.h ----- Header file with library version.
DnnOpenVinoDetectorVersion.h.in -- Service CMake file to generate version header.
impl
BoxTracker.h ----------------- Header file of box tracker library.
BoxTracker.cpp --------------- C++ implementation file file of tracker library.
DnnOpenVinoDetectorImpl.h ---- Implementation header file.
DnnOpenVinoDetectorImpl.cpp -- C++ implementation file.
demo --------------------------------- Folder for demo application.
CMakeLists.txt ------------------- CMake file for demo application.
3rdparty ------------------------- Folder with third-party libraries.
CMakeLists.txt --------------- CMake file to include third-party libraries.
SimpleFileDialog ------------- Folder with SimpleFileDialog library source code.
main.cpp ------------------------- Source C++ file of demo application.
example ------------------------------ Folder for example application.
CMakeLists.txt ------------------- CMake file of example application.
main.cpp ------------------------- Source C++ file of example application.
benchmark ---------------------------- Folder with benchmark.
CMakeLists.txt ------------------- CMake file of benchmark.
main.cpp ------------------------- Source C++ file of benchmark.
Key features and capabilities
Table 2 - Key features and capabilities.
Parameter and feature | Description |
---|---|
Programming language | C++ (standard C++17) using the OpenVINO™ library (version 2023.3.0) and OpenCV library (version >=4.5.0). |
Supported OS | Compatible with any operating system that supports the C++ compiler (C++17 standard), the OpenVINO™ library (version 2023.3.0) and OpenCV library (version >=4.5.0). |
Shape of detected objects | The library is capable of detecting moving objects of various shape. The minimum and maximum height and width of the objects to be detected are set by the user in the library parameters. |
Supported pixel formats | RGB24, BGR24 and YUV24. The library uses the RGB24 format for video processing. If the pixel format of the image is different from RGB24, the library pre-converts the pixel formats to RGB24. |
Maximum and minimum video frame size | The minimum size of video frames to be processed is 32x32 pixels, and the maximum size is 8192x8192 pixels. The size of the video frames to be processed doesn’t have a significant impact on the computational speed, because input images are resized to input network size (usually 640x640 pixels). |
Coordinate system | The algorithm uses a window coordinate system with the zero point in the upper left corner of the video frame. |
Calculation speed | The processing time per video frame depends mostly on loaded neural network model and also on the computing platform used. The processing time per video frame can be estimated with the demo application. |
Type of algorithm for detection of objects | To detect different objects on current frame interface for obtaining neural network model was implemented. This interface relies on OpenVINO™ implementation and utilizes its features such as: pre-processing of input data, reading and launching neural network model and obtaining output results. Output data is cleaned up from overlapping boxes and correct types are assigned to output vector of objects. |
Discreteness of computation of coordinates | The library utilizes the object bounding box for each detected object. If boxes are overlapping and have the same object type, they are merged into one combined. This means discreteness of library is strongly depended on loaded neural network model. |
Working conditions | The library is designed to function on a variety of devices. It is optimized for GPU hardware support, which can significantly enhance processing speed. The detector processes each frame independently, so camera movement does not impact the results. |
Supported pixel formats
Frame library (included in DnnOpenVinoDetector library) contains Fourcc enum, which defines supported pixel formats (Frame.h file). DnnOpenVinoDetector library supports BGR24, RGB24 and YUV24 input pixel formats. The library uses the RGB24 format for video processing. If the pixel format of the image is different from RGB24, the library pre-converts the pixel formats to RGB24. Fourcc enum declaration:
enum class Fourcc
{
/// RGB 24bit pixel format.
RGB24 = MAKE_FOURCC_CODE('R', 'G', 'B', '3'),
/// BGR 24bit pixel format.
BGR24 = MAKE_FOURCC_CODE('B', 'G', 'R', '3'),
/// YUYV 16bits per pixel format.
YUYV = MAKE_FOURCC_CODE('Y', 'U', 'Y', 'V'),
/// UYVY 16bits per pixel format.
UYVY = MAKE_FOURCC_CODE('U', 'Y', 'V', 'Y'),
/// Grayscale 8bit.
GRAY = MAKE_FOURCC_CODE('G', 'R', 'A', 'Y'),
/// YUV 24bit per pixel format.
YUV24 = MAKE_FOURCC_CODE('Y', 'U', 'V', '3'),
/// NV12 pixel format.
NV12 = MAKE_FOURCC_CODE('N', 'V', '1', '2'),
/// NV21 pixel format.
NV21 = MAKE_FOURCC_CODE('N', 'V', '2', '1'),
/// YU12 (YUV420) - Planar pixel format.
YU12 = MAKE_FOURCC_CODE('Y', 'U', '1', '2'),
/// YV12 (YVU420) - Planar pixel format.
YV12 = MAKE_FOURCC_CODE('Y', 'V', '1', '2'),
/// JPEG compressed format.
JPEG = MAKE_FOURCC_CODE('J', 'P', 'E', 'G'),
/// H264 compressed format.
H264 = MAKE_FOURCC_CODE('H', '2', '6', '4'),
/// HEVC compressed format.
HEVC = MAKE_FOURCC_CODE('H', 'E', 'V', 'C')
};
Table 3 - Bytes layout of supported pixel formats. Example of 4x4 pixels image.
RGB24 | BGR24 |
---|---|
YUV24 |
Library Principles
The object detection feature within this library is seamlessly integrated with OpenVINO™ runtime support, designed to facilitate efficient object detection based on neural network models. The library simplifies the process and provides a straightforward usage sequence for developers. The algorithm primarily consists of the following steps:
- The library accepts input frames directly and sets proper inputs for compiling neural network model, eliminating the need for any preprocessing.
- After model compilation, input frames are forwarded directly into the OpenVINO™ pipeline, utilizing neural network models for object detection.
- Computing results in output tensor, which is converted (according to detector parameters) to the final vector of objects.
- Results store not only coordinates but also probability and type - which can be assigned to particular item, according to current network labels.
The library is available as source code only. To utilize the library as source code, developers must include the library’s files into their project. The recommended usage sequence for the library is as follows:
- Integration: Include the library files in your project.
- Initialization: Create an instance of the
DnnOpenVinoDetector
C++ class for each camera or input source you wish to process. The library supports multiple instances for parallel camera processing. - Customization: If needed, you can customize the library’s behavior by using the
setParam()
method. This allows you to adapt the library to specific requirements. - Object Detection: Create a
Frame
object to represent the input frame, and prepare a vector to store the detected objects. - Detection Process: Call the
detect(...)
method to initiate the object detection process. - Object Retrieval: Retrieve the detected objects by using the
getObjects()
method. The library provides a vector ofObjects
containing information about the detected objects, such as their positions and attributes.
DnnOpenVinoDetector class description
DnnOpenVinoDetector class declaration
DnnOpenVinoDetector.h file contains DnnOpenVinoDetector class declaration. DnnOpenVinoDetector class inherits interface from ObjectDetector interface class. Class declaration:
class DnnOpenVinoDetector : public ObjectDetector
{
public:
/// Class constructor.
DnnOpenVinoDetector();
/// Class destructor.
~DnnOpenVinoDetector();
/// Get string of current library version.
static std::string getVersion();
/// Init object detector.
bool initObjectDetector(ObjectDetectorParams& params) override;
/// Set object detector param.
bool setParam(ObjectDetectorParam id, float value) override;
/// Get object detector param value.
float getParam(ObjectDetectorParam id) override;
/// Get object detector params structure.
void getParams(ObjectDetectorParams& params) override;
/// Get list of objects.
std::vector<Object> getObjects() override;
/// Execute command.
bool executeCommand(ObjectDetectorCommand id) override;
/// Perform detection.
bool detect(cr::video::Frame& frame) override;
/// Set detection mask.
bool setMask(cr::video::Frame mask) override;
/// Decode command and execute command.
bool decodeAndExecuteCommand(uint8_t* data, int size) override;
}
getVersion method
The getVersion() method returns string of current version of DnnOpenVinoDetector class. Method declaration:
static std::string getVersion();
Method can be used without DnnOpenVinoDetector class instance. Example:
std::cout << "DnnOpenVinoDetector version: " << DnnOpenVinoDetector::getVersion() << std::endl;
Console output:
DnnOpenVinoDetector version: 1.3.0
initObjectDetector method
The initObjectDetector(…) method initializes object detector. At the first call (before the first frame of video is processed), the method only checks the initialization string (field initString of ObjectDetectorParams class) is correct and loads the neural network model (checks the validity of the network). If the initialization string is correct and the neural network file is read, the method returns TRUE. When processing the first frame, the method will perform full library initialization (load the neural network and compile it). Compiling a neural network takes a lot of time and this scheme allows us not to wait for the execution of the method. Method declaration:
bool initObjectDetector(ObjectDetectorParams& params) override;
Parameter | Value |
---|---|
params | ObjectDetectorParams class object. The library takes into account only following parameters from ObjectDetectorParams class: initString minObjectWidth (Default value 4) maxObjectWidth (Default value 128) minObjectHeight (Default value 4) maxObjectHeight (Default value 128) minDetectionProbability(Default value 0.5f) type(Default value 1) xDetectionCriteria (Default value 1) yDetectionCriteria (Default value 1) resetCriteria (Default value 1) If particular parameter out of valid range the library will set default values automatically. |
Returns: TRUE if the object detector was initialized (initialization params are valid) or FALSE if not.
setParam method
The setParam(…) method designed to set new DnnOpenVinoDetector object parameter value. Method declaration:
bool setParam(ObjectDetectorParam id, float value) override;
Parameter | Description |
---|---|
id | Parameter ID according to ObjectDetectorParam enum. The library support not all parameters from ObjectDetectorParam enum. |
value | Parameter value. Value depends on parameter ID. |
Returns: TRUE if the parameter was set or FALSE if not (not supported or out of valid range).
getParam method
The getParam(…) method designed to obtain object detector parameter value. Method declaration:
float getParam(ObjectDetectorParam id) override;
Parameter | Description |
---|---|
id | Parameter ID according to ObjectDetectorParam enum. |
Returns: parameter value or -1.0f if the parameter is not supported by DnnOpenVinoDetector.
getParams method
The getParams(…) method designed to obtain object detector params structures as well as a list of detected objects. Method declaration:
void getParams(ObjectDetectorParams& params) override;
Parameter | Description |
---|---|
params | ObjectDetectorParams class object. |
getObjects method
The getObjects() method designed to obtain list of detected objects. User can obtain list of detected objects via getParams(…) method as well. Method declaration:
std::vector<Object> getObjects() override;
Returns: list of detected objects (see Object structure description). If no objects are detected, the list will be empty.
executeCommand method
The executeCommand(…) method designed to execute object detector command. Method declaration:
bool executeCommand(ObjectDetectorCommand id) override;
Parameter | Description |
---|---|
id | Command ID according to ObjectDetectorCommand enum. |
Returns: TRUE if the command was executed or FALSE if not (only if command ID not valid).
detect method
The detect(…) method designed to perform detection algorithm. Before transmitting the frame to OpenVINO™, the method scales the frame to match the resolution of the neural network. Method declaration:
bool detect(cr::video::Frame& frame) override;
Parameter | Description |
---|---|
frame | Video frame for processing. Object detector processes only BGR24, RGB24 and YUYV24 pixel formats (see Frame class description). Size of frame should be from 32x32 to 8192x8192. |
Returns: TRUE if the video frame was processed FALSE if not. Note: If object detector disabled (see ObjectDetectorParam enum description) the method returns TRUE.
setMask method
The setMask(…) method designed to set detection mask. The user can disable detection in any areas of the video frame. For this purpose the user can create an image of any size and configuration with GRAY (preferable), NV12, NV21, YV12 or YU12 pixel format. Mask image pixel values equal to 0 prohibit detection of objects in the corresponding place of video frames. Any other mask pixel value other than 0 allows detection of objects at the corresponding location of video frames. The mask is used for detection algorithms to compute a binary motion mask. The method can be called either before video frame processing or during video frame processing. Method declaration:
bool setMask(cr::video::Frame mask) override;
Parameter | Description |
---|---|
mask | Image of detection mask. Should have GRAY (preferable), NV12, NV21, YV12 or YU12 pixel format (see Frame description). Detector omits image segments, where detection mask pixels’ values equal 0. Mask can have any resolution. If resolution of mask (width and height) is not equal to video frame resolution, the library will scale this mask up to original processed video resolution. |
Returns: TRUE if the the mask accepted or FALSE if not (not valid pixel format or empty).
decodeAndExecuteCommand method
The decodeAndExecuteCommand(…) method decodes and executes command which encoded by encodeSetParamCommand(…) or encodeCommand(…) method of ObjectDetector class. decodeAndExecuteCommand(…) is thread-safe method. This means that the decodeAndExecuteCommand(…) method can be safely called from any thread. Method declaration:
bool decodeAndExecuteCommand(uint8_t* data, int size) override;
Parameter | Description |
---|---|
data | Pointer to input command. |
size | Size of command. Must be 11 bytes for SET_PARAM or 7 bytes for COMMAND. |
Returns: TRUE if command decoded (SET_PARAM or COMMAND) and executed (action command or set param command).
encodeSetParamCommand method of ObjectDetector class
The encodeSetParamCommand(…) static method of the ObjectDetector interface designed to encode command to change any parameter for remote object detector (including motion detectors). To control object detector remotely, the developer has to design his own protocol and according to it encode the command and deliver it over the communication channel. To simplify this, the ObjectDetector class contains static methods for encoding the control command. The ObjectDetector class provides two types of commands: a parameter change command (SET_PARAM) and an action command (COMMAND). encodeSetParamCommand(…) designed to encode SET_PARAM command. Method declaration:
static void encodeSetParamCommand(uint8_t* data, int& size, ObjectDetectorParam id, float value);
Parameter | Description |
---|---|
data | Pointer to data buffer for encoded command. Must have size >= 11. |
size | Size of encoded data. Will be 11 bytes. |
id | Parameter ID according to ObjectDetectorParam enum. |
value | Parameter value. Value depends on parameter ID. |
encodeSetParamCommand(…) is static and used without ObjectDetector class instance. This method used on client side (control system). Command encoding example:
// Buffer for encoded data.
uint8_t data[11];
// Size of encoded data.
int size = 0;
// Random parameter value.
float outValue = (float)(rand() % 20);
// Encode command.
ObjectDetector::encodeSetParamCommand(data, size, ObjectDetectorParam::MIN_OBJECT_WIDTH, outValue);
encodeCommand method of ObjectDetector class
The encodeCommand(…) static method of the ObjectDetector interface designed to encode command for remote object detector (including motion detectors). To control object detector remotely, the developer has to design his own protocol and according to it encode the command and deliver it over the communication channel. To simplify this, the ObjectDetector class contains static methods for encoding the control command. The ObjectDetector class provides two types of commands: a parameter change command (SET_PARAM) and an action command (COMMAND). encodeCommand(…) designed to encode COMMAND (action command). Method declaration:
static void encodeCommand(uint8_t* data, int& size, ObjectDetectorCommand id);
Parameter | Description |
---|---|
data | Pointer to data buffer for encoded command. Must have size >= 11. |
size | Size of encoded data. Will be 11 bytes. |
id | Command ID according to ObjectDetectorCommand enum. |
encodeCommand(…) is static and used without ObjectDetector class instance. This method used on client side (control system). Command encoding example:
// Buffer for encoded data.
uint8_t data[11];
// Size of encoded data.
int size = 0;
// Encode command.
ObjectDetector::encodeCommand(data, size, ObjectDetectorCommand::RESET);
decodeCommand method of ObjectDetector class
The decodeCommand(…) static method of the ObjectDetector interface designed to decode command on object detector side (edge device) encoded by encodeSetParamCommand(…) or encodeCommand(…) method of ObjectDetector class. Method declaration:
static int decodeCommand(uint8_t* data, int size, ObjectDetectorParam& paramId, ObjectDetectorCommand& commandId, float& value);
Parameter | Description |
---|---|
data | Pointer to input command. |
size | Size of command. Should be 11 bytes. |
paramId | Parameter ID according to ObjectDetectorParam enum. After decoding SET_PARAM command the method will return parameter ID. |
commandId | Command ID according to ObjectDetectorCommand enum. After decoding COMMAND the method will return command ID. |
value | Parameter value (after decoding SET_PARAM command). |
Returns: 0 - in case decoding COMMAND, 1 - in case decoding SET_PARAM command or -1 in case errors.
Data structures
ObjectDetectorCommand enum
Enum declaration:
enum class ObjectDetectorCommand
{
/// Reset.
RESET = 1,
/// Enable.
ON,
/// Disable.
OFF
};
Table 4 - Object detector commands description. Some commands maybe unsupported by particular object detector class.
Command | Description |
---|---|
RESET | Reset algorithm. Clears the list of detected objects and resets all internal filters. |
ON | Enable object detector. If the detector is not activated, frame processing is not performed - the list of detected objects will always be empty. |
OFF | Disable object detector. If the detector is not activated, frame processing is not performed - the list of detected objects will always be empty. |
ObjectDetectorParam enum
Enum declaration:
enum class ObjectDetectorParam
{
/// Logging mode. Values: 0 - Disable, 1 - Only file,
/// 2 - Only terminal (console), 3 - File and terminal (console).
LOG_MODE = 1,
/// Frame buffer size. Depends on implementation.
FRAME_BUFFER_SIZE,
/// Minimum object width to be detected, pixels. To be detected object's
/// width must be >= MIN_OBJECT_WIDTH.
MIN_OBJECT_WIDTH,
/// Maximum object width to be detected, pixels. To be detected object's
/// width must be <= MAX_OBJECT_WIDTH.
MAX_OBJECT_WIDTH,
/// Minimum object height to be detected, pixels. To be detected object's
/// height must be >= MIN_OBJECT_HEIGHT.
MIN_OBJECT_HEIGHT,
/// Maximum object height to be detected, pixels. To be detected object's
/// height must be <= MAX_OBJECT_HEIGHT.
MAX_OBJECT_HEIGHT,
/// Minimum object's horizontal speed to be detected, pixels/frame. To be
/// detected object's horizontal speed must be >= MIN_X_SPEED.
MIN_X_SPEED,
/// Maximum object's horizontal speed to be detected, pixels/frame. To be
/// detected object's horizontal speed must be <= MAX_X_SPEED.
MAX_X_SPEED,
/// Minimum object's vertical speed to be detected, pixels/frame. To be
/// detected object's vertical speed must be >= MIN_Y_SPEED.
MIN_Y_SPEED,
/// Maximum object's vertical speed to be detected, pixels/frame. To be
/// detected object's vertical speed must be <= MAX_Y_SPEED.
MAX_Y_SPEED,
/// Probability threshold from 0 to 1. To be detected object detection
/// probability must be >= MIN_DETECTION_PROBABILITY.
MIN_DETECTION_PROBABILITY,
/// Horizontal track detection criteria, frames. By default shows how many
/// frames the objects must move in any(+/-) horizontal direction to be
/// detected.
X_DETECTION_CRITERIA,
/// Vertical track detection criteria, frames. By default shows how many
/// frames the objects must move in any(+/-) vertical direction to be
/// detected.
Y_DETECTION_CRITERIA,
/// Track reset criteria, frames. By default shows how many
/// frames the objects should be not detected to be excluded from results.
RESET_CRITERIA,
/// Detection sensitivity. Depends on implementation. Default from 0 to 1.
SENSITIVITY,
/// Frame scaling factor for processing purposes. Reduce the image size by
/// scaleFactor times horizontally and vertically for faster processing.
SCALE_FACTOR,
/// Num threads. Number of threads for parallel computing.
NUM_THREADS,
/// Processing time of last frame in microseconds.
PROCESSING_TIME_MCS,
/// Algorithm type. Depends on implementation.
TYPE,
/// Mode. Default: 0 - Off, 1 - On.
MODE,
/// Custom parameter. Depends on implementation.
CUSTOM_1,
/// Custom parameter. Depends on implementation.
CUSTOM_2,
/// Custom parameter. Depends on implementation.
CUSTOM_3
};
Table 5 - DnnOpenVinoDetector class params description (from ObjectDetector interface class). Some params are not supported by DnnOpenVinoDetector class.
Parameter | Access | Description |
---|---|---|
LOG_MODE | read / write | Not used. Can have any value. |
FRAME_BUFFER_SIZE | read / write | Not used. Can have any value. |
MIN_OBJECT_WIDTH | read / write | Minimum object width to be detected, pixels. Valid values from 1 to 8192. Must be < MAX_OBJECT_WIDTH. To be detected object’s width must be >= MIN_OBJECT_WIDTH. Default value is 4. |
MAX_OBJECT_WIDTH | read / write | Maximum object width to be detected, pixels. Valid values from 1 to 8192. Must be > MIN_OBJECT_WIDTH. To be detected object’s width must be <= MAX_OBJECT_WIDTH. Default value is 128. |
MIN_OBJECT_HEIGHT | read / write | Minimum object height to be detected, pixels. Valid values from 1 to 8192. Must be < MAX_OBJECT_HEIGHT. To be detected object’s height must be >= MIN_OBJECT_HEIGHT. Default value is 4. |
MAX_OBJECT_HEIGHT | read / write | Maximum object height to be detected, pixels. Valid values from 1 to 8192. Must be > MIN_OBJECT_HEIGHT. To be detected object’s height must be <= MAX_OBJECT_HEIGHT. Default value is 128. |
MIN_X_SPEED | read / write | Not used. Can have any value. |
MAX_X_SPEED | read / write | Not used. Can have any value. |
MIN_Y_SPEED | read / write | Not used. Can have any value. |
MAX_Y_SPEED | read / write | Not used. Can have any value. |
MIN_DETECTION_PROBABILITY | read / write | Defines threshold for object detection probability. Only objects with probability greater than MIN_DETECTION_PROBABILITY will be detected. Can have any values from 0 to 1. |
X_DETECTION_CRITERIA | read / write | The same meaning as Y_DETECTION_CRITERIA. The number of video frames on which the object was detected so that it will be included in the list of detected objects. The library has detection counter for each object. After each detection the library increases detection counter for particular object. When detection counter >= X_DETECTION_CRITERIA this object will be included into output detected objects list. If object not detected the library decreases detection counter. When user call setParam(…) for X_DETECTION_CRITERIA the Y_DETECTION_CRITERIA will be set automatically to the same value. Also method getParam(…) will return same value for X_DETECTION_CRITERIA and Y_DETECTION_CRITERIA. Valid values from 0 to 8192. |
Y_DETECTION_CRITERIA | read / write | The same meaning as X_DETECTION_CRITERIA. The number of video frames on which the object was detected so that it will be included in the list of detected objects. The library has detection counter for each object. After each detection the library increases detection counter for particular object. When detection counter >= Y_DETECTION_CRITERIA this object will be included into output detected objects list. If object not detected the library decreases detection counter. When user call setParam(…) for Y_DETECTION_CRITERIA the X_DETECTION_CRITERIA will be set automatically to the same value. Also method getParam(…) will return same value for X_DETECTION_CRITERIA and Y_DETECTION_CRITERIA. Valid values from 0 to 8192. |
RESET_CRITERIA | read / write | The number of consecutive video frames in which the object was not detected, so it will be excluded from the internal object list. The internal object list is used to associate newly detected objects with objects detected in the previous video frame. Internal objects list allow to library keep static object ID from frame to frame. If object not detected during number of frames < RESET_CRITERIA next time of detection will not change of object ID. |
SENSITIVITY | read / write | Not used. Can have any value. |
SCALE_FACTOR | read / write | Not used. Can have any value. |
NUM_THREADS | read / write | Not used. Can have any value. |
PROCESSING_TIME_MCS | read only | Read only. Stores processing time of current frame in microseconds. |
TYPE | read / write | Type defines kind of device for neural network model computation. Default 0 - CPU, 1 - integrated GPU (GPU.0), 2 - separate GPU (GPU.1) 3 - both GPUs (MULTI mode). More information - OpenVinoDevices. |
MODE | read / write | Mode. Default: 0 - Off, 1 - On. If the detector is not activated, frame processing is not performed - the list of detected objects will always be empty. |
CUSTOM_1 | read / write | Not used. Can have any value. |
CUSTOM_2 | read / write | Not used. Can have any value. |
CUSTOM_3 | read / write | Not used. Can have any value. |
Object structure
Object structure is used to describe detected object. Object structure declared in ObjectDetector.h file. Structure declaration:
typedef struct Object
{
/// Object ID. Must be uniques for particular object.
int id{0};
/// Frame ID. Must be the same as frame ID of processed video frame.
int frameId{0};
/// Object type. Depends on implementation.
int type{0};
/// Object rectangle width, pixels.
int width{0};
/// Object rectangle height, pixels.
int height{0};
/// Object rectangle top-left horizontal coordinate, pixels.
int x{0};
/// Object rectangle top-left vertical coordinate, pixels.
int y{0};
/// Horizontal component of object velocity, +-pixels/frame.
float vX{0.0f};
/// Vertical component of object velocity, +-pixels/frame.
float vY{0.0f};
/// Detection probability from 0 (minimum) to 1 (maximum).
float p{0.0f};
} Object;
Table 6 - Object structure fields description.
Field | Type | Description |
---|---|---|
id | int | Object ID. The library will try keep the same frame ID for particular object from frame to frame. |
frameId | int | Frame ID. Will be the same as frame ID of processed video frame. |
type | int | Object type according to probability for particular label that was returned from neural network model inference output. |
width | int | Object rectangle width, pixels. Must be MIN_OBJECT_WIDTH <= width <= MAX_OBJECT_WIDTH (see ObjectDetectorParam enum description). |
height | int | Object rectangle height, pixels. Must be MIN_OBJECT_HEIGHT <= height <= MAX_OBJECT_HEIGHT (see ObjectDetectorParam enum description). |
x | int | Object rectangle top-left horizontal coordinate, pixels. |
y | int | Object rectangle top-left vertical coordinate, pixels. |
vX | float | Not used. Will have value 0.0f. |
vY | float | Not used. Will have value 0.0f. |
p | float | Object detection probability (score). |
ObjectDetectorParams class description
ObjectDetectorParams class declaration
ObjectDetectorParams class used for object detector initialization (initObjectDetector(…) method) or to get all actual params (getParams() method) including list of detected objects. Also ObjectDetectorParams provides structure to write/read params from JSON files (JSON_READABLE macro, see ConfigReader class description) and provide methos to encode and decode params. Class declaration:
class ObjectDetectorParams
{
public:
/// Init string. Depends on implementation.
std::string initString{""};
/// Logging mode. Values: 0 - Disable, 1 - Only file,
/// 2 - Only terminal (console), 3 - File and terminal (console).
int logMode{0};
/// Frame buffer size. Depends on implementation.
int frameBufferSize{1};
/// Minimum object width to be detected, pixels. To be detected object's
/// width must be >= minObjectWidth.
int minObjectWidth{4};
/// Maximum object width to be detected, pixels. To be detected object's
/// width must be <= maxObjectWidth.
int maxObjectWidth{128};
/// Minimum object height to be detected, pixels. To be detected object's
/// height must be >= minObjectHeight.
int minObjectHeight{4};
/// Maximum object height to be detected, pixels. To be detected object's
/// height must be <= maxObjectHeight.
int maxObjectHeight{128};
/// Minimum object's horizontal speed to be detected, pixels/frame. To be
/// detected object's horizontal speed must be >= minXSpeed.
float minXSpeed{0.0f};
/// Maximum object's horizontal speed to be detected, pixels/frame. To be
/// detected object's horizontal speed must be <= maxXSpeed.
float maxXSpeed{30.0f};
/// Minimum object's vertical speed to be detected, pixels/frame. To be
/// detected object's vertical speed must be >= minYSpeed.
float minYSpeed{0.0f};
/// Maximum object's vertical speed to be detected, pixels/frame. To be
/// detected object's vertical speed must be <= maxYSpeed.
float maxYSpeed{30.0f};
/// Probability threshold from 0 to 1. To be detected object detection
/// probability must be >= minDetectionProbability.
float minDetectionProbability{0.5f};
/// Horizontal track detection criteria, frames. By default shows how many
/// frames the objects must move in any(+/-) horizontal direction to be
/// detected.
int xDetectionCriteria{1};
/// Vertical track detection criteria, frames. By default shows how many
/// frames the objects must move in any(+/-) vertical direction to be
/// detected.
int yDetectionCriteria{1};
/// Track reset criteria, frames. By default shows how many
/// frames the objects should be not detected to be excluded from results.
int resetCriteria{1};
/// Detection sensitivity. Depends on implementation. Default from 0 to 1.
float sensitivity{0.04f};
/// Frame scaling factor for processing purposes. Reduce the image size by
/// scaleFactor times horizontally and vertically for faster processing.
int scaleFactor{1};
/// Num threads. Number of threads for parallel computing.
int numThreads{1};
/// Processing time of last frame in microseconds.
int processingTimeMcs{0};
/// Algorithm type. Depends on implementation.
int type{0};
/// Mode. Default: false - Off, on - On.
bool enable{true};
/// Custom parameter. Depends on implementation.
float custom1{0.0f};
/// Custom parameter. Depends on implementation.
float custom2{0.0f};
/// Custom parameter. Depends on implementation.
float custom3{0.0f};
/// List of detected objects.
std::vector<Object> objects;
// A list of object class names used in detectors that recognize different
// object classes. Detected objects have an attribute called 'type.'
// If a detector doesn't support object class recognition or can't determine
// the object type, the 'type' field must be set to 0. Otherwise, the 'type'
// should correspond to the ordinal number of the class name from the
// 'classNames' list (if the list was set in params), starting from 1
// (where the first element in the list has 'type == 1').
std::vector<std::string> classNames{ "" };
JSON_READABLE(ObjectDetectorParams, initString, logMode, frameBufferSize,
minObjectWidth, maxObjectWidth, minObjectHeight, maxObjectHeight,
minXSpeed, maxXSpeed, minYSpeed, maxYSpeed, minDetectionProbability,
xDetectionCriteria, yDetectionCriteria, resetCriteria, sensitivity,
scaleFactor, numThreads, type, enable, custom1, custom2, custom3);
/**
* @brief operator =
* @param src Source object.
* @return ObjectDetectorParams object.
*/
ObjectDetectorParams& operator= (const ObjectDetectorParams& src);
/**
* @brief Encode params. Method doesn't encode initString.
* @param data Pointer to data buffer.
* @param size Size of data.
* @param mask Pointer to parameters mask.
*/
void encode(uint8_t* data, int& size,
ObjectDetectorParamsMask* mask = nullptr);
/**
* @brief Decode params. Method doesn't decode initString;
* @param data Pointer to data.
* @return TRUE is params decoded or FALSE if not.
*/
bool decode(uint8_t* data);
};
Table 7 - ObjectDetectorParams class fields description. Some params may be unsupported by DnnOpenVinoDetector class.
Field | Type | Description |
---|---|---|
initString | string | Has to include path to neural network model that is supported by OpenVino and has standard one-batch layout. Also has to include dimension (width;height) of neural network input images, everything separated by semicolon. E.g.: “./model.onnx;640;640”. If neural network model consists of 2 files it should be as this example: “./model.xml;./model.bin;256;480”. |
logMode | int | Not used. Can have any value. |
frameBufferSize | int | Not used. Can have any value. |
minObjectWidth | int | Minimum object width to be detected, pixels. Valid values from 1 to 8192. Must be < maxObjectWidth. To be detected object’s width must be >= minObjectWidth. Default value is 4. |
maxObjectWidth | int | Maximum object width to be detected, pixels. Valid values from 1 to 8192. Must be > minObjectWidth. To be detected object’s width must be <= maxObjectWidth. Default value is 128. |
minObjectHeight | int | Minimum object height to be detected, pixels. Valid values from 1 to 8192. Must be < maxObjectHeight. To be detected object’s height must be >= minObjectHeight. Default value is 4. |
maxObjectHeight | int | Maximum object height to be detected, pixels. Valid values from 1 to 8192. Must be > minObjectHeight. To be detected object’s height must be <= maxObjectHeight. Default value is 128. |
minXSpeed | float | Not used. Can have any value. |
maxXSpeed | float | Not used. Can have any value. |
minYSpeed | float | Not used. Can have any value. |
maxYSpeed | float | Not used. Can have any value. |
minDetectionProbability | float | Defines threshold for object detection probability. Only objects with probability greater than MIN_DETECTION_PROBABILITY will be detected. Can have any values from 0 to 1. |
xDetectionCriteria | int | The same meaning as yDetectionCriteria. The number of video frames on which the object was detected so that it will be included in the list of detected objects. The library has detection counter for each object. After each detection the library increases detection counter for particular object. When detection counter >= xDetectionCriteriathis object will be included into output detected objects list. If object not detected the library decreases detection counter. When user call setParam(…) for xDetectionCriteriathis the yDetectionCriteriathis will be set automatically to the same value. Also method getParams(…) will return same value for xDetectionCriteriathis and yDetectionCriteriathis . Valid values from 0 to 8192. |
yDetectionCriteria | int | The same meaning as xDetectionCriteria. The number of video frames on which the object was detected so that it will be included in the list of detected objects. The library has detection counter for each object. After each detection the library increases detection counter for particular object. When detection counter >= yDetectionCriteriathis object will be included into output detected objects list. If object not detected the library decreases detection counter. When user call setParam(…) for xDetectionCriteriathis the yDetectionCriteriathis will be set automatically to the same value. Also method getParams(…) will return same value for xDetectionCriteriathis and yDetectionCriteriathis . Valid values from 0 to 8192. |
resetCriteria | int | The number of consecutive video frames in which the object was not detected, so it will be excluded from the internal object list. The internal object list is used to associate newly detected objects with objects detected in the previous video frame. Internal objects list allow to library keep static object ID from frame to frame. If object not detected during number of frames < resetCriteria next time of detection will not change of object ID. |
sensitivity | float | Not used. Can have any value. |
scaleFactor | int | Not used. Can have any value. |
numThreads | int | Not used. Can have any value. |
processingTimeMcs | int | Read only. Stores processing time of current frame in microseconds. |
type | int | Type defines kind of device for neural network model computation. Default 0 - CPU, 1 - integrated GPU (GPU.0), 2 - separate GPU (GPU.1) 3 - both GPUs (MULTI mode). More information - OpenVinoDevices. |
enable | bool | Mode: false - Off, true - On. If the detector is not activated, frame processing is not performed - the list of detected objects will always be empty. |
custom1 | float | Not used. Can have any value. |
custom2 | float | Not used. Can have any value. |
custom3 | float | Not used. Can have any value. |
objects | std::vector | List of detected objects. |
classNames | std::vector | A list of object class names used in detectors that recognize different object classes. Detected objects have an attribute called ‘type.’ If a detector doesn’t support object class recognition or can’t determine the object type, the ‘type’ field must be set to 0. Otherwise, the ‘type’ should correspond to the ordinal number of the class name from the ‘classNames’ list (if the list was set in params), starting from 1 (where the first element in the list has ‘type == 1’). |
Note: ObjectDetectorParams class fields listed in Table 7 must reflect params set/get by methods setParam(…) and getParam(…).
Serialize object detector params
ObjectDetectorParams class provides method encode(…) to serialize object detector params (fields of ObjectDetectorParams class, see Table 5). Serialization of object detector params necessary in case when you need to send params via communication channels. Method provides options to exclude particular parameters from serialization. To do this method inserts binary mask (3 bytes) where each bit represents particular parameter and decode(…) method recognizes it. Method declaration:
void encode(uint8_t* data, int dataBufferSize, int& size, ObjectDetectorParamsMask* mask = nullptr);
Parameter | Value |
---|---|
data | Pointer to data buffer. Buffer size should be at least 99 bytes. |
dataBufferSize | Size of data buffer. If the data buffer size is not large enough to serialize all detected objects (40 bytes per object), not all objects will be included in the data. |
size | Size of encoded data. 99 bytes by default. |
mask | Parameters mask - pointer to ObjectDetectorParamsMask structure. ObjectDetectorParamsMask (declared in ObjectDetector.h file) determines flags for each field (parameter) declared in ObjectDetectorParams class. If the user wants to exclude any parameters from serialization, he can put a pointer to the mask. If the user wants to exclude a particular parameter from serialization, he should set the corresponding flag in the ObjectDetectorParamsMask structure. |
ObjectDetectorParamsMask structure declaration:
struct ObjectDetectorParamsMask
{
bool initString{ true };
bool logMode{ true };
bool frameBufferSize{ true };
bool minObjectWidth{ true };
bool maxObjectWidth{ true };
bool minObjectHeight{ true };
bool maxObjectHeight{ true };
bool minXSpeed{ true };
bool maxXSpeed{ true };
bool minYSpeed{ true };
bool maxYSpeed{ true };
bool minDetectionProbability{ true };
bool xDetectionCriteria{ true };
bool yDetectionCriteria{ true };
bool resetCriteria{ true };
bool sensitivity{ true };
bool scaleFactor{ true };
bool numThreads{ true };
bool processingTimeMcs{ true };
bool type{ true };
bool enable{ true };
bool custom1{ true };
bool custom2{ true };
bool custom3{ true };
bool objects{ true };
bool classNames{ true };
};
Example without parameters mask:
// Prepare random params.
ObjectDetectorParams in;
in.logMode = rand() % 255;
in.classNames = { "apple", "banana", "orange", "pineapple", "strawberry",
"watermelon", "lemon", "peach", "pear", "plum" };
for (int i = 0; i < 5; ++i)
{
Object obj;
obj.id = rand() % 255;
obj.type = rand() % 255;
obj.width = rand() % 255;
obj.height = rand() % 255;
obj.x = rand() % 255;
obj.y = rand() % 255;
obj.vX = rand() % 255;
obj.vY = rand() % 255;
obj.p = rand() % 255;
in.objects.push_back(obj);
}
// Encode data.
uint8_t data[1024];
int size = 0;
in.encode(data, size);
cout << "Encoded data size: " << size << " bytes" << endl;
Example with parameters mask:
// Prepare random params.
ObjectDetectorParams in;
in.logMode = rand() % 255;
in.classNames = { "apple", "banana", "orange", "pineapple", "strawberry",
"watermelon", "lemon", "peach", "pear", "plum" };
for (int i = 0; i < 5; ++i)
{
Object obj;
obj.id = rand() % 255;
obj.type = rand() % 255;
obj.width = rand() % 255;
obj.height = rand() % 255;
obj.x = rand() % 255;
obj.y = rand() % 255;
obj.vX = rand() % 255;
obj.vY = rand() % 255;
obj.p = rand() % 255;
in.objects.push_back(obj);
}
// Prepare mask.
ObjectDetectorParamsMask mask;
mask.logMode = false;
// Encode data.
uint8_t data[1024];
int size = 0;
in.encode(data, size, &mask)
cout << "Encoded data size: " << size << " bytes" << endl;
Deserialize object detector params
ObjectDetectorParams class provides method decode(…) to deserialize params (fields of ObjectDetectorParams class, see Table 5). Deserialization of params necessary in case when you need to receive params via communication channels. Method automatically recognizes which parameters were serialized by encode(…) method. Method declaration:
bool decode(uint8_t* data);
Parameter | Value |
---|---|
data | Pointer to encode data buffer. |
Returns: TRUE if data decoded (deserialized) or FALSE if not.
Example:
// Prepare random params.
ObjectDetectorParams in;
in.logMode = rand() % 255;
in.classNames = { "apple", "banana", "orange", "pineapple", "strawberry",
"watermelon", "lemon", "peach", "pear", "plum" };
for (int i = 0; i < 5; ++i)
{
Object obj;
obj.id = rand() % 255;
obj.type = rand() % 255;
obj.width = rand() % 255;
obj.height = rand() % 255;
obj.x = rand() % 255;
obj.y = rand() % 255;
obj.vX = rand() % 255;
obj.vY = rand() % 255;
obj.p = rand() % 255;
in.objects.push_back(obj);
}
// Encode data.
uint8_t data[1024];
int size = 0;
in.encode(data, size);
cout << "Encoded data size: " << size << " bytes" << endl;
// Decode data.
ObjectDetectorParams out;
if (!out.decode(data))
{
cout << "Can't decode data" << endl;
return false;
}
Read params from JSON file and write to JSON file
ObjectDetector library depends on ConfigReader library which provides method to read params from JSON file and to write params to JSON file. Example of writing and reading params to JSON file:
// Prepare random params.
ObjectDetectorParams in;
in.logMode = rand() % 255;
in.classNames = { "apple", "banana", "orange", "pineapple", "strawberry",
"watermelon", "lemon", "peach", "pear", "plum" };
for (int i = 0; i < 5; ++i)
{
Object obj;
obj.id = rand() % 255;
obj.type = rand() % 255;
obj.width = rand() % 255;
obj.height = rand() % 255;
obj.x = rand() % 255;
obj.y = rand() % 255;
obj.vX = rand() % 255;
obj.vY = rand() % 255;
obj.p = rand() % 255;
in.objects.push_back(obj);
}
// Write params to file.
cr::utils::ConfigReader inConfig;
inConfig.set(in, "ObjectDetectorParams");
inConfig.writeToFile("ObjectDetectorParams.json");
// Read params from file.
cr::utils::ConfigReader outConfig;
if(!outConfig.readFromFile("ObjectDetectorParams.json"))
{
cout << "Can't open config file" << endl;
return false;
}
ObjectDetectorParams out;
if(!outConfig.get(out, "ObjectDetectorParams"))
{
cout << "Can't read params from file" << endl;
return false;
}
ObjectDetectorParams.json will look like:
{
"ObjectDetectorParams": {
"classNames": [
"apple",
"banana",
"orange",
"pineapple",
"strawberry",
"watermelon",
"lemon",
"peach",
"pear",
"plum"
],
"custom1": 57.0,
"custom2": 244.0,
"custom3": 68.0,
"enable": false,
"frameBufferSize": 200,
"initString": "sfsfsfsfsf",
"logMode": 111,
"maxObjectHeight": 103,
"maxObjectWidth": 199,
"maxXSpeed": 104.0,
"maxYSpeed": 234.0,
"minDetectionProbability": 53.0,
"minObjectHeight": 191,
"minObjectWidth": 149,
"minXSpeed": 213.0,
"minYSpeed": 43.0,
"numThreads": 33,
"resetCriteria": 62,
"scaleFactor": 85,
"sensitivity": 135.0,
"type": 178,
"xDetectionCriteria": 224,
"yDetectionCriteria": 199
}
}
Build and connect to your project
Necessary step before building DnnOpenVinoDetector library is to download OpenVINO™ runtime library. Current tested and working perfectly with DnnOpenVinoDetector library OpenVino runtime library version is 2023.3.0. Download OpenVINO™ runtime library for Windows or for Linux. After install OpenVINO™ runtime library according to instructions: on Windows or on Linux. To build library use commands:
cd DnnOpenVinoDetector
mkdir build
cd build
for Windows:
cmake .. -D CMAKE_PREFIX_PATH=C:/Program Files (x86)/Intel/openvino/runtime/cmake
or for Linux:
cmake .. -D CMAKE_PREFIX_PATH=/opt/intel/openvino_2023.3.0/runtime/cmake
make
If you want to connect DnnOpenVinoDetector library to your CMake project as source code, you can do the following. For example, if your repository has structure:
CMakeLists.txt
src
CMakeList.txt
yourLib.h
yourLib.cpp
Create 3rdparty folder in your repository. Copy DnnOpenVinoDetector repository to 3rdparty folder. Remember to specify path to selected OpenCV library build as above. The new structure of your repository will be as follows:
CMakeLists.txt
src
CMakeList.txt
yourLib.h
yourLib.cpp
3rdparty
DnnOpenVinoDetector
Create CMakeLists.txt file in 3rdparty folder. CMakeLists.txt should be containing:
cmake_minimum_required(VERSION 3.13)
################################################################################
## 3RD-PARTY
## dependencies for the project
################################################################################
project(3rdparty LANGUAGES CXX)
################################################################################
## SETTINGS
## basic 3rd-party settings before use
################################################################################
# To inherit the top-level architecture when the project is used as a submodule.
SET(PARENT ${PARENT}_YOUR_PROJECT_3RDPARTY)
# Disable self-overwriting of parameters inside included subdirectories.
SET(${PARENT}_SUBMODULE_CACHE_OVERWRITE OFF CACHE BOOL "" FORCE)
################################################################################
## CONFIGURATION
## 3rd-party submodules configuration
################################################################################
SET(${PARENT}_SUBMODULE_DNN_OPENVINO_DETECTOR ON CACHE BOOL "" FORCE)
if (${PARENT}_SUBMODULE_DNN_OPENVINO_DETECTOR)
SET(${PARENT}_DNN_OPENVINO_DETECTOR ON CACHE BOOL "" FORCE)
SET(${PARENT}_DNN_OPENVINO_DETECTOR_BENCHMARK OFF CACHE BOOL "" FORCE)
SET(${PARENT}_DNN_OPENVINO_DETECTOR_DEMO OFF CACHE BOOL "" FORCE)
SET(${PARENT}_DNN_OPENVINO_DETECTOR_EXAMPLE OFF CACHE BOOL "" FORCE)
endif()
################################################################################
## INCLUDING SUBDIRECTORIES
## Adding subdirectories according to the 3rd-party configuration
################################################################################
if (${PARENT}_SUBMODULE_DNN_OPENVINO_DETECTOR)
add_subdirectory(DnnOpenVinoDetector)
endif()
File 3rdparty/CMakeLists.txt adds folder DnnOpenVinoDetector to your project and excludes test applications and examples from compiling (by default test applications and example are excluded from compiling if DnnOpenVinoDetector is included as sub-repository). The new structure of your repository:
CMakeLists.txt
src
CMakeList.txt
yourLib.h
yourLib.cpp
3rdparty
CMakeLists.txt
DnnOpenVinoDetector
Next, you need to include the ‘3rdparty’ folder in the main CMakeLists.txt file of your repository. Add the following string at the end of your main CMakeLists.txt:
add_subdirectory(3rdparty)
Next, you have to include DnnOpenVinoDetector library in your src/CMakeLists.txt file:
target_link_libraries(${PROJECT_NAME} DnnOpenVinoDetector)
Done!
Example
A simple application shows how to use the DnnOpenVinoDetector library. The application opens a video file “test.mp4” and copies the video frame data into an object of the Frame class and performs objects detection.
#include "DnnOpenVinoDetector.h"
int main(void)
{
// Open video file "test.mp4".
cv::VideoCapture videoSource;
if (!videoSource.open("test.mp4"))
return -1;
// Create and init detector.
cr::detector::DnnOpenVinoDetector detector;
cr::detector::ObjectDetectorParams params;
params.initString = "./yolo8n_thermal.onnx;640;640";
params.maxObjectHeight = 256;
params.maxObjectWidth = 256;
params.minObjectHeight = 4;
params.minObjectHeight = 4;
params.type = 1; // 1 - GPU, 0 - CPU.
detector.initObjectDetector(params);
// Main loop.
cv::Mat frameBgrOpenCv;
int frameId = 0;
while (true)
{
// Capture next video frame.
videoSource >> frameBgrOpenCv;
if (frameBgrOpenCv.empty())
{
videoSource.set(cv::CAP_PROP_POS_FRAMES, 0);
continue;
}
// Prepare Frame object.
cr::video::Frame frameBgr;
frameBgr.fourcc = cr::video::Fourcc::BGR24;
frameBgr.width = frameBgrOpenCv.size().width;
frameBgr.height = frameBgrOpenCv.size().height;
frameBgr.size = frameBgr.width * frameBgr.height * 3;
frameBgr.data = frameBgrOpenCv.data;
frameBgr.frameId = frameId++;
// Detect objects.
detector.detect(frameBgr);
// Get list of objects.
std::vector<cr::detector::Object> objects = detector.getObjects();
// Draw detected objects.
for (int n = 0; n < objects.size(); ++n)
{
rectangle(frameBgrOpenCv, cv::Rect(objects[n].x, objects[n].y,
objects[n].width, objects[n].height),
cv::Scalar(0, 0, 255), 1);
putText(frameBgrOpenCv, std::to_string(objects[n].p),
cv::Point(objects[n].x, objects[n].y),
1, 1, cv::Scalar(0, 0, 255));
}
// Show video.
cv::imshow("VIDEO", frameBgrOpenCv);
if (cv::waitKey(1) == 27)
return -1;
}
return 1;
}
Benchmark
Benchmark application located in /benchmark folder and is intended to check performance (processing time per frame) on particular platform. After start user should enter initialization string (see ObjectDetectorParams class description). Benchmark application will load neural network and will open test.mp4 video file (must be placed in the benchmark’s folder, test.mp4 file located in /video folder). The benchmark application will show number of detected objects and processing time per frame (microseconds). Benchmark output example:
DnnOpenVinoDetector v1.1.1 benchmark
Set init string ('model;width;height'): yolo8n_thermal.onnx;640;640
Set detector type (0 - CPU, 1 - GPU): 1
2 objects detected, time 16932571 us
2 objects detected, time 9956 us
2 objects detected, time 13013 us
2 objects detected, time 10884 us
2 objects detected, time 9805 us
2 objects detected, time 10910 us
Demo application
Demo application overview
The demo application is intended to evaluate the performance of the DnnOpenVinoDetector C++ library. The application allows you to evaluate the detection algorithm with chosen video file. It is console application and can be used as an example of DnnOpenVinoDetector library usage. The application uses the OpenCV (version >=4.5) library for capturing video, recording video, displaying video, and forming a simple user interface.
Launch and user interface
The demo application does not require installation. The demo application is compiled for Windows OS x64 (Windows 10 and newer) as well for Linux OS (few distros, to get demo application for specific Linux distro sent us a request). Table 8 shows list of files of demo application.
Table 8 - List of files of demo application (example for Windows OS).
File | Description |
---|---|
DnnOpenVinoDetectorDemo.exe | Demo application executable file for windows OS. |
DnnOpenVinoDetectorDemo.json | Demo application config file. |
opencv_world480.dll | OpenCV library file version 4.8.0 for Windows x64. |
openvino.dll | OpenVino library file version 2022.3.1 for Windows x64. |
tbb.dll | TBB library file from OpenVino 3rdparty for Windows x64. |
openvino_intel_cpu_plugin.dll | OpenVino library file for CPU. |
openvino_onnx_frontend.dll | OpenVino library file for obtaining onnx models. |
openvino_paddle_frontend.dll | OpenVino library file for obtaining paddle models. |
openvino_tensorflow_frontend.dll | OpenVino library file for obtaining tensorflow models. |
openvino_intel_gpu_plugin.dll | OpenVino library file for obtaining intel gpu. |
openvino_auto_plugin.dll | OpenVino library file for auto mode. |
plugins.xml | OpenVino xml config file. |
test.mp4 | Test video file. |
yolo8n_thermal.onnx | Example of YOLOv8 neural network. |
To launch demo application run DnnOpenVinoDetectorDemo.exe executable file on Windows x64 OS or run commands on Linux:
sudo chmod +x DnnOpenVinoDetectorDemo
./DnnOpenVinoDetectorDemo
If a message about missing system libraries appears (on Windows OS) when launching the application, you must install the VC_redist.x64.exe program, which will install the system libraries required for operation. Config file DnnOpenVinoDetectorDemo.json included video capture params (videoSource section) and object detector parameters (objectDetectorsection). If the demo application does not find the file after startup, it will create it with default parameters. Config file content:
{
"Params": {
"objectDetector": {
"classNames": [
"bus",
"car",
"person"
],
"custom1": -1.0,
"custom2": -1.0,
"custom3": -1.0,
"enable": true,
"frameBufferSize": -1,
"initString": "yolo8n_thermal.onnx;640;640",
"logMode": 0,
"maxObjectHeight": 256,
"maxObjectWidth": 256,
"maxXSpeed": -1.0,
"maxYSpeed": -1.0,
"minDetectionProbability": 0.5,
"minObjectHeight": 4,
"minObjectWidth": 4,
"minXSpeed": -1.0,
"minYSpeed": -1.0,
"numThreads": -1,
"resetCriteria": 10,
"scaleFactor": -1,
"sensitivity": -1.0,
"type": 1,
"xDetectionCriteria": 1,
"yDetectionCriteria": 1
},
"videoSource": "file dialog"
}
}
Table 9 - Parameters description.
Field | Type | Description |
---|---|---|
videoSource | Video source initialization string. If the parameter is set to “file dialog”, then after start the program will offer to select video file via file selection dialog. The parameter can contain a full file name (e.g. “test.mp4”) or an RTP (RTSP) stream string (format “rtsp://username:password@IP:PORT”). You can also specify the camera number in the system (e.g. “0” or “1” or other). When capturing video from a video file, the software plays the video with repetition i.e. when the end of the video file is reached, playback starts again. | |
classNames | string | A list of object class names used in detectors that recognize different object classes. Detected objects have an attribute called ‘type.’ If a detector doesn’t support object class recognition or can’t determine the object type, the ‘type’ field must be set to 0. Otherwise, the ‘type’ should correspond to the ordinal number of the class name from the ‘classNames’ list (if the list was set in params), starting from 1 (where the first element in the list has ‘type == 1’). In provided example classNames are created basing on yolo models layout. |
custom1 | float | Not used. Can have any value. |
custom2 | float | Not used. Can have any value. |
custom2 | float | Not used. Can have any value. |
enable | bool | enable / disable detector. |
frameBufferSize | int | Not used. Can have any value. |
initString | int | In initString path to neural network model that is supported by OpenVino and has standard one-batch layout have to be included. Also there should be included dimensions (width;height) of neural network input images, everything separated by semicolon. E.g.: “./model.onnx;640;640”. If neural network model consists of 2 files it should be as this example: “./model.xml;./model.bin;256;480”. |
logMode | int | Not used. Can have any value. |
maxObjectHeight | int | Maximum object height to be detected, pixels. To be detected object’s height must be <= maxObjectHeight. Default value 128. |
maxObjectWidth | int | Maximum object width to be detected, pixels. To be detected object’s width must be <= maxObjectWidth. Default value 128. |
maxXSpeed | float | Not used. Can have any value. |
maxYSpeed | float | Not used. Can have any value. |
minDetectionProbability | float | Defines threshold for object detection probability. Only objects with probability greater than minDetectionProbability will be detected. Can have any values from 0 to 1. |
minObjectHeight | int | Minimum object height to be detected, pixels. To be detected object’s height must be >= minObjectHeight. Default value 2. |
minObjectWidth | int | Minimum object width to be detected, pixels. To be detected object’s width must be >= minObjectWidth. Default value 2. |
minXSpeed | float | Not used. Can have any value. |
minYSpeed | float | Not used. Can have any value. |
numThreads | int | Not used. Can have any value. |
resetCriteria | int | The number of consecutive video frames in which the object was not detected, so it will be excluded from the internal object list. The internal object list is used to associate newly detected objects with objects detected in the previous video frame. Internal objects list allow to library keep static object ID from frame to frame. If object not detected during number of frames < resetCriteria next time of detection will not change of object ID. |
scaleFactor | int | Not used. Can have any value. |
sensitivity | float | Not used. Can have any value. |
type | int | Type defines kind of device for neural network model computation. Default 0 - CPU, 1 - integrated GPU (GPU.0), 2 - separate GPU (GPU.1) 3 - both GPUs (MULTI mode). More information - OpenVinoDevices. |
xDetectionCriteria | int | The same meaning as yDetectionCriteria. The number of video frames on which the object was detected so that it will be included in the list of detected objects. The library has detection counter for each object. After each detection the library increases detection counter for particular object. When detection counter >= xDetectionCriteriathis object will be included into output detected objects list. If object not detected the library decreases detection counter. When user call setParam(…) for xDetectionCriteriathis the yDetectionCriteriathis will be set automatically to the same value. Also method getParams(…) will return same value for xDetectionCriteriathis and yDetectionCriteriathis . Valid values from 0 to 8192. |
yDetectionCriteria | int | The same meaning as xDetectionCriteria. The number of video frames on which the object was detected so that it will be included in the list of detected objects. The library has detection counter for each object. After each detection the library increases detection counter for particular object. When detection counter >= yDetectionCriteriathis object will be included into output detected objects list. If object not detected the library decreases detection counter. When user call setParam(…) for xDetectionCriteriathis the yDetectionCriteriathis will be set automatically to the same value. Also method getParams(…) will return same value for xDetectionCriteriathis and yDetectionCriteriathis . Valid values from 0 to 8192. |
After starting the application (running the executable file) the user should select the video file in the dialog box (if parameter “videoSource” in config file is set to “file dialog”). After that the user will see the user interface as shown bellow. The window shows the original video (top) with detection results and binary motion mask (bottom). Each detected object is represented by red rectangle and metadata above which included (left to right): detection probability, object ID and class name.
Control
To control the application, it is necessary that the main video display window was active (in focus), and also it is necessary that the English keyboard layout was activated without CapsLock mode. The program is controlled by the keyboard and mouse (detection ROI control)
Table 10 - Control buttons.
Button | Description |
---|---|
ESC | Exit the application. If video recording is active, it will be stopped. |
SPACE | Reset motion detector. |
R | Start/stop video recording. When video recording is enabled, a file dst_[date and time].avi (result video) is created in the directory with the application executable file. Recording is performed of what is displayed to the user. To stop the recording, press the R key again. During video recording, the application shows a warning message. |
↑ | Arrow down. Navigate through parameters. Active parameters will highlighted by red color. |
↓ | Arrow up. Navigate through parameters. Active parameters will highlighted by red color. |
→ | Arrow right. Increase parameter value. |
← | Arrow left. Decrease parameter value. |
The user can set the detection mask (mark a rectangular area where objects are to be detected). In order to set a rectangular detection area it is necessary to draw a line with the mouse with the left button pressed from the left-top corner of the required detection area to the right-bottom corner. The detection area will be marked in blue color as shown in the image.