CN112487976A

CN112487976A - Monitoring method and device based on image recognition and storage medium

Info

Publication number: CN112487976A
Application number: CN202011373574.5A
Authority: CN
Inventors: 段勃; 李浩澜; 杨东鑫; 张春明; 张杨
Original assignee: Western Institute Of Advanced Technology Institute Of Computing Chinese Academy Of Sciences
Current assignee: Western Institute Of Advanced Technology Institute Of Computing Chinese Academy Of Sciences
Priority date: 2020-11-30
Filing date: 2020-11-30
Publication date: 2021-03-12
Anticipated expiration: 2040-11-30
Also published as: CN112487976B

Abstract

The invention provides a monitoring method, a monitoring device and a storage medium based on image recognition, wherein the method comprises the following steps: acquiring video image data; decoding the video image data to obtain a decoded image set; preprocessing the decoding image set to obtain a target image set; inputting the target image set into a target detection model for image recognition to obtain a target recognition result of each target image; performing data analysis on all target identification results to obtain monitoring results; therefore, the invention solves the problems of high monitoring difficulty and high monitoring cost of the monitoring method for the hanging basket operation in the prior art, improves the precision and timeliness of safety monitoring, and meets the requirement of monitoring the hanging basket operation safety.

Description

Monitoring method and device based on image recognition and storage medium

Technical Field

The invention relates to the technical field of image recognition, in particular to a monitoring method and device based on image recognition and a storage medium.

Background

The hanging basket is novel high-altitude operation equipment which can replace a traditional scaffold, reduce labor intensity, improve working efficiency and be reused, and is widely applied to the operation fields of outer wall construction, curtain wall installation, heat preservation construction, maintenance and cleaning and the like of high-rise buildings; however, the hanging basket operation belongs to a high-risk work type in the building industry, the safety and the reliability are relatively poor, the operation management is relatively disordered, the traditional video acquisition technology is usually adopted to record images on site at present, and then the safety problem of the hanging basket operation is solved by means of the supervision of manpower on the images, so that the problems of high monitoring difficulty and high cost are caused.

Therefore, the safety monitoring method for the hanging basket operation in the prior art has the problems of high monitoring difficulty and high monitoring cost, and can not meet the monitoring requirement for the hanging basket operation safety because the potential safety hazard occurs due to the fact that manual monitoring is not in place.

Disclosure of Invention

Aiming at the defects in the prior art, the monitoring method, the monitoring device and the storage medium based on the image recognition solve the problems of high monitoring difficulty and high monitoring cost of the safety monitoring method for the hanging basket operation in the prior art, improve the precision and timeliness of safety monitoring, and meet the requirement of automatic monitoring on the hanging basket operation safety.

In a first aspect, the present invention provides a monitoring method based on image recognition, the method comprising: acquiring video image data; decoding the video image data to obtain a decoded image set; preprocessing the decoding image set to obtain a target image set; inputting the target image set into a target detection model for image recognition to obtain a target recognition result of each target image; and analyzing data of all target identification results to obtain monitoring results.

Optionally, before acquiring the video image data, the method further includes: acquiring a sample data set; carrying out data annotation on the sample data set to obtain an annotated data set; performing data enhancement on the labeled data set to obtain a training data set; inputting the training data set into an artificial intelligence method for iterative training to obtain the target detection model.

Optionally, decoding the video image data to obtain a decoded image set, including: the main thread distributes each frame of image in the video image data to a plurality of decoding sub-threads for decoding; the plurality of decoding sub-threads store the decoded images in a buffer queue; wherein all decoded pictures in the buffer queue are used as the decoded picture set.

Optionally, preprocessing the decoded image set to obtain a target image set, including: modifying the size of each image in the decoded image set according to a preset standard size to obtain a standard image set; and filtering the standard image set to obtain the target image set.

Optionally, the target recognition result of each target image includes: a plurality of job categories and a plurality of location information, the plurality of location information corresponding to the plurality of job categories.

Optionally, after the target image set is input to a target detection model for image recognition, and a target recognition result of each target image is obtained, the method further includes: generating a plurality of position frames according to the plurality of position information; and combining the plurality of operation types, the plurality of position frames and the video image data to obtain an annotated video image.

Optionally, when the plurality of operation categories include non-wearing safety helmet, wearing safety belt, non-wearing safety belt, wearing safety buckle, non-wearing safety buckle and safety buckle position violation, performing data analysis on all target identification results to obtain monitoring results, including: classifying and summarizing all the target identification results to obtain the number of each operation category; judging whether the number of the worn safety helmets is smaller than a first threshold value or not; when the number of the worn safety helmets is larger than or equal to the first threshold value, judging whether the number of the worn safety helmets is smaller than a second threshold value; when the number of the worn safety belts is larger than or equal to the second threshold value, judging whether the number of the tied safety buckles is smaller than a third threshold value; when the number of tied safety buckles is larger than or equal to the third threshold value, judging whether violation exists in the positions of the safety buckles; and when the position of the safety buckle is not violated, the monitoring result is in a normal operation state.

Optionally, the determining whether the position of the safety buckle is violated includes: acquiring first pixel information and second pixel information of the safety buckle and a target reference position in the target image; obtaining a reference distance from the safety buckle to the target reference position according to the first pixel information and the second pixel information; obtaining a reference size of the safety buckle in the target image according to the first pixel information of the safety buckle; obtaining a scaling according to the ratio of the reference size and the actual size of the safety buckle; obtaining the actual distance from the safety buckle to the target reference position according to the reference distance and the scaling; and comparing the actual distance with a preset distance, and judging whether the position of the safety buckle is violated.

In a second aspect, the present invention provides an image recognition-based monitoring device, the device comprising: the video image acquisition module is used for acquiring video image data; the decoding module is used for decoding the video image data to obtain a decoding image set; the image preprocessing module is used for preprocessing the decoding image set to obtain a target image set; the image identification module is used for inputting the target image set into a target detection model for image identification to obtain a target identification result of each target image; and the data analysis module is used for carrying out data analysis on all the target recognition results to obtain monitoring results.

In a third aspect, the present invention provides a computer readable storage medium having stored thereon a computer program which, when executed by a processor, performs the steps of: acquiring video image data; decoding the video image data to obtain a decoded image set; preprocessing the decoding image set to obtain a target image set; inputting the target image set into a target detection model for image recognition to obtain a target recognition result of each target image; and analyzing data of all target identification results to obtain monitoring results.

Compared with the prior art, the invention has the following beneficial effects:

1. according to the invention, the operation condition of the operator is automatically obtained by carrying out image acquisition, image recognition and data analysis on the monitoring area of the hanging basket operation, so that the monitoring precision is high, and the problems of high monitoring difficulty and high monitoring cost in the prior art are solved.

2. The invention reduces or avoids the waiting time in the image recognition process by adopting multithreading to decode the video image, improves the image processing efficiency, improves the timeliness of safety monitoring, meets the requirement of real-time monitoring on the operation safety of the hanging basket and effectively avoids the occurrence of safety accidents in time.

Drawings

Fig. 1 is a schematic flow chart of a monitoring method based on image recognition according to an embodiment of the present invention;

fig. 2 is a schematic flow chart of another monitoring method based on image recognition according to an embodiment of the present invention;

fig. 3 is a schematic flow chart of another monitoring method based on image recognition according to an embodiment of the present invention;

fig. 4 is a schematic structural diagram of a monitoring device based on image recognition according to an embodiment of the present invention.

Detailed Description

In order to make the objects, technical solutions and advantages of the embodiments of the present application clearer, the technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are some embodiments of the present application, but not all embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.

Fig. 1 is a schematic flow chart of a monitoring method based on image recognition according to an embodiment of the present invention; as shown in fig. 1, the monitoring method based on image recognition specifically includes the following steps:

step S101, video image data is acquired.

In this embodiment, the acquired video image data may be real-time video image data of a current time period, or historical video image data; in order to meet the requirement for monitoring the video image in real time, the duration of each time period can be defined as 1 second, 200 milliseconds, 300 milliseconds and the like according to the actual situation, and all image data acquired in the target monitoring area in the current duration are combined into the video image data.

In practical application, the image acquisition device is a target monitoring point fixed on the hanging basket, and the video image data of the hanging basket operation area can be acquired at the target monitoring point in a maximum view field range, so that the operation area monitored at the target monitoring point is used as a target monitoring area.

And step S102, decoding the video image data to obtain a decoded image set.

Specifically, the implementation can decode the video image data through a single thread, and in order to improve the decoding efficiency, the implementation can also decode the video image data simultaneously through multiple threads; the image acquisition device is used for compressing and encoding acquired operation video images to obtain video image data, then transmitting the video image data to a core processor of a server, and the core processor is used for carrying out multi-thread decoding on each image in the received video image data to obtain a plurality of decoded images.

Further, decoding the video image data to obtain a decoded image set, including: the main thread distributes each frame of image in the video image data to a plurality of decoding sub-threads for decoding; the plurality of decoding sub-threads store the decoded images in a buffer queue; wherein all decoded pictures in the buffer queue are used as the decoded picture set.

It should be noted that, in order to improve the video image processing efficiency, in this embodiment, the core processor is used to decode the video image, the video card is used to perform image recognition on the video image, and then the core processor is used to perform analysis and calculation on the image recognition result to obtain a safety monitoring result; therefore, the core processor needs to send the video images to the video card for image recognition after decoding the video images one by one, and the decoding efficiency affects the efficiency of image recognition.

In order to reduce or avoid waiting time for the display card to acquire a decoded image, the core processor adopts a plurality of threads to realize synchronous decoding of the video image, the main thread is a control thread and is used for receiving data such as a control instruction, an image identification result, a safety analysis result and the like and sending the control instruction to the plurality of decoding threads, so that the plurality of decoding threads acquire a coded image from a corresponding video source address according to the control instruction, then decode the coded image, sequentially store the decoded image in a cache queue and enable the display card to acquire the decoded image from the cache queue.

And step S103, preprocessing the decoding image set to obtain a target image set.

Specifically, the preprocessing the decoded image set to obtain a target image set includes: modifying the size of each image in the decoded image set according to a preset standard size to obtain a standard image set; and filtering the standard image set to obtain the target image set.

It should be noted that, before the image is subjected to the target detection model identification, a preprocessing step is required, and the decoded image needs to be modified to the fixed input size of the model, which may be 608 × 608 or 416 × 416 in this embodiment, the larger the image, the more features captured by the model, the better the identification effect; the image after the size modification is subjected to the fuzzy processing through Gaussian filtering, and random noise caused by a small remote target in the background is mainly reduced; and after the Gaussian kernel parameters are calculated through a Gaussian filtering formula, carrying out global filtering on the image.

The gaussian filter formula employed in this implementation is:

and step S104, inputting the target image set into a target detection model for image recognition to obtain a target recognition result of each target image.

It should be noted that the target recognition result of each target image includes: a plurality of job categories and a plurality of location information, the plurality of location information corresponding to the plurality of job categories, the plurality of job categories including an unworn safety helmet, a worn safety belt, an unworn safety belt, a fastened safety buckle, an unfastened safety buckle, and a safety buckle location violation.

The target detection model is a model which is trained, verified and tested through a sample data set, each target image in a target image set is input into the target detection model to be subjected to multilayer convolution, pooling and activation, and a target identification result of each target image is obtained, namely whether one or more operation categories of an unworn safety helmet, a wore safety belt, an unworn safety belt, a worn safety buckle, an unworn safety buckle and a safety buckle position violation exist in each target image.

And step S105, performing data analysis on all target recognition results to obtain monitoring results.

It should be noted that after the target recognition results of all target images in the current time period are comprehensively analyzed, monitoring results can be obtained, wherein the monitoring results include but are not limited to that a person is in a safe state, a safety helmet is not worn, a safety belt is not worn, a safety buckle is not fastened, the position of the safety buckle is violated, and the like; and the field operating personnel and the remote management personnel can be timely reminded according to the safety detection result, so that the safety risk is reduced, and the safety accident is timely avoided.

Fig. 2 is a schematic flow chart of another monitoring method based on image recognition according to an embodiment of the present invention; as shown in fig. 2, before acquiring video image data, the method provided by the present invention further comprises the following steps:

step S201, acquiring a sample data set;

step S202, carrying out data annotation on the sample data set to obtain an annotated data set;

step S203, performing data enhancement on the labeled data set to obtain a training data set;

and step S204, inputting the training data set into an artificial intelligence method for iterative training to obtain the target detection model.

In practical application, according to the real scene of the hanging basket, the operation types are divided into: the method comprises the following steps of (1) not wearing a safety helmet, wearing the safety helmet, wearing a safety belt, not wearing the safety belt, wearing a safety buckle, not wearing the safety buckle and carrying out safety buckle position violation planning, so that video images of all types under the operation type are respectively collected, and image data in the video images are used as a sample data set; and carrying out data annotation and data enhancement on the sample data set according to the operation category to obtain an enhanced data set, and dividing the enhanced data set into a training data set, a verification data set and a test data set according to the ratio of 8:1: 1.

Further, inputting the training data set into an artificial intelligence method for iterative training to obtain a training detection model, and then verifying and testing the training detection model according to the verification data set and the test data set to obtain the target detection model; wherein the artificial intelligence methods include, but are not limited to, convolutional neural networks, cyclic neural networks, deep neural networks, decision trees, rule-based expert systems, genetic algorithms, regression algorithms, bayesian algorithms, and other methods having similar functionality to the above methods.

In the embodiment, the expansion of the sample data set is realized by collecting the upward and downward viewing angles of the camera and different illumination, background and semi-shielding images, and enhancement methods such as image geometric distortion, illumination distortion, random angle inclination, layer mixing, blurring, mosaic and random shielding are added in the model training stage, so that the higher generalization capability of the model is ensured.

According to the invention, the reduction of the false recognition rate can be effectively ensured through the customized classification of the scenes, the generalization capability of the model is ensured through the expansion of the data set, the cognitive ability under more complex scenes is improved, and the detection capability of small targets is ensured through the mosaic enhancement.

Fig. 3 is a schematic flow chart of another monitoring method based on image recognition according to an embodiment of the present invention; as shown in fig. 3, the present invention obtains a batch of image detection results (object detection type, frame position information) to determine whether the person is in a safe state in the calculation cycle, and if the batch of detected objects of the safety helmet is less than a threshold (number x false detection rate of a batch of images), it is determined that the person does not wear the safety helmet. And judging whether the number of the detected safety belts in one image batch is smaller than the threshold value (the false detection rate of the number x of the images in one image batch) again, and judging whether the person wears the safety belts. And finally, judging whether the position of the buckle is in the legal position of the safety rope or not, and outputting a detection result in case of violation. The position of the buckle is in the network camera with the unchanged background, and the input image view of the visual algorithm cannot change along with the lifting of the hanging basket.

In another embodiment of the present invention, the analyzing data of all the target recognition results to obtain the monitoring result includes: classifying and summarizing all the target identification results to obtain the number of each operation category; judging whether the number of the worn safety helmets is smaller than a first threshold value or not; when the number of the worn safety helmets is larger than or equal to the first threshold value, judging whether the number of the worn safety helmets is smaller than a second threshold value; when the number of the worn safety belts is larger than or equal to the second threshold value, judging whether the number of the tied safety buckles is smaller than a third threshold value; when the number of tied safety buckles is larger than or equal to the third threshold value, judging whether violation exists in the positions of the safety buckles; and when the position of the safety buckle is not violated, the monitoring result is in a normal operation state.

It should be noted that the first threshold, the second threshold, and the third threshold are set according to the operator in the target monitoring area, and when there is only one operator, the first threshold, the second threshold, and the third threshold are all theoretically 1, but since the target detection model and the human safety analysis algorithm have corresponding calculation errors, the thresholds are set by performing comprehensive calculation according to actual conditions and calculation errors.

When the number of the safety helmets is smaller than the first threshold value, the safety detection result of the current time period is that the safety helmets are not worn; when the number of the safety belts is smaller than the second threshold value, the safety detection result of the current time period is that no safety belt is worn; when the number of the buckles is smaller than the third threshold value, the safety detection result of the current time period is that the safety buckles are not tied; and when the position of the safety buckle is violated, the safety detection result of the current time period is the violation of the buckle position.

In another embodiment of the present invention, the determining whether the position of the safety buckle has a violation includes: acquiring first pixel information and second pixel information of the safety buckle and a target reference position in the target image; obtaining a reference distance from the safety buckle to the target reference position according to the first pixel information and the second pixel information; obtaining a reference size of the safety buckle in the target image according to the first pixel information of the safety buckle; obtaining a scaling according to the ratio of the reference size and the actual size of the safety buckle; obtaining the actual distance from the safety buckle to the target reference position according to the reference distance and the scaling; and comparing the actual distance with a preset distance, and judging whether the position of the safety buckle is violated.

It should be noted that, in order to further ensure the safety of the hanging basket operator, the safety catch needs to be arranged at a preset position, the distance from the preset position to the bottom of the hanging basket is a preset distance, such as 2 meters or 2.5 meters, and the target parameter position is the position of the bottom of the hanging basket, so that whether an illegal operation exists in the position of the safety catch is detected by measuring the distance from the safety catch to the target reference position.

In another embodiment of the present invention, after inputting the target image set into a target detection model for image recognition, and obtaining a target recognition result of each target image, the method further includes: generating a plurality of position frames according to the plurality of position information; and combining the plurality of operation types, the plurality of position frames and the video image data to obtain an annotated video image.

It should be noted that, after the target detection model performs target detection on the video image data, the position information in the detection result is generated into a position frame to perform tracking marking on a corresponding target in the video image, so as to obtain a labeled video image in the current time period, so that the remote administrator can monitor and manage the field operation more clearly.

Fig. 4 is a schematic structural diagram of a monitoring device based on image recognition according to an embodiment of the present invention, and as shown in fig. 4, the monitoring device based on image recognition specifically includes:

a video image obtaining module 410, configured to obtain video image data;

a decoding module 420, configured to decode the video image data to obtain a decoded image set;

the image preprocessing module 430 is configured to preprocess the decoded image set to obtain a target image set;

the image recognition module 440 is configured to input the target image set to a target detection model for image recognition, so as to obtain a target recognition result of each target image;

and the data analysis module 450 is configured to perform data analysis on all the target identification results to obtain a monitoring result.

In the present invention, CSPDarkNet53 is used as a new neural network model. Firstly, the CSPNet is fused to the DarkNet53, mainly aiming at solving the problem of gradient information repetition in the backbone network of the deep convolutional neural network framework, the gradient change is completely put into the characteristic diagram, the parameters of the model are reduced, the speed and the precision are considered, and the characteristics of rapidness and accuracy mentioned in the invention are corresponded. Secondly, on the basis of the concept of DenseNet, CSPNet copies the dense block to the next stage, thereby extracting the feature mapping chart of the basic layer, effectively relieving the disappearance of the gradient, supporting the information transmission and multiplexing the network feature information.

The invention also adopts a more accurate Mish activation function to replace a LeakyReLU method for training, and the negative value interval is not completely truncated, so that the possibility of smaller gradient inflow is ensured, the activation operation after convolution is more accurate, and the model convergence is quicker, wherein the expression of the Mish activation function is Mish ═ x tan (ln (1+ e ^ x)).

In another embodiment of the present invention, a computer device is provided, comprising a memory, a processor, and a computer program stored on the memory and executable on the processor, the processor implementing the following steps when executing the computer program: acquiring video image data; decoding the video image data to obtain a decoded image set; preprocessing the decoding image set to obtain a target image set; inputting the target image set into a target detection model for image recognition to obtain a target recognition result of each target image; and analyzing data of all target identification results to obtain monitoring results.

In another embodiment of the present invention, decoding the video image data to obtain a decoded image set further comprises:

the main thread acquires the number of decoding tasks in a task queue and the maximum thread number of a current central processing unit, the task queue stores a plurality of decoding tasks, and each decoding task corresponds to one path of video image;

the main thread creates a plurality of decoding sub-threads according to the task number and the maximum thread number;

the plurality of decoding sub-threads receive decoding instructions sent by the main thread, and decode each frame of image in each decoding task simultaneously according to the decoding instructions to obtain each frame of decoded image;

the decoding sub threads sequentially store each frame of decoded image into a decoding queue;

the main thread acquires the decoding image quantity value matched with each decoding task in the decoding queue in real time;

judging whether the decoded image quantity value is equal to a first preset value or not;

and when the decoded image quantity value is equal to the first preset value, taking the decoded image corresponding to the decoded image quantity value as the decoded image set.

The embodiment of the invention can be applied to the field of safety monitoring of hanging basket operation, wherein each hanging basket is provided with a video monitoring device, and video images collected on a plurality of hanging baskets are processed and identified to judge whether the safety of personnel operation exists. Therefore, one path of video image corresponds to one decoding task, when a plurality of video images exist, the video image corresponds to a plurality of decoding tasks, each decoding task has a unique identity, and all the decoding tasks are stored in the task queue.

The main thread of the central processing unit establishes an appropriate number of decoding sub-threads according to the number of decoding tasks in the task queue and the maximum number of threads of the current central processing unit, for example, when the number of decoding tasks is 8, the current maximum number of threads is 16, and therefore the number of established decoding sub-threads is 8; when the number of decoding tasks is greater than the maximum number of decoding sub-threads that can be currently established, a central processing unit may be added, and the establishment of the decoding sub-threads and the assignment of the tasks are performed in a distributed multi-thread manner, and the processing procedure is the same as that of this embodiment, and will not be described here again.

After the main thread distributes corresponding decoding tasks for each decoding sub-thread, the main thread generates a decoding instruction representing the identity of each decoding task and sends the decoding instruction to each decoding sub-thread, enabling each decoding sub-thread to acquire the video image in the corresponding decoding task according to the identity in the decoding instruction, and each frame of video image is decoded, each decoded frame of decoded image is stored in a decoding queue in sequence, wherein each decoding sub-thread also adds a timestamp to each frame of decoded images when decoding the video images, therefore, the ID representing each frame of decoded image consists of the identification of the decoding task and the timestamp, and the decoded image containing the timestamp can be stored in the decoding queue as long as the decoding of the decoding sub-thread is completed, so that the decoded image in the decoding queue is in a storage rule of disorder and first-in first-out.

Further, the main thread acquires the quantity value of the decoded images corresponding to each decoding task in the decoding queue in real time, and all the decoded images corresponding to the decoding tasks are used as a target decoded image set when the quantity value reaches a first preset value; for example, when the first preset value is equal to 8, the main thread judges whether decoded images from the same decoding task in the decoding queue have 8 frames of images in real time, and when 8 frames of decoded images exist, all the 8 frames of decoded images are taken out to serve as a decoded image set; when the decoding task of 8 frames of decoded images is not reached, the corresponding decoding sub-thread is required to be continuously waited for decoding; of course, the first preset value may be modified according to the actual processing capability of the graphics processor.

In a further embodiment of the invention, a computer-readable storage medium is provided, on which a computer program is stored which, when being executed by a processor, realizes the following steps: acquiring video image data; decoding the video image data to obtain a decoded image set; preprocessing the decoding image set to obtain a target image set; inputting the target image set into a target detection model for image recognition to obtain a target recognition result of each target image; and analyzing data of all target identification results to obtain monitoring results.

It will be understood by those skilled in the art that all or part of the processes of the methods of the embodiments described above can be implemented by a computer program, which can be stored in a non-volatile computer-readable storage medium, and can include the processes of the embodiments of the methods described above when the program is executed. Any reference to memory, storage, database, or other medium used in the embodiments provided herein may include non-volatile and/or volatile memory, among others. Non-volatile memory can include read-only memory (ROM), Programmable ROM (PROM), Electrically Programmable ROM (EPROM), Electrically Erasable Programmable ROM (EEPROM), or flash memory. Volatile memory can include Random Access Memory (RAM) or external cache memory. By way of illustration and not limitation, RAM is available in a variety of forms such as Static RAM (SRAM), Dynamic RAM (DRAM), Synchronous DRAM (SDRAM), Double Data Rate SDRAM (DDRSDRAM), Enhanced SDRAM (ESDRAM), Synchronous Link DRAM (SLDRAM), Rambus Direct RAM (RDRAM), direct bus dynamic RAM (DRDRAM), and memory bus dynamic RAM (RDRAM).

It is noted that, in this document, relational terms such as "first" and "second," and the like, may be used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Also, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other identical elements in a process, method, article, or apparatus that comprises the element.

Claims

1. A monitoring method based on image recognition, the method comprising:

acquiring video image data;

decoding the video image data to obtain a decoded image set;

preprocessing the decoding image set to obtain a target image set;

inputting the target image set into a target detection model for image recognition to obtain a target recognition result of each target image;

and analyzing data of all target identification results to obtain monitoring results.

2. The image recognition-based monitoring method of claim 1, wherein prior to acquiring video image data, the method further comprises:

acquiring a sample data set;

carrying out data annotation on the sample data set to obtain an annotated data set;

performing data enhancement on the labeled data set to obtain a training data set;

inputting the training data set into an artificial intelligence method for iterative training to obtain the target detection model.

3. The image recognition-based monitoring method of claim 1, wherein decoding the video image data to obtain a decoded image set comprises:

the main thread distributes each frame of image in the video image data to a plurality of decoding sub-threads for decoding;

the plurality of decoding sub-threads store the decoded images in a buffer queue;

wherein all decoded pictures in the buffer queue are used as the decoded picture set.

4. The image recognition-based monitoring method of claim 1, wherein preprocessing the decoded image set to obtain a target image set comprises:

modifying the size of each image in the decoded image set according to a preset standard size to obtain a standard image set;

and filtering the standard image set to obtain the target image set.

5. The image recognition-based monitoring method of claim 1, wherein the target recognition result of each target image comprises:

a plurality of job categories and a plurality of location information, the plurality of location information corresponding to the plurality of job categories.

6. The image recognition-based monitoring method of claim 5, wherein after inputting the target image set into a target detection model for image recognition to obtain a target recognition result of each target image, the method further comprises:

generating a plurality of position frames according to the plurality of position information;

and combining the plurality of operation types, the plurality of position frames and the video image data to obtain an annotated video image.

7. The image recognition-based monitoring method of claim 5, wherein when the plurality of job categories include unworn safety helmet, worn safety belt, unworn safety belt, worn safety buckle, unworn safety buckle and safety buckle position violation, performing data analysis on all target recognition results to obtain a monitoring result comprises:

classifying and summarizing all the target identification results to obtain the number of each operation category;

judging whether the number of the worn safety helmets is smaller than a first threshold value or not;

when the number of the worn safety helmets is larger than or equal to the first threshold value, judging whether the number of the worn safety helmets is smaller than a second threshold value;

when the number of the worn safety belts is larger than or equal to the second threshold value, judging whether the number of the tied safety buckles is smaller than a third threshold value;

when the number of tied safety buckles is larger than or equal to the third threshold value, judging whether violation exists in the positions of the safety buckles;

and when the position of the safety buckle is not violated, the monitoring result is in a normal operation state.

8. The image recognition-based monitoring method of claim 7, wherein determining whether a violation exists in the position of the safety catch comprises:

acquiring first pixel information and second pixel information of the safety buckle and a target reference position in the target image;

obtaining a reference distance from the safety buckle to the target reference position according to the first pixel information and the second pixel information;

obtaining a reference size of the safety buckle in the target image according to the first pixel information of the safety buckle;

obtaining a scaling according to the ratio of the reference size and the actual size of the safety buckle;

obtaining the actual distance from the safety buckle to the target reference position according to the reference distance and the scaling;

and comparing the actual distance with a preset distance, and judging whether the position of the safety buckle is violated.

9. An image recognition-based monitoring device, the device comprising:

the video image acquisition module is used for acquiring video image data;

the decoding module is used for decoding the video image data to obtain a decoding image set;

the image preprocessing module is used for preprocessing the decoding image set to obtain a target image set;

the image identification module is used for inputting the target image set into a target detection model for image identification to obtain a target identification result of each target image;

and the data analysis module is used for carrying out data analysis on all the target recognition results to obtain monitoring results.

10. A computer-readable storage medium, on which a computer program is stored, which, when being executed by a processor, carries out the steps of the method of any one of claims 1 to 8.