WO2024066044A1 - 基于超分辨率重建的危险行为识别方法、系统及相关设备 - Google Patents

基于超分辨率重建的危险行为识别方法、系统及相关设备 Download PDF

Info

Publication number
WO2024066044A1
WO2024066044A1 PCT/CN2022/137061 CN2022137061W WO2024066044A1 WO 2024066044 A1 WO2024066044 A1 WO 2024066044A1 CN 2022137061 W CN2022137061 W CN 2022137061W WO 2024066044 A1 WO2024066044 A1 WO 2024066044A1
Authority
WO
WIPO (PCT)
Prior art keywords
resolution
super
image
training
behavior
Prior art date
Application number
PCT/CN2022/137061
Other languages
English (en)
French (fr)
Inventor
杨之乐
吴承科
刘祥飞
郭媛君
王尧
冯伟
Original Assignee
深圳先进技术研究院
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 深圳先进技术研究院 filed Critical 深圳先进技术研究院
Publication of WO2024066044A1 publication Critical patent/WO2024066044A1/zh

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T3/00Geometric image transformations in the plane of the image
    • G06T3/40Scaling of whole images or parts thereof, e.g. expanding or contracting
    • G06T3/4053Scaling of whole images or parts thereof, e.g. expanding or contracting based on super-resolution, i.e. the output image resolution being higher than the sensor resolution
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/764Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/50Context or environment of the image
    • G06V20/52Surveillance or monitoring of activities, e.g. for recognising suspicious objects

Definitions

  • the present invention relates to the field of image processing technology, and in particular to a dangerous behavior recognition method, system and related equipment based on super-resolution reconstruction.
  • a frame of image captured by a surveillance camera is usually directly analyzed to determine whether there is a dangerous behavior.
  • the problem with the prior art is that the resolution of the surveillance camera is usually low, and the resolution of the corresponding captured image is also low. It is difficult to achieve clear recognition when identifying and analyzing the captured image, which is not conducive to improving the accuracy of dangerous behavior recognition.
  • the main purpose of the present invention is to provide a dangerous behavior identification method, system and related equipment based on super-resolution reconstruction, aiming to solve the problem that the solution in the prior art that directly analyzes a frame of image captured by a surveillance camera to determine whether there is a dangerous behavior is not conducive to improving the accuracy of dangerous behavior identification.
  • the first aspect of the present invention provides a dangerous behavior recognition method based on super-resolution reconstruction, wherein the dangerous behavior recognition method based on super-resolution reconstruction comprises:
  • dangerous behavior identification is performed through the trained behavior recognition model to obtain the target behavior category corresponding to the above-mentioned image to be identified, wherein the above-mentioned target behavior category includes one or more of a plurality of preset behavior categories, and the above-mentioned plurality of preset behavior categories include no danger, protective gear abnormality and climbing.
  • the super-resolution reconstruction of the continuous multiple frames of images to be identified according to the trained super-resolution reconstruction model to obtain continuous multiple frames of super-resolution images includes:
  • the above-mentioned image to be identified is an image obtained by photographing the target area with a camera, and the above-mentioned super-resolution reconstruction model uses an interpolation method to perform super-resolution reconstruction.
  • the super-resolution reconstruction model is pre-trained according to the following steps:
  • the first training data includes a plurality of first training image groups, each of the first training image groups includes a low-resolution training image and a high-resolution training image, one low-resolution training image corresponds to one high-resolution training image, and the resolution of the low-resolution training image is lower than that of the high-resolution training image;
  • the model parameters of the super-resolution reconstruction model are adjusted, and the step of inputting the low-resolution training image in the first training data into the super-resolution reconstruction model is continued until the first preset training condition is met to obtain the trained super-resolution reconstruction model.
  • the high-resolution training image is an image captured by a high-definition camera
  • the low-resolution training image corresponding to the high-resolution training image is an image captured by a low-definition camera of a shooting area corresponding to the high-resolution training image, and the resolution of the high-definition camera is higher than that of the low-definition camera.
  • the high-resolution training image is an image captured by a high-definition camera
  • the low-resolution training image corresponding to the high-resolution training image is an image obtained by processing the high-resolution training image based on a preset interference method
  • the preset interference method includes applying at least one of jitter and blur processing.
  • the dangerous behavior recognition is performed based on the super-resolution image by using a trained behavior recognition model to obtain a target behavior category corresponding to the image to be recognized, including:
  • a regional weight value is set for each image sub-region of each frame of the super-resolution image, wherein the regional weight value of the same image sub-region corresponding to each frame of the super-resolution image in the above-mentioned continuous multiple frames of super-resolution images is the same, and the regional weight value of the above-mentioned image sub-region is determined according to the change amount of the pixel value of the image sub-region in each frame of the super-resolution image, and the regional weight value corresponding to the image sub-region with a large change amount of pixel value is greater than the regional weight value corresponding to the image sub-region with a small change amount of pixel value;
  • feature recognition is performed on the continuous multi-frame super-resolution images according to the regional weight values to obtain feature vectors, and dangerous behavior classification is performed on the feature vectors to obtain the target behavior category.
  • the above behavior recognition model is pre-trained according to the following steps:
  • the model parameters of the above-mentioned behavior recognition model are adjusted, and the above-mentioned step of inputting the above-mentioned continuous multi-frame training super-resolution images in the second training data into the above-mentioned behavior recognition model is continued until the preset second training condition is met to obtain the trained behavior recognition model.
  • the method further includes:
  • a second aspect of the present invention provides a dangerous behavior recognition system based on super-resolution reconstruction, wherein the dangerous behavior recognition system based on super-resolution reconstruction comprises:
  • the module for acquiring images to be recognized is used to acquire multiple consecutive frames of images to be recognized
  • a super-resolution reconstruction module used for performing super-resolution reconstruction on the continuous multiple frames of images to be identified according to the trained super-resolution reconstruction model to obtain continuous multiple frames of super-resolution images, wherein the resolution of the super-resolution images is higher than that of the images to be identified;
  • the dangerous behavior recognition module is used to identify dangerous behaviors based on the super-resolution image through a trained behavior recognition model, and obtain the target behavior category corresponding to the image to be identified, wherein the target behavior category includes one or more of a plurality of preset behavior categories, and the plurality of preset behavior categories include no danger, protective gear abnormality, and climbing.
  • a third aspect of the present invention provides an intelligent terminal, which includes a memory, a processor, and a dangerous behavior identification program based on super-resolution reconstruction stored in the memory and executable on the processor.
  • the dangerous behavior identification program based on super-resolution reconstruction is executed by the processor, the steps of any one of the dangerous behavior identification methods based on super-resolution reconstruction are implemented.
  • a plurality of continuous frames of images to be identified are obtained; super-resolution reconstruction is performed on the plurality of continuous frames of images to be identified according to a trained super-resolution reconstruction model to obtain a plurality of continuous frames of super-resolution images, wherein the resolution of the super-resolution images is higher than that of the images to be identified; based on the super-resolution images, dangerous behaviors are identified through a trained behavior recognition model to obtain target behavior categories corresponding to the images to be identified, wherein the target behavior categories include one or more of a plurality of preset behavior categories, and the plurality of preset behavior categories include no danger, abnormal protective gear, and climbing.
  • FIG1 is a flow chart of a dangerous behavior identification method based on super-resolution reconstruction provided by an embodiment of the present invention
  • FIG. 2 is a schematic diagram of the structure of a dangerous behavior recognition system based on super-resolution reconstruction provided by an embodiment of the present invention
  • FIG. 3 is a block diagram of the internal structure of an intelligent terminal provided by an embodiment of the present invention.
  • the term “if” may be interpreted as “when” or “upon” or “in response to determining” or “in response to being classified into,” depending on the context.
  • the phrase “if it is determined” or “if classified into [described condition or event]” may be interpreted as meaning “upon determination” or “in response to determining” or “upon classification into [described condition or event]” or “in response to being classified into [described condition or event],” depending on the context.
  • the prior art In the prior art, it is usually directly analyzed based on a frame of image captured by a surveillance camera to determine whether there is any dangerous behavior.
  • the problem with the prior art is that the resolution of the surveillance camera is usually low, and the resolution of the corresponding captured image is also low. It is difficult to achieve clear recognition when identifying and analyzing the captured image, which is not conducive to improving the accuracy of dangerous behavior recognition.
  • the data quality (especially the resolution) collected by the surveillance camera and other image acquisition equipment installed at the construction site is generally low, making it difficult to accurately identify dangerous behaviors.
  • dangerous behavior can usually only be identified based on a single frame image, while the behavior of the target object (such as the worker who needs to be monitored) is usually continuous, and it is difficult to accurately judge from a single frame image, which is not conducive to improving the accuracy of dangerous behavior identification.
  • the existing technology usually requires monitoring personnel to manually identify monitoring images, which consumes a lot of human resources (for example, it is necessary to arrange multiple groups of monitoring personnel to be on duty 24 hours a day), and the recognition efficiency is not high, which is not conducive to improving the efficiency of dangerous behavior identification.
  • a plurality of continuous frames of images to be identified are obtained; super-resolution reconstruction is performed on the plurality of continuous frames of images to be identified according to a trained super-resolution reconstruction model to obtain a plurality of continuous frames of super-resolution images, wherein the resolution of the super-resolution image is higher than that of the image to be identified; based on the super-resolution image, dangerous behavior identification is performed through a trained behavior recognition model to obtain a target behavior category corresponding to the image to be identified, wherein the target behavior category includes one or more of a plurality of preset behavior categories, and the plurality of preset behavior categories include no danger, protective gear abnormality, and climbing.
  • the scheme of the present invention can combine continuous multi-frame images to identify dangerous behaviors, and can comprehensively consider continuous behaviors to further improve the accuracy of dangerous behavior identification.
  • image super-resolution reconstruction can be automatically performed through a trained super-resolution reconstruction model, and dangerous behavior identification can be automatically performed through a trained behavior recognition model, without the need for manual processing and identification by monitoring personnel, which is conducive to improving the efficiency of dangerous behavior identification.
  • an embodiment of the present invention provides a dangerous behavior recognition method based on super-resolution reconstruction. Specifically, the method includes the following steps:
  • Step S100 obtaining a plurality of consecutive frames of images to be recognized.
  • the above-mentioned images to be identified are images obtained by photographing the target area to be monitored and/or the target object to be monitored through a pre-set surveillance camera, and the above-mentioned continuous multiple frames of images to be identified are obtained by continuously shooting multiple frames with the surveillance camera, so as to determine whether the target object has dangerous behavior based on the continuous images to be identified.
  • the target area is a construction site area, and the target object is a worker at the construction site.
  • the target area and the corresponding target object can also be determined according to actual needs.
  • the target area can also be a railway inspection area, and the target object can also be an inspection personnel, which is not specifically limited here.
  • the number of frames of the acquired image to be recognized can be preset (for example, preset to 8 frames) or adjusted according to actual needs.
  • the number of frames acquired is 8 to 16 frames, which is not specifically limited.
  • Step S200 performing super-resolution reconstruction on the continuous multiple frames of images to be identified according to the trained super-resolution reconstruction model to obtain continuous multiple frames of super-resolution images, wherein the resolution of the super-resolution images is higher than that of the images to be identified.
  • the trained super-resolution reconstruction model is a pre-trained model for super-resolution reconstruction of images to improve image resolution.
  • the super-resolution reconstruction model is a deep reinforcement learning model (DRL), but this is not a specific limitation.
  • DRL deep reinforcement learning model
  • the super-resolution reconstruction model performs super-resolution reconstruction on each frame of the image to be identified in turn to improve its image resolution, thereby performing better feature recognition.
  • the super-resolution reconstruction of the continuous multiple frames of the image to be identified based on the trained super-resolution reconstruction model to obtain continuous multiple frames of super-resolution images includes:
  • the super-resolution reconstruction model is pre-trained according to the following steps: a low-resolution training image in the first training data is input into the super-resolution reconstruction model, and pixel interpolation is performed according to the super-resolution reconstruction model to generate a high-resolution reconstructed image corresponding to the low-resolution training image, wherein the first training data includes a plurality of first training image groups, each of the first training image groups includes a low-resolution training image and a high-resolution training image, one of the low-resolution training images corresponds to one of the high-resolution training images, and the resolution of the low-resolution training image is lower than that of the high-resolution training image;
  • the model parameters of the super-resolution reconstruction model are adjusted, and the step of inputting the low-resolution training image in the first training data into the super-resolution reconstruction model is continued until the first preset training condition is met to obtain the trained super-resolution reconstruction model.
  • the model parameters of the super-resolution reconstruction model are used to determine whether to interpolate and what interpolation method to use (e.g., average pooling, maximum pooling, etc.).
  • the first preset training condition includes that the number of iterations corresponding to the super-resolution reconstruction model is not less than a preset first iteration threshold, or the loss value between the high-resolution training image and the high-resolution reconstructed image is not less than a preset first loss threshold, and may also include other conditions, which are not specifically limited here.
  • the image contents contained in the high-resolution training image and the low-resolution training image are the same, for example, they are images with different resolutions acquired from the exact same area and object.
  • the high-resolution training image is an image captured by a high-definition camera
  • the low-resolution training image corresponding to the high-resolution training image is an image captured by a low-definition camera of the shooting area corresponding to the high-resolution training image
  • the resolution of the high-definition camera is higher than that of the low-definition camera
  • the high-resolution training image is an image captured by a high-definition camera
  • the low-resolution training image corresponding to the high-resolution training image is an image obtained by processing the high-resolution training image based on a preset interference method
  • the preset interference method includes applying at least one of jitter and blur processing.
  • the training images in the first training data are also obtained by shooting the target area where dangerous behavior identification is required.
  • a high-definition camera is used to collect images of the target area in advance to obtain a high-resolution training image, and then the high-resolution training image is subjected to dithering, blurring, and other methods to reduce its resolution to obtain a corresponding low-resolution training image.
  • the super-resolution reconstruction model i.e., a deep reinforcement learning model
  • the current environment perceived by the deep reinforcement learning model is the pixel value of all pixels in the entire image.
  • Each pixel is a reinforcement learning agent, and its behavior is defined as interpolation with neighboring pixels (each pixel has 8 neighboring points around it).
  • This behavior can be used to restore image interference factors such as dithering, and whether to interpolate and what interpolation method to use (average pooling, maximum pooling, etc.) are strategies that the agent needs to learn to make decisions (i.e., Policy in reinforcement learning).
  • Model parameters can be used to limit whether interpolation is required for each pixel and what interpolation method is used.
  • super-resolution reconstruction after all agents in the deep reinforcement learning model have completed their actions, the high-resolution reconstructed image is compared with the original high-resolution training image. The features of the two images are extracted through the convolutional network CNN, the cosine distance is calculated, and the distance difference is used as the loss to update the model parameters through back-propagation.
  • the above-mentioned super-resolution reconstruction model corresponds to a reconstruction scale, which is used to limit the resolution improvement multiple when reconstructing an image.
  • the reconstruction scale can be input by the user in real time or a fixed value can be set in advance.
  • a fixed value for example, 4 times
  • a fixed reconstruction scale is used regardless of whether the super-resolution reconstruction model is used or trained.
  • the user can determine its value in real time according to actual needs and input it into the model to limit the resolution improvement multiple.
  • multiple reconstruction scales will be pre-set when the above super-resolution reconstruction model is trained, and corresponding low-resolution training images and corresponding high-resolution training images under the reconstruction scale are set for different reconstruction scales.
  • the reconstruction scale input by the user is one of the multiple reconstruction scales that are pre-set, that is, the user selects from the multiple reconstruction scales that are pre-set.
  • weight values may also be set for different regions of the above-mentioned image to be identified, so that the above-mentioned super-resolution reconstruction model pays more attention to the regions that need attention, thereby improving the image reconstruction efficiency of the above-mentioned super-resolution reconstruction model.
  • the weight values of different regions in the above-mentioned image to be identified may be pre-set by the user according to actual needs (for example, the weight of the position of the center of the image is set to be larger, or the weight of the position where the target object often appears is set to be larger, or the weight of the position where the important equipment is located is set to be larger, or the weight of the more dangerous area is set to be larger), or may be adjusted by the user in real time (the user confirms which region needs to be set with a larger weight based on the current image to be identified), or may be determined based on the complexity of the pixel values in the current frame of the image to be identified (the weight value corresponding to the region with large pixel values or large variance corresponding to the pixel values is large), or may be determined based on the degree of change in the pixel values of the same region of the image to be identified in the previous and next frames (or all frames) in the continuous multiple frames of the image to be identified, and the weight value of the region with a large degree of change in
  • Step S300 based on the super-resolution image, dangerous behavior recognition is performed through a trained behavior recognition model to obtain a target behavior category corresponding to the image to be recognized, wherein the target behavior category includes one or more of a plurality of preset behavior categories, and the plurality of preset behavior categories include no danger, protective gear abnormality, and climbing.
  • the trained behavior recognition model is a pre-trained model for extracting features from images (or continuous image frames) and identifying the behavior categories therein, and the target behavior category is the irregular (potentially dangerous) behavior that may exist in the original continuous multiple frames of images to be recognized.
  • the behavior recognition model can recognize dangerous behaviors for a single frame of image, and then combine the dangerous behaviors identified for each frame of the image to be recognized as the target behavior category corresponding to the image to be recognized.
  • the above-mentioned trained behavior recognition model is combined with continuous multi-frame super-resolution images to perform dangerous behavior recognition, so as to better combine the actions of the target object between the previous and next frames to perform dangerous behavior recognition.
  • weight values can also be set for each area of the super-resolution image. It should be noted that when setting weight values for each area of the super-resolution image, the method of setting weight values for different areas of the image to be identified can be referred to.
  • the weight of the position of the center of the image is set to be larger, or the weight of the position where the target object often appears is larger, or the weight of the location of the important equipment is larger, or the weight of the more dangerous area is larger), or it can be adjusted by the user in real time (the user confirms which area needs to be set with a larger weight based on the current super-resolution image), and it can also be determined according to the complexity of the pixel values in the current frame of the super-resolution image (the pixel value is large, or the area with a large variance corresponding to the pixel value has a large weight value), or according to the degree of change of the pixel values of the same area of the super-resolution images of the previous and next frames (or all frames) in the continuous multi-frame super-resolution image.
  • the weight value of the area with a large degree of change in pixel values is larger, which is not specifically limited here.
  • the dangerous behavior recognition is performed based on the super-resolution image by using the trained behavior recognition model to obtain the target behavior category corresponding to the image to be recognized, including:
  • a regional weight value is set for each image sub-region of each frame of the super-resolution image, wherein the regional weight value of the same image sub-region corresponding to each frame of the super-resolution image in the above-mentioned continuous multiple frames of super-resolution images is the same, and the regional weight value of the above-mentioned image sub-region is determined according to the change amount of the pixel value of the image sub-region in each frame of the super-resolution image, and the regional weight value corresponding to the image sub-region with a large change amount of pixel value is greater than the regional weight value corresponding to the image sub-region with a small change amount of pixel value;
  • feature recognition is performed on the continuous multi-frame super-resolution images according to the regional weight values to obtain feature vectors, and dangerous behavior classification is performed on the feature vectors to obtain the target behavior category.
  • the image size and resolution of each frame of super-resolution image are the same, so when dividing the image sub-region, the same image sub-region can be obtained by dividing, for example, the corresponding image sub-region can be obtained by equally dividing each image according to the preset number of regions.
  • the weight value of each image sub-region is determined according to the change amount of the pixel value in the continuous multi-frame image corresponding to the region, wherein the change amount of the above-mentioned pixel value can be the difference, variance or standard deviation of the pixel value in the continuous multi-frame image.
  • the regional weight value corresponding to the image sub-region with a large change amount of pixel value is greater than the regional weight value corresponding to the image sub-region with a small change amount of pixel value.
  • the corresponding weight value can be set to 1, 2, 3, etc. in sequence, or the specific weight value can be fine-tuned or standardized to the interval of 0-1 to improve the calculation efficiency and obtain better results.
  • the above behavior recognition model is pre-trained according to the following steps:
  • the model parameters of the above-mentioned behavior recognition model are adjusted, and the above-mentioned step of inputting the above-mentioned continuous multi-frame training super-resolution images in the second training data into the above-mentioned behavior recognition model is continued until the preset second training condition is met to obtain the trained behavior recognition model.
  • the above-mentioned second training data is pre-collected data for training the behavior recognition model, including multiple groups of continuous multi-frame training super-resolution images, and each group of training super-resolution images has pre-labeled training actual dangerous behavior categories, and the corresponding training target behavior categories are the dangerous categories that may be contained in the image predicted by the model after feature recognition and feature classification matching during the model training process.
  • the above-mentioned second preset training conditions include that the number of iterations corresponding to the behavior recognition model is not less than the preset second iteration threshold, or the loss value between the training target behavior category and the training actual dangerous behavior category is not less than the preset second loss threshold, and may also include other conditions, which are not specifically limited here.
  • multiple consecutive frames e.g., 8-16 frames
  • corresponding training super-resolution images can be obtained by super-resolution reconstruction of collected low-resolution images
  • the corresponding training actual dangerous behavior categories are manually labeled, such as no dangerous behavior, or one or more dangerous behaviors such as abnormal protective gear, climbing, smoking, etc.
  • the corresponding training super-resolution images are then input into the behavior recognition model for dangerous behavior recognition.
  • the above-mentioned behavior recognition model is specifically a gated recurrent neural network (GRU) model.
  • GRU gated recurrent neural network
  • the above-mentioned super-resolution reconstruction model and the above-mentioned behavior recognition model can be trained based on semi-supervised learning during the training process, which is not specifically limited here.
  • the data involved in the model use and training process are similar or corresponding, for example, the training super-resolution image corresponds to the super-resolution image, and the training feature vector has a corresponding feature vector.
  • the data involved in the model training process or the use process are distinguished only by the name, but it is not a specific limitation.
  • a super-resolution reconstruction model is used to read low-resolution images of the engineering site and reconstruct them into high-resolution training super-resolution images.
  • the above training super-resolution images are easy to identify risky behaviors.
  • These training super-resolution images are input into the behavior recognition model, and a 1D vector (i.e., training feature vector) is obtained according to the behavior recognition network processing.
  • the softmax layer of the model outputs the probability of different risky behaviors, and the risky behaviors with the highest probability (including risk-free behaviors) are selected for comparison with the true labels (i.e., the actual dangerous behavior categories of training), and the GRU model parameters are updated based on cross-entropy back propagation.
  • accurate identification of engineering low-resolution video risks can be achieved by combining DRL and GRU as a whole.
  • the method further includes: matching each behavior in the target behavior category with a preset dangerous behavior level table and obtaining all corresponding matching danger levels, and taking the matching danger level with the highest danger level as the target danger level corresponding to the target behavior category; and issuing a danger warning according to the target danger level.
  • the above-mentioned dangerous behavior level table is a pre-set table for storing different dangerous behaviors and corresponding danger levels.
  • the danger level corresponding to no danger can be pre-stored as the first level
  • the danger level corresponding to smoking can be the second level
  • the danger level corresponding to abnormal protective gear and climbing can be the third level.
  • the level with the highest matching danger level is used as the target danger level, that is, the third level is used as the target danger level.
  • a danger alarm is issued according to the current target danger level.
  • different alarm methods can be set in advance for different danger levels. For example, no alarm is required for the first level, sound and light alarm is issued for the second level, and sound and light alarm is issued for the third level while sending alarm information to a preset movable device, etc., to improve the safety of the construction site.
  • the dangerous behavior identification is not performed directly based on it. Instead, the image to be identified is super-resolution reconstructed to obtain a super-resolution image with improved resolution, and then the dangerous behavior identification is performed based on the super-resolution image with higher resolution, which is conducive to improving the accuracy of dangerous behavior identification.
  • an embodiment of the present invention further provides a dangerous behavior recognition system based on super-resolution reconstruction, and the dangerous behavior recognition system based on super-resolution reconstruction includes:
  • the to-be-recognized image acquisition module 410 is used to acquire a plurality of consecutive frames of to-be-recognized images
  • a super-resolution reconstruction module 420 is used to perform super-resolution reconstruction on the continuous multiple frames of the to-be-recognized images according to the trained super-resolution reconstruction model to obtain continuous multiple frames of super-resolution images, wherein the resolution of the super-resolution images is higher than that of the to-be-recognized images;
  • the dangerous behavior identification module 430 is used to identify dangerous behaviors based on the super-resolution image through a trained behavior recognition model, and obtain the target behavior category corresponding to the image to be identified, wherein the target behavior category includes one or more of a plurality of preset behavior categories, and the plurality of preset behavior categories include no danger, protective gear abnormality, and climbing.
  • the specific functions of the dangerous behavior identification system based on super-resolution reconstruction and its modules can refer to the corresponding description in the dangerous behavior identification method based on super-resolution reconstruction, which will not be repeated here.
  • each module of the dangerous behavior recognition system based on super-resolution reconstruction is not unique and is not specifically limited here.
  • the present invention also provides an intelligent terminal, and its principle block diagram can be shown in Figure 3.
  • the above intelligent terminal includes a processor and a memory.
  • the memory of the intelligent terminal includes a dangerous behavior identification program based on super-resolution reconstruction, and the memory provides an environment for the operation of the dangerous behavior identification program based on super-resolution reconstruction.
  • the dangerous behavior identification program based on super-resolution reconstruction is executed by the processor, the steps of any of the above-mentioned dangerous behavior identification methods based on super-resolution reconstruction are implemented.
  • the above intelligent terminal may also include other functional modules or units, which are not specifically limited here.
  • FIG. 3 is only a block diagram of a partial structure related to the solution of the present invention, and does not constitute a limitation on the smart terminal to which the solution of the present invention is applied.
  • the smart terminal may include more or fewer components than those shown in the figure, or combine certain components, or have a different arrangement of components.
  • An embodiment of the present invention also provides a computer-readable storage medium, on which is stored a dangerous behavior identification program based on super-resolution reconstruction.
  • a dangerous behavior identification program based on super-resolution reconstruction is executed by a processor, the steps of any one of the dangerous behavior identification methods based on super-resolution reconstruction provided by an embodiment of the present invention are implemented.
  • system/intelligent terminal and method can be implemented in other ways.
  • system/intelligent terminal embodiments described above are only schematic, for example, the division of the above modules or units is only a logical function division, and in actual implementation, other division methods can be used, for example, multiple units or components can be combined or integrated into another system, or some features can be ignored or not executed.
  • the above integrated modules/units are implemented in the form of software functional units and sold or used as independent products, they can be stored in a computer-readable storage medium. Based on this understanding, the present invention implements all or part of the processes in the above embodiment method, and can also be completed by instructing the relevant hardware through a computer program.
  • the above computer program can be stored in a computer-readable storage medium. When the computer program is executed by the processor, the steps of the above method embodiments can be implemented.
  • the above computer program includes computer program code, and the above computer program code can be in source code form, object code form, executable file or some intermediate form.
  • the above computer-readable medium can include: any entity or device that can carry the above computer program code, recording medium, U disk, mobile hard disk, disk, optical disk, computer memory, read-only memory (ROM, Read-Only Memory), random access memory (RAM, Random Access Memory), electric carrier signal, telecommunication signal and software distribution medium. It should be noted that the content contained in the above computer-readable storage medium can be appropriately increased or decreased according to the requirements of legislation and patent practice in the jurisdiction.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Evolutionary Computation (AREA)
  • General Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Software Systems (AREA)
  • Computing Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Multimedia (AREA)
  • Databases & Information Systems (AREA)
  • Medical Informatics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Molecular Biology (AREA)
  • General Engineering & Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Image Analysis (AREA)

Abstract

一种基于超分辨率重建的危险行为识别方法、系统及相关设备,方法包括:获取连续多帧待识别图像(S100);根据已训练的超分辨重建模型对上述连续多帧待识别图像进行超分辨率重建以获取连续多帧超分辨率图像,其中,上述超分辨率图像的分辨率高于上述待识别图像(S200);根据上述超分辨率图像,通过已训练的行为识别模型进行危险行为识别,获取上述待识别图像对应的目标行为类别,其中,上述目标行为类别包括多种预设行为类别中的一种或多种,上述多种预设行为类别包括无危险、护具异常和攀爬(S300)。与现有技术中相比,该方法有利于提高危险行为识别的准确性。

Description

基于超分辨率重建的危险行为识别方法、系统及相关设备 技术领域
本发明涉及图像处理技术领域,尤其涉及的是一种基于超分辨率重建的危险行为识别方法、系统及相关设备。
背景技术
随着科学技术的发展,尤其是摄像头等图像采集设备的发展,基于摄像头采集的图像需要进行监控的场所或需要监控的对象(如加工人员、设备操作人员)进行危险行为监控的方案越来越受到重视。
现有技术中,通常直接根据监控摄像头采集的一帧图像进行分析以确定是否存在危险行为。现有技术的问题在于,监控摄像头的分辨率通常较低,对应采集获取的图像的分辨率也较低,根据该采集获取的图像进行识别和分析时难以实现清楚地识别,不利于提高危险行为识别的准确性。
因此,现有技术还有待改进和发展。
技术问题
本发明的主要目的在于提供一种基于超分辨率重建的危险行为识别方法、系统及相关设备,旨在解决现有技术中直接根据监控摄像头采集的一帧图像进行分析以确定是否存在危险行为的方案不利于提高危险行为识别的准确性的问题。
技术解决方案
为了实现上述目的,本发明第一方面提供一种基于超分辨率重建的危险行为识别方法,其中,上述基于超分辨率重建的危险行为识别方法包括:
获取连续多帧待识别图像;
根据已训练的超分辨重建模型对上述连续多帧待识别图像进行超分辨率重建以获取连续多帧超分辨率图像,其中,上述超分辨率图像的分辨率高于上述待识别图像;
根据上述超分辨率图像,通过已训练的行为识别模型进行危险行为识别,获取上述待识别图像对应的目标行为类别,其中,上述目标行为类别包括多种预设行为类别中的一种或多种,上述多种预设行为类别包括无危险、护具异常和攀爬。
可选的,上述根据已训练的超分辨重建模型对上述连续多帧待识别图像进行超分辨率重建以获取连续多帧超分辨率图像,包括:
依次将每一帧上述待识别图像输入上述已训练的超分辨率重建模型,通过上述超分辨率重建模型分别对各上述待识别图像进行超分辨率重建以分别获取各上述待识别图像对应的一帧超分辨率图像;
其中,上述待识别图像是通过摄像头对目标区域进行拍摄获得的图像,上述超分辨率重建模型采用插值方式进行超分辨率重建。
可选的,上述超分辨重建模型预先根据如下步骤进行训练:
将第一训练数据中的低分辨率训练图像输入上述超分辨率重建模型,根据上述超分辨率重建模型进行像素点插值以生成上述低分辨率训练图像对应的高分辨率重建图像,其中,上述第一训练数据包括多组第一训练图像组,每一组上述第一训练图像组包括低分辨率训练图像和高分辨率训练图像,一张上述低分辨率训练图像与一张上述高分辨率训练图像对应,且上述低分辨率训练图像的分辨率低于上述高分辨率训练图像;
根据上述低分辨率训练图像对应的高分辨率训练图像和上述低分辨率训练图像对应的高分辨率重建图像,对上述超分辨重建模型的模型参数进行调整,并继续执行将第一训练数据中的低分辨率训练图像输入上述超分辨率重建模型的步骤,直至满足第一预设训练条件,以得到已训练的超分辨率重建模型。
可选的,上述高分辨率训练图像是通过高清摄像头拍摄获取的图像,上述高分辨率训练图像对应的低分辨率训练图像是通过低清摄像头对该高分辨率训练图像所对应的拍摄区域拍摄获取的图像,上述高清摄像头的分辨率高于上述低清摄像头。
可选的,上述高分辨率训练图像是通过高清摄像头拍摄获取的图像,上述高分辨率训练图像对应的低分辨率训练图像是基于预设干扰方式对该高分辨率训练图像进行处理后获得的图像,上述预设干扰方式包括施加抖动和模糊处理中的至少一种。
可选的,上述根据上述超分辨率图像,通过已训练的行为识别模型进行危险行为识别,获取上述待识别图像对应的目标行为类别,包括:
将上述连续多帧超分辨率图像输入上述已训练的行为识别模型;
在上述已训练的行为识别模型中,为各帧超分辨率图像的各个图像子区域设置区域权重值,其中,上述连续多帧超分辨率图像中的各帧超分辨率图像所对应的同一个图像子区域的区域权重值相同,上述图像子区域的区域权重值根据该图像子区域在各帧超分辨率图像中像素值的变化量确定,像素值的变化量大的图像子区域所对应的区域权重值大于像素值的变化量小的图像子区域所对应的区域权重值;
在上述已训练的行为识别模型中,根据上述区域权重值对上述连续多帧超分辨率图像进行特征识别获得特征向量,并对上述特征向量进行危险行为分类以获取上述目标行为类别。
可选的,上述行为识别模型预先根据如下步骤进行训练:
将第二训练数据中的连续多帧训练超分辨率图像输入上述行为识别模型,通过上述行为识别模型为各帧训练超分辨率图像的各个训练图像子区域设置训练区域权重值,通过上述行为识别模型根据上述训练区域权重值对上述连续多帧训练超分辨率图像进行特征识别获得训练特征向量,并对上述训练特征向量进行危险行为分类以获取上述训练超分辨率图像对应的训练目标行为类别,其中,上述第二训练数据中包括多组第二训练图像组,每一组第二训练图像组包括一组连续多帧的训练超分辨率图像及其对应的训练实际危险行为类别;
根据上述连续多帧训练超分辨率图像对应的训练目标行为类别和上述连续多帧待识别图像对应的训练实际危险行为类别,对上述行为识别模型的模型参数进行调整,并继续执行上述将第二训练数据中的连续多帧训练超分辨率图像输入上述行为识别模型的步骤,直至满足预设第二训练条件,以得到已训练的行为识别模型。
可选的,在上述根据上述超分辨率图像,通过已训练的行为识别模型进行危险行为识别,获取上述待识别图像对应的目标行为类别之后,上述方法还包括:
将上述目标行为类别中的每一种行为与预设的危险行为等级表进行匹配并获取对应的所有匹配危险等级,将危险等级最高的匹配危险等级作为上述目标行为类别对应的目标危险等级;
根据上述目标危险等级进行危险告警。
本发明第二方面提供一种基于超分辨率重建的危险行为识别系统,其中,上述基于超分辨率重建的危险行为识别系统包括:
待识别图像获取模块,用于获取连续多帧待识别图像;
超分辨率重建模块,用于根据已训练的超分辨重建模型对上述连续多帧待识别图像进行超分辨率重建以获取连续多帧超分辨率图像,其中,上述超分辨率图像的分辨率高于上述待识别图像;
危险行为识别模块,用于根据上述超分辨率图像,通过已训练的行为识别模型进行危险行为识别,获取上述待识别图像对应的目标行为类别,其中,上述目标行为类别包括多种预设行为类别中的一种或多种,上述多种预设行为类别包括无危险、护具异常和攀爬。
本发明第三方面提供一种智能终端,上述智能终端包括存储器、处理器以及存储在上述存储器上并可在上述处理器上运行的基于超分辨率重建的危险行为识别程序,上述基于超分辨率重建的危险行为识别程序被上述处理器执行时实现上述任意一种基于超分辨率重建的危险行为识别方法的步骤。
由上可见,本发明方案中,获取连续多帧待识别图像;根据已训练的超分辨重建模型对上述连续多帧待识别图像进行超分辨率重建以获取连续多帧超分辨率图像,其中,上述超分辨率图像的分辨率高于上述待识别图像;根据上述超分辨率图像,通过已训练的行为识别模型进行危险行为识别,获取上述待识别图像对应的目标行为类别,其中,上述目标行为类别包括多种预设行为类别中的一种或多种,上述多种预设行为类别包括无危险、护具异常和攀爬。
有益效果
与现有技术中相比,本发明方案中,在获取分辨率较低的待识别图像之后,并不是直接根据其进行危险行为识别。而是对待识别图像进行超分辨率重建,获得分辨率提高的超分辨率图像,从而根据分辨率较高的超分辨率图像进行危险行为识别,有利于提高危险行为识别的准确性。
附图说明
为了更清楚地说明本发明实施例中的技术方案,下面将对实施例或现有技术描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本发明的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动性的前提下,还可以根据这些附图获得其它的附图。
图1是本发明实施例提供的一种基于超分辨率重建的危险行为识别方法的流程示意图;
图2是本发明实施例提供的一种基于超分辨率重建的危险行为识别系统的结构示意图;
图3是本发明实施例提供的一种智能终端的内部结构原理框图。
本发明的实施方式
以下描述中,为了说明而不是为了限定,提出了诸如特定系统结构、技术之类的具体细节,以便透彻理解本发明实施例。然而,本领域的技术人员应当清楚,在没有这些具体细节的其它实施例中也可以实现本发明。在其它情况下,省略对众所周知的系统、装置、电路以及方法的详细说明,以免不必要的细节妨碍本发明的描述。
应当理解,当在本说明书和所附权利要求书中使用时,术语“包括”指示所描述特征、整体、步骤、操作、元素和/或组件的存在,但并不排除一个或多个其它特征、整体、步骤、操作、元素、组件和/或其集合的存在或添加。
还应当理解,在本发明说明书中所使用的术语仅仅是出于描述特定实施例的目的而并不意在限制本发明。如在本发明说明书和所附权利要求书中所使用的那样,除非上下文清楚地指明其它情况,否则单数形式的“一”、“一个”及“该”意在包括复数形式。
还应当进一步理解,在本发明说明书和所附权利要求书中使用的术语“和/或”是指相关联列出的项中的一个或多个的任何组合以及所有可能组合,并且包括这些组合。
如在本说明书和所附权利要求书中所使用的那样,术语“如果”可以依据上下文被解释为“当…时”或“一旦”或“响应于确定”或“响应于分类到”。类似的,短语“如果确定”或“如果分类到[所描述条件或事件]”可以依据上下文被解释为意指“一旦确定”或“响应于确定”或“一旦分类到[所描述的条件或事件]”或“响应于分类到[所描述条件或事件]”。
下面结合本发明实施例的附图,对本发明实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例仅仅是本发明的一部分实施例,而不是全部的实施例。基于本发明中的实施例,本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其它实施例,都属于本发明保护的范围。
在下面的描述中阐述了很多具体细节以便于充分理解本发明,但是本发明还可以采用其它不同于在此描述的其它方式来实施,本领域技术人员可以在不违背本发明内涵的情况下做类似推广,因此本发明不受下面公开的具体实施例的限制。
随着科学技术的发展,尤其是摄像头等图像采集设备的发展,在各个区域设置的监控摄像头越来越多,基于摄像头采集的图像需要进行监控的场所(例如建筑现场)或需要监控的对象(如工人、加工人员、设备操作人员)进行危险行为监控的方案越来越受到重视。
现有技术中,通常直接根据监控摄像头采集的一帧图像进行分析以确定是否存在危险行为。现有技术的问题在于,监控摄像头的分辨率通常较低,对应采集获取的图像的分辨率也较低,根据该采集获取的图像进行识别和分析时难以实现清楚地识别,不利于提高危险行为识别的准确性。例如,在建筑现场设置的监控摄像头等图像采集设备采集的数据质量(尤其是分辨率)普遍较低,难以准确进行危险行为识别。
同时,现有技术中,通常只能针对单帧图像进行危险行为识别,而目标对象(例如需要监控的工人)的行为通常是连续的,从单帧图像难以准确进行判断,因此也不利于提高危险行为识别的准确性。
并且现有技术中通常需要监控人员针对监控图像进行人工识别,需要耗费大量人力资源(例如需要安排多组监控人员进行24小时值守),并且识别的效率不高,不利于提高危险行为识别的效率。
为了解决上述多个问题中的至少一个问题,本发明方案中,获取连续多帧待识别图像;根据已训练的超分辨重建模型对上述连续多帧待识别图像进行超分辨率重建以获取连续多帧超分辨率图像,其中,上述超分辨率图像的分辨率高于上述待识别图像;根据上述超分辨率图像,通过已训练的行为识别模型进行危险行为识别,获取上述待识别图像对应的目标行为类别,其中,上述目标行为类别包括多种预设行为类别中的一种或多种,上述多种预设行为类别包括无危险、护具异常和攀爬。
与现有技术中相比,本发明方案中,在获取分辨率较低的待识别图像之后,并不是直接根据其进行危险行为识别。而是对待识别图像进行超分辨率重建,获得分辨率提高的超分辨率图像,从而根据分辨率较高的超分辨率图像进行危险行为识别,有利于提高危险行为识别的准确性。
同时,本发明方案中可以结合连续的多帧图像进行危险行为识别,能够综合考虑连续的行为,进一步提高危险行为识别的准确性。同时,基于本发明方案,可以通过已训练的超分辨率重建模型自动进行图像超分辨率重建,并且通过已训练的行为识别模型自动进行危险行为识别,无需监控人员进行人工处理和识别,有利于提高危险行为识别的效率。
示例性方法
如图1所示,本发明实施例提供一种基于超分辨率重建的危险行为识别方法,具体的,上述方法包括如下步骤:
步骤S100,获取连续多帧待识别图像。
其中,上述待识别图像是通过预先设置的监控摄像头对需要进行监控的目标区域和/或需要进行监控的目标对象进行拍摄获取的图像,并且通过监控摄像头进行连续多帧拍摄获得上述连续多帧待识别图像,从而根据连续的待识别图像确定目标对象是否存在危险行为。
本实施例中,上述目标区域是建筑施工现场区域,上述目标对象是建筑施工现场的工人。实际使用过程中,上述目标区域和对应的目标对象还可以根据实际需求确定,利于目标区域还可以是铁路巡检区域,目标对象还可以是巡检人员,在此不作具体限定。
具体的,采集获取的待识别图像的帧数可以预先设置(例如预先设置为8帧)或根据实际需求进行调整,本实施例中采集的帧数为8到16帧,具体不做限定。
步骤S200,根据已训练的超分辨重建模型对上述连续多帧待识别图像进行超分辨率重建以获取连续多帧超分辨率图像,其中,上述超分辨率图像的分辨率高于上述待识别图像。
其中,上述已训练的超分辨率重建模型是预先训练的用于对图像进行超分辨率重建以提高图像分辨率的模型。本实施例中,上述超分辨率重建模型为深度强化学习模型(DRL),但不作为具体限定。
本实施例中,上述超分辨率重建模型依次对每一帧待识别图像进行超分辨率重建以提升其图像分辨率,从而进行更好的特征识别。本实施例中,上述根据已训练的超分辨重建模型对上述连续多帧待识别图像进行超分辨率重建以获取连续多帧超分辨率图像,包括:
依次将每一帧上述待识别图像输入上述已训练的超分辨率重建模型,通过上述超分辨率重建模型分别对各上述待识别图像进行超分辨率重建以分别获取各上述待识别图像对应的一帧超分辨率图像;其中,上述待识别图像是通过摄像头对目标区域进行拍摄获得的图像,上述超分辨率重建模型采用插值方式进行超分辨率重建。
进一步的,上述超分辨重建模型预先根据如下步骤进行训练:将第一训练数据中的低分辨率训练图像输入上述超分辨率重建模型,根据上述超分辨率重建模型进行像素点插值以生成上述低分辨率训练图像对应的高分辨率重建图像,其中,上述第一训练数据包括多组第一训练图像组,每一组上述第一训练图像组包括低分辨率训练图像和高分辨率训练图像,一张上述低分辨率训练图像与一张上述高分辨率训练图像对应,且上述低分辨率训练图像的分辨率低于上述高分辨率训练图像;
根据上述低分辨率训练图像对应的高分辨率训练图像和上述低分辨率训练图像对应的高分辨率重建图像,对上述超分辨重建模型的模型参数进行调整,并继续执行将第一训练数据中的低分辨率训练图像输入上述超分辨率重建模型的步骤,直至满足第一预设训练条件,以得到已训练的超分辨率重建模型。
其中,上述超分辨率重建模型的模型参数用于限定该是否插值,具体采用何种插值方式(例如平均池化、最大池化等)。上述第一预设训练条件包括超分辨率重建模型对应的迭代次数不小于预设的第一迭代阈值,或者高分辨率训练图像和高分辨率重建图像之间的损失值不小于预设的第一损失阈值,还可以包括其它条件,在此不做具体限定。
需要说明的是,上述高分辨率训练图像和上述低分辨率训练图像两者所包含的图像内容是相同的,例如,是对完全相同的区域、对象进行采集获取的分辨率不同的图像。
在一种应用场景中,上述高分辨率训练图像是通过高清摄像头拍摄获取的图像,上述高分辨率训练图像对应的低分辨率训练图像是通过低清摄像头对该高分辨率训练图像所对应的拍摄区域拍摄获取的图像,上述高清摄像头的分辨率高于上述低清摄像头。
在另一种应用场景中,上述高分辨率训练图像是通过高清摄像头拍摄获取的图像,上述高分辨率训练图像对应的低分辨率训练图像是基于预设干扰方式对该高分辨率训练图像进行处理后获得的图像,上述预设干扰方式包括施加抖动和模糊处理中的至少一种。
本实施例中,上述第一训练数据中的训练图像也是对需要进行危险行为识别的目标区域进行拍摄获得的。具体的,预先使用高清摄像头针对目标区域进行图像采集获得高分辨率训练图像,然后对高分辨率训练图像施加抖动、模糊等方法降低其分辨率以获取对应的低分辨率训练图像。在对超分辨率重建模型(即一个深度强化学习模型)进行训练的过程中,该深度强化学习模型感知的当前环境为整个图像中所有像素点的像素值,每一个像素点作为一个强化学习智能体,其可以进行的行为定义为与近邻像素点(每个像素点周围有8个相邻点)进行插值,这一行为可以用于对抖动等图像干扰因素的还原,而是否进行插值,采用何种插值方法(平均池化、最大池化等)是智能体需要学习决策的策略(即强化学习中的Policy),可以通过模型参数限定针对各个像素点是否需要进行插值已经采用何种插值方式。进一步的,根据超分辨率重建的定义,每次深度强化学习模型所有智能体执行完行为之后,将得到的高分辨率重建图像与原始的高分辨率训练图像进行对比,通过卷积网络CNN提取两幅图像特征,计算余弦距离,使用该距离差异作为损失,通过反相传播更新模型参数。
需要说明的是,上述超分辨率重建模型对应有重建尺度,用于限定进行图像重建时的分辨率提升倍数,重建尺度可以由用户实时输入或者预先设置一个固定值。本实施例中以预先设置一个固定值(例如设置为4倍)为例进行说明,即无论在超分辨率重建模型的使用还是训练过程中都使用固定的重建尺度。
当上述重建尺度由用户输入时,用户可以根据实际需求实时确定其取值并输入模型,以实现对分辨率提升倍数的限定。需要说明的是,当上述重建尺度由用户输入时,在上述超分辨率重建模型进行训练时也会预先设置多种重建尺度,并且为不同重建尺度设置对应的低分辨率训练图像和该重建尺度下对应的高分辨率训练图像。需要说明的是,用户输入的重建尺度是预先设置的多种重建尺度中的一种,即用户在预先设置的多种重建尺度中进行选择。
在一种应用场景中,还可以为上述待识别图像的不同区域设置权重值,以使得上述超分辨率重建模型更注意需要注意的区域,进而提高上述超分辨率重建模型的图像重建效率。具体的,上述待识别图像中不同区域的权重值可以由用户根据实际需求预先设置(例如设置图像中心的位置的权重更大,或者目标对象经常出现的位置权重更大,或者重要设备所在位置权重更大,或者更危险的区域权重更大),或者由用户实时调整(用户根据当前的待识别图像确认哪个区域需要设置更大的权重),还可以根据当前的一帧待识别图像中像素值的复杂程度确定(像素值大、或者像素值对应的方差大的区域对应的权重值大),或者根据连续多帧待识别图像中前后帧(或所有帧)待识别图像的相同区域的像素值的变化程度来确定,像素值变化程度大的区域权重值更大。
步骤S300,根据上述超分辨率图像,通过已训练的行为识别模型进行危险行为识别,获取上述待识别图像对应的目标行为类别,其中,上述目标行为类别包括多种预设行为类别中的一种或多种,上述多种预设行为类别包括无危险、护具异常和攀爬。
其中,上述已训练的行为识别模型是预先训练好的用于对图像(或连续的图像帧)进行特征提取并识别其中的行为类别的模型,上述目标行为类别是识别出的原始的连续多帧待识别图像中可能存在的不规范(可能造成危险)的行为。在一种应用场景中,上述行为识别模型可以对单帧图像进行危险行为识别,然后将针对每一帧待识别图像识别出的危险行为组合起来作为待识别图像对应的目标行为类别。
本实施例中,通过上述已训练的行为识别模型结合连续多帧超分辨率图像进行危险行为识别,以更好地结合前后帧之间目标对象的动作进行危险行为识别。并且,本实施例中还可以为超分辨率图像的各个区域设置权重值。需要说明的是,为超分辨率图像的各个区域设置权重值时,可以参照为待识别图像的不同区域设置权重值时的方式,例如可以由用户根据实际需求预先设置(例如设置图像中心的位置的权重更大,或者目标对象经常出现的位置权重更大,或者重要设备所在位置权重更大,或者更危险的区域权重更大),或者由用户实时调整(用户根据当前的超分辨率图像确认哪个区域需要设置更大的权重),还可以根据当前的一帧超分辨率图像中像素值的复杂程度确定(像素值大、或者像素值对应的方差大的区域对应的权重值大),或者根据连续多帧超分辨率图像中前后帧(或所有帧)超分辨率图像的相同区域的像素值的变化程度来确定,像素值变化程度大的区域权重值更大,在此不作具体限定。
本实施例中,上述根据上述超分辨率图像,通过已训练的行为识别模型进行危险行为识别,获取上述待识别图像对应的目标行为类别,包括:
将上述连续多帧超分辨率图像输入上述已训练的行为识别模型;
在上述已训练的行为识别模型中,为各帧超分辨率图像的各个图像子区域设置区域权重值,其中,上述连续多帧超分辨率图像中的各帧超分辨率图像所对应的同一个图像子区域的区域权重值相同,上述图像子区域的区域权重值根据该图像子区域在各帧超分辨率图像中像素值的变化量确定,像素值的变化量大的图像子区域所对应的区域权重值大于像素值的变化量小的图像子区域所对应的区域权重值;
在上述已训练的行为识别模型中,根据上述区域权重值对上述连续多帧超分辨率图像进行特征识别获得特征向量,并对上述特征向量进行危险行为分类以获取上述目标行为类别。
其中,各帧超分辨率图像的图像尺寸和分辨率是相同的,因此进行图像子区域划分时可以划分获得相同的图像子区域,例如根据预设的区域数目对各个图像进行均分可以获得对应的图像子区域。本实施例中,各个图像子区域的权重值是分别根据该区域对应的连续的多帧图像中的像素值的变化量确定的,其中,上述像素值的变化量可以是连续多帧图像中像素值的差值、方差或标准差,例如,对于图像子区域中的每一个像素点,根据连续的多帧图像确定其方差,将图像子区域中所有像素点对应的方差求和,可以作为该区域的像素值的变化量。设置权重值时,像素值的变化量大的图像子区域所对应的区域权重值大于像素值的变化量小的图像子区域所对应的区域权重值。例如,可以根据像素值变化量的大小,依次将对应的权重值设置为1、2、3等,也可以对具体的权重值进行微调或标准化到0-1的区间,以提高计算效率,获得更好的效果。
本实施例中,上述行为识别模型预先根据如下步骤进行训练:
将第二训练数据中的连续多帧训练超分辨率图像输入上述行为识别模型,通过上述行为识别模型为各帧训练超分辨率图像的各个训练图像子区域设置训练区域权重值,通过上述行为识别模型根据上述训练区域权重值对上述连续多帧训练超分辨率图像进行特征识别获得训练特征向量,并对上述训练特征向量进行危险行为分类以获取上述训练超分辨率图像对应的训练目标行为类别,其中,上述第二训练数据中包括多组第二训练图像组,每一组第二训练图像组包括一组连续多帧的训练超分辨率图像及其对应的训练实际危险行为类别;
根据上述连续多帧训练超分辨率图像对应的训练目标行为类别和上述连续多帧待识别图像对应的训练实际危险行为类别,对上述行为识别模型的模型参数进行调整,并继续执行上述将第二训练数据中的连续多帧训练超分辨率图像输入上述行为识别模型的步骤,直至满足预设第二训练条件,以得到已训练的行为识别模型。
其中,上述第二训练数据是预先采集的用于训练行为识别模型的数据,其中包括多组连续多帧的训练超分辨率图像,且每一组训练超分辨率图像都有预先标注好的训练实际危险行为类别,而对应的训练目标行为类别则是在模型训练过程中由模型进行特征识别、特征分类匹配后预测出的图像中可能包含的危险类别。上述第二预设训练条件包括行为识别模型对应的迭代次数不小于预设的第二迭代阈值,或者训练目标行为类别和训练实际危险行为类别之间的损失值不小于预设的第二损失阈值,还可以包括其它条件,在此不做具体限定。
本实施例中,预先采集目标区域的连续多帧(例如8-16帧)高分辨率图像(其分辨率与模型使用过程中的超分辨率图像的分辨率相同)作为训练超分辨率图像,或者也可以根据采集的低分辨率图像超分辨率重构获得对应的训练超分辨率图像,人工标注其对应的训练实际危险行为类别,例如无危险行为,或者护具异常、攀爬、抽烟等危险行为中的一种或多种。然后将对应的训练超分辨率图像输入到行为识别模型进行危险行为识别。
本实施例中,上述行为识别模型具体为一个门控循环神经网络(GRU)模型。上述超分辨重建模型和上述行为识别模型在训练过程中可以基于半监督学习的方式进行学习训练,在此不作具体限定。需要说明的是,在模型使用和训练过程中涉及的数据是相似或相对应的,例如训练超分辨率图像与超分辨率图像对应,训练特征向量有对应的特征向量,仅通过名字区分是在模型训练过程还是使用过程涉及的数据,但不作为具体限定。
具体的,本实施例中,通过超分辨率重建模型读取工程现场低分辨率的图像并重建为高分辨率的训练超分辨率图像,上述训练超分辨率图像易于进行风险行为识别。将这些训练超分辨率图像输入到行为识别模型中,根据行为识别网络处理获得一个1D向量(即训练特征向量),通过模型的softmax层输出为不同风险行为的概率,选择概率最高的风险行为(包括无风险行为)与真实标签(即训练实际危险行为类别)对比,基于交叉熵反向传播更新GRU模型参数。如此,基于本实施例方案,可以整体实现结合DRL和GRU进行工程低分辨视频风险准确识别。
进一步的,在上述根据上述超分辨率图像,通过已训练的行为识别模型进行危险行为识别,获取上述待识别图像对应的目标行为类别之后,上述方法还包括:将上述目标行为类别中的每一种行为与预设的危险行为等级表进行匹配并获取对应的所有匹配危险等级,将危险等级最高的匹配危险等级作为上述目标行为类别对应的目标危险等级;根据上述目标危险等级进行危险告警。
其中,上述危险行为等级表是预先设置的用于存储不同的危险行为与对应的危险等级的表,例如,可以预先存储无危险对应的危险等级为第一等级,抽烟对应的危险等级为第二等级,护具异常和攀爬对应的危险等级为第三等级。根据识别出的目标行为类别确定对应的目标危险等级,需要说明的是,当识别出多种危险行为,例如识别出护具异常和抽烟时,将匹配的危险等级最高的等级作为目标危险等级,即将第三等级作为目标危险等级。然后根据当前的目标危险等级进行危险告警,具体的,可以预先为不同的危险等级设置不同的告警方式,例如第一等级不需要告警,第二等级进行声光告警,第三等级进行声光告警同时发送告警信息给预设可移动设备等,以提升施工现场的安全性。
由上可见,本发明方案中,在获取分辨率较低的待识别图像之后,并不是直接根据其进行危险行为识别。而是对待识别图像进行超分辨率重建,获得分辨率提高的超分辨率图像,从而根据分辨率较高的超分辨率图像进行危险行为识别,有利于提高危险行为识别的准确性。
示例性设备
如图2中所示,对应于上述基于超分辨率重建的危险行为识别方法,本发明实施例还提供一种基于超分辨率重建的危险行为识别系统,上述基于超分辨率重建的危险行为识别系统包括:
待识别图像获取模块410,用于获取连续多帧待识别图像;
超分辨率重建模块420,用于根据已训练的超分辨重建模型对上述连续多帧待识别图像进行超分辨率重建以获取连续多帧超分辨率图像,其中,上述超分辨率图像的分辨率高于上述待识别图像;
危险行为识别模块430,用于根据上述超分辨率图像,通过已训练的行为识别模型进行危险行为识别,获取上述待识别图像对应的目标行为类别,其中,上述目标行为类别包括多种预设行为类别中的一种或多种,上述多种预设行为类别包括无危险、护具异常和攀爬。
具体的,本实施例中,上述基于超分辨率重建的危险行为识别系统及其各模块的具体功能可以参照上述基于超分辨率重建的危险行为识别方法中的对应描述,在此不再赘述。
需要说明的是,上述基于超分辨率重建的危险行为识别系统的各个模块的划分方式并不唯一,在此也不作为具体限定。
基于上述实施例,本发明还提供了一种智能终端,其原理框图可以如图3所示。上述智能终端包括处理器及存储器。该智能终端的存储器包括基于超分辨率重建的危险行为识别程序,存储器为基于超分辨率重建的危险行为识别程序的运行提供环境。该基于超分辨率重建的危险行为识别程序被处理器执行时实现上述任意一种基于超分辨率重建的危险行为识别方法的步骤。需要说明的是,上述智能终端还可以包括其它功能模块或单元,在此不作具体限定。
本领域技术人员可以理解,图3中示出的原理框图,仅仅是与本发明方案相关的部分结构的框图,并不构成对本发明方案所应用于其上的智能终端的限定,具体地智能终端可以包括比图中所示更多或更少的部件,或者组合某些部件,或者具有不同的部件布置。
本发明实施例还提供一种计算机可读存储介质,上述计算机可读存储介质上存储有基于超分辨率重建的危险行为识别程序,上述基于超分辨率重建的危险行为识别程序被处理器执行时实现本发明实施例提供的任意一种基于超分辨率重建的危险行为识别方法的步骤。
应理解,上述实施例中各步骤的序号大小并不意味着执行顺序的先后,各过程的执行顺序应以其功能和内在逻辑确定,而不应对本发明实施例的实施过程构成任何限定。
所属领域的技术人员可以清楚地了解到,为了描述的方便和简洁,仅以上述各功能单元、模块的划分进行举例说明,实际应用中,可以根据需要而将上述功能分配由不同的功能单元、模块完成,即将上述系统的内部结构划分成不同的功能单元或模块,以完成以上描述的全部或者部分功能。实施例中的各功能单元、模块可以集成在一个处理单元中,也可以是各个单元单独物理存在,也可以两个或两个以上单元集成在一个单元中,上述集成的单元既可以采用硬件的形式实现,也可以采用软件功能单元的形式实现。另外,各功能单元、模块的具体名称也只是为了便于相互区分,并不用于限制本发明的保护范围。上述系统中单元、模块的具体工作过程,可以参考前述方法实施例中的对应过程,在此不再赘述。
在上述实施例中,对各个实施例的描述都各有侧重,某个实施例中没有详述或记载的部分,可以参见其它实施例的相关描述。
本领域普通技术人员可以意识到,结合本文中所公开的实施例描述的各实例的单元及算法步骤,能够以电子硬件、或者计算机软件和电子硬件的结合来实现。这些功能究竟是以硬件还是软件方式来执行,取决于技术方案的特定应用和设计约束条件。专业技术人员可以对每个特定的应用来使用不同法来实现所描述的功能,但是这种实现不应认为超出本发明的范围。
在本发明所提供的实施例中,应该理解到,所揭露的系统/智能终端和方法,可以通过其它的方式实现。例如,以上所描述的系统/智能终端实施例仅仅是示意性的,例如,上述模块或单元的划分,仅仅为一种逻辑功能划分,实际实现时可以由另外的划分方式,例如多个单元或组件可以结合或者可以集成到另一个系统,或一些特征可以忽略,或不执行。
上述集成的模块/单元如果以软件功能单元的形式实现并作为独立的产品销售或使用时,可以存储在一个计算机可读存储介质中。基于这样的理解,本发明实现上述实施例方法中的全部或部分流程,也可以通过计算机程序来指令相关的硬件来完成,上述计算机程序可存储于一计算机可读存储介质中,该计算机程序在被处理器执行时,可实现上述各个方法实施例的步骤。其中,上述计算机程序包括计算机程序代码,上述计算机程序代码可以为源代码形式、对象代码形式、可执行文件或某些中间形式等。上述计算机可读介质可以包括:能够携带上述计算机程序代码的任何实体或装置、记录介质、U盘、移动硬盘、磁碟、光盘、计算机存储器、只读存储器(ROM,Read-Only Memory)、随机存取存储器(RAM,Random Access Memory)、电载波信号、电信信号以及软件分发介质等。需要说明的是,上述计算机可读存储介质包含的内容可以根据司法管辖区内立法和专利实践的要求进行适当的增减。
以上所述实施例仅用以说明本发明的技术方案,而非对其限制;尽管参照前述实施例对本发明进行了详细的说明,本领域的普通技术人员应当理解;其依然可以对前述各实施例所记载的技术方案进行修改,或者对其中部分技术特征进行等同替换;而这些修改或者替换,并不是相应技术方案的本质脱离本发明各实施例技术方案的精神和范围,均应包含在本发明的保护范围之内。

Claims (10)

  1. 一种基于超分辨率重建的危险行为识别方法,其特征在于,所述方法包括:
    获取连续多帧待识别图像;
    根据已训练的超分辨重建模型对所述连续多帧待识别图像进行超分辨率重建以获取连续多帧超分辨率图像,其中,所述超分辨率图像的分辨率高于所述待识别图像;
    根据所述超分辨率图像,通过已训练的行为识别模型进行危险行为识别,获取所述待识别图像对应的目标行为类别,其中,所述目标行为类别包括多种预设行为类别中的一种或多种,所述多种预设行为类别包括无危险、护具异常和攀爬。
  2. 根据权利要求1所述的基于超分辨率重建的危险行为识别方法,其特征在于,所述根据已训练的超分辨重建模型对所述连续多帧待识别图像进行超分辨率重建以获取连续多帧超分辨率图像,包括:
    依次将每一帧所述待识别图像输入所述已训练的超分辨率重建模型,通过所述超分辨率重建模型分别对各所述待识别图像进行超分辨率重建以分别获取各所述待识别图像对应的一帧超分辨率图像;
    其中,所述待识别图像是通过摄像头对目标区域进行拍摄获得的图像,所述超分辨率重建模型采用插值方式进行超分辨率重建。
  3. 根据权利要求2所述的基于超分辨率重建的危险行为识别方法,其特征在于,所述超分辨重建模型预先根据如下步骤进行训练:
    将第一训练数据中的低分辨率训练图像输入所述超分辨率重建模型,根据所述超分辨率重建模型进行像素点插值以生成所述低分辨率训练图像对应的高分辨率重建图像,其中,所述第一训练数据包括多组第一训练图像组,每一组所述第一训练图像组包括低分辨率训练图像和高分辨率训练图像,一张所述低分辨率训练图像与一张所述高分辨率训练图像对应,且所述低分辨率训练图像的分辨率低于所述高分辨率训练图像;
    根据所述低分辨率训练图像对应的高分辨率训练图像和所述低分辨率训练图像对应的高分辨率重建图像,对所述超分辨重建模型的模型参数进行调整,并继续执行将第一训练数据中的低分辨率训练图像输入所述超分辨率重建模型的步骤,直至满足第一预设训练条件,以得到已训练的超分辨率重建模型。
  4. 根据权利要求3所述的基于超分辨率重建的危险行为识别方法,其特征在于,所述高分辨率训练图像是通过高清摄像头拍摄获取的图像,所述高分辨率训练图像对应的低分辨率训练图像是通过低清摄像头对该高分辨率训练图像所对应的拍摄区域拍摄获取的图像,所述高清摄像头的分辨率高于所述低清摄像头。
  5. 根据权利要求3所述的基于超分辨率重建的危险行为识别方法,其特征在于,所述高分辨率训练图像是通过高清摄像头拍摄获取的图像,所述高分辨率训练图像对应的低分辨率训练图像是基于预设干扰方式对该高分辨率训练图像进行处理后获得的图像,所述预设干扰方式包括施加抖动和模糊处理中的至少一种。
  6. 根据权利要求1所述的基于超分辨率重建的危险行为识别方法,其特征在于,所述根据所述超分辨率图像,通过已训练的行为识别模型进行危险行为识别,获取所述待识别图像对应的目标行为类别,包括:
    将所述连续多帧超分辨率图像输入所述已训练的行为识别模型;
    在所述已训练的行为识别模型中,为各帧超分辨率图像的各个图像子区域设置区域权重值,其中,所述连续多帧超分辨率图像中的各帧超分辨率图像所对应的同一个图像子区域的区域权重值相同,所述图像子区域的区域权重值根据该图像子区域在各帧超分辨率图像中像素值的变化量确定,像素值的变化量大的图像子区域所对应的区域权重值大于像素值的变化量小的图像子区域所对应的区域权重值;
    在所述已训练的行为识别模型中,根据所述区域权重值对所述连续多帧超分辨率图像进行特征识别获得特征向量,并对所述特征向量进行危险行为分类以获取所述目标行为类别。
  7. 根据权利要求6所述的基于超分辨率重建的危险行为识别方法,其特征在于,所述行为识别模型预先根据如下步骤进行训练:
    将第二训练数据中的连续多帧训练超分辨率图像输入所述行为识别模型,通过所述行为识别模型为各帧训练超分辨率图像的各个训练图像子区域设置训练区域权重值,通过所述行为识别模型根据所述训练区域权重值对所述连续多帧训练超分辨率图像进行特征识别获得训练特征向量,并对所述训练特征向量进行危险行为分类以获取所述训练超分辨率图像对应的训练目标行为类别,其中,所述第二训练数据中包括多组第二训练图像组,每一组第二训练图像组包括一组连续多帧的训练超分辨率图像及其对应的训练实际危险行为类别;
    根据所述连续多帧训练超分辨率图像对应的训练目标行为类别和所述连续多帧待识别图像对应的训练实际危险行为类别,对所述行为识别模型的模型参数进行调整,并继续执行所述将第二训练数据中的连续多帧训练超分辨率图像输入所述行为识别模型的步骤,直至满足预设第二训练条件,以得到已训练的行为识别模型。
  8. 根据权利要求1-7任意一项所述的基于超分辨率重建的危险行为识别方法,其特征在于,在所述根据所述超分辨率图像,通过已训练的行为识别模型进行危险行为识别,获取所述待识别图像对应的目标行为类别之后,所述方法还包括:
    将所述目标行为类别中的每一种行为与预设的危险行为等级表进行匹配并获取对应的所有匹配危险等级,将危险等级最高的匹配危险等级作为所述目标行为类别对应的目标危险等级;
    根据所述目标危险等级进行危险告警。
  9. 一种基于超分辨率重建的危险行为识别系统,其特征在于,所述系统包括:
    待识别图像获取模块,用于获取连续多帧待识别图像;
    超分辨率重建模块,用于根据已训练的超分辨重建模型对所述连续多帧待识别图像进行超分辨率重建以获取连续多帧超分辨率图像,其中,所述超分辨率图像的分辨率高于所述待识别图像;
    危险行为识别模块,用于根据所述超分辨率图像,通过已训练的行为识别模型进行危险行为识别,获取所述待识别图像对应的目标行为类别,其中,所述目标行为类别包括多种预设行为类别中的一种或多种,所述多种预设行为类别包括无危险、护具异常和攀爬。
  10. 一种智能终端,其特征在于,所述智能终端包括存储器、处理器以及存储在所述存储器上并可在所述处理器上运行的基于超分辨率重建的危险行为识别程序,所述基于超分辨率重建的危险行为识别程序被所述处理器执行时实现如权利要求1-8任意一项所述基于超分辨率重建的危险行为识别方法的步骤。
PCT/CN2022/137061 2022-09-27 2022-12-06 基于超分辨率重建的危险行为识别方法、系统及相关设备 WO2024066044A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202211181324.0 2022-09-27
CN202211181324.0A CN115619638A (zh) 2022-09-27 2022-09-27 基于超分辨率重建的危险行为识别方法、系统及相关设备

Publications (1)

Publication Number Publication Date
WO2024066044A1 true WO2024066044A1 (zh) 2024-04-04

Family

ID=84861313

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2022/137061 WO2024066044A1 (zh) 2022-09-27 2022-12-06 基于超分辨率重建的危险行为识别方法、系统及相关设备

Country Status (2)

Country Link
CN (1) CN115619638A (zh)
WO (1) WO2024066044A1 (zh)

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110490802A (zh) * 2019-08-06 2019-11-22 北京观微科技有限公司 一种基于超分辨率的卫星影像飞机目标型号识别方法
CN110858286A (zh) * 2018-08-23 2020-03-03 杭州海康威视数字技术股份有限公司 一种用于目标识别的图像处理方法及装置
CN111784578A (zh) * 2020-06-28 2020-10-16 Oppo广东移动通信有限公司 图像处理、模型训练方法及装置、设备、存储介质
CN112508782A (zh) * 2020-09-10 2021-03-16 浙江大华技术股份有限公司 网络模型的训练方法、人脸图像超分辨率重建方法及设备
CN113361689A (zh) * 2021-06-09 2021-09-07 上海联影智能医疗科技有限公司 超分辨率重建网络模型的训练方法和扫描图像处理方法
JP2022122989A (ja) * 2021-07-28 2022-08-23 ベイジン バイドゥ ネットコム サイエンス テクノロジー カンパニー リミテッド 画像認識モデルを構築するための方法及び装置、画像認識方法及び装置、電子デバイス、コンピュータ可読記憶媒体、並びにコンピュータプログラム
CN115035599A (zh) * 2022-06-08 2022-09-09 中国兵器工业计算机应用技术研究所 一种融合装备与行为特征的武装人员识别方法和系统

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110858286A (zh) * 2018-08-23 2020-03-03 杭州海康威视数字技术股份有限公司 一种用于目标识别的图像处理方法及装置
CN110490802A (zh) * 2019-08-06 2019-11-22 北京观微科技有限公司 一种基于超分辨率的卫星影像飞机目标型号识别方法
CN111784578A (zh) * 2020-06-28 2020-10-16 Oppo广东移动通信有限公司 图像处理、模型训练方法及装置、设备、存储介质
CN112508782A (zh) * 2020-09-10 2021-03-16 浙江大华技术股份有限公司 网络模型的训练方法、人脸图像超分辨率重建方法及设备
CN113361689A (zh) * 2021-06-09 2021-09-07 上海联影智能医疗科技有限公司 超分辨率重建网络模型的训练方法和扫描图像处理方法
JP2022122989A (ja) * 2021-07-28 2022-08-23 ベイジン バイドゥ ネットコム サイエンス テクノロジー カンパニー リミテッド 画像認識モデルを構築するための方法及び装置、画像認識方法及び装置、電子デバイス、コンピュータ可読記憶媒体、並びにコンピュータプログラム
CN115035599A (zh) * 2022-06-08 2022-09-09 中国兵器工业计算机应用技术研究所 一种融合装备与行为特征的武装人员识别方法和系统

Also Published As

Publication number Publication date
CN115619638A (zh) 2023-01-17

Similar Documents

Publication Publication Date Title
CN107566781B (zh) 视频监控方法和视频监控设备
CN110245579B (zh) 人流密度预测方法及装置、计算机设备及可读介质
CN109085174A (zh) 显示屏外围电路检测方法、装置、电子设备及存储介质
CN108921840A (zh) 显示屏外围电路检测方法、装置、电子设备及存储介质
CN112487913A (zh) 一种基于神经网络的标注方法、装置及电子设备
CN110826522A (zh) 人体异常行为监控方法、系统、存储介质及监控设备
CN112417955A (zh) 巡检视频流处理方法及装置
CN113869137A (zh) 事件检测方法、装置、终端设备及存储介质
CN115546742A (zh) 一种基于单目热红外摄像头的铁轨异物识别方法及系统
CN116612389B (zh) 一种建筑施工进度管理方法及系统
WO2024066044A1 (zh) 基于超分辨率重建的危险行为识别方法、系统及相关设备
CN110611793B (zh) 基于工业视觉的供应链信息采集与数据分析方法及装置
CN104809438B (zh) 一种检测电子眼的方法和设备
KR20220151130A (ko) 영상 처리 방법, 장치, 전자 기기, 저장 매체 및 컴퓨터 프로그램
CN116363578A (zh) 一种基于视觉的船舶封闭舱室人员监测方法和系统
CN114360064A (zh) 基于深度学习的办公场所人员行为轻量级目标检测方法
CN113971762A (zh) 一种旋转机械作业安全风险智能识别方法及系统
CN114049682A (zh) 一种人体异常行为识别方法、装置、设备及存储介质
CN111753574A (zh) 抛扔区域定位方法、装置、设备及存储介质
CN115409753B (zh) 图像融合方法、装置、电子设备及计算机可读存储介质
CN110956057A (zh) 一种人群态势分析方法、装置及电子设备
Omeiza A step towards exposing bias in trained deep convolutional neural network models
CN114418064B (zh) 一种目标检测方法、终端设备及存储介质
CN112101279B (zh) 目标物异常检测方法、装置、电子设备及存储介质
CN118351487A (zh) 基于ai的员工违章行为辨识方法

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22960639

Country of ref document: EP

Kind code of ref document: A1