CN112216049B - Construction warning area monitoring and early warning system and method based on image recognition - Google Patents

Construction warning area monitoring and early warning system and method based on image recognition Download PDF

Info

Publication number
CN112216049B
CN112216049B CN202011026889.2A CN202011026889A CN112216049B CN 112216049 B CN112216049 B CN 112216049B CN 202011026889 A CN202011026889 A CN 202011026889A CN 112216049 B CN112216049 B CN 112216049B
Authority
CN
China
Prior art keywords
module
early warning
area
image
images
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202011026889.2A
Other languages
Chinese (zh)
Other versions
CN112216049A (en
Inventor
刘伟
李春阳
李伟
陈磊
杨弘卿
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Research Institute of Highway Ministry of Transport
Original Assignee
Research Institute of Highway Ministry of Transport
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Research Institute of Highway Ministry of Transport filed Critical Research Institute of Highway Ministry of Transport
Priority to CN202011026889.2A priority Critical patent/CN112216049B/en
Publication of CN112216049A publication Critical patent/CN112216049A/en
Application granted granted Critical
Publication of CN112216049B publication Critical patent/CN112216049B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G08SIGNALLING
    • G08BSIGNALLING OR CALLING SYSTEMS; ORDER TELEGRAPHS; ALARM SYSTEMS
    • G08B13/00Burglar, theft or intruder alarms
    • G08B13/18Actuation by interference with heat, light, or radiation of shorter wavelength; Actuation by intruding sources of heat, light, or radiation of shorter wavelength
    • G08B13/189Actuation by interference with heat, light, or radiation of shorter wavelength; Actuation by intruding sources of heat, light, or radiation of shorter wavelength using passive radiation detection systems
    • G08B13/194Actuation by interference with heat, light, or radiation of shorter wavelength; Actuation by intruding sources of heat, light, or radiation of shorter wavelength using passive radiation detection systems using image scanning and comparing systems
    • G08B13/196Actuation by interference with heat, light, or radiation of shorter wavelength; Actuation by intruding sources of heat, light, or radiation of shorter wavelength using passive radiation detection systems using image scanning and comparing systems using television cameras
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • GPHYSICS
    • G08SIGNALLING
    • G08BSIGNALLING OR CALLING SYSTEMS; ORDER TELEGRAPHS; ALARM SYSTEMS
    • G08B13/00Burglar, theft or intruder alarms
    • G08B13/18Actuation by interference with heat, light, or radiation of shorter wavelength; Actuation by intruding sources of heat, light, or radiation of shorter wavelength
    • G08B13/189Actuation by interference with heat, light, or radiation of shorter wavelength; Actuation by intruding sources of heat, light, or radiation of shorter wavelength using passive radiation detection systems
    • G08B13/194Actuation by interference with heat, light, or radiation of shorter wavelength; Actuation by intruding sources of heat, light, or radiation of shorter wavelength using passive radiation detection systems using image scanning and comparing systems
    • G08B13/196Actuation by interference with heat, light, or radiation of shorter wavelength; Actuation by intruding sources of heat, light, or radiation of shorter wavelength using passive radiation detection systems using image scanning and comparing systems using television cameras
    • G08B13/19639Details of the system layout
    • G08B13/19645Multiple cameras, each having view on one of a plurality of scenes, e.g. multiple cameras for multi-room surveillance or for tracking an object by view hand-over

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • General Engineering & Computer Science (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Biomedical Technology (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Human Computer Interaction (AREA)
  • Multimedia (AREA)
  • Image Analysis (AREA)
  • Closed-Circuit Television Systems (AREA)

Abstract

The invention provides a construction warning area monitoring and early warning system and method based on image recognition, wherein the system comprises an image input module, an image splicing module, an interactive calibration module, a pedestrian detection module, a visual intrusion module, a feature extraction module and a decision-making module, and combines visual intrusion detection, target detection and re-recognition technologies, a camera is arranged around a construction warning area or dangerous construction equipment to acquire surrounding environment information and entering and exiting character information in real time, an operator can replace an early warning triggering area and operator registration information at any time, whether a personnel intrusion signal exists in the early warning triggering area is updated in real time, and when the personnel intrusion signal exists, an alarm is activated to remind an illegal person to prohibit entering the area, so that the safety of the construction warning area is ensured.

Description

Construction warning area monitoring and early warning system and method based on image recognition
Technical Field
The invention relates to the technical field of information monitoring and early warning, in particular to a construction warning area monitoring and early warning system and method based on image recognition.
Background
In an engineering construction area, for example, in areas where warning areas should be set in hoisting operation (including bridge girder erection machine tower cranes), mechanical operation, hydraulic slip forms, blasting operation, main tower construction, the periphery of a bin wall body, tensioning operation and the like required by national and industrial standards or guidelines, and areas where construction warning areas should be set below high-altitude operations such as cradle construction, movable formwork construction and the like, a wire netting fence is usually set to prevent irrelevant technicians from entering the construction area to prevent unsafe events. However, for a construction area in a large range, the direct arrangement of the fence not only can influence the entering and exiting of constructors, but also can cause the condition of missing detection. Meanwhile, the construction area is large and needs to be changed along with the progress of the engineering, and the adoption of the wire netting fence mode can cause resource and labor waste, so that a safer, more effective and more convenient monitoring and early warning method needs to be found.
Disclosure of Invention
The invention aims to provide a construction warning area monitoring and early warning system and method based on image recognition, so that the problems in the prior art are solved.
In order to achieve the purpose, the technical scheme adopted by the invention is as follows:
a construction warning area monitoring and early warning system based on image recognition comprises a plurality of cameras, internet and terminal equipment which are arranged outside a construction warning area, and also comprises an image input module, an image splicing module, an interactive calibration module, a pedestrian detection module, a visual invasion module, a feature extraction module and a decision-making module,
the image input module is used for acquiring synchronous video images of all cameras around a highway engineering construction warning area and sending the collected synchronous video images to the image splicing module, and the splicing module is used for splicing a plurality of synchronous video images to obtain a panoramic image in an operation area;
the interactive calibration module is used for interactively calibrating a pre-warning trigger area on the panoramic image and activating the visual intrusion module at the same time:
after the visual intrusion module is activated, acquiring a calibrated early warning trigger area, observing video image information in the early warning trigger area in real time, and if intrusion information exists in the early warning trigger area, sending early warning to a pedestrian detection module;
after receiving the early warning sent by the visual intrusion module, the pedestrian detection module detects an intruded pedestrian target in the early warning trigger area and determines basic information of the intruded pedestrian;
the system comprises a decision module, a characteristic extraction module and a decision module, wherein the characteristic extraction module is used for extracting the characteristics of invading pedestrians in an early warning triggering area and transmitting the extracted characteristic information to the decision module, and the decision module compares the acquired characteristic information with the characteristic information of operating personnel recorded in the system and judges whether warning information needs to be sent or not.
Preferably, the image input module comprises an input process in an initialization state and an input process in an operation state;
the input processing of the initialization state is that when the video images of each camera are directly transmitted to the image splicing module, the input module extracts the video frames of different positions at the same time according to the serial numbers of the cameras, and the content of the video frames with adjacent numbers can be spliced;
the input processing in the operating state means that after an early warning trigger area is set, video frames covering the boundary of the early warning trigger area are transmitted to the visual intrusion module in real time; for an area where an intrusion response has been activated, all video frames of the area are directly transmitted into the pedestrian detection module.
Preferably, the image splicing module comprises a video feature extraction submodule, a video feature matching submodule and a matrix regression submodule, wherein the video feature extraction submodule adopts a high-resolution network to extract the features of video images input by two adjacent cameras at the same moment; the video matching submodule firstly performs L2 standardization processing on the two extracted video image characteristics, and then performs characteristic matching on the two video image characteristics after the standardization processing so as to obtain a similarity score matrix; and the matrix regression submodule processes the similarity score matrix by adopting a convolutional neural network to obtain a global homography matrix, and visually aligns the images through mapping change according to the global homography matrix to complete the splicing of the two images.
Preferably, the interactive calibration module is configured to map the homography matrix obtained by calculation of the image stitching module into the original multiple video frames according to multiple vertex coordinates calibrated by the user, and use a region surrounded by multiple vertex connecting lines as an early warning trigger region.
Preferably, the visual intrusion module realizes the visual intrusion detection by calling a vibi function, and specifically includes:
1) the method comprises the steps that a GetImMask module is designed to obtain an early warning trigger area, and the early warning trigger area can be set according to actual needs and comprises an area formed by a transverse line, a vertical line oblique line, a rectangular frame and a trapezoid;
2) through the vibe method class and the member functions thereof, the functions of resource initialization, dynamic background modeling, background updating, real-time foreground acquisition and the like are realized on the video data in the early warning trigger area;
3) filtering a detection frame which is not adjacent to a boundary line or a region of the early warning trigger region through an isoverLapWithBorder module to remove false detection;
4) through the dup _ rect _ eliminate module, detection frames which are repeated or overlapped when the detection frames are drawn are eliminated.
Preferably, the feature extraction module trains and generates a convolutional neural network for extracting features by constructing a twin neural network, specifically comprising ternary data construction, loss design and a human feature extraction network;
the triple data construct a triple data training set used for constructing characteristics of operators, each group of triple data comprises a pair of similar images and a dissimilar image, namely, the acquired images of the same operator at different camera positions at different moments are marked as a sample aiRecording the collected images of other operators as a type sample ajEach time data is selected to construct triple data, it will be at aiTwo images are extracted at random and are in ajExtracting an image to construct a triple, and calculating a cos similarity measurement distance;
the loss design is specifically as follows:
selecting a set of triplet training data, including from sample aiA positive sample picture a and a positive sample picture p extracted from the image data, from the sample ajExtracting a negative sample picture n, and calculating the loss of the triplet:
Lt=(D(a,p)-D(a,n)+mar gin)+
wherein margin is a boundary super parameter, D (a, p) represents a similarity distance between the picture a and the picture p, and D (a, n) represents a similarity distance between the picture a and the picture n;
the character feature extraction network sub-module adopts a three-branch input structure network to input sample feature data, the size of a sample feature graph is input in a unified mode, sample data are subjected to class division and sample identification, and character features are obtained.
The invention also aims to provide a construction warning area monitoring and early warning method based on image recognition, which specifically comprises the following steps:
s1, deploying a camera set to cover the engineering construction operation and the surrounding warning area; inputting images collected by a plurality of cameras into an image splicing module for splicing to obtain a panoramic image of a working area;
s2, marking an early warning trigger area on the obtained panoramic image through an interactive calibration module, and simultaneously recording the characteristics and the number of the operators allowed to enter the early warning trigger area;
s3, the visual intrusion module monitors video images in the early warning trigger area in real time, and when an intruder enters the early warning trigger area, an early warning signal is sent out to activate the pedestrian detection module;
s4, detecting the pedestrian target in the early warning trigger area after the pedestrian detection module obtains the early warning of the visual intrusion module, counting the number of pedestrians, intercepting the area where the pedestrian is located from the video image, and determining the specific position of the area where the pedestrian is located according to the warning position information;
and S5, the feature extraction module performs feature extraction on the obtained intrusion pedestrian image, and extracts the measured distance between the obtained features and the operator features recorded in the step S2, so that whether non-operators exist in the early warning trigger area or not is determined, and a corresponding warning signal is given.
Preferably, step S3 specifically includes:
s31, the vision invasion module acquires a real-time video stream, splits the video data stream to obtain a single-frame image, and acquires an early warning trigger area through a GetImMask module;
s32, performing border crossing detection and area detection on the obtained early warning trigger area respectively;
the out-of-range detection is to detect whether personnel intrusion signals exist at the upper/lower/left/right sides of the boundary line of the early warning trigger area, and if so, an alarm signal is sent out; if not, returning to the step S31, and acquiring the single-frame image again;
the area detection means that whether a personnel intrusion signal exists in the early warning trigger area or not, and if the personnel intrusion signal exists in the early warning trigger area, an alarm signal is sent out; if not, the process returns to step S31 to acquire a single frame image again.
Preferably, step S4 further includes: when the number of the invading pedestrians is larger than the number of the recorded operators, warning information is directly sent out; steps S4 and S5 may be initiated with a timed start or a specific time.
Preferably, step S5 specifically includes:
s51, constructing a triple data training set: each group of triple data comprises a pair of similar images and a dissimilar image, namely, the acquired images of the same operator at different camera positions at different moments are recorded as a sample aiRecording the collected images of other operators as a type sample ajEach time data is selected to construct triple data, it will be at aiTwo images a and a picture p are extracted at random internally, whereinjExtracting an image n to construct a triple, and calculating the cos similarity measurement distance of each group of triples;
s52, training a triplet data training set by using triplet loss:
in the training process, the number of training images read in at a time is set to be P multiplied by K, namely, images of P categories are randomly selected each time, and K images are randomly selected for each category to be used for training a network; calculating the triple loss of each read-in training image by adopting the following formula:
Figure BDA0002702379680000051
wherein the content of the first and second substances,
Figure BDA0002702379680000061
refers to the homogeneous sample with the largest similarity distance,
Figure BDA0002702379680000062
the method comprises the steps that different samples with the smallest similarity distance are referred, i and j respectively represent different categories, subscripts a and p represent picture labels in the same category, and subscript n represents picture labels in different categories;
s53, inputting sample feature data by adopting a three-branch input structure network, gathering input sample feature maps with different sizes into feature maps with uniform size by adopting a feature gathering mode and using ROI Align, and gathering and retaining effective features when compressing images;
and S54, respectively carrying out category division and sample identification on sample feature graphs of uniform size by adopting a multi-task learning method, classifying images of persons in the same area at different machine positions in the same time period into a category and numbering, modeling by using triple loss, measuring the distance between images of different persons by using cos similarity, and finally carrying out sample pair identification through the measured similarity distance between sample pairs.
The invention has the beneficial effects that:
the invention provides a construction warning area monitoring and early warning system and method based on image recognition, the system and method combines visual intrusion detection, target detection and re-recognition technology, a camera is arranged around a construction warning area or dangerous construction equipment to acquire surrounding environment information and entering and exiting character information in real time, an operator can replace an early warning triggering area and operator registration information at any time, whether a personnel intrusion signal exists in the early warning triggering area is updated in real time, and when the personnel intrusion signal exists, the system and method are activated to send out an alarm to remind illegal personnel to prohibit entering the area, so that the safety of the construction warning area is ensured.
In addition, in order to save resources, only the visual invasion module is generally reserved, and other modules are activated only when the visual invasion module sends a signal; but all modules are activated for important time periods, such as noon, evening or time periods in which people may be present, and all modules can be activated at regular intervals to prevent false alarm of visual intrusion into the modules.
Drawings
FIG. 1 is a diagram of a construction warning area monitoring and early warning system based on image recognition;
FIG. 2 is a flowchart of the overall algorithm of the visual intrusion module;
FIG. 3 is a functional relationship diagram in a visual intrusion module;
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention is further described in detail below with reference to the accompanying drawings. It should be understood that the detailed description and specific examples, while indicating the invention, are intended for purposes of illustration only and are not intended to limit the scope of the invention.
Example 1
The embodiment provides a construction warning area monitoring and early warning system based on image recognition, which comprises a plurality of cameras, internet and terminal equipment which are arranged outside a construction warning area, and further comprises an image input module, an image splicing module, an interactive calibration module, a pedestrian detection module, a visual intrusion module, a feature extraction module and a decision module, wherein the image input module, the image splicing module, the interactive calibration module, the pedestrian detection module, the visual intrusion module, the feature extraction module and the decision module are shown in figure 1;
the image input module is used for acquiring synchronous video images of all cameras around a highway engineering construction warning area and sending the collected synchronous video images to the image splicing module, and the splicing module is used for splicing a plurality of synchronous video images to obtain a panoramic image in an operation area;
the interactive calibration module is used for interactively calibrating a pre-warning trigger area on the panoramic image and activating the visual intrusion module at the same time:
after the visual intrusion module is activated, acquiring a calibrated early warning trigger area, observing video image information in the early warning trigger area in real time, and if intrusion information exists in the early warning trigger area, sending early warning to a pedestrian detection module;
after receiving the early warning sent by the visual intrusion module, the pedestrian detection module detects an intruded pedestrian target in the early warning trigger area and determines basic information of the intruded pedestrian;
the system comprises a decision module, a characteristic extraction module and a decision module, wherein the characteristic extraction module is used for extracting the characteristics of invading pedestrians in an early warning triggering area and transmitting the extracted characteristic information to the decision module, and the decision module compares the acquired characteristic information with the characteristic information of operating personnel recorded in the system and judges whether warning information needs to be sent or not.
Specifically, the input module is mainly divided into two parts: the input processing in the initialization state and the input processing in the operating state.
An initialization state: in an initialization state, the human-computer interaction experience is mainly considered, and the panoramic image of the operation area needs to be displayed by applying an image splicing technology. The input module needs to extract video frames of different machine positions at the same time from input data, and serial numbers of different machine positions are carried out, so that the contents of the video frames with adjacent serial numbers can be spliced.
The operation state is as follows: in the operation state, the video frames covering the boundary area are provided for the visual intrusion module in real time, and all the video frames are sent to the pedestrian detection module for the video frames activating the intrusion response.
The focus of the image stitching task is image registration. The image registration part is designed with a convolution neural network composed of a feature extraction module, a feature matching module and a matrix regression module. By using an end-to-end neural network training optimization mode, the difference of optimization targets between different modules in the traditional method is overcome, and meanwhile, the image registration method becomes more robust and stable. The input of the whole network is two images, the output is 8 regression values, and then a homography matrix is obtained through the 8 regression values. Specifically, features are extracted for two input images using a feature extraction module. And (3) calculating the similarity relation among the characteristics by using a characteristic matching module, and finally predicting by using a matrix regression module to obtain 8 regression values.
The feature extraction module considers that more space detail information needs to be reserved in an image splicing task to sense the small difference between two images, so that an HRNet high-resolution network is adopted to ensure that the feature images reserve enough space detail information.
The feature matching module is mainly used for calculating a correlation coefficient between two features. In this module, the features obtained for both images are first normalized by L2. Then, feature matching is carried out, and then a similarity score matrix is obtained.
And the matrix regression module is used for trying to calculate a homography matrix by using a convolutional neural network, and performing Relu operation on the similarity score matrix obtained by the feature matching module to eliminate a negative correlation part. And then extracting features by building a plurality of convolution modules of convolution + Relu + Batchnorm, and finally obtaining 8 regression values for generating a homography matrix through two full connection layers, thereby obtaining a global homography matrix. And finally, according to the obtained global homography matrix, visually aligning the two obtained images through mapping change, and finally splicing the two images together.
The interactive calibration module has the main functions of providing a panoramic picture of an operation area for a user and carrying out interactive labeling, and workers can select an early warning area range on a notebook computer, a tablet computer and other equipment. And the calibration module is mapped into a plurality of original video frames according to the 4 vertex coordinates calibrated by the user and the homography matrix to be used as the calibration information of the visual intrusion module. Meanwhile, after the calibration process is finished, the program can automatically start the pedestrian detection and feature extraction module and record the feature information of the operating personnel in the field.
The overall algorithm flow of the visual intrusion module is shown in fig. 2, and the alarm signal refers to a signal for activating a subsequent module, and the specific contents are as follows:
firstly, the vision invasion module acquires a real-time video stream, splits the video data stream to obtain a single-frame image, and acquires an early warning trigger area through a GetImMask function;
then, performing border crossing detection and area detection on the obtained early warning trigger area respectively;
the out-of-range detection is to detect whether personnel intrusion signals exist at the upper/lower/left/right sides of the boundary line of the early warning trigger area, and if so, an alarm signal is sent out; if not, returning to the step S31, and acquiring the single-frame image again;
the area detection means that whether a personnel intrusion signal exists in the early warning trigger area or not, and if the personnel intrusion signal exists in the early warning trigger area, an alarm signal is sent out; if not, acquiring the single-frame image again, and repeating the steps.
The relationship of the functions in the visual intrusion detection module is shown in fig. 3. First, the main program main calls the vibi function to realize the visual intrusion detection. There are mainly four functional blocks in the vibi function: 1) the method comprises the steps that a monitoring area is obtained through a GetImMask module, and the monitoring area with various shapes such as a transverse line, a vertical line, an oblique line, a rectangular frame and an irregular quadrangle can be supported; 2) the functions of resource initialization, dynamic background modeling, background updating, real-time foreground acquisition and the like are realized through the vibe class and the member functions thereof; 3) filtering a detection frame which is not adjacent to the monitoring line or the monitoring area through an isoverLapWithBorder module to remove false detection; 4) through the dup _ rect _ eliminate module, detection frames which are repeated or overlapped when the detection frames are drawn are eliminated.
It is worth noting that the vibi algorithm is an algorithm for pixel-level video background modeling or foreground detection, and has better effect than the known algorithms, and occupies less hardware memory. The ViBe is a pixel-level background modeling and foreground detection algorithm, and the main difference of the algorithm is an updating strategy of a background model, samples of pixels needing to be replaced are randomly selected, and neighborhood pixels are randomly selected for updating. When the model of the pixel change cannot be determined, the random updating strategy can simulate the uncertainty of the pixel change to a certain extent.
The pedestrian detection module in this embodiment uses a classic ssd (single Shot multi box detector) detection network, and quickly locates the position and the number of people of the pedestrian in the video frame captured by each camera position. The SSD is an abbreviation of a Single Shot Detector, and can realize real-time detection speed on the premise of not influencing too much detection precision, and three major characteristics of the SSD comprise that: the method comprises the following steps of multi-scale, anchor point boxes with various aspect ratios and data enhancement strategies. The method effectively combines ideas in fast R-CNN, YOLO and multi-scale convolution characteristics, and can meet the requirement of real-time detection while achieving detection precision equivalent to the most advanced two-stage detection method at that time.
The feature extraction module in this embodiment is mainly used for generating a convolutional neural network for feature extraction by training in a twin neural network construction mode based on triple loss, and mainly includes triple data construction, loss design and a human feature extraction network. The main purpose of the triple data construction is to provide high-quality triple data to provide training data for a next high-resolution feature learning network. Each set of triplet data is required to contain three pieces of image data during the training process, one of which is a "similar" image and the other of which is a "dissimilar" image. Specifically, collected images of the same operator at different camera positions at different times are recorded as a sample aiEach time data is selected to construct triple data, it will be at aiTwo images are extracted at random and are in ajAnd (i is not equal to j) extracting an image to construct a triple, and measuring the distance through cos similarity. a is1
The triple loss is a widely applied metric learning loss, and has the advantages of end-to-end performance, cluster property, high embedding of features and the like compared with other losses (classification loss and contrast loss). Triple loss training data each group requires three input pictures. An input Triplet (Triplet) includes a pair of positive samples and a pair of negative samples. The three pictures are named as fixed picture (Anchor) a, Positive sample picture (Positive) p and Negative sample picture (Negative) n, respectively. The picture a and the picture p are a pair of positive samples, and the picture a and the picture n are a pair of negative samples. The triplet penalty is expressed as:
Lt=(D(a,p)-D(a,n)+mar gin)+
where margin is a boundary super parameter, D (a, p) represents the distance between picture a and picture p, and D (a, n) represents the distance between picture a and picture n.
However, in the training process of the triple loss network, a large number of negative sample pairs can be generated in a combined manner, so that the number of the positive and negative sample pairs is unbalanced, the training is blocked, and the convergence result is poor, so that the design of the training strategy for the personnel image can directly influence the performance of deep network learning. Thus, during training, the trained Batch size (number of images read in at a time) is set to P × K, i.e., P classes of images are randomly selected at a time, and K images are randomly selected for training the network per class. The triple penalty within each Batch size is calculated using the following formula:
Figure BDA0002702379680000111
wherein the content of the first and second substances,
Figure BDA0002702379680000112
refers to the homogeneous sample with the largest similarity distance,
Figure BDA0002702379680000113
the method comprises the steps that different samples with the smallest similarity distance are referred, i and j respectively represent different categories, subscripts a and p represent picture labels in the same category, and subscript n represents picture labels in different categories;
through the training mode, the least similar positive sample pair and the least distinguishable similar negative sample pair in each batch size are selected each time to calculate loss, so that training data are reduced, the problem of training sample imbalance is solved, and the feature representation capability learned by the network is higher.
The part aims at the characteristics of an operation scene, combines a learning task represented by characteristics, designs a reasonable deep network structure, and extracts the characteristics with strong representation capability and strong robustness.
The feature extraction network comprises three contents of feature representation, feature aggregation and multitask construction, and for feature representation: according to the characteristics of triple loss, the net in this embodiment is designed as a three-branch input structure, corresponding to a set of positive sample pairs (X)i,Xj) And a set of negative sample pairs (X)j,Xl) Simultaneous provisioning of a backbone network using a shared set of parametersFeature Map of the industry personnel (Feature Map).
Feature aggregation: the method has the advantages that the limitation of early warning on speed requirements and calculation amount and the requirement of calculation measurement distance on feature dimensions are considered, the number of feature diagram channels obtained through extraction of a backbone network is not too high, for example, the Vgg produces 512-dimensional feature diagrams, Resnet produces 1024-dimensional feature diagrams, and the Incepotion produces 1024-dimensional feature diagrams, which affect the image retrieval speed and the availability of calculation measurement distance, and the number of channels of three image feature diagrams is compressed while adding 1 × 1 convolution of parameter sharing after the backbone network. Meanwhile, the feature maps of the three images obtained through the backbone network are only kept consistent on the channel due to the fact that the sizes of the input images are not consistent. The size of the feature map needs to be kept consistent when the distance measurement is calculated and the classification feature extraction is carried out, so the size of the ROI Align is designed to be uniform. The ROI Align obtains the image numerical value on the pixel point with the coordinate as the floating point number by using a bilinear interpolation method, so that the whole feature aggregation process is converted into a continuous operation. And clustering feature maps of different sizes into feature maps of the same size through the ROI Align operation, wherein effective features are reserved in clustering while the size of the feature maps is compressed.
And (3) multitask construction: the multi-task Learning (multi task Learning) is a derivation transfer Learning method, a deep network puts a plurality of related tasks together for Learning, and the Learning process mutually shares and mutually supplements the information related to the learned field through a shallow sharing representation, so that the Learning is mutually promoted, and the generalization effect is improved. In the feature expression learning network, a classification task and an identification task of a sample pair are simultaneously adopted, the classification task is used as an auxiliary task and is beneficial to the feature learning of the network and the convergence of the network, in the classification task, all sample data are roughly classified, images of the same person are determined as one class, and the images are numbered in a range of 1-N (N is the number of all classes). And connecting a full Connected Layer (full Connected Layer) after a feature map obtained by ROI Align extraction of each image is activated by a ReLU function. And stretching the feature map into a one-dimensional feature vector after passing through the full connection layer, inputting the feature vector into the classification layer after being activated by the ReLU function, and classifying by adopting softmax logistic regression (softmax regression). In the task of identifying sample pairs, the loss of the triples is used for guiding network learning, and the loss of the triples is also a core idea of feature representation modeling. First, the distance between different human images is measured by cos similarity on the feature space. Finally, sample pair identification is performed by measuring the distance between the sample pairs.
The decision module in this embodiment has a main function of deciding whether to send out warning information or to enter information of a newly-appeared operator. In addition, the decision module activates all modules at key time intervals (noon, evening or time intervals in which people may appear (generally, only the visual intrusion module is reserved)) and activates all modules at intervals so as to prevent the visual intrusion module from being under report.
Specifically, the decision module calculates cos similarity distances between the obtained personnel features in the area and all stored personnel features, does not make warning information when the similarity is smaller than a preset threshold value, and sends out early warning when the similarity is larger than the preset threshold value to remind the personnel of leaving.
Meanwhile, when the decision module finds that the number of detected personnel in the monitoring area is larger than the number of recorded personnel, early warning can be directly triggered.
Example 2
The embodiment provides a highway engineering construction warning area monitoring and early warning based on image recognition, which specifically comprises the following steps:
s1, deploying a camera set to cover the engineering construction operation and the surrounding warning area; inputting images collected by a plurality of cameras into an image splicing module for splicing to obtain a panoramic image of a working area;
s2, marking an early warning trigger area on the obtained panoramic image through an interactive calibration module, and simultaneously recording the characteristics and the number of the operators allowed to enter the early warning trigger area;
s3, the visual intrusion module monitors video images in the early warning trigger area in real time, and when an intruder enters the early warning trigger area, an early warning signal is sent out to activate the pedestrian detection module;
s4, detecting the pedestrian target in the early warning trigger area after the pedestrian detection module obtains the early warning of the visual intrusion module, counting the number of pedestrians, intercepting the area where the pedestrian is located from the video image, and determining the specific position of the area where the pedestrian is located according to the warning position information;
and S5, the feature extraction module performs feature extraction on the obtained intrusion pedestrian image, and extracts the measured distance between the obtained features and the operator features recorded in the step S2, so that whether non-operators exist in the early warning trigger area or not is determined, and a corresponding warning signal is given.
Step S3 specifically includes:
s31, the vision invasion module acquires a real-time video stream, splits the video data stream to obtain a single-frame image, and acquires an early warning trigger area through a GetImMask module;
s32, performing border crossing detection and area detection on the obtained early warning trigger area respectively;
the out-of-range detection is to detect whether personnel intrusion signals exist at the upper/lower/left/right sides of the boundary line of the early warning trigger area, and if so, an alarm signal is sent out; if not, returning to the step S31, and acquiring the single-frame image again;
the area detection means that whether a personnel intrusion signal exists in the early warning trigger area or not, and if the personnel intrusion signal exists in the early warning trigger area, an alarm signal is sent out; if not, the process returns to step S31 to acquire a single frame image again.
Step S4 further includes: when the number of the invading pedestrians is larger than the number of the recorded operators, warning information is directly sent out; steps S4 and S5 may be initiated with a timed start or a specific time.
Step S5 specifically includes:
s51, constructing a triple data training set: each group of triple data comprises a pair of similar images and a dissimilar image, namely, the acquired images of the same operator at different camera positions at different moments are recorded as a sample aiRecording the collected image of another operator as a sample ajEach time the selection data is constructedWhen three sets of data are present, a isiTwo images are extracted at random and are in ajExtracting an image to construct a triple, and calculating a cos similarity measurement distance;
s52, training a triplet data training set by using triplet loss:
during training, the trained Batch size (the number of images read in at a time) is set to be P × K, that is, P classes of images are randomly selected at a time, and K images are randomly selected for training the network for each class. The triple penalty within each Batch size is calculated using the following formula:
Figure BDA0002702379680000151
s53, inputting sample feature data by adopting a three-branch input structure network, gathering input sample feature maps with different sizes into feature maps with uniform size by adopting a feature gathering mode and using ROI Align, and gathering and retaining effective features when compressing images;
and S54, respectively carrying out category division and sample identification on sample feature graphs of uniform size by adopting a multi-task learning method, classifying images of persons in the same area at different machine positions in the same time period into a category and numbering, modeling by using triple loss, measuring the distance between images of different persons by using cos similarity, and finally carrying out sample pair identification through the measured distance between sample pairs.
By adopting the technical scheme disclosed by the invention, the following beneficial effects are obtained:
the invention provides a construction warning area monitoring and early warning system and method based on image recognition, the system and method combines visual intrusion detection, target detection and re-recognition technology, a camera is arranged around a construction warning area or dangerous construction equipment to acquire surrounding environment information and entering and exiting character information in real time, an operator can replace an early warning triggering area and operator registration information at any time, whether a personnel intrusion signal exists in the early warning triggering area is updated in real time, and when the personnel intrusion signal exists, the system and method are activated to send out an alarm to remind illegal personnel to prohibit entering the area, so that the safety of the construction warning area is ensured.
In addition, in order to save resources, only the visual invasion module is generally reserved, and other modules are activated only when the visual invasion module sends a signal; but all modules are activated for important time periods, such as noon, evening or time periods in which people may be present, and all modules can be activated at regular intervals to prevent false alarm of visual intrusion into the modules.
The foregoing is only a preferred embodiment of the present invention, and it should be noted that, for those skilled in the art, various modifications and improvements can be made without departing from the principle of the present invention, and such modifications and improvements should also be considered within the scope of the present invention.

Claims (5)

1. A construction warning area monitoring and early warning system based on image recognition comprises a plurality of cameras, internet and terminal equipment which are arranged outside a construction warning area, and is characterized by also comprising an image input module, an image splicing module, an interactive calibration module, a pedestrian detection module, a visual invasion module, a feature extraction module and a decision-making module,
the image input module is used for acquiring synchronous video images of all cameras around a highway engineering construction warning area and sending the collected synchronous video images to the image splicing module, and the image splicing module is used for splicing a plurality of synchronous video images to obtain a panoramic image in an operation area;
the interactive calibration module is used for interactively calibrating a pre-warning trigger area on the panoramic image and activating the visual intrusion module at the same time:
after the visual intrusion module is activated, acquiring a calibrated early warning trigger area, observing video image information in the early warning trigger area in real time, and if intrusion information exists in the early warning trigger area, sending early warning to a pedestrian detection module;
the pedestrian detection module adopts an SSD detection network, detects an invading pedestrian target in an early warning trigger area after receiving the early warning sent by the vision invading module, rapidly positions the pedestrian position and the number of people in a video frame captured by each camera position, and determines the invading pedestrian position information;
the system comprises a characteristic extraction module, a decision module and a warning module, wherein the characteristic extraction module is used for extracting the characteristics of invading pedestrians in a warning triggering area and transmitting the extracted characteristic information to the decision module, and the decision module compares the obtained characteristic information with the characteristic information of operating personnel recorded in the system and judges whether warning information needs to be sent or not;
the decision module has the main functions of deciding whether to send out warning information or not or to input newly appeared operator information, and in addition, the decision module can activate all modules at key time intervals including noon, evening or time intervals when people are likely to appear, and can also activate all modules at intervals so as to prevent the condition of missing report of the visual intrusion module;
the specific steps of judging whether warning information needs to be sent are as follows: the decision module calculates cos similarity distance between the obtained personnel features in the area and all stored personnel features, does not make warning information when the similarity is smaller than a preset threshold value, and sends out early warning when the similarity is larger than the preset threshold value to remind the personnel to leave; meanwhile, when the decision module finds that the number of detected personnel in the monitoring area is larger than the number of recorded personnel, early warning can be directly triggered;
the operation process of the visual intrusion module is specifically as follows:
firstly, the vision invasion module acquires a real-time video stream, splits the video data stream to obtain a single-frame image, and acquires an early warning trigger area through a GetImMask function;
then, performing border crossing detection and area detection on the obtained early warning trigger area respectively;
the out-of-range detection is to detect whether personnel intrusion signals exist at the upper/lower/left/right sides of the boundary line of the early warning trigger area, and if so, an alarm signal is sent out; if not, the real-time video stream is obtained again, and the single-frame image is obtained again;
the area detection means that whether a personnel intrusion signal exists in the early warning trigger area or not, and if the personnel intrusion signal exists in the early warning trigger area, an alarm signal is sent out; if not, acquiring the single-frame image again, and repeating the steps;
the visual intrusion module realizes visual intrusion detection by calling a vibi function, and specifically comprises the following steps:
1) acquiring an early warning trigger area by designing a GetImMask module, wherein the early warning trigger area comprises but is not limited to a monitoring area adopting a transverse line, a vertical line, an oblique line, a rectangular frame and a trapezoid;
2) through the vibe method class and the member functions thereof, resource initialization, dynamic background modeling, background updating and real-time foreground acquisition are realized on video data in the early warning trigger area;
3) filtering a detection frame which is not adjacent to a boundary line or a region of the early warning trigger region through an isoverLapWithBorder module to remove false detection;
4) eliminating the detection frames which are repeated or overlapped when the detection frames are drawn through a dup _ rect _ eliminate module;
the image input module comprises input processing in an initialization state and input processing in an operation state;
the input processing of the initialization state is that when the video images of each camera are directly transmitted to the image splicing module, the input module extracts the video frames of different positions at the same time according to the serial numbers of the cameras, and the content of the video frames with adjacent numbers can be spliced;
the input processing in the operating state means that after an early warning trigger area is set, video frames covering the boundary of the early warning trigger area are transmitted to the visual intrusion module in real time; for the area with activated intrusion response, directly transmitting all video frames of the area into the pedestrian detection module;
the image splicing module comprises a video feature extraction submodule, a video feature matching submodule and a matrix regression submodule, wherein the video feature extraction submodule adopts a high-resolution network to extract the features of video images input by two adjacent cameras at the same moment; the video matching submodule firstly performs L2 standardization processing on the two extracted video image characteristics, and then performs characteristic matching on the two video image characteristics after the standardization processing so as to obtain a similarity score matrix; the matrix regression submodule processes the similarity score matrix by adopting a convolutional neural network to obtain a global homography matrix, and visually aligns the images through mapping change according to the global homography matrix to complete the splicing of the two images;
the interactive calibration module is used for mapping the homography matrix obtained by calculation of the image splicing module into a plurality of original video frames through 4 vertex coordinates calibrated by a user, taking an area surrounded by connecting lines of 4 vertices as an early warning trigger area, and simultaneously, after the calibration process is finished, a program automatically starts the pedestrian detection and feature extraction module to record feature information of operators in the field;
the feature extraction module is used for training and generating a convolutional neural network for extracting features by constructing a twin neural network, and specifically comprises three-tuple data construction, loss design and a figure feature extraction network;
the triple data construct a triple data training set used for constructing characteristics of operators, each group of triple data comprises a pair of similar images and a dissimilar image, namely, the acquired images of the same operator at different camera positions at different moments are marked as a sample aiRecording the collected images of other operators as a type sample ajEach time data is selected to construct triple data, it will be at aiTwo images are extracted at random and are in ajExtracting an image to construct a triple, and calculating a cos similarity measurement distance;
the loss design is specifically as follows:
selecting a set of triplet training data, including from sample aiA positive sample picture a and a positive sample picture p extracted from the image data, from the sample ajExtracting a negative sample picture n, and calculating the loss of the triplet:
Lt=(D(a,p)-D(a,n)+margin)+
wherein margin is a boundary super parameter, D (a, p) represents a similarity distance between the picture a and the picture p, and D (a, n) represents a similarity distance between the picture a and the picture n;
the character feature extraction network sub-module adopts a three-branch input structure network, the size of a sample is input in a unified mode, the sample data is subjected to class division and sample identification, and character features are obtained.
2. A construction warning area monitoring and early warning method based on image recognition is characterized in that the early warning system of claim 1 is adopted, and the method specifically comprises the following steps:
s1, deploying a camera set to cover the engineering construction operation and the surrounding warning area; inputting images collected by a plurality of cameras into an image splicing module for splicing to obtain a panoramic image of a working area;
s2, marking an early warning trigger area on the obtained panoramic image through an interactive calibration module, and simultaneously recording the characteristics and the number of the operators allowed to enter the early warning trigger area;
s3, the visual intrusion module monitors video images in the early warning trigger area in real time, and when an intruder enters the early warning trigger area, an early warning signal is sent out to activate the pedestrian detection module;
s4, detecting the pedestrian target in the early warning trigger area after the pedestrian detection module obtains the early warning of the visual intrusion module, counting the number of pedestrians, intercepting the area where the pedestrian is located from the video image, and determining the specific position of the area where the pedestrian is located according to the warning position information;
and S5, the feature extraction module performs feature extraction on the obtained intrusion pedestrian image, and extracts the measured distance between the obtained features and the operator features recorded in the step S2, so that whether non-operators exist in the early warning trigger area or not is determined, and a corresponding warning signal is given.
3. The construction warning region monitoring and early warning method based on image recognition as claimed in claim 2, wherein the step S3 specifically comprises:
s31, the vision invasion module acquires a real-time video stream, splits the video data stream to obtain a single-frame image, and acquires an early warning trigger area through a GetImMask module;
s32, performing border crossing detection and area detection on the obtained early warning trigger area respectively;
the out-of-range detection is to detect whether personnel intrusion signals exist at the upper/lower/left/right sides of the boundary line of the early warning trigger area, and if so, an alarm signal is sent out; if not, returning to the step S31, and acquiring the single-frame image again;
the area detection means that whether a personnel intrusion signal exists in the early warning trigger area or not, and if the personnel intrusion signal exists in the early warning trigger area, an alarm signal is sent out; if not, the process returns to step S31 to acquire a single frame image again.
4. The construction warning region monitoring and early warning method based on image recognition as claimed in claim 2, wherein the step S4 further comprises: when the number of the invading pedestrians is larger than the number of the recorded operators, warning information is directly sent out; steps S4 and S5 may be initiated with a timed start or a specific time.
5. The construction warning region monitoring and early warning method based on image recognition as claimed in claim 2, wherein the step S5 specifically comprises:
s51, constructing a triple data training set: each group of triple data comprises a pair of similar images and a dissimilar image, namely, the acquired images of the same operator at different camera positions at different moments are recorded as a sample aiRecording the collected images of other operators as a type sample ajEach time data is selected to construct triple data, it will be at aiTwo images a and a picture p are extracted at random internally, whereinjExtracting an image n to construct a triple, and calculating the cos similarity measurement distance of each group of triples;
s52, training a triplet data training set by using triplet loss:
in the training process, the number of training images read in at a time is set to be P multiplied by K, namely, images of P categories are randomly selected each time, and K images are randomly selected for each category to be used for training a network; calculating the triple loss of each read-in training image by adopting the following formula:
Figure FDA0003527930680000061
wherein the content of the first and second substances,
Figure FDA0003527930680000062
refers to the homogeneous sample with the largest similarity distance,
Figure FDA0003527930680000063
the method comprises the steps that different samples with the smallest similarity distance are referred, i and j respectively represent different categories, subscripts a and p represent picture labels in the same category, and subscript n represents picture labels in different categories;
s53, inputting sample feature data by adopting a three-branch input structure network, gathering input sample feature graphs of different sizes into feature graphs of uniform size by adopting a feature gathering mode and using ROI Align, and gathering and retaining effective features when compressing the features;
and S54, respectively carrying out category division and sample identification on sample feature graphs of uniform size by adopting a multi-task learning method, classifying images of persons in the same area at different machine positions in the same time period into a category and numbering, modeling by using triple loss, measuring the distance between images of different persons by using cos similarity, and finally carrying out sample pair identification through the measured similarity distance between sample pairs.
CN202011026889.2A 2020-09-25 2020-09-25 Construction warning area monitoring and early warning system and method based on image recognition Active CN112216049B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011026889.2A CN112216049B (en) 2020-09-25 2020-09-25 Construction warning area monitoring and early warning system and method based on image recognition

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011026889.2A CN112216049B (en) 2020-09-25 2020-09-25 Construction warning area monitoring and early warning system and method based on image recognition

Publications (2)

Publication Number Publication Date
CN112216049A CN112216049A (en) 2021-01-12
CN112216049B true CN112216049B (en) 2022-04-29

Family

ID=74051245

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011026889.2A Active CN112216049B (en) 2020-09-25 2020-09-25 Construction warning area monitoring and early warning system and method based on image recognition

Country Status (1)

Country Link
CN (1) CN112216049B (en)

Families Citing this family (15)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113128340B (en) * 2021-03-16 2022-09-02 广州华微明天软件技术有限公司 Personnel intrusion detection method and device
CN112990711B (en) * 2021-03-19 2023-11-17 云南建投第九建设有限公司 Aluminum alloy template construction monitoring method and system based on site construction
CN113115225B (en) * 2021-04-09 2022-07-15 国能智慧科技发展(江苏)有限公司 Electronic fence area generation system based on dangerous source monitoring and personnel positioning
CN113052137B (en) * 2021-04-25 2022-11-01 烟台大迈物联科技有限公司 Identification and judgment method for construction site environment
CN113340390B (en) * 2021-06-28 2023-09-05 广东韶钢松山股份有限公司 Anti-cheating weighing system and method
CN113673046B (en) * 2021-07-20 2023-06-06 杭州大杰智能传动科技有限公司 Internet of things communication system and method for intelligent tower crane emergency early warning
CN113762171A (en) * 2021-09-09 2021-12-07 赛思沃德(武汉)科技有限公司 Method and device for monitoring safety of railway construction site
CN113888825A (en) * 2021-09-16 2022-01-04 无锡湖山智能科技有限公司 Monitoring system and method for driving safety
WO2023060405A1 (en) * 2021-10-11 2023-04-20 深圳市大疆创新科技有限公司 Unmanned aerial vehicle monitoring method and apparatus, and unmanned aerial vehicle and monitoring device
CN114445393B (en) * 2022-02-07 2023-04-07 无锡雪浪数制科技有限公司 Bolt assembly process detection method based on multi-vision sensor
CN114411561B (en) * 2022-02-11 2023-05-12 中交第二公路工程局有限公司 Control method of prestress tension control system based on voice and image recognition technology
CN114639214B (en) * 2022-05-23 2022-08-12 安徽送变电工程有限公司 Intelligent safety distance early warning system and method for electric power hoisting operation
CN115206091B (en) * 2022-06-07 2024-06-07 西安电子科技大学广州研究院 Road condition and event monitoring system and method based on multiple cameras and millimeter wave radar
CN115147770B (en) * 2022-08-30 2022-12-02 山东千颐科技有限公司 Belt foreign matter vision recognition system based on image processing
CN115691040A (en) * 2022-09-07 2023-02-03 保利长大工程有限公司 Safe distance judgment type bridge construction early warning system

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107483889A (en) * 2017-08-24 2017-12-15 北京融通智慧科技有限公司 The tunnel monitoring system of wisdom building site control platform
CN108875505B (en) * 2017-11-14 2022-01-21 北京旷视科技有限公司 Pedestrian re-identification method and device based on neural network
CN111163286A (en) * 2018-11-08 2020-05-15 北京航天长峰科技工业集团有限公司 Panoramic monitoring system based on mixed reality and video intelligent analysis technology
JP7232650B2 (en) * 2019-01-25 2023-03-03 富士古河E&C株式会社 Intrusion detection system
CN111598787B (en) * 2020-04-01 2023-06-02 西安电子科技大学 Biological radar image denoising method and device, electronic equipment and storage medium thereof

Also Published As

Publication number Publication date
CN112216049A (en) 2021-01-12

Similar Documents

Publication Publication Date Title
CN112216049B (en) Construction warning area monitoring and early warning system and method based on image recognition
CN109670441A (en) A kind of realization safety cap wearing knows method for distinguishing, system, terminal and computer readable storage medium
CN107222660B (en) Distributed network vision monitoring system
CN109448326B (en) Geological disaster intelligent group defense monitoring system based on rapid image recognition
CN112183472A (en) Method for detecting whether test field personnel wear work clothes or not based on improved RetinaNet
CN108470424A (en) A kind of forest safety monitoring system based on characteristics of image
CN107103299B (en) People counting method in monitoring video
CN114140745A (en) Method, system, device and medium for detecting personnel attributes of construction site
CN112184773A (en) Helmet wearing detection method and system based on deep learning
CN114399734A (en) Forest fire early warning method based on visual information
CN113033315A (en) Rare earth mining high-resolution image identification and positioning method
CN111368615A (en) Violation building early warning method and device and electronic equipment
WO2023109664A1 (en) Monitoring method and related product
CN113989858A (en) Work clothes identification method and system
CN114187541A (en) Intelligent video analysis method and storage device for user-defined service scene
CN106971142A (en) A kind of image processing method and device
CN114708566A (en) Improved YOLOv 4-based automatic driving target detection method
CN115272820A (en) Flame smoke recognition model training method and device, application method and device of flame smoke recognition model training method and device, and storage medium
CN116261742A (en) Information processing apparatus and information processing method
CN113139477A (en) Method, device and equipment for training well lid detection model and computer storage medium
KR20220073242A (en) Apparatus for constructing learning data for machine learning of power facilities and metho thereof
CN113570615A (en) Image processing method based on deep learning, electronic equipment and storage medium
CN112417946A (en) Boundary crossing detection method and system for designated area of power construction site
JP6831396B2 (en) Video monitoring device
CN112580449B (en) Method for judging abnormal behaviors of people on intelligent construction site

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant