CN111898541A - Intelligent visual monitoring and warning system for safety operation of gantry crane - Google Patents

Intelligent visual monitoring and warning system for safety operation of gantry crane Download PDF

Info

Publication number
CN111898541A
CN111898541A CN202010755699.8A CN202010755699A CN111898541A CN 111898541 A CN111898541 A CN 111898541A CN 202010755699 A CN202010755699 A CN 202010755699A CN 111898541 A CN111898541 A CN 111898541A
Authority
CN
China
Prior art keywords
module
video
limb
candidate
confidence
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202010755699.8A
Other languages
Chinese (zh)
Inventor
潘晴
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhongke Blue Ocean Yangzhou Intelligent Vision Technology Co Ltd
Original Assignee
Zhongke Blue Ocean Yangzhou Intelligent Vision Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhongke Blue Ocean Yangzhou Intelligent Vision Technology Co Ltd filed Critical Zhongke Blue Ocean Yangzhou Intelligent Vision Technology Co Ltd
Priority to CN202010755699.8A priority Critical patent/CN111898541A/en
Publication of CN111898541A publication Critical patent/CN111898541A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/161Detection; Localisation; Normalisation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/172Classification, e.g. identification
    • GPHYSICS
    • G08SIGNALLING
    • G08BSIGNALLING OR CALLING SYSTEMS; ORDER TELEGRAPHS; ALARM SYSTEMS
    • G08B13/00Burglar, theft or intruder alarms
    • G08B13/18Actuation by interference with heat, light, or radiation of shorter wavelength; Actuation by intruding sources of heat, light, or radiation of shorter wavelength
    • G08B13/189Actuation by interference with heat, light, or radiation of shorter wavelength; Actuation by intruding sources of heat, light, or radiation of shorter wavelength using passive radiation detection systems
    • G08B13/194Actuation by interference with heat, light, or radiation of shorter wavelength; Actuation by intruding sources of heat, light, or radiation of shorter wavelength using passive radiation detection systems using image scanning and comparing systems
    • G08B13/196Actuation by interference with heat, light, or radiation of shorter wavelength; Actuation by intruding sources of heat, light, or radiation of shorter wavelength using passive radiation detection systems using image scanning and comparing systems using television cameras
    • G08B13/19602Image analysis to detect motion of the intruder, e.g. by frame subtraction
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N7/00Television systems
    • H04N7/18Closed-circuit television [CCTV] systems, i.e. systems in which the video signal is not broadcast
    • H04N7/181Closed-circuit television [CCTV] systems, i.e. systems in which the video signal is not broadcast for receiving images from a plurality of remote sources

Abstract

The invention relates to an intelligent visual monitoring system for safety operation of an aerial crane, which comprises image acquisition modules, video analysis systems, a video data server and a video display module, wherein the image acquisition modules are respectively connected with the server through a wired network or a wireless network, the video analysis systems and the video data server are deployed in a service period, and the video data server is connected with the video display module. The system has six functions of examining qualification of operators, following monitoring and warning, detecting safety protection, detecting falling of workers, gathering of workers, asking for help of the operators, rushing into a navigation and hanging area and preventing in the bud. The system belongs to a new capital construction product for safety production, and if each line can be hung and matched with one set, the system not only can effectively help a production unit to improve the safety production management efficiency, but also can be networked to a local big data bureau, and enables local government related departments to master the safety operation big data of the local gantry crane.

Description

Intelligent visual monitoring and warning system for safety operation of gantry crane
Technical Field
The invention belongs to the field of safety production management, relates to intelligent monitoring equipment, and particularly relates to an intelligent visual monitoring and alarming system for safety operation of an aerial crane.
Background
In recent years, with the rapid development of national infrastructure industry, the national infrastructure industry develops in a blowout manner, but the pain point caused by the national infrastructure industry is not solved effectively until now. The safety accidents of the crane are counted, and more than 90 percent of the safety accidents are caused by the illegal operation of personnel, and the safety accidents comprise three aspects: 1. persons without formal training operate without permission; 2. the dangerous area is not cleared in time (because the crane is in motion, the dangerous area is dynamic, and the safe place can become the dangerous area when in operation at ordinary times); 3. the protection measures of field personnel are not in place, for example, the safety helmet is not provided, and a great deal of accidents happen to hit the head. The safety issue is the greatest pain point in the capital construction industry. The intelligent visual monitoring and warning system for the safety operation of the gantry crane can protect the gantry crane for new construction.
Through searching, the published patent technology which is the same as the content of the invention is not found.
Disclosure of Invention
The invention aims to overcome the defects of the prior art and provides an intelligent visual monitoring and alarming system for safety operation of a gantry crane, which is mainly based on attitude monitoring of computer vision and attitude monitoring of an acceleration sensor, and has the advantages of higher system efficiency, high speed and high accuracy.
The technical problem to be solved by the invention is realized by adopting the following technical scheme:
the utility model provides a navigation hangs safety work intelligence vision monitoring and alarming system which characterized in that: the system comprises image acquisition modules, video analysis systems, video data servers and video display modules, wherein the image acquisition modules are respectively connected with the servers through wired or wireless networks;
the image acquisition module comprises cameras or high-speed shooting instruments, distributors, video technology modules and network video servers, the cameras or high-speed shooting instruments are respectively connected with the distributors, the output ends of the distributors are respectively connected with the video technology modules and the network video servers, and the signal output lines of the video technology modules and the network video servers are connected with the servers; the camera or the high-speed shooting instrument is used for acquiring pictures and videos of a scene, the video technology module adopts the image/video enhancement module to perform signal denoising and enhancement on the acquired images/videos, the network server is used for network transmission of the images and the videos, and the distributor is used for signal distribution transmission of the plurality of acquisition modules, the video technology module and the network server.
In addition, an internal management system and a public service platform are also installed in the server.
Moreover, the video analysis system comprises a target detection module, a face recognition module and a key point detection module, and specifically comprises:
a target detection module: the human body identification and positioning device is used for identifying and positioning the human body and providing basis for further analysis;
the face recognition module is used for extracting, recognizing and comparing the faces of the pedestrians in the video one by one to realize qualification examination of operators;
and the key point detection module is used for positioning the characteristic information and providing position information for subsequent identification.
Moreover, the video data server comprises a video retrieval module, an application security module, a distributed database and a distributed file system, and specifically comprises:
the video retrieval module: and searching and playing back related videos according to the key information.
Applying a security module: providing a data encryption function to ensure the security of data;
distributed database: the system is a database foundation of a multi-server architecture and provides data support for distributed computing;
distributed file system: the file system is a file system foundation of a multi-server architecture and provides data management for distributed computing.
The operation platform of the system comprises an acquisition module, an operation center, a data center, a demonstration center and a Web server, wherein the acquisition module is respectively connected with the demonstration center and the data center in a communication manner, the demonstration center and the data center are respectively connected with the operation center in a communication manner, and the operation center, the data center and the demonstration center are all deployed in the server;
a server: deploying an automatic monitoring and management system and a network communication module;
the demonstration center: the system comprises a user interface, a split screen display module and an image/video enhancement module;
the data center comprises: the system is used for collecting, storing and managing high-definition videos and comprises a video retrieval module, an application security module, a distributed database and a distributed file system;
an operation center: the intelligent video analysis module comprises a target detection module, a face recognition module and a key point detection module.
A working method of an intelligent visual monitoring and alarming system for safety operation of an aerial crane is characterized in that: and (3) safety monitoring and alarming in a fixed scene:
(1) extracting foreground characteristics through the pictures and videos acquired by the image acquisition module;
(2) the operation center calculates the position, direction, perimeter, width-to-height ratio, area-to-enclosed area ratio, eccentricity and other geometric characteristics of an object in the foreground image, and the target detection module forms a characteristic vector to reflect the human body posture through the extracted geometric characteristic parameters;
(3) according to the algorithm requirement, different postures are adopted as samples, the characteristics in the samples are extracted as a data set and used as data source reference, and the key point detection module and the face recognition module recognize the person;
the key point detection module realizes the behavior recognition function of the human body, and the behavior recognition function is realized by the key point detection module, including the intrusion of a navigation crane area, the falling of workers, the gathering of the workers and the safety protection detection;
the face recognition module can accurately realize face comparison through face deep learning, realize the qualification inspection function of the intelligent visual surveillance system for aviation crane safety operation, can recognize a blacklist and a whitelist through establishing a face library, set an AOI surveillance area, and realize alarming for intrusion in a dangerous area.
Moreover, the algorithm steps are as follows:
(1) the video analysis system takes a color image as input and then produces as output 2D locations for contact keypoints for each individual in the image;
first, a forward network is used to predict a set of 2D confidence maps S for body part positioning and a 2D vector domain L for part affinity, which encodes the relevance between the parts, the set S ∈ (S1, S2.., Sj) having J confidence maps, one for each part, where S ∈ Rw × h, J ∈ {1.. J }, the set L ∈ { L1, L2.., LC } has C vector domains, one for each limb, where LC ∈ Rw × h × 2, C ∈ {1.. C }, each picture encodes a 2D vector at the position LC, and finally, the confidence dictionary and the relevance domain are analyzed by greedy inference to output 2D key points for all individuals in the image;
for multiple individual posture assessments, body parts belonging to the same individual are connected, different directions are coded with different colors, and for each pixel in the domain, a 2D vector encodes the position and direction of the limb;
(2) overall architecture, taking the whole picture as input of two branches CNN and combining the prediction confidence map for body part detection and the part association domain of the part relationship, the parsing step performs a set of bilateral matching to associate the body part candidates, and finally combines them into the whole body posture of all individuals in the image,
a detection confidence dictionary and an associated domain of coding part-to-part connection are predicted simultaneously by a framework, and the associated domain network is divided into two branches: a top branch, predicting a confidence dictionary; a bottom branch, a prediction association domain; each branch is a repetitive prediction architecture, and the prediction results are adjusted by repeating the steps in sequence, with supervision added at each stage.
In addition, according to the two-branch multi-step CNN architecture method, each stage in the first branch predicts the confidence dictionary, each stage in the second branch predicts the PAFs, and after each stage, the prediction results of the two branches plus the picture features are transmitted to the next stage;
the image is first analyzed by a convolutional network to produce a collective feature dictionary F which is used as input to the first stage of each branch, in which the network produces a collective detection confidence map S1=ρ1(F) And a set of site-associated domains L1=φ1(F) Where ρ is1And
Figure BDA0002611480590000031
is the CNNs inference at the first stage, and in each subsequent stage, the predictions from each branch of the previous stage, in combination with the original image features F, are used to generate a confirmation prediction,
Figure BDA0002611480590000032
Figure BDA0002611480590000033
where ρ istAnd phitIs an inference of CNNs in step t.
To guide the network to iteratively predict the confidence maps of body parts in the first branch and PAFs in the second branch, two loss functions are added at the end of each stage, one for each branch, and an L is used between the prediction estimate and the true map and domain2Loss, where the proportion of the penalty function is spatially distributed to solve the problem that some data sets cannot be fully labeled for all individuals, in particular, at stage t, the penalty function in both branches is:
Figure BDA0002611480590000034
Figure BDA0002611480590000041
wherein
Figure BDA0002611480590000042
Is a true region confidence map of the region,
Figure BDA0002611480590000043
is the true region associated vector field, W is a binary code, when the label is missing at position p of the imageNote that w (p) is 0. The coding is used to avoid penalizing the correct active prediction during the training process, the intermediate supervision of each stage is used to solve the gradient disappearance problem by periodically supplementing the gradient, the overall objective is as follows:
Figure BDA0002611480590000044
for each body part detection combination confidence measure, given a set of body part detection results, how to integrate them without knowing the number of individuals to form the posture of the whole body of each individual, a confidence measure for each pair of body part detection combinations is required, i.e. they belong to the same individual, position information of the limb support domain is stored, and direction information is stored, for each limb part relation is a 2D vector domain, for each pixel in the region belonging to a particular limb, a 2D vector encodes the direction from one part to another part in the limb, each type of limb has a corresponding relation domain to connect its associated two parts.
To evaluate f in equation (5) during trainingLDefining a real part joint vector field,
Figure BDA0002611480590000045
for point p in the picture there are:
Figure BDA0002611480590000046
wherein the content of the first and second substances,
Figure BDA0002611480590000047
is the unit vector of the limb, the set of points on the limb are defined as those points where the line segment is within the distance threshold, those points p are defined as:
Figure BDA00026114805900000412
wherein the width of limbs sigmalIs the distance at pixel level, the limb length
Figure BDA0002611480590000048
And v \\prepIs a vector that is orthogonal to v and,
the real site union domain averages the union domains of all individuals in the picture:
Figure BDA0002611480590000049
wherein n isc(p) is the number of vectors in all k individuals at which point p is not 0,
during testing, the correlation between candidate part detections is measured by calculating the line integral of a line segment along the part coordinates corresponding to the PAF, the correspondence between the predicted PAF and the candidate limb constituted by the body part detected by the correlation is measured, and the positions of the two candidate parts are measured
Figure BDA00026114805900000410
And
Figure BDA00026114805900000411
sampling from the predicted site-specific region, LcAlong the line segment to measure the confidence of the connection between them:
Figure BDA0002611480590000051
wherein p (u) is in two body parts
Figure BDA0002611480590000052
And
Figure BDA0002611480590000053
the position of the insertion:
Figure BDA0002611480590000054
in practice, the integral is approximated by sampling and summing the equally spaced values of u;
applying non-maximum suppression on the detected confidence map to obtain a discrete set of candidate site locations, for each site, a plurality of candidate locations defining a large set of possible limbs, scoring each candidate limb using a line integral operation on the PAF, defined by equation (10), finding the most ideal solution presents a problem corresponding to a K-dimensional matching problem,
formally, we obtain a set of body part detection candidates D for multiple individualsjWherein
Figure BDA0002611480590000055
NjIs the number of candidate sites j, and
Figure BDA0002611480590000056
is the m-th detected coordinate for candidate body part j, these candidate body part detections still need to be associated with other parts of the same individual, defining a variable
Figure BDA0002611480590000057
To represent two candidate detections
Figure BDA0002611480590000058
And
Figure BDA0002611480590000059
whether a connection is possible, and our goal is to find the best possible connection allocation,
Figure BDA00026114805900000510
if a single site pair j is considered for the c-th limb1And j2Finding the best connection to transform into a maximum weight bipartite graph matching problem in which the nodes of the graph are body part candidate detections
Figure BDA00026114805900000511
And
Figure BDA00026114805900000512
and edges are all possible connections in the paired candidate test, and further, each edge is assigned a weight by equation (10), a match in the bipartite graph is a selected subset of edges that do not share a node with two edges, the goal is to find a match that has the greatest weight for all selected edges:
Figure BDA00026114805900000513
Figure BDA00026114805900000514
Figure BDA00026114805900000515
wherein EcIs the total weight, Z, matched from limb type ccIs a subset Z, E of limb type cmnIs defined in the body part by the formula (10)
Figure BDA00026114805900000516
And
Figure BDA00026114805900000517
the connection between equation (13) and equation (14) states that two edges cannot share a node, so that no two limbs belonging to the same type share a part.
Furthermore, looking for multiple body overall body poses, two relaxations are added to do the optimization, first, we choose a minimum number of edges to get an individual pose spanning tree summary instead of using the full graph, second, we further decompose the matching problem into a set bipartite graph matching sub-problem and decide the matching problem independently in the neighboring trees,
after adding these two relaxations, the optimization simply decomposes into:
Figure BDA0002611480590000061
the limb connection candidates are obtained for each limb category independently using equations (12) - (14), all candidate limb connections are available, and connections sharing the same candidate detection site are integrated to form whole body posture detection for multiple individuals.
The invention has the advantages and positive effects that:
1. intelligence: based on intelligent deep learning artificial intelligence edge calculation, an advanced and mature audio and video coding technology and a network transmission technology are adopted based on a traditional network video monitoring system; the video coding adopts the advanced H.264 standard, and has high compression ratio, good image quality, strong fault-tolerant capability and strong network adaptability.
2. Stability: the system software is designed by adopting a C/S framework, the stability is good, the monitoring front end adopts an embedded design and a special chip, the structure is compact, the power consumption is low, the stability is high, and the system can adapt to various severe environments.
3. Simplicity: the method and the system are closely combined with the application requirements of different services of the user, the system operation process is clear and efficient, the software operation interface is concise and friendly, and the method and the system are easy to use.
4. Customization: the Chinese blue sea has a professional technical talent team which is engaged in the research and development and construction of intelligent video technology for many years, each function of the system adopts a modular design, and the system can be customized according to actual requirements, so that the individual requirements of users are fully met. Providing optimal support for the team.
5. The system has 6 functions, namely: qualification examination of the operator and qualification of the human face recognition operator. Secondly, the method comprises the following steps: following the police, the system is arranged on the traveling crane and moves along with the traveling crane. Thirdly, the method comprises the following steps: safety protection detection, which is to detect whether protection work is in place when a person enters a work area, such as whether a safety helmet is worn or whether a work garment is worn or not; fourthly: fall detection of workers, worker gathering and help seeking of operators. Fifth, the method comprises the following steps: and (5) carrying out intrusion in an aerial lift area. Sixth: it is prevented from happening.
6. The system belongs to a new capital construction product for safety production, and if each line can be hung and matched with one set, the system not only can effectively help a production unit to improve the safety production management efficiency, but also can be networked to a local big data bureau, and enables local government related departments to master the safety operation big data of the local gantry crane.
Drawings
FIG. 1 is a schematic diagram of the system of the present invention;
FIG. 2 is a system operating platform of the present invention;
FIG. 3 is a schematic diagram of the architecture of a two-branch multi-step CNN;
FIG. 4 is a confidence graph illustration;
FIG. 5 is a schematic view of a simple limb movement;
fig. 6 shows a bipartite graph set of three structures (d) of the original graph (b) K division graph (c) in which the upper-level detection result is shown in the matching graph (a).
Detailed Description
The present invention will be described in further detail with reference to the following embodiments, which are illustrative only and not limiting, and the scope of the present invention is not limited thereby.
An intelligent visual monitoring and alarming system for safety operation of an aerial crane comprises image acquisition modules, video analysis systems, a video data server and a video display, wherein the image acquisition modules are respectively connected with the server through a wired network or a wireless network, the video analysis systems and the video data server are deployed in a service period, and the video data server is connected with the video display module;
an internal management system and a public service platform are also installed in the server;
the image acquisition module comprises cameras or high-speed shooting instruments, distributors, video technology modules and network video servers, the cameras or high-speed shooting instruments are respectively connected with the distributors, the output ends of the distributors are respectively connected with the video technology modules and the network video servers, and the signal output lines of the video technology modules and the network video servers are connected with the servers; the camera or the high-speed shooting instrument is used for acquiring pictures and videos of a scene, the video technology module adopts the image/video enhancement module to perform signal denoising and enhancement on the acquired images/videos, the network server is used for network transmission of the images and the videos, and the distributor is used for signal distribution transmission of the plurality of acquisition modules, the video technology module and the network server.
The video analysis system comprises a target detection module, a face recognition module and a key point detection module, and specifically comprises:
a target detection module: the human body identification and positioning device is used for identifying and positioning the human body and providing basis for further analysis;
the face recognition module is used for extracting, recognizing and comparing the faces of the pedestrians in the video one by one to realize qualification examination of operators;
and the key point detection module is used for positioning the characteristic information and providing position information for subsequent identification.
The video data server comprises a video retrieval module, an application security module, a distributed database and a distributed file system, and specifically comprises the following steps:
the video retrieval module: and searching and playing back related videos according to the key information.
Applying a security module: providing a data encryption function to ensure the security of data;
distributed database: the system is a database foundation of a multi-server architecture and provides data support for distributed computing;
distributed file system: the file system is a file system foundation of a multi-server architecture and provides data management for distributed computing.
The system comprises the following functions:
firstly, the method comprises the following steps: and (6) qualification examination. The qualification of the human face recognition operator.
Secondly, the method comprises the following steps: follow the police. The system is arranged on a travelling crane and moves along with the travelling crane, and when a person in a dangerous area is found, voice alarm is automatically performed, so that the condition that manual field supervision is not in place is avoided, and the situation is easy to occur in the travelling crane moving process.
Thirdly, the method comprises the following steps: and (6) wearing and checking. Detecting whether protection work is in place when a person enters a work area, such as whether a safety helmet is worn or whether a work garment is worn or not;
fourthly: fall detection of workers, worker gathering and help seeking of operators;
fifth, the method comprises the following steps: carrying out crane area intrusion;
sixth: the system can really prevent the accident before one real accident if one hundred illegal operations possibly occur but are not found and corrected.
The working method of the system comprises the following safety monitoring and alarming steps in a fixed scene:
firstly, extracting foreground characteristics from a photo and a video acquired by an image acquisition module;
then, the operation center calculates the position, the direction, the perimeter, the width-to-height ratio, the ratio of the area to the enclosed area, the eccentricity and other geometric characteristics of the object in the foreground image, and the target detection module forms a characteristic vector to reflect the human posture through the extracted geometric characteristic parameters;
finally, according to algorithm requirements, different postures are adopted as samples, characteristics in the samples are extracted as data sets and used as data source references, so that behavior recognition of personnel is achieved, and the whole data depends on the powerful computing power of a computer and the rationality of data modeling;
the key point detection module can realize the behavior recognition function of a human body, and has the main functions of inspection of an operator of the gantry crane, intrusion of a gantry crane area, falling of workers, gathering of the workers, safety protection detection (namely safety helmet protective clothing) and the like, so that the behavior recognition function is realized.
The face recognition module can accurately realize face comparison through face deep learning, realize the qualification inspection function of the intelligent visual surveillance system for aviation crane safety operation, can recognize a blacklist and a whitelist through establishing a face library, can also set an AOI surveillance area, is flexible and changeable, can realize alarming for intrusion in a dangerous area, and realizes the inspection function of an aviation crane operator.
The algorithm of the system is based on real Multi-Person 2D Pose Estimation using PartAffinity Fields, and the specific steps are as follows:
the video analysis system takes a color image as input and then produces as output 2D locations for contact keypoints for each individual in the image. First, a forward network is used to predict a set of 2D confidence maps S for body part locations and a 2D vector field L for part affinity, which encodes the relevance between the locations, the set S ∈ (S1, S2., Sj) having J confidence maps, one for each location, where S ∈ Rw × h, J ∈ {1.. J }, the set L ∈ { L1, L2.,. LC } has C vector fields, one for each limb, where LC ∈ Rw × h × 2, C ∈ {1.. C }, each picture encodes a 2D vector at a location LC, and finally, the confidence dictionary and the relevance field are resolved by greedy inference to output 2D key points for all individuals in the image.
For multiple individual posture assessments, body parts belonging to the same individual are connected, different directions are coded with different colors, and for each pixel in the domain, a 2D vector encodes the position and direction of the limb;
then, the whole picture is used as the input of two branches CNN and is combined with the prediction confidence map of body part detection and the part joint domain of part relation, the analyzing step executes a group of bilateral matching to associate body part candidates, and finally the two bilateral matching are combined into the whole body postures of all individuals in the image.
And completing the whole framework, and predicting a detection confidence dictionary and an associated domain of coding part-to-part relation at the same time. This network is divided into two branches: the top branch, labeled with beige, predicts the confidence dictionary, while the bottom branch, labeled with blue, predicts the association domain. Each branch is a repetitive prediction architecture, followed by the method proposed by Wei et al, i.e. the prediction result is adjusted by repeating the steps in sequence, and the T ∈ { 1...., T }, in each stage, supervision is added;
referring to FIG. 3, a two-branch multi-step CNN architecture is shown, with each stage in the first branch predicting confidence dictionaries and each stage in the second branch predicting PAFs. After each stage, the prediction results of the two branches plus the picture features are passed to the next stage;
the image is first analyzed by a convolutional network (initialized and trimmed using the first 10 layers of VGG-19) to produce a collective feature dictionary F, which is used as input for the first stage of each branch. In the first stage, the network generates a set of detection confidence maps S1=ρ1(F) And a set of site-associated domains L1=φ1(F) Where ρ is1And
Figure BDA0002611480590000091
is the CNNs inference at the first stage. In each subsequent stage, the predictions for each branch from the previous stage, in combination with the original image features F, are used to generate a confirmation prediction,
Figure BDA0002611480590000092
Figure BDA0002611480590000093
where ρ istAnd phitIs an inference of CNNs in step t.
FIG. 4 illustrates the design of a confidence map and associated domains through several steps. To guide the network iteratively predicting the confidence map of the body part in the first branch and predicting the PAFs in the second branch, we add two loss functions at the end of each stage, one for each branch. We use an L between the prediction evaluation and the real graph and domain2Loss. Here we spatially assign a proportion of the loss function to solve the problem that some data sets cannot be fully labeled to all individuals. In particular, in the t-th stage, the loss function in both branches is:
Figure BDA0002611480590000094
Figure BDA0002611480590000095
wherein
Figure BDA0002611480590000096
Is a true region confidence map of the region,
Figure BDA0002611480590000097
is the true region association vector field, W is a binary code, and when there is no label at position p of the image, W (p) is 0. This coding is used to avoid penalizing the correct positive prediction during the training process. Intermediate supervision of each stage is used to solve the gradient disappearance problem by periodically supplementing the gradient. The overall objectives are as follows:
Figure BDA0002611480590000098
a joint confidence measure is detected for each body part. Given a set of body part detection results, how to integrate them without knowing the number of individuals to construct the overall body posture of each individual. A confidence measure is needed for each pair of body-part detection combinations, i.e. they belong to the same individual. One possible approach is by detecting an additional midpoint for each pair of sites on the limb and examining its incidence between candidate site detections. However, but the individuals crowd together, these midpoints may express erroneous connections that increase due to two expression limitations: (1) it only encodes the position of each limb, not the direction; (2) it narrows the support field of a limb to a simple point. To address these limitations, a novel representation of features is proposed, where the position information of the limb support field is stored, and the orientation information is stored, where for each limb the part association is a 2D vector field, where for each pixel within the area belonging to a particular limb, a 2D vector encodes the orientation of the limb from one part to another. Each type of limb has a corresponding contact field to contact its associated two parts.
A simple limb as shown in fig. 5, let
Figure BDA00026114805900001014
And
Figure BDA00026114805900001013
site j representing limb c of individual k1And j2Is a coordinate, if a point p falls on the limb, then
Figure BDA0002611480590000101
Is a value of from j1Point j of2A unit vector of (a); for other points, the vector has a value of 0.
To evaluate f in equation 5 during trainingLWe define a real-site joint vector field,
Figure BDA0002611480590000102
for point p in the picture there are:
Figure BDA0002611480590000103
wherein the content of the first and second substances,
Figure BDA0002611480590000104
is the unit vector of the limb. The set of points on the limb is defined as those points where the line segment is within a distance threshold, that is, those points p may be defined as:
Figure BDA0002611480590000105
wherein the width of limbs sigmalIs the distance at pixel level, the limb length
Figure BDA0002611480590000106
And v \\prepIs a vector orthogonal to v.
The real site union domain averages the union domains of all individuals in the picture:
Figure BDA0002611480590000107
wherein n isc(p) is the number of vectors in all k individuals at which point p is not 0 (i.e., the average pixel of the different individual limbs overlapping).
At test time, we measure the link between candidate site detections by computing the line integral of the line segment along the site coordinates for the corresponding PAF. That is, we measure the consistency between the predicted PAF and the candidate limb constituted by the body part detected by the association. In particular, for two candidate site positions
Figure BDA0002611480590000108
And
Figure BDA0002611480590000109
we sampled from the predicted site-association domain, LcAlong the line segment to measure the confidence of the connection between them:
Figure BDA00026114805900001010
wherein p (u) is in two body parts
Figure BDA00026114805900001011
And
Figure BDA00026114805900001012
the position of the insertion:
Figure BDA0002611480590000111
in practice, we approximate the integral by sampling and summing the equally spaced values of u.
Non-maxima suppression is applied on the detection confidence map to obtain a discrete set of candidate site locations. For each part, we may have multiple candidate locations because there are multiple individuals or false locations in the image (as shown in FIG. 6 b). These candidate sites define a large set of possible limbs. We score each candidate limb using a line integral operation on PAF, defined by equation 10. The problem of finding the optimal resolution corresponds to a K-dimension matching problem, which is an NP-hard problem (as shown in fig. 6 c). In this article, we propose a greedy relaxation strategy that can always produce high quality matches. We speculate that the reason for this is that the paired contact scores can potentially encode global information because of the large reception domain of the PAF network.
Formally, we obtain a set of body part detection candidates D for multiple individualsjWherein
Figure BDA0002611480590000112
NjIs the number of candidate sites j, and
Figure BDA0002611480590000113
is the m-th detected coordinate for the candidate body part j. These candidate body part tests still need to be linked to other parts of the same individual, in other words we need to find pairs of part tests that are in fact on the same limb. We define a variable
Figure BDA0002611480590000114
To represent two candidate detections
Figure BDA0002611480590000115
And
Figure BDA0002611480590000116
whether a connection is possible, and our goal is to find the best possible connection allocation,
Figure BDA0002611480590000117
if we consider a single site pair j for the c-th limb1And j2(e.g., neck and right shoulder) find the best connection to translate into the maximize weight bipartite graph matching problem. This situation is shown in fig. 6 b. In this graph matching problem, the nodes of the graph are body part candidate detections
Figure RE-GDA00026478335000001113
And
Figure RE-GDA00026478335000001114
and edges are all possible connections in the paired candidate detection. In addition, each edge is assigned a weight, the partial affinity aggregate, by equation 10. A match in a bipartite graph is a subset of edge choices where no two edges share a node. Our goal is to find a match such that there is maximum weight for all selected edges:
Figure BDA00026114805900001110
Figure BDA00026114805900001111
Figure BDA00026114805900001112
wherein EcIs the total weight, Z, matched from limb type ccIs a subset Z, E of limb type cmnIs at the body part as defined by equation 10
Figure BDA00026114805900001113
And
Figure BDA00026114805900001114
the link between them. Equations 13 and 14 specify that two edges cannot share a node, i.e., that no two are of the same classLimbs of the model (e.g., the left forearm) share a single location. We can use the hungarian algorithm to get the best match.
In order to find the overall body posture of multiple individuals, it is necessary to measure ZZ in the K-dimension matching problem. This problem is NP-hard and there are many relaxations (relaxations) present. In our work, we have added two relaxations to do the optimization, especially for our dimensions. First, we select a minimum number of edges to obtain an individual pose spanning tree summary instead of using the full graph, as shown in FIG. 6 c. Secondly, we further decompose the matching problem into a set of bipartite graph matching sub-problems and decide the matching problem independently in neighboring trees, as shown in FIG. 6 d. We will show detailed comparison results in section 3.1, which demonstrates that the least greedy inference approximates global processing while the computational overhead is only a fraction of global processing. The reason for this is that the association of adjacent tree nodes is explicitly modeled by PAFs, but internally, non-adjacent tree nodes are explicitly modeled by CNNs. This attribute arises because CNNs are trained based on a large receiving field, and PAFs originating from non-adjacent tree nodes are also affected by the predicted PAF.
After adding these two relaxations, the optimization can be simply broken down into:
Figure BDA0002611480590000121
we can thus obtain limb contact candidates for each limb class independently using equations 12-14. With all candidate limb connections, we can integrate the connections sharing the same candidate detection site to form the whole body posture detection for multiple individuals. Our optimization strategy on three structures is orders of magnitude faster than the optimization on the whole connectivity graph.
The system operation platform comprises an acquisition module, an operation center, a data center, a demonstration center and a Web server, wherein the acquisition module is respectively connected with the demonstration center and the data center in a communication manner, the demonstration center and the data center are respectively connected with the operation center in a communication manner, and the operation center, the data center and the demonstration center are all deployed in the server;
a server: deploying an automatic monitoring and management system and a network communication module;
the demonstration center: the system comprises a user interface, a split screen display module and an image/video enhancement module;
the data center comprises: the system is used for collecting, storing and managing high-definition videos and comprises a video retrieval module, an application security module, a distributed database and a distributed file system;
an operation center: the intelligent video analysis module comprises a target detection module, a face recognition module and a key point detection module.
System operating parameters
1) Hardware environment: industrial server of high-performance GPU processor
2) Software environment: B/S architecture software system based on Ubuntu system
Basic function of system
1) A behavior recognition function: the system can realize the function of identifying the behavior of the human body, and the main functions comprise inspection of an operator of the gantry crane (namely face comparison), intrusion of the gantry crane in an area, falling of workers, gathering of the workers, safety protection detection (namely safety helmet protective clothing) and the like.
2) Retrieval and query functions: the system can realize the target retrieval and related query functions, and mainly comprises the functions of retrieving through face photos, retrieving through video frames and the like.
3) Data storage function: the system can realize the functions of storing original data, face images, key frames and the like for at least 1 month.
4) Interface operation and display functions: the system interface is beautiful and reasonable, the operation is convenient, and the process of the video analysis of the computer artificial intelligent algorithm can be visually embodied by the display software.
Although the embodiments of the present invention and the accompanying drawings are disclosed for illustrative purposes, those skilled in the art will appreciate that: various substitutions, changes and modifications are possible without departing from the spirit and scope of the invention and the appended claims, and therefore the scope of the invention is not limited to the disclosure of the embodiments and the accompanying drawings.

Claims (9)

1. The utility model provides a navigation hangs safety work intelligence vision monitoring and alarming system which characterized in that: the system comprises image acquisition modules, video analysis systems, video data servers and video display modules, wherein the image acquisition modules are respectively connected with the servers through wired or wireless networks;
the image acquisition module comprises cameras or high-speed shooting instruments, distributors, video technology modules and network video servers, the cameras or the high-speed shooting instruments are respectively connected with the distributors, the output ends of the distributors are respectively connected with the video technology modules and the network video servers, and the signal output lines of the video technology modules and the network video servers are connected with the servers; the camera or the high-speed shooting instrument is used for acquiring on-site photos and videos, the video technology module adopts the image/video enhancement module to perform signal denoising and enhancement on the acquired images/videos, the network server is used for network transmission of the images and the videos, and the distributor is used for signal distribution transmission of the plurality of acquisition modules, the video technology module and the network server.
2. The intelligent vision monitoring and warning system for safety operation of aviation cranes of claim 1, characterized in that: and an internal management system and a public service platform are also installed in the server.
3. The intelligent vision monitoring and warning system for safety operation of aviation cranes of claim 1, characterized in that: the video analysis system comprises a target detection module, a face recognition module and a key point detection module, and specifically comprises:
a target detection module: the human body identification and positioning device is used for identifying and positioning the human body and providing basis for further analysis;
the face recognition module is used for extracting, recognizing and comparing the faces of the pedestrians in the video one by one to realize qualification examination of operators;
and the key point detection module is used for positioning the characteristic information and providing position information for subsequent identification.
4. The intelligent vision monitoring and warning system for safety operation of aviation cranes of claim 1, characterized in that: the video data server comprises a video retrieval module, an application security module, a distributed database and a distributed file system, and specifically comprises the following steps:
the video retrieval module: and searching and playing back related videos according to the key information.
Applying a security module: providing a data encryption function to ensure the security of data;
distributed database: the system is a database foundation of a multi-server architecture and provides data support for distributed computing;
distributed file system: the file system is a file system foundation of a multi-server architecture and provides data management for distributed computing.
5. The intelligent vision monitoring and warning system for safety operation of aviation cranes of claim 1, characterized in that: the operation platform of the system comprises an acquisition module, an operation center, a data center, a demonstration center and a Web server, wherein the acquisition module is respectively connected with the demonstration center and the data center in a communication manner, the demonstration center and the data center are respectively connected with the operation center in a communication manner, and the operation center, the data center and the demonstration center are all arranged in the server;
a server: deploying an automatic monitoring and management system and a network communication module;
the demonstration center: the system comprises a user interface, a split screen display module and an image/video enhancement module;
the data center comprises: the system is used for collecting, storing and managing high-definition videos and comprises a video retrieval module, an application security module, a distributed database and a distributed file system;
an operation center: the intelligent video analysis module comprises a target detection module, a face recognition module and a key point detection module.
6. A working method of an intelligent visual monitoring and alarming system for safety operation of an aerial crane is characterized in that: and (3) safety monitoring and alarming in a fixed scene:
(1) extracting foreground characteristics through the pictures and videos acquired by the image acquisition module;
(2) the operation center calculates the position, direction, perimeter, width-to-height ratio, area-to-enclosed area ratio, eccentricity and other geometric characteristics of an object in the foreground image, and the target detection module forms a characteristic vector to reflect the human body posture through the extracted geometric characteristic parameters;
(3) according to the algorithm requirement, different postures are adopted as samples, the characteristics in the samples are extracted as a data set and used as data source reference, and the key point detection module and the face recognition module recognize the person;
the key point detection module realizes the behavior recognition function of the human body, and the behavior recognition function is realized by the key point detection module, including the intrusion of a navigation crane area, the falling of workers, the gathering of the workers and the safety protection detection;
the face recognition module can accurately realize face comparison through face deep learning, realize the qualification inspection function of the intelligent visual surveillance system for aviation crane safety operation, can recognize a blacklist and a whitelist through establishing a face library, set an AOI surveillance area, and realize alarming for intrusion in a dangerous area.
7. The working method of the intelligent visual surveillance system for safety operation of aviation cranes of claim 6, characterized by comprising the following steps: the algorithm comprises the following steps:
(1) the video analysis system takes a color image as input and then produces as output 2D locations for contact keypoints for each individual in the image;
first, a forward network is used to predict a set of 2D confidence maps S for body part positioning and a 2D vector domain L for part affinity, which encodes the relevance between the parts, the set S ═ (S1, S2.,. Sj) has J confidence maps, one for each part, where S ∈ Rw × h, J ∈ {1.. J }, the set L ∈ { L1, L2.,. LC } has C vector domains, one for each limb, where LC ∈ Rw × h × 2, C ∈ {1.. C }, each picture encodes a 2D vector at the position LC, and finally, the confidence dictionary and the relevance domain are resolved by greedy inference to output 2D key points for all individuals in the image;
for multiple individual posture assessments, body parts belonging to the same individual are connected, different directions are coded with different colors, and for each pixel in the domain, a 2D vector encodes the position and direction of the limb;
(2) overall architecture, using the whole picture as input of two branches CNN and combining the prediction confidence map for body part detection and the part association domain of the part relationship, the parsing step executes a set of bilateral matching to associate the body part candidates, and finally combines them into the whole body posture of all individuals in the image,
a detection confidence dictionary and an associated domain of coding part-to-part relation are simultaneously predicted by a framework, and the associated domain network is divided into two branches: a top branch, predicting a confidence dictionary; a bottom branch, a prediction association domain; each branch is a repetitive prediction architecture, and the prediction results are adjusted by repeating the steps in sequence, with supervision added at each stage.
8. The working method of the intelligent visual surveillance system for safety operation of gantry cranes of claim 7, characterized in that: in the two-branch multi-step CNN architecture method, each stage in a first branch predicts a confidence dictionary, each stage in a second branch predicts PAFs, and after each stage, the prediction results of the two branches plus picture features are transmitted to the next stage;
the image is first analyzed by a convolutional network to produce a collective feature dictionary F which is used as input to the first stage of each branch, in which the network produces a collective detection confidence map S1=ρ1(F) And a set of site-associated domains L1=φ1(F) Where ρ is1And
Figure FDA0002611480580000038
is the CNNs inference at the first stage, and in each subsequent stage, the predictions from each branch of the previous stage, in combination with the original image features F, are used to generate a confirmation prediction,
Figure FDA0002611480580000031
Figure FDA0002611480580000032
where ρ istAnd phitIs an inference of CNNs in step t.
To guide the network to iteratively predict the confidence maps of body parts in the first branch and PAFs in the second branch, two penalty functions are added at the end of each stage, one for each branch, using an L between the prediction estimate and the real map and domain2Loss, where the proportion of the penalty function is spatially distributed to solve the problem that some data sets cannot be fully labeled for all individuals, in particular, at stage t, the penalty function in both branches is:
Figure FDA0002611480580000033
Figure FDA0002611480580000034
wherein
Figure FDA0002611480580000035
Is a true region confidence map of the region,
Figure FDA0002611480580000036
is the true region association vector field, W is a binary code, and when there is no label at position p of the image, W (p) is 0. The coding is used to avoid penalties inThe correct positive prediction during the training process, the intermediate supervision of each stage is used to solve the gradient disappearance problem by periodically supplementing the gradient, and the overall goal is as follows:
Figure FDA0002611480580000037
for each body part detection combination confidence measure, given a set of body part detection results, how to integrate them without knowing the number of individuals to form the posture of the whole body of each individual, a confidence measure for each pair of body part detection combinations is required, i.e. they belong to the same individual, position information of the limb support domain is stored, and direction information is stored, for each limb part relation is a 2D vector domain, for each pixel in the region belonging to a particular limb, a 2D vector encodes the direction from one part to another part in the limb, each type of limb has a corresponding relation domain to connect its associated two parts.
To evaluate f in equation (5) during trainingLDefining a real part joint vector field,
Figure FDA0002611480580000041
for point p in the picture there are:
Figure FDA0002611480580000042
wherein the content of the first and second substances,
Figure FDA0002611480580000043
is the unit vector of the limb, the set of points on the limb are defined as those points where the line segment is within the distance threshold, those points p are defined as:
Figure FDA0002611480580000044
wherein the width of limbs sigmalIs the distance at pixel level, the limb length
Figure FDA0002611480580000045
And v \\prepIs a vector that is orthogonal to v and,
the real site union domain averages the union domains of all individuals in the picture:
Figure FDA0002611480580000046
wherein n isc(p) is the number of vectors in all k individuals at which point p is not 0,
during testing, the correlation between candidate part detections is measured by calculating the line integral of a line segment along the part coordinates corresponding to the PAF, the correspondence between the predicted PAF and the candidate limb constituted by the body part detected by the correlation is measured, and the positions of the two candidate parts are measured
Figure FDA0002611480580000047
And
Figure FDA0002611480580000048
sampling from the predicted site-specific region, LcAlong the line segment to measure the confidence of the connection between them:
Figure FDA0002611480580000049
wherein p (u) is in two body parts
Figure FDA00026114805800000410
And
Figure FDA00026114805800000411
the position of the insertion:
Figure FDA00026114805800000412
in practice, the integral is approximated by sampling and summing the equally spaced values of u;
applying non-maximum suppression on the detected confidence map to obtain a discrete set of candidate locations, for each location, a plurality of candidate locations defining a large set of possible limbs, scoring each candidate limb using a line integral operation on the PAF, defined by equation (10), the problem of finding the most ideal solution corresponding to a K-dimensional matching problem,
formally, we obtain a set of body part detection candidates D for multiple individualsjWherein
Figure FDA00026114805800000413
NjIs the number of candidate sites j, and
Figure FDA00026114805800000414
is the m-th detected coordinate for candidate body part j, these candidate body part detections still need to be associated with other parts of the same individual, defining a variable
Figure FDA00026114805800000415
To represent two candidate detections
Figure FDA00026114805800000416
And
Figure FDA00026114805800000417
whether a connection is possible, and our goal is to find the best possible connection allocation,
Figure FDA0002611480580000051
if a single site pair j is considered for the c-th limb1And j2Finding the best connection translates to the maximum weight bipartite graph matching problem, whereIn the individual graph matching problem, the nodes of the graph are body part candidate detections
Figure FDA0002611480580000052
And
Figure FDA0002611480580000053
and edges are all possible connections in the paired candidate detection, and further, each edge is assigned a weight by equation (10), a match in the bipartite graph is a selected subset of edges that do not share a node with two edges, the goal is to find a match such that there is a maximum weight for all selected edges:
Figure FDA0002611480580000054
Figure FDA0002611480580000055
Figure FDA0002611480580000056
wherein EcIs the total weight, Z, matched from limb type ccIs a subset Z, E of limb type cmnIs defined in the body part by the formula (10)
Figure FDA0002611480580000057
And
Figure FDA0002611480580000058
the connection between equation (13) and equation (14) states that two edges cannot share a node, so that no two limbs belonging to the same type share a part.
9. The working method of the intelligent visual surveillance system for safety operation of gantry cranes of claim 8, characterized in that: finding multiple body overall body poses, adding two relaxations to do optimization, first, we choose a minimum number of edges to get an individual pose spanning tree summary instead of using the full graph, second, we further decompose the matching problem into a set bipartite graph matching sub-problem and decide the matching problem independently in the neighboring trees,
after adding these two relaxations, the optimization simply decomposes into:
Figure FDA0002611480580000059
the limb connection candidates are obtained for each limb category independently using equations (12) - (14), all candidate limb connections are available, and connections sharing the same candidate detection site are integrated to form whole body posture detection for multiple individuals.
CN202010755699.8A 2020-07-31 2020-07-31 Intelligent visual monitoring and warning system for safety operation of gantry crane Pending CN111898541A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010755699.8A CN111898541A (en) 2020-07-31 2020-07-31 Intelligent visual monitoring and warning system for safety operation of gantry crane

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010755699.8A CN111898541A (en) 2020-07-31 2020-07-31 Intelligent visual monitoring and warning system for safety operation of gantry crane

Publications (1)

Publication Number Publication Date
CN111898541A true CN111898541A (en) 2020-11-06

Family

ID=73183019

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010755699.8A Pending CN111898541A (en) 2020-07-31 2020-07-31 Intelligent visual monitoring and warning system for safety operation of gantry crane

Country Status (1)

Country Link
CN (1) CN111898541A (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112349150A (en) * 2020-11-19 2021-02-09 飞友科技有限公司 Video acquisition method and system for airport flight guarantee time node
CN112434828A (en) * 2020-11-23 2021-03-02 南京富岛软件有限公司 Intelligent identification method for safety protection in 5T operation and maintenance
CN112434827A (en) * 2020-11-23 2021-03-02 南京富岛软件有限公司 Safety protection identification unit in 5T fortune dimension

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112349150A (en) * 2020-11-19 2021-02-09 飞友科技有限公司 Video acquisition method and system for airport flight guarantee time node
CN112434828A (en) * 2020-11-23 2021-03-02 南京富岛软件有限公司 Intelligent identification method for safety protection in 5T operation and maintenance
CN112434827A (en) * 2020-11-23 2021-03-02 南京富岛软件有限公司 Safety protection identification unit in 5T fortune dimension

Similar Documents

Publication Publication Date Title
CN111898541A (en) Intelligent visual monitoring and warning system for safety operation of gantry crane
CN111898514B (en) Multi-target visual supervision method based on target detection and action recognition
CN110425005B (en) Safety monitoring and early warning method for man-machine interaction behavior of belt transport personnel under mine
KR101839827B1 (en) Smart monitoring system applied with recognition technic of characteristic information including face on long distance-moving object
CN108734055A (en) A kind of exception personnel detection method, apparatus and system
WO2020122456A1 (en) System and method for matching similarities between images and texts
CN114758362B (en) Clothing changing pedestrian re-identification method based on semantic perception attention and visual shielding
CN114998934B (en) Clothes-changing pedestrian re-identification and retrieval method based on multi-mode intelligent perception and fusion
CN110852222A (en) Campus corridor scene intelligent monitoring method based on target detection
CN110929584A (en) Network training method, monitoring method, system, storage medium and computer equipment
Zhang et al. Transmission line abnormal target detection based on machine learning yolo v3
CN112149494A (en) Multi-person posture recognition method and system
CN110322472A (en) A kind of multi-object tracking method and terminal device
CN113505704B (en) Personnel safety detection method, system, equipment and storage medium for image recognition
CN114067396A (en) Vision learning-based digital management system and method for live-in project field test
CN114882251A (en) Internet of things big data intelligent video security monitoring method and device
KR20020082476A (en) Surveillance method, system and module
Dogra et al. Scene representation and anomalous activity detection using weighted region association graph
Lai Real-Time Aerial Detection and Reasoning on Embedded-UAVs in Rural Environments
CN115083229A (en) Intelligent recognition and warning system of flight training equipment based on AI visual recognition
CN113159984A (en) Substation worker work path tracking method
Lestari et al. Comparison of two deep learning methods for detecting fire
Sharma et al. Face mask detection using artificial intelligence for workplaces
Kumari et al. Deep learning and computer vision-based social distancing detection system
CN110175521A (en) Method based on double camera linkage detection supervision indoor human body behavior

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination