CN111626276A - Two-stage neural network-based work shoe wearing detection method and device - Google Patents

Two-stage neural network-based work shoe wearing detection method and device Download PDF

Info

Publication number
CN111626276A
CN111626276A CN202010750662.6A CN202010750662A CN111626276A CN 111626276 A CN111626276 A CN 111626276A CN 202010750662 A CN202010750662 A CN 202010750662A CN 111626276 A CN111626276 A CN 111626276A
Authority
CN
China
Prior art keywords
shoe
human body
frame
body frame
stage
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202010750662.6A
Other languages
Chinese (zh)
Other versions
CN111626276B (en
Inventor
张逸
徐晓刚
王军
徐芬
张文广
何鹏飞
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhejiang Lab
Original Assignee
Zhejiang Lab
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhejiang Lab filed Critical Zhejiang Lab
Priority to CN202010750662.6A priority Critical patent/CN111626276B/en
Publication of CN111626276A publication Critical patent/CN111626276A/en
Application granted granted Critical
Publication of CN111626276B publication Critical patent/CN111626276B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/20Scenes; Scene-specific elements in augmented reality scenes
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/25Fusion techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Abstract

The invention discloses a method and a device for detecting wearing of a pair of industrial shoes based on a two-stage neural network, wherein the method comprises the following steps: acquiring a picture data set of a monitoring video; marking the shoe target and the human body target contained in the picture data set to obtain a marked data set; constructing a two-stage neural network model, wherein the two-stage neural network model is formed by cascading a first-stage human body detection network model and a second-stage shoe detection network model, and the input of the second-stage shoe detection network model is the output of the first-stage human body detection network model; inputting the picture to be detected into the two-stage neural network model, and outputting the position of the human body frame, the offset of the shoe position relative to the human body frame and the confidence coefficient of the wearer shoe; and calculating the position of the shoe according to the position of the human body frame and the offset of the shoe position relative to the human body frame, and judging whether the worker shoes are worn or not according to the confidence coefficient. The method solves the problem of low detection recall rate caused by small shoe target of the staff in the video, and can be used for detection of wearing of the staff shoes in a factory.

Description

Two-stage neural network-based work shoe wearing detection method and device
Technical Field
The invention belongs to the technical field of artificial intelligence and computer vision, and particularly relates to a method and a device for detecting wearing of an industrial shoe based on a two-stage neural network.
Background
With the development of intelligence, the safety of production and living has become the focus and the demand of people for increasing attention. Cameras have been installed in industrial production sites and in many corners of cities, creating good objective conditions for automated monitoring using computer vision techniques.
In industrial production, human body accessories are used as key parts of human bodies and are often used as target objects for detection. Particularly, in an industrial production field, the wearing of the safety helmet can greatly reduce the occurrence of personal injury accidents, so that the wearing detection of the safety helmet at the head position needs to utilize a human head detection technology; in addition, in the mill, the wearing of worker's shoes can prevent to a great extent all kinds of dangerous circumstances that the sole skidded and arouses, consequently, need utilize the target detection technique to detect shoes to the wearing of worker's shoes.
In recent years, with the development of deep learning technology, the performance of target detection technology has been greatly improved, and target detection technology represented by Yolo has been widely applied in the industry, wherein the application of human body detection technology is becoming mature. However, the detection of small targets is one of the difficulties of the target detection technology, and because of the relatively high installation positions of factory buildings and urban road cameras, the human body accessory target has the characteristic of small target in the video image, so that the detection of human body accessories such as shoes directly on the video image often fails to obtain an ideal recall rate, and a human body accessory detection technology with high recall rate, high accuracy and limited calculation overhead is urgently needed to be provided.
Disclosure of Invention
The embodiment of the invention aims to provide a method and a device for detecting wearing of a worker shoe based on a two-stage neural network, so as to solve the problem that an ideal recall rate cannot be obtained when human body accessories such as shoes are detected in a video image.
In order to achieve the above object, the techniques adopted in the embodiments of the present invention are as follows:
in a first aspect, an embodiment of the present invention provides a two-stage neural network-based method for detecting wearing of an industrial shoe, including:
acquiring a picture data set of a monitoring video;
marking the shoe target and the human body target contained in the picture data set to obtain a marked data set;
constructing a two-stage neural network model, wherein the two-stage neural network model is formed by cascading a first-stage human body detection network model and a second-stage shoe detection network model, the first-stage human body detection network model is obtained by training a human body detection network by a human body frame part in the labeled data set, the second-stage shoe detection network model is obtained by jointly training a second-stage shoe detection network by a human body frame, a shoe frame and a shoe category part in the labeled data set, and the input of the second-stage shoe detection network model is the output of the first-stage human body detection network model;
inputting the picture to be detected into the two-stage neural network model, and outputting the position of the human body frame, the offset of the shoe position relative to the human body frame and the confidence coefficient of the wearer shoe;
and calculating the position of the shoe according to the position of the human body frame and the offset of the position of the shoe relative to the human body frame, and judging whether the shoe is worn or not by combining the result of the comparison between the confidence coefficient of the shoe worn by the multi-frame picture and the threshold value.
Further, acquiring a picture data set of the surveillance video, comprising:
and acquiring a monitoring video, and performing picture segmentation on the monitoring video to obtain a picture data set.
Further, labeling the shoe targets and the human body targets contained in the picture data set includes:
and marking the wearing worker shoe personnel and the non-wearing worker shoe personnel contained in the picture data set, and respectively marking the human body position, the shoe position and the shoe category of each personnel. As the shoes are possibly shielded from each other and at least one shoe is visible, only one shoe is detected for simplifying the problem, the position of one completely visible shoe is marked, the category of the shoe is the work shoe, the category is marked as 1, and otherwise, the category is marked as 0.
Further, the first-level human detection network model is obtained by training a human detection network by a human frame part in the labeled data set, and includes:
and training a human body detection network by adopting the human body frame part in the labeled data set to obtain a primary human body detection network model, wherein a branch is led out before a last convolution module of the primary human body detection network, and the detection results of the characteristic diagram and the primary human body detection network are output together.
Further, training a secondary shoe detection network together with the human body frame, the shoe frame and the shoe category part in the labeled data set comprises:
step S3.1: the setting of the secondary shoe detection network structure comprises the following substeps:
step S3.1.1: the characteristic diagram and the human body detection frame output by the primary human body detection network model are used as the input of a secondary shoe detection network, wherein the dimension of the characteristic diagram is
Figure 727642DEST_PATH_IMAGE001
WhereinNIs composed ofbatchThe number of the first and second groups is,Cthe number of the channels is the number of the channels,H,Wrespectively the height and width of the input image after network downsampling; the dimension of the human body detection frame is
Figure 175941DEST_PATH_IMAGE002
Wherein
Figure 486837DEST_PATH_IMAGE003
Representing the number of human body detection frames corresponding to a single image; base ofOutputting a characteristic diagram in a first-level human body detection network, performing pooling treatment on the characteristics of each human body detection frame area to obtain dimensionality of
Figure 147625DEST_PATH_IMAGE004
Wherein the width and height of the pooled features are used
Figure 493156DEST_PATH_IMAGE005
Represents;
step S3.1.2: unfolding the obtained features so that the dimensions become
Figure 846777DEST_PATH_IMAGE006
Step S3.1.3: sequentially sending the characteristics into two layers of full-connection layers, wherein the two layers of full-connection layers are respectively connected with one another after trainingdrop_outLayer to avoid network overfitting;
step S3.1.4: sending the characteristics into a full-connection layer with the neuron number of 5 to obtain five-dimensional output;
step S3.1.5: normalizing the five-dimensional output to obtain a five-dimensional predicted value (P_bias0,P_bias1,P_ bias2,P_bias3,P_label),P_bias0The predicted value of the ratio of the deviation of the shoe center point relative to the left lower corner of the human body frame in the direction of the transverse axis to the width of the human body frame is obtained;P_bias1the predicted value of the ratio of the deviation of the central point of the shoe relative to the lower left corner of the human body frame in the direction of the longitudinal axis to the height of the human body frame is obtained,P_bias2the predicted value of the ratio of the width of the shoes to the width of the human body frame,P_ bias3Is a predicted value of the ratio of the height of the shoes to the height of the human body frame,P_labela confidence prediction value for the wearer shoe;
step S3.2: setting a loss function of a secondary shoe detection network;
position of center point of shoes: (x 0 ,y 0 ) Relative to the left lower corner position of the human body frame (x 1 ,y 1 ) Is offset fromGt_biasComprising five components (Gt_bias0,Gt_bias1,Gt_bias2,Gt_bias3,Gt_label),Gt_bias0The ratio of the deviation of the shoe center point relative to the left lower corner of the human body frame in the direction of the transverse axis to the width of the human body frame;Gt_bias1the ratio of the deviation of the central point of the shoe relative to the left lower corner of the human body frame in the direction of the longitudinal axis to the height of the human body frame,Gt_bias2is the ratio of the width of the shoes to the width of the human body frame,Gt_bias3The shoe type label is the ratio of the height of the shoe to the height of the human body frameGt_label。(Gt_ bias0,Gt_bias1,Gt_bias2,Gt_bias3,Gt_label) As a true value in the secondary shoe detection network training process, namely a regression target, the expressions are respectively as follows:
Figure 379389DEST_PATH_IMAGE007
Figure 109448DEST_PATH_IMAGE008
Figure 309485DEST_PATH_IMAGE009
Figure 568428DEST_PATH_IMAGE010
wherein the content of the first and second substances,
Figure 853916DEST_PATH_IMAGE011
the central point of the shoe deviates in the direction of the transverse axis relative to the left lower corner of the human body frame, the value is positive,
Figure 856507DEST_PATH_IMAGE012
the central point of the shoe deviates in the direction of the longitudinal axis relative to the position of the lower left corner of the human body frame, the value is positive,
Figure 911051DEST_PATH_IMAGE013
the width of the human body detection frame is,
Figure 606474DEST_PATH_IMAGE014
the height of the human body detection frame is set,
Figure 379258DEST_PATH_IMAGE015
the height of the shoe is the height of the shoe,
Figure 206048DEST_PATH_IMAGE016
is the width of the shoe;
the position of the lower left corner of the human body detection frame is used as a reference point, the coordinates are set as (0,0), and meanwhile, the width and the height of the human body detection frame are set, namely
Figure 115098DEST_PATH_IMAGE013
Figure 981423DEST_PATH_IMAGE014
Are all unit 1; from this, the coordinate of the center point of the shoe corresponding to the true value is (Gt_bias0,Gt_ bias1) Wide isGt_bias2High isGt_bias3
From this, the coordinates of the upper left corner of the shoe target frame (A) can be obtained
Figure 975924DEST_PATH_IMAGE017
):
Figure 585897DEST_PATH_IMAGE018
Figure 349453DEST_PATH_IMAGE019
True value of coordinates of lower right corner of shoe target frame (
Figure 652259DEST_PATH_IMAGE020
):
Figure 134056DEST_PATH_IMAGE021
Figure 282140DEST_PATH_IMAGE022
Similarly, the coordinates of the upper left corner of the target frame of the shoe are predicted (
Figure 165783DEST_PATH_IMAGE023
):
Figure 639489DEST_PATH_IMAGE024
Figure 343003DEST_PATH_IMAGE025
Predicting the coordinates of the lower right corner of the shoe target frame (
Figure 294779DEST_PATH_IMAGE026
):
Figure 298507DEST_PATH_IMAGE027
Figure 677535DEST_PATH_IMAGE028
Is prepared from (A)
Figure 133925DEST_PATH_IMAGE017
) And (a)
Figure 889391DEST_PATH_IMAGE020
) Define truth value shoe target framebox gt From (a) to (
Figure 216467DEST_PATH_IMAGE023
) And (a)
Figure 31976DEST_PATH_IMAGE026
) Defining a predictive shoe goal boxbox p
Computingbox gt Andbox p degree of overlap ofGIoUThen the target frame is lostLoss=1-GIoU
The above-mentionedGIoUThe calculation process is as follows:
for thebox gt Andbox p first, find the minimum box that can enclose bothCGiven its area ofc_area
Let area
Figure 509750DEST_PATH_IMAGE029
Figure 803328DEST_PATH_IMAGE030
Then, then
Figure 516069DEST_PATH_IMAGE031
Figure 971321DEST_PATH_IMAGE032
For class lossLoss_labelAnd adopting a cross entropy loss function, wherein the calculation formula is as follows:
Figure 402302DEST_PATH_IMAGE033
the total loss function is:
Figure 499571DEST_PATH_IMAGE034
wherein
Figure 332398DEST_PATH_IMAGE035
Is the weight lost to the shoe position,
Figure 958552DEST_PATH_IMAGE036
weight lost for shoe classification;
step S3.3: performing secondary shoe detection network training;
and based on the network structure in the step S3.1 and the loss function in the step S3.2, training a secondary shoe detection network together by adopting the human body frame, the shoe frame and the shoe category part in the labeled data set to obtain a secondary shoe detection network model.
Further, calculating the position of the shoe according to the position of the human body frame and the offset of the position of the shoe relative to the human body frame, and the method comprises the following steps:
obtaining a target frame of the shoe according to the position of the human body frame and the offset of the position of the shoe relative to the human body frame, wherein the coordinate of the lower left corner of the human body frame output by the primary human body detection network is (x 1 ,y 1 ) Coordinates of upper left corner of shoe: (x t ,y t ) And the coordinates of the lower right corner: (x b ,y b ) The calculation formula of (a) is as follows:
Figure 876829DEST_PATH_IMAGE037
Figure 512210DEST_PATH_IMAGE038
Figure 933964DEST_PATH_IMAGE039
Figure 262177DEST_PATH_IMAGE040
further, whether wearing the work shoes is judged according to the result of comparing the confidence coefficient of the wearing work shoes with the threshold value by fusing the multi-frame images, and the method comprises the following steps:
judging whether the current frame wears the worker shoes by adopting a median filtering result, and setting a filtering length parameter asxI.e. the confidence of wearing the work shoe at the current frame is frontxAnd outputting the mean value of the confidence coefficient predicted values by the frame secondary shoe detection network, and if the mean value is greater than or equal to a set threshold value, determining that the worker shoes are worn, otherwise, determining that the worker shoes are not worn. The method can make the detection result of the shoe type more robust and avoid frequent jump of the prediction result.
In a second aspect, an embodiment of the present invention further provides a two-stage neural network-based device for detecting wearing of an industrial shoe, including:
the acquisition module is used for acquiring a picture data set of the monitoring video;
the marking module is used for marking the shoe target and the human body target contained in the picture data set to obtain a marked data set;
the building module is used for building a two-stage neural network model, the two-stage neural network model is formed by cascading a first-stage human body detection network model and a second-stage shoe detection network model, the first-stage human body detection network model is obtained by training a human body detection network by a human body frame part in the labeled data set, the second-stage shoe detection network model is obtained by jointly training a second-stage shoe detection network by a human body frame, a shoe frame and a shoe category part in the labeled data set, and the input of the second-stage shoe detection network model is the output of the first-stage human body detection network model;
the output module is used for inputting the picture to be detected into the two-stage neural network model and outputting the position of the human body frame, the offset of the shoe position relative to the human body frame and the confidence coefficient of the wearing worker shoe;
and the calculation and judgment module is used for calculating the position of the shoe according to the position of the human body frame and the offset of the shoe position relative to the human body frame, and judging whether the shoe is worn according to the result of the comparison between the confidence coefficient of the shoe worn by the multi-frame picture and the threshold value.
According to the technical scheme, the invention has the beneficial effects that:
(1) because the shoe target is very small in the video image and the direct detection of the shoe recall rate is low, the invention adopts a network cascade mode to convert the detection of the shoe target into human body detection and shoe position regression and classification based on the human body detection, and the realization method is simple. Compared with the detection by adopting a primary neural network, the secondary shoe detection network in the cascade network is dedicated to detecting the positions and the types of the shoes in the human body, the difficulty of directly detecting the small target shoes in the factory monitoring video is overcome, and the recall rate and the accuracy rate of the detection are obviously improved.
(2) The application effect is obviously influenced by the reasoning time consumption of the detection algorithm in industrial application, the secondary shoe detection network only adopts the characteristics of the primary human body detection network and the detection box result as input, and large-memory copy does not exist, so the reasoning time consumption is low, and the application of an industrial scene is facilitated.
Drawings
The accompanying drawings, which are included to provide a further understanding of the invention and are incorporated in and constitute a part of this specification, illustrate embodiments of the invention and together with the description serve to explain the invention and not to limit the invention. In the drawings:
FIG. 1 is a flow chart of a method for detecting wearing of a pair of industrial shoes based on a two-stage neural network according to an embodiment of the present invention;
FIG. 2 is a network model structure diagram of a two-stage neural network-based method for detecting wearing of a pair of industrial shoes according to an embodiment of the present invention;
FIG. 3 is a schematic diagram illustrating the calculation of the offset of a shoe target frame relative to a human target frame in accordance with an embodiment of the present invention;
fig. 4 is a block diagram of a two-stage neural network-based device for detecting wearing of an industrial shoe according to an embodiment of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the present application more apparent, the technical solutions of the present application will be described in detail and completely with reference to the following specific embodiments of the present application and the accompanying drawings. It should be apparent that the described embodiments are only some of the embodiments of the present application, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.
Example 1:
FIG. 1 is a flow chart of a method for detecting wearing of a pair of industrial shoes based on a two-stage neural network according to an embodiment of the present invention; the method for detecting the wearing of the industrial shoe based on the two-stage neural network comprises the following steps:
step S1, acquiring a picture data set of a monitoring video;
specifically, a monitoring video is obtained, picture segmentation is performed on the monitoring video, one frame of image is extracted every 200 frames, a picture data set is obtained, and each frame of image in the data set is ensured to have certain difference.
Step 2, marking the shoe target and the human body target contained in the picture data set to obtain a marked data set;
specifically, the wearer shoe personnel and the non-wearer shoe personnel contained in the picture data set are marked, and for each personnel, the human body position, the shoe position and the shoe category of each personnel are marked respectively. As the shoes are possibly shielded from each other and at least one shoe is visible, only one shoe is detected for simplifying the problem, the position of one completely visible shoe is marked, the category of the shoe is the work shoe, the category is marked as 1, and otherwise, the category is marked as 0.
Step S3, constructing a two-stage neural network model, where fig. 2 is a network model structure diagram of a two-stage neural network-based method for detecting wearing of a pair of industrial shoes according to an embodiment of the present invention, where the two-stage neural network model is formed by cascading a first-stage human body detection network model and a second-stage shoe detection network model, the first-stage human body detection network model is obtained by training a human body detection network in a human body frame part in the labeled data set, the second-stage shoe detection network model is obtained by training a second-stage shoe detection network in a human body frame, a shoe frame and a shoe category part in the labeled data set, and an input of the second-stage shoe detection network model is an output of the first-stage human body detection network model;
specifically, the first-stage human body detection network model is obtained by training a human body detection network by a human body frame part in the labeled data set, and includes:
training using the human frame portion in the labeled datasetyolov3The human body detection network obtains a first-level human body detection network model by adoptingadamIn an optimization mode, the initial learning rate is set to 0.0005,epochis set to be at a speed of 90 degrees,batchsizeset to 64. Wherein the underlying network employsdarknet53Fusing 32-time and 16-time downsampling branches in a subsequent network and then fusing with 8-time downsampling branches, finally leading out a branch before a last convolution module of a first-level human body detection network, wherein the branch is a characteristic diagram of an image subjected to 8-time downsampling, and the same-level human body detection networkAre output together.
Adopt human frame, shoes subframe and shoes classification part in the mark data set train second grade shoes detection network jointly, include:
step S3.1: the setting of the secondary shoe detection network structure comprises the following substeps:
step S3.1.1: the characteristic diagram and the human body detection frame output by the primary human body detection network model are used as the input of a secondary shoe detection network, wherein the dimension of the characteristic diagram is
Figure 136592DEST_PATH_IMAGE001
WhereinNIs composed ofbatchThe number of the first and second groups is,Cthe number of the channels is the number of the channels,H,Wrespectively the height and width of the input image after network downsampling; the dimension of the human body detection frame is
Figure 841243DEST_PATH_IMAGE002
Wherein
Figure 383083DEST_PATH_IMAGE003
The number of the human body detection frames corresponding to a single image is represented, and the coordinates of the center point and the width and height dimensions of each human body detection frame are 4; outputting a characteristic diagram based on a primary human body detection network, and carrying out characteristic processing on the detection frame area of each human bodyroi- alignPerforming pooling treatment to obtain a product with dimension of
Figure 616618DEST_PATH_IMAGE004
Is characterized in thatroi-alignWidth and height of pooled features
Figure 243908DEST_PATH_IMAGE005
Showing, this embodiment is here
Figure 221092DEST_PATH_IMAGE005
7, thereby obtaining the characteristics of each human body frame area, so that the subsequent steps can be focused on the shoe detection in the human body frame area;
step S3.1.2: unfolding the obtained features so that the dimensions become
Figure 617438DEST_PATH_IMAGE041
Step S3.1.3: sequentially sending the characteristics into a full-connection layer network with 2048 neurons in two layers, wherein the two full-connection layers are respectively connected with one another after trainingdrop_outLayers to avoid overfitting;
step S3.1.4: sending the characteristics into a full-connection layer with the neuron number of 5 to obtain five-dimensional output;
step S3.1.5: for five-dimensional outputsigmoidNormalization processing to obtain five-dimensional predicted value (P_bias0,P_ bias1,P_bias2,P_bias3,P_label),P_bias0The predicted value of the ratio of the deviation of the shoe center point relative to the left lower corner of the human body frame in the direction of the transverse axis to the width of the human body frame is obtained;P_bias1the predicted value of the ratio of the deviation of the central point of the shoe relative to the lower left corner of the human body frame in the direction of the longitudinal axis to the height of the human body frame is obtained,P_bias2the predicted value of the ratio of the width of the shoes to the width of the human body frame,P_bias3Is a predicted value of the ratio of the height of the shoes to the height of the human body frame,P_labela confidence prediction value for the wearer shoe;
step S3.2: setting a loss function of a secondary shoe detection network;
FIG. 3 is a schematic diagram illustrating the calculation of the deviation of a shoe target frame from a human target frame according to an embodiment of the present invention, wherein the shoe center point is located at (A)x 0 ,y 0 ) Relative to the left lower corner position of the human body frame (x 1 ,y 1 ) Is offset fromGt_biasComprising five components (Gt_bias0,Gt_bias1,Gt_bias2,Gt_bias3,Gt_label),Gt_bias0The ratio of the deviation of the shoe center point relative to the left lower corner of the human body frame in the direction of the transverse axis to the width of the human body frame;Gt_bias1the ratio of the deviation of the central point of the shoe relative to the left lower corner of the human body frame in the direction of the longitudinal axis to the height of the human body frame,Gt_bias2is the ratio of the width of the shoes to the width of the human body frame,Gt_ bias3The shoe type label is the ratio of the height of the shoe to the height of the human body frameGt_label。(Gt_bias0,Gt_bias1,Gt_bias2,Gt_bias3,Gt_label) As a true value in the secondary shoe detection network training process, namely a regression target, the expressions are respectively as follows:
Figure 284524DEST_PATH_IMAGE007
Figure 133531DEST_PATH_IMAGE008
Figure 914405DEST_PATH_IMAGE009
Figure 165258DEST_PATH_IMAGE010
wherein the content of the first and second substances,
Figure 6175DEST_PATH_IMAGE011
the central point of the shoe deviates in the direction of the transverse axis relative to the left lower corner of the human body frame, the value is positive,
Figure 608058DEST_PATH_IMAGE012
the central point of the shoe deviates in the direction of the longitudinal axis relative to the position of the lower left corner of the human body frame, the value is positive,
Figure 927044DEST_PATH_IMAGE013
the width of the human body detection frame is,
Figure 297982DEST_PATH_IMAGE014
the height of the human body detection frame is set,
Figure 309801DEST_PATH_IMAGE015
the height of the shoe is the height of the shoe,
Figure 867821DEST_PATH_IMAGE016
is the width of the shoe;
the left side of the human body detection frameThe lower corner position is used as a reference point, the coordinate is set as (0,0), and meanwhile, the width and the height of the human body detection frame are set, namely
Figure 256077DEST_PATH_IMAGE013
Figure 215943DEST_PATH_IMAGE014
Are all unit 1; from this, the coordinate of the center point of the shoe corresponding to the true value is (Gt_bias0,Gt_ bias1) Wide isGt_bias2High isGt_bias3
From this, the coordinates of the upper left corner of the shoe target frame (A) can be obtained
Figure 398662DEST_PATH_IMAGE017
):
Figure 975137DEST_PATH_IMAGE018
Figure 635926DEST_PATH_IMAGE019
True value of coordinates of lower right corner of shoe target frame (
Figure 715877DEST_PATH_IMAGE020
):
Figure 335077DEST_PATH_IMAGE021
Figure 133269DEST_PATH_IMAGE022
Similarly, the coordinates of the upper left corner of the target frame of the shoe are predicted (
Figure 863328DEST_PATH_IMAGE023
):
Figure 800715DEST_PATH_IMAGE024
Figure 325237DEST_PATH_IMAGE025
Predicting the coordinates of the lower right corner of the shoe target frame (
Figure 610725DEST_PATH_IMAGE026
):
Figure 613316DEST_PATH_IMAGE027
Figure 667860DEST_PATH_IMAGE028
Is prepared from (A)
Figure 628863DEST_PATH_IMAGE017
) And (a)
Figure 870488DEST_PATH_IMAGE020
) Define truth value shoe target framebox gt From (a) to (
Figure 942349DEST_PATH_IMAGE023
) And (a)
Figure 116979DEST_PATH_IMAGE026
) Defining a predictive shoe goal boxbox p
Computingbox gt Andbox p degree of overlap ofGIoUThen the target frame is lostLoss=1-GIoU
The above-mentionedGIoUThe calculation process is as follows:
for thebox gt Andbox p first, find the minimum box that can enclose bothCGiven its area ofc_area
Let area
Figure 717724DEST_PATH_IMAGE029
Figure 977804DEST_PATH_IMAGE030
Then, then
Figure 322198DEST_PATH_IMAGE031
Figure 351334DEST_PATH_IMAGE032
For class lossLoss_labelAnd adopting a cross entropy loss function, wherein the calculation formula is as follows:
Figure 388560DEST_PATH_IMAGE033
the total loss function is:
Figure 870357DEST_PATH_IMAGE034
wherein
Figure 284021DEST_PATH_IMAGE035
Is the weight lost to the shoe position,
Figure 167663DEST_PATH_IMAGE036
weight lost for shoe classification, set
Figure 375791DEST_PATH_IMAGE035
The content of the organic acid is 0.6,
Figure 344884DEST_PATH_IMAGE036
is 0.4.
Step S3.3: performing secondary shoe detection network training;
based on the network structure of step S3.1 and the loss function of step S3.2, training a secondary shoe detection network together by adopting the human body frame, the shoe frame and the shoe category part in the labeled data set to obtain a secondary shoe detection network model, and adopting the secondary shoe detection network modeladamIn an optimization mode, the initial learning rate is set to 0.0002,epochis set to be 50 in the above-mentioned order,batchsizeset to 64.
Step S4, inputting the picture to be detected into the two-stage neural network model, and outputting the human body frame position, the deviation of the shoe position relative to the human body frame and the confidence coefficient predicted value of the wearing worker shoe;
specifically, the video is subjected to nearest neighbor downsampling frame by frame, the longest edge is scaled to 608 pixels, and the short edge is subjected to nearest neighbor downsamplingpaddingThe size is adjusted to 608 × 608 pixels, after normalization, the first-level human body detection network model is input, corresponding features of a human body detection frame and 8-time down-sampling are obtained, the human body detection frame and the 8-time down-sampling are used as input of a second-level shoe detection network, and deviation of the shoe position relative to the human body frame and a confidence coefficient prediction value of a wearer shoe are further obtained.
And step S5, calculating the position of the shoe according to the position of the human body frame and the deviation of the shoe position relative to the human body frame, and judging whether the shoe is worn or not by combining the result of the comparison between the confidence coefficient of the shoe worn by the multi-frame picture and the threshold value.
Specifically, the method for calculating the position of the shoe according to the position of the human body frame and the offset of the position of the shoe relative to the human body frame comprises the following steps:
obtaining a target frame of the shoe according to the position of the human body frame and the offset of the position of the shoe relative to the human body frame, wherein the coordinate of the lower left corner of the human body frame output by the primary human body detection network is (x 1 ,y 1 ) Coordinates of upper left corner of shoe: (x t ,y t ) And the coordinates of the lower right corner: (x b ,y b ) The calculation formula of (a) is as follows:
Figure 293730DEST_PATH_IMAGE037
Figure 297458DEST_PATH_IMAGE038
Figure 410907DEST_PATH_IMAGE039
Figure 867296DEST_PATH_IMAGE040
fuse whether wearing the work shoes of result judgement that multiframe picture wearing work shoes confidence degree and threshold value are compared includes:
and judging whether the current frame wears the worker shoes or not by adopting a median filtering result, wherein the filtering length parameter is set to be 10, namely, the confidence coefficient of the current frame wears the worker shoes is the mean value of the predicted values of the output confidence coefficients of the first 10 frames of secondary shoe detection networks, if the mean value is more than or equal to a set threshold value of 0.5, the current frame wears the worker shoes, otherwise, the current frame does not wear the worker shoes, the method can enable the detection result of the shoe type to be more robust, and frequent jumping of the prediction result is avoided.
Example 2:
referring to fig. 4, the present embodiment provides a two-stage neural network-based work shoe wearing detection apparatus, which is a virtual apparatus of the two-stage neural network-based work shoe wearing detection method provided in embodiment 1, and has corresponding functional modules and beneficial effects for executing the method, and the apparatus includes:
an obtaining module 91, configured to obtain a picture data set of a monitoring video;
the marking module 92 is used for marking the shoe target and the human body target contained in the picture data set to obtain a marked data set; a building module 93, configured to build a two-stage neural network model, where the two-stage neural network model is formed by cascading a first-stage human body detection network model and a second-stage shoe detection network model, the first-stage human body detection network model is obtained by training a human body detection network on a human body frame part in the labeled data set, the second-stage shoe detection network model is obtained by training a second-stage shoe detection network on a human body frame, a shoe frame and a shoe category part in the labeled data set, and an input of the second-stage shoe detection network model is an output of the first-stage human body detection network model;
the output module 94 is used for inputting the picture to be detected into the two-stage neural network model, and outputting the position of the human body frame, the offset of the shoe position relative to the human body frame and the confidence coefficient of the wearing worker shoe;
and the calculation and judgment module 95 is used for calculating the position of the shoe according to the position of the human body frame and the offset of the shoe position relative to the human body frame, and judging whether the shoe is worn according to the result of the comparison between the confidence coefficient of the shoe worn by the multi-frame picture and the threshold value.
The above-mentioned serial numbers of the embodiments of the present invention are merely for description and do not represent the merits of the embodiments.
In the above embodiments of the present invention, the descriptions of the respective embodiments have respective emphasis, and for parts that are not described in detail in a certain embodiment, reference may be made to related descriptions of other embodiments.
In the embodiments provided in the present application, it should be understood that the disclosed technology can be implemented in other ways. The above-described device embodiments are merely illustrative, and for example, the division of the units may be a logical division, and in actual implementation, there may be another division, for example, multiple units or components may be combined or integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, units or modules, and may be in an electrical or other form.
The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.
In addition, functional units in the embodiments of the present invention may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit can be realized in a form of hardware, and can also be realized in a form of a software functional unit.
The integrated unit, if implemented in the form of a software functional unit and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present invention may be embodied in the form of a software product, which is stored in a storage medium and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: a U-disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a removable hard disk, a magnetic or optical disk, and other various media capable of storing program codes.
The above description is only for the purpose of illustrating the preferred embodiments of the present invention and is not to be construed as limiting the invention, and any modifications, equivalents, improvements and the like that fall within the spirit and principle of the present invention are intended to be included therein.

Claims (8)

1. A method for detecting wearing of a pair of industrial shoes based on a two-stage neural network is characterized by comprising the following steps:
acquiring a picture data set of a monitoring video;
marking the shoe target and the human body target contained in the picture data set to obtain a marked data set;
constructing a two-stage neural network model, wherein the two-stage neural network model is formed by cascading a first-stage human body detection network model and a second-stage shoe detection network model, the first-stage human body detection network model is obtained by training a human body detection network by a human body frame part in the labeled data set, the second-stage shoe detection network model is obtained by jointly training a second-stage shoe detection network by a human body frame, a shoe frame and a shoe category part in the labeled data set, and the input of the second-stage shoe detection network model is the output of the first-stage human body detection network model;
inputting the picture to be detected into the two-stage neural network model, and outputting the position of the human body frame, the offset of the shoe position relative to the human body frame and the confidence coefficient of the wearer shoe;
and calculating the position of the shoe according to the position of the human body frame and the offset of the position of the shoe relative to the human body frame, and judging whether the shoe is worn or not by combining the result of the comparison between the confidence coefficient of the shoe worn by the multi-frame picture and the threshold value.
2. The two-stage neural network-based method for detecting wearing of the industrial shoes according to claim 1, wherein the step of obtaining a picture data set of the monitoring video comprises the following steps:
and acquiring a monitoring video, and performing picture segmentation on the monitoring video to obtain a picture data set.
3. The method for detecting wearing of the work shoe based on the two-stage neural network as claimed in claim 1 or 2, wherein the step of labeling the shoe target and the human body target included in the picture data set comprises the steps of:
marking the worker shoes personnel and the worker shoes personnel not wearing contained in the picture data set, respectively marking the human body position of each personnel, and only marking the position of one completely visible shoe in the personnel, wherein the shoe category is the worker shoes, the category is marked as 1, otherwise, the category is marked as 0.
4. The method for detecting wearing of a pair of industrial shoes based on a two-stage neural network as claimed in claim 1, wherein the one-stage human body detection network model is obtained by training a human body detection network in a human body frame part in the labeled data set, and comprises:
and training a human body detection network by adopting the human body frame part in the labeled data set to obtain a primary human body detection network model, wherein a branch is led out before a last convolution module of the primary human body detection network, and the detection results of the characteristic diagram and the primary human body detection network are output together.
5. The method for detecting wearing of a worker shoe based on the two-stage neural network as claimed in claim 1, wherein the training of the two-stage shoe detection network by the human body frame, the shoe frame and the shoe category part in the labeled data set comprises:
step S3.1: the setting of the secondary shoe detection network structure comprises the following substeps:
step S3.1.1: the characteristic diagram and the human body detection frame output by the primary human body detection network model are used as the input of a secondary shoe detection network, wherein the dimension of the characteristic diagram is
Figure 394796DEST_PATH_IMAGE001
WhereinNIs composed ofbatchThe number of the first and second groups is,Cthe number of the channels is the number of the channels,H,Wrespectively the height and width of the input image after network downsampling; the dimension of the human body detection frame is
Figure 987452DEST_PATH_IMAGE002
Wherein
Figure 802961DEST_PATH_IMAGE003
Representing the number of human body detection frames corresponding to a single image; based on a first-level human body detection network output characteristic diagram, performing pooling treatment on the characteristics of each human body detection frame area to obtain dimensionality of
Figure 743717DEST_PATH_IMAGE004
Wherein the width and height of the pooled features are used
Figure 37295DEST_PATH_IMAGE005
Represents;
step S3.1.2: unfolding the obtained features so that the dimensions become
Figure 750036DEST_PATH_IMAGE006
Step S3.1.3: sequentially sending the characteristics into two layers of full-connection layers, wherein the two layers of full-connection layers are respectively connected with one another after trainingdrop_outLayer to avoid network overfitting;
step S3.1.4: sending the characteristics into a full-connection layer with the neuron number of 5 to obtain five-dimensional output;
step S3.1.5: normalizing the five-dimensional output to obtain a five-dimensional predicted value (P_bias0P_bias1P_ bias2P_bias3P_label),P_bias0The predicted value of the ratio of the deviation of the shoe center point relative to the left lower corner of the human body frame in the direction of the transverse axis to the width of the human body frame is obtained;P_bias1the predicted value of the ratio of the deviation of the central point of the shoe relative to the lower left corner of the human body frame in the direction of the longitudinal axis to the height of the human body frame is obtained,P_bias2the predicted value of the ratio of the width of the shoes to the width of the human body frame,P_ bias3Is a predicted value of the ratio of the height of the shoes to the height of the human body frame,P_labela confidence prediction value for the wearer shoe;
step S3.2: setting a loss function of a secondary shoe detection network;
position of center point of shoes: (x 0 ,y 0 ) Relative to the left lower corner position of the human body frame (x 1 ,y 1 ) Is offset fromGt_biasComprising five components (Gt_bias0Gt_bias1Gt_bias2Gt_bias3Gt_label),Gt_bias0The ratio of the deviation of the shoe center point relative to the left lower corner of the human body frame in the direction of the transverse axis to the width of the human body frame;Gt_bias1the ratio of the deviation of the central point of the shoe relative to the left lower corner of the human body frame in the direction of the longitudinal axis to the height of the human body frame,Gt_bias2is the ratio of the width of the shoes to the width of the human body frame,Gt_bias3The shoe type label is the ratio of the height of the shoe to the height of the human body frameGt_label
(Gt_bias0Gt_bias1Gt_bias2Gt_bias3Gt_label) As a true value in the secondary shoe detection network training process, namely a regression target, the expressions are respectively as follows:
Figure 470867DEST_PATH_IMAGE007
Figure 167428DEST_PATH_IMAGE008
Figure 999117DEST_PATH_IMAGE009
Figure 566365DEST_PATH_IMAGE010
wherein the content of the first and second substances,
Figure 723677DEST_PATH_IMAGE011
the central point of the shoe deviates in the direction of the transverse axis relative to the left lower corner of the human body frame, the value is positive,
Figure 641954DEST_PATH_IMAGE012
the central point of the shoe deviates in the direction of the longitudinal axis relative to the position of the lower left corner of the human body frame, the value is positive,
Figure 11756DEST_PATH_IMAGE013
the width of the human body detection frame is,
Figure 699089DEST_PATH_IMAGE014
the height of the human body detection frame is set,
Figure 27302DEST_PATH_IMAGE015
the height of the shoe is the height of the shoe,
Figure 167296DEST_PATH_IMAGE016
is the width of the shoe;
the position of the lower left corner of the human body detection frame is used as a reference point, the coordinates are set as (0,0), and meanwhile, the width and the height of the human body detection frame are set, namely
Figure 871947DEST_PATH_IMAGE013
Figure 148208DEST_PATH_IMAGE014
Are all unit 1; from this, true value pairs can be obtainedThe coordinate of the center point of the shoe is (Gt_bias0Gt_ bias1) Wide isGt_bias2High isGt_bias3
From this, the coordinates of the upper left corner of the shoe target frame (A) can be obtained
Figure 647322DEST_PATH_IMAGE017
):
Figure 274613DEST_PATH_IMAGE018
Figure 251796DEST_PATH_IMAGE019
True value of coordinates of lower right corner of shoe target frame (
Figure 651072DEST_PATH_IMAGE020
):
Figure 321088DEST_PATH_IMAGE021
Figure 435674DEST_PATH_IMAGE022
Similarly, the coordinates of the upper left corner of the target frame of the shoe are predicted (
Figure 950969DEST_PATH_IMAGE023
):
Figure 467401DEST_PATH_IMAGE024
Figure 308318DEST_PATH_IMAGE025
Predicting the coordinates of the lower right corner of the shoe target frame (
Figure 175780DEST_PATH_IMAGE026
):
Figure 494766DEST_PATH_IMAGE027
Figure 865704DEST_PATH_IMAGE028
Is prepared from (A)
Figure 611944DEST_PATH_IMAGE017
) And (a)
Figure 435543DEST_PATH_IMAGE020
) Define truth value shoe target framebox gt From (a) to (
Figure 823799DEST_PATH_IMAGE023
) And (a)
Figure 49244DEST_PATH_IMAGE026
) Defining a predictive shoe goal boxbox p
Computingbox gt Andbox p degree of overlap ofGIoUThen the target frame is lostLoss=1-GIoU
The above-mentionedGIoUThe calculation process is as follows:
for thebox gt Andbox p first, find the minimum box that can enclose bothCGiven its area ofc_area
Let area
Figure 966385DEST_PATH_IMAGE029
Figure 542859DEST_PATH_IMAGE030
Then, then
Figure 203648DEST_PATH_IMAGE031
Figure 283599DEST_PATH_IMAGE032
For class lossLoss_labelAnd adopting a cross entropy loss function, wherein the calculation formula is as follows:
Figure 902799DEST_PATH_IMAGE033
the total loss function is:
Figure 435412DEST_PATH_IMAGE034
wherein
Figure 896962DEST_PATH_IMAGE035
Is the weight lost to the shoe position,
Figure 96999DEST_PATH_IMAGE036
weight lost for shoe classification;
step S3.3: performing secondary shoe detection network training;
and based on the network structure in the step S3.1 and the loss function in the step S3.2, training a secondary shoe detection network together by adopting the human body frame, the shoe frame and the shoe category part in the labeled data set to obtain a secondary shoe detection network model.
6. The method for detecting wearing of the work shoe based on the two-stage neural network as claimed in claim 5, wherein the step of calculating the position of the work shoe according to the position of the human body frame and the deviation of the position of the work shoe relative to the human body frame comprises the following steps:
obtaining a target frame of the shoe according to the position of the human body frame and the offset of the position of the shoe relative to the human body frame, wherein, the position and the width and the height of the human body frame are output by the primary human body detection network, and the coordinate of the upper left corner of the shoe is (the coordinate of the upper left corner of the shoe)x t ,y t ) And the coordinates of the lower right corner: (x b ,y b ) The calculation formula of (a) is as follows:
Figure 621521DEST_PATH_IMAGE037
Figure 641430DEST_PATH_IMAGE038
Figure 175179DEST_PATH_IMAGE039
Figure 964144DEST_PATH_IMAGE040
7. the method for detecting wearing of a work shoe based on a two-stage neural network as claimed in claim 5, wherein the step of combining the result of comparing the confidence level of the work shoe with the threshold value to determine whether to wear the work shoe comprises the steps of:
judging whether the current frame wears the worker shoes by adopting a median filtering result, and setting a filtering length parameter asxI.e. the confidence of wearing the work shoe at the current frame is frontxAnd outputting the mean value of the confidence coefficient predicted values by the frame secondary shoe detection network, and if the mean value is greater than or equal to a set threshold value, determining that the worker shoes are worn, otherwise, determining that the worker shoes are not worn.
8. The utility model provides a detection device is dressed to worker's shoes based on two-stage neural network which characterized in that includes:
the acquisition module is used for acquiring a picture data set of the monitoring video;
the marking module is used for marking the shoe target and the human body target contained in the picture data set to obtain a marked data set;
the building module is used for building a two-stage neural network model, the two-stage neural network model is formed by cascading a first-stage human body detection network model and a second-stage shoe detection network model, the first-stage human body detection network model is obtained by training a human body detection network by a human body frame part in the labeled data set, the second-stage shoe detection network model is obtained by jointly training a second-stage shoe detection network by a human body frame, a shoe frame and a shoe category part in the labeled data set, and the input of the second-stage shoe detection network model is the output of the first-stage human body detection network model;
the output module is used for inputting the picture to be detected into the two-stage neural network model and outputting the position of the human body frame, the offset of the shoe position relative to the human body frame and the confidence coefficient of the wearing worker shoe;
and the calculation and judgment module is used for calculating the position of the shoe according to the position of the human body frame and the offset of the shoe position relative to the human body frame, and judging whether the shoe is worn according to the result of the comparison between the confidence coefficient of the shoe worn by the multi-frame picture and the threshold value.
CN202010750662.6A 2020-07-30 2020-07-30 Two-stage neural network-based work shoe wearing detection method and device Active CN111626276B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010750662.6A CN111626276B (en) 2020-07-30 2020-07-30 Two-stage neural network-based work shoe wearing detection method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010750662.6A CN111626276B (en) 2020-07-30 2020-07-30 Two-stage neural network-based work shoe wearing detection method and device

Publications (2)

Publication Number Publication Date
CN111626276A true CN111626276A (en) 2020-09-04
CN111626276B CN111626276B (en) 2020-10-30

Family

ID=72259604

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010750662.6A Active CN111626276B (en) 2020-07-30 2020-07-30 Two-stage neural network-based work shoe wearing detection method and device

Country Status (1)

Country Link
CN (1) CN111626276B (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111968114A (en) * 2020-09-09 2020-11-20 山东大学第二医院 Orthopedics consumable detection method and system based on cascade deep learning method
CN111968115A (en) * 2020-09-09 2020-11-20 山东大学第二医院 Method and system for detecting orthopedic consumables based on rasterization image processing method
CN112528960A (en) * 2020-12-29 2021-03-19 之江实验室 Smoking behavior detection method based on human body posture estimation and image classification
CN113554682A (en) * 2021-08-03 2021-10-26 同济大学 Safety helmet detection method based on target tracking

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106845430A (en) * 2017-02-06 2017-06-13 东华大学 Pedestrian detection and tracking based on acceleration region convolutional neural networks
CN109934081A (en) * 2018-08-29 2019-06-25 厦门安胜网络科技有限公司 A kind of pedestrian's attribute recognition approach, device and storage medium based on deep neural network
CN110580445A (en) * 2019-07-12 2019-12-17 西北工业大学 Face key point detection method based on GIoU and weighted NMS improvement

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106845430A (en) * 2017-02-06 2017-06-13 东华大学 Pedestrian detection and tracking based on acceleration region convolutional neural networks
CN109934081A (en) * 2018-08-29 2019-06-25 厦门安胜网络科技有限公司 A kind of pedestrian's attribute recognition approach, device and storage medium based on deep neural network
CN110580445A (en) * 2019-07-12 2019-12-17 西北工业大学 Face key point detection method based on GIoU and weighted NMS improvement

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
ZHAO YIHENG等: "A Novel Real-time Driver Monitoring System Based on Deep Convolutional Neural Network", 《2019 IEEE INTERNATIONAL SYMPOSIUM ON ROBOTIC AND SENSORS ENVIRONMENT》 *
王凯迪: "基于小目标检测的工人不安全行为检测系统", 《中国优秀硕士学位论文全文数据库工程科技Ⅰ辑》 *

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111968114A (en) * 2020-09-09 2020-11-20 山东大学第二医院 Orthopedics consumable detection method and system based on cascade deep learning method
CN111968115A (en) * 2020-09-09 2020-11-20 山东大学第二医院 Method and system for detecting orthopedic consumables based on rasterization image processing method
CN111968115B (en) * 2020-09-09 2021-05-04 山东大学第二医院 Method and system for detecting orthopedic consumables based on rasterization image processing method
CN112528960A (en) * 2020-12-29 2021-03-19 之江实验室 Smoking behavior detection method based on human body posture estimation and image classification
CN112528960B (en) * 2020-12-29 2023-07-14 之江实验室 Smoking behavior detection method based on human body posture estimation and image classification
CN113554682A (en) * 2021-08-03 2021-10-26 同济大学 Safety helmet detection method based on target tracking

Also Published As

Publication number Publication date
CN111626276B (en) 2020-10-30

Similar Documents

Publication Publication Date Title
CN111626276B (en) Two-stage neural network-based work shoe wearing detection method and device
CN110097568B (en) Video object detection and segmentation method based on space-time dual-branch network
CN108491835B (en) Two-channel convolutional neural network for facial expression recognition
Sun et al. Abnormal event detection for video surveillance using deep one-class learning
Brulin et al. Posture recognition based on fuzzy logic for home monitoring of the elderly
CN108062574B (en) Weak supervision target detection method based on specific category space constraint
CN107220635A (en) Human face in-vivo detection method based on many fraud modes
CN111598066A (en) Helmet wearing identification method based on cascade prediction
Pan et al. Intelligent diagnosis of northern corn leaf blight with deep learning model
CN112163564B (en) Tumble prejudging method based on human body key point behavior identification and LSTM (least Square TM)
CN111079539B (en) Video abnormal behavior detection method based on abnormal tracking
CN115273244B (en) Human body action recognition method and system based on graph neural network
CN110084201A (en) A kind of human motion recognition method of convolutional neural networks based on specific objective tracking under monitoring scene
CN113158983A (en) Airport scene activity behavior recognition method based on infrared video sequence image
CN113139502A (en) Unsupervised video segmentation method
CN109409224B (en) Method for detecting flame in natural scene
CN111291785A (en) Target detection method, device, equipment and storage medium
CN115909409A (en) Pedestrian attribute analysis method and device, storage medium and electronic equipment
CN114359796A (en) Target identification method and device and electronic equipment
Huynh et al. An efficient model for copy-move image forgery detection
CN108665479B (en) Infrared target tracking method based on compressed domain multi-scale feature TLD
CN110245666A (en) Multiple target Interval Valued Fuzzy based on dual membership driving clusters image partition method
CN114565639A (en) Target tracking method and system based on composite convolutional network
CN113743190A (en) Flame detection method and system based on BiHR-Net and YOLOv3-head
Joshi et al. Unsupervised synthesis of anomalies in videos: Transforming the normal

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant