CN110222570B - Automatic identification method for cargo throwing/kicking behaviors of express industry based on monocular camera - Google Patents

Automatic identification method for cargo throwing/kicking behaviors of express industry based on monocular camera Download PDF

Info

Publication number
CN110222570B
CN110222570B CN201910369755.1A CN201910369755A CN110222570B CN 110222570 B CN110222570 B CN 110222570B CN 201910369755 A CN201910369755 A CN 201910369755A CN 110222570 B CN110222570 B CN 110222570B
Authority
CN
China
Prior art keywords
kicking
goods
behavior
analysis
throwing
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201910369755.1A
Other languages
Chinese (zh)
Other versions
CN110222570A (en
Inventor
刘立力
吴晓晖
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hangzhou Shizai Technology Co ltd
Original Assignee
Hangzhou Shizai Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hangzhou Shizai Technology Co ltd filed Critical Hangzhou Shizai Technology Co ltd
Priority to CN201910369755.1A priority Critical patent/CN110222570B/en
Publication of CN110222570A publication Critical patent/CN110222570A/en
Application granted granted Critical
Publication of CN110222570B publication Critical patent/CN110222570B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/20Movements or behaviour, e.g. gesture recognition
    • G06V40/23Recognition of whole body movements, e.g. for sport training

Abstract

The invention belongs to the technical field of video analysis, and particularly relates to an automatic identification method for goods throwing/kicking behaviors in an express industry based on a monocular camera. The automatic identification method for the goods throwing/kicking behaviors in the express industry based on the monocular camera has the advantages of convenience in erection, wide popularization range, no need of calibration and convenience in on-site deployment, samples various goods throwing/kicking behaviors and normal working behaviors at various visual angles, obtains sparse characteristics of the behaviors by a dictionary learning method, compares the sparse characteristics with the sparse characteristics of the goods throwing/kicking behaviors, can effectively solve the problem of ambiguity when the skeleton points extracted based on the monocular camera are used for behavior analysis, combines the analysis of the goods throwing/kicking behaviors based on the skeleton points with the movement tracks of the goods for analysis and judgment, and reduces the misjudgment behavior probability.

Description

Automatic identification method for cargo throwing/kicking behaviors of express industry based on monocular camera
Technical Field
The invention belongs to the technical field of video analysis, and particularly relates to an automatic identification method for goods throwing/kicking behaviors in an express industry based on a monocular camera.
Background
In the express delivery industry, goods can be greatly damaged by the throwing/kicking actions of couriers, and the credit of enterprises is seriously influenced. Therefore, every year, express enterprises need to invest a large amount of manpower to supervise the illegal behaviors, but the manpower is very limited, and the workload of supervision is huge, so that the illegal behaviors cannot be fundamentally prevented from happening. With the development of artificial intelligence technology, the automatic detection of the throwing/kicking behavior and the timely warning of the throwing/kicking behavior by using the existing distribution control camera on site and combining with a computer vision algorithm have important practical significance.
A visual solution for the analysis of the throwing/kicking behaviour typically comprises the following steps: firstly, detecting a courier in a picture, positioning the position and the size of the courier, and secondly, extracting a bone point of each detection object; and finally, analyzing the skeleton points so as to judge whether the goods are thrown or kicked. Extracting skeletal points of a human body, and according to different sensor types, the following three solutions are provided:
firstly, a skeleton point detection method based on a depth camera and a behavior identification method based on skeleton point detection; the method has the advantages of high detection speed and high real-time performance, can acquire the three-dimensional coordinate information of the position of the skeleton point, and can effectively reduce the ambiguity of behavior identification to a great extent; the disadvantages are that: compared with a common RGB camera, the depth camera is not widely applied, and particularly needs to be rearranged on site in the express delivery industry, so that certain difficulty is brought to implementation;
the second method, based on binocular vision, has the same advantages as the depth camera method, but has the following disadvantages: the internal reference and the external reference of the binocular camera need to be accurately calibrated, and the problem of inconvenient implementation also exists;
the third method is based on a single camera method, and has the advantages that: the identification work can be completed by utilizing the existing camera on the spot without any additional calibration or installation work, but the defect is that the identification ambiguity problem exists.
Disclosure of Invention
The invention aims to provide an automatic identification method for goods throwing/kicking behaviors in the express industry based on a monocular camera, aiming at the defects in the prior art.
Therefore, the above purpose of the invention is realized by the following technical scheme:
the automatic identification method for the goods throwing/kicking behavior of the express industry based on the monocular camera comprises the following steps:
(1) detecting the couriers in the picture by using a deep learning model, and extracting skeleton points, wherein the extraction of the skeleton points adopts a depth model method of a convolutional neural network;
(2) throw/play goods behavior analysis based on skeleton point specifically includes:
a. modeling phase
Inputting: and (3) normalized bone point coordinates of different casting/kicking behaviors at different angles and recorded as Y belonging to Rm×nAppointing iteration times N and convergence error e;
and (3) outputting: dictionary D, sparse coding X ∈ R of original bone point dataK×n
Step 1: initialization: randomly extracting K from original normalized bone point samples to serve as an initial dictionary D; let j equal 0, repeat the following steps 2-Step3 until a specified number of iteration steps N is reached, or a specified error e is converged;
step 2: sparse coding is carried out by utilizing the dictionary obtained by Step1 to obtain X(j)∈RK×n
Step 3: column by column updating dictionary D(j)Column d of the dictionaryk∈{d1,d2,…,dK};
Step3.1: when updating dkWhile, calculating an error matrix EkWherein
Figure BDA0002049502270000021
Step3.2: taking out an index composition set omega with the k-th row vector of the sparse matrix not being 0kAnd is recorded as:
Figure BDA0002049502270000022
Figure BDA0002049502270000023
step3.3: from the error matrix EkGet out of corresponding omegakLines other than 0 give E'k
Step3.4: to E'kSingular value decomposition Ek=U∑VTColumn 1 of U is taken to update column k of the dictionary, i.e. dkU (·, 1); order to
Figure BDA0002049502270000024
Figure BDA0002049502270000025
After obtaining, it is correspondingly updated to the original
Figure BDA0002049502270000026
Step3.5:j=j+1;
b. Sparse representation phase of bone points
Inputting: a skeleton point vector y to be analyzed and a dictionary D;
and (3) outputting: sparse feature x of the skeletal point vector;
the method comprises the following steps: the following optimization problem is solved using the orthogonal matching pursuit algorithm (OMP):
Figure BDA0002049502270000031
the obtained vector x is used as sparse coding of the skeleton point vector y; wherein OMP is a technology disclosed in the industry;
c. a behavioral analysis phase
The behavior analysis is analyzed from two aspects, the first aspect is an apparent feature analysis method based on dictionary learning; the second aspect is a method based on trajectory analysis; the final result of the behavior analysis is the fusion of the result of the apparent characteristic analysis method and the result of the trajectory analysis method;
step 1: apparent feature throwing/kicking analysis based on dictionary learning
Forming a subspace by using sparse codes of the throwing/kicking goods data set, calculating the distance between the sparse feature of the behavior to be analyzed and the subspace, normalizing the distance to 0 to 1, recording the distance as Ps, and defining the distance as throwing/kicking goods if Ps is greater than a set threshold value; otherwise, the normal operation behavior is determined;
step 2: trajectory-based cast/kick analysis
Step2.1: extracting a plurality of front and back frame pictures, detecting goods such as packages appearing in the pictures, and obtaining a motion track G of the goods on the pictures, wherein the set of the tracks of all the goods appearing in the target field is recorded as G (G1, G2.., gn); the method for detecting goods adopts a method based on a deep convolutional neural network;
step2.2: extracting the motion tracks of the left hand, the right hand or the left leg and the right leg of a plurality of frames of pictures before and after the picture is recorded as H ═ hl, hr }; calculating the similarity of G and H according to the following formula, and normalizing to 0 to 1;
Simularity(G,H)=max{Simularity(gi,hj)|gi∈G,hj∈H};
Figure BDA0002049502270000041
wherein, simulity (gi, hj) is a similarity measure of the trajectory gi and the trajectory hj, and the measurement distance may be euclidean distance or cos distance;
step 3: the fusion probability of a cast/kick behavior is calculated using the following equation:
P=Simularity(G,H)*Ps;
d. stage of behavior judgment
If the fusion probability P is larger than a threshold value set by a user, judging that the behavior is a throwing/kicking behavior; wherein the user-set threshold value can be obtained through training of the neural network.
The invention provides an automatic identification method of goods throwing/kicking behaviors in the express industry based on a monocular camera, which has the advantages of convenient erection, wide popularization range, no need of calibration and convenient field deployment, and the automatic identification method of the goods throwing/kicking behaviors in the express industry based on the monocular camera samples various goods throwing/kicking behaviors and normal working behaviors at various visual angles, the sparse characteristics of behaviors are obtained through a dictionary learning method and then compared with the sparse characteristics of the throwing/kicking behaviors, so that the ambiguity problem in behavior analysis of bone points extracted based on a monocular camera can be effectively solved, and the throwing/kicking behavior analysis based on the skeleton points is combined with the motion trail of the goods to be analyzed and judged, so that the misjudgment behavior probability is reduced.
Drawings
Fig. 1 is a schematic diagram of cargo throwing behavior of a courier.
Detailed Description
The invention is described in further detail with reference to the figures and specific embodiments.
The automatic identification method for the goods throwing/kicking behavior of the express industry based on the monocular camera comprises the following steps:
(1) detecting the couriers in the picture by using a deep learning model, and extracting skeleton points, wherein the extraction of the skeleton points adopts a depth model method of a convolutional neural network;
(2) throw/play goods behavior analysis based on skeleton point specifically includes:
a. modeling phase
Inputting: and (3) normalized bone point coordinates of different casting/kicking behaviors at different angles and recorded as Y belonging to Rm×nAppointing iteration times N and convergence error e;
and (3) outputting: dictionary D, sparse coding X ∈ R of original bone point dataK×n
Step 1: initialization: randomly extracting K from original normalized bone point samples to serve as an initial dictionary D; let j equal 0, repeat the following steps 2-Step3 until a specified number of iteration steps N is reached, or a specified error e is converged;
step 2: sparse coding is carried out by utilizing the dictionary obtained by Step1 to obtain X(j)∈RK×n
Step 3: column by column updating dictionary D(j)Column d of the dictionaryk∈{d1,d2,…,dK};
Step3.1: when updating dkWhile, calculating an error matrix EkWherein
Figure BDA0002049502270000051
Step3.2: taking out an index composition set omega with the k-th row vector of the sparse matrix not being 0kAnd is recorded as:
Figure BDA0002049502270000052
Figure BDA0002049502270000053
step3.3: from the error matrix EkGet out of corresponding omegakLines other than 0 give E'k
Step3.4: to E'kSingular value decomposition Ek=U∑VTColumn 1 of U is taken to update column k of the dictionary, i.e. dkU (·, 1); order to
Figure BDA0002049502270000054
Figure BDA0002049502270000055
After obtaining, it is correspondingly updated to the original
Figure BDA0002049502270000056
Step3.5:j=j+1;
b. Sparse representation phase of bone points
Inputting: a skeleton point vector y to be analyzed and a dictionary D;
and (3) outputting: sparse feature x of the skeletal point vector;
the method comprises the following steps: the following optimization problem is solved using the orthogonal matching pursuit algorithm (OMP):
Figure BDA0002049502270000057
the obtained vector x is used as sparse coding of the skeleton point vector y; wherein OMP is a technology disclosed in the industry;
c. a behavioral analysis phase
The behavior analysis is analyzed from two aspects, the first aspect is an apparent feature analysis method based on dictionary learning; the second aspect is a method based on trajectory analysis; the final result of the behavior analysis is the fusion of the result of the apparent characteristic analysis method and the result of the trajectory analysis method;
step 1: apparent feature throwing/kicking analysis based on dictionary learning
Forming a subspace by using sparse codes of the throwing/kicking goods data set, calculating the distance between the sparse feature of the behavior to be analyzed and the subspace, normalizing the distance to 0 to 1, recording the distance as Ps, and defining the distance as throwing/kicking goods if Ps is greater than a set threshold value; otherwise, the normal operation behavior is determined;
step 2: trajectory-based cast/kick analysis
Step2.1: extracting a plurality of front and back frame pictures, detecting goods such as packages appearing in the pictures, and obtaining a motion track G of the goods on the pictures, wherein the set of the tracks of all the goods appearing in the target field is recorded as G (G1, G2.., gn); the method for detecting goods adopts a method based on a deep convolutional neural network;
step2.2: extracting the motion tracks of the left hand, the right hand or the left leg and the right leg of a plurality of frames of pictures before and after the picture is recorded as H ═ hl, hr }; calculating the similarity of G and H according to the following formula, and normalizing to 0 to 1;
Simularity(G,H)=max{Simularity(gi,hj)|gi∈G,hj∈H};
Figure BDA0002049502270000061
wherein, simulity (gi, hj) is a similarity measure of the trajectory gi and the trajectory hj, and the measurement distance may be euclidean distance or cos distance;
step 3: the fusion probability of a cast/kick behavior is calculated using the following equation:
P=Simularity(G,H)*Ps;
d. stage of behavior judgment
If the fusion probability P is larger than a threshold value set by a user, judging that the behavior is a throwing/kicking behavior; wherein the user-set threshold value can be obtained through training of the neural network.
The invention mainly studies the throwing/kicking behavior of the courier at present, and certainly, other non-standard behaviors with larger amplitude can be identified and analyzed by the method provided by the invention.
The above-described embodiments are intended to illustrate the present invention, but not to limit the present invention, and any modifications, equivalents, improvements, etc. made within the spirit of the present invention and the scope of the claims fall within the scope of the present invention.

Claims (1)

1. The automatic identification method for the goods throwing/kicking behavior of the express industry based on the monocular camera is characterized by comprising the following steps of:
(1) detecting the couriers in the picture by using a deep learning model, and extracting skeleton points, wherein the extraction of the skeleton points adopts a depth model method of a convolutional neural network;
(2) throw/play goods behavior analysis based on skeleton point specifically includes:
a. modeling phase
Inputting: and (3) normalized bone point coordinates of different casting/kicking behaviors at different angles and recorded as Y belonging to Rm×nAppointing iteration times N and convergence error e;
and (3) outputting: dictionary D, sparse coding X ∈ R of original bone point dataK×n
Step 1: initialization: randomly extracting K from original normalized bone point samples to serve as an initial dictionary D; let j equal 0, repeat the following steps 2-Step3 until a specified number of iteration steps N is reached, or a specified error e is converged;
step 2: sparse coding is carried out by utilizing the dictionary obtained by Step1 to obtain X(j)∈RK×n
Step 3: column by column updating dictionary D(j)Column d of the dictionaryk∈{d1,d2,…,dK};
Step3.1: when updating dkWhile, calculating an error matrix EkWherein
Figure FDA0003215812860000011
Step3.2: taking out an index composition set omega with the k-th row vector of the sparse matrix not being 0kAnd is recorded as:
Figure FDA0003215812860000012
Figure FDA0003215812860000013
step3.3: from the error matrix EkGet out of corresponding omegakLines other than 0 give E'k
Step3.4: to E'kSingular value decomposition of E'k=U∑VTColumn 1 of U is taken to update column k of the dictionary, i.e. dkU (·, 1); order to
Figure FDA0003215812860000014
Figure FDA0003215812860000015
After obtaining, it is correspondingly updated to the original
Figure FDA0003215812860000016
Step3.5:j=j+1;
b. Sparse representation phase of bone points
Inputting: a skeleton point vector y to be analyzed and a dictionary D;
and (3) outputting: sparse feature x of the skeletal point vector;
the method comprises the following steps: the following optimization problem is solved by using an orthogonal matching pursuit algorithm OMP:
Figure FDA0003215812860000021
the obtained vector x is used as sparse coding of the skeleton point vector y; wherein OMP is a technology disclosed in the industry;
c. a behavioral analysis phase
The behavior analysis is analyzed from two aspects, the first aspect is an apparent feature analysis method based on dictionary learning; the second aspect is a method based on trajectory analysis; the final result of the behavior analysis is the fusion of the result of the apparent characteristic analysis method and the result of the trajectory analysis method;
step 1: apparent feature throwing/kicking analysis based on dictionary learning
Forming a subspace by using sparse codes of the throwing/kicking goods data set, calculating the distance between the sparse feature of the behavior to be analyzed and the subspace, normalizing the distance to 0 to 1, recording the distance as Ps, and defining the distance as throwing/kicking goods if Ps is less than a set threshold value; otherwise, the normal operation behavior is determined;
step 2: trajectory-based cast/kick analysis
Step2.1: extracting a plurality of front and back frame pictures, detecting goods such as packages appearing in the pictures, and obtaining a motion track G of the goods on the pictures, wherein the set of the tracks of all the goods appearing in the target field is recorded as G (G1, G2.., gn); the method for detecting goods adopts a method based on a deep convolutional neural network;
step2.2: extracting the motion tracks of the left hand, the right hand or the left leg and the right leg of a plurality of frames of pictures before and after the picture is recorded as H ═ hl, hr }; calculating the similarity of G and H according to the following formula, and normalizing to 0 to 1;
Simularity(G,H)=max{Simularity(gi,hj)|gi∈G,hj∈H};
wherein, simulity (gi, hj) is a similarity measure of the trajectory gi and the trajectory hj, and the measurement distance may be euclidean distance or cos distance;
simulity (G, H) is normalized to 0 to 1 and is noted simulity (G, H)';
step 3: the fusion probability of a cast/kick behavior is calculated using the following equation:
P=Simularity(G,H)′*Ps
d. stage of behavior judgment
If the fusion probability P is smaller than a threshold set by a user, judging the behavior as a throwing/kicking behavior; wherein the user-set threshold value can be obtained through training of the neural network.
CN201910369755.1A 2019-05-06 2019-05-06 Automatic identification method for cargo throwing/kicking behaviors of express industry based on monocular camera Active CN110222570B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910369755.1A CN110222570B (en) 2019-05-06 2019-05-06 Automatic identification method for cargo throwing/kicking behaviors of express industry based on monocular camera

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910369755.1A CN110222570B (en) 2019-05-06 2019-05-06 Automatic identification method for cargo throwing/kicking behaviors of express industry based on monocular camera

Publications (2)

Publication Number Publication Date
CN110222570A CN110222570A (en) 2019-09-10
CN110222570B true CN110222570B (en) 2021-11-23

Family

ID=67820360

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910369755.1A Active CN110222570B (en) 2019-05-06 2019-05-06 Automatic identification method for cargo throwing/kicking behaviors of express industry based on monocular camera

Country Status (1)

Country Link
CN (1) CN110222570B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113516102A (en) * 2021-08-06 2021-10-19 上海中通吉网络技术有限公司 Deep learning parabolic behavior detection method based on video

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104091169A (en) * 2013-12-12 2014-10-08 华南理工大学 Behavior identification method based on multi feature fusion
US10902343B2 (en) * 2016-09-30 2021-01-26 Disney Enterprises, Inc. Deep-learning motion priors for full-body performance capture in real-time
CN106897670B (en) * 2017-01-19 2020-09-22 南京邮电大学 Express violence sorting identification method based on computer vision
US10489656B2 (en) * 2017-09-21 2019-11-26 NEX Team Inc. Methods and systems for ball game analytics with a mobile device
CN108960078A (en) * 2018-06-12 2018-12-07 温州大学 A method of based on monocular vision, from action recognition identity
CN109614874B (en) * 2018-11-16 2023-06-30 深圳市感动智能科技有限公司 Human behavior recognition method and system based on attention perception and tree skeleton point structure

Also Published As

Publication number Publication date
CN110222570A (en) 2019-09-10

Similar Documents

Publication Publication Date Title
CN109903312B (en) Football player running distance statistical method based on video multi-target tracking
CN104134222B (en) Traffic flow monitoring image detecting and tracking system and method based on multi-feature fusion
CN104850865B (en) A kind of Real Time Compression tracking of multiple features transfer learning
CN103699908B (en) Video multi-target tracking based on associating reasoning
CN103544483B (en) A kind of joint objective method for tracing based on local rarefaction representation and system thereof
CN104299243B (en) Target tracking method based on Hough forests
CN106780557A (en) A kind of motion target tracking method based on optical flow method and crucial point feature
CN109977971A (en) Dimension self-adaption Target Tracking System based on mean shift Yu core correlation filtering
WO2022095514A1 (en) Image detection method and apparatus, electronic device, and storage medium
CN102063625B (en) Improved particle filtering method for multi-target tracking under multiple viewing angles
CN106651915A (en) Target tracking method of multi-scale expression based on convolutional neural network
CN111046856A (en) Parallel pose tracking and map creating method based on dynamic and static feature extraction
CN106682573A (en) Pedestrian tracking method of single camera
CN107256083A (en) Many finger method for real time tracking based on KINECT
CN110222570B (en) Automatic identification method for cargo throwing/kicking behaviors of express industry based on monocular camera
Lin et al. Keypoint-based category-level object pose tracking from an RGB sequence with uncertainty estimation
Li et al. Video-based table tennis tracking and trajectory prediction using convolutional neural networks
CN110197121A (en) Moving target detecting method, moving object detection module and monitoring system based on DirectShow
CN114120188A (en) Multi-pedestrian tracking method based on joint global and local features
CN106874881B (en) A kind of anti-joint sparse expression method for tracking target in the part of multi-template space time correlation
CN113255487A (en) Three-dimensional real-time human body posture recognition method
CN114550219B (en) Pedestrian tracking method and device
CN108573226B (en) Drosophila larva body node key point positioning method based on cascade posture regression
CN106354333B (en) A method of improving touch-control system tracking precision
Wang et al. Machine Vision-Based Ping Pong Ball Rotation Trajectory Tracking Algorithm

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant