CN110222570B

CN110222570B - Automatic identification method for cargo throwing/kicking behaviors of express industry based on monocular camera

Info

Publication number: CN110222570B
Application number: CN201910369755.1A
Authority: CN
Inventors: 刘立力; 吴晓晖
Original assignee: Hangzhou Shizai Technology Co ltd
Current assignee: Hangzhou Shizai Technology Co ltd
Priority date: 2019-05-06
Filing date: 2019-05-06
Publication date: 2021-11-23
Anticipated expiration: 2039-05-06
Also published as: CN110222570A

Abstract

The invention belongs to the technical field of video analysis, and particularly relates to an automatic identification method for goods throwing/kicking behaviors in an express industry based on a monocular camera. The automatic identification method for the goods throwing/kicking behaviors in the express industry based on the monocular camera has the advantages of convenience in erection, wide popularization range, no need of calibration and convenience in on-site deployment, samples various goods throwing/kicking behaviors and normal working behaviors at various visual angles, obtains sparse characteristics of the behaviors by a dictionary learning method, compares the sparse characteristics with the sparse characteristics of the goods throwing/kicking behaviors, can effectively solve the problem of ambiguity when the skeleton points extracted based on the monocular camera are used for behavior analysis, combines the analysis of the goods throwing/kicking behaviors based on the skeleton points with the movement tracks of the goods for analysis and judgment, and reduces the misjudgment behavior probability.

Description

Automatic identification method for cargo throwing/kicking behaviors of express industry based on monocular camera

Technical Field

The invention belongs to the technical field of video analysis, and particularly relates to an automatic identification method for goods throwing/kicking behaviors in an express industry based on a monocular camera.

Background

In the express delivery industry, goods can be greatly damaged by the throwing/kicking actions of couriers, and the credit of enterprises is seriously influenced. Therefore, every year, express enterprises need to invest a large amount of manpower to supervise the illegal behaviors, but the manpower is very limited, and the workload of supervision is huge, so that the illegal behaviors cannot be fundamentally prevented from happening. With the development of artificial intelligence technology, the automatic detection of the throwing/kicking behavior and the timely warning of the throwing/kicking behavior by using the existing distribution control camera on site and combining with a computer vision algorithm have important practical significance.

A visual solution for the analysis of the throwing/kicking behaviour typically comprises the following steps: firstly, detecting a courier in a picture, positioning the position and the size of the courier, and secondly, extracting a bone point of each detection object; and finally, analyzing the skeleton points so as to judge whether the goods are thrown or kicked. Extracting skeletal points of a human body, and according to different sensor types, the following three solutions are provided:

firstly, a skeleton point detection method based on a depth camera and a behavior identification method based on skeleton point detection; the method has the advantages of high detection speed and high real-time performance, can acquire the three-dimensional coordinate information of the position of the skeleton point, and can effectively reduce the ambiguity of behavior identification to a great extent; the disadvantages are that: compared with a common RGB camera, the depth camera is not widely applied, and particularly needs to be rearranged on site in the express delivery industry, so that certain difficulty is brought to implementation;

the second method, based on binocular vision, has the same advantages as the depth camera method, but has the following disadvantages: the internal reference and the external reference of the binocular camera need to be accurately calibrated, and the problem of inconvenient implementation also exists;

the third method is based on a single camera method, and has the advantages that: the identification work can be completed by utilizing the existing camera on the spot without any additional calibration or installation work, but the defect is that the identification ambiguity problem exists.

Disclosure of Invention

The invention aims to provide an automatic identification method for goods throwing/kicking behaviors in the express industry based on a monocular camera, aiming at the defects in the prior art.

Therefore, the above purpose of the invention is realized by the following technical scheme:

the automatic identification method for the goods throwing/kicking behavior of the express industry based on the monocular camera comprises the following steps:

(1) detecting the couriers in the picture by using a deep learning model, and extracting skeleton points, wherein the extraction of the skeleton points adopts a depth model method of a convolutional neural network;

(2) throw/play goods behavior analysis based on skeleton point specifically includes:

a. modeling phase

Inputting: and (3) normalized bone point coordinates of different casting/kicking behaviors at different angles and recorded as Y belonging to R^m×nAppointing iteration times N and convergence error e;

and (3) outputting: dictionary D, sparse coding X ∈ R of original bone point data^K×n；

Step 1: initialization: randomly extracting K from original normalized bone point samples to serve as an initial dictionary D; let j equal 0, repeat the following steps 2-Step3 until a specified number of iteration steps N is reached, or a specified error e is converged;

step 2: sparse coding is carried out by utilizing the dictionary obtained by Step1 to obtain X^(j)∈R^K×n；

Step 3: column by column updating dictionary D^(j)Column d of the dictionary_k∈{d₁,d₂,…,d_K}；

Step3.1: when updating d_kWhile, calculating an error matrix E_kWherein

Step3.2: taking out an index composition set omega with the k-th row vector of the sparse matrix not being 0_kAnd is recorded as:

step3.3: from the error matrix E_kGet out of corresponding omega_kLines other than 0 give E'_k；

Step3.4: to E'_kSingular value decomposition E_k＝U∑V^TColumn 1 of U is taken to update column k of the dictionary, i.e. d_kU (·, 1); order to

After obtaining, it is correspondingly updated to the original

Step3.5：j＝j+1；

b. Sparse representation phase of bone points

Inputting: a skeleton point vector y to be analyzed and a dictionary D;

and (3) outputting: sparse feature x of the skeletal point vector;

the method comprises the following steps: the following optimization problem is solved using the orthogonal matching pursuit algorithm (OMP):

the obtained vector x is used as sparse coding of the skeleton point vector y; wherein OMP is a technology disclosed in the industry;

c. a behavioral analysis phase

The behavior analysis is analyzed from two aspects, the first aspect is an apparent feature analysis method based on dictionary learning; the second aspect is a method based on trajectory analysis; the final result of the behavior analysis is the fusion of the result of the apparent characteristic analysis method and the result of the trajectory analysis method;

step 1: apparent feature throwing/kicking analysis based on dictionary learning

Forming a subspace by using sparse codes of the throwing/kicking goods data set, calculating the distance between the sparse feature of the behavior to be analyzed and the subspace, normalizing the distance to 0 to 1, recording the distance as Ps, and defining the distance as throwing/kicking goods if Ps is greater than a set threshold value; otherwise, the normal operation behavior is determined;

step 2: trajectory-based cast/kick analysis

Step2.1: extracting a plurality of front and back frame pictures, detecting goods such as packages appearing in the pictures, and obtaining a motion track G of the goods on the pictures, wherein the set of the tracks of all the goods appearing in the target field is recorded as G (G1, G2.., gn); the method for detecting goods adopts a method based on a deep convolutional neural network;

step2.2: extracting the motion tracks of the left hand, the right hand or the left leg and the right leg of a plurality of frames of pictures before and after the picture is recorded as H ═ hl, hr }; calculating the similarity of G and H according to the following formula, and normalizing to 0 to 1;

Simularity(G,H)＝max{Simularity(gi,hj)|gi∈G,hj∈H}；

wherein, simulity (gi, hj) is a similarity measure of the trajectory gi and the trajectory hj, and the measurement distance may be euclidean distance or cos distance;

step 3: the fusion probability of a cast/kick behavior is calculated using the following equation:

P＝Simularity(G,H)*Ps；

d. stage of behavior judgment

If the fusion probability P is larger than a threshold value set by a user, judging that the behavior is a throwing/kicking behavior; wherein the user-set threshold value can be obtained through training of the neural network.

The invention provides an automatic identification method of goods throwing/kicking behaviors in the express industry based on a monocular camera, which has the advantages of convenient erection, wide popularization range, no need of calibration and convenient field deployment, and the automatic identification method of the goods throwing/kicking behaviors in the express industry based on the monocular camera samples various goods throwing/kicking behaviors and normal working behaviors at various visual angles, the sparse characteristics of behaviors are obtained through a dictionary learning method and then compared with the sparse characteristics of the throwing/kicking behaviors, so that the ambiguity problem in behavior analysis of bone points extracted based on a monocular camera can be effectively solved, and the throwing/kicking behavior analysis based on the skeleton points is combined with the motion trail of the goods to be analyzed and judged, so that the misjudgment behavior probability is reduced.

Drawings

Fig. 1 is a schematic diagram of cargo throwing behavior of a courier.

Detailed Description

The invention is described in further detail with reference to the figures and specific embodiments.

a. modeling phase

Step3.1: when updating d_kWhile, calculating an error matrix E_kWherein

After obtaining, it is correspondingly updated to the original

Step3.5：j＝j+1；

b. Sparse representation phase of bone points

Inputting: a skeleton point vector y to be analyzed and a dictionary D;

and (3) outputting: sparse feature x of the skeletal point vector;

c. a behavioral analysis phase

step 1: apparent feature throwing/kicking analysis based on dictionary learning

step 2: trajectory-based cast/kick analysis

Simularity(G,H)＝max{Simularity(gi,hj)|gi∈G,hj∈H}；

P＝Simularity(G,H)*Ps；

d. stage of behavior judgment

The invention mainly studies the throwing/kicking behavior of the courier at present, and certainly, other non-standard behaviors with larger amplitude can be identified and analyzed by the method provided by the invention.

The above-described embodiments are intended to illustrate the present invention, but not to limit the present invention, and any modifications, equivalents, improvements, etc. made within the spirit of the present invention and the scope of the claims fall within the scope of the present invention.

Claims

1. The automatic identification method for the goods throwing/kicking behavior of the express industry based on the monocular camera is characterized by comprising the following steps of:

a. modeling phase

Step 3: column by column updating dictionary D^(j)Column d of the dictionary_k∈{d₁，d₂，…，d_K}；

Step3.1: when updating d_kWhile, calculating an error matrix E_kWherein

Step3.4: to E'_kSingular value decomposition of E'_k＝U∑V^TColumn 1 of U is taken to update column k of the dictionary, i.e. d_kU (·, 1); order to

After obtaining, it is correspondingly updated to the original

Step3.5：j＝j+1；

b. Sparse representation phase of bone points

Inputting: a skeleton point vector y to be analyzed and a dictionary D;

and (3) outputting: sparse feature x of the skeletal point vector;

the method comprises the following steps: the following optimization problem is solved by using an orthogonal matching pursuit algorithm OMP:

c. a behavioral analysis phase

step 1: apparent feature throwing/kicking analysis based on dictionary learning

Forming a subspace by using sparse codes of the throwing/kicking goods data set, calculating the distance between the sparse feature of the behavior to be analyzed and the subspace, normalizing the distance to 0 to 1, recording the distance as Ps, and defining the distance as throwing/kicking goods if Ps is less than a set threshold value; otherwise, the normal operation behavior is determined;

step 2: trajectory-based cast/kick analysis

Simularity(G,H)＝max{Simularity(gi,hj)|gi∈G,hj∈H}；

simulity (G, H) is normalized to 0 to 1 and is noted simulity (G, H)';

P＝Simularity(G,H)′*Ps

d. stage of behavior judgment

If the fusion probability P is smaller than a threshold set by a user, judging the behavior as a throwing/kicking behavior; wherein the user-set threshold value can be obtained through training of the neural network.