CN114663834B

CN114663834B - On-site monitoring method for express storage

Info

Publication number: CN114663834B
Application number: CN202210282392.XA
Authority: CN
Inventors: 靳涵宇
Original assignee: Tianmu Aishi Beijing Technology Co Ltd
Current assignee: Tianmu Aishi Beijing Technology Co Ltd
Priority date: 2022-03-22
Filing date: 2022-03-22
Publication date: 2023-04-28
Anticipated expiration: 2042-03-22
Also published as: CN114663834A

Abstract

The invention provides an on-site monitoring method for express storage, which comprises the following steps: s1: the camera acquires first video data of an acquisition area; s2: intercepting a first group of images F from first video data ₁ Analyzing each image, detecting whether an express delivery person exists or not, if yes, entering S3, otherwise returning to S1; s3: and sending the express delivery notification information to the terminal. The method can autonomously monitor and inform the express delivery and abnormal express delivery events, assist the receiving clients in remotely verifying the express delivery state, and solve the problem of unsuccessful delivery caused by incapability of being checked and accepted by the clients on site.

Description

On-site monitoring method for express storage

Technical Field

The invention relates to the application fields of intelligent home and artificial intelligent equipment, in particular to the field of an express storage field monitoring method.

Background

In modern society, along with the development of economy and technology, express delivery gradually becomes an indispensable important industry in people's daily life. Modern people, whether at home or at work, are increasingly dependent on the courier industry. On the other hand, the development of the express industry plays a pushing role in economy and technology. From the economic aspect, the development of the express industry enables economic activities to be denser and the efficiency to be higher, and promotes the development of economy; from the technical point of view, the express industry has higher requirements on working efficiency, convenience and safety, has wide requirements on automation and intelligent application, and the application requirements in turn drive the invention and the production of new technologies, thereby promoting the progress of the technologies.

Express delivery is an important link in the express industry. The express delivery is the final link of express logistics, so that the acceptance degree of the receiving clients to the express service is determined to a large extent, and the express delivery is a key link which is very easy to cause client disputes. One of the main contradictions is the conflict between dispatch time and customer receipt time. Due to the specificity of the industry, the delivery of the express is sensitive to time, and if the delivery time of the express personnel conflicts with the time of receiving the express by the customer, a series of problems such as timeliness of goods, cost of delivery, loss of goods and the like are extremely easy to cause. It can be seen that there is room for improvement in this respect for both the delivery enterprise and the delivery client.

Particularly, in order to prevent cross infection during epidemic situations, the distance between the courier and the client needs to be ensured, short-distance direct contact is avoided, and judgment of intelligent rapid delivery under school gates and resident buildings is a problem to be solved.

The express cabinet is a more effective scheme for solving the problems, and can realize the deposit of goods at lower cost by setting the automatic deposit encryption cabinet, thereby solving the difficulty that customers can not receive the goods in time. However, the express cabinet scheme still has a plurality of problems. Firstly, a plurality of express cabinets push the receiving notification in a short message mode, a WeChat mode and the like, and in the modern society of information explosion, the information is easily filtered out as junk information, so that clients cannot receive the information; even if the customer receives the message, the customer forgets to take the piece due to the reasons of time interval, etc., not only the express cabinet resource is occupied, but also the loss of the goods is easy to cause. Secondly, the express cabinet needs to occupy certain public space resources, power and other resources, so that planning is needed in advance in the aspect of property management, and most existing houses and office environments possibly do not have the conditions, so that the transformation and management cost is high. Again, based on the foregoing costs, the setting and operation of the delivery cabinet requires a certain cost, which is usually borne by the delivery enterprise in the initial stage of deployment from the current situation, but this cost, especially the operation and maintenance costs, is a long-term and continuous expenditure, so the delivery enterprise also has a need to transfer this cost for the operation, which intangibly increases the overall cost of delivery.

Disclosure of Invention

In order to solve the problems, the invention provides an on-site monitoring method for storing the express, which realizes the autonomous monitoring of the express delivered by the express delivery person through the application of artificial intelligence and machine vision innovation technology, and when the event of delivering the express delivered by the express delivery person is monitored, a preset receiving client is notified, the express delivery person is prompted to put the express delivery in a preset position, and on-site video is called according to a client instruction to manually verify the express delivery condition; and when the fact that a person takes the article from the preset position is detected, sending an abnormal notification to a receiving client. The method can autonomously monitor and inform the express delivery and abnormal express delivery events, assist the receiving clients in remotely verifying the express delivery state, and solve the problem of unsuccessful delivery caused by incapability of being checked and accepted by the clients on site. The method can be realized by only one network camera, has low material cost, does not have operation and maintenance cost, and solves the problems of common resource occupation and high operation and maintenance cost of the express cabinet.

The invention provides an on-site monitoring method for express storage, which comprises the following steps:

s1: the camera acquires first video data of an acquisition area;

S2: intercepting a first group of images F from first video data ₁ Analyzing each image, acquiring a subgraph of each image, detecting whether an express delivery person exists or not, if yes, entering S3, otherwise returning to S1;

s3: sending express delivery notification information to a terminal;

using neural network model N ₁ Detecting in the subgraph whether the courier exists,

the neural network model N ₁ The input layer is a sub-graph S _I The output layer is a group of two-dimensional vectors, and the two dimensions of the vectors respectively represent whether express dispatchers exist or not and whether common pedestrians exist in the input image or not;

neural network model N ₁ Is defined as follows:

definition of neural network model N ₁ Is a first layer hidden layer:

in the formula ,

the weight of a convolution window taking (u ', v') as the center in the input layer is represented, p and q represent integer coordinates of relative positions in the convolution window, the convolution window size is 9*9, and the value range of the corresponding p and q is-4 to 4.S is S _I ^{(u′+p，v′+q)} Representation ofInputting pixel values of the layer subgraph at coordinates (u '+p, v' +q);

Representing a node with coordinates (x, y) in the hidden layer of the first layer, this node being dependent on the window parameter +.>

Definition, connected with 9*9 nodes of the input layer. b ₀ Is a linear offset. Sigma (x) is a nonlinear function:

e ^x Representing an exponential function, so that the neural network can classify nonlinear data samples, and alpha is an empirical parameter;

definition of neural network N ₁ Is a second hidden layer:

in the formula ,

a node representing the coordinates (x, y) of the second hidden layer is connected with 4*4 =16 nodes of the first hidden layer, and max represents the node +_in the first hidden layer>

The maximum value of 16 nodes of the corresponding position is defined by p and q in the x and y directions, namely the p and q take the values of 0, 1, 2 and 3.

Representing nodes with coordinates (4x+p, 4y+q) in the hidden layer of the first layer. b ₁ Is a linear offset. Sigma (x) is defined by formula (2);

definition of neural networksN ₁ Is a third hidden layer:

in the formula ,

representing nodes with coordinates (x, y) in the hidden layer of the third layer, +.>

Represents the node in the second hidden layer at coordinates (x+p, y+q), in +.>

The weight of the convolution window is represented, the size of the convolution window is 7*7, the range of values of the corresponding p and q is-3 to 3, and the p and q represent integer coordinates of relative positions in the convolution window.

According to->

The weights represented are connected to 7*7 nodes in the second hidden layer. b ₂ Is a linear offset. Sigma (x) is a nonlinear function as defined by equation (2);

definition of neural network N ₁ The fourth hidden layer of (a) is:

in the formula ,

a node with coordinates (x, y) of the fourth hidden layer is connected with 2x 2 = 4 nodes of the third hidden layer, and max represents a node +.>

The maximum value of 4 nodes at the corresponding position is defined by p and q in the x and y directions, namely the range of the p and q values is 0, 1, 2 and 3.

Representing nodes with coordinates (2x+p, 2y+q) in the hidden layer of the third layer. b ₃ Is a linear offset. Sigma (x) is defined by formula (2);

definition of neural network N ₁ Is a fifth hidden layer of (a):

in the formula ,

the joint represents nodes in the hidden layer of the fifth layer, so that the fifth layer is composed of joint two-dimensional matrix nodes, (X, y) represents coordinates in X, y directions, the subscript of X represents the number of one of the two matrices,

represents the node in the fourth hidden layer at coordinates (x+p, y+q), +.>

Respectively represent and

the corresponding convolution window weights are 7*7, the convolution window sizes are in the range of-3 to 3 corresponding to the values of p and q, and the p and q represent integer coordinates of the relative positions in the convolution window. It can be seen that->

Press->

Represented weights and a fourth hidden layer7*7 nodes of (a) are connected, and +.>

Press->

The weights represented are connected to 7*7 nodes in the fourth hidden layer. b ₄ Is a linear offset. Sigma (x) is a nonlinear function as defined by equation (2);

by setting a neural network N ₁ Combining the target of the express delivery person to be detected with the target of the general pedestrian, and extracting the common characteristics of the target; extracting the difference characteristics of the express deliverer target and the general pedestrian target by using a fifth hidden layer, and further distinguishing the target characteristics;

definition N ₁ Is provided with an output layer of:

in the formula ,

representing the nodes of the fifth hidden layer. Omega= [ omega ] ₁ ，ω ₂ ] ^T Representing two nodes of the output layer.

Representation and->

The corresponding connection weight, c, d, has the same value range as x, y, i.e. each +.>

And +.>

Corresponding, each->

And +.>

Corresponding to the above. b ₅ Is a linear offset. Sigma (x) is defined by formula (2);

output layer node Ω= [ ω ] of neural network ₁ ，ω ₂ ] ^T The value range is [0,1 ]]The probability of whether an express delivery person exists or not and whether a general pedestrian exists or not in the input diagram are respectively represented. When omega _i When i= {1,2} tends to be 0, it means that the target is not present, when ω _i When i= {1,2} tends to be 1, it indicates that there is a target. To further clarify the determination result, the threshold values are set as follows:

the weight parameters and bias parameters of each layer of the neural network in the formulas (1) - (7) need to be obtained through training sample learning. Preparing a plurality of groups of training samples in advance, wherein the training samples are divided into three types, the first type is an image of an express dispatcher who dispatches packages, and the corresponding output is [1,0 ] ]The second type is an image of a common pedestrian, and the corresponding output value is [0,1]The third class is an image containing neither express dispatcher nor general pedestrian, and the corresponding output value is [0,0]. Each group of training samples comprises images and output values thereof for training the neural network model N ₁ . Calculating the output result for a given training sample input value according to the definitions of (1) - (7) and comparing it with the labeled value of the training sample, a comparison value can be obtained, the comparison value being defined as a cost function:

wherein ,

marking output value representing sample, Ω= [ ω ] ₁ ，ω ₂ ] ^T Representation according to a neural network model N ₁ The output estimated value is calculated from the input image.

Representation vector->

Is a norm of (a). The parameter theta is a control parameter, so that the robustness of the noise of the model is improved;

solving extremum of cost function (8) by adopting backward propagation method to realize neural network model N ₁ Is used for determining a neural network model N ₁ The parameters of formulas (1) - (7).

Optionally, the method further comprises:

s4, the camera acquires second video data of the acquisition area;

s5, intercepting a second group of images F from the second video data ₂ Analyzing each image, acquiring a subgraph of each image, detecting whether express packages exist in a designated subarea in the image, if yes, entering S6, otherwise, returning to S4;

And S6, sending the express delivery completion notification information to the terminal.

Optionally, the method further comprises:

s7, the camera acquires third video data of the acquisition area;

s8, intercepting a third group of images F from the third video data ₃ Analyzing each image, detecting whether express packages exist in a designated subarea in the image, if yes, entering S9, otherwise returning to S7;

s9, sending the express package abnormal notification information to the terminal.

Optionally, in the step S2, a plurality of subgraphs in each image are respectively acquired, and a neural network model N is adopted according to the specific characteristics of the dispatcher ₁ And detecting whether the express delivery person exists or not from the subgraph.

Optionally, for each complete image I, the method for obtaining the subgraph includes:

ss.a, size of given subgraph;

ss.b, giving a step value;

ss.c, taking the initial pixel of the complete image I as a reference pixel, taking the reference pixel as a cutting start point, and cutting a sub-image according to the size given in the step ss.a;

ss.d, moving the reference pixels of step ss.c along two independent directions of the image with step values given by ss.b;

and (4) repeating the steps of SS.C and SS.D with the new reference pixel as a starting point until a new sub-graph cannot be intercepted.

Optionally, in S5, the specified sub-region is a predetermined rectangular region [ (u) ₁ ，v ₁ )，(u ₂ ，v ₂ )，(u ₃ ，v ₃ )，(u ₄ ，v ₄ )]For marking the location of the express deposit, (u) ₁ ，v ₁ )，(u ₂ ，v ₂ )，(u ₃ ，v ₃ )，(u ₄ ，v ₄ ) Is the coordinates of four vertices in the image for the rectangular region.

Optionally, the method for detecting whether the express package exists in the designated subarea in the image includes:

when the express package is not delivered at the designated position, a frame of image is intercepted from the video shot by the camera, and the subgraph of the video in a preset area is recorded as a reference image R (u ', v') which does not contain the express package;

given a sub-picture S of another image I (u, v) to be determined in the set region _I (u ', v') taking the difference:

D(u′，v′)＝|S _I (u′，v′)-R(u′，v′)|

representing the absolute value of the difference between the pixels at the corresponding positions of the two sub-images;

introduction of neural network model N ₂ The input layer is a difference map D (u ', v'), the output layer is a scalar ψ, when ψ=1, it indicates that there is express package in the input image, and when ψ=0, it indicates that there is no express package in the input image.

Alternatively, the capturing is capturing several images from the video data at equal intervals over a period of time.

Optionally, if the number of images of the express delivery person detected exceeds a first preset value and at least one of the images does not detect a general pedestrian, it is determined that the express delivery person exists.

Optionally, if the number of images of the express package detected exceeds a second preset value and the image shooting sequences are continuous, determining that the package exists.

The invention has the following technical effects:

1. the invention creatively provides three events, namely (1) the express delivery person goes up the gate, (2) the express delivery person is placed at the appointed position, (3) the express delivery person is taken away from the appointed position, and the event processing analysis is carried out, so that the delivery and abnormal conditions of the express delivery are rapidly and accurately monitored and judged.

2. A plurality of special neural networks are designed to accurately detect three events, namely (1) the express delivery personnel going up the gate, (2) the express delivery personnel being placed at the designated position, (3) the express delivery being taken away from the designated position, so that the network structure, the excitation function and the cost function are optimized, the detection accuracy is improved, and the detection time is taken into consideration.

3. The invention creatively provides a method for detecting express dispatchers and general pedestrians in a single-frame image based on a neural network, which combines an express dispatcher target to be detected with a general pedestrian target to extract common characteristics of the express dispatcher target and the general pedestrian target; particularly, the difference characteristics of the express delivery person target and the general pedestrian target are extracted by utilizing the hidden layer, so that the target characteristics can be further distinguished, and the target detection performance can be improved.

Drawings

Various other advantages and benefits will become apparent to those of ordinary skill in the art upon reading the following detailed description of the preferred embodiments. The drawings are only for purposes of illustrating the preferred embodiments and are not to be construed as limiting the invention. Also, like reference numerals are used to designate like parts throughout the figures. In the drawings:

FIG. 1 is a schematic diagram of the composition and relationship of an embodiment of the present invention.

Detailed Description

Exemplary embodiments of the present disclosure will be described in more detail below with reference to the accompanying drawings. While exemplary embodiments of the present disclosure are shown in the drawings, it should be understood that the present disclosure may be embodied in various forms and should not be limited to the embodiments set forth herein. Rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the scope of the disclosure to those skilled in the art.

The embodiment of the invention provides an on-site monitoring method for express storage, which comprises the following steps:

s1: the camera acquires first video data of an acquisition area;

s2: intercepting a first group of images F from first video data ₁ Analyzing each image, detecting whether an express delivery person exists or not, if yes, entering S3, otherwise returning to S1;

S3: and sending the express delivery notification information to the terminal.

Based on this, the above method further comprises: s4, the camera acquires second video data of the acquisition area;

s5, intercepting a second group of images F from the second video data ₂ Analyzing each image, detecting whether express packages exist in a designated subarea in the image, if yes, entering S6, otherwise, returning to S4;

Further, the method further comprises the steps of:

s7, the camera acquires third video data of the acquisition area;

In the step 1-3, a wireless network camera assembled on the express depositing site is utilized to collect continuous videos for different intelligent methods. The client installs service software on the mobile phone or the personal computer, the service software communicates with the camera through the network to obtain the collected video and image, and controls the start and stop of the camera. Please refer to fig. 1.

The express delivery storage site refers to a place where a customer agrees and needs to monitor to place the express delivery, such as a home gate, an office gate and the like, and generally belongs to a private place of the customer. To use the invention and the device, the customer should have the right to take pictures and take pictures of the field.

The wireless network camera refers to a general camera device with wireless network access, communication function and video image shooting and recording function, and can provide functions of other software systems for accessing data acquired by the camera in real time in the forms of secondary development and the like.

The service software is software which is installed on general-purpose computing equipment such as a personal mobile phone or a personal computer of a client and provides sending monitoring and notification functions, the software acquires video and images of a camera through a communication network, automatically analyzes the acquired video and image data, further acquires analysis results, and notifies a user of the analysis results in preset content and format.

The images and videos are acquired by a camera, and a two-dimensional and three-dimensional digital matrix is obtained by a preprocessing method such as sampling and quantization. Wherein the image is a mapped subset of the video in the time dimension. Several frames of images may be taken from the video, or a group of videos may be synthesized from several frames of images.

The customers described herein refer to natural persons or organizations at home or in offices, and are generally recipients of express goods, and for various reasons, express delivery cannot be received in time, so that the methods and devices described herein have a use requirement. However, the methods described herein are not limited in theory or technology to the identity or role of the client.

In the step 2, the detection of the express deliverer and the general pedestrian in the single frame image refers to a method for automatically detecting the express deliverer and the general pedestrian from the single frame image and distinguishing the two; intuitively, the express delivery sender holds or loads the express packages by using a carrier, and the pedestrian does not have the characteristics. The detection method comprises the steps of inputting an image, outputting an image coordinate set of a region where a mark express dispatcher is located in the image, or outputting an image coordinate set of a region where a mark general pedestrian is located in the image; the image coordinate set is a vertex coordinate set of a rectangle surrounded by four points.

And intercepting a frame of image from the video shot by the camera, wherein the frame of image is used for detecting whether the express delivery person or the common pedestrian exists or not. The image is represented in digital form as a two-dimensional matrix:

I(u，v)

Wherein I represents an image, I (u, v) represents an element of a two-dimensional matrix of the image, or a pixel of the image, u, v represents the coordinates of the pixel in the image I

One subset of the image matrix I is called the subgraph of I:

S _I (u′，v′)

wherein ,S_I Subgraph representing I, S _I (u ', v') represents a pixel in the sub-graph and u ', v' represents the coordinates of the pixel in the sub-graph. At this time I is called the original image.

Multiple subgraphs can be acquired from one image I, and the method comprises the following steps:

ss.a, size of given subgraph;

ss.b, giving a step value;

Through the steps SS.A-SS.E, a plurality of subgraphs can be intercepted, each position of the complete image is covered, and if an object of an express deliverer or a general pedestrian exists in the original image I, the object can be detected from the subgraphs through an intelligent algorithm.

If the sub-graph is not of suitable size, it is checked whether the target is present by resetting the sub-graph size and repeating steps ss.a-ss.e.

If all possible options are tried, no object can be detected from any sub-graph, then the corresponding object is considered to be absent from the original image.

As a preferred configuration, the sub-picture size is set here to 1/16, 1/12, 1/8, 1/6 of the original image size, and the step value is set to 4 pixels in the original image.

Using neural network model N ₁ An express dispatcher or a general pedestrian target is detected from the subgraph.

According to general definition, the neural network model consists of an input layer, a hidden layer and an output layer; the hidden layers are composed of a plurality of layers, wherein the first hidden layer is a logic operation result of the input layer, the subsequent hidden layer is a logic operation result of the previous hidden layer, and the output layer is a logic operation result of the last hidden layer; each layer of the model consists of a plurality of scalar quantities, which are also called nodes, and the layer number of the hidden layer refers to the number of nodes through which the shortest path from the layer to the input layer passes; the logical operation relation between the nodes is defined by connection, and no connection exists between the nodes with the same layer number.

Neural network model N as described herein ₁ The input layer is a sub-graph S _I And (u ', v'), wherein the output layer is a group of two-dimensional vectors, and the two dimensions of the vectors respectively represent whether express dispatchers exist or not and whether general pedestrians exist in the input image or not.

Neural network model N ₁ Is defined as follows.

Definition of neural network model N ₁ Is a first layer hidden layer:

in the formula ,

the weight of a convolution window taking (u ', v') as the center in the input layer is represented, p and q represent integer coordinates of relative positions in the convolution window, the convolution window size is 9*9, and the value range of the corresponding p and q is-4 to 4.S is S _I ^{(u′+p，v′+q)} Representing pixel values of the input layer subgraph at coordinates (u '+p, v' +q);

e ^x representing an exponential function, enables the neural network to classify nonlinear data samples, α being an empirical parameter, preferably α=10. The nonlinear function helps to improve the classification effect of the model compared to classical classification functions.

Definition of neural network N ₁ Is a second hidden layer:

in the formula ,

a node representing the coordinates (x, y) of the second hidden layer is connected with 4*4 =16 nodes of the first hidden layer, and max represents the node +_in the first hidden layer >

Maximum of 16 nodes of corresponding position, the 16The nodes are defined by p and q in the x and y directions, namely the p and q take the values of 0, 1, 2 and 3.

Representing nodes with coordinates (4x+p, 4y+q) in the hidden layer of the first layer. b ₁ Is a linear offset. Sigma (x) is defined by equation (2).

Definition of neural network N ₁ Is a third hidden layer:

in the formula ,

According to->

The weights represented are connected to 7*7 nodes in the second hidden layer. b ₂ Is a linear offset. Sigma (x) is a nonlinear function as defined by equation (2).

Definition of neural network N ₁ The fourth hidden layer of (a) is:

in the formula ,

a node with coordinates (x, y) of the fourth hidden layer is connected with 2 x 2 = 4 nodes of the third hidden layer, and max represents a node +.>

The maximum value of 4 nodes at the corresponding position is defined by p and q in the x and y directions, namely the range of the p and q values is 0, 1, 2 and 3./ >

Representing nodes with coordinates (2x+p, 2y+q) in the hidden layer of the third layer. b ₃ Is a linear offset. Sigma (x) is defined by equation (2).

Definition of neural network N ₁ Is a fifth hidden layer of (a):

in the formula ,

represents the node in the fourth hidden layer at coordinates (x+p, y+q), +.>

Respectively represent and

corresponding convolution window weights, the convolution window sizes are 7*7, and the corresponding value ranges of p and q are obtainedAnd (3) and p and q represent integer coordinates of relative positions in the convolution window. It can be seen that->

Press->

The weights represented are connected to 7*7 nodes in the fourth hidden layer, +.>

Press->

The weights represented are connected to 7*7 nodes in the fourth hidden layer. b ₄ Is a linear offset. Sigma (x) is a nonlinear function as defined by equation (2).

By setting a neural network N ₁ Combining the target of the express delivery person to be detected with the target of the general pedestrian, and extracting the common characteristics of the target; particularly, the fifth hidden layer is utilized to extract the difference characteristics of the express deliverer target and the general pedestrian target, which is helpful for further distinguishing the target characteristics and improving the target detection performance.

Definition N ₁ Is provided with an output layer of:

in the formula ,

Representation and->

Corresponding toThe range of the connection weights of c, d is the same as x, y, i.e. each +.>

And +.>

Corresponding, each->

And +.>

Corresponding to the above. b ₅ Is a linear offset. Sigma (x) is defined by equation (2).

the weight parameters and bias parameters of each layer of the neural network in the formulas (1) - (7) need to be obtained through training sample learning. Preparing a plurality of groups of training samples in advance, wherein the training samples are divided into three types, the first type is an image of an express dispatcher who dispatches packages, and the corresponding output is [1,0 ]]The second type is an image of a common pedestrian, and the corresponding output value is [0,1]The third class is an image containing neither express dispatcher nor general pedestrian, and the corresponding output value is [0,0 ]. Each group of training samples comprises images and output values thereof for training the neural network model N ₁ . Calculating the output result of the given training sample input value according to the definitions of (1) - (7) and comparing the result with the labeled value of the training sample to obtain a comparison value, wherein the comparison value is defined as a cost functionThe number:

wherein ,

Representation vector->

Is a norm of (a). The parameter theta is a control parameter, and is helpful for improving the robustness of noise of the model. Preferably, θ=0.045 is taken.

Through the model, the express delivery person target and the general pedestrian target can be detected and distinguished, and more accurate screening of events according to the target type in the subsequent steps is facilitated.

Step 4-6 is a method for detecting express packages at specified positions, namely automatically detecting whether express packages exist in a specified subarea from a frame of image. The method inputs an image comprising a predetermined rectangular area [ (u) ₁ ，v ₁ )，(u ₂ ，v ₂ )，(u ₃ ，v ₃ )，(u ₄ ，v ₄ )]For marking the location of the express deposit, (u) ₁ ，v ₁ )，(u ₂ ，v ₂ )，(u ₃ ，v ₃ )，(u ₄ ，v ₄ ) Coordinates of four vertexes of a rectangular area in the image; a scalar value psi is output, when psi=1, tableAnd (4) showing that the express package is detected, or psi=0, and showing that the express package is not detected in the area.

When the express package is not delivered at the designated position, a frame of image is intercepted from the video shot by the camera, and the subgraph of the video in the preset area is recorded as a reference image R (u ', v') which does not contain the express package.

D(u′，v′)＝|S _I (u′，v′)-R(u′，v′)|…(9)

representing the absolute value of the difference between the corresponding position pixels of the two sub-images.

Introduction of neural network model N ₂ The input layer is a difference map D (u ', v'), the output layer is a scalar ψ, when ψ=1, it indicates that there is express package in the input image, and when ψ=0, it indicates that there is no express package in the input image. By taking the difference value as the input of the neural network, other express parcel interferences can be effectively eliminated. For example, there may be a package already present at the location before the package arrives, and a recognition error may be caused. This is also one of the points of the invention.

Neural network model N ₂ Is defined as follows.

Definition of neural network model N ₂ Is a first layer hidden layer:

in the formula ,

the weight of a convolution window taking (u ', v') as the center in the input layer is represented, p and q represent integer coordinates of relative positions in the convolution window, the convolution window size is 9*9, and the value range of the corresponding p and q is-4 to 4.D (D) ^{(u′+p，v′+q)} Representing pixel values of the difference subgraph of the input layer at coordinates (u '+p, v' +q);

Definition, connected with 9*9 nodes of the input layer. e, e ₀ Is a linear offset. Sigma (x) and model N ₁ The definition (2) is the same.

Definition of neural network N ₂ Is a second hidden layer:

in the formula ,

The maximum value of 16 nodes of the corresponding position is defined by p and q in the x and y directions, namely the p and q take the values of 0, 1, 2 and 3.Y is Y ₁ ^{(4x+p，4y+q)} Representing nodes with coordinates (4x+p, 4y+q) in the hidden layer of the first layer. e, e ₁ Is a linear offset. Sigma (x) is defined by equation (2).

Definition N ₂ Is provided with an output layer of:

in the formula ,

Representing nodes of the hidden layer of the second layer. ψ represents the nodes of the output layer.

Representation and->

And +.>

Corresponding to the above. e, e ₂ Is a linear offset. Sigma (x) is defined by equation (2). />

The value range of the output layer node psi of the neural network is 0,1, which indicates whether the package exists in the original image corresponding to the input difference image. When ψ tends to be 0, it indicates that a package does not exist, and when ψ tends to be 1, it indicates that a package exists. To further clarify the determination result, the threshold values are set as follows:

similarly to step 2, training sample images containing and not containing the package are prepared in advance, and the output values of the training sample images are marked as 1 or 0 respectively, and the model N is obtained ₂ Training is performed to obtain the values of the parameters in formulas (10) - (12). The cost function is defined as follows:

wherein ,

the labeled output values representing training samples, ψ represents the values according to the neural network model N ₂ And calculating an output estimated value after calculating the input difference image. The parameter theta is a control parameter, and is helpful for improving the robustness of noise of the model. Preferably, θ=0.065 is taken.

Through the model, whether packages exist in a preset area or not can be detected and used as the basis for autonomous discrimination of events in the subsequent steps.

Step 7-9 is a monitoring and notifying method for express delivery and abnormal collection, namely, after a client starts service software, the service software automatically runs in a background of the client device, video data transmitted by a camera is received, autonomous detection is implemented according to a related method, occurrence of a monitoring event is further judged according to a detection result, and the client is notified of the occurrence of the event.

The events described herein that notify the client include the following three categories: the method comprises the steps of (1) putting an express delivery person on a gate, (2) putting the express delivery person at a specified position, and (3) taking the express delivery from the specified position. Further, it is agreed that, among the above three types of events, notification of event (2) occurs only after notification of event (1) occurs, and notification of event (3) occurs only after notification of event (1) occurs.

In order to realize autonomous detection and notification of certain events, a correlation method is adopted to acquire a plurality of frames of images within a certain time, and the images are autonomously detected, so that whether the event occurs or not is judged, and the notification is initiated when the event occurs is judged. The related methods and steps are described in detail herein in steps 2-6.

The service software runs on general terminal equipment such as mobile phones, personal computers and the like appointed by clients, and the implementation of the terminal equipment does not affect the implementation of the method. After the service software is installed, the service software enters an operation state according to a client instruction, and operates in a resident service mode in the background of the equipment. When the service runs, video data transmitted by the camera is automatically received, related events are detected, and notification is initiated to a client when the occurrence of the events is detected.

The video data receiving function, the event detecting function and the notification function of the service can be respectively arranged and run on different terminal devices according to the needs so as to better realize the function of the service. For example, the video data receiving function and the event detecting function are arranged on the personal computer, so that the high computing performance of the personal computer can be better utilized, and the data processing and detecting efficiency can be improved; meanwhile, the notification initiation function is arranged on the mobile phone terminal so as to timely notify the user, the mobile phone is only used for initiating event notification, mobile communication flow can be greatly reduced, additional cost is reduced, and communication efficiency is improved.

After a camera is installed by a client, a rectangular area is preset in service software for marking the delivery and monitoring areas of the express delivery.

The monitoring and notifying method for express delivery is realized by judging whether the following events occur: the event (1) is sent to the gate of the express delivery person, and the event (2) is delivered at the appointed position. The express delivery event is the sequential combination of the event (1) and the event (2).

The occurrence condition of the event (1) is judged as follows:

s1.A is in a certain period of time T ₁ In, equally spaced intercept F from video ₁ And (3) detecting the express delivery person and the general pedestrians by adopting the method in the step (2) for each image, and recording detection results.

S1.B if all F in step S1.A ₁ The express deliveries are detected in the images, namely omega in the step 2 ₁ > 0.5, and no normal pedestrian is detected in at least one of the images, i.e., ω in step 2 ₂ And (3) judging that the event (1) occurs if the value is less than or equal to 0.5.

S1.C otherwise, repeating the steps S1.A-S1.B until the event (1) occurs.

The service software in the step 1 sends a notice sent by the event (1) to the client after the event (1) occurs according to the methods in the steps S1.A, S1.B and S1.C, and informs the client that the express dispatcher goes up; and begins transmitting live video to the client. The clients can view the site videos through the service software and observe the express delivery conditions.

The above parameter T ₁ 、F ₁ Empirically obtained, the T is preferred herein by a number of experiments ₁ =5 seconds, F ₁ ＝11.

After event (1) occurs, a monitoring step of event (2) is entered.

The occurrence condition of the event (2) is judged as follows:

s2.a event (1) has occurred.

S2.B is during a certain period of time T ₂ In, equally spaced intercept F from video ₂ And (3) detecting express packages by adopting the method in the step (3) for each image, and recording detection results.

S2.C if there is more than H in step S2.B ₂ Sheet image (H) ₂ ＜F ₂ ) The package was detected, i.e. ψ > 0.5 in step 3, and this H ₂ If the image capturing order is continuous, it is determined that the event (2) occurs.

And S2.D, otherwise, repeating the steps S2.B and S2. C.

And the service software in the step 1 sends a notice of completion of delivery to the client after the event (1) and the event (2) occur according to the methods in the steps S2.A, S2.B, S2.C and S2. D.

The above parameter T ₂ 、F ₂ 、H ₂ Empirically obtained, the T is preferred herein by a number of experiments ₂ =10 seconds, F ₂ ＝6，H ₂ ＝4.

After the events (1) and (2) occur, the service software stops recording the live video to the client; and proceeds to the step of monitoring event (3).

The occurrence condition of the event (3) is judged as follows:

s3. event (1), (2) has occurred.

S3.B is in a certain period of time T ₃ In, equally spaced intercept F from video ₃ And (3) detecting express packages by adopting the method in the step (3) for each image, and recording detection results.

S3.C if there is more than M in step S3.B ₃ Sheet image (H) ₂ ＜F ₂ ) No package was detected, i.e. ψ was 0.5 or less in step 3, and this M ₃ If the image capturing order is continuous, it is determined that the event (3) occurs.

And S3.D, otherwise, repeating the steps S3.B and S3. C.

And the service software in the step 1 sends a notice of abnormal express package monitoring to the client after the event (3) occurs according to the method in the steps S3.A, S3.B, S3.C and S3.D, and sends the on-site video within a certain time (such as 60 seconds) before the event (3) occurs to the client.

After the client starts the service software, the service software automatically runs in the background of the client equipment, receives video data transmitted by a camera, automatically detects according to the steps and the preamble steps, automatically judges monitoring events such as quick delivery arrival, abnormal express delivery and the like, notifies the client of the events, and starts a field video viewing function. By the method, remote monitoring of the quick delivery of the customer is realized, and the customer property safety is protected.

Table 1 shows the test results of the methods described herein, including two types of indicators, namely, the rapid delivery arrival monitoring recognition rate and the rapid delivery abnormality monitoring recognition rate. The express delivery monitoring recognition rate refers to the probability that after the express delivery is delivered by the express delivery person, the express delivery event notification can be automatically and correctly judged and initiated; the abnormal express monitoring recognition rate refers to the probability that when someone takes the express from the site, the abnormal event notification can be automatically and correctly judged and initiated. The response time refers to the time difference between the delivery of the express delivery or the taking of the express delivery to the method and the autonomous reporting of the event by the system. Test results show that the method has high recognition rate of two types of events and short response period, and realizes an autonomous and intelligent remote express delivery monitoring function.

TABLE 1

In the above embodiments, it may be implemented in whole or in part by software, hardware, firmware, or any combination thereof. When implemented in software, may be implemented in whole or in part in the form of a computer program product. The computer program product includes one or more computer instructions. When the computer program instructions are loaded and executed on a computer, the processes or functions in accordance with the present application are produced in whole or in part. The computer may be a general purpose computer, a special purpose computer, a computer network, or other programmable apparatus. The computer instructions may be stored in a computer-readable storage medium or transmitted from one computer-readable storage medium to another computer-readable storage medium, for example, the computer instructions may be transmitted from one website, computer, server, or data center to another website, computer, server, or data center by a wired (e.g., coaxial cable, fiber optic, digital subscriber line), or wireless (e.g., infrared, wireless, microwave, etc.). Computer readable storage media can be any available media that can be accessed by a computer or data storage devices, such as servers, data centers, etc., that contain an integration of one or more available media. The usable medium may be a magnetic medium (e.g., floppy Disk, hard Disk, magnetic tape), an optical medium (e.g., DVD), or a semiconductor medium (e.g., solid State Disk), etc.

It will be appreciated that in addition to the foregoing, some conventional structures and conventional methods are included, and as such are well known, they will not be described in detail. But this does not mean that the structures and methods do not exist in the present invention.

It will be appreciated by those skilled in the art that while a number of exemplary embodiments of the invention have been shown and described herein in detail, many other variations or modifications which are in accordance with the principles of the invention may be directly ascertained or inferred from the present disclosure without departing from the spirit and scope of the invention. Accordingly, the scope of the present invention should be understood and deemed to cover all such other variations or modifications.

Claims

1. The on-site monitoring method for the express storage is characterized by comprising the following steps of:

s1: the camera acquires first video data of an acquisition area;

s2: intercepting a first group of images F1 from the first video data, analyzing each image, obtaining a plurality of subgraphs of each image, detecting whether an express delivery person exists, if yes, entering S3, otherwise, returning to S1;

s3: sending express delivery notification information to a terminal;

using neural network model N ₁ Detecting in the subgraph whether the express deliverer exists,

neural network model N ₁ Is defined as follows:

definition of neural network model N ₁ Is a first layer hidden layer:

in the formula ,

the weight of the convolution window taking (u ', v') as the center in the input layer is represented, p and q represent integer coordinates of relative positions in the convolution window, the convolution window size is 9*9, the value range of the corresponding p and q is-4 to 4, S _I ^{(u′+p，v′+q)} Representing pixel values of the input layer subgraph at coordinates (u '+p, v' +q);

Definition, connected with 9*9 nodes of input layer, b ₀ For linear offset, σ (x) is a nonlinear function:

e ^x representing an exponential function enabling neural networks to cope with non-statesThe linear data samples are classified, and alpha is an empirical parameter;

definition of neural network N ₁ Is a second hidden layer:

in the formula ,

Maximum of 16 nodes of the corresponding position, the 16 nodes are defined by p and q in the x and y directions, namely the values of p and q are in the ranges of 0, 1, 2 and 3,

Representing nodes with coordinates (4x+p, 4y+q) in the hidden layer of the first layer, b ₁ For the linear offset, σ (x) is defined by equation (2);

definition of neural network N ₁ Is a third hidden layer:

in the formula ,

The weight of the convolution window is represented, the size of the convolution window is 7*7, the range of values of the corresponding p and q is-3 to 3, p and q represent integer coordinates of relative positions in the convolution window, and +.>

According to->

The weights represented are connected to 7*7 nodes in the second hidden layer, b ₂ For linear offset, σ (x) is a nonlinear function as defined by equation (2);

definition of neural network N ₁ The fourth hidden layer of (a) is:

in the formula ,

Maximum of 4 nodes of corresponding position, the 4 nodes are defined by p and q in x and y directions, i.e. the values of p and q range from 0, 1, 2 and 3,/ >

Representing nodes with coordinates (2x+p, 2y+q) in the hidden layer of the third layer, b ₃ For the linear offset, σ (x) is defined by equation (2);

definition of neural network N ₁ Is a fifth hidden layer of (a):

in the formula ,

represents the node in the fourth hidden layer at coordinates (x+p, y+q), +.>

Respectively represent and

corresponding convolution window weights, the convolution window sizes are 7*7, the corresponding values of p and q are-3 to 3, and p and q represent integer coordinates of relative positions in the convolution window, so that ++>

Press->

Press->

The weights represented are connected to 7*7 nodes in the fourth hidden layer, b ₄ For linear offset, σ (x) is a nonlinear function as defined by equation (2);

Definition N ₁ Is provided with an output layer of:

in the formula ,

representing the nodes of the hidden layer of the fifth layer, Ω= [ ω ] ₁ ，ω ₂ ] ^T Two nodes representing the output layer, +.>

Representation and->

And +.>

Corresponding, each->

And +.>

Correspondingly, b ₅ For the linear offset, σ (x) is defined by equation (2);

output layer node Ω= [ ω ] of neural network ₁ ，ω ₂ ] ^T Value rangeIs [0,1]Respectively representing the probability of whether express delivery personnel exist or whether general pedestrians exist in the input diagram, when omega _i When i= {1,2} tends to be 0, it means that the target is not present, when ω _i When i= {1,2} tends to be 1, it indicates that there is a target, and in order to further clarify the determination result, the threshold is set as follows:

the weight parameters and bias parameters of each layer of the neural network in the formulas (1) - (7) are required to be obtained through training sample learning, a plurality of groups of training samples are prepared in advance, the training samples are divided into three types, the first type is the image of the express dispatcher who dispatches the package, and the corresponding output is [1,0 ]]The second type is an image of a common pedestrian, and the corresponding output value is [0,1]The third class is an image containing neither express dispatcher nor general pedestrian, and the corresponding output value is [0,0 ]Each group of training samples comprises images and output values thereof for training the neural network model N ₁ Calculating the output result for a given training sample input value according to the definitions of (1) - (7) and comparing it with the labeled value of the training sample, a comparison value can be obtained, which is defined as a cost function:

wherein ,

marking output value representing sample, Ω= [ ω ] ₁ ，ω ₂ ] ^T Representation according to a neural network model N ₁ Calculating the input image and then outputting the estimated value, < >>

Representation vector->

The parameter theta is a control parameter, so that the robustness of the noise of the model is improved;

solving extremum of cost function (8) by adopting backward propagation method to realize neural network model N ₁ Is used for determining a neural network model N ₁ Parameters of formulae (1) - (7);

the method further comprises the steps of: s4, the camera acquires second video data of the acquisition area;

s6, sending the express delivery completion notification information to the terminal;

the method specifically comprises the following steps: inputting an image including a predetermined rectangular area [ (u) ₁ ，v ₁ )，(u ₂ ，v ₂ )，(u ₃ ，v ₃ )，(u ₄ ，v ₄ )]For marking the location of the express deposit, (u) ₁ ，v ₁ )，(u ₂ ，v ₂ )，(u ₃ ，v ₃ )，(u ₄ ，v ₄ ) Coordinates of four vertexes of a rectangular area in the image; outputting a scalar value psi, wherein when psi=1, the express package is detected, or psi=0, the express package is not detected in the area;

D(u′，v′)＝|S _I (u′，v′)-R(u′，v′)|…(9)

introduction of neural network model N ₂ The input layer isA difference map D (u ', v'), whose output layer is a scalar psi, when psi=1, indicating that there is an express package in the input image, when psi=0, indicating that there is no express package in the input image,

neural network model N ₂ Is defined as follows:

definition of neural network model N ₂ Is a first layer hidden layer:

in the formula ,f₁ ^(p，q) The weight of the convolution window taking (u ', v') as the center in the input layer is represented, p and q represent integer coordinates of relative positions in the convolution window, the convolution window size is 9*9, the value range of the corresponding p and q is-4 to 4, and D ^{(u′+p，v′+q)} Representing pixel values of the difference subgraph of the input layer at coordinates (u '+p, v' +q);

representing a node with coordinates (x, y) in the hidden layer of the first layer, this node being based on a window parameter f ₁ ^(p，q) Definition, which is connected with 9*9 nodes of the input layer; e, e ₀ Is a linear offset; sigma (x) and model N ₁ Wherein said definitions (2) are the same;

definition of neural network N ₂ Is a second hidden layer:

in the formula ,

Maximum of 16 nodes of the corresponding position, the 16 nodes are defined by p and q in the x and y directions, namely the values of p and q are in the ranges of 0,1, 2 and 3,

Representing nodes with coordinates (4x+p, 4y+q) in the hidden layer of the first layer, e ₁ For the linear offset, σ (x) is defined by equation (2);

definition N ₂ Is provided with an output layer of:

in the formula ,

a node representing a hidden layer of the second layer, and ψ represents a node of an output layer;

Representation and->

And +.>

Correspondingly, e ₂ Is a linear offset; sigma (x) is defined by formula (2);

the value range of the output layer node psi of the neural network is 0,1, which indicates whether the original image corresponding to the input difference image has a package or not; when ψ tends to be 0, it indicates that a package does not exist, and when ψ tends to be 1, it indicates that a package exists; to further clarify the determination result, the threshold values are set as follows:

Training sample images containing the package and not containing the package are prepared in advance, the output value of the training sample images is marked as 1 or 0 respectively, and the model N is obtained ₂ Training is performed to obtain values of parameters in formulas (10) - (12), and a cost function is defined as follows:

wherein ,

the labeled output values representing training samples, ψ represents the values according to the neural network model N ₂ Calculating an input difference image, and then outputting an estimated value, wherein a parameter theta is a control parameter;

judging whether the following events occur or not through the steps: the event (1) is sent to the gate of the express delivery personnel, and the event (2) is delivered at a designated position;

after the events (1) and (2) occur, the service software stops recording the live video to the client; and entering a monitoring step of the event (3);

the occurrence condition of the event (3) is judged as follows:

A. events (1), (2) have occurred;

B. in a certain time period T3, F3 images are intercepted from the video at equal intervals, each image is detected by the method, and the detection result is recorded;

C. if more than M3 images in the step B cannot detect the package, namely psi is less than or equal to 0.5, and the M3 images are shot in sequence continuously, determining that an event (3) occurs;

D. otherwise repeating B, C steps;

and when the event (3) occurs, sending a notice of abnormal monitoring of the express package to the client, and sending the on-site video of 60 seconds before the event (3) occurs to the client.

2. The monitoring method of claim 1, wherein the method further comprises:

s7, the camera acquires third video data of the acquisition area;

3. The monitoring method according to claim 1, wherein the method for obtaining the subgraph comprises, for each complete image I:

ss.a, size of given subgraph;

ss.b, giving a step value;

4. The monitoring method of claim 1, wherein the capturing is capturing a plurality of images from the video data at equal intervals over a period of time.

5. The monitoring method of claim 1, wherein the presence of the courier is determined if the number of images of the courier is detected to exceed a first preset value and if at least one of the images does not detect a general pedestrian.

6. The monitoring method of claim 1, wherein the presence of packages is determined if the number of images of the express packages is detected to exceed a second preset value and the image capturing order is continuous.