CN110796412B

CN110796412B - Parcel tracking method and related device

Info

Publication number: CN110796412B
Application number: CN201911037111.9A
Authority: CN
Inventors: 付建海; 赵蕾; 熊剑平
Original assignee: Zhejiang Dahua Technology Co Ltd
Current assignee: Zhejiang Dahua Technology Co Ltd
Priority date: 2019-10-29
Filing date: 2019-10-29
Publication date: 2022-09-06
Anticipated expiration: 2039-10-29
Also published as: CN110796412A

Abstract

The application discloses a parcel tracking method and a related device. The parcel tracking method comprises the following steps: acquiring video data obtained by scanning a parcel conveying area by a scanning device; taking an image of the target package detected in the video data as a target image, and taking a next frame of image of the target image in the video data as an image to be detected; determining a target image and a tracking area corresponding to the target package in the image to be detected based on the position of the target package in the target image; training based on image data of a tracking area of a target image to obtain a detection model; and detecting the tracking area of the image to be detected by using the detection model to obtain a target area corresponding to the target package in the image to be detected. Above-mentioned scheme can improve parcel detection efficiency.

Description

Parcel tracking method and related device

Technical Field

The present application relates to the field of image processing technologies, and in particular, to a package tracking method and a related apparatus.

Background

With the increasing attention of the modern society to trip safety, the security inspection strength of stations, airports, customs and other places is also increased. In this case, the packages of large size and small size, such as shopping bags and luggage cases, need to be strictly inspected.

At present, the problem of low efficiency exists in manual identification for detecting whether forbidden articles exist in a package or depending on manual identification; or the images scanned by scanning devices such as an X-ray machine and a millimeter wave security check instrument are detected by depending on an image-based detection model, so as to determine whether target packages containing prohibited articles exist, however, in the security check process, the packages often need to stay for a certain time in a package conveying area such as a conveyor belt, and in order to determine whether the packages conveyed in the package conveying area are the target packages with the prohibited articles, the detection in the period needs to be continuously performed, so that the problem of repeated detection on the same package occurs, and the detection efficiency is affected. In view of this, how to improve the package detection efficiency is an urgent problem to be solved.

Disclosure of Invention

The technical problem mainly solved by the application is to provide a parcel tracking method and a related device, which can improve parcel detection efficiency.

In order to solve the above problems, a first aspect of the present application provides a package tracking method, including: acquiring video data obtained by scanning a parcel conveying area by a scanning device; taking an image of the target package detected in the video data as a target image, and taking a next frame image of the target image in the video data as an image to be detected; determining a tracking area corresponding to the target package in the target image and the image to be detected based on the position of the target package in the target image; training based on image data of a tracking area of a target image to obtain a detection model; and detecting the tracking area of the image to be detected by using the detection model to obtain a target area corresponding to the target package in the image to be detected.

In order to solve the above problem, a second aspect of the present application provides a package tracking apparatus, including a memory and a processor coupled to each other, the processor being configured to execute program instructions stored in the memory to implement the package tracking method in the first aspect.

In order to solve the above problems, a third aspect of the present application provides a storage device storing program instructions executable by a processor, the program instructions being configured to implement the package tracking method in the first aspect.

According to the scheme, the image of the target package detected in the video data obtained by scanning the package conveying area by the scanning device is used as the target image, the next frame image of the target image in the video data is used as the image to be detected, the tracking area of the target package in the target image and the image to be detected is determined based on the position of the target package in the target image, the tracking area of the target package in the target image and the tracking area of the image to be detected are trained based on the image data of the tracking area of the target image, a detection model is obtained, the tracking area of the image to be detected is detected by the detection model, the target area corresponding to the target package in the image to be detected is obtained, continuous tracking of the target package can be achieved, repeated detection of the target package is not needed for subsequent images in the video data, and therefore package detection efficiency can be improved.

Drawings

FIG. 1 is a schematic flow chart diagram of one embodiment of a package tracking method of the present application;

FIG. 2 is a schematic diagram of an embodiment of a target image and an image to be detected;

FIG. 3 is a diagram illustrating an embodiment of cyclic shift processing performed on image data of a tracking area of a target image;

FIG. 4 is a diagram illustrating an embodiment of cyclic shift processing performed on image data of a tracking area of an image to be detected;

FIG. 5 is a schematic flow chart diagram of another embodiment of a package tracking method of the present application;

FIG. 6 is a flowchart illustrating an embodiment of step S14 in FIG. 1;

FIG. 7 is a flowchart illustrating an embodiment of step S142 in FIG. 6;

FIG. 8 is a block diagram of an embodiment of a training sample image;

FIG. 9 is a flowchart illustrating an embodiment of step S15 in FIG. 1;

FIG. 10 is a flowchart illustrating an embodiment of step S154 in FIG. 9;

FIG. 11 is a block diagram of the framework of one embodiment of the package tracking device of the present application;

FIG. 12 is a block diagram of another embodiment of the package tracking device of the present application;

FIG. 13 is a block diagram of an embodiment of a memory device according to the present application.

Detailed Description

The following describes in detail the embodiments of the present application with reference to the drawings attached hereto.

In the following description, for purposes of explanation and not limitation, specific details are set forth such as particular system structures, interfaces, techniques, etc. in order to provide a thorough understanding of the present application.

The terms "system" and "network" are often used interchangeably herein. The term "and/or" herein is merely an association describing an associated object, meaning that three relationships may exist, e.g., a and/or B, may mean: a exists alone, A and B exist simultaneously, and B exists alone. In addition, the character "/" herein generally indicates that the former and latter associated objects are in an "or" relationship. Further, the term "plurality" herein means two or more than two.

Referring to fig. 1, fig. 1 is a schematic flow chart diagram of an embodiment of a package tracking method of the present application. Specifically, the method may include the steps of:

step S11: and acquiring video data obtained by scanning the parcel conveying area by the scanning device.

In this embodiment, the scanning device may include, but is not limited to: x-ray machine, millimeter wave security check appearance. The parcel delivery area may be the area of a conveyor belt scanned by the scanning device, the conveyor belt delivering the parcel belt so that the parcel moves in a direction whereby the scanning device scans images of the parcel at different positions, the images constituting the video data.

The image in the video data may be a color image. In one implementation scenario, the image in the video data may be a grayscale image in order to reduce subsequent processing loads. In addition, the video data in this embodiment may be in MP4 format, AVI format, or the like, and this embodiment is not limited in this respect.

In this embodiment, the video data may be acquired from the scanning device by an electrical connection manner, or the video data scanned by the scanning device may be acquired by a wireless connection with the scanning device.

Step S12: and taking the image of the detected target parcel in the video data as a target image, and taking the next frame image of the target image in the video data as an image to be detected.

In this embodiment, the target package may be set according to a specific application scenario. In one implementation scenario, the target parcel may be a parcel containing contraband items such as knives, batteries, etc.; in another implementation scenario, the target package may also be a package containing a large number of repeated items, for example, a package containing a large number of cosmetics of the same type, or a package containing a large number of electronic products of the same type, which is not illustrated here.

In this embodiment, the target package may be obtained through manual detection, and after the target package is identified, an image of the detected target package may be used as a target image; in addition, images in the video data can be detected by using a preset target parcel detection model, so that the image of the target parcel detected for the first time is used as a target image, and the next frame of the target image is used as an image to be detected. For example, the video data includes video frames { f ₁ ,f ₂ ,f ₃ ,…,f _n F, if in the video data, the video frame image f _i Detecting the target package for the first time, and then, displaying the video frame image f _i As a target image, a video frame image f _i+1 As an image to be detected.

Step S13: and determining the tracking areas of the target package in the target image and the image to be detected based on the position of the target package in the target image.

In this embodiment, a target area containing the target package in the target image may be determined first. When the target image is the image which contains the target package and is detected in the video data for the first time, the target area can be manually determined, for example, a base point in the target image is manually determined, a rectangular frame containing the target package is drawn by taking the base point as a starting point, and the rectangular frame is taken as the target area; the target area including the target parcel may also be automatically detected through the foreground and the background of the target image, and this embodiment is not particularly limited herein. Or, after the target area corresponding to the target package in the image to be detected is determined through the following steps in this embodiment, in order to continuously track the target package in the subsequent image in the video data, the image to be detected in this embodiment may be used as a new target image, and the image of the frame subsequent to the new target image may be used as the image to be detected, and the target area of the image to be detected is directly used as the target area of the new target image.

After the target area of the target package contained in the target image is determined, the target area can be expanded, the expanded target area is used as the tracking area in the target image and the image to be detected, and the tracking area in the target image and the tracking area in the image to be detected are the same in size and position. Specifically, the center of the target area may be used as a base point, the target area is expanded by a preset multiple, the expanded target area is used as a tracking area of the target image, and an area in the to-be-detected image, which is the same as the tracking area of the target image in position and size, is used as the tracking area of the to-be-detected image. In an implementation scenario, the position between the tracking area in the target image and the tracking area in the image to be detected may also have a certain offset, and the offset needs to be considered subsequently when determining the target area of the image to be detected, which is not described herein again in this embodiment.

Referring to fig. 2, fig. 2 is a schematic diagram of an embodiment of a target image and a detection image. As shown in FIG. 2, a target image f of the video data, from which the target parcel was first detected, is determined _i And enlarging the target area to obtain a tracking area (solid line frame in the figure), and obtaining a target image f _i Tracking area and target image f in (1) _i The next frame of image f to be detected _i+1 Are identical in position and size, when the image f to be detected is determined by the following steps _i+1 Can be detected by the image f to be detected _i+1 As a new target image and the next frame as a new image f to be detected _i+2 At this point, it may be based on the new target image f _i+1 In the same way determines a new target image f _i+1 And a new image f to be detected _i+2 The tracking area (dotted line frame in the figure), and a new target image f _i+1 And a new image f to be detected _i+2 The tracking areas have the same position and the same size, and when a new image f to be detected _i+2 When there are also subsequent images, the analogy can be done, and the embodiment is not illustrated here.

In addition, for clarity, only 1 object parcel is shown in the target image and the image to be detected in fig. 2, and in a specific application, there may be 2, 3, 4, and the like object parcels in the target image, and this embodiment is not limited specifically herein.

Step S14: and training based on the image data of the tracking area of the target image to obtain a detection model.

Training is performed based on image data of the tracking area of the target image, for example, training is performed by using a Support Vector Machine (SVM), so as to obtain a detection model.

In an implementation scenario, in order to enrich sample data required for training, cyclic shift processing may be performed on image data of a tracking area, so as to obtain a large amount of sample data. Referring to fig. 3, fig. 3 is a schematic diagram illustrating an embodiment of performing cyclic shift processing on image data of a tracking area of a target image. As shown in fig. 3, image data P of the tracking area of the target image _i Performing cyclic shift processing to obtain multiple sample data

The sample data has different offset directions and offset amounts relative to the image data of the tracking area of the target image, only a part of the sample data is shown in fig. 3, and other sample data may be similar, and this embodiment is not illustrated here, wherein the sample data in fig. 2

The image data of the tracking area relative to the target image is not shifted and may be used as a positive sample, and the other shifted sample data may be used as a negative sample. In addition, in order to avoid the imbalance of the numbers of positive and negative samples, sample data that is not shifted may also be copied in a large amount, and this embodiment is not limited in this respect. For one-dimensional image data, the cyclic shift processing is to shift each pixel point of the image data to a certain direction (such as left or right) by the positions of a plurality of pixel points, and when the image data is shifted for the second time, the positions of a plurality of pixel points are shifted again on the basis of the last shiftPerforming continuous circulation in the same way; for two-dimensional image data, the cyclic shift processing is, in short, on the basis that the one-dimensional image data is shifted toward a single direction, the one-dimensional image data is shifted toward two directions, such as right and down, so as to obtain sample data with different shift directions and shift amounts.

In another implementation scenario, in order to enrich sample data required during training, the image data of the entire target package may be shifted toward different shift directions by different shift amounts in the tracking area, so that a plurality of sample data may be directly obtained, which is not limited in this embodiment.

In an implementation scenario, in order to more accurately determine a target area of a target package in an image to be detected, after obtaining a plurality of sample data, further performing feature extraction to obtain a feature image corresponding to the plurality of sample data, for example, performing grayscale feature extraction, or performing HOG (Histogram of Oriented Gradient) feature extraction, or performing CN (Color Name) feature extraction, or further fusing two or more features, which is not limited in this embodiment.

Step S15: and detecting the tracking area of the image to be detected by using the detection model to obtain a target area corresponding to the target package in the image to be detected.

In an implementation scenario, when the detection model is used to detect the tracking area of the image to be detected, the image data of the tracking area of the image to be detected may also be subjected to cyclic displacement processing, so as to obtain a plurality of sample data, and then the detection model is used to detect the sample data respectively. Referring to fig. 4, fig. 4 is a schematic diagram illustrating an embodiment of performing cyclic shift processing on image data of a tracking area of an image to be detected. As shown in fig. 4, image data Q of the tracking area of the image to be detected _i Performing cyclic shift processing to obtain multiple sample data with different offset and offset direction

Fig. 4 shows only a part of sample data, and other sample data may be analogized, which is not illustrated here, and after a plurality of sample data are obtained, the sample data are detected by using a detection model, for example, correlation value calculation may be performed, and each sample data may be obtained by calculation

For example, the target area in the image to be detected may be obtained by shifting the target area in the target image by the obtained shift amount in the direction opposite to the shift direction, or, in order to reduce errors as much as possible, the obtained shift amount may be decomposed into the same direction as the conveying direction of the parcel conveying area to obtain a component of the shift amount, and the target area in the target image may be shifted by the component in the same direction as the conveying direction to obtain the target area in the image to be detected, which is not limited in this embodiment.

In an implementation scene, in order to more accurately determine a target area where a target package is located in an image to be detected, after a plurality of sample data are obtained and before detection is performed by using a detection model, feature extraction can be performed on the plurality of sample data obtained by shifting image data of a tracking area of the image to be detected, a plurality of feature images corresponding to the plurality of sample data are obtained, and the detection model is used for detecting the feature images. Feature extraction may include, but is not limited to: a gray scale feature, an HOG feature, and a CN feature, and the embodiment is not limited in detail herein.

According to the scheme, the image of the target package detected in the video data obtained by scanning the package conveying area by the scanning device is used as the target image, the next frame image of the target image in the video data is used as the image to be detected, the tracking area of the target package in the target image and the tracking area of the corresponding target package in the image to be detected are determined based on the position of the target package in the target image, the tracking area of the image to be detected is detected by using the detection model, the target area corresponding to the target package in the image to be detected is obtained, the target package can be continuously tracked, repeated detection of the target package is not needed for subsequent images in the video data, and therefore the package detection efficiency can be improved.

Referring to fig. 5, fig. 5 is a schematic flow chart of another embodiment of the package tracking method of the present application. Specifically, in this embodiment, the package tracking method may include the following steps:

step S501: and acquiring video data obtained by scanning the parcel conveying area by the scanning device.

Please refer to step S11 in the above embodiment.

Step S502: and taking the image of the target parcel detected in the video data as a target image.

Please refer to step S12 in the above embodiment.

Step S503: and taking the next frame image of the target image in the video data as an image to be detected.

Please refer to step S12 in the above embodiment.

Step S504: and determining the tracking areas of the target package in the target image and the image to be detected based on the position of the target package in the target image.

Please refer to step S13 in the above embodiment.

Step S505: and training based on the image data of the tracking area of the target image to obtain a detection model.

Please refer to step S14 in the above embodiment.

Step S506: and detecting the tracking area of the image to be detected by using the detection model to obtain a target area corresponding to the target package in the image to be detected.

Please refer to step S15 in the above embodiment.

In one implementation scenario, to enhance the user experience,after the target area corresponding to the target package in the image to be detected is obtained, whether the first sequence number of the image to be detected in the video data is an integral multiple of the fourth preset number or not can be judged, and if yes, the image to be detected and the target area of the image to be detected are output. Please refer to fig. 2, the image f to be detected is obtained _i+1 Judging the image f to be detected after detecting the target area corresponding to the target package _i+1 Whether the first sequence number i +1 is an integer multiple of a fourth preset number, in this embodiment, the fourth preset number may be set according to a specific application scenario, for example: 5. 10, 15, etc., the embodiment is not limited in this respect, and when the first serial number i +1 is an integer multiple of the fourth preset number, the image f to be detected is output _i+1 And an image f to be detected _i+1 The target area can be presented to the user at regular intervals of frame number, and then tracking of the target package can be vividly presented to the user, and user experience and perception are improved.

Step S507: and judging whether the area boundaries of the target area of the image to be detected are all within the image boundaries of the image to be detected. If so, go to step S508, otherwise go to step S510.

Referring to FIG. 2, as shown in FIG. 2, the image f to be detected is further processed _i+1 After the detection is completed and the target area is determined, it can be further determined whether the area boundaries of the target area are all within the image boundaries, obviously, the image f to be detected shown in fig. 2 _i+1 The area boundary of the target area is within the image boundary, that is, the target parcel continues to scan in the parcel delivery area, so that the target parcel needs to be tracked, so as to execute step S508, and the image f to be detected is obtained _i+1 The relevant steps in the above embodiments are re-executed as a new target image to continuously track the target package.

In other implementation scenarios, if one of the area boundaries of the target area of the image to be detected exceeds the image boundary or coincides with the image boundary, it is indicated that the target package is about to be conveyed out of the range that can be scanned by the scanning device, and therefore, it is determined that the tracking of the target package is completed.

Step S508: and taking the image to be detected as a new target image.

And when the area boundaries of the target area of the image to be detected are all within the image boundaries of the image to be detected, the tracking of the target package is not completed, and the image to be detected is used as a new target image.

Step S509: step S503 and subsequent steps are re-executed.

Please refer to FIG. 2, the image f to be detected _i+1 After being taken as a new target image, the new target image f is taken _i+1 The latter frame is used as a new image f to be detected _i+2 To thereby redetermine the target image f _i+1 And an image f to be detected _i+2 Tracking area corresponding to target parcel and based on target image f _i+1 The image data of the tracking area is trained, so that the detection model is updated, and then the detection model is utilized to treat the image f to be detected _i+2 Detecting the tracking area to obtain an image f to be detected _i+2 The target area corresponding to the target parcel. Specifically, reference may be made to steps S13 to S15 in the above embodiments, which are not described herein again.

Step S510: and determining that the target package is completely tracked.

When one of the area boundaries of the target area of the image to be detected exceeds the image boundary or is superposed with the image boundary, the target package is indicated to be conveyed out of the range which can be scanned by the scanning device, and therefore, the target package is determined to be completely tracked.

Different from the embodiment, in the above scheme, after the target area corresponding to the target package in the image to be detected is obtained by using the detection model, it is continuously determined whether the area boundaries of the target area of the image to be detected are all within the image boundaries of the image to be detected, if all within the image boundaries, it is indicated that the tracking of the target package is not completed, the image to be detected is taken as a new target image, the tracking is continuously performed, if the condition is not met, it is indicated that the tracking of the target package is completed, and then the tracking of the target package is realized in the whole scanning process of the target package.

Referring to fig. 6, fig. 6 is a schematic flowchart illustrating an embodiment of step S14 in fig. 1. Specifically, the method may include the steps of:

step S141: and carrying out cyclic displacement processing on the image data of the tracking area of the target image to obtain a plurality of training sample images.

In this embodiment, the plurality of training sample images have different offset directions and offset amounts with respect to the target image. Referring to fig. 3, image data P of the tracking area of the target image is shown _i Performing cyclic displacement processing to obtain multiple training sample images

As shown in FIG. 3, training sample images

Image data P of a tracking area with respect to a target image _i An offset is shifted to the lower left to train the sample image

Image data P of tracking area relative to target image _i Training sample images without offset

Image data P of tracking area relative to target image _i An offset is shifted to the upper right, only a part of training sample images obtained after the cyclic shift processing is shown in fig. 3, and other training sample images can be analogized in turn, which is not illustrated here.

Step S142: and extracting the gradient features and the gray features of the training sample images by adopting a preset feature extraction mode to obtain a plurality of training feature images corresponding to the training sample images.

The gradient feature in the present example and the following examples refers to the HOG feature unless otherwise specified. In one implementation scenario, in order to reduce interference of irrelevant image information, before feature extraction, feature extraction regions of a plurality of training sample images may be determined based on a target region of a target image, for example, the target region may be narrowed by a second preset number of pixel points with a center of the target region of the target image as a base point, the narrowed target region may be used as a feature extraction region of the plurality of training sample images, and the feature extraction regions of the plurality of training sample images are the same in size and position as the narrowed target region. The second predetermined number may be an even number, such as 2, 4, 6, 8, etc., and the embodiment is not limited in this respect.

Referring to fig. 7, fig. 7 is a flowchart illustrating an embodiment of step S142 in fig. 6, which specifically includes the following steps:

step S1421: and extracting the gradient feature of the feature extraction area of the training sample image to obtain a gradient feature map of the training sample image.

Referring to fig. 8, fig. 8 is a frame diagram of an embodiment of a training sample image. As shown in fig. 8, a training sample image P is _i ^k The image sub-blocks are divided into a plurality of continuous image sub-blocks (as shown by the thick lines), each image sub-block has a size of c pixels by x c, the image sub-block shown in fig. 8 has a size of 4 pixels by x 4 pixels, and in other implementation scenarios, the image sub-blocks may also have a size of 6 pixels by x 6 pixels, which is not limited in this embodiment. Calculating pixel point P in each image subblock _i ^k (m, n) gradient directions and gradient amplitudes, and extracting histogram features of a plurality of directions in each image sub-block, wherein the plurality of directions in the embodiment may include 0-20 degrees, 20-40 degrees, 40-60 degrees, 60-80 degrees, 80-100 degrees, 100-120 degrees, 120-140 degrees, 140-160 degrees, and 160-180 degrees, and in specific implementation, the gradient direction of each pixel point may be mapped to within 0-180 degrees. In addition, in consideration of redundancy of feature information, a PCA (Principal Component Analysis) algorithm may be further used to reduce the dimension of the feature of each image sub-block, for example, to n ₁ Dimension, in this example, n ₁ The dimension may be 1800 dimensions, or other values in other implementation scenariosFor example: 2000 dimensions, 1600 dimensions, etc., and the embodiment is not particularly limited herein. The details of the PCA algorithm are prior art, and are not described herein.

In one implementation scenario, in order to eliminate the influence of the interference on the edge features on the subsequent training as much as possible, after feature extraction, edge image sub-blocks may be further removed, for example, 1 image sub-block at the top edge, 1 image sub-block at the bottom edge, 1 image sub-block at the left edge, and 1 image sub-block at the right edge, so that when the size of the feature extraction region is w × h and the size of the image sub-blocks is c × c, the size of the obtained gradient feature map is f after the above-mentioned edge removal operation _w And f _h Respectively, as follows:

step S1422: and extracting the gray characteristic of the characteristic extraction area of the training sample image to obtain a gray characteristic image with the same size as the gradient characteristic image.

In order to reduce the difference between the grayscale characteristic and the gradient characteristic in the number of channels and avoid the situation that the subsequent training characteristic diagram is mainly dominated by the gradient characteristic, the number of channels of the grayscale characteristic can be increased. Specifically, the training sample image may be processed into a training grayscale image in which the grayscale value of each pixel point is within a preset range, where the preset range may be [0, 1]]Then, a first predetermined number n is selected from the predetermined range ₂ Selecting 5 gray feature values, for example, 0.2, 0.4, 0.6, 0.8, and 1.0, and then selecting the gray feature value with the minimum difference value with the gray value of each pixel point in the training gray image as the gray value of the corresponding pixel point in the gray feature image, for example, selecting the gray value of a certain pixel point in the training gray image as 0.45, and then selecting the gray feature value with the minimum difference value with the gray valueTaking the gray characteristic value 0.4 with the minimum value as the gray value of the pixel point to further form a training gray image with 5 channels, and in order to keep the size of the finally obtained gray characteristic image the same as that of the gradient characteristic image, performing down-sampling treatment on the training gray image so as to make the size of the finally obtained gray characteristic image the same as that of the gradient characteristic image, wherein f is the same as that of the gradient characteristic image _w *f _h . In this example and the following examples, when [ 'and']' and ' (' and ') when indicating numerical ranges, ' [ ' and ']'both indicate the inclusion of the endpoints, and' both ('and') indicate the exclusion of the endpoints.

In an implementation scenario, in order to highlight the central feature of the gray feature map, reduce the boundary effect, and fade the boundary effect, an operation of subtracting a preset gray value from the gray value of each pixel in the gray feature map may be performed after selecting the gray feature value having the smallest difference from the gray value of each pixel in the training gray feature map as the gray value of the corresponding pixel in the gray feature map, and when the gray value after subtracting the preset gray value is less than 0, the gray value is set to 0, the preset gray value in this embodiment may be a median of a preset range, for example, when the preset range is [0, 1], the preset gray value may be 0.5. After that, the gray feature map obtained by subtracting the preset gray value may be further processed by using a hamming window process, and specifically, the hamming window may be multiplied by the gray feature map, so that the feature center is more prominent.

Step S1423: and taking the gradient feature of each pixel point of the gradient feature map and the gray feature of the pixel point corresponding to the gray feature map as the image feature of the pixel point corresponding to the training feature image of the training sample image.

The gradient features of each pixel point of the gradient feature map and the gray features of the pixel points corresponding to the gray feature map are used as the image features of the pixel points corresponding to the training feature image of the training sample image, so that the feature dimension of the final training feature image is f _w *f _h *(n ₁ +n ₂ )。

Step S143: and training by using a plurality of training characteristic images to obtain a detection model.

In this embodiment, the training feature image extracted from the training sample image that is not offset with respect to the image data of the tracking region of the target image may be used as a positive sample, and the training feature image extracted from the training sample image that is offset with respect to the image data of the tracking region of the target image may be used as a negative sample training classifier, so as to obtain the detection model. In one implementation scenario, to balance the number of positive and negative samples, the positive samples may be replicated to obtain multiple identical positive samples, and then the classifier is trained.

Different from the embodiment, in the scheme, the cyclic displacement processing is performed on the image data of the tracking area of the target image, so that a plurality of training sample images can be obtained, the gradient features and the gray features of the plurality of training sample images are extracted, a plurality of training feature images corresponding to the training sample images are obtained, the plurality of training feature images are used for training, a detection model is obtained, the stability is guaranteed, meanwhile, the speed of obtaining the detection model is improved, and the real-time requirement under the actual scene can be met.

Referring to fig. 9, fig. 9 is a schematic flowchart illustrating an embodiment of step S15 in fig. 1. Specifically, the method may include the steps of:

step S151: and performing cyclic displacement processing on the image data of the tracking area of the image to be detected to obtain a plurality of detection sample images.

In this embodiment, the plurality of detection sample images have different offset directions and offset amounts with respect to the image to be detected.

In this embodiment, the operation step of performing cyclic shift processing on the image data of the tracking area of the detected image is substantially the same as the operation step of performing cyclic shift processing on the image data of the tracking area of the target image in the foregoing embodiment, and reference may be specifically made to step S141 in the foregoing embodiment, which is not described herein again.

Step S152: and extracting the gradient features and the gray features of the plurality of detection sample images by adopting a preset feature extraction mode to obtain a plurality of detection feature images corresponding to the detection sample images.

In this embodiment, the operation step of extracting the gradient feature and the gray feature of the multiple detection sample images is substantially the same as the operation step of extracting the gradient feature and the gray feature of the multiple training sample images in the above embodiment, and reference may be specifically made to step S142 in the above embodiment, which is not described herein again.

Step S153: and calculating correlation values of the detection characteristic images by using the detection model, and counting the correlation values corresponding to the detection sample images.

When the position deviation of the target parcel in the detection sample image relative to the target area of the target parcel in the target image is smaller, the correlation value obtained by calculating the correlation value of the detection characteristic image extracted by the detection sample image by the theoretical sampling detection model is larger. Therefore, based on the premise, the image data of the tracking area of the image to be detected is subjected to cyclic displacement processing through the steps, and the offset direction and the offset of the detection sample image with the highest correlation value after the correlation value calculation is performed on the training feature images corresponding to the multiple training sample images and the detection model can be obtained, so that the target area in the image to be detected can be determined.

Step S154: and determining a target area in the image to be detected according to the offset direction and the offset of the detected sample image corresponding to the maximum correlation value.

In this embodiment, the tracking area in the image to be detected and the tracking area in the target image have the same size and the same position, so that after the offset direction and the offset of the detected sample image corresponding to the maximum correlation value are obtained, the target area in the image to be detected can be obtained by performing an offset operation on the target area in the target image, which is opposite to the obtained offset direction and offset. In one implementation scenario, the tracking area in the image to be detected and the tracking area in the target image have the same size but different positions, and at this time, after the offset direction and the offset of the detected sample image corresponding to the maximum correlation value are obtained, an operation of offsetting the target area in the target image in the direction opposite to the obtained offset direction and offset needs to be performed, and in addition, the offset of the tracking area in the image to be detected relative to the tracking area in the target image needs to be considered, and an offset operation opposite to the offset operation needs to be performed again, so that the target area in the image to be detected is obtained. The embodiment is not particularly limited herein.

Referring to FIG. 2, assume the image f to be detected _i+1 Tracking area (shown by a solid line frame) and target image f _i The tracking areas (indicated by the solid line frame) have the same size and the same position, and the image f to be detected is _i+1 After the cyclic shift operation is performed on the image data of the tracking area, the detection sample image shown in fig. 4 can be obtained

In which a sample image is detected

Corresponding to the largest correlation value, assuming that the sample image is detected

Is offset by

The offset direction is-1 degree, please refer to FIG. 2, only need to combine the target image f _i Is shifted to 179 degrees

To obtain the image f to be detected _i+1 The target area of (2).

Referring to fig. 10, fig. 10 is a flowchart illustrating an embodiment of step S154 in fig. 9, and the following steps may be further implemented to eliminate the calculation error generated in the non-conveying direction as much as possible.

Step S1541: and taking the direction opposite to the offset direction as the motion direction of the target area in the image to be detected.

Taking the direction opposite to the offset direction as the motion direction of the target area in the image to be detected, for example, if the obtained offset direction is-1 degree, the motion direction is 179 degrees; or, if the obtained offset direction is 1 degree, the movement direction is 181 degrees, and this embodiment is not illustrated here.

Step S1542: the direction of movement is broken down into a first direction of movement that is the same as the direction of conveyance in the parcel delivery area and a second direction of movement that is perpendicular to the direction of conveyance.

The direction of movement is broken down into a first direction of movement that is the same as the direction of conveyance in the parcel delivery area and a second direction of movement that is opposite the direction of conveyance.

Step S1543: the offset is decomposed into a first offset in the first direction of motion and a second offset in the second direction of motion.

For example, the moving direction is arctan (3/4), the offset is 5, the first offset resolved into the first moving direction is 4, and the second offset resolved into the second moving direction is 3. When the moving direction and the offset are other values, the same can be analogized, and the embodiment is not illustrated one by one here.

Step S1544: and offsetting the target area in the target image by a first offset amount towards a first movement direction, and taking the offset target area as the target area of the image to be detected.

Still taking the motion direction as arctan (3/4) and the offset as 5 as an example, at this time, the target area of the target image is offset by a first offset, that is, by 4, towards the first motion direction, and the target area in the image to be detected can be obtained.

Different from the foregoing embodiment, in the above-mentioned scheme, a plurality of detection sample images are obtained by performing cyclic displacement processing on image data of a tracking area of an image to be detected, so that gradient features and grayscale features of the plurality of detection sample images are extracted by using a preset feature extraction method, a plurality of detection feature images corresponding to the detection sample images are obtained, a detection model is used to perform correlation value calculation on the plurality of detection feature images, correlation values corresponding to the plurality of detection sample images are counted, a target area in the image to be detected is determined according to an offset direction and an offset of the detection sample image corresponding to a maximum correlation value, a speed for determining the target area can be increased, and then a package tracking speed is increased.

In one embodiment, in order to further smooth the offset calculated by the steps in the above embodiment, when a difference between a first sequence number of an image to be detected in the video data and a second sequence number of an image of the video data in which the target parcel is detected is not greater than a third preset number, the first offset and the first sequence number corresponding to the first offset may be further stored, and the first offset obtained by storage is used to construct the moving average filter. In this embodiment, the third preset number may be set according to an actual application scenario, for example: 3. 4, 5, etc., and the present embodiment is not particularly limited herein. For example, the first sequence number of the image to be detected in the video data is i +1, the second sequence number of the target parcel image detected in the video data is i, and the third preset number is 5, and since the difference between the first sequence number i +1 and the second sequence number i is not greater than the third preset number, the first offset S calculated by the above embodiment is stored at this time _i+1 When continuously tracking the next frame to be detected, i.e. f _i+2 In the same way, the first offset amount S calculated by the above embodiment is stored _i+2 By analogy, can store to S _i+3 、S _i+4 、S _i+5 Thereby constructing a moving average filter { S } _i+1 ,S _i+2 ,S _i+3 ,S _i+4 ,S _i+5 }。

When the difference value between the first sequence number of the image to be detected in the video data and the second sequence number of the image of the detected target package in the video data is larger than a third preset number, calculating and storing the average value of the first offset of the third preset number, judging whether the difference value between the decomposed first offset and the calculated average value is larger than a preset threshold value, if so, taking the calculated average value as the first offset for determining the target area of the image to be detected, and if not, taking the decomposed first offset as the first offset for determining the target area of the image to be detected, and updating the sliding average filter. Taking the third predetermined number as 5 as an example, when examiningMeasuring image f _i+6 And the first sequence number i +6 of the video data and the image f of the detected target package in the video data _i Is greater than 5, then the average value of the 5 first offsets obtained by storage is calculated

And judging the first offset S obtained by decomposition _i+6 And the average value

Whether the difference between the values is greater than a preset threshold S _th If not, the first offset S obtained by decomposition is obtained _i+6 As determining the image f to be detected _i+6 And updating the moving average filter, and specifically storing the decomposed first offset S _i+6 And decomposing the first sequence number i +6 corresponding to the obtained first offset, deleting the first offset with the minimum first sequence number in the stored first offsets, and updating the moving average filter to be { S _i+2 ,S _i+3 ,S _i+4 ,S _i+5 ,S _i+6 If yes, calculating the average value

As determining the image f to be detected _i+6 A first offset of the target area.

Referring to fig. 11, fig. 11 is a block diagram of an embodiment of a package tracking device 1100 according to the present application. In this embodiment, the package tracking apparatus 1100 includes a data obtaining module 1110, an image extracting module 1120, an area determining module 1130, a data training module 1140, and a detection tracking module 1150, where the data obtaining module 1110 is configured to obtain video data obtained by scanning a package conveying area by a scanning device; the image extraction module 1120 is configured to use an image of the target package detected in the video data as a target image, and use a next frame image of the target image in the video data as an image to be detected; the region determining module 1130 is configured to determine a tracking region of the target package in the target image and the image to be detected based on the position of the target package in the target image; the data training module 1140 is configured to train based on image data of a tracking area of the target image to obtain a detection model; the detection tracking module 1150 is configured to detect a tracking area of the image to be detected by using the detection model, and obtain a target area corresponding to the target package in the image to be detected.

In some embodiments, the package tracking apparatus 1100 further includes a boundary determining module, configured to determine whether the region boundaries of the target region of the image to be detected are all within the image boundaries of the image to be detected, if the determination result of the boundary determining module is yes, the image extracting module 1120 is further configured to take the image to be detected as a new target image, and re-perform the steps in the foregoing embodiments in combination with the region determining module 1130, the data training module 1140 and the detection tracking module 1150, and the detection tracking module 1150 is further configured to determine that the tracking of the target package is completed when the determination result of the boundary determining module is no.

In some embodiments, the region determining module 1130 further includes a region obtaining sub-module configured to determine a target region containing the target package in the target image, and the region determining module 1130 further includes a region enlarging sub-module configured to enlarge the target region and use the enlarged target region as a tracking region in the target image and the image to be detected, where the tracking region in the target image and the tracking region in the image to be detected are the same in size and the same in position. In an implementation scenario, the region expansion sub-module is specifically configured to expand the target region by a preset multiple with a center of the target region as a base point, use the expanded target region as a tracking region of the target image, and use a region in the image to be detected, which is at the same position and has the same size as the tracking region of the target image, as the tracking region of the image to be detected.

In some embodiments, the package tracking apparatus 1100 further includes a sequence number judging module for judging whether the first sequence number of the image to be detected in the video data is an integer multiple of a fourth preset number, and the package tracking apparatus 1100 further includes an image output module for outputting the image to be detected and the target area of the image to be detected when the judgment result of the sequence number judging module is yes.

Different from the embodiment, the target area can be presented to the user at regular intervals by the scheme, so that the tracking of the target package can be vividly presented to the user, and the user experience and perception are improved.

In some embodiments, the data training module 1140 includes a first cyclic shift sub-module configured to perform cyclic shift processing on image data of a tracking region of a target image to obtain a plurality of training sample images, where the plurality of training sample images have different offset directions and offset amounts with respect to the target image, the data training module 1140 further includes a first feature extraction sub-module configured to extract gradient features and gray features of the plurality of training sample images by using a preset feature extraction manner to obtain a plurality of training feature images corresponding to the training sample images, and the data training module 1140 further includes a training sub-module configured to perform training by using the plurality of training feature images to obtain a detection model.

In some embodiments, the region determination module 1130 also includes a feature region acquisition sub-module, for determining feature extraction regions of a plurality of training sample images based on a target region of a target image, the first feature extraction sub-module further comprising a gradient feature extraction unit, the first feature extraction submodule is used for extracting the gradient feature of the feature extraction area of the training sample image to obtain a gradient feature map of the training sample image, and also comprises a gray feature extraction unit, the first feature extraction submodule is used for extracting the gray feature of a feature extraction area of a training sample image to obtain a gray feature map with the same size as the gradient feature map, and also comprises a feature fusion unit, and the gray feature of the pixel point corresponding to the gray feature map and the gradient feature of each pixel point of the gradient feature map are used as the image features of the pixel points corresponding to the training feature image of the training sample image.

In some embodiments, the grayscale feature extraction unit further includes a grayscale normalization subunit configured to process the training sample image into a training grayscale image in which the grayscale value of each pixel is within a preset range, the grayscale feature extraction unit further includes a grayscale feature value selection subunit configured to select a first preset number of grayscale feature values from the preset range, the grayscale feature extraction unit further includes a grayscale clustering subunit configured to select a grayscale feature value having a minimum difference from the grayscale value of each pixel in the training grayscale image as the grayscale value of the corresponding pixel in the grayscale feature image, and the grayscale feature extraction unit further includes a downsampling subunit configured to downsample the grayscale feature image, so that the size of the downsampled grayscale feature image is the same as the size of the gradient feature image. In one implementation scenario, the preset range is greater than or equal to 0 and less than or equal to 1; in one implementation scenario, the first predetermined number is 5.

Different from the embodiment, the scheme can increase the number of channels of the gray scale features, reduce the difference between the gray scale features and the gradient features in the number of channels, and avoid the situation that the subsequent training feature diagram is mainly dominated by the gradient features.

In some embodiments, the grayscale feature extraction unit further includes a grayscale centering subunit, configured to perform an operation of subtracting a preset grayscale value from the grayscale value of each pixel of the grayscale feature map, and the grayscale centering subunit is further configured to perform the grayscale feature map after the operation of subtracting the preset grayscale value by using a hamming window process. In one implementation scenario, the preset gray value is the median of the preset range.

Different from the foregoing embodiment, the foregoing scheme can highlight the central feature of the gray feature map, reduce the boundary effect, and fade the boundary effect.

In some embodiments, the feature region acquisition submodule is specifically configured to, with the center of the target region of the target image as a base point, shrink the target region by a second preset number of pixel points, and use the shrunk target region as a feature extraction region of the plurality of training sample images.

In some embodiments, the detection and tracking module 1150 includes a second cyclic shift sub-module configured to perform cyclic shift processing on image data of a tracking area of an image to be detected to obtain a plurality of detection sample images, where the plurality of detection sample images have different offset directions and offset amounts with respect to the image to be detected, the detection and tracking module 1150 further includes a second feature extraction sub-module configured to extract gradient features and gray features of the plurality of detection sample images by using a preset feature extraction method to obtain a plurality of detection feature images corresponding to the detection sample images, the detection and tracking module 1150 further includes a correlation operator module configured to perform correlation value calculation on the plurality of detection feature images by using a detection model, and count correlation values corresponding to the plurality of detection sample images, and the detection and tracking module 1150 further includes a target area determination sub-module configured to determine the image to be detected according to the offset direction and offset amount of the detection sample image corresponding to a maximum correlation value Of the target area.

In some embodiments, the target area determination sub-module comprises a moving direction determination unit for determining a direction opposite to the offset direction as a moving direction of the target area in the image to be detected, the target area determination sub-module further comprises a moving direction decomposition unit for decomposing the moving direction into a first moving direction identical to the conveying direction in the parcel conveying area and a second moving direction perpendicular to the conveying direction, the target area determination sub-module further comprises an offset decomposition unit for decomposing the offset into a first offset in the first moving direction and a second offset in the second moving direction, the target area determination sub-module further comprises an area offset unit for offsetting the target area in the target image by the first offset in the first moving direction and using the offset target area as the target area of the image to be detected.

Different from the foregoing embodiment, in the above-described scheme, a direction opposite to the offset direction is taken as a moving direction of a target area in an image to be detected, the moving direction is decomposed into a first moving direction which is the same as the conveying direction in the parcel conveying area and a second moving direction which is perpendicular to the conveying direction, the offset amount is decomposed into a first offset amount in the first moving direction and a second offset amount in the second moving direction, the target area in the target image is offset by the first offset amount towards the first moving direction, and the offset target area is taken as a target area of the image to be detected, so that a calculation error can be eliminated as much as possible, and accuracy of parcel tracking is improved.

In some embodiments, the sequence number determining module is further configured to determine whether a difference between a first sequence number of the image to be detected in the video data and a second sequence number of the image in the video data, in which the target package is detected, is not greater than a third preset number, the target area determining submodule further includes an offset storage unit configured to store the first offset and the first sequence number corresponding to the first offset when the sequence number determining module determines that the difference is yes, the target area determining submodule further includes a filter constructing unit configured to construct a moving average filter using the stored first offset, the target area determining submodule further includes an average calculating unit configured to calculate an average of the stored third preset number of first offsets when the sequence number determining module determines that the sequence number is no, and the target area determining submodule further includes a difference determining unit, the target area determining submodule further comprises an offset determining unit, the offset determining unit is used for taking the calculated average value as the first offset for determining the target area of the image to be detected when the judgment result of the difference judging unit is yes, the offset determining unit is further used for taking the decomposed first offset as the first offset for determining the target area of the image to be detected when the judgment result of the difference judging unit is no, and the filter constructing unit is further used for updating the sliding average filter when the judgment result of the difference judging unit is no.

Unlike the foregoing embodiment, the above-described scheme can smooth the calculated offset amount, so that a smooth tracking effect can be ensured as much as possible.

Referring to fig. 12, fig. 12 is a block diagram of an embodiment of a package tracking device 1200 according to the present application. In this embodiment, the package tracking device 1200 includes a memory 1210 and a processor 1220 coupled to each other, the processor 1220 being configured to execute program instructions stored in the memory 1210 to implement the steps of any of the package tracking method embodiments described above.

In particular, the processor 1220 is configured to control itself and the memory 1210 to implement the steps in any of the package tracking method embodiments described above. Processor 1220 may also be referred to as a CPU (Central Processing Unit). Processor 1220 may be an integrated circuit chip having signal processing capabilities. Processor 1220 may also be a general purpose Processor, a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), a Field Programmable Gate Array (FPGA) or other Programmable logic device, discrete Gate or transistor logic, discrete hardware components. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like. In addition, the processor 1220 may be commonly implemented by a plurality of integrated circuit chips.

In some embodiments, the package tracking apparatus 1200 further includes a scanning device 1230 for scanning the package delivery area for video data.

Referring to fig. 13, fig. 13 is a block diagram illustrating a memory device 1300 according to an embodiment of the present invention. The storage device 1300 stores program instructions 1310 capable of being executed by a processor, the program instructions 1310 for implementing the steps in any of the package tracking method embodiments described above.

By the aid of the scheme, continuous tracking of the target package can be achieved, repeated detection of the target package is not required to be carried out on subsequent images in the video data, and therefore package detection efficiency can be improved.

In the several embodiments provided in the present application, it should be understood that the disclosed method and apparatus may be implemented in other manners. For example, the above-described apparatus embodiments are merely illustrative, and for example, a division of a module or a unit is only one type of logical division, and other divisions may be realized in practice, for example, a plurality of units or components may be combined or integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection of devices or units through some interfaces, and may be in an electrical, mechanical or other form.

Units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the embodiment.

In addition, functional units in the embodiments of the present application may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit can be realized in a form of hardware, and can also be realized in a form of a software functional unit.

The integrated unit, if implemented in the form of a software functional unit and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on such understanding, the technical solutions of the present application, which are essential or contributing to the prior art, or all or part of the technical solutions may be embodied in the form of a software product, which is stored in a storage medium and includes several instructions for causing a computer device (which may be a personal computer, a server, a network device, or the like) or a processor (processor) to execute all or part of the steps of the methods of the embodiments of the present application. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk, or an optical disk, and various media capable of storing program codes.

Claims

1. A package tracking method, comprising:

acquiring video data obtained by scanning a parcel conveying area by a scanning device;

taking an image of a target package detected in the video data as a target image, and taking a next frame image of the target image in the video data as an image to be detected;

determining a tracking area corresponding to the target parcel in the target image and the image to be detected based on the position of the target parcel in the target image; the tracking area in the target image and the tracking area in the image to be detected have the same size and the same position, and the tracking area is obtained by expanding a target area containing the target package in the target image;

training based on image data of the tracking area of the target image to obtain a detection model;

detecting a tracking area of an image to be detected by using the detection model to obtain a target area corresponding to a target package in the image to be detected, wherein cyclic displacement processing is performed on image data of the tracking area of the image to be detected to obtain a plurality of detection sample images, the plurality of detection sample images have different offset directions and offset amounts relative to the image to be detected, the detection model is used for performing correlation value calculation on detection characteristic images of the plurality of detection sample images, correlation values corresponding to the plurality of detection sample images are counted, then based on the opposite direction of the offset direction of the detection sample image corresponding to the maximum correlation value and the offset amount of the corresponding detection sample image, the target area in the target image is offset to determine the target area in the image to be detected, and when the first sequence number of the image to be detected in the video data and the offset amount detected in the video data are used for detecting the target area in the video data And when the difference value between the second sequence numbers of the images of the target packages is not more than a third preset number, storing a first offset of the offset decomposition in the conveying direction and a first sequence number corresponding to the first offset, and constructing a moving average filter by using the stored first offset, wherein the moving average filter is used for smoothing the offset.

2. The parcel tracking method according to claim 1, wherein after the tracking area of the image to be detected is detected by using the detection model to obtain a target area corresponding to the target parcel in the image to be detected, the method further comprises:

judging whether the area boundaries of the target area of the image to be detected are all within the image boundaries of the image to be detected;

if so, taking the image to be detected as a new target image, and re-executing the step of taking the next frame image of the target image in the video data as the image to be detected and the subsequent steps;

and if not, determining to finish tracking the target package.

3. The package tracking method of claim 1,

the training of the image data of the tracking area based on the target image to obtain a detection model comprises the following steps:

performing cyclic displacement processing on image data of a tracking area of the target image to obtain a plurality of training sample images, wherein the plurality of training sample images have different offset directions and offset amounts relative to the target image;

extracting gradient features and gray features of the training sample images by adopting a preset feature extraction mode to obtain a plurality of training feature images corresponding to the training sample images;

and training by using the training characteristic images to obtain the detection model.

4. The package tracking method according to claim 3, wherein before the extracting gradient features and gray features of the training sample images by using a preset feature extraction manner to obtain a plurality of training feature images corresponding to the training sample images, the method further comprises:

determining feature extraction regions of the plurality of training sample images based on a target region of the target image;

the extracting the gradient features and the gray features of the training sample images by adopting a preset feature extraction mode to obtain a plurality of training feature images corresponding to the training sample images comprises the following steps:

extracting the gradient feature of the feature extraction area of the training sample image to obtain a gradient feature map of the training sample image;

extracting the gray scale features of the feature extraction area of the training sample image to obtain a gray scale feature map with the same size as the gradient feature map;

and taking the gradient feature of each pixel point of the gradient feature map and the gray feature of the pixel point corresponding to the gray feature map as the image feature of the pixel point corresponding to the training feature image of the training sample image.

5. The package tracking method according to claim 4, wherein the extracting the gray features of the training sample image to obtain the gray feature map with the same size as the gradient feature map comprises:

processing the training sample image into a training gray image with the gray value of each pixel point within a preset range;

selecting a first preset number of gray characteristic values from the preset range;

selecting a gray characteristic value with the minimum difference value with the gray value of each pixel point in the training gray image as the gray value of the corresponding pixel point in the gray characteristic image;

and performing down-sampling processing on the gray feature map, so that the size of the gray feature map after the down-sampling processing is the same as that of the gradient feature map.

6. The package tracking method according to claim 5, wherein after selecting the gray feature value having the smallest difference with the gray value of each pixel point in the training gray image as the gray value of the corresponding pixel point in the gray feature map, and before performing down-sampling on the gray feature map so that the size of the down-sampled gray feature map is the same as the size of the gradient feature map, the method further comprises:

subtracting a preset gray value from the gray value of each pixel point of the gray characteristic map;

and subtracting the preset gray value by using Hamming window processing to obtain the gray characteristic graph.

7. The package tracking method of claim 6, wherein the predetermined range is 0 or greater and 1 or less; and/or the presence of a gas in the atmosphere,

the first preset number is 5; and/or the presence of a gas in the gas,

the preset gray value is the median of the preset range.

8. The package tracking method of claim 4, wherein the determining the feature extraction regions of the plurality of training sample images based on the target region of the target image comprises:

with the center of a target area of the target image as a base point, shrinking the target area by a second preset number of pixel points;

and taking the contracted target area as a feature extraction area of the plurality of training sample images.

9. The parcel tracking method according to claim 3, wherein the detecting the tracking area of the image to be detected by using the detection model to obtain the target area corresponding to the target parcel in the image to be detected comprises:

performing the cyclic displacement processing on the image data of the tracking area of the image to be detected to obtain a plurality of detection sample images, wherein the plurality of detection sample images have different offset directions and offset amounts relative to the image to be detected;

extracting gradient features and gray features of the detection sample images by adopting the preset feature extraction mode to obtain a plurality of detection feature images corresponding to the detection sample images;

calculating correlation values of the detection characteristic images by using the detection model, and counting the correlation values corresponding to the detection sample images;

and determining a target area in the image to be detected according to the offset direction and the offset of the detected sample image corresponding to the maximum correlation value.

10. The package tracking method according to claim 9, wherein the determining the target area in the image to be detected by the offset direction and the offset of the detected sample image corresponding to the maximum correlation value comprises:

taking the direction opposite to the offset direction as the motion direction of the target area in the image to be detected;

decomposing the direction of movement into a first direction of movement that is the same as the direction of conveyance in the parcel delivery area and a second direction of movement that is perpendicular to the direction of conveyance;

decomposing the offset into a first offset in the first movement direction and a second offset in the second movement direction;

and offsetting a target area in the target image by a first offset amount towards the first movement direction, and taking the offset target area as the target area of the image to be detected.

11. The parcel tracking method according to claim 10, wherein when a difference value between a first sequence number of the image to be detected in the video data and a second sequence number of the image in the video data for detecting the target parcel is larger than a third preset number, the method further comprises after decomposing the offset amount into a first offset amount in the first moving direction and a second offset amount in the second moving direction, and before shifting a target area in the target image towards the first moving direction by the first offset amount and taking the shifted target area as the target area of the image to be detected:

calculating the average value of the third preset number of the first offsets obtained by storage;

judging whether the difference value between the first offset value obtained by decomposition and the average value obtained by calculation is larger than a preset threshold value or not;

if so, taking the calculated average value as a first offset for determining a target area of the image to be detected;

if not, the decomposed first offset is used as the first offset for determining the target area of the image to be detected, and the moving average filter is updated.

12. The package tracking method of claim 11, wherein the updating the moving average filter comprises:

storing the decomposed first offset and a first sequence number corresponding to the decomposed first offset;

and deleting the first offset with the minimum first sequence number in the stored first offsets.

13. The package tracking method according to claim 3, wherein enlarging the target area and using the enlarged target area as a tracking area in the target image and the image to be detected comprises:

expanding the target area by a preset multiple by taking the center of the target area as a base point;

taking the expanded target area as a tracking area of the target image;

and taking the area with the same position and size as the tracking area of the target image in the image to be detected as the tracking area of the image to be detected.

14. The parcel tracking method according to claim 1, wherein after the tracking area of the image to be detected is detected by using the detection model to obtain a target area corresponding to the target parcel in the image to be detected, the method further comprises:

judging whether the first sequence number of the image to be detected in the video data is an integral multiple of a fourth preset number or not;

and if so, outputting the image to be detected and the target area of the image to be detected.

15. A package tracking device comprising a memory and a processor coupled to each other, the processor being configured to execute program instructions stored by the memory to implement the package tracking method of any of claims 1 to 14.

16. The package tracking device of claim 15, further comprising a scanning device for scanning the package delivery area for video data.

17. A storage device having stored thereon program instructions executable by a processor to implement the package tracking method of any one of claims 1 to 14.