CN113762190A

CN113762190A - Neural network-based parcel stacking detection method and device

Info

Publication number: CN113762190A
Application number: CN202111078544.6A
Authority: CN
Inventors: 朱强; 唐金亚; 杜萍
Original assignee: Zhongke Weizhi Intelligent Manufacturing Technology Jiangsu Co ltd
Current assignee: Zhongke Weizhi Intelligent Manufacturing Technology Jiangsu Co ltd
Priority date: 2021-09-15
Filing date: 2021-09-15
Publication date: 2021-12-07
Anticipated expiration: 2041-09-15
Also published as: CN113762190B

Abstract

The invention relates to a neural network-based parcel stacking detection method and device. Which comprises the following steps: 101, training to obtain a package detection neural network model, wherein the package detection neural network is based on a Yolov3 target detection model, the backbone network is based on a MobileNet V2 framework, and the detection head part uses two layers of Yolo layers to output a detection frame; 102, acquiring a belt image when a package is conveyed by a top scanning camera; 103, extracting the parcel information in the belt image through the parcel detection neural network model, wherein the parcel information extracted by the parcel detection neural network model comprises parcel positions and relations between parcels; and 104, judging the stacking state of the packages according to the extracted package information. The invention can effectively realize the detection of the package stacking and improve the accuracy of automatic package supply and the sorting efficiency.

Description

Neural network-based parcel stacking detection method and device

Technical Field

The invention relates to a detection method and a detection device, in particular to a neural network-based parcel stacking detection method and a neural network-based parcel stacking detection device.

Background

The full-automatic bag supply system consists of a cross belt type sorting system and a package separation system. Upon entering the cross-belt sortation system, it must be ensured that the packages enter one by one in sequence. If stacked packages enter the cross-belt sortation system, this may result in package sortation failures and reduced sortation efficiency.

Currently, for separation of parcels, the speed of a belt is changed so as to control the separation of the parcels, and the parcels with serious stacking cannot be separated. Therefore, in order to ensure the sorting accuracy and efficiency of the parcels, how to detect the stacked parcels before entering the cross-belt sorting system becomes a problem to be solved urgently.

Disclosure of Invention

The invention aims to overcome the defects in the prior art and provide a method and a device for detecting parcel stacking based on a neural network, which can effectively realize the detection of the parcel stacking and improve the accuracy of automatic parcel supply and sorting efficiency.

According to the technical scheme provided by the invention, the parcel stacking detection method based on the neural network comprises the following steps:

101, training to obtain a package detection neural network model, wherein the package detection neural network is based on a Yolov3 target detection model, the backbone network is based on a MobileNet V2 framework, and the detection head part uses two layers of Yolo layers to output a detection frame;

102, acquiring a belt image when a package is conveyed by a top scanning camera;

103, extracting the parcel information in the belt image through the parcel detection neural network model, wherein the parcel information extracted by the parcel detection neural network model comprises parcel positions and relations between parcels;

and 104, judging the stacking state of the packages according to the extracted package information.

In step 101, when a target detection model based on YOLOV3 is adopted, the training of the neural network comprises a pre-training step and a stacking detection training step; wherein,

during pre-training, the pre-trained data set is a COCO data set, and a network weight initialization uses a Hommin initialization method; performing fixed network iteration for M1 times, and selecting the optimal weight of the MAP as a pre-training weight;

during stacking detection training, stacking detection training is carried out by utilizing an acquired stacking detection data set, and weights obtained by pre-training are initialized by using network weights; when selecting the verification set, only selecting pictures containing only one single packet or only one multi-packet; after non-maximum value inhibition, classifying each verification set picture into a single packet or a plurality of packets, calculating F1-score, recording an F1-score curve, and selecting the optimal weight of the F1-score when training is finished; the initial learning rate is 0.0005, and the learning rate attenuation strategy is cosine annealing.

When the package information in the belt image is extracted through the package detection neural network model, the extracted package target is subjected to post-processing by using non-maximum suppression, and the suppression is only performed among the same types.

When the non-maximum suppression is used for processing and suppressing the extracted parcel target, the method comprises the following processes:

ordering the target frames in the target frame set { B } according to the confidence level from high to low;

traversing the target frame set { B }, and aiming at each target frame B_iFrom the target frame B_i+1Go backwards, if the target frame B_iAnd target frame B_jBelonging to the same class and the intersection ratio is larger than the threshold value, the target frame B is put_jRemoving the mark;

if the target frame B is traversed while traversing the target frame set { B }, the target frame B_iIf the mark is removed, the subsequent operation of the round of traversal is skipped, and the target frame B is removed_i+1Starting traversal; wherein the value range of i is [0, N-2 ]]N is the number of the target frames, and the value range of j is [ i +1, N-1 ]]。

In step 104, judging the stacking state of the packages according to the extracted package information, wherein the stacking state of the packages comprises:

when only one single packet appears, judging the situation of non-stacking;

when a plurality of parcels appear, when the head-to-tail distance between the first parcel and the second parcel regular rectangular frame is smaller than a threshold value L, judging that the parcels are stacked;

when a plurality of parcels appear, when the first parcel type is a single parcel and the head-to-tail distance between the first parcel and a second parcel regular rectangular frame is larger than a threshold value L, judging that the parcels are not stacked;

whether or not there are multiple parcels, when the first parcel category is a multiple parcel, a stacking situation is determined.

The threshold L is not greater than 250 mm.

A parcel piles up detection device based on neural network, characterized by: comprises that

The image acquisition module can acquire a belt image when the package is conveyed;

the stack detection module is connected with the image acquisition module, a package detection neural network model is arranged in the stack detection module, the package detection neural network is based on a Yolov3 target detection model, the backbone network adopts a structure based on MobileNetV2, and the detection head part outputs a detection frame by using a two-layer Yolo layer; the stacking detection module can extract the package information in the belt image according to the belt image acquired by the image acquisition module and judges the package stacking state corresponding to the belt image according to the extracted package information;

and the signal sending module is connected with the stacking detection module and the package conveying control system and can send the package stacking state judged by the stacking detection module to the package conveying control system.

The image acquisition module comprises an area array color camera, and the distance between the image acquisition module and the belt for conveying the packages is 1000mm to 1200 mm.

Judging the stacking state of the parcels according to the extracted parcel information, wherein the stacking state of the parcels comprises the following steps:

when only one single packet appears, judging the situation of non-stacking;

A virtual sending line is arranged in the signal sending module, and the stack signal is sent to the control system only when the parcel touches the sending line; and the same parcel is restrained from triggering and sending signals for multiple times through target tracking.

The invention has the advantages that: acquiring a belt image through the image acquisition module, wherein the belt is used for conveying packages; a package detection neural network model is arranged in the stack detection module, the package detection neural network is based on a Yolov3 target detection model, the backbone network adopts a structure based on MobileNet V2, and the detection head part outputs a detection frame by using a two-layer Yolo layer; the stacking detection module can extract the package information in the belt image according to the belt image acquired by the image acquisition module, and judges the package stacking state corresponding to the belt image according to the extracted package information, namely the package information is extracted by the stacking detection module and the stacking condition at the current moment is judged, and the stacking condition is sent to the control system by the signal sending module; the control system can change the running direction of the packages according to the stacking condition, so that the packages are prevented from being stacked to enter a subsequent sorting system, and the accuracy and the sorting efficiency of the packages are effectively improved.

Drawings

FIG. 1 is a flow chart of the detection according to the present invention.

Fig. 2 is a flow chart of the present invention for controlling package delivery.

Fig. 3 is a schematic diagram of the architecture of MobileNetV2 according to the present invention.

Detailed Description

The invention is further illustrated by the following specific figures and examples.

As shown in fig. 1, in order to effectively realize the detection of the package stacking and improve the accuracy of automatic package supply and the sorting efficiency, the detection method of the package stacking method of the present invention comprises the following steps:

in specific implementation, the YOLOV3 target detection model is a commonly used target detection model, and is specifically known to those skilled in the art, and will not be described herein again. In order to adapt to package detection, the backbone network adopts a structure based on MobileNet V2, and a detection header part outputs a detection frame by using a two-layer YOLO layer, wherein the specific situation based on the structure of MobileNet V2 is consistent with the prior art. Fig. 3 shows a case based on the MobileNetV2 architecture, specifically, input represents the size of an input tensor, Operator represents a tensor operation method, c represents the number of output channels, n represents the number of Operator iterations, s represents the step length of output relative to input, Conv2D is a 2D convolution, bottleneck is an inverse residual module, and the inverse residual module bottleneck may specifically adopt a conventional and commonly used form, that is, the MobileNetV2 architecture is specifically known by those skilled in the art, and details thereof are not repeated here.

In addition, the detection header portion outputs a detection frame by using a two-layer YOLO layer, which is specifically related to the YOLOV3 target detection model, and is well known to those skilled in the art, and is not described herein again.

In the embodiment of the invention, when the package detection neural network model is based on the Yolov3 target detection model, the training of the neural network comprises a pre-training step and a stacking detection training step; in specific implementation, during pre-training and stack detection training, the picture resolution size may affect the picture resolution size required in forward inference. The training picture resolution is obtained according to the picture resolution required by inference, and specifically, the picture resolution required by inference is between the maximum value and the minimum value of the training picture resolution. For example, the picture resolution during inference is 320 × 320, the minimum value of the training picture resolution is 224, and the maximum value is 512, i.e. the selected input picture resolution is a multiple of 32 between 224 and 512.

in order to adapt to package detection, after a YOLOV 3-based target detection model is selected, a corresponding training step is further required, the training step specifically includes a pre-training step and a stack detection step, and the purpose of the specific step is consistent with that of the prior art, and is not described herein again.

In the embodiment of the invention, in the pre-training process, training is carried out on a COCO data set, and a network weight initialization uses a Hommin initialization method; specifically, the COCO data set is an MS-COCO-2017 data set, and the MS-COCO-2017 data set can be obtained from an MS COCO official website. During pre-training, the network is set to detect 80 types. In specific implementation, the network fixed iteration M1 is set to 800020 times, and the optimal weight of the MAP value is selected as a pre-training weight; the initial learning rate is 0.001, the learning rate attenuation strategy is sectional attenuation, and the attenuation is 0.9 times in 400000 and 650000 iterations; during the first 4000 iterations, the learning rate is increased from 0 to 0.001 at a constant speed.

After the conditions such as the data set, the network weight initialization method, and the like in the pre-training process are selected, pre-training of the Y0LOV3 target detection model can be realized, and the specific training process is well known to those skilled in the art and will not be described herein again.

In the embodiment of the invention, the stack detection training data set can be captured and acquired in an actual production line. Specifically, the camera is fixed on a belt to collect pictures. Under the condition that no sample exists, pictures are obtained in an equal-interval snapshot mode, and effective samples are manually selected; and under the condition of obtaining a small-scale sample, carrying out primary stacking detection training, capturing pictures in real time by a camera, and automatically storing the pictures containing the packages as effective samples by using the trained model. During the stacking detection training, the network is set to detect class 2, i.e. the network is divided into a single-packet sample and a multi-packet sample, and the process of specifically obtaining the stacking detection training data set can be selected and determined according to actual needs, which is specifically known to those skilled in the art and will not be described herein again.

In specific implementation, training is carried out on the collected stacking detection data set, and weights obtained by pre-training are initialized and used by the network weights; when selecting the verification set, only selecting pictures containing only one single packet or only one multi-packet; after non-maximum value inhibition, classifying each verification set picture into a single packet or a plurality of packets, calculating F1-score, recording an F1-score curve, and selecting the optimal weight of the F1-score when training is finished; the initial learning rate is 0.0005, and the learning rate decay strategy is cosine annealing; the total number of training rounds is determined by the data set size and should be no less than 2000 iteration batches. In the specific training process, when more picture samples are collected, 150 training rounds can be fixed.

In the training process, the specific processes of calculating F1-score, recording F1-score curve, and selecting the optimal weight of F1-score at the end of training are consistent with the prior art, and are well known to those skilled in the art, and will not be described herein again. After the pre-training and the stack detection training, the required package detection neural network model based on the Yolov3 target detection model can be obtained.

specifically, the manner and process of acquiring the belt image by the top-scan camera are consistent with those of the prior art, which are well known to those skilled in the art, and are not described herein again.

in the embodiment of the present invention, after the package detection neural network model is obtained, for any obtained belt image, the package information in the leather strap image can be extracted and obtained by using the package detection neural network model, and specifically, the process of obtaining the package information by using the package detection neural network model is the same as that in the prior art, and is not described here again. In specific implementation, the extracted parcel information includes the location of parcels and relationships between parcels.

In specific implementation, when package information in a belt image is extracted through a package detection neural network model, non-maximum suppression is used for post-processing the extracted package target, and suppression is only performed among the same types.

In the embodiment of the invention, when the non-maximum suppression is used for processing and suppressing the extracted parcel target, the method comprises the following steps:

In the embodiment of the present invention, the target frame set { B } includes a plurality of target frames, and the number of the target frames is related to the acquired belt image, which is well known to those skilled in the art and will not be described herein again. After the package information is extracted by using the package detection neural network model, the confidence of each target frame is also given at the same time, which is specifically consistent with the existing method and is not repeated here. After the non-maximum value suppression processing, the interference between the single-packet information and the multi-packet information can be reduced.

In specific implementation, the state of the package stacking is judged according to the package extraction information, and the state of the package stacking comprises the following steps:

when only one single packet appears, judging the situation of non-stacking;

As can be seen from the above description, after the parcel information is extracted from the belt image, the position of the parcel and the relationship between the parcels can be obtained, and thus, the parcel stacking state can be determined based on the parcel information. In specific implementation, the threshold L is not greater than 250mm, and the specific condition of the threshold L may be selected according to the actual conditions of package delivery and the like, which are well known to those skilled in the art and will not be described herein again.

In summary, a neural network based package stack detection apparatus is available, specifically comprising

In the embodiment of the present invention, for the image acquisition module, an area-array camera, a color sensor or a gray-scale sensor may be used, which may be specifically selected according to actual needs, and will not be described herein again. When the area array color camera is used for acquiring the target image, the suspension height depends on the actual view situation, and particularly, the suspension height of the area array color camera is preferably between 1000mm and 1200 mm.

In real scenes, the parcel may move at a high speed, and the exposure time should be reduced to avoid motion blur of the captured image. Therefore, a supplementary lighting module should also be included. The light supplementing module can adopt a mainstream LED light source and has the characteristics of small heat productivity, high brightness, long service life and the like. The device adopts a white light LED lamp bead with a fixed emission angle.

For stacking the detection module, set up parcel detection neural network in stacking the detection module, the concrete condition of parcel detection neural network and the detection etc. to the belt image all are unanimous with the above-mentioned explanation, specifically can refer to above-mentioned digifax, and the no longer repeated here.

In the embodiment of the invention, each parcel is tracked, and a tracking list is added when the parcel enters the visual field; the virtual sending line is arranged, when the package touches the virtual sending line, the stacking condition is sent to the package conveying control system, the package conveying control system can control the conveying process of the package, the package conveying control system can adopt the existing common form, the conveying control process of the package is consistent with the existing conveying control process, and the virtual sending line is particularly well known by people in the technical field and is not needed to be carried out at any time.

In the implementation, the image acquisition frame rate and the package movement rate should be considered when setting the virtual transmission line. It should be ensured that the package does not move directly outside the field of view of the image from a position where it never touches the transmission line within the interval of two frame image acquisitions. When the moving speed of the parcel is 1800mm/s and the photographing interval is 100ms, the virtual sending line is arranged to the position 250mm away from the edge of the visual field outlet. Specifically, the virtual send line is set to 250mm from the edge of the exit of the field of view, which refers to the side that the parcel touches when leaving in the image. For the specific position of the virtual transmission line, the adjustment is specifically required according to the movement speed of the belt and the image acquisition interval, and the adjustment is required to be slightly larger than the movement distance of the belt in one acquisition interval. When the system works, each frame of image can judge the position relation between the package at the forefront end and the sending line, and when the package at the forefront end touches the sending line, the current judgment result is sent to the package conveying control system.

The target tracking may be performed by matching the current frame wrapping detection result with the previous frame result in a track-by-detection manner known to those skilled in the art, so as to implement tracking, and the specific target tracking process is consistent with the prior art, and is specifically known to those skilled in the art and is not described again.

As shown in fig. 2, it is a flow chart of the work of the parcel stacking detection apparatus of the present invention, specifically: acquiring a belt image through the image acquisition module, wherein the belt is used for conveying packages; extracting parcel information and judging the stacking condition at the current moment through a stacking detection module, wherein the parcel information comprises a parcel position, a parcel interval and a parcel type; and the stacking condition is sent to the control system through the signal sending module. The problem of wrong branch because of the parcel piles up and cause is solved, automatic confession package efficiency and accuracy are effectively promoted.

The technical features of the embodiments described above may be arbitrarily combined, and for the sake of brevity, all possible combinations of the technical features in the embodiments described above are not described, but should be considered as being within the scope of the present specification as long as there is no contradiction between the combinations of the technical features.

The above-mentioned embodiments only express several embodiments of the present application, and the description thereof is more specific and detailed, but not construed as limiting the scope of the invention. It should be noted that, for a person skilled in the art, several variations and modifications can be made without departing from the concept of the present application, which falls within the scope of protection of the present application. Therefore, the protection scope of the present patent shall be subject to the appended claims.

Claims

1. A parcel stacking detection method based on a neural network is characterized by comprising the following steps:

2. The method according to claim 1, wherein in step 101, when the YOLOV 3-based object detection model is adopted, the training of the neural network comprises a pre-training step and a stack detection training step; wherein,

3. The method of claim 1, wherein the extracted parcel objects are post-processed using non-maximum suppression and suppressed only between classes when parcel information in the belt image is extracted by the parcel detection neural network model.

4. The method of claim 3, wherein the processing and suppressing of the extracted parcel object using non-maximum suppression comprises:

5. The neural-network-based package stack detection method as claimed in any one of claims 1 to 4, wherein in step 104, the status of the package stack is determined according to the extracted package information, and the status of the package stack comprises:

when only one single packet appears, judging the situation of non-stacking;

6. The neural-network-based parcel stack detection method of claim 5 wherein said threshold L is not greater than 250 mm.

7. A parcel piles up detection device based on neural network, characterized by: comprises that

8. The neural network-based package stack detection device of claim 7, wherein: the image acquisition module comprises an area array color camera, and the distance between the image acquisition module and the belt for conveying the packages is 1000mm to 1200 mm.

9. The neural network-based package stack detection device as claimed in claim 7 or 8, wherein: judging the stacking state of the parcels according to the extracted parcel information, wherein the stacking state of the parcels comprises the following steps:

when only one single packet appears, judging the situation of non-stacking;

10. The neural network-based package stack detection device as claimed in claim 7 or 8, wherein: a virtual sending line is arranged in the signal sending module, and the stack signal is sent to the control system only when the parcel touches the sending line; and the same parcel is restrained from triggering and sending signals for multiple times through target tracking.