CN113762190B

CN113762190B - Method and device for detecting package stacking based on neural network

Info

Publication number: CN113762190B
Application number: CN202111078544.6A
Authority: CN
Inventors: 朱强; 唐金亚; 杜萍
Original assignee: Zhongke Weizhi Technology Co ltd
Current assignee: Zhongke Weizhi Technology Co ltd
Priority date: 2021-09-15
Filing date: 2021-09-15
Publication date: 2024-03-29
Anticipated expiration: 2041-09-15
Also published as: CN113762190A

Abstract

The invention relates to a package stacking detection method and device based on a neural network. The method comprises the following steps: step 101, training to obtain a package detection neural network model, wherein the package detection neural network is based on a YOLOV3 target detection model, a backbone network adopts a MobileNet V2-based architecture, and a detection head part uses a two-layer YOLO layer to output a detection frame; 102, acquiring belt images when conveying packages through a top scanning camera; step 103, extracting the package information in the belt image through the package detection neural network model, wherein the package information extracted by the package detection neural network model comprises the positions of packages and the relation among the packages; and 104, judging the package stacking state according to the extracted package information. The invention can effectively realize the detection of package stacking and improve the accuracy of automatic package feeding and the sorting efficiency.

Description

Method and device for detecting package stacking based on neural network

Technical Field

The invention relates to a detection method and a detection device, in particular to a package stacking detection method and a package stacking detection device based on a neural network.

Background

The full-automatic package supply system consists of a cross belt type sorting system and a package separation system. When entering the cross-belt sortation system, it must be ensured that packages are sequentially entered one by one. If there are stacked packages entering the cross-belt sortation system, package sort failures and reduced sort efficiency will result.

Currently, for package separation, the package separation is controlled by changing the belt speed, and the package with serious stacking cannot be separated. Therefore, to ensure package sorting accuracy and efficiency, how to detect stacked packages before entering the cross-belt sorting system is a critical issue to be addressed.

Disclosure of Invention

The invention aims to overcome the defects in the prior art and provides a parcel stacking detection method and device based on a neural network, which can effectively realize the detection of parcel stacking and improve the accuracy of automatic parcel supply and the sorting efficiency.

According to the technical scheme provided by the invention, the package stacking detection method based on the neural network comprises the following steps of:

step 101, training to obtain a package detection neural network model, wherein the package detection neural network is based on a YOLOV3 target detection model, a backbone network adopts a MobileNet V2-based architecture, and a detection head part uses a two-layer YOLO layer to output a detection frame;

102, acquiring belt images when conveying packages through a top scanning camera;

step 103, extracting the package information in the belt image through the package detection neural network model, wherein the package information extracted by the package detection neural network model comprises the positions of packages and the relation among the packages;

and 104, judging the package stacking state according to the extracted package information.

In step 101, when the YOLOV 3-based target detection model is adopted, training of the neural network includes a pre-training step and a stacking detection training step; wherein,

in the pre-training process, the pre-training data set is a COCO data set, and a He Kaiming initialization method is used for initializing the network weight; the network is fixedly iterated for M1 times, and the optimal weight of MAP is selected as a pre-training weight;

during stacking detection training, performing stacking detection training by using an acquired stacking detection data set, and initializing network weights by using weights obtained by pre-training; when selecting the verification set, only selecting pictures which only contain one single packet or only contain one multi-packet; classifying each verification set picture after non-maximum suppression, dividing the verification set picture into single packets or multiple packets, calculating an F1-score, recording an F1-score curve, and selecting the optimal weight of the F1-score when training is finished; the initial learning rate is 0.0005 and the learning rate decay strategy is cosine annealing.

When the package information in the belt image is extracted through the package detection neural network model, the non-maximum suppression is used for carrying out post-processing on the extracted package target, and the suppression is only carried out among the same classes.

When the non-maximum suppression is used for processing and suppressing the extracted parcel target, the method comprises the following steps:

ordering the target frames in the target frame set { B } from high to low according to the confidence;

traversing the set of target boxes { B }, for each target box B _i From object frame B _i+1 Traversing backward if target frame B _i And target frame B _j Belonging to the same class and having a cross-over ratio greater than a threshold value, and the target frame B _j Marking and removing;

if the target frame B is traversed while traversing the target frame set { B }, the target frame B _i If the mark is removed, skipping the subsequent operation of the round of traversal, and selecting the target frame B _i+1 Starting traversing; wherein, the value range of i is [0, N-2 ]]N is the number of target frames, and the value range of j is [ i+1, N-1]。

In step 104, the package stacking state is determined according to the extracted package information, where the package stacking state includes:

when only one single packet appears, judging that the single packet is in a non-stacking condition;

when a plurality of packages appear, judging that the stacking situation exists when the head-to-tail distance between the first package and the second package positive rectangular frame is smaller than a threshold value L;

when a plurality of packages appear, judging that the package is in a non-stacking condition when the first package type is a single package and the head-tail distance between the first package type and the positive rectangular frame of the second package is larger than a threshold value L;

whether there are multiple packages or not, when the first package category is multiple packages, then a stacking situation is determined.

The threshold L is not greater than 250mm.

The utility model provides a parcel stacks detection device based on neural network which characterized in that: comprising

The image acquisition module can acquire belt images when the package is conveyed;

the stacking detection module is connected with the image acquisition module, a package detection neural network model is arranged in the stacking detection module, the package detection neural network is based on a YOLOV3 target detection model, a backbone network adopts a MobileNet V2-based architecture, and a detection head part outputs a detection frame by using two layers of YOLO layers; the stacking detection module can extract package information in the belt image according to the belt image acquired by the image acquisition module, and judges package stacking state corresponding to the belt image according to the extracted package information;

the signal sending module is connected with the stacking detection module and the package conveying control system and can send the package stacking state judged by the stacking detection module to the package conveying control system.

The image acquisition module comprises an area array color camera, and the distance between the image acquisition module and a belt for conveying the package is 1000mm to 1200mm.

Judging the state of the package stacking according to the extracted package information, wherein the state of the package stacking comprises the following steps:

A virtual transmission line is arranged in the signal transmission module, and a stacking signal is transmitted to the control system only when a package touches the transmission line; and the same package is restrained from triggering and sending signals for multiple times through target tracking.

The invention has the advantages that: acquiring a belt image through the image acquisition module, wherein the belt is used for conveying packages; a package detection neural network model is arranged in the stack detection module, the package detection neural network is based on a YOLOV3 target detection model, a backbone network adopts a MobileNet V2-based architecture, and a detection head part uses a two-layer YOLO layer to output a detection frame; the stacking detection module can extract package information in the belt image according to the belt image acquired by the image acquisition module, and judge package stacking state corresponding to the belt image according to the extracted package information, namely, the stacking detection module extracts package information and judges the stacking condition at the current moment, and the signal transmission module transmits the stacking condition to the control system; the control system can change the running direction of the packages according to the stacking condition, so that the stacked packages are prevented from entering a subsequent sorting system, and the package feeding accuracy and the sorting efficiency are effectively improved.

Drawings

FIG. 1 is a flow chart of the detection method of the present invention.

Fig. 2 is a flow chart of the present invention for controlling package delivery.

Fig. 3 is a schematic diagram of a MobileNetV 2-based architecture according to the present invention.

Detailed Description

The invention will be further described with reference to the following specific drawings and examples.

As shown in fig. 1, in order to effectively realize package stacking detection and improve automatic package feeding accuracy and sorting efficiency, the package stacking method of the invention comprises the following steps:

in specific implementation, the YOLOV3 target detection model is a conventional target detection model, which is well known to those skilled in the art, and will not be described herein. In order to adapt to package detection, a backbone network adopts a MobileNet V2-based architecture, and a detection head part uses a two-layer YOLO layer output detection frame, wherein the specific situation based on the MobileNet V2 architecture is consistent with the prior art. In fig. 3, a case based on a MobileNetV2 architecture is shown, specifically, input represents the input tensor size, operator represents the tensor operation method, c represents the number of output channels, n represents the number of iterations of Operator, s represents the step size of output relative to input, conv2D is 2D convolution, bottleck is an anti-residual module, and the anti-residual module bottleck may specifically take a conventional commonly used form, i.e. the specific embodiment of the MobileNetV2 architecture is well known to those skilled in the art, and will not be repeated herein.

In addition, the detection head portion uses a two-layer YOLO layer output detection frame, which is specifically related to the YOLOV3 target detection model, and is specifically well known in the art, and is not described herein.

In the embodiment of the invention, when the parcel detection neural network model is based on a YOLOV3 target detection model, training of the neural network comprises a pre-training step and a stacking detection training step; in particular, the size of the picture resolution during pre-training and stacking detection training affects the size of the picture resolution required during forward reasoning. The picture resolution of training is obtained according to the picture resolution required by reasoning, and specifically, the picture resolution required by reasoning is between the maximum value and the minimum value of the picture resolution of training. For example, the picture resolution at the time of reasoning is 320x320, the training picture resolution is 224 at the minimum and 512 at the maximum, i.e. the selected input picture resolution is a multiple of 32 between 224 and 512.

in order to adapt to the detection of packages, after selecting the YOLOV 3-based target detection model, a corresponding training step is further required, where the training step specifically includes a pre-training step and a stacking detection step, and the purpose of the specific step is consistent with that of the existing one, and will not be repeated here.

In the embodiment of the invention, training is performed on a COCO data set in the pre-training process, and a He Kaiming initialization method is used for initializing the network weight; in particular, the COCO data set is in particular an MS-COCO-2017 data set, which is obtainable from an MS COCO functional network. In pre-training, the network is set to detect class 80. In specific implementation, setting the network fixed iteration M1 as 800020 times, and selecting the weight with the optimal MAP value as the pre-training weight; the initial learning rate is 0.001, the learning rate attenuation strategy is a sectional attenuation, and the attenuation is 0.9 times when the 400000 th iteration and the 650000 th iteration are carried out; in the first 4000 iterations, the learning rate rises from 0 to 0.001 at a constant rate.

After the conditions such as the data set and the network weight initialization method in the pre-training process are selected, the pre-training of the Y0LOV3 target detection model can be realized, and the specific training process is well known to those skilled in the art and is not repeated here.

In the embodiment of the invention, the stacking detection training data set can be obtained in a snap shot mode in an actual production line. Specifically, the camera is fixed on the belt to collect pictures. Under the condition that no sample exists, obtaining pictures in a mode of equiinterval snapshot, and manually picking out an effective sample; under the condition that a small-scale sample is obtained, preliminary stacking detection training is carried out, the camera captures pictures in real time, and the pictures containing packages are automatically stored by using a trained model to serve as effective samples. When stacking detection training is performed, the network is set to be in detection class 2, namely, the network is divided into a single-packet sample and a multi-packet sample, and the specific process of acquiring the stacking detection training data set can be selected and determined according to actual needs, which is well known to those skilled in the art, and is not repeated here.

In specific implementation, training is carried out on the collected stacked detection data set, and the weight obtained by pre-training is used for initializing the network weight; when selecting the verification set, only selecting pictures which only contain one single packet or only contain one multi-packet; classifying each verification set picture after non-maximum suppression, dividing the verification set picture into single packets or multiple packets, calculating an F1-score, recording an F1-score curve, and selecting the optimal weight of the F1-score when training is finished; the initial learning rate is 0.0005, and the learning rate attenuation strategy is cosine annealing; the total number of training rounds is determined by the data set size and should not be less than 2000 iteration lots. In the specific training process, when more picture samples are collected, 150 training wheels can be fixed.

In the training process, the specific processes of calculating the F1-score, recording the F1-score curve, and selecting the optimal weight of the F1-score at the end of training are consistent with the existing processes, and are well known to those skilled in the art, and will not be repeated here. After the pre-training and stacking detection training, the required parcel detection neural network model based on the YOLOV3 target detection model can be obtained.

specifically, the manner and process of obtaining the belt image by the top scanning camera are consistent with the prior art, and are well known to those skilled in the art, and are not described herein.

in the embodiment of the invention, after the package detection neural network model is obtained, for an arbitrarily acquired belt image, the package information in the leather strip image can be extracted by using the package detection neural network model, and the process of obtaining the package information by using the package detection neural network model is consistent with the prior art, and is not repeated here. In specific implementation, the extracted package information includes the locations of packages and the relationships between packages.

When the method is implemented, the non-maximum suppression is used for carrying out post-treatment on the extracted parcel target and only suppressing among the same class when parcel information in the belt image is extracted through the parcel detection neural network model.

In the embodiment of the invention, when the non-maximum suppression is used for processing and suppressing the extracted parcel target, the method comprises the following steps:

traversing the set of object boxes { B }, for each objectFrame B _i From object frame B _i+1 Traversing backward if target frame B _i And target frame B _j Belonging to the same class and having a cross-over ratio greater than a threshold value, and the target frame B _j Marking and removing;

In the embodiment of the present invention, the target frame set { B } includes a plurality of target frames, and the number of target frames is related to the acquired belt image, which is well known in the art and will not be described herein. After the package information is extracted by using the package detection neural network model, the confidence coefficient of each target frame is also given at the same time, and is in accordance with the existing, and details are not repeated here. After the non-maximum value suppression processing, the interference between the single-packet information and the multi-packet information can be reduced.

In specific implementation, the package stacking state is judged according to the extracted package information, and the package stacking state comprises the following steps:

As can be seen from the above description, after the belt image is extracted to obtain the package information, the location of the package and the relationship between the packages can be obtained, so that the package stacking state can be determined according to the package information. In specific implementation, the threshold L is not greater than 250mm, and the specific situation of the threshold L may be selected according to situations such as actual package delivery, which are well known in the art, and will not be described herein.

In summary, a neural network-based package stack detection device is available, specifically comprising

In the embodiment of the present invention, for the image acquisition module, an area-array camera, a color or gray sensor may be used, and specifically may be selected according to actual needs, which will not be described herein. When an area-array color camera is used to acquire the target image, the suspension height depends on the actual field of view, and specifically, the suspension height of the area-array color camera is preferably between 1000mm and 1200mm.

In a real scene, the package may move at a high speed, and the exposure time should be reduced to avoid motion blur of the photographed image. Therefore, a light supplementing module should also be included. The light supplementing module can adopt a mainstream LED light source and has the characteristics of small heating value, high brightness, long service life and the like. The device adopts a white light LED lamp bead with a fixed emission angle.

For the stacking detection module, a package detection neural network is arranged in the stacking detection module, the specific conditions of the package detection neural network, detection of the belt image and the like are consistent with the above description, and the above digital-analog can be referred to for details, which are not repeated here.

In the embodiment of the invention, each package is tracked, and when the package enters the visual field, a tracking list is added; when the virtual transmission line is set, the stacking condition is sent to the package conveying control system when the package touches the virtual transmission line, the package conveying control system can control the conveying process of the package, the package conveying control system can adopt the conventional common mode, the conveying control process of the package is consistent with the conventional mode, and the like, and the package conveying control system is well known to the person skilled in the art, and is not used anytime.

In particular, the virtual transmission line should be set with the image acquisition frame rate and the parcel motion rate taken into consideration. It should be ensured that the package does not move directly outside the field of view of the image from a position where the transmission line is not touched, within the interval of two frame image acquisitions. When the moving speed of the package is 1800mm/s, the photographing interval is 100ms, and the virtual sending line is arranged at a position 250mm away from the edge of the visual field outlet. In particular, the virtual transmission line is set 250mm from the view outlet edge, which refers to the side of the package that is touched when leaving in the image. For the specific position of the virtual transmission line, the specific position needs to be adjusted according to the movement speed of the belt and the image acquisition interval, and the distance of the belt movement in one acquisition interval needs to be slightly larger than that of the belt movement in one acquisition interval. When the system works, each frame of image judges the position relation between the front-end package and the sending line, and when the front-end package touches the sending line, the current judging result is sent to the package conveying control system.

The target tracking can be implemented by matching the current frame package detection result with the previous frame result in a track-by-detection manner well known to those skilled in the art, so that the tracking is implemented, and the specific target tracking process is consistent with the existing one, and is well known to those skilled in the art and will not be repeated.

As shown in fig. 2, a workflow diagram of the package stack inspection device of the present invention is shown, specifically: acquiring a belt image through the image acquisition module, wherein the belt is used for conveying packages; extracting package information and judging the stacking condition at the current moment through a stacking detection module, wherein the package information comprises package positions, package intervals and package types; and sending the stacking condition to a control system through a signal sending module. The problem of wrong division caused by package stacking is solved, and the automatic package feeding efficiency and accuracy are effectively improved.

The technical features of the above-described embodiments may be arbitrarily combined, and all possible combinations of the technical features in the above-described embodiments are not described for brevity of description, however, as long as there is no contradiction between the combinations of the technical features, they should be considered as the scope of the description.

The above examples merely represent a few embodiments of the present application, which are described in more detail and are not to be construed as limiting the scope of the invention. It should be noted that it would be apparent to those skilled in the art that various modifications and improvements could be made without departing from the spirit of the present application, which would be within the scope of the present application. Accordingly, the scope of protection of the present application is to be determined by the claims appended hereto.

Claims

1. The package stacking detection method based on the neural network is characterized by comprising the following steps of:

104, judging the package stacking state according to the extracted package information;

2. The neural network-based parcel stacking detection method of claim 1, wherein when parcel information in a belt image is extracted by a parcel detection neural network model, non-maximum suppression is used to post-process the extracted parcel target, and suppression is performed only between the same classes.

3. The neural network-based parcel stacking detection method of claim 2, wherein when using non-maximum suppression to process and suppress extracted parcel targets, comprising the following steps:

4. A method of detecting a package stack based on a neural network according to any one of claims 1 to 3, wherein in step 104, the status of the package stack is determined according to the extracted package information, and the status of the package stack includes:

5. The neural network-based parcel stacking detection method of claim 4, wherein said threshold L is no greater than 250mm.

6. The utility model provides a parcel stacks detection device based on neural network which characterized in that: comprising

the signal sending module is connected with the stacking detection module and the package conveying control system and can send the package stacking state judged by the stacking detection module to the package conveying control system;

when the detection neural network model is arranged in the stacking detection module, the method comprises the step of training to obtain the package detection neural network model, wherein,

when the YOLOV 3-based target detection model is adopted, training of the neural network comprises a pre-training step and a stacking detection training step; wherein,

7. The neural network-based parcel stack detection apparatus of claim 6, wherein: the image acquisition module comprises an area array color camera, and the distance between the image acquisition module and a belt for conveying the package is 1000mm to 1200mm.

8. The neural network-based parcel stack detection apparatus of claim 6 or 7, wherein: judging the state of the package stacking according to the extracted package information, wherein the state of the package stacking comprises the following steps:

9. The neural network-based parcel stack detection apparatus of claim 6 or 7, wherein: a virtual transmission line is arranged in the signal transmission module, and a stacking signal is transmitted to the control system only when a package touches the transmission line; and the same package is restrained from triggering and sending signals for multiple times through target tracking.