CN113076889B

CN113076889B - Container lead seal identification method, device, electronic equipment and storage medium

Info

Publication number: CN113076889B
Application number: CN202110382063.8A
Authority: CN
Inventors: 谭黎敏; 蔡文扬; 李金涛
Original assignee: Shanghai Westwell Information Technology Co Ltd
Current assignee: Shanghai Westwell Information Technology Co Ltd
Priority date: 2021-04-09
Filing date: 2021-04-09
Publication date: 2023-06-30
Anticipated expiration: 2041-04-09
Also published as: CN113076889A

Abstract

The invention relates to the technical field of computer vision, and provides a container lead seal identification method, a container lead seal identification device, electronic equipment and a storage medium. The container lead seal identification method comprises the following steps: collecting video of a container; extracting an image from the video, inputting the image into a trained first convolutional neural network for lead sealing identification, and at least obtaining a local image area where lead sealing is located in the image as a target image; inputting the target image into a trained second convolutional neural network for lead sealing identification, and at least obtaining the category and the confidence of lead sealing; and carrying out weighted calculation according to the types and the confidence levels of the lead seals corresponding to the images in the video to obtain the lead seal types of the container. According to the method and the device, the lead sealing type of the container can be accurately identified in real time according to the acquired container video, the accuracy requirement and the real-time requirement of lead sealing identification are met, and an accurate data basis is provided for intelligent management of the container.

Description

Container lead seal identification method, device, electronic equipment and storage medium

Technical Field

The invention relates to the technical field of computer vision, in particular to a container lead seal identification method, a device, electronic equipment and a storage medium.

Background

In port operations, it is necessary to detect the seal of a container and to record seal information.

At present, workers mainly rely on ports to count lead sealing information of containers, so that the personnel demand is large, the efficiency of manual statistics is low, the accuracy is poor, and the intelligent transformation is needed for managing and burying subsequent containers.

However, small target detection is a difficulty in computer vision. Because the size of the lead seal is smaller, if a traditional computer vision algorithm is adopted, the lead seal is detected in a camera picture of the container, so that much noise interference exists, and the accuracy cannot be ensured; and because the available information of the target is less, the algorithm model has great parameter requirements, and the real-time performance cannot be ensured.

Therefore, accurate real-time detection of the container lead seals is not realized at present.

Disclosure of Invention

The invention provides a method, a device, electronic equipment and a storage medium for identifying the lead seal of a container, which can accurately identify the lead seal type of the container in real time according to the acquired container video, meet the accuracy requirement and the real-time requirement of lead seal identification and provide an accurate data basis for intelligent management of the container.

One aspect of the present invention provides a container lead seal identification method, comprising the steps of: collecting video of a container; extracting an image from the video, inputting the image into a trained first convolutional neural network for lead sealing identification, and at least obtaining a local image area where lead sealing is located in the image as a target image; inputting the target image into a trained second convolutional neural network for lead sealing identification, and at least obtaining the category and the confidence of lead sealing; and carrying out weighted calculation according to the types and the confidence levels of the lead seals corresponding to the images in the video to obtain the lead seal types of the container.

In some embodiments, in the training process of the first convolutional neural network, a first type of picture is taken as a training sample, at least a local image area where a lead seal is positioned in the first type of picture is taken as a target to be output, and the first type of picture is a picture of a container with various lead seals; in the training process of the second convolutional neural network, a second type of picture is taken as a training sample, a local image area where the lead seal is positioned in the second type of picture, the type of the lead seal and the confidence level are taken as targets to be output, and the second type of picture is a picture of various lead seals.

In some embodiments, the second type of picture is taken from the first type of picture.

In some embodiments, in the weighting calculation, the weight of an image increases in a forward direction with the picture occupancy in the image of the local image region in which the lead seal is located in the image.

In some embodiments, the structure of the first convolutional neural network comprises: the device comprises two groups of connected convolution modules, wherein each convolution module in each group of convolution modules comprises a feature extraction layer and a downsampling layer which are constructed based on cavity convolution; the first detection layer is connected with the convolution modules of the latter group of the two groups of convolution modules; the second detection layer is respectively connected with the former group of convolution modules and the latter group of convolution modules of the two groups of convolution modules through the feature fusion layer.

In some embodiments, the identification process of the first convolutional neural network comprises: performing cyclic convolution on the image for preset times through the previous group of convolution modules to obtain a first feature map; convolving the first feature map by the latter group of convolution modules to obtain a second feature map; lead sealing identification is carried out on the second feature map through the first detection layer, and at least one detection frame corresponding to the local image area where the lead sealing is located is obtained; performing feature fusion on the first feature map and the deconvolution up-sampled second feature map through the feature fusion layer to obtain a third feature map; lead sealing identification is carried out on the third feature map through the second detection layer, and at least one detection frame corresponding to the local image area where the lead sealing is located is obtained; and screening out target detection frames from the multiple detection frames based on non-maximum suppression, and outputting the target detection frames as the local image area.

In some embodiments, the training process of the first convolutional neural network comprises: obtaining a training sample containing a plurality of first-class pictures, and marking a local image area where a lead seal is positioned in the first-class pictures to obtain the width, height and class of a lead seal frame of the first-class pictures; clustering the width and height of the lead sealing frame of the training sample through a clustering algorithm to obtain the average width and height of each class of width and height obtained through clustering, and forming a plurality of prior frames obtained through clustering; respectively obtaining a first sample feature map and a third sample feature map of each first type of picture through two groups of convolution modules and the feature fusion layer; and respectively inserting the prior frames into the first sample feature map and the third sample feature map of each first type of picture at a pixel level, and selecting the prior frame with highest matching degree with the lead sealing frame of the first type of picture as the identification targets of the first detection layer and the second detection layer respectively.

In some embodiments, in the training process, a priori frame with the matching degree smaller than the matching degree threshold value with the lead sealing frame of each first type of picture is also selected, and is used as a negative sample for loss calculation and optimization based on gradient descent.

In some embodiments, the loss calculation process includes: calculating a first loss value loss_box according to a first formula:

wherein S is ² The number of pixels in the first sample feature map or the third sample feature map; b is the number of prior frames obtained by clustering,

marking whether the prior frame at the position i and j contains lead sealing, if so, the prior frame is 1, and if not, the prior frame at the position j is 0; w (w) _i,j The width of the a priori frame at i, j, h _i,j The height of the a priori frame at i, j, x _i,j And y _i,j The x-coordinate and the y-coordinate of the center point of the prior frame at i, j respectively; />

Width of lead seal corresponding to a priori frame at i, j in the first type of picture calculated for the current loss,/>

Calculation of the first to the current lossThe height of the lead seal corresponding to the a priori frame at i, j in a class of pictures,/-, is->

And->

Respectively calculating an x coordinate and a y coordinate of a center point of a lead sealing frame corresponding to the prior frame at the i and j positions in the first type of pictures of the current loss; lambda (lambda) _coord Taking 5;

calculating a second loss value loss_cls according to a second formula:

wherein p is _i,j (c) The probability that the class of the a priori box at i, j is class c,

probability of category c of lead seal corresponding to a priori frame at i, j in first-class picture calculated for current loss>

Take 0 or 1, lambda _class Taking 1;

calculating a third loss value loss_obj according to a third formula:

Wherein,,

marking whether the prior frame at the position i and j does not contain lead sealing, if so, the prior frame is 1, and if not, the prior frame at the position j is 0; d, d _i,j For i, j the probability that the a priori box contains lead seals,/>

A priori block at i, jMatching the probability of the corresponding lead sealing frame in the first type of picture calculated by the current loss; lambda (lambda) _noobj And lambda (lambda) _obj Taking 0.5 each;

calculating a total loss value loss according to a fourth formula:

loss＝loss_box+loss_cls+loss_obj。

in some embodiments, the second convolutional neural network is structured to replace the hole convolution with a standard convolution.

In some embodiments, in the identifying process of the second convolutional neural network, the detection layer further obtains a category and a confidence of the lead seal in each detection frame, and the second convolutional neural network further outputs the category and the confidence of the lead seal in the target detection frame.

Another aspect of the present invention provides a container lead seal identification device, comprising: the video acquisition module is used for acquiring videos of the container; the first recognition module is used for extracting an image from the video, inputting the image into a trained first convolutional neural network for lead sealing recognition, and at least obtaining a local image area where lead sealing is located in the image as a target image; the second recognition module is used for inputting the target image into a trained second convolutional neural network to carry out lead sealing recognition, and at least obtaining the category and the confidence of the lead sealing; and the weighting calculation module is used for carrying out weighting calculation according to the types and the confidence levels of the lead seals corresponding to the images in the video to obtain the lead seal types of the container.

Yet another aspect of the present invention provides an electronic device, comprising: a processor; a memory having executable instructions stored therein; when the executable instructions are executed by the processor, the container lead seal identification method of any embodiment is realized.

A further aspect of the present invention provides a computer-readable storage medium storing a program which, when executed, implements the container seal identification method according to any of the above embodiments.

Compared with the prior art, the invention has the beneficial effects that at least:

preliminary identification is carried out on lead seal in the container image through a first convolutional neural network, a local image area where the lead seal is positioned is rapidly obtained, and then a target image with extremely reduced picture size and extremely expanded lead seal occupation ratio in a picture is obtained; accurately identifying the lead seal in the target image through a second convolutional neural network, and accurately obtaining the lead seal type; finally, weighting the identification results of the plurality of images to obtain the lead sealing type of the container;

therefore, the invention can accurately and real-timely identify the lead sealing type of the container according to the acquired container video, meets the accuracy requirement and the real-time requirement of lead sealing identification, and provides an accurate data basis for intelligent management of the container.

It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the invention as claimed.

Drawings

The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the invention and together with the description, serve to explain the principles of the invention. It is evident that the figures described below are only some embodiments of the invention, from which other figures can be obtained without inventive effort for a person skilled in the art.

Fig. 1 is a schematic flow chart of steps of a method for identifying a lead seal of a container according to an embodiment of the invention;

fig. 2 is a schematic view of a scene flow of a container lead seal identification method in an embodiment of the invention;

FIG. 3 is a schematic diagram of a first convolutional neural network in an embodiment of the present invention;

FIG. 4 is a schematic diagram of an identification process of a first convolutional neural network in an embodiment of the present invention;

FIG. 5 is a schematic diagram of a training process of a first convolutional neural network in an embodiment of the present invention;

FIG. 6 is a schematic block diagram of a container lead seal identification device in an embodiment of the invention;

fig. 7 is a schematic structural diagram of an electronic device according to an embodiment of the present invention;

Fig. 8 shows a schematic structure of a computer-readable storage medium in an embodiment of the present invention.

Detailed Description

Example embodiments will now be described more fully with reference to the accompanying drawings. However, the example embodiments may be embodied in many different forms and should not be construed as limited to the embodiments set forth herein. Rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the concept of the example embodiments to those skilled in the art.

Furthermore, the drawings are merely schematic illustrations of the present invention and are not necessarily drawn to scale. The same reference numerals in the drawings denote the same or similar parts, and thus a repetitive description thereof will be omitted. Some of the block diagrams shown in the figures are functional entities and do not necessarily correspond to physically or logically separate entities. These functional entities may be implemented in software or in one or more hardware modules or integrated circuits or in different networks and/or processor devices and/or microcontroller devices.

The step numbers in the following embodiments are merely for representing different execution contents, and do not strictly limit the execution order between steps. The use of the terms "first," "second," and the like in the description herein do not denote any order, quantity, or importance, but rather are used to distinguish one element from another. It should be noted that, without conflict, the embodiments of the present invention and features in different embodiments may be combined with each other.

Fig. 1 shows a main step flow of a container lead seal identification method, and referring to fig. 1, the container lead seal identification method in this embodiment includes: in step S110, video of the container is collected; in step S120, an image is extracted from the video, the image is input into a trained first convolutional neural network for lead seal recognition, and at least a local image area where lead seal is located in the image is obtained as a target image; in step S130, inputting the target image into the trained second convolutional neural network for lead seal identification, and at least obtaining the category and confidence of lead seal; in step S140, a weighting calculation is performed according to the types and confidence levels of the lead seals corresponding to the plurality of images in the video, so as to obtain the lead seal types of the container.

Fig. 2 shows a scene flow of a container lead seal identification method, and in combination with fig. 1 and fig. 2, in a container lead seal identification process:

first, video of the container 22 is acquired. For example, during the entry of a truck, video of a container 22 carried on the truck is acquired by a camera 21 installed at the port site. The invention is suitable for various ports, such as shore bridges, field bridges and the like, and is also suitable for the container transportation process, such as lead sealing identification of the container carried on the truck when the truck passes through a gate; or, the container video does not require real-time acquisition, and the video pre-stored in the system acquired in advance can be called for lead sealing identification.

Next, a plurality of images are extracted from the video and input to the first convolutional neural network 23. The frame of the video can be regularly extracted within the preset interval of the camera picture. For example, when the container 22 enters the preset range of the camera 21, 5 frames per second are extracted from the video stream acquired by the camera 21, and the corresponding images of each frame are sequentially input into the first convolutional neural network 23 for lead sealing recognition. The first convolutional neural network 23 is trained to detect the area ROI (region of interest) where the lead seal is located, and output a local image area in the image where the lead seal is located. In one embodiment, the first convolutional neural network 23 specifically outputs coordinate information of a local image region where the lead seal is located in the image.

In the training process, the first convolutional neural network 23 takes a first type of picture as a training sample, at least takes a local image area where the lead seal is positioned in the first type of picture as a target to output, and the first type of picture is a picture of a container with various lead seals. The first convolutional neural network 23 may also output the lead seal type and confidence, but is not utilized as subsequent data.

After the local image area where the lead seal is positioned in the image is obtained, the local image area in the image can be intercepted, and the target image is obtained. The target image greatly reduces the picture size and greatly expands the lead seal duty in the picture compared to the original container image.

Again, the target image is sequentially input into the second convolutional neural network 24 for lead seal recognition. The second convolutional neural network 24 is trained to pinpoint the lead seals and output the type and confidence of the lead seals. In one embodiment, the second convolutional neural network 24 specifically outputs the lead seal coordinates, categories, and confidence levels in the target image.

In the training process, the second convolutional neural network 24 takes the second type of picture as a training sample, and takes the local image area where the lead seal is located in the second type of picture, the type of the lead seal and the confidence level as targets for outputting, wherein the second type of picture is a picture of various lead seals. Here, the coordinate information of the lead seal is the partial image area where the lead seal is located. The second type of pictures can be obtained by intercepting the first type of pictures.

The network structure of the second convolutional neural network 24 may be substantially the same as the first convolutional neural network 23, with only the convolutional kernels and the parameter amounts being smaller than the first convolutional neural network 23. The first convolutional neural network 23 detects the coordinate information of the area where the lead seal is located in the image, and intercepts the original image according to the coordinate information as an input of the second convolutional neural network 24. The frame occupancy of the lead seal in the target image is greatly improved compared with the frame occupancy of the lead seal in the original image, so that the second convolutional neural network 24 can perform fine detection and identification by only using a smaller neural network model.

The training of the second convolutional neural network 24 differs from the training of the first convolutional neural network 23 in that: training the first convolutional neural network 23 by using the container picture, so that the first convolutional neural network 23 has the function of detecting a small target area, namely an area where lead sealing is positioned, from the container picture; and training the second convolutional neural network 24 by using the lead seal picture, so that the second convolutional neural network 24 has the function of finely identifying a small target and positioning the coordinates and the category of the lead seal. Therefore, by combining the trained first convolutional neural network 23 and the trained second convolutional neural network 24, lead seals can be accurately and real-time identified according to the acquired container video.

The pictures of the first type of pictures are not limited to containers, and can also comprise part of scene pictures such as containers loaded on trucks, containers placed on road surfaces and the like; likewise, the pictures of the second type are not limited to lead seals, but may also include lead sealed carriers, such as partial pictures of containers.

In this embodiment, the types of lead seals specifically include four types: none, rod-like, ring-like and filiform. That is, the types of lead seals are actually classified into two main types, no lead seal and a lead seal, and the types of lead seals are further classified into three types of rod, ring and wire according to the shape of the lead seal. As the lead sealing product changes, the type of lead sealing in the invention can be adjusted accordingly.

And finally, merging the identification results of the plurality of images, and weighting to obtain the lead sealing identification result of the container. In the weighting calculation, the weight of an image may be increased in a forward direction with the picture occupation ratio of the local image area where the lead seal is located in the image. That is, the larger the picture proportion of the local image area where the lead seal is located in an image in the image, the larger the weight of the image is, so as to adapt to dynamic processes such as container port entering transportation and the like, and improve the identification accuracy.

The lead seals on the container may include a plurality of local image areas output by the first convolutional neural network 23, which may be the whole area where the detected lead seals are located, or may be the local area where each detected lead seal is located; the category and confidence of the lead seals output by the second convolutional neural network 24 are the category and confidence of each lead seal detected. Therefore, the finally obtained recognition result of each image specifically comprises a plurality of groups of recognition results corresponding to a plurality of lead seals, and each group of recognition results comprises the confidence that the lead seals in the group belong to each category.

In the weighting calculation, for each lead seal, the confidence coefficient of each recognition result of the lead seal is taken as the basic weight of each recognition result, and the recognition results of each category of the lead seal are weighted by considering the weight of the image corresponding to the recognition result.

The container is provided with a lead seal (referred to as target lead seal) for a simple example, but not as a limitation of the invention. Presetting a picture proportion as a reference proportion, and taking the ratio of the picture proportion of a local image area where a target lead seal is positioned in each image to the reference proportion as the weight of each image; and taking the product of the confidence coefficient of each recognition result of the target lead seal and the weight of the image corresponding to the recognition result as the weight of the recognition result. Thus, for each category, the recognition result of the category to which the target lead seal belongs is weighted and calculated, and the result value of the category to which the target lead seal belongs is obtained. And finally, taking the category with the largest result value as the category of the target lead seal, namely the lead seal category of the container.

Of course, in other embodiments, the confidence may be directly weighted as a weight.

Further, in addition to the seal type, the seal identification result of the container may also include the presence or absence of seal and seal coordinates. The process for judging whether the container has lead sealing is as follows: the method for obtaining the lead sealing type of the container can be adopted to judge the major class (without lead sealing or with lead sealing) of the lead sealing type of the container, and the identification result of whether the lead sealing exists in the container is obtained; or, firstly, assigning a value to the class to which each identification result of each lead seal belongs, if the identification result is lead-free seal, assigning a value of 0, and if the identification result is lead seal, assigning a value of 1; and then taking the confidence coefficient of each recognition result as a weight, and taking the weight of the image corresponding to the recognition result into consideration, and carrying out weighted calculation on the recognition result of each lead seal to obtain a judgment result of whether each lead seal exists or not. The process for obtaining the lead sealing coordinates of the container comprises the following steps: when the container contains lead seals, the identification results with lead seals and the confidence coefficient greater than the confidence coefficient threshold value can be screened out, and the average coordinates of the identification results of each lead seal are taken as the coordinates of each lead seal.

According to the container lead seal identification method, lead seals in container images are primarily identified through the first convolutional neural network, local image areas where the lead seals are located are rapidly obtained, and then target images with extremely reduced picture sizes and extremely expanded lead seal occupation ratios in pictures are obtained; accurately identifying the lead seal in the target image through the second convolutional neural network, and accurately obtaining the lead seal type; and finally, weighting the identification results of the plurality of images to obtain the lead sealing type of the container. Therefore, the lead seal type of the container is accurately identified in real time according to the collected container video, the accuracy requirement and the real-time requirement of lead seal identification are met, and an accurate data basis is provided for intelligent management of the container.

The first convolutional neural network and the second convolutional neural network are specifically described below.

Fig. 3 shows a main structure of the first convolutional neural network, and referring to fig. 3, in one embodiment, the structure of the first convolutional neural network includes: two connected sets of convolution modules (a former set of convolution modules 310 and a latter set of convolution modules 320), each convolution module in each set of convolution modules including a feature extraction layer and a downsampling layer constructed based on hole convolution; a first detection layer 330 connected to the convolution module 320 of the latter of the two sets of convolution modules; the second detection layer 350 is connected to the former set of convolution modules 310 and the latter set of convolution modules 320 of the two sets of convolution modules, respectively, through the feature fusion layer 340.

In one embodiment, the previous set of convolution modules 310 specifically includes four convolution modules that perform a circular convolution based on the concept of a Resnet residual error. The characteristic extraction layer of each convolution module comprises a BN normalization layer and a LeakyReLU activation layer besides the cavity convolution layer. The downsampling layer may employ a MaxPooling function. The latter set of convolution modules 320 includes one convolution module. A deconvolution upsampling layer is also included between the latter set of convolution modules 320 and the feature fusion layer 340. Feature fusion layer 340 may employ a concat function.

The structure of the second convolutional neural network is basically the same as that of the first convolutional neural network, and only the hole convolution in each convolutional module is replaced by standard convolution, so that the description is not repeated.

Fig. 4 shows a process for identifying a first convolutional neural network, and in combination with the process shown in fig. 4, the process for identifying a first convolutional neural network includes: step S410, performing cyclic convolution on the image for preset times through a previous group of convolution modules to obtain a first feature map. Here, the resize function may be used to resize the image to a suitable size before inputting the image into the first convolutional neural network, and then input into the first convolutional neural network. Step S420, convolving the first feature map through a later group of convolution modules to obtain a second feature map. Step S430, lead sealing recognition is carried out on the second feature map through the first detection layer, and at least one detection frame corresponding to the local image area where the lead sealing is located is obtained. Step S440, performing feature fusion on the first feature map and the deconvoluted up-sampled second feature map through a feature fusion layer to obtain a third feature map. Step S450, lead sealing recognition is carried out on the third feature map through the second detection layer, and at least one detection frame corresponding to the local image area where the lead sealing is located is obtained. Step S460, a target detection frame is selected from a plurality of detection frames based on Non-maximum suppression (NMS for short), and the target detection frame is output as a local image area.

As described above, the local image area output here may be a set of coordinate information corresponding to the area where the plurality of lead seals are located in the container, or may be a plurality of sets of coordinate information corresponding to the area where each lead seal is located.

The identification process of the second convolutional neural network is basically the same as that of the first convolutional neural network, except that: and the detection layer of the second convolutional neural network also obtains the category and the confidence coefficient of the lead seal in each detection frame in the lead seal identification process, and the second convolutional neural network also outputs the category and the confidence coefficient of the lead seal in the target detection frame.

Fig. 5 shows a training process of the first convolutional neural network, and in combination with the training process of fig. 5, the training process of the first convolutional neural network includes:

step S510, a training sample containing a plurality of first-class pictures is obtained, and a local image area where the lead seal is positioned in the first-class pictures is marked, so that the width, the height and the class of the lead seal frame of the first-class pictures are obtained. The first type of pictures may be taken from the video stream of the container on a frame-by-frame basis. The marking can be carried out by adopting a data marking tool such as Labelimg, and the marking content is the angular point coordinates and the categories of the lead sealing frame, so that the width and height information of the lead sealing frame can be obtained. During sample collection, when some sample data is small, data enhancement can be used to expand its specific gravity in the training sample. Enhancement methods include flipping, scaling, cropping, contrast and noise, and the like.

Step S520, clustering the width and height of the lead sealing frame of the training sample through a clustering algorithm to obtain the average width and height of each class of width and height obtained through clustering, and forming a plurality of priori frames obtained through clustering. In one embodiment, 9 sets of width and height information in an average sense may be obtained, forming 9 a priori frames.

In step S530, a first sample feature map and a third sample feature map of each first type of picture are obtained through two sets of convolution modules and feature fusion layers, respectively. Reference is specifically made to the process of obtaining the first feature map and the third feature map in the above-described identification process, and a description thereof will not be repeated here. Before inputting the network, the aspect ratio (16:9) of the original image (1920 pixels×1080 pixels for example) is maintained, and the design of the downsampling multiple of the network is considered, so that the input video stream single frame picture resize is of a fixed size (672 pixels×384 pixels). Considering the size change in the camera picture in the process of adapting to the container entering port, the detection layers of the first convolutional neural network are arranged in two, and the downsampling multiples are respectively 16 and 32. Meanwhile, considering that the semantic information of different feature images has differences, the feature images with the downsampling multiple of 16 can be fused with the information of the deep feature images, so that the deep semantic and the shallow high precision are considered.

In step S540, in the first sample feature map and the third sample feature map of each first type of picture, the prior frames are inserted at the pixel level, and the prior frame with the highest matching degree with the lead sealing frame of the first type of picture is selected as the recognition target of the first detection layer and the second detection layer respectively. The pixel level inserting the prior frames refers to inserting 9 prior frames obtained by clustering in each pixel of the feature map, calculating an IOU (interaction-over-Union) between each prior frame in the feature map and a lead sealing frame of a first type of picture corresponding to the feature map, and taking the prior frame corresponding to the maximum IOU as a prediction target of the detection layer on the feature map.

Further, in the training process, a priori frame with the matching degree smaller than the matching degree threshold value with the lead sealing frame of each first type of picture is also screened out, and is used as a negative sample for loss calculation and optimization based on gradient descent. For example, when the IOU of a certain prior box and any lead seal box is less than the threshold of 0.5, it participates in loss of loss calculation as a negative sample. And in loss calculation, positive samples matched with the number of samples of the negative samples are also obtained, and a priori frame set for loss calculation is formed.

When the loss is calculated, calculating a first loss value according to the coordinates and the width and the height of the central point of the priori frame and the coordinates and the width and the height of the central point of the corresponding lead sealing frame; calculating a second loss value according to the category of the priori frame and the category of the corresponding lead sealing frame; calculating a third loss value according to the probability that the priori frame contains lead sealing and the probability that the priori frame is matched with the corresponding lead sealing frame; and calculating the total loss value of the prior frame according to the first loss value, the second loss value and the third loss value. Based on the loss values, a SGDM (random gradient descent with momentum) algorithm may be used for gradient updating to optimize the first convolutional neural network.

In addition, in the calculation of the loss, the cross entropy is adopted as the classification loss, and the mean square error is adopted as the regression loss. The regression loss mainly comprises the central point of the prior frame and the information of the width and height dimensions, the coordinates of the central point can be constrained in (0, 1) by adopting a sigmoid function, and the value range of the width and height can be expanded to the whole real space by adopting a logarithmic function. In addition, to ensure the detection accuracy of the small object, a coefficient of (2-wh) may be multiplied to increase the loss weight of the small object.

The calculation formula of the first loss value loss_box specifically includes:

wherein S is ² The number of pixels in the first sample feature map or the third sample feature map, that is, the size of the detection layer; b is the number of prior frames obtained by clustering,

Marking whether the prior frame at the position i and j contains lead sealing, if so, the prior frame is 1, and if not, the prior frame at the position j is 0; w (w) _i,j The width of the a priori frame at i, j, h _i,j The height of the a priori frame at i, j, x _i,j And y _i,j X-coordinate and y-coordinate of the center point of the a priori frame at i, j, ++>

The height of the lead seal corresponding to the a priori frame at i, j in the first class of pictures calculated for the current loss,/>

And->

Respectively calculating the x coordinate and the y coordinate of the center point of the lead sealing frame corresponding to the prior frame at the i and j positions in the first type of pictures of the current loss, lambda _coord The weighting for the loss function is preferably 5.

The calculation formula of the second loss value loss_cls is specifically as follows:

Take 0 or 1, lambda _class The weighting for the loss function may be 1.

The calculation formula of the third loss value loss_obj specifically includes:

wherein,,

marking whether the prior frame at the position i and j does not contain lead sealing, if so, the prior frame is 1, and if not, the prior frame at the position j is 0; d, d _i,j For i, j the probability that the a priori box contains lead seals,/ >

If the IOU between the prior frame and the lead sealing frame is larger than a threshold value, the prior frame is considered to be matched with the corresponding lead sealing frame; lambda (lambda) _noobj And lambda (lambda) _obj The weights for the different parts of the loss function may each be taken to be 0.5.

The calculation formula of the total loss value loss is specifically as follows:

loss＝loss_box+loss_cls+loss_obj。

further, considering the requirement of the identification frame rate, the first convolutional neural network can be properly layer-by-layer channel pruning under the condition that the accuracy is not affected remarkably during training, and the specific pruning proportion is automatically selected by the principle that the accuracy is not reduced during actual training.

The training process of the second convolutional neural network is basically the same as that of the first convolutional neural network, and the main difference is that: the input size of the second convolutional neural network is smaller, 208 pixels are selected in practical training, and training of the second convolutional neural network can be converged rapidly. The rest of the training process, pruning setting and the like can refer to the first convolutional neural network, and the description is not repeated.

In conclusion, the container lead seal identification method combines two convolutional neural networks to realize coarse positioning and fine detection of a tiny target, can carry out high-precision real-time identification on the container lead seal, can adapt to dynamic processes such as port entry transportation of the container and the like, and provides an accurate data basis for intelligent management of the container.

The embodiment of the invention also provides a container lead seal identification device which can be used for realizing the container lead seal identification method described in any embodiment. The features and principles of the container lead seal identification method described in any of the above embodiments are applicable to the following container lead seal identification device embodiments. In the following embodiments of the container lead seal identification device, the features and principles already explained with respect to container lead seal identification will not be repeated.

Fig. 6 shows the main modules of the container lead seal identification device, and referring to fig. 6, in one embodiment, the container lead seal identification device 600 comprises: a video acquisition module 610, configured to acquire a video of a container; the first recognition module 620 is configured to extract an image from the video, input the image into the trained first convolutional neural network for lead seal recognition, and obtain at least a local image area where lead seals are located in the image as a target image; the second recognition module 630 is configured to input the target image into a trained second convolutional neural network to perform lead seal recognition, and at least obtain a category and a confidence of lead seal; and the weighting calculation module 640 is used for carrying out weighting calculation according to the types and the confidence levels of the lead seals corresponding to the images in the video to obtain the types of the lead seals of the container.

Further, the container lead seal identification device 600 may further include modules for implementing other flow steps of the foregoing embodiments of the container lead seal identification method, and specific principles of each module may refer to the foregoing descriptions of the foregoing embodiments of the container lead seal identification method, which are not repeated herein.

As described above, the container lead seal identification device can combine two convolutional neural networks to realize coarse positioning and fine detection of a tiny target, can perform high-precision real-time identification on the container lead seal, can adapt to dynamic processes such as port entering transportation of the container, and provides an accurate data basis for intelligent management of the container.

The embodiment of the invention also provides electronic equipment, which comprises a processor and a memory, wherein executable instructions are stored in the memory, and when the executable instructions are executed by the processor, the container lead seal identification method described in any embodiment is realized.

As described above, the electronic equipment can combine two convolutional neural networks to realize coarse positioning and fine detection of a tiny target, perform high-precision real-time identification on the lead sealing of the container, adapt to dynamic processes such as port entry transportation of the container and the like, and provide an accurate data basis for intelligent management of the container.

Fig. 7 is a schematic structural diagram of an electronic device according to an embodiment of the present invention, and it should be understood that fig. 7 is only a schematic diagram illustrating each module, and the modules may be virtual software modules or actual hardware modules, and the combination, splitting and addition of the remaining modules are all within the scope of the present invention.

As shown in fig. 7, the electronic device 700 is embodied in the form of a general purpose computing device. Components of electronic device 700 include, but are not limited to: at least one processing unit 710, at least one memory unit 720, a bus 730 connecting the different platform components (including memory unit 720 and processing unit 710), a display unit 740, and the like.

The storage unit stores therein a program code that can be executed by the processing unit 710, so that the processing unit 710 performs the steps of the container lead seal identification method described in any of the above embodiments. For example, the processing unit 710 may perform the steps as shown in fig. 1.

The memory unit 720 may include readable media in the form of volatile memory units, such as Random Access Memory (RAM) 7201 and/or cache memory 7202, and may further include Read Only Memory (ROM) 7203.

The storage unit 720 may also include a program/utility 7204 having one or more program modules 7205, such program modules 7205 including, but not limited to: an operating system, one or more application programs, other program modules, and program data, each or some combination of which may include an implementation of a network environment.

Bus 730 may be a bus representing one or more of several types of bus structures including a memory unit bus or memory unit controller, a peripheral bus, an accelerated graphics port, a processing unit, or a local bus using any of a variety of bus architectures.

The electronic device 700 may also communicate with one or more external devices 800, which external devices 800 may be one or more of a keyboard, pointing device, bluetooth device, etc. These external devices 800 enable a user to interactively communicate with the electronic device 700. The electronic device 700 can also communicate with one or more other computing devices, including a router, modem, as shown. Such communication may occur through an input/output (I/O) interface 750. Also, electronic device 700 may communicate with one or more networks such as a Local Area Network (LAN), a Wide Area Network (WAN), and/or a public network, such as the Internet, through network adapter 660. Network adapter 760 may communicate with other modules of electronic device 700 via bus 730. It should be appreciated that although not shown, other hardware and/or software modules may be used in connection with electronic device 700, including, but not limited to: microcode, device drivers, redundant processing units, external disk drive arrays, RAID systems, tape drives, data backup storage platforms, and the like.

The embodiment of the invention also provides a computer readable storage medium for storing a program, which when executed, implements the container lead seal identification method described in any of the above embodiments. In some possible embodiments, the aspects of the invention may also be implemented in the form of a program product comprising program code for causing a terminal device to carry out the container seal identification method as described in any of the above embodiments, when the program product is run on the terminal device.

As described above, the computer-readable storage medium of the invention can combine two convolutional neural networks to realize coarse positioning and fine detection of a tiny target, can perform high-precision real-time identification on the lead seal of the container, can adapt to dynamic processes such as port entry transportation of the container, and provides an accurate data basis for intelligent management of the container.

Fig. 8 is a schematic structural view of a computer-readable storage medium of the present invention. Referring to fig. 8, a program product 900 for implementing the above-described method according to an embodiment of the present invention is described, which may employ a portable compact disc read only memory (CD-ROM) and include program code, and may be run on a terminal device, such as a personal computer. However, the program product of the present invention is not limited thereto, and in this document, a readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device.

The program product may employ any combination of one or more readable media. The readable medium may be a readable signal medium or a readable storage medium. The readable storage medium can be, for example, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or a combination of any of the foregoing. More specific examples of the readable storage medium include, but are not limited to: an electrical connection having one or more wires, a portable disk, a hard disk, random Access Memory (RAM), read-only memory (ROM), erasable programmable read-only memory (EPROM or flash memory), optical fiber, portable compact disk read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.

The computer readable storage medium may include a data signal propagated in baseband or as part of a carrier wave, with readable program code embodied therein. Such a propagated data signal may take any of a variety of forms, including, but not limited to, electro-magnetic, optical, or any suitable combination of the foregoing. A readable storage medium may also be any readable medium that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device. Program code embodied on a readable storage medium may be transmitted using any appropriate medium, including but not limited to wireless, wireline, optical fiber cable, RF, etc., or any suitable combination of the foregoing.

Program code for carrying out operations of the present invention may be written in any combination of one or more programming languages, including an object oriented programming language such as Java, C++ or the like and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user's computing device, partly on the user's device, as a stand-alone software package, partly on the user's computing device, partly on a remote computing device, or entirely on the remote computing device or server. In the case of remote computing devices, the remote computing device may be connected to the user computing device through any kind of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or may be connected to an external computing device, such as through the Internet using an Internet service provider.

The foregoing is a further detailed description of the invention in connection with the preferred embodiments, and it is not intended that the invention be limited to the specific embodiments described. It will be apparent to those skilled in the art that several simple deductions or substitutions may be made without departing from the spirit of the invention, and these should be considered to be within the scope of the invention.

Claims

1. The container lead seal identification method is characterized by comprising the following steps:

collecting video of a container;

extracting an image from the video, inputting the image into a trained first convolutional neural network for lead sealing identification, and at least obtaining a local image area where lead sealing is located in the image as a target image;

inputting the target image into a trained second convolutional neural network for lead sealing identification, and at least obtaining the category and the confidence of lead sealing;

according to the types and the confidence levels of the lead seals corresponding to the images in the video, carrying out weighted calculation to obtain the types of the lead seals of the container;

in the training process of the first convolutional neural network, a first type of picture is taken as a training sample, at least a local image area where a lead seal is positioned in the first type of picture is taken as a target to be output, and the first type of picture is a picture of a container with various lead seals;

in the training process of the second convolutional neural network, taking a second type of picture as a training sample, taking a local image area where lead seals are positioned in the second type of picture, the type of the lead seals and the confidence level as targets for outputting, wherein the second type of picture is a picture of various lead seals, and the second type of picture is obtained by intercepting the first type of picture; and

In the weighting calculation, the weight of an image increases forward with the picture occupation ratio of the local image area where the lead seal is located in the image.

2. The method for identifying a lead seal of a container as defined in claim 1, wherein the structure of the first convolutional neural network comprises:

the device comprises two groups of connected convolution modules, wherein each convolution module in each group of convolution modules comprises a feature extraction layer and a downsampling layer which are constructed based on cavity convolution;

the first detection layer is connected with the convolution modules of the latter group of the two groups of convolution modules;

the second detection layer is respectively connected with the former group of convolution modules and the latter group of convolution modules of the two groups of convolution modules through the feature fusion layer.

3. The method for identifying a lead seal of a container as in claim 2, wherein the identification process of the first convolutional neural network comprises:

performing cyclic convolution on the image for preset times through the previous group of convolution modules to obtain a first feature map;

convolving the first feature map by the latter group of convolution modules to obtain a second feature map;

lead sealing identification is carried out on the second feature map through the first detection layer, and at least one detection frame corresponding to the local image area where the lead sealing is located is obtained;

Performing feature fusion on the first feature map and the deconvolution up-sampled second feature map through the feature fusion layer to obtain a third feature map;

lead sealing identification is carried out on the third feature map through the second detection layer, and at least one detection frame corresponding to the local image area where the lead sealing is located is obtained;

and screening out target detection frames from the multiple detection frames based on non-maximum suppression, and outputting the target detection frames as the local image area.

4. A method of identifying a lead seal for a container as claimed in claim 3 wherein the training process of the first convolutional neural network comprises:

obtaining a training sample containing a plurality of first-class pictures, and marking a local image area where a lead seal is positioned in the first-class pictures to obtain the width, height and class of a lead seal frame of the first-class pictures;

clustering the width and height of the lead sealing frame of the training sample through a clustering algorithm to obtain the average width and height of each class of width and height obtained through clustering, and forming a plurality of prior frames obtained through clustering;

respectively obtaining a first sample feature map and a third sample feature map of each first type of picture through two groups of convolution modules and the feature fusion layer;

And respectively inserting the prior frames into the first sample feature map and the third sample feature map of each first type of picture at a pixel level, and selecting the prior frame with highest matching degree with the lead sealing frame of the first type of picture as the identification targets of the first detection layer and the second detection layer respectively.

5. The method for identifying lead seals of containers as defined in claim 4, wherein in the training process, a priori frame having a matching degree smaller than a matching degree threshold value with lead seal frames of each of the first type pictures is also selected as a negative sample for loss calculation and optimized based on gradient descent.

6. The method for identifying a lead seal of a container as claimed in claim 5, wherein the loss calculation process comprises:

calculating a first loss value loss_box according to a first formula:

marking whether the prior frame at the position i and j contains lead sealing, if so, the prior frame is 1, and if not, the prior frame at the position j is 0; w (w) _i,j The width of the a priori frame at i, j, h _i,j The height of the a priori frame at i, j, x _i,j And y _i,j The x-coordinate and the y-coordinate of the center point of the prior frame at i, j respectively; / >

And->

calculating a second loss value loss_cls according to a second formula:

Take 0 or 1, lambda _class Taking 1;

calculating a third loss value loss_obj according to a third formula:

wherein,,

The probability that the prior frame at the position i and j is matched with the corresponding lead sealing frame in the first type of picture calculated by the current loss; lambda (lambda) _noobj And lambda (lambda) _obj Taking 0.5 each;

calculating a total loss value loss according to a fourth formula:

loss＝loss_box+loss_cls+loss_obj。

7. a container lead seal identification method as defined in claim 2, wherein said second convolutional neural network is structured to replace said hole convolution with a standard convolution.

8. A container lead seal identification method as in claim 3, wherein in the identification process of the second convolutional neural network, the detection layer further obtains the category and the confidence of the lead seal in each detection frame, and the second convolutional neural network further outputs the category and the confidence of the lead seal in the target detection frame.

9. A container lead seal identification device, comprising:

the video acquisition module is used for acquiring videos of the container;

the first recognition module is used for extracting an image from the video, inputting the image into a trained first convolutional neural network for lead sealing recognition, and at least obtaining a local image area where lead sealing is located in the image as a target image;

the second recognition module is used for inputting the target image into a trained second convolutional neural network to carry out lead sealing recognition, and at least obtaining the category and the confidence of the lead sealing;

the weighting calculation module is used for carrying out weighting calculation according to the category and the confidence coefficient of the lead seal corresponding to the plurality of images in the video to obtain the category of the lead seal of the container;

10. An electronic device, comprising:

a processor;

a memory having executable instructions stored therein;

wherein the executable instructions, when executed by the processor, implement the container lead identification method of any one of claims 1-8.

11. A computer-readable storage medium storing a program, wherein the program when executed implements the container lead seal identification method according to any one of claims 1-8.