CN111985466A

CN111985466A - Container dangerous goods mark identification method

Info

Publication number: CN111985466A
Application number: CN202010840426.3A
Authority: CN
Inventors: 丁一
Original assignee: Shanghai Maritime University
Current assignee: Shanghai Maritime University
Priority date: 2020-08-19
Filing date: 2020-08-19
Publication date: 2020-11-24

Abstract

The invention discloses a container dangerous goods mark identification method based on a deep learning algorithm, which comprises the following steps: collecting data; detecting the edge of the box body; perspective transformation; enhancing the image; marking a dangerous goods area; carrying out image standardization processing; acquiring a region of interest; extracting characteristics; fusing the characteristics; upsampling; difficult sample excavation; and calculating a connected domain, finally obtaining whether the detected video or picture contains the dangerous article mark and the coordinate of the dangerous article mark, and constructing an identification system. Compared with the prior art, the method has the advantages that the image information can be effectively extracted by utilizing multilayer convolution, the influence of bottom information such as brightness, color, texture and the like is avoided, the loss of image characteristic information in the process of network convolution kernel pooling can be well avoided through scale fusion, the identification speed of the dangerous goods mark is increased, and the efficiency of wharf production operation is improved.

Description

Container dangerous goods mark identification method

Technical Field

The invention relates to the technical field of image recognition, in particular to a method for recognizing dangerous goods signs of a container.

Background

The hazardous material mark is a mark for indicating physical and chemical properties of the hazardous material and the degree of danger. The transportation of dangerous goods mainly depends on 5 transportation modes such as railway transportation, waterway transportation, road transportation, air transportation, pipeline transportation and the like. The port hub is used as a transfer station for waterway transportation, road transportation, railway transportation and the like, and bears a large number of dangerous goods transportation tasks. After multi-frequency circulation such as long-distance transportation, loading and unloading, transfer and the like, the dangerous goods box is very easy to damage and fade, so that the properties and characteristics of the loaded goods cannot be timely and accurately mastered by ships and related working personnel, and further dangerous goods accidents are caused, and serious economic loss and casualties are caused. In the prior art, the working principle of dangerous article mark identification is basically the same, and the wireless radio frequency technology and the sensor technology are adopted. The method can effectively utilize multilayer convolution to extract picture information, avoids the influence of bottom information such as brightness, color, texture and the like, can well avoid the loss of image characteristic information in the process of network convolution kernel pooling through scale fusion, improves the identification speed of the dangerous goods mark, and is beneficial to improving the efficiency of wharf production operation.

Disclosure of Invention

In view of the above disadvantages of the prior art, the present invention is directed to identifying dangerous goods signs of a container, and aims to provide a dangerous goods sign identification method based on a deep learning algorithm.

In order to achieve the above objects and other related objects, the present invention provides a dangerous goods mark identification method for a container, which comprises a dangerous goods mark identification system established by a related technology using a neural network as a main component. The method is characterized by mainly comprising the following steps:

s1, collecting data, using a camera mounted on a bridge crane or a road junction as a video collector, and collecting images of the operated container by using a wireless radio frequency technology; acquiring and marking training data and acquiring a picture to be identified; and constructing a large amount of rich text training required by deep learning algorithm training, and acquiring and labeling training data by a data labeling management system required by standardized acquisition, labeling, storage and efficient transmission of test samples. The data marking management system mainly comprises a marking terminal, a marking subsystem and a big data subsystem;

further, in an implementation manner of the present invention, the step of acquiring the training data set includes:

preprocessing the acquired picture, mainly comprising format conversion, data set renaming and picture data screening;

uploading the preprocessed pictures to a big data system;

the marking subsystem acquires a picture to be marked from the big data system and sends the picture to the marking terminal;

the marking terminal marks the picture and uploads the marked picture to the big data system through the marking subsystem to generate training data;

s2, detecting the edge of the box body, and performing linear positioning by adopting a Hough transform algorithm to detect the edge of the box body:

obtaining all object outlines of the picture;

calculating for no point (x, y) on the contour its corresponding angular vector (θ, p)

Counting the angular vectors of all the points, wherein the (theta, p) set is clustered according to a certain error range;

setting a threshold value T, and when the (theta, p) > T is greater than the threshold value T, the contour point set is a straight line.

S3 perspective transformation, and further projecting the image to a new viewing plane by using the perspective transformation. The process comprises converting a two-dimensional coordinate into a three-dimensional left system, and then projecting the three-dimensional coordinate system into a new two-dimensional coordinate system; the perspective matrix was obtained in opencv using cv2.getperspective transform, and the image after perspective was acquired using cv2. warpppective.

S4 performs image enhancement, and further performs image enhancement processing on the acquired picture. The image enhancement comprises the following steps:

and selecting the picture with sufficient light and obvious danger mark in the daytime from the container region pictures for gray processing. Calculating the average gray value h₀As a brightness baseline;

for the received picture to be identified, the average gray value h is obtained and compared with the reference line to obtain the proportionality coefficient

Correcting an original image:

(when gamma > 1, the contrast of the high gray scale area of the image is enhanced, when gamma < 1, the contrast of the low gray scale area of the image is enhanced, and when gamma is 1, the original image is not changed), thereby improving the brightness of the low brightness area.

Further, the model training step based on a large amount of labeled data and a deep learning algorithm comprises:

classifying input images, and structuring the labeled images in the big data system into information of a certain category to describe pictures by using a category or an instance ID determined in advance;

acquiring the contour coordinates of dangerous goods signs of the training pictures through the marked training data;

acquiring a picture area (ROI) corresponding to the dangerous goods mark through the contour boundary coordinates;

a Convolutional Neural Network (CNN) is used as a feature extractor, and after high-dimensional features are extracted, feature fusion and pixel prediction are carried out in an FCN mode;

performing neural network calculation on an input image to finally generate a binary image which has the same size as the original input image and only contains 0 and 1;

further said image classification comprises:

s5 marks the area of the hazardous material label. Before training a training sample, marking a discriminant in an image of a training sample set, and outputting a binary image;

s6 image normalization processing. Although the full convolution neural network (FCN) can input a color image of any size, in order to improve the accuracy of model discrimination, the original label image is cut into images of 512 × 512 size, so that the input of the neural network is 512 × 512 × 3;

s7, obtaining the interested region, and adopting the candidate region method to extract the interested region from the contour, including:

method for image segmentation based on graph obtains original segmentation region R ═ { R ═ R₁，r₂，…r_n}

Initializing a set of similarity

Calculating the similarity between every two adjacent regions, and adding the similarity into a similarity set S;

finding out two regions r with maximum similarity from the similarity set S_iAnd r_jMerge them into one region r_tRemoving the original and r from the similarity set_iAnd r_jCalculating the similarity between adjacent regions, and calculating r_tAnd its neighboring region (formerly and r)_iOr r_jAdjacent regions), the result of which is added to the similarity set S. At the same time, new region r_tAdding into the region set R;

acquiring boundary areas (Bounding Boxes) of each area, wherein the result is a possible result L of the position of the object;

s8 feature extraction, constructing model based on full convolution neural network, wherein the neural network comprises 5 pooling layers, and the image size after each pooling is the upper layer

Extracting features of different sizes of input image by using full convolution neural network, converting the final full connection layer into convolution of 1 × 1 for feature extraction, forming thermodynamic diagram for use by up-sampling layer

S9 feature fusion, jump connection, and pooling of 5 layers by 7-layer convolution

Deconvolution of the image of size into

Size of imageThe feature map of the pooling layer (pool4) is selected for fusion to generate

The image of (a); then to

Is deconvoluted into

The images with the sizes are generated by fusing the feature maps of the pooling layer 3(pool3)

The image of (2).

S10 upsampling (upsampling) uses a two-line interpolation method to determine a pixel in the target image by using four real pixels around the target point in the original image, and doubles the height and width of the matrix.

Furthermore, the neural network adopts separable deep convolution, reduces convolution kernel parameters and improves the calculation speed. The resnet18 is used as an ackbone, and the feature extraction capability is enhanced by increasing the number of layers of the neural network; in the feature fusion stage, concat is carried out after upsampling, parameters extracted by different layers are merged, and feature expression capacities of different encoder layers are improved.

S11 Hard sample Mining, wherein in the model training process, the Hard sample Mining is carried out by using Online Hard sample Mining (OHEM), namely in the process of each round of model training, each batch of training data is sorted according to the current loss function value (loss), N with the largest loss is selected, the loss of the positive sample area is added, and gradient descent is carried out after the loss is combined.

S12, calculating a connected domain, and finally generating a binary image which has the same size as the original image but only contains 0 and 1 after the input image is calculated by a neural network. And the position with the pixel value of 1 in the generated picture corresponds to the dangerous goods mark area, and the position with the pixel value of 0 is the background. And positioning through the connected domain with the calculated value of 1 to finally obtain the coordinates of the dangerous goods mark area.

S13, constructing an identification system, and after the dangerous goods mark model is trained, providing services to the outside in a service mode to ensure the high efficiency and the availability of the identification model; the identification system comprises a basic resource layer, a containerization layer, an algorithm development layer, a service layer and an acquisition layer. Wherein the base resource layer is constructed by a plurality of X86 architectures and a host with GPU capability; the containerization layer is used as an integral solution by Kubemeters and provides services to the outside in a containerization mode; the algorithm development layer provides a customized development container mirror image; the service layer provides services such as Restful and http of the recognition algorithm in a container form; the acquisition layer integrates a specific software program through hardware and provides hardware support for acquisition work of terminal identification images, videos, acquisition data and the like.

As described above, in the method for identifying dangerous goods in a container, an FCN algorithm is provided to construct a dangerous goods identification model, the collected data is used to perform model training, and finally, the trained model is used to identify dangerous goods in a picture transmitted by a camera on a container terminal job site. The identification rate and the accuracy of the dangerous goods mark identification of the container are effectively improved.

Drawings

In order to more clearly illustrate the embodiments of the present invention and the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, and it is obvious that the drawings described below are only some embodiments of the present invention, and it is obvious for those skilled in the art that other drawings can be obtained according to these drawings without creative efforts.

FIG. 1 is a schematic flow chart of the dangerous goods identification method for the container of the present invention;

FIG. 2 is a collected image to be identified provided by an embodiment of the identification method for dangerous goods labels of a container according to the present invention;

FIG. 3 is a schematic diagram of a fully-connected neural network architecture provided by an embodiment of the method for identifying dangerous goods in a container according to the present invention;

FIG. 4 is a schematic diagram of a neural network training process provided by an embodiment of the identification method for dangerous goods in a container according to the present invention;

FIG. 5 is a schematic diagram of a convolution kernel calculation provided by an embodiment of the container dangerous goods identification method of the present invention;

FIG. 6 is a binary image obtained from the identification according to the embodiment of the dangerous goods identification method for container of the present invention;

FIG. 7 is a schematic diagram of training data collection provided by an embodiment of the dangerous goods identification method for a container according to the invention;

FIG. 8 is a schematic view of a dangerous goods identification system provided by an embodiment of the dangerous goods identification method for a container of the present invention;

Detailed Description

In order to make the technical means, inventive features, objectives and effects of the present invention easily understood, the present invention will be further described with reference to the following detailed drawings.

Please refer to fig. 1-8. It should be noted that the drawings provided in the present embodiment are only for illustrating the basic idea of the present invention, and the components related to the present invention are only shown in the drawings rather than drawn according to the number, shape and size of the components in actual implementation, and the type, quantity and proportion of the components in actual implementation may be changed freely, and the layout of the components may be more complicated.

Fig. 1 shows a method for identifying dangerous goods labels of a container, which comprises the following steps:

and S1, acquiring data, namely, taking a camera installed on a bridge crane or a road junction as a video acquisition device, and informing system equipment of preparing to acquire pictures of the operated container through a radio frequency technology or a trigger. A camera-mounted device such as a bridge crane may involve one or more types of sensors for triggers that, upon detection of certain conditions, command the attached camera to begin capturing images (video). Since one image to be recognized may have multiple photographs taken, and often there are multiple pictures on one side. The invention adopts a panoramic splicing mode to splice the pictures into a complete picture.

S2, detecting the edge of the box body, wherein the dangerous goods are mistakenly reported because the scenes of shooting dangerous goods containers are more and the situation is more complex, and objects similar to dangerous goods marks may exist outside the box body. It is therefore necessary to filter the image area outside the container. As shown in fig. 2, since the container edges appear as standard line segments, the straight lines (container edges) that may be present in the picture are first located. And the straight line positioning adopts Hough transform algorithm. The method comprises the following steps:

obtaining all object outlines of the picture;

A plurality of straight lines are obtained through the method, and then the container edge point coordinates are obtained according to the length of the obtained straight lines, whether the straight lines are parallel or not, the straight lines are intersected and the like.

And S3, stretching the area in the coordinate by a perspective transformation algorithm to obtain a standard rectangular output image. The process projects an image onto a new viewing plane using perspective transformation, including converting a two-dimensional coordinate into a three-dimensional left system, and then projecting the three-dimensional coordinate system onto a new two-dimensional coordinate system; the perspective matrix was obtained in opencv using cv2.getperspective transform, and the image after perspective was acquired using cv2. warpppective. The transformation relationship of the perspective transformation is as follows:

s4, image enhancement, because the scene identifying the threat of the hazardous material is outdoors, there are various weather and light intensity disturbances. Especially at night, part of dangerous mark shooting is fuzzy, which causes the reduction of the recognition rate, therefore, the image needs to be enhanced, and the image enhancement comprises the following steps:

Correcting an original image:

(when gamma is larger than 1, the contrast of the high gray scale area of the image is enhanced, when gamma is smaller than 1, the contrast of the low gray scale area of the image is enhanced, and when gamma is 1, the original image is not changed), thereby improving the brightness of the low brightness area;

s5 marking a dangerous goods area, as shown in figure 2, marking the dangerous goods mark area, marking the discriminant in the training sample set image before training the training sample, and outputting a binary image;

s6, carrying out image standardization processing, namely cutting the original marked image into images with the size of 512 multiplied by 512, so that the input of the fully-connected neural network is 512 multiplied by 3;

s7, obtaining the interested area, obtaining the contour coordinate of the dangerous goods sign of the training picture through the marked training data, and extracting the interested area from the contour by using the method of the candidate area. The method comprises the following specific steps:

method for image segmentation based on graph obtains original segmentation region R ═ { R ═ R₁，r₂，…r_n}；

Initializing a similarity set

finding out two regions r with maximum similarity from the similarity set S_iAnd r_jMerge them into one region r_tRemoving the original and r from the similarity set_iAnd r_jCalculating the similarity between adjacent regions, and calculating r_tAnd its neighboring region (formerly and r)_iOr r_jAdjacent regions), the result of which is added to the similarity set S. At the same time, new region r_tAdding into the region set R; acquiring Bounding Boxes of each area, wherein the result is a possible result L of the position of the object;

and S8, feature extraction, namely constructing a model based on a full convolution neural network, extracting features of different sizes of an input image by using the full convolution neural network, converting the final full connection layer into convolution of 1 multiplied by 1 for feature extraction, and forming a thermodynamic diagram for an upper sampling layer. And performing feature extraction on the extracted interested region by using a convolutional neural network, and after extracting high-dimensional features, training the data set by adopting a similar FCN (fuzzy C-means) mode.

Deconvolution of the image of size into

The images with the sizes are generated by fusing the feature maps of the pooling layer 4

The image of (a); then to

Is deconvoluted into

The images with the sizes are fused by selecting the feature map of the pooling layer 3 to generate

The image of (a); comprises the following steps:

(1) firstly, reversing a convolution kernel, and performing order-stepping operation in the up-down, left-right directions;

(2) secondly, taking the feature graph generated by convolution as input, and performing 0-complementing expansion operation on the feature graph, namely complementing 0 to the back of each pixel; the step size of each element is complemented by 1 minus 0 along the step size direction according to the step size of the convolution operation, and if the step size is 1, 0 is not complemented;

(3) and thirdly, performing secondary 0 supplementing operation on the feature diagram on the basis of the expanded feature diagram. Taking the original input characteristic diagram shape as output, calculating the position and the number of padding 0 according to the convolution padding rule, and respectively reversing the upper part, the lower part, the left part and the right part of the position of the padding 0;

(4) fourthly, the characteristic diagram after 0 is supplemented is used as input, and deconvolution operation with the step length of 1 is carried out;

s10, performing upsampling, determining a pixel in the target image by using four really existing pixels around the target point in the original image by using a double-line interpolation method, and doubling the height and width of the matrix;

the core of the upsampling is that linear interpolation is performed in two directions, and a point p is equal to a value of (x, y), and a known function f is assumed to be at a point Q₁₁＝(x₁，y₁)，Q₁₂＝(x₁，y₂)，Q₂₁＝(x₂，y₁) And Q₂₂＝(x₂，y₂) Values of four points. Firstly, linear interpolation is carried out in the x direction to obtain

Then linear interpolation is carried out in the y direction to obtain

This gives the desired result f (x, y),

s11 difficult sample mining, in the process of each round of model training, sequencing each batch of training data according to the current loss function values, selecting N pieces with the largest loss function values, simultaneously adding the loss function values of the positive sample regions, combining the loss function values, and then performing gradient descent, wherein the loss function expression is as follows:

s12, calculating a connected domain, and finally generating a binary image which has the same size as the picture but only contains 0 and 1 after the input picture to be identified is calculated by the neural network provided by the invention, wherein the position of the generated picture with the pixel value of 1 corresponds to the dangerous goods mark area, and the position of the pixel value of 0 is the background. Positioning through the connected domain with the calculated value of 1 to finally obtain the coordinates of the dangerous article mark area; the method comprises the following steps:

the method comprises the following steps:

(1) acquiring a binary image;

(2) when the image is scanned for the first time by lines, each pixel value in the image is scanned from top to bottom and from left to right, and each valid pixel value is given a label, the rule is as follows:

a) if the left pixel value and the upper pixel value in the 4 neighborhoods of the pixel are both 0 and no label exists, giving a new label to the pixel;

b) if one of the left pixel value or the upper pixel value in the 4 neighborhoods of the pixel is 1, the label of the pixel is the label with the pixel value of 1;

c) if the left pixel value and the upper pixel value in the 4 neighborhoods of the pixel are both 1 and the labels are the same, the label of the pixel is the label;

d) if the left pixel value and the upper pixel value in the 4 neighborhood of the pixel are both 1 and the labels are different, the smaller label is taken as the label of the pixel. After the marking is finished, a plurality of different labels exist in a connected domain, and the label of the pixel on the left side of the pixel and the label of the pixel on the upper side of the pixel are marked as an equal relation;

s13, constructing an identification system, and after the dangerous goods mark model is trained, providing services to the outside in a service mode to ensure the high efficiency and the availability of the identification model;

fig. 3, 4, 5 disclose the network structure, model training process and convolution kernel calculation process of the present invention. The image discrimination of the invention is mainly based on the network structure of Unet. The method mainly comprises several sub-processes of convolutional layers, activation functions, deconvolution, skip level structures and the like. The combination of convolutional layers and activation functions is mainly used for feature extraction, down-sampling and data dimension reduction. The deconvolution layer predicts an input image. The skip level structure combines relative information and integral information in the output result of the deconvolution layer roughness to achieve the purpose of optimizing the output result.

The model network architecture diagram is divided into two parts: a full convolution part and a deconvolution part. The left half of the network architecture depicted in fig. 3 is the full convolution portion and the right half is the deconvolution portion. Wherein the full convolution part converts the last fully-connected layer into a 1 x 1 convolution for feature extraction, forming a thermodynamic diagram. The deconvolution part samples the small-size heat point diagram to obtain a semantic segmentation image with the original size;

the convolutional neural network can input an image of arbitrary size, the output is the same as the input size, but the original image is still cut to 512 x 512 size for batch gradient descent of the training set data. The present invention divides images into two categories: background + critical area, and therefore the number of channels, i.e. the depth, is 2. And inputting the original gray image and the depth image into a network in parallel, wherein the upper-layer original gray image comprises a plurality of information such as scenes, positions and the like. The input to the fully convolutional neural network is a 512 x 3 image, which is passed through a series of convolutional and downsampled layers to transform the image data into a feature matrix of size 7 x 2. At this time, the 3 fully-connected layers are converted into convolutional layers, each filter is provided with a filter size, and the final output data volume is 1 × 1000. And each convolution reduces the resolution of the image, and a final feature map is obtained after multiple convolutions, wherein the feature map contains all semantic information in the image.

The deconvolution operation can be understood as the inverse operation of the convolution operation, and the result after convolution is convolved again by transposing the convolution kernel, so that the image is reduced to the original image size to obtain the pixel-by-pixel prediction result.

In the pixel-by-pixel prediction process, the prediction depth is 2, 16 × 4096 is input, the template size is convolved by 1 × 1, 16 × 2 is output, and 2 types of results are predicted from 4096-dimensional features. In the deconvolution process, image pixels are enlarged, the resolution of the image is improved to be consistent with that of the original image, and the area with high weight is the area where the target is located.

The result is usually very coarse, directly up-sampled from the smallest feature map to the original size in the deconvolution process, so a skip level structure can be used that combines relative information with global information to optimize the coarse output result. In the network structure shown in fig. 3, convolution convl is performed on the original image, and the post-pool image is reduced to 1/2; carrying out a second convolution conv2 on the image, and reducing the image to 1/4 after pool 2; the image is convolved for a third time conv3, pool3 and then the image is reduced to 1/8, and so on. The image classification result obtained by up-sampling is not accurate enough, and some details in the image cannot be recovered. Therefore, the outputs of the third layer and the fourth layer are also subjected to deconvolution in sequence, 8 times of upsampling and 16 times of upsampling are respectively needed, local and global information are considered, and the obtained result is finer.

Fig. 6 discloses a binarized picture obtained by a container dangerous goods mark identification method. The binarization processing of the image is to set the gray value of a point on the image to be 0 or 255, that is, to make the whole image show obvious black and white effect. That is, the gray level image with 256 brightness levels is selected by proper threshold value to obtain the binary image which can still reflect the whole and local features of the image

Fig. 7 shows that the dangerous goods mark identification system for the container comprises a data marking management system and a dangerous goods mark identification service system. The data labeling management system mainly comprises a labeling terminal, a labeling subsystem and a big data subsystem. The marking terminal is a PC computer used by a marking person, and marking work is carried out on the marking terminal; the marking subsystem mainly acquires original marks and preprocessed materials stored in the big data system from the big data system, and can issue marking tasks and corresponding data to be marked to the terminal; and receiving the labeled data uploaded by the terminal, verifying the uploaded data, and synchronizing the labeled data to the big data system. The big data system provides a marked data persistence function and realizes efficient data export for model training.

In the training data labeling system, data sources mainly comprise video and picture data, and the sources mainly comprise the following types:

(1) relevant video and picture data acquired in a container terminal management system (TOS);

(2) relevant video and picture data crawled and downloaded on the internet;

(3) and (4) uploading related video and picture data by the operator.

The deep neural network contains a large number of parameters, and in order to enable the parameters to work normally, acquired data need to be enhanced, the generalization capability of the model is improved, noise data is increased, and the robustness of the model is enhanced. The invention discloses dangerous goods mark identification for a container, which utilizes a segmentation technology to carry out image matting on dangerous goods marks, each dangerous goods mark obtains a plurality of shape images, and then the scratched images are randomly rotated and turned to manufacture large-scale training data.

The training data preprocessing of the data labeling management system disclosed by the invention is completed in a labeling subsystem, and the processing of a data source mainly comprises the following points:

(1) converting the format, namely converting the data into a format which can be processed by model training, such as converting a video into an mp4 format and converting a picture into a jpg format;

(2) renaming the source data according to the data specification, and reserving original information;

(3) extracting frames from the video data as picture data according to needs;

(4) the pictures after the frame extraction processing are sent to a labeling terminal for screening, and suitable pictures are screened out for data labeling;

fig. 8 shows that the hazardous article identifier recognition system disclosed by the invention comprises a base resource layer, a containerization layer, an algorithm development layer, a service layer and an acquisition layer.

The basic resource layer is constructed by a plurality of X86 architectures and a host with GPU capability, and is used as a hardware basic layer of cloud service to provide basic functions of computing, storing, networking and the like.

The containerization layer is used as an integral solution by Kubemeters and provides services to the outside in a containerization mode;

the algorithm development layer provides a customized development container mirror image, and deploys a development environment and required controls rapidly;

the service layer provides services such as Restful and http of the recognition algorithm in a container form;

the acquisition layer integrates a specific software program through hardware and provides hardware support for acquisition work of terminal identification images, videos, acquisition data and the like.

The foregoing embodiments are merely illustrative of the principles and utilities of the present invention and are not intended to limit the invention. Any person skilled in the art can modify or change the above-mentioned embodiments without departing from the spirit and scope of the present invention. Therefore, it is intended that all modifications and variations which may occur to those skilled in the art without departing from the spirit and scope of the invention disclosed herein be covered by the appended claims.

Claims

1. A method for identifying dangerous goods in a container, comprising the steps of:

s1, collecting data, using a camera mounted on a bridge crane or a road junction as a video collector, and collecting images of the operated container by using a wireless radio frequency technology;

s2, detecting the edge of the box body, positioning the straight line by adopting a Hough transform algorithm, calculating corresponding angle vectors (theta, p) through each point (x, y) on the contour of the object, and collecting the contour points into a straight line when the angle vectors (theta, p) are more than a set threshold value T;

s3 perspective transformation, transforming the two-dimensional matrix image into three-dimensional space display effect, setting the point before transformation as the point with Z value as 1, the value on the three-dimensional plane is x, y, 1, the projection on the two-dimensional plane is x, y, transforming into the point X, Y, Z in three-dimensional through matrix, and then transforming into the point x ', y' in two-dimensional through dividing by the value of Z axis in three-dimensional;

s4 image enhancement, selecting the image with sufficient light and obvious dangerous goods mark in the daytime from the container area images, carrying out gray processing, and calculating the average gray value h₀As a brightness baseline; for the received picture to be identified, the average gray value h is obtained and compared with the reference line to obtain the proportionality coefficient

Correcting an original image:

when gamma is greater than 1, the contrast of high gray scale region of image is enhanced, when gamma is less than 1, the contrast of low gray scale region of image is enhanced, when gamma is 1, the original image is not changed, thereby raising low brightnessRegional brightness;

s5, marking a dangerous article area, marking the area of the dangerous article mark, marking a discriminant in the training sample set image before training the training sample, and outputting a binary image;

s7, acquiring an area of interest, acquiring contour coordinates of dangerous goods marks of training pictures through marked training data, and acquiring picture areas corresponding to the dangerous goods marks through contour boundary coordinates;

s8 feature extraction, namely constructing a model based on a full convolution neural network, extracting features of different sizes of an input image by using the full convolution neural network, converting the final full connection layer into convolution of 1 multiplied by 1 for feature extraction, and forming a thermodynamic diagram for an upper sampling layer;

Deconvolution of the image of size into

The image of (a); then to

Is deconvoluted into

The image of (a);

s12, calculating a connected domain, calculating the input picture to be identified through the neural network provided by the invention, finally generating a binary picture which has the same size as the picture but only comprises 0 and 1, generating a dangerous article mark region corresponding to the position with the pixel value of 1 in the picture, positioning the position with the pixel value of 0 as a background through the connected domain with the calculated value of 1, and finally obtaining the coordinates of the dangerous article mark region;

the data acquisition in the step S1 includes training data acquisition, labeling, and acquisition of a picture to be recognized; the method comprises the steps that a data labeling management system required by massive rich text training, standardized acquisition, labeling, storage and efficient transmission of test samples required by deep learning algorithm training is constructed, training data are acquired and labeled, and the data labeling management system mainly comprises a labeling terminal, a labeling subsystem and a big data subsystem;

the step S7 of acquiring the region of interest includes the following steps:

(1) firstly, an original segmentation region R ═ R is obtained by a graph-based image segmentation method₁，r₂，…r_n) And initializing a similarity set

(2) Secondly, calculating the similarity between every two adjacent regions, and adding the similarity into a similarity set S;

(3) thirdly, two areas r with the maximum similarity are found out from the similarity set S_iAnd r_jMerge them into one region r_tRemoving the original and r from the similarity set_iAnd r_jCalculating the similarity between adjacent regions, and calculating r_tThe similarity with the adjacent region is added to the similarity set S, and the new region r is added_tAdding into the region set R;

(4) fourthly, acquiring a boundary area of each area, wherein the boundary area is a dangerous goods mark possible result L;

step S8 feature fusion divides the images into two categories: background + critical area, so the number of channels, i.e. depth, is 2; inputting an original gray image and a depth image into a network in parallel, wherein the upper-layer original gray image comprises a plurality of information such as scenes, positions and the like; the input of the fully convolutional neural network is a 512 × 512 × 3 image, the image data is changed into a characteristic matrix with the size of 7 × 7 × 2 through a series of convolutional layers and downsampling layers, at the moment, the 3 fully connected layers are converted into convolutional layers, the sizes of filters are respectively set, the finally output data volume is 1 × 1 × 1000, the resolution of the image can be reduced through each convolution, and a final characteristic diagram is obtained after multiple convolutions and contains all semantic information in the image;

the step S9 feature extraction includes the steps of:

(3) thirdly, performing secondary 0 complementing operation on the feature diagram on the basis of the expanded feature diagram, taking the shape of the feature diagram which is input originally as output, and calculating the positions and the number of 0 complementing positions of padding according to the convolution padding rule to obtain the positions of 0 complementing, wherein the upper part and the lower part and the left part and the right part are respectively reversed;

the core of step S10 is that linear interpolation is performed in two directions, where the point p is (x, y), and the point Q is assumed to have a known function f₁₁＝(x₁，y₁)，Q₁₂＝(x₁，y₂)，Q₂₁＝(x₂，y₁) And Q₂₂＝(x₂，y₂) The values of the four points are firstly subjected to linear interpolation in the x direction to obtain

Then linear interpolation is carried out in the y direction to obtain

This gives the desired result f (x, y),

the step S12 of calculating the connected component includes the steps of:

(1) acquiring a binary image;

d) if the left pixel value and the upper pixel value in the 4 neighborhoods of the pixel are both 1 and the labels are different, the smaller label is taken as the label of the pixel; after the marking is finished, a plurality of different labels exist in a connected domain, and the label of the pixel on the left side of the pixel and the label of the pixel on the upper side of the pixel are marked as an equal relation;

(3) when the image is scanned line by line for the second time, the labels with the equality relation are selected to be the labels with the smallest labels as the labels, namely, the labels which are already marked are accessed and combined to form the labels with the equality relation;

the system in step S13 is divided into multiple layers, including:

the system comprises a basic resource layer, a containerization layer, an algorithm development layer, a service layer and an acquisition layer.