CN115830585A - Port container number identification method based on image enhancement - Google Patents

Port container number identification method based on image enhancement Download PDF

Info

Publication number
CN115830585A
CN115830585A CN202211545908.1A CN202211545908A CN115830585A CN 115830585 A CN115830585 A CN 115830585A CN 202211545908 A CN202211545908 A CN 202211545908A CN 115830585 A CN115830585 A CN 115830585A
Authority
CN
China
Prior art keywords
image
enhancement
illumination
network
decomposition
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202211545908.1A
Other languages
Chinese (zh)
Inventor
宋广军
侯佳辛
陆洋
毕振波
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhejiang Ocean University ZJOU
Original Assignee
Zhejiang Ocean University ZJOU
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhejiang Ocean University ZJOU filed Critical Zhejiang Ocean University ZJOU
Priority to CN202211545908.1A priority Critical patent/CN115830585A/en
Publication of CN115830585A publication Critical patent/CN115830585A/en
Pending legal-status Critical Current

Links

Images

Landscapes

  • Image Analysis (AREA)

Abstract

The invention discloses a port container number identification method based on image enhancement, which comprises the following steps: collecting container images under a low-illumination environment at a freight port to form an image library; constructing a resistance enhancement network decomposed by Retinex, and carrying out image enhancement processing on an image library to improve the enhancement effect; preprocessing the container image box number subjected to enhanced processing, extracting characters to construct a box number character data set, and performing iterative training on the character data set by the constructed convolutional neural network model so as to improve the accuracy rate of character recognition; loading the actually acquired low-illumination container image to be identified into a trained image enhancement network to complete enhancement processing on the low-illumination image so as to obtain a high-quality image; and preprocessing the enhanced high-quality image, extracting a box character image to be recognized, inputting the obtained character image into a trained convolutional neural network model for box number recognition, and obtaining a final result.

Description

Port container number identification method based on image enhancement
Technical Field
The invention belongs to the technical field of container identification, and particularly relates to a port container number identification method based on image enhancement.
Background
The increasing number of containers also facilitates the gradual expansion of port production, requiring 24-hour uninterrupted container tallying operations at each port in order to avoid the heavy stress due to the large backlog of container containers. For this reason, a modern intelligent shipping system for port containers is indispensable, and the essential basic task is to intelligently identify and collect the information of the container number.
However, for most of the small ports and general port areas in China at present, the tallying operation of the containers is still in a manual mode, and a plurality of tallying personnel are required to collect the information of the container numbers on site, so that the mode has a great error probability in the recording process and in the later calibration; on the other hand is when can't effectively improve work efficiency, still greatly increased the operation cost of enterprise, especially when carrying out case number discernment and collection work under the low light level environment at night, to the region that the light irradiation is not enough and there is the shadow influence in the workspace, most harbours can use a plurality of industry high wattage searchlights to provide the illumination for the collection work, in order to improve because the low light level leads to the wrong problem of container number information acquisition, but such mode has also increased the consumption of the energy to a certain extent. Therefore, the problems that image details are unclear and image recognition is affected due to low illumination at night are technical hotspots which need to be solved at present.
The method and the device mainly aim at image enhancement processing of the low-quality container image acquired under the low-illumination condition at night, adjust the whole hue of the image and the corresponding color saturation degree and the like through an algorithm to brighten the dark area of the image, inhibit the highlighted area to a certain extent, finally effectively improve the contrast of the image, enhance the detail characteristics of the box number character area, enable the subsequent box number information to be positioned and identified to be clearly presented, improve the accuracy of box number character identification, and provide guarantee for intelligent cargo handling operation of the port container.
Disclosure of Invention
The invention provides a port container number identification method based on image enhancement, which aims to solve the technical problems that errors are easy to occur in the acquisition, identification and positioning of images of a low-illumination box body at night and the like.
The specific technical scheme of the application is as follows:
a port container number identification method based on image enhancement comprises the following steps:
collecting container images under a low-illumination environment at a freight port to form an image library;
establishing a Retinex decomposed confrontation enhancement network, and carrying out image enhancement processing on an image library to improve the enhancement effect;
preprocessing the container image box number subjected to enhanced processing, extracting characters to construct a box number character data set, and performing iterative training on the character data set by the constructed convolutional neural network model so as to improve the accuracy rate of character recognition;
loading the actually acquired low-illumination container image to be identified into a trained image enhancement network to complete enhancement processing on the low-illumination image so as to obtain a high-quality image;
and preprocessing the enhanced high-quality image, extracting a box character image to be recognized, inputting the obtained character image into a trained convolutional neural network model for box number recognition, and obtaining a final result.
Further, the establishing of the Retinex decomposed confrontation enhancement network comprises the following steps:
s1, effectively decomposing an original low-illumination image in a decomposition network, and ensuring the capability of recovering details in the decomposition process through multiple decomposition losses;
s2, utilizing a fusion enhancement network to carry out enhancement learning on the decomposed result;
and S3, carrying out game on the fusion enhanced result and the original reference image through an identification network to obtain an enhanced result closer to the reference image.
Further, the polynomial decomposition loss of the decomposition network in step S1 is L RD =L ini +α*L wtv +β*L com +γ*L err Wherein the initialization penalty L ini Weighted total variation loss L wtv Decomposition loss L com And reflection error loss L err And the constants alpha, beta and gamma are specific gravity parameters which are as follows:
initialization penalty L ini : decomposing illumination P by calculation I And estimating the illumination P v Mean square error MSE between to realize decomposition of illumination P I The formula is as follows:
Figure BDA0003979831720000021
wherein, in I For the input image illumination component, in V Estimating an illumination component, lb, for an image I Lb being a reference image illumination component V To estimate the image illumination component;
weighted total variation loss L wtv : weighted total variation loss L wtv The method utilizes the full variation size of the image to limit the noise size of the image, so as to smooth the illumination component of the image, avoid the generation of halo defects, and define a weight matrix P W Specifically, the following are shown:
Figure BDA0003979831720000022
wherein the content of the first and second substances,
Figure BDA0003979831720000023
representing the total variation in the horizontal and vertical directions, w (x) being a window of 3 × 3 size centered on pixel x, so L wtv Can be expressed as:
Figure BDA0003979831720000024
decomposition loss L com : by calculating P I ·P R Mean square error MSE with P to ensure image decompositionWhere P is the original illumination, P R For reflected illumination, the specific calculation process is as follows:
Figure BDA0003979831720000031
wherein Input is the original low-illumination image, label is the illumination of the target image,
reflection error loss L err
Figure BDA0003979831720000032
In R For the input image reflected light component, lb R Reflecting light components with reference to the image by calculating In R And Lb R Mean Square Error (MSE) between the two is used for ensuring the reflectivity precision of the image decomposition.
Further, the fusion enhancement network in step S2 includes CRM, and the CRM is used to map the illumination component obtained by decomposing the low-illumination image to a coarse good exposure effect, and a specific process of CRM processing can be represented by the following formula:
Figure BDA0003979831720000033
CRM(In I ,Input)=e b(1-k) Input k
where ε represents a small constant, a and b are camera parameters, which are set to-0.3293 and 1.1258, respectively, to accommodate most exposure, and the final enhancement result from merging the enhancement networks can be expressed as:
Out=FENet(Input,CRM(In I ,Input),In R ),
used in the pre-training of this converged enhanced network are the content loss parameters in the VGG19 model:
Figure BDA0003979831720000034
the VGG19 network structure totally comprises 19 hidden layers, and consists of 16 convolution layers of 3 multiplied by 3 and 3 full connection layers of 2 multiplied by 2, the whole structure is very clear and concise, and the convolution layer combination of a plurality of small filters in the VGG19 network is better than the filtering effect generated by directly using one large filter.
Further, in step S3, the discriminator network includes 6 convolutional layers and 1 fully-connected layer, and the Prelu function is used to activate the network, and a new countermeasure loss RDGAN _ d is used in the alternate training process of the discriminator network, specifically calculated according to the enhancement result and the reflectivity component of the original reference image, and is helpful for recovering the image color and details, where the formula is as follows:
Figure BDA0003979831720000035
the final antagonistic enhancement loss can consist of the content loss parameters and the antagonistic loss parameters in the pre-trained VGG19 model of the converged enhancement network:
L FE =L con +L RDGAN_g
wherein D is fake Is the probability that the input image belongs to the enhanced image, D real Is the probability that the input image belongs to the reference image, out R Representing the reflectivity components resulting from the decomposition of the enhancement result image by the decomposition network.
Further, the image preprocessing comprises the steps of carrying out image graying, gray stretching, binarization and morphological processing on the container image after the enhancement processing.
Further, extracting the box number characters comprises projecting the preprocessed binary image in the horizontal direction, and analyzing the characteristics of the gray distribution of the binary image; by counting the number of white pixel points in the y direction, other box parameter information under the box number is effectively distinguished
Furthermore, a convolutional neural network model is adopted for identifying the container number characters, and the specific structure of the convolutional neural network model comprises 2 convolutional layers, 1 maximum pooling layer, 2 full-connection layers and a Dropout layer.
Further, a GUI visual system interface is designed by utilizing a PyQt5 tool, the container number identification working process and the result are subjected to fusion display, and the integrity of the working process is presented.
Further, the GUI visualization system interface includes the loaded original low light container box image, the enhanced box image, the located box number region, and the finally obtained box number character recognition result.
Compared with the prior art, the invention has the following advantages:
(1) Aiming at the enhancement processing of the night low-illumination box images, the method for enhancing the low-illumination images by using the countermeasure enhancement network based on Retinex decomposition is analyzed, the enhancement network is built, and the actually acquired night container images are input into the network to obtain corresponding processing results;
(2) A plurality of image quality evaluation parameters are introduced to objectively analyze the result presented by the network model, and the result is combined with a corresponding brightness distribution histogram, and through the confirmation of various data, the image processed by the countermeasure enhancement network based on Retinex decomposition effectively avoids color distortion, simultaneously, the retention of image details is more perfect, and the noise influence in the enhancement process is effectively inhibited;
(3) In order to reduce the difficulty of acquiring the box number characters in a positioning way, a series of preprocessing operations including gray scale stretching, morphological processing and binaryzation are carried out on the enhanced result image, so that various noise and other interference factors are removed, and the problem of adhesion among characters is avoided;
(4) Acquiring box number characters by knowing and analyzing box number areas and character features and adopting a projection positioning segmentation method, wherein the process needs to perform horizontal projection on an overall image to position the box number areas; then, vertically projecting the image of the region, capturing the characteristics of wave crests and wave troughs in a distribution histogram of corresponding pixel points, and accurately segmenting to obtain box number characters;
(5) Aiming at the identification and acquisition of the box number information, a convolutional neural network is built, iterative training is carried out on the convolutional neural network through a constructed box number character data set, and the accuracy rate obtained by training can reach 98.09%;
(6) In order to visually present the effect realized by each part and show the integrity of the whole working process, a Qt Designer visual interface Designer in a PyQt5 tool is used for configuring the called control, and a corresponding GUI system interface is designed;
(7) Through extracting the image of the container body of the port at night that the part was actually gathered, input it and carry out a lot of tests in the work system that finally fuses and obtain, can see out from the effect that appears, to the box image that night light shines inadequately or has the shadow influence, can realize good reinforcing effect, and can keep more than 95% to the discernment rate of accuracy of case number after the reinforcing.
Drawings
FIG. 1 is a schematic flow diagram of the present invention;
FIG. 2 is a general framework diagram of the countermeasure enhancement network based on Retinex decomposition according to the present invention;
FIG. 3 is a diagram of P for initializing an estimated illumination component according to the present invention v Displaying a graph;
FIG. 4 shows an illumination component image and unused P obtained by practical decomposition according to the present invention v Estimating a decomposition result of the initialization illumination to carry out a comparison graph;
FIG. 5 is a CRM function definition of the present invention;
FIG. 6 is a graph showing the CMR adjustment result of the present invention;
FIG. 7 is a graph comparing the enhancement results of the present invention with the original low-light image;
FIG. 8 is a luminance histogram of the enhancement result of the present invention and the original low-luminance image;
FIG. 9 is an image of a container according to an embodiment of the present invention;
FIG. 10 is a diagram of the effect of horizontal projection according to the present invention;
FIG. 11 is a diagram of the vertical projection effect of the present invention;
FIG. 12 is a diagram of a convolutional neural network model (CNN) architecture of the present invention;
FIG. 13 is a box number character dataset of the present invention;
FIG. 14 is a table of accuracy and loss for convolutional neural network training of the present invention;
FIG. 15 is the training loss curve and the training accuracy curve of FIG. 14 of the present invention;
FIG. 16 is a block diagram of a port container box number identification system of the present invention;
FIG. 17 is a graph of the challenge enhancement network test results of the present invention;
FIG. 18 is a system flow diagram of the present invention;
FIG. 19 is a table of system test accuracy in accordance with the present invention;
FIG. 20 is a graph of system test identification accuracy of the present invention.
Detailed Description
In order to make the specific solution of the present invention more clear to those skilled in the art, the port container number identification method based on image enhancement of the present invention is further described below with reference to the accompanying drawings.
As shown in fig. 1, a port container number identification method based on image enhancement includes the following steps:
s1, collecting container images under a low-illumination environment at a freight port to form an image library;
s2, establishing a resistance enhancement network decomposed by Retinex, and carrying out image enhancement processing on an image library to improve the enhancement effect;
s3, preprocessing the container image box number subjected to enhanced processing, extracting characters to construct a box number character data set, and performing iterative training on the character data set by the constructed convolutional neural network model to improve the accuracy rate of character recognition;
s4, loading the actually acquired low-illumination container image to be identified into a trained image enhancement network to perform enhancement processing on the low-illumination image to obtain a high-quality image;
and S5, preprocessing the enhanced high-quality image, extracting a box character image to be recognized, inputting the obtained character image into a trained convolutional neural network model for box number recognition, and obtaining a final result.
Specifically, the step S1 is to acquire an original image with low illumination: the original image may be an image acquired in an environment with insufficient illumination level, for example, an image acquired at night, which is not particularly limited in the embodiment of the present invention, and for convenience of description, the "original image with low illumination" is simply referred to as a "low illumination image" or an "original image" hereinafter.
As shown in fig. 2, the countermeasure-enhancing Network based on Retinex decomposition in step S2 is a countermeasure-learning framework fused on the basis of Retinex theory, and has better color and detail recovery performance, the countermeasure-enhancing Network based on Retinex decomposition is an end-to-end countermeasure-learning Network, and the whole framework includes a generation Network (Generator Network) and a Discriminator Network (Discriminator Network), wherein the generation Network can be divided into a decomposition Network (RDNet) and a fusion enhancement Network (FENet), and the original low-illumination image is firstly decomposed effectively in the decomposition Network, and the detail recovery capability in the decomposition process is ensured through multiple decomposition losses; then, utilizing a fusion enhancement network to carry out enhancement learning on the decomposed result; the purpose of the discriminator network is to game the fused enhanced result with the original reference image to obtain an enhanced result closer to the reference image. The method comprises the following specific steps:
(1) Decomposing the network:
the method comprises the steps of taking an original image and a corresponding reference image thereof as input of a decomposition network, outputting an obtained illumination component and a reflectivity component, constructing a brightness graph by finding a maximum brightness value channel in the illumination component, greatly reducing a resolution space and reducing calculation cost, therefore, operating an HSV (hue, saturation and value) space of the original image in the decomposition network, and carrying out morphological closing operation on an obtained V channel component to obtain P for initializing and estimating the illumination component in order to fuse a narrow fracture and fill a gap on a contour v (ii) a The whole network structure is composed of a plurality of convolution layers, and deconvolution operation is introduced to change a low-resolution image into a high-resolution image; at the end of the decomposition network, the nearest neighbor upsampling layer and the 3 x 3 convolutional layer are used to replace the deconvolution layer, while the PRelu is used as the activation function to consider the negative region.
Multiple decomposition losses of the decomposition network are L RD =L ini +α*L wtv +β*L com +γ*L err Wherein the initialization penalty L ini Weighted total variation loss L wtv Decomposition loss L com And reflection error loss L err And the constants alpha, beta and gamma are specific gravity parameters which are as follows:
initialization penalty L ini : decomposing illumination P by calculation I And estimating the illumination P v Mean square error MSE between to realize decomposition of illumination P I The formula is as follows:
Figure BDA0003979831720000071
wherein, in I For the input image illumination component, in V Estimating an illumination component, lb, for an image I Lb being a reference image illumination component V To estimate the image illumination component;
weighted total variation loss L wtv : weighted total variation loss L wtv The method utilizes the full variation size of the image to limit the noise size of the image, so as to smooth the illumination component of the image, avoid the generation of halo defects, and define a weight matrix P W Specifically, the following are shown:
Figure BDA0003979831720000072
wherein the content of the first and second substances,
Figure BDA0003979831720000073
representing the total variation in the horizontal and vertical directions, w (x) being a window of 3 x 3 around pixel point x, so L wtv Can be expressed as:
Figure BDA0003979831720000074
decomposition loss L com : by calculating P I ·P R Mean square error MSE between P, where P is the original image illumination, and P, to ensure the accuracy of image decomposition R For reflected illumination, the specific calculation process is as follows:
Figure BDA0003979831720000075
wherein Input is an original low-illumination image, and Label is the illumination of a target image;
reflection error loss L err
Figure BDA0003979831720000076
In R For the input image reflected light component, lb R Reflecting light components with reference to the image by calculating In R And Lb R Mean Square Error (MSE) between the two is used for ensuring the reflectivity precision of the image decomposition.
(2) Fusing an enhanced network:
the framework of the fusion enhancement network is substantially the same as that of the decomposition network, the difference is mainly that the number of convolutional layers is different, and no full connection layer is arranged in the fusion enhancement network, in order to maintain the naturalness of the image, CRM (Camera Response Model) is added in the fusion enhancement network, and by using the CRM, the illumination component obtained by decomposing the low-illumination image is mapped to a rough good exposure effect, and the specific process of CRM processing can be represented by the following formula:
Figure BDA0003979831720000081
CRM(In I ,Input)=e b(1-k) Input k
where ε represents a small constant, a and b are camera parameters, which are set to-0.3293 and 1.1258, respectively, to accommodate most of the exposure, so the final enhancement result from fusing the enhancement network can be expressed as:
Out=FENet(Input,CRM(In I ,Input),In R )
used in the pre-training of this converged enhanced network are the content loss parameters in the VGG19 model:
Figure BDA0003979831720000082
the VGG19 network structure comprises 19 hidden layers, and the hidden layers comprise 16 convolution layers of 3 x 3 and 3 full connection layers of 2 x 2, so that the integral structure is very clear and concise, and the combination of the convolution layers using a plurality of small filters in the VGG19 network is better than the filtering effect generated by directly using one large filter.
(3) The discriminator network:
the overall structure of the discriminator network is composed of 6 convolutional layers and 1 full-link layer, a Prelu function is used for activating the network, a new anti-loss RDGAN _ d is adopted in the alternate training process of the discriminator network, specifically, the RDGAN _ d is calculated according to the enhanced result and the reflectivity component of the original reference image, and is beneficial to restoring the color and the details of the image, and the formula is expressed as follows:
Figure BDA0003979831720000083
wherein D is fake Is the probability that the input image belongs to the enhanced image, D real Is the probability that the input image belongs to the reference image, out R Representing the reflectivity components resulting from the decomposition of the enhancement result image by the decomposition network.
Thus, the final antagonistic augmentation loss may consist of the content loss parameters and the antagonistic loss parameters in the pre-trained VGG19 model of the converged augmentation network: l is FE =L con +L RDGAN_g
For the confrontation learning framework fused by the confrontation enhancement network based on Retinex decomposition, the main purpose is to put the low-illumination image and the reference image corresponding to the low-illumination image into the network at the same time, and the whole confrontation enhancement network is promoted to gradually learn the real distribution of the sample through continuous game of the low-illumination image and the reference image; for this reason, we need to provide an image dataset containing two types of samples for training of the confrontation-enhanced network model, one of which is a low-illumination image of different exposure; the other is a reference image under normal illumination corresponding to the image.
The embodiment presents the enhancement processing procedure of the whole confrontation enhancement network by inputting a low-illumination box image which is actually collected at night, and the specific enhancement processing procedure comprises the following steps:
A1. acquiring a V-channel image of the initialized illumination component: extracting the V-channel component of the original image in HSV space, performing morphological closing operation on the V-channel component to fuse narrow fractures, filling gaps on the outline, extracting and processing the result (P for initializing and estimating the illumination component) v ) As shown in fig. 3;
A2. and (3) decomposing the network: p for initializing estimated illumination component in A1 v The original low-illumination image and the original low-illumination image are simultaneously used as the input of the decomposition network, and the decomposition network model is used for processing to obtain corresponding decomposition results, namely the reflectivity component and the illumination component, as shown in fig. 4, the illumination component image obtained by actual decomposition and the unused P are known to be v Estimating the decomposition result of the initial illumination for comparison, and passing through P in the decomposition network v The effect obtained by initializing and estimating illumination is clearer, and the problem of detail loss caused by subsequent smooth operation on illumination components can be effectively avoided;
crm adjusted exposure: before calling the fusion enhancement network model to obtain the enhancement result, the low-illumination image needs to be adjusted to a rough good exposure effect by virtue of the CRM function, and the definition of the function and the adjusted effect are respectively shown in FIGS. 5 and 6;
A4. treatment against enhancement: and simultaneously inputting the original low-illumination image, the reflectivity component obtained by decomposition and the CRM adjustment result into the fusion enhancement network and the discriminator network module to realize recovery adjustment and obtain a final enhancement result.
And (3) analyzing the image result subjected to the enhancement processing: the results of the countering enhancement are compared with the input low-illuminance image as follows, as shown in fig. 7: by enhancing the comparison of the result with the original low-illumination image, it can be seen that the Retinex-based image is
The decomposed countermeasure enhancement network can effectively adjust the brightness and contrast of the image, has good color recovery capability and does not have obvious color distortion phenomenon. In order to further analyze the exposure and the brightness distribution, a corresponding brightness histogram is drawn, as shown in fig. 8, it can be seen from the histogram that the enhanced corresponding brightness distribution is more balanced, the region with approximately stable gray value is between 100 and 200, and there is almost no pixel point distribution around 0 and 255, which indicates that the image processed by the enhanced network has no noise influence and overexposure, and the overall effect is better.
Wherein the image quality evaluation parameters include:
image Mean (Mean): the method is the most basic index parameter in image quality evaluation, and the average brightness of an image is expressed by calculating the average value of gray values of all pixel points, and the specific formula is as follows:
Figure BDA0003979831720000091
standard Deviation (Standard development): the gray level distribution is used for representing the dispersion degree of each level of gray level values relative to the image mean value, and the larger the value of the dispersion degree is, the more dispersed the gray level distribution of the reflected image is, namely the image contrast is larger; the smaller the standard deviation is, the smaller the contrast of the image is, and the calculation formula is as follows:
Figure BDA0003979831720000092
mean Square Error (MSE): the image enhancement method is an objective evaluation index for comparing the difference between a processed image and an original reference image, the smaller the value of the mean square error is, the higher the detail similarity between the enhanced processed image and the original image is, namely the better the relative quality of the image is, the mathematical formula is as follows:
Figure BDA0003979831720000101
peak signal-to-noise ratio (PSNR): the most important and widely used image objective evaluation index parameter is based on the error between corresponding pixel points, namely, the parameter is used for evaluating the image quality based on the mean square error, the peak signal-to-noise ratio is inversely proportional to the distortion degree of the image, the larger the value of the peak signal-to-noise ratio is, the lower the corresponding distortion degree is, and the better the image quality is. The corresponding mathematical formula is expressed as follows:
Figure BDA0003979831720000102
entropy of information (Entropy) of image: the method is a parameter for evaluating an image by using a quantization standard, and is generally used for measuring the information amount in the image, reflecting the corresponding richness, wherein the larger the value of the parameter is, the more the information contained in the image is represented, and p (i) is assumed to be the probability of the ith gray level of the image, when all the gray levels are in equal probability distribution, the information entropy of the image is the maximum, and the histogram distribution at the moment is basically equalized, and the formula of the information entropy is expressed as follows:
Figure BDA0003979831720000103
average Gradient (Average Gradient): the image processing method is a parameter for objectively reflecting the image definition, in the aspect of image processing technology, the image definition is an important index, the clearer the image is, the larger the corresponding average gradient value is, the more comprehensive the image information received by a human visual system is, and the mathematical formula is as follows:
Figure BDA0003979831720000104
wherein the content of the first and second substances,
Figure BDA0003979831720000105
representing the image in horizontal directionThe size of the gradient;
Figure BDA0003979831720000106
indicating the magnitude of the gradient of the image in the vertical direction.
Specifically, in step S3, the box number identification method includes image preprocessing, box number region positioning, and segmentation acquisition of box number characters, and specifically includes:
B1. preprocessing an image, including performing operations such as image graying, gray stretching, binaryzation, morphological processing and the like on the container image after enhancement processing, specifically as follows:
graying the image, namely processing the color image of an original image from an RGB space into a grayscale image, effectively reducing the storage space required by the image, improving the overall calculation processing speed and simultaneously keeping more complete image characteristic information;
gray scale stretching, stretching or compressing the gray scale distribution range of the whole image through mapping calculation, and further improving the contrast of the gray scale image to make the image more clear;
morphological processing, wherein the applied morphological operations comprise image Opening Operation and Top Hat Operation, the Opening Operation (Opening Operation) is firstly corrosion and then expansion, and is usually used for eliminating fine noise influence, so that the edge outline of a target area becomes smoother, meanwhile, narrow redundant connection is effectively separated, and the position and the volume of an object are ensured not to generate obvious change, the Top Hat Operation (Top Hat) is used for obtaining a difference image between an original input gray image and an Opening Operation result image, and the Top Hat Operation is carried out after the Opening Operation, so that an area brighter than the area around the target in the original image can be obtained, and the aim of enhancing a bright target object in a dark background image is fulfilled;
and (3) carrying out image binarization, namely obtaining an image which can clearly present overall and local characteristics through a proper threshold value, wherein the detail information is only related to the positions of black and white pixel points with the values of 0 and 255, and the image binarization method is more beneficial to the subsequent analysis and processing of a target area in the image.
B2. Acquiring a box number character:
in order to improve the efficiency of managing containers in ports and ensure real-time tracking of goods in the process of tallying and transporting, related information such as related identifiers of container types, manufacturer Logo of containers, container purposes, container numbers and the like is usually printed on the containers, the rules and characteristics of container number codes can be obtained by analyzing a large number of collected container images, and a plurality of container images are selected for display, as shown in fig. 9, the container number coding rules follow the ISO 6346 coding standard and consist of 11-bit print characters with fixed sizes. The first four digits are capital English letters, next to the six digits of Arabic numerals, the last digit is an Arabic numeral which is arranged independently, the Arabic numerals are often distinguished from the first ten digits of box registration codes in the form of an additional frame, meanwhile, the space change frequency corresponding to the box number area of the container is high, and the intervals between the English letters and the digits are basically the same.
B3. Positioning a box number: projecting the preprocessed binary image in the horizontal direction, analyzing the characteristics of gray distribution of the binary image, counting the number of white pixel points in the y direction, and effectively distinguishing parameter information of other boxes below the box number, wherein the effect is as shown in fig. 10, and the method can be seen that aiming at the transversely arranged container numbers, characters are mainly concentrated on a certain line, the number of the character bits is obviously greater than that of other character areas, and the positions of the characters are distributed at the top end of the whole image can be known by priori knowledge, so that the part with the most concentrated white pixel point distribution and the highest peak value is selected as a candidate area by observing a histogram obtained by horizontal projection; determining the boundary of the box number character region according to the characteristics of the wave crest and the wave trough, and finally completing the positioning of the box number region for the row number corresponding to the first non-zero column vector and the row number corresponding to the next adjacent zero column vector;
B4. the method comprises the steps of dividing box number characters, carrying out vertical projection on a character region image obtained by dividing, counting the number of white pixel points in the x direction, and obtaining a corresponding distribution histogram, wherein the region of each character can present the characteristics of peak-valley-peak, as can be known from fig. 11, the condition of adhesion between the characters is effectively avoided through image preprocessing in the previous stage, so that gaps between the characters can be well reflected in the histogram, the corresponding part is the part with the number of the white pixel points being zero, judging the difference between the position of the next item and the position of the current item by recording the column number corresponding to the item with the number of the pixel points being zero, determining the starting and stopping column number corresponding to each character, and finally finishing the division and acquisition of each box number character.
B5. And (3) identifying box number characters: adopting a convolutional neural network model (CNN), wherein the specific structure is composed of 2 convolutional layers, 1 maximum pooling layer, 2 full-link layers and a Dropout layer, as shown in FIG. 12, the input of the recognition network is a binary image of a single character obtained by segmentation, the size of the binary image is 32 x 40, the size of a convolutional kernel set by the first convolutional layer is 3 x 3, the number of filters is 32, and the activation is carried out by using a Relu function; the second convolution layer increases the number of the filters to 64 on the basis of the first convolution layer, the pooling layer adopts the maximum pooling mode, the pooling size is 2 multiplied by 2, the step length is 1, the activation function used by the first full-connection layer is Relu, and 128 filters are set; the second full connection layer takes the Softmat function as an activation function, and the number of the filters is determined according to the number of the output identification categories, namely the number of the categories of the numbers and the English letters contained in the box numbers.
The training of the network of the invention needs to construct a corresponding container number character data set in advance and to capture a large amount of collected container images from the scene; positioning and segmenting box number characters by a projection positioning segmentation method to obtain the box number characters; obtaining the seriously damaged box body and the side box body by adopting a manual positioning and segmentation method, carrying out binarization processing on each obtained character picture, and normalizing the character pictures into a size of 32 multiplied by 40, thereby obtaining 3997 character pictures to form a box number character data set, and partially showing the box number character data set as shown in FIG. 13; as shown in fig. 14-15, the constructed box number character data set is put into a constructed convolutional neural network for training, corresponding batch processing parameters and iteration times are configured, the loss of the model is reduced from 2.41 to 0.07 through the training process, and the recognition accuracy is improved from 36.36% to 98.09%; therefore, a mature convolutional neural network model can be obtained for recognizing the box number characters, and the box number characters are output and stored in a char _ cnn.h5 model for being directly called in a system subsequently.
As shown in fig. 16, the port container number identification system based on image enhancement is designed and built on a tenswafw framework by using python3.6 assembly language in an Anaconda3 compiling environment, and specifically includes:
the development environment layer is simple to compile a GUI graphic visualization interface through PyQt5, and the GUI graphic visualization interface has more control styles, complete functional documents and high stability; in addition, the ecology support feature of PyQt5 enables the conversion of the drawn. Ui file into. Py file for later integration and modification.
The system mainly comprises a basic layer and a data acquisition layer, wherein the basic layer mainly comprises two parts of data contents required to be used by the system, and one part is a night container body image acquired and collected in advance; the other is to identify the box number character data set that the network needs to use for training.
The data processing layer is divided into two aspects of data loading and model training optimization, and the data loading refers to loading path information of a corresponding picture from a specified data set required to be called during each model training; the model training optimization refers to the training of an anti-enhancement network model and a convolutional neural network model for identifying box number characters respectively, and the obtained models are stored respectively for direct calling in system research, so that the system can output a high-quality enhancement effect and an accurate identification result.
The logic layer is mainly used for loading the low-illumination container images collected actually at night into a trained enhancement network and a recognition network for testing, firstly, the enhancement processing is completed on the low-illumination images through the enhancement network, then, a series of preprocessing operations are carried out on the enhanced images, and finally, the obtained character images are brought into the recognition network to recognize the final result.
And the display layer displays the original night container image, the enhanced image, the segmented box number character area and the finally identified box number result in a system window by utilizing a GUI interface.
The port container number identification system based on image enhancement mainly comprises an enhancement module, an identification module and a system visualization module.
After defining each function module and function needed by the whole network, a parser can be created in the training process, and corresponding parameters are configured by calling add _ attribute () function to realize the training process; after the final network model is obtained, the decomposition network model and the fusion enhancement network model are stored in the corresponding path, so that the input night low-illumination container image can be directly called according to the path information when being enhanced in the system in the following. Inputting a night low-illumination container body image to a countermeasure reinforcing network based on Retinex decomposition, and obtaining corresponding illumination components and reflectivity components through the decomposition network; after the decomposition result is obtained, the obtained illumination component is adjusted to a rough and natural good exposure result through an introduced camera response function (CRM); and then, using a fusion enhancement network (FENet) to simultaneously input the input low-illumination image, the decomposed reflectivity component and the CRM processing result based on the illumination component as the input of the network so as to obtain a final enhanced result image.
As shown in fig. 17, two images of the container body collected at port in the night low-light environment are input into the countermeasure reinforcing network for testing, and a corresponding reinforcing result diagram is obtained.
Designing a system GUI interface: when a GUI visualization system interface is designed by using a configured PyQt5 tool in PyCharm, the GUI visualization system interface can be directly realized by writing Python language or can be finished by means of Qt Designer; to further enable the enhancement module and recognition model to be invoked and present their results directly therein, the.ui file needs to be converted to a.py file by means of the pyuic5 tool. And the GUI visual interface mainly comprises a loaded original night low-illumination container body image, an enhanced effect, a positioned box number area and a finally obtained box number character recognition result. The functions of each button and the realization effects of each part are as follows:
loading a box image: presenting an original low-illumination box image and an enhanced image at the same time;
positioning a box number: displaying the target box number area obtained by segmentation below the button;
and (3) recognition results: and displaying the finally identified box number character information.
As shown in fig. 18, which is a general work flow diagram of the system, the imaging effect of the container body image collected at night under the low illumination is enhanced, so that the image is free from the low illumination effect on the basis of effectively retaining the characteristic details and no serious color distortion, and the information of the target area is displayed more clearly; secondly, performing series of preprocessing operations on the enhanced result image to remove the influence of various noise and other interference factors on the subsequent identification effect; then, effectively positioning and dividing the box number area and the box number character to be recognized, and completing normalization processing so as to facilitate the smooth proceeding of subsequent recognition work; and finally, inputting the single character image obtained by segmentation into a recognition network, recognizing the box number information and displaying the box number information in a system visual interface.
In a specific embodiment, the system test extracts three images from the acquired night low-illumination container body images for testing, and inputs the three original night low-illumination container images into the system to finish the effects of enhancing processing and identifying the container number; the visual interface is used for presenting the working process of the system, the realization effect of each core function module in the system can be clearly seen, a clear enhancement result can be obtained while box images are loaded, and then a box number area can be accurately positioned, and finally required box number information is obtained.
As shown in fig. 19-20, by identifying a plurality of box numbers and performing a plurality of tests, the identification accuracy can be maintained above 95%, and a good guarantee is provided for supporting the stable operation of the whole system.
Finally, it should be noted that: the above embodiments are only used to illustrate the technical solution of the present invention, and not to limit the same; while the invention has been described in detail and with reference to the foregoing embodiments, it will be understood by those skilled in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some or all of the technical features may be equivalently replaced; and the modifications or the substitutions do not make the essence of the corresponding technical solutions depart from the scope of the technical solutions of the embodiments of the present invention.

Claims (10)

1. A port container number identification method based on image enhancement is characterized by comprising the following steps:
collecting container images under a low-illumination environment at a freight port to form an image library;
constructing a resistance enhancement network decomposed by Retinex, and carrying out image enhancement processing on an image library to improve the enhancement effect;
preprocessing the container image box number subjected to enhanced processing, extracting characters to construct a box number character data set, and performing iterative training on the character data set by the constructed convolutional neural network model so as to improve the accuracy rate of character recognition;
loading the actually acquired low-illumination container image to be identified into a trained image enhancement network to complete enhancement processing on the low-illumination image so as to obtain a high-quality image;
and preprocessing the enhanced high-quality image, extracting a box character image to be recognized, inputting the obtained character image into a trained convolutional neural network model for box number recognition, and obtaining a final result.
2. The identification method for the number of the port container based on the image enhancement as claimed in claim 1, wherein the establishing of the resistance enhancement network of Retinex decomposition comprises the following steps:
s1, effectively decomposing an original low-illumination image in a decomposition network, and ensuring the capability of recovering details in the decomposition process through multiple decomposition losses;
s2, utilizing a fusion enhancement network to carry out enhancement learning on the decomposed result;
and S3, carrying out game on the fusion enhanced result and the original reference image through an identification network to obtain an enhanced result closer to the reference image.
3. The port container number identification method based on image enhancement as claimed in claim 2, wherein the multi-term decomposition loss of the decomposition network in step S1 is L RD =L ini +α*L wtv +β*L com +γ*L err Wherein the initialization penalty L ini Weighted total variation loss L wtv Decomposition loss L com And reflection error loss L err And the constants alpha, beta and gamma are specific gravity parameters which are as follows:
initialization penalty L ini : decomposing illumination P by calculation I And estimating the illumination P v Mean square error MSE between to realize decomposition of illumination P I The formula is as follows:
Figure FDA0003979831710000011
wherein, in I For the input image illumination component, in V Estimating an illumination component, lb, for an image I Lb being a reference image illumination component V To estimate the image illumination component;
weighted total variation loss L wtv : weighted total variation loss L wtv The method utilizes the full variation size of the image to limit the noise size of the image, so as to smooth the illumination component of the image, avoid the generation of halo defects, and define a weight matrix P W Specifically, the following are shown:
Figure FDA0003979831710000021
wherein the content of the first and second substances,
Figure FDA0003979831710000022
in the horizontal and vertical directionsW (x) is a window of 3 × 3 size centered on pixel point x, so L wtv Can be expressed as:
Figure FDA0003979831710000023
decomposition loss L com : by calculating P I ·P R Mean square error MSE between P, where P is the original image illumination, and P, to ensure the accuracy of image decomposition R For reflected illumination, the specific calculation process is as follows:
Figure FDA0003979831710000024
wherein Input is the original low-illumination image, label is the illumination of the target image,
reflection error loss L err
Figure FDA0003979831710000025
In R For input image reflected light components, lb R Reflecting light components with reference to the image by calculating In R And Lb R Mean Square Error (MSE) between the two is used for ensuring the reflectivity precision of the image decomposition.
4. The image enhancement-based port container number identification method according to claim 2, wherein the fusion enhancement network in step S2 includes CRM, which is used to map the illumination component obtained by decomposing the low-illumination image to a coarse good exposure effect, and the specific process of CRM processing can be represented by the following formula:
Figure FDA0003979831710000026
CRM(In I ,Input)=e b(1-k) Input k
where ε represents a small constant, a and b are camera parameters, which are set to-0.3293 and 1.1258, respectively, to accommodate most exposure, and the final enhancement result from merging the enhancement networks can be expressed as:
Out=FENet(Input,CRM(In I ,Input),In R ),
used in the pre-training of this converged enhanced network are the content loss parameters in the VGG19 model:
Figure FDA0003979831710000027
the VGG19 network structure comprises 19 hidden layers in total, wherein 16 convolution layers of 3 x 3 and 3 full-connection layers of 2 x 2.
5. The method as claimed in claim 2, wherein the discriminator network includes 6 convolutional layers and 1 full link layer in step S3, and the Prelu function is used to activate the discriminator network, and a new countermeasure loss RDGAN _ d is used in the alternate training of the discriminator network, specifically calculated according to the enhancement result and the reflectivity component of the original reference image, and the formula is as follows:
Figure FDA0003979831710000031
the final antagonistic augmentation loss consists of the content loss parameters and the antagonistic loss parameters in the pre-trained VGG19 model of the converged augmentation network: l is FE =L con +L RDGAN_g Wherein D is fake Is the probability that the input image belongs to the enhanced image, D real Is the probability that the input image belongs to the reference image, out R Representing the reflectivity components resulting from the decomposition of the enhancement result image by the decomposition network.
6. The port container number identification method based on image enhancement as claimed in claim 1, characterized in that the image preprocessing comprises image graying, grayscale stretching, binarization and morphological processing of the enhanced container image.
7. The port container number identification method based on image enhancement as claimed in claim 6, wherein the extracting of the container number characters comprises performing horizontal projection on the preprocessed binary image to locate a container number area; and vertically projecting the obtained box number area image, and completing the segmentation and acquisition of box number characters by analyzing a corresponding pixel point distribution histogram.
8. The port container number identification method based on image enhancement as claimed in claim 7, wherein the identification of the container number characters adopts a convolutional neural network model, and the specific structure of the convolutional neural network model comprises 2 convolutional layers, 1 max pooling layer, 2 full connection layers and a Dropout layer.
9. The port container number identification method based on image enhancement as claimed in any one of claims 1-8, wherein a PyQt5 tool is used to design a GUI visualization system interface, and the work process and results of the container number identification are displayed in a fused manner to show the complete work process.
10. The image enhancement-based port container number identification method as claimed in claim 9, wherein the GUI visualization system interface comprises the loaded original low-light container body image, the enhanced body image, the located container number area, and the final box number character identification result.
CN202211545908.1A 2022-12-05 2022-12-05 Port container number identification method based on image enhancement Pending CN115830585A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211545908.1A CN115830585A (en) 2022-12-05 2022-12-05 Port container number identification method based on image enhancement

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211545908.1A CN115830585A (en) 2022-12-05 2022-12-05 Port container number identification method based on image enhancement

Publications (1)

Publication Number Publication Date
CN115830585A true CN115830585A (en) 2023-03-21

Family

ID=85545029

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211545908.1A Pending CN115830585A (en) 2022-12-05 2022-12-05 Port container number identification method based on image enhancement

Country Status (1)

Country Link
CN (1) CN115830585A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116110053A (en) * 2023-04-13 2023-05-12 济宁能源发展集团有限公司 Container surface information detection method based on image recognition
CN116343125A (en) * 2023-03-30 2023-06-27 北京国泰星云科技有限公司 Container bottom lock head detection method based on computer vision

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116343125A (en) * 2023-03-30 2023-06-27 北京国泰星云科技有限公司 Container bottom lock head detection method based on computer vision
CN116343125B (en) * 2023-03-30 2024-04-02 北京国泰星云科技有限公司 Container bottom lock head detection method based on computer vision
CN116110053A (en) * 2023-04-13 2023-05-12 济宁能源发展集团有限公司 Container surface information detection method based on image recognition

Similar Documents

Publication Publication Date Title
CN110675368B (en) Cell image semantic segmentation method integrating image segmentation and classification
CN109255344B (en) Machine vision-based digital display type instrument positioning and reading identification method
CN115830585A (en) Port container number identification method based on image enhancement
CN109684922B (en) Multi-model finished dish identification method based on convolutional neural network
CN111833306A (en) Defect detection method and model training method for defect detection
CN111915704A (en) Apple hierarchical identification method based on deep learning
CN113160192A (en) Visual sense-based snow pressing vehicle appearance defect detection method and device under complex background
CN111652317B (en) Super-parameter image segmentation method based on Bayes deep learning
CN109255326B (en) Traffic scene smoke intelligent detection method based on multi-dimensional information feature fusion
CN108829711B (en) Image retrieval method based on multi-feature fusion
CN111915572A (en) Self-adaptive gear pitting quantitative detection system and method based on deep learning
CN112365497A (en) High-speed target detection method and system based on Trident Net and Cascade-RCNN structures
CN112446871A (en) Tunnel crack identification method based on deep learning and OpenCV
CN113609984A (en) Pointer instrument reading identification method and device and electronic equipment
CN116012291A (en) Industrial part image defect detection method and system, electronic equipment and storage medium
CN113158977A (en) Image character editing method for improving FANnet generation network
Saidane et al. Robust binarization for video text recognition
CN112348018A (en) Digital display type instrument reading identification method based on inspection robot
CN117011260A (en) Automatic chip appearance defect detection method, electronic equipment and storage medium
CN114066862A (en) Indicator identification method and system based on color gamut and contour characteristics
CN116665051B (en) Method for rescreening metals in garbage based on RGB image reconstruction hyperspectral image
CN117036314A (en) Method for detecting oxidation area of high-density flexible IC substrate
Nie et al. Machine vision-based apple external quality grading
CN114387592B (en) Character positioning and identifying method under complex background
CN110533027A (en) A kind of mobile device-based text detection and recognition methods and system

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination