CN112861800B - Express identification method based on improved Faster R-CNN model - Google Patents

Express identification method based on improved Faster R-CNN model Download PDF

Info

Publication number
CN112861800B
CN112861800B CN202110279121.4A CN202110279121A CN112861800B CN 112861800 B CN112861800 B CN 112861800B CN 202110279121 A CN202110279121 A CN 202110279121A CN 112861800 B CN112861800 B CN 112861800B
Authority
CN
China
Prior art keywords
express
target
box
suggestion
initial
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202110279121.4A
Other languages
Chinese (zh)
Other versions
CN112861800A (en
Inventor
张昀
王瑶
于舒娟
黄橙
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Nanjing University of Posts and Telecommunications
Original Assignee
Nanjing University of Posts and Telecommunications
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nanjing University of Posts and Telecommunications filed Critical Nanjing University of Posts and Telecommunications
Priority to CN202110279121.4A priority Critical patent/CN112861800B/en
Publication of CN112861800A publication Critical patent/CN112861800A/en
Application granted granted Critical
Publication of CN112861800B publication Critical patent/CN112861800B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/40Document-oriented image-based pattern recognition
    • G06V30/41Analysis of document content
    • G06V30/412Layout analysis of documents structured with printed lines or input boxes, e.g. business forms or tables
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques
    • G06F18/232Non-hierarchical techniques
    • G06F18/2321Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions
    • G06F18/23213Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions with fixed number of clusters, e.g. K-means clustering
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/25Determination of region of interest [ROI] or a volume of interest [VOI]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/44Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Physics & Mathematics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Computation (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • General Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Biomedical Technology (AREA)
  • Molecular Biology (AREA)
  • Health & Medical Sciences (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Evolutionary Biology (AREA)
  • Probability & Statistics with Applications (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses an express identification method based on an improved Faster R-CNN model, and aims to solve the technical problems that an express sorting technology based on an express company is lacked and an express bill is damaged or polluted and is difficult to accurately identify in the prior art. It includes: and improving a traditional Faster R-CNN model by using an RPN improved based on a k-means + + algorithm and a double-threshold-non-maximum suppression algorithm based on the length-width ratio of the candidate frame, and processing the express image to be identified by using the trained improved Faster R-CNN model to obtain an express mark identification result. The express identification method and the system can identify and sort express according to express companies, and are high in express identification speed, high in accuracy and not prone to missing detection.

Description

Express identification method based on improved Faster R-CNN model
Technical Field
The invention relates to an express identification method based on an improved Faster R-CNN model, and belongs to the technical field of express identification.
Background
With the rapid development of electronic commerce, a large number of packages need to be received, sent and transported, express logistics is rapidly developed into a huge industry, and how to improve the efficiency and market competitiveness of express handling becomes an important issue of attention of the whole industry. In a large logistics link, sorting orders are the most complicated and are a key factor for improving express delivery efficiency, and sorting duration and sorting error rate can directly influence the satisfaction degree of users. In the face of increasing express quantity, the realization of express sorting from manual collection to intelligent upgrading is required by the market of the express industry, and an automatic sorting system is standard configuration of each express point.
The current express sorting system mainly carries out automatic express identification and sorting according to two-dimensional codes or bar codes on express bills, and sorting systems based on Chinese character identification exist, but the main sorting objects of the sorting systems are names and addresses of recipients, and the sorting systems based on information of express companies are few; at present, sorting operation based on express companies is realized manually, and for express stations and warehouses which cooperate with a plurality of express companies at the same time, the existing express sorting systems are difficult to meet the express sorting requirements. In addition, the express delivery is very easy to appear the damaged or contaminated condition of express delivery list in the transportation, and the current express delivery letter sorting system is relatively poor to the discernment level of the information on the damaged or contaminated express delivery list, is difficult to guarantee the correct rate of express delivery letter sorting.
Disclosure of Invention
Aiming at the problems that an express sorting technology based on an express company is lacked and a damaged or polluted express bill is difficult to accurately identify in the prior art, the invention provides an express identification method based on an improved Faster R-CNN model.
In order to solve the technical problems, the invention adopts the following technical means:
the invention provides an express identification method based on an improved Faster R-CNN model, which comprises the following steps:
acquiring an express image to be identified;
processing an express image to be identified by using a trained improved Faster R-CNN model to obtain an express mark identification result;
the improved Faster R-CNN model comprises an RPN network improved based on a k-means + + algorithm and a double-threshold-non-maximum suppression algorithm based on a candidate frame aspect ratio.
Further, the express mark comprises an icon and name characters of the express carrier, the identification result of the express mark comprises a target frame and a target name, and the target name is the name of the express carrier.
Further, the training process of the improved Faster R-CNN model is as follows:
obtaining a model training data set, wherein the model training data set comprises a plurality of express delivery images marked with target frames and target names;
carrying out feature extraction on the express images in the model training data set by using a feature extraction network to obtain a target feature map;
based on the target characteristic diagram, clustering and modifying the anchor points of the RPN network by using a k-means + + algorithm to obtain modified anchor points;
processing the target characteristic graph through the sliding window and the modified anchor point to obtain an initial suggestion frame and a score thereof;
updating the score of each initial suggestion box by using a double-threshold-non-maximum suppression algorithm based on the aspect ratio of the candidate boxes, and obtaining Top-m suggestion candidate boxes according to the updated scores;
processing the suggestion candidate box by using a detection subnetwork and a double-threshold-non-maximum suppression algorithm to obtain a suggestion box and a corresponding suggestion name;
updating network parameters of the improved Faster R-CNN model with the validation dataset based on the suggestion box and the suggestion name;
and processing the express images in the model training data set by using the updated improved Faster R-CNN model until the trained improved Faster R-CNN model is obtained.
Further, the method for acquiring the model training data set and the verification data set comprises the following steps:
image acquisition is carried out on the express boxes by using camera equipment, and express images in a JPG format are obtained;
compressing the collected express image, labeling the express marks in the compressed express image by using a circumscribed rectangle method, obtaining a target frame and a target name of each express mark, and generating a labeled data set;
carrying out noise adding and data enhancing processing on the marked data set to obtain a sample data set, wherein the noise adding comprises image cutting, collaging and smearing processing, and the data enhancing processing comprises image turning, brightness enhancing and weakening and saturation enhancing and weakening processing;
and dividing the sample data set into a model training data set and a verification data set according to a preset proportion.
Further, the specific operations of clustering and modifying the anchor points of the RPN network by using the k-means + + algorithm are as follows:
(1) generating a target set C ═ C according to the length and the width of a target frame marked in the target feature map 1 (w 1 ,h 1 ),c 2 (w 2 ,h 2 ),...,c i (w i ,h i ),...,c n (w n ,h n ) In which c is i (w i ,h i ) Representing the ith object box in the object set, w i Indicates the length of the ith target frame, h i Denotes the width of the ith target frame, i ═ 1, 2.., n;
(2) randomly selecting a target frame from the target set C as an initial clustering center a 1 Calculating each target frame in C to the initial clustering center a 1 And the probability of each target box being selected as the next cluster center, the calculation formula is as follows:
d(c i (w i ,h i ),a 1 )=1-IoU(c i (w i ,h i ),a 1 ) (1)
Figure BDA0002977799350000041
wherein d (c) i (w i ,h i ),a 1 ) Representing the object box c i (w i ,h i ) To the initial cluster center a 1 Distance of IoU (c) i (w i ,h i ),a 1 ) Representing the object box c i (w i ,h i ) With the initial cluster center a 1 Degree of coincidence, P i Representing the object box c i (w i ,h i ) Probability of being selected as next cluster center, C 'represents the target box in the target set C, d (C', a) 1 ) 2 Representing the target box c' to the initial cluster center a 1 The distance of (d);
(3) is provided with
Figure BDA0002977799350000042
When the preset value r is in the value range [ Q ] i-1 ,Q i ]When the internal is reached, then the target frame c i (w i ,h i ) Is the next cluster center, where r ∈ [0, 1 ]];
(4) Repeating the steps (2) and (3) until k clustering centers are selected, and generating a clustering center set A ═ a 1 ,a 2 ,...,a p ,...,a k In which a p Denotes the pth cluster center, p 1, 2.., k;
(5) sequentially calculating the distance from each target frame in the step C to k clustering centers, and putting the target frames into the class corresponding to the clustering center with the smallest distance to obtain k classes;
(6) recalculating the cluster center of each class according to the k classes in the step (5), wherein the calculation formula is as follows:
Figure BDA0002977799350000051
wherein, c j (w j ,h j ) Representing the jth target frame in the pth class, wherein j is 1, 2.
(7) Repeating the steps (5) and (6) until the values of the k clustering centers are not changed;
(8) and (4) utilizing the aspect ratio of the k clustering centers in the step (7) as an anchor point proportion of the RPN network.
Further, let the initial suggestion box be B ═ B 1 ,b 2 ,...,b w ) The score corresponding to the initial suggestion box is S ═ S (S) 1 ,s 2 ,...,s w ) Q th initial suggestion box b q Coordinate representation of (x) q ,y q ,w q ,h q ) Wherein q is 1, 2.., w, (x) q ,y q ) Denotes b q (w) of (a) q ,h q ) Denotes b q Length and width of;
the specific operation of updating the score of each initial proposed box based on the candidate box aspect ratio dual threshold-non-maximum suppression algorithm is as follows:
screening all initial suggestion frames according to a preset length-width ratio range of the candidate frames to obtain a suggestion frame screening set;
selecting the initial suggestion box b with the highest score from the suggestion box screening set max
Screening collections for divisions b according to suggestion boxes max Each initial suggestion box outside and b max The score of the contact ratio is updated, and the updating expression is as follows:
Figure BDA0002977799350000052
wherein s is q Score representing the qth candidate box, IoU (b) max ,b q ) Represents the q-th candidate box b q And b max The contact ratio of (a) is an artificially set parameter, N f Indicating a preset minimum threshold value for the degree of overlap, N t Representing a preset maximum threshold value of degree of coincidence, b max ≠b q
Further, IoU (b) max ,b q ) The calculation formula of (a) is as follows:
Figure BDA0002977799350000061
wherein, area (b) max ,b q ) Representing an initial suggestion box b q And b max Area of intersection of, area (b) max ) Representing an initial suggestion box b max Area of, area (b) q ) Representing an initial suggestion box b q The area of (a).
The following advantages can be obtained by adopting the technical means:
the invention provides an express identification method based on an improved Faster R-CNN model, which is used for identifying express images to obtain express companies to which express belongs and realizing the effect of identifying and sorting express according to the express companies. According to the invention, the improved Faster R-CNN model is trained by using the data subjected to noise processing, the training data is more fit with the actual express image, the trained model can more accurately identify the express mark on the damaged or polluted express bill, and the accuracy of express identification is ensured. Because the express mark has smaller size relative to the whole express image, the invention improves the traditional RPN by using a clustering algorithm, clusters the anchor points to obtain the proportion and the size of the anchor points meeting the express identification requirement, and improves the accuracy of the initial suggestion frame output by the RPN; in addition, the invention improves the traditional NMS algorithm by using the length-width ratio and the double thresholds of the express delivery marks, punishs the scores of the candidate frames according to the coincidence degrees between different candidate frames, can better screen redundant frames, and avoids the problem of express delivery missing detection caused by undersize express delivery marks, overlapping of a plurality of express delivery marks or too close distance. Therefore, the method can accurately identify the express of different express companies, and ensures higher identification speed and higher identification precision.
Drawings
FIG. 1 is a flow chart illustrating steps of an express delivery identification method based on an improved Faster R-CNN model according to the present invention;
fig. 2 is a schematic diagram of an express delivery identifier identification result in the embodiment of the present invention;
FIG. 3 is a schematic diagram of an improved Faster R-CNN model in accordance with an embodiment of the present invention;
FIG. 4 is a flowchart illustrating a process of clustering anchor points of an RPN network according to the k-means + + algorithm in the embodiment of the present invention;
fig. 5 is a flowchart illustrating the steps of updating the score based on the candidate box aspect ratio dual threshold-non-maximum suppression algorithm in accordance with an embodiment of the present invention.
Detailed Description
The technical scheme of the invention is further explained by combining the accompanying drawings as follows:
the invention provides an express identification method based on an improved Faster R-CNN model, which specifically comprises the following steps as shown in figure 1:
step A, shooting the express to be identified from multiple angles by utilizing camera equipment such as a mobile phone and the like, and acquiring an image of the express to be identified.
And step B, processing the express image to be identified by using the trained improved Faster R-CNN model to obtain an express mark identification result, wherein the express mark comprises an icon and name characters of an express transport company, the express mark identification result comprises a target frame and a corresponding target name, the icon or the name characters of the express transport company on the express image are selected by the target frame, the target name refers to the name of the express transport company annotated on the target frame, and the target name generally uses lower case letters, as shown in FIG. 2.
In order to achieve accurate and reliable express identification effect, the invention provides an improved Faster R-CNN model, as shown in FIG. 3, the difference between the model and the prior art fast R-CNN model is that the model adopts an RPN network improved based on a k-means + + algorithm and a dual-threshold-non-maximum suppression algorithm based on the length-width ratio of a candidate frame.
In the embodiment of the invention, the training process of the improved Faster R-CNN model is as follows:
step B01, obtaining a model training data set, a verification data set and a test data set by using various image acquisition and processing technologies, wherein the specific operations are as follows:
(1) utilize camera equipment to carry out image acquisition to a plurality of express delivery boxes from a plurality of angles, should contain clear complete express delivery sign in amazing express delivery image, the format of express delivery image is JPG.
(2) The collected express images are compressed, the express images collected by the camera equipment are generally large in pixel, in order to reduce memory and video memory occupied by model training, PS software is adopted to compress the express images in batches, and the pixel size of the compressed images is about 649 multiplied by 480. After compression is completed, the express delivery marks in the compressed express delivery image are marked by using an external rectangle method, a target frame and a target name of each express delivery mark are marked, and a marking data set is generated. In the embodiment of the invention, a LabelImg labeling tool is adopted to label the pictures, and each target box has only one express delivery mark.
(3) And carrying out noise addition and data enhancement processing on the labeled data set to obtain a sample data set. In the actual transportation process, due to the influence of factors such as weather and the like, the express delivery mark is possibly polluted or damaged, in order to ensure the identification effect of the method, the method carries out noise adding treatment on part of express delivery images in the mark data set, wherein the noise adding treatment comprises image cutting, collaging, smearing and the like, and the noise added images can simulate the damage and pollution conditions of the real express delivery images; in addition, in order to increase the diversity of the samples and avoid the overfitting risk caused by too few data sets, the invention carries out data enhancement processing on the noisy image, and the data enhancement processing mainly comprises image turning, brightness enhancement and weakening, saturation enhancement and weakening and the like. The marked information and position of the image after noise addition, brightness and contrast adjustment are not changed, so that the picture name and the file name only need to be corresponding on the basis of the marked image; the image after being turned and rotated can calculate the coordinate information of the rotated target frame according to the characteristics of the symmetry and the rotation angle of the image, and the annotation information in the image is adjusted.
(4) Dividing the sample data set into a model training data set, a verification data set and a test data set according to a preset proportion, wherein the proportion of the model training data set to the verification data set to the test data set is 3:3:4, the model training data set and the verification data set respectively comprise a plurality of express delivery images marked with target frames and target names, and the test data set comprises a plurality of unmarked express delivery images.
And step B02, performing feature extraction on the express images in the model training data set by using a feature extraction network in the improved Faster R-CNN model to obtain a target feature map, and sharing the parameters of the feature extraction to the RPN network and the detection sub-network. And the target feature map output by the feature extraction network is used as the deep convolution feature of the express image and is respectively input into the RPN network and the detection subnetwork.
And step B03, based on the target characteristic diagram, clustering and modifying the anchor points of the RPN network by using a k-means + + algorithm to obtain modified anchor points. Because the icons and the names of the express companies only occupy small positions on the whole express bill, the requirement of all express mark identification on anchor points is high, and the final identification rate is not high due to improper anchor point setting, the invention clusters the anchor points of the RPN network by using a k-means + + algorithm, adjusts the scale of the anchor points according to the size of the actual express mark, and generates the anchor points suitable for the express mark identification, as shown in FIG. 4, the specific operation is as follows:
(1) generating a target set C ═ C according to the length and the width of a target frame marked in the target feature map 1 (w 1 ,h 1 ), 2 (w 2 ,h 2 ),…,c i (w i ,h i ),…, n (w n ,h n ) In which c is i (w i ,h i ) Representing the ith object box in the object set, w i Indicates the length of the ith target frame, h i Denotes the width of the ith target frame, i ═ 1, 2.
(2) Taking a target set C as the input of a k-means + + algorithm, and randomly selecting a target frame from the target set C as an initial clustering center a 1 Calculating each target frame in C to the initial clustering center a 1 And the probability of each target box being selected as the next cluster center, the calculation formula is as follows:
d(c i (w i ,h i ),a 1 )=1-IoU(c i (w i ,h i ),a 1 ) (6)
Figure BDA0002977799350000101
wherein d (c) i (w i ,h i ),a 1 ) Representing the object box c i (w i ,h i ) To the initial cluster center a 1 Distance of IoU (c) i (w i ,h i ),a 1 ) Representing the object box c i (w i ,h i ) With the initial cluster center a 1 Degree of coincidence, P i Representing the object box c i (w i ,h i ) Probability of being selected as the next cluster center, C 'represents the target box in the target set C, d (C', a) 1 ) 2 Representing the target box c' to the initial cluster center a 1 The distance of (c).
(3) Is provided with
Figure BDA0002977799350000102
When the preset value r is in the value range [ Q ] i-1 ,Q i ]When the internal time is within, then the target frame c i (w i ,h i ) For the next cluster center, where r is a randomly chosen number, r ∈ [0, 1 ]]。
(4) Repeating the steps (2) and (3) until k clustering centers are selected, and generating a clustering center set A ═ a 1 ,a 2 ,...,a p ,...,a k In which a p The size of the pth cluster center is denoted, p 1, 2.
(5) And sequentially calculating the distance from each target frame in the step C to k clustering centers, and putting the target frames into the class corresponding to the clustering center with the minimum distance to obtain k classes.
(6) Recalculating the cluster center of each class according to the k classes in the step (5), wherein the calculation formula is as follows:
Figure BDA0002977799350000103
wherein, c j (w j ,h j ) Represents the jth target frame in the pth class, j is 1, 2.
(7) And (5) repeating the steps (5) and (6) until the values of the k clustering centers are not changed.
(8) Utilizing the aspect ratio (i.e., w) of the k cluster centers in step (7) i /h i ) As the anchor point proportion of the RPN network.
(9) And after the anchor point proportion is modified, modifying the anchor point size by combining the pixel size of the express image and the length-width ratio of each express mark in the actual scene, thereby eliminating the interference of a plurality of characteristics in a complex background and reducing the false recognition.
In the embodiment of the present invention, the number k of the cluster centers is set to 3, and the processing on a large number of express delivery images obtains that the 3 cluster centers are (0.056, 0.05247376), (0.094, 0.07646177), (0.152, 0.12762763), respectively, and the corresponding aspect ratios are 1.07,1.19 and 1.23, and 1.07,1.19 and 1.23 are anchor point ratios of the RPN network in model training.
And step B04, processing the target characteristic graph through a sliding window and the modified anchor points in the RPN network to obtain an initial suggestion frame and the score thereof.
And step B05, updating the score of each initial suggestion box by using a double-threshold-non-maximum suppression algorithm based on the aspect ratio of the suggestion boxes, and obtaining Top-m suggestion box according to the updated scores.
In a standard Faster R-CNN model, a non-maximum suppression algorithm (NMS) is a greedy algorithm, whether a candidate frame is reserved or deleted directly or not is reserved, but in express identification, express marks are generally small, different express marks may overlap or are very close to each other, and the express mark omission is easily caused by the fact that the candidate frame is deleted directly.
Let the initial suggestion box be B ═ B 1 ,b 2 ,...,b w ) The score corresponding to the initial suggestion box is S ═ S (S) 1 ,s 2 ,...,s w ) Q th initial suggestion box b q Coordinate representation of (x) q ,y q ,w q ,h q ) Wherein q is 1, 2.., w, (x) q ,y q ) Denotes b q (w) of (a) q ,h q ) Denotes b q Length and width of (2), as shown in FIG. 5The specific operation of updating the initial suggestion box score is as follows:
(1) according to the preset length-width ratio range [ epsilon ] of the candidate frame min ,ε max ]Screening all initial suggestion boxes, deleting a certain initial suggestion box if the aspect ratio of the initial suggestion box is not in the range, and forming a suggestion box screening set by using all the initial suggestion boxes meeting the aspect ratio range, wherein epsilon min Is the minimum value of the aspect ratio, epsilon max Is the maximum value of the aspect ratio, epsilon min And ε max Generally through big data statistics.
(2) Sorting the initial suggestion boxes in the suggestion box screening set from low to high according to the scores, and selecting the initial suggestion box b with the highest score from the suggestion box screening set max
(3) Screening collections for divisions b according to suggestion boxes max Each initial suggestion box outside and b max The score of the contact ratio is updated, and the updating expression is as follows:
Figure BDA0002977799350000121
wherein s is q Score representing the g-th candidate box, IoU (b) max ,b q ) Represents the q-th candidate box b q And b max The coincidence degree of (a) is an artificially set parameter, N f Indicating a preset minimum threshold value for the degree of overlap, N t Representing a preset maximum threshold value of degree of coincidence, b max ≠b q
IoU(b max ,b q ) The calculation formula of (a) is as follows:
Figure BDA0002977799350000122
wherein, area (b) max ,b q ) Representing an initial suggestion box b q And b max Area of intersection of, area (b) max ) Representing an initial suggestion box b max Area of, area (b) q ) Representing the initial suggestion box b q The area of (a).
And step B06, processing the suggestion candidate box by using the detection sub-network and a dual-threshold-non-maximum suppression algorithm to obtain a suggestion box and a corresponding suggestion name. The method comprises the steps of utilizing a double-threshold-non-maximum suppression algorithm to suppress candidate frames, removing redundant candidate frames, utilizing a RoI pooling layer in a detection sub-network to pool different scale mapping suggestion candidate frames obtained in an RPN uniformly into a fixed size, then respectively sending the fixed size mapping suggestion candidate frames into a classification layer and a regression layer of the detection sub-network after full connection layer processing, judging the category of objects in the suggestion candidate frames by the classification layer to obtain suggestion names, and finely adjusting the positions of the suggestion candidate frames by the regression layer to obtain the suggestion frames.
Step B07, updating the network parameters of the improved Faster R-CNN model with the validation dataset based on the suggestion box and the suggestion name.
And step B08, processing the express delivery images in the model training data set by using the updated improved Faster R-CNN model until the trained improved Faster R-CNN model is obtained. After model training is completed, the trained improved Faster R-CNN model can be tested using the test data set, thereby testing the effect of the model.
Compared with the existing express identification method, the express identification method has the advantages that the improved Faster R-CNN model is used for identifying the express images, express identification and sorting can be carried out according to express companies, the express identification speed is high, the accuracy rate is high, missing detection is not easy, and the express identification requirements of express stations or warehouses can be met.
The above description is only a preferred embodiment of the present invention, and it should be noted that, for those skilled in the art, several modifications and variations can be made without departing from the technical principle of the present invention, and these modifications and variations should also be regarded as the protection scope of the present invention.

Claims (5)

1. An express identification method based on an improved Faster R-CNN model is characterized by comprising the following steps:
acquiring an express image to be identified;
processing an express image to be identified by using a trained improved Faster R-CNN model to obtain an express mark identification result;
the improved Faster R-CNN model comprises an RPN network improved based on a k-means + + algorithm and a double-threshold-non-maximum suppression algorithm based on a candidate frame length-width ratio;
wherein the training process of the improved Faster R-CNN model comprises the following steps:
obtaining a model training data set, wherein the model training data set comprises a plurality of express delivery images marked with target frames and target names;
carrying out feature extraction on the express images in the model training data set by using a feature extraction network to obtain a target feature map;
based on the target characteristic diagram, clustering and modifying the anchor points of the RPN network by using a k-means + + algorithm to obtain modified anchor points;
processing the target characteristic graph through the sliding window and the modified anchor point to obtain an initial suggestion frame and a score thereof;
updating the score of each initial suggestion box by using a double-threshold-non-maximum suppression algorithm based on the aspect ratio of the candidate boxes, and obtaining Top-m suggestion candidate boxes according to the updated scores;
processing the suggestion candidate box by using a detection subnetwork and a double-threshold-non-maximum suppression algorithm to obtain a suggestion box and a corresponding suggestion name;
updating network parameters of the improved Faster R-CNN model with the validation dataset based on the suggestion box and the suggestion name;
processing express images in the model training data set by using the updated improved Faster R-CNN model until a trained improved Faster R-CNN model is obtained;
the specific operations of clustering and modifying the anchor points of the RPN network by using the k-means + + algorithm are as follows:
(1) generating a target set C ═ C according to the length and the width of a target frame marked in the target feature map 1 (w 1 ,h 1 ),c 2 (w 2 ,h 2 ),...,c i (w i ,h i ),...,c n (w n ,h n ) In which c is i (w i ,h i ) Representing the ith object box in the object set, w i Indicates the length of the ith target frame, h i Denotes the width of the ith target frame, i ═ 1, 2.., n;
(2) randomly selecting a target frame from the target set C as an initial clustering center a 1 Calculating each target frame in C to the initial clustering center a 1 And the probability of each target box being selected as the next cluster center, the calculation formula is as follows:
d(c i (w i ,h i ),a 1 )=1-IoU(c i (w i ,h i ),a 1 )
Figure FDA0003685465710000021
wherein d (c) i (w i ,h i ),a 1 ) Representing the object box c i (w i ,h i ) To the initial cluster center a 1 Distance of IoU (c) i (w i ,h i ),a 1 ) Representing the object box c i (w i ,h i ) With the initial cluster center a 1 Degree of coincidence, P i Representing the object box c i (w i ,h i ) Probability of being selected as next cluster center, C 'represents the target box in the target set C, d (C', a) 1 ) 2 Representing the target box c' to the initial cluster center a 1 The distance of (d);
(3) is provided with
Figure FDA0003685465710000022
When the preset value r is in the value range [ Q ] i-1 ,Q i ]When the internal is reached, then the target frame c i (w i ,h i ) Is the next cluster center, where r ∈ [0, 1 ]];
(4) Repeating the steps (2) and (3) until k clustering centers are selected, and generating a clustering center set A ═ a 1 ,a 2 ,…,a p ,…,a k In which a p Represents the pth cluster center, p ═ 1,2, …, k;
(5) sequentially calculating the distance from each target frame in the step C to k clustering centers, and putting the target frames into the class corresponding to the clustering center with the smallest distance to obtain k classes;
(6) recalculating the cluster center of each class according to the k classes in the step (5), wherein the calculation formula is as follows:
Figure FDA0003685465710000031
wherein, c j (w j ,h j ) Represents the jth target frame in the pth class, j is 1,2, …, l, l is the number of target frames in the pth class;
(7) repeating the steps (5) and (6) until the values of the k clustering centers are not changed;
(8) and (4) utilizing the aspect ratio of the k clustering centers in the step (7) as the anchor point proportion of the RPN network.
2. The express delivery identification method based on the improved Faster R-CNN model according to claim 1, wherein the express delivery label comprises an icon and name text of an express delivery carrier, and the express delivery label identification result comprises a target box and a target name, wherein the target name is the name of the express delivery carrier.
3. The express delivery identification method based on the improved Faster R-CNN model according to claim 1, wherein the model training dataset and the verification dataset are obtained by:
image acquisition is carried out on the express boxes by using camera equipment, and express images in a JPG format are obtained;
compressing the collected express image, labeling the express marks in the compressed express image by using a circumscribed rectangle method, obtaining a target frame and a target name of each express mark, and generating a labeled data set;
carrying out noise adding and data enhancing processing on the marked data set to obtain a sample data set, wherein the noise adding comprises image cutting, collaging and smearing processing, and the data enhancing processing comprises image turning, brightness enhancing and weakening and saturation enhancing and weakening processing;
and dividing the sample data set into a model training data set and a verification data set according to a preset proportion.
4. The express delivery identification method based on the improved Faster R-CNN model according to claim 1, wherein the initial suggestion box is set as B ═ (B) 1 ,b 2 ,...,b w ) The score corresponding to the initial suggestion box is S ═ S (S) 1 ,s 2 ,...,s w ) Q th initial suggestion box b q Coordinate representation of (x) q ,y q ,w q ,h q ) Wherein q is 1, 2.., w, (x) q ,y q ) Denotes b q (w) of (a) q ,h q ) Denotes b q Length and width of (d);
the specific operation of updating the score of each initial suggested box based on the candidate box aspect ratio dual threshold-non-maximum suppression algorithm is as follows:
screening all initial suggestion frames according to a preset length-width ratio range of the candidate frames to obtain a suggestion frame screening set;
selecting the initial suggestion box b with the highest score from the suggestion box screening set max
Screening collections for divisions b according to suggestion boxes max Each initial suggestion box outside and b max The score of the contact ratio is updated, and the updating expression is as follows:
Figure FDA0003685465710000041
wherein s is q Score representing the qth candidate box, IoU (b) max ,b q ) Represents the q-th candidate box b q And b max The coincidence degree of (a) is an artificially set parameter, N f Indicating a predetermined degree of coincidenceMinimum threshold, N t Representing a preset maximum threshold value of degree of coincidence, b max ≠b q
5. The express delivery identification method based on improved Faster R-CNN model according to claim 4, wherein IoU (b) max ,b q ) The calculation formula of (a) is as follows:
Figure FDA0003685465710000051
wherein, area (b) max ,b q ) Representing an initial suggestion box b q And b max Area of intersection of, area (b) max ) Representing an initial suggestion box b max Area of, area (b) q ) Representing an initial suggestion box b q The area of (a).
CN202110279121.4A 2021-03-16 2021-03-16 Express identification method based on improved Faster R-CNN model Active CN112861800B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110279121.4A CN112861800B (en) 2021-03-16 2021-03-16 Express identification method based on improved Faster R-CNN model

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110279121.4A CN112861800B (en) 2021-03-16 2021-03-16 Express identification method based on improved Faster R-CNN model

Publications (2)

Publication Number Publication Date
CN112861800A CN112861800A (en) 2021-05-28
CN112861800B true CN112861800B (en) 2022-08-05

Family

ID=75994624

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110279121.4A Active CN112861800B (en) 2021-03-16 2021-03-16 Express identification method based on improved Faster R-CNN model

Country Status (1)

Country Link
CN (1) CN112861800B (en)

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110991435A (en) * 2019-11-27 2020-04-10 南京邮电大学 Express waybill key information positioning method and device based on deep learning
CN111488920A (en) * 2020-03-27 2020-08-04 浙江工业大学 Bag opening position detection method based on deep learning target detection and recognition

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110991435A (en) * 2019-11-27 2020-04-10 南京邮电大学 Express waybill key information positioning method and device based on deep learning
CN111488920A (en) * 2020-03-27 2020-08-04 浙江工业大学 Bag opening position detection method based on deep learning target detection and recognition

Also Published As

Publication number Publication date
CN112861800A (en) 2021-05-28

Similar Documents

Publication Publication Date Title
CN110348294B (en) Method and device for positioning chart in PDF document and computer equipment
CN108009543B (en) License plate recognition method and device
CN107194398B (en) Vehicle damages recognition methods and the system at position
CN112052850B (en) License plate recognition method and device, electronic equipment and storage medium
US20210192227A1 (en) Method and apparatus for detecting parking space usage condition, electronic device, and storage medium
CN111967313B (en) Unmanned aerial vehicle image annotation method assisted by deep learning target detection algorithm
CN103824091B (en) A kind of licence plate recognition method for intelligent transportation system
CN111476210B (en) Image-based text recognition method, system, device and storage medium
CN109360179B (en) Image fusion method and device and readable storage medium
CN109598298B (en) Image object recognition method and system
CN110751146A (en) Text region detection method, text region detection device, electronic terminal and computer-readable storage medium
CN113435407B (en) Small target identification method and device for power transmission system
CN111488854A (en) Automatic identification and classification method for road traffic signs
CN110659637A (en) Electric energy meter number and label automatic identification method combining deep neural network and SIFT features
CN112580108A (en) Signature and seal integrity verification method and computer equipment
CN109615610B (en) Medical band-aid flaw detection method based on YOLO v2-tiny
CN111582377A (en) Edge end target detection method and system based on model compression
CN111160107A (en) Dynamic region detection method based on feature matching
CN113780116A (en) Invoice classification method and device, computer equipment and storage medium
CN110766001B (en) Bank card number positioning and end-to-end identification method based on CNN and RNN
CN112861800B (en) Express identification method based on improved Faster R-CNN model
CN113569940A (en) Few-sample target detection method based on knowledge migration and probability correction
CN114359931A (en) Express bill identification method and device, computer equipment and storage medium
CN113159029A (en) Method and system for accurately capturing local information in picture
CN112464015A (en) Image electronic evidence screening method based on deep learning

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant