CN110046650B - Express package bar code rapid detection method - Google Patents

Express package bar code rapid detection method Download PDF

Info

Publication number
CN110046650B
CN110046650B CN201910197753.9A CN201910197753A CN110046650B CN 110046650 B CN110046650 B CN 110046650B CN 201910197753 A CN201910197753 A CN 201910197753A CN 110046650 B CN110046650 B CN 110046650B
Authority
CN
China
Prior art keywords
feature
convolution
network
detection method
feature fusion
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201910197753.9A
Other languages
Chinese (zh)
Other versions
CN110046650A (en
Inventor
许绍云
易帆
李功燕
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhongke Weizhi Technology Co ltd
Original Assignee
Zhongke Weizhi Intelligent Manufacturing Technology Jiangsu Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhongke Weizhi Intelligent Manufacturing Technology Jiangsu Co ltd filed Critical Zhongke Weizhi Intelligent Manufacturing Technology Jiangsu Co ltd
Priority to CN201910197753.9A priority Critical patent/CN110046650B/en
Publication of CN110046650A publication Critical patent/CN110046650A/en
Application granted granted Critical
Publication of CN110046650B publication Critical patent/CN110046650B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/25Fusion techniques
    • G06F18/253Fusion techniques of extracted features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Evolutionary Computation (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Evolutionary Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Computational Linguistics (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses a quick detection method for express package bar codes, which comprises the following steps: constructing a cascade multi-scale feature fusion network, wherein the feature fusion network comprises a feature reduction network and a feature retention network which are connected in sequence; inputting the express package barcode image into the feature fusion network, and obtaining a feature map through convolution extraction; and inputting the feature map into a detection module, and carrying out classification and coordinate regression on the feature map to obtain a detection result. The method has obvious advantages in the aspects of accuracy and speed, and can well solve the problem of bar code detection in practical application. In other practical application environments, the number of feature fusion layers in the network can be customized according to specific requirements, and the universality is realized.

Description

Express package bar code rapid detection method
Technical Field
The invention relates to the technical field of deep learning and image processing, in particular to a quick detection method for express package bar codes.
Background
Image processing techniques are an effective way to achieve barcode detection. In a relatively ideal physical environment, the traditional image algorithm generally adopts methods of extracting bar code edge texture feature information, obtaining a bar code area by means of corrosion and expansion of a morphological processing algorithm, or detecting a bar code edge straight line by adopting a Hough transform algorithm and the like to realize detection. The traditional image algorithm has high requirements on detection environment, and can realize good detection effect under specific physical environment. However, in reality, in a logistics package automatic sorting scene, due to the influence of illumination conditions, field environments and the like, the quality of the acquired pictures is uneven, for example, false detection and missing detection are easily caused by severe illumination change, complex background interference, bar code distortion, small bar code target and the like, so that the bar code detection difficulty is improved. Therefore, the research on the highly reliable, strong and stable bar code detection and positioning method has important significance for realizing efficient automatic sorting of logistics packages in complex environments.
The deep learning is different from the traditional image algorithm in that the characteristics need to be manually designed, the relevant characteristics can be extracted through self-learning, tasks such as characteristic extraction, screening and classification can be integrated in a network for optimization, and the method has remarkable advantages. Particularly, the convolutional neural network realizes the solving function of far-exceeding the traditional image algorithm aiming at the tasks of image recognition, image understanding, target detection, semantic segmentation and the like, and can be suitable for any scene task by virtue of good robustness.
The target detection model of the convolutional neural network is mainly divided into two types: a is with fast-Rcnn as the representative's two-stage detector based on candidate area, its principle is to use the deep convolution neural Network to obtain the characteristic map of the picture first, and then propose the Network (RPN) to produce the candidate area through the area with the characteristic map, and combine classifier, boundry regressor and non-maximum value to inhibit algorithm, etc. to classify and adjust the candidate area, and then obtain the valid target; the other one-stage detector based on regression represented by YOLO and SSD mainly adopts the principle that the feature diagram of an output layer or an intermediate layer is selected to be directly classified and subjected to coordinate regression. Different from the two-stage detector, the regression-based one-stage detector omits the process of generating a candidate region by a region proposal network, directly integrates target identification and target judgment, greatly saves the calculation cost and time consumption, and plays an important role in realizing real-time end-to-end target detection.
Disclosure of Invention
The purpose of the invention is realized by the following technical scheme.
Specifically, the invention provides a method for rapidly detecting express package bar codes, which comprises the following steps:
constructing a cascade multi-scale feature fusion network, wherein the feature fusion network comprises a feature reduction network and a feature retention network which are connected in sequence;
inputting the express package barcode image into the feature fusion network, and obtaining a feature map through convolution extraction;
and inputting the feature map into a detection module, and carrying out classification and coordinate regression on the feature map to obtain a detection result.
Preferably, the feature reduction network is used for feature graph size reduction and feature information extraction; the feature preserving network is used for feature semantic information fusion.
Preferably, the feature reduction network includes a plurality of feature fusion modules, each feature fusion module is composed of 3 × 3 convolutional layers, the output of the feature fusion module is obtained by splicing feature maps after convolution of the 3 convolutional layers, and the feature maps are adjusted to have the same size by transposing convolution and upsampling the second and third 3 × 3 convolutional layers, thereby completing splicing.
Preferably, the feature preserving network comprises a plurality of feature fusion modules, each feature fusion module is composed of 3 × 3 expansion convolutional layers, and the convolution results of the 3 expansion convolutional layers are directly spliced to output feature maps of the same size.
Preferably, the method further comprises the following steps: and adding a 1x1 convolution compression layer between adjacent feature fusion modules to reduce the number of feature maps output by the feature fusion modules.
Preferably, the method further comprises the following steps: and converting the 3x3 convolution layer in the feature reduction network, and splitting the convolution layer into a depth convolution and a dot-product convolution, wherein the number of convolution kernel channels of the depth convolution is 1, and the size of convolution kernel space of the dot-product convolution is 1.
More preferably, the method further comprises the following steps: and further optimizing the dot product convolution by adopting packet convolution, dividing dot product convolution kernels into a plurality of groups, performing convolution respectively, and finally merging and outputting results.
More preferably, the method further comprises the following steps: and adding channel recombination after the grouping convolution, and cross-mixing the feature maps of different groups.
Preferably, the detection module comprises a classifier, a regressor and a non-maxima suppression unit.
The invention has the advantages that: the method has obvious advantages in the aspects of accuracy and speed, and can well solve the problem of bar code detection in practical application. In other practical application environments, the number of feature fusion layers in the network can be customized according to specific requirements, and the universality is realized.
Drawings
Various other advantages and benefits will become apparent to those of ordinary skill in the art upon reading the following detailed description of the preferred embodiments. The drawings are only for purposes of illustrating the preferred embodiments and are not to be construed as limiting the invention. Also, like reference numerals are used to refer to like parts throughout the drawings. In the drawings:
fig. 1 shows a flow chart of a quick detection method for express package bar codes according to an embodiment of the invention;
FIG. 2 shows a schematic structural diagram of a feature fusion layer I according to an embodiment of the invention;
FIG. 3 shows a schematic structural diagram of a feature fusion layer II according to an embodiment of the invention;
FIG. 4 illustrates a feature map channel compression diagram according to an embodiment of the present invention;
FIG. 5 illustrates a schematic diagram of a standard convolution kernel and a depth separable convolution kernel according to an embodiment of the present invention;
FIG. 6 shows a schematic diagram of channel reorganization according to an embodiment of the present invention;
fig. 7 shows a schematic diagram of a standard 3x3 convolutional layer structure (left) and a modified 3x3 depth separable convolutional layer structure (right) in accordance with an embodiment of the present invention.
Detailed Description
Exemplary embodiments of the present disclosure will be described in more detail below with reference to the accompanying drawings. While exemplary embodiments of the present disclosure are shown in the drawings, it should be understood that the present disclosure may be embodied in various forms and should not be limited to the embodiments set forth herein. Rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the scope of the disclosure to those skilled in the art.
According to the method, a light cascade multi-scale feature fusion network is designed by taking a one-stage detector as a basic framework according to the special scene of package sorting and the specificity of the target bar code to be detected, and features are directly extracted through convolution and the bar code is positioned through classification regression. The model structure and the flow chart are shown in fig. 1, wherein the feature extraction part mainly comprises two networks: feature reduction networks, feature retention networks. The feature reduction network mainly realizes feature graph size reduction and feature information extraction; the feature preserving network further fuses feature semantic information on the premise of keeping the size of the feature graph unchanged. The detection part consists of a classifier, a regressor and a non-maximum value suppression unit, and performs coordinate regression on the obtained feature map to further obtain a detection frame of the target.
The following provides a detailed description of various arrangements and specific parts of the present invention:
1Anchor setup
In the present invention, the anchor type setting is mainly to adjust two types of hyper-parameters, i.e. the aspect ratio R and the area size S. According to the analysis of the size of the bar code in the actual application field, the bar code size is smaller, so the set S is set to be 322,482,642,802,962, and the set R is set to be 1:1,1:2,1:3,1:4,4:1,3:1,2:1,1: 1. Therefore, the anchor can cover the size range of all the bar codes in the bar code data set, and the accuracy of detecting the bar codes finally can be effectively improved.
2 feature reduction network
As shown in FIG. 1, the invention designs a multi-scale feature fusion layer I, which enhances bar code features by fusing semantic information of different hierarchical scales, as shown in FIG. 2. Each feature fusion layer I corresponds to a feature fusion module consisting of 3 convolutional layers of 3 × 3, each convolutional layer having a step size (stride) of 2. The output of the fusion layer I is obtained by splicing (concat) the feature maps after the 3 convolutions, and the feature maps are adjusted to have the same size by up-sampling the second and third 3 × 3 convolution layers by means of transposition convolution (deconstruction), thereby completing the splicing. In this way, the size of the feature map is down-sampled by 2 times every time the feature map passes through one multi-scale feature fusion layer I, and the feature map size is rapidly reduced by repeatedly overlapping a plurality of feature fusion layers I. In order to ensure that the feature points of the barcode do not disappear under a series of downsampling, the number of the feature fusion modules is set to be T-4, and the feature map of the last layer is downsampled to 1/16 of the input image, so that the integrity of the barcode features is kept as much as possible while the semantic information is fully extracted.
3 feature retention network
In order to further extract semantic information features, after a feature fusion layer II is designed and added to a feature reduction network, the structure of the feature fusion layer II is shown in fig. 3, each feature fusion layer II is composed of 3 × 3 expanded convolution layers (scaled convolution), the convolution step size stride of each layer is 1, and 3 convolutions are directly spliced to output feature maps of the same size. The expansion convolution is adopted to replace the traditional standard convolution, the size of the feature graph is kept unchanged, the receptive field can be better improved, and the semantic features are enriched. The receptive field calculation mode of the dilation convolution is as follows:
RF=(K+1)×(DR-1)+K
RF denotes the receptive field of the dilated convolution output, K denotes the convolution kernel size, and DR denotes the dilation rate.
4 optimized acceleration of network model
In order to increase the processing speed of the model and adapt to the actual application requirements, the invention further makes acceleration improvement on two aspects of the output channel and the convolution depth of the model respectively.
(1) Output feature map channel compression
In feature fusion, the number of channels is increased after feature map splicing, so that the complexity and the calculated amount of a model are increased. To solve the problem, a 1x1 convolution compression layer is added between adjacent fusion modules, and the number of feature maps after the modules are output is reduced. Specifically, the number V of convolution kernels of 1 × 1 is set so as to be less than the number M of channels of the input feature map, as shown in fig. 4.
(2) Convolution kernel depth separation
In the convolution calculation, the 3x3 standard convolution is completely converted into the improved depth separable convolution (depthwise separable convolution), so that the calculation speed of the convolution layer can be effectively improved. Taking a standard convolution kernel with a spatial size of K and a depth of M as an example (fig. 5(a)), the total number of convolution kernels is N, and the convolution can be divided into depth convolution (fig. 5(b)) and point convolution (fig. 5(c)) by conversion. The number of the convolution kernel channels of the deep convolution is 1, namely each channel of the feature map has an independent convolution kernel, the characteristics of cross channels in the standard convolution are stripped, and the extraction of the feature map space dimension features is concerned. The convolution kernel space size of the dot-product convolution is 1, which is equivalent to a convolution kernel of 1 × 1, and the dot-product convolution implements mixing and flowing of cross-channel feature information, as opposed to the deep convolution.
Generally, the number N of convolution kernels is set to be large (e.g., 128, 256, etc.), and the convolution kernel size is mainly 3x3 or 5x5, so the computation amount of the depth separable convolution is mainly focused on the point-by-point convolution. The 1x1 dot product convolution can be further optimized by adopting the grouping convolution, the dot product convolution kernel is divided into g groups, then the convolution is respectively carried out, and finally the results are combined and output, so that the calculation amount can be reduced by g times. However, the grouping convolution tends to make the inter-channel information independent, contrary to the cross-channel information mixing effect of 1x1 dot product convolution, so that adding channel recombination (fig. 6) after the grouping convolution cross-mixes the feature maps of different groups.
Through the above improvement, the structure of the 3 × 3 convolutional layer is changed as shown in fig. 7.
Assuming an output signature size of DxD, the standard convolution is compared to the computation of the improved depth separable convolution by:
Figure BDA0001996371370000061
in the invention, the size of the convolution kernel is 3x3, the grouping number g of the dot product convolution is 2, the number N of the convolution kernels is generally larger, and the second term of the ratio can be temporarily ignored, so that the speed of the improved depth separable convolution can be increased by about 18 times compared with the standard convolution theory, the complexity and the parameter quantity of the model are effectively reduced, and the speed of the model is improved.
The above description is only for the preferred embodiment of the present invention, but the scope of the present invention is not limited thereto, and any changes or substitutions that can be easily conceived by those skilled in the art within the technical scope of the present invention are included in the scope of the present invention. Therefore, the protection scope of the present invention shall be subject to the protection scope of the appended claims.

Claims (7)

1. A quick detection method for express package bar codes is characterized by comprising the following steps:
constructing a cascade multi-scale feature fusion network, wherein the feature fusion network comprises a feature reduction network and a feature retention network which are connected in sequence;
inputting the express package barcode image into the feature fusion network, and obtaining a feature map through convolution extraction;
inputting the feature map into a detection module, and carrying out classification and coordinate regression on the feature map to obtain a detection result;
the feature reduction network comprises a plurality of feature fusion modules, each feature fusion module consists of 3 convolutional layers of 3x3, the output of the feature fusion modules is obtained by splicing feature graphs after convolution of the 3 convolutional layers, the feature graphs are adjusted to be the same in size by means of transposing convolution and up-sampling a second convolutional layer and a third convolutional layer of 3x3, and splicing is further completed;
the feature maintaining network comprises a plurality of feature fusion layers, each feature fusion layer is composed of 3x3 expansion convolutional layers, and the convolution results of the 3 expansion convolutional layers are directly spliced to output feature maps with the same size.
2. The express parcel barcode rapid detection method according to claim 1,
the feature reduction network is used for reducing the size of the feature map and extracting feature information; the feature preserving network is used for feature semantic information fusion.
3. The express package bar code rapid detection method according to claim 1, characterized by further comprising:
and adding a 1x1 convolution compression layer between adjacent feature fusion modules to reduce the number of feature maps output by the feature fusion modules.
4. The express package bar code rapid detection method according to claim 1, characterized by further comprising:
and converting the 3x3 convolution layer in the feature reduction network, and splitting the convolution layer into a depth convolution and a dot-product convolution, wherein the number of convolution kernel channels of the depth convolution is 1, and the size of convolution kernel space of the dot-product convolution is 1.
5. The express package barcode rapid detection method of claim 4, further comprising:
and further optimizing the dot product convolution by adopting packet convolution, dividing dot product convolution kernels into a plurality of groups, performing convolution respectively, and finally merging and outputting results.
6. The express package barcode rapid detection method of claim 5, further comprising:
and adding channel recombination after the grouping convolution, and cross-mixing the feature maps of different groups.
7. The express parcel barcode rapid detection method according to claim 1,
the detection module includes a classifier, a regressor, and a non-maxima suppression unit.
CN201910197753.9A 2019-03-15 2019-03-15 Express package bar code rapid detection method Active CN110046650B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910197753.9A CN110046650B (en) 2019-03-15 2019-03-15 Express package bar code rapid detection method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910197753.9A CN110046650B (en) 2019-03-15 2019-03-15 Express package bar code rapid detection method

Publications (2)

Publication Number Publication Date
CN110046650A CN110046650A (en) 2019-07-23
CN110046650B true CN110046650B (en) 2021-05-28

Family

ID=67274910

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910197753.9A Active CN110046650B (en) 2019-03-15 2019-03-15 Express package bar code rapid detection method

Country Status (1)

Country Link
CN (1) CN110046650B (en)

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108073849B (en) * 2016-11-18 2021-04-30 杭州海康威视数字技术股份有限公司 Bar code detection method, device and system
CN108510012B (en) * 2018-05-04 2022-04-01 四川大学 Target rapid detection method based on multi-scale feature map
CN108985145A (en) * 2018-05-29 2018-12-11 同济大学 The Opposite direction connection deep neural network model method of small size road traffic sign detection identification

Also Published As

Publication number Publication date
CN110046650A (en) 2019-07-23

Similar Documents

Publication Publication Date Title
CN112287940B (en) Semantic segmentation method of attention mechanism based on deep learning
Kamal et al. Automatic traffic sign detection and recognition using SegU-Net and a modified Tversky loss function with L1-constraint
CN107527007B (en) Method for detecting object of interest in vehicle image processing system
CN111104903B (en) Depth perception traffic scene multi-target detection method and system
US10198657B2 (en) All-weather thermal-image pedestrian detection method
CN105512638B (en) A kind of Face datection and alignment schemes based on fusion feature
Zhao et al. Improved vision-based vehicle detection and classification by optimized YOLOv4
Haloi Traffic sign classification using deep inception based convolutional networks
Ribeiro et al. An end-to-end deep neural architecture for optical character verification and recognition in retail food packaging
CN109446922B (en) Real-time robust face detection method
CN105488468A (en) Method and device for positioning target area
CN114155527A (en) Scene text recognition method and device
Ma et al. Fusioncount: Efficient crowd counting via multiscale feature fusion
CN110349167A (en) A kind of image instance dividing method and device
CN117037119A (en) Road target detection method and system based on improved YOLOv8
Cho et al. Semantic segmentation with low light images by modified CycleGAN-based image enhancement
CN111353544A (en) Improved Mixed Pooling-Yolov 3-based target detection method
CN112883926B (en) Identification method and device for form medical images
Sabater et al. Event Transformer+. A multi-purpose solution for efficient event data processing
Yin et al. Road Damage Detection and Classification based on Multi-level Feature Pyramids.
Golgire Traffic Sign Recognition using Machine Learning: A Review
CN116778346A (en) Pipeline identification method and system based on improved self-attention mechanism
CN110046650B (en) Express package bar code rapid detection method
CN112541469B (en) Crowd counting method and system based on self-adaptive classification
CN114694042A (en) Disguised person target detection method based on improved Scaled-YOLOv4

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
CB02 Change of applicant information
CB02 Change of applicant information

Address after: 214105 No. 299 Dacheng Road, Xishan District, Jiangsu, Wuxi

Applicant after: Zhongke Weizhi intelligent manufacturing technology Jiangsu Co.,Ltd.

Address before: 214105 No. 299 Dacheng Road, Xishan District, Jiangsu, Wuxi

Applicant before: ZHONGKE WEIZHI INTELLIGENT MANUFACTURING TECHNOLOGY JIANGSU Co.,Ltd.

TA01 Transfer of patent application right
TA01 Transfer of patent application right

Effective date of registration: 20200917

Address after: 214105 No. 299 Dacheng Road, Xishan District, Jiangsu, Wuxi

Applicant after: ZHONGKE WEIZHI INTELLIGENT MANUFACTURING TECHNOLOGY JIANGSU Co.,Ltd.

Address before: Reed City Road Kunshan city Suzhou city Jiangsu province 215347 No. 1699 building 7 layer ITRI

Applicant before: KUNSHAN BRANCH, INSTITUTE OF MICROELECTRONICS OF CHINESE ACADEMY OF SCIENCES

GR01 Patent grant
GR01 Patent grant
CP03 Change of name, title or address
CP03 Change of name, title or address

Address after: No. 979, Antai Third Road, Xishan District, Wuxi City, Jiangsu Province, 214000

Patentee after: Zhongke Weizhi Technology Co.,Ltd.

Address before: No. 299, Dacheng Road, Xishan District, Wuxi City, Jiangsu Province

Patentee before: Zhongke Weizhi intelligent manufacturing technology Jiangsu Co.,Ltd.