CN111046866A

CN111046866A - Method for detecting RMB crown word number region by combining CTPN and SVM

Info

Publication number: CN111046866A
Application number: CN201911289182.8A
Authority: CN
Inventors: 杨志钢; 李辉洋; 黎明; 王军亮; 胡家欣; 孙鹏
Original assignee: Harbin Engineering University
Current assignee: Harbin Engineering University
Priority date: 2019-12-13
Filing date: 2019-12-13
Publication date: 2020-04-21
Anticipated expiration: 2039-12-13
Also published as: CN111046866B

Abstract

The invention relates to a method for detecting a RMB crown word number region by combining CTPN and SVM, which comprises the following steps: shooting RMB of each currency value through camera equipment, marking the RMB crown word number area as a positive sample, and marking other areas as negative samples, and establishing a sample set; training a CTPN network by utilizing a sample set to obtain a preliminary positioning model of the proposed prefix number area; preprocessing the sample set, extracting projection statistical feature vectors, and training an SVM (support vector machine) to obtain a secondary discrimination model capable of screening a crown word number region; when the detection is carried out, for the acquired RMB picture to be detected, a plurality of candidate crown word number areas are proposed by using a primary positioning model, then projection statistical characteristic vectors of the candidate areas are extracted, and a secondary discrimination model is combined to obtain a correct crown word number area. The invention has the beneficial effects that: the missing detection probability is greatly reduced, and meanwhile, the accuracy of detecting the crown word number region is improved; the ability of accurately identifying the crown word number area is still provided under the conditions of color change, blurring and the like of the RMB picture.

Description

Method for detecting RMB crown word number region by combining CTPN and SVM

Technical Field

The invention relates to the technical field of image processing, in particular to the field of a method for detecting a RMB crown word number region by combining CTPN and SVM.

Background

The serial number code is the characteristic of the paper money of the RMB, and the code character string of each paper money is unique and can be used as the characteristic for distinguishing the paper money. The bank system can not only identify counterfeit money, but also assist in catching robbery and stealing RMB through RMB crown word number coding. The RMB code detection and identification technology can protect the public and national property safety and has good application value in the security field. At present, the fifth set of RMB and part of the fourth set of RMB which are still circulated in the market have abundant pattern texture interference, which causes serious trouble to the detection task of the crown word number area; and because of the long-time circulation problem, the paper of the RMB is seriously broken, including the factors of deformity, correction, distressing, obvious crease and the like, so that the difference of the appearances of the paper of the same currency picture is larger, and the difficulty of the detection task of the prefix region is increased. Therefore, the research on the RMB crown word number region detection algorithm can provide powerful guarantee for the crown word number identification and supervision tasks, and has sufficient practical significance. Currently, the detection methods for the serial number of the RMB include a CIS perspective-based detection method and a visible light diagram-based detection method:

the detection method based on the CIS perspective adopts a mode that the perspective of the RMB is collected through the transmission light of the CIS, and then the perspective is subjected to image analysis to obtain the crown word number region, so that the method is a mature crown word number region detection method in the early stage. The RMB perspective is formed by superposing front and back patterns, so that the interference is more serious, and the solution is to directly search the area by using the prior knowledge of the position of the serial number or carry out convolution matching through a template so as to obtain the serial number area. However, the method requires that paper money is cleaner and more complete, facilities are required for matching, the use conditions are more rigorous, and the algorithm robustness is not strong.

The detection method based on the visible light image adopts a mode that the front image of the RMB is collected by the camera equipment, and then the front image is subjected to image analysis to obtain a crown word area. In recent years, due to the abundance of applicable scenes of visible light diagrams and the improvement of the operation speed of equipment, more researchers pay attention to the visible light diagrams. The front view of the RMB only comprises one side image, but the pattern interference is still serious, which is not beneficial to obtaining the crown word number area. One solution is to directly refer to a perspective view, directly find an approximate region by using template matching or priori position knowledge, and the using conditions are also harsh; another solution is to detect the prefix region by using the recently emerging deep learning method and strong device computing power, and the method has better adaptability and anti-interference capability.

Disclosure of Invention

The invention aims to provide a method for detecting a RMB crown word number region by combining a CTPN and an SVM, which comprises the steps of firstly ensuring that the crown word number region is not missed to be detected through a primary positioning model based on the CTPN, then screening a plurality of candidate regions to obtain a correct crown word number region by utilizing a secondary discrimination model based on the SVM, and ensuring the accuracy and the robustness of detection through a coarse-to-fine detection strategy.

The invention is realized by the following steps:

a method for detecting a RMB crown word number region by combining CTPN and SVM comprises the following steps:

(1) shooting RMB with different currency values by using a camera, wherein the RMB comprises pictures influenced by factors such as poor contrast and definition, incomplete paper, old paper, color change and the like, marking a RMB crown word area as a positive sample, and marking other areas as negative samples to establish a sample set;

(2) training a CTPN network by utilizing a sample set to obtain a preliminary positioning model of the proposed prefix number area;

(3) preprocessing the sample set, extracting projection statistical feature vectors, and training an SVM (support vector machine) to obtain a secondary discrimination model capable of screening a crown word number region;

(4) when the detection is carried out, for the acquired RMB picture to be detected, a plurality of candidate crown word number areas are proposed by using a primary positioning model, then projection statistical characteristic vectors of the candidate areas are extracted, and a secondary discrimination model is combined to obtain a correct crown word number area.

In the step (2), the step of training the CTPN network to obtain the preliminary positioning model comprises the following steps:

(2.1) zooming the sample set picture according to the original length-width ratio, namely zooming the sample set picture with the length of m and the width of n, wherein the zoomed length is m ' is 1200, and the zoomed width is n ' ═ n/m) x m ';

(2.2) training a network by using a sample set and an Adam gradient descent optimization algorithm until a loss value and a missing detection rate are both smaller than a threshold value, so as to obtain a required primary positioning model;

and (2.3) when detection is carried out, inputting the RMB picture into the primary positioning model to obtain a plurality of candidate areas, wherein the candidate areas comprise a crown word number area and other non-crown word number areas.

In the step (3), the specific training and detecting process of the secondary discrimination model obtained by projecting the statistical feature vector and training the SVM is as follows:

(3.1) scaling the sample set image to a size of 300 multiplied by 50 and converting the sample set image into a gray map, performing thresholding processing twice after bilateral filtering, namely performing OTSU thresholding method to obtain a foreground map without a background area, and performing mean value thresholding method to further separate patterns and characters to obtain a secondary thresholding map of the image;

(3.2) calculating a projection statistical feature vector f (z) of the quadratic thresholding graph, and projecting the image I (x, y) to x and y axes respectively, wherein each element is a statistical value of the row or the column:

then connecting the elements in the two directions to form a vector containing 350 elements, and using the vector as a feature expression vector of the quadratic thresholding graph;

(3.3) training the SVM by using the characteristic vector of the sample set, and preventing overfitting by using a linear kernel to obtain a quadratic discriminant model;

and (3.4) when detection is carried out, inputting a plurality of candidate areas obtained by the primary positioning model into a secondary judging model, and finally obtaining a correct area containing the crown word number.

In the step (3.1), the calculation method for solving the optimal threshold value by the OTSU algorithm comprises the following steps:

w₂＝1-w₁

g＝w₁*w₂*(u₁-u₂)²

wherein N is₁Representing the total number of pixels in any of the two classes, SUM being the total number of pixels in the whole image, u₁And u₂The pixel values of the two types corresponding to the subscripts are respectively the mean value, and g is the optimal threshold value.

The invention has the beneficial effects that: firstly, a CTPN-based primary positioning model is used for ensuring that the crown word number region is not missed, and then an SVM-based secondary discrimination model is used for screening from a plurality of candidate regions to obtain a correct crown word number region. The method adopts a coarse-to-fine detection strategy, considers the two aspects of the missed detection rate and the accuracy rate, greatly reduces the missed detection probability and improves the accuracy of detecting the crown word number region; the candidate region is described through the projection statistical characteristics with stable performance, so that the model still has the capability of accurately identifying the crown word number region under the conditions of color change, blurring and the like of the RMB picture, and the robustness and the accuracy of crown word number region detection are further improved.

Drawings

FIG. 1 is a schematic overall view of a RMB crown word number region detection method according to an embodiment of the present invention;

FIG. 2 is a schematic diagram of a CTPN-based preliminary positioning model detection process in an embodiment of the present invention;

FIG. 3 is a schematic flow chart of preprocessing and calculating projection statistics according to an embodiment of the present invention;

fig. 4(a) is an original image of a crown-word number region provided in an embodiment of the present invention;

FIG. 4(b) is a diagram of the result of the first thresholding of the crown word number region according to the embodiment of the present invention;

FIG. 4(c) is a diagram of a second thresholding result of the crown word number region according to the embodiment of the invention;

fig. 5(a) is an original image of a non-crown-number region provided in an embodiment of the present invention;

FIG. 5(b) is a diagram illustrating a first thresholding result of a non-crown-number region according to an embodiment of the present invention;

fig. 5(c) is a diagram of a result of the first thresholding of the non-crown number region according to the embodiment of the present invention.

Detailed Description

The present invention is described in further detail below with reference to the attached drawing figures.

The invention relates to the technical field of image processing, in particular to a method for detecting a RMB crown word number region by combining CTPN and SVM.

The invention provides a method for detecting a RMB crown word number region by combining a CTPN and an SVM, aiming at the defects of the prior art, firstly, a CTPN-based primary positioning model is used for ensuring that the crown word number region is not missed for detection, then an SVM-based secondary discrimination model is used for screening correct crown word number regions from a plurality of candidate regions, and the accuracy and the robustness of detection are ensured by a coarse-to-fine detection strategy. In order to achieve the above object, the technical solution of the present invention is as follows:

a method for detecting a RMB crown word number region by combining a CTPN and an SVM is characterized by comprising the following steps:

Further, the training of the initial positioning model obtained by the CTPN network includes the following specific training and detection processes:

(2-a) zooming the sample set picture according to the original length-width ratio, namely zooming the sample set picture with the length of m and the width of n, wherein the zoomed length is m ' is 1200, and the zoomed width is n ' ═ n/m) x m ';

(2-b) training a network by using a sample set and an Adam gradient descent optimization algorithm until a loss value and a missing detection rate are both smaller than a threshold value, and obtaining a required primary positioning model;

and (2-c) when detection is carried out, inputting the RMB picture into the primary positioning model to obtain a plurality of candidate areas, wherein the candidate areas comprise a crown word number area and other non-crown word number areas.

Further, the secondary discrimination model obtained by projecting the statistical feature vector and training the SVM includes a specific training and detection process, which includes:

(3-a) zooming the image of the sample set to the size of 300 multiplied by 50 and converting the image into a gray image, performing thresholding processing twice after bilateral filtering, namely performing OTSU thresholding method to obtain a foreground image without a background area, and performing mean value thresholding method to further separate patterns and characters to obtain a secondary thresholding image of the image;

(3-b) calculating a projection statistical feature vector f (z) of the quadratic thresholding map, and projecting the image I (x, y) to x and y axes respectively, namely, each element is a statistical value of the row or column:

(3-c) training the SVM by using the characteristic vector of the sample set, and preventing overfitting by using a linear kernel to obtain a secondary discrimination model;

and (3-d) when detection is carried out, inputting a plurality of candidate areas obtained by the primary positioning model into a secondary judging model, and finally obtaining a correct area containing the crown word number.

The invention is further described as follows:

in order to make the objects, technical solutions and advantages of the present invention more apparent, the technical solutions of the present invention will be described in detail and completely with reference to all the accompanying drawings in the following embodiments. It should be understood that the described embodiments are merely illustrative of the invention, rather than all embodiments and are not limiting of the invention.

Fig. 1 shows that the method for detecting the rmb crown word number region by combining CTPN and SVM according to the present embodiment is divided into a training phase and a detection phase. The training stage aims to obtain a primary positioning model sensitive to the crown word number region and a secondary distinguishing model capable of accurately distinguishing the region type, and specifically comprises the following steps:

s1, shooting each currency RMB picture influenced by different factors by the front side of a camera device, marking a RMB crown word number area as a sample library, and concretely, the method comprises the following steps:

s11, collecting the RMB pictures, and acquiring corresponding pictures according to specific scene requirements to enable the model to adapt to specific tasks.

The factors influencing the RMB picture comprise internal factors of paper and external factors of a shooting scene, the internal factors can cause partial incomplete paper, aging, color fading or color change and the like, and the external factors cause partial bright or dark pictures, low definition and even fuzzy characters, partial paper money area shooting and the like.

S12, marking the crown word number area of the RMB, setting the IOU >0.5 of the area as a positive sample, and setting other areas as negative samples without the crown word number.

The samples used for training the CTPN network are the whole picture, and the samples used for training the SVM are a crown word number region and a non-crown word number region intercepted from the picture.

And S2, training the CTPN network by using the sample set to obtain a preliminary positioning model of the proposed prefix number area.

The CTPN (connecting Text forward Network) is a Text detection method based on the RCNN series model, and has greatly improved Text detection capability in the horizontal direction, so that the CTPN is suitable for the preliminary positioning task of the crown word number area on the RMB. The CTPN framework uses VGG as an image feature extraction network, BiRNN is connected with context information, CTC loss is used as a model training target, and text regions are connected through a text line construction method.

Fig. 2 is a schematic diagram of a detection process of a primary positioning model based on CTPN, where the forward propagation process of the model is as follows: firstly, normalizing RMB pictures, then utilizing a VGG network to extract features of the zoomed pictures to obtain feature maps, extracting anchor frame regions with fixed width and different heights from each pixel point on the feature maps, inputting the anchor frame regions into a bidirectional LSTM network to obtain time sequence features of the anchor frame regions, then judging whether the anchor frame regions contain texts and y coordinates and heights of prediction centers, screening one or more text suggestion regions, and finally connecting each text suggestion region into a prefix region by adopting a text line construction method.

In practice, the pattern on the RMB has strong interference capability, and the CTPN network cannot reach a low enough loss value during training, namely, the model cannot be converged finally. Therefore, the embodiment selects and utilizes the good detection characteristic of the CTPN network to the horizontal character area, and only uses the CTPN network as the preliminary positioning component of the crown word number area, thereby avoiding the defect that the model can not be converged. In order to obtain the preliminary positioning model required by the invention, the training process specifically comprises the following steps:

and S21, zooming the sample set picture according to the original length-width ratio, namely zooming the picture with the length of m and the width of n, wherein the zoomed length m ' is 1200, and the zoomed width n ' is (n/m) multiplied by m '.

And S22, training the network by using the sample set and an Adam gradient descent optimization algorithm until the network converges and the undetected rate is less than a threshold value, thereby obtaining the required primary positioning model.

Training the CTPN network to obtain a model with low missed detection rate, and only using an Adam method to quickly reduce the loss value without modifying the total loss value of the model; by monitoring the missed detection rate and the total loss value index, when both meet a certain threshold, the training is stopped, and a model obtained by current training is obtained to serve as a primary positioning model. The resulting preliminary location model is characterized by one or more of the proposed regions, one of which must contain a prefix number.

And inputting the single picture into the preliminary positioning model, wherein the obtained result has two types of areas, namely an area containing the crown word number and other areas which do not contain the crown word number, and the area containing the crown word number is a correct area.

And S3, extracting projection statistical characteristic vectors from the sample set, training an SVM (support vector machine), and obtaining a secondary discrimination model for screening the region containing the crown word number.

For the crown region that is significantly different from the other regions, by analyzing the image, it can be known that: because the RMB crown word number area covers the portrait area, the edge information of portrait interference is rich and the portrait and the character area can not be separated only by the pixel value; and the non-crown word number interference area in the initial positioning result has the characteristics of rich edge information and little outline information. Simple background interference can be filtered by adopting an image processing mode, then the characteristic vector is extracted, and finally a method for judging whether the region has characters or not by using a machine learning method is used, so that the screening of the correct region containing the complete prefix number is completed.

And S31, preprocessing positive and negative sample images in the sample set.

Fig. 3 is a schematic diagram illustrating a flow of preprocessing and calculating a projection statistical characteristic according to an embodiment of the present invention, where the preprocessing flow includes:

and S311, normalizing the picture, namely scaling the picture to be 300 multiplied by 50, taking a gray image, and then carrying out bilateral filtering to obtain a filtered gray image.

The picture with the same currency value has larger color difference due to the brightness and paper fading, so that the image analysis can not be carried out by utilizing a single channel in RGB, and the gray-scale image contains the integral information of the picture and is suitable for the separation task of processing the crown word number and the background. Then, bilateral filtering is used for processing noise interference contained in the RMB picture and a large number of fine veins which are difficult to process by other filtering methods, and filtered gray level images of a crown word number area and a non-crown word number area are obtained.

S312, carrying out secondary thresholding on the gray level image, firstly carrying out an OTSU thresholding method to obtain a foreground image without a background area, and then carrying out an averaging thresholding method to further separate the pattern and the character.

In order to discard unnecessary background information and achieve the purpose of simplifying subsequent processes, the threshold segmentation method is an effective method for segmenting the foreground and the background, and in practice, the RMB picture is influenced by brightness and paper fading, the pixel difference of the same part is large, and the purpose can be achieved only through an adaptive threshold method. The OTSU algorithm is an efficient algorithm for self-adaptive thresholding of an image, and calculates the optimal threshold of two types of pixel values in a segmented image according to the maximum between-class variance principle:

w₂＝1-w₁

g＝w₁*w₂*(u₁-u₂)²

Fig. 4 and fig. 5 are schematic diagrams illustrating a crown-word number region and a non-crown-word number region, respectively, which are subjected to secondary thresholding according to an embodiment of the present invention. In the foreground image segmented by the OTSU for the first time, the crown word number area of partial currency still retains the interference of a large number of patterns, and the foreground image can be simply and efficiently segmented by a threshold value of a second averaging method for further separation. Firstly, taking a foreground image segmented by the OTSU as a mask, covering a gray image, and obtaining a foreground original image of the area; and calculating the pixel average value of the foreground original image as a threshold value of image segmentation to obtain a secondary thresholding image. Wherein, for the crown word number area, the secondary thresholding result is a binary image of the character pattern; and the result of the quadratic thresholding of the non-prefix regions is a binarized map of the interference pattern.

And S32, calculating projection statistical characteristics of the quadratic thresholding image.

The secondary thresholding image has already separated foreground objects such as characters and the background, so the image feature expressions of the crown-word number region and the non-crown-word number region can be respectively described through the pixel histograms in the row and column directions of the thresholding image. The calculation method is as follows:

for a thresholded map I (x, y) of size 300 × 50 projected onto the x and y axes, respectively, i.e. each element is a statistical value for that row or column:

the elements in these two directions are then concatenated to form a vector f (z) containing 350 elements as the feature expression vector for the quadratic thresholding map.

And S33, training the SVM by using the obtained sample set projection statistical feature vector, and using a linear kernel to prevent overfitting to obtain a secondary discrimination model.

An SVM (Support Vector Machine) is an excellent Machine learning model, which can not only separate two types of samples but also maximize the classification interval by finding an optimal classification plane. By training the SVM, a quick and reliable secondary discrimination model can be obtained, and secondary discrimination tasks of the crown word number region and the non-crown word number region are completed. For the huge difference of the RMB pictures in the actual scene, the overfitting phenomenon of the model is prominent, and the feature vectors of the crown word number area are obviously sparse, so that the overfitting problem of the model is avoided by adopting a linear kernel method.

The projected feature vectors of the candidate region list are input into a quadratic discriminant model, regions identified as non-crown-word-number regions are removed, and correct regions belonging to the crown-word-number regions are retained.

And S4, during detection, proposing a plurality of candidate crown word number areas for the acquired RMB picture to be detected by using a primary positioning model, extracting projection statistical characteristic vectors of the candidate areas, and obtaining a correct crown word number area by using a secondary discrimination model.

As shown in fig. 1, the detection flow in the overall schematic diagram of the method for detecting the rmb crown word number region provided in this embodiment is as follows: inputting the RMB picture to be detected into a trained primary positioning model to obtain a plurality of candidate areas, wherein the candidate areas comprise a crown word number area and a non-crown word number area; then preprocessing each candidate region and calculating a projection statistical feature vector, identifying by using a trained secondary discrimination model, deleting the non-crown word number region in the candidate list, and finally obtaining the correct region containing the crown word number.

In conclusion, the detection strategy from coarse to fine is suitable for detecting the crown word number area of the RMB with complex patterns; the initial positioning model based on the CTPN utilizes the characteristic that the initial positioning model is sensitive to text content to relieve or even avoid the phenomenon of missing crown word number detection; the SVM-based secondary discrimination model strengthens the anti-interference capability on the basis of accurately identifying the crown word number region through stable and reliable projection statistical characteristics and the SVM, so that the model can be suitable for the conditions of poor image quality such as color change, blurring and the like of RMB pictures, and has the characteristics of strong practicability and strong robustness.

In summary, the invention provides a method for detecting a RMB crown word number region by combining a CTPN and an SVM, which comprises the steps of firstly ensuring that the crown word number region is not missed through a primary positioning model based on the CTPN, and then screening a plurality of candidate regions by utilizing a secondary discrimination model based on the SVM to obtain a correct crown word number region. The method adopts a coarse-to-fine detection strategy, considers the two aspects of the missed detection rate and the accuracy rate, greatly reduces the missed detection probability and improves the accuracy of detecting the crown word number region; the candidate region is described through the projection statistical characteristics with stable performance, so that the model still has the capability of accurately identifying the crown word number region under the conditions of color change, blurring and the like of the RMB picture, and the robustness and the accuracy of crown word number region detection are further improved.

Having described the basic principles, main features and practical features of the method for detecting the region of the national minbi crown word, those skilled in the art should understand that the above description of the embodiments is only for helping understanding the method, technology and core idea of the present invention, and not for limiting the present invention, and meanwhile, according to the idea of the present application, there are changes in the specific implementation and application scope, and these changes all fall into the protection scope of the present invention.

Claims

1. A method for detecting the RMB crown word number area by combining CTPN and SVM is characterized in that: the method comprises the following steps:

2. The rmb crown word number region detection method combining CTPN and SVM as claimed in claim 1, wherein: in the step (2), the step of training the CTPN network to obtain the preliminary positioning model comprises the following steps:

3. The rmb crown word number region detection method combining CTPN and SVM as claimed in claim 1, wherein: in the step (3), the specific training and detecting process of the secondary discrimination model obtained by projecting the statistical feature vector and training the SVM is as follows:

4. The rmb crown word number region detection method combining CTPN and SVM as claimed in claim 1, wherein: in the step (3.1), the calculation method for solving the optimal threshold value by the OTSU algorithm comprises the following steps:

w₂＝1-w₁

g＝w₁*w₂*(u₁-u₂)²