CN116363655A - Financial bill identification method and system - Google Patents

Financial bill identification method and system Download PDF

Info

Publication number
CN116363655A
CN116363655A CN202310294104.7A CN202310294104A CN116363655A CN 116363655 A CN116363655 A CN 116363655A CN 202310294104 A CN202310294104 A CN 202310294104A CN 116363655 A CN116363655 A CN 116363655A
Authority
CN
China
Prior art keywords
image
feature
network
model
financial
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310294104.7A
Other languages
Chinese (zh)
Inventor
张子荣
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenzhen Yicai Information Technology Co ltd
Original Assignee
Shenzhen Yicai Information Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenzhen Yicai Information Technology Co ltd filed Critical Shenzhen Yicai Information Technology Co ltd
Priority to CN202310294104.7A priority Critical patent/CN116363655A/en
Publication of CN116363655A publication Critical patent/CN116363655A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition
    • G06V30/14Image acquisition
    • G06V30/146Aligning or centring of the image pick-up or image-field
    • G06V30/147Determination of region of interest
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/0464Convolutional networks [CNN, ConvNet]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/25Determination of region of interest [ROI] or a volume of interest [VOI]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/28Quantising the image, e.g. histogram thresholding for discrimination between background and foreground patterns
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/30Noise filtering
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/44Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/764Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/80Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition
    • G06V30/16Image preprocessing
    • G06V30/162Quantising the image signal
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition
    • G06V30/16Image preprocessing
    • G06V30/164Noise filtering
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition
    • G06V30/18Extraction of features or characteristics of the image
    • G06V30/1801Detecting partial patterns, e.g. edges or contours, or configurations, e.g. loops, corners, strokes or intersections
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition
    • G06V30/19Recognition using electronic means
    • G06V30/191Design or setup of recognition systems or techniques; Extraction of features in feature space; Clustering techniques; Blind source separation
    • G06V30/19173Classification techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition
    • G06V30/19Recognition using electronic means
    • G06V30/191Design or setup of recognition systems or techniques; Extraction of features in feature space; Clustering techniques; Blind source separation
    • G06V30/1918Fusion techniques, i.e. combining data from various sources, e.g. sensor fusion
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Computing Systems (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Software Systems (AREA)
  • Databases & Information Systems (AREA)
  • Medical Informatics (AREA)
  • Biomedical Technology (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Molecular Biology (AREA)
  • General Engineering & Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses a financial bill identification method, which comprises the following steps: acquiring a financial bill image to be identified; preprocessing the financial bill image to obtain a processed image; constructing an improved YOLO-v3 model; positioning the region of interest of the processed image by adopting the constructed improved YOLO-v3 model, extracting the coordinate position of the region of interest, and cutting out a target picture according to the coordinate position; text segmentation is carried out on the target picture to obtain a segmented picture; and carrying out text recognition on the segmented picture by adopting a CRNN network model to obtain a bill recognition result. The image preprocessing of the financial bill is beneficial to the extraction of follow-up bill information, the improved YOLO-v3 model is adopted for positioning the region of interest, the accuracy of detecting the region of interest is improved, the CRNN network model is adopted for character recognition, the accuracy of identifying the financial bill is effectively improved, and the labor cost is greatly saved.

Description

Financial bill identification method and system
Technical Field
The invention relates to the technical field of bill identification, in particular to a financial bill identification method and a financial bill identification system.
Background
A ticket is an important document that records the economic behavior of a commodity or service transaction. With the development of social economy, the importance of bills is more and more emphasized by people, and invoices are taken as information carriers of economic activities and are important finishing objects of financial staff. The advent of electronic invoices has presented many convenience to the job as well as challenges to financial management staff.
Financial staff is in the in-process of handling the invoice, and the invoice carries out manual filing intensity of labour high, and easily makes mistakes moreover, and work efficiency is extremely low. Financial staff needs to carefully read invoice contents, and the contents needing to be extracted, tidied and archived are manually input, so that the cost of manually inputting information is high, the efficiency is low, and errors are easy to occur.
Disclosure of Invention
Aiming at the defects in the prior art, the financial bill identification method and the financial bill identification system provided by the invention can accurately and automatically extract text information from bill images, improve the accuracy of bill identification results and greatly save labor cost.
In a first aspect, a method for identifying a financial bill provided by an embodiment of the present invention includes:
acquiring a financial bill image to be identified;
preprocessing the financial bill image to obtain a processed image;
constructing an improved YOLO-v3 model;
positioning the region of interest of the processed image by adopting the constructed improved YOLO-v3 model, extracting the coordinate position of the region of interest, and cutting out a target picture according to the coordinate position;
text segmentation is carried out on the target picture to obtain a segmented picture;
and carrying out text recognition on the segmented picture by adopting a CRNN network model to obtain a bill recognition result.
In a second aspect, an embodiment of the present invention provides a financial bill identifying system, including: the device comprises an acquisition module, a preprocessing module, a model construction module, a positioning module, a cutting module and an identification module,
the acquisition module is used for acquiring a financial bill image to be identified;
the preprocessing module is used for preprocessing the financial bill image to obtain a processed image;
the model building module is used for building an improved YOLO-v3 model;
the positioning module adopts the constructed improved YOLO-v3 model to position the region of interest of the processed image, extracts the coordinate position of the region of interest, and cuts out a target picture according to the coordinate position;
the cutting module performs text segmentation on the target picture to obtain a segmented picture;
and the recognition module adopts a CRNN network model to carry out text recognition on the segmented picture, so as to obtain a bill recognition result.
The invention has the beneficial effects that:
according to the financial bill identification method and system provided by the embodiment of the invention, the image preprocessing is carried out on the financial bill, so that the image noise is reduced, the image azimuth posture is corrected, the extraction of subsequent bill information is facilitated, the improved YOLO-v3 model is adopted for positioning the region of interest, the accuracy of detecting the region of interest is improved, the CRNN network model is adopted for character identification, the accuracy of identifying the financial bill is effectively improved, and the labor cost is greatly saved.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below. Like elements or portions are generally identified by like reference numerals throughout the several figures. In the drawings, elements or portions thereof are not necessarily drawn to scale.
FIG. 1 is a flow chart of a method for identifying financial billing according to another embodiment of the invention;
fig. 2 is a schematic diagram of a financial bill identifying system according to a first embodiment of the present invention.
Detailed Description
The following description of the embodiments of the present invention will be made clearly and fully with reference to the accompanying drawings, in which it is evident that the embodiments described are some, but not all embodiments of the invention. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
It should be understood that the terms "comprises" and "comprising," when used in this specification and the appended claims, specify the presence of stated features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groups thereof.
It is also to be understood that the terminology used in the description of the invention herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the invention. As used in this specification and the appended claims, the singular forms "a," "an," and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise.
It should be further understood that the term "and/or" as used in the present specification and the appended claims refers to any and all possible combinations of one or more of the associated listed items, and includes such combinations.
As used in this specification and the appended claims, the term "if" may be interpreted as "when..once" or "in response to a determination" or "in response to detection" depending on the context. Similarly, the phrase "if a determination" or "if a [ described condition or event ] is detected" may be interpreted in the context of meaning "upon determination" or "in response to determination" or "upon detection of a [ described condition or event ]" or "in response to detection of a [ described condition or event ]".
It is noted that unless otherwise indicated, technical or scientific terms used herein should be given the ordinary meaning as understood by one of ordinary skill in the art to which this invention pertains.
As shown in fig. 1, a flowchart of a financial bill identifying method according to a first embodiment of the present invention is shown, the method comprising the steps of:
s1: acquiring a financial bill image to be identified;
s2: preprocessing the financial bill image to obtain a processed image;
s3: constructing an improved YOLO-v3 model;
s4: positioning the region of interest of the processed image by adopting the constructed improved YOLO-v3 model, extracting the coordinate position of the region of interest, and cutting out a target picture according to the coordinate position;
s5: text segmentation is carried out on the target picture to obtain a segmented picture;
s6: and carrying out text recognition on the segmented picture by adopting a CRNN network model to obtain a bill recognition result.
In this embodiment, the financial bill includes various common financial reimbursement vouchers such as a value-added tax plain invoice, a value-added tax special invoice, a value-added tax electronic plain invoice, a value-added tax electronic special invoice, a train ticket, a taxi ticket, an aviation receipt, and the like. The quality of the image has a direct impact on subsequent detection and identification. Therefore, the financial bill image is preprocessed, so that the bill image quality is improved, and the noise in the image is reduced. And positioning an interested region of the processed image through the built improved YOLO-v3 model, calibrating the interested region, extracting the coordinate position of the interested region, cutting out a target picture, segmenting the text of the target picture to obtain a segmented picture, and carrying out text recognition on the segmented picture by adopting a CRNN network model to obtain a bill recognition result.
The quality of the bill image is improved by preprocessing the financial bill image, the processing of the bill image by a subsequent model is facilitated, the region of interest is positioned by adopting the improved YOLO-v3 model, the region of interest is detected, and the performance of extracting the feature map by the improved YOLO-v3 model is good and the detection speed is high. And carrying out text recognition on the segmented picture by adopting a CRNN network model, and improving the accuracy of recognition results.
Specifically, in the method for identifying the financial bill provided by the invention, the specific method for preprocessing the financial bill image to obtain the processed image comprises the following steps:
s201: carrying out Gaussian filtering on the financial bill image to obtain a filtered image;
s202: performing binarization processing on the filtered image to obtain a binarized image;
s203: performing edge detection on the binarized image, and performing perspective transformation on the edge image to obtain a corrected image;
s204: and correcting the azimuth and attitude of the corrected image to obtain a positive financial bill image.
In order to eliminate common noise in the financial bill image, the embodiment filters the financial bill image by adopting Gaussian filtering, removes some Gaussian noise on the image, and improves the quality of the original image to the greatest extent. The image shot by the camera is a color image generally, the color image contains abundant information, but most of the information does not greatly contribute to the character recognition task and even can cause adverse effects, in order to remove redundant information in the image, the image is conveniently processed by subsequent images, the image is converted into a gray level image, and each pixel in the image is represented by a brightness value of 0 to 255. The binarization formula is as follows:
Figure BDA0004142586290000051
where f (x, y) denotes a pixel value at the (x, y) position, and T denotes a threshold value. As can be seen from the formula, the binarization of the image is performed by comparing the pixel value f (x, y) of the image with the threshold value T one by one. Through binarization processing, the obtained image has relatively less noise and high contrast, and is more beneficial to the subsequent extraction of character information.
Performing edge detection on the obtained binarized image, wherein the edge detection step comprises the following steps: and (3) a step of: the image edge gradient direction is calculated. And II: filtering the non-maximum value; and II: upper and lower thresholds are defined to detect edges. By setting the upper and lower thresholds, the pixel value higher than the upper threshold is directly determined as an edge pixel point, the pixel value lower than the lower threshold is directly determined as a non-edge pixel point, and the criterion between the upper and lower thresholds is to determine whether the pixel value and the edge pixel point are adjacent points, if so, the pixel value is still determined as an edge point, otherwise, the pixel value is the non-edge point. After the bill image is subjected to the contour detection step, a lot of contour information can be obtained, but only the contour of the bill image is covered by the needed contour, so that only the contour with the largest area among the obtained contours is needed to be searched for as an effective information area. Thus, an effective wrapping contour point set of the bill image is obtained, and then four vertex coordinates of the wrapping bill image need to be found. Firstly, carrying out Hough transformation to detect straight lines, then carrying out exclusion screening according to the criteria of intersecting, not being too close, forming quadrangles and the like, and finally obtaining four vertex coordinate information meeting constraint conditions. The corrected image is obtained by a perspective transformation which can produce a new quadrilateral, but not necessarily a parallelogram, which has a total of 8 degrees of freedom to consider the affine transformation as a subset of the perspective transformation. The image becomes more regular after perspective transformation, and noise interference of peripheral irrelevant areas is removed. And for the non-aligned image, correcting the azimuth and attitude to obtain an aligned financial bill image. When the azimuth posture is corrected, two-dimensional codes, bar codes, seals, bill heads and the like can be used for auxiliary correction, and the correction method can be realized by adopting the prior art.
Specifically, in the financial bill identification method provided by the invention, the specific method for constructing the improved YOLO-v3 model comprises the following steps:
the method comprises the steps of improving a dark net-53 main network to obtain a dark net-39 main network model;
extracting image features by adopting a trained dark net-39 backbone network model to obtain feature images of 5 convolution layers with different scales;
optimally combining the feature images of the convolution layers with different scales to obtain a combined feature image;
carrying out weighted feature fusion on the combined feature map;
and carrying out regression prediction on the fused feature images by using a YOLO-V3 algorithm to obtain a positioning area.
The method for constructing the improved YOLO-v3 model further comprises the step of training a dark net-39 main network model, and the specific method for training the dark net-39 main network model comprises the following steps:
2 convolution layers are added in a main network of a traditional YOLO-V3 algorithm, and 5 feature maps of convolution layers with different scales are adopted for target detection;
acquiring a data set, dividing the data set into a training set, a testing set and a verification set,
and re-clustering the coordinates of the boundary frames on the training set by adopting a k-means clustering algorithm, and calculating 15 boundary frame coordinates of the characteristic diagrams of the convolution layers with 5 different scales.
The dark net-39 main network model is formed by cutting channels of a dark net-53 main network, so that the number of model parameters is reduced, the picture characteristics are fully extracted, the running efficiency is improved, the original calculated amount of the improved YOLO-V3 algorithm is reduced by 80%, and the speed is improved by 4 times.
The dark net-39 main network module performs reasonable pruning on the dark net-53 main network, optimizes the network structure, removes some redundant convolution operations, and obtains the dark net-39 main network, wherein the method comprises the specific operation of halving the number of channels of a Level 5 layer, and simultaneously taking the Level 5 layer as a characteristic output layer, wherein the stride is 4 at the moment, so that the method is beneficial to better improving the detection rate of small target objects. The Level 4 and Level 3 and Level 2 layers halve the number of channels and the number of operations is halved, and the stride is 8, 16 and 32, respectively. Finally, a 3×3 convolution layer is added, and the feature extraction effect is enhanced while the parameter number is hardly increased, and the stride is 64. The dark net-39 network at this time cannot directly load the weight parameters of the original dark net-53 and needs to be retrained. The specific method for training the dark net-39 backbone network model comprises the following steps: 2 convolution layers are added in a main network of a traditional YOLO-V3 algorithm, and 5 feature maps of convolution layers with different scales are adopted for target detection; and acquiring a data set, dividing the data set into a training set, a testing set and a verification set, re-clustering coordinates of the boundary frames on the training set by adopting a k-means clustering algorithm, and calculating 15 boundary frame coordinates of the characteristic diagrams of the convolution layers with 5 different scales. By improving the traditional YOLO-V3 algorithm, 5 convolution layers with different scales are adopted for feature map extraction, shallow layer feature information and deep layer feature information are fully fused, the detection effect is improved, the model is reduced, the target detection speed is improved, and the calculated amount is reduced.
In this embodiment, the CRNN network model includes a CNN network, a circulation network, and a transcription layer, where the CNN network is configured to extract scale invariant features of the segmented picture to form a feature map, segment the feature map in columns to form a feature sequence, and input the feature sequence to the circulation network;
the cyclic network is composed of two layers of bidirectional LSTM networks, and is used for corresponding the characteristic sequences input by the CNN network and the classification labels, outputting a string of ordered label sequences, wherein the substrings of the sequences contain real final results, and inputting the ordered label sequences into the transcription layer;
and carrying out de-recombination operation on the transcription layer ordered tag sequence to obtain a recognition result.
In actual model training, a data set is generated in batches by adopting image processing and is divided into a training set and a test set according to the proportion of 10:1, a picture path of a picture sample and a real label of a picture text are mapped in a corresponding label file in a K, V key value pair mode, and finally the mapping is input into a CRNN model formed by a CNN network, a BiLSTM and a transcription layer for iterative training. And the CRNN model outputs the recognized characters to obtain higher speed and recognition accuracy.
According to the financial bill identification method provided by the embodiment of the invention, the image preprocessing is carried out on the financial bill, so that the image noise is reduced, the image azimuth posture is corrected, the extraction of subsequent bill information is facilitated, the improved YOLO-v3 model is adopted for positioning the region of interest, the accuracy of detecting the region of interest is improved, the CRNN network model is adopted for character identification, the accuracy of identifying the financial bill is effectively improved, and the labor cost is greatly saved.
In the first embodiment, a financial bill identifying method is provided, and corresponding to the method, the application also provides a financial bill identifying system. Fig. 2 is a block diagram of a financial bill identifying system according to a second embodiment of the present invention. Since the apparatus embodiments are substantially similar to the method embodiments, the description is relatively simple, and reference is made to the description of the method embodiments for relevant points. The device embodiments described below are merely illustrative.
Referring now to FIG. 2, a block diagram illustrating a financial instrument recognition system according to another embodiment of the present invention is shown, the system comprising: the device comprises an acquisition module, a preprocessing module, a model construction module, a positioning module, a cutting module and an identification module, wherein:
the acquisition module is used for acquiring a financial bill image to be identified;
the preprocessing module is used for preprocessing the financial bill image to obtain a processed image;
the model building module is used for building an improved YOLO-v3 model;
the positioning module adopts the constructed improved YOLO-v3 model to position the region of interest of the processed image, extracts the coordinate position of the region of interest, and cuts out a target picture according to the coordinate position;
the cutting module performs text segmentation on the target picture to obtain a segmented picture;
and the recognition module adopts a CRNN network model to carry out text recognition on the segmented picture, so as to obtain a bill recognition result.
The preprocessing module comprises a filtering unit, a binarization processing unit, a correction unit and an azimuth correction unit, wherein the filtering unit carries out Gaussian filtering on the financial bill image to obtain a filtered image; the binarization processing unit is used for performing binarization processing on the filtered image to obtain a binarized image; the correction unit performs edge detection on the binarized image, and performs perspective transformation on the edge image to obtain a corrected image; and the azimuth correcting unit corrects the azimuth posture of the corrected image to obtain a positive financial bill image.
The construction module comprises a dark net-39 main network unit, a feature combination unit, a weighted feature fusion unit and a prediction unit, wherein the dark net-39 main network unit is improved through a dark net-53 main network to obtain a dark net-39 main network model, and image features are extracted by the dark net-39 main network model to obtain feature diagrams of 5 different scale convolution layers; the feature combination unit is used for optimally combining the feature graphs of the 5 different scale convolution layers to obtain a combined feature graph; the weighted feature fusion unit is used for carrying out weighted feature fusion on the combined feature map; the prediction unit is used for carrying out regression prediction on the fused feature images by using a YOLO-V3 algorithm to obtain a target detection result.
The dark net-39 main network unit also comprises a network training unit, wherein 2 convolution layers are added in the main network of the traditional YOLO-V3 algorithm, and 5 different scale convolution layer feature graphs are adopted to detect targets; and acquiring a data set, dividing the data set into a training set, a testing set and a verification set, re-clustering coordinates of the boundary frames on the training set by adopting a k-means clustering algorithm, and calculating 15 boundary frame coordinates of the characteristic diagrams of the convolution layers with 5 different scales.
The CRNN network model comprises a CNN network, a circulating network and a transcription layer, wherein the CNN network is used for extracting scale invariant features of the segmented pictures to form feature graphs, and segmenting the feature graphs according to columns to form feature sequences and inputting the feature sequences into the circulating network;
the cyclic network is composed of two layers of bidirectional LSTM networks, and is used for corresponding the characteristic sequences input by the CNN network and the classification labels, outputting a string of ordered label sequences, wherein the substrings of the sequences contain real final results, and inputting the ordered label sequences into the transcription layer;
and integrating the transcription layer ordered tag sequences to obtain an identification result.
According to the financial bill identification system provided by the embodiment of the invention, the image preprocessing is carried out on the financial bill, so that the image noise is reduced, the image azimuth posture is corrected, the extraction of subsequent bill information is facilitated, the improved YOLO-v3 model is adopted for positioning the region of interest, the accuracy of detecting the region of interest is improved, the CRNN network model is adopted for character identification, the accuracy of identifying the financial bill is effectively improved, and the labor cost is greatly saved.
Finally, it should be noted that: the above embodiments are only for illustrating the technical solution of the present invention, and not for limiting the same; although the invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical scheme described in the foregoing embodiments can be modified or some or all of the technical features thereof can be replaced by equivalents; such modifications and substitutions do not depart from the spirit of the invention, and are intended to be included within the scope of the appended claims and description.

Claims (10)

1. A method for identifying a financial instrument, comprising:
acquiring a financial bill image to be identified;
preprocessing the financial bill image to obtain a processed image;
constructing an improved YOLO-v3 model;
positioning the region of interest of the processed image by adopting the constructed improved YOLO-v3 model, extracting the coordinate position of the region of interest, and cutting out a target picture according to the coordinate position;
text segmentation is carried out on the target picture to obtain a segmented picture;
and carrying out text recognition on the segmented picture by adopting a CRNN network model to obtain a bill recognition result.
2. The method for identifying a financial instrument according to claim 1, wherein the specific method for preprocessing the financial instrument image to obtain a processed image comprises:
carrying out Gaussian filtering on the financial bill image to obtain a filtered image;
performing binarization processing on the filtered image to obtain a binarized image;
performing edge detection on the binarized image, and performing perspective transformation on the edge image to obtain a corrected image;
and correcting the azimuth and attitude of the corrected image to obtain a positive financial bill image.
3. The financial billing identification method of claim 1 wherein the specific method of constructing an improved YOLO-v3 model comprises:
the method comprises the steps of improving a dark net-53 main network to obtain a dark net-39 main network model;
extracting image features by adopting a trained dark net-39 backbone network model to obtain feature images of 5 convolution layers with different scales;
optimally combining the feature images of the convolution layers with different scales to obtain a combined feature image;
carrying out weighted feature fusion on the combined feature map;
and carrying out regression prediction on the fused feature images by using a YOLO-V3 algorithm to obtain a positioning area.
4. The financial instrument recognition method of claim 3, further comprising the step of training a dark net-39 backbone network model, the specific method of training the dark net-39 backbone network model comprising:
2 convolution layers are added in a main network of a traditional YOLO-V3 algorithm, and 5 feature maps of convolution layers with different scales are adopted for target detection;
acquiring a data set, dividing the data set into a training set, a testing set and a verification set,
and re-clustering the coordinates of the boundary frames on the training set by adopting a k-means clustering algorithm, and calculating 15 boundary frame coordinates of the characteristic diagrams of the convolution layers with 5 different scales.
5. The financial bill recognition method according to claim 1, wherein the CRNN network model comprises a CNN network, a circulation network and a transcription layer, the CNN network is used for extracting scale-invariant features of the segmented pictures to form feature graphs, and the feature graphs are segmented in columns to form feature sequences and input into the circulation network;
the cyclic network is composed of two layers of bidirectional LSTM networks, and is used for corresponding the characteristic sequences input by the CNN network and the classification labels, outputting a string of ordered label sequences, wherein the substrings of the sequences contain real final results, and inputting the ordered label sequences into the transcription layer;
and integrating the transcription layer ordered tag sequences to obtain an identification result.
6. A financial instrument recognition system, comprising: the device comprises an acquisition module, a preprocessing module, a model construction module, a positioning module, a cutting module and an identification module,
the acquisition module is used for acquiring a financial bill image to be identified;
the preprocessing module is used for preprocessing the financial bill image to obtain a processed image;
the model building module is used for building an improved YOLO-v3 model;
the positioning module adopts the constructed improved YOLO-v3 model to position the region of interest of the processed image, extracts the coordinate position of the region of interest, and cuts out a target picture according to the coordinate position;
the cutting module performs text segmentation on the target picture to obtain a segmented picture;
and the recognition module adopts a CRNN network model to carry out text recognition on the segmented picture, so as to obtain a bill recognition result.
7. The financial bill identifying system according to claim 6, wherein the preprocessing module comprises a filtering unit, a binarization processing unit, a correction unit and an azimuth correction unit, wherein the filtering unit performs gaussian filtering on the financial bill image to obtain a filtered image;
the binarization processing unit is used for performing binarization processing on the filtered image to obtain a binarized image;
the correction unit performs edge detection on the binarized image, and performs perspective transformation on the edge image to obtain a corrected image;
and the azimuth correcting unit corrects the azimuth posture of the corrected image to obtain a positive financial bill image.
8. The financial instrument recognition system of claim 6, wherein the building block comprises a dark net-39 backbone network element, a feature combining element, a weighted feature fusion element, and a prediction element, wherein,
the dark net-39 main network unit is improved through a dark net-53 main network to obtain a dark net-39 main network model, and image features are extracted through the dark net-39 main network model to obtain feature diagrams of 5 different scale convolution layers;
the feature combination unit is used for optimally combining the feature graphs of the 5 different scale convolution layers to obtain a combined feature graph;
the weighted feature fusion unit is used for carrying out weighted feature fusion on the combined feature map;
the prediction unit is used for carrying out regression prediction on the fused feature images by using a YOLO-V3 algorithm to obtain a target detection result.
9. The financial billing identification system of claim 8 wherein the dark net-39 backbone network element further comprises a network training element adding 2 convolutional layers to a backbone network of a conventional YOLO-V3 algorithm employing 5 different scale convolutional layer feature maps for target detection;
acquiring a data set, dividing the data set into a training set, a testing set and a verification set,
and re-clustering the coordinates of the boundary frames on the training set by adopting a k-means clustering algorithm, and calculating 15 boundary frame coordinates of the characteristic diagrams of the convolution layers with 5 different scales.
10. The financial bill recognition system according to claim 6, wherein the CRNN network model includes a CNN network, a circulation network and a transcription layer, the CNN network is used for extracting scale-invariant features of the segmented pictures to form feature graphs, and segmenting the feature graphs in columns to form feature sequences and inputting the feature sequences to the circulation network;
the cyclic network is composed of two layers of bidirectional LSTM networks, and is used for corresponding the characteristic sequences input by the CNN network and the classification labels, outputting a string of ordered label sequences, wherein the substrings of the sequences contain real final results, and inputting the ordered label sequences into the transcription layer;
and integrating the transcription layer ordered tag sequences to obtain an identification result.
CN202310294104.7A 2023-03-15 2023-03-15 Financial bill identification method and system Pending CN116363655A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310294104.7A CN116363655A (en) 2023-03-15 2023-03-15 Financial bill identification method and system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310294104.7A CN116363655A (en) 2023-03-15 2023-03-15 Financial bill identification method and system

Publications (1)

Publication Number Publication Date
CN116363655A true CN116363655A (en) 2023-06-30

Family

ID=86906517

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310294104.7A Pending CN116363655A (en) 2023-03-15 2023-03-15 Financial bill identification method and system

Country Status (1)

Country Link
CN (1) CN116363655A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117422958A (en) * 2023-12-19 2024-01-19 山东工程职业技术大学 Financial data verification method and system based on deep learning

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117422958A (en) * 2023-12-19 2024-01-19 山东工程职业技术大学 Financial data verification method and system based on deep learning
CN117422958B (en) * 2023-12-19 2024-03-19 山东工程职业技术大学 Financial data verification method and system based on deep learning

Similar Documents

Publication Publication Date Title
CN111325203B (en) American license plate recognition method and system based on image correction
CN113160192B (en) Visual sense-based snow pressing vehicle appearance defect detection method and device under complex background
CN110569878B (en) Photograph background similarity clustering method based on convolutional neural network and computer
CN108805076B (en) Method and system for extracting table characters of environmental impact evaluation report
CN110008956B (en) Invoice key information positioning method, invoice key information positioning device, computer equipment and storage medium
CN112651289B (en) Value-added tax common invoice intelligent recognition and verification system and method thereof
CN106203539B (en) Method and device for identifying container number
CN110119741A (en) A kind of card card image information recognition methods having powerful connections
CN111091124B (en) Spine character recognition method
CN106874901B (en) Driving license identification method and device
CN113963147B (en) Key information extraction method and system based on semantic segmentation
CN113780087B (en) Postal package text detection method and equipment based on deep learning
CN111461133B (en) Express delivery surface single item name identification method, device, equipment and storage medium
CN111275040A (en) Positioning method and device, electronic equipment and computer readable storage medium
CN113158895A (en) Bill identification method and device, electronic equipment and storage medium
CN113870202A (en) Far-end chip defect detection system based on deep learning technology
CN116363655A (en) Financial bill identification method and system
CN114463767A (en) Credit card identification method, device, computer equipment and storage medium
CN112686872B (en) Wood counting method based on deep learning
CN113780116A (en) Invoice classification method and device, computer equipment and storage medium
CN111914706B (en) Method and device for detecting and controlling quality of text detection output result
CN110298347B (en) Method for identifying automobile exhaust analyzer screen based on GrayWorld and PCA-CNN
CN111950556A (en) License plate printing quality detection method based on deep learning
Bouafif et al. A hybrid method for three segmentation level of handwritten Arabic script
CN107403192A (en) A kind of fast target detection method and system based on multi-categorizer

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination