CN115223181A - Text detection-based method and device for recognizing characters of seal of report material - Google Patents

Text detection-based method and device for recognizing characters of seal of report material Download PDF

Info

Publication number
CN115223181A
CN115223181A CN202210700107.1A CN202210700107A CN115223181A CN 115223181 A CN115223181 A CN 115223181A CN 202210700107 A CN202210700107 A CN 202210700107A CN 115223181 A CN115223181 A CN 115223181A
Authority
CN
China
Prior art keywords
image
seal
stamp
character
characters
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210700107.1A
Other languages
Chinese (zh)
Inventor
林利祥
朱以顺
朱志芳
吴国玥
梁毅
佟佳俊
马景行
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guangzhou Power Supply Bureau of Guangdong Power Grid Co Ltd
Original Assignee
Guangzhou Power Supply Bureau of Guangdong Power Grid Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guangzhou Power Supply Bureau of Guangdong Power Grid Co Ltd filed Critical Guangzhou Power Supply Bureau of Guangdong Power Grid Co Ltd
Priority to CN202210700107.1A priority Critical patent/CN115223181A/en
Publication of CN115223181A publication Critical patent/CN115223181A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition
    • G06V30/14Image acquisition
    • G06V30/148Segmentation of character regions
    • G06V30/153Segmentation of character regions using recognition of characters or words
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T5/00Image enhancement or restoration
    • G06T5/70Denoising; Smoothing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/13Edge detection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/30Noise filtering
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/56Extraction of image or video features relating to colour
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/40Document-oriented image-based pattern recognition
    • G06V30/41Analysis of document content
    • G06V30/412Layout analysis of documents structured with printed lines or input boxes, e.g. business forms or tables

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Multimedia (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Health & Medical Sciences (AREA)
  • Computing Systems (AREA)
  • Databases & Information Systems (AREA)
  • General Health & Medical Sciences (AREA)
  • Medical Informatics (AREA)
  • Software Systems (AREA)
  • Image Analysis (AREA)

Abstract

The application relates to a text detection-based method, a text detection-based device, a text detection-based storage medium and a text detection-based computer program product. The method comprises the following steps: acquiring a seal image; preprocessing the seal image to obtain a seal area image in the seal image; inputting the seal area image into a pre-trained character detection model, and predicting the area information of characters in the seal area image; extracting a stamp character area image from the stamp area image based on the area information of the characters; and processing the stamp character area image to identify the stamp characters in the stamp image. By adopting the method, the identification precision of the seal characters can be improved.

Description

Text detection-based method and device for recognizing characters of seal of report material
Technical Field
The present application relates to the field of computer technologies, and in particular, to a text detection-based method and apparatus for recognizing characters on a seal of a report material, a computer device, a storage medium, and a computer program product.
Background
With the development of computer technology, in order to implement the deployment of the company digital transformation construction strategy, comprehensively improve the document management quality, and accelerate the document auditing process, a character detection algorithm is generally operated on electronic equipment in the conventional technology, so as to realize the examination of the related contents of the document (such as the seal characters in the seal area in the document).
However, most of the current character detection algorithms are limited in detection when identifying the characters of the stamp in the stamp area, for example, only horizontal text lines can be detected, and large text with a bending degree cannot be detected, which finally results in poor identification accuracy.
Disclosure of Invention
In view of the above, it is necessary to provide a method, an apparatus, a computer device, a computer readable storage medium, and a computer program product for recognizing a stamp of a report material based on text detection, which can improve the accuracy of recognizing the stamp characters.
In a first aspect, the application provides a text detection-based method for recognizing characters of a seal of a report material. The method comprises the following steps:
acquiring a seal image;
preprocessing the stamp image to obtain a stamp area image in the stamp image;
inputting the seal area image into a pre-trained character detection model, and predicting the area information of characters in the seal area image;
extracting a stamp character area image from the stamp area image based on the area information of the characters;
and processing the stamp character area image to identify the stamp characters in the stamp image.
In one embodiment, the preprocessing the stamp image to obtain a stamp region image in the stamp image includes:
performing edge detection on the stamp image to determine a central point coordinate in the stamp image and radius information corresponding to the central point coordinate;
determining a seal area to be extracted according to the central point coordinate and the radius information;
and denoising the seal region to be extracted based on an HSI color space filtering method to obtain a seal region image in the seal image.
In one embodiment, the denoising processing is performed on the to-be-extracted seal region based on an HSI color space filtering method to obtain a seal region image in the seal image, and the denoising processing includes:
calculating pixel parameters of all pixel points in the seal area to be extracted;
determining a seal color saturation range, a seal color intensity range and a seal color wavelength range of the HSI color space based on the pixel parameters;
and extracting pixel points which simultaneously meet the seal color range, the seal color intensity range and the seal color wavelength range to obtain a seal region image in the seal image.
In one embodiment, the inputting the stamp region image to a pre-trained character detection model to predict region information to which characters in the stamp region image belong includes:
the character feature extraction network of the character detection model extracts character features of characters in the seal area image;
an encoder of the character detection model obtains a feature vector of the character, which pays attention to the position relation between the character and other characters, according to the position code corresponding to the character and the character features of the character;
the decoder of the component recognition model obtains the identifications of various prediction frames of the target query set according to the feature vectors of the characters and the target query set;
and the recognition head of the component recognition model predicts the region information of the characters in the stamp region image according to the identifications of the various prediction frames and the feature vectors of the characters.
In one embodiment, the extracting text features of the text in the stamp area image includes:
and performing convolution and downsampling processing on the seal area image through the feature extraction network, extracting a feature map of the seal area image, and processing the feature map by using a space attention mechanism to obtain character features of each character in the seal area image.
In one embodiment, the text region information includes coordinate information of a plurality of control points;
extracting a stamp character region image from the stamp region image based on the region information to which the characters belong, wherein the stamp character region image comprises:
determining a Becky curve based on the coordinate information of each control point;
extracting a sector ring-shaped seal character area image from the seal area image based on the Bezier curve;
and carrying out polar coordinate conversion on the fan-shaped annular seal character area image, and converting the fan-shaped annular seal character area image into a rectangular seal character area image, wherein the seal character area image is a rectangular seal character area image.
In one embodiment, the processing the stamp character region image to identify the stamp characters in the stamp image includes:
inputting the stamp character area image into a convolution cyclic neural network;
carrying out convolution processing on the stamp character area image by the convolution layer of the convolution cyclic neural network to obtain a convolution character characteristic diagram of the stamp character area;
the circulation layer of the convolution circulation neural network converts the convolution character feature map into character feature vectors, performs feature coding on the character feature vectors to obtain a convolution character feature sequence, and predicts the convolution character feature sequence to obtain prediction label distribution;
and the transcription layer of the convolutional neural network carries out sequence decoding on the distribution of the predicted labels, and the seal characters in the seal image are identified.
In a second aspect, the present application further provides a device for recognizing characters of a stamp on a report material based on text detection, the device including:
the image acquisition module is used for acquiring a seal image;
the first image processing module is used for preprocessing the stamp image to obtain a stamp area image in the stamp image;
the information acquisition module is used for inputting the stamp region image into a pre-trained character detection model and predicting the region information of characters in the stamp region image;
the second image processing module is used for extracting a stamp character area image from the stamp area image based on the area information of the characters;
and the character recognition module is used for processing the stamp character area image and recognizing the stamp characters in the stamp image.
In a third aspect, the present application further provides a computer device, which includes a memory and a processor, where the memory stores a computer program, and the processor implements the steps of the above method for recognizing characters on a seal of a report material based on text detection when executing the computer program.
In a fourth aspect, the present application further provides a computer-readable storage medium, on which a computer program is stored, which, when being executed by a processor, realizes the steps of the above text detection-based text recognition method for a report material stamp.
In a fifth aspect, the present application further provides a computer program product. The computer program product comprises a computer program which, when executed by a processor, implements the steps of the above-described text detection-based method for recognizing characters on a stamp of a report material.
According to the report material seal character recognition method based on text detection, the device, the computer equipment, the storage medium and the computer program product, the seal area image is obtained by preprocessing the seal image, and the seal area image is input to the character detection model trained in advance, so that the character belonging area information can be obtained through the character detection model, and due to the fact that the character belonging area information is obtained, the image of the seal character area can be accurately extracted from the seal area image, seal characters in the seal image can be recognized more efficiently in the follow-up process, the work intensity of auditing personnel is effectively reduced, the document auditing efficiency is improved, and the service digital transformation process is promoted.
Drawings
FIG. 1 is a schematic flow chart diagram of a method for text detection-based identification of a stamp of a reporting material in one embodiment;
FIG. 2 is a schematic diagram of a text detection model of a report material stamp identification method based on text detection in one embodiment;
FIG. 3 is a schematic diagram of a neural network model of a text detection-based method for identifying a stamp of a report material in one embodiment;
FIG. 4 is a block diagram of a reporting material stamp identification apparatus based on text detection in one embodiment;
FIG. 5 is a diagram of the internal structure of a report material computer device based on text detection in one embodiment.
Detailed Description
In order to make the objects, technical solutions and advantages of the present application more clearly understood, the present application is further described in detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the present application and are not intended to limit the present application.
The text detection-based method for recognizing the characters of the seal of the report material can be applied to a terminal and can also be applied to a server, specifically, in a scene of performing service auditing on various services, the seal in a service file needs to be recognized, so that the daily official seal auditing work is completed according to the recognized seal characters, and the terminal can acquire the seal image by using the method; preprocessing the stamp image to obtain a stamp area image in the stamp image; inputting the seal area image to a pre-trained character detection model, and predicting the area information of characters in the seal area image; extracting a stamp character area image from the stamp area image based on the area information of the characters; and processing the stamp character area image to identify the stamp characters in the stamp image. Thereby improving the seal character recognition precision.
The terminal can be but not limited to various personal computers, notebook computers, smart phones, tablet computers, internet of things equipment and portable wearable equipment, and the internet of things equipment can be smart sound boxes, smart televisions, smart air conditioners, smart vehicle-mounted equipment and the like. The portable wearable device can be a smart watch, a smart bracelet, a head-mounted device, and the like. The server may be implemented as a stand-alone server or as a server cluster consisting of a plurality of servers.
In one embodiment, as shown in fig. 1, a text detection-based method for recognizing characters of a seal of a report material is provided, which is described by taking the method as an example applied to a terminal, and includes the following steps:
step S102, obtaining a seal image.
The stamp image may be a document image stamped with a stamp, and the stamp image may be obtained by obtaining a text page (stamped with the stamp) and then performing image conversion on the text page.
And step S104, preprocessing the stamp image to obtain a stamp area image in the stamp image.
The preprocessing of the stamp image refers to performing image processing (such as denoising) on the stamp image to obtain a rough stamp region image.
In one example, the preprocessing the stamp image to obtain a stamp region image in the stamp image includes:
performing edge detection on the stamp image to determine a central point coordinate in the stamp image and radius information corresponding to the central point coordinate;
determining a seal area to be extracted according to the central point coordinate and the radius information;
and denoising the to-be-extracted seal region based on an HSI color space filtering method to obtain a seal region image in the seal image.
The method comprises the steps that whether an arc-shaped region exists in a seal image or not is detected through edge detection, a central point coordinate in the seal image refers to the center coordinate of the seal image which is analyzed in a rectangular coordinate system, the obtained center coordinate of the seal region to be extracted is obtained, correspondingly, the radius is the circle radius of the seal region to be extracted, an HSI color filtering method refers to a tool capable of describing the image by colors, an HSI color space is sensitive to brightness and saturation, and noise points (except other information of the seal region) in the seal region to be extracted can be accurately divided from the seal region through an HSI space model, so that denoising is achieved.
And S106, inputting the seal area image to a pre-trained character detection model, and predicting the area information of the characters in the seal area image.
When the stamp region image is used, the stamp region image is input into the pre-trained character detection model, and the stamp region image is predicted based on the character detection model, so that the region information to which the character corresponding to the stamp region image belongs can be obtained.
And S108, extracting a stamp character area image from the stamp area image based on the area information of the characters.
The stamp character region image refers to a region to which characters belong in the stamp region image, and the stamp character region image can be extracted from the stamp region image after the region information to which the characters belong is determined.
Step S110, processing the stamp character area image, and identifying the stamp characters in the stamp image.
After the stamp character area image is acquired, character recognition processing can be performed on the stamp character area image to recognize the stamp characters, specifically, when the stamp character area image is recognized, a neural network model can be used for recognition, and any other recognition mode can be used as long as the stamp characters can be recognized accurately.
According to the report material seal character recognition method based on text detection, the seal image is preprocessed to obtain the seal area image, the seal area image is input to the character detection model trained in advance, the character area information can be obtained through the character detection model, and the character area information is obtained, so that the image of the seal character area can be accurately extracted from the seal area image, the seal characters in the seal image can be recognized more efficiently subsequently, the working intensity of an auditor is effectively reduced finally, the document auditing efficiency is improved, and the service digital transformation process is promoted.
In one embodiment, when the stamp region image is determined, hough transform identification can be applied to perform edge detection, and the center point coordinate in the stamp image and the radius information corresponding to the center point coordinate are determined, so that the stamp region to be extracted can be extracted.
In one embodiment, the denoising processing of the stamp region to be extracted based on the HSI color space filtering method to obtain the stamp region image in the stamp image includes:
calculating pixel parameters of all pixel points in the seal area to be extracted;
determining a seal color saturation range, a seal color intensity range and a seal color wavelength range of the HSI color space based on the pixel parameters;
and extracting pixel points which simultaneously meet the seal color saturation range, the seal color intensity range and the seal color wavelength range, and obtaining the seal region image in the seal image.
The pixel is related to the color, and therefore, for different colors, the seal color saturation, the seal color intensity and the seal color wavelength of the pixel in the HSI color space are also different, so that the pixel of each pixel can be calculated, and the component values of each pixel on H (hue), S (saturation) and I (brightness) are obtained, thereby determining the seal color hue range, the seal color saturation range and the seal color intensity range, for example: in the seal image, the existing elements can include black characters, red lines, red seals and the like, and the color range of the pixel points needing to be obtained can be determined by calculating the pixel parameters of the pixel points, so that the pixel points outside the seal area are excluded, the screened seal pixel points are obtained, and the seal area image is finally determined.
In one embodiment, the image of the stamp region is red, the value range of the determined H component is [ -30 °, +30 ° ], the value range of the S component is [0.1,1 ], and the value range of the I component is [0.4,0.9 ], so that the range of the pixel points simultaneously satisfying the H component, the S component, and the I component can be determined, and the pixel points satisfying the range are extracted, so as to remove other pixel points (such as black characters) to obtain the stamp region image in the stamp image.
In one embodiment, after the stamp region image is obtained, the stamp region image can be converted into a gray scale image, the stamp region image is filtered by a bilateral filter, the image is binarized based on an inter-class variance method, and noise is further removed by an erosion expansion algorithm, so that the accuracy of the stamp region image is further improved.
In one embodiment, the inputting the stamp region image to a pre-trained character detection model to predict region information to which characters in the stamp region image belong includes:
extracting character features of characters in the seal area image by the feature extraction network of the character detection model;
an encoder of the character detection model obtains a feature vector of the character, which pays attention to the position relation between the character and other characters, according to the position code corresponding to the character and the character features of the character;
the decoder of the component recognition model obtains the identifications of various prediction frames of the target query set according to the feature vectors of the characters and the target query set;
and the recognition head of the component recognition model predicts the region information of the characters in the seal region image according to the identifications of the various prediction frames and the feature vectors of the characters.
The character feature refers to a feature related to a character in the stamp image, for example, the character feature of the character may be "horizontal, vertical, left-falling, right-falling" or the like in a grammar, the position code of the character may refer to a position relationship of each character in the stamp region image, and the feature vector is a vector representation of the character feature.
The target query set comprises N elements, N is a set super parameter, and the number of N is far larger than the number of characters in the seal region image. The identifications of the various prediction boxes of the target query set refer to the identifications of N prediction boxes obtained by inputting the target query set and the feature vectors of the characters into the multilayer perceptron. The number of identifications of each type of prediction box corresponds to the number of elements of the target query set. It can be understood that N elements in the target query set contain all possible positions of each text in the stamp region image. And marking a prediction result corresponding to the seal region image by the prediction frame.
In one embodiment, as shown in fig. 2, the text detection model is a schematic structural diagram, and the text detection model mainly includes four main modules: a backbone network (feature extraction network) 202, an encoder 204, a decoder 206, a prediction header (recognition header) 208.
The backbone network 202 is formed by adding a ResNet-50 network to a classical Convolutional Neural Network (CNN), and can be used for extracting character features of characters in a stamp area image, wherein the character features can be 'horizontal, vertical, left-falling, right-falling' and the like in grammar.
The encoder 204 may encode the text feature and the position corresponding to the text to obtain the corresponding feature vector.
The decoder 206, the decoder 206 includes the feature vector obtained by the encoder 204 and a target query set, and obtains the identifications of various prediction boxes of the target query set according to the feature vector of the text and the target query set.
The prediction header 208 (FNN), the prediction header 208 is a 3-layer multi-layer perceptron, the activation function uses a strained Linear Unit (ReLU), and the number of hidden nodes is d. Each set of target queries predicts a prediction box and a category of the target via prediction head 208, where the prediction box has four values, which are the center point coordinate and the width and height of the target, respectively. The prediction header 208 may predict the N prediction boxes and their categories. Specifically, the recognition head predicts the matching probability of the identifications and characters of various prediction boxes according to the identifications and character feature vectors of various prediction boxes; and marking the prediction box with the maximum matching probability on the characters for prompting the information of the area to which the characters of the characters belong.
In one embodiment, the extracting text features of text in the stamp region image includes:
and performing convolution and downsampling processing on the stamp area image through a feature extraction network, extracting a feature map of the stamp area image, and processing the feature map by using a space attention mechanism to obtain character features of each character in the stamp area image.
The method comprises the steps of performing convolution and downsampling processing on a stamp area image based on a ResNet-50 network to obtain three characteristic diagrams theta, phi and g, wherein a space attention mechanism is used for performing stretching, transposition and matrix multiplication on each characteristic diagram by using the space attention mechanism, and performing normalization processing through a softmax function to obtain character characteristics of each character. And finally obtaining a feature map with 32 times of down sampling by re-expanding and compressing the processed feature map, wherein the feature map comprises character features of characters in the stamp area image.
Specifically, the ResNet-50 network has a fourth-layer output layer of x and a dimension of [1024, H4, W4], where 1024 is the number of channels, H4= H0/16, and W4= W0/16. And respectively carrying out three 1 multiplied by 1 convolution kernels on the output characteristic diagrams, and reducing the number of the characteristic diagram channels to half of the original number to obtain three characteristic diagrams theta, phi and g of [512, H4 and W4].
First, each feature map dimension is stretched to [512, H4 × W4]. And transposing the dimensions of the theta and g feature maps to obtain the dimension of [ H4 multiplied by W4, 512]. Then, carrying out matrix multiplication on theta and phi to obtain a matrix with the dimensionality of [ H4 xW 4, H4 xW 4], carrying out normalization on the matrix through a softmax function, and obtaining the calculation result as the attention score.
Then, multiplying the attention score by the g matrix to obtain a result y with a dimension [ H4 × W4, 512], transposing y to obtain a feature map of [512, H4 × W4], and then re-stretching the H4 × W4 dimension to [ H4, W4], thereby obtaining a feature map with a dimension [512, H4, W4].
Finally, a 1 × 1 convolution kernel is applied to the feature map to expand the channel to 1024, so as to obtain a feature map of [1024, H4, W4], which is consistent with the dimension of the input feature. The final step is to add x to the resulting signature. And (5) sending the addition result into the original ResNet-50 structure, and finally outputting a feature map with the dimension of [2048, hi/32, wi/32] which is 32 times of downsampling.
In this embodiment, the seal area image is subjected to convolution and downsampling processing based on the ResNet-50 network to obtain three feature maps θ, Φ, and g, and then the feature maps are processed by using a spatial attention mechanism to obtain character features of characters in the seal area image.
In one embodiment, the text region information includes coordinate information of a plurality of control points;
extracting a stamp character region image from the stamp region image based on the region information to which the characters belong, wherein the stamp character region image comprises:
determining a Becky curve based on the coordinate information of each control point;
extracting a sector ring-shaped seal character area image from the seal area image based on the Bezier curve;
and carrying out polar coordinate conversion on the fan-shaped annular seal character area image, and converting the fan-shaped annular seal character area image into a rectangular seal character area image, wherein the seal character area image is a rectangular seal character area image.
The text belonging area information includes coordinate information of a plurality of control points, the control points refer to points that can be used for extracting the text of the stamp, the number of the plurality of control points may be 16, or may be other numbers, and the number of the specific control points may be determined according to actual conditions.
After obtaining the coordinate information of the control points, a back curve corresponding to 16 control point parameters can be determined, the seal region image is divided into a plurality of sections of bars by taking pixels as units, then the longitudinal axis directions of the plurality of sections of bars are differentiated, and the longitudinal coordinates of the upper left corner are taken as the reference to be adjusted to the same longitudinal coordinate value, so that the fan-shaped ring seal character region image is obtained. And finally, carrying out polar coordinate conversion on the sector annular seal character area image, and converting the sector annular seal character area image into a rectangular seal character area image, wherein the seal character area image is the rectangular seal character area image.
In one embodiment, the processing the stamp character region image to identify the stamp characters in the stamp image includes:
inputting the stamp character area image into a convolution cyclic neural network;
carrying out convolution processing on the stamp character area image by the convolution layer of the convolution circulation neural network to obtain a convolution character characteristic diagram of the stamp character area;
the circulation layer of the convolution circulation neural network converts the convolution character feature diagram into character feature vectors, performs feature coding on the character feature vectors to obtain a convolution character feature sequence, and predicts the convolution character feature sequence to obtain prediction label distribution;
and the transcription layer of the convolutional neural network carries out sequence decoding on the distribution of the predicted labels, and the seal characters in the seal image are identified.
In one embodiment, as shown in fig. 3, the convolutional cyclic neural network is a schematic diagram of a structure of a convolutional cyclic neural network, wherein the convolutional layer is a seven-layer convolutional neural network improved based on the VGG16 network structure. The feature map of the text image obtained from the convolutional network cannot be directly used as the input of the RNN (recurrent neural network), and therefore, the corresponding feature vector sequence in the feature map is extracted according to the requirements of the RNN layer.
In CRNN, the image input height is limited to 32 pixels, and feature vectors output by the convolutional layer are extracted from a feature map sequence, each feature vector is generated on the feature map in columns from left to right, and each column may contain 512-dimensional features, which means that the ith feature vector is the connection of the ith pixels of all feature maps, and these feature vectors form a sequence.
Since the max pooling layer and the activation function are performed on local regions in the convolutional layer, they are shift-invariant. Each column of the feature map (i.e., one feature vector) corresponds to one rectangular region (i.e., receptive field) of the original image, and these rectangular regions have the same order as the corresponding columns from left to right on the feature map. Each feature vector represents a feature over a certain width on the image, e.g. a width of 1, i.e. representing a single pixel. These feature vector sequences are used as input to the loop layer, and each feature vector is used as input to the RNN at one time step.
In order to realize the recognition of characters with indefinite length, the RNN is used for sequentially processing the text information with any length according to time sequence, and compared with the traditional recognition mode according to character segmentation, the recognition method based on the sequence has higher accuracy and precision. For example, on a special long text "forest, lake and moon are long and night", the former is easily and wrongly identified as "wood", "pair", "forest", "staphyl", "moon", "bloat" and night ", and the identification result of the latter is more in line with the original picture characters. However, the special structure of RNN can cause the situations of gradient disappearance and gradient explosion, and especially can lose more text information in the process of long dependence problem, so that the feature coding is carried out by introducing LSTM into the network, and the problem is solved. Since the context information in the forward and backward directions can realize feature complementation and help the prediction of the sequence, the CRNN network uses bidirectional BLSTM for sequence coding, i.e. two LSTMs are combined into bidirectional BLSTM, and a deep bidirectional RNN structure is formed by stacking.
In the RNN, each time step has an input feature vector, and the input feature vectors are predicted to obtain the softmax probability distribution of all characters and serve as the input of a CTC layer. If 40 characteristic vectors are input, the posterior probability matrix formed by 40 vectors with the length being the character category number is finally output, the posterior probability matrix is transmitted into a transcription layer, the transcription layer carries out sequence decoding on the distribution of the prediction labels, and the prediction result is output.
It should be understood that, although the steps in the flowcharts related to the embodiments as described above are sequentially displayed as indicated by arrows, the steps are not necessarily performed sequentially as indicated by the arrows. The steps are not limited to being performed in the exact order illustrated and, unless explicitly stated herein, may be performed in other orders. Moreover, at least a part of the steps in the flowcharts related to the embodiments described above may include multiple steps or multiple stages, which are not necessarily performed at the same time, but may be performed at different times, and the execution order of the steps or stages is not necessarily sequential, but may be performed alternately or alternately with other steps or at least a part of the steps or stages in other steps.
Based on the same inventive concept, the embodiment of the application also provides a stamp character recognition device for realizing the stamp character recognition method. The implementation scheme for solving the problem provided by the device is similar to the implementation scheme recorded in the method, so the specific limitations in one or more embodiments of the stamp character recognition device provided below can refer to the limitations on the stamp character recognition method in the above, and details are not repeated here.
In one embodiment, as shown in FIG. 4, there is provided a reporting material stamp character recognition apparatus based on text detection, comprising: image acquisition module, first image processing module, information acquisition module, second image processing module and character recognition module, wherein:
an image obtaining module 402, configured to obtain a stamp image.
A first image processing module 404, configured to preprocess the stamp image to obtain a stamp region image in the stamp image.
And the information acquisition module 406 is configured to input the stamp region image to a pre-trained character detection model, and predict region information to which characters in the stamp region image belong.
The second image processing module 408 is configured to extract a stamp character region image from the stamp region image based on the region information to which the character belongs.
And the character recognition module 410 is configured to process the stamp character region image, and recognize stamp characters in the stamp image.
In one embodiment, the first image processing module is configured to perform edge detection on the stamp image, and determine a center point coordinate in the stamp image and radius information corresponding to the center point coordinate; determining a seal area to be extracted according to the central point coordinate and the radius information; and denoising the to-be-extracted seal region based on an HSI color space filtering method to obtain a seal region image in the seal image.
In one embodiment, the first image processing module is configured to calculate a pixel parameter of each pixel point in the to-be-extracted stamp region; determining a seal color saturation range, a seal color intensity range and a seal color wavelength range of the HSI color space based on the pixel parameters; and extracting pixel points which simultaneously meet the seal color range, the seal color intensity range and the seal color wavelength range to obtain a seal region image in the seal image.
In one embodiment, the information acquisition module extracts character features of characters in the stamp area image by a feature extraction network of the character detection model; an encoder of the character detection model obtains a feature vector of the character, which pays attention to the position relation between the character and other characters, according to the position code corresponding to the character and the character features of the character; the decoder of the component recognition model obtains the identifications of various prediction frames of the target query set according to the feature vectors of the characters and the target query set; and the recognition head of the component recognition model predicts the region information of the characters in the stamp region image according to the identifications of the various prediction frames and the feature vectors of the characters.
In one embodiment, the information obtaining module is configured to perform convolution and downsampling on the stamp region image through the feature extraction network, extract a feature map of the stamp region image, and process the feature map by using a spatial attention mechanism to obtain character features of each character in the stamp region image.
In one embodiment, the second image processing module is configured to determine a bezier curve based on coordinate information of each of the control points; extracting a sector ring-shaped seal character area image from the seal area image based on the Bezier curve; and carrying out polar coordinate conversion on the fan-shaped annular seal character area image, and converting the fan-shaped annular seal character area image into a rectangular seal character area image, wherein the seal character area image is a rectangular seal character area image.
In one embodiment, the character recognition module is used for inputting the stamp character area image into a convolution cyclic neural network;
carrying out convolution processing on the stamp character area image by the convolution layer of the convolution cyclic neural network to obtain a convolution character characteristic diagram of the stamp character area; the circulation layer of the convolution circulation neural network converts the convolution character feature map into character feature vectors, performs feature coding on the character feature vectors to obtain a convolution character feature sequence, and predicts the convolution character feature sequence to obtain prediction label distribution; and the transcription layer of the convolutional neural network carries out sequence decoding on the distribution of the predicted labels, and the seal characters in the seal image are identified.
The various modules in the above-described apparatus may be implemented in whole or in part by software, hardware, and combinations thereof. The above-mentioned module seal character recognition device can be embedded in hardware form or independent from the processor in the computer equipment, and also can be stored in the memory in the computer equipment in software form, so that the processor can call and execute the operation corresponding to the above-mentioned modules.
In one embodiment, a computer device is provided, which may be a terminal, and its internal structure diagram may be as shown in fig. 5. The computer device comprises a processor, a memory, a communication interface, a display screen and an input device which are connected through a system bus. Wherein the processor of the computer device is configured to provide computing and control capabilities. The memory of the computer device comprises a nonvolatile storage medium and an internal memory. The non-volatile storage medium stores an operating system and a computer program. The internal memory provides an environment for the operation of an operating system and computer programs in the non-volatile storage medium. The communication interface of the computer device is used for carrying out wired or wireless communication with an external terminal, and the wireless communication can be realized through WIFI, a mobile cellular network, NFC (near field communication) or other technologies. The computer program is executed by a processor to realize a seal character recognition method. The display screen of the computer equipment can be a liquid crystal display screen or an electronic ink display screen, and the input device of the computer equipment can be a touch layer covered on the display screen, a key, a track ball or a touch pad arranged on the shell of the computer equipment, an external keyboard, a touch pad or a mouse and the like.
It will be appreciated by those skilled in the art that the configuration shown in fig. 5 is a block diagram of only a portion of the configuration associated with the present application, and is not intended to limit the computing device to which the present application may be applied, and that a particular computing device may include more or less components than those shown, or may combine certain components, or have a different arrangement of components.
In one embodiment, a computer device is provided, comprising a memory and a processor, the memory having stored therein a computer program, the processor implementing the steps of the above text detection based text recognition method for reporting material stamps when executing the computer program.
In one embodiment, a computer readable storage medium is provided, having stored thereon a computer program which, when executed by a processor, performs the steps of the above-described text detection-based text recognition method for a reporting material stamp.
In one embodiment, a computer program product is provided that includes a computer program that when executed by a processor performs the steps of the above-described text detection-based text recognition method for a report material stamp.
It will be understood by those skilled in the art that all or part of the processes of the methods of the embodiments described above can be implemented by hardware instructions of a computer program, which can be stored in a non-volatile computer-readable storage medium, and when executed, can include the processes of the embodiments of the methods described above. Any reference to memory, database, or other medium used in the embodiments provided herein may include at least one of non-volatile and volatile memory. The nonvolatile Memory may include a Read-Only Memory (ROM), a magnetic tape, a floppy disk, a flash Memory, an optical Memory, a high-density embedded nonvolatile Memory, a resistive Random Access Memory (ReRAM), a Magnetic Random Access Memory (MRAM), a Ferroelectric Random Access Memory (FRAM), a Phase Change Memory (PCM), a graphene Memory, and the like. Volatile Memory can include Random Access Memory (RAM), external cache Memory, and the like. By way of illustration and not limitation, RAM can take many forms, such as Static Random Access Memory (SRAM) or Dynamic Random Access Memory (DRAM), for example. The databases involved in the embodiments provided herein may include at least one of relational and non-relational databases. The non-relational database may include, but is not limited to, a block chain based distributed database, and the like. The processors referred to in the embodiments provided herein may be general purpose processors, central processing units, graphics processors, digital signal processors, programmable logic devices, quantum computing based data processing logic devices, etc., without limitation.
All possible combinations of the technical features in the above embodiments may not be described for the sake of brevity, but should be considered as being within the scope of the present disclosure as long as there is no contradiction between the combinations of the technical features.
The above-mentioned embodiments only express several embodiments of the present application, and the description thereof is more specific and detailed, but not construed as limiting the scope of the present application. It should be noted that, for a person skilled in the art, several variations and modifications can be made without departing from the concept of the present application, which falls within the scope of protection of the present application. Therefore, the protection scope of the present application should be subject to the appended claims.

Claims (10)

1. A text detection-based method for recognizing characters on a seal of a report material is characterized by comprising the following steps:
acquiring a seal image;
preprocessing the stamp image to obtain a stamp area image in the stamp image;
inputting the seal area image into a pre-trained character detection model, and predicting the area information of characters in the seal area image;
extracting a stamp character area image from the stamp area image based on the area information of the characters;
and processing the stamp character area image to identify the stamp characters in the stamp image.
2. The method according to claim 1, wherein said preprocessing said stamp image to obtain a stamp region image in said stamp image comprises:
performing edge detection on the stamp image to determine a central point coordinate in the stamp image and radius information corresponding to the central point coordinate;
determining a seal area to be extracted according to the central point coordinate and the radius information;
and denoising the to-be-extracted seal region based on an HSI color space filtering method to obtain a seal region image in the seal image.
3. The method according to claim 2, wherein the denoising processing is performed on the stamp region to be extracted based on an HSI color space filtering method to obtain a stamp region image in the stamp image, and the denoising processing comprises:
calculating pixel parameters of all pixel points in the seal area to be extracted;
determining a seal color saturation range, a seal color intensity range and a seal color wavelength range of the HSI color space based on the pixel parameters;
and extracting pixel points which simultaneously meet the seal color range, the seal color intensity range and the seal color wavelength range to obtain a seal region image in the seal image.
4. The method according to claim 1, wherein the inputting the stamp region image to a pre-trained character detection model to predict region information to which characters in the stamp region image belong comprises:
extracting character features of characters in the seal area image by the feature extraction network of the character detection model;
an encoder of the character detection model obtains a feature vector of the character, which pays attention to the position relation between the character and other characters, according to the position code corresponding to the character and the character features of the character;
the decoder of the component recognition model obtains the identifications of various prediction frames of the target query set according to the feature vectors of the characters and the target query set;
and the recognition head of the component recognition model predicts the region information of the characters in the stamp region image according to the identifications of the various prediction frames and the feature vectors of the characters.
5. The method according to claim 4, wherein the extracting text features of the text in the stamp area image comprises:
and performing convolution and downsampling processing on the seal area image through the feature extraction network, extracting a feature map of the seal area image, and processing the feature map by using a space attention mechanism to obtain character features of each character in the seal area image.
6. The method according to claim 4, wherein the text region information includes coordinate information of a plurality of control points;
extracting a stamp character region image from the stamp region image based on the region information to which the characters belong, wherein the stamp character region image comprises:
determining a Bejean curve based on the coordinate information of each control point;
extracting a sector annular seal character area image from the seal area image based on the Bezier curve;
and carrying out polar coordinate conversion on the fan-shaped ring-shaped seal character area image, and converting the fan-shaped ring-shaped seal character area image into a rectangular seal character area image, wherein the seal character area image is the rectangular seal character area image.
7. The method according to claim 1, wherein said processing said stamp text region image to identify stamp text in said stamp image comprises:
inputting the stamp character area image into a convolution cyclic neural network;
carrying out convolution processing on the stamp character area image by the convolution layer of the convolution cyclic neural network to obtain a convolution character characteristic diagram of the stamp character area;
the circulation layer of the convolution circulation neural network converts the convolution character feature diagram into character feature vectors, performs feature coding on the character feature vectors to obtain a convolution character feature sequence, and predicts the convolution character feature sequence to obtain prediction label distribution;
and the transcription layer of the convolutional neural network carries out sequence decoding on the distribution of the predicted labels, and the seal characters in the seal image are identified.
8. A device for recognizing characters of a seal of a report material based on text detection, said device comprising:
the image acquisition module is used for acquiring a seal image;
the first image processing module is used for preprocessing the seal image to obtain a seal area image in the seal image;
the information acquisition module is used for inputting the seal area image to a pre-trained character detection model and predicting the area information of characters in the seal area image;
the second image processing module is used for extracting a stamp character area image from the stamp area image based on the area information of the characters;
and the character recognition module is used for processing the stamp character area image and recognizing the stamp characters in the stamp image.
9. A computer device comprising a memory and a processor, the memory storing a computer program, characterized in that the processor, when executing the computer program, implements the steps of the method of any of claims 1 to 7.
10. A computer-readable storage medium, on which a computer program is stored, which, when being executed by a processor, carries out the steps of the method of any one of claims 1 to 7.
CN202210700107.1A 2022-06-20 2022-06-20 Text detection-based method and device for recognizing characters of seal of report material Pending CN115223181A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210700107.1A CN115223181A (en) 2022-06-20 2022-06-20 Text detection-based method and device for recognizing characters of seal of report material

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210700107.1A CN115223181A (en) 2022-06-20 2022-06-20 Text detection-based method and device for recognizing characters of seal of report material

Publications (1)

Publication Number Publication Date
CN115223181A true CN115223181A (en) 2022-10-21

Family

ID=83607064

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210700107.1A Pending CN115223181A (en) 2022-06-20 2022-06-20 Text detection-based method and device for recognizing characters of seal of report material

Country Status (1)

Country Link
CN (1) CN115223181A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117422958A (en) * 2023-12-19 2024-01-19 山东工程职业技术大学 Financial data verification method and system based on deep learning

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117422958A (en) * 2023-12-19 2024-01-19 山东工程职业技术大学 Financial data verification method and system based on deep learning
CN117422958B (en) * 2023-12-19 2024-03-19 山东工程职业技术大学 Financial data verification method and system based on deep learning

Similar Documents

Publication Publication Date Title
Liu et al. Automatic building extraction on high-resolution remote sensing imagery using deep convolutional encoder-decoder with spatial pyramid pooling
CN109657582B (en) Face emotion recognition method and device, computer equipment and storage medium
CN104661037B (en) The detection method and system that compression image quantization table is distorted
CN111860233B (en) SAR image complex building extraction method and system based on attention network selection
Li et al. DMNet: A network architecture using dilated convolution and multiscale mechanisms for spatiotemporal fusion of remote sensing images
US20220084165A1 (en) System and method for single-modal or multi-modal style transfer and system for random stylization using the same
CN116797787B (en) Remote sensing image semantic segmentation method based on cross-modal fusion and graph neural network
CN114387289B (en) Semantic segmentation method and device for three-dimensional point cloud of power transmission and distribution overhead line
CN113887472A (en) Remote sensing image cloud detection method based on cascade color and texture feature attention
CN109784154B (en) Emotion recognition method, device, equipment and medium based on deep neural network
CN115223181A (en) Text detection-based method and device for recognizing characters of seal of report material
CN113657225B (en) Target detection method
CN117409330B (en) Aquatic vegetation identification method, aquatic vegetation identification device, computer equipment and storage medium
Pathak et al. Content‐based image retrieval for super‐resolutioned images using feature fusion: Deep learning and hand crafted
Wang et al. An unsupervised heterogeneous change detection method based on image translation network and post-processing algorithm
CN111274936B (en) Multispectral image ground object classification method, system, medium and terminal
Pavithra et al. Texture image classification and retrieval using multi-resolution radial gradient binary pattern
Zhang et al. Blind image quality assessment based on local quantized pattern
Luo et al. A full-scale hierarchical encoder-decoder network with cascading edge-prior for infrared and visible image fusion
CN110781884A (en) Method for realizing intelligent reading of electric meter data
CN117314756B (en) Verification and protection method and device based on remote sensing image, computer equipment and storage medium
Li et al. Bisupervised network with pyramid pooling module for land cover classification of satellite remote sensing imagery
CN112287940B (en) Semantic segmentation method of attention mechanism based on deep learning
CN115620013B (en) Semantic segmentation method and device, computer equipment and computer readable storage medium
Yu-Dong et al. Image Quality Predictor with Highly Efficient Fully Convolutional Neural Network

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination