CN114067192A

CN114067192A - Character recognition method and system

Info

Publication number: CN114067192A
Application number: CN202210012424.4A
Authority: CN
Inventors: 许占林; 张宏杰; 张健; 刘树
Original assignee: Beijing Xuxianwang Technology Development Co ltd
Current assignee: Beijing Xuxianwang Technology Development Co ltd
Priority date: 2022-01-07
Filing date: 2022-01-07
Publication date: 2022-02-18

Abstract

The invention is suitable for the technical field of image recognition, and particularly relates to a character recognition method and a character recognition system, wherein the method comprises the following steps: acquiring an image to be identified, and preprocessing the image to be identified to obtain an enhanced image; uploading the enhanced image, and extracting the character area in the enhanced image to obtain a character area image; carrying out background processing on the character area image to obtain a background-free character image; and cleaning the line segment and the point of the background-free character image, and identifying by using a convolutional neural network to obtain a character identification result. According to the invention, the image is preprocessed and enhanced, so that the text content contained in the image is more cleaned, and then after the background is removed, the convolutional neural network is used for recognition, so that a text recognition result is obtained, the recognition precision is high, and not only the printed matter can be recognized, but also the non-printed matter can be precisely recognized.

Description

Character recognition method and system

Technical Field

The invention belongs to the technical field of image recognition, and particularly relates to a character recognition method and system.

Background

The technology of automatically recognizing characters by using a computer is an important field of pattern recognition application. People need to process a large amount of words, reports and texts in production and life. In order to reduce the labor of people and improve the processing efficiency, people begin to discuss a general character recognition method and develop an optical character recognizer.

With the development and progress of science and technology, character recognition technology has been widely applied, for example, some software can recognize the character content contained in the picture by processing the screenshot, and particularly, the recognition result is very accurate for the characters of the print.

However, it is difficult to achieve satisfactory recognition accuracy for non-printed characters by the above recognition method, and thus a method for recognizing non-printed characters is required to solve the above problems.

Disclosure of Invention

An embodiment of the present invention is directed to provide a method for recognizing a character, and aims to solve the problem set forth in the third part of the background art.

The embodiment of the invention is realized in such a way that a character recognition method comprises the following steps:

acquiring an image to be identified, and preprocessing the image to be identified to obtain an enhanced image;

uploading the enhanced image, and extracting the character area in the enhanced image to obtain a character area image;

carrying out background processing on the character area image to obtain a background-free character image;

and cleaning the line segment and the point of the background-free character image, and identifying by using a convolutional neural network to obtain a character identification result.

Preferably, the step of uploading the enhanced image and extracting the text region in the enhanced image to obtain the text region image specifically includes:

training an artificial neural network, performing target recognition on the enhanced image by using the trained artificial neural network, and extracting a target area image;

and cutting the target area image to obtain a character area image.

Preferably, the step of performing background processing on the text region image to obtain a background-free text image specifically includes: and carrying out binarization processing on the font picture by an automatic threshold value and a segmentation area processing mode, and removing the background by taking a binarization result as a mask code to obtain a background-free character image.

Preferably, the step of cleaning the line segment and the point of the background-free character image and identifying by using a convolutional neural network to obtain a character identification result specifically includes: the method comprises the steps of performing Hoffman straight line fitting on points of a line segment, judging an effective area of the points after fitting, eliminating the points of which the effective area is smaller than a preset value, and identifying by using a convolutional neural network to obtain a character identification result.

Preferably, the preset value is 0.01.

Preferably, the preprocessing step at least includes performing noise filtering processing and color enhancement processing.

Preferably, the image after the binarization processing includes a black area and a white area, wherein the black area is a background to be removed.

Another object of an embodiment of the present invention is to provide a text recognition system, including:

the image acquisition module is used for acquiring an image to be identified and preprocessing the image to be identified to obtain an enhanced image;

the image enhancement module is used for uploading an enhanced image and extracting a character area in the enhanced image to obtain a character area image;

the background processing module is used for carrying out background processing on the character area image to obtain a background-free character image;

and the character recognition module is used for cleaning the line segments and the points of the background-free character images, and recognizing by utilizing a convolutional neural network to obtain a character recognition result.

According to the character recognition method provided by the embodiment of the invention, the image is preprocessed and enhanced, so that the contained character content is more cleaned, and then after the background is removed, the character recognition result is obtained by utilizing the convolutional neural network for recognition, the recognition precision is high, and not only the printed body can be recognized, but also the non-printed body can be precisely recognized.

Drawings

Fig. 1 is a flowchart of a text recognition method according to an embodiment of the present invention;

fig. 2 is an architecture diagram of a text recognition system according to an embodiment of the present invention.

Detailed Description

In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention is described in further detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention.

Specific implementations of the present invention are described in detail below with reference to specific embodiments.

With the development and progress of science and technology, character recognition technology has been widely applied, for example, some software can recognize the character content contained in the picture by processing the screenshot, and particularly, the recognition result is very accurate for the characters of the print. However, it is difficult to achieve satisfactory recognition accuracy for non-printed characters by the above recognition method, and thus a method for recognizing non-printed characters is required to solve the above problems.

As shown in fig. 1, which is a flowchart of a text recognition method provided in an embodiment of the present invention, the method includes:

and S100, acquiring an image to be identified, and preprocessing the image to be identified to obtain an enhanced image.

In this step, the image to be recognized is collected, specifically, any device having an image collecting function may be used for collecting the image, such as a mobile phone, a tablet computer, a camera, etc., and then, an automatic binarization image is used for noise filtering to perform color enhancement on the effective part of the original image.

And S200, uploading the enhanced image, and extracting the character area in the enhanced image to obtain a character area image.

In the step, the enhanced image is uploaded to a server, Http uploading is adopted during uploading, an artificial intelligence network for target recognition is trained, then target recognition is carried out on the current image through the trained network, the most central target is automatically taken out, the target is cut through box, independent characters are extracted, namely the artificial neural network is trained, the trained artificial neural network is used for carrying out target recognition on the enhanced image, and a target area image is extracted; and cutting the target area image to obtain a character area image.

And S300, carrying out background processing on the character area image to obtain a background-free character image.

In this step, a font picture is binarized by means of an automatic threshold value and a segmentation region process, and a background is removed by using a binarization result as a mask to obtain a background-free character image.

S400, cleaning the line segment and the point of the background-free character image, and identifying by using a convolutional neural network to obtain a character identification result.

In the step, Hough line detection is based on the basic principle that duality of points and lines is utilized in a line detection task, namely, straight lines in an image space and points in a parameter space are in one-to-one correspondence, and straight lines in the parameter space and points in the image space are also in one-to-one correspondence. This means that we can draw two very useful conclusions:

1) each line in the image space is represented in the parameter space corresponding to a single point;

2) any part of line segments on the straight line in the image space correspond to the same point in the parameter space.

Therefore, the Hough line detection algorithm is used for converting the line detection problem in the image space into the detection problem of the point in the parameter space, and the line detection task is completed by searching the peak value in the parameter space.

And (3) automatic threshold value, namely selecting a calculation method according to the color distribution and change in the picture and the specific segmentation requirement, calculating the segmented threshold value, and segmenting each picture by using different threshold values.

The Huffman transform is to subdivide the parameter space into accumulation subunits and to map each point on a straight line to another parameter space, so that the straight line forms a high frequency in one accumulation subunit of the parameter plane, whereby the parameters of the accumulation subunit are reversely deduced to the original straight line. The Hough transform can find out straight lines in the picture, and a curve is drawn for each pixel point in the image to express countless straight lines passing through the point. The intersection point of the curve represents the value of the intersection point, the represented straight line passes through all the effective pixel points, and actually, a straight line is found to pass through the effective pixel points as much as possible. If such a line is found, a large number of valid pixels (which is the threshold value to be adjusted) are passed, that is, a line is found in the image.

And (4) making different areas into selection areas and respectively processing the selection areas. The natural smooth transition coordinates various parameters with each other, and the pixel distribution of each picture is different, so that each picture needs to be divided with different thresholds in a targeted manner.

The Huffman transform is to subdivide the parameter space into accumulation subunits and to map each point on a straight line to another parameter space, so that the straight line forms a high frequency in one accumulation subunit of the parameter plane, whereby the parameters of the accumulation subunit are reversely deduced to the original straight line.

As shown in fig. 2, a text recognition system provided in an embodiment of the present invention includes:

the image acquisition module 100 is configured to acquire an image to be identified, and preprocess the image to be identified to obtain an enhanced image.

In the system, the image to be identified is collected, specifically, any equipment with an image collecting function can be adopted for collection, such as a mobile phone, a tablet computer, a camera and the like, and then an automatic binary image is adopted for noise filtering and color enhancement of the effective part of the original image.

The image enhancement module 200 is configured to upload an enhanced image, and extract a text region in the enhanced image to obtain a text region image.

In the system, the image enhancement module 200 uploads the enhanced image to the server, and Http uploading is adopted during uploading to train an artificial intelligence network for target recognition, and then target recognition is performed on the current image through the trained network, so that the most central target is automatically taken out.

The background processing module 300 is configured to perform background processing on the text region image to obtain a background-free text image.

In the system, the background processing module 300 performs binarization processing on the font picture by using an automatic threshold and a segmented region processing mode, and removes the background by using a binarization result as a mask to obtain a background-free character image.

And the character recognition module 400 is used for cleaning the line segments and the points of the background-free character image, and recognizing by using a convolutional neural network to obtain a character recognition result.

In the system, the character recognition module 400 performs huffman straight line fitting on points of a line segment, determines an effective area of the points after fitting, eliminates the points of which the effective area is smaller than a preset value, and performs recognition by using a convolutional neural network to obtain a character recognition result, wherein the preset value is 0.01.

It should be understood that, although the steps in the flowcharts of the embodiments of the present invention are shown in sequence as indicated by the arrows, the steps are not necessarily performed in sequence as indicated by the arrows. The steps are not performed in the exact order shown and described, and may be performed in other orders, unless explicitly stated otherwise. Moreover, at least a portion of the steps in various embodiments may include multiple sub-steps or multiple stages that are not necessarily performed at the same time, but may be performed at different times, and the order of performance of the sub-steps or stages is not necessarily sequential, but may be performed in turn or alternately with other steps or at least a portion of the sub-steps or stages of other steps.

It will be understood by those skilled in the art that all or part of the processes of the methods of the embodiments described above can be implemented by a computer program, which can be stored in a non-volatile computer-readable storage medium, and can include the processes of the embodiments of the methods described above when the program is executed. Any reference to memory, storage, database, or other medium used in the embodiments provided herein may include non-volatile and/or volatile memory, among others. Non-volatile memory can include read-only memory (ROM), Programmable ROM (PROM), Electrically Programmable ROM (EPROM), Electrically Erasable Programmable ROM (EEPROM), or flash memory. Volatile memory can include Random Access Memory (RAM) or external cache memory. By way of illustration and not limitation, RAM is available in a variety of forms such as Static RAM (SRAM), Dynamic RAM (DRAM), Synchronous DRAM (SDRAM), Double Data Rate SDRAM (DDRSDRAM), Enhanced SDRAM (ESDRAM), Synchronous Link DRAM (SLDRAM), Rambus Direct RAM (RDRAM), direct bus dynamic RAM (DRDRAM), and memory bus dynamic RAM (RDRAM).

The technical features of the embodiments described above may be arbitrarily combined, and for the sake of brevity, all possible combinations of the technical features in the embodiments described above are not described, but should be considered as being within the scope of the present specification as long as there is no contradiction between the combinations of the technical features.

The above-mentioned embodiments only express several embodiments of the present invention, and the description thereof is more specific and detailed, but not construed as limiting the scope of the present invention. It should be noted that, for a person skilled in the art, several variations and modifications can be made without departing from the inventive concept, which falls within the scope of the present invention. Therefore, the protection scope of the present patent shall be subject to the appended claims.

The above description is only for the purpose of illustrating the preferred embodiments of the present invention and is not to be construed as limiting the invention, and any modifications, equivalents and improvements made within the spirit and principle of the present invention are intended to be included within the scope of the present invention.

Claims

1. A method for recognizing a character, the method comprising:

2. The character recognition method according to claim 1, wherein the step of uploading the enhanced image and extracting the character region in the enhanced image to obtain the character region image specifically comprises:

and cutting the target area image to obtain a character area image.

3. The character recognition method of claim 1, wherein the step of performing background processing on the character region image to obtain a background-free character image specifically comprises: and carrying out binarization processing on the font picture by an automatic threshold value and a segmentation area processing mode, and removing the background by taking a binarization result as a mask code to obtain a background-free character image.

4. The method according to claim 1, wherein the step of cleaning the line segment and the point of the background-free text image and recognizing the line segment and the point by using a convolutional neural network to obtain a text recognition result specifically comprises: the method comprises the steps of performing Hoffman straight line fitting on points of a line segment, judging an effective area of the points after fitting, eliminating the points of which the effective area is smaller than a preset value, and identifying by using a convolutional neural network to obtain a character identification result.

5. The character recognition method of claim 4, wherein the predetermined value is 0.01.

6. The method of claim 1, wherein the preprocessing step comprises at least noise filtering and color enhancement.

7. The character recognition method of claim 3, wherein the image after the binarization processing comprises a black area and a white area, wherein the black area is a background to be removed.

8. A character recognition system, the system comprising: