CN113537221A

CN113537221A - Image recognition method, device and equipment

Info

Publication number: CN113537221A
Application number: CN202010294760.3A
Authority: CN
Inventors: 张诗禹; 高飞宇; 王永攀; 郑琪; 罗楚威
Original assignee: Alibaba Group Holding Ltd
Current assignee: Alibaba Group Holding Ltd
Priority date: 2020-04-15
Filing date: 2020-04-15
Publication date: 2021-10-22

Abstract

The application discloses an image recognition method, which comprises the following steps: acquiring an image to be processed, wherein the image to be processed comprises a title area and a table area; identifying the content in the title area to obtain a title; obtaining an analysis template corresponding to the image to be processed according to the title; and identifying the content in the table area according to the analysis template. By adopting the method, the robustness of image recognition is improved.

Description

Image recognition method, device and equipment

Technical Field

The application relates to the technical field of computers, in particular to an image recognition method, an image recognition device, electronic equipment and storage equipment, and also relates to another image recognition method, an optical character recognition method, a movie ticket image recognition method, a test paper image recognition method and an operation image recognition method.

Background

There is a large amount of image information in a commercial activity, of which a considerable part is table information. Such as a picture of a bank statement, or a charge receipt for a medical institution, etc. The processing of these pictured tables is performed first with a table understanding. Table understanding refers to the manner in which data in a table is organized. The table understanding is realized by depending on machine learning, so that the manual participation is reduced, and the data processing efficiency is improved.

In the prior art, some picture table identification products can only perform targeted automatic analysis on pure KV (key value) pairs and pure list tables. Some picture tables identify products, the contents outside the tables are combined and processed according to the positions of the character blocks, and the contents in the tables are restored according to the table lines. Such structured recognition is mainly achieved by finding keys in a Key (Key) library, and then only supporting the automatic resolution of KV pairs, and cannot resolve lists or tables with more complex structures. Some picture tables identify products, and only the contents in the tables are restored according to the table lines. Its custom template is a template product for user interaction, supports table parsing, but only supports parsing of simple list tables, and has no grouping of values (Value).

It can be seen that due to ambiguity of KV semantics, most of the conventional table understanding schemes in image recognition in the prior art are only applicable to a certain field, and the robustness to a new scene is poor.

Disclosure of Invention

The application provides an image identification method, an image identification device, electronic equipment and storage equipment, so as to improve robustness of image identification.

The image identification method provided by the application comprises the following steps:

acquiring an image to be processed, wherein the image to be processed comprises a title area and a table area;

identifying the content in the title area to obtain a title;

obtaining an analysis template corresponding to the image to be processed according to the title;

and identifying the content in the table area according to the analysis template.

Optionally, the obtaining of the image to be processed, where the image to be processed includes a header area and a table area, includes:

acquiring an image to be processed;

and performing layout analysis on the image to be processed to obtain a title area of the image to be processed and a table area of the image to be processed.

Optionally, the performing layout analysis on the image to be processed to obtain a header area of the image to be processed and a table area of the image to be processed includes:

acquiring projection characteristics of an image to be processed;

analyzing the projection characteristics to obtain blank space information of the image to be processed and a threshold value of a communication interval of the image to be processed;

segmenting the image to be processed according to the blank space information and the connected interval threshold value to obtain segmented text blocks;

and identifying the segmented text block to obtain a title area of the image to be processed and a table area of the image to be processed.

Optionally, the content in the header area includes at least one of the following:

the content of the characters;

the graphical content is marked.

Optionally, the identifying the content in the title area to obtain a title includes:

extracting the characteristics of the characters to be recognized in the header area to obtain the characteristic data of the characters to be recognized;

and performing matching query from a feature library according to the feature data of the characters to be recognized to obtain the title of the picture table.

Optionally, the obtaining, according to the title, an analysis template corresponding to the image to be processed includes:

searching in the existing template library according to the title to obtain a search result;

analyzing the retrieval result and judging whether the retrieval is successful;

if the retrieval result is successful, taking a target template obtained by retrieval as an analysis template corresponding to the image to be processed;

and if the retrieval result is failure, acquiring the analysis template of the image to be processed through user interaction.

Optionally, the obtaining an analysis template of the image to be processed through user interaction includes:

obtaining a key list of the image to be processed according to a template file provided by a user;

matching the character blocks in the table area by using the key list to obtain a key block of the image to be processed and a value block of the image to be processed;

analyzing the key block and the value block to obtain a corresponding relation between a key structure of the image to be processed and a key value of the image to be processed;

and acquiring an analysis template of the image to be processed according to the key structure of the image to be processed and the key value corresponding relation of the image to be processed.

Optionally, the matching the text blocks in the table area by using the key list to obtain the key block of the image to be processed and the value block of the image to be processed includes:

obtaining a key from the list of keys;

carrying out fuzzy matching on the character blocks in the table area by using the keys to obtain the character blocks corresponding to the keys;

identifying the character block corresponding to the key as the key block of the image to be processed;

and marking other character blocks except the key block in the table area as the value blocks of the image to be processed.

Optionally, the analyzing the key block and the value block to obtain a correspondence between the key structure of the image to be processed and the key value of the image to be processed includes:

acquiring a pre-trained graph model for table understanding;

and analyzing the key block and the value block by using the graph model to obtain the corresponding relation between the key structure of the image to be processed and the key value of the image to be processed.

Optionally, the obtaining a graph model trained in advance for table understanding includes:

obtaining a graph model to be trained;

obtaining a table structure data sample;

obtaining a key block sample corresponding to the table structure data sample;

obtaining a value block sample corresponding to the table structure data sample;

and training the graph model to be trained according to the table structure data sample, the key block sample and the value block sample to obtain a pre-trained graph model for table understanding.

Optionally, the identifying the content in the table area according to the parsing template includes:

obtaining a key value structure in the table area according to the analysis template;

and identifying the attribute of the value in the table area according to the key value structure in the table area.

The application provides an image recognition apparatus, including:

the image acquisition unit is used for acquiring an image to be processed, wherein the image to be processed comprises a title area and a table area;

a title obtaining unit, for identifying the content in the title area and obtaining a title;

the template obtaining unit is used for obtaining an analysis template corresponding to the image to be processed according to the title;

and the content identification unit is used for identifying the content in the table area according to the analysis template.

Optionally, the image obtaining unit is specifically configured to:

acquiring an image to be processed;

Optionally, the image acquiring unit is further configured to:

acquiring projection characteristics of an image to be processed;

the content of the characters;

the graphical content is marked.

Optionally, the title obtaining unit is specifically configured to:

Optionally, the template obtaining unit is specifically configured to:

analyzing the retrieval result and judging whether the retrieval is successful;

Optionally, the template obtaining unit is further configured to:

obtaining a key from the list of keys;

Optionally, the template obtaining unit is further configured to:

acquiring a pre-trained graph model for table understanding;

Optionally, the template obtaining unit is further configured to:

obtaining a graph model to be trained;

obtaining a table structure data sample;

obtaining a key block sample corresponding to the table structure data sample;

Optionally, the content identification unit is specifically configured to:

The application provides an electronic device, including:

a processor; and

a memory for storing a program of a data processing method, the apparatus performing the following steps after being powered on and running the program of the data processing method by the processor:

identifying the content in the title area to obtain a title;

The application provides a storage device storing a program of a data processing method, the program being executed by a processor to perform the steps of: the method comprises the following steps:

identifying the content in the title area to obtain a title;

The application provides an image recognition method, which comprises the following steps:

creating an analysis template corresponding to an image to be processed, wherein the image to be processed comprises a title area and a table area;

uploading the analysis template and the marked image of the image to be processed to a server side for training;

determining a trained analysis template according to the feedback information of the server;

uploading an image to be identified to a server;

and obtaining the recognition result of the image to be recognized returned by the server.

The application provides an optical character recognition method, which comprises the following steps:

acquiring an optical character form image to be recognized;

identifying the optical character form image to obtain a title;

obtaining characteristic data of the optical character form image according to the title;

obtaining an identification template of the optical character form image according to the characteristic data;

and identifying the optical character form image by using the identification template to obtain the form content of the optical character form image.

The application provides a movie ticket image identification method, which comprises the following steps:

acquiring a movie ticket image to be identified;

identifying the movie ticket image to obtain a title;

acquiring characteristic data of the movie ticket image according to the title;

obtaining an identification template of the movie ticket image according to the characteristic data;

and identifying the movie ticket image by using the identification template to obtain the content information of the movie ticket image.

The application provides a test paper image identification method, which comprises the following steps:

acquiring a test paper image to be identified;

identifying the test paper image to obtain a title;

obtaining feature data of the test paper image according to the title;

obtaining an identification template of the test paper image according to the characteristic data;

and identifying the test paper image by using the identification template to obtain the content information of the test paper image.

The application provides a job image identification method, which comprises the following steps:

acquiring a job image to be identified;

identifying the operation image to obtain a title;

obtaining feature data of the operation image according to the title;

obtaining an identification template of the operation image according to the feature data;

and identifying the job image by using the identification template to obtain the content information of the job image.

Compared with the prior art, the method has the following advantages:

the image identification method provided by the application acquires an image to be processed, wherein the image to be processed comprises a title area and a table area; identifying the content in the title area to obtain a title; obtaining an analysis template corresponding to the image to be processed according to the title; and identifying the content in the table area according to the analysis template. By adopting the method provided by the application, the analysis template corresponding to the image to be processed is obtained according to the title; and identifying the content in the table area according to the analysis template, so that the image identification is more easily adapted to a new scene, and the robustness of the image identification is improved.

Drawings

Fig. 1a is a schematic view of an application scenario embodiment of an image recognition method provided in the present application.

Fig. 1 is a flowchart of an image recognition method according to a first embodiment of the present application.

Fig. 2 is a schematic view of table understanding related to the first embodiment of the present application.

Fig. 3 is a schematic diagram illustrating an effect of the table structure understanding model according to the first embodiment of the present application.

Fig. 4 is a schematic diagram of an image recognition apparatus according to a second embodiment of the present application.

Fig. 5 is a flowchart of an image recognition method according to a fifth embodiment of the present application.

Fig. 6 is a schematic diagram of a recognition result according to a fifth embodiment of the present application.

Fig. 7 is a flowchart of an optical character recognition method according to a sixth embodiment of the present application.

Fig. 8 is a flowchart of a movie ticket image recognition method according to a seventh embodiment of the present application.

Fig. 9 is a flowchart of a test paper image recognition method according to an eighth embodiment of the present application.

Fig. 10 is a flowchart of a job image recognition method according to a ninth embodiment of the present application.

Detailed Description

In the following description, numerous specific details are set forth in order to provide a thorough understanding of the present invention. This invention may, however, be embodied in many different forms and should not be construed as limited to the embodiments set forth herein, but rather construed as limited to the embodiments set forth herein.

In order to make the technical solutions of the present application better understood, a detailed description is first given of a specific application scenario embodiment of the present application.

First, a user 108-1 sends an image to be processed to the image recognition server 100 through the network 105 via a client application 107-1 on the client device 106-1; then, in the image obtaining unit 101, a to-be-processed image is obtained, wherein the to-be-processed image includes a header area and a table area; next, in the title obtaining unit 102, the content in the title area is identified, and a title is obtained; then, in a template obtaining unit 103, obtaining an analysis template corresponding to the image to be processed according to the title; further, content recognition section 104 recognizes the content in the table area from the analysis template, and obtains a recognition result. Finally, the image recognition server 100 transmits the recognition result to the client device.

The first embodiment of the present application provides an image recognition method. Please refer to fig. 1, which is a flowchart illustrating a first embodiment of the present application. The first embodiment of the present application will be described in detail below with reference to fig. 1. The implementation of the method comprises the following steps:

as shown in fig. 1, in step S101, a to-be-processed image is acquired, wherein the to-be-processed image includes a header area and a table area.

The image to be processed may be a photo-taking type form, please refer to fig. 2. The photo category form may include a title area and a table area, such as the title area and the table area shown in fig. 2.

The image to be processed may be a picture table. Such as the bank lineup form of fig. 2.

The acquiring of the image to be processed, wherein the image to be processed includes a header area and a table area, and includes:

acquiring an image to be processed;

The performing layout analysis on the image to be processed to obtain a header area of the image to be processed and a table area of the image to be processed includes:

acquiring projection characteristics of an image to be processed;

Please refer to fig. 2, which is a schematic diagram for understanding a table. In fig. 2, the picture before layout analysis is a picture table to be processed. Ocr (optical Character recognition), refers to a process in which an electronic device (e.g., a scanner or a digital camera) examines a Character printed on paper, determines its shape by detecting dark and light patterns, and then translates the shape into a computer text by a Character recognition method. Layout analysis refers to analyzing the block structure of a text image for subsequent OCR recognition processing. Because the relationships are identified with accuracy and in the correct text sequence, layout analysis is particularly important.

Layout analysis includes layout segmentation and text block attribute determination after segmentation. Layout segmentation has some more common analysis algorithms. For example, segmentation methods based on run-length smoothing algorithms. The method comprises the steps of scanning an image from the transverse direction and the longitudinal direction, namely the X direction or the Y direction, setting all pixels with the number of continuous blank pixels smaller than a given prior threshold value to be black, and then logically combining two images generated by secondary operation. Regions with different attributes in the layout are segmented by changing a given prior threshold value. The method has strong anti-noise capability, but has the defects of needing a priori threshold, more scanning times and large calculation amount.

Another approach is a connected domain based layout segmentation method.

Performing layout analysis on the picture table to be processed to obtain a title area of the picture table and a table area of the picture table, including:

acquiring projection characteristics of a picture table;

analyzing the projection characteristics to obtain blank space information of the picture table and a connected interval threshold value of the picture table;

according to the blank space information and the connected interval threshold value, the picture table is divided, and divided text blocks are obtained;

and identifying the segmented text block to obtain a title area of the picture table and a table area of the picture table.

The method comprises the steps of firstly, starting from the whole image, analyzing the whole projection characteristics of the image to obtain information such as blank space and analyzing the threshold value of a connected interval, and performing connection operation on the image through the threshold value to obtain the whole text block so as to be convenient to segment. The method has good adaptability and moderate processing speed.

As can be seen from fig. 2, after layout analysis, two areas, a title and a table, are obtained.

As shown in fig. 1, in step S102, the content in the title area is identified, and a title is obtained.

The content in the title area comprises at least one of the following content:

the content of the characters;

the graphical content is marked.

As shown in fig. 2, the header area may be textual content, such as an XX bank statement. Or LOGO graphic contents such as LOGO of XX bank.

The identifying the content in the title area and obtaining the title comprises:

and performing matching query from a feature library according to the feature data of the characters to be recognized to obtain the title of the picture table. Feature extraction is a process of extracting statistical or structural features from a single character image. The stability and validity of the extracted features determine the performance of the recognition. For the extraction of the statistical characteristics, a characteristic extraction method in the statistical pattern recognition can be utilized, and for the extraction of the structural characteristics, a corresponding characteristic extraction method is determined according to the recognition primitives determined by the specific characters. In the long research process of character recognition, the extraction of character features is guided by using the experience knowledge of people. Such as edge features, transform features, penetration features, grid features, feature point features, direction line element features, and the like.

The feature matching is a process of finding out a character with the highest similarity to the character to be recognized from an existing feature library. After the characters to be recognized are subjected to feature extraction, no matter the statistical features or the structural features are used, a feature library is needed for comparison, and the feature library contains the features of all the characters in the character set to be recognized. There are many methods for feature matching, and the following are more commonly used: euclidean spatial alignment method, relaxed alignment method, dynamic program alignment method, HMM (hidden Markov model), and the like. This template matching method has been used in the field of Chinese character OCR for a long time before and after the advent of neural networks.

Deep learning has been successfully applied to the OCR field, and the development of deep learning replaces heavy feature engineering, and features of an image are automatically learned from a large number of label predictions, wherein CNN (convolutional neural network) is especially worried, and besides a process of manual feature extraction is omitted, the number of weights is reduced by means of sharing the weights, the calculation cost is greatly reduced, and the two advantages of the method make CNN show prominently in the OCR field.

The character recognition in the step can be realized by adopting a traditional characteristic extraction and matching method or a deep learning method.

As shown in fig. 1, in step S103, an analysis template corresponding to the image to be processed is obtained according to the title.

The obtaining of the analysis template corresponding to the image to be processed according to the title includes:

analyzing the retrieval result and judging whether the retrieval is successful;

For example, from an XX bank statement, a resolution template for the XX bank statement may be found in the existing template library. The template includes attributes of each key in the statement, such as key information of name, account number, etc., and corresponding values of the name and account number.

If the retrieval result fails, the user is required to provide a resolution template.

The obtaining of the analysis template of the image to be processed through user interaction includes:

The template file may include the title of the image to be recognized and may also include all key information on the image to be processed. According to the title, a key list of the image to be processed is obtained. The key list refers to a list of keys of data stored in a table. As provided in fig. 2 is data for a bank statement with a list of keys as date, time, amount, etc.

The obtaining a key list of the picture table according to the title includes:

searching in an existing key list library according to the title to obtain a search result;

analyzing the retrieval result and judging whether the retrieval is successful;

if the retrieval result is successful, taking the key list obtained by retrieval as the key list of the picture list;

and if the retrieval result is failure, acquiring the key list of the picture table through user interaction.

First, a knowledge base storing key lists can be established according to industry characteristics. For example, a knowledge base of a storage key list conforming to the characteristics of the banking industry can be established according to the characteristics of the banking industry. And searching in an existing key list library according to the title. And judging whether the title exists in the existing knowledge base or not according to the retrieval result. If so, the key list stored in the knowledge base is directly fetched. If the key list does not exist, the key list corresponding to the title can be obtained through user interaction or other technical methods, and the obtained key list can be stored in a knowledge base, so that the key list is convenient to retrieve and use next time.

This step corresponds to the Key (Key) block search step in fig. 2.

The matching the character blocks in the table area by using the key list to obtain the key blocks of the picture table and the value blocks of the picture table includes:

obtaining a key from the list of keys;

identifying the character block corresponding to the key as a key block of the picture table;

and identifying other character blocks except the key block in the table area as the value blocks of the picture table.

First, a key is obtained from the key list, for example, the key is a date, and the character blocks in the table area are fuzzy matched by using the key of the date, so as to obtain the character block corresponding to the key. Then, the text block corresponding to the date key is identified as the key block of the picture table. Similarly, the text block corresponding to the time and amount keys can be identified as the key block of the picture table. And after all other key block identifications are completed, identifying other character blocks except the key block in the table area as the value blocks of the picture table.

The analyzing the key block and the value block to obtain a key structure of the picture table and a key value corresponding relation of the picture table includes:

acquiring a pre-trained graph model for table understanding;

and analyzing the key block and the value block by using the graph model to obtain a key structure of the picture table and a key value corresponding relation of the picture table.

And analyzing the Key structure and the KV corresponding relation by using a pre-trained table understanding model according to the acquired KV classification information of the text block. The table understanding model does not depend on specific domain knowledge and is a general structure understanding model. As shown in fig. 3, after the character block is classified by the key value, a general table structure understanding model can be learned on a large amount of generated table structure data by a method such as a graph model.

The obtaining of a pre-trained graph model for table understanding includes:

obtaining a graph model to be trained;

obtaining a table structure data sample;

obtaining a key block sample corresponding to the table structure data sample;

In engineering practice, a pre-trained graphical model for form understanding can be obtained with the supervisory algorithm provided by this step. Since the content of this part is prior art, it is not described here in detail.

Traditional table understanding is limited to specific areas primarily because the general KV semantics are ambiguous, whereas parsing KV structures is generic with known KV classification. By adopting the method provided by the first embodiment of the application, the KV classification and the KV structure analysis can be separated, the KV classification information is firstly acquired by establishing a knowledge base or a user interaction method, and then the KV structure is analyzed by using a pre-trained structure understanding model according to the KV classification information, thereby realizing the general table understanding.

As shown in fig. 1, in step S104, the content in the table area is identified according to the analysis template.

The identifying the content in the table area according to the parsing template includes:

According to the analysis template, after the character content on the picture is identified, the attribute of the value on the picture can be determined. For example, the attribute of the value corresponding to gender may be determined to be female.

Corresponding to the method for processing the picture table provided in the first embodiment of the present application, a second embodiment of the present application also provides a device for processing the picture table.

As shown in fig. 4, the apparatus for processing a picture table provided in this embodiment includes:

an image obtaining unit 401, configured to obtain an image to be processed, where the image to be processed includes a header area and a table area;

a title obtaining unit 402, configured to identify the content in the title area, and obtain a title;

a template obtaining unit 403, configured to obtain, according to the title, an analysis template corresponding to the image to be processed;

a content identification unit 404, configured to identify the content in the table area according to the parsing template.

In this embodiment, the image obtaining unit is specifically configured to:

acquiring an image to be processed;

In this embodiment, the image obtaining unit is further configured to:

acquiring projection characteristics of an image to be processed;

In this embodiment, the content in the title area includes at least one of the following:

the content of the characters;

the graphical content is marked.

In this embodiment, the title obtaining unit is specifically configured to:

In this embodiment, the template obtaining unit is specifically configured to:

analyzing the retrieval result and judging whether the retrieval is successful;

In this embodiment, the template obtaining unit is further configured to:

obtaining a key from the list of keys;

In this embodiment, the template obtaining unit is further configured to:

acquiring a pre-trained graph model for table understanding;

In this embodiment, the template obtaining unit is further configured to:

obtaining a graph model to be trained;

obtaining a table structure data sample;

obtaining a key block sample corresponding to the table structure data sample;

In this embodiment, the content identification unit is specifically configured to:

It should be noted that, for the detailed description of the apparatus provided in the second embodiment of the present application, reference may be made to the related description of the first embodiment of the present application, and details are not repeated here.

Corresponding to the method for processing the picture table provided in the first embodiment of the present application, a third embodiment of the present application provides an electronic device, including:

a processor; and

identifying the content in the title area to obtain a title;

acquiring an image to be processed;

acquiring projection characteristics of an image to be processed;

the content of the characters;

the graphical content is marked.

analyzing the retrieval result and judging whether the retrieval is successful;

obtaining a key from the list of keys;

acquiring a pre-trained graph model for table understanding;

obtaining a graph model to be trained;

obtaining a table structure data sample;

obtaining a key block sample corresponding to the table structure data sample;

It should be noted that, for the detailed description of the electronic device provided in the third embodiment of the present application, reference may be made to the related description of the first embodiment of the present application, and details are not repeated here.

In accordance with a method for processing a picture table provided in the first embodiment of the present application, a fourth embodiment of the present application provides a storage device storing a program of a data processing method, the program being executed by a processor to perform the steps of: the method comprises the following steps:

identifying the content in the title area to obtain a title;

It should be noted that, for the detailed description of the electronic device provided in the fourth embodiment of the present application, reference may be made to the related description of the first embodiment of the present application, and details are not repeated here.

A fifth embodiment of the present application provides an image recognition method, please refer to fig. 5, which is a flowchart of the image recognition method provided in the present application. The present embodiment may be implemented at the client.

As shown in fig. 5, in step S501, a parsing template corresponding to an image to be processed is created, wherein the image to be processed includes a header area and a table area.

The image to be processed may be an image of a specified category, such as an XX bank statement. The analysis template may include keys in the XX bank statement, and may further include a spatial distribution relationship corresponding to key values in the bank statement, and the like.

As shown in fig. 5, in step S502, the analysis template and the annotation image of the image to be processed are uploaded to a server for training.

For example, the annotated images of the XX bank statement may be sent to an image recognition server, and the image recognition server establishes a deep learning neural network model by using the annotated images and the parsing template uploaded by the client, and performs machine learning on the parsing template. And then, transmitting the training result of the analysis template after machine learning to the client.

As shown in fig. 5, in step S503, the analysis template after training is determined according to the feedback information of the server.

After receiving the training result, the client can determine the analysis template after training according to the training result. And the trained analysis template is used for identifying the image to be identified in the production stage. For example, the training result shows that the accuracy of the test set of the analysis template is 95%, and if the client receives the accuracy, the analysis template at this time can be used as the post-training analysis template, and the server is notified to use the analysis template at this time as the post-training analysis template. If the client does not accept the accuracy, the image server is required to perform more training.

As shown in fig. 5, in step S504, the image to be recognized is uploaded to the server.

After the template is analyzed after the training is determined, the image to be recognized can be uploaded to the server side. And the server side identifies the image to be identified by using the trained analysis template to obtain an identification result.

As shown in fig. 5, in step S505, the recognition result of the image to be recognized returned by the server is obtained.

For example, please refer to the schematic diagram of the recognition result in fig. 6. The left side of fig. 6 is an image to be recognized, and the right side is a recognition result of the image to be recognized.

A sixth embodiment of the present application provides an optical character recognition method, please refer to fig. 7. The optical character recognition method comprises the following steps:

as shown in fig. 7, in step S701, an optical character form image to be recognized is acquired.

Please refer to fig. 6, the left side is the optical character form image to be recognized.

As shown in fig. 7, in step S702, the optical character form image is recognized, and a title is obtained.

As shown in fig. 7, in step S703, feature data of the optical character form image is obtained according to the title.

The characteristic data of the optical character form image may be a correspondence between a key structure of the image to be processed and a key value of the image to be processed in the first embodiment of the present application.

As shown in fig. 7, in step S704, a recognition template of the optical character form image is obtained according to the feature data.

The identification template may be a parsing template in the first embodiment of the present application.

As shown in fig. 7, in step S705, the optical character form image is recognized by using the recognition template, and the form content of the optical character form image is obtained.

Please refer to fig. 6, the right side of the table is the table content of the optical character table image.

It should be noted that, for the detailed description of the optical character recognition method provided in the sixth embodiment of the present application, reference may be made to the related description of the first embodiment of the present application, and details are not repeated herein.

A seventh embodiment of the present application provides a movie ticket image recognition method, please refer to fig. 8. The movie ticket image identification method comprises the following steps:

as shown in fig. 8, in step S801, a movie ticket image to be recognized is acquired.

As shown in fig. 8, in step S802, the movie ticket image is recognized, and a title is obtained.

As shown in fig. 8, in step S803, feature data of the movie ticket image is obtained from the title.

The feature data of the movie ticket image may be a correspondence between a key structure of the image to be processed and a key value of the image to be processed in the first embodiment of the present application.

As shown in fig. 8, in step S804, an identification template of the movie ticket image is obtained according to the feature data.

As shown in fig. 8, in step S805, the movie ticket image is identified by using the identification template, and content information of the movie ticket image is obtained.

After the content information of the movie ticket image is obtained, whether the movie ticket image is an effective movie ticket can be judged; if it is a valid movie ticket, the gate is automatically opened.

It should be noted that, for the detailed description of the movie ticket image recognition method provided in the seventh embodiment of the present application, reference may be made to the related description of the first embodiment of the present application, and details are not repeated here.

An eighth embodiment of the present application provides a method for identifying a test paper image, please refer to fig. 9. The test paper image identification method comprises the following steps:

as shown in fig. 9, in step S901, a test sheet image to be recognized is acquired.

As shown in fig. 9, in step S902, the test sheet image is recognized and a title is obtained.

As shown in fig. 9, in step S903, the test paper image is recognized, and feature data of the test paper image is obtained.

The feature data of the test paper image may be a correspondence between a key structure of the image to be processed and a key value of the image to be processed in the first embodiment of the present application.

As shown in fig. 9, in step S904, an identification template of the test paper image is obtained according to the feature data.

As shown in fig. 9, in step S905, the test paper image is recognized by using the recognition template, and content information of the test paper image is obtained.

After the content information of the test paper image is obtained, the content information of the test paper image can be compared with standard answer information prepared in advance to obtain the test paper score of the test paper image.

It should be noted that, for the detailed description of the test paper image identification method provided in the eighth embodiment of the present application, reference may be made to the related description of the first embodiment of the present application, and details are not repeated here.

A ninth embodiment of the present application provides a method for identifying a job image, please refer to fig. 10. The job image identification method comprises the following steps:

as shown in fig. 10, in step S1001, a job image to be recognized is acquired.

As shown in fig. 10, in step S1002, the job image is recognized, and a title is obtained.

As shown in fig. 10, in step S1003, feature data of the job image is obtained from the title.

The feature data of the job image may be a correspondence between a key structure of the image to be processed and a key value of the image to be processed in the first embodiment of the present application.

As shown in fig. 10, in step S1004, an identification template of the job image is obtained from the feature data.

As shown in fig. 10, in step S1005, the job image is identified by using the identification template, and content information of the job image is obtained.

After the content information of the job image is obtained, the content information of the job image can be compared with standard answer information which is prepared in advance to obtain the score of the job image.

It should be noted that, for the detailed description of the job image identification method provided in the ninth embodiment of the present application, reference may be made to the related description of the first embodiment of the present application, and details are not repeated herein.

Although the present application has been described with reference to the preferred embodiments, it is not intended to limit the present application, and those skilled in the art can make variations and modifications without departing from the spirit and scope of the present application, therefore, the scope of the present application should be determined by the claims that follow.

In a typical configuration, a computing device includes one or more processors (CPUs), memory mapped input/output interfaces, network interfaces, and memory.

The memory may include forms of volatile memory in a computer readable medium, Random Access Memory (RAM) and/or non-volatile memory, such as Read Only Memory (ROM) or flash memory (flash RAM). Memory is an example of a computer-readable medium.

Computer-readable media, including both non-transitory and non-transitory, removable and non-removable media, may implement information storage by any method or technology. The information may be computer readable instructions, data structures, modules of a program, or other data. Examples of computer storage media include, but are not limited to, phase change memory (PRAM), Static Random Access Memory (SRAM), Dynamic Random Access Memory (DRAM), other types of Random Access Memory (RAM), Read Only Memory (ROM), Electrically Erasable Programmable Read Only Memory (EEPROM), flash memory or other memory technology, compact disc read only memory (CD-ROM), Digital Versatile Discs (DVD) or other optical storage, magnetic cassettes, magnetic tape magnetic disk storage or other magnetic storage devices, or any other non-transmission medium that can be used to store information that can be accessed by a computing device. As defined herein, computer readable media does not include non-transitory computer readable media (transient media), such as modulated data signals and carrier waves.

As will be appreciated by one skilled in the art, embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.

Claims

1. An image recognition method, comprising:

identifying the content in the title area to obtain a title;

2. The image recognition method of claim 1, wherein the obtaining of the to-be-processed image, wherein the to-be-processed image comprises a header area and a table area, comprises:

acquiring an image to be processed;

3. The image recognition method of claim 2, wherein the performing layout analysis on the image to be processed to obtain a header area of the image to be processed and a table area of the image to be processed comprises:

acquiring projection characteristics of an image to be processed;

4. The image recognition method of claim 1, wherein the content in the header area comprises at least one of the following:

the content of the characters;

the graphical content is marked.

5. The image recognition method of claim 1, wherein the recognizing the content in the title area to obtain a title comprises:

6. The image recognition method according to claim 1, wherein the obtaining an analysis template corresponding to the image to be processed according to the title includes:

analyzing the retrieval result and judging whether the retrieval is successful;

7. The image recognition method of claim 6, wherein the obtaining of the analytic template of the image to be processed through user interaction comprises:

8. The image recognition method of claim 7, wherein the matching the text blocks in the table area by using the key list to obtain the key block of the image to be processed and the value block of the image to be processed comprises:

obtaining a key from the list of keys;

9. The image recognition method according to claim 7, wherein the analyzing the key block and the value block to obtain a correspondence between a key structure of the image to be processed and a key value of the image to be processed includes:

acquiring a pre-trained graph model for table understanding;

10. The image recognition method of claim 9, wherein the obtaining a pre-trained graph model for table understanding comprises:

obtaining a graph model to be trained;

obtaining a table structure data sample;

obtaining a key block sample corresponding to the table structure data sample;

11. The image recognition method of claim 1, wherein the recognizing the content in the table area according to the parsing template comprises:

12. An image recognition apparatus, comprising:

13. An electronic device, comprising:

a processor; and

identifying the content in the title area to obtain a title;

14. A storage device characterized by storing a program of a data processing method, the program being executed by a processor to execute the steps of: the method comprises the following steps:

identifying the content in the title area to obtain a title;

15. An image recognition method, comprising:

uploading an image to be identified to a server;

16. An optical character recognition method, comprising:

acquiring an optical character form image to be recognized;

identifying the optical character form image to obtain a title;

17. A movie ticket image recognition method is characterized by comprising the following steps:

acquiring a movie ticket image to be identified;

identifying the movie ticket image to obtain a title;

acquiring characteristic data of the movie ticket image according to the title;

18. A test paper image recognition method is characterized by comprising the following steps:

acquiring a test paper image to be identified;

identifying the test paper image to obtain a title;

obtaining feature data of the test paper image according to the title;

19. A method for identifying a job image, comprising:

acquiring a job image to be identified;

identifying the operation image to obtain a title;

obtaining feature data of the operation image according to the title;