CN114911963A - Template picture classification method, device, equipment, storage medium and product - Google Patents

Template picture classification method, device, equipment, storage medium and product Download PDF

Info

Publication number
CN114911963A
CN114911963A CN202210516779.7A CN202210516779A CN114911963A CN 114911963 A CN114911963 A CN 114911963A CN 202210516779 A CN202210516779 A CN 202210516779A CN 114911963 A CN114911963 A CN 114911963A
Authority
CN
China
Prior art keywords
picture
tested
template
text
features
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202210516779.7A
Other languages
Chinese (zh)
Other versions
CN114911963B (en
Inventor
闻婷
郭玉杰
徐永达
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Transwarp Technology Shanghai Co Ltd
Original Assignee
Transwarp Technology Shanghai Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Transwarp Technology Shanghai Co Ltd filed Critical Transwarp Technology Shanghai Co Ltd
Priority to CN202210516779.7A priority Critical patent/CN114911963B/en
Publication of CN114911963A publication Critical patent/CN114911963A/en
Application granted granted Critical
Publication of CN114911963B publication Critical patent/CN114911963B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/50Information retrieval; Database structures therefor; File system structures therefor of still image data
    • G06F16/55Clustering; Classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/74Image or video pattern matching; Proximity measures in feature spaces
    • G06V10/761Proximity, similarity or dissimilarity measures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/764Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • General Physics & Mathematics (AREA)
  • Physics & Mathematics (AREA)
  • Databases & Information Systems (AREA)
  • Computing Systems (AREA)
  • General Health & Medical Sciences (AREA)
  • Medical Informatics (AREA)
  • Software Systems (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Health & Medical Sciences (AREA)
  • Multimedia (AREA)
  • Data Mining & Analysis (AREA)
  • General Engineering & Computer Science (AREA)
  • Image Analysis (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a template picture classification method, a template picture classification device, template picture classification equipment, a template picture classification storage medium and a template picture classification product. The method comprises the following steps: determining at least one template picture corresponding to the picture to be tested according to whether the picture to be tested contains a form or not; obtaining a multi-dimensional feature matrix based on the picture to be tested and at least one template picture corresponding to the picture to be tested; inputting the feature matrix into a preset classification model, and outputting the similarity between the picture to be tested and each template picture corresponding to the picture to be tested; and taking the category to which the picture to be tested belongs as the category corresponding to the target template picture, wherein the target template picture is the template picture with the highest similarity to the picture to be tested. By using the method, the classification process of the template pictures can be simplified, and the accuracy of template picture classification can be improved.

Description

Template picture classification method, device, equipment, storage medium and product
Technical Field
The embodiment of the invention relates to the technical field of image recognition, in particular to a template picture classification method, a template picture classification device, template picture classification equipment, a storage medium and a product.
Background
At present, general Recognition in the field of Optical Character Recognition (OCR) tends to be perfect, and general applications of common types of pictures can be realized through customized detection Recognition and targeted post-processing. For more personalized scenarios, the user's need may be to identify specific keyword location information in a certain type of picture. In response to the increasing demand, a method commonly used in the industry at present is to extract key information in a class of pictures in batch through a custom template.
For a user, before each identification, the pictures and the template pictures are required to be compared one by one to find out the template ID of the class to which the pictures belong; for services, pictures cannot be identified if the user does not specify a template ID.
The existing method usually needs to configure a plurality of training pictures for the template picture of each category, then carries out on-line training to learn the difference between the pictures of different categories, and when one picture category is newly added, retraining is needed, so that the process is complex and inflexible.
Disclosure of Invention
The invention provides a template picture classification method, a template picture classification device, a template picture classification storage medium and a template picture classification product, which aim to overcome the defects that in the prior art, a plurality of training pictures are additionally provided to train a classification model, and the prior art needs to train the model again after a new class is added.
According to an aspect of the present invention, there is provided a template picture classification method, including:
determining at least one template picture corresponding to the picture to be tested according to whether the picture to be tested contains the form or not;
obtaining a multi-dimensional feature matrix based on the picture to be tested and at least one template picture corresponding to the picture to be tested;
inputting the feature matrix into a preset classification model, and outputting the similarity between the picture to be tested and each template picture corresponding to the picture to be tested;
and taking the category to which the picture to be tested belongs as the category corresponding to the target template picture, wherein the target template picture is the template picture with the highest similarity to the picture to be tested.
According to another aspect of the present invention, there is provided a template picture classification apparatus, including:
the template picture determining module is used for determining at least one template picture corresponding to the picture to be tested according to whether the picture to be tested contains the form or not;
the characteristic matrix determining module is used for obtaining a multi-dimensional characteristic matrix based on the picture to be tested and at least one template picture corresponding to the picture to be tested;
the similarity determining module is used for inputting the characteristic matrix into a preset classification model and outputting the similarity between the picture to be tested and each template picture corresponding to the picture to be tested;
and the classification module is used for taking the category to which the picture to be tested belongs as the category corresponding to the target template picture, and the target template picture is the template picture with the highest similarity with the picture to be tested.
According to another aspect of the present invention, there is provided an electronic apparatus including:
at least one processor; and
a memory communicatively coupled to the at least one processor; wherein the content of the first and second substances,
the memory stores a computer program executable by the at least one processor, the computer program being executable by the at least one processor to enable the at least one processor to perform any of the embodiments of the invention. Provided is a template picture classification method.
According to another aspect of the present invention, there is provided a computer-readable storage medium storing computer instructions for causing a processor to implement the template picture classification method according to any embodiment of the present invention when the computer instructions are executed.
According to the technical scheme of the embodiment of the invention, the target template picture is determined through effective feature extraction and feature matching, and then the category corresponding to the picture to be tested can be directly used as the category to which the target template picture belongs, so that the problems of high difficulty in obtaining the training picture and repeated model training in the existing template picture classification method are solved, and the beneficial effects of simplifying the classification process and improving the classification accuracy are achieved.
It should be understood that the statements in this section do not necessarily identify key or critical features of the embodiments of the present invention, nor do they necessarily limit the scope of the invention. Other features of the present invention will become apparent from the following description.
Drawings
In order to more clearly illustrate the technical solutions in the embodiments of the present invention, the drawings needed to be used in the description of the embodiments will be briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without creative efforts.
Fig. 1 is a schematic flowchart of a template picture classification method according to an embodiment of the present invention;
fig. 2 is a flowchart illustrating a process of classifying template pictures according to a second embodiment of the present invention;
fig. 3 is a schematic structural diagram of a template picture classification apparatus according to a third embodiment of the present invention;
fig. 4 is a schematic structural diagram of an electronic device of a template picture classification method according to an embodiment of the present invention.
Detailed Description
In order to make the technical solutions of the present invention better understood, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention. It should be understood that the various steps recited in method embodiments of the present invention may be performed in a different order, and/or performed in parallel. Moreover, method embodiments may include additional steps and/or omit performing the illustrated steps. The scope of the invention is not limited in this respect.
The term "include" and variations thereof as used herein are open-ended, i.e., "including but not limited to". The term "based on" is "based, at least in part, on". The term "one embodiment" means "at least one embodiment"; the term "another embodiment" means "at least one additional embodiment"; the term "some embodiments" means "at least some embodiments". Relevant definitions for other terms will be given in the following description.
It should be noted that the terms "first," "second," and the like in the description and claims of the present invention and in the drawings described above are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used is interchangeable under appropriate circumstances such that the embodiments of the invention described herein are capable of operation in sequences other than those illustrated or described herein. Moreover, the terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed, but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.
It is noted that references to "a", "an", and "the" modifications in the present invention are intended to be illustrative rather than limiting, and that those skilled in the art will recognize that reference to "one or more" unless the context clearly dictates otherwise.
The names of messages or information exchanged between devices in the embodiments of the present invention are for illustrative purposes only, and are not intended to limit the scope of the messages or information.
Example one
The current common template picture classification method is to upload 30 training pictures to each type of template picture after the configuration of each type of template picture is completed, and realize the classification among a plurality of templates by constructing a template feature classifier. The method is essentially characterized in that a plurality of classifiers are trained by utilizing data of a user to realize the distinguishing of different template types, N template types are required to be determined in advance, the trained classification model can only distinguish the N template pictures, and once the picture types need to be newly added, the N +1 template pictures need to be retrained again.
The conventional template picture classification method has the following problems:
1. the configuration and the use can not be realized, and the template feature classifier training work needs to be carried out again every time one type of template picture is newly added;
2. the method has the advantages that the lower limit requirement is set on the number of pictures of a user, the required number of pictures is specified for training pictures of each type of template, and multi-template classification work cannot be realized if the current pictures of the user cannot meet the requirement;
3. the training of the classifier results in redundant effort and additional time overhead for the user.
Based on the above problems, embodiments of the present invention provide a template picture classification method, which can effectively solve the above problems.
Fig. 1 is a flowchart of a template picture classifying method according to an embodiment of the present invention, where the method is applicable to a situation where information in a picture is identified, and the method can be executed by a template picture classifying device, where the device can be implemented by software and/or hardware and is generally integrated on an electronic device, and in this embodiment, the electronic device includes but is not limited to: a computer device.
As shown in fig. 1, a method for classifying template pictures according to an embodiment of the present invention includes the following steps:
s110, determining at least one template picture corresponding to the picture to be tested according to whether the picture to be tested contains the table.
The picture to be tested can be a picture to be identified, and information to be identified in the picture to be tested can be identified according to the template picture with the same type as the picture to be tested. For example, if the picture to be tested is a train ticket picture, the template picture of the train ticket type can be used to identify the user-specified identification keyword region in the train ticket picture, and identify the keywords in the specified identification keyword region.
In this embodiment, a plurality of template pictures corresponding to the pictures to be tested are preliminarily determined according to whether the pictures to be tested include the form.
Specifically, the determining at least one template picture corresponding to the picture to be tested according to whether the picture to be tested contains the table includes: if the pictures to be tested contain forms, acquiring all template pictures containing the forms from a plurality of template pictures configured by a user as a plurality of first template pictures corresponding to the pictures to be tested; and if the picture to be tested does not contain the form, acquiring all template pictures which do not contain the form from a plurality of template pictures configured by a user as a plurality of second template pictures corresponding to the picture to be tested.
The first template picture can be understood as a template picture containing a form; the second template picture may be understood as a template picture that does not contain a form. The first template picture and the second template picture can be determined after identification.
In this embodiment, the template picture is obtained by user pre-configuration, and the configured content includes extraction of anchor keywords and framing of the region to be identified of the key in the template picture. And after the user configures the template picture, storing a configuration result into the database, and when the user inputs the picture to be tested, directly obtaining the template picture from the database. The template pictures pre-configured by the user can include different types of template pictures.
The anchor keywords can be understood as keywords whose positions and contents do not change relatively in the template picture. For example, in a train ticket class picture, anchor keywords may include origin and destination, time of departure, ticket price, seat number, and the like. The key region to be identified may be understood as a region where key information can be identified.
S120, obtaining a multi-dimensional characteristic matrix based on the picture to be tested and at least one template picture corresponding to the picture to be tested.
In this embodiment, feature extraction is performed on a picture to be tested and at least one template picture corresponding to the picture to be tested, a plurality of features of the picture to be tested are respectively matched with corresponding features of each template picture one by one, and matching information obtained after matching forms a multi-dimensional feature matrix. After feature matching is performed on the picture to be tested and one template picture, a group of multi-dimensional feature vectors can be obtained, and the feature matrix can comprise a plurality of groups of feature vectors.
It can be understood that, for the to-be-tested picture containing the table and the to-be-tested picture not containing the table, features to be extracted are different, for the to-be-tested picture containing the table, text features and table features need to be extracted respectively, and for the to-be-tested picture not containing the table, only text features need to be extracted. And the text characteristics extracted by the picture to be tested containing the table are different from the text characteristics extracted by the picture to be tested not containing the table.
In an embodiment, if the picture to be tested includes a table, correspondingly, the obtaining a multi-dimensional feature matrix based on the picture to be tested and at least one template picture corresponding to the picture to be tested includes: respectively extracting features of the table information and the text information in the picture to be tested to obtain a first table feature and a first text feature; respectively extracting the characteristics of the form information and the text information in each first template picture to obtain a second form characteristic and a second text characteristic; respectively performing feature matching on the first text features and second text features corresponding to each template picture to obtain text matching information of the picture to be tested and each first template picture; respectively carrying out feature matching on the first form features and second form features corresponding to each first template picture to obtain form matching information of the picture to be tested and each first template picture; and combining the text matching information and the table matching information to obtain a multi-dimensional feature matrix.
The first text feature can be understood as a text feature extracted after character detection and character recognition are performed on a to-be-tested picture containing a table, and the first text feature can comprise a text position feature and an indefinite length coding feature, namely an IDS coding feature. It should be noted that the indefinite length coding features can be used for comparing the font similarity of the Chinese characters, the character coding can form the Chinese characters into character strings by the sequence of the radical strokes, the tree edit distance is used for measuring the difference of the Chinese characters, and the larger the tree edit distance is, the larger the difference of the Chinese characters is represented.
The first form feature can be understood as a form feature extracted after form recognition is performed on a to-be-tested picture containing a form, and the first form feature can include a global form feature, a cell local feature and a cell text feature.
The second text feature may be understood as a text feature extracted after performing text detection and text recognition on the first template picture, and the second text feature may include an indefinite length coding feature of a text position and an anchor keyword.
The second form feature may be understood as a form feature extracted after form recognition is performed on the first template picture, and the second form feature may include a global form feature, a cell local feature, and a cell text feature. It should be further noted that the global table feature may include a row and column feature, a table location feature, and a normalized size; the cell local features may include normalized coordinates for each cell in the table and cross-row and cross-column information for each cell; the cell text features may include variable length coding features in each cell.
In this embodiment, the text matching information may be a text similarity index between the picture to be tested and the first template picture, and the text matching information may be determined in a manner that: and performing similarity comparison on the first text characteristic and the second text characteristic to obtain a text similarity index, and taking the text similarity index as text matching information.
Specifically, the text matching information includes the number of anchor keyword matches, the similarity of anchor keyword matches, and the similarity of anchor keyword positions, and correspondingly, the matching of the first text features with the second text features corresponding to each template picture respectively to obtain the text matching information between the picture to be tested and each first template picture includes: comparing the text similarity of the variable length coding features of the anchor keywords included in the second text features with the variable length coding features included in the first text features, and determining the anchor keyword matching number and the anchor keyword matching similarity; and determining the position similarity of the anchor keywords according to the text position characteristics included in the first text characteristics and the text position characteristics included in the second text characteristics.
The more the text position features included in the first text features and the text position features included in the second text features are matched, the higher the similarity of the positions of the keywords representing the anchor points is.
In this embodiment, the form matching information may include a form similarity index between the picture to be tested and the first template picture, and the determination manner of the form matching information may be: and performing similarity comparison on the first form characteristic and the second form characteristic to obtain a form similarity index, and taking the form similarity index as form matching information.
Specifically, matching the first form features with the second form features corresponding to each first template picture respectively to obtain form matching information of the picture to be tested and each first template picture, including: determining row, column, cell number comparison results and table normalization position comparison results in a table according to global table features included in the first table features and global table features included in the second table features, and taking the determined comparison results as global table matching information in the table matching information; according to the normalized size, the unit pattern local features included in the first table features and the unit pattern local features included in the second table features, matching the unit cells in the table one by one, determining the number of matched unit cells, the number of unmatched unit cells, the size similarity of the matched unit cells, the overlapping area of the matched unit cells, the row offset of the cross-row unit cells and the column offset of the cross-column unit cells, and taking the determined matching result as unit cell matching information in the table matching information; and determining the similarity of the matched cell texts and the similarity of the positions of the matched cell texts according to the cell text features included in the first table features and the cell text features included in the second table features, and taking the determined similarity result as the cell text matching information in the table matching information.
The determination method of the table matching information may be: comparing the global form characteristics of the picture to be tested with the global form characteristics of the first template picture to obtain row, column and cell quantity comparison results and form normalization position comparison results; the cell shape information can be obtained according to the normalized position characteristics of the cells, and the matching number of the cells, the size similarity of the matched cells and the overlapping degree of the cells can be obtained by matching the size similarity between the cells one by one and the overlapping area of the cells; the offset of the cells crossing rows and columns can be calculated according to the cells crossing rows and columns; and according to the comparison of the coding features with different lengths in the cells, the text similarity of the cells can be obtained, and according to the comparison of the text position features in the cells, the text position similarity of the cells can be obtained.
In one embodiment, if the picture to be tested does not include the table, correspondingly, the obtaining a multi-dimensional feature matrix based on the picture to be tested and at least one template picture corresponding to the picture to be tested includes: performing feature extraction on the text information in the picture to be tested to obtain a third text feature; for each second template picture, performing feature extraction on the text information in the second template picture to obtain a fourth text feature; matching the third text characteristics with fourth text characteristics corresponding to each second template picture respectively to obtain matching information of the picture to be tested and each second template picture; and combining the picture to be tested with the matching information of each second template picture to obtain a multi-dimensional characteristic matrix.
The third text feature can be understood as a text feature extracted after performing character detection and character recognition on a picture to be tested which does not include a table, and the third text feature can include a text position feature, an indefinite length coding feature and a word vector feature. The word vector features may be obtained by performing word vector encoding on the text in the picture in natural language processing.
The fourth text feature may be understood as a text feature extracted after performing text detection and text recognition on the second template picture, and the fourth text feature may include a text position feature, an indefinite length coding feature, and a word vector feature.
In one embodiment, the matching information includes the number of anchor keyword matches, the similarity of anchor keyword positions, and the similarity of image texts, and correspondingly, the matching of the third text features with the fourth text features corresponding to each second template image respectively to obtain the matching information between the image to be tested and each second template image includes: performing text comparison according to the variable-length coding features of the anchor keyword included in the fourth text features and the variable-length coding features included in the third text features, and determining the matching number of the anchor keyword and the similarity of the anchor keyword; determining the position similarity of anchor keywords according to the text position characteristics included in the third text characteristics and the text position characteristics included in the fourth text characteristics; and determining the similarity of the picture texts according to the word vector characteristics included in the third text characteristics and the word vector characteristics included in the fourth text characteristics.
Comparing the picture to be tested with the coding features of the second template picture with indefinite lengths, so as to obtain the number of the anchor point keywords matched with each other and the similarity of the anchor point keywords; comparing the text position characteristics in the picture to be tested and the second template picture to obtain the position similarity of the anchor point keywords; and comparing the word vector characteristics in the picture to be tested and the second template picture to obtain the picture text similarity.
S130, inputting the feature matrix into a preset classification model, and outputting the similarity between the picture to be tested and each template picture corresponding to the picture to be tested.
The preset classification model can be a preset two-classification model, and the preset classification model can analyze the similarity between the picture to be tested and the template picture according to the input feature matrix and correspondingly output the similarity between the picture to be tested and each template picture.
S140, taking the template picture with the highest similarity as a target template picture of the category to which the picture to be tested belongs to complete the classification of the template pictures.
In this embodiment, after the similarity between the picture to be tested and each template picture is obtained, the template picture with the highest similarity may be used as the target template picture, and the picture type to which the target template picture belongs is the same as the picture type of the picture to be tested. For example, if the picture to be tested is a ticket type, the target template picture is also the ticket type.
The template picture classification method provided by the embodiment of the invention comprises the steps of firstly determining at least one template picture corresponding to a picture to be tested according to whether the picture to be tested contains a table or not; then, obtaining a multi-dimensional characteristic matrix based on the picture to be tested and at least one template picture corresponding to the picture to be tested; then inputting the feature matrix into a preset classification model, and outputting the similarity between the picture to be tested and each template picture corresponding to the picture to be tested; and finally, taking the category to which the picture to be tested belongs as the category corresponding to the target template picture, wherein the target template picture is the template picture with the highest similarity to the picture to be tested. According to the method, additional template pictures do not need to be added, the template picture most similar to the picture to be tested can be matched through feature comparison by extracting the features of the picture to be tested and the template picture, and then the category of the template picture is determined.
Further, the embodiment of the present invention further includes: performing information identification on the picture to be tested through the target template picture, and displaying the name and the identification result of the target template picture to a user; and the identification result is obtained after the picture to be tested is identified according to the key area to be identified in the target template picture.
The name of the target template picture is displayed to a user so that the user can determine whether the target template picture and the picture to be tested are pictures of the same category according to the name of the display template, and the accuracy of the identification result can be further determined. For example, if the name of the target object displayed to the user is a passport template picture and the picture to be tested input by the user is a train ticket picture, it may be determined that the target template picture is selected incorrectly, and thus it may be determined that the displayed identification result is incorrect.
Example two
On the basis of the technical solutions of the above embodiments, the embodiment of the present invention provides a specific implementation manner, and the second embodiment may be taken as an example embodiment of the present invention.
Fig. 2 is a flowchart illustrating a template picture classification process according to a second embodiment of the present invention, and as shown in fig. 2, a picture to be tested input by a user is obtained, character recognition, character detection, and form detection are sequentially performed on the picture to be tested, whether the picture to be tested includes a form is determined, if the picture to be tested includes a form, feature generation may be performed on the picture to be tested and all template pictures having form information, so as to obtain a group of multi-dimensional feature matrices, and a template picture not including form information does not participate in feature generation; if the picture to be tested does not contain the table, the picture to be tested and all template pictures without the table can be subjected to feature generation to obtain a group of multi-dimensional feature matrixes, and the template pictures containing the table information do not participate in the feature generation. Inputting the generated feature matrix into a second classifier to obtain the similarity between the picture to be tested and the corresponding template picture; and selecting the best template picture according to the similarity, and taking the best template picture as the template picture with the same category as the picture to be tested.
According to the template picture classification method provided by the embodiment of the invention, when a batch picture identification task is performed, a specific template picture does not need to be specified, and a matched template picture in a template library can be automatically selected; the similarity between the template picture and the picture to be tested can be more accurately determined through feature matching, and the picture category to which the picture to be tested belongs can be used as the classification result of the template picture.
EXAMPLE III
Fig. 3 is a schematic structural diagram of a template picture classifying device according to a third embodiment of the present invention, which is applicable to a situation where information in a picture is identified, where the device may be implemented by software and/or hardware and is generally integrated on an electronic device.
As shown in fig. 3, the apparatus includes: a template picture determination module 110, a feature matrix determination module 120, a similarity determination module 130, and a classification module 140.
The template picture determining module 110 is configured to determine at least one template picture corresponding to a picture to be tested according to whether the picture to be tested includes a table;
the feature matrix determining module 120 is configured to obtain a multi-dimensional feature matrix based on the picture to be tested and at least one template picture corresponding to the picture to be tested;
the similarity determining module 130 is configured to input the feature matrix into a preset classification model, and output the similarity between the picture to be tested and each template picture corresponding to the picture to be tested;
the classification module 140 is configured to use the category to which the to-be-tested picture belongs as a category corresponding to a target template picture, where the target template picture is a template picture with the highest similarity to the to-be-tested picture.
In this embodiment, the apparatus first determines at least one template picture corresponding to a picture to be tested according to whether the picture to be tested includes a table through the template picture determining module 110; then, a multi-dimensional feature matrix is obtained through the feature matrix determining module 120 based on the picture to be tested and at least one template picture corresponding to the picture to be tested; then, the feature matrix is input into a preset classification model through a similarity determination module 130, and the similarity between the picture to be tested and each template picture corresponding to the picture to be tested is output; and finally, the classification module 140 takes the class to which the picture to be tested belongs as the class corresponding to the target template picture, wherein the target template picture is the template picture with the highest similarity with the picture to be tested.
The embodiment provides a template picture classification device, which can simplify the classification process of template pictures and improve the accuracy of template picture classification.
Further, the template picture is obtained after being configured in advance by a user, and the configured content comprises extraction of anchor point keywords and framing of the area to be identified of the keywords in the template picture.
Further, the template picture determining module 110 is configured to: if the pictures to be tested contain forms, acquiring all template pictures containing the forms from a plurality of template pictures pre-configured by a user as a plurality of first template pictures corresponding to the pictures to be tested; and if the picture to be tested does not contain the form, acquiring all template pictures which do not contain the form from a plurality of template pictures which are configured in advance by a user as a plurality of second template pictures corresponding to the picture to be tested.
Based on the above technical solution, if the to-be-tested picture includes a table, the feature matrix determining module 120 is specifically configured to: respectively extracting features of the table information and the text information in the picture to be tested to obtain a first table feature and a first text feature; respectively extracting the characteristics of the form information and the text information in each first template picture to obtain a second form characteristic and a second text characteristic; respectively performing feature matching on the first text features and second text features corresponding to each template picture to obtain text matching information of the picture to be tested and each first template picture; respectively carrying out feature matching on the first form features and second form features corresponding to each first template picture to obtain form matching information of the picture to be tested and each first template picture; and combining the text matching information and the table matching information to obtain a multi-dimensional feature matrix.
Further, the text matching information includes the number of anchor keyword matches, the matching similarity of anchor keyword and the position similarity of anchor keyword, and correspondingly, the matching of the first text features with the second text features corresponding to each template picture is performed to obtain the text matching information between the picture to be tested and each first template picture, including: text comparison is carried out on the variable length coding features of the anchor keywords included in the second text features and the variable length coding features included in the first text features, and the matching number of the anchor keywords and the matching similarity of the anchor keywords are determined; and determining the position similarity of the anchor keywords according to the text position characteristics included in the first text characteristics and the text position characteristics included in the second text characteristics.
Further, matching the first form features with second form features corresponding to each first template picture respectively to obtain form matching information of the picture to be tested and each first template picture, including: determining row, column, cell number comparison results and table normalization position comparison results in a table according to global table features included in the first table features and global table features included in the second table features, and taking the determined comparison results as global table matching information in the table matching information; according to the normalized size, the unit pattern local features included in the first table features and the unit pattern local features included in the second table features, matching the unit cells in the table one by one, determining the number of matched unit cells, the number of unmatched unit cells, the size similarity of the matched unit cells, the overlapping area of the matched unit cells, the row offset of the cross-row unit cells and the column offset of the cross-column unit cells, and taking the determined matching result as unit cell matching information in the table matching information; and determining the similarity of the matched cell texts and the similarity of the positions of the matched cell texts according to the cell text features included in the first table features and the cell text features included in the second table features, and taking the determined similarity result as the cell text matching information in the table matching information.
Based on the above technical solution, if the to-be-tested picture does not include a table, the feature matrix determining module 120 is specifically configured to: performing feature extraction on the text information in the picture to be tested to obtain a third text feature; for each second template picture, performing feature extraction on the text information in the second template picture to obtain a fourth text feature; matching the third text characteristics with fourth text characteristics corresponding to each second template picture respectively to obtain matching information of the picture to be tested and each second template picture; and combining the picture to be tested with the matching information of each second template picture to obtain a multi-dimensional characteristic matrix.
Further, the matching information includes anchor keyword matching number, anchor keyword matching similarity, anchor keyword position similarity, and image text similarity, and correspondingly, the matching of the third text feature with the fourth text feature corresponding to each second template image respectively to obtain the matching information between the image to be tested and each second template image includes: performing text comparison according to the variable length coding features of the anchor keywords included in the fourth text features and the variable length coding features included in the third text features, and determining the matching number of the anchor keywords and the similarity of the anchor keywords; determining the position similarity of anchor keywords according to the text position characteristics included in the third text characteristics and the text position characteristics included in the fourth text characteristics; and determining the similarity of the picture texts according to the word vector characteristics included in the third text characteristics and the word vector characteristics included in the fourth text characteristics.
Further, the device also comprises an identification module, which is used for carrying out information identification on the picture to be tested through the target template picture and displaying the name and the identification result of the target template picture to a user; and the identification result is obtained after the picture to be tested is identified according to the key area to be identified in the target template picture.
The template picture classification device can execute the template picture classification method provided by any embodiment of the invention, and has the corresponding functional modules and beneficial effects of the execution method.
Example four
FIG. 4 shows a schematic block diagram of an electronic device 10 that may be used to implement an embodiment of the invention. Electronic devices are intended to represent various forms of digital computers, such as laptops, desktops, workstations, personal digital assistants, servers, blade servers, mainframes, and other appropriate computers. The electronic device may also represent various forms of mobile devices, such as personal digital assistants, cellular phones, smart phones, wearable devices (e.g., helmets, glasses, watches, etc.), and other similar computing devices. The components shown herein, their connections and relationships, and their functions, are meant to be exemplary only, and are not meant to limit implementations of the inventions described and/or claimed herein.
As shown in fig. 4, the electronic device 10 includes at least one processor 11, and a memory communicatively connected to the at least one processor 11, such as a Read Only Memory (ROM)12, a Random Access Memory (RAM)13, and the like, wherein the memory stores a computer program executable by the at least one processor, and the processor 11 can perform various suitable actions and processes according to the computer program stored in the Read Only Memory (ROM)12 or the computer program loaded from a storage unit 18 into the Random Access Memory (RAM) 13. In the RAM 13, various programs and data necessary for the operation of the electronic apparatus 10 can also be stored. The processor 11, the ROM 12, and the RAM 13 are connected to each other via a bus 14. An input/output (I/O) interface 15 is also connected to bus 14.
A number of components in the electronic device 10 are connected to the I/O interface 15, including: an input unit 16 such as a keyboard, a mouse, or the like; an output unit 17 such as various types of displays, speakers, and the like; a storage unit 18 such as a magnetic disk, an optical disk, or the like; and a communication unit 19 such as a network card, modem, wireless communication transceiver, etc. The communication unit 19 allows the electronic device 10 to exchange information/data with other devices via a computer network such as the internet and/or various telecommunication networks.
The processor 11 may be a variety of general and/or special purpose processing components having processing and computing capabilities. Some examples of processor 11 include, but are not limited to, a Central Processing Unit (CPU), a Graphics Processing Unit (GPU), various dedicated Artificial Intelligence (AI) computing chips, various processors running machine learning model algorithms, a Digital Signal Processor (DSP), and any suitable processor, controller, microcontroller, and so forth. The processor 11 performs the various methods and processes described above, such as the template picture classification method.
In some embodiments, the template picture classification method may be implemented as a computer program tangibly embodied in a computer-readable storage medium, such as storage unit 18. In some embodiments, part or all of the computer program may be loaded and/or installed onto the electronic device 10 via the ROM 12 and/or the communication unit 19. When the computer program is loaded into the RAM 13 and executed by the processor 11, one or more steps of the template picture classification method described above may be performed. Alternatively, in other embodiments, the processor 11 may be configured to perform the template picture classification method by any other suitable means (e.g. by means of firmware).
Various implementations of the systems and techniques described here above may be implemented in digital electronic circuitry, integrated circuitry, Field Programmable Gate Arrays (FPGAs), Application Specific Integrated Circuits (ASICs), Application Specific Standard Products (ASSPs), system on a chip (SOCs), load programmable logic devices (CPLDs), computer hardware, firmware, software, and/or combinations thereof. These various embodiments may include: implemented in one or more computer programs that are executable and/or interpretable on a programmable system including at least one programmable processor, which may be special or general purpose, receiving data and instructions from, and transmitting data and instructions to, a storage system, at least one input device, and at least one output device.
A computer program for implementing the methods of the present invention may be written in any combination of one or more programming languages. These computer programs may be provided to a processor of a general purpose computer, special purpose computer, or other programmable data processing apparatus, such that the computer programs, when executed by the processor, cause the functions/acts specified in the flowchart and/or block diagram block or blocks to be performed. A computer program can execute entirely on a machine, partly on a machine, as a stand-alone software package partly on a machine and partly on a remote machine or entirely on a remote machine or server.
In the context of the present invention, a computer-readable storage medium may be a tangible medium that can contain, or store a computer program for use by or in connection with an instruction execution system, apparatus, or device. A computer readable storage medium may include, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any suitable combination of the foregoing. Alternatively, the computer readable storage medium may be a machine readable signal medium. More specific examples of a machine-readable storage medium would include an electrical connection based on one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.
To provide for interaction with a user, the systems and techniques described here can be implemented on an electronic device having: a display device (e.g., a CRT (cathode ray tube) or LCD (liquid crystal display) monitor) for displaying information to a user; and a keyboard and a pointing device (e.g., a mouse or a trackball) by which a user can provide input to the electronic device. Other kinds of devices may also be used to provide for interaction with a user; for example, feedback provided to the user can be any form of sensory feedback (e.g., visual feedback, auditory feedback, or tactile feedback); and input from the user may be received in any form, including acoustic, speech, or tactile input.
The systems and techniques described here can be implemented in a computing system that includes a back-end component (e.g., as a data server), or that includes a middleware component (e.g., an application server), or that includes a front-end component (e.g., a user computer having a graphical user interface or a web browser through which a user can interact with an implementation of the systems and techniques described here), or any combination of such back-end, middleware, or front-end components. The components of the system can be interconnected by any form or medium of digital data communication (e.g., a communication network). Examples of communication networks include: local Area Networks (LANs), Wide Area Networks (WANs), blockchain networks, and the internet.
The computing system may include clients and servers. A client and server are generally remote from each other and typically interact through a communication network. The relationship of client and server arises by virtue of computer programs running on the respective computers and having a client-server relationship to each other. The server can be a cloud server, also called a cloud computing server or a cloud host, and is a host product in a cloud computing service system, so that the defects of high management difficulty and weak service expansibility in the traditional physical host and VPS service are overcome.
It should be understood that various forms of the flows shown above may be used, with steps reordered, added, or deleted. For example, the steps described in the present invention may be executed in parallel, sequentially, or in different orders, and are not limited herein as long as the desired results of the technical solution of the present invention can be achieved.
The above-described embodiments should not be construed as limiting the scope of the invention. It should be understood by those skilled in the art that various modifications, combinations, sub-combinations and substitutions may be made in accordance with design requirements and other factors. Any modification, equivalent replacement, and improvement made within the spirit and principle of the present invention should be included in the protection scope of the present invention.

Claims (13)

1. A template picture classification method is characterized by comprising the following steps:
determining at least one template picture corresponding to the picture to be tested according to whether the picture to be tested contains a form or not;
obtaining a multi-dimensional feature matrix based on the picture to be tested and at least one template picture corresponding to the picture to be tested;
inputting the feature matrix into a preset classification model, and outputting the similarity between the picture to be tested and each template picture corresponding to the picture to be tested;
and taking the category to which the picture to be tested belongs as the category corresponding to the target template picture, wherein the target template picture is the template picture with the highest similarity with the picture to be tested.
2. The method of claim 1, wherein the template picture is obtained by user pre-configuration, and the configuration content includes extracting anchor keywords and framing a key to-be-identified area in the template picture.
3. The method of claim 1, wherein determining at least one template picture corresponding to the picture to be tested according to whether the picture to be tested contains a table comprises:
if the pictures to be tested contain forms, acquiring all template pictures containing the forms from a plurality of template pictures pre-configured by a user as a plurality of first template pictures corresponding to the pictures to be tested;
and if the picture to be tested does not contain the form, acquiring all template pictures which do not contain the form from a plurality of template pictures which are configured in advance by a user as a plurality of second template pictures corresponding to the picture to be tested.
4. The method of claim 3, wherein if the picture to be tested includes a table, and correspondingly, obtaining a multi-dimensional feature matrix based on the picture to be tested and at least one template picture corresponding to the picture to be tested comprises:
respectively extracting features of the table information and the text information in the picture to be tested to obtain a first table feature and a first text feature;
respectively extracting the characteristics of the form information and the text information in each first template picture to obtain a second form characteristic and a second text characteristic;
respectively performing feature matching on the first text features and second text features corresponding to each template picture to obtain text matching information of the picture to be tested and each first template picture;
respectively carrying out feature matching on the first form features and second form features corresponding to each first template picture to obtain form matching information of the picture to be tested and each first template picture;
and combining the text matching information and the table matching information to obtain a multi-dimensional feature matrix.
5. The method of claim 4, wherein the text matching information includes a number of anchor keyword matches, an anchor keyword matching similarity, and an anchor keyword location similarity, and correspondingly, the matching the first text features with the second text features corresponding to each template picture to obtain the text matching information between the picture to be tested and each first template picture includes:
text comparison is carried out on the variable length coding features of the anchor keywords included in the second text features and the variable length coding features included in the first text features, and the matching number of the anchor keywords and the matching similarity of the anchor keywords are determined;
and determining the position similarity of the anchor keywords according to the text position characteristics included in the first text characteristics and the text position characteristics included in the second text characteristics.
6. The method according to claim 4, wherein the matching the first form feature with the second form feature corresponding to each first template picture to obtain form matching information between the picture to be tested and each first template picture comprises:
determining row, column, cell number comparison results and table normalization position comparison results in a table according to global table features included in the first table features and global table features included in the second table features, and taking the determined comparison results as global table matching information in the table matching information;
according to the normalized size, the unit pattern local features included in the first table features and the unit pattern local features included in the second table features, matching the unit cells in the table one by one, determining the number of matched unit cells, the number of unmatched unit cells, the size similarity of the matched unit cells, the overlapping area of the matched unit cells, the row offset of the cross-row unit cells and the column offset of the cross-column unit cells, and taking the determined matching result as unit cell matching information in the table matching information;
and determining the similarity of the matched cell texts and the similarity of the positions of the matched cell texts according to the cell text features included in the first table features and the cell text features included in the second table features, and taking the determined similarity result as the cell text matching information in the table matching information.
7. The method of claim 3, wherein if the picture to be tested does not include a table, and correspondingly, obtaining a multi-dimensional feature matrix based on the picture to be tested and at least one template picture corresponding to the picture to be tested comprises:
performing feature extraction on the text information in the picture to be tested to obtain a third text feature;
for each second template picture, performing feature extraction on text information in the second template picture to obtain a fourth text feature;
matching the third text characteristics with fourth text characteristics corresponding to each second template picture respectively to obtain matching information of the picture to be tested and each second template picture;
and combining the picture to be tested with the matching information of each second template picture to obtain a multi-dimensional characteristic matrix.
8. The method according to claim 7, wherein the matching information includes anchor keyword matching number, anchor keyword matching similarity, anchor keyword position similarity, and image text similarity, and correspondingly, the matching the third text feature with a fourth text feature corresponding to each second template image to obtain the matching information between the image to be tested and each second template image includes:
performing text comparison according to the variable length coding features of the anchor keywords included in the fourth text features and the variable length coding features included in the third text features, and determining the matching number of the anchor keywords and the similarity of the anchor keywords;
determining the position similarity of anchor keywords according to the text position characteristics included in the third text characteristics and the text position characteristics included in the fourth text characteristics;
and determining the similarity of the picture texts according to the word vector characteristics included in the third text characteristics and the word vector characteristics included in the fourth text characteristics.
9. The method of claim 1, further comprising:
performing information identification on the picture to be tested through the target template picture, and displaying the name and the identification result of the target template picture to a user;
and the identification result is obtained after the picture to be tested is identified according to the key area to be identified in the target template picture.
10. A template picture classification apparatus, the apparatus comprising:
the template picture determining module is used for determining at least one template picture corresponding to the picture to be tested according to whether the picture to be tested contains the form or not;
the characteristic matrix determining module is used for obtaining a multi-dimensional characteristic matrix based on the picture to be tested and at least one template picture corresponding to the picture to be tested;
the similarity determining module is used for inputting the characteristic matrix into a preset classification model and outputting the similarity between the picture to be tested and each template picture corresponding to the picture to be tested;
and the classification module is used for taking the category to which the picture to be tested belongs as the category corresponding to the target template picture, and the target template picture is the template picture with the highest similarity with the picture to be tested.
11. An electronic device, characterized in that the electronic device comprises:
at least one processor; and
a memory communicatively coupled to the at least one processor; wherein the content of the first and second substances,
the memory stores a computer program executable by the at least one processor to enable the at least one processor to perform the template picture classification method of any one of claims 1-9.
12. A computer-readable storage medium storing computer instructions for causing a processor to implement the template picture classification method according to any one of claims 1 to 9 when executed.
13. A computer program product, characterized in that the computer program product comprises a computer program which, when being executed by a processor, implements the template picture classification method according to any one of claims 1-9.
CN202210516779.7A 2022-05-12 2022-05-12 Template picture classification method, device, equipment, storage medium and product Active CN114911963B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210516779.7A CN114911963B (en) 2022-05-12 2022-05-12 Template picture classification method, device, equipment, storage medium and product

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210516779.7A CN114911963B (en) 2022-05-12 2022-05-12 Template picture classification method, device, equipment, storage medium and product

Publications (2)

Publication Number Publication Date
CN114911963A true CN114911963A (en) 2022-08-16
CN114911963B CN114911963B (en) 2023-09-01

Family

ID=82767587

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210516779.7A Active CN114911963B (en) 2022-05-12 2022-05-12 Template picture classification method, device, equipment, storage medium and product

Country Status (1)

Country Link
CN (1) CN114911963B (en)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111444922A (en) * 2020-03-27 2020-07-24 Oppo广东移动通信有限公司 Picture processing method and device, storage medium and electronic equipment
CN111476227A (en) * 2020-03-17 2020-07-31 平安科技(深圳)有限公司 Target field recognition method and device based on OCR (optical character recognition) and storage medium
CN113361636A (en) * 2021-06-30 2021-09-07 山东建筑大学 Image classification method, system, medium and electronic device
CN113569070A (en) * 2021-07-24 2021-10-29 平安科技(深圳)有限公司 Image detection method and device, electronic equipment and storage medium

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111476227A (en) * 2020-03-17 2020-07-31 平安科技(深圳)有限公司 Target field recognition method and device based on OCR (optical character recognition) and storage medium
CN111444922A (en) * 2020-03-27 2020-07-24 Oppo广东移动通信有限公司 Picture processing method and device, storage medium and electronic equipment
CN113361636A (en) * 2021-06-30 2021-09-07 山东建筑大学 Image classification method, system, medium and electronic device
CN113569070A (en) * 2021-07-24 2021-10-29 平安科技(深圳)有限公司 Image detection method and device, electronic equipment and storage medium

Also Published As

Publication number Publication date
CN114911963B (en) 2023-09-01

Similar Documents

Publication Publication Date Title
US11861919B2 (en) Text recognition method and device, and electronic device
CN112541332B (en) Form information extraction method and device, electronic equipment and storage medium
CN113780098B (en) Character recognition method, character recognition device, electronic equipment and storage medium
CN114494784A (en) Deep learning model training method, image processing method and object recognition method
CN113657395B (en) Text recognition method, training method and device for visual feature extraction model
CN114490998A (en) Text information extraction method and device, electronic equipment and storage medium
CN114092948B (en) Bill identification method, device, equipment and storage medium
CN112699237B (en) Label determination method, device and storage medium
CN115248890A (en) User interest portrait generation method and device, electronic equipment and storage medium
CN115510212A (en) Text event extraction method, device, equipment and storage medium
CN116010916A (en) User identity information identification method and device, electronic equipment and storage medium
CN115934928A (en) Information extraction method, device, equipment and storage medium
CN115600592A (en) Method, device, equipment and medium for extracting key information of text content
CN115909376A (en) Text recognition method, text recognition model training device and storage medium
CN114911963B (en) Template picture classification method, device, equipment, storage medium and product
CN114417862A (en) Text matching method, and training method and device of text matching model
CN115116080A (en) Table analysis method and device, electronic equipment and storage medium
CN114444514A (en) Semantic matching model training method, semantic matching method and related device
CN114049686A (en) Signature recognition model training method and device and electronic equipment
CN113535916A (en) Question and answer method and device based on table and computer equipment
CN112580620A (en) Sign picture processing method, device, equipment and medium
CN113536751B (en) Processing method and device of form data, electronic equipment and storage medium
CN113822057B (en) Location information determination method, location information determination device, electronic device, and storage medium
CN116644724B (en) Method, device, equipment and storage medium for generating bid
CN117807972A (en) Method, device, equipment and medium for extracting form information in long document

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant