CN114911963B - Template picture classification method, device, equipment, storage medium and product - Google Patents

Template picture classification method, device, equipment, storage medium and product Download PDF

Info

Publication number
CN114911963B
CN114911963B CN202210516779.7A CN202210516779A CN114911963B CN 114911963 B CN114911963 B CN 114911963B CN 202210516779 A CN202210516779 A CN 202210516779A CN 114911963 B CN114911963 B CN 114911963B
Authority
CN
China
Prior art keywords
picture
text
tested
template picture
features
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202210516779.7A
Other languages
Chinese (zh)
Other versions
CN114911963A (en
Inventor
闻婷
郭玉杰
徐永达
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Transwarp Technology Shanghai Co Ltd
Original Assignee
Transwarp Technology Shanghai Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Transwarp Technology Shanghai Co Ltd filed Critical Transwarp Technology Shanghai Co Ltd
Priority to CN202210516779.7A priority Critical patent/CN114911963B/en
Publication of CN114911963A publication Critical patent/CN114911963A/en
Application granted granted Critical
Publication of CN114911963B publication Critical patent/CN114911963B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/50Information retrieval; Database structures therefor; File system structures therefor of still image data
    • G06F16/55Clustering; Classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/74Image or video pattern matching; Proximity measures in feature spaces
    • G06V10/761Proximity, similarity or dissimilarity measures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/764Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • General Physics & Mathematics (AREA)
  • Physics & Mathematics (AREA)
  • Databases & Information Systems (AREA)
  • Computing Systems (AREA)
  • General Health & Medical Sciences (AREA)
  • Medical Informatics (AREA)
  • Software Systems (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Health & Medical Sciences (AREA)
  • Multimedia (AREA)
  • Data Mining & Analysis (AREA)
  • General Engineering & Computer Science (AREA)
  • Image Analysis (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a template picture classification method, a template picture classification device, template picture classification equipment, a template picture classification storage medium and a template picture classification product. The method comprises the following steps: determining at least one template picture corresponding to the to-be-tested attempted sheet according to whether the to-be-tested picture contains a table or not; obtaining a multidimensional feature matrix based on the picture to be tested and at least one template picture corresponding to the test piece; inputting the feature matrix into a preset classification model, and outputting the similarity between the to-be-tested try sheet and each template picture corresponding to the to-be-tested try sheet; and taking the category to which the to-be-tested try sheet belongs as a category corresponding to a target template picture, wherein the target template picture is a template picture with highest similarity with the to-be-tested try sheet. By utilizing the method, the classification process of the template pictures can be simplified, and the accuracy of the template picture classification can be improved.

Description

Template picture classification method, device, equipment, storage medium and product
Technical Field
The embodiment of the invention relates to the technical field of image recognition, in particular to a template picture classification method, a template picture classification device, template picture classification equipment, a template picture storage medium and a template picture product.
Background
Currently, the general recognition in the field of optical character recognition (Optical Character Recognition, OCR) is well-established, and the general application of common type pictures can be realized through customized detection recognition and targeted post-processing. For more personalized scenes, the user's need may be to identify specified keyword location information in a certain type of picture. In response to this increasing demand, a method commonly used in the industry is to extract key information in a class of pictures in batches through custom templates.
For a user, comparing the picture with the template picture to find the template ID of the category to which the picture belongs before each identification; for a service, a picture cannot be identified if the user does not specify a template ID.
The existing method often needs to configure a plurality of training pictures for each type of template picture, then perform online training, learn the difference between different types of pictures, and when one picture type is newly added, the training needs to be retrained, and the process is complex and inflexible.
Disclosure of Invention
The invention provides a template picture classification method, a device, equipment, a storage medium and a product, which are used for solving the defect that a plurality of training pictures are required to be additionally provided for training a classification model in the prior art, and the defect that the model training is required to be carried out again after a new class is added in the prior art.
According to an aspect of the present invention, there is provided a template picture classification method, including:
determining at least one template picture corresponding to the to-be-tested attempted sheet according to whether the to-be-tested picture contains a table or not;
obtaining a multidimensional feature matrix based on the picture to be tested and at least one template picture corresponding to the test piece;
inputting the feature matrix into a preset classification model, and outputting the similarity between the to-be-tested try sheet and each template picture corresponding to the to-be-tested try sheet;
and taking the category to which the to-be-tested try sheet belongs as a category corresponding to a target template picture, wherein the target template picture is a template picture with highest similarity with the to-be-tested try sheet.
According to another aspect of the present invention, there is provided a template picture classifying apparatus including:
the template picture determining module is used for determining at least one template picture corresponding to the to-be-tested attempted sheet according to whether the to-be-tested attempted sheet contains a form or not;
the feature matrix determining module is used for obtaining a multi-dimensional feature matrix based on the to-be-tested try sheet and at least one template picture corresponding to the to-be-tested try sheet;
the similarity determining module is used for inputting the feature matrix into a preset classification model and outputting the similarity of the to-be-tested attempted sheet and each template picture corresponding to the to-be-tested attempted sheet;
And the classification module is used for taking the category to which the to-be-detected attempted sheet belongs as the category corresponding to the target template picture, wherein the target template picture is the template picture with the highest similarity with the to-be-detected attempted sheet.
According to another aspect of the present invention, there is provided an electronic apparatus including:
at least one processor; and
a memory communicatively coupled to the at least one processor; wherein,,
the memory stores a computer program executable by the at least one processor to enable the at least one processor to perform any one of the embodiments of the present invention. A template picture classification method.
According to another aspect of the present invention, there is provided a computer readable storage medium storing computer instructions for causing a processor to implement the template picture classification method according to any one of the embodiments of the present invention when executed.
According to the technical scheme provided by the embodiment of the invention, the target template picture is determined through effective feature extraction and feature matching, so that the category corresponding to the to-be-detected attempted sheet can be directly used as the category to which the target template picture belongs, the problems of high training picture acquisition difficulty and repeated model training in the existing template picture classification method are avoided, and the beneficial effects of simplifying the classification process and improving the classification accuracy are achieved.
It should be understood that the description in this section is not intended to identify key or critical features of the embodiments of the invention or to delineate the scope of the invention. Other features of the present invention will become apparent from the description that follows.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings required for the description of the embodiments will be briefly described below, and it is apparent that the drawings in the following description are only some embodiments of the present invention, and other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
Fig. 1 is a flowchart of a template picture classification method according to an embodiment of the present invention;
FIG. 2 is a flowchart illustrating a template picture classification according to a second embodiment of the present invention;
fig. 3 is a schematic structural diagram of a template image classifying device according to a third embodiment of the present invention;
fig. 4 is a schematic structural diagram of an electronic device according to a template picture classification method in an embodiment of the invention.
Detailed Description
In order that those skilled in the art will better understand the present invention, a technical solution in the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in which it is apparent that the described embodiments are only some embodiments of the present invention, not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the present invention without making any inventive effort, shall fall within the scope of the present invention. It should be understood that the various steps recited in the method embodiments of the present invention may be performed in a different order and/or performed in parallel. Furthermore, method embodiments may include additional steps and/or omit performing the illustrated steps. The scope of the invention is not limited in this respect.
The term "including" and variations thereof as used herein are intended to be open-ended, i.e., including, but not limited to. The term "based on" is based at least in part on. The term "one embodiment" means "at least one embodiment"; the term "another embodiment" means "at least one additional embodiment"; the term "some embodiments" means "at least some embodiments. Related definitions of other terms will be given in the description below.
It should be noted that the terms "first," "second," and the like in the description and the claims of the present invention and the above figures are used for distinguishing between similar objects and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used may be interchanged where appropriate such that the embodiments of the invention described herein may be implemented in sequences other than those illustrated or otherwise described herein. Furthermore, the terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.
It should be noted that references to "one", "a plurality" and "a plurality" in this disclosure are intended to be illustrative rather than limiting, and those skilled in the art will appreciate that "one or more" is intended to be construed as "one or more" unless the context clearly indicates otherwise.
The names of messages or information interacted between the devices in the embodiments of the present invention are for illustrative purposes only and are not intended to limit the scope of such messages or information.
Example 1
The template picture classification method commonly used at present is to upload 30 training pictures to each type of template picture after completing the configuration of each type of template picture, and to realize the classification among a plurality of templates by constructing a template feature classifier. The method essentially comprises the steps of training a multi-classifier by using data of a user to realize the distinction of different template categories, wherein N template categories are required to be determined in advance, a classification model after training can only distinguish the N template pictures, and the N+1 template pictures are required to be retrained once the picture categories are required to be newly added.
The common template picture classification method has the following problems:
1. the method can not realize the instant allocation and instant use, and the template feature classifier training work is needed to be carried out again every time a new template picture is added;
2. The number of the pictures of the user is limited, the required number of the pictures is specified for the training pictures of each type of template, and if the current picture of the user cannot meet the requirement, the multi-template classification work cannot be realized;
3. the training effort of the classifier brings redundant workload and extra time overhead to the user.
Based on the above problems, the embodiment of the invention provides a template picture classification method, which can effectively solve the above problems.
Fig. 1 is a flow chart of a template picture classifying method according to an embodiment of the present invention, which is applicable to the case of identifying information in pictures, and the method may be performed by a template picture classifying device, where the device may be implemented by software and/or hardware and is generally integrated on an electronic device, and in this embodiment, the electronic device includes but is not limited to: a computer device.
As shown in fig. 1, a template picture classification method provided in a first embodiment of the present invention includes the following steps:
s110, determining at least one template picture corresponding to the to-be-tested attempted sheet according to whether the to-be-tested picture contains a table or not.
The picture to be tested can be a picture to be identified, and the information to be identified in the picture to be tested can be identified according to the template picture with the same picture type as the picture to be tested. For example, if the picture to be tested is a train ticket picture, the template picture of the train ticket type may be used to identify the user-specified identification keyword region in the train ticket picture, and identify the keywords in the specified identification keyword region.
In this embodiment, a plurality of template pictures corresponding to the test piece are preliminarily determined according to whether the test piece includes a table.
Specifically, the determining, according to whether the picture to be tested includes a table, at least one template picture corresponding to the to-be-tested try sheet includes: if the to-be-tested trying sheet contains a form, acquiring template pictures containing all forms from a plurality of template pictures configured by a user as a plurality of first template pictures corresponding to the to-be-tested trying sheet; and if the picture to be tested does not contain the table, acquiring all template pictures which do not contain the table from a plurality of template pictures configured by a user as a plurality of second template pictures corresponding to the to-be-tested attempted sheet.
Wherein, the first template picture can be understood as a template picture containing a table; the second template picture may be understood as a template picture that does not contain a table. The first template picture and the second template picture may be determined by recognition.
In this embodiment, the template picture is obtained after being preconfigured by the user, and the configured content includes extracting an anchor keyword and a frame selection key area to be identified in the template picture. After the user configures the template picture, the configuration result is stored in the database, and when the user inputs the picture to be tested, the template picture can be directly obtained from the database. The template pictures preconfigured by the user can comprise different types of template pictures.
The anchor keywords are understood to be keywords in which the positions and contents of the template pictures are not changed relatively. For example, in a train ticket picture, anchor keywords may include origin and destination, departure time, fare, seat number, etc. The key area to be identified is understood as an area in which key information can be identified.
S120, obtaining a multi-dimensional feature matrix based on the picture to be tested and at least one template picture corresponding to the test piece to be tested.
In this embodiment, feature extraction is performed on a picture to be tested and at least one template picture corresponding to the picture to be tested, a plurality of features of the picture to be tested are respectively matched with corresponding features of each template picture one by one, and matching information obtained after matching forms a multi-dimensional feature matrix. The feature matrix can comprise a plurality of groups of feature vectors.
It can be understood that the features to be extracted are different for the picture to be tested containing the table and the test piece to be tested not containing the table, the text features and the table features need to be extracted respectively for the test piece to be tested containing the table, and only the text features need to be extracted for the test piece to be tested not containing the table. Wherein, there is a difference between the text features extracted by the test pieces to be tested that contain the table and the text features extracted by the test pieces to be tested that do not contain the table.
In one embodiment, if the to-be-tested sample includes a table, correspondingly, the obtaining a multi-dimensional feature matrix based on the to-be-tested picture and at least one template picture corresponding to the to-be-tested sample includes: respectively extracting features of form information and text information in the picture to be tested to obtain a first form feature and a first text feature; respectively extracting features of form information and text information in the first template picture aiming at each first template picture to obtain second form features and second text features; performing feature matching on the first text features and the second text features corresponding to each template picture respectively to obtain text matching information of the to-be-tested try sheet and each first template picture; performing feature matching on the first form features and the second form features corresponding to each first template picture respectively to obtain form matching information of the to-be-tested try sheet and each first template picture; and combining the text matching information and the form matching information to obtain a multi-dimensional feature matrix.
The first text feature may be understood as a text feature extracted after text detection and text recognition of an attempted chip to be tested containing a table, and may include a text position feature and an indefinite length coding feature, i.e., an IDS coding feature. It should be noted that the characteristic of variable length coding can be used for comparing the character pattern similarity of Chinese characters, the character coding can form character strings by using the sequence of the radical strokes of Chinese characters, the difference of Chinese characters is measured by using the tree editing distance, and the larger the tree editing distance is, the larger the difference of Chinese characters is represented.
The first table feature may be understood as a table feature extracted after performing table recognition on an attempted sheet to be tested including a table, and the first table feature may include a global table feature, a local cell feature, and a text cell feature.
The second text feature may be understood as a text feature extracted after text detection and text recognition of the first template picture, and the second text feature may include a text position and an indefinite length coding feature of an anchor keyword.
The second table feature may be understood as a table feature extracted after the first template picture is subjected to table recognition, and the second table feature may include a global table feature, a local cell feature and a text cell feature. It should be further noted that the global table features may include a rank feature, a table location feature, and a normalized size; the cell local features may include coordinates normalized for each cell in the table and per-cell cross-row cross-column information; the cell text features may include an indefinite length coding feature in each cell.
In this embodiment, the text matching information may be a text similarity index of the to-be-tested attempt sheet and the first template picture, and the determining manner of the text matching information may be: and performing similarity comparison on the first text feature and the second text feature to obtain a text similarity index, and taking the text similarity index as text matching information.
Specifically, the text matching information includes anchor keyword matching number, anchor keyword matching similarity and anchor keyword position similarity, and correspondingly, the matching is performed on the first text features and the second text features corresponding to each template picture respectively, so as to obtain text matching information of the to-be-detected attempt sheet and each first template picture, including: comparing the text similarity between the variable length coding features of the anchor keywords included in the second text features and the variable length coding features included in the first text features, and determining the matching quantity of the anchor keywords and the matching similarity of the anchor keywords; and determining the position similarity of the anchor key words according to the text position features included in the first text features and the text position features included in the second text features.
The more the text position features included in the first text features coincide with the text position features included in the second text features, the higher the similarity of the anchor point keyword positions is represented.
In this embodiment, the table matching information may include a table similarity index of the to-be-tested picture and the first template picture, and the determining manner of the table matching information may be: and carrying out similarity comparison on the first table features and the second table features to obtain table similarity indexes, wherein the table similarity indexes are used as table matching information.
Specifically, the matching of the first table features with the second table features corresponding to each first template picture to obtain table matching information of the to-be-tested try sheet and each first template picture includes: determining a row, a column, a cell number comparison result and a table normalization position comparison result in a table according to global table features included in the first table features and global table features included in the second table features, and taking the determined comparison result as global table matching information in the table matching information; according to the normalized size, the cell local features included in the first table feature and the cell local features included in the second table feature, matching the cells in the table one by one, determining the number of matched cells and the number of unmatched cells, the similarity of the sizes of the matched cells, the overlapping area of the matched cells, the row offset of the cross-row cells and the column offset of the cross-column cells, and taking the determined matching result as cell matching information in the table matching information; and determining matching cell text similarity and matching cell text position similarity according to the cell text features included in the first table features and the cell text features included in the second table features, and taking the determined similarity result as cell text matching information in the table matching information.
The table matching information may be determined by: comparing the global table features of the picture to be tested with the global table features of the first template picture to obtain a comparison result of the number of rows, columns and cells and a comparison result of the normalization positions of the tables; the cell shape information can be obtained according to the normalized position characteristics of the cells, and the size similarity among the matched cells and the coincidence degree of the cells can be obtained to obtain the matching number of the cells, the size similarity of the matched cells and the coincidence area of the matched cells; the offset of the cross-row and cross-column unit cells can be calculated according to the cross-row and cross-column unit cells; the text similarity of the cells can be obtained according to the comparison of the coding features with the indefinite length in the cells, and the text position similarity of the cells can be obtained according to the comparison of the text position features in the cells.
In one embodiment, if the picture to be tested does not include a table, correspondingly, the obtaining a multi-dimensional feature matrix based on the picture to be tested and at least one template picture corresponding to the test piece includes: performing feature extraction on the text information in the picture to be tested to obtain a third text feature; for each second template picture, extracting the characteristics of text information in the second template picture to obtain fourth text characteristics; matching the third text features with fourth text features corresponding to each second template picture respectively to obtain matching information of the to-be-detected try sheet and each second template picture; and combining the to-be-detected try sheet with the matching information of each second template picture respectively to obtain a multi-dimensional feature matrix.
The third text feature may be understood as a text feature extracted after text detection and text recognition on the to-be-tested try sheet not including the table, and the third text feature may include a text position feature, an indefinite length code feature and a word vector feature. The word vector feature may be obtained by encoding a word vector in natural language processing of text in a picture.
The fourth text feature may be understood as a text feature extracted after performing text detection and text recognition on the second template picture, and the fourth text feature may include a text position feature, an indefinite length coding feature, and a word vector feature.
In one embodiment, the matching information includes an anchor keyword matching number, an anchor keyword matching similarity, an anchor keyword position similarity, and a picture text similarity, and correspondingly, the matching the third text feature with a fourth text feature corresponding to each second template picture to obtain the matching information of the to-be-detected attempt piece and each second template picture includes: according to the text comparison between the variable length coding features of the anchor keywords included in the fourth text features and the variable length coding features included in the third text features, determining the matching number of the anchor keywords and the similarity of the anchor keywords; determining the position similarity of the anchor key words according to the text position features included in the third text features and the text position features included in the fourth text features; and determining the similarity of the picture text according to the word vector features included in the third text features and the word vector features included in the fourth text features.
Comparing the picture to be tested with the variable length coding features in the second template picture to obtain the number of the matched anchor keywords and the similarity of the anchor keywords; comparing the text position features in the picture to be tested and the second template picture to obtain the position similarity of the anchor key words; and comparing the word vector characteristics of the picture to be tested with those of the picture of the second template to obtain the picture text similarity.
S130, inputting the feature matrix into a preset classification model, and outputting the similarity between the to-be-tested attempted sheet and each template picture corresponding to the to-be-tested attempted sheet.
The preset classification model can be a preset classification model, the preset classification model can analyze the similarity between the picture to be tested and the template picture according to the input feature matrix, and the similarity between the picture to be tested and each template picture can be correspondingly output.
And S140, taking the template picture with the highest similarity as the target template picture of the category to which the to-be-detected attempted sheet belongs, so as to finish the classification of the template picture.
In this embodiment, after obtaining the similarity between the to-be-tested picture and each template picture, the template picture with the highest similarity may be used as the target template picture, where the picture type to which the target template picture belongs is the same as the picture type described by the to-be-tested picture. For example, if the picture to be tested is a ticket type, the target template picture is also a ticket type.
According to the template picture classification method provided by the first embodiment of the invention, at least one template picture corresponding to the to-be-tested attempted sheet is determined according to whether the to-be-tested picture contains a table or not; then obtaining a multi-dimensional feature matrix based on the picture to be tested and at least one template picture corresponding to the test piece; inputting the feature matrix into a preset classification model, and outputting the similarity between the to-be-tested try sheet and each template picture corresponding to the to-be-tested try sheet; and finally taking the category to which the to-be-tested trying piece belongs as the category corresponding to the target template picture, wherein the target template picture is the template picture with the highest similarity with the to-be-tested trying piece. According to the method, no additional template picture is needed, the template picture which is most similar to the to-be-tested attempted sheet can be obtained through feature comparison and matching by carrying out feature extraction on the to-be-tested picture and the template picture, and then the category of the template picture is determined.
Further, the embodiment of the invention further comprises: information identification is carried out on the to-be-detected attempted sheet through the target template picture, and the name and the identification result of the target template picture are displayed to a user; the identification result is obtained after the to-be-tested try sheet is identified according to the key to-be-identified area in the target template picture.
The name of the target template picture is displayed to a user so that the user can determine whether the target template picture and the to-be-tested attempted sheet are the same type of picture according to the display template name, and accuracy of the identification result can be further determined. For example, if the name of the target object displayed to the user is a passport template picture and the picture to be tested input by the user is a train ticket picture, it may be determined that the target template picture is selected incorrectly, and thus it may be determined that the displayed recognition result is incorrect.
Example two
The embodiment of the present invention provides a specific implementation manner based on the technical solutions of the above embodiments, and the second embodiment can be used as an example embodiment of the present invention.
Fig. 2 is a flowchart illustrating a classification process of a template picture according to a second embodiment of the present invention, as shown in fig. 2, an attempt to be tested piece input by a user is obtained, text recognition, text detection and table detection are sequentially performed on the to-be-tested piece, whether the to-be-tested piece contains a table is determined, if the to-be-tested piece contains a table, feature generation can be performed on the to-be-tested piece and all template pictures with table information, so as to obtain a set of multi-dimensional feature matrixes, and the template picture without table information will not participate in feature generation; if the to-be-tested picture does not contain a table, the to-be-tested picture and all template pictures without the table can be subjected to feature generation to obtain a group of multidimensional feature matrixes, and the template pictures containing the table information do not participate in feature generation. Inputting the generated feature matrix into a two-classifier to obtain the similarity between the picture to be tested and the corresponding template picture; and selecting an optimal template picture according to the similarity, and taking the optimal template picture as the template picture with the same category as the to-be-tested attempted sheet.
According to the template picture classification method provided by the second embodiment of the invention, when a batch of pictures are identified, a specific template picture is not required to be specified, and the matched template picture in the template library can be automatically selected; the similarity between the template picture and the picture to be tested can be determined more accurately through feature matching, and then the picture category of the picture to be tested can be used as a classification result of the template picture.
Example III
Fig. 3 is a schematic structural diagram of a template picture classifying device according to a third embodiment of the present invention, where the device is applicable to the case of identifying information in a picture, and the device may be implemented by software and/or hardware and is generally integrated on an electronic device.
As shown in fig. 3, the apparatus includes: template picture determination module 110, feature matrix determination module 120, similarity determination module 130, and classification module 140.
The template picture determining module 110 is configured to determine at least one template picture corresponding to the to-be-tested try sheet according to whether the to-be-tested picture contains a table;
the feature matrix determining module 120 is configured to obtain a multi-dimensional feature matrix based on the to-be-tested try sheet and at least one template picture corresponding to the to-be-tested try sheet;
The similarity determining module 130 is configured to input the feature matrix into a preset classification model, and output a similarity between the to-be-tested try sheet and each template picture corresponding to the to-be-tested try sheet;
the classification module 140 is configured to take a class to which the to-be-detected attempt sheet belongs as a class corresponding to a target template picture, where the target template picture is a template picture with highest similarity with the to-be-detected attempt sheet.
In this embodiment, the device first determines, by using the template picture determining module 110, at least one template picture corresponding to the to-be-tested try sheet according to whether the to-be-tested picture includes a table; then, obtaining a multi-dimensional feature matrix based on the picture to be tested and at least one template picture corresponding to the attempted sheet to be tested through a feature matrix determining module 120; inputting the feature matrix into a preset classification model through a similarity determination module 130, and outputting the similarity between the to-be-tested attempted chip and each template picture corresponding to the to-be-tested attempted chip; and finally, using the category to which the to-be-tested try sheet belongs as the category corresponding to the target template picture through the classification module 140, wherein the target template picture is the template picture with the highest similarity with the to-be-tested try sheet.
The embodiment provides a template picture classifying device, which can simplify the classifying process of template pictures and improve the classifying accuracy of the template pictures.
Further, the template picture is obtained after being pre-configured by a user, and the configured content comprises the steps of extracting anchor point keywords in the template picture and framing a key area to be identified.
Further, the template picture determining module 110 is configured to: if the to-be-tested trying sheet contains a form, acquiring template pictures containing all forms from a plurality of template pictures preset by a user as a plurality of first template pictures corresponding to the to-be-tested trying sheet; and if the picture to be tested does not contain the form, acquiring all template pictures which do not contain the form from a plurality of template pictures preset by a user as a plurality of second template pictures corresponding to the to-be-tested attempted sheet.
Based on the above technical solution, if the to-be-tested attempt slice includes a table, the feature matrix determining module 120 is specifically configured to: respectively extracting features of form information and text information in the picture to be tested to obtain a first form feature and a first text feature; respectively extracting features of form information and text information in the first template picture aiming at each first template picture to obtain second form features and second text features; performing feature matching on the first text features and the second text features corresponding to each template picture respectively to obtain text matching information of the to-be-tested try sheet and each first template picture; performing feature matching on the first form features and the second form features corresponding to each first template picture respectively to obtain form matching information of the to-be-tested try sheet and each first template picture; and combining the text matching information and the form matching information to obtain a multi-dimensional feature matrix.
Further, the text matching information includes anchor keyword matching number, anchor keyword matching similarity and anchor keyword position similarity, and correspondingly, the matching is performed on the first text feature and the second text feature corresponding to each template picture, so as to obtain text matching information of the to-be-detected attempt sheet and each first template picture, including: text comparison is carried out on the variable length coding features of the anchor keywords included in the second text features and the variable length coding features included in the first text features, and the matching number of the anchor keywords and the matching similarity of the anchor keywords are determined; and determining the position similarity of the anchor key words according to the text position features included in the first text features and the text position features included in the second text features.
Further, the matching of the first table features with the second table features corresponding to each first template picture to obtain table matching information of the to-be-tested try sheet and each first template picture includes: determining a row, a column, a cell number comparison result and a table normalization position comparison result in a table according to global table features included in the first table features and global table features included in the second table features, and taking the determined comparison result as global table matching information in the table matching information; according to the normalized size, the cell local features included in the first table feature and the cell local features included in the second table feature, matching the cells in the table one by one, determining the number of matched cells and the number of unmatched cells, the similarity of the sizes of the matched cells, the overlapping area of the matched cells, the row offset of the cross-row cells and the column offset of the cross-column cells, and taking the determined matching result as cell matching information in the table matching information; and determining matching cell text similarity and matching cell text position similarity according to the cell text features included in the first table features and the cell text features included in the second table features, and taking the determined similarity result as cell text matching information in the table matching information.
Based on the above technical solution, if the picture to be tested does not include a table, the feature matrix determining module 120 is specifically configured to: performing feature extraction on the text information in the picture to be tested to obtain a third text feature; for each second template picture, extracting the characteristics of text information in the second template picture to obtain a fourth text characteristic; matching the third text features with fourth text features corresponding to each second template picture respectively to obtain matching information of the to-be-detected try sheet and each second template picture; and combining the to-be-detected try sheet with the matching information of each second template picture respectively to obtain a multi-dimensional feature matrix.
Further, the matching information includes an anchor keyword matching number, an anchor keyword matching similarity, an anchor keyword position similarity, and a picture text similarity, and correspondingly, the matching of the third text feature with a fourth text feature corresponding to each second template picture respectively, to obtain matching information of the to-be-detected attempt sheet and each second template picture includes: according to the text comparison between the variable length coding features of the anchor keywords included in the fourth text features and the variable length coding features included in the third text features, determining the matching number of the anchor keywords and the similarity of the anchor keywords; determining the position similarity of the anchor key words according to the text position features included in the third text features and the text position features included in the fourth text features; and determining the similarity of the picture text according to the word vector features included in the third text features and the word vector features included in the fourth text features.
Further, the device also comprises an identification module, which is used for carrying out information identification on the to-be-tested attempted chip through the target template picture and displaying the name and the identification result of the target template picture to a user; the identification result is obtained after the to-be-tested try sheet is identified according to the key to-be-identified area in the target template picture.
The template picture classifying device can execute the template picture classifying method provided by any embodiment of the invention, and has the corresponding functional modules and beneficial effects of the executing method.
Example IV
Fig. 4 shows a schematic diagram of the structure of an electronic device 10 that may be used to implement an embodiment of the invention. Electronic devices are intended to represent various forms of digital computers, such as laptops, desktops, workstations, personal digital assistants, servers, blade servers, mainframes, and other appropriate computers. Electronic equipment may also represent various forms of mobile devices, such as personal digital processing, cellular telephones, smartphones, wearable devices (e.g., helmets, glasses, watches, etc.), and other similar computing devices. The components shown herein, their connections and relationships, and their functions, are meant to be exemplary only, and are not meant to limit implementations of the inventions described and/or claimed herein.
As shown in fig. 4, the electronic device 10 includes at least one processor 11, and a memory, such as a Read Only Memory (ROM) 12, a Random Access Memory (RAM) 13, etc., communicatively connected to the at least one processor 11, in which the memory stores a computer program executable by the at least one processor, and the processor 11 may perform various appropriate actions and processes according to the computer program stored in the Read Only Memory (ROM) 12 or the computer program loaded from the storage unit 18 into the Random Access Memory (RAM) 13. In the RAM 13, various programs and data required for the operation of the electronic device 10 may also be stored. The processor 11, the ROM 12 and the RAM 13 are connected to each other via a bus 14. An input/output (I/O) interface 15 is also connected to bus 14.
Various components in the electronic device 10 are connected to the I/O interface 15, including: an input unit 16 such as a keyboard, a mouse, etc.; an output unit 17 such as various types of displays, speakers, and the like; a storage unit 18 such as a magnetic disk, an optical disk, or the like; and a communication unit 19 such as a network card, modem, wireless communication transceiver, etc. The communication unit 19 allows the electronic device 10 to exchange information/data with other devices via a computer network, such as the internet, and/or various telecommunication networks.
The processor 11 may be a variety of general and/or special purpose processing components having processing and computing capabilities. Some examples of processor 11 include, but are not limited to, a Central Processing Unit (CPU), a Graphics Processing Unit (GPU), various specialized Artificial Intelligence (AI) computing chips, various processors running machine learning model algorithms, digital Signal Processors (DSPs), and any suitable processor, controller, microcontroller, etc. The processor 11 performs the various methods and processes described above, such as the template picture classification method.
In some embodiments, the template picture classification method may be implemented as a computer program tangibly embodied on a computer-readable storage medium, such as storage unit 18. In some embodiments, part or all of the computer program may be loaded and/or installed onto the electronic device 10 via the ROM 12 and/or the communication unit 19. When the computer program is loaded into RAM 13 and executed by processor 11, one or more steps of the template picture classification method described above may be performed. Alternatively, in other embodiments, the processor 11 may be configured to perform the template picture classification method in any other suitable way (e.g. by means of firmware).
Various implementations of the systems and techniques described here above may be implemented in digital electronic circuitry, integrated circuit systems, field Programmable Gate Arrays (FPGAs), application Specific Integrated Circuits (ASICs), application Specific Standard Products (ASSPs), systems On Chip (SOCs), load programmable logic devices (CPLDs), computer hardware, firmware, software, and/or combinations thereof. These various embodiments may include: implemented in one or more computer programs, the one or more computer programs may be executed and/or interpreted on a programmable system including at least one programmable processor, which may be a special purpose or general-purpose programmable processor, that may receive data and instructions from, and transmit data and instructions to, a storage system, at least one input device, and at least one output device.
A computer program for carrying out methods of the present invention may be written in any combination of one or more programming languages. These computer programs may be provided to a processor of a general purpose computer, special purpose computer, or other programmable data processing apparatus, such that the computer programs, when executed by the processor, cause the functions/acts specified in the flowchart and/or block diagram block or blocks to be implemented. The computer program may execute entirely on the machine, partly on the machine, as a stand-alone software package, partly on the machine and partly on a remote machine or entirely on the remote machine or server.
In the context of the present invention, a computer-readable storage medium may be a tangible medium that can contain, or store a computer program for use by or in connection with an instruction execution system, apparatus, or device. The computer readable storage medium may include, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any suitable combination of the foregoing. Alternatively, the computer readable storage medium may be a machine readable signal medium. More specific examples of a machine-readable storage medium would include an electrical connection based on one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.
To provide for interaction with a user, the systems and techniques described here can be implemented on an electronic device having: a display device (e.g., a CRT (cathode ray tube) or LCD (liquid crystal display) monitor) for displaying information to a user; and a keyboard and a pointing device (e.g., a mouse or a trackball) through which a user can provide input to the electronic device. Other kinds of devices may also be used to provide for interaction with a user; for example, feedback provided to the user may be any form of sensory feedback (e.g., visual feedback, auditory feedback, or tactile feedback); and input from the user may be received in any form, including acoustic input, speech input, or tactile input.
The systems and techniques described here can be implemented in a computing system that includes a background component (e.g., as a data server), or that includes a middleware component (e.g., an application server), or that includes a front-end component (e.g., a user computer having a graphical user interface or a web browser through which a user can interact with an implementation of the systems and techniques described here), or any combination of such background, middleware, or front-end components. The components of the system can be interconnected by any form or medium of digital data communication (e.g., a communication network). Examples of communication networks include: local Area Networks (LANs), wide Area Networks (WANs), blockchain networks, and the internet.
The computing system may include clients and servers. The client and server are typically remote from each other and typically interact through a communication network. The relationship of client and server arises by virtue of computer programs running on the respective computers and having a client-server relationship to each other. The server can be a cloud server, also called a cloud computing server or a cloud host, and is a host product in a cloud computing service system, so that the defects of high management difficulty and weak service expansibility in the traditional physical hosts and VPS service are overcome.
It should be appreciated that various forms of the flows shown above may be used to reorder, add, or delete steps. For example, the steps described in the present invention may be performed in parallel, sequentially, or in a different order, so long as the desired results of the technical solution of the present invention are achieved, and the present invention is not limited herein.
The above embodiments do not limit the scope of the present invention. It will be apparent to those skilled in the art that various modifications, combinations, sub-combinations and alternatives are possible, depending on design requirements and other factors. Any modifications, equivalent substitutions and improvements made within the spirit and principles of the present invention should be included in the scope of the present invention.

Claims (13)

1. A template picture classification method, the method comprising:
determining at least one template picture corresponding to the to-be-tested attempted sheet according to whether the to-be-tested picture contains a table or not;
obtaining a multidimensional feature matrix based on the picture to be tested and at least one template picture corresponding to the test piece, wherein the method comprises the following steps: extracting features of the to-be-tested try sheet and at least one template picture corresponding to the to-be-tested try sheet respectively, matching a plurality of features of the to-be-tested picture with corresponding features of each template picture one by one, and forming a multi-dimensional feature matrix by matched matching information; the method comprises the steps that a picture to be tested and a template picture are subjected to feature matching to obtain a group of multi-dimensional feature vectors, wherein the feature matrix comprises a plurality of groups of feature vectors;
Inputting the feature matrix into a preset classification model, and outputting the similarity between the to-be-tested try sheet and each template picture corresponding to the to-be-tested try sheet;
and taking the category to which the to-be-tested try sheet belongs as a category corresponding to a target template picture, wherein the target template picture is a template picture with highest similarity with the to-be-tested try sheet.
2. The method of claim 1, wherein template pictures are pre-configured by a user, and wherein configuring comprises extracting anchor keywords in the template pictures and framing key regions to be identified.
3. The method according to claim 1, wherein the determining at least one template picture corresponding to the to-be-tested try-sheet according to whether the to-be-tested picture contains a table comprises:
if the to-be-tested trying sheet contains a form, acquiring template pictures containing all forms from a plurality of template pictures preset by a user as a plurality of first template pictures corresponding to the to-be-tested trying sheet;
and if the picture to be tested does not contain the form, acquiring all template pictures which do not contain the form from a plurality of template pictures preset by a user as a plurality of second template pictures corresponding to the to-be-tested attempted sheet.
4. A method according to claim 3, wherein if the test piece includes a table, the obtaining a multi-dimensional feature matrix based on the picture to be tested and at least one template picture corresponding to the test piece includes:
respectively extracting features of form information and text information in the picture to be tested to obtain a first form feature and a first text feature;
respectively extracting features of form information and text information in the first template picture aiming at each first template picture to obtain second form features and second text features;
performing feature matching on the first text features and the second text features corresponding to each template picture respectively to obtain text matching information of the to-be-tested try sheet and each first template picture;
performing feature matching on the first form features and the second form features corresponding to each first template picture respectively to obtain form matching information of the to-be-tested try sheet and each first template picture;
and combining the text matching information and the form matching information to obtain a multi-dimensional feature matrix.
5. The method of claim 4, wherein the text matching information includes an anchor keyword matching number, an anchor keyword matching similarity, and an anchor keyword position similarity, and the matching the first text feature with the second text feature corresponding to each of the template pictures respectively, to obtain the text matching information of the to-be-detected attempt slice and each of the first template pictures includes:
text comparison is carried out on the variable length coding features of the anchor keywords included in the second text features and the variable length coding features included in the first text features, and the matching number of the anchor keywords and the matching similarity of the anchor keywords are determined;
and determining the position similarity of the anchor key words according to the text position features included in the first text features and the text position features included in the second text features.
6. The method of claim 4, wherein the matching the first table feature with the second table feature corresponding to each first template picture to obtain table matching information of the to-be-tested try sheet and each first template picture includes:
Determining a row, a column, a cell number comparison result and a table normalization position comparison result in a table according to global table features included in the first table features and global table features included in the second table features, and taking the determined comparison result as global table matching information in the table matching information;
according to the normalized size, the cell local features included in the first table feature and the cell local features included in the second table feature, matching the cells in the table one by one, determining the number of matched cells and the number of unmatched cells, the similarity of the sizes of the matched cells, the overlapping area of the matched cells, the row offset of the cross-row cells and the column offset of the cross-column cells, and taking the determined matching result as cell matching information in the table matching information;
and determining matching cell text similarity and matching cell text position similarity according to the cell text features included in the first table features and the cell text features included in the second table features, and taking the determined similarity result as cell text matching information in the table matching information.
7. A method according to claim 3, wherein if the picture to be tested does not contain a table, correspondingly, the obtaining a multi-dimensional feature matrix based on the picture to be tested and at least one template picture corresponding to the test piece includes:
performing feature extraction on the text information in the picture to be tested to obtain a third text feature;
for each second template picture, extracting the characteristics of text information in the second template picture to obtain a fourth text characteristic;
matching the third text features with fourth text features corresponding to each second template picture respectively to obtain matching information of the to-be-detected try sheet and each second template picture;
and combining the to-be-detected try sheet with the matching information of each second template picture respectively to obtain a multi-dimensional feature matrix.
8. The method of claim 7, wherein the matching information includes an anchor keyword matching number, an anchor keyword matching similarity, an anchor keyword position similarity, and a picture text similarity, and the matching the third text feature with the fourth text feature corresponding to each second template picture respectively, to obtain the matching information of the to-be-detected attempt slice and each second template picture includes:
According to the text comparison between the variable length coding features of the anchor keywords included in the fourth text features and the variable length coding features included in the third text features, determining the matching number of the anchor keywords and the similarity of the anchor keywords;
determining the position similarity of the anchor key words according to the text position features included in the third text features and the text position features included in the fourth text features;
and determining the similarity of the picture text according to the word vector features included in the third text features and the word vector features included in the fourth text features.
9. The method as recited in claim 1, further comprising:
information identification is carried out on the to-be-detected attempted sheet through the target template picture, and the name and the identification result of the target template picture are displayed to a user;
the identification result is obtained after the to-be-tested try sheet is identified according to the key to-be-identified area in the target template picture.
10. A template picture classifying apparatus, said apparatus comprising:
the template picture determining module is used for determining at least one template picture corresponding to the to-be-tested attempted sheet according to whether the to-be-tested attempted sheet contains a form or not;
The template picture determining module is specifically configured to extract features of the to-be-tested try sheet and at least one template picture corresponding to the to-be-tested try sheet, match a plurality of features of the to-be-tested picture with corresponding features of each template picture one by one, and form a multi-dimensional feature matrix from matched matching information; the method comprises the steps that a picture to be tested and a template picture are subjected to feature matching to obtain a group of multi-dimensional feature vectors, wherein the feature matrix comprises a plurality of groups of feature vectors;
the feature matrix determining module is used for obtaining a multi-dimensional feature matrix based on the to-be-tested try sheet and at least one template picture corresponding to the to-be-tested try sheet;
the similarity determining module is used for inputting the feature matrix into a preset classification model and outputting the similarity of the to-be-tested attempted sheet and each template picture corresponding to the to-be-tested attempted sheet;
and the classification module is used for taking the category to which the to-be-detected attempted sheet belongs as the category corresponding to the target template picture, wherein the target template picture is the template picture with the highest similarity with the to-be-detected attempted sheet.
11. An electronic device, the electronic device comprising:
At least one processor; and
a memory communicatively coupled to the at least one processor; wherein,,
the memory stores a computer program executable by the at least one processor to enable the at least one processor to perform the template picture classification method of any one of claims 1-9.
12. A computer readable storage medium storing computer instructions for causing a processor to implement the template picture classification method of any one of claims 1-9 when executed.
13. A computer program product, characterized in that the computer program product comprises a computer program which, when executed by a processor, implements the template picture classification method according to any of claims 1-9.
CN202210516779.7A 2022-05-12 2022-05-12 Template picture classification method, device, equipment, storage medium and product Active CN114911963B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210516779.7A CN114911963B (en) 2022-05-12 2022-05-12 Template picture classification method, device, equipment, storage medium and product

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210516779.7A CN114911963B (en) 2022-05-12 2022-05-12 Template picture classification method, device, equipment, storage medium and product

Publications (2)

Publication Number Publication Date
CN114911963A CN114911963A (en) 2022-08-16
CN114911963B true CN114911963B (en) 2023-09-01

Family

ID=82767587

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210516779.7A Active CN114911963B (en) 2022-05-12 2022-05-12 Template picture classification method, device, equipment, storage medium and product

Country Status (1)

Country Link
CN (1) CN114911963B (en)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111444922A (en) * 2020-03-27 2020-07-24 Oppo广东移动通信有限公司 Picture processing method and device, storage medium and electronic equipment
CN111476227A (en) * 2020-03-17 2020-07-31 平安科技(深圳)有限公司 Target field recognition method and device based on OCR (optical character recognition) and storage medium
CN113361636A (en) * 2021-06-30 2021-09-07 山东建筑大学 Image classification method, system, medium and electronic device
CN113569070A (en) * 2021-07-24 2021-10-29 平安科技(深圳)有限公司 Image detection method and device, electronic equipment and storage medium

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111476227A (en) * 2020-03-17 2020-07-31 平安科技(深圳)有限公司 Target field recognition method and device based on OCR (optical character recognition) and storage medium
CN111444922A (en) * 2020-03-27 2020-07-24 Oppo广东移动通信有限公司 Picture processing method and device, storage medium and electronic equipment
CN113361636A (en) * 2021-06-30 2021-09-07 山东建筑大学 Image classification method, system, medium and electronic device
CN113569070A (en) * 2021-07-24 2021-10-29 平安科技(深圳)有限公司 Image detection method and device, electronic equipment and storage medium

Also Published As

Publication number Publication date
CN114911963A (en) 2022-08-16

Similar Documents

Publication Publication Date Title
CN113657395B (en) Text recognition method, training method and device for visual feature extraction model
CN112541332B (en) Form information extraction method and device, electronic equipment and storage medium
CN113656582A (en) Training method of neural network model, image retrieval method, device and medium
CN113780098A (en) Character recognition method, character recognition device, electronic equipment and storage medium
CN113408280A (en) Negative example construction method, device, equipment and storage medium
CN116340831B (en) Information classification method and device, electronic equipment and storage medium
CN116309963B (en) Batch labeling method and device for images, electronic equipment and storage medium
CN114911963B (en) Template picture classification method, device, equipment, storage medium and product
CN111597336A (en) Processing method and device of training text, electronic equipment and readable storage medium
CN115510212A (en) Text event extraction method, device, equipment and storage medium
CN115934928A (en) Information extraction method, device, equipment and storage medium
CN114417862A (en) Text matching method, and training method and device of text matching model
CN114049686A (en) Signature recognition model training method and device and electronic equipment
CN117807972A (en) Method, device, equipment and medium for extracting form information in long document
CN114299522B (en) Image recognition method device, apparatus and storage medium
CN116644724B (en) Method, device, equipment and storage medium for generating bid
CN116127948B (en) Recommendation method and device for text data to be annotated and electronic equipment
CN113536751B (en) Processing method and device of form data, electronic equipment and storage medium
CN114998906B (en) Text detection method, training method and device of model, electronic equipment and medium
CN113190698B (en) Paired picture set generation method and device, electronic equipment and storage medium
CN115761445A (en) Method, device, equipment and medium for training chromosome analysis model
CN116229490A (en) Layout analysis method, device, equipment and medium of graphic neural network
CN118013065A (en) Man-machine interaction method, device, equipment and storage medium based on image
CN116343236A (en) Efficient small language recognition model training and text recognition method, device and equipment
CN116451092A (en) Text difference rate determination method and device and electronic equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant