CN111753817B - Information processing method and device, electronic equipment and computer readable storage medium - Google Patents

Information processing method and device, electronic equipment and computer readable storage medium Download PDF

Info

Publication number
CN111753817B
CN111753817B CN202010599478.6A CN202010599478A CN111753817B CN 111753817 B CN111753817 B CN 111753817B CN 202010599478 A CN202010599478 A CN 202010599478A CN 111753817 B CN111753817 B CN 111753817B
Authority
CN
China
Prior art keywords
attribute
auditing
checked
text
audited
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202010599478.6A
Other languages
Chinese (zh)
Other versions
CN111753817A (en
Inventor
周晶
张宾
孙喜民
贾江凯
李慧超
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
State Grid Digital Technology Holdings Co ltd
State Grid E Commerce Technology Co Ltd
Original Assignee
State Grid Digital Technology Holdings Co ltd
State Grid E Commerce Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by State Grid Digital Technology Holdings Co ltd, State Grid E Commerce Technology Co Ltd filed Critical State Grid Digital Technology Holdings Co ltd
Priority to CN202010599478.6A priority Critical patent/CN111753817B/en
Publication of CN111753817A publication Critical patent/CN111753817A/en
Application granted granted Critical
Publication of CN111753817B publication Critical patent/CN111753817B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/22Image preprocessing by selection of a specific region containing or referencing a pattern; Locating or processing of specific regions to guide the detection or recognition
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/06Buying, selling or leasing transactions
    • G06Q30/0601Electronic shopping [e-shopping]
    • G06Q30/0623Item investigation
    • G06Q30/0625Directed, with specific intent or strategy
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition

Abstract

The method, the device, the electronic equipment and the computer readable storage medium for processing the information comprise the steps of obtaining a file to be checked of an object, carrying out text detection and text recognition on the file to be checked to obtain text information contained in the file to be checked, positioning attribute items to be checked from the text information, extracting attribute values corresponding to the attribute items to be checked from the text information, checking the check information according to preset checking rules, and obtaining checking results aiming at the object. Because the attribute items to be checked are the checking attribute items included in the pre-constructed checking attribute item database, the attribute values corresponding to the attribute items to be checked, which are extracted from the text information, can be ensured to be the content of the text information to be checked, so that the accuracy of checking the file to be checked can be improved, and the good credibility of the checking result of the object can be ensured.

Description

Information processing method and device, electronic equipment and computer readable storage medium
Technical Field
The present invention relates to the field of image information processing, and in particular, to an information processing method and apparatus, an electronic device, and a computer readable storage medium.
Background
With the development of the internet, many factories choose to sell products online on a third party shopping platform. In order to maintain the reliability of online transactions, third party shopping platforms need to determine whether products offered by vendors meet the requirements of on-shelf.
At present, the auditing mode is that a manufacturer uploads a qualification picture of a product to a third-party shopping platform, wherein the qualification picture is a qualification proof material of the product provided by a third-party detection mechanism, such as a quality detection report of the product, and after the third-party shopping platform receives the qualification picture of the product, an employee of the third-party shopping platform audits the qualification picture of the product to determine whether the product meets the requirement of being put on shelf.
And due to various types and large quantity of qualification pictures, the error rate is high in the process of manually checking the qualification pictures. Therefore, how to accurately audit the qualification pictures to determine whether the product has the qualification of putting on shelf becomes a problem to be solved urgently.
Disclosure of Invention
In order to achieve the above object, the present application provides the following technical solutions:
a method of information processing, comprising:
acquiring a file to be checked of an object;
performing text detection and text recognition on the file to be checked to obtain text information included in the file to be checked;
positioning an attribute item to be audited from the text information, and extracting an attribute value corresponding to the attribute item to be audited from the text information; the attribute items to be checked are check attribute items included in a pre-constructed database; any one of the auditing attribute items in the database is extracted from a history to-be-audited file;
and auditing the audit information according to a preset audit rule to obtain an audit result aiming at the object, wherein the audit information at least comprises the attribute value corresponding to the attribute item to be audited.
The method, optionally, the process of constructing the database includes:
acquiring a sample set of files to be checked, wherein the sample set of files to be checked comprises a plurality of history files to be checked;
acquiring a target audit attribute item, wherein the target audit attribute item is an audit attribute item specified by the audit rule;
and extracting text content with the similarity reaching a first threshold value with the target audit attribute item from each historical to-be-audited file aiming at each target audit attribute item, and taking the text content as the audit attribute item of the database.
In the above method, optionally, the positioning the attribute item to be audited from the text information, and extracting the attribute value corresponding to the attribute item to be audited from the text information includes:
taking the first text content which is the same as the auditing attribute item in the database in the text information as the attribute item to be audited of the text information;
and determining an indication identifier corresponding to the attribute item to be checked in the text information, taking the second text content indicated by the indication identifier as the attribute value corresponding to the attribute item to be checked, and extracting the second text content.
According to the above method, optionally, the auditing information is audited according to the preset auditing rule, so as to obtain an auditing result for the object, including:
determining an auditing rule corresponding to the attribute item to be audited, which is included in the auditing rule, according to the attribute value corresponding to each attribute item to be audited; judging whether the attribute value meets the requirement of the auditing regulations, if so, determining that the auditing result of the attribute value aiming at the attribute item to be audited is auditing passing;
and obtaining an auditing result aiming at the object according to the auditing result of the attribute value of each attribute item to be audited.
The method, optionally, further comprises obtaining feature information of the object;
the audit information also comprises the characteristic information of the object;
and auditing the audit information according to the preset audit rule to obtain an audit result aiming at the object, wherein the audit result comprises the following steps:
for each piece of characteristic information, acquiring the attribute value corresponding to the characteristic information from the attribute value of the attribute item to be checked, and determining that the checking result for the attribute value is checking passing under the condition that the similarity between the attribute value and the characteristic information reaches a second threshold; the characteristic information corresponds to the attribute value, namely, the characteristic information presets that the corresponding auditing attribute item is the same as the attribute item to be audited corresponding to the attribute value;
determining an auditing rule corresponding to the attribute items to be audited, which is included in the auditing rule, according to the attribute value of each attribute item to be audited, which does not correspond to the characteristic information; under the condition that the attribute value meets the requirement of the auditing rule, determining that the auditing result of the attribute value aiming at the attribute item to be audited is auditing passing;
and obtaining an auditing result aiming at the object according to the auditing result of the attribute value of each attribute item to be audited.
In the above method, optionally, the file to be checked is a picture;
and performing text detection and text recognition on the file to be checked to obtain text information included in the file to be checked, wherein the text information includes:
preprocessing the picture;
detecting text included in the preprocessed picture based on a text detection tool;
and identifying the detected text based on a text identification tool to obtain the text information included in the picture.
According to the method, optionally, the text detection and the text recognition are performed on the file to be checked to obtain text information included in the file to be checked, and the method comprises the following steps:
and under the condition that the picture quality value of the picture is larger than a third threshold value, performing text detection and text recognition on the file to be checked to obtain text information included in the file to be checked.
An apparatus for information processing, comprising:
the acquisition unit is used for acquiring the file to be checked of the object;
the identification unit is used for carrying out text detection and text identification on the file to be checked to obtain text information included in the file to be checked;
the extraction unit is used for positioning the attribute items to be checked from the text information and extracting attribute values corresponding to the attribute items to be checked from the text information; the attribute items to be checked are check attribute items included in a pre-constructed database; any one of the auditing attribute items in the database is extracted from a history to-be-audited file;
the auditing unit is used for auditing the auditing information according to a preset auditing rule to obtain an auditing result aiming at the object, wherein the auditing information at least comprises the attribute value corresponding to the attribute item to be audited.
An electronic device, comprising: a processor and a memory for storing a program; the processor is configured to run the program to implement the above-described information processing method.
A computer readable storage medium having instructions stored therein which, when run on a computer, cause the computer to perform the method of information processing described above.
According to the method and the device, the file to be checked of the object is obtained, text detection and text recognition are carried out on the file to be checked, text information included in the file to be checked is obtained, the attribute items to be checked are located from the text information, the attribute values corresponding to the attribute items to be checked are extracted from the text information, and checking is carried out on the check information according to preset checking rules, so that checking results aiming at the object are obtained. Because the attribute items to be checked are the checking attribute items included in the pre-constructed checking attribute item database, the attribute values corresponding to the attribute items to be checked extracted from the text information can be ensured to be the contents of the text information to be checked, so that the attribute values corresponding to the attribute items to be checked are checked, which is equivalent to checking the content information which is included in the text information and needs to be checked, thereby improving the accuracy of checking the files to be checked and further ensuring good credibility of the checking results of the objects.
Drawings
In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings that are required in the embodiments or the description of the prior art will be briefly described below, it being obvious that the drawings in the following description are only some embodiments of the present application, and that other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
FIG. 1 is a flow chart of a method of information processing provided in an embodiment of the present application;
FIG. 2 is a flowchart of a method of building a database according to an embodiment of the present application;
FIG. 3 is a flow chart of another method for information processing provided by an embodiment of the present application;
fig. 4 is a schematic structural diagram of an information processing apparatus according to an embodiment of the present application;
fig. 5 is a schematic structural diagram of an electronic device according to an embodiment of the present application.
Detailed Description
The following description of the embodiments of the present application will be made clearly and fully with reference to the accompanying drawings, in which it is evident that the embodiments described are only some, but not all, of the embodiments of the present application. All other embodiments, which can be made by one of ordinary skill in the art without undue burden from the present disclosure, are within the scope of the present disclosure.
Fig. 1 is a flowchart of a method for processing information according to an embodiment of the present application, including the following steps:
s101, obtaining a file to be checked of the object.
The object may be a product, an enterprise organization, or a natural person, and the embodiment is not limited, and the document to be checked may be text or a picture. The document to be checked may be a material document evaluating the object, for example, the object is a product, and the document to be checked may be a quality report of the product.
S102, performing text detection and text recognition on the file to be checked to obtain text information included in the file to be checked.
Specifically, a text detection tool for Optical Character Recognition (OCR) may be used to detect a text included in a document to be audited, and a text recognition tool for OCR may be used to recognize the detected text, so as to obtain text information. For more details of text detection and text recognition of documents to be audited using OCR technology, reference is made to the prior art.
When the document to be checked is a picture, the picture needs to be subjected to picture preprocessing, such as binarization, noise removal, inclination correction, and the like, and then text detection and text recognition are performed.
S103, positioning the attribute items to be audited from the text information, and extracting attribute values corresponding to the attribute items to be audited from the text information.
The attribute items to be audited are audit attribute items included in a pre-constructed database, and any audit attribute item in the database is extracted from the historical file to be audited. The process of constructing the database may refer to the flow shown in fig. 2.
The specific implementation mode of the step comprises the steps of A1 to A2:
step A1: and taking the first text content which is the same as the auditing attribute item in the database in the text information as the attribute item to be audited of the text information.
And judging whether the first text content which is the same as the auditing attribute item exists in the text information according to each auditing attribute item in the database, and if so, taking the first text content as the attribute item to be audited of the text information.
Step A2: and determining an indication identifier corresponding to the attribute item to be checked in the text information, taking the second text content indicated by the indication identifier as an attribute value corresponding to the attribute item to be checked, and extracting the second text content.
And determining the indication identifier corresponding to the attribute item to be checked, namely determining the indication identifier corresponding to the first text content, wherein the indication identifier can be characters such as a colon, a space, a bar and the like, and can also be prepositions such as yes or yes which characterize the relation between objects. For example, the attribute item to be checked is denoted by a, and the indication is identified as a colon ": the content included in the text information is: a: and B, the attribute value corresponding to the attribute item A to be checked is B.
S104, auditing attribute values corresponding to the attribute items to be audited according to preset auditing rules to obtain auditing results aiming at the objects.
One embodiment of this step may be: determining an auditing rule corresponding to the attribute items to be audited, which is included in the auditing rule, according to the attribute values corresponding to each attribute item to be audited; and judging whether the attribute values meet the requirement of auditing regulations, if so, determining that the auditing result of the attribute values of the attribute items to be audited is auditing passing, and obtaining the auditing result of the object according to the auditing result of the attribute values of each attribute item to be audited.
The auditing regulations corresponding to different attribute items to be audited are different, for example, aiming at textile fabric products, the attribute items to be audited are formaldehyde content, and in text information, attribute values corresponding to the formaldehyde content are as follows: in the auditing rule, the auditing rule aiming at formaldehyde content of attribute items to be audited is as follows: the formaldehyde content of the textile fabric product is not more than 1.0mg/kg. And determining that the attribute value of the formaldehyde content of the product does not meet the requirement of auditing regulations.
Obtaining an audit result for the object according to the audit result of the attribute value of each attribute item to be audited, for example, if the audit result of the attribute value of one attribute item to be audited is that the audit is not passed, determining that the audit result for the object is that the audit is not passed.
According to the method provided by the embodiment, the file to be checked of the object is obtained, text detection and text recognition are carried out on the file to be checked, text information included in the file to be checked is obtained, the attribute items to be checked are located from the text information, the attribute values corresponding to the attribute items to be checked are extracted from the text information, and checking is carried out on the check information according to preset checking rules, so that checking results aiming at the object are obtained. Because the attribute items to be checked are the checking attribute items included in the pre-constructed checking attribute item database, the attribute values corresponding to the attribute items to be checked extracted from the text information can be ensured to be the contents of the text information to be checked, so that the attribute values corresponding to the attribute items to be checked are checked, which is equivalent to checking the content information which is included in the text information and needs to be checked, thereby improving the accuracy of checking the files to be checked and further ensuring good credibility of the checking results of the objects.
In the above embodiment, when the file to be checked is a picture, further including calculating a picture quality value of the picture, if the picture quality value is greater than a third threshold, executing the steps of text detection and text recognition of the picture, and if the picture quality value is not greater than the third threshold, sending out prompt information that the picture quality is unqualified. On the premise of high picture quality value of the picture, the picture is subjected to character detection and character recognition, so that the accuracy of character detection and character recognition can be improved.
Fig. 2 is a method for constructing a database according to an embodiment of the present application, including the following steps:
s201, acquiring a file sample set to be checked.
The sample set of pending documents includes a plurality of historical pending documents.
S202, acquiring a target audit attribute item.
The target audit attribute is an audit attribute specified by the audit rule. Different target audit items may be set for different audit rules.
S203, extracting text content with similarity reaching a first threshold value with the target audit attribute items from each historical to-be-audited file aiming at each target audit attribute item, and taking the text content as the audit attribute item of the database.
For example, the target audit attribute item is "product name", and text contents such as "commodity name", "product name", and "food name" in the historical to-be-inspected file can be used as the audit attribute item of the database.
According to the technical scheme, a sample set of files to be checked is obtained, target checking attribute items are obtained, text content, the similarity of which with the target checking attribute items reaches a first threshold, is extracted from each historical file to be checked, and the text content is used as the checking attribute item of the database.
Fig. 3 is a flowchart of another information processing method provided in the embodiment of the present application, and steps included in the embodiment and the same as those in the embodiment may refer to the embodiment, which is not repeated herein, and the embodiment includes the following steps:
s301, obtaining a file to be checked of the object and characteristic information of the object.
The feature information of the object may be a name of the object, or an identification code of the object, such as a specification model of a product, a product model, or the like.
S302, performing text detection and text recognition on the file to be checked to obtain text information included in the file to be checked.
S303, positioning the attribute items to be audited from the text information, and extracting attribute values corresponding to the attribute items to be audited from the text information.
S304, aiming at each piece of characteristic information, acquiring an attribute value corresponding to the characteristic information from the attribute values of the attribute items to be checked, and determining that the checking result aiming at the attribute value passes the checking when the similarity between the attribute value and the characteristic information reaches a second threshold value.
And the characteristic information corresponds to the attribute value, namely, the auditing attribute item corresponding to the characteristic information preset is the same as the attribute item to be audited corresponding to the attribute value. For example, the feature information of the object is "disposable medical mask", and the corresponding audit attribute item preset by "disposable medical mask" is "product name", and the to-be-audit attribute item located from the text information of the object is: and (3) determining that the characteristic information of the object corresponds to the attribute value of the attribute item to be checked. If the attribute value corresponding to the commodity name extracted from the text information is: the disposable medical mask, namely the similarity between the attribute value and the characteristic information reaches 100 percent, and the auditing result of the disposable medical mask aiming at the attribute value is determined to be the auditing passing. In this step, the second threshold may be set by itself.
S305, determining an auditing rule corresponding to the attribute items to be audited, which is included in the auditing rule, according to the attribute value of each attribute item to be audited, which does not correspond to the characteristic information, and determining that the auditing result of the attribute value of the attribute item to be audited is auditing passing under the condition that the attribute value accords with the auditing rule.
And the attribute value which does not correspond to the characteristic information, namely, the attribute value to be checked corresponding to the characteristic information preset is different from the attribute value to be checked corresponding to the attribute value.
And aiming at the attribute values of the auditing attribute items which do not correspond to the characteristic information, auditing the attribute values of the auditing attribute items by adopting auditing regulations which are included in auditing rules and correspond to the attribute items to be audited, and determining that the auditing result of the attribute values of the attribute items to be audited is auditing passing under the condition that the attribute values accord with the auditing regulations.
S306, according to the auditing result of the attribute value of each attribute item to be audited, the auditing result aiming at the object is obtained.
According to the method provided by the embodiment, for each piece of characteristic information, the attribute value corresponding to the characteristic information is obtained from the attribute value of the attribute item to be checked, the checking result of the attribute value is determined to be checking passing under the condition that the similarity between the attribute value and the characteristic information reaches a second threshold value, the checking rule corresponding to the attribute item to be checked and included in the checking rule is determined for the attribute value of each attribute item to be checked, and the checking result of the attribute value of the attribute item to be checked is determined to be checking passing under the condition that the attribute value accords with the checking rule. The accuracy of the verification can be further improved by comparing the similarity between the attribute value of the attribute item to be verified and the feature information included in the text information.
The method provided by the embodiment of the invention can be applied to a third party shopping platform to check the qualification picture of the product provided by a manufacturer to confirm whether the product has a scene of on-shelf qualification, wherein the third party determines that the product has the on-shelf qualification, the name of the product is the same as the name of the product in the qualification picture, the model of the product is the same as the model of the product included in the qualification picture, and the quality detection result of the product in the qualification picture is qualified.
The specific implementation mode comprises the following steps:
step 1, acquiring a qualification picture of a product;
step 2, acquiring qualification text information included in the qualification picture by utilizing an OCR text recognition tool;
and step 3, positioning the first attribute item to be checked, the second attribute item to be checked and the third attribute item to be checked from the qualification text information according to a preset database.
The first attribute item to be audited may be: literal contents such as "product name", "commodity name", "article name", "product name", "trade name", "article name", or "name". The second to-be-checked attribute item may be: and the text contents such as a product model, a commodity model, an article model or a model are displayed. The third to-be-checked attribute item may be: the text contents such as a product quality detection result, a commodity quality detection result, a quality detection result or a quality detection result.
All possible designations of the first to-be-inspected attribute, all possible designations of the second to-be-inspected attribute, and all possible designations of the third to-be-inspected attribute are pre-stored in the database. All possible designations of the first to-be-inspected attribute, all possible designations of the second to-be-inspected attribute, and all possible designations of the third to-be-inspected attribute are derived from qualification pictures of the historical product. For a specific process of generating the database principle, reference may be made to the above embodiments, and details are not repeated here.
And 4, extracting a first attribute value of the first attribute item to be checked from the qualification text information, a second attribute value of the second attribute item to be checked, and a third attribute value of the third attribute item to be checked.
The first attribute value is a specific name of a product included in the qualification text information, the second attribute value is a specific model of the product included in the qualification text information, and the third attribute value is a specific quality detection result, e.g., a detection result is qualified, of the product included in the qualification text information.
And 5, judging whether the first attribute value is consistent with the actual name of the product, judging whether the second attribute value is consistent with the actual model of the product, and judging whether the third attribute value is qualified as a detection result, if the first attribute value is consistent with the actual name of the product, the second attribute value is consistent with the actual model of the product, and the third attribute value is qualified as the detection result, determining and outputting a result that the product has the qualification of being put on shelf.
The method provided by the embodiment can improve the accuracy and efficiency of auditing the qualification picture of the product and can save a great deal of manpower.
Fig. 4 is a schematic structural diagram of an information processing apparatus 400 according to an embodiment of the present application, including:
an obtaining unit 401, configured to obtain a file to be checked of an object;
the recognition unit 402 is configured to perform text detection and text recognition on the file to be checked, so as to obtain text information included in the file to be checked;
the extracting unit 403 is configured to locate an attribute item to be audited from the text information, and extract an attribute value corresponding to the attribute item to be audited from the text information; the attribute items to be audited are audit attribute items included in a pre-constructed database; any one audit attribute item in the database is extracted from the historical to-be-audited file;
and the auditing unit 404 is configured to audit the auditing information according to a preset auditing rule, so as to obtain an auditing result for the object, where the auditing information at least includes an attribute value corresponding to the attribute item to be audited.
The above apparatus 400 further includes a construction unit 405, configured to construct a database, and the specific implementation process is: acquiring a sample set of files to be checked, wherein the sample set of files to be checked comprises a plurality of historical files to be checked; acquiring a target audit attribute item, wherein the target audit attribute item is an audit attribute item specified by an audit rule; and extracting text content with the similarity reaching a first threshold value with the target audit attribute items from each historical to-be-audited file aiming at each target audit attribute item, and taking the text content as the audit attribute item of the database.
Optionally, the extracting unit 403 locates the attribute item to be audited from the text information, and extracts the attribute value corresponding to the attribute item to be audited from the text information by the implementation manner: taking the first text content which is the same as the auditing attribute item in the database in the text information as the attribute item to be audited of the text information; and determining an indication identifier corresponding to the attribute item to be checked in the text information, taking the second text content indicated by the indication identifier as an attribute value corresponding to the attribute item to be checked, and extracting the second text content.
Optionally, the auditing unit 404 audits the audit information according to a preset auditing rule, and the specific implementation manner of obtaining the audit result for the object is as follows: determining an auditing rule corresponding to the attribute items to be audited, which is included in the auditing rule, according to the attribute values corresponding to each attribute item to be audited; judging whether the attribute value meets the requirement of the auditing regulations, if so, determining that the auditing result of the attribute value of the attribute item to be audited is auditing passing; and obtaining an auditing result aiming at the object according to the auditing result of the attribute value of each attribute item to be audited.
Optionally, the obtaining unit 401 is further configured to obtain feature information of the object.
Optionally, the auditing unit 404 audits the audit information according to a preset auditing rule, and another implementation manner of obtaining the audit result for the object is as follows: for each piece of characteristic information, acquiring an attribute value corresponding to the characteristic information from the attribute value of the attribute item to be checked, and determining that the checking result for the attribute value is checking passing under the condition that the similarity between the attribute value and the characteristic information reaches a second threshold value; the feature information corresponds to the attribute value, namely, the auditing attribute item corresponding to the feature information preset is the same as the attribute item to be audited corresponding to the attribute value;
determining an auditing rule corresponding to the attribute items to be audited, which is included in the auditing rule, according to the attribute value of each attribute item to be audited, which does not correspond to the characteristic information; under the condition that the attribute value meets the requirement of the auditing regulations, determining that the auditing result of the attribute value of the attribute item to be audited is auditing passing;
and obtaining an auditing result aiming at the object according to the auditing result of the attribute value of each attribute item to be audited.
Optionally, the file to be checked is a picture, and the recognition unit 402 performs text detection and text recognition on the file to be checked to obtain text information included in the file to be checked by the implementation manner: preprocessing the picture; detecting text included in the preprocessed picture based on a text detection tool; and identifying the detected text based on a text identification tool to obtain text information included in the picture.
Optionally, the identifying unit 402 performs text detection and text recognition on the file to be checked, so as to obtain text information included in the file to be checked, where the picture quality value of the picture is greater than the third threshold value, and performs text detection and text recognition on the file to be checked so as to obtain text information included in the file to be checked.
According to the device, the file to be checked of the object is obtained, text detection and text recognition are carried out on the file to be checked, text information included in the file to be checked is obtained, the attribute items to be checked are located from the text information, the attribute values corresponding to the attribute items to be checked are extracted from the text information, and checking is carried out on the check information according to preset checking rules, so that checking results aiming at the object are obtained. Because the attribute items to be checked are the checking attribute items included in the pre-constructed checking attribute item database, the attribute values corresponding to the attribute items to be checked extracted from the text information can be ensured to be the contents of the text information to be checked, so that the attribute values corresponding to the attribute items to be checked are checked, which is equivalent to checking the content information which is included in the text information and needs to be checked, thereby improving the accuracy of checking the files to be checked and further ensuring good credibility of the checking results of the objects.
The embodiment of the application further provides an electronic device 500, a schematic structural diagram of which is shown in fig. 5, which specifically includes: a processor 501 and a memory 502, the memory 502 for storing a program; the processor 501 is configured to execute a program to implement the method of information processing of the present application, that is, to execute the following steps:
acquiring a file to be checked of an object;
performing text detection and text recognition on the file to be checked to obtain text information included in the file to be checked;
positioning attribute items to be audited from the text information, and extracting attribute values corresponding to the attribute items to be audited from the text information; the attribute items to be audited are audit attribute items included in a pre-constructed database; any one audit attribute item in the database is extracted from the historical to-be-audited file;
and auditing the audit information according to a preset audit rule to obtain an audit result aiming at the object, wherein the audit information at least comprises an attribute value corresponding to the attribute item to be audited.
Optionally, the process of constructing the database includes: acquiring a file sample set to be audited of a file sample to be audited, wherein the file sample set to be audited comprises a plurality of historical files to be audited; acquiring a target audit attribute item, wherein the target audit attribute item is an audit attribute item specified by an audit rule; and extracting text content with the similarity reaching a first threshold value with the target audit attribute items from each historical to-be-audited file aiming at each target audit attribute item, and taking the text content as the audit attribute item of the database.
Optionally, locating the attribute item to be audited from the text information, and extracting the attribute value corresponding to the attribute item to be audited from the text information, including: taking the first text content which is the same as the auditing attribute item in the database in the text information as the attribute item to be audited of the text information; and determining an indication identifier corresponding to the attribute item to be checked in the text information, taking the second text content indicated by the indication identifier as an attribute value corresponding to the attribute item to be checked, and extracting the second text content.
The embodiments of the present application also provide a computer readable storage medium having instructions stored therein, which when run on a computer, cause the computer to perform the method of information processing of the present application, i.e. to perform the steps of:
acquiring a file to be checked of an object;
performing text detection and text recognition on the file to be checked to obtain text information included in the file to be checked;
positioning attribute items to be audited from the text information, and extracting attribute values corresponding to the attribute items to be audited from the text information; the attribute items to be audited are audit attribute items included in a pre-constructed database; any one audit attribute item in the database is extracted from the historical to-be-audited file;
and auditing the audit information according to a preset audit rule to obtain an audit result aiming at the object, wherein the audit information at least comprises an attribute value corresponding to the attribute item to be audited.
Optionally, the process of constructing the database includes: acquiring a file sample set to be audited of a file sample to be audited, wherein the file sample set to be audited comprises a plurality of historical files to be audited; acquiring a target audit attribute item, wherein the target audit attribute item is an audit attribute item specified by an audit rule; and extracting text content with the similarity reaching a first threshold value with the target audit attribute items from each historical to-be-audited file aiming at each target audit attribute item, and taking the text content as the audit attribute item of the database.
Optionally, locating the attribute item to be audited from the text information, and extracting the attribute value corresponding to the attribute item to be audited from the text information, including: taking the first text content which is the same as the auditing attribute item in the database in the text information as the attribute item to be audited of the text information; and determining an indication identifier corresponding to the attribute item to be checked in the text information, taking the second text content indicated by the indication identifier as an attribute value corresponding to the attribute item to be checked, and extracting the second text content.
The functions described in the methods of the present application, if implemented in the form of software functional units and sold or used as a stand-alone product, may be stored in a computing device readable storage medium. Based on such understanding, a portion of the embodiments of the present application that contributes to the prior art or a portion of the technical solution may be embodied in the form of a software product stored in a storage medium, comprising several instructions for causing a computing device (which may be a personal computer, a server, a mobile computing device or a network device, etc.) to perform all or part of the steps of the methods described in the embodiments of the present application. And the aforementioned storage medium includes: a usb disk, a removable hard disk, a Read-only memory (ROM), a random access memory (RAM, randomAccessMemory), a magnetic disk, or an optical disk, or other various media capable of storing program codes.
In this specification, each embodiment is described in a progressive manner, and each embodiment is mainly described in a different point from other embodiments, so that the same or similar parts between the embodiments are referred to each other.
The previous description of the disclosed embodiments is provided to enable any person skilled in the art to make or use the present application. Various modifications to these embodiments will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other embodiments without departing from the spirit or scope of the application. Thus, the present application is not intended to be limited to the embodiments shown herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.

Claims (8)

1. A method of information processing, comprising:
acquiring a file to be checked of an object;
performing text detection and text recognition on the file to be checked to obtain text information included in the file to be checked;
positioning an attribute item to be audited from the text information, and extracting an attribute value corresponding to the attribute item to be audited from the text information; the attribute items to be checked are check attribute items included in a pre-constructed database; any one of the auditing attribute items in the database is extracted from a history to-be-audited file;
according to a preset auditing rule, auditing the auditing information to obtain an auditing result aiming at the object, wherein the auditing information at least comprises the attribute value corresponding to the attribute item to be audited;
the positioning the attribute item to be audited from the text information, and extracting the attribute value corresponding to the attribute item to be audited from the text information, includes:
taking the first text content which is the same as the auditing attribute item in the database in the text information as the attribute item to be audited of the text information;
determining an indication identifier corresponding to the attribute item to be checked in the text information, taking the second text content indicated by the indication identifier as the attribute value corresponding to the attribute item to be checked, and extracting the second text content;
the auditing information according to the preset auditing rule to obtain the auditing result aiming at the object comprises the following steps:
determining an auditing rule corresponding to the attribute item to be audited, which is included in the auditing rule, according to the attribute value corresponding to each attribute item to be audited; judging whether the attribute value meets the requirement of the auditing regulations, if so, determining that the auditing result of the attribute value aiming at the attribute item to be audited is auditing passing;
and obtaining an auditing result aiming at the object according to the auditing result of the attribute value of each attribute item to be audited.
2. The method of claim 1, wherein the process of building the database comprises:
acquiring a sample set of files to be checked, wherein the sample set of files to be checked comprises a plurality of history files to be checked;
acquiring a target audit attribute item, wherein the target audit attribute item is an audit attribute item specified by the audit rule;
and extracting text content with the similarity reaching a first threshold value with the target audit attribute item from each historical to-be-audited file aiming at each target audit attribute item, and taking the text content as the audit attribute item of the database.
3. The method of claim 1, further comprising obtaining characteristic information of the object;
the audit information also comprises the characteristic information of the object;
and auditing the audit information according to the preset audit rule to obtain an audit result aiming at the object, wherein the audit result comprises the following steps:
for each piece of characteristic information, acquiring the attribute value corresponding to the characteristic information from the attribute value of the attribute item to be checked, and determining that the checking result for the attribute value is checking passing under the condition that the similarity between the attribute value and the characteristic information reaches a second threshold; the characteristic information corresponds to the attribute value, namely, the characteristic information presets that the corresponding auditing attribute item is the same as the attribute item to be audited corresponding to the attribute value;
determining an auditing rule corresponding to the attribute items to be audited, which is included in the auditing rule, according to the attribute value of each attribute item to be audited, which does not correspond to the characteristic information; under the condition that the attribute value meets the requirement of the auditing rule, determining that the auditing result of the attribute value aiming at the attribute item to be audited is auditing passing;
and obtaining an auditing result aiming at the object according to the auditing result of the attribute value of each attribute item to be audited.
4. The method of claim 1, wherein the document to be reviewed is a picture;
and performing text detection and text recognition on the file to be checked to obtain text information included in the file to be checked, wherein the text information includes:
preprocessing the picture;
detecting text included in the preprocessed picture based on a text detection tool;
and identifying the detected text based on a text identification tool to obtain the text information included in the picture.
5. The method of claim 4, wherein the performing text detection and text recognition on the document to be checked to obtain text information included in the document to be checked includes:
and under the condition that the picture quality value of the picture is larger than a third threshold value, performing text detection and text recognition on the file to be checked to obtain text information included in the file to be checked.
6. An apparatus for information processing, comprising:
the acquisition unit is used for acquiring the file to be checked of the object;
the identification unit is used for carrying out text detection and text identification on the file to be checked to obtain text information included in the file to be checked;
the extraction unit is used for positioning the attribute items to be checked from the text information and extracting attribute values corresponding to the attribute items to be checked from the text information; the attribute items to be checked are check attribute items included in a pre-constructed database; any one of the auditing attribute items in the database is extracted from a history to-be-audited file;
the auditing unit is used for auditing the auditing information according to a preset auditing rule to obtain an auditing result aiming at the object, wherein the auditing information at least comprises the attribute value corresponding to the attribute item to be audited;
the positioning the attribute item to be audited from the text information, and extracting the attribute value corresponding to the attribute item to be audited from the text information, includes:
taking the first text content which is the same as the auditing attribute item in the database in the text information as the attribute item to be audited of the text information;
determining an indication identifier corresponding to the attribute item to be checked in the text information, taking the second text content indicated by the indication identifier as the attribute value corresponding to the attribute item to be checked, and extracting the second text content;
the auditing information according to the preset auditing rule to obtain the auditing result aiming at the object comprises the following steps:
determining an auditing rule corresponding to the attribute item to be audited, which is included in the auditing rule, according to the attribute value corresponding to each attribute item to be audited; judging whether the attribute value meets the requirement of the auditing regulations, if so, determining that the auditing result of the attribute value aiming at the attribute item to be audited is auditing passing;
and obtaining an auditing result aiming at the object according to the auditing result of the attribute value of each attribute item to be audited.
7. An electronic device, comprising: a processor and a memory for storing a program; the processor is configured to run the program to implement the method of information processing according to any one of claims 1 to 5.
8. A computer readable storage medium having instructions stored therein which, when run on a computer, cause the computer to perform the method of information processing according to any of claims 1-5.
CN202010599478.6A 2020-06-28 2020-06-28 Information processing method and device, electronic equipment and computer readable storage medium Active CN111753817B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010599478.6A CN111753817B (en) 2020-06-28 2020-06-28 Information processing method and device, electronic equipment and computer readable storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010599478.6A CN111753817B (en) 2020-06-28 2020-06-28 Information processing method and device, electronic equipment and computer readable storage medium

Publications (2)

Publication Number Publication Date
CN111753817A CN111753817A (en) 2020-10-09
CN111753817B true CN111753817B (en) 2024-01-26

Family

ID=72677677

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010599478.6A Active CN111753817B (en) 2020-06-28 2020-06-28 Information processing method and device, electronic equipment and computer readable storage medium

Country Status (1)

Country Link
CN (1) CN111753817B (en)

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112418813B (en) * 2020-12-02 2024-04-05 上海三稻智能科技有限公司 AEO qualification intelligent rating management system and method based on intelligent analysis and identification and storage medium
CN112699872A (en) * 2020-12-29 2021-04-23 天津幸福生命科技有限公司 Form auditing processing method and device, electronic equipment and storage medium
CN113538103A (en) * 2021-07-26 2021-10-22 国网电子商务有限公司 Purchasing control method and device, storage medium and electronic equipment
CN113762097A (en) * 2021-08-18 2021-12-07 合肥联宝信息技术有限公司 Automatic document auditing method and device and computer readable storage medium

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2001052519A1 (en) * 2000-01-11 2001-07-19 Workonce Wireless Corporation A method and system for form recognition and digitized image processing
CN107507018A (en) * 2017-07-25 2017-12-22 广州智选网络科技有限公司 It is a kind of that method, storage device and mobile terminal are checked and write off based on image recognition technology
CN107689006A (en) * 2017-03-13 2018-02-13 平安科技(深圳)有限公司 Claims Resolution bill recognition methods and device
CN110097329A (en) * 2019-03-16 2019-08-06 平安科技(深圳)有限公司 Signal auditing method, device, equipment and computer readable storage medium
CN110796454A (en) * 2019-10-09 2020-02-14 中国建设银行股份有限公司 Enterprise authentication auditing method and device
CN110837998A (en) * 2018-08-16 2020-02-25 北京国双科技有限公司 Contract auditing method, device, equipment and medium
CN111177181A (en) * 2019-12-11 2020-05-19 天翼电子商务有限公司 SQL text auditing method, system, storage medium and device
WO2020119116A1 (en) * 2018-12-13 2020-06-18 平安医疗健康管理股份有限公司 Medical insurance auditing method, apparatus and device based on data analysis, and storage medium

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9589183B2 (en) * 2013-11-22 2017-03-07 Parchment, Inc. System and method for identification and extraction of data
US9251139B2 (en) * 2014-04-08 2016-02-02 TitleFlow LLC Natural language processing for extracting conveyance graphs
JP6642429B2 (en) * 2014-07-23 2020-02-05 日本電気株式会社 Text processing system, text processing method, and text processing program

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2001052519A1 (en) * 2000-01-11 2001-07-19 Workonce Wireless Corporation A method and system for form recognition and digitized image processing
CN107689006A (en) * 2017-03-13 2018-02-13 平安科技(深圳)有限公司 Claims Resolution bill recognition methods and device
CN107507018A (en) * 2017-07-25 2017-12-22 广州智选网络科技有限公司 It is a kind of that method, storage device and mobile terminal are checked and write off based on image recognition technology
CN110837998A (en) * 2018-08-16 2020-02-25 北京国双科技有限公司 Contract auditing method, device, equipment and medium
WO2020119116A1 (en) * 2018-12-13 2020-06-18 平安医疗健康管理股份有限公司 Medical insurance auditing method, apparatus and device based on data analysis, and storage medium
CN110097329A (en) * 2019-03-16 2019-08-06 平安科技(深圳)有限公司 Signal auditing method, device, equipment and computer readable storage medium
CN110796454A (en) * 2019-10-09 2020-02-14 中国建设银行股份有限公司 Enterprise authentication auditing method and device
CN111177181A (en) * 2019-12-11 2020-05-19 天翼电子商务有限公司 SQL text auditing method, system, storage medium and device

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
基于图像识别的业务智能化审核技术研究;潘喆琼;龙正雄;胡瑞瑞;毛倩倩;;科学技术创新(第05期);全文 *

Also Published As

Publication number Publication date
CN111753817A (en) 2020-10-09

Similar Documents

Publication Publication Date Title
CN111753817B (en) Information processing method and device, electronic equipment and computer readable storage medium
EP3476092B1 (en) Automation of image validation
CN111241367A (en) Method and system for supervising network catering platform based on custom rule
CN112765003B (en) Risk prediction method based on APP behavior log
CN112529575B (en) Risk early warning method, equipment, storage medium and device
CN110837998A (en) Contract auditing method, device, equipment and medium
US20200210459A1 (en) Method and apparatus for classifying samples
CN112487982A (en) Merchant information auditing method, system and storage medium
CN114626024A (en) Internet infringement video low-consumption detection method and system based on block chain
CN107862599B (en) Bank risk data processing method and device, computer equipment and storage medium
CN112418813B (en) AEO qualification intelligent rating management system and method based on intelligent analysis and identification and storage medium
CN112307101A (en) Project pricing auditing method, device, computer equipment and system
CN115035523A (en) Data identification method and mobile terminal
CN114386935A (en) Examination and verification method and device for bid document
CN108920700B (en) False picture identification method and device
CN112862409A (en) Picking bill verification method and device
CN115603926A (en) Phishing mail identification method, system, device and storage medium
CN111046236A (en) Personalized data checking method, device and medium applied to IC card
CN110912918A (en) Page repairing method and device
CN115953598A (en) Picture auditing method and device, electronic equipment and storage medium
CN115983956B (en) Bid file detection method and system
CN110673888B (en) Verification method and device for configuration file
CN115687059A (en) Code detection method, device and storage medium
JP4963296B2 (en) Seal verification device and seal verification method
CN113723969A (en) Article claim settlement processing method, device, equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
CB02 Change of applicant information
CB02 Change of applicant information

Address after: 100053 room 8018, 8 / F, building 7, Guangyi street, Xicheng District, Beijing

Applicant after: State Grid Digital Technology Holdings Co.,Ltd.

Applicant after: State Grid E-Commerce Technology Co.,Ltd.

Address before: 311 guanganmennei street, Xicheng District, Beijing 100053

Applicant before: STATE GRID ELECTRONIC COMMERCE Co.,Ltd.

Applicant before: State Grid E-Commerce Technology Co.,Ltd.

GR01 Patent grant
GR01 Patent grant