CN115169335B - Invoice data calibration method and device, computer equipment and storage medium - Google Patents

Invoice data calibration method and device, computer equipment and storage medium Download PDF

Info

Publication number
CN115169335B
CN115169335B CN202211087287.7A CN202211087287A CN115169335B CN 115169335 B CN115169335 B CN 115169335B CN 202211087287 A CN202211087287 A CN 202211087287A CN 115169335 B CN115169335 B CN 115169335B
Authority
CN
China
Prior art keywords
character
calibrated
current
reference character
character string
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202211087287.7A
Other languages
Chinese (zh)
Other versions
CN115169335A (en
Inventor
张民遐
许金明
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenzhen Gaodeng Computer Technology Co ltd
Original Assignee
Shenzhen Gaodeng Computer Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenzhen Gaodeng Computer Technology Co ltd filed Critical Shenzhen Gaodeng Computer Technology Co ltd
Priority to CN202211087287.7A priority Critical patent/CN115169335B/en
Publication of CN115169335A publication Critical patent/CN115169335A/en
Application granted granted Critical
Publication of CN115169335B publication Critical patent/CN115169335B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Character Discrimination (AREA)

Abstract

The application relates to an invoice data calibration method, an invoice data calibration device, computer equipment and a storage medium. The method comprises the following steps: acquiring invoice data to be calibrated and invoice head-up library data, and determining character strings to be calibrated in the invoice data to be calibrated; for each reference character string included in the invoice new line database data, carrying out similarity detection on the reference character string and the character string to be calibrated to obtain a detection result of the corresponding reference character string; screening a target reference character string from the multiple reference character strings according to the detection result corresponding to each reference character string and a preset detection threshold value; and calibrating the character string to be calibrated through the target reference character string to obtain the target character string. By adopting the method, the accuracy of invoice data calibration can be improved.

Description

Invoice data calibration method and device, computer equipment and storage medium
Technical Field
The present application relates to the field of computer technologies, and in particular, to an invoice data calibration method, apparatus, computer device, and storage medium.
Background
With the development of computer technology, the existing financial auditing work is mainly carried out after the invoice is identified, so that the authenticity and the compliance of invoice identification are important to research at the present stage.
At present, after an invoice image is identified in an optical character identification mode to obtain initial invoice data, the invoice image is verified through an invoice verification interface of a third party, and then the calibrated invoice data is determined according to a verification result. However, real-time invoice verification cannot be performed by a third party, and the obtained invoice data is usually not highly accurate. Therefore, how to verify the invoice data in real time so as to improve the accuracy of invoice data calibration is a problem to be solved by the disclosure.
Disclosure of Invention
In view of the above, there is a need to provide an invoice data calibration method, apparatus, computer device and computer readable storage medium capable of improving accuracy of invoice data.
In a first aspect, the present application provides a method for invoice data calibration. The method comprises the following steps:
acquiring invoice data to be calibrated and invoice head-up library data, and determining character strings to be calibrated in the invoice data to be calibrated; the invoice head-up library data comprises a plurality of reference character strings;
for each reference character string in a plurality of reference character strings, carrying out similarity detection on the reference character string and the character string to be calibrated to obtain a detection result of the corresponding reference character string;
screening out a target reference character string from the plurality of reference character strings according to the detection result corresponding to each reference character string and a preset detection threshold value;
and calibrating the character string to be calibrated through the target reference character string to obtain a target character string.
In one embodiment, the similarity detection of the reference character string and the character string to be calibrated to obtain a detection result of the corresponding reference character string includes: obtaining an initial result, and determining a current reference character in the reference character string and a current character to be calibrated in the character string to be calibrated; carrying out similarity detection on the current reference character and the current character to be calibrated to obtain a first sub-result of the current reference character; determining a detection sub-result of the current reference character according to the first sub-result; entering the next round of character similarity detection, determining a new current reference character, a new current character to be calibrated and a new initial result, returning to the step of performing similarity detection on the current reference character and the current character to be calibrated, and continuing to execute the step until a detection sub-result of the last reference character in the reference character string is obtained; and taking the detection sub-result of the last reference character as the detection result corresponding to the reference character string.
In one embodiment, determining a detection sub-result for the current reference character based on the first sub-result comprises: if the first sub-result represents that the current reference character is similar to the current character to be calibrated, overlapping the initial result and the first sub-result to obtain a detection sub-result of the current reference character; determining a new current reference character, a new current character to be calibrated and a new initial result, returning to the step of detecting the similarity between the current reference character and the current character to be calibrated, and continuing to execute the steps, wherein the steps comprise: determining a next reference character in the reference character string and a next character to be calibrated in the character string to be calibrated, taking the next reference character as a new current reference character, taking the next character to be calibrated as a new current character to be calibrated, and taking a detection sub-result of the current reference character as a new initial result; and returning to the step of detecting the similarity between the current reference character and the current character to be calibrated, and continuing to execute the steps.
In one embodiment, the similarity detection of the current reference character and the current character to be calibrated to obtain a first sub-result of the current reference character includes: judging whether the current reference character is the same as the current character to be calibrated or not; if the reference characters are the same as the characters to be calibrated, determining the similarity score between the current reference character and the current character to be calibrated as a preset first score; and obtaining a first sub-result corresponding to the current reference character according to the difference between the preset first score and the preset similarity threshold.
In one embodiment, the method further comprises: if not, acquiring a similar character list corresponding to the current character to be calibrated; the similar character list comprises at least one similar character and a similarity score corresponding to each similar character; finding out a target similar character which is the same as the current reference character from at least one similar character; and obtaining a first sub-result corresponding to the current reference character according to the difference between the similarity score corresponding to the target similar character and the similarity threshold.
In one embodiment, determining the detection sub-result for the current reference character based on the first sub-result comprises: if the first sub-result represents that the current reference character is not similar to the current character to be calibrated, acquiring a first character splitting list corresponding to the current character to be calibrated; according to the first character splitting list, carrying out similarity detection on the current reference character and the current character to be calibrated to obtain a second sub-result of the current reference character; superposing the initial result and the second sub-result to obtain a detection sub-result of the current reference character; determining a new current reference character, a new current character to be calibrated and a new initial result, returning to the step of detecting the similarity between the current reference character and the current character to be calibrated, and continuing to execute the steps, wherein the steps comprise: updating the character string to be calibrated to obtain an updated character string to be calibrated, and determining a current updated calibration character in the updated character string to be calibrated; taking the detection sub-result of the current reference character as the detection sub-result of the next reference character in the reference character string, which is adjacent to the current reference character string; taking the next two reference characters in the reference character string, which are adjacent to the current reference character, as new current reference characters, taking the next two updated calibration characters in the updated character string to be calibrated, which are adjacent to the current updated calibration character, as new current characters to be calibrated, and taking the detection sub-result of the current reference character as a new initial result; and returning to the step of detecting the similarity between the current reference character and the current character to be calibrated and continuing to execute.
In one embodiment, the reference string includes the i-number character M (i) The character string to be calibrated comprises a number i character L (i) I is a positive integer, i is less than or equal to the number of candidate characters of the reference character string, or i is less than or equal to the number of candidate characters of the character string to be calibrated; according to the first character splitting list, similarity detection is carried out on the current reference character and the current character to be calibrated, and a second sub-result of the current reference character is obtained, wherein the second sub-result comprises the following steps: determining the current reference character as character M (i) The current character to be calibrated is a character L (i) (ii) a Determining a first character splitting character and a second character splitting character in the first character splitting list; judging whether the first character is matched with the character M (i) Whether the same, second character is similar to character M (i+1) The same; if they are the same, then obtain the character M (i) A second predetermined value between the first character M and the first character (i+1) A third preset score between the first character and the second character; obtaining the character M according to the second preset value, the third preset value and a preset similarity threshold value (i) The second sub-result of (1).
In one embodiment, the method further comprises: when the first character is separated from the character M (i) Different or second character and M (i+1) And when the reference character string and the character string to be calibrated are different, stopping the process of detecting the similarity of the reference character string and the character string to be calibrated.
In one embodiment, updating the character string to be calibrated to obtain an updated character string to be calibrated includes: to-be-calibrated wordCharacter L of i +1 number in string (i+1) To the n character L (n) Moving each character in the character string to be calibrated to the tail of the character string to be calibrated by a character position to obtain a candidate character string to be calibrated; adding a null character to the tail of the reference character string; will character M (i) Assigned to the character L (i) To obtain a new character L (i) And will character M (i+1) Assigned to the character L (i+1) To obtain a new character L (i+1) (ii) a Synthesize new character L (i) New character L (i+1) And candidate character strings to be calibrated are obtained, and the updated character strings to be calibrated are obtained.
In one embodiment, determining the detection sub-result for the current reference character based on the first sub-result comprises: if the first sub-result represents that the current reference character is not similar to the current character to be calibrated, acquiring a second character splitting list corresponding to the current reference character; according to the second character splitting list, carrying out similarity detection on the current reference character and the current character to be calibrated to obtain a third sub-result of the current reference character; superposing the initial result and the third sub-result to obtain a detection sub-result of the current reference character; determining a new current reference character, a new current character to be calibrated and a new initial result, returning to the step of detecting the similarity between the current reference character and the current character to be calibrated, and continuing to execute the steps, wherein the steps comprise: updating the character string to be calibrated to obtain an updated character string to be calibrated, and determining a current updated calibration character in the updated character string to be calibrated; taking the next reference character in the reference character string, which is next to the current reference character, as a new current reference character, taking the next updated calibration character in the updated character string to be calibrated, which is next to the current updated calibration character, as a new current character to be calibrated, and taking the detection sub-result of the current reference character as a new initial result; and returning to the step of detecting the similarity between the current reference character and the current character to be calibrated, and continuing to execute the steps.
In one embodiment, the reference string includes the i-number character M (i) The character string to be calibrated comprises a number i character L (i) I is positive integerThe number of the candidate characters of the reference character string is less than or equal to i, or the number of the candidate characters of the character string to be calibrated is less than or equal to i; according to the second character splitting list, carrying out similarity detection on the current reference character and the current character to be calibrated to obtain a third sub-result of the current reference character, wherein the third sub-result comprises the following steps: determining the current reference character as character M (i) The current character to be calibrated is a character L (i) (ii) a Determining a third character and a fourth character in the second character splitting list; judging whether the third character is the character L or not (i) Whether the same and fourth characters are the same as the character L (i+1) The same; if they are the same, then obtain character M (i) A fourth preset value between the first word splitting list and the second word splitting list; obtaining the character M according to the third preset value and the preset similarity threshold value (i) The third sub-result of (1).
In one embodiment, the method further comprises: when the third character is separated from the character L (i) Not identical or fourth characters of separating character and character L (i+1) And when the reference character string and the character string to be calibrated are different, stopping the process of detecting the similarity of the reference character string and the character string to be calibrated.
In one embodiment, updating the character string to be calibrated to obtain an updated character string to be calibrated includes: the character L of number i +2 in the character string to be calibrated (i+2) To the n character L (n) Each character in the character string is moved to the head of the character string to be calibrated by a character position, and a null character is added at the tail of the character string to be calibrated to obtain a candidate character string to be calibrated; will character M (i) Assigned to the character L (i) To obtain a new character L (i) Synthesis of new character N (i) And candidate character strings to be calibrated are obtained, and the updated character strings to be calibrated are obtained.
In one embodiment, after performing similarity detection on the reference character string and the character string to be calibrated to obtain a detection result of the corresponding reference character string, the method further includes: deleting the empty characters with the same number in the tail parts of each reference character string and each character string to be calibrated respectively to obtain each new reference character string and each new character string to be calibrated; determining the number of target characters in each new reference character string; screening a target reference character string from a plurality of reference character strings according to the detection result corresponding to each reference character string and a preset detection threshold, wherein the screening process comprises the following steps: respectively determining the detection numerical value corresponding to each reference character string according to the detection result corresponding to each reference character string and the number of the target characters; the detection value comprises at least one of a variance value and a mean square error value; screening the reference character strings corresponding to the detection numerical values smaller than the preset detection threshold value into candidate reference character strings; the detection threshold comprises at least one of a variance threshold and a mean square variance threshold; and taking the candidate reference character string with the minimum detection value as a target reference character string.
In a second aspect, the application further provides an invoice data calibration device. The device comprises:
the data acquisition module is used for acquiring invoice data to be calibrated and invoice heading library data and determining character strings to be calibrated in the invoice data to be calibrated; the invoice head-up library data comprises a plurality of reference character strings;
the similarity detection module is used for carrying out similarity detection on each reference character string in a plurality of reference character strings and the character string to be calibrated to obtain a detection result of the corresponding reference character string;
the character string determining module is used for screening out a target reference character string from the plurality of reference character strings according to the detection result corresponding to each reference character string and a preset detection threshold value; and calibrating the character string to be calibrated through the target reference character string to obtain a target character string.
In a third aspect, the present application also provides a computer device. The computer device comprises a memory storing a computer program and a processor implementing the following steps when executing the computer program:
acquiring invoice data to be calibrated and invoice head-up library data, and determining character strings to be calibrated in the invoice data to be calibrated; the invoice head-up library data comprises a plurality of reference character strings;
for each reference character string in a plurality of reference character strings, carrying out similarity detection on the reference character string and the character string to be calibrated to obtain a detection result of the corresponding reference character string;
screening out a target reference character string from the plurality of reference character strings according to the detection result corresponding to each reference character string and a preset detection threshold;
and calibrating the character string to be calibrated through the target reference character string to obtain a target character string.
In a fourth aspect, the present application further provides a computer-readable storage medium. The computer-readable storage medium has stored thereon a computer program which, when executed by a processor, performs the steps of:
acquiring invoice data to be calibrated and invoice head-up library data, and determining character strings to be calibrated in the invoice data to be calibrated; the invoice head-up library data comprises a plurality of reference character strings;
for each reference character string in a plurality of reference character strings, carrying out similarity detection on the reference character string and the character string to be calibrated to obtain a detection result of the corresponding reference character string;
screening out a target reference character string from the plurality of reference character strings according to the detection result corresponding to each reference character string and a preset detection threshold;
and calibrating the character string to be calibrated through the target reference character string to obtain a target character string.
According to the invoice data calibration method, the invoice data calibration device, the computer equipment and the storage medium, the invoice data to be calibrated and the invoice head-up library data are obtained, the character string to be calibrated in the invoice data to be calibrated is determined, and for each reference character string in the invoice head-up library data, the similarity detection is carried out on the reference character string and the character string to be calibrated to obtain the detection result of the corresponding reference character string. Because the target reference character string is screened out firstly, and then the character string to be calibrated is calibrated, compared with the traditional mode of verifying by adopting an invoice checking interface of a third party, the method and the device can detect the similarity of each character string to be calibrated in real time, obtain the calibrated target character string and improve the accuracy of invoice data calibration. Meanwhile, because the system does not depend on an interface of a third party any more, the efficiency of obtaining real invoice data is ensured, and the management cost of financial management is also reduced.
Drawings
FIG. 1 is a diagram of an environment in which the invoice data calibration method may be used in one embodiment;
FIG. 2 is a schematic flow chart diagram of a method for invoice data calibration in one embodiment;
FIG. 3 is a diagram illustrating a structure of string comparison in one embodiment;
FIG. 4 is a diagram illustrating similar characters in a list of similar characters in one embodiment;
FIG. 5 is a flow diagram illustrating an exemplary process for determining a detection sub-result for a current reference character;
FIG. 6 is a diagram illustrating a de-character in a de-character list in one embodiment;
FIG. 7 is a flow diagram illustrating the determination of a new current reference character, a new current character to be calibrated, and a new initial result, in one embodiment;
FIG. 8 is a diagram illustrating an updated structure of a string in one embodiment;
FIG. 9 is a flow chart illustrating the process of determining the detection sub-result of the current reference character in another embodiment;
FIG. 10 is a flow diagram illustrating the determination of a new current reference character, a new current character to be calibrated, and a new initial result in another embodiment;
FIG. 11 is a diagram showing an updated structure of a character string in another embodiment;
FIG. 12 is a general architecture diagram of the invoice data calibration method in one embodiment;
FIG. 13 is a block diagram of the invoice data calibration apparatus in one embodiment;
FIG. 14 is a diagram illustrating an internal structure of a computer device according to an embodiment.
Detailed Description
In order to make the objects, technical solutions and advantages of the present application more apparent, the present application is described in further detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the present application and are not intended to limit the present application.
The invoice data calibration method provided by the embodiment of the application can be applied to the application environment shown in fig. 1. Wherein the terminal 102 communicates with the server 104 via a network. The data storage system may store data that the server 104 needs to process. The data storage system may be integrated on the server 104 or may be placed on the cloud or other network server. The terminal 102 is configured to send invoice data to be calibrated and invoice head-up library data to the server 104. The server 104 is configured to determine a character string to be calibrated in the invoice data to be calibrated, and perform similarity detection on each reference character string included in the invoice head-up library data and the reference character string to be calibrated to obtain a detection result of the corresponding reference character string. The server 104 is further configured to screen out a target reference character string from the multiple reference character strings according to the detection result corresponding to each reference character string and a preset detection threshold, and calibrate the character string to be calibrated through the target reference character string to obtain the target character string. The terminal 102 may be, but not limited to, various personal computers, notebook computers, smart phones, tablet computers, internet of things devices, and portable wearable devices, and the internet of things devices may be smart speakers, smart televisions, smart air conditioners, smart car-mounted devices, and the like. The portable wearable device can be a smart watch, a smart bracelet, a head-mounted device, and the like. The server 104 may be implemented as a stand-alone server or as a server cluster comprised of multiple servers.
In one embodiment, as shown in fig. 2, there is provided an invoice data calibration method, which is illustrated by applying the method to the computer device in fig. 1, and includes the following steps:
step 202, acquiring invoice data to be calibrated and invoice head-up library data, and determining character strings to be calibrated in the invoice data to be calibrated.
The invoice head-up library data comprises a plurality of reference character strings, the invoice head-up library data is data pre-stored in the invoice head-up library by the enterprise system, and the invoice head-up library can be a relational database management system (such as MySQL) or a data structure storage system (such as Remote Dictionary Server).
Specifically, an invoice parsing controller and recognition server are provided in a computer device, as shown in fig. 3, fig. 3 is an overall architecture diagram of an invoice data calibration system, which may also be referred to as an electronic reimbursement system. When the user submits the invoice, the invoice parsing controller can control the recognition server to perform Optical Character Recognition (OCR) on the invoice, so as to obtain the invoice data to be calibrated. The invoice data to be calibrated includes a plurality of initial character strings, such as a business name character string. The computer equipment determines the invoice type of the invoice according to the invoice data to be calibrated, and determines that the character string to be calibrated comprises an enterprise name character string and a tax number character string from the initial character string when the invoice type is a common invoice; when the invoice type is a value-added special invoice, determining character strings to be calibrated from the initial character strings, wherein the character strings to be calibrated comprise an enterprise name character string, a tax number character string, an address character string, an account opening character string and the like; when the invoice type is a general purpose machine invoice, a quota invoice, a railway bill, an entrance ticket and the like, the initial character string is not required to be calibrated.
Meanwhile, the computer equipment provides a character deviation calculation engine which is used for receiving invoice data to be calibrated, which is obtained by the invoice analysis controller, and invoice head-up library data which is obtained from the invoice head-up library.
In one embodiment, if the address character string, the account line character string and the like in the invoice data to be calibrated are null, but the invoice type is a value-added special invoice, the invoice is marked as an abnormal invoice at the moment.
In one embodiment, a heads-up server is provided in a computer device, and as shown in fig. 3, the heads-up server creates an enterprise heads-up list including a plurality of enterprise heads in response to an editing operation of an administrator on an invoice heads-up library, and stores the enterprise heads-up list in the invoice heads-up library. The enterprise heading comprises an enterprise name character string, a tax number character string, an address character string, an account opening character string and the like, and can be pointed by a unique heading ID. The computer device treats the strings included in the enterprise heads-up list as reference strings in the invoice heads-up library data.
And 204, for each reference character string in the plurality of reference character strings, carrying out similarity detection on the reference character string and the character string to be calibrated to obtain a detection result of the corresponding reference character string.
Wherein, it is easy to understand, when waiting to calibrate the character string for tax character string, a plurality of reference character strings in the invoice new line storehouse data all correspond to a plurality of tax character strings in the invoice new line storehouse, also promptly every reference character string corresponds to the tax character string in every enterprise new line. Since the calibration process for different types of strings is similar, for the sake of simplifying description, the present application only specifically describes the examples of tax number strings.
Since each reference character string is the same as the process of detecting the similarity of the character string to be calibrated, in order to simplify the description, the present application specifically explains the reference character string as an example. Any one of the reference strings corresponds to any one of the enterprise heads-up in the enterprise head-up list.
In one embodiment, the similarity detection of the reference character string and the character string to be calibrated to obtain a detection result of the corresponding reference character string includes: acquiring an initial result, and determining a current reference character in a reference character string and a current character to be calibrated in a character string to be calibrated; carrying out similarity detection on the current reference character and the current character to be calibrated to obtain a first sub-result of the current reference character; determining a detection sub-result of the current reference character according to the first sub-result; entering the next round of character similarity detection, determining a new current reference character, a new current character to be calibrated and a new initial result, returning to the step of performing similarity detection on the current reference character and the current character to be calibrated, and continuing to execute the step until a detection sub-result of the last reference character in the reference character string is obtained; and taking the detection sub-result of the last reference character as the detection result corresponding to the reference character string.
Wherein, the initial result is a detection sub-result of a last reference character which is adjacent to the current reference character in the reference character string; when the current reference character is the first character of the reference character string, the initial result can be set to zero; the first sub-result represents a similarity score S of a current reference character i The difference relationship with the preset similarity threshold 100, for example, the first sub-result is
Figure 604773DEST_PATH_IMAGE001
Specifically, since the similarity detection between the reference character string and the character string to be calibrated is a process of performing similarity detection for each character, the computer device needs to determine the current reference character in the reference character string and the current character to be calibrated in the character string to be calibrated. As shown in fig. 4, fig. 4 is a schematic structural diagram of character string comparison, where if the current character to be calibrated is the character No. 2 1 in the character string st0 to be calibrated, the current reference character is the character No. 2L in the reference character string st1, and then the computer device may perform similarity detection on the current reference character and the current character to be calibrated, so as to obtain a first sub-result of the current reference character as
Figure 820991DEST_PATH_IMAGE002
In one embodiment, determining a detection sub-result for the current reference character based on the first sub-result comprises: and if the first sub-result represents that the current reference character is similar to the current character to be calibrated, overlapping the initial result and the first sub-result to obtain a detection sub-result of the current reference character.
Specifically, if the similarity score S i The difference from the preset similarity threshold 100 is less than or equal to a preset difference value, which may represent the similarity between the current reference character and the current character to be calibrated, for example, the preset difference value is 25, and the similarity score S is 2 At 95, the reference character characterizing number 2 is similar to the character to be calibrated number 2. The initial result at this time is the detector result corresponding to reference character No. 1, i.e. the result is
Figure 172338DEST_PATH_IMAGE003
Therefore, after the computer device superposes the initial result and the first sub-result, the detection sub-result of the current reference character is obtained as
Figure 977352DEST_PATH_IMAGE004
Further, determining a new current reference character, a new current character to be calibrated and a new initial result, returning to the step of detecting the similarity between the current reference character and the current character to be calibrated, and continuing to execute the steps, including: determining a next reference character in the reference character string and a next character to be calibrated in the character string to be calibrated, taking the next reference character as a new current reference character, taking the next character to be calibrated as a new current character to be calibrated, and taking a detection sub-result of the current reference character as a new initial result; and returning to the step of detecting the similarity between the current reference character and the current character to be calibrated and continuing to execute.
Specifically, if the first sub-result indicates that the current reference character is similar to the current character to be calibrated, referring to the above example, the computer device uses the reference character No. 3 4 adjacent to the reference character No. 2 as a new current reference character, uses the character No. 3 adjacent to the character No. 2 to be calibrated as a new current character to be calibrated, uses the detection sub-result of the reference character No. 2 as a new initial result, continues to perform similarity detection on the current reference character and the current character to be calibrated, and obtains the first sub-result of the reference character No. 3 as
Figure 185479DEST_PATH_IMAGE005
. To sum up, if the current reference character is the i-character in the reference character string, the corresponding detection sub-result is
Figure 92255DEST_PATH_IMAGE006
That is, the first sub-result of each reference character is continuously superimposed until the n-th character superimposed in the reference character string, that is, the last reference character, so that the sub-result of the n-th character is the detection result corresponding to the reference character string.
In the embodiment, after the current reference character in the reference character string and the current character to be calibrated in the character string to be calibrated are determined, the similarity detection can be performed on the current reference character and the current character to be calibrated to obtain the first sub-result of the current reference character, so that the first sub-results of each reference character are continuously superposed, and finally, the detection result corresponding to the reference character string can be accurately obtained. Meanwhile, the detection sub-result of each reference character is associated with the detection sub-result of the previous reference character, so that the similarity score between the current reference character and the current character to be calibrated can be accurately obtained.
It is easy to understand that, in the present application, for an example when the character string to be calibrated is a tax number character string, the tax number character string only includes 12 characters, and actually, the tax number is generally composed of 15-bit, 17-bit, 18-bit or 20-bit codes, and the characters at different positions represent information such as an area code, an economic property code, an industry code, and the like.
In one embodiment, before performing similarity detection on the reference character string and the character string to be calibrated, the computer device further needs to determine whether the initial number of characters of the reference character string is equal to the initial number of characters of the character string to be calibrated; and if the characters are not equal, adding a null character at the tail of the character string with the smaller number of the initial characters until the number of the null character is equal to that of the characters in the character string with the larger number of the initial characters, and obtaining the reference character string with the same number of the candidate characters and the data to be calibrated.
In this embodiment, the accuracy of similarity detection between the current reference character and the corresponding current character to be calibrated is ensured by adding a null character in the reference character string or the character string to be calibrated.
And step 206, screening out a target reference character string from the plurality of reference character strings according to the detection result corresponding to each reference character string and a preset detection threshold value.
In one embodiment, after obtaining the detection result of the corresponding reference character string, the method further includes: deleting the empty characters with the same number in the tail parts of each reference character string and each character string to be calibrated respectively to obtain each new reference character string and each new character string to be calibrated; the number of target characters in each new reference string is determined.
After detecting the similarity between the current reference character and the current character to be calibrated, the process of updating the reference character string and the character string to be calibrated respectively may be involved, that is, the positions of some characters in the reference character string and the character string to be calibrated may move, so that the situation with empty characters may occur. Therefore, in order to ensure that a plurality of reference character strings can be accurately screened subsequently, each reference character string and the tail of the character string to be calibrated need to be deleted with the same number of empty characters.
In one embodiment, screening out a target reference character string from a plurality of reference character strings according to a detection result corresponding to each reference character string and a preset detection threshold value comprises: respectively determining the detection numerical value corresponding to each reference character string according to the detection result corresponding to each reference character string and the number of the target characters; screening the reference character strings corresponding to the detection numerical values smaller than the preset detection threshold value into candidate reference character strings; and taking the candidate reference character string with the minimum detection value as a target reference character string.
Wherein the detection value comprises at least one of a variance value and a mean square error value; the detection threshold includes at least one of a variance threshold and a mean square variance threshold.
Specifically, the following method may be adopted to determine the detection value corresponding to each reference character string:
Figure 981714DEST_PATH_IMAGE007
Figure 657546DEST_PATH_IMAGE008
wherein T is the detection result corresponding to the reference character string, n is the target character number corresponding to the reference character string, D is the variance value,
Figure 36575DEST_PATH_IMAGE009
is a mean square error value. The computer equipment obtains a preset detection threshold value, judges the size between the detection threshold value and the detection numerical value corresponding to each reference character string respectively, screens the reference character strings with the detection numerical values smaller than the detection threshold value into candidate reference character strings, and takes the candidate reference character string with the minimum detection numerical value as a target reference character string.
It is easy to understand that, when the reference character string and the character string to be calibrated are identical, the variance value or the mean square error value is 0, i.e. the larger the variance value or the mean square error value is, the lower the similarity between the reference character string and the character string to be calibrated is.
In one embodiment, if the candidate reference string with the smallest detection value includes a plurality of candidate reference strings, one of the candidate reference strings may be arbitrarily selected as the target reference string.
In one embodiment, the computer device determines the detection threshold according to a preset similarity threshold, the number of characters of the character string to be calibrated, and a preset special value. The preset special value may be zero, and at this time, the similarity score between any character in the character string representing the character to be calibrated and the character in the reference character string is 0. For example, when the number of characters of the character string to be calibrated is 18, the detection threshold value at this time
Figure 414335DEST_PATH_IMAGE010
Comprises the following steps:
Figure 107485DEST_PATH_IMAGE011
in one embodiment, after the similarity detection is performed on the tax number character strings in the invoice data to be calibrated, the detection results corresponding to the obtained reference character strings are all greater than or equal to a preset detection threshold, and at this time, the process of performing the similarity detection on other character strings in the invoice data to be calibrated is stopped.
In one embodiment, for the enterprise name string, the tax number string, the address string, the account opening string and the like in the invoice data to be calibrated, similarity detection can be performed on each type of string and the reference string in a parallel mode.
And 208, calibrating the character string to be calibrated through the target reference character string to obtain a target character string.
Specifically, the computer device corrects the character string to be calibrated into a target reference character string to obtain a target character string, and searches for the head-up of the target enterprise containing the target reference character string from an enterprise head-up list, so as to calibrate the invoice data to be calibrated by using other character strings included in the head-up of the target enterprise.
In an embodiment, when the to-be-calibrated string is the tax number string, the calibration is performed, and after the target string is obtained, the computer device further needs to calibrate the address string, the account opening string, and the like.
In one embodiment, if the computer device determines that the respective corresponding detection value of the address character string or the account-making line character string is greater than or equal to the detection threshold value and the invoice type is a value-added special invoice, it determines that the invoice is an abnormal invoice. Therefore, the address character string or the account opening character string and the like in the invoice data to be calibrated do not need to be updated. At this time, it is necessary to determine the new heads of other enterprises from the enterprise head-up list as the reference character string.
In one embodiment, when an invoice is flagged as an anomalous invoice, an administrator may be alerted to focus on the invoice or a function may be triggered that prevents the user from submitting a reimbursement slip containing the invoice.
In one embodiment, referring to fig. 3, when the character deviation calculation engine in the computer device obtains the target character string, the target invoice data after the invoice data to be calibrated is calibrated, that is, the invoice elements in the invoice submitted by the user, can be obtained. And the invoice analysis controller sends the target invoice data obtained from the character deviation calculation engine to the invoice management equipment, so that the invoice management equipment completes the auditing operation of the reimbursement bill corresponding to the invoice.
In the embodiment, the target reference character string is data which is configured in the enterprise heading list in advance, and new character string data cannot be introduced when the character string to be calibrated is calibrated, so that the accuracy of calibrating the character string to be calibrated is improved, and the convenience of submitting invoices to be checked by a user is also improved.
According to the invoice data calibration method, the invoice data to be calibrated and the invoice head-up library data are obtained, the character strings to be calibrated in the invoice data to be calibrated are determined, for each reference character string in the invoice head-up library data, similarity detection is carried out on the reference character string and the character string to be calibrated, and the detection result of the corresponding reference character string is obtained. Because this application is after screening out target reference character string earlier, treat the calibration character string and calibrate, consequently, compare in the traditional mode that adopts the invoice checking interface of third party to verify, this application can carry out the similarity detection to every character string of treating the calibration in real time to obtain the target character string after the calibration, improved the accuracy of invoice data calibration.
In one embodiment, performing similarity detection on a current reference character and a current character to be calibrated to obtain a first sub-result of the current reference character, includes: judging whether the current reference character is the same as the current character to be calibrated or not; if the reference characters are the same as the characters to be calibrated, determining the similarity score between the current reference character and the current character to be calibrated as a preset first score; and obtaining a first sub-result corresponding to the current reference character according to the difference between the preset first score and the preset similarity threshold.
Specifically, referring to fig. 4, if the current reference character is reference character No. 1, and the current character to be calibrated is character No. 1, to be calibrated 9, the computer device determines that the current reference character is the same as the current character to be calibrated, and then compares similarity score S between the current reference character and the current character to be calibrated 1 Set to 100, i.e. preset first score. The computer device determines the difference between the preset first score and the preset similarity threshold, i.e. determines
Figure 434561DEST_PATH_IMAGE012
And obtaining a first sub-result of 0 corresponding to the current reference character.
In one embodiment, the method further comprises: if not, acquiring a similar character list corresponding to the current character to be calibrated; finding out a target similar character which is the same as the current reference character from at least one similar character; and obtaining a first sub-result corresponding to the current reference character according to the difference between the similarity score corresponding to the target similar character and the similarity threshold.
The similar character list comprises at least one similar character and a similarity score corresponding to each similar character; the similar character list is obtained by system pre-configuration, as shown in fig. 5, fig. 5 is a schematic diagram of similar characters in the similar character list.
Specifically, referring to fig. 4, if the current reference character is reference character L No. 2 and the current character to be calibrated is character 1 to be calibrated No. 2, the computer device determines that the current reference character is different from the current character to be calibrated, and then obtains a similar character list corresponding to the current character to be calibrated. The computer equipment finds out that the character is the same as the current reference character from the similar character list corresponding to the No. 2 character to be calibratedAnd determining that the similarity score corresponding to the target similar character L is 95. Therefore, the computer device determines the similarity value according to the difference between the similarity value and the preset similarity threshold value
Figure 922174DEST_PATH_IMAGE013
And obtaining a first sub-result 25 corresponding to the current reference character. It is easy to understand that the smaller the first sub-result corresponding to the current reference character is, the higher the similarity between the current reference character and the current character to be calibrated is.
In one embodiment, the similar character list is stored in a font similar library, and the data in the font similar library can be obtained by combing and inputting common Chinese characters in a national standard Chinese character coding character set, and can also be obtained by combing and inputting related Chinese characters and symbols in an invoice head raising library.
In one embodiment, the glyph similarity library may be a relational database management system, or a data structure storage system that serves as the middleware for databases, caches, and messages.
In the embodiment, whether the current reference character is the same as the current character to be calibrated or not is judged, the difference between the preset first score and the preset similarity threshold value is directly determined when the current reference character is the same as the current character to be calibrated, the similar character list is obtained when the current reference character is not the same as the preset first score and the preset similarity threshold value, and then the first sub-result corresponding to the current reference character is determined according to the similarity score of the target similar character in the similar character list, so that multiple detection schemes are provided for detecting the similarity of the current reference character and the current character to be calibrated, and the flexibility of similarity detection is improved.
In one embodiment, as shown in fig. 6, determining a detection sub-result of the current reference character according to the first sub-result includes the following steps:
step 602, if the first sub-result indicates that the current reference character is not similar to the current character to be calibrated, a first character splitting list corresponding to the current character to be calibrated is obtained.
When the computer device determines that the current reference character is not the same as the current character to be calibrated and does not find a target similar character which is the same as the current reference character from a similar character list corresponding to the current character to be calibrated, the computer device can represent that the current reference character is not similar to the current character to be calibrated.
Specifically, referring to fig. 4, if the current reference character is reference character No. 4 1 and the current character to be calibrated is character No. 4 to be calibrated B, the reference character No. 4 and the character No. 4 to be calibrated are represented to be dissimilar, and the computer device needs to obtain a first character splitting list corresponding to the current character to be calibrated. The first character splitting list includes two character splitting characters, the first character splitting list is any one character splitting list in a character splitting library, the character splitting library is obtained after the system is configured in advance, as shown in fig. 7, and fig. 7 is a schematic diagram of the character splitting characters in the character splitting list.
In one embodiment, the first character splitting list is stored in a character splitting library, and the data in the character splitting library can be obtained by combing and inputting common Chinese characters in a national standard Chinese character coding character set, and can also be obtained by combing and inputting related Chinese characters and symbols in an invoice head raising library.
In one embodiment, the glyph discourse may be a relational database management system, or a data structure storage system that serves as the middleware for databases, caches, and messages.
And step 604, according to the first character splitting list, performing similarity detection on the current reference character and the current character to be calibrated to obtain a second sub-result of the current reference character.
In one embodiment, according to the first word splitting list, performing similarity detection on the current reference character and the current character to be calibrated to obtain a second sub-result of the current reference character, including: determining the current reference character as character M (i) The current character to be calibrated is a character L (i) (ii) a Determining a first character splitting character and a second character splitting character in the first character splitting list; judging whether the first character is matched with the character M (i) Whether the same, second character is similar to character M (i+1) The same; if they are the same, then obtain the character M (i) Second predetermined value between first character of separating characterCharacter M (i+1) A third preset score between the first character and the second character; obtaining the character M according to the second preset value, the third preset value and a preset similarity threshold value (i) The second sub-result of (1).
Wherein, the reference character string comprises an i-number character M (i) The character string to be calibrated comprises a number i character L (i) I is a positive integer, i is less than or equal to the number of candidate characters of the reference character string, or i is less than or equal to the number of candidate characters of the character string to be calibrated. The number of the candidate characters of the reference character string is the same as that of the character string to be calibrated.
Specifically, referring to FIG. 4, the computer device may have reference character 1, number 4, as character M (i) Taking the character B to be calibrated No. 4 as the character L (i) I =4, and determining that the character 1 in the first character splitting list corresponding to the current character to be calibrated is the first character splitting and the other character 3 is the second character splitting. Computer equipment determines first character and M (i) Identical and second character M (i+1) The same, at this time, the character M can be acquired (i) A second predetermined value between the first character M and the first character (i+1) And a third predetermined score between the second typewritten character. The second preset value and the third preset value are preset by the system, and may be 99, for example. The computer device determines a first difference between a preset second score and a preset similarity threshold, i.e. the computer device determines a first difference between the preset second score and the preset similarity threshold
Figure 803542DEST_PATH_IMAGE014
And determining a second difference between the preset third score and a preset similarity threshold, i.e.
Figure 97121DEST_PATH_IMAGE015
And then overlapping the first difference and the second difference to obtain the character M (i) The second sub-result of (A) is
Figure 996812DEST_PATH_IMAGE016
In one embodiment, when the first word-breaking list includes at least two word-breaking characters, the computer device needs to determine a plurality of word-breaking characters, for example, 3, in the first word-breaking list, and determine whether each word-breaking character is associated with the character M respectively (i) Character M (i+1) Character M (i+2) Similarly.
In one embodiment, the method further comprises: when the first character is separated from the character M (i) Different or second character and M (i+1) And when the reference character string and the character string to be calibrated are different, stopping the process of detecting the similarity of the reference character string and the character string to be calibrated. At this time, it is indicated that the initially selected reference character string st1 is different from the character string st0 to be calibrated, and the reference character string st2 in the new heads of other enterprises needs to be selected from the invoice new head library to continue similarity detection with the character string st0 to be calibrated.
And 606, overlapping the initial result and the second sub-result to obtain a detection sub-result of the current reference character.
In reference to step 204, the specific implementation process of superimposing the initial result and the first sub-result to obtain the sub-result of the detection of the current reference character is performed, and in this embodiment, the initial result is determined to be the character M of number i-1 (i-1) Result of the detector of (1)
Figure 389748DEST_PATH_IMAGE017
Then, the character M is obtained (i) Corresponding detector result is
Figure 555150DEST_PATH_IMAGE018
In one embodiment, the computer device may convert the character M (i) As character M (i+1) The result of (1).
In the embodiment, when the current reference character is not similar to the current character to be calibrated, the similarity detection is performed on the current reference character and the current character to be calibrated by adopting the first character splitting list corresponding to the current character to be calibrated, so that the characters conforming to the character splitting structure are detected in a targeted manner, the scene of the current character to be calibrated can be still accurately identified when the characters of the invoice data are deformed, and the subsequent character string to be calibrated is ensured.
In an embodiment, as shown in fig. 8, if the first sub-result represents that the current reference character is not similar to the current character to be calibrated, determining a new current reference character, a new character to be calibrated, and a new initial result, and returning to the step of performing similarity detection on the current reference character and the current character to be calibrated to continue the execution, the method includes the following steps:
and 802, updating the character string to be calibrated to obtain an updated character string to be calibrated, and determining a current updated calibration character in the updated character string to be calibrated.
In one embodiment, updating the character string to be calibrated to obtain an updated character string to be calibrated includes: the character L of the number i +1 in the character string to be calibrated (i+1) To the n character L (n) Moving a character position to the tail of the character string to be calibrated to obtain a candidate character string to be calibrated; adding a null character to the tail of the reference character string; will character M (i) Assigned to the character L (i) To obtain a new character L (i) And will character M (i+1) Assigned to the character L (i+1) To obtain a new character L (i+1) (ii) a Synthesize new character L (i) New character L (i+1) And candidate character strings to be calibrated are obtained, and the updated character strings to be calibrated are obtained.
Specifically, referring to FIG. 4, for character L (i) When corresponding to the character B to be calibrated No. 4, the computer equipment enables the character L of No. i +1 in the character string to be calibrated (i+1) To the n character L (n) Each character in (1), that is, the character string MRHVV9DR in the character string to be calibrated st0, is moved by one character position to the tail of the character string to be calibrated, to obtain a candidate character string to be calibrated of 91AB [ 2 ]]MRHVV9DR. Computer equipment will character L (i) B in (1) is deleted, and the character M is deleted (i) 1 in to the character L (i) Character M (i) 3 in to the character L (i+1) Therefore, it can be obtained that the updated character string st0 (A1) to be calibrated is 91a13MRHVV9DR. Meanwhile, the computer equipment adds a null character to the tail part of the reference character string so as to ensure that the number of the characters in the character string to be calibrated is consistent with the updated number of the characters in the character string to be calibrated.
Further, as shown in fig. 9, fig. 9 is a schematic structural diagram after a character string is updated. When the computer equipment determines the updated character string to be calibrated, the computer equipment determines the updated character string to be calibrated according to the character L (i) The corresponding number, i.e. i =4, determines the current updated calibration character L in the updated string st0 (a 1) to be calibrated (i) Is 1.
Step 804, using the next two reference characters in the reference character string which are adjacent to the current reference character as new current reference characters, using the next two updated calibration characters in the updated character string to be calibrated which are adjacent to the current updated calibration character as new current characters to be calibrated, and using the detection sub-result of the current reference character as a new initial result.
Specifically, referring to fig. 9, the computer device may treat reference character No. 6M immediately adjacent to reference character i =4 as a new current reference character, treat updated calibration character No. 6M immediately adjacent to current updated calibration character i =4 as a new current character to be calibrated, and treat character M (i) The corresponding detector result serves as a new initial result. Wherein it is easy to understand, due to the character M (i) And the character M (i) The result of the detection is the same, and the character M can also be used (i+1) As a new initial result.
And step 806, returning to the step of detecting the similarity between the current reference character and the current character to be calibrated, and continuing to execute the step.
Specifically, the computer device continues to perform similarity detection on the current reference character and the current character to be calibrated, that is, returns to the specific implementation process in step 204, and obtains the first sub-result of the reference character No. 6, where the first sub-result is 0 since the current reference character and the current character to be calibrated are both M.
In this embodiment, after the similarity detection is performed on the current reference character and the current character to be calibrated by using the first character splitting list, the character string to be calibrated needs to be updated to ensure that the updated character in the character string to be calibrated and the character in the reference character string are in a corresponding state.
In one embodiment, as shown in fig. 10, determining a detection sub-result of the current reference character according to the first sub-result further comprises the following steps:
step 1002, if the first sub-result represents that the current reference character is not similar to the current character to be calibrated, a second word splitting list corresponding to the current reference character is obtained.
The second character splitting list is any character splitting list in the character splitting library.
And 1004, according to the second character splitting list, performing similarity detection on the current reference character and the current character to be calibrated to obtain a third sub-result of the current reference character.
In one embodiment, according to the second word splitting list, performing similarity detection on the current reference character and the current character to be calibrated to obtain a third sub-result of the current reference character, including: determining the current reference character as character M (i) The current character to be calibrated is a character L (i) (ii) a Determining a third character and a fourth character in the second character splitting list to judge whether the third character is matched with the character L or not (i) Whether the same and fourth characters are the same as the character L (i+1) The same; if they are the same, then obtain the character M (i) A fourth preset value between the first word splitting list and the second word splitting list; obtaining the character M according to the third preset value and the preset similarity threshold value (i) The third sub-result of (1).
Wherein, the reference character string comprises an i-number character M (i) The character string to be calibrated comprises a number i character L (i) I is a positive integer, i is less than or equal to the number of candidate characters of the reference character string, and i is less than or equal to the number of candidate characters of the character string to be calibrated. The number of the candidate characters of the reference character string is the same as that of the character strings to be calibrated.
Specifically, referring to fig. 9, the computer device may take reference character No. 9W in reference character string stl as character M (i) And taking the No. 9 updated calibration character V in the updated character string st0 (a 1) to be calibrated as the character L (i) I =9, and determining that the character V in the second separation list corresponding to the current reference character is the third separation character, and the other character V is the fourth separation character. Computer equipment determines third character and character L (i) The same and fourth character L (i+1) The same, at this time, the character M can be acquired (i) And a fourth preset score between the second list of characters. The fourth preset score is preset by the system, and may be 99, for example. The computer device determines the difference between the preset fourth score and the preset similarity threshold, i.e. the difference is
Figure 324522DEST_PATH_IMAGE019
To obtain a character M (i) The third sub-result of (1).
In one embodiment, when the second word-breaking list includes at least two word-breaking characters, the computer device needs to determine a plurality of word-breaking characters, for example, 3, in the second word-breaking list, and determine whether each word-breaking character is associated with the character L respectively (i) Character L (i+1) Character L (i+2) Similarly.
In one embodiment, the method further comprises: when the third character is separated from the character L (i) Not identical or fourth characters of separating character and character L (i+1) And when the reference character string and the character string to be calibrated are different, stopping the process of detecting the similarity of the reference character string and the character string to be calibrated. At this time, it is indicated that the initially selected reference character string st1 is different from the updated character string st0 (a 1) to be calibrated, and the reference character string st2 and the character string to be calibrated in the new heads of other enterprises need to be selected from the invoice new head libraryst0 similarity detection is performed again.
In one embodiment, when the computer device determines that the first sub-result represents that the current reference character is not similar to the current character to be calibrated, the computer device may obtain a first word splitting list corresponding to the current character to be calibrated and a second word splitting list corresponding to the current reference character in parallel, so that the efficiency of similarity detection may be improved, and the time for data calculation may be reduced.
And step 1006, superposing the initial result and the third sub-result to obtain a detection sub-result of the current reference character.
Referring to the specific implementation process of superimposing the initial result and the first sub-result in step 204, this embodiment determines that the initial result is the character M of number i-1 (i-1) Result of the detector of (1)
Figure 95032DEST_PATH_IMAGE020
Then, the character M is obtained (i) The result of the detector is
Figure 721186DEST_PATH_IMAGE021
In the embodiment, when the current reference character is not similar to the current character to be calibrated, the similarity detection is performed on the current reference character and the current character to be calibrated by adopting the second character splitting list corresponding to the current reference character, so that the characters conforming to the character splitting structure can be detected in a targeted manner, the scene of the current character to be calibrated can be still accurately identified when the characters of the invoice data are deformed, and the subsequent character string to be calibrated is ensured.
In an embodiment, as shown in fig. 11, if the first sub-result represents that the current reference character is not similar to the current character to be calibrated, determining a new current reference character, a new current character to be calibrated, and a new initial result, and returning to the step of performing similarity detection on the current reference character and the current character to be calibrated to continue the execution, the method includes the following steps:
step 1102, updating the character string to be calibrated to obtain an updated character string to be calibrated, and determining a current updated calibration character in the updated character string to be calibrated.
In one embodiment, updating the character string to be calibrated to obtain an updated character string to be calibrated includes: the character L of the number i +2 in the character string to be calibrated (i+2) To the n character L (n) Each character in the character string is moved to the head of the character string to be calibrated by a character position, and a null character is added at the tail of the character string to be calibrated to obtain a candidate character string to be calibrated; will character M (i) Assigned to the character L (i) To obtain a new character L (i) Synthesis of new character N (i) And candidate character strings to be calibrated are obtained, and the updated character strings to be calibrated are obtained.
When the character string to be calibrated is the character string to be calibrated st0 (a 1), the updated character string to be calibrated is st0 (b), and when the character string to be calibrated is the character string to be calibrated st0, the updated character string to be calibrated is st0 (a 2), which is exemplified by the process of obtaining st0 (b).
Specifically, referring to FIG. 9, for character L (i) When updating the calibration character V corresponding to number 9, the computer device will update the character L (i) V and character L in (i+1) V in the calibration data are deleted, and the character L of number i +2 in the character string to be calibrated is deleted (i+2) To the n character L (n) Each character in (1), that is, the character string 9DR in the character string st0 (A1) to be calibrated, is moved by one character position to the head of the character string to be calibrated, and a null character is added to the tail of the character string to be calibrated, to obtain a candidate character string to be calibrated of 91A13MRH [, ]]9DR[]. Computer equipment will character M (i) W in (1) to the character L (i) Thus, it can be obtained that the updated string to be calibrated st0 (b) is 91A13MRHW9DR 2]。
Further, as shown in fig. 12, fig. 12 is a schematic structural diagram after a character string is updated. The computer device determines from the characters L in the string st0 (a 1) to be calibrated (i) The corresponding number, i.e. i =9, determines the current updated calibration character L in the updated string st0 (b) to be calibrated (i) Is W.
And 1104, taking the next reference character in the reference character string adjacent to the current reference character as a new current reference character, taking the next updated calibration character in the updated character string to be calibrated adjacent to the current updated calibration character as a new current character to be calibrated, and taking the detection sub-result of the current reference character as a new initial result.
Specifically, referring to fig. 12, the computer device may treat reference character No. 10, which is next to reference character No. i =9, as a new current reference character, update calibration character No. 10, which is next to update calibration character No. i =9, as a new current character to be calibrated, and character M (i) The corresponding detector result is used as a new initial result.
Step 1106, returning to the step of detecting the similarity between the current reference character and the current character to be calibrated, and continuing to execute the steps.
In this embodiment, after the similarity detection is performed on the current reference character and the current character to be calibrated by using the second character splitting list, the character string to be calibrated needs to be updated to ensure that the updated character in the character string to be calibrated and the character in the reference character string are in a corresponding state.
It should be understood that, although the steps in the flowcharts related to the embodiments are shown in sequence as indicated by the arrows, the steps are not necessarily executed in sequence as indicated by the arrows. The steps are not limited to being performed in the exact order illustrated and, unless explicitly stated herein, may be performed in other orders. Moreover, at least a part of the steps in the flowcharts related to the above embodiments may include multiple steps or multiple stages, which are not necessarily performed at the same time, but may be performed at different times, and the order of performing the steps or stages is not necessarily sequential, but may be performed alternately or alternately with other steps or at least a part of the steps or stages in other steps.
Based on the same inventive concept, the embodiment of the application also provides an invoice data calibration device for realizing the invoice data calibration method. The implementation scheme for solving the problem provided by the device is similar to the implementation scheme described in the above method, so the specific limitations in one or more embodiments of the invoice data calibration device provided below can be referred to the limitations on the invoice data calibration method in the above description, and details are not repeated here.
In one embodiment, as shown in fig. 13, there is provided an invoice data calibration apparatus 1300, comprising: a data acquisition module 1302, a similarity detection module 1304, and a string determination module 1306, wherein:
the data acquisition module 1302 is configured to acquire invoice data to be calibrated and invoice head-up library data, and determine a character string to be calibrated in the invoice data to be calibrated; the invoice head-up library data includes a plurality of reference strings.
The similarity detection module 1304 is configured to perform similarity detection on each of the plurality of reference character strings between the reference character string and the character string to be calibrated, so as to obtain a detection result of the corresponding reference character string.
A character string determining module 1306, configured to screen out a target reference character string from the multiple reference character strings according to a detection result corresponding to each reference character string and a preset detection threshold; and calibrating the character string to be calibrated through the target reference character string to obtain the target character string.
In one embodiment, the similarity detection module 1304 is configured to obtain an initial result, and determine a current reference character in the reference character string and a current character to be calibrated in the character string to be calibrated; carrying out similarity detection on the current reference character and the current character to be calibrated to obtain a first sub-result of the current reference character; determining a detection sub-result of the current reference character according to the first sub-result; entering the next round of character similarity detection, determining a new current reference character, a new current character to be calibrated and a new initial result, returning to the step of performing similarity detection on the current reference character and the current character to be calibrated, and continuing to execute the step until a detection sub-result of the last reference character in the reference character string is obtained; and taking the detection sub-result of the last reference character as the detection result corresponding to the reference character string.
In one embodiment, the similarity detection module 1304 further includes a first sub-result module 1304a, configured to determine whether the current reference character is the same as the current character to be calibrated; if the reference characters are the same as the characters to be calibrated, determining the similarity score between the current reference character and the current character to be calibrated as a preset first score; and obtaining a first sub-result corresponding to the current reference character according to the difference between the preset first score and the preset similarity threshold.
In an embodiment, the first sub-result module 1304a is configured to, if the characters are different from each other, obtain a similar character list corresponding to the current character to be calibrated; the similar character list comprises at least one similar character and a similarity score corresponding to each similar character; finding out a target similar character which is the same as the current reference character from at least one similar character; and obtaining a first sub-result corresponding to the current reference character according to the difference between the similarity score corresponding to the target similar character and the similarity threshold.
In one embodiment, the similarity detecting module 1304 further includes a second sub-result module 1304b, configured to obtain a first word splitting list corresponding to the current character to be calibrated, if the first sub-result indicates that the current reference character is not similar to the current character to be calibrated; according to the first character splitting list, carrying out similarity detection on the current reference character and the current character to be calibrated to obtain a second sub-result of the current reference character; and superposing the initial result and the second sub-result to obtain a detection sub-result of the current reference character.
In an embodiment, the second sub-result module 1304b is further configured to perform similarity detection on the current reference character and the current character to be calibrated according to the first word splitting list, and obtaining the second sub-result of the current reference character includes: determining the current reference character as character M (i) The current character to be calibrated is a character L (i) (ii) a Determining a first character separating character and a second character separating character in the first character separating list; judging whether the first character is matched with the character M (i) Whether the same and second character is similar to the character M (i+1) The same; if they are the same, then obtain the character M (i) A second preset value between the first character and the first character, the character M (i+1) A third preset score between the first character and the second character; obtaining the character M according to the second preset value, the third preset value and a preset similarity threshold value (i) The second sub-result of (1).
In one embodiment, the similarity detection module 1304 further includes an updating module 1304c for updating the i +1 number character L in the character string to be calibrated (i+1) To the n character L (n) Moving each character in the character string to be calibrated to the tail of the character string to be calibrated by a character position to obtain a candidate character string to be calibrated; adding a null character to the tail of the reference character string; will character M (i) Assigned to the character L (i) To obtain a new character L (i) And will character M (i+1) Assigned to the character L (i+1) To obtain a new character L (i+1) (ii) a Synthesize new character L (i) New character L (i+1) And candidate character strings to be calibrated are obtained, and the updated character strings to be calibrated are obtained.
In one embodiment, the similarity detection module 1304 includes a third sub-result module 1304d, configured to obtain a second word splitting list corresponding to the current reference character if the first sub-result indicates that the current reference character is not similar to the current character to be calibrated; according to the second character splitting list, carrying out similarity detection on the current reference character and the current character to be calibrated to obtain a third sub-result of the current reference character; and superposing the initial result and the third sub-result to obtain a detection sub-result of the current reference character.
In an embodiment, the third sub-result module 1304d is further configured to perform similarity detection on the current reference character and the current character to be calibrated according to the second word splitting list to obtain a third sub-result of the current reference character,the method comprises the following steps: determining the current reference character as character M (i) The current character to be calibrated is a character L (i) (ii) a Determining a third character and a fourth character in the second character splitting list; judging whether the third character is the character L or not (i) Whether the same and fourth characters are the same as the character L (i+1) The same; if they are the same, then obtain the character M (i) A fourth preset value between the first word splitting list and the second word splitting list; obtaining the character M according to the third preset value and the preset similarity threshold value (i) The third sub-result of (1).
In one embodiment, the updating module 1304c is further configured to calibrate the character string L of i +2 number (i+2) To the n character L (n) Each character in the character string is moved to the head of the character string to be calibrated by a character position, and a null character is added at the tail of the character string to be calibrated to obtain a candidate character string to be calibrated; will character M (i) Assigned to the character L (i) To obtain a new character L (i) Synthesizing a new character N (i) And candidate character strings to be calibrated are obtained, and the updated character strings to be calibrated are obtained.
In an embodiment, the character string determining module 1306 is further configured to determine, according to the detection result and the target character number respectively corresponding to each reference character string, a detection numerical value respectively corresponding to each reference character string; screening the reference character strings corresponding to the detection numerical values smaller than the preset detection threshold value into candidate reference character strings; and taking the candidate reference character string with the minimum detection value as a target reference character string.
The various modules in the invoice data calibration apparatus described above may be implemented in whole or in part by software, hardware, and combinations thereof. The modules can be embedded in a hardware form or independent from a processor in the computer device, and can also be stored in a memory in the computer device in a software form, so that the processor can call and execute operations corresponding to the modules.
In one embodiment, a computer device is provided, which may be a terminal, and its internal structure diagram may be as shown in fig. 14. The computer device includes a processor, a memory, an Input/Output (I/O) interface, a communication interface, a display unit, and an Input apparatus. The processor, the memory and the input/output interface are connected by a system bus, and the communication interface, the display unit and the input device are connected by the input/output interface to the system bus. Wherein the processor of the computer device is configured to provide computing and control capabilities. The memory of the computer device comprises a nonvolatile storage medium and an internal memory. The non-volatile storage medium stores an operating system and a computer program. The internal memory provides an environment for the operating system and the computer program to run on the non-volatile storage medium. The input/output interface of the computer device is used for exchanging information between the processor and an external device. The communication interface of the computer device is used for carrying out wired or wireless communication with an external terminal, and the wireless communication can be realized through WIFI, a mobile cellular network, NFC (near field communication) or other technologies. The computer program when executed by a processor implements a method of invoice data calibration. The display unit of the computer equipment is used for forming a visual and visible picture, and can be a display screen, a projection device or a virtual reality imaging device, the display screen can be a liquid crystal display screen or an electronic ink display screen, the input device of the computer equipment can be a touch layer covered on the display screen, a key, a track ball or a touch pad arranged on the shell of the computer equipment, and can also be an external keyboard, a touch pad or a mouse and the like.
It will be appreciated by those skilled in the art that the configuration shown in fig. 14 is a block diagram of only a portion of the configuration associated with the present application, and is not intended to limit the computing device to which the present application may be applied, and that a particular computing device may include more or fewer components than shown, or may combine certain components, or have a different arrangement of components.
In one embodiment, a computer device is further provided, which includes a memory and a processor, the memory stores a computer program, and the processor implements the steps of the above method embodiments when executing the computer program.
In an embodiment, a computer-readable storage medium is provided, in which a computer program is stored which, when being executed by a processor, carries out the steps of the above-mentioned method embodiments.
In one embodiment, a computer program product or computer program is provided that includes computer instructions stored in a computer-readable storage medium. The computer instructions are read by a processor of the computer device from a computer-readable storage medium, and the computer instructions are executed by the processor to cause the computer device to perform the steps of the above-described method embodiments.
It will be understood by those skilled in the art that all or part of the processes of the methods of the embodiments described above can be implemented by hardware related to instructions of a computer program, which can be stored in a non-volatile computer-readable storage medium, and when executed, the computer program can include the processes of the embodiments of the methods described above. Any reference to memory, database, or other medium used in the embodiments provided herein may include at least one of non-volatile and volatile memory. The nonvolatile Memory may include Read-Only Memory (ROM), magnetic tape, floppy disk, flash Memory, optical Memory, high-density embedded nonvolatile Memory, resistive Random Access Memory (ReRAM), magnetic Random Access Memory (MRAM), ferroelectric Random Access Memory (FRAM), phase Change Memory (PCM), graphene Memory, and the like. Volatile Memory can include Random Access Memory (RAM), external cache Memory, and the like. By way of illustration and not limitation, RAM can take many forms, such as Static Random Access Memory (SRAM) or Dynamic Random Access Memory (DRAM), among others. The databases referred to in various embodiments provided herein may include at least one of relational and non-relational databases. The non-relational database may include, but is not limited to, a block chain based distributed database, and the like. The processors referred to in the embodiments provided herein may be general purpose processors, central processing units, graphics processors, digital signal processors, programmable logic, data processing logic, etc., and are not limited thereto.
The technical features of the above embodiments can be arbitrarily combined, and for the sake of brevity, all possible combinations of the technical features in the above embodiments are not described, but should be considered as the scope of the present specification as long as there is no contradiction between the combinations of the technical features.
The above examples only express several embodiments of the present application, and the description thereof is more specific and detailed, but not to be construed as limiting the scope of the present application. It should be noted that, for a person skilled in the art, several variations and modifications can be made without departing from the concept of the present application, and these are all within the scope of protection of the present application. Therefore, the protection scope of the present application shall be subject to the appended claims.

Claims (17)

1. A method of invoice data calibration, the method comprising:
acquiring invoice data to be calibrated and invoice head-up library data, and determining character strings to be calibrated in the invoice data to be calibrated; the invoice head-up library data comprises a plurality of reference character strings;
for each reference character string in a plurality of reference character strings, determining a current reference character in the reference character string and a current character to be calibrated in the character string to be calibrated, and acquiring an initial result; the initial result is a detection sub-result of a last reference character which is adjacent to the current reference character in the reference character string;
carrying out similarity detection on the current reference character and the current character to be calibrated to obtain a first sub-result of the current reference character;
determining a detection sub-result of the current reference character according to the first sub-result;
entering the next round of character similarity detection, determining a new current reference character, a new current character to be calibrated and a new initial result, returning to the step of performing similarity detection on the current reference character and the current character to be calibrated, and continuing to execute the steps until a detection sub-result of the last reference character in the reference character string is obtained;
taking the detection sub-result of the last reference character as the detection result corresponding to the reference character string;
screening out a target reference character string from the plurality of reference character strings according to the detection result corresponding to each reference character string and a preset detection threshold;
and calibrating the character string to be calibrated through the target reference character string to obtain a target character string.
2. The method of claim 1, wherein the invoice head-up library data comprises a relational database management system.
3. The method of claim 1, wherein determining a detection sub-result for the current reference character according to the first sub-result comprises:
if the first sub-result represents that the current reference character is similar to the current character to be calibrated, overlapping the initial result and the first sub-result to obtain a detection sub-result of the current reference character;
the determining of the new current reference character, the new current character to be calibrated and the new initial result returns to the step of detecting the similarity between the current reference character and the current character to be calibrated, and the step of continuing to execute includes:
determining a next reference character in the reference character string and a next character to be calibrated in the character string to be calibrated, taking the next reference character as a new current reference character, taking the next character to be calibrated as a new current character to be calibrated, and taking a detection sub-result of the current reference character as a new initial result;
and returning to the step of detecting the similarity between the current reference character and the current character to be calibrated, and continuing to execute the step.
4. The method according to claim 1, wherein the detecting similarity between the current reference character and the current character to be calibrated to obtain a first sub-result of the current reference character comprises:
judging whether the current reference character is the same as the current character to be calibrated or not;
if the reference characters are the same as the current characters to be calibrated, determining the similarity score between the current reference characters and the current characters to be calibrated as a preset first score;
and obtaining a first sub-result corresponding to the current reference character according to the difference between the preset first score and a preset similarity threshold.
5. The method of claim 4, further comprising:
if not, acquiring a similar character list corresponding to the current character to be calibrated; the similar character list comprises at least one similar character and a similarity score corresponding to each similar character;
finding out a target similar character which is the same as the current reference character from at least one similar character;
and obtaining a first sub-result corresponding to the current reference character according to the difference between the similarity score corresponding to the target similar character and the similarity threshold.
6. The method of claim 1, wherein determining a detection sub-result for the current reference character according to the first sub-result comprises:
if the first sub-result represents that the current reference character is not similar to the current character to be calibrated, acquiring a first character splitting list corresponding to the current character to be calibrated;
according to the first character splitting list, carrying out similarity detection on the current reference character and the current character to be calibrated to obtain a second sub-result of the current reference character;
superposing the initial result and the second sub-result to obtain a detection sub-result of the current reference character;
the determining of the new current reference character, the new current character to be calibrated and the new initial result returns to the step of detecting the similarity between the current reference character and the current character to be calibrated, and the step of continuing to execute includes:
updating the character string to be calibrated to obtain an updated character string to be calibrated, and determining a current updated calibration character in the updated character string to be calibrated;
taking the next two reference characters in the reference character string which are adjacent to the current reference character as new current reference characters, taking the next two updated calibration characters in the updated character string to be calibrated which are adjacent to the current updated calibration character as new current characters to be calibrated, and taking the detection sub-result of the current reference character as a new initial result;
and returning to the step of detecting the similarity between the current reference character and the current character to be calibrated, and continuing to execute the step.
7. The method of claim 6, wherein the reference string comprises a number i character M (i) The character string to be calibrated comprises a number i character L (i) I is a positive integer, i is less than or equal to the number of candidate characters of the reference character string, or i is less than or equal to the number of candidate characters of the character string to be calibrated; the detecting similarity between the current reference character and the current character to be calibrated according to the first character splitting list to obtain a second sub-result of the current reference character includes:
determining the current reference character as character M (i) The current character to be calibrated is a character L (i)
Determining a first character separation character and a second character separation character in the first character separation list;
judging whether the first character of separating characters is the same as the character M or not (i) Whether the second character is the same as the character M (i+1) The same;
if the characters are the same, acquiring the character M (i) A second preset score between the first character and the first character, the character M (i+1) A third preset score between the first character and the second character;
obtaining a character M according to the second preset value, the third preset value and a preset similarity threshold value (i) The second sub-result of (1).
8. The method of claim 7, further comprising:
when the first character is separated from the character M (i) Different or the second character of separating characters and the character M (i+1) And when the reference character string and the character string to be calibrated are different, stopping the process of carrying out similarity detection on the reference character string and the character string to be calibrated.
9. The method according to claim 7, wherein the updating the character string to be calibrated to obtain an updated character string to be calibrated includes:
the character L of the number i +1 in the character string to be calibrated (i+1) To the n character L (n) Moving a character position to the tail of the character string to be calibrated to obtain a candidate character string to be calibrated;
adding a null character to the tail of the reference character string;
the character M is divided into (i) Assigned to the character L (i) To obtain a new character L (i) And combining the character M (i+1) Assigned to the character L (i+1) To obtain a new character L (i+1)
Synthesize the new character L (i) The new character L (i+1) And the candidate character string to be calibrated is obtained, and the updated character string to be calibrated is obtained.
10. The method of claim 1, wherein determining a detection sub-result for the current reference character according to the first sub-result comprises:
if the first sub-result represents that the current reference character is not similar to the current character to be calibrated, acquiring a second word splitting list corresponding to the current reference character;
according to the second character splitting list, carrying out similarity detection on the current reference character and the current character to be calibrated to obtain a third sub-result of the current reference character;
superposing the initial result and the third sub-result to obtain a detection sub-result of the current reference character;
the step of determining a new current reference character, a new current character to be calibrated and a new initial result, and returning to the step of performing similarity detection on the current reference character and the current character to be calibrated, and the step of continuing to perform similarity detection includes:
updating the character string to be calibrated to obtain an updated character string to be calibrated, and determining a current updated calibration character in the updated character string to be calibrated;
taking a next reference character in the reference character string, which is next to the current reference character, as a new current reference character, taking a next updated calibration character in the updated character string to be calibrated, which is next to the current updated calibration character, as a new current character to be calibrated, and taking a detection sub-result of the current reference character as a new initial result;
and returning to the step of detecting the similarity between the current reference character and the current character to be calibrated, and continuing to execute the step.
11. The method of claim 10, wherein the reference string includes an i-number character M (i) The character string to be calibrated comprises a character number I L (i) I is a positive integer, i is less than or equal to the number of candidate characters of the reference character string, or i is less than or equal to the number of candidate characters of the character string to be calibrated; and according to the second character splitting list, performing similarity detection on the current reference character and the current character to be calibrated to obtain a third sub-result of the current reference character, including:
determining the current reference character as character M (i) The current character to be calibrated is a character L (i)
Determining a third character and a fourth character in the second character splitting list;
judging whether the third character for separating characters is the same as the character L or not (i) Whether the fourth character is the same as the character L (i+1) The same;
if the characters are the same, acquiring the character M (i) A fourth preset value between the first word splitting list and the second word splitting list;
obtaining the character M according to the fourth preset value and a preset similarity threshold value (i) The third sub-result of (1).
12. The method of claim 11, further comprising:
when the third character is separated from the character L (i) Not the same or the fourth character and the character L (i+1) And when the reference character string and the character string to be calibrated are different, stopping the process of carrying out similarity detection on the reference character string and the character string to be calibrated.
13. The method according to claim 11, wherein the updating the character string to be calibrated to obtain an updated character string to be calibrated includes:
the character L of the number i +2 in the character string to be calibrated (i+2) To the n character L (n) Is moved to the head of the character string to be calibrated by a character position, andadding a null character at the tail of the character string to be calibrated to obtain a candidate character string to be calibrated;
the character M is divided into (i) Assigned to the character L (i) To obtain a new character L (i) And synthesizing the new character N (i) And the candidate character string to be calibrated is obtained, and the updated character string to be calibrated is obtained.
14. The method according to claim 1, wherein after the detecting the similarity between the reference character string and the character string to be calibrated to obtain the detection result of the corresponding reference character string, the method further comprises:
deleting the same number of empty characters in the tail parts of each reference character string and the character string to be calibrated respectively to obtain each new reference character string and each new character string to be calibrated;
determining the number of target characters in each new reference character string;
the screening of the target reference character string from the plurality of reference character strings according to the detection result corresponding to each reference character string and a preset detection threshold value comprises:
respectively determining a detection numerical value corresponding to each reference character string according to the detection result corresponding to each reference character string and the number of target characters; the detection value comprises at least one of a variance value and a mean square error value;
screening the reference character strings corresponding to the detection numerical values smaller than a preset detection threshold value into candidate reference character strings; the detection threshold comprises at least one of a variance threshold and a mean square variance threshold;
and taking the candidate reference character string with the minimum detection numerical value as a target reference character string.
15. An invoice data calibration apparatus, characterised in that the apparatus comprises:
the data acquisition module is used for acquiring invoice data to be calibrated and invoice head-up library data and determining character strings to be calibrated in the invoice data to be calibrated; the invoice head-up library data comprises a plurality of reference character strings;
the similarity detection module is used for determining a current reference character in a reference character string and a current character to be calibrated in a character string to be calibrated for each reference character string in a plurality of reference character strings and acquiring an initial result; the initial result is a detection sub-result of a last reference character which is adjacent to the current reference character in the reference character string; carrying out similarity detection on the current reference character and the current character to be calibrated to obtain a first sub-result of the current reference character; determining a detection sub-result of the current reference character according to the first sub-result; entering the next round of character similarity detection, determining a new current reference character, a new current character to be calibrated and a new initial result, returning to the step of performing similarity detection on the current reference character and the current character to be calibrated, and continuing to execute the step until a detection sub-result of the last reference character in the reference character string is obtained; taking the detection sub-result of the last reference character as the detection result corresponding to the reference character string;
the character string determining module is used for screening out a target reference character string from the plurality of reference character strings according to the detection result corresponding to each reference character string and a preset detection threshold value; and calibrating the character string to be calibrated through the target reference character string to obtain a target character string.
16. A computer device comprising a memory and a processor, the memory storing a computer program, characterized in that the processor, when executing the computer program, implements the steps of the method of any of claims 1 to 14.
17. A computer-readable storage medium, on which a computer program is stored, which, when being executed by a processor, carries out the steps of the method of any one of claims 1 to 14.
CN202211087287.7A 2022-09-07 2022-09-07 Invoice data calibration method and device, computer equipment and storage medium Active CN115169335B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211087287.7A CN115169335B (en) 2022-09-07 2022-09-07 Invoice data calibration method and device, computer equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211087287.7A CN115169335B (en) 2022-09-07 2022-09-07 Invoice data calibration method and device, computer equipment and storage medium

Publications (2)

Publication Number Publication Date
CN115169335A CN115169335A (en) 2022-10-11
CN115169335B true CN115169335B (en) 2023-01-13

Family

ID=83480533

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211087287.7A Active CN115169335B (en) 2022-09-07 2022-09-07 Invoice data calibration method and device, computer equipment and storage medium

Country Status (1)

Country Link
CN (1) CN115169335B (en)

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104636971A (en) * 2013-11-06 2015-05-20 航天信息股份有限公司 Method of detecting one number for multiple names of value added tax invoice and system thereof
CN113807256A (en) * 2021-09-17 2021-12-17 上海亿保健康管理有限公司 Bill data processing method and device, electronic equipment and storage medium

Family Cites Families (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP3255816B2 (en) * 1995-02-15 2002-02-12 松下電器産業株式会社 Character recognition device
JP3126945B2 (en) * 1997-10-30 2001-01-22 株式会社エイ・ティ・アール音声翻訳通信研究所 Character error correction device
CN105046252B (en) * 2014-11-21 2018-09-07 华中科技大学 A kind of RMB prefix code recognition methods
TWI607387B (en) * 2016-11-25 2017-12-01 財團法人工業技術研究院 Character recognition systems and character recognition methods thereof
CN109087439B (en) * 2018-07-03 2021-02-09 百度在线网络技术(北京)有限公司 Bill checking method, terminal device, storage medium and electronic device
CN109740417B (en) * 2018-10-29 2023-05-16 深圳壹账通智能科技有限公司 Invoice type identification method, invoice type identification device, storage medium and computer equipment
CN111209827B (en) * 2019-12-31 2023-07-14 中国南方电网有限责任公司 Method and system for OCR (optical character recognition) bill problem based on feature detection
CN113192252B (en) * 2020-01-14 2024-02-02 深圳怡化电脑股份有限公司 Method, device, equipment and readable medium for detecting note duplicate
CN113554029B (en) * 2021-09-17 2022-03-11 北京奇虎科技有限公司 Bill verification method, device, equipment and storage medium
CN114913538A (en) * 2022-05-19 2022-08-16 山东国子软件股份有限公司 Multi-class invoice identification method and system based on deep learning

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104636971A (en) * 2013-11-06 2015-05-20 航天信息股份有限公司 Method of detecting one number for multiple names of value added tax invoice and system thereof
CN113807256A (en) * 2021-09-17 2021-12-17 上海亿保健康管理有限公司 Bill data processing method and device, electronic equipment and storage medium

Also Published As

Publication number Publication date
CN115169335A (en) 2022-10-11

Similar Documents

Publication Publication Date Title
US11182544B2 (en) User interface for contextual document recognition
US11256756B2 (en) Character string distance calculation method and device
US11763583B2 (en) Identifying matching fonts utilizing deep learning
US20230205755A1 (en) Methods and systems for improved search for data loss prevention
CN111562965A (en) Page data verification method and device based on decision tree
CN110991538B (en) Sample classification method and device, storage medium and computer equipment
CN114332883A (en) Invoice information identification method and device, computer equipment and storage medium
CN115169335B (en) Invoice data calibration method and device, computer equipment and storage medium
CN114816772B (en) Debugging method, debugging system and computing device for application running based on compatible layer
CN111639903A (en) Review processing method for architecture change and related equipment
CN110874326A (en) Test case generation method and device, computer equipment and storage medium
CN116225956A (en) Automated testing method, apparatus, computer device and storage medium
KR20200046192A (en) Method and system for correcting keyboard typo based on deep learning model
CN114490415A (en) Service testing method, computer device, storage medium, and computer program product
US8548800B2 (en) Substitution, insertion, and deletion (SID) distance and voice impressions detector (VID) distance
US20240005688A1 (en) Document authentication using multi-tier machine learning models
CN112286579B (en) Data processing method, device, computer readable storage medium and computer equipment
CN117555955A (en) Data conversion method, data conversion device, computer device, and storage medium
US20240070534A1 (en) Individualized classification thresholds for machine learning models
US20220414077A1 (en) Graph searching apparatus, graph searching method, and computer-readable recording medium
CN114187088A (en) Modification mark adding method and device, computer equipment and storage medium
CN117370817A (en) Data processing method, apparatus, device, medium, and program product
CN109657090A (en) The method, apparatus, equipment and storage medium of object information are established in ERP system
CN116301786A (en) Auxiliary encoding method, device, computer equipment and storage medium
CN116049009A (en) Test method, test device, computer equipment and computer readable storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant