CN116228265A - Invoice risk identification method, device and equipment - Google Patents

Invoice risk identification method, device and equipment Download PDF

Info

Publication number
CN116228265A
CN116228265A CN202310302024.1A CN202310302024A CN116228265A CN 116228265 A CN116228265 A CN 116228265A CN 202310302024 A CN202310302024 A CN 202310302024A CN 116228265 A CN116228265 A CN 116228265A
Authority
CN
China
Prior art keywords
invoice
information
invoice information
risk
registration text
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310302024.1A
Other languages
Chinese (zh)
Inventor
郑传勇
崔翔
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Zhongnuo Lianjie Digital Technology Co ltd
Original Assignee
Beijing Zhongnuo Lianjie Digital Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Zhongnuo Lianjie Digital Technology Co ltd filed Critical Beijing Zhongnuo Lianjie Digital Technology Co ltd
Priority to CN202310302024.1A priority Critical patent/CN116228265A/en
Publication of CN116228265A publication Critical patent/CN116228265A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/018Certifying business or products
    • G06Q30/0185Product, service or business identity fraud
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/289Phrasal analysis, e.g. finite state techniques or chunking
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q40/00Finance; Insurance; Tax strategies; Processing of corporate or income taxes
    • G06Q40/12Accounting
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02WCLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO WASTEWATER TREATMENT OR WASTE MANAGEMENT
    • Y02W90/00Enabling technologies or technologies with a potential or indirect contribution to greenhouse gas [GHG] emissions mitigation

Landscapes

  • Business, Economics & Management (AREA)
  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Finance (AREA)
  • Physics & Mathematics (AREA)
  • Accounting & Taxation (AREA)
  • Economics (AREA)
  • Marketing (AREA)
  • Strategic Management (AREA)
  • General Business, Economics & Management (AREA)
  • Development Economics (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Technology Law (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • General Engineering & Computer Science (AREA)
  • Financial Or Insurance-Related Operations Such As Payment And Settlement (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The method, the device and the equipment for identifying the invoice risk in the present disclosure, wherein the method for identifying the invoice risk comprises: acquiring enterprise information and invoice information; based on the enterprise information, downloading asset registration text corresponding to the enterprise information; based on the invoice information and the asset registration text, it is identified whether there is an invoice reuse risk. According to the invoice risk identification method disclosed by the invention, whether the invoice has the repeated use risk in financing activities can be automatically identified, so that the invoice risk identification efficiency and accuracy are improved.

Description

Invoice risk identification method, device and equipment
Technical Field
The disclosure relates to the technical field of data processing, and in particular relates to an invoice risk identification method, device and equipment.
Background
In a supply chain financial financing campaign, a financing party needs to provide an invoice for the asset to the financing institution to justify the actual existence of the asset. However, if the invoice is used in other financing processes, the financing project has a large risk, so that the financing institution checks whether the invoice has a risk of repeated use, which is important for risk prevention and control of the financing project.
At present, when checking whether the invoice is at risk of repeated use, a data auditor of a financing institution logs in a China people bank credit center real estate financing unified registration public system (hereinafter referred to as 'medium network logging'), inquires related asset registration words of a financing party, and verifies whether the invoice provided in the financing process is registered one by one in each asset registration text in a manual comparison mode (namely, whether the invoice is at risk of repeated use). And whether the invoice is in repeated use risk is checked by a manual mode, so that the checking effect is low in efficiency and the checking accuracy is poor.
Disclosure of Invention
In view of this, the present disclosure provides an invoice risk recognition method, device and equipment, which can automatically complete invoice risk recognition, and improve the efficiency and accuracy of invoice risk recognition.
According to a first aspect of the present disclosure, there is provided an invoice risk recognition method, including:
acquiring enterprise information and invoice information;
downloading asset registration text corresponding to the enterprise information based on the enterprise information;
and identifying whether invoice reuse risks exist or not based on the invoice information and the asset registration text.
In one possible implementation, when identifying whether there is a risk of invoice reuse based on the invoice information and the asset registration text, the method includes:
performing word segmentation processing on the asset registration text to obtain a word segmentation result of the asset registration text;
calculating the association degree of the invoice information and the registration text based on the invoice information and the word segmentation result;
and identifying whether the invoice reuse risk exists or not based on the association degree between the invoice information and the registration text.
In one possible implementation, the invoice information includes at least two hierarchical invoice information;
when calculating the association degree between the invoice information and the registration text based on the invoice information and the word segmentation result, the method comprises the following steps:
calculating the association degree of each hierarchical invoice information and the registration text based on each hierarchical invoice information and the word segmentation result;
and calculating the association degree of the invoice information and the registration text based on the association degree of each piece of hierarchical invoice information and the registration text.
In one possible implementation, when acquiring invoice information, the method includes:
acquiring an invoice file, and identifying initial invoice information in the invoice file based on an optical character identification algorithm;
verifying whether the initial invoice information is accurate or not based on a preset verification rule;
under the condition of verifying that the initial invoice information is accurate, verifying whether the initial invoice information is true;
and under the condition of verifying that the initial invoice information is true, taking the initial invoice information as the invoice information.
In one possible implementation, the initial invoice information is alerted to an abnormality in the event that the initial invoice information is verified to be inaccurate or not authentic.
In one possible implementation, after acquiring the enterprise information and the invoice information, the method further includes: verifying whether the enterprise information is consistent with the invoice information;
and under the condition that the enterprise information is verified to be consistent with the invoice information, downloading an asset registration text corresponding to the enterprise information based on the enterprise information.
In one possible implementation, when an invoice is identified as having a reuse risk, the invoice is alerted to the reuse risk.
In one possible implementation, the method further includes: and generating an instance of invoice risk identification, and adding the instance into a wind control task center to perform periodic risk identification on the invoice according to the dispatching of the wind control task center.
According to a second aspect of the present disclosure, there is provided an invoice risk recognition device, comprising:
the first data acquisition module is used for acquiring enterprise information and invoice information;
the second data acquisition module is used for downloading asset registration text corresponding to the enterprise information based on the enterprise information;
and the risk identification module is used for identifying whether the invoice reuse risk exists or not based on the invoice information and the asset registration text.
According to a third aspect of the present disclosure, there is provided an invoice risk recognition apparatus, comprising: a processor; a memory for storing processor-executable instructions; wherein the processor is configured to perform the method of the first aspect of the present disclosure.
The invoice risk identification method comprises the steps of obtaining enterprise information and invoice information; based on the enterprise information, downloading asset registration text corresponding to the enterprise information; based on the invoice information and the asset registration text, it is identified whether there is an invoice reuse risk. According to the invoice risk identification method disclosed by the invention, whether the invoice has the repeated use risk in financing activities can be automatically identified, so that the invoice risk identification efficiency and accuracy are improved.
Other features and aspects of the present disclosure will become apparent from the following detailed description of exemplary embodiments, which proceeds with reference to the accompanying drawings.
Drawings
The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate exemplary embodiments, features and aspects of the present disclosure and together with the description, serve to explain the principles of the disclosure.
FIG. 1 illustrates a flow chart of an invoice risk recognition method according to an embodiment of the present disclosure;
FIG. 2 illustrates an example flow diagram of an invoice risk recognition method, according to an embodiment of this disclosure;
FIG. 3 shows a schematic block diagram of an invoice risk recognition device, according to an embodiment of the present disclosure;
fig. 4 shows a schematic block diagram of an invoice risk recognition device, according to an embodiment of the present disclosure.
Detailed Description
Various exemplary embodiments, features and aspects of the disclosure will be described in detail below with reference to the drawings. In the drawings, like reference numbers indicate identical or functionally similar elements. Although various aspects of the embodiments are illustrated in the accompanying drawings, the drawings are not necessarily drawn to scale unless specifically indicated.
The word "exemplary" is used herein to mean "serving as an example, embodiment, or illustration. Any embodiment described herein as "exemplary" is not necessarily to be construed as preferred or advantageous over other embodiments.
In addition, numerous specific details are set forth in the following detailed description in order to provide a better understanding of the present disclosure. It will be understood by those skilled in the art that the present disclosure may be practiced without some of these specific details. In some instances, methods, means, elements, and circuits well known to those skilled in the art have not been described in detail in order not to obscure the present disclosure.
< method example >
FIG. 1 illustrates a flow chart of an invoice risk recognition method according to an embodiment of the present disclosure. As shown in fig. 1, the method includes steps S1100-S1300.
S1100, acquiring enterprise information and invoice information. The business information may include at least one of a name of a business and a tax payer identification number. The invoice information may include at least one of an invoice number, an invoice code, an invoice type, encrypted data, purchaser information, seller information, a goods name, quantity, unit price, amount, tax rate, tax amount, and date of invoicing.
In one possible implementation, when acquiring invoice information, the method may include the following steps:
first, an invoice file is acquired, and initial invoice information in the invoice file is identified based on an optical character identification algorithm. The invoice file is an electronic file of an invoice, and the file type of the invoice file can be PDF, OFD, PNG, JPG, or other file types, which are not particularly limited herein.
In one possible implementation manner, after the invoice file is acquired, the file type of the invoice file is firstly identified, and when the file type of the invoice file is identified as PDF or OFD, the file type of the invoice file is converted to obtain the PNG type invoice file; then, the identification of the initial invoice information is performed on the PNG type invoice file. When the type of the invoice file is identified as PNG or JPG, the identification of the initial invoice information can be directly carried out based on the invoice file of the PNG or JPG type.
And secondly, based on a preset verification rule, verifying whether the initial invoice information is accurate.
In one possible implementation, the preset verification rule may include: at least one of the first validation rule and the second validation rule and the third validation rule.
In one possible implementation, the first validation rule may be: the name length of the purchaser is 1-100 bits; the length of the identification number of the purchaser tax payer is 15-20 digits, and only uppercase letters and numbers are supported; the address length of the purchaser is 0-100 bits; the telephone length is 0-100 bits; the length of the buyer account opening row is 0-100; the commodity name length is 1-100 bits; the length of the specification model is 0-40 bits; the unit length of measurement is at least one of 0 to 22 bits. The first verification rule can verify whether the identified initial invoice information is complete, normal or not and whether the identified invoice information is valid or not.
In one possible implementation, the second validation rule may be: and verifying whether the invoice number, the invoice code and the invoice type accord with a set constraint relation, and determining that the initial invoice information is accurate under the condition of accord with the set constraint relation. The set constraint relation may be: when the invoice code is empty, the invoice number length is 20 bits; when the invoice code is not empty, the invoice code length is 8-12 bits, and the invoice code length is 8 bits.
In one possible implementation, the third validation rule may be: and analyzing the encrypted data, verifying whether the analyzed data is consistent with the initial invoice information, and determining that the initial invoice information is accurate under the condition of consistency. For example, after the encrypted data is parsed, parsed data such as invoice number, invoice code, date of invoicing, amount and tax will be obtained. And respectively judging whether the analyzed invoice number, invoice code, invoicing date, amount and tax are consistent with the invoice number, invoice code, invoicing date, amount and tax in the initial invoice information, and determining that the initial invoice information is accurate under the condition that the analyzed invoice number, invoice code, invoicing date, amount and tax are consistent with the initial invoice information.
In an implementation manner that the verification rules include a first verification rule, a second verification rule and a third verification rule, the accuracy of the initial invoice information can be verified by sequentially using the first verification rule, the second verification rule and the third verification rule, and the initial invoice information is determined to be accurate when the initial invoice information meets the three verification rules. And when the initial invoice information does not meet any verification rule, determining that the initial invoice information is inaccurate.
Again, in the case of verifying that the initial invoice information is accurate, it is verified whether the initial invoice information is authentic.
It should be noted that, whether the initial invoice information is verified accurately can only determine whether the invoice information has content deficiency, irregular filling, paradox caused by tampering and the like, but cannot determine whether the invoice is actually issued, namely whether the invoice is actually issued, so that whether the initial invoice is actually verified is further verified under the condition that the initial invoice information is verified accurately.
In one possible implementation, the invoice authenticity query interface provided by the national tax authority is implemented upon verifying whether the initial invoice information is authentic. Specifically, an invoice verification interface (invoice verification application program interface (Application Programming Interface, API)) is called, and an invoice verification API is called by sending an HTTPS, GET or POST request to a service end address of the invoice verification API and adding corresponding request parameters to the request according to the invoice verification API interface specification. Wherein the corresponding request parameter may include at least one of a check code, an invoice code in the initial invoice information, an invoice number, an invoicing date, and a tax free amount. And starting invoice verification service to verify the authenticity of the initial invoice information, and if the verification result is true, returning real invoice information, wherein the returned real invoice information is structured information data in a JSON (JavaScript Object Notation) format. That is, when the true invoice information returned by the invoice verification interface is received, the original invoice information can be determined to be true. The returned real invoice information may include: at least one of invoice number, invoice code, invoice type, seller name, seller tax payer identification number, seller contact, seller issuer, buyer name, buyer tax payer identification number, buyer contact, buyer issuer, invoice verification code, invoice machine code, whether to void, tax total, price tax total, quantity total, update time, tax office check number, and remark information.
And finally, under the condition of verifying that the initial invoice information is true, taking the initial invoice information as invoice information.
In the realizable mode, the initial invoice information is firstly identified from the invoice information, then the accuracy of the initial invoice information is verified through a locally preset verification rule, under the condition that the initial invoice information is accurate, the verification of the authenticity of the initial invoice information is carried out through an invoice authenticity inquiry interface provided by a national tax general office, and under the condition that the verification is true, the initial invoice information is used as the invoice information for repeated use risk identification, so that the efficiency of acquiring the effective invoice information can be improved.
In a possible implementation manner, in the case of acquiring the invoice information, the method further comprises generating a unique identifier of the invoice information, and storing the invoice information and the unique identifier of the invoice information, so that the stored invoice information can be read based on the unique identifier of the invoice information later.
In one possible implementation, in the case of inaccurate verification or unrealistic initial invoice information, the method further includes an operation of alarming for an abnormality in the initial invoice information. Specifically, under the condition that the initial invoice information is inaccurate, a first alarm message reflecting the inaccuracy of the initial invoice information is generated, and the first alarm message is sent to a designated external application end in a broadcast mode or a short message mode. And under the condition that the initial invoice information is not true, generating a second warning message reflecting the fact that the initial invoice information is not true, and sending the second warning message to the appointed external application end in a broadcast or short message mode. Therefore, the user can timely receive the alarm message through the external application terminal and process the invoice file with abnormal initial invoice information according to the alarm message.
It should be noted that, when the enterprise performs financing to the financing institution, invoice information related to the enterprise needs to be provided, so in one possible implementation, after obtaining the enterprise information and the invoice information, the method further includes: and verifying whether the enterprise information is consistent with the invoice information, and under the condition that the enterprise information is verified to be consistent with the invoice information, executing the operation of downloading the asset registration text corresponding to the enterprise information based on the enterprise information, so that the risk identification error of repeated use of the invoice caused by inconsistent between the invoice information and the enterprise information can be avoided.
In an implementation where the business information includes a business name and a tax payer identification number, it may be verified whether the business name and the tax payer identification number are consistent with the buyer information or the seller information in the invoice information, and if so, it is determined that the business information is consistent with the invoice information. The buyer information comprises a business name and a tax payer identification number of the buyer, and the seller information comprises a business name and a tax payer identification number of the seller.
S1200, based on the enterprise information, downloading an asset registration text corresponding to the enterprise information. Specifically, the asset registration file related to the enterprise information is downloaded from the internet on the basis of the enterprise information. Then, an asset registration text is generated based on the asset registration file. The type of the asset register file may be PDF, OFD, openXML, or other file types, which are not specifically limited herein.
In one possible implementation, the asset register text is generated based on the asset register file based on a file type of the asset register file. Specifically, after the asset register file is acquired, the file type of the asset register file is identified, a matching algorithm with the file type of the asset register file is selected to extract text content in the asset register file, and an asset register text is generated.
In one possible implementation, when the file type of the asset register file is PDF or OFD, the optical word recognition algorithm is selected to extract text content in the asset register file, and obtain the asset register text.
In one possible implementation, when the file type of the asset register file is openXML, the selection tag analysis algorithm extracts text content in the asset register file to obtain the asset register text. Specifically, the text in the asset register file comprises various openXML tags, the various openXML tags in the text are classified, matched, arranged and assembled according to the thread examples with the corresponding number according to the length of the text, so as to obtain text content corresponding to the asset register file, and an asset register text is generated based on the text content. Among them, various openXML tags refer to the "Ecma Office Open XML" standard proposed by the standardized technical commission TC45 of openXML, which is an internationalized open standard for word processing documents, presentations and electronic forms. The processing speed of the asset register file can be realized by selecting the thread examples with the corresponding number matched with the length of the text to carry out classified matching, sorting and assembling on various openXML labels in the text.
In the realizable mode, when various openXML tags in the text are classified, matched, arranged and assembled according to the length of the text and the corresponding number of thread examples, so as to obtain text content corresponding to the asset registration file, the method can comprise the following steps:
firstly, a fileLoader is used for reading a file openXmlFile, and whether the openXmlFile belongs to an openXML format to be processed is judged. The openXML format to be processed comprises: xlsx, docx.
Secondly, an algorithm matched with the format of the penXmlFile is selected, and the openXmlFile file is processed to obtain text content corresponding to the asset registration text.
The second step will be described in detail below using the penXmlFile in xlsx format as an example.
Firstly, decompressing openXml File to obtain a corresponding XML file according to a directory rule, wherein the openXml File is an openXml directory folder. Wherein xl/worksheets/sheets 1.Xml is a table structure file, and xl/sharedstrings. Xml is a string shared storage file. The reading table structure file is sheet1, and the reading character string sharing storage file is sharedString.
Next, read sheet1 is the Xml structure sheet Xml1 and read sharedString is the Xml structure sharedStringXml.
Finally, the number of lines of the shaetXml 1 are obtained, shaetXml 1Rows [ ], and the data of the processed line array is shaetXmlStr. Specifically, 1) all cell cells in a row are acquired. 2) And selecting a corresponding value processing method according to the cell type. When the cell type is s, namely the character string type, acquiring a corresponding text from sharedStringXml according to a index of a subscript and returning the text; when the cell type is n, namely the digital type, directly acquiring v, namely the value and returning; and when the cell type is other types, all the xml tags are removed regularly, and the rest content is spliced and returned. 3) All cell text in the line is summarized, and the line text strings are summarized by separating the cell text by half-angle commas. 4) And adding a row subscript correction number to the tail of the row text. 5) The row array text is assembled, separated by "|", and the table text strings (i.e., the text content corresponding to the asset register text) are assembled.
Here, when the file type of the asset register file is text, the asset register file is directly used as the asset register text.
After the asset register text is obtained, all asset register text may be cached in the local server. The time for caching the asset registration text in the local server can be set according to a specific application scene. For example, the buffering time of the asset register text in the local server may be set to 4 hours.
S1300, identifying whether the invoice reuse risk exists or not based on the invoice information and the asset registration text.
In one possible implementation, when identifying whether there is a risk of invoice reuse based on invoice information and asset registration text, the following steps may be included:
first, word segmentation processing is carried out on the asset registration text, and a word segmentation result of the asset registration text is obtained. Specifically, the asset registration text may be subjected to word segmentation using a natural language processing tool, thereby obtaining a word segmentation result including a plurality of word segments.
Second, based on the invoice information and the word segmentation result, the association degree of the invoice information and the registration text is calculated.
In one possible implementation, when calculating the association degree of the invoice information and the registration text based on the invoice information and the word segmentation result, the hierarchical implementation is based on preset invoice information.
In one possible implementation, the preset invoice information hierarchy may be at least two levels. For example, the invoice information may be divided into two levels, wherein an invoice number in the invoice information is taken as first level invoice information and an amount in the invoice information is taken as second level invoice information. For another example, the invoice information may be further divided into three levels, wherein the invoice number in the invoice information may be used as the first level invoice information, the buyer's taxpayer identification number in the invoice information may be used as the second level invoice information, and the buyer's name in the invoice information may be used as the third level invoice information.
In this implementation manner, when calculating the association degree between the invoice information and the registration text based on the invoice information and the word segmentation result, the method may include the following steps:
first, the degree of association of each hierarchical invoice information with the registration text is calculated based on each hierarchical invoice information and the word segmentation result.
For example, the preset invoice information is divided into N levels, namely, first level invoice information, second level invoice information, … and nth level invoice information. When the association degree between each level of invoice information and the registration text is calculated in sequence, the method comprises the following steps: the association degree of the first-level invoice information and the registration text is calculated based on the first-level invoice information and the word segmentation result. Specifically, the first-level invoice information and the word segmentation result can be input into a TF-IDF model, a first word frequency of the first-level invoice information in the registration text is calculated, a product of the first word frequency and a preset first-level weight is calculated, and the product is used as the association degree of the first-level invoice information and the registration text. And calculating the association degree between the second-level invoice information and the registration text based on the second-level invoice information and the word segmentation result. Specifically, the second-level invoice information and the word segmentation result can be input into a TF-IDF model, the second word frequency of the second-level invoice information in the registration text is calculated, the product of the second word frequency, the first word frequency (namely, the first word frequency) of the second word frequency and the preset second-level weight is calculated, and the product is used as the association degree of the second-level invoice information and the registration text. And then, sequentially calculating the association degree of the third-level invoice information to the Nth-level invoice information and the registration text. The specific calculation process refers to the calculation process of the association degree between the second-level invoice information and the registration text, and is not described herein.
It should be noted here that the preset weights corresponding to the invoice information of each level may be configured according to specific requirements, which is not limited herein. For example, in an implementation where invoice information is divided into two levels, an invoice number in the invoice information is the first level invoice information, and an amount in the invoice information is the second level invoice information, the first level weight may be set to 0.85, and the second level weight may be set to 0.89. For another example, in an implementation where the invoice information is divided into three levels, the invoice number in the invoice information is used as the first level invoice information, the identification number of the buyer in the invoice information is used as the second level invoice information, and the name of the buyer in the invoice information is used as the third level invoice information, the weights corresponding to the invoice information of each level may be set to 0.85,0.82,0.63 in sequence.
Next, the degree of association of the invoice information with the registration text is calculated based on the degree of association of each hierarchical invoice information with the registration text. Specifically, an average value of the degree of association of each level of invoice information with the registration text may be taken as the degree of association between the invoice information and the registration text.
Third, based on the degree of association between invoice information and the registration text, it is identified whether there is a risk of invoice reuse. Specifically, a risk recognition threshold may be preset, and in the case that the association degree between the invoice information and the registration text is greater than the risk recognition threshold, it is determined that the invoice file corresponding to the invoice information has a reuse risk. The risk identification threshold may be configured according to specific application requirements, for example, the risk identification threshold may be set to 0.85.
In one possible implementation, upon identifying that there is a reuse risk for the invoice, the invoice reuse risk is alerted. Specifically, when the invoice is identified that the invoice has the reuse risk, a third warning message reflecting the invoice has the reuse risk can be generated, and the third warning message is sent to the appointed external application end in a broadcast or short message mode. Thus, the user can process the invoice file which is repeatedly used according to the third alarm message received by the external application.
In one possible implementation, the invoice risk recognition method further includes: and generating an invoice risk identification example, and adding the example into a wind control task center to perform periodic risk identification on the invoice according to the dispatching of the wind control task center. Further, after the financing campaign at which the invoice is located is over, the instance may be removed from the wind-controlled task center to end the periodic risk identification of the invoice. Through the method and the device, the repeated use risks of the invoice can be identified in the whole financing period, so that the risk prevention and control capability in the financing period is improved.
The present disclosure provides an invoice risk recognition method, including: acquiring enterprise information and invoice information; based on the enterprise information, downloading asset registration text corresponding to the enterprise information; based on the invoice information and the asset registration text, it is identified whether there is an invoice reuse risk. According to the invoice risk identification method disclosed by the invention, whether the risk of repeated use exists in the financing activities or not can be automatically identified, so that the efficiency and the accuracy of invoice risk identification are improved.
< method example >
FIG. 2 illustrates a flow chart of an example of an invoice risk recognition method, according to an embodiment of the present disclosure. This example is implemented interactively by an external application and an invoice risk recognition system. The invoice risk identification system comprises an invoice wind control service module and a basic service module. As shown in fig. 2, the method includes steps S2001-S2012.
S2001, an invoice verification request is sent to an invoice wind control service module through an external application terminal, wherein the invoice verification request comprises an invoice file used in the financing process.
S2002, after receiving the invoice verification request, the invoice wind control service module sends an OCR recognition request to the base service module, where the OCR recognition request includes the invoice file in step S2001.
And S2003, after receiving the OCR request, the basic service module analyzes the invoice file from the OCR request and performs file type conversion on the invoice file, and the invoice information is identified from the invoice file by adopting an optical character recognition algorithm.
And S2004, the basic service module sends the identified invoice information to the invoice wind control service module.
And S2005, after receiving the invoice information, the invoice wind control service module verifies the accuracy and the authenticity of the invoice information.
S2006, the invoice wind control service module feeds invoice information and an accuracy and authenticity verification result of the invoice information back to an external application end, and alarms that the accuracy and authenticity of the invoice are abnormal. Meanwhile, accurate and real invoice information is stored, wherein each piece of stored invoice information is correspondingly provided with a unique invoice identifier, and therefore corresponding invoice information can be obtained based on the unique invoice identifier.
S2007, an invoice review request is sent to an invoice wind control service module through an external application terminal, wherein the invoice review request comprises enterprise information and invoice identification of an invoice used for financing.
S2008, the invoice wind control service module sends the invoice duplication checking request to the basic service module after receiving the invoice duplication checking request.
And S2009, after receiving the invoice duplication checking request, the basic service module analyzes the enterprise information and the invoice identification information from the invoice duplication checking request, and pulls the asset registration file related to the enterprise information from the internet based on the enterprise information.
And S2010, the basic service module sends the enterprise information, the invoice identification information and the asset registration file to the invoice wind control service module.
S2011, an invoice wind control service module reads corresponding invoice information based on the invoice identification information, converts an asset registration file into an asset registration text, and identifies whether the risk of recycling the invoice exists based on the invoice information and the asset registration text.
S2012, the invoice pneumatic control service module feeds back an identification result of the invoice reuse risk to an external application end, and alarms the invoice reuse risk.
In this method example, the invoice risk recognition method further includes the steps of:
and packaging the S2011-S2012 into an example by an invoice wind control service module, adding the example into a wind control task center, and carrying out periodic risk identification on the invoice according to the dispatching of the wind control task center. Wherein each instance correspondence setting is identified by a unique instance.
And removing an instance of the wind control task center based on the instance identifier when the invoice wind control service module receives the invoice duplication checking request so as to finish identifying the invoice risk.
< device example >
Fig. 3 shows a schematic block diagram of an invoice risk recognition device, according to an embodiment of the present disclosure. As shown in fig. 3, the invoice risk recognition apparatus 100 includes:
a first data acquisition module 110, configured to acquire enterprise information and invoice information;
a second data obtaining module 120, configured to download an asset registration text corresponding to the enterprise information based on the enterprise information;
and a risk identification module 130, configured to identify whether there is an invoice reuse risk based on the invoice information and the asset registration text.
< device example >
Fig. 4 shows a schematic block diagram of an invoice risk recognition device, according to an embodiment of the present disclosure. As shown in fig. 4, the invoice risk recognition apparatus 200 includes: processor 210 and memory 220 for storing instructions executable by processor 210. Wherein the processor 210 is configured to implement any of the invoice risk recognition methods described above when executing the executable instructions.
Here, it should be noted that the number of processors 210 may be one or more. Meanwhile, in the invoice risk recognition apparatus 200 of the embodiment of the present disclosure, an input device 230 and an output device 240 may be further included. The processor 210, the memory 220, the input device 230, and the output device 240 may be connected by a bus, or may be connected by other means, which is not specifically limited herein.
The memory 220 is a computer-readable storage medium that can be used to store software programs, computer-executable programs, and various modules, such as: the invoice risk identification method of the embodiment of the disclosure corresponds to a program or a module. Processor 210 executes various functional applications and data processing of invoice risk recognition device 200 by running software programs or modules stored in memory 220.
The input device 230 may be used to receive an input digital or signal. Wherein the signal may be a key signal generated in connection with user settings of the device/terminal/server and function control. The output means 240 may comprise a display device such as a display screen.
The foregoing description of the embodiments of the present disclosure has been presented for purposes of illustration and description, and is not intended to be exhaustive or limited to the embodiments disclosed. Many modifications and variations will be apparent to those of ordinary skill in the art without departing from the scope and spirit of the various embodiments described. The terminology used herein was chosen in order to best explain the principles of the embodiments, the practical application, or the technical improvement of the technology in the marketplace, or to enable others of ordinary skill in the art to understand the embodiments disclosed herein.

Claims (10)

1. An invoice risk recognition method, comprising:
acquiring enterprise information and invoice information;
downloading asset registration text corresponding to the enterprise information based on the enterprise information;
and identifying whether invoice reuse risks exist or not based on the invoice information and the asset registration text.
2. The method of claim 1, wherein identifying whether there is a risk of invoice reuse based on the invoice information and the asset registration text, comprises:
performing word segmentation processing on the asset registration text to obtain a word segmentation result of the asset registration text;
calculating the association degree of the invoice information and the registration text based on the invoice information and the word segmentation result;
and identifying whether the invoice reuse risk exists or not based on the association degree between the invoice information and the registration text.
3. The method of claim 2, wherein the invoice information includes at least two hierarchical invoice information;
when calculating the association degree between the invoice information and the registration text based on the invoice information and the word segmentation result, the method comprises the following steps:
calculating the association degree of each hierarchical invoice information and the registration text based on each hierarchical invoice information and the word segmentation result;
and calculating the association degree of the invoice information and the registration text based on the association degree of each piece of hierarchical invoice information and the registration text.
4. The method of claim 1, wherein, when obtaining invoice information, comprising:
acquiring an invoice file, and identifying initial invoice information in the invoice file based on an optical character identification algorithm;
verifying whether the initial invoice information is accurate or not based on a preset verification rule;
under the condition of verifying that the initial invoice information is accurate, verifying whether the initial invoice information is true;
and under the condition of verifying that the initial invoice information is true, taking the initial invoice information as the invoice information.
5. The method of claim 4, wherein the initial invoice information is alerted to an anomaly in the event that the initial invoice information is verified to be inaccurate or not authentic.
6. The method of claim 1, further comprising, after obtaining the business information and the invoice information: verifying whether the enterprise information is consistent with the invoice information;
and under the condition that the enterprise information is verified to be consistent with the invoice information, downloading an asset registration text corresponding to the enterprise information based on the enterprise information.
7. The method of any one of claims 1-6, wherein upon identifying that there is a reuse risk for an invoice, alerting the invoice to the reuse risk.
8. The method of any one of claims 1-6, further comprising: and generating an invoice risk identification example, and adding the invoice risk identification example into a wind control task center to perform periodic risk identification on the invoice according to the scheduling of the wind control task center.
9. An invoice risk recognition device, comprising:
the first data acquisition module is used for acquiring enterprise information and invoice information;
the second data acquisition model is used for downloading asset registration text corresponding to the enterprise information based on the enterprise information;
and the risk identification module is used for identifying whether the invoice reuse risk exists or not based on the invoice information and the asset registration text.
10. An invoice risk recognition device, comprising:
a processor;
a memory for storing processor-executable instructions;
wherein the processor is configured to implement the method of any one of claims 1 to 8 when executing the executable instructions.
CN202310302024.1A 2023-03-24 2023-03-24 Invoice risk identification method, device and equipment Pending CN116228265A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310302024.1A CN116228265A (en) 2023-03-24 2023-03-24 Invoice risk identification method, device and equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310302024.1A CN116228265A (en) 2023-03-24 2023-03-24 Invoice risk identification method, device and equipment

Publications (1)

Publication Number Publication Date
CN116228265A true CN116228265A (en) 2023-06-06

Family

ID=86582579

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310302024.1A Pending CN116228265A (en) 2023-03-24 2023-03-24 Invoice risk identification method, device and equipment

Country Status (1)

Country Link
CN (1) CN116228265A (en)

Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN2606400Y (en) * 2002-02-22 2004-03-10 何长杰 System for distinguishing receipt
WO2008108861A1 (en) * 2006-06-12 2008-09-12 Datacert, Inc Electronic document processing
CN106033445A (en) * 2015-03-16 2016-10-19 北京国双科技有限公司 Method and device for obtaining article association degree data
CN106339378A (en) * 2015-07-07 2017-01-18 中国科学院信息工程研究所 Data collecting method based on keyword oriented topic web crawlers
CN109472918A (en) * 2018-10-12 2019-03-15 深圳壹账通智能科技有限公司 Invoice validation method, financing checking method, device, equipment and medium
CN109523685A (en) * 2018-09-04 2019-03-26 航天信息股份有限公司 A kind of electronic invoice checking method and system based on OFD formatted file
WO2020119287A1 (en) * 2018-12-13 2020-06-18 阿里巴巴集团控股有限公司 Blockchain-based invoice creation method and apparatus, and electronic device
CN112069808A (en) * 2020-09-28 2020-12-11 深圳壹账通智能科技有限公司 Financing wind control method and device, computer equipment and storage medium
CN115018613A (en) * 2022-04-20 2022-09-06 中银金融科技有限公司 Report analysis method, device, equipment, storage medium and product

Patent Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN2606400Y (en) * 2002-02-22 2004-03-10 何长杰 System for distinguishing receipt
WO2008108861A1 (en) * 2006-06-12 2008-09-12 Datacert, Inc Electronic document processing
CN106033445A (en) * 2015-03-16 2016-10-19 北京国双科技有限公司 Method and device for obtaining article association degree data
CN106339378A (en) * 2015-07-07 2017-01-18 中国科学院信息工程研究所 Data collecting method based on keyword oriented topic web crawlers
CN109523685A (en) * 2018-09-04 2019-03-26 航天信息股份有限公司 A kind of electronic invoice checking method and system based on OFD formatted file
CN109472918A (en) * 2018-10-12 2019-03-15 深圳壹账通智能科技有限公司 Invoice validation method, financing checking method, device, equipment and medium
WO2020119287A1 (en) * 2018-12-13 2020-06-18 阿里巴巴集团控股有限公司 Blockchain-based invoice creation method and apparatus, and electronic device
CN112069808A (en) * 2020-09-28 2020-12-11 深圳壹账通智能科技有限公司 Financing wind control method and device, computer equipment and storage medium
CN115018613A (en) * 2022-04-20 2022-09-06 中银金融科技有限公司 Report analysis method, device, equipment, storage medium and product

Similar Documents

Publication Publication Date Title
CN109887153B (en) Finance and tax processing method and system
US11195008B2 (en) Electronic document data extraction
CN108876213B (en) Block chain-based product management method, device, medium and electronic equipment
US10657530B2 (en) Automated transactions clearing system and method
CN109062872B (en) Method for uniformly processing customs files with different formats
CN110599319B (en) Automatic auditing method, device, terminal and storage medium
CN113918583B (en) Method and device for determining risk level of audit node in business document
CN115249007A (en) Method and device for detecting enclosing and bidding behavior based on electronic bidding document comparison
CN117010779A (en) Customs import and export declaration method, system, equipment and medium based on intelligent coding
CN111582786A (en) Express bill number identification method, device and equipment based on machine learning
CN116228265A (en) Invoice risk identification method, device and equipment
US20130300562A1 (en) Generating delivery notification
CN113537964A (en) Application form processing method, device, storage medium and device
US10387561B2 (en) System and method for obtaining reissues of electronic documents lacking required data
CN111223230A (en) Invoice file authenticity identification method based on CRNN algorithm
TWI768744B (en) Reference document generation method and system
CN117809325B (en) Full invoice checking authentication management method and system
US20240143919A1 (en) Systems and methods for extracting data from documents
CN113887955B (en) Method and device for examining business document, electronic equipment and readable storage medium
CN113469758B (en) Billing method, device, equipment and storage medium of blockchain and tax system
CN116384854A (en) Method, device, equipment and storage medium for managing bill of material flow distribution
CN116629230A (en) Document processing method, device, equipment and storage medium
CN112861613A (en) Method and system for identifying and checking electronic invoice layout file
CN115410309A (en) Invoice checking method and system
CN115187339A (en) Business order information processing method and device based on artificial intelligence

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination