CN113888143A - Bill data processing method and device and storage medium - Google Patents

Bill data processing method and device and storage medium Download PDF

Info

Publication number
CN113888143A
CN113888143A CN202111488115.6A CN202111488115A CN113888143A CN 113888143 A CN113888143 A CN 113888143A CN 202111488115 A CN202111488115 A CN 202111488115A CN 113888143 A CN113888143 A CN 113888143A
Authority
CN
China
Prior art keywords
statement
category
account
word
data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202111488115.6A
Other languages
Chinese (zh)
Other versions
CN113888143B (en
Inventor
刘纯熙
王栋
岂小伟
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
CHANJET INFORMATION TECHNOLOGY CO LTD
Original Assignee
CHANJET INFORMATION TECHNOLOGY CO LTD
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by CHANJET INFORMATION TECHNOLOGY CO LTD filed Critical CHANJET INFORMATION TECHNOLOGY CO LTD
Priority to CN202111488115.6A priority Critical patent/CN113888143B/en
Publication of CN113888143A publication Critical patent/CN113888143A/en
Application granted granted Critical
Publication of CN113888143B publication Critical patent/CN113888143B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/10Office automation; Time management
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/22Indexing; Data structures therefor; Storage structures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/28Databases characterised by their database models, e.g. relational or object models
    • G06F16/284Relational databases
    • G06F16/285Clustering or classification

Abstract

The invention provides a method, a device and a storage medium for processing statement of account data, wherein the method comprises the following steps: training the business classification model by using historical statement data acquired from a database to obtain a trained business classification model; classifying the account statement data to be processed by using the trained business classification model to obtain an initial class of the account statement data; the initial category is adjusted based on the account number of the other party in the statement of account data to serve as the category of the statement of account data, or/and the initial category is provided for a user through a display interface, the user determines whether the initial category needs to be modified, and if so, the user modifies the initial category to obtain the category of the statement of account data; and processing the account statement data based on the category to obtain an account keeping voucher. In the invention, the improved fastText model is used, and the relation between words is considered when generating the word vector, so that the accuracy of the model is improved, and the accuracy of generating the voucher is ensured.

Description

Bill data processing method and device and storage medium
Technical Field
The invention relates to the technical field of artificial intelligence, in particular to a method and a device for processing statement data and a storage medium.
Background
The development of modern information technologies represented by the internet, big data, artificial intelligence and the like is changing day by day, various advanced intelligent technologies are applied to the aspects of social life, and the intelligent era comes. However, in the aspect of financial accounting, due to limited technical force, accounting staff of many small and small enterprises still rely on manual receipt and payment document processing, manual voucher making, and the efficiency of business processing is very low. Intelligent business auxiliary processing capacity is urgently needed, financial staff are enabled to get rid of heavy and monotonous work, and the working efficiency of the field is improved.
In fund flow and bank statement, the flow summary information is a simple description of the whole transaction process, although the information is simple, the connotation is rich, and the method is very helpful for subsequent financial processing. In the invention, the fastTest model can be used for quickly classifying the text based on the statement of bill running summary information, but because the text of the statement of bill is short, an accurate classification model is difficult to obtain during training, thereby causing poor subsequent processing accuracy. This is a drawback of the prior art.
Disclosure of Invention
The present invention proposes the following technical solutions to address one or more technical defects in the prior art.
A method of statement data processing, the method comprising:
training, namely training a business classification model by using historical statement data acquired from a database to obtain a trained business classification model;
classifying, namely classifying the account statement data to be processed by using the trained service classification model to obtain an initial class of the account statement data;
a judging step, namely adjusting the initial category based on an account of the other party in the statement of account data to be used as the category of the statement of account data, or/and providing the initial category for a user through a display interface, determining whether the initial category needs to be modified by the user, and if so, modifying the initial category by the user to obtain the category of the statement of account data;
and a generation step, namely processing the statement bill data based on the category of the statement bill data to obtain a bookkeeping voucher and then storing the bookkeeping voucher in a database.
Furthermore, the business classification model is an improved fastText model, and the input parameter of the improved fastText model is a word Wt-1、Wt-2、……、Wt-n+1Based on said word Wt-1、Wt-2、……Wt-n+1And the relation between the different words generates an input layer word sequence vector C (W)t-1)、C(Wt-2)、……、C(Wt-n+1) The word sequence vector C (W)t-1)、C(Wt-2)、……、C(Wt-n+1) And after the input hiding layer processes, the input softmax layer outputs the initial category of the bill data.
Further, the word W is based ont-1、Wt-2、……Wt-n+1And the relation between the different words generates an input layer word sequence vector C (W)t-1)、C(Wt-2)、……、C(Wt-n+1) The operation of (1) is as follows:
based on the word Wt-1、Wt-2、……Wt-n+1Index of (2) obtaining word Wt-1、Wt-2、……Wt-n+1Calculating a relationship value between any two words: rel (W)t-i,W t-j) = a dis + b sim + c diff, where a, b, c are the respective weight values, dis =
Figure 100002_DEST_PATH_IMAGE001
(ii) a sim represents word W t-iAnd W t-jThe semantic relatedness of (c); diff denotes the word W t-iAnd W t-jThe absolute value of the number difference of the words in (1), i is not equal to j, and T1 is a preset threshold;
the word Wt-1、Wt-2、……Wt-n+1Multiplying the vectorized vector by the corresponding sumrel to obtain a word sequence vector C (W)t-1)、C(Wt-2)、……、C(Wt-n+1);
Wherein sumrel =
Figure DEST_PATH_IMAGE002
Furthermore, when the initial category needs to be modified, the user modifies the initial category, if the user does not modify the initial category but directly modifies the certificate template, the category corresponding to the certificate template is searched for in the database according to the opposite subject of the modified certificate template, and the initial category is modified based on the corresponding category.
Further, the generating step operates to: and after the types of the statement of account data are matched, matching the statement of account data based on the matching rules of the account name and the subject details to obtain an account keeping voucher, and then storing the account keeping voucher in a database.
The invention also provides a statement of account data processing device, which comprises:
the training unit is used for training the business classification model by using the historical statement of account data acquired from the database to obtain a trained business classification model;
the classification unit is used for classifying the account statement data to be processed by using the trained service classification model to obtain an initial class of the account statement data;
the judging unit is used for adjusting the initial category based on the account number of the other party in the statement of account data to be used as the category of the statement of account data, or/and providing the initial category for the user through a display interface, the user determines whether the initial category needs to be modified, and if so, the user modifies the initial category to obtain the category of the statement of account data;
and the generation unit is used for processing the statement bill data based on the category of the statement bill data to obtain an accounting voucher and then storing the accounting voucher in a database.
Furthermore, the business classification model is an improved fastText model, and the input parameter of the improved fastText model is a word Wt-1、Wt-2、……、Wt-n+1Based on said word Wt-1、Wt-2、……Wt-n+1And the relation between said different wordsGenerating an input layer word sequence vector C (W)t-1)、C(Wt-2)、……、C(Wt-n+1) The word sequence vector C (W)t-1)、C(Wt-2)、……、C(Wt-n+1) And after the input hiding layer processes, the input softmax layer outputs the initial category of the bill data.
Further, the word W is based ont-1、Wt-2、……Wt-n+1And the relation between the different words generates an input layer word sequence vector C (W)t-1)、C(Wt-2)、……、C(Wt-n+1) The operation of (1) is as follows:
based on the word Wt-1、Wt-2、……Wt-n+1Index of (2) obtaining word Wt-1、Wt-2、……Wt-n+1Calculating a relationship value between any two words: rel (W)t-i,W t-j) = a dis + b sim + c diff, where a, b, c are the respective weight values, dis =
Figure 857937DEST_PATH_IMAGE001
(ii) a sim represents word W t-iAnd W t-jThe semantic relatedness of (c); diff denotes the word W t-iAnd W t-jThe absolute value of the number difference of the words in (1), i is not equal to j, and T1 is a preset threshold;
the word Wt-1、Wt-2、……Wt-n+1Multiplying the vectorized vector by the corresponding sumrel to obtain a word sequence vector C (W)t-1)、C(Wt-2)、……、C(Wt-n+1);
Wherein sumrel =
Figure 203468DEST_PATH_IMAGE002
Furthermore, when the initial category needs to be modified, the user modifies the initial category, if the user does not modify the initial category but directly modifies the certificate template, the category corresponding to the certificate template is searched for in the database according to the opposite subject of the modified certificate template, and the initial category is modified based on the corresponding category.
Still further, the operation of the generating unit is: and after the types of the statement of account data are matched, matching the statement of account data based on the matching rules of the account name and the subject details to obtain an account keeping voucher, and then storing the account keeping voucher in a database.
The invention also proposes a computer-readable storage medium having stored thereon computer program code which, when executed by a computer, performs any of the methods described above.
The invention has the technical effects that: the invention relates to a method, a device and a storage medium for processing statement bill data, wherein the method comprises the following steps: a training step S101, training a business classification model by using historical statement data acquired from a database to obtain a trained business classification model; a classification step S102, classifying the account statement data to be processed by using the trained service classification model to obtain an initial class of the account statement data; a judging step S103, in which the initial category is adjusted based on the account number of the other party in the statement of account data to be used as the category of the statement of account data, or/and the initial category is provided for the user through a display interface, the user determines whether the initial category needs to be modified, if so, the user modifies the initial category to obtain the category of the statement of account data; and a generating step S104, processing the statement bill data based on the category of the statement bill data to obtain a bookkeeping voucher, and then storing the bookkeeping voucher in a database. In the invention, the business type of the bill data is judged by using a trained business classification model, the primary business classification precision is limited, then, based on the account information of the other party, the preliminarily confirmed partial business type is further finely adjusted, and the user manually finely adjusts under the condition of inaccurate recommendation to obtain the accurate category of the bill data, thereby generating an accurate bookkeeping voucher; in the invention, when generating word vectors, an improved fastText model is used, the relation between words is considered when generating the word vectors, the accuracy of the model is improved, and a specific way of applying the relation between the words to the generated word vectors is provided; in the invention, when a user does not directly fine-tune the initial category but directly modifies the certificate template recommended by the system, the category corresponding to the certificate template is searched in the database according to the opposite subject of the modified certificate template, the initial category is modified based on the corresponding category and then is used as the category of the account checking data, thereby improving the compatibility of the system, ensuring the accuracy of the subsequent template, namely integrating the certificate record manually adjusted by the user into the system, when the next user inputs a similar certificate, the system finally gives an optimal result according to the record and the certificate information manually adjusted by the previous user, and ensuring the accuracy of the generated certificate.
Drawings
Other features, objects and advantages of the present application will become more apparent upon reading of the following detailed description of non-limiting embodiments thereof, made with reference to the accompanying drawings.
Fig. 1 is a flowchart of a statement data processing method according to an embodiment of the present invention.
Fig. 2 is a block diagram of a statement data processing apparatus according to an embodiment of the present invention.
Detailed Description
The present application will be described in further detail with reference to the following drawings and examples. It is to be understood that the specific embodiments described herein are merely illustrative of the relevant invention and not restrictive of the invention. It should be noted that, for convenience of description, only the portions related to the related invention are shown in the drawings.
It should be noted that the embodiments and features of the embodiments in the present application may be combined with each other without conflict. The present application will be described in detail below with reference to the embodiments with reference to the attached drawings.
Fig. 1 shows a statement of account data processing method of the present invention, which includes:
a training step S101, training a business classification model by using historical statement data acquired from a database to obtain a trained business classification model;
a classification step S102, classifying the account statement data to be processed by using the trained service classification model to obtain an initial class of the account statement data;
a judging step S103, in which the initial category is adjusted based on the account number of the other party in the statement of account data to be used as the category of the statement of account data, or/and the initial category is provided for the user through a display interface, the user determines whether the initial category needs to be modified, if so, the user modifies the initial category to obtain the category of the statement of account data;
and a generating step S104, processing the statement bill data based on the category of the statement bill data to obtain a bookkeeping voucher, and then storing the bookkeeping voucher in a database.
In the invention, the types of the account statement data comprise: collecting money, providing services, interest income, receiving investment, paying, receiving services, interest expenditure, bank commission, issuing wages, paying social security, paying housing accumulation, paying welfare, paying value-added taxes, paying income taxes, paying personal taxes, paying additional taxes, etc. The method comprises the steps of judging the business type of bill data by using a trained business classification model, limiting the primary business classification precision, further performing fine adjustment on part of preliminarily confirmed business types based on the account information of the other party, and manually fine-adjusting by a user under the condition of inaccurate recommendation to obtain the accurate category of the bill data so as to generate an accurate bookkeeping voucher
In one embodiment, the business classification model is a modified fastText model with input parameters of the word Wt-1、Wt-2、……、Wt-n+1Based on said word Wt-1、Wt-2、……Wt-n+1And the relation between the different words generates an input layer word sequence vector C (W)t-1)、C(Wt-2)、……、C(Wt-n+1) The word sequence vector C (W)t-1)、C(Wt-2)、……、C(Wt-n+1) Input so after input hidden layer processingThe ftmax layer outputs the initial category of billing data.
The invention improves the existing fastText model, and the input parameter is the word Wt-1、Wt-2、……、Wt-n+1Based on said word Wt-1、Wt-2、……Wt-n+1And the relation between the different words generates an input layer word sequence vector C (W)t-1)、C(Wt-2)、……、C(Wt-n+1) The method replaces the prior art that the generated word vector is directly used for input, and t-1, t-2 and t-n +1 represent marks of words, namely n-1 words are counted, and the numbers are t-1, t-2 and t-n +1 in sequence. In the invention, when the word vector is generated, the relation between different words is considered, because the texts in the statement are short, and the information contained in the vector generated by simply using the words is too simple, so that the classification is not accurate, therefore, the relation between the words is considered when the word vector is generated, the accuracy of the model is improved, and the invention is another invention point of the invention.
In one embodiment, the word W is based ont-1、Wt-2、……Wt-n+1And the relation between the different words generates an input layer word sequence vector C (W)t-1)、C(Wt-2)、……、C(Wt-n+1) The operation of (1) is as follows: based on the word Wt-1、Wt-2、……Wt-n+1Index of (2) obtaining word Wt-1、Wt-2、……Wt-n+1Calculating a relationship value between any two words: rel (W)t-i,W t-j) = a dis + b sim + c diff, where a, b, c are the respective weight values, dis =
Figure 822668DEST_PATH_IMAGE001
(ii) a sim represents word W t-iAnd W t-jThe semantic relatedness of (c); diff denotes the word W t-iAnd W t-jThe absolute value of the number difference of the words in (1), i is not equal to j, and T1 is a preset threshold;
the word Wt-1、Wt-2、……Wt-n+1Multiplying the vector obtained by vectorization by corresponding sumrel to obtain a wordSequence vector C (W)t-1)、C(Wt-2)、……、C(Wt-n+1);
Wherein sumrel =
Figure 620860DEST_PATH_IMAGE002
The invention provides a specific way of applying the relation between words to the generated word vector, and practical tests show that the operation can enable the generated word vector to accurately express the information in the statement abstract, so that the accuracy of model classification is improved, which is an important invention point of the invention.
In one embodiment, when it is determined that the initial category needs to be modified, the user modifies the initial category, and if the user does not modify the initial category but directly modifies the credential template, the category corresponding to the credential template is searched for in the database according to the other subject of the modified credential template, and the initial category is modified based on the corresponding category.
In the invention, a special condition is that a user does not directly fine-tune an initial category but directly modifies a certificate template recommended by a system, the category corresponding to the certificate template is searched in a database according to the opposite subject of the modified certificate template, the initial category is modified based on the corresponding category and then is used as the category of account checking data, so that the compatibility of the system is improved, the accuracy of a subsequent template is ensured, namely, a certificate record manually adjusted by the user is merged into the system, when the user inputs a similar certificate next time, the system finally gives an optimal result according to the record and the certificate information manually adjusted by the user in the past, and the accuracy of generating the certificate is ensured, which is another important invention point of the invention.
In one embodiment, the operation of the generating step S104 is: and after the types of the statement of account data are matched, matching the statement of account data based on the matching rules of the account name and the subject details to obtain an account keeping voucher, and then storing the account keeping voucher in a database.
For example, the category of the account statement data is income, the existing system files are searched, the client + employee files are searched, and 90% of the client name matching is judged to be matching. If the customer's profile is matched, it is "collected", and the reference to the name of the customer is the customer's profile. If matching the employee file, it is 'personal collection', and the reference of the user name is personal file; and searching whether the detailed subject matched with the account is available under the account receivable, other accounts receivable and other accounts payable, and if the detailed subject matched with the account is available, automatically assigning the value to the subject of the other party. If the superior subject is receivable, the receipt and payment type is collection, and if the superior subject is other receivable or other receivable, the receipt and payment type is personal collection; at the time of import, if the rule of 90% name matching (matching auxiliary accounting or detail subject) is met, the opposite user name is changed into the corresponding name to be displayed. If none of the above matches and the name of the opposite party is 2 or 3 characters, the default is "personal collection". When the category is a payout, similar processing is performed. In other words, in the invention, after the categories of the statement of account data are matched, the statement of account data is matched based on the matching rules of the account name and the subject specification to obtain the accounting voucher, so that the accuracy of voucher generation is improved, which is another important invention point of the invention.
Fig. 2 shows a statement data processing device of the present invention, which includes:
a training unit 201, which trains the business classification model by using the historical statement data acquired from the database to obtain a trained business classification model;
a classification unit 202, configured to classify the statement data to be processed by using the trained service classification model to obtain an initial category of the statement data;
the judging unit 203 is used for adjusting the initial category based on the account number of the other party in the statement of account data to be used as the category of the statement of account data, or/and providing the initial category for the user through a display interface, and the user determines whether the initial category needs to be modified, if so, the user modifies the initial category to obtain the category of the statement of account data;
and the generating unit 204 is configured to process the statement bill data based on the category of the statement bill data to obtain a billing voucher, and store the billing voucher in a database.
In the invention, the types of the account statement data comprise: collecting money, providing services, interest income, receiving investment, paying, receiving services, interest expenditure, bank commission, issuing wages, paying social security, paying housing accumulation, paying welfare, paying value-added taxes, paying income taxes, paying personal taxes, paying additional taxes, etc. The method comprises the steps of judging the business type of bill data by using a trained business classification model, limiting the primary business classification precision, further performing fine adjustment on part of preliminarily confirmed business types based on the account information of the other party, and manually fine-adjusting by a user under the condition of inaccurate recommendation to obtain the accurate category of the bill data so as to generate an accurate bookkeeping voucher
In one embodiment, the business classification model is a modified fastText model with input parameters of the word Wt-1、Wt-2、……、Wt-n+1Based on said word Wt-1、Wt-2、……Wt-n+1And the relation between the different words generates an input layer word sequence vector C (W)t-1)、C(Wt-2)、……、C(Wt-n+1) The word sequence vector C (W)t-1)、C(Wt-2)、……、C(Wt-n+1) And after the input hiding layer processes, the input softmax layer outputs the initial category of the bill data.
The invention improves the existing fastText model, and the input parameter is the word Wt-1、Wt-2、……、Wt-n+1Based on said word Wt-1、Wt-2、……Wt-n+1And the relation between the different words generates an input layer word sequence vector C (W)t-1)、C(Wt-2)、……、C(Wt-n+1) The method replaces the prior art that the generated word vector is directly used for input, and t-1, t-2 and t-n +1 represent marks of words, namely n-1 words are counted, and the numbers are t-1, t-2 and t-n +1 in sequence. In the present invention, in the direction of generating wordsIn the process of generating the quantity, the relation between different words is considered, because texts in a bill are short, and information contained in a vector generated by simply using the words is too simple, so that the classification is not accurate.
In one embodiment, the word W is based ont-1、Wt-2、……Wt-n+1And the relation between the different words generates an input layer word sequence vector C (W)t-1)、C(Wt-2)、……、C(Wt-n+1) The operation of (1) is as follows: based on the word Wt-1、Wt-2、……Wt-n+1Index of (2) obtaining word Wt-1、Wt-2、……Wt-n+1Calculating a relationship value between any two words: rel (W)t-i,W t-j) = a dis + b sim + c diff, where a, b, c are the respective weight values, dis =
Figure 836071DEST_PATH_IMAGE001
(ii) a sim represents word W t-iAnd W t-jThe semantic relatedness of (c); diff denotes the word W t-iAnd W t-jThe absolute value of the number difference of the words in (1), i is not equal to j, and T1 is a preset threshold;
the word Wt-1、Wt-2、……Wt-n+1Multiplying the vectorized vector by the corresponding sumrel to obtain a word sequence vector C (W)t-1)、C(Wt-2)、……、C(Wt-n+1);
Wherein sumrel =
Figure 770529DEST_PATH_IMAGE002
The invention provides a specific way of applying the relation between words to the generated word vector, and practical tests show that the operation can enable the generated word vector to accurately express the information in the statement abstract, so that the accuracy of model classification is improved, which is an important invention point of the invention.
In one embodiment, when it is determined that the initial category needs to be modified, the user modifies the initial category, and if the user does not modify the initial category but directly modifies the credential template, the category corresponding to the credential template is searched for in the database according to the other subject of the modified credential template, and the initial category is modified based on the corresponding category.
In the invention, a special condition is that a user does not directly fine-tune an initial category but directly modifies a certificate template recommended by a system, the category corresponding to the certificate template is searched in a database according to the opposite subject of the modified certificate template, the initial category is modified based on the corresponding category and then is used as the category of account checking data, so that the compatibility of the system is improved, the accuracy of a subsequent template is ensured, namely, a certificate record manually adjusted by the user is merged into the system, when the user inputs a similar certificate next time, the system finally gives an optimal result according to the record and the certificate information manually adjusted by the user in the past, and the accuracy of generating the certificate is ensured, which is another important invention point of the invention.
In one embodiment, the operation of the generating unit 204 is: and after the types of the statement of account data are matched, matching the statement of account data based on the matching rules of the account name and the subject details to obtain an account keeping voucher, and then storing the account keeping voucher in a database.
For example, the category of the account statement data is income, the existing system files are searched, the client + employee files are searched, and 90% of the client name matching is judged to be matching. If the customer's profile is matched, it is "collected", and the reference to the name of the customer is the customer's profile. If matching the employee file, it is 'personal collection', and the reference of the user name is personal file; and searching whether the detailed subject matched with the account is available under the account receivable, other accounts receivable and other accounts payable, and if the detailed subject matched with the account is available, automatically assigning the value to the subject of the other party. If the superior subject is receivable, the receipt and payment type is collection, and if the superior subject is other receivable or other receivable, the receipt and payment type is personal collection; at the time of import, if the rule of 90% name matching (matching auxiliary accounting or detail subject) is met, the opposite user name is changed into the corresponding name to be displayed. If none of the above matches and the name of the opposite party is 2 or 3 characters, the default is "personal collection". When the category is a payout, similar processing is performed. In other words, in the invention, after the categories of the statement of account data are matched, the statement of account data is matched based on the matching rules of the account name and the subject specification to obtain the accounting voucher, so that the accuracy of voucher generation is improved, which is another important invention point of the invention.
In one embodiment of the present invention, a statement of account data processing device is provided, where the device includes a processor and a memory, the processor is connected to the processor through a bus, the memory stores a computer program, and the processor executes the computer program stored in the memory to implement the method described above.
An embodiment of the present invention provides a computer storage medium, on which a computer program is stored, which when executed by a processor implements the above-mentioned method, and the computer storage medium can be a hard disk, a DVD, a CD, a flash memory, or the like.
For convenience of description, the above devices are described as being divided into various units by function, and are described separately. Of course, the functionality of the units may be implemented in one or more software and/or hardware when implementing the present application.
From the above description of the embodiments, it is clear to those skilled in the art that the present application can be implemented by software plus necessary general hardware platform. Based on such understanding, the technical solutions of the present application may be essentially implemented or the portions that contribute to the prior art may be embodied in the form of a software product, which may be stored in a storage medium, such as ROM/RAM, a magnetic disk, an optical disk, etc., and includes several instructions for enabling a computer device (which may be a personal computer, a server, or a network device, etc.) to execute the apparatuses described in the embodiments or some portions of the embodiments of the present application.
Finally, it should be noted that: although the present invention has been described in detail with reference to the above embodiments, it should be understood by those skilled in the art that: modifications and equivalents may be made thereto without departing from the spirit and scope of the invention and it is intended to cover in the claims the invention as defined in the appended claims.

Claims (10)

1. A statement data processing method, comprising:
training, namely training a business classification model by using historical statement data acquired from a database to obtain a trained business classification model;
classifying, namely classifying the account statement data to be processed by using the trained service classification model to obtain an initial class of the account statement data;
a judging step, namely adjusting the initial category based on an account of the other party in the statement of account data to be used as the category of the statement of account data, or/and providing the initial category for a user through a display interface, determining whether the initial category needs to be modified by the user, and if so, modifying the initial category by the user to obtain the category of the statement of account data;
and a generation step, namely processing the statement bill data based on the category of the statement bill data to obtain a bookkeeping voucher and then storing the bookkeeping voucher in a database.
2. The method of claim 1, wherein the business classification model is a modified fastText model, and wherein an input parameter of the modified fastText model is a word Wt-1、Wt-2、……、Wt-n+1Based on said word Wt-1、Wt-2、……、Wt-n+1And the relationship between different words generates an input-layer word sequence vector C (W)t-1)、C(Wt-2)、……、C(Wt-n+1) The input layer word sequence vector C (W)t-1)、C(Wt-2)、……、C(Wt-n+1) And after the input hiding layer is processed, the input softmax layer outputs the initial category of the statement data.
3. The method of claim 2, wherein the method is based on the word Wt-1、Wt-2、……、Wt-n+1And the relationship between different words generates an input-layer word sequence vector C (W)t-1)、C(Wt-2)、……、C(Wt-n+1) The operation of (1) is as follows:
based on the word Wt-1、Wt-2、……、Wt-n+1Index of (2) obtaining word Wt-1、Wt-2、……Wt-n+1Calculating a relationship value between any two words: rel (W)t-i,W t-j) = a dis + b sim + c diff, where a, b, c are the respective weight values, dis =
Figure DEST_PATH_IMAGE001
(ii) a sim represents word W t-iAnd W t-jThe semantic relatedness of (c); diff denotes the word W t-iAnd W t-jThe absolute value of the number difference of the words in (1), i is not equal to j, and T1 is a preset threshold;
the word Wt-1、Wt-2、……Wt-n+1Multiplying the vectorized vector by the corresponding sumrel to obtain a word sequence vector C (W)t-1)、C(Wt-2)、……、C(Wt-n+1);
Wherein sumrel =
Figure 324825DEST_PATH_IMAGE002
4. The method according to claim 3, wherein when it is determined that the initial category needs to be modified, the user modifies the initial category, and if the user does not modify the initial category but directly modifies the credential template, the user searches the category corresponding to the credential template in the database according to the other subject of the modified credential template, and modifies the initial category based on the category corresponding to the credential template.
5. The method of claim 4, wherein the generating step operates to: and after the types of the statement of account data are matched, matching the statement of account data based on the matching rules of the account name and the subject details to obtain an account keeping voucher, and then storing the account keeping voucher in a database.
6. A statement data processing device, characterized in that the device comprises:
the training unit is used for training the business classification model by using the historical statement of account data acquired from the database to obtain a trained business classification model;
the classification unit is used for classifying the account statement data to be processed by using the trained service classification model to obtain an initial class of the account statement data;
the judging unit is used for adjusting the initial category based on the account number of the other party in the statement of account data to be used as the category of the statement of account data, or/and providing the initial category for the user through a display interface, the user determines whether the initial category needs to be modified, and if so, the user modifies the initial category to obtain the category of the statement of account data;
and the generation unit is used for processing the statement bill data based on the category of the statement bill data to obtain an accounting voucher and then storing the accounting voucher in a database.
7. The apparatus of claim 6, wherein the business classification model is a modified fastText model, and wherein an input parameter of the modified fastText model is a word Wt-1、Wt-2、……、Wt-n+1Based on said word Wt-1、Wt-2、……、Wt-n+1And the relationship between different words generates an input-layer word sequence vector C (W)t-1)、C(Wt-2)、……、C(Wt-n+1) The input layer word sequence vector C (W)t-1)、C(Wt-2)、……、C(Wt-n+1) And after the input hiding layer is processed, the input softmax layer outputs the initial category of the statement data.
8. The apparatus of claim 7, wherein the word W is based ont-1、Wt-2、……Wt-n+1And the relationship between different words generates an input-layer word sequence vector C (W)t-1)、C(Wt-2)、……、C(Wt-n+1) The operation of (1) is as follows:
based on the word Wt-1、Wt-2、……、Wt-n+1Index of (2) obtaining word Wt-1、Wt-2、……Wt-n+1Calculating a relationship value between any two words: rel (W)t-i,W t-j) = a dis + b sim + c diff, where a, b, c are the respective weight values, dis =
Figure 773124DEST_PATH_IMAGE001
(ii) a sim represents word W t-iAnd W t-jThe semantic relatedness of (c); diff denotes the word W t-iAnd W t-jThe absolute value of the number difference of the words in (1), i is not equal to j, and T1 is a preset threshold;
the word Wt-1、Wt-2、……Wt-n+1Multiplying the vectorized vector by the corresponding sumrel to obtain a word sequence vector C (W)t-1)、C(Wt-2)、……、C(Wt-n+1);
Wherein sumrel =
Figure 569173DEST_PATH_IMAGE002
9. The apparatus according to claim 8, wherein when it is determined that the initial category needs to be modified, the user modifies the initial category, and if the user does not modify the initial category but directly modifies the credential template, the user searches a category corresponding to the credential template in the database according to an opposite subject of the modified credential template, and modifies the initial category based on the category corresponding to the credential template.
10. A computer storage medium, characterized in that the computer storage medium has stored thereon a computer program which, when being executed by a processor, carries out the method of any one of claims 1-5.
CN202111488115.6A 2021-12-08 2021-12-08 Bill data processing method and device and storage medium Active CN113888143B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111488115.6A CN113888143B (en) 2021-12-08 2021-12-08 Bill data processing method and device and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111488115.6A CN113888143B (en) 2021-12-08 2021-12-08 Bill data processing method and device and storage medium

Publications (2)

Publication Number Publication Date
CN113888143A true CN113888143A (en) 2022-01-04
CN113888143B CN113888143B (en) 2022-02-25

Family

ID=79016531

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111488115.6A Active CN113888143B (en) 2021-12-08 2021-12-08 Bill data processing method and device and storage medium

Country Status (1)

Country Link
CN (1) CN113888143B (en)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109598479A (en) * 2018-10-25 2019-04-09 北京奇虎科技有限公司 A kind of bill extracting method, device, electronic equipment and medium
CN110069252A (en) * 2019-04-11 2019-07-30 浙江网新恒天软件有限公司 A kind of source code file multi-service label mechanized classification method
US20200058025A1 (en) * 2018-08-15 2020-02-20 Royal Bank Of Canada System, methods, and devices for payment recovery platform
WO2021179483A1 (en) * 2020-03-09 2021-09-16 平安科技(深圳)有限公司 Intention identification method, apparatus and device based on loss function, and storage medium

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20200058025A1 (en) * 2018-08-15 2020-02-20 Royal Bank Of Canada System, methods, and devices for payment recovery platform
CN109598479A (en) * 2018-10-25 2019-04-09 北京奇虎科技有限公司 A kind of bill extracting method, device, electronic equipment and medium
CN110069252A (en) * 2019-04-11 2019-07-30 浙江网新恒天软件有限公司 A kind of source code file multi-service label mechanized classification method
WO2021179483A1 (en) * 2020-03-09 2021-09-16 平安科技(深圳)有限公司 Intention identification method, apparatus and device based on loss function, and storage medium

Also Published As

Publication number Publication date
CN113888143B (en) 2022-02-25

Similar Documents

Publication Publication Date Title
Xia et al. Cost-sensitive boosted tree for loan evaluation in peer-to-peer lending
CN110458592B (en) Method for mining potential credit clients of bank based on knowledge graph and machine learning algorithm
US11710332B2 (en) Electronic document data extraction
CN110458693A (en) A kind of automatic measures and procedures for the examination and approval of business loan, device, storage medium and electronic equipment
US20090132348A1 (en) Method for deal-based pricing and estimation of deal winning probability using multiple prospective models
WO2019008766A1 (en) Voucher processing system and voucher processing program
US20150046302A1 (en) Transaction level modeling method and apparatus
US20150235222A1 (en) Investment Risk Modeling Method and Apparatus
US20220398573A1 (en) Machine learning classifier based on category modeling
US7020639B1 (en) Check verification system and method
US20180053204A1 (en) Auto-population of discount information into an e-invoice
CN113935723B (en) Accounting voucher generation method and device based on optimization loss and storage medium
CN113888143B (en) Bill data processing method and device and storage medium
CN111523298A (en) Generation method and device of accounting voucher
KR102417698B1 (en) Platform system for collecting information of financial instruments, method for collecting information of financial instruments and computer program for the same
US20220405859A1 (en) Recommendation system for recording a transaction
CN109635289A (en) Entry classification method and audit information abstracting method
CN113886448B (en) Account data processing method and device and storage medium
CN111932368B (en) Credit card issuing system and construction method and device thereof
KR102249015B1 (en) Calculation System for Corporate Debt Payment Capability
US20220398583A1 (en) Transaction reconciliation and deduplication
US20230409644A1 (en) Systems and method for generating labelled datasets
US20230214456A1 (en) Dynamic calibration of confidence-accuracy mappings in entity matching models
CN117455699A (en) Automatic identification billing algorithm model based on machine learning
CN114820008A (en) Method and device for determining price of bond voucher

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant