CN117151647A

CN117151647A - Auxiliary auditing method, equipment and medium for bill

Info

Publication number: CN117151647A
Application number: CN202311183836.5A
Authority: CN
Inventors: 王印智; 马士中; 王金丽; 任聪; 唐昌明
Original assignee: Inspur General Software Co Ltd
Current assignee: Inspur General Software Co Ltd
Priority date: 2023-09-13
Filing date: 2023-09-13
Publication date: 2023-12-01

Abstract

The application provides an auxiliary auditing method, equipment and medium for a bill, and belongs to the technical field of data processing. The method comprises the steps of obtaining to-be-checked single data information from a user terminal; at least one reject predictor information sequence corresponding to the to-be-checked document information is determined based on a pre-trained logistic regression model. The reject prediction sub-information sequence is obtained based on bill information with a first reject prediction probability value larger than a preset first probability threshold value in the to-be-checked bill information. And determining a second reject prediction probability value corresponding to the to-be-checked single data information based on the logistic regression model and the reject prediction sub-information sequence. And under the condition that the second reject prediction probability value is larger than a second probability threshold value, generating audit prompt information and sending the audit prompt information to a corresponding audit terminal. And generating a reject analysis text according to the reject reason text set corresponding to the reject prediction sub-information sequence, and storing the reject analysis text to the cloud server.

Description

Auxiliary auditing method, equipment and medium for bill

Technical Field

The present application relates to the field of data processing technologies, and in particular, to a method, an apparatus, and a medium for assisting in checking documents.

Background

Along with the development of enterprise business, the efficiency and accuracy of bill auditing become an important assessment index, and particularly, the workload of financial staff auditing bills is increased aiming at a financial sharing center of a group. Excessive document auditing work can cause financial staff to make mistakes easily in the auditing process, or auditing refuses can not timely give a document submitting staff answer.

Currently, for a document to be rejected, a document submitting person often obtains a rejection reason and inquires a solution by inquiring an auditor offline or online. The method consumes auditing working time, causes poor experience of a user of the bill auditing system, and cannot meet the intelligent requirement of the user on the bill auditing system.

Based on the above, a technical scheme capable of reducing the manual auditing amount of bill auditors, improving auditing efficiency and improving the intelligent level of a bill auditing system is needed.

Disclosure of Invention

The embodiment of the application provides an auxiliary auditing method, equipment and medium for a bill, which are used for solving the problems of large auditing quantity, low auditing efficiency and low intelligent level of a bill auditing system of the conventional bill auditing worker.

In one aspect, an embodiment of the present application provides a method for assisting in auditing a document, where the method includes:

acquiring to-be-checked single data information from a user terminal; the bill information to be checked at least comprises a header, bill transaction content and bill auditors;

determining at least one reject predictor information sequence corresponding to the to-be-checked bill information based on a pre-trained logistic regression model; the reject prediction sub-information sequence is obtained based on bill sub-information with a first reject prediction probability value larger than a preset first probability threshold value in the to-be-checked bill information; the bill sub-information is obtained by comparing a preset historical keyword set with the to-be-checked bill information;

determining a second reject prediction probability value corresponding to the to-be-checked single data information based on the logistic regression model and the reject prediction sub-information sequence;

generating audit prompt information and sending the audit prompt information to a corresponding audit terminal under the condition that the second reject prediction probability value is larger than a second probability threshold value; and

and generating a rejection analysis text according to the rejection reason text set corresponding to the rejection prediction sub-information sequence, and storing the rejection analysis text to a cloud server.

In one implementation of the present application, before determining at least one reject predicted sub-information sequence corresponding to the to-be-checked single data information based on a pre-trained logistic regression model, the method further includes:

acquiring a plurality of historical reject bill information and corresponding reject reason text data sets;

respectively determining word frequency and inverse text frequency index of words in the reject cause text data set through a preset TF-IDF model;

when the product value of the word frequency of the word and the inverse text frequency index is larger than a preset value, the word is used as a keyword of the reject reason text data set;

and generating keyword feature vectors respectively corresponding to the reject cause text data of the reject cause text data set based on the obtained keywords and the word bag model, so as to train the logistic regression model according to the keyword feature vectors and the historical reject document information.

In one implementation manner of the present application, training the logistic regression model according to each keyword feature vector and each history reject document information specifically includes:

taking any one of the keyword feature vectors as a feature vector to be associated;

sequentially calculating cosine similarity of the feature vector to be associated and other feature vectors of the keywords, and generating an associated feature vector set corresponding to the feature vector to be associated according to a comparison result of the cosine similarity and a first preset threshold; one of the sets of associated feature vectors corresponds to a predetermined historical reject cause;

determining occurrence frequencies of the keyword feature vectors in the associated feature vector sets according to the obtained associated feature vector sets, so as to determine occurrence probability values corresponding to the keyword feature vectors respectively according to the occurrence frequencies;

and adding the bill sub-information of each history reject bill information, the corresponding keyword feature vector and the occurrence probability value of each history reject bill information to a data dictionary as model training samples so as to train the logistic regression model.

In one implementation of the present application, training the logistic regression model specifically includes:

sequentially inputting the associated feature vector set and the corresponding occurrence frequency value corresponding to each bill sub-information in the data dictionary into a logistic regression model to be trained so as to train the logistic regression model to be trained;

under the condition that the logistic regression model to be trained is trained, updating corresponding model parameter values through a gradient descent algorithm until the corresponding model parameter values are determined so that the function value of the logistic regression cost function is smaller than a second preset threshold value, and obtaining the logistic regression model after training is completed.

In one implementation of the present application, the method further includes:

acquiring each piece of real-time updated to-be-checked document information and corresponding reject reason text;

and adding the to-be-checked list data information and the corresponding reject reason text to a preset database to update a model training sample, and retraining the logistic regression model.

In one implementation manner of the present application, generating a reject analysis text according to a reject cause text set corresponding to the reject predictor information sequence specifically includes:

inputting the reject cause text set into a preset analysis text generation model so as to combine the words of the reject cause text; the analysis text generation model is obtained based on training of a plurality of reject reason text word samples and corresponding reject reason sentences;

and determining the reject analysis text according to the output result of the analysis text generation model.

In one implementation of the present application, the method further includes:

responding to the rejection operation of the auditing terminal, and sending prompt information corresponding to the rejection analysis text to the auditing terminal; the prompt message comprises a control for checking the refused analysis text;

the prompt information is sent to the user terminal under the condition that the auditing terminal does not add the rejection analysis text to the rejection reason text box of the rejection operation, so that the user terminal can check the rejection analysis text;

and under the condition that the checking terminal adds the reject reason description text in the reject reason text box, comparing the reject reason description text with the reject analysis text, and sending the reject reason description text and/or the prompt message to the user terminal according to a text comparison result.

In one implementation manner of the present application, according to a text comparison result, the reject cause description text and/or the prompt message is sent to the user terminal, and specifically includes:

respectively calculating text feature vectors corresponding to the reject cause description text and the reject analysis text, and corresponding text similarity; the text similarity is cosine similarity;

sending the prompt information to the user terminal under the condition that the text similarity is larger than a similarity threshold;

and under the condition that the text similarity is smaller than or equal to a similarity threshold, sending the rejection reason description text to the user terminal, adding the rejection reason description text and the to-be-checked list data information to a preset database to update a model training sample, and retraining the logistic regression model.

In another aspect, an embodiment of the present application further provides an auxiliary audit device for a document, where the device includes:

at least one processor; and a memory communicatively coupled to the at least one processor; wherein the memory stores instructions executable by the at least one processor to enable the at least one processor to:

In yet another aspect, an embodiment of the present application further provides a non-volatile computer storage medium for assisted review of documents, storing computer-executable instructions configured to:

Through the technical scheme, the method and the device can predict the reject probability of the bill through the logistic regression model so as to classify the submitted bill, enable auditors to be clear of the bill with reject risk at a glance, and further focus on auditing the part of the bill. Meanwhile, the logistic regression model can automatically capture and learn key characteristic variables, improve the accuracy of intelligent prediction and reduce the risk of missed judgment. Therefore, the problems of large auditing quantity, low auditing efficiency and low intelligent level of the bill auditing system of the current bill auditing worker are solved.

In addition, the application can also give out the reject analysis text for the auditor or bill submitting personnel to check, and does not need to pay more manpower to generate or acquire the reject analysis text, thereby improving the use experience of the user on the bill auditing system.

Drawings

The accompanying drawings, which are included to provide a further understanding of the application and are incorporated in and constitute a part of this specification, illustrate embodiments of the application and together with the description serve to explain the application and do not constitute a limitation on the application. In the drawings:

FIG. 1 is a schematic flow chart of an auxiliary audit method for documents according to an embodiment of the present application;

FIG. 2 is a schematic structural diagram of an auxiliary audit device for documents according to an embodiment of the present application.

Detailed Description

In order to make the objects, technical solutions and advantages of the present application more apparent, the technical solutions of the present application will be clearly and completely described below with reference to specific embodiments of the present application and corresponding drawings. It will be apparent that the described embodiments are only some, but not all, embodiments of the application. All other embodiments, which can be made by those skilled in the art based on the embodiments of the application without making any inventive effort, are intended to be within the scope of the application.

For the document auditing flow intellectualization, algorithms such as a support vector machine, a random forest, a decision tree and the like are applied to the document intellectualization auditing, but the algorithms have low calculation efficiency and limited applicable document auditing range, and under certain conditions, part of auditing work still needs to be manually participated, for example, an auditor does not give out a rejection reason, and a document submitter cannot timely learn the rejection reason and the like. This makes current document auditing systems inadequate for users' needs for intelligent auditing procedures.

Based on the above, the embodiment of the application provides an auxiliary auditing method, equipment and medium for a bill, which are used for solving the problems of large auditing quantity, low auditing efficiency and low intelligent level of a bill auditing system of the current bill auditing worker.

Various embodiments of the present application are described in detail below with reference to the attached drawing figures.

The embodiment of the application provides an auxiliary auditing method for a bill, as shown in fig. 1, the method can comprise the steps of S101-S105:

s101, the server acquires the to-be-checked single data information from the user terminal.

The to-be-checked bill data information at least comprises a header, bill transaction content and bill auditors.

It should be noted that, the server is merely an example as an execution subject of the auxiliary auditing method for documents, and the execution subject is not limited to the server, and the present application is not limited thereto in particular.

The server can be a server running the bill auditing system or connected with the running server of the bill auditing system, and the server can be connected with a user terminal submitted by the bill and an auditing terminal for auditing the bill. The user terminal can submit the to-be-checked bill information through the bill checking system through the mobile phone, the computer and other equipment of the bill submitting personnel, such as the notebook computer of the financial reimbursement personnel.

The list head comprises the type of the bill, the bill transaction content refers to the specific content in the list information to be checked, and financial reimbursement bill is taken as an example, such as reimbursement amount, reimbursement item type, reimbursement item quantity and the like. The bill auditor can be the person responsible for auditing the bill after the bill submitting personnel submits the to-be-checked bill information.

S102, the server determines at least one reject prediction sub-information sequence corresponding to the to-be-checked single data information based on a pre-trained logistic regression model.

The reject prediction sub-information sequence is obtained based on bill information with a first reject prediction probability value larger than a preset first probability threshold value in the to-be-checked bill information. The bill information is obtained by comparing a preset historical keyword set with the to-be-checked bill information. The first probability threshold may be specified by a user, as the application is not particularly limited in this regard. Since the to-be-checked bill information may have a plurality of different types of bills, a plurality of reject predictor information sequences may be included in one to-be-checked bill information.

The first reject prediction probability value may be understood as a reject probability of the document when the document sub-information appears in the to-be-checked document information. The first probability threshold is used for screening part of bill sub-information, namely, when the first reject prediction probability value is larger than the first probability threshold, the bill sub-information is added to the reject prediction sub-information sequence. For example, the first rejection prediction probability value of the bill sub-information corresponding to a certain fee item is 80%, the first preset probability threshold value is 60%, and the bill sub-information corresponding to the fee item is added to the rejection prediction sub-information sequence; the first reject prediction probability value of the bill sub-information corresponding to the other bill travel selection is 30%, the preset first probability threshold value is 60%, and the bill sub-information corresponding to the bill travel selection is not added to the reject prediction sub-information sequence. The specific value of the first probability threshold is merely exemplary, and the first probability threshold may be set in an actual use process, and the present application is not limited to this specific value.

The application adopts a logistic regression model to process the to-be-checked bill information, wherein the refused forecast sub-information sequence can be understood as the pre-divided bill sub-information meeting the rule in the to-be-checked bill information. The bill sub-information, such as sub-information of money, project, date, sensitive words in abstract, etc. in the bill, the specific rejection prediction sub-information sequence includes feature vectors such as that the cost project should select travel cost, the invoice is affected repeatedly, the journey is not closed loop, etc., and the bill sub-information is respectively corresponding to a rejection probability (first rejection prediction probability value) as shown in the following table. The first probability threshold is set by the user, which is not particularly limited by the present application.

Multiple bill sub-information	Probability of rejection
		The expense items should choose the travel expense	80％
Invoice with repeated image	70％
		The travel being not closed-loop	30％

In an embodiment of the present application, before determining at least one reject predictor information sequence corresponding to the to-be-checked individual data information based on a pre-trained logistic regression model, the method further includes:

the server acquires a plurality of historical reject document information and corresponding reject reason text data sets. And then, respectively determining word frequency and inverse text frequency indexes of words in the reject reason text data set through a preset TF-IDF model. And under the condition that the product value of the word frequency of the word and the inverse text frequency index is larger than a preset value, the word is used as a keyword of the text data set for rejecting reasons. And generating keyword feature vectors corresponding to the reject reason text data of the reject reason text data set respectively based on the obtained keywords and the word bag model, so as to train the logistic regression model according to the keyword feature vectors and the historical reject document information.

In other words, the server can obtain training samples of the logistic regression model from a preset database preset by a user, wherein the training samples comprise a plurality of pieces of historical reject bill information and corresponding reject reason text data sets, the reject reason text data sets are composed of a plurality of pieces of historical reject reason text data of the historical reject bill information, and the historical reject reason text data can be generated in advance by the user or filled by bill auditors in actual use, and the application is not limited in particular. By utilizing the TF-IDF model, the server can firstly count word frequencies of words in the same reject reason text data, which appear in the reject reason text data set, and calculate reverse text frequency indexes of the words in all reject reason text data. And obtaining TF-IDF values of the words according to the product of the word frequency and the inverse text frequency index, comparing the product value with a preset value, and selecting the words larger than the preset value as keywords in the reject reason text data set by the server. The preset value is set by a user in the actual use process, and the application is not particularly limited to this.

After obtaining the keywords of the reject cause text data set, the server may also obtain a keyword feature vector of the keywords included in each reject cause text data by using a Bag-of-Words model (BOW), for example, the reject cause text data is "monetary amount exceeds reimbursement index", the keywords include "exceeds" index ", and the keyword feature vector is {1,2}.

Then, the server trains the logistic regression model according to the feature vectors of the keywords and the information of the historical reject receipts, and specifically comprises the following steps:

and the server takes any keyword feature vector in the keyword feature vectors as the feature vector to be correlated. And sequentially calculating cosine similarity of the feature vector to be associated and other feature vectors of the keywords, so as to generate an associated feature vector set corresponding to the feature vector to be associated according to a comparison result of the cosine similarity and a first preset threshold. An associated feature vector set corresponds to a predetermined historical reject cause. And determining the occurrence frequency of each keyword feature vector in each associated feature vector set according to each obtained associated feature vector set, so as to determine the occurrence probability value corresponding to each keyword feature vector according to the occurrence frequency. And adding bill sub-information of each history reject bill information, and corresponding keyword feature vectors and occurrence probability values thereof to a data dictionary as model training samples so as to train a logistic regression model.

That is, the reject cause text data set includes a plurality of reject cause text data, each reject cause text data corresponds to one keyword feature vector, the server may randomly designate one keyword feature vector as a feature vector to be associated, and then obtain the associated feature vector set by calculating cosine similarity between the feature vector to be associated and other keyword feature vectors. Specifically, when the cosine similarity between the feature vector to be associated and the feature vector of another keyword is greater than a first preset threshold, the feature vector to be associated is similar to the feature vector of the other keyword, and the reason association is refused, namely, the two feature vectors are added to the association feature vector set. The first preset threshold may be set according to actual use, which is not particularly limited in the present application. Because of the reject cause association, the keyword feature vectors in the same associated feature vector set may correspond to the same reject cause or to several reject causes, i.e., the same reject cause may correspond to multiple keyword feature vectors.

The server can also count the occurrence times of the keyword feature vector in different associated feature vector sets, calculate the ratio of the occurrence times and the value to the associated feature vector sets, and use the occurrence probability value corresponding to the keyword feature vector as the occurrence probability value of the reject reason corresponding to the keyword feature vector, and use the probability value corresponding to the reject reason to reject the bill. The server may further predict the occurrence probability value as a first reject prediction probability value corresponding to the bill information according to a correspondence between the reject cause and the bill information. The reasons for rejection are that bill information A is wrong, bill information B is wrong, etc.

And then, the server can generate a data dictionary containing bill sub-information, the keyword feature vector corresponding to the bill sub-information and the occurrence frequency value of the keyword feature vector for training of the logistic regression model. The logistic regression model is trained, and specifically comprises the following steps:

and the server sequentially inputs the associated feature vector set and the corresponding occurrence frequency values corresponding to the bill sub-information in the data dictionary into the logistic regression model to be trained so as to train the logistic regression model to be trained. Under the condition that the logistic regression model to be trained is trained, updating corresponding model parameter values through a gradient descent algorithm until the corresponding model parameter values are determined so that the function value of the logistic regression cost function is smaller than a second preset threshold value, and obtaining the logistic regression model after training.

The server uses a logistic regression algorithm to model based on the feature variables and bill sub-information (feature values) data dictionary. Splitting the modeling data set into a training set and a corresponding test set for training and evaluation, wherein the specific implementation logic is as follows: first, assume that n keyword feature vectors x= [ X, X, … xn ], where xi is the i-th keyword feature vector. Logistic regression assumes that the probability function is hθ (x) =g (θ≡t x), where hθ (x) is the occurrence probability value of the keyword feature variable x and θ is the parameter value of the model. g (z) is a sigmoid function g (z) =1/(1+e++z), z is θt x, and θt represents the θtranspose. In addition, using a logistic regression cost function, using the difference between Log metric actual and predicted values, J (θ) = -1/m Σ [ y Log (hθ (x)) + (1-y) Log (1-hθ (x)) ], J (θ) is a cost function, m is the number of training samples, y is the actual tag value (0 or 1), and hθ (x) is an approximate estimate of model prediction: to minimize the cost function J (θ), a gradient descent algorithm is used to estimate the optimal parameter value θ. And continuously adjusting the model until the model is optimal through iteratively updating the parameter theta, and obtaining the logistic regression model after training even if the function value of the logistic regression cost function is smaller than a second preset threshold value. The second preset threshold is set by the user, which is not particularly limited by the present application.

Through the logistic regression model which is completed through the training, the server can calculate a first reject prediction probability value of each bill information in the to-be-checked bill information, and then the reject prediction sub-information sequence is generated through a comparison result with a first probability threshold value.

And S103, the server determines a second reject prediction probability value corresponding to the to-be-checked bill information based on the logistic regression model and the reject prediction sub-information sequence.

Through the logistic regression model, weighted average calculation can be performed on each first reject prediction probability value corresponding to each reject prediction sub-information sequence, and weights can be set according to different reject reasons in the logistic regression model training process, namely different weights are set by different associated feature vector sets, and a second reject prediction probability value is calculated according to the sum of the product values of each weight and the first reject prediction probability value.

And S104, the server generates auditing prompt information under the condition that the second reject prediction probability value is larger than a second probability threshold value, and sends the auditing prompt information to the corresponding auditing terminal.

The second probability threshold may be set according to actual use, which is not particularly limited by the present application. When the second reject prediction probability value is larger than the second probability threshold value, the server can generate prompt information for prompting the auditing terminal to audit the bill sub-information corresponding to the second reject prediction probability value corresponding to the auditor, and label the bill sub-information to generate prompt information.

And S105, the server generates reject analysis text according to the reject reason text set corresponding to the reject prediction sub-information sequence, and stores the reject analysis text to the cloud server.

The cloud server may be connected to the server for storing the reject analysis text.

In the embodiment of the application, the method further comprises the following steps:

the server acquires the real-time updated information of each to-be-checked list and the corresponding reject reason text. And adding the information of each to-be-checked list and the corresponding reject reason text to a preset database to update a model training sample, and retraining the logistic regression model.

That is, if the to-be-checked document information is rejected, the server can retrain the logistic regression model according to the rejected to-be-checked document information and the reject reason text corresponding to the to-be-checked document information, which is obtained by processing the logistic regression model, so as to ensure that the logistic regression model is kept optimal.

In an embodiment of the present application, the generating the reject analysis text according to the reject cause text set corresponding to the reject predictor information sequence specifically includes:

and the server inputs the reject reason text set into a preset analysis text generation model so as to combine the reject reason text words. The analysis text generation model is obtained based on training of a plurality of reject reason text word samples and corresponding reject reason sentences. And determining the refused analysis text according to the output result of the analysis text generation model.

The analysis text generation model may be a neural network model, which is obtained by training a plurality of reject cause text word samples and reject cause sentences corresponding to the samples, and can identify reject cause texts and output reject cause sentences corresponding to the reject cause texts, and the output reject cause sentences are taken as reject analysis texts.

Furthermore, the application also enables:

and responding to the rejection operation of the auditing terminal, and sending prompt information corresponding to the rejection analysis text to the auditing terminal. The hint information includes controls for viewing the overrule text. And sending prompt information to the user terminal under the condition that the auditing terminal does not add the reject analysis text to the reject reason text box of the reject operation, so that the user terminal can check the reject analysis text. Under the condition that the checking terminal adds the reject reason description text in the reject reason text box, comparing the reject reason description text with the reject analysis text, and sending the reject reason description text and/or prompt information to the user terminal according to the text comparison result.

That is, the auditing terminal can perform the rejection operation, and after obtaining the second rejection prediction probability value, the auditing terminal user can check the to-be-checked single data information with the auditing prompt information, and click the rejection control to reject the to-be-checked single data information under the condition that the rejection condition is met. At this time, the server may obtain the reject analysis text and generate a control capable of viewing the reject analysis text, e.g., the user may view the reject analysis text by clicking, sliding, etc. the control.

If the checking terminal is in the process of checking the bill information to be checked, the checking terminal does not actively add the checking analysis text, and the server can send prompt information to the user terminal, so that the user terminal accesses the cloud server to check the checking analysis text. If the auditing terminal actively adds the reject reason description text, the reject analysis text and the reject reason description text can be further compared, and then the reject reason description text is sent and/or prompt information is given.

According to the text comparison result, sending a reject reason description text and/or prompt information to the user terminal, wherein the method specifically comprises the following steps:

the server calculates the text feature vectors corresponding to the reject reason description text and the reject analysis text respectively, and the corresponding text similarity. The text similarity is cosine similarity. And sending prompt information to the user terminal under the condition that the text similarity is larger than a similarity threshold value. And under the condition that the text similarity is smaller than or equal to a similarity threshold, sending a reject reason description text to the user terminal, adding the reject reason description text and the to-be-checked single data information to a preset database to update a model training sample, and retraining the logistic regression model.

In other words, the server may send the rejection reason description text to the user terminal when the rejection reason description text is dissimilar to the rejection analysis text given by the logistic regression model, and perform feedback training of the logistic regression model through the rejection reason description text, so as to ensure freshness of the logistic regression model.

Fig. 2 is a schematic structural diagram of an auxiliary audit device for documents according to an embodiment of the present application, where, as shown in fig. 2, the device includes:

at least one processor; and a memory communicatively coupled to the at least one processor. Wherein the memory stores instructions executable by the at least one processor, the instructions being executable by the at least one processor to enable the at least one processor to:

and acquiring the to-be-checked single data information from the user terminal. The to-be-checked bill data information at least comprises a header, bill transaction content and bill auditors. At least one reject predictor information sequence corresponding to the to-be-checked document information is determined based on a pre-trained logistic regression model. The reject prediction sub-information sequence is obtained based on bill information with a first reject prediction probability value larger than a preset first probability threshold value in the to-be-checked bill information. The bill information is obtained by comparing a preset historical keyword set with the to-be-checked bill information. And determining a second reject prediction probability value corresponding to the to-be-checked single data information based on the logistic regression model and the reject prediction sub-information sequence. And under the condition that the second reject prediction probability value is larger than a second probability threshold value, generating audit prompt information and sending the audit prompt information to a corresponding audit terminal. And generating a reject analysis text according to the reject reason text set corresponding to the reject prediction sub-information sequence, and storing the reject analysis text to the cloud server.

The embodiment of the application also provides a non-volatile computer storage medium for the auxiliary audit of the bill, which stores computer executable instructions, wherein the computer executable instructions are set as follows:

The embodiments of the present application are described in a progressive manner, and the same and similar parts of the embodiments are all referred to each other, and each embodiment is mainly described in the differences from the other embodiments. In particular, for the apparatus, medium embodiments, since they are substantially similar to the method embodiments, the description is relatively simple, and reference is made to the description of the method embodiments for relevant points.

The device, medium and method provided by the embodiment of the application are in one-to-one correspondence, so that the device and medium also have similar beneficial technical effects as the corresponding method, and the beneficial technical effects of the device and medium are not repeated here because the beneficial technical effects of the method are described in detail above.

It should also be noted that the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising one … …" does not exclude the presence of other like elements in a process, method, article or apparatus that comprises the element.

The foregoing is merely exemplary of the present application and is not intended to limit the present application. Various modifications and variations of the present application will be apparent to those skilled in the art. Any modification, equivalent replacement, improvement, etc. which come within the spirit and principles of the application are to be included in the scope of the claims of the present application.

Claims

1. An assisted verification method for documents, the method comprising:

2. The assisted review method of claim 1 in which prior to determining at least one reject predicted sub-information sequence corresponding to the document to be reviewed based on a pre-trained logistic regression model, the method further comprises:

3. The method for assisting in auditing documents according to claim 2, wherein training the logistic regression model according to the keyword feature vectors and the history reject document information specifically comprises:

4. A method for assisted review of documents as claimed in claim 3 in which training the logistic regression model comprises:

5. An auxiliary auditing method for a document according to claim 1, the method further comprising:

6. The auxiliary auditing method for documents according to claim 1, wherein generating a reject analysis text according to a reject cause text set corresponding to the reject predictor information sequence specifically comprises:

7. An assisted review method for documents as claimed in claim 6, further comprising:

8. The auxiliary auditing method for documents according to claim 1, wherein the sending the reject cause description text and/or the prompt message to the user terminal according to the text comparison result specifically comprises:

9. An auxiliary auditing apparatus for documents, the apparatus comprising:

at least one processor; the method comprises the steps of,

a memory communicatively coupled to the at least one processor; wherein,

the memory stores instructions executable by the at least one processor to enable the at least one processor to perform a method of assisted audit of documents according to any of the preceding claims 1-8.

10. A non-volatile computer storage medium for the assisted verification of documents, storing computer executable instructions capable of performing a method for the assisted verification of documents as claimed in any one of claims 1 to 8.