CN113850331A - Annunciation bill abnormity detection method, using method, device, equipment and storage medium - Google Patents

Annunciation bill abnormity detection method, using method, device, equipment and storage medium Download PDF

Info

Publication number
CN113850331A
CN113850331A CN202111137345.8A CN202111137345A CN113850331A CN 113850331 A CN113850331 A CN 113850331A CN 202111137345 A CN202111137345 A CN 202111137345A CN 113850331 A CN113850331 A CN 113850331A
Authority
CN
China
Prior art keywords
reimbursement
information
keyword
abnormal
bill information
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202111137345.8A
Other languages
Chinese (zh)
Inventor
李同
巴堃
庄伯金
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Ping An Technology Shenzhen Co Ltd
Original Assignee
Ping An Technology Shenzhen Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Ping An Technology Shenzhen Co Ltd filed Critical Ping An Technology Shenzhen Co Ltd
Priority to CN202111137345.8A priority Critical patent/CN113850331A/en
Publication of CN113850331A publication Critical patent/CN113850331A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2415Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on parametric or probabilistic models, e.g. based on likelihood ratio or false acceptance rate versus a false rejection rate
    • G06F18/24155Bayesian classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/213Feature extraction, e.g. by transforming the feature space; Summarisation; Mappings, e.g. subspace methods
    • G06F18/2135Feature extraction, e.g. by transforming the feature space; Summarisation; Mappings, e.g. subspace methods based on approximation criteria, e.g. principal component analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2413Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on distances to training or reference patterns
    • G06F18/24147Distances to closest patterns, e.g. nearest neighbour classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/243Classification techniques relating to the number of classes
    • G06F18/24323Tree-organised classifiers
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/205Parsing
    • G06F40/216Parsing using statistical methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/289Phrasal analysis, e.g. finite state techniques or chunking

Abstract

The application relates to the technical field of artificial intelligence and discloses a method, a device, equipment and a storage medium for detecting abnormal reimbursement notes, wherein the method comprises the following steps: acquiring a reimbursement bill information set to be processed, wherein each reimbursement bill information of the reimbursement bill information set comprises a qualitative characteristic, a quantitative characteristic and a remark information characteristic; extracting keywords from the remark information characteristics of each reimbursement note information and performing reprocessing to obtain a keyword word set; calculating the weight value corresponding to each corresponding keyword in the keyword set; matching the weight variable characteristics of the remark information characteristics corresponding to the reimbursement note information according to the keyword word set and the weight values of the keywords; carrying out comprehensive characteristic extraction on the qualitative characteristics and the weight variable characteristics of the reimbursement bill information to obtain comprehensive variable characteristics of each reimbursement bill information in the reimbursement bill information set; and identifying abnormal reimbursement bill information in the reimbursement bill information set through the abnormal detection model according to the comprehensive variable characteristics and the quantitative characteristics.

Description

Annunciation bill abnormity detection method, using method, device, equipment and storage medium
Technical Field
The application relates to the technical field of artificial intelligence, in particular to a reimbursement note abnormity detection method, a using method, a device, equipment and a storage medium.
Background
In accounting financial management, the reimbursement of travel expenses is always a difficult and painful point in management. Because various types of data exist in the reimbursement slip data, such as qualitative variables for representing categories, such as a departure place and a destination, quantitative data for representing numerical values, such as reimbursement amount, and text unstructured data, such as remarks and descriptions, it is difficult to identify whether the reimbursement slip is abnormal.
Disclosure of Invention
The application mainly aims to provide a reimbursement note abnormity detection method, a using method, a device, equipment and a storage medium, and aims to realize accurate identification of reimbursement note abnormity.
In a first aspect, the present application provides a method for detecting an exception of a reimbursement bill, including:
acquiring a reimbursement bill information set to be processed, wherein the reimbursement bill information set comprises a plurality of reimbursement bill information, and each reimbursement bill information comprises a qualitative characteristic, a quantitative characteristic and a remark information characteristic;
extracting keywords from the remark information characteristics corresponding to each reimbursement information in the reimbursement note information set, and performing duplicate removal processing on the extracted keywords to obtain a keyword word set;
calculating the keyword frequency of each corresponding keyword in the keyword word set appearing in the remark information characteristic corresponding to each reimbursement note information, and calculating the weight value corresponding to each keyword according to the keyword frequency;
matching the remark information characteristics of the reimbursement note information set according to the keyword word set and the weight value of each keyword to obtain weight variable characteristics corresponding to reimbursement note information;
performing comprehensive characteristic extraction on the qualitative characteristics and the weight variable characteristics of the reimbursement bill information to obtain comprehensive variable characteristics of each reimbursement bill information in the reimbursement bill information set;
and identifying abnormal reimbursement bill information in the reimbursement bill information set through a preset abnormal detection model according to the comprehensive variable characteristics and the quantitative characteristics.
In a second aspect, the present application further provides an abnormal reimbursement bill detection device, including:
the system comprises a reimbursement bill information acquisition module, a reimbursement bill information processing module and a remark information processing module, wherein the reimbursement bill information processing module is used for acquiring a reimbursement bill information set to be processed, the reimbursement bill information set comprises a plurality of reimbursement bill information, and each reimbursement bill information comprises a qualitative characteristic, a quantitative characteristic and a remark information characteristic;
the keyword extraction module is used for extracting keywords from the remark information characteristics corresponding to each reimbursement note information in the reimbursement note information set and performing duplication removal processing on the extracted keywords to obtain a keyword word set;
the keyword weight matching module is used for calculating the keyword frequency of each corresponding keyword in the keyword word set appearing in the remark information characteristic corresponding to each reimbursement note information, and calculating the weight value corresponding to each keyword according to the keyword frequency;
the weight variable matching module is used for matching the remark information characteristics of the reimbursement note information set according to the keyword word set and the weight value of each keyword to obtain the weight variable characteristics corresponding to the reimbursement note information;
the comprehensive characteristic extraction module is used for carrying out comprehensive characteristic extraction on the qualitative characteristics and the weight variable characteristics of the reimbursement bill information to obtain comprehensive variable characteristics of each reimbursement bill information in the reimbursement bill information set;
and the abnormality detection module is used for identifying abnormal reimbursement bill information in the reimbursement bill information set through a preset abnormality detection model according to the comprehensive variable characteristics and the quantitative characteristics.
In a third aspect, the present application also provides a computer device, which includes a processor, a memory, and a computer program stored on the memory and executable by the processor, wherein the computer program, when executed by the processor, implements the steps of the reimbursement order anomaly detection method as described above.
In a fourth aspect, the present application further provides a storage medium having a computer program stored thereon, where the computer program is executed by a processor to implement the steps of the reimbursement note anomaly detection method as described above.
In the application, the key words are extracted from the remark information characteristics in the reimbursement note information, and the weight values of the key words are calculated. And converting the remark information characteristics in the reimbursement note information into corresponding weight variable characteristics according to the keywords and the corresponding weight values. And carrying out comprehensive variable characteristic extraction on the qualitative characteristic and the weight variable characteristic corresponding to the reimbursement bill information to obtain the comprehensive variable characteristic corresponding to the reimbursement bill information. And detecting the qualitative characteristics and the weight variable characteristics of the reimbursement bill information by using an abnormality detection model, and identifying abnormal reimbursement bill information in the reimbursement bill information. By the method, the problems that the type of the reimbursement note information data is complex, the reimbursement note information is difficult to use for model data analysis, and abnormal reimbursement notes are difficult to identify can be solved.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings needed to be used in the description of the embodiments are briefly introduced below, and it is obvious that the drawings in the following description are some embodiments of the present application, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without creative efforts.
Fig. 1 is a schematic flowchart illustrating steps of a method for detecting an exception of a reimbursement bill according to an embodiment of the present disclosure;
FIG. 2 is a flowchart illustrating steps corresponding to one embodiment of step S11 of FIG. 1;
FIG. 3 is a flowchart illustrating steps corresponding to one embodiment of step S13 of FIG. 1;
FIG. 4 is a schematic block diagram of an apparatus for detecting an exception of a reimbursement slip according to an embodiment of the present disclosure;
fig. 5 is a schematic block diagram of a structure of a computer device according to an embodiment of the present application.
The implementation, functional features and advantages of the objectives of the present application will be further explained with reference to the accompanying drawings.
Detailed Description
The technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are some, but not all, embodiments of the present application. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.
The flow diagrams depicted in the figures are merely illustrative and do not necessarily include all of the elements and operations/steps, nor do they necessarily have to be performed in the order depicted. For example, some operations/steps may be decomposed, combined or partially combined, so that the actual execution sequence may be changed according to the actual situation. In addition, although the division of the functional blocks is made in the device diagram, in some cases, it may be divided in blocks different from those in the device diagram.
The embodiment of the application provides a reimbursement note abnormity detection method, a using method, a device, equipment and a storage medium. The method for detecting the abnormal reimbursement bill can be applied to terminal equipment or a server, wherein the terminal equipment can be electronic equipment such as a mobile phone, a tablet personal computer, a notebook computer, a desktop computer, a personal digital assistant and wearable equipment; the server may be a single server or a server cluster including a plurality of servers. The fraud identification method is explained below by taking an example in which the fraud identification method is applied to a server.
Some embodiments of the present application will be described in detail below with reference to the accompanying drawings. The embodiments described below and the features of the embodiments can be combined with each other without conflict.
Referring to fig. 1, fig. 1 is a schematic flowchart illustrating a step of a method for detecting an exception of a reimbursement bill according to an embodiment of the present disclosure.
As shown in fig. 1, the method for detecting an abnormal claim statement includes steps S10 to S15.
Step S10, acquiring a reimbursement bill information set to be processed, wherein the reimbursement bill information set comprises a plurality of reimbursement bill information, and each reimbursement bill information comprises a qualitative characteristic, a quantitative characteristic and a remark information characteristic.
It can be understood that if the method for detecting the abnormal reimbursement bill is implemented by the server, the reimbursement bill information can be directly obtained from the database for the server. If the method for detecting the abnormal reimbursement bill is executed by the terminal equipment, reimbursement bill information can be obtained by sending a network request to the server for the terminal equipment. The reimbursement bill information set is a set formed by a plurality of pieces of acquired reimbursement bill information.
The reimbursement note information includes qualitative characteristics, quantitative characteristics and remark information characteristics. Wherein the quantitative characteristic may comprise a plurality of sub-characteristics for characterizing numerical data of the reimbursement order information. The qualitative signature may include a plurality of sub-signatures characterizing the category data of the reimbursement order information. The remarks information feature may include a plurality of sub-features for characterizing the text data of the reimbursement order information.
For example, assume that a reimbursement order message includes the following data, as shown in the following table one:
table one:
Figure BDA0003282553690000041
Figure BDA0003282553690000051
in the above-mentioned reimbursement receipt information, the values corresponding to "amount of money (element)" and "distance to journey (kilometer)" are numerical data, and correspond to the quantitative characteristics of the reimbursement receipt information. The value corresponding to the "vehicle" is generally data of an airplane, a high-speed rail, a ship and the like, and corresponds to the qualitative characteristics of the reimbursement note information. The value corresponding to the remark is the explanatory text information filled in the reimbursement bill by the user and corresponds to the remark information characteristic of the reimbursement bill information.
In some embodiments, the qualitative characteristics include a type of business trip, origin, destination, vehicle, class of hold, class of personnel, and invoice status, the quantitative characteristics include a reimbursement amount and distance traveled, and the remark information characteristics include remarks.
Step S11, extracting keywords from the remark information characteristics corresponding to each reimbursement note information in the reimbursement note information set, and performing de-duplication processing on the extracted keywords to obtain a keyword word set.
It will be appreciated that the reimbursement order information set has a plurality of reimbursement order information, each reimbursement order information having a corresponding remark information characteristic. And sequentially extracting keywords from the remark information characteristics corresponding to each reimbursement information of the reimbursement note information set, and performing de-duplication processing on the extracted keywords to obtain a keyword word set.
As shown in fig. 2, in some embodiments, step S11 includes steps S110 to S112.
Step S110, performing word segmentation processing on the remark information characteristics corresponding to each reimbursement note information in the reimbursement note information set to obtain a first word set;
s111, filtering the first word set through a preset keyword filter to obtain a second word set;
and step S112, carrying out keyword duplicate removal processing on the second word set to obtain a keyword word set.
It can be understood that the remark information characteristics of the reimbursement note information in the reimbursement note information set are segmented in sequence, and the segmented words are put into an array in a unified mode, so that the first word set is obtained. In some embodiments, the remark information characteristic of the reimbursement note information may be tokenized using nlp (natural Language processing) techniques.
The keyword filter is provided with useless words which need to be filtered out. According to the keyword filter, useless phrases in the first word set can be filtered out, so that a second word set is obtained. And carrying out duplication removal processing on the second word set to obtain a keyword word set.
For example, assume that the reimbursement bill information set has two reimbursement bill information sets, wherein the remark information of one reimbursement bill information set is characterized by remarks: "having gone out for business today, this is the car of round trip station takes, and the remark information characteristic of another reimbursement note information is the remark: "this is the taxi-taking cost of the outwork today".
After the remark information characteristics corresponding to the two reimbursement note information are segmented, the obtained first word set is [ "today", "going out", "outing", "reached", "this", "is", "station to go to,", "fare", "haha", "this", "is", "today", "outing", "parking", "cost" ].
It is assumed that the useless words set in the keyword filter that need to be filtered out include: "go," got, "" this, "" is, "" was, "" ha, "" today, "and" cost.
And then filtering the first word set through a keyword filter to obtain a second word set as follows: [ "attendance", "shuttle station", "fare", "attendance", "taxi taking" ].
And (3) carrying out keyword duplicate removal processing on the second word set, removing repeated keywords 'outwork' in the second word set, and obtaining a keyword word set: [ "shuttle station", "fare", "outwork", "taxi taking" ].
Step S12, calculating the keyword frequency of each corresponding keyword in the keyword word set appearing in the remark information characteristic corresponding to each reimbursement note information, and calculating the weight value corresponding to each keyword according to the keyword frequency.
It can be understood that the keyword frequency corresponding to the keyword may be in the reimbursement note information set, and the remark information feature includes the number of reimbursement note information of the keyword. And then, the weight value corresponding to each keyword can be calculated by combining the number of all reimbursement bill information in the reimbursement bill information set.
In some embodiments, the weight value corresponding to the keyword may be obtained by calculating an inverse document frequency of the keyword. The inverse document frequency is the idf (inverse document frequency) value of the keyword. Assuming that the keyword is w, the method for calculating the IDF value of a certain keyword w is log (D/Dw). Wherein D is the number of the reimbursement bill information in the reimbursement bill information set. And Dw is the number of the reimbursement note information with the keyword w appearing in the corresponding remark information characteristics in the reimbursement note information set, namely the keyword frequency corresponding to the keyword. And calculating the obtained IDF value, namely the weight value corresponding to the keyword w. It can be understood that, if the number of the reimbursement note information sets is smaller, and the corresponding remark information features include the keyword w, the larger the IDF value corresponding to the keyword is, the more the keyword w has the category distinguishing capability, that is, the larger the weight value corresponding to the keyword w is.
For example, assuming that the number of the reimbursement note information in the reimbursement note information set is 10000000, and the number of the reimbursement note information having the keyword "station to go" appearing in the corresponding remark information feature is 1000, the frequency of the keyword corresponding to the keyword "station to go" is 1000, and the IDF value corresponding to the keyword "station to go" is log (10000000/1,000) ═ 4, that is, the weight value corresponding to the keyword "station to go" is 4.
And step S13, matching the remark information characteristics of the reimbursement note information set according to the keyword word set and the weight value of each keyword to obtain the weight variable characteristics corresponding to the reimbursement note information.
It is understood that in each reimbursement note information of the reimbursement note information set, the remark information characteristic corresponding to the reimbursement note information may include a plurality of keywords in the keyword set. The keywords can be selected as the weight variable characteristics of the remark information characteristics corresponding to the reimbursement note information according to the weight values corresponding to the keywords. Because the weight variable characteristic corresponding to the reimbursement note information is one of the keywords in the keyword set, the property of the reimbursement note information is equivalent to the qualitative characteristic.
As shown in fig. 3, in some embodiments, step S13 includes steps S130 through S132.
Step S130, acquiring the reimbursement bill information in the reimbursement bill information set in sequence;
step S131, according to the keyword word set, performing keyword matching on the remark information characteristics of the reimbursement note information to obtain a keyword matching word set corresponding to the reimbursement note information;
step S132, according to the weight value corresponding to each keyword, acquiring the keyword with the maximum weight value in the keyword matching word set, and obtaining the weight variable characteristic corresponding to the reimbursement note information.
It will be appreciated that the remark information of the reimbursement note information is characterized by text-type data such as remarks. Through keyword matching, which keywords in the keyword word set are contained in the remark information characteristics can be known, and the keyword matching word set corresponding to the reimbursement note information is obtained according to an array formed by the matched keywords.
And each keyword has a corresponding weight value, and the keyword with the maximum weight value is selected from the keyword matching word set, so that the weight variable characteristic corresponding to the reimbursement bill information is obtained.
Illustratively, assume that the keyword word set is: [ "shuttle station", "fare", "parking" ]. Assume that the weight value corresponding to the keyword "shuttle station" is 10, assume that the weight value corresponding to the keyword "fare" is 7, and assume that the weight value corresponding to the keyword "parking" is 2.
Supposing that the remark information characteristic corresponding to one of the reimbursement note information in the reimbursement note information set is remark: "this is the cost of the shuttle station and also the parking.
And matching the remark information characteristics according to the keyword word set to obtain a keyword matching word set of [ 'come and go station' and 'stop' ]. And because the keyword matching words are concentrated and the weight value corresponding to the keyword 'round trip station' is the largest, selecting the 'round trip station' as the weight variable characteristic corresponding to the reimbursement bill information.
Step S14, carrying out comprehensive feature extraction on the qualitative features and the weight variable features of the reimbursement bill information to obtain comprehensive variable features of each reimbursement bill information in the reimbursement bill information set.
It will be appreciated that the composite variable signature may comprise a plurality of sub-signatures, and that the resulting composite variable signature is a numerical class of data, the nature of which equates to a quantitative signature.
In some embodiments, the comprehensive characteristic extraction is performed on the qualitative characteristic and the weighted variable characteristic of the reimbursement bill information, and may be performed by using a principal component analysis method to obtain a comprehensive variable characteristic corresponding to the reimbursement bill information. The principal component analysis method can refine M-dimensional category data into N-dimensional numerical data, wherein N < M. That is, the qualitative characteristics and the weighted variable characteristics corresponding to the reimbursement bill information in the reimbursement bill information set can be subjected to dimensionality reduction by the principal component analysis method to obtain the comprehensive variable characteristics.
In some embodiments, the qualitative feature and the weighted variable feature of the reimbursement order information are subjected to comprehensive feature extraction, and the comprehensive feature extraction may also be performed by using a Multiple Correlation Analysis (MCA) algorithm to obtain a comprehensive variable feature corresponding to the reimbursement order information. The principle of the multivariate corresponding analysis algorithm is similar to that of the principal component analysis method, and the multidimensional variables can be subjected to dimensionality reduction treatment.
And step S15, according to the comprehensive variable characteristics and the quantitative characteristics, recognizing abnormal reimbursement bill information in the reimbursement bill information set through a preset abnormal detection model.
It can be understood that the anomaly detection model is a classification model obtained by pre-training a reimbursement slip training data set according to a linear regression classification algorithm or an isolated forest algorithm. In addition, the anomaly detection model can also be obtained by training through other classification algorithms, and the specific use can be selected according to the needs.
And in the process of carrying out abnormity detection on the reimbursement bill information in the reimbursement bill information set by the abnormity detection model, judging according to the comprehensive variable characteristics and the quantitative characteristics of the reimbursement bill information to identify whether the reimbursement bill information is abnormal or not.
It can be understood that the anomaly detection model needs to identify whether the reimbursement note information is abnormal according to the numerical class input variables. And the comprehensive variable characteristics and quantitative characteristics corresponding to the reimbursement bill information are numerical data, so that the abnormal reimbursement bill information in the reimbursement bill information set can be identified through the abnormal j model.
In some embodiments, the method further includes steps S20 to S23.
And step S20, acquiring abnormal comprehensive variable characteristics and abnormal quantitative characteristics corresponding to the abnormal reimbursement note information.
It can be understood that the abnormal reimbursement note information is reimbursement note information in which the abnormal state is detected from the reimbursement note information set through the abnormal detection model. And the abnormal comprehensive variable characteristic of the abnormal reimbursement note information is the comprehensive variable characteristic corresponding to the abnormal reimbursement note information. And the abnormal quantitative characteristic of the abnormal reimbursement note information is the quantitative characteristic corresponding to the abnormal reimbursement note information.
And step S21, calculating a matching value of the abnormal reimbursement bill information and the non-abnormal reimbursement bill information in the reimbursement bill information set according to the abnormal comprehensive variable characteristics and the abnormal quantitative characteristics, wherein the non-abnormal reimbursement bill information is the reimbursement bill information except the abnormal reimbursement bill information in the reimbursement bill information set.
It can be understood that the abnormal reimbursement note information and the non-abnormal reimbursement note information both have corresponding comprehensive variable characteristics and quantitative characteristics, and the comprehensive variable characteristics and the quantitative characteristics are numerical data. Therefore, the matching value of the abnormal reimbursement note information and other non-abnormal reimbursement note information in the reimbursement note information set can be calculated according to the comprehensive variable characteristics and quantitative characteristics corresponding to the abnormal reimbursement note information.
In some embodiments, a K-Nearest Neighbor (KNN) algorithm may be used to calculate a distance value between the abnormal reimbursement bill information and the non-abnormal reimbursement bill information in the reimbursement bill information set according to the comprehensive variable features and the quantitative features, where a smaller distance value indicates that other reimbursement bill information used for calculating the distance is closer to the abnormal reimbursement bill information, and in this embodiment, the distance value is a matching value.
And step S22, screening the reimbursement bill information with the matching value meeting the preset condition in the reimbursement bill information set to obtain a matching reimbursement bill information set corresponding to the abnormal reimbursement bill.
In some embodiments, the reimbursement bill information with the matching value meeting the preset condition may be the reimbursement bill information set obtained, and the reimbursement bill information with the matching value within the preset range is used as the matching reimbursement bill information set corresponding to the abnormal reimbursement bill.
In some embodiments, the reimbursement bill information whose matching value meets the preset condition may also be obtained by sorting reimbursement bill information in the reimbursement bill information set according to the matching value, and using the obtained reimbursement bill information in a preset number as the matched reimbursement bill information set corresponding to the abnormal reimbursement bill.
Step S23, analyzing the abnormal reimbursement bill information and the matched reimbursement bill information set, and determining the abnormal root cause characteristics of the abnormal reimbursement bill information.
It can be understood that in the matching reimbursement bill information set, the data structure of the reimbursement bill information is the same as that of the abnormal reimbursement bill information, and the probability that each feature in the abnormal reimbursement bill information correspondingly causes the abnormality of the reimbursement bill information can be calculated by analyzing the abnormal reimbursement bill information and the reimbursement bill information in the matching reimbursement bill information set, wherein the feature with the maximum probability is the abnormal root cause feature of the abnormal reimbursement bill information.
In some embodiments, step S23 includes steps S230 to S231.
Step S230, acquiring matching reimbursement bill information in the matching reimbursement bill information set in sequence;
step S231, analyzing the qualitative characteristics, the quantitative characteristics and the weight variable characteristics of the abnormal reimbursement bill information and the matched reimbursement bill information through a preset classification algorithm, and determining abnormal root cause characteristics of the abnormal reimbursement bill information.
In some embodiments, the preset classification algorithm is a naive bayes classification algorithm, or other classification algorithms, and the specific use can be selected as required.
And analyzing the qualitative characteristics, the quantitative characteristics and the weight variable characteristics corresponding to the abnormal reimbursement bill information and the reimbursement bill information in the matched reimbursement bill information set by a naive Bayesian classification algorithm, so as to obtain the probability of the reimbursement bill information abnormality caused by the corresponding qualitative characteristics, quantitative characteristics and weight variable characteristics, wherein the characteristic with the maximum corresponding probability is the abnormal root cause characteristic of the abnormal reimbursement bill information.
For example, it is assumed that after being analyzed by a naive bayesian classification algorithm, the probability of causing an abnormal report form information abnormality in the qualitative feature, the quantitative feature and the remark information feature of the corresponding abnormal report form information is shown in the following table two:
table two:
Figure BDA0003282553690000101
the probability of the "vehicle" in the qualitative features is the largest in the abnormal reimbursement bill information, and therefore, the "vehicle" in the qualitative features is the abnormal root cause feature causing the abnormality of the reimbursement bill.
In some embodiments, the method further comprises: and generating abnormal report information of the reimbursement bill according to the abnormal reimbursement bill information in the to-be-processed reimbursement bill information set and the abnormal root cause characteristics corresponding to the abnormal reimbursement bill information.
The abnormal report information of the reimbursement bill records abnormal reimbursement bill information in the reimbursement bill information set and abnormal root cause characteristics corresponding to the abnormal reimbursement bill information. Through the abnormal report information of the reimbursement bill, the financial staff can correspondingly process the abnormal reimbursement bill.
In the application, the key words are extracted from the remark information characteristics in the reimbursement note information, and the weight values of the key words are calculated. And converting the remark information characteristics in the reimbursement note information into corresponding weight variable characteristics according to the keywords and the corresponding weight values. And carrying out comprehensive variable characteristic extraction on the qualitative characteristic and the weight variable characteristic corresponding to the reimbursement bill information to obtain the comprehensive variable characteristic corresponding to the reimbursement bill information. And detecting the qualitative characteristics and the weight variable characteristics of the reimbursement bill information by using an abnormality detection model, and identifying abnormal reimbursement bill information in the reimbursement bill information. By the method, the problems that the type of the reimbursement note information data is complex, the reimbursement note information is difficult to use for model data analysis, and abnormal reimbursement notes are difficult to identify can be solved.
Referring to fig. 4, fig. 4 is a schematic block diagram of an abnormal reimbursement bill detection apparatus according to an embodiment of the present disclosure.
As shown in fig. 4, the sales slip abnormality detection apparatus 201 includes:
the reimbursement note information acquiring module 2011 is configured to acquire a reimbursement note information set to be processed, where the reimbursement note information set includes multiple reimbursement note information sets, and each reimbursement note information set includes a qualitative feature, a quantitative feature, and a remark information feature;
a keyword extraction module 2012, configured to perform keyword extraction on the remark information features corresponding to each reimbursement note information in the reimbursement note information set, and perform deduplication processing on the extracted keywords to obtain a keyword word set;
a keyword weight matching module 2013, configured to calculate a keyword frequency of each corresponding keyword in the keyword word set appearing in a remark information feature corresponding to each reimbursement note information, and calculate a weight value corresponding to each keyword according to the keyword frequency;
a weight variable matching module 2014, configured to match the remark information features of the reimbursement note information set according to the keyword word set and the weight values of the keywords, to obtain weight variable features corresponding to the reimbursement note information;
an integrated feature extraction module 2015, configured to perform integrated feature extraction on the qualitative features and the weight variable features of the reimbursement order information to obtain integrated variable features of each reimbursement order information in the reimbursement order information set;
and an anomaly detection module 2016 configured to identify, according to the comprehensive variable characteristic and the quantitative characteristic, the abnormal reimbursement note information in the reimbursement note information set through a preset anomaly detection model.
In some embodiments, the keyword extraction module 2012, when performing keyword extraction on the remark information features corresponding to each of the reimbursement information in the reimbursement information set and performing deduplication processing on the extracted keywords to obtain a keyword word set, includes:
performing word segmentation on the remark information characteristics corresponding to each reimbursement note information in the reimbursement note information set to obtain a first word set;
filtering the first word set through a preset keyword filter to obtain a second word set;
and carrying out keyword duplicate removal processing on the second word set to obtain a keyword word set.
In some embodiments, when the weight variable matching module 2014 matches the remark information features of the reimbursement note information set according to the keyword word set and the weight value of each keyword to obtain the weight variable features corresponding to the reimbursement note information, the weight variable matching module 2014 includes:
sequentially acquiring the reimbursement bill information in the reimbursement bill information set;
according to the keyword word set, performing keyword matching on the remark information characteristics of the reimbursement note information to obtain a keyword matching word set corresponding to the reimbursement note information;
and acquiring the keyword with the maximum weight value in the keyword matching word set according to the weight value corresponding to each keyword to obtain the weight variable characteristic corresponding to the reimbursement bill information.
In some embodiments, the reimbursement bill abnormality detection apparatus 201 further includes:
the abnormal feature obtaining module 2017: the system is used for acquiring abnormal comprehensive variable characteristics and abnormal quantitative characteristics corresponding to the abnormal reimbursement note information;
anomaly feature matching module 2018: the system is used for calculating a matching value of the abnormal reimbursement bill information and non-abnormal reimbursement bill information in the reimbursement bill information set according to the abnormal comprehensive variable characteristics and the abnormal quantitative characteristics, wherein the non-abnormal reimbursement bill information is reimbursement bill information except the abnormal reimbursement bill information in the reimbursement bill information set;
matching reimbursement note obtaining module 2019: the system is used for screening the reimbursement bill information of which the matching values meet preset conditions in the reimbursement bill information set to obtain a matched reimbursement bill information set corresponding to the abnormal reimbursement bill;
abnormal root cause analysis module 2020: and the system is used for analyzing the abnormal reimbursement bill information and the matched reimbursement bill information set and determining the abnormal root cause characteristics of the abnormal reimbursement bill information.
In some embodiments, the abnormal root cause analysis module 2019, when analyzing the abnormal reimbursement bill information and the matching reimbursement bill information set to determine the abnormal root cause characteristics of the abnormal reimbursement bill information, includes:
sequentially acquiring matched reimbursement bill information in the matched reimbursement bill information set;
and analyzing the abnormal reimbursement bill information and the qualitative characteristics, the quantitative characteristics and the weight variable characteristics of the matched reimbursement bill information through a preset classification algorithm, and determining the abnormal root characteristics of the abnormal reimbursement bill information.
In some embodiments, the qualitative characteristics include a type of business trip, origin, destination, vehicle, class of hold, class of personnel, and invoice status, the quantitative characteristics include a reimbursement amount and distance traveled, and the remark information characteristics include remarks.
In some embodiments, the reimbursement note abnormality detection apparatus 201 further comprises an abnormality report generating module 2021 for: and generating abnormal report information of the reimbursement bill according to the abnormal reimbursement bill information in the to-be-processed reimbursement bill information set and the abnormal root cause characteristics corresponding to the abnormal reimbursement bill information.
It should be clearly understood by those skilled in the art that, for convenience and brevity of description, the specific working processes of the apparatus and each module and unit described above may refer to the corresponding processes in the foregoing embodiment of the method for detecting an exception of a reimbursement note, and are not described herein again.
The apparatus provided by the above embodiments may be implemented in the form of a computer program that can be run on a computer device as shown in fig. 5.
Referring to fig. 5, fig. 5 is a schematic block diagram of a computer device according to an embodiment of the present disclosure. The computer device includes, but is not limited to, a server.
As shown in fig. 5, the computer device 301 includes a processor 3011, a memory and a network interface connected through a system bus, where the memory may include a storage medium 3012 and an internal memory 3015, and the storage medium 3012 may be non-volatile or volatile.
The storage medium 3012 may store an operating system and computer programs. The computer program includes program instructions that, when executed, cause the processor 3011 to perform any of the methods of claim exception detection.
Processor 3011 is used to provide computing and control capabilities, supporting the operation of the overall computer device.
The internal memory 3015 provides an environment for running a computer program on the storage medium 3012, and when the computer program is executed by the processor 3011, the processor 3011 may execute any of the reimbursement note abnormality detection methods.
The network interface is used for network communication, such as sending assigned tasks and the like. Those skilled in the art will appreciate that the architecture shown in fig. 5 is merely a block diagram of some of the structures associated with the disclosed aspects and is not intended to limit the computing devices to which the disclosed aspects apply, as particular computing devices may include more or less components than those shown, or may combine certain components, or have a different arrangement of components.
It should be understood that Processor 3011 may be a Central Processing Unit (CPU), and that Processor 3011 may also be other general purpose Processor, a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), a Field Programmable Gate Array (FPGA) or other Programmable logic device, discrete Gate or transistor logic, discrete hardware components, etc. Wherein a general purpose processor may be a microprocessor or the processor may be any conventional processor or the like.
In some embodiments, the processor 3011 is configured to run a computer program stored in the memory to implement the following steps:
acquiring a reimbursement bill information set to be processed, wherein the reimbursement bill information set comprises a plurality of reimbursement bill information, and each reimbursement bill information comprises a qualitative characteristic, a quantitative characteristic and a remark information characteristic;
extracting keywords from the remark information characteristics corresponding to each reimbursement information in the reimbursement note information set, and performing duplicate removal processing on the extracted keywords to obtain a keyword word set;
calculating the keyword frequency of each corresponding keyword in the keyword word set appearing in the remark information characteristic corresponding to each reimbursement note information, and calculating the weight value corresponding to each keyword according to the keyword frequency;
matching the remark information characteristics of the reimbursement note information set according to the keyword word set and the weight value of each keyword to obtain weight variable characteristics corresponding to reimbursement note information;
performing comprehensive characteristic extraction on the qualitative characteristics and the weight variable characteristics of the reimbursement bill information to obtain comprehensive variable characteristics of each reimbursement bill information in the reimbursement bill information set;
and identifying abnormal reimbursement bill information in the reimbursement bill information set through a preset abnormal detection model according to the comprehensive variable characteristics and the quantitative characteristics.
In some embodiments, the processor 3011 is configured to, when performing keyword extraction on the remark information features corresponding to each reimbursement information in the reimbursement information set, and performing deduplication processing on the extracted keywords to obtain a keyword word set, implement:
performing word segmentation on the remark information characteristics corresponding to each reimbursement note information in the reimbursement note information set to obtain a first word set;
filtering the first word set through a preset keyword filter to obtain a second word set;
and carrying out keyword duplicate removal processing on the second word set to obtain a keyword word set.
In some embodiments, the processor 3011 is configured to, when matching the remark information features of the reimbursement note information set according to the keyword word set and the weight value of each keyword to obtain a weight variable feature corresponding to the reimbursement note information, implement:
sequentially acquiring the reimbursement bill information in the reimbursement bill information set;
according to the keyword word set, performing keyword matching on the remark information characteristics of the reimbursement note information to obtain a keyword matching word set corresponding to the reimbursement note information;
and acquiring the keyword with the maximum weight value in the keyword matching word set according to the weight value corresponding to each keyword to obtain the weight variable characteristic corresponding to the reimbursement bill information.
In some embodiments, the processor 3011 is further configured to implement:
acquiring abnormal comprehensive variable characteristics and abnormal quantitative characteristics corresponding to abnormal reimbursement note information;
according to the abnormal comprehensive variable characteristics and the abnormal quantitative characteristics, calculating a matching value of the abnormal reimbursement bill information and the non-abnormal reimbursement bill information in the reimbursement bill information set, wherein the non-abnormal reimbursement bill information is the reimbursement bill information except the abnormal reimbursement bill information in the reimbursement bill information set;
screening the reimbursement bill information of which the matching values meet preset conditions in the reimbursement bill information set to obtain a matched reimbursement bill information set corresponding to the abnormal reimbursement bill;
and analyzing the abnormal reimbursement bill information and the matched reimbursement bill information set, and determining the abnormal root cause characteristics of the abnormal reimbursement bill information.
In some embodiments, the processor 3011, when analyzing the abnormal reimbursement note information and the matching reimbursement note information set to determine an abnormal root cause characteristic of the abnormal reimbursement note information, is configured to:
sequentially acquiring matched reimbursement bill information in the matched reimbursement bill information set;
and analyzing the abnormal reimbursement bill information and the qualitative characteristics, the quantitative characteristics and the weight variable characteristics of the matched reimbursement bill information through a preset classification algorithm, and determining the abnormal root characteristics of the abnormal reimbursement bill information.
In some embodiments, the qualitative characteristics include a type of business trip, origin, destination, vehicle, class of hold, class of personnel, and invoice status, the quantitative characteristics include a reimbursement amount and distance traveled, and the remark information characteristics include remarks.
In some embodiments, the processor 3011 is further configured to implement: and generating abnormal report information of the reimbursement bill according to the abnormal reimbursement bill information in the to-be-processed reimbursement bill information set and the abnormal root cause characteristics corresponding to the abnormal reimbursement bill information.
It should be clearly understood by those skilled in the art that, for convenience and brevity of description, the specific working process of the computer device described above may refer to the corresponding process in the foregoing embodiment of the reimbursement note abnormality detection method, and details are not described herein again.
The embodiment of the present application further provides a storage medium, where the storage medium is a computer-readable storage medium, where a computer program is stored on the computer-readable storage medium, where the computer program includes program instructions, and a method implemented when the program instructions are executed may refer to various embodiments of the method for detecting an exception of a reimbursement note in the present application.
The computer-readable storage medium may be an internal storage unit of the computer device described in the foregoing embodiment, for example, a hard disk or a memory of the computer device. The computer readable storage medium may also be an external storage device of the computer device, such as a plug-in hard disk, a Smart Media Card (SMC), a Secure Digital (SD) Card, a Flash memory Card (Flash Card), and the like provided on the computer device.
It is to be understood that the terminology used in the description of the present application herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the application. As used in the specification of the present application and the appended claims, the singular forms "a," "an," and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise.
It should also be understood that the term "and/or" as used in this specification and the appended claims refers to and includes any and all possible combinations of one or more of the associated listed items. It should be noted that, in this document, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or system that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or system. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other like elements in a process, method, article, or system that comprises the element.
The above-mentioned serial numbers of the embodiments of the present application are merely for description and do not represent the merits of the embodiments. While the invention has been described with reference to specific embodiments, the scope of the invention is not limited thereto, and those skilled in the art can easily conceive various equivalent modifications or substitutions within the technical scope of the invention. Therefore, the protection scope of the present application shall be subject to the protection scope of the claims.

Claims (10)

1. A method for detecting an abnormal reimbursement bill is characterized by comprising the following steps:
acquiring a reimbursement bill information set to be processed, wherein the reimbursement bill information set comprises a plurality of reimbursement bill information, and each reimbursement bill information comprises a qualitative characteristic, a quantitative characteristic and a remark information characteristic;
extracting keywords from the remark information characteristics corresponding to each reimbursement information in the reimbursement note information set, and performing duplicate removal processing on the extracted keywords to obtain a keyword word set;
calculating the keyword frequency of each corresponding keyword in the keyword word set appearing in the remark information characteristic corresponding to each reimbursement note information, and calculating the weight value corresponding to each keyword according to the keyword frequency;
matching the remark information characteristics of the reimbursement note information set according to the keyword word set and the weight value of each keyword to obtain weight variable characteristics corresponding to reimbursement note information;
performing comprehensive characteristic extraction on the qualitative characteristics and the weight variable characteristics of the reimbursement bill information to obtain comprehensive variable characteristics of each reimbursement bill information in the reimbursement bill information set;
and identifying abnormal reimbursement bill information in the reimbursement bill information set through a preset abnormal detection model according to the comprehensive variable characteristics and the quantitative characteristics.
2. The method according to claim 1, wherein the extracting the keyword from the remark information feature corresponding to each of the reimbursement information in the reimbursement information set, and performing de-duplication processing on the extracted keyword to obtain a keyword word set comprises:
performing word segmentation on the remark information characteristics corresponding to each reimbursement note information in the reimbursement note information set to obtain a first word set;
filtering the first word set through a preset keyword filter to obtain a second word set;
and carrying out keyword duplicate removal processing on the second word set to obtain a keyword word set.
3. The method of claim 2, wherein the matching the remark information characteristics of the reimbursement note information set according to the keyword word set and the weight value of each keyword to obtain the weight variable characteristics corresponding to the reimbursement note information comprises:
sequentially acquiring the reimbursement bill information in the reimbursement bill information set;
according to the keyword word set, performing keyword matching on the remark information characteristics of the reimbursement note information to obtain a keyword matching word set corresponding to the reimbursement note information;
and acquiring the keyword with the maximum weight value in the keyword matching word set according to the weight value corresponding to each keyword to obtain the weight variable characteristic corresponding to the reimbursement bill information.
4. The method according to any one of claims 1-3, further comprising:
acquiring abnormal comprehensive variable characteristics and abnormal quantitative characteristics corresponding to abnormal reimbursement note information;
according to the abnormal comprehensive variable characteristics and the abnormal quantitative characteristics, calculating a matching value of the abnormal reimbursement bill information and the non-abnormal reimbursement bill information in the reimbursement bill information set, wherein the non-abnormal reimbursement bill information is the reimbursement bill information except the abnormal reimbursement bill information in the reimbursement bill information set;
screening the reimbursement bill information of which the matching values meet preset conditions in the reimbursement bill information set to obtain a matched reimbursement bill information set corresponding to the abnormal reimbursement bill;
and analyzing the abnormal reimbursement bill information and the matched reimbursement bill information set, and determining the abnormal root cause characteristics of the abnormal reimbursement bill information.
5. The method of claim 4, wherein analyzing the abnormal reimbursement note information and the set of matching reimbursement note information to determine abnormal root cause characteristics of the abnormal reimbursement note information comprises:
sequentially acquiring matched reimbursement bill information in the matched reimbursement bill information set;
and analyzing the abnormal reimbursement bill information and the qualitative characteristics, the quantitative characteristics and the weight variable characteristics of the matched reimbursement bill information through a preset classification algorithm, and determining the abnormal root characteristics of the abnormal reimbursement bill information.
6. The method of claim 5, wherein the qualitative characteristics include a type of business trip, a departure location, a destination, a vehicle, a class of compartments, a class of personnel, and an invoice status, wherein the quantitative characteristics include a reimbursement amount and a distance traveled, and wherein the remark information characteristics include a remark.
7. The method of claim 6, further comprising:
and generating abnormal report information of the reimbursement bill according to the abnormal reimbursement bill information in the to-be-processed reimbursement bill information set and the abnormal root cause characteristics corresponding to the abnormal reimbursement bill information.
8. An apparatus for detecting an abnormality of a reimbursement bill, the apparatus comprising:
the system comprises a reimbursement bill information acquisition module, a reimbursement bill information processing module and a remark information processing module, wherein the reimbursement bill information processing module is used for acquiring a reimbursement bill information set to be processed, the reimbursement bill information set comprises a plurality of reimbursement bill information, and each reimbursement bill information comprises a qualitative characteristic, a quantitative characteristic and a remark information characteristic;
the keyword extraction module is used for extracting keywords from the remark information characteristics corresponding to each reimbursement note information in the reimbursement note information set and performing duplication removal processing on the extracted keywords to obtain a keyword word set;
the keyword weight matching module is used for calculating the keyword frequency of each corresponding keyword in the keyword word set appearing in the remark information characteristic corresponding to each reimbursement note information, and calculating the weight value corresponding to each keyword according to the keyword frequency;
the weight variable matching module is used for matching the remark information characteristics of the reimbursement note information set according to the keyword word set and the weight value of each keyword to obtain the weight variable characteristics corresponding to the reimbursement note information;
the comprehensive characteristic extraction module is used for carrying out comprehensive characteristic extraction on the qualitative characteristics and the weight variable characteristics of the reimbursement bill information to obtain comprehensive variable characteristics of each reimbursement bill information in the reimbursement bill information set;
and the abnormality detection module is used for identifying abnormal reimbursement bill information in the reimbursement bill information set through a preset abnormality detection model according to the comprehensive variable characteristics and the quantitative characteristics.
9. A computer arrangement comprising a processor, a memory, and a computer program stored on the memory and executable by the processor, wherein the computer program, when executed by the processor, carries out the steps of the reimbursement note anomaly detection method according to any one of claims 1 to 7.
10. A computer-readable storage medium, having stored thereon a computer program, wherein the computer program, when executed by a processor, carries out the steps of the reimbursement note anomaly detection method according to any one of claims 1 to 7.
CN202111137345.8A 2021-09-27 2021-09-27 Annunciation bill abnormity detection method, using method, device, equipment and storage medium Pending CN113850331A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111137345.8A CN113850331A (en) 2021-09-27 2021-09-27 Annunciation bill abnormity detection method, using method, device, equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111137345.8A CN113850331A (en) 2021-09-27 2021-09-27 Annunciation bill abnormity detection method, using method, device, equipment and storage medium

Publications (1)

Publication Number Publication Date
CN113850331A true CN113850331A (en) 2021-12-28

Family

ID=78980141

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111137345.8A Pending CN113850331A (en) 2021-09-27 2021-09-27 Annunciation bill abnormity detection method, using method, device, equipment and storage medium

Country Status (1)

Country Link
CN (1) CN113850331A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114495137A (en) * 2022-04-15 2022-05-13 深圳高灯计算机科技有限公司 Bill abnormity detection model generation method and bill abnormity detection method

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114495137A (en) * 2022-04-15 2022-05-13 深圳高灯计算机科技有限公司 Bill abnormity detection model generation method and bill abnormity detection method
CN114495137B (en) * 2022-04-15 2022-08-02 深圳高灯计算机科技有限公司 Bill abnormity detection model generation method and bill abnormity detection method

Similar Documents

Publication Publication Date Title
US8898092B2 (en) Leveraging user-to-tool interactions to automatically analyze defects in it services delivery
CN110851598B (en) Text classification method and device, terminal equipment and storage medium
US11562373B2 (en) Utilizing machine learning models, predictive analytics, and data mining to identify a vehicle insurance fraud ring
US20240013315A1 (en) Computer-implemented methods, computer-readable media, and systems for identifying causes of loss
CN113435202A (en) Product recommendation method and device based on user portrait, electronic equipment and medium
CN112395875A (en) Keyword extraction method, device, terminal and storage medium
CN111581193A (en) Data processing method, device, computer system and storage medium
CN113342984A (en) Garden enterprise classification method and system, intelligent terminal and storage medium
CN113886708A (en) Product recommendation method, device, equipment and storage medium based on user information
CN115953123A (en) Method, device and equipment for generating robot automation flow and storage medium
CN112579781B (en) Text classification method, device, electronic equipment and medium
CN113850331A (en) Annunciation bill abnormity detection method, using method, device, equipment and storage medium
CN112632000A (en) Log file clustering method and device, electronic equipment and readable storage medium
Hassan et al. Crime news analysis: Location and story detection
Mohemad et al. Performance analysis in text clustering using k-means and k-medoids algorithms for Malay crime documents
CN109886318B (en) Information processing method and device and computer readable storage medium
Sudha et al. Analysis and evaluation of integrated cyber crime offences
Fursov et al. Sequence embeddings help to identify fraudulent cases in healthcare insurance
CN111931229B (en) Data identification method, device and storage medium
CN114064893A (en) Abnormal data auditing method, device, equipment and storage medium
CN113722484A (en) Rumor detection method, device, equipment and storage medium based on deep learning
CN113505117A (en) Data quality evaluation method, device, equipment and medium based on data indexes
CN113435741A (en) Training plan generation method, device, equipment and storage medium
CN110610213A (en) Mail classification method, device, equipment and computer readable storage medium
CN116188049B (en) Potential user mining method and device based on chain analysis

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination