CN113449103A - Bank transaction flow classification method and system integrating label and text interaction mechanism - Google Patents
Bank transaction flow classification method and system integrating label and text interaction mechanism Download PDFInfo
- Publication number
- CN113449103A CN113449103A CN202110119998.7A CN202110119998A CN113449103A CN 113449103 A CN113449103 A CN 113449103A CN 202110119998 A CN202110119998 A CN 202110119998A CN 113449103 A CN113449103 A CN 113449103A
- Authority
- CN
- China
- Prior art keywords
- label
- data
- transaction flow
- model
- des
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000000034 method Methods 0.000 title claims abstract description 42
- 230000003993 interaction Effects 0.000 title claims abstract description 34
- 230000007246 mechanism Effects 0.000 title claims abstract description 27
- 238000012549 training Methods 0.000 claims abstract description 39
- 238000003062 neural network model Methods 0.000 claims abstract description 15
- 238000012545 processing Methods 0.000 claims abstract description 12
- 238000004140 cleaning Methods 0.000 claims abstract description 9
- 239000013598 vector Substances 0.000 claims description 21
- 230000006870 function Effects 0.000 claims description 11
- 230000011218 segmentation Effects 0.000 claims description 11
- 238000013145 classification model Methods 0.000 claims description 8
- 230000008569 process Effects 0.000 claims description 8
- 238000010276 construction Methods 0.000 claims description 5
- 239000011159 matrix material Substances 0.000 claims description 5
- 230000014509 gene expression Effects 0.000 claims description 4
- 238000002372 labelling Methods 0.000 claims description 4
- 238000007781 pre-processing Methods 0.000 claims description 4
- 230000004913 activation Effects 0.000 claims description 3
- 238000004364 calculation method Methods 0.000 claims description 3
- 230000000717 retained effect Effects 0.000 claims description 3
- 239000000126 substance Substances 0.000 claims description 3
- 230000017105 transposition Effects 0.000 claims description 3
- 238000004590 computer program Methods 0.000 claims description 2
- 238000005406 washing Methods 0.000 claims description 2
- 235000005911 diet Nutrition 0.000 description 4
- 230000037213 diet Effects 0.000 description 4
- 230000006399 behavior Effects 0.000 description 3
- 239000006187 pill Substances 0.000 description 3
- 235000014347 soups Nutrition 0.000 description 3
- 238000013528 artificial neural network Methods 0.000 description 2
- 238000010586 diagram Methods 0.000 description 2
- 238000005516 engineering process Methods 0.000 description 2
- 241000288105 Grus Species 0.000 description 1
- 240000007594 Oryza sativa Species 0.000 description 1
- 235000007164 Oryza sativa Nutrition 0.000 description 1
- 238000013473 artificial intelligence Methods 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 230000018109 developmental process Effects 0.000 description 1
- 230000008034 disappearance Effects 0.000 description 1
- 238000004880 explosion Methods 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 238000012423 maintenance Methods 0.000 description 1
- 239000000463 material Substances 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 238000005457 optimization Methods 0.000 description 1
- 235000009566 rice Nutrition 0.000 description 1
- 230000009897 systematic effect Effects 0.000 description 1
- 230000000007 visual effect Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/35—Clustering; Classification
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/21—Design, administration or maintenance of databases
- G06F16/215—Improving data quality; Data cleansing, e.g. de-duplication, removing invalid entries or correcting typographical errors
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/38—Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
- G06F16/381—Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using identifiers, e.g. barcodes, RFIDs
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/279—Recognition of textual entities
- G06F40/284—Lexical analysis, e.g. tokenisation or collocates
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q40/00—Finance; Insurance; Tax strategies; Processing of corporate or income taxes
- G06Q40/04—Trading; Exchange, e.g. stocks, commodities, derivatives or currency exchange
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Databases & Information Systems (AREA)
- Computational Linguistics (AREA)
- Business, Economics & Management (AREA)
- Health & Medical Sciences (AREA)
- Artificial Intelligence (AREA)
- General Health & Medical Sciences (AREA)
- Finance (AREA)
- Accounting & Taxation (AREA)
- Life Sciences & Earth Sciences (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- Evolutionary Computation (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Library & Information Science (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Quality & Reliability (AREA)
- Development Economics (AREA)
- Economics (AREA)
- Marketing (AREA)
- Strategic Management (AREA)
- Technology Law (AREA)
- General Business, Economics & Management (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The invention discloses a bank transaction flow classification method and system integrated with a label and text interaction mechanism, and relates to the field of financial business processing. The method comprises the following steps: marking and cleaning the transaction flow data, and constructing training data required by model training; continuously optimizing the model based on training data by adopting a neural network model integrated with an interaction mechanism; calling an optimal model, and marking the new transaction flow; and outputting the corresponding running label, and automatically pushing the consumption report forms of each week, each month or each quarter according to the setting of the user. The system can mark the user transaction flow in real time, and show the expenditure proportion of a specific period to the user, so that the user can comprehensively and systematically know the own consumption structure. Compared with the traditional classification system, the system adds an interaction layer, explicitly calculates the matching degree of the label and the text, and plays a guiding role in later label prediction.
Description
Technical Field
The invention relates to the field of financial business processing, in particular to a bank transaction flow classification method and system integrating a label and text interaction mechanism.
Background
The financial science and technology is rapidly developed, and various large financial institutions actively promote the development of network financial services and devote to the improvement and promotion of traditional financial services and products. The mode of providing financial services by mobile terminals such as mobile banking and the like changes the implementation mode of financial functions, changes the behaviors and habits of people in traditional offline financial transactions and online financial transactions, and becomes a key measure for the bank to take precedence and seize the future market. At present, online transactions have penetrated the lives of the masses, the generated online transaction records have been increased explosively, for example, payments applied in real life are composed of large low-frequency payments and small high-frequency payments, and based on the transaction information, technologies such as big data and artificial intelligence can be adopted to derive some valuable new data, such as the characteristics of terminal merchants or the characteristics of consumers.
People tend to pay more attention to information such as the type of each transaction and the expense ratio of the people, and according to relevant statistics, checking the income and expense details becomes one of the most frequently used functions of people for logging in the mobile phone bank APP. Therefore, more abundant transaction information is displayed to the user based on the user requirements, such as marking consumption category information of diet, trip, clothes and the like for each transaction, the user can conveniently check consumption conditions of each category, and meanwhile, an individualized consumption report is displayed to the user, so that the user experience can be greatly improved, and the user viscosity is increased. However, at present, most of bank APP transaction details only display basic information such as transaction time, transaction objects, amount and the like, information such as consumption types of users and consumption structures of users in a specific period is not output, and visual and systematic display of user consumption behaviors is lacked.
Disclosure of Invention
Aiming at the problems, the invention provides a bank transaction flow classification method and system integrating a label and text interaction mechanism. The system can mark the user transaction flow in real time, and show the expenditure proportion of a specific period to the user, so that the user can comprehensively and systematically know the own consumption structure. Compared with the traditional classification system, the system adds an interaction layer, explicitly calculates the matching degree of the label and the text, and plays a guiding role in later label prediction.
According to a first aspect of the present invention, there is provided a bank transaction pipelining classification method incorporating a tag and text interaction mechanism, wherein the method comprises the following steps:
a data construction step: marking and cleaning the transaction flow data, and constructing training data required by model training;
model training: continuously optimizing the model based on training data by adopting a neural network model integrated with an interaction mechanism;
and (3) label prediction step: calling an optimal model, and marking the new transaction flow;
and a result display step: and outputting corresponding running labels, and automatically pushing consumption reports of each week, each month or each season according to user settings.
Further, the data constructing step specifically includes:
s1: constructing a data set, and labeling and cleaning transaction flow data;
s2: and preprocessing the cleaned transaction flow data, and removing interference words to obtain training data required by model training.
Further, the step S1 specifically includes:
s1.1: marking the transaction flow data based on the regular pattern, thereby obtaining marking data with corresponding labels;
s1.2: washing the marked data, performing de-duplication processing on the marked data, and deleting the marked data with a null value larger than a certain threshold value;
s1.3: obtaining the description corresponding to the label, and storing the label and the description corresponding to the label into a dictionary form: d ═ l1:des1;l2:des2;…;lc:descWhere c is the number of classes of label, lqIs a q-th class tag, desqFor class q labels lqDescription of (1. ltoreq. q. ltoreq.c).
Further, the step S2 specifically includes:
s2.1: carrying out stop word removal processing on character string fields in transaction flow data;
s2.2: splicing the character string fields obtained in the step S2.1, and performing word segmentation, wherein the content corresponding to each transaction flow data field is represented as: s ═ S1,s2,s3,…,snIn which s isiThe ith word of the transaction flow data is indicated, and n represents the number of words of a transaction flow data field (i is more than or equal to 1 and less than or equal to n);
s2.3: performing word segmentation processing on the description field corresponding to the label to obtain desq={w1,w2,w3,…,wmIn which wjRefers to the jth word of the description field, m represents the number of words of the description field (1 ≦ j ≦ m),
and each transaction flow data field after word segmentation and the description field after word segmentation are jointly used as training data.
Further, the stop words include, but are not limited to, special characters, city names, and other information that is not related to marking.
Further, the model training step specifically includes:
the method comprises the steps of training based on training data by adopting a classification model fused with a label and text interaction mechanism, obtaining the matching degree of each word and each category in a text through the dot product operation of vectors by utilizing word-level information, and applying the matching degree to a final prediction layer.
Further, the specific training process of the classification model using the label-fused and text interaction mechanism is as follows:
an input layer: words of the training data are mapped to continuous vectors using the word2vec model,
the vector for each transaction pipeline data field is represented as:
whereinMeans the ith word s of the trade flow dataiIs represented by a vector of (A), RdR in (a) represents the real number space, d represents the dimension of the vector,
label lqCorresponding description field desqThe vector of (d) is represented as:
whereinMeans description field jth word wjAll label descriptions correspond to a vector representation matrix of:
Number space, shape c × m × d;
and (3) coding layer: will SeAnd DESeRespectively inputting the data into a Gated Current Unit (GRU) for coding, wherein c GRU coders are required to separately code c-type labels, and coding expressions of transaction flow and label description are respectively obtained:Sh∈Rn×d,DESh∈ Rc×dwherein R isn×dDenotes ShThe value range of (1) is real number space, and the shape is n multiplied by d, Rc×dRepresentation DEShThe value range of (A) is real number space, the shape is c x d,
wherein, when coding S, the output of each time is reserved, and des is described for each labelqOnly the last hidden state is retained when encoding is performed:
an interaction layer: calculating each word siWith each tag description desqBased on the matching degree, obtaining a classification clue with finer granularity, wherein the matching degree calculation formula is as follows:
q is obtained by multiplying two matrixes, and T represents the transposition of the matrixes;
full connection layer: inputting Q to a full connection layer, obtaining O by adopting a ReLU activation function, and obtaining the probability P of the transaction running data S corresponding to each category by a softmax function:
O=ReLU(I×W+b),W∈Rn×1
P=softmax(O)={p1,p2,...,pc},
wherein, W is a parameter matrix, b is a bias, and all parameters of the model to be learned are parameters;
an output layer: and outputting the label corresponding to the maximum probability value as a prediction result:
Labelpre=argmax(P);
wherein the model is optimized based on the following loss function:
wherein the content of the first and second substances,a value representing the j dimension of the correct label corresponding to the ith transaction flow,the probability that the ith transaction running label predicted by the model is j is represented.
According to a second aspect of the present invention, there is provided a bank transaction flow classification device incorporating a tag and text interaction mechanism, wherein the device operates based on the method according to any one of the above aspects, and the device comprises the following modules:
a data construction module: the system is used for marking and cleaning transaction flow data and constructing training data required by model training;
a model training module: the model is continuously optimized based on training data by adopting a neural network model integrated with an interaction mechanism;
a label prediction module: the system is used for calling the optimal model and marking the new transaction flow;
and a result display module: the system is used for outputting corresponding running labels and automatically pushing consumption reports of each week, each month or each quarter according to user settings.
According to a third aspect of the present invention, there is provided a bank transaction pipelining classification system incorporating a tag and text interaction mechanism, the system comprising:
a processor and a memory for storing executable instructions;
wherein the processor is configured to execute the executable instructions to perform the bank transaction pipelining classification method according to any one of the above aspects, incorporating a tag and text interaction mechanism.
According to a fourth aspect of the present invention, there is provided a computer readable storage medium having stored thereon a computer program which, when executed by a processor, implements a method of pipelining banking transactions involving a tag and text interaction mechanism according to any one of the above aspects.
The invention has the beneficial effects that:
1) and a data set is automatically constructed based on the regular pattern, so that a large amount of manpower and material resources are saved. The premise of applying the neural network model is that a large amount of labeled data is needed, and the traditional mode is based on manual labeling, which is time-consuming and labor-consuming. The method and the device obtain the marking data based on the regular expression and provide training data for the model, so that the training by adopting the neural network model has feasibility.
2) The neural network model is adopted to predict the labels of the bank flow, so that the generalization of the system can be improved. The Chinese language representation is diversified, rules cannot cover all conditions, and the neural network has the characteristics of self-learning, self-adaption, nonlinearity and the like, so that the neural network can learn complex semantic association in the Chinese language, and has higher accuracy and recall rate in a classification task of a real scene.
3) And the word level information and the label information are fully utilized, and the convergence speed and accuracy of the model are improved. The overall representation of the text in the traditional classification model determines the classification with high probability, neglects the word level information which can provide effective classification clues, for example, the rice line strongly suggests the subject of diet. Compared with the traditional classification model which only utilizes the label information at the prediction layer, the model adopted by the system also utilizes the label information at the interaction layer, and the interaction result of the layer can further guide the optimization direction of the model and accelerate the convergence of the model.
4) The invention can mark each transaction of the user with a specific consumption type in real time, can also regularly push a consumption report to the user, supports the user to set a report display period by himself, and can intuitively know the consumption structure of the user according to the reports, thereby providing a basis for the next consumption and further adjusting the consumption behavior. Other use scenes can be derived based on the consumption labels, such as user portrait construction, product recommendation and the like, and the interests and hobbies of the user are known according to the consumption structure of the user, so that relevant information or financial products can be pertinently recommended to the user.
Drawings
FIG. 1 illustrates a block diagram of a banking transaction pipelining classification system incorporating a tag and text interaction mechanism in accordance with the present invention;
fig. 2 shows a structure diagram of a bank transaction flow classification model merged into a label and text interaction mechanism according to the invention.
The implementation, functional features and advantages of the objects of the present invention will be further explained with reference to the accompanying drawings.
Detailed Description
Reference will now be made in detail to the exemplary embodiments, examples of which are illustrated in the accompanying drawings. The following description refers to the accompanying drawings, in which like numerals refer to the same or similar elements throughout the different views, unless otherwise specified. The implementations described in the following exemplary examples do not represent all implementations consistent with the present disclosure. Rather, they are merely examples of apparatus and methods consistent with certain aspects of the present disclosure, as detailed in the appended claims.
The terms "first," "second," and the like in the description and in the claims of the present disclosure are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used is interchangeable under appropriate circumstances such that the embodiments of the disclosure described herein are, for example, capable of operation in sequences other than those illustrated or otherwise described herein.
Furthermore, the terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed, but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.
A plurality, including two or more.
And/or, it should be understood that, for the term "and/or" as used in this disclosure, it is merely one type of association that describes an associated object, meaning that three types of relationships may exist. For example, a and/or B, may represent: a exists alone, A and B exist simultaneously, and B exists alone.
Examples
The method comprises the following steps:
s1: and constructing a data set, and mainly labeling and cleaning the data.
S1.1A bank assembly line generally comprises information related to transaction contents such as a business name, a customer introduction, a bank remark and the like, and data is marked on the basis of a regular basis (for example, when a supermarket character appears, the category of the assembly line (a certain treasure-AA supermarket stock company) is marked as supermarket convenience), so that a large amount of marked data is obtained. But regularization does not cover all cases (e.g., when only "AA" appears but not "supermarket"), so neural network models are also needed to learn the underlying information in the sample.
S1.2: and (6) cleaning data. And (3) carrying out deduplication processing on the data (increasing the diversity of samples under the condition of a certain number of training sets), and deleting the data with more null values.
S1.3: obtaining the description corresponding to the label (such as { 'convenience service:' shared charging treasure, housekeeping service, moving, flash sending and maintenance) }), and storing the label and the description corresponding to the label into a dictionary form: d ═ l1:des1;l2:des2;…;lc:descWhere c is the number of classes of label, lqIs a q-th class tag, desqIs a q-th type label lqDescription of (1. ltoreq. q. ltoreq.c).
Regular, i.e. regular expression, describes a pattern (pattern) for matching a character string, which can be used to check whether a string contains a certain substring, to replace the matching substring, or to extract a substring that meets a certain condition from a certain string, etc.
S2: and (4) preprocessing data. And preprocessing the cleaned data, and removing stop words, word segmentation and the like.
S2.1: and carrying out stop word processing on the character string field in the transaction flow. The stop words comprise information which is irrelevant to marking, such as special characters, city place names and the like.
S2.2: splicing the character string fields, and performing word segmentation processing, wherein the final content corresponding to the fields is represented as: s ═ S1,s2,s3,…,snIn which s isiThe ith word of the transaction flow data is indicated, and n represents the number of words of the transaction flow data field (i is more than or equal to 1 and less than or equal to n).
S2.3: performing word segmentation processing on the description corresponding to the label to obtain desq={w1,w2,w3,…,wmIn which wjRefers to the jth word of the description field, and m represents the number of words of the description field (j is more than or equal to 1 and less than or equal to m).
S3: by adopting a classification model integrated with a label and text interaction mechanism, by utilizing finer-grained information, namely information of word level, the matching degree of each word in the text and each category is obtained through the dot product operation of vectors, and the matching degree is applied to a final prediction layer (for a BBB soup pill store, wherein the soup pill strongly suggests a diet theme, and in the learning of the model, compared with other themes, the matching degree of the soup pill and the diet theme is the largest). The invention adopts GRU to encode text and labels, the network unit can capture time sequence information and optimize the problems of gradient disappearance and explosion, and the network unit is widely used for word-level encoding in recent years. The specific training process of the model is as follows:
s3.1 input layer: word2vec model is used to map words into continuous vectors. word2vec converts words in natural language into dense vectors which can be understood by a computer, and the position relation of the vectors in the space generally represents the semantic correlation degree between words, namely, words with similar meanings can be mapped to similar positions in the vector space.
The representation of the final string is:
desq={w1,w2,w3,…,wmin which wjRefers to the jth word of the description field, m represents the number of words of the description field (1 ≦ j ≦ m),
label lqCorresponding description field desqThe vector of (d) is represented as:
whereinMeans description field jth word wjThe final representation matrix corresponding to all labels is (where c represents the label category number):
L∈Rc×m×dwherein R isc×m×dThe DES is represented by a real space and a shape of c multiplied by m multiplied by d;
s3.2 coding layer: will SeAnd LeAnd respectively inputting the coded signals into GRUs for coding, wherein c GRU coders are required to respectively and independently code c labels. Finally obtain the representation of the string and label: Sh∈Rn×d,DESh∈Rc×dwherein R isn×dDenotes ShThe value range of (1) is real number space, and the shape is n multiplied by d, Rc×dRepresentation DEShThe value range of (A) is real number space, the shape is c x d,
wherein the output at each time is preserved when S is coded, and each label is describeddesqOnly the last hidden state is retained when encoding is performed:
s3.3, interaction layer: calculating each word siWith each tag description desqBased on the matching degree, obtaining a classification clue with finer granularity, wherein the matching degree calculation formula is as follows:
q is obtained by multiplying two matrixes, and T represents the transposition of the matrixes;
s3.4 full connection layer: inputting Q to a full connection layer, obtaining O by adopting a ReLU activation function, and obtaining the probability P of the transaction running data S corresponding to each category by a softmax function:
O=ReLU(I×W+b),W∈Rn×1
P=softmax(O)={p1,p2,...,pc},
s3.5 output layer: and outputting the label corresponding to the maximum probability value as a prediction result:
Labelpre=argmax(P);
wherein the model is optimized based on the following loss function:
wherein the content of the first and second substances,a value representing the j dimension of the correct label corresponding to the ith transaction flow,representing the probability that the ith transaction flow label predicted by the model is j;
s4: and displaying the label prediction result to the user, and pushing a consumption report according to the time set by the user per se and according to the week, the month or the quarter.
It should be noted that, in this document, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other like elements in a process, method, article, or apparatus that comprises the element.
The above-mentioned serial numbers of the embodiments of the present invention are merely for description and do not represent the merits of the embodiments.
Through the above description of the embodiments, those skilled in the art will clearly understand that the above implementation method can be implemented by software plus a necessary general hardware platform, and certainly can also be implemented by hardware, but in many cases, the former is a better implementation method. Based on such understanding, the technical solutions of the present invention may be embodied in the form of a software product, which is stored in a storage medium (such as ROM/RAM, magnetic disk, optical disk) and includes instructions for enabling a terminal (such as a mobile phone, a computer, a server, an air conditioner, or a network device) to execute the method according to the embodiments of the present invention.
While the present invention has been described with reference to the embodiments shown in the drawings, the present invention is not limited to the embodiments, which are illustrative and not restrictive, and various modifications can be made by those skilled in the art without departing from the spirit and scope of the present invention as defined in the appended claims.
Claims (10)
1. A bank transaction flow classification method integrated with a label and text interaction mechanism is characterized by comprising the following steps:
a data construction step: marking and cleaning the transaction flow data, and constructing training data required by model training;
model training: continuously optimizing the neural network model based on training data by adopting a neural network model integrated into an interaction mechanism;
and (3) label prediction step: calling the optimized neural network model to mark the new transaction flow;
and a result display step: and outputting the corresponding running label, and automatically pushing the consumption report forms of each week, each month or each quarter according to the setting of the user.
2. The method for classifying bank transaction flow according to claim 1, wherein the data constructing step specifically comprises:
s1: constructing a data set, and labeling and cleaning transaction flow data;
s2: and preprocessing the cleaned transaction flow data, and removing interference words to obtain training data required by model training.
3. The method for classifying bank transaction flow according to claim 2, wherein the step S1 specifically includes:
s1.1: marking the transaction flow data based on the regular pattern, thereby obtaining marking data with corresponding labels;
s1.2: washing the marked data, performing de-duplication processing on the marked data, and deleting the marked data with a null value larger than a certain threshold value;
s1.3: obtaining the description corresponding to the label, and storing the label and the description corresponding to the label into a dictionary form: d ═ l1:des1;l2:des2;…;lc:descWhere c is the number of classes of label, lqIs a q-th class tag, desqIs a q-th type label lqDescription of (1. ltoreq. q. ltoreq.c).
4. The method for classifying bank transaction flow according to claim 2, wherein the step S2 specifically includes:
s2.1: carrying out stop word removal processing on character string fields in transaction flow data;
s2.2: splicing the character string fields obtained in the step S2.1, and performing word segmentation, wherein the content corresponding to each transaction flow data field is represented as: s ═ S1,s2,s3,…,snIn which s isiThe ith word of the transaction flow data is indicated, and n represents the number of words of a transaction flow data field (i is more than or equal to 1 and less than or equal to n);
s2.3: performing word segmentation processing on the description field corresponding to the label to obtain desq={w1,w2,w3,…,wmIn which wjRefers to the jth word of the description field, m represents the number of words of the description field (1 ≦ j ≦ m),
and each transaction flow data field after word segmentation and the description field after word segmentation are jointly used as training data.
5. The method for classifying bank transaction flow according to claim 4, wherein the stop words include but are not limited to special characters, city names and other information which is not related to marking.
6. The method for classifying bank transaction flow according to claim 1, wherein the model training step specifically comprises:
the method comprises the steps of training based on training data by adopting a classification model fused with a label and text interaction mechanism, obtaining the matching degree of each word and each category in a text through the dot product operation of vectors by utilizing word-level information, and applying the matching degree to a final prediction layer.
7. The method for classifying bank transactions according to claim 6, wherein the specific training process of the classification model using the label and text interaction mechanism is as follows:
an input layer: words of the training data are mapped to continuous vectors using the word2vec model,
the vector for each transaction pipeline data field is represented as:
whereinFat trade pipeline data ith word siIs represented by a vector of (A), RdR in (a) represents the real number space, d represents the dimension of the vector,
label lqCorresponding description field desqThe vector of (d) is represented as:
all label descriptions correspond to vector representation matrix as:
L∈Rc×m×dwherein R isc×m×dThe DES is represented by a real number space in a value range of c multiplied by m multiplied by d;
and (3) coding layer: will SeAnd DESeRespectively inputting the data into a Gated Current Unit (GRU) for coding, wherein c GRU coders are required to separately code c labels, and coding expressions of a transaction stream and label description are respectively obtained:Sh∈Rn×d,DESh∈Rc×dwherein R isn×dDenotes ShThe value range of (1) is real number space, and the shape is n multiplied by d, Rc×dRepresentation DEShThe value range of (A) is real number space, the shape is c x d,
wherein, when coding S, the output of each time is reserved, and des is described for each labelqOnly the last hidden state is retained when encoding is performed:
an interaction layer: calculating each word siWith each tag description desqBased on the matching degree, obtaining a classification clue with finer granularity, wherein the matching degree calculation formula is as follows:
q is obtained by multiplying two matrixes, and T represents the transposition of the matrixes;
full connection layer: inputting Q to a full connection layer, obtaining O by adopting a ReLU activation function, and obtaining the probability P of the transaction running data S corresponding to each category by a softmax function:
O=ReLU(I×W+b),W∈Rn×1
P=softmax(O)={p1,p2,…,pc},
wherein, W is a parameter matrix, b is a bias, and all parameters of the model to be learned are parameters;
an output layer: and outputting the label corresponding to the maximum probability value as a prediction result:
Labelpre=argmax(P);
wherein the model is optimized based on the following loss function:
8. A banking transaction pipelining classification apparatus incorporating a tag and text interaction mechanism, the apparatus operating based on a method according to any one of claims 1 to 7, the apparatus comprising the following modules:
a data construction module: the system is used for marking and cleaning transaction flow data and constructing training data required by model training;
a model training module: the neural network model is used for continuously optimizing the neural network model based on training data by adopting a neural network model integrated into an interaction mechanism;
a label prediction module: the neural network model is used for calling the optimized neural network model to mark a new transaction flow;
and a result display module: the system is used for outputting corresponding running labels and automatically pushing consumption reports of each week, each month or each quarter according to user settings.
9. A system for sorting bank transactions in-line by incorporating a tag and text interaction mechanism, the system comprising:
a processor and a memory for storing executable instructions;
wherein the processor is configured to execute the executable instructions to perform the bank transaction pipelining classification method incorporating the tag-to-text interaction mechanism of any one of claims 1 to 7.
10. A computer-readable storage medium having stored thereon a computer program which, when executed by a processor, implements a method of pipelining banking transactions involving a tag and text interaction mechanism according to any one of claims 1 to 7.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110119998.7A CN113449103B (en) | 2021-01-28 | 2021-01-28 | Bank transaction running water classification method and system integrating label and text interaction mechanism |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110119998.7A CN113449103B (en) | 2021-01-28 | 2021-01-28 | Bank transaction running water classification method and system integrating label and text interaction mechanism |
Publications (2)
Publication Number | Publication Date |
---|---|
CN113449103A true CN113449103A (en) | 2021-09-28 |
CN113449103B CN113449103B (en) | 2024-05-10 |
Family
ID=77808887
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202110119998.7A Active CN113449103B (en) | 2021-01-28 | 2021-01-28 | Bank transaction running water classification method and system integrating label and text interaction mechanism |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN113449103B (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116720944A (en) * | 2023-08-10 | 2023-09-08 | 山景智能(北京)科技有限公司 | Bank flowing water marking method and device |
Citations (14)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20020032810A1 (en) * | 1995-06-22 | 2002-03-14 | Wagner Richard Hiers | Open network system for I/O operation including a common gateway interface and an extended open network protocol with non-standard I/O devices utilizing device and identifier for operation to be performed with device |
CN101140645A (en) * | 2007-11-05 | 2008-03-12 | 陆航程 | Tax controlling method based on article internet, and tax controlling method and EPC, EBC article internet and implement used for tax controlling |
CN104272335A (en) * | 2011-12-02 | 2015-01-07 | 艾萨薇公司 | Unified processing of events associated with a transaction executing product purchase and/or use |
CN107908606A (en) * | 2017-10-31 | 2018-04-13 | 上海壹账通金融科技有限公司 | Method and system based on different aforementioned sources automatic report generation |
CN108073677A (en) * | 2017-11-02 | 2018-05-25 | 中国科学院信息工程研究所 | A kind of multistage text multi-tag sorting technique and system based on artificial intelligence |
CN108509485A (en) * | 2018-02-07 | 2018-09-07 | 深圳壹账通智能科技有限公司 | Preprocess method, device, computer equipment and the storage medium of data |
CN109299273A (en) * | 2018-11-02 | 2019-02-01 | 广州语义科技有限公司 | Based on the multi-source multi-tag file classification method and its system for improving seq2seq model |
CN109711848A (en) * | 2018-12-28 | 2019-05-03 | 武汉金融资产交易所有限公司 | A kind of matching system and its construction method, matching process of financial transaction |
CN109829818A (en) * | 2019-02-03 | 2019-05-31 | 中国银行股份有限公司 | Cash demand amount prediction technique, device, electronic equipment and readable storage medium storing program for executing |
CN110188199A (en) * | 2019-05-21 | 2019-08-30 | 北京鸿联九五信息产业有限公司 | A kind of file classification method for intelligent sound interaction |
CN110442707A (en) * | 2019-06-21 | 2019-11-12 | 电子科技大学 | A kind of multi-tag file classification method based on seq2seq |
CN111274791A (en) * | 2020-01-13 | 2020-06-12 | 江苏艾佳家居用品有限公司 | Modeling method of user loss early warning model in online home decoration scene |
CN111754241A (en) * | 2019-05-27 | 2020-10-09 | 北京京东尚科信息技术有限公司 | User behavior perception method, device, equipment and medium |
US10831452B1 (en) * | 2019-09-06 | 2020-11-10 | Digital Asset Capital, Inc. | Modification of in-execution smart contract programs |
-
2021
- 2021-01-28 CN CN202110119998.7A patent/CN113449103B/en active Active
Patent Citations (14)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20020032810A1 (en) * | 1995-06-22 | 2002-03-14 | Wagner Richard Hiers | Open network system for I/O operation including a common gateway interface and an extended open network protocol with non-standard I/O devices utilizing device and identifier for operation to be performed with device |
CN101140645A (en) * | 2007-11-05 | 2008-03-12 | 陆航程 | Tax controlling method based on article internet, and tax controlling method and EPC, EBC article internet and implement used for tax controlling |
CN104272335A (en) * | 2011-12-02 | 2015-01-07 | 艾萨薇公司 | Unified processing of events associated with a transaction executing product purchase and/or use |
CN107908606A (en) * | 2017-10-31 | 2018-04-13 | 上海壹账通金融科技有限公司 | Method and system based on different aforementioned sources automatic report generation |
CN108073677A (en) * | 2017-11-02 | 2018-05-25 | 中国科学院信息工程研究所 | A kind of multistage text multi-tag sorting technique and system based on artificial intelligence |
CN108509485A (en) * | 2018-02-07 | 2018-09-07 | 深圳壹账通智能科技有限公司 | Preprocess method, device, computer equipment and the storage medium of data |
CN109299273A (en) * | 2018-11-02 | 2019-02-01 | 广州语义科技有限公司 | Based on the multi-source multi-tag file classification method and its system for improving seq2seq model |
CN109711848A (en) * | 2018-12-28 | 2019-05-03 | 武汉金融资产交易所有限公司 | A kind of matching system and its construction method, matching process of financial transaction |
CN109829818A (en) * | 2019-02-03 | 2019-05-31 | 中国银行股份有限公司 | Cash demand amount prediction technique, device, electronic equipment and readable storage medium storing program for executing |
CN110188199A (en) * | 2019-05-21 | 2019-08-30 | 北京鸿联九五信息产业有限公司 | A kind of file classification method for intelligent sound interaction |
CN111754241A (en) * | 2019-05-27 | 2020-10-09 | 北京京东尚科信息技术有限公司 | User behavior perception method, device, equipment and medium |
CN110442707A (en) * | 2019-06-21 | 2019-11-12 | 电子科技大学 | A kind of multi-tag file classification method based on seq2seq |
US10831452B1 (en) * | 2019-09-06 | 2020-11-10 | Digital Asset Capital, Inc. | Modification of in-execution smart contract programs |
CN111274791A (en) * | 2020-01-13 | 2020-06-12 | 江苏艾佳家居用品有限公司 | Modeling method of user loss early warning model in online home decoration scene |
Non-Patent Citations (1)
Title |
---|
陈强: "兴业银行"AI+大数据"的创新应用与实践", 《 金融电子化》, pages 72 - 74 * |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116720944A (en) * | 2023-08-10 | 2023-09-08 | 山景智能(北京)科技有限公司 | Bank flowing water marking method and device |
CN116720944B (en) * | 2023-08-10 | 2023-12-19 | 山景智能(北京)科技有限公司 | Bank flowing water marking method and device |
Also Published As
Publication number | Publication date |
---|---|
CN113449103B (en) | 2024-05-10 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN105046515B (en) | Method and device for sorting advertisements | |
CN109493199A (en) | Products Show method, apparatus, computer equipment and storage medium | |
CN110598206A (en) | Text semantic recognition method and device, computer equipment and storage medium | |
CN107729309A (en) | A kind of method and device of the Chinese semantic analysis based on deep learning | |
CN112347367B (en) | Information service providing method, apparatus, electronic device and storage medium | |
CN110245257B (en) | Push information generation method and device | |
CN109598517B (en) | Commodity clearance processing, object processing and category prediction method and device thereof | |
CN109492103A (en) | Label information acquisition methods, device, electronic equipment and computer-readable medium | |
CN112819023A (en) | Sample set acquisition method and device, computer equipment and storage medium | |
CN109740642A (en) | Invoice category recognition methods, device, electronic equipment and readable storage medium storing program for executing | |
CN110287341A (en) | A kind of data processing method, device and readable storage medium storing program for executing | |
CN113051914A (en) | Enterprise hidden label extraction method and device based on multi-feature dynamic portrait | |
CN116861258B (en) | Model processing method, device, equipment and storage medium | |
CN111897954A (en) | User comment aspect mining system, method and storage medium | |
CN111782793A (en) | Intelligent customer service processing method, system and equipment | |
CN109447129A (en) | A kind of multi-mode Emotion identification method, apparatus and computer readable storage medium | |
CN113449103A (en) | Bank transaction flow classification method and system integrating label and text interaction mechanism | |
CN112905787B (en) | Text information processing method, short message processing method, electronic device and readable medium | |
CN113761910A (en) | Comment text fine-grained emotion analysis method integrating emotional characteristics | |
CN114119191A (en) | Wind control method, overdue prediction method, model training method and related equipment | |
CN116029793A (en) | Commodity recommendation method, device, equipment and medium thereof | |
CN115293818A (en) | Advertisement putting and selecting method and device, equipment and medium thereof | |
CN115953217A (en) | Commodity grading recommendation method and device, equipment, medium and product thereof | |
CN115618079A (en) | Session recommendation method, device, electronic equipment and storage medium | |
CN113807920A (en) | Artificial intelligence based product recommendation method, device, equipment and storage medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |