CN112561530A

CN112561530A - Transaction flow processing method and system based on multi-model fusion

Info

Publication number: CN112561530A
Application number: CN202011567495.8A
Authority: CN
Inventors: 李振; 尹正; 张刚; 鲍东岳; 刘昊霖; 傅佳美; 赵希; 任鹏飞; 李千惠; 黑小波; 刘蓓
Original assignee: Minsheng Science And Technology Co ltd
Current assignee: Minsheng Science And Technology Co ltd
Priority date: 2020-12-25
Filing date: 2020-12-25
Publication date: 2021-03-26

Abstract

A transaction flow processing method and a system based on multi-model fusion relate to the technical field of intelligent classification, and the method comprises the following steps of S1: collecting trade flow samples to construct a training set; s2: preprocessing the training set to obtain an input vector of each transaction running water sample; s3: respectively substituting the input vectors into LightGBM, SVM and Softmax for training, and completing prediction of the to-be-classified transaction pipelining first-level label by combining the trained three models; s4: and on the basis of the predicted primary label of S3, completing the prediction of the secondary label of the transaction pipeline to be classified by using a convolutional neural network model. The method disclosed by the invention adopts a hierarchical classification system, can quickly and accurately display the consumption type labels to the user, has the characteristics of error correction and autonomous learning, generates good interaction with the user and improves the user experience.

Description

Transaction flow processing method and system based on multi-model fusion

Technical Field

The invention relates to the technical field of intelligent classification, in particular to a transaction flow processing method and a transaction flow processing system based on multi-model fusion.

Background

The receiving and payment details refer to data recording all transaction flow of the account corresponding to the customer, and generally include basic information such as transaction time, customer name, transaction amount and the like for the customer to check. Nowadays, with the development of the internet, a new era of mobile phone payment is started by convenient payment, online and offline transaction amount and transaction flow are rapidly increased, meanwhile, the demand of people on the collection and payment function is continuously improved, and the real-time and transparent collection and payment detailed data becomes a great trend of the financial payment market.

According to statistics, the 'view of the receipt and payment details' becomes one of the most frequently used functions of the current customer after logging in the online bank APP, the content and the display form of the receipt and payment details are optimized based on the user requirements, the user experience can be greatly improved, and therefore the user viscosity is enhanced. At present, most of expenditure details displayed to a user by an online bank APP only comprise basic information such as transaction time, transaction account, customer name, transaction amount and the like, and personalized information such as consumption type of each expenditure, total consumption type proportion and the like does not have presentation of related data, so that the user does not have clear and intuitive knowledge on income or expenditure, income source or channel, expenditure destination or type of individual single transaction.

Disclosure of Invention

In view of this, the invention provides a transaction flow processing method and system based on multi-model fusion, which adopts a hierarchical classification system and provides a transaction flow processing system based on machine learning multi-model fusion and a convolutional neural network model. The system can realize real-time marking of user transaction flow, can quickly and accurately display the consumption type label to the user, has the characteristics of error correction and autonomous learning, generates good interaction with the user, and improves user experience.

In order to achieve the purpose, the invention adopts the following technical scheme:

according to a first aspect of the present invention, there is provided a transaction pipeline processing method based on multi-model fusion, the method comprising the following steps:

s1: collecting a transaction flow sample to construct a training set, wherein the sample comprises transaction flow data and a first-level label and a second-level label corresponding to the transaction flow data;

s2: preprocessing the training set to obtain an input vector of each transaction running water sample;

s3: respectively substituting the input vectors into a light gradient lifting regression tree model, a support vector machine model and a logistic regression model for training, and completing prediction of the to-be-classified trade pipelining primary label by combining the trained three models;

s4: on the basis of the predicted primary label of S3, the prediction of the secondary label of the transaction pipeline to be classified is completed by utilizing a convolutional neural network model;

s5: and (4) correcting the error prediction result to form a training set as a new sample, and repeating the steps S2-S4 to complete the optimization of the model.

Further, the transaction flow data in S1 includes a plurality of fields, and the fields include name, remark, amount, and transaction time.

Further, the S2 specifically includes:

s21: removing special characters and stop words in the training set transaction flow data;

s22: performing word segmentation processing on each field of the transaction flow data;

s23: converting the words obtained after word segmentation into word vectors;

s24: accumulating all word vectors of each field and taking the average value to obtain a field vector;

s25: and splicing the field vectors of each sample to obtain an input vector set of each sample.

Further, the word-to-word vector conversion is completed by adopting the word2vec model in the step S23, so that the defects of huge dimensionality disasters and vector sparsity generated when a dictionary is constructed by a discrete representation model can be overcome.

Further, the S3 specifically includes:

s31: inputting the transaction flow data in the training set and the primary labels corresponding to the transaction flow data into a light gradient lifting regression tree model for training, and predicting the probability P that the transaction flow data to be classified belongs to each primary label based on the trained light gradient lifting regression tree model_L(j) Wherein j represents the number of the primary label;

s32, inputting the transaction running water data in the training set and the first-level labels corresponding to the transaction running water data into a support vector machine model for training, and predicting the probability P that the transaction running water to be classified belongs to each first-level label based on the trained support vector machine model_SVM(j)；

S33, inputting the transaction flow data in the training set and the first-level labels corresponding to the transaction flow data into a logistic regression model for training, and predicting the probability P that the transaction flow to be classified belongs to each first-level label based on the trained logistic regression model_S(j)；

S34: calculating the probability average value P of each primary label_j＝(P_L(j)+P_SVM(j)+P_S(j) 3) selecting the primary label with the maximum average probability as the transaction flow to be classifiedThe primary label of water predicts the result.

Further, the S4 specifically includes:

s41: dividing the training set into a plurality of sub-training sets according to the first-level labels in the training set;

s42: inputting the transaction running water data contained in each sub-training set and the secondary labels corresponding to the transaction running water data into a convolutional neural network for training to obtain a plurality of convolutional neural network models corresponding to each primary label;

s43: and selecting a convolutional neural network model corresponding to the to-be-classified transaction flow primary label output by the S3 to complete the prediction of the to-be-classified transaction flow secondary label.

Further, the convolutional neural network model includes:

the input layer is used for converting the transaction flow samples into neural network input vectors;

the convolutional layer is used for extracting text features in each neural network input vector;

the pooling layer is used for screening out important features from the text features;

the full connection layer is used for connecting the important features to the classifier to obtain the probability that the transaction flow to be classified belongs to each secondary label;

and the prediction layer is used for outputting the secondary label corresponding to the maximum probability as a prediction result.

According to a second aspect of the present invention, there is provided a transaction pipeline processing system based on multi-model fusion, comprising:

the system comprises a sample collection module, a training set generation module and a training set generation module, wherein the sample collection module is used for collecting transaction flow samples to construct a training set, and the samples comprise transaction flow data and a primary label and a secondary label which correspond to the transaction flow data;

the sample processing module is used for preprocessing the training set to obtain an input vector of each transaction flow sample;

the first-level label prediction module is used for substituting the input vectors into the light gradient lifting regression tree, the support vector machine and the logistic regression model respectively for training, and completing the prediction of the first-level label of the flow of the to-be-classified transaction by utilizing a majority voting method based on the output results of the trained three models;

the secondary label prediction module is used for completing the prediction of the to-be-classified transaction flow secondary label by utilizing a convolutional neural network model on the basis of the primary label predicted by the S3;

and the result optimization module is used for forming a training set by taking the corrected error prediction result as a new sample to complete the optimization of the model.

According to a third aspect of the present invention, a computer-readable storage medium is provided, on which a computer program is stored, characterized in that the computer program, when being executed by a processor, carries out the steps of the method as set forth above.

According to a fourth aspect of the present invention, there is provided a computer device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, wherein the processor implements the steps of the method as described above when executing the program.

Compared with the prior art, the transaction flow processing method and the transaction flow processing system based on multi-model fusion have the following advantages:

the system adopts a hierarchical classification system, firstly adopts three models, namely a light gradient lifting regression tree model (LightGBM), a support vector machine model (SVM) and a logistic regression model (Softmax), to fuse votes for primary label classification, then adopts a convolutional neural network model (CNN) to perform secondary label classification on the basis of the primary label classification, and displays the result to the user. Meanwhile, the display of the system results enriches the user transaction detail display interface, supports the user to modify the marking results, provides a more intelligent accounting function for the user, improves the user experience, and improves the user click rate.

Drawings

The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate an embodiment of the invention and, together with the description, serve to explain the invention and not to limit the invention. In the drawings:

FIG. 1 is a detailed flow chart of the method of the present invention.

Detailed Description

Reference will now be made in detail to the exemplary embodiments, examples of which are illustrated in the accompanying drawings. When the following description refers to the accompanying drawings, like numbers in different drawings represent the same or similar elements unless otherwise indicated. The implementations described in the exemplary embodiments below are not intended to represent all implementations consistent with the present disclosure. Rather, they are merely examples of apparatus and methods consistent with certain aspects of the present disclosure, as detailed in the appended claims.

The terms "first," "second," and the like in the description and in the claims of the present disclosure are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used is interchangeable under appropriate circumstances such that the embodiments of the disclosure described herein are, for example, capable of operation in sequences other than those illustrated or otherwise described herein. Furthermore, the terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed, but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.

A plurality, including two or more.

And/or, it should be understood that, for the term "and/or" as used in this disclosure, it is merely one type of association that describes an associated object, meaning that three types of relationships may exist. For example, a and/or B, may represent: a exists alone, A and B exist simultaneously, and B exists alone.

The invention adopts a hierarchical classification system, firstly divides data into a plurality of first-level consumption labels, and then divides the data under each first-level label into second-level consumption labels under corresponding labels. The method resolves a large multi-classification problem into a plurality of sub-classification problems, can greatly reduce the number of labels corresponding to a single model, improves the precision of the model, and effectively avoids the condition of low model classification accuracy under the condition of more classes. Meanwhile, a plurality of models are trained in parallel under a hierarchical classification system, so that the speed can be ensured, and the overall efficiency of the system can be improved.

The invention has a self-learning mechanism and can periodically correct and update the model in the system according to the correction data.

The invention takes the characteristics of the CNN into consideration, and adopts the CNN model to process the bank transaction running water. The network can extract local features, considers the importance of local information, and is more suitable for the characteristic that significant key words in Chinese fields of running data can have key influence on results. Meanwhile, the extraction of the local features is equivalent to the capture of a plurality of different n-gram features of the text, and for one n-gram feature, a plurality of different filters extract useful information from different angles, and the combination mode of the plurality of information can learn the implicit corresponding relation between different fields in the flow data. Compared with other neural network models, the model has the advantages of difficult overfitting and high speed.

The method specifically comprises the following steps:

s1: the bank flow data consists of a plurality of fields: the characteristics are fields of client name, client remarks, bank remarks, money amount, transaction time and the like. And simultaneously marking a primary label and a secondary label for the transaction according to the transaction flow category. The obtained running water data and the corresponding labels form a training set of the following model.

S2: carrying out data preprocessing on the obtained training data set, wherein the data preprocessing steps are as follows:

s21, removing special characters and stop words from the Chinese field;

s22, the Chinese field is participled by using the ending participle, and the content corresponding to the final field is expressed as: s ═ w₁,w₂,w₃,…,w_lIn which w_iMeans the word after the ith word segmentation processingThe term l is the number of words obtained after the Chinese field is processed by the above processing steps.

S23, using the word2vec model to represent the Chinese word vector. word2vec can represent the words as a dense, low-dimensional, real-valued vector, and has good semantic properties, suitable for business names with diversity in modeling data. The distributed vector representation can effectively overcome the defects of huge dimensionality disasters and very sparse vectors generated when a dictionary is constructed by a discrete representation model, such as a word bag model.

S24: finally, the vector representation of the whole Chinese field is calculated by accumulating and averaging all word vectors:

wherein S _ vec is a vector of corresponding contents of Chinese field, w _ vec_iA word vector representing each word.

S25: and splicing vectors corresponding to all Chinese fields to obtain the input of each sample:

X＝concat(S_vec₁；...；S_vec_m)

where m represents the number of chinese fields.

And S3, training the training data by adopting three models, namely a light gradient lifting regression tree model (LightGBM), a support vector machine model (SVM) and a logistic regression model (Softmax), and then obtaining the first-level classification labels of the data through a voting mechanism. The three machine learning models are independent from each other, and the optimization functions and the processing characteristic modes in the models are different from each other, so that the correlation difference between the models is large, the repeatability of the obtained prediction result is lower than that of a result obtained by using a similar model, and the method is suitable for a mode of performing model fusion by adopting a voting mechanism. The method comprises the following specific operations:

s31, putting the training set data corresponding to the primary labels into a LightGBM model for training, and predicting the probability P of each primary label based on the trained LightGBM model_L(j) Wherein j represents one orderThe number of the label. The LightGBM model formula is as follows:

wherein f is_k(x_i) Representing the kth residual Tree vs. the ith sample x_iF is the function space of the residual trees, each of which resembles a piecewise function from a functional point of view.

S32, putting the training set data only labeled with the first-level label into SVM model, predicting probability P of each first-level label of data by adopting OVR (i.e. classifying samples of a certain class into one class and classifying other residual samples into another class during training)_SVM(j) The concrete formula is as follows:

P_SVM(j)＝max(f₁(x),f₂(x),...,f_n(x))

wherein f is_n(x) And j is the label corresponding to the maximum value in the prediction result for the nth SVM classifier.

S33, putting the training set data only labeled with the primary label into a Softmax regression model, and predicting the probability P of each primary label of the data based on the following formula_S(j)；

Wherein, theta is a model parameter, n is a first-level label number, and x is a sample.

S34, calculating the probability average value by adopting a voting mechanism for the obtained three prediction probabilities, wherein the concrete formula is as follows:

P_j＝(P_L(j)+P_SVM(j)+P_S(j))/3

and finally, selecting the category with the highest probability as a primary label classification result.

And S4, performing secondary label prediction by adopting a Convolutional Neural Network (CNN) model on the basis of the primary label. As the transaction flow data magnitude of the bank can reach hundred million levels, the traditional machine learning model is slow to train. Compared with the traditional machine learning model, the neural network model can show strong learning ability under the condition of large data volume. The specific CNN training procedure is as follows:

s41, dividing the data set into n parts according to the primary labels (n is the number of the primary labels, data)_iData corresponding to the ith tag):

s42: will data_iInput into CNN for training.

An input layer: the layer receives a vectorized representation of the streaming water data. The vector consists of a Chinese character vector, a category vector and a numerical value vector. The category variables (corresponding category fields in the transaction flow: transaction categories, transaction behaviors and the like) use an onehot coding processing mode to obtain corresponding vector representation C _ vec, vector dimensions in the onehot coding mode are category numbers, category representations are corresponding dimensions 1, and the other dimensions are all 0. The vector representation of the Chinese field in the input data adopts word vectors of splicing pre-training, the word vector char _ vec corresponding to each word in the Chinese field is spliced with the category vector C _ vec and the numerical value vector to obtain the vector representation x corresponding to each word_iI.e. by

x_i＝concat(char_vec_i；C_vec₁；...；C_vec_n；num₁；...；num_t)

Where n denotes the number of category fields, t denotes the number of value class fields, num_iA value class field is represented.

Finally, obtaining an input vector X of each flow data sample:

where r is the number of words in the Chinese field.

And (3) rolling layers: this layer is a core layer of CNN, and the vectors obtained from the input layer are mainly convolved by a convolution kernel of h × d size to extract features of a deeper layer and obtain a corresponding feature representation T ═ T₁,t₂,...,t_kWhere t is_kAnd representing the corresponding column vector obtained by the kth convolution kernel, wherein the sizes of the convolution kernels are respectively 1, 2 and 3.

A pooling layer: the layer acts to compress the features obtained from the convolutional layer and to extract the main features (using a dimension reduction operation). The invention uses a maximum pooling operation, i.e. a column vector t of the previous step_kThe maximum value of (a) is the most important feature.

Full connection layer: inputting the vector output by the pooling layer into the full-connection layer, and obtaining a prediction probability set P ═ { P } through a Softmax function₁,p₂,...,p_nIn which p is_nThe probability corresponding to the nth label, n being the number of labels to be predicted.

Prediction layer: and outputting the label corresponding to the maximum probability value in the probability set output by the full connection layer as a prediction result, wherein the formula is as follows:

Label＝argmax(P)

label represents the final predicted Label and P represents the probability.

For n data, n CNN models were trained.

And S5, correcting the error prediction result, using the corrected error prediction result as a new sample to form a training set, and sending the training set into the model, and training the new model on the basis of the last model.

It should be noted that, in this document, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other like elements in a process, method, article, or apparatus that comprises the element.

The above-mentioned serial numbers of the embodiments of the present invention are merely for description and do not represent the merits of the embodiments.

Through the above description of the embodiments, those skilled in the art will clearly understand that the above implementation method can be implemented by software plus a necessary general hardware platform, and certainly can also be implemented by hardware, but in many cases, the former is a better implementation method. Based on such understanding, the technical solutions of the present invention may be embodied in the form of a software product, which is stored in a storage medium (such as ROM/RAM, magnetic disk, optical disk) and includes instructions for enabling a terminal (such as a mobile phone, a computer, a server, an air conditioner, or a network device) to execute the method according to the embodiments of the present invention.

While the present invention has been described with reference to the embodiments shown in the drawings, the present invention is not limited to the embodiments, which are illustrative and not restrictive, and it will be apparent to those skilled in the art that various changes and modifications can be made therein without departing from the spirit and scope of the invention as defined in the appended claims.

Claims

1. A transaction flow processing method based on multi-model fusion is characterized by comprising the following steps:

s4: and on the basis of the predicted primary label of S3, completing the prediction of the secondary label of the transaction pipeline to be classified by using a convolutional neural network model.

2. The transaction pipeline processing method based on multi-model fusion of claim 1, wherein the transaction pipeline data in S1 includes a plurality of fields, and the fields include name, remark, amount and transaction time.

3. The transaction pipeline processing method based on multi-model fusion of claim 2, wherein the S2 specifically includes:

s23: converting the words obtained after word segmentation into word vectors;

4. The transaction pipeline processing method based on multi-model fusion of claim 1, wherein the error prediction result is modified to be used as a new sample to form a training set, and the model optimization can be completed by repeating the steps S2-S4.

5. The transaction pipeline processing method based on multi-model fusion of claim 1, wherein the S3 specifically includes:

s31: inputting the transaction flow data in the training set and the primary labels corresponding to the transaction flow data into a light gradient lifting regression tree model for training, and predicting the probability P that the transaction flow data to be classified belongs to each primary label based on the trained light gradient lifting regression tree model_L(j) Wherein j represents the compilation of primary labelsNumber;

S34: calculating the probability average value P of each primary label_j＝(P_L(j)+P_SVM(j)+P_S(j) And 3), selecting the primary label with the maximum probability average value as the primary label prediction result of the transaction flow to be classified.

6. The transaction pipeline processing method based on multi-model fusion of claim 1, wherein the S4 specifically includes:

7. The transaction pipeline processing method based on multi-model fusion of claim 6, wherein the convolutional neural network model comprises:

8. A transaction pipeline processing system based on multi-model fusion is characterized by comprising:

and the secondary label prediction module is used for completing the prediction of the to-be-classified transaction flow secondary label by utilizing a convolutional neural network model on the basis of the primary label predicted by the S3.

9. A computer-readable storage medium, on which a computer program is stored, which, when being executed by a processor, carries out the steps of the method according to any one of claims 1 to 7.

10. A computer device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, wherein the steps of the method according to any of claims 1 to 7 are carried out when the program is executed by the processor.