CN116596675A

CN116596675A - Data processing method and device, electronic equipment and storage medium

Info

Publication number: CN116596675A
Application number: CN202310428642.0A
Authority: CN
Inventors: 肖钢; 刘勇; 马丽霞; 田田; 吴穹; 仇乾栋
Original assignee: China Securities Co Ltd
Current assignee: China Securities Co Ltd
Priority date: 2023-04-20
Filing date: 2023-04-20
Publication date: 2023-08-15

Abstract

The embodiment of the invention provides a data processing method, a device, electronic equipment and a storage medium, and relates to the technical field of artificial intelligence, wherein the method comprises the following steps: outputting a client management interface for the target client; in response to receiving a trigger operation for a trigger button, acquiring personal attribute information, behavior information and transaction information of a target client in each target sub-time period; inputting the personal attribute information, the behavior information and the transaction information into an asset loss prediction model to obtain asset loss probability of a target client in a specified future time period; when the asset loss probability in the appointed future time period is larger than a preset threshold value, sending an early warning notice to a client manager of the target client according to a preset notice mode; the early warning notice is used for representing that the target client predicts the asset loss condition in a specified future time period. By applying the scheme, the historical asset data of the client can be intelligently processed, so that a certain data reference basis is provided for the client manager to maintain the client.

Description

Data processing method and device, electronic equipment and storage medium

Technical Field

The present invention relates to the field of artificial intelligence technologies, and in particular, to a data processing method, apparatus, electronic device, and storage medium.

Background

With the development of investment markets, an asset management platform of a securities company needs to perform multidimensional analysis on assets of the securities company by clients to provide asset conditions of the clients to client managers, so that the client managers can provide comprehensive services according to the asset conditions of the clients.

Currently, the asset management platform may provide personal attribute information, behavioral information, and transaction information of the customer to the customer manager; the personal attribute information of the client may be sex, age, etc. of the client, the behavior information of the client may be operation of the client at the client side of the securities corporation, transfer-in and transfer-out condition of funds by the client at the securities corporation account, etc., and the trade information of the client may be asset trade time, trade times in appointed time, trade amount in appointed time, etc.

However, the above-described functionality of the asset management platform remains in the simple data presentation phase and has not been able to meet today's needs for intelligence. Along with the increasing importance of customer experience, in order to better maintain customers, a customer manager hopes that an asset management platform can perform intelligent analysis and processing based on historical asset data of the customers under the condition of not considering the influence of external factors such as economic law change and the like, so that a certain data reference basis is provided for the customer manager to maintain the customers.

Therefore, how to intelligently process the historical asset data of the client, so as to provide a certain data reference basis for the client manager to maintain the client, is a problem to be solved urgently.

Disclosure of Invention

The embodiment of the invention aims to provide a data processing method, a device, electronic equipment and a storage medium, so as to realize intelligent processing of historical asset data of clients, thereby providing a certain data reference basis for clients maintenance of client managers. The specific technical scheme is as follows:

in a first aspect, an embodiment of the present invention provides a data processing method, which is applied to an asset management platform; the method comprises the following steps:

outputting a client management interface for the target client; the client management interface comprises a trigger button aiming at an appointed intelligent analysis task; the appointed intelligent analysis task is an early warning task about whether the target client has asset loss conditions in an appointed future time period;

acquiring appointed personal attribute information, appointed behavior information and appointed transaction information of the target client in each target sub-time period in response to receiving triggering operation of the triggering button; wherein each target sub-time period is a time period which is within a target history time period and belongs to a specified time granularity;

Inputting the acquired appointed personal attribute information, appointed behavior information and appointed transaction information into an asset loss prediction model which is trained in advance to obtain asset loss probability of the target client in the appointed future time period; wherein the asset loss probability is a probability that the target client has an asset loss condition; the asset loss prediction model is as follows: utilizing appointed personal attribute information, appointed behavior information and appointed transaction information of a sample client in each sample sub-time period and label information of the sample client to train an obtained model based on a transformer network; the label information of the sample client is used for identifying: whether the sample client has incurred asset churn over a reference period of time; each sample sub-period is a period which is positioned in a sample history period before the reference period and belongs to a specified time granularity;

when the asset loss probability of the target client in the appointed future time period is greater than a preset threshold value, sending an early warning notice to a client manager of the target client according to a preset notice mode; the early warning notice is used for representing that the target client predicts the asset loss condition in the appointed future time period.

Optionally, the asset loss prediction model includes: an input layer and a probability fitting layer;

wherein, input layer is used for:

aiming at each target sub-time period, respectively carrying out data processing on the appointed behavior information and the appointed transaction information of the target client under the target sub-time period according to a preset data preprocessing mode to obtain the valued content corresponding to the appointed behavior information under the target sub-time period and the valued content corresponding to the appointed transaction information under the target sub-time period; the preset data preprocessing mode is used for mapping any piece of data to be processed into a value content, and the data quantity of the value content mapped by any piece of data to be processed is smaller than that of any piece of data to be processed;

for each target sub-time period, respectively coding the appointed personal attribute information of the target client in the target sub-time period, the valued content corresponding to the appointed behavior information in the target sub-time period and the valued content corresponding to the appointed transaction information in the target sub-time period to obtain a plurality of coded contents corresponding to the target sub-time period, and constructing the feature vector of the target client in the target sub-time period based on the plurality of coded contents corresponding to the target sub-time period;

Constructing a tensor matrix corresponding to the target client based on the feature vectors of the target client in each target sub-time period; the tensor matrix is used for representing appointed personal attribute information, appointed behavior information and appointed transaction information of the target client in each target sub-time period;

the probability fitting layer is used for generating a tensor matrix of the target client based on the input layer; and fitting the asset loss probability of the target client in the specified future time period by using a transformer network.

Optionally, the input layer performs data processing on the specified behavior information and the specified transaction information of the target client in the target sub-time period according to a predetermined data processing manner for each target sub-time period, to obtain a value content corresponding to the specified behavior information in the target sub-time period and a value content corresponding to the specified transaction information in the target sub-time period, where the data processing includes:

for each target sub-time period, carrying out layering treatment on the specified behavior information of the target client in the target sub-time period according to a first type of preset quantile, obtaining a layering of the specified behavior information in the target sub-time period, and determining preset value content corresponding to the obtained layering as the value content corresponding to the specified behavior information in the target sub-time period; and carrying out layering treatment on the appointed transaction information of the target client in the target sub-time period according to the second type of preset quantiles, obtaining the layering of the appointed transaction information in the target sub-time period, and obtaining preset value content corresponding to the obtained layering to serve as the value content corresponding to the appointed transaction information in the target sub-time period.

Optionally, the training process of the asset loss prediction model includes:

determining a plurality of sample clients;

acquiring appointed personal attribute information, appointed behavior information and appointed transaction information of each sample client in each sample sub-time period, and label information of the sample client;

respectively inputting appointed personal attribute information, appointed behavior information and appointed transaction information of each sample client in each sample sub-time period into an asset loss prediction model to be trained to obtain asset loss probability of each sample client in a reference time period;

calculating a model loss value based on the asset loss probability of each sample client in the reference time period and the label information of each sample client;

judging whether the asset loss prediction model in training is converged or not based on the model loss value, and if so, training to obtain a trained asset loss prediction model; if not, adjusting the parameters of the asset loss predictive model, and returning to the step of determining a plurality of sample clients to continue training the asset loss predictive model.

In a second aspect, an embodiment of the present invention provides a data processing apparatus, which is applied to an asset management platform; the device comprises:

An output module for outputting a client management interface for the target client; the client management interface comprises a trigger button aiming at an appointed intelligent analysis task; the appointed intelligent analysis task is an early warning task about whether the target client has asset loss conditions in an appointed future time period;

the acquisition module is used for responding to the received triggering operation of the triggering button and acquiring appointed personal attribute information, appointed behavior information and appointed transaction information of the target client in each target sub-time period; wherein each target sub-time period is a time period which is within a target history time period and belongs to a specified time granularity;

the input module is used for inputting the acquired appointed personal attribute information, the appointed behavior information and the appointed transaction information into an asset loss prediction model which is trained in advance to obtain the asset loss probability of the target client in the appointed future time period; wherein the asset loss probability is a probability that the target client has an asset loss condition; the asset loss prediction model is as follows: utilizing appointed personal attribute information, appointed behavior information and appointed transaction information of a sample client in each sample sub-time period and label information of the sample client to train an obtained model based on a transformer network; the label information of the sample client is used for identifying: whether the sample client has incurred asset churn over a reference period of time; each sample sub-period is a period which is positioned in a sample history period before the reference period and belongs to a specified time granularity;

The notification module is used for sending an early warning notification to a client manager of the target client according to a preset notification mode when the asset loss probability of the target client in the appointed future time period is greater than a preset threshold; the early warning notice is used for representing that the target client predicts the asset loss condition in the appointed future time period.

wherein, input layer is used for:

for each target sub-time period, carrying out layering treatment on the specified behavior information of the target client in the target sub-time period according to a first type of preset quantile, obtaining a layering of the specified behavior information in the target sub-time period, and determining preset value contents corresponding to the obtained layering as the value contents corresponding to the behavior information in the target sub-time period; and carrying out layering treatment on the appointed transaction information of the target client in the target sub-time period according to the second type of preset quantiles, obtaining the layering of the appointed transaction information in the target sub-time period, and obtaining preset value content corresponding to the obtained layering to serve as the value content corresponding to the appointed transaction information in the target sub-time period.

Optionally, the training process of the asset loss prediction model includes:

determining a plurality of sample clients;

In a third aspect, an embodiment of the present invention further provides an electronic device, including a processor, a communication interface, a memory, and a communication bus, where the processor, the communication interface, and the memory complete communication with each other through the communication bus;

A memory for storing a computer program;

and the processor is used for realizing any data processing method when executing the program stored in the memory.

In a fourth aspect, embodiments of the present invention further provide a computer readable storage medium having a computer program stored therein, which when executed by a processor implements any of the data processing methods described above.

The embodiment of the invention has the beneficial effects that:

in the data processing method provided by the embodiment of the invention, the asset management platform can acquire the appointed personal attribute information, the appointed behavior information and the appointed transaction information of the target client in each target sub-time period in response to receiving the triggering operation of the triggering button of the client management interface; then inputting the acquired appointed personal attribute information, appointed behavior information and appointed transaction information into an asset loss prediction model which is trained in advance to obtain asset loss probability of a target client in an appointed future time period, namely predicting asset loss condition of the target client in the appointed future time period through the asset loss prediction model which is trained in advance; and when the asset loss probability of the target client in the appointed future time period is larger than the preset threshold value, sending an early warning notice used for representing that the target client predicts the asset loss condition in the appointed future time period to a client manager of the target client according to a preset notice mode. Therefore, by applying the data processing method provided by the embodiment of the invention, the historical asset data of the client can be intelligently processed, so that a certain data reference basis is provided for the client manager to maintain the client.

Of course, it is not necessary for any one product or method of practicing the application to achieve all of the advantages set forth above at the same time.

Drawings

In order to more clearly illustrate the embodiments of the application or the technical solutions in the prior art, the drawings used in the embodiments or the description of the prior art will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the application, and other embodiments may be obtained according to these drawings to those skilled in the art.

FIG. 1 is a schematic flow chart of a data processing method according to an embodiment of the present application;

FIG. 2 is a schematic flow chart of a training method of an asset loss prediction model according to an embodiment of the present application;

FIG. 3 is a flowchart of another method for training an asset loss prediction model according to an embodiment of the present application;

FIG. 4 is a schematic diagram of a data processing apparatus according to an embodiment of the present application;

fig. 5 is a schematic structural diagram of an electronic device according to an embodiment of the present application.

Detailed Description

The following description of the embodiments of the present application will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present application, but not all embodiments. Based on the embodiments of the present application, all other embodiments obtained by the person skilled in the art based on the present application are included in the scope of protection of the present application.

In order to better understand the embodiments of the present invention, concepts presented in the embodiments of the present invention will be described below.

A transducer network: a transformer network formed based on a multi-head self-attention mechanism is proposed in Google Attention is All you need paper, and two main structures of an encoder and a decoder included in the network can capture global dependency relationship between input and output based on the multi-head self-attention mechanism, which is commonly used for the task of sequence modeling. the principle of the transformer network is an important theoretical basis of a large model such as ChatGPT (Chat Generative Pre-trained Transformer, chat generation pre-training transformers). Aiming at scenes such as NLP (Natural Language Processing) or recommended double-tower model, a model applying a transformer network can carry out the ebedding processing on the phrase or id label, convert the phrase or id label into a low-dimensional vector, and then drive the vector into a main body structure applying a multi-head self-attention mechanism. However, the model to which the transformer network is applied is generally an NLP model, and such model is designed for an NLP scene and is mainly used for processing natural language, and in the network structure of the NLP model, the input layer has fewer characteristics that can be processed and is mainly an ebedding vector of a phrase or an id label. The transducer network may be also referred to as a transducer neural network structure, a transducer neural network, or the like, and the present invention is not particularly limited thereto, and is collectively referred to as a transducer network in the description of the present invention.

The following first describes a data processing method provided by an embodiment of the present invention.

The data processing method provided by the embodiment of the invention can be applied to an asset management platform. The asset management platform may be functional software and may be loaded on the electronic device. In a specific application, the electronic device may be a server or a terminal device, which is reasonable. In practical application, the terminal device may be: tablet computers, desktop computers, and the like. The asset management platform can be software used by the securities company in the process of managing the assets of the clients, and can analyze and check the in-out condition of the assets of each client in the securities company, the running condition of the products invested by each client and the like.

The data processing method provided by the embodiment of the invention can comprise the following steps:

A data processing method provided by an embodiment of the present invention will be described below with reference to the accompanying drawings.

Fig. 1 is a flow chart of a data processing method according to an embodiment of the present invention, as shown in fig. 1, the method may include steps S101 to S104:

s101, outputting a client management interface about a target client; the client management interface comprises a trigger button aiming at an appointed intelligent analysis task; the specified intelligent analysis task is an early warning task regarding whether the target client has asset loss conditions in a specified future time period.

It will be appreciated that the asset management platform may present to the interactive user an interactive interface for the target client, which interface may include a number of buttons for each analysis task, wherein the number of buttons may include a trigger button for a specified intelligent analysis task, which may be: whether the target client has an early warning task of asset loss condition in a specified future time period. Wherein the interactive user is a user who interacts with the asset management platform, such as a client manager of a securities company, a technician, and so on; the specified future time period may be set according to actual conditions, for example, a future day, a future week, a future month, and the like.

S102, acquiring appointed personal attribute information, appointed behavior information and appointed transaction information of the target client in each target sub-time period in response to receiving triggering operation of the triggering button; wherein each target sub-time period is a time period which is within the target history time period and belongs to the specified time granularity.

It will be appreciated that after the interactive user triggers a button for a specified intelligent analysis task at the customer management interface, the asset management platform may obtain specified personal attribute information, specified behavioral information, and specified transaction information for the target customer at each target sub-time period in response to the triggering operation. The asset management platform may obtain, from a database storing client information, specific personal attribute information, specific behavior information, and specific transaction information of the target client in each target sub-period, which is not specifically limited in this embodiment of the present invention.

The specific personal attribute information of the target client in each target sub-period may be personal information of the target client, and may also be referred to as client portraits, for example, age, sex, city, region, academic, risk assessment level, whether to prefer a middle-sized and small-sized board, whether to prefer a main board, whether to be a high net value client, and the like of the target client may be used as the personal attribute information, and the specific personal attribute information is information selected from the personal attribute information; the behavior information of the target client under each target sub-period may be some operations of the target client at the securities company, for example, the login time of the target client at the client of the securities company, the products purchased by the target client at the securities company, the specific amount of the target client in and out of the assets of the securities company, etc. may be used as the behavior information, and the specified behavior information is information selected from the behavior information; the transaction information of the target client under each target sub-period may be transaction data of the target client, for example, a credit account or a total account, a transaction time of the target client, a transaction amount of the target client in a certain period, an account profit margin of the target client in a certain period, an asset amount of the target client, and the like may be used as the transaction information, and the specified transaction information is information selected from the transaction information.

It should be noted that, each target sub-period may be within the target history period, and each target sub-period may be distributed according to a specified time granularity, and the specified time granularity may characterize a specified division of time within the target history period, so that each target sub-period may be each period distributed according to the specified division within the target history period. For example, each target sub-period may be a first quarter, a second quarter, a third quarter, and a fourth quarter in the past year, that is, the target sub-period may be every three months; or each target sub-period may be the first week, the second week, the third week, and the fourth week in the past month, that is, the target sub-period may be each week, and the specific time length of the target sub-period is not specifically limited in the embodiment of the present application.

In addition, it should be noted that, for each target sub-period, the number of the specified personal attribute information in the target sub-period may be one or more, for example: and acquiring the two pieces of appointed personal attribute information, namely the sex and the risk assessment level of the target client in a target sub-time period, or acquiring one piece of appointed personal attribute information, namely the risk assessment level of the target client in a target sub-time period. The present application is not limited to the number of kinds of personal attribute information. Similarly, for each target sub-period, the category of the behavior information in the target sub-period may be one or more, and the category of the transaction information in the target sub-period may be one or more, and the present application is not limited to the number of categories. It is emphasized that when the specified personal attribute information is the ID of the target client, the asset management platform must additionally acquire other personal attribute information as the specified personal attribute information.

S103, inputting the acquired appointed personal attribute information, appointed behavior information and appointed transaction information into an asset loss prediction model which is trained in advance, and obtaining the asset loss probability of the target client in the appointed future time period; wherein the asset loss probability is a probability that the target client has an asset loss condition; the asset loss prediction model is as follows: utilizing appointed personal attribute information, appointed behavior information and appointed transaction information of a sample client in each sample sub-time period and label information of the sample client to train an obtained model based on a transformer network; the label information of the sample client is used for identifying: whether the sample client has incurred asset churn over a reference period of time; the respective sample sub-periods are periods that are located within a sample history period preceding the reference period and belong to a specified time granularity.

It can be appreciated that after the specified personal attribute information, the specified behavior information and the specified transaction information of the target client are obtained, the asset management platform can input the obtained specified personal attribute information, specified behavior information and specified transaction information into the asset loss prediction model obtained by training in advance; the asset loss prediction model obtained by training in advance is a model obtained by training on the basis of a transducer network by utilizing the appointed personal attribute information, the appointed behavior information and the appointed transaction information of the sample client in each sample sub-time period and the label information of the sample client, and the label information of the sample client is used for identifying: whether the sample client has lost the asset in the reference time period or not is judged, and accordingly the obtained appointed personal attribute information, the appointed behavior information and the appointed transaction information are input into an asset loss prediction model which is trained in advance, and the asset loss probability of the target client in the appointed future time period can be obtained.

It should be noted that, each target sub-period belongs to a specified time granularity, and each sample sub-period also belongs to a specified time granularity. It can be seen that, the time granularity of the sample sub-period is the same as the time granularity of the aforementioned target sub-period, for example, the sample sub-period and the target sub-period may be one month, or the sample sub-period and the target sub-period may be one week, and the specific time lengths of the sample sub-period and the sample sub-period are not specifically limited in the embodiments of the present invention. In addition, each sample sub-period is a period within a sample history period preceding a reference period, and the information identified by the sample tag for training is: the method includes the steps that whether the sample client has lost the asset in the reference time period or not, and the specific personal attribute information, the specific behavior information and the specific transaction information of the target client are input into an asset loss prediction model which is trained in advance, and the asset loss probability of the target client in the specific future time period is obtained, so that the relation between the reference time period and the sample history time period can be the relation between the specific future time period and the target history time period, namely, the relation between the target history time period used for prediction and the predicted future time period can follow the relation between the sample history time period and the reference time period in the model training process. For example, the time difference between the reference time period and the sample history time period may be equivalent to: the time difference between the specified future time period and the target history time period, for example, the difference between the sample history time period and the reference time period is three months in the process of training the asset loss prediction model, then the asset loss probability of the specified future time period can be obtained by using the target history time period in the process of using the asset loss prediction model, and the difference between the specified future time period and the target history time period is three months.

In one implementation, an asset loss prediction model includes: an input layer and a probability fitting layer;

wherein, input layer is used for:

It is understood that the asset loss prediction model may be divided into an input layer and a probabilistic fit layer.

Aiming at the specific effect of the input layer, the input layer can process the data according to a preset data processing mode by inputting the appointed behavior information and the appointed transaction information of the target client of the asset loss prediction model, and the preset data preprocessing mode is used for mapping any data to be processed into a value content, and the data quantity of the value content mapped by any data to be processed is smaller than that of any data to be processed, so that the value content corresponding to the appointed behavior information and the value content corresponding to the appointed transaction information can be obtained through data processing. It should be noted that, in the process of performing data processing on the specified behavior information and the specified transaction information of the target client, the input layer processes each target sub-period, so that through data processing, the value content corresponding to the specified behavior information of the target client under each target sub-period and the value content corresponding to the specified transaction information under each target sub-period can be obtained, and when a plurality of specified behavior information exists, each specified behavior information can have the corresponding value content; when there are a plurality of specified transaction information, each of the specified transaction information may also have a corresponding respective value content. Alternatively, the input layer may embody the acquired specified personal attribute information of the target client in the form of specific valued content, where when there are multiple specified personal attribute information, each specified personal attribute information may correspond to the respective valued content.

In one implementation manner, the input layer performs data processing on the specified behavior information and the specified transaction information of the target client in the target sub-time period according to a predetermined data processing manner for each target sub-time period to obtain a value content corresponding to the specified behavior information in the target sub-time period and a value content corresponding to the specified transaction information in the target sub-time period, where the data processing method includes:

It can be understood that the input layer can perform layering processing on the specified behavior information according to the first type of predetermined quantiles to obtain the layering of the behavior information, and determine preset value contents for each layering of the specified behavior information; for the specified transaction information, layering processing can be performed on the specified transaction information according to the second type of preset quantiles, layering of the specified transaction information is obtained, and preset value contents are determined for each layering of the specified transaction information. For example, the first month transaction amount of the target client in the target history period is 6 ten thousand yuan, the second month transaction amount is 47 ten thousand yuan, the third month transaction amount is 49 ten thousand yuan, the fourth month transaction amount is 15 ten thousand yuan, the fifth month transaction amount is 42 ten thousand yuan, the sixth month transaction amount is 41 ten thousand yuan, the seventh month transaction amount is 7 ten thousand yuan, the eighth month transaction amount is 39 ten thousand yuan, the ninth month transaction amount is 43 ten thousand yuan, the tenth month transaction amount is 40 ten thousand yuan, the eleventh month transaction amount is 36 ten thousand yuan, and the transaction amounts are arranged in order from small to large: (6,7,15,36,39,40,41,42,43,47,49) for 11 months, wherein the predetermined quantile is a quartile, and the transaction amount of the target client in the target history period is according to the predetermined quantile, and the position of the quartile in the sequence of the transaction amounts can be determined to be 15, 40 and 43, so that the sequence of the transaction amounts is divided into four grades (6,7,15), (36,39,40), (41, 42, 43) and (47 and 49), and the value contents corresponding to the transaction amounts of the four grades are sequentially 1, 2, 3 and 4; therefore, the target client has 1 value content corresponding to the first month transaction amount, 4 value content corresponding to the second month transaction amount, 4 value content corresponding to the third month transaction amount, 1 value content corresponding to the fourth month transaction amount, 3 value content corresponding to the fifth month transaction amount, 3 value content corresponding to the sixth month transaction amount, 1 value content corresponding to the seventh month, 2 value content corresponding to the eighth month transaction amount, 3 value content corresponding to the ninth month transaction amount, 2 value content corresponding to the tenth month transaction amount, and 2 value content corresponding to the eleventh month transaction amount in the target history period. Of course, in the above layering process, the input layer also processes for each target sub-period to obtain the value content corresponding to the specified behavior information under each target sub-period and the value content corresponding to the specified transaction information under each target sub-period. In the scheme, the appointed behavior information and the appointed transaction information are subjected to layering processing by utilizing the preset quantile, the value contents corresponding to the appointed behavior information and the appointed transaction information in each layering are determined, the appointed behavior information and the appointed transaction information are represented in a compact form, the calculation resources required by the model in the process of utilizing the appointed behavior information and the appointed transaction information are reduced, and the utilization rate of the calculation resources and the running speed of the model are improved.

Aiming at the specific effect of the input layer, the input layer can respectively encode the appointed personal attribute information, the valued content corresponding to the appointed behavior information and the valued content corresponding to the appointed transaction information of the target client to obtain a plurality of encoded contents. Optionally, the input layer may obtain a plurality of encoded contents by using a one-hot encoding manner, where the encoded contents correspond to the specified personal attribute information, the specified behavior information, and the specified transaction information of the target client. The input layer can construct the obtained plurality of coded contents into feature vectors of the target clients, and because the coding processing is carried out on each target sub-time period, the obtained plurality of coded contents are also aimed at one target sub-time period, the constructed feature vectors are feature vectors of the target clients under one target sub-time period, and the feature vectors of the target clients in each target sub-time period respectively comprise appointed personal attribute information, appointed behavior information corresponding value contents and appointed transaction information corresponding value contents of the target clients in the target sub-time period.

Optionally, in one implementation manner, for each target sub-period, based on a plurality of encoded contents corresponding to the target sub-period, the process of constructing the feature vector of the target client under the target sub-period may include:

For each target sub-time period, constructing respective feature sub-vectors of each feature of the target client in the target sub-time period by utilizing a plurality of coded contents corresponding to the target sub-time period; each feature in the target sub-time period is used for representing appointed personal attribute information, value content corresponding to appointed behavior information and value content corresponding to appointed transaction information of a target client in the target sub-time period;

and performing splicing processing and dimension reduction processing on each constructed feature sub-vector to obtain the feature vector of the target client in the target sub-time period.

It can be understood that, since the plurality of encoded contents are encoded by using the specified personal attribute information of the target client, the valued content corresponding to the specified behavior information, and the valued content corresponding to the specified transaction information, the input layer can construct feature sub-vectors corresponding to different features by using the plurality of encoded contents. The input layer can perform splicing processing and dimension reduction processing on each feature sub-vector, and it can be understood that the splicing processing can splice the local features of each feature sub-vector into global features containing the value content corresponding to the appointed personal attribute information and the appointed behavior information and the value content corresponding to the appointed transaction information; the dimension reduction process can convert the high-dimension sparse vector into a low-dimension dense vector, so that the constructed feature vector can concentrate the feature information, and the dimension reduction process can also be called as ebedding (embellishment for short), so that an input layer capable of carrying out the dimension reduction process in the model can also be called as an ebedding layer (embellishment for short). It should be noted that the splicing process and the dimension reduction process are not limited in sequence. In the scheme, the appointed personal attribute information, the appointed behavior information and the appointed transaction information of the target client are input into an emb layer, and the encoding processing, the splicing processing and the dimension reduction processing are carried out, in particular to emb the financial characteristics in the transaction behavior, so that more characteristics can be spliced by the characteristic vectors, the data noise is reduced, and the model processing speed is improved.

Aiming at the specific effect of the input layer, the input layer can construct a tensor matrix corresponding to the target client based on the feature vectors of the target client in each target sub-time period; and the tensor matrix is used for representing the appointed personal attribute information, the appointed behavior information and the appointed transaction information of the target client in each target sub-time period.

For the specific role of the probability fitting layer, the probability fitting layer can input the tensor matrix of the target client generated by the input layer into a main structure encoder and a decoder of a transducer network with a multi-head self-attention mechanism applied, namely, in the multi-head self-attention neural network structure, an output result is generated by using the self-attention mechanism of the multi-head self-attention neural network structure, and the probability fitting layer can use the output result to fit the asset loss probability of the target client in a specified future time period. For example, the probability fit layer may fit the asset loss probability of the target customer at a specified future time period using the output results and the softmax function.

In the scheme, the input layer is utilized to process the appointed personal attribute information, the appointed behavior information and the appointed transaction information of the target clients, the appointed personal attribute information, the appointed behavior information and the appointed transaction information are spliced to financial characteristics of more target clients, the application of a model applying a transducer network in a financial scene is expanded, and the prediction effect is improved compared with a traditional DNN (Deep Neural Networks, deep neural network) model by utilizing an advanced self-attention mechanism.

S104, when the asset loss probability of the target client in the appointed future time period is greater than a preset threshold value, sending an early warning notice to a client manager of the target client according to a preset notice mode; the early warning notice is used for representing that the target client predicts the asset loss condition in the appointed future time period.

It may be appreciated that after obtaining the asset loss probability of the target client in the specified future time period, the asset management platform may compare the obtained probability with a preset threshold, and if the asset loss probability of the target client in the specified future time period is greater than the preset threshold, send an early warning notification to a client manager of the target client according to a predetermined notification manner, where the early warning notification may be used to characterize that the target client predicts that there is an asset loss condition in the specified future time period, so that the client manager of the target client takes measures to maintain the target client.

In the scheme, the input layer of the model applying the transformer network, namely the network structure of the embedding layer, is modified, the application scene is expanded, the characteristics of the transaction information of the client, the personal attribute information of the client and the like are input into the asset loss prediction model, the input layer is used for processing, the transformer network is added for sequence modeling, the method can be applied to the prediction of asset loss of high-net-value clients in the securities industry, generalization, robustness enhancement, recall rate, accuracy, f1 value and the like of the model are improved, the obtained asset loss probability can represent a better loss prediction effect, and the NLP large model structure is adopted, so that the method is suitable for application of large-scale data.

In order to better understand the data processing method provided by the embodiment of the present invention, a training method of the asset loss prediction model is described below with reference to fig. 2.

As shown in fig. 2, the training method of the asset loss prediction model may include the following steps S201 to S205:

s201, determining a plurality of sample clients.

It will be appreciated that prior to training the asset loss predictive model, it may be necessary to determine the data being trained, i.e., to determine a plurality of sample clients, e.g., a plurality of high net clients may be considered as a plurality of sample clients.

S202, acquiring appointed personal attribute information, appointed behavior information and appointed transaction information of each sample client in each sample sub-period, and label information of the sample client.

It will be appreciated that after a plurality of sample clients are determined, the specified personal attribute information, specified behavior information, and specified transaction information for each sample client at each sample time period, as well as the tag information for the sample client, may be obtained from a database or device storing historical data. And, the tag information of the sample client may be determined according to the historical performance of the sample client, for example, the sample client loses 80% of the assets at the end of 12 months compared with the assets at the end of 11 months, the sample client may be labeled in advance for 80% of the assets lost in the next month, and the tag information for identifying that the sample client has asset loss in a reference period of 12 months may be obtained while the specific personal attribute information, the specific behavior information and the specific transaction information of each month from 1 month to 11 months of the sample client are obtained, where the reference period of 12 months is the specific tag information may be used for identifying that the sample client loses 80% of the assets in 12 months.

S203, respectively inputting the appointed personal attribute information, the appointed behavior information and the appointed transaction information of each sample client in each sample sub-time period into an asset loss prediction model to be trained, and obtaining the asset loss probability of each sample client in the reference time period.

It can be understood that the specific personal attribute information, the specific behavior information and the specific transaction information of each sample client in each sample sub-period are respectively input into an asset loss prediction model, at this time, network parameters of the model are preset in the asset loss prediction model, the asset loss prediction model can perform data processing on the specific behavior information and the specific transaction information of each sample client in each sample sub-period according to a preset data preprocessing mode, and encode the specific personal attribute information of the sample client in the sample sub-period, the value content corresponding to the specific behavior information of the sample client in the sample sub-period and the value content corresponding to the specific transaction information in the sample sub-period respectively, construct feature vectors of the sample client in the sample sub-period, construct tensor matrixes corresponding to the sample client, and fit the asset loss probability of the sample client in the reference period by using a transformer network. And the processing process is repeated for each sample client, so that the asset loss probability of each sample client in the reference time period can be obtained. The specific application of the above steps may be similar to the process of using the asset loss prediction model, and the detailed description has been given in the foregoing embodiments, and will not be repeated here.

S204, calculating model loss values based on the asset loss probability of each sample client in the reference time period and the label information of each sample client.

It can be appreciated that the supervision value of each sample client can be determined according to the tag information of each sample client, the asset loss probability of each sample client in the reference time period can be used as a predicted value, and the loss function, the predicted value and the supervision value can be utilized to generate the loss value of the model.

S205, judging whether an asset loss prediction model in training is converged based on the model loss value, and if so, obtaining the asset loss prediction model after training; if not, adjusting the parameters of the asset loss predictive model, and returning to the step of determining a plurality of sample clients to continue training the asset loss predictive model.

It can be appreciated that after the model loss value is obtained, an error back propagation algorithm can be adopted to update parameters of the model through back propagation, so as to reduce the gap between the real asset loss condition and the predicted asset loss probability, and make the generated predicted value approach toward the real value direction. If the model loss value is smaller than a preset value, the completion of the training of the asset loss prediction model can be judged, and the asset management platform can use the model to regularly output the probability of asset loss of the high net value client.

In this embodiment, the input layer in the model may be used to train with more features of the sample client, which improves accuracy of model training, and improves efficiency of model training with a self-attention mechanism based on a transducer network.

In order to better understand the training method of the asset loss prediction model provided by the embodiment of the present invention, another training method of the asset loss prediction model is described below with reference to fig. 3. As shown in fig. 3, the method may include the following steps S301-S306:

s301, acquiring appointed personal attribute information, appointed behavior information and appointed transaction information of each sample client in each sample sub-period, and label information of each sample client.

It will be appreciated that the sample clients may be high net clients, each sample sub-period may be a preselected number n of time steps, n may be a natural number, and each time step may correspond to one sample sub-period, n may correspond to each sample sub-period, each time step may be one day, one month, etc., during the training model, the client characteristics of each high net client in the selected number n of time steps may be first obtained, where the client characteristics may include specific personal attribute information, specific behavior information, and specific transaction information, and tag information of each sample client, and the specific personal attribute information may also be referred to as a client representation, and of course, the obtained specific personal attribute information may include a client ID, but may not only include a client ID, but also need to include other personal attribute information, such as a client's gender, age, region, etc.

S302, according to a preset data preprocessing mode, respectively carrying out data processing on the appointed behavior information and the appointed transaction information of each sample client in the sample sub-time period to obtain the valued content corresponding to the appointed behavior information in the sample sub-time period and the valued content corresponding to the appointed transaction information in the sample sub-time period.

It will be understood that the specified behavior information and the specified transaction information of each sample client may be referred to as characteristics of each sample client, and the process of performing data processing on the specified behavior information and the specified transaction information of each sample client under the sub-period of the sample may be that characteristics in each time step are layered according to quantiles.

S303, respectively encoding the appointed personal attribute information of each sample client in the sample sub-time period, the valued content corresponding to the appointed behavior information in the sample sub-time period and the valued content corresponding to the appointed transaction information in the sample sub-time period according to each sample sub-time period to obtain a plurality of encoded contents corresponding to the sample sub-time period, and constructing the feature vector of each sample client in the sample sub-time period based on the plurality of encoded contents corresponding to the sample sub-time period.

It will be appreciated that the specified personal attribute information, specified behavior information, and specified transaction information correspondence may all be considered as characteristics of the sample client, and in particular, for the t-th time step, individual characteristics of the sample client may be encoded, and the dimension reduction process, that is, the emmbedding, may be performed, and the vector after the emmbedding of all individual characteristics may be spliced to the vector α _t Wherein the t-th time step can be one of all time steps, and the vector alpha _t The feature vector for each sample client at that sample sub-period may be determined.

S304, constructing a tensor matrix corresponding to each sample client based on the eigenvectors of each sample client in each sample sub-time period.

It will be appreciated that for a single sample client, such as a high net client, the feature vector of the sample client for each sample sub-period may be constructed as a tensor sample, i.e., a tensor matrix, and that the label information, i.e., label information, of the tensor matrix may be whether the sample client is about 80% of the asset lost for the next time step or steps. It should be noted that, for each sample client, a respective tensor matrix belonging to each sample client may be constructed according to the implementation manner of this step.

S305, fitting asset loss probability of each sample client in a reference time period by using a transducer network based on the tensor matrix corresponding to each sample client.

It will be appreciated that the tensor matrix constructed in step S304 may be passed through a transducer network, i.e. a multi-headed self-focusing neural network model, to fit the probability of loss of the high net client over a certain time period in the future, and train the model.

S306, calculating a model loss value based on the asset loss probability of each sample client in the reference time period and the label information of each sample client; judging whether the asset loss prediction model in training is converged or not based on the model loss value, and ending training if the asset loss prediction model in training is converged to obtain a trained asset loss prediction model; if not, the parameters of the asset loss predictive model are adjusted, and the process returns to step S301 to continue training the asset loss predictive model.

It will be appreciated that after model training is completed, the model may be brought up to line with the probability of asset loss for the high net client.

It should be noted that, the related steps in this embodiment have been described in the foregoing embodiments, and specific implementation may refer to the foregoing embodiments, which are not repeated herein.

In addition, the asset loss probability of the high-equity client can be predicted in the embodiment, so that a client manager can judge whether the client is still the high-equity client in the future, and digital marketing and loss intervention can be performed.

Based on the above method embodiment, the present invention further provides a data processing apparatus, where the apparatus is applied to an asset management platform, as shown in fig. 4, the apparatus may include:

an output module 410 for outputting a client management interface for a target client; the client management interface comprises a trigger button aiming at an appointed intelligent analysis task; the appointed intelligent analysis task is an early warning task about whether the target client has asset loss conditions in an appointed future time period;

an obtaining module 420, configured to obtain, in response to receiving a trigger operation for the trigger button, specified personal attribute information, specified behavior information, and specified transaction information of the target client in each target sub-period; wherein each target sub-time period is a time period which is within a target history time period and belongs to a specified time granularity;

The input module 430 is configured to input the obtained specified personal attribute information, specified behavior information, and specified transaction information into a pre-trained asset loss prediction model, so as to obtain an asset loss probability of the target client in the specified future time period; wherein the asset loss probability is a probability that the target client has an asset loss condition; the asset loss prediction model is as follows: utilizing appointed personal attribute information, appointed behavior information and appointed transaction information of a sample client in each sample sub-time period and label information of the sample client to train an obtained model based on a transformer network; the label information of the sample client is used for identifying: whether the sample client has incurred asset churn over a reference period of time; each sample sub-period is a period which is positioned in a sample history period before the reference period and belongs to a specified time granularity;

a notification module 440, configured to send an early warning notification to a client manager of the target client according to a predetermined notification manner when the asset loss probability of the target client in the specified future time period is greater than a preset threshold; the early warning notice is used for representing that the target client predicts the asset loss condition in the appointed future time period.

wherein, input layer is used for:

Optionally, the training process of the asset loss prediction model includes:

determining a plurality of sample clients;

The embodiment of the invention also provides an electronic device, as shown in fig. 5, which comprises a processor 501, a communication interface 502, a memory 503 and a communication bus 504, wherein the processor 501, the communication interface 502 and the memory 503 complete communication with each other through the communication bus 504,

a memory 503 for storing a computer program;

the processor 501 is configured to implement the steps of the data processing method when executing the program stored in the memory 503:

the communication bus mentioned above for the electronic devices may be a peripheral component interconnect standard (Peripheral Component Interconnect, PCI) bus or an extended industry standard architecture (Extended Industry Standard Architecture, EISA) bus, etc. The communication bus may be classified as an address bus, a data bus, a control bus, or the like. For ease of illustration, the figures are shown with only one bold line, but not with only one bus or one type of bus.

The communication interface is used for communication between the electronic device and other devices.

The Memory may include random access Memory (Random Access Memory, RAM) or may include Non-Volatile Memory (NVM), such as at least one disk Memory. Optionally, the memory may also be at least one memory device located remotely from the aforementioned processor.

The processor may be a general-purpose processor, including a central processing unit (Central Processing Unit, CPU), a network processor (Network Processor, NP), etc.; but also digital signal processors (Digital Signal Processor, DSP), application specific integrated circuits (Application Specific Integrated Circuit, ASIC), field programmable gate arrays (Field-Programmable Gate Array, FPGA) or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components.

In yet another embodiment of the present invention, there is also provided a computer readable storage medium having stored therein a computer program which when executed by a processor implements the steps of any of the data processing methods described above.

In a further embodiment of the present invention, there is also provided a computer program product containing instructions which, when run on a computer, cause the computer to perform any of the data processing methods of the above embodiments.

In the above embodiments, it may be implemented in whole or in part by software, hardware, firmware, or any combination thereof. When implemented in software, may be implemented in whole or in part in the form of a computer program product. The computer program product includes one or more computer instructions. When loaded and executed on a computer, produces a flow or function in accordance with embodiments of the present invention, in whole or in part. The computer may be a general purpose computer, a special purpose computer, a computer network, or other programmable apparatus. The computer instructions may be stored in or transmitted from one computer-readable storage medium to another, for example, by wired (e.g., coaxial cable, optical fiber, digital Subscriber Line (DSL)), or wireless (e.g., infrared, wireless, microwave, etc.). The computer readable storage medium may be any available medium that can be accessed by a computer or a data storage device such as a server, data center, etc. that contains an integration of one or more available media. The usable medium may be a magnetic medium (e.g., floppy Disk, hard Disk, magnetic tape), an optical medium (e.g., DVD), or a semiconductor medium (e.g., solid State Disk (SSD)), etc.

It is noted that relational terms such as first and second, and the like are used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Moreover, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising one … …" does not exclude the presence of other like elements in a process, method, article, or apparatus that comprises the element.

In this specification, each embodiment is described in a related manner, and identical and similar parts of each embodiment are all referred to each other, and each embodiment mainly describes differences from other embodiments. In particular, for the device embodiments, since they are substantially similar to the method embodiments, the description is relatively simple, and reference is made to the description of the method embodiments in part.

The foregoing description is only of the preferred embodiments of the present invention and is not intended to limit the scope of the present invention. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the present invention are included in the protection scope of the present invention.

Claims

1. A data processing method, characterized by being applied to an asset management platform; the method comprises the following steps:

2. The method of claim 1, wherein the asset loss prediction model comprises: an input layer and a probability fitting layer;

wherein, input layer is used for:

3. The method according to claim 2, wherein the input layer performs data processing on the specified behavior information and the specified transaction information of the target client in the target sub-time period according to a predetermined data processing manner for each target sub-time period, to obtain a valued content corresponding to the specified behavior information in the target sub-time period and a valued content corresponding to the specified transaction information in the target sub-time period, respectively, including:

4. The method of claim 1 or 2, wherein the training process of the asset loss prediction model comprises:

determining a plurality of sample clients;

5. A data processing apparatus, characterized by being applied to an asset management platform; the device comprises:

6. The apparatus of claim 5, wherein the asset loss prediction model comprises: an input layer and a probability fitting layer;

wherein, input layer is used for:

7. The apparatus of claim 6, wherein the input layer performs data processing on the specified behavior information and the specified transaction information of the target client in the target sub-time period according to a predetermined data processing manner for each target sub-time period, to obtain a valued content corresponding to the specified behavior information in the target sub-time period and a valued content corresponding to the specified transaction information in the target sub-time period, respectively, including:

8. The apparatus of claim 5 or 6, wherein the training process of the asset loss prediction model comprises:

determining a plurality of sample clients;

9. The electronic equipment is characterized by comprising a processor, a communication interface, a memory and a communication bus, wherein the processor, the communication interface and the memory are communicated with each other through the communication bus;

A memory for storing a computer program;

a processor for implementing the method of any of claims 1-4 when executing a program stored on a memory.

10. A computer readable storage medium, characterized in that the computer readable storage medium has stored therein a computer program which, when executed by a processor, implements the method of any of claims 1-4.