CN110909775A

CN110909775A - Data processing method and device and electronic equipment

Info

Publication number: CN110909775A
Application number: CN201911092266.2A
Authority: CN
Inventors: 刘腾飞
Original assignee: Alipay Hangzhou Information Technology Co Ltd
Current assignee: Alipay Hangzhou Information Technology Co Ltd
Priority date: 2019-11-08
Filing date: 2019-11-08
Publication date: 2020-03-24

Abstract

The embodiment of the specification discloses a data processing method, a data processing device and electronic equipment, wherein a service model is trained by sample data to determine characteristic parameters of preset characteristics in the service model when the preset characteristics correspond to different target services, and the sample data comprises service data of the different target services; and respectively adjusting the characteristic parameters in the service model according to each target service to obtain respective submodels of different target services, so that the submodels perform corresponding target service processing.

Description

Data processing method and device and electronic equipment

Technical Field

The embodiment of the specification relates to the technical field of computers, in particular to a data processing method and device and electronic equipment.

Background

In business activities such as business and production, especially in the context of big data, a large number of businesses need to be processed, and in order to effectively process the businesses and ensure normal operation of the businesses, different business models are usually used to process the corresponding businesses.

In the prior art, sample data corresponding to different target services are often used to perform corresponding service processing training on the different target services, so as to determine a service model corresponding to each target service, and to perform corresponding processing on the target service by using the service model.

Disclosure of Invention

In view of this, embodiments of the present specification provide a data processing method, an apparatus, and an electronic device, which are used to solve the problem in the prior art that a service processing efficiency of a service model obtained by training a plurality of separate services is low.

The embodiment of the specification adopts the following technical scheme:

an embodiment of the present specification provides a data processing method, including:

training a service model by using sample data to determine characteristic parameters of each preset characteristic in the service model when the preset characteristic corresponds to different target services, wherein the sample data comprises service data of the different target services;

and respectively adjusting the characteristic parameters in the service model according to each target service to obtain respective submodels of different target services, so that the submodels perform corresponding target service processing.

An embodiment of the present specification further provides a data processing method, including:

training a risk identification model by using sample data to determine characteristic parameters of each preset characteristic in the risk identification model when different target risks are identified, wherein the sample data comprises risk data of the different target risks;

and respectively adjusting the characteristic parameters in the risk identification model according to each target risk to obtain identification submodels corresponding to different target risks, so that the identification submodels identify the corresponding target risks.

acquiring service data;

processing the service data by utilizing a sub-model to obtain target services to which the service data belongs, wherein the sub-model is a sub-model of each different target service obtained by respectively adjusting the characteristic parameters in the service model according to each target service after training the service model by utilizing sample data and determining the characteristic parameters of each preset characteristic in the service model when corresponding to the different target services, and the sample data comprises the service data of the different target services;

and determining a service processing strategy corresponding to the service data according to the target class service to which the service data belongs.

acquiring service data;

processing the business data by using an identifier model to obtain target risks to which the business data belong, wherein the identifier model is obtained by training a risk identification model by using sample data, determining characteristic parameters of preset characteristics in the risk identification model when different target risks are identified, and adjusting the characteristic parameters in the risk identification model according to the target risks to obtain respective identifier models of the different target risks, wherein the sample data comprises risk data of the different target risks;

and determining a business processing strategy corresponding to the business data according to the target class risk to which the business data belongs.

An embodiment of the present specification further provides a data processing apparatus, including:

the training module is used for training a service model by using sample data to determine characteristic parameters of each preset characteristic in the service model when the preset characteristic corresponds to different target services, wherein the sample data comprises service data of the different target services;

and the adjusting module is used for respectively adjusting the characteristic parameters in the service models according to the target services to obtain respective sub-models of the different target services, so that the sub-models perform corresponding target service processing.

the training module is used for training a risk identification model by utilizing sample data to determine characteristic parameters of each preset characteristic in the risk identification model when different target risks are identified, wherein the sample data comprises risk data of the different target risks;

and the adjusting module is used for respectively adjusting the characteristic parameters in the risk identification model according to each target risk to obtain identification submodels corresponding to different target risks, so that the identification submodels identify the corresponding target risks.

the acquisition module acquires service data;

the processing module is used for processing the service data by utilizing a sub-model to obtain target services to which the service data belongs, the sub-model is a sub-model of each different target service obtained by respectively adjusting the characteristic parameters in the service model according to each target service after training a service model by utilizing sample data and determining the characteristic parameters of each preset characteristic in the service model when the preset characteristic corresponds to the different target services, wherein the sample data comprises the service data of the different target services;

and the determining module is used for determining a service processing strategy corresponding to the service data according to the target class service to which the service data belongs.

the acquisition module acquires service data;

the processing module is used for processing the business data by utilizing an identifier model to obtain target risks to which the business data belong, wherein the identifier model is obtained by training a risk identification model by utilizing sample data, determining characteristic parameters of preset characteristics in the risk identification model when different target risks are identified, and adjusting the characteristic parameters in the risk identification model according to the target risks to obtain respective identifier models of the different target risks, wherein the sample data comprises risk data of the different target risks;

and the determining module is used for determining a business processing strategy corresponding to the business data according to the target class risk to which the business data belongs.

Embodiments of the present specification further provide an electronic device, including at least one processor and a memory, where the memory stores programs and is configured to enable the at least one processor to execute the following steps:

and respectively adjusting the characteristic parameters in the risk identification model according to each target risk to obtain respective identifier models of different target risks, so that the identifier models perform corresponding target risk identification.

acquiring service data;

The embodiment of the specification adopts at least one technical scheme which can achieve the following beneficial effects:

the method comprises the steps of training a business model by using sample data, further determining characteristic parameters of the same preset characteristic in the business model when the same preset characteristic corresponds to different target services, and taking the influence degree of each preset characteristic on all the target services into consideration. When different target services are trained, preset features among the different target services can be shared, so that the generalization capability of each sub-model can be improved, and overfitting is prevented. The characteristic parameters in the service model are respectively adjusted according to the target services to obtain respective submodels of different target services, so that the training effect and the service processing effect of the submodels can be improved.

Drawings

The accompanying drawings, which are included to provide a further understanding of the embodiments of the specification and are incorporated in and constitute a part of this specification, illustrate exemplary embodiments of the specification and together with the description serve to explain the application and not to limit the application. In the drawings:

fig. 1 is a schematic flow chart of a data processing method provided in an embodiment of the present disclosure;

fig. 2 is a schematic flow chart of a data processing method provided in an embodiment of the present specification;

fig. 3 is a schematic flowchart of a data processing method provided in an embodiment of the present disclosure;

FIG. 4 is a flow chart of a data processing method using a neural network model according to an embodiment of the present disclosure;

fig. 5 is a schematic flowchart of a data processing method provided in an embodiment of the present specification;

fig. 6 is a schematic flowchart of a data processing method provided in an embodiment of the present specification;

fig. 7 is a schematic structural diagram of a data processing apparatus according to an embodiment of the present disclosure;

fig. 8 is a schematic structural diagram of a data processing apparatus according to an embodiment of the present disclosure;

fig. 9 is a schematic structural diagram of a data processing apparatus according to an embodiment of the present disclosure;

fig. 10 is a schematic structural diagram of a data processing apparatus according to an embodiment of the present disclosure.

Detailed Description

When training different target services, sample data of each target service is generally collected, preset characteristics of each target service are respectively determined, and then independent training is performed to obtain a service model for processing the target service.

In the prior art, when different target services with similar service attributes are trained, only the influence of preset features of each target service is considered, and the mutual influence among the preset features of each target service is not considered, so that the processing efficiency of a plurality of service models obtained by training is low, and the condition of processing errors is easy to occur.

Therefore, embodiments of the present specification provide a data processing method, an apparatus, and an electronic device, which train a service model by using sample data, and further determine feature parameters of the same preset feature in the service model when the preset feature corresponds to different target services, and take into account the influence degree of each preset feature on all target services. When different target services are trained, preset features among the different target services can be shared, so that the generalization capability of each sub-model can be improved, and overfitting is prevented. The characteristic parameters in the service model are respectively adjusted according to the target services to obtain respective submodels of different target services, so that the training effect and the service processing effect of the submodels can be improved.

In order to make the objects, technical solutions and advantages of the present application more clear, the technical solutions of the present application will be clearly and completely described below with reference to the specific embodiments of the present specification and the accompanying drawings. It is to be understood that the embodiments described are only a few embodiments of the present disclosure, and not all embodiments. All other embodiments obtained by a person skilled in the art based on the embodiments in the present specification without any inventive step are within the scope of the present application.

The technical solutions provided by the embodiments of the present description are described in detail below with reference to the accompanying drawings.

Fig. 1 is a schematic flowchart of a data processing method according to an embodiment of the present disclosure.

S101: training a service model by using sample data to determine characteristic parameters of each preset characteristic in the service model when the preset characteristic corresponds to different target services, wherein the sample data comprises service data of the different target services.

In this embodiment of the present specification, the service model may be a model that processes a target service by using an evaluation feature corresponding to the target service to obtain a service processing policy corresponding to each service, where the service model may specifically be a neural network model and the like, and is not specifically limited herein. For example, the service model is used to process the target service, the service type of the target service may be determined, and then a service processing policy corresponding to the target service may be obtained, or an abnormality or a risk of the target service may be found in time, and then a prevention and control process such as interception of the occurred abnormality or risk may be performed, which is not limited herein.

The target class service may be understood as a service of the same type that needs to be processed, and different target class services have similar service attributes, such as different types of transactions in the transaction service class, different target class risks in the application class service, and the like, which are not specifically limited herein.

Specifically, the service model may perform service processing such as service prediction on the target service, where the service prediction may refer to predicting an operation state of the service or predicting a specific type of the service according to service data related to the service, and may predict whether the service is abnormal or risky, or a probability of the abnormal or risky condition, or predict a specific type to which the service belongs, such as a transaction service or an application service, so as to timely perform risk anomaly prevention and control on the service, thereby reducing loss as much as possible.

The sample data may be a training sample for training the service model, and specifically may be a plurality of historical service data including different target services, so that the service model may determine, through learning the sample data, a characteristic parameter of each preset characteristic in the service model when corresponding to the different target services, and specifically may be a specific reference condition for predicting whether the different target services are abnormal or risky.

Further, the feature parameter may be understood as an evaluation judgment parameter of the same preset feature when corresponding to different target services, specifically, a weight of each preset feature when corresponding to different target services, and in a specific application scenario, the service model may perform service processing on each target service according to the weight of each preset feature on the different target services. For example, each target service may be scored, and whether the target service satisfies a preset condition that there is some abnormality or risk may be determined according to the scoring.

The preset feature may be feature information that is determined according to service attributes of different target services and that can process the target services. Because different target services have similar service attributes, when different target services are trained, the characteristic parameters of the preset characteristics when different target services are carried out are learned and trained, so that the service model judges the influence of all the preset characteristics when different target services are processed, and a more accurate service processing result can be obtained.

For example, in the transaction-type service, there may be a dummy transaction service, a gambling transaction service, a fraudulent transaction service, etc., and in predicting whether a service is a dummy transaction service, the preset feature may be a transaction time, a transaction amount, and for the gambling transaction service, the preset feature may be a transaction time, a transaction amount, a transaction location, and for the fraudulent transaction service, the preset feature may be a transaction amount, a transaction subject. In the embodiment of the present specification, training is performed on different transaction-class services, and feature parameters of all the preset feature pairs for predicting each target-class service can be determined. In this case, when the gambling transaction service is predicted by using the service model, the transaction amount, the transaction time, the transaction location, and the transaction subject of the transaction service data need to be judged, and finally, a more accurate prediction result can be obtained.

As an application embodiment, before training the business model by using sample data, the method may further include:

and determining the preset characteristics in the service model according to the evaluation characteristics of the different target services.

The evaluation characteristics of different target services may refer to evaluation indexes for processing the target services, and as in the above example, the evaluation characteristics of the fake transaction service may be transaction time, transaction amount, and the like, and the evaluation characteristics of the gambling transaction service may be transaction time, transaction amount, transaction location, and the like, which is not limited specifically herein.

In the service model in the embodiment of the present specification, by combining the preset features corresponding to different target services together, when performing service processing training of different target services, the influence of all the preset features in the set on the service processing of the target services can be considered, so that the feature parameters of the different target services corresponding to the preset features can be determined, and the service processing effect is improved.

Based on the same preset characteristics, the service processing training is carried out on different target services by using the sample data, and the obtained service model can synchronously predict the different target services. Continuing the above example, when an unknown transaction service is predicted by using the service model, if the transaction service can be obtained through the prediction result and simultaneously meets the prediction conditions of the false transaction service and the fraud transaction service, two different abnormal risks can be predicted to exist in the transaction service, and the transaction service belongs to a high risk. Namely, for the business model, different target class businesses can be synchronously processed.

S103: and respectively adjusting the characteristic parameters in the service model according to each target service to obtain respective submodels of different target services, so that the submodels perform corresponding target service processing.

In this embodiment of the present specification, a sub-model may refer to a sub-model that processes a corresponding target service, where each sub-model includes a feature parameter of each preset feature when processing the target service corresponding to the sub-model, and may retain service processing accuracy of a service model. In addition, the sub-models are smaller in scale, so that the sub-models are easier to deploy in different business scenes.

As an application embodiment, the adjusting the feature parameter corresponding to each target class service in the service model according to each target class service may include:

and compressing the service model according to each target service to adjust the characteristic parameters in the service model to obtain respective sub-models of different target services.

The sub-models of different target services can process the corresponding target services, so that the different sub-models comprise the characteristic parameters of the corresponding target services, and the characteristic parameters corresponding to the corresponding target services can be reserved in the sub-models by adopting a mode of compressing the service models respectively according to the characteristic parameters corresponding to the target services, so as to realize the processing of the corresponding target services by utilizing the sub-models.

In a specific application scenario, compressing the service model according to each target class service may include:

and respectively cutting the service model according to each target service to reserve the respective characteristic parameters of each target service, and obtaining the sub-model of the target service constructed based on the reserved characteristic parameters.

In order to obtain the submodels corresponding to different target services from the service model, the service model can be cut, model structures which are not needed for processing the target services are respectively removed, and finally the submodels corresponding to the different target services are obtained.

When the business model is cut, the business model can be cut according to each target business, so that each preset characteristic and each characteristic parameter corresponding to each target business are reserved in the obtained sub-models corresponding to different target businesses, and each sub-model can effectively process the corresponding target business.

In a specific application scenario, a model pruning method may be used to split the service model, specifically, for the sub-models corresponding to each target service, the feature parameters of the preset features irrelevant to the processing of the target service are removed by clipping, and the feature parameters of the preset features corresponding to the target service are retained, so that the sub-models corresponding to different target services can be obtained.

As another application embodiment, the compressing the service model according to the feature parameters corresponding to each target class service may further include:

obtaining respective sample data of the different target services;

respectively executing the following steps by utilizing the respective sample data of the different target services:

and retraining the service model by using the sample data of the target service to adjust the characteristic parameters corresponding to the target service in the service model to obtain the sub-model corresponding to the target service.

In order to obtain the submodels corresponding to different target services, a method of retraining the service model may be adopted, and in a specific application scenario, a knowledge distillation algorithm (knowledge distillation) may be adopted, and the service model is retrained by using sample data of different target services, so as to obtain the submodels corresponding to different target services.

In the data processing method provided in the embodiment of the present specification, a service model is trained by using sample data, so as to determine feature parameters of the same preset feature in the service model when the preset feature corresponds to different target services, and the influence degree of each preset feature on all target services is taken into consideration. When different target services are trained, preset features among the different target services can be shared, so that the generalization capability of each sub-model can be improved, and overfitting is prevented. The characteristic parameters in the service model are respectively adjusted according to the target services to obtain respective submodels of different target services, so that the training effect and the service processing effect of the submodels can be improved.

Fig. 2 is a schematic flowchart of a data processing method according to an embodiment of the present disclosure.

After the sub-models are deployed and brought on line, with the development of services, different target services may be updated, and the target service models need to be updated so as to keep high-precision processing on the target services. When at least one target service is updated, the sub-model needs to be updated iteratively, so that the sub-model can maintain a processing effect with high accuracy on the updated target service.

S201: and when the evaluation characteristic corresponding to at least one target class service is updated, updating the stored sample data by using the service data containing the evaluation characteristic.

When the evaluation feature corresponding to at least one target service is updated, in order to ensure that the service model performs accurate service processing on the updated target service, the service model needs to be updated and trained, in order to improve the training effect, the original stored sample data can be updated by using the service data containing the updated evaluation feature, and then the service model can learn to process the updated target service from the updated sample data.

S203: and retraining the service model by using the updated sample data to update the characteristic parameters of each preset characteristic in the service model when the preset characteristic corresponds to different target services.

By retraining the service model by using the updated sample data, the characteristic parameters of each preset characteristic in the service model when different target services are predicted can be updated, and then at least one updated target service can be processed synchronously.

As an application embodiment, retraining the business model by using the updated sample data may include:

updating each preset feature in the service model according to the updated evaluation feature;

and training the updated service model by using the updated sample data to update the characteristic parameters of each preset characteristic in the service model when the preset characteristic corresponds to the different target services.

In order to ensure that the service model can accurately process the updated target service, the preset characteristics can be updated through the updated evaluation characteristics, and then the updated service model is subjected to updated service processing training by using the updated sample data.

S205: respectively adjusting the characteristic parameters corresponding to each target service in the service model according to each target service to obtain sub-models corresponding to different target services, wherein the sub-models comprise:

and adjusting the updated characteristic parameters according to the at least one target service to obtain the updated sub-model corresponding to the at least one target service.

As an application embodiment, adjusting the updated characteristic parameter according to the at least one target class service to obtain an updated sub-model corresponding to the at least one target class service may include:

performing service processing on the other target services on the updated service model by using sample data of other target services different from the at least one target service to obtain a service processing result;

judging whether the service processing result meets the expectation;

and if so, updating the sub-models corresponding to the other target services.

In this case, when the updated service model is used to process other target services, the processing effect of the updated service model on other target services may not be improved, or may be even worse than that of the original sub-model, and in this case, the original sub-model does not need to be updated.

Therefore, before the original submodel is deployed and updated, the processing effects of the two models can be tested and judged, if the updated service model has better processing effects on other target services and accords with expectations, the original submodel can be updated, and otherwise, the original submodel is not updated.

Fig. 3 is a flowchart illustrating a data processing method according to an embodiment of the present disclosure, where risk identification is used as an application of the embodiment of the present disclosure.

S301: training a risk identification model by using sample data to determine characteristic parameters of each preset characteristic in the risk identification model when different target risks are identified, wherein the sample data comprises risk data of the different target risks.

In this embodiment of the present specification, the sample data may be sample data including risk data corresponding to different target class risks, specifically, each sample data may be various types of historical risk data with a respective risk type label, and supervised risk recognition training may be performed on the risk recognition model using the sample data.

The method comprises the steps that sample data are collected in a preset time period, wherein the sample data are a plurality of historical risk data corresponding to various target risks in the preset time period, the latest sample data can be trained by collecting the sample data in the preset time period, and a risk identification model obtained by training can effectively identify and predict the latest target risks. The trained risk identification model can synchronously identify the risks of various targets, and the risk identification result is more accurate.

Different target-class risks can be understood as different risk types of the same target-class service, and each target-class risk has similar characteristic attributes, such as false transaction risk, cash-over transaction risk, gambling transaction risk, marketing cheating risk, investment cheating risk and the like corresponding to the transaction-type service, and are not specifically limited herein.

The preset features can be risk features corresponding to the risks of the targets, and all the preset features of different risks of the targets are used as a set of the preset features in the risk recognition model, so that the risks of the targets can be recognized and trained based on the same preset features, and a more accurate risk recognition model can be obtained.

S303: and respectively adjusting the characteristic parameters in the risk identification model according to each target risk to obtain respective identifier models of different target risks, so that the identifier models perform corresponding target risk identification.

The risk identification model can synchronously identify the risks of various target classes, so that the scale is large. By determining the identifier models corresponding to different target risks, the identification accuracy of the risk identification model can be maintained, and the scale can be reduced, so that the method is more suitable for deployment in different business scenes.

And when the risk characteristics corresponding to at least one target class risk are updated, updating the sample data, wherein the updated sample data comprises the sample data of the at least one target class risk, and then training a risk identification model by using the updated sample data so as to update the characteristic parameters of each preset characteristic containing the risk characteristics in the risk identification model when the different target class risks are predicted.

Further, adjusting the characteristic parameters in the risk identification model according to each target class risk to obtain respective identifier models of the different target class risks may include:

and adjusting the updated characteristic parameters according to the at least one target risk to obtain an updated identification submodel corresponding to the at least one target risk.

Specifically, each preset feature in the risk identification model may be updated according to the updated risk feature, and then the updated risk identification model is trained by using the updated sample data, so as to update the feature parameters of each preset feature including the risk feature in the risk identification model when identifying the different target class risks.

In order to determine whether the updated risk identification model has a better effect on identifying other target class risks different from the at least one target class risk, a sample data of other target class risks different from the at least one target class risk may be used to perform an identification test on the updated risk identification model for the other target class risks; and then judging whether the test result of the other target class risks is better than the identification result of the identification submodel corresponding to the other target class risks, if so, updating the identification submodel corresponding to the other target class risks. In this case, unnecessary updates can be avoided.

According to the data processing method provided by the embodiment of the specification, different target risks are subjected to batch identification training by using sample data, so that the characteristic parameters of the same preset characteristic in the risk identification model when different target risks are identified are determined, and the influence degree of each preset characteristic on the identification of all target risks is taken into consideration, so that the risk identification model can accurately identify each target risk, and the training cost can be reduced. In addition, the trained risk identification model is processed to obtain the identifier models corresponding to different target risks, so that the identification accuracy of the risk identification model can be achieved, and the risk identification model can be conveniently deployed in the service scene corresponding to each target risk.

Fig. 4 is a flowchart of a flow framework when a neural network model is applied to a data processing method provided in an embodiment of the present specification, where the neural network model is used to perform risk identification training on different target risks of a transaction service in the embodiment of the present specification.

S401: a sample data preparation phase.

As an application embodiment, determining sample data of different target class risks may include:

determining at least one target class risk existing in the event of the same target type;

collecting a plurality of historical event data corresponding to each target class risk;

and determining the sample data according to the collected multiple historical event data.

Collecting a plurality of historical event data corresponding to each of the target class risks may include:

and collecting a plurality of historical event data corresponding to each target class risk in a preset time period.

If the target type event is a transaction service, when sample data is determined, a plurality of historical transaction data corresponding to each target type risk in a preset time period need to be collected, and specifically, each transaction data may include at least one or more of the following transaction information:

transaction subject identification information; a transaction amount; a transaction time; transaction subject location information; device information of the transaction performed by the transaction body.

In the embodiment of the present specification, supervised training is performed on different target-class risks, each collected historical transaction data may be transaction data carrying a risk label, but since the historical transaction data does not consider characteristic parameters of preset features during risk identification, the collected historical transaction data may have a situation of multiple risks at the same time, when sample data is prepared, a risk label corresponding to each target-class risk, such as a false transaction risk, a cash register transaction risk, a gambling transaction risk, a marketing cheating risk, an investment fraud risk, and the like, may also be prepared for each collected historical transaction data, and is not specifically limited herein.

For the risk identification model, after different target risks are determined, risk features matched with the target risks can be determined, namely preset features, a risk feature pool of the risk identification model is determined by utilizing a set of the risk features, the different target risks are identified and trained based on the unified risk feature pool, the generalization capability of the risk identification model can be effectively improved, overfitting is prevented, and the training effect of the risk identification model can be improved compared with that of independent training.

S403: and carrying out a batch training phase by utilizing the neural network model.

In the embodiment of the present specification, the risk identification model adopts a multitask neural network model, and may perform synchronous and unified risk identification training on different target class risks by using sample data, and specifically may implement synchronous training of multiple target class risks by using a parameter hard sharing (hard parameter sharing) algorithm.

As shown in fig. 4, when the neural network model is used for batch training, the lowest layer of the neural network model is a risk feature variable in a uniform risk feature pool, the middle layer is a hidden layer variable in the neural network, the top layer is output variables of a plurality of different training tasks, and each output variable corresponds to a risk label of a target class risk identification training task. Among them, task 1, task 2, task 3, etc. shown in fig. 4 are identification tasks of target class risks corresponding to different risk labels.

Specifically, based on the risk feature pool, training a risk identification model for identifying different target risks by using the sample data to obtain a risk identification model capable of identifying the different target risks, which may include:

extracting risk information matched with at least one risk feature from data in each history of the sample data according to the at least one risk feature in the risk feature pool;

training recognition tasks of different target risks on the risk recognition model by using the extracted at least one piece of risk information;

and determining a risk identification model capable of identifying the different target risks according to the training result.

By utilizing the multi-task neural network model, various task labels are concentrated into one neural network model for unified training, and the multiple models do not need to be trained one by one aiming at single target risks, so that the training time and the training cost of the risk identification model can be greatly reduced.

S403: and identifying a sub-model determining stage.

In the identification submodel determining stage, the risk identification model may be compressed and simplified based on each target risk to obtain identification submodels corresponding to the different target risks, so that the identification submodels identify the corresponding target risks.

In the embodiment of the present specification, a risk identification model may be compressed and simplified by using a knowledge distillation algorithm, and specifically, risk identification may be performed on data in each history in the sample data by using the risk identification model to obtain a risk identification result; and training different target risk identification tasks on the risk identification model by using the sample data according to the risk identification result, and then determining and identifying the identifier models corresponding to different target risks according to the training result.

Further, when at least one target risk is updated, updating a preset feature according to the updated target risk, and updating a risk feature pool of a risk identification model by using the updated preset feature; and then, carrying out recognition training on different target risks by using the updated sample data to determine updated characteristic parameters of each preset characteristic in the updated risk recognition model when recognizing the different target risks.

When one object risk identification submodel is updated, updated risk features are added into the risk feature pool, the multitask neural network model is used for retraining a plurality of tasks, other object risk identification submodels are updated in batch, and whether the old model is replaced or not is selected according to the risk identification effect of the new model. Therefore, each recognition submodel is updated, and compared with the traditional mode of updating the models one by one, the updating time and the updating cost can be greatly saved.

Fig. 5 is a schematic flow chart of a data processing method provided in an embodiment of the present specification, where the embodiment of the present specification performs specific business processing on business data by using a business model.

S501: and acquiring service data.

The service data in the embodiment of the present specification is related to a service that needs to be processed, and the service model can predict a specific service type to which the service data belongs by analyzing the acquired service data.

S503: and processing the service data by utilizing a sub-model to obtain the target service to which the service data belongs, wherein the sub-model is the sub-model of each different target service obtained by respectively adjusting the characteristic parameters in the service model according to each target service after training the service model by utilizing sample data and determining the characteristic parameters of each preset characteristic in the service model when corresponding to the different target services, and the sample data comprises the service data of the different target services.

In the embodiment of the present specification, the sub-model is used to process the service data, specifically, the service feature information corresponding to each preset feature in the sub-model is extracted from the service data, and then the extracted service feature information is analyzed and processed, so that the target class service to which the service data belongs can be predicted.

S505: and determining a service processing strategy corresponding to the service data according to the target class service to which the service data belongs.

In this embodiment of the present specification, the service processing policy may be a processing manner corresponding to different target services, and the service processing policy corresponding to different target services may be preset, so that after the target service to which the service data belongs is determined by using the sub-model, the service data may be processed according to the corresponding preset service processing policy. For example, when the sub-model is used to determine that the service data is the target transaction service, operations such as preset calculation, statistics, and the like may be performed on the service data according to a service processing policy corresponding to the target transaction service, which is not specifically limited herein.

In the data processing method provided in the embodiment of the present specification, the received service data is processed by using the sub-model, and the service feature information corresponding to each preset feature is extracted from the service data, so that the target class service to which the service data belongs can be predicted. The submodels take the influence degree of each preset characteristic on the prediction of all target services into consideration, and the preset characteristics among different target services can be shared when different target services are trained, so that the generalization capability of each submodel can be improved, overfitting is prevented, and the training effect and the service processing effect of the submodel can be improved.

Fig. 6 is a flowchart illustrating a data processing method provided in an embodiment of the present specification, where the embodiment of the present specification performs specific risk identification on business data by using a risk identification model.

S601: and acquiring service data.

The business data in the embodiment of the present specification is related to a business requiring risk identification, and the risk identification model can identify a specific risk type to which the business data belongs by analyzing the obtained business data.

S603: processing the business data by using an identifier model to obtain target risks to which the business data belong, wherein the identifier model is obtained by training a risk identification model by using sample data, determining characteristic parameters of preset characteristics in the risk identification model when different target risks are identified, and then respectively adjusting the characteristic parameters in the risk identification model according to the target risks to obtain respective identifier models of the different target risks, wherein the sample data comprises risk data of the different target risks, and the obtained identifier models correspond to the different target risks.

In the embodiment of the present specification, the service data is processed by using the identifier sub-model, and specifically, corresponding risk feature information is extracted from the service data according to each preset feature in the identifier sub-model, and the extracted risk feature information is used to identify the target class risk to which the service data belongs.

S605: and determining a business processing strategy corresponding to the business data according to the target class risk to which the business data belongs.

In this embodiment of the present specification, the service processing policy may be a processing manner corresponding to different target class risks, and the service processing policy corresponding to different target class risks may be preset, so that after the target class risk to which the service data belongs is determined by using the identifier sub-model, the service data may be processed according to the corresponding preset service processing policy. For example, when the identification submodel is used to determine that the service data is at risk for gambling type transaction service, operations such as early warning, risk prompt, account locking, etc. may be performed on the service data according to a service processing policy corresponding to the gambling type transaction service risk, which is not specifically limited herein.

In the data processing method provided in the embodiment of the present specification, the identifier model is used to process the received service data, and the risk feature information corresponding to each preset feature is extracted from the service data, so that the target class risk to which the service data belongs can be identified. The recognition submodel takes the influence degree of each preset characteristic on the recognition of all the target risks into consideration, and the preset characteristics among different target risks can be shared when different target risks are trained, so that the generalization capability of each recognition submodel can be improved, overfitting is prevented, and the training effect and the risk recognition effect of the recognition submodel can be improved.

Fig. 7 is a schematic structural diagram of a data processing apparatus according to an embodiment of the present disclosure.

A training module 701, configured to train a service model by using sample data to determine feature parameters of preset features in the service model when the preset features correspond to different target services, where the sample data includes service data of the different target services;

an adjusting module 702, configured to adjust the characteristic parameters in the service model according to each target service, to obtain respective sub-models of the different target services, so that the sub-models perform corresponding target service processing.

In the data processing apparatus provided in the embodiment of the present specification, a service model is trained by using sample data, so as to determine feature parameters of the same preset feature in the service model when the preset feature corresponds to different target services, and the influence degree of each preset feature on all target services is taken into consideration. When different target services are trained, preset features among the different target services can be shared, so that the generalization capability of each sub-model can be improved, and overfitting is prevented. The characteristic parameters in the service model are respectively adjusted according to the target services to obtain respective submodels of different target services, so that the training effect and the service processing effect of the submodels can be improved.

Fig. 8 is a schematic structural diagram of a data processing apparatus according to an embodiment of the present disclosure.

A training module 801, configured to train a risk identification model using sample data to determine feature parameters of each preset feature in the risk identification model when identifying different target risks, where the sample data includes risk data of the different target risks;

an adjusting module 802, respectively adjusting the characteristic parameters in the risk identification model according to each target risk to obtain respective identifier models of the different target risks, so that the identifier models perform corresponding target risk identification.

According to the data processing device provided by the embodiment of the specification, different target risks are subjected to batch identification training by using sample data, so that the characteristic parameters of the same preset characteristic in the risk identification model when different target risks are identified are determined, and the influence degree of each preset characteristic on the identification of all target risks is taken into consideration, so that the risk identification model can accurately identify each target risk, and the training cost can be reduced. In addition, the trained risk identification model is processed to obtain the identifier models corresponding to different target risks, so that the identification accuracy of the risk identification model can be achieved, and the risk identification model can be conveniently deployed in the service scene corresponding to each target risk.

Fig. 9 is a schematic structural diagram of a data processing apparatus according to an embodiment of the present disclosure.

An obtaining module 901 for obtaining service data;

a processing module 902, configured to process the service data by using a sub-model to obtain a target service to which the service data belongs, where the sub-model is a sub-model for each different target service obtained by training a service model by using sample data, determining a feature parameter of each preset feature in the service model when the preset feature corresponds to the different target service, and then adjusting the feature parameter in the service model according to the different target service, and the sample data includes service data of the different target service;

the determining module 903 is configured to determine a service processing policy corresponding to the service data according to the target class service to which the service data belongs.

In the data processing apparatus provided in the embodiment of the present specification, the sub-model is used to process the received service data, and the service feature information corresponding to each preset feature is extracted from the service data, so that the target service to which the service data belongs can be processed. The submodels take the influence degree of each preset characteristic on all the target services into consideration, and the preset characteristics among different target services can be shared when different target services are trained, so that the generalization capability of each submodel can be improved, overfitting is prevented, and the training effect and the service processing effect of the submodel can be improved.

An obtaining module 1001 for obtaining service data;

the processing module 1002 is configured to process the service data by using an identifier model to obtain a target risk to which the service data belongs, where the identifier model is an identifier model for adjusting the characteristic parameters in the risk identification model according to each target risk after training the risk identification model by using sample data to determine the characteristic parameters of each preset characteristic in the risk identification model when identifying different target risks, and the obtained target risk is different from the target risk, where the sample data includes risk data of the different target risks;

the determining module 1003 determines a service processing policy corresponding to the service data according to the target class risk to which the service data belongs.

In the data processing apparatus provided in the embodiment of the present specification, the identifier model is used to process the received service data, and the risk feature information corresponding to each preset feature is extracted from the service data, so that the target class risk to which the service data belongs can be identified. The recognition submodel takes the influence degree of each preset characteristic on the recognition of all the target risks into consideration, and the preset characteristics among different target risks can be shared when different target risks are trained, so that the generalization capability of each recognition submodel can be improved, overfitting is prevented, and the training effect and the risk recognition effect of the recognition submodel can be improved.

Based on the same inventive concept, embodiments of the present specification further provide an electronic device, including at least one processor and a memory, where the memory stores programs and is configured to be executed by the at least one processor to:

For other functions of the processor, reference may also be made to the contents described in the above embodiments, which are not described in detail herein.

Based on the same inventive concept, embodiments of the present specification further provide a computer-readable storage medium including a program for use in conjunction with an electronic device, the program being executable by a processor to perform the steps of:

acquiring service data;

In the 90 s of the 20 th century, improvements in a technology could clearly distinguish between improvements in hardware (e.g., improvements in circuit structures such as diodes, transistors, switches, etc.) and improvements in software (e.g., improvements in process flow). However, as technology advances, many of today's process flow improvements have been seen as direct improvements in hardware circuit architecture. Designers almost always obtain the corresponding hardware circuit structure by programming an improved method flow into the hardware circuit. Thus, it cannot be said that an improvement in the process flow cannot be realized by hardware physical modules. For example, a Programmable Logic Device (PLD), such as a Field Programmable Gate Array (FPGA), is an integrated circuit whose Logic functions are determined by programming the Device by a user. A digital system is "integrated" on a PLD by the designer's own programming without requiring the chip manufacturer to design and fabricate application-specific integrated circuit chips. Furthermore, nowadays, instead of manually making an integrated Circuit chip, such Programming is often implemented by "logic compiler" software, which is similar to a software compiler used in program development and writing, but the original code before compiling is also written by a specific Programming Language, which is called Hardware Description Language (HDL), and HDL is not only one but many, such as abel (advanced Boolean Expression Language), ahdl (alternate Language Description Language), traffic, pl (core unified Programming Language), HDCal, JHDL (Java Hardware Description Language), langue, Lola, HDL, laspam, hardsradware (Hardware Description Language), vhjhd (Hardware Description Language), and vhigh-Language, which are currently used in most common. It will also be apparent to those skilled in the art that hardware circuitry that implements the logical method flows can be readily obtained by merely slightly programming the method flows into an integrated circuit using the hardware description languages described above.

The controller may be implemented in any suitable manner, for example, the controller may take the form of, for example, a microprocessor or processor and a computer-readable medium storing computer-readable program code (e.g., software or firmware) executable by the (micro) processor, logic gates, switches, an Application Specific Integrated Circuit (ASIC), a programmable logic controller, and an embedded microcontroller, examples of which include, but are not limited to, the following microcontrollers: ARC 625D, Atmel AT91SAM, Microchip PIC18F26K20, and Silicone Labs C8051F320, the memory controller may also be implemented as part of the control logic for the memory. Those skilled in the art will also appreciate that, in addition to implementing the controller as pure computer readable program code, the same functionality can be implemented by logically programming method steps such that the controller is in the form of logic gates, switches, application specific integrated circuits, programmable logic controllers, embedded microcontrollers and the like. Such a controller may thus be considered a hardware component, and the means included therein for performing the various functions may also be considered as a structure within the hardware component. Or even means for performing the functions may be regarded as being both a software module for performing the method and a structure within a hardware component.

The systems, devices, modules or units illustrated in the above embodiments may be implemented by a computer chip or an entity, or by a product with certain functions. One typical implementation device is a computer. In particular, the computer may be, for example, a personal computer, a laptop computer, a cellular telephone, a camera phone, a smartphone, a personal digital assistant, a media player, a navigation device, an email device, a game console, a tablet computer, a wearable device, or a combination of any of these devices.

For convenience of description, the above devices are described as being divided into various modules by functions and/or various units separately. Of course, the functionality of the modules and/or units may be implemented in the same one or more software and/or hardware when implementing the present application.

As will be appreciated by one skilled in the art, embodiments of the present invention may be provided as a method, system, or computer program product. Accordingly, the present invention may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present invention may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.

The present invention is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the invention. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

In a typical configuration, a computing device includes one or more processors (CPUs), input/output interfaces, network interfaces, and memory.

The memory may include forms of volatile memory in a computer readable medium, Random Access Memory (RAM) and/or non-volatile memory, such as Read Only Memory (ROM) or flash memory (flash RAM). Memory is an example of a computer-readable medium.

Computer-readable media, including both non-transitory and non-transitory, removable and non-removable media, may implement information storage by any method or technology. The information may be computer readable instructions, data structures, modules of a program, or other data. Examples of computer storage media include, but are not limited to, phase change memory (PRAM), Static Random Access Memory (SRAM), Dynamic Random Access Memory (DRAM), other types of Random Access Memory (RAM), Read Only Memory (ROM), Electrically Erasable Programmable Read Only Memory (EEPROM), flash memory or other memory technology, compact disc read only memory (CD-ROM), Digital Versatile Discs (DVD) or other optical storage, magnetic cassettes, magnetic tape magnetic disk storage or other magnetic storage devices, or any other non-transmission medium that can be used to store information that can be accessed by a computing device. As defined herein, computer readable media does not include transitory computer readable media (transmyedia) such as modulated data signals and carrier waves.

It should also be noted that the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. The use of the phrase "including a" does not exclude the presence of other, identical elements in the process, method, article, or apparatus that comprises the same element, whether or not the same element is present in all of the same element.

The application may be described in the general context of computer-executable instructions, such as program modules, being executed by a computer. Generally, program modules include routines, programs, objects, components, data structures, etc. that perform particular tasks or implement particular abstract data types. The application may also be practiced in distributed computing environments where tasks are performed by remote processing devices that are linked through a communications network. In a distributed computing environment, program modules may be located in both local and remote computer storage media including memory storage devices.

The embodiments in the present specification are described in a progressive manner, and the same and similar parts among the embodiments are referred to each other, and each embodiment focuses on the differences from the other embodiments. In particular, for the system embodiment, since it is substantially similar to the method embodiment, the description is simple, and for the relevant points, reference may be made to the partial description of the method embodiment.

The above description is only an example of the present application and is not intended to limit the present application. Various modifications and changes may occur to those skilled in the art. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the present application should be included in the scope of the claims of the present application.

Claims

1. A method of data processing, comprising:

2. The method of claim 1, prior to training the business model with sample data, further comprising:

3. The method of claim 1, wherein adjusting the feature parameter corresponding to each target class service in the service model according to each target class service comprises:

4. The method of claim 3, wherein compressing the service model according to each of the target class services comprises:

5. The method of claim 3, wherein compressing the service model according to each of the target class services comprises:

obtaining respective sample data of the different target services;

6. The method of claim 1, training a business model using sample data, comprising:

when the evaluation characteristic corresponding to at least one target service is updated, updating the stored sample data by using the service data containing the evaluation characteristic;

retraining the service model by using the updated sample data to update the characteristic parameters of each preset characteristic in the service model when the preset characteristic corresponds to different target services;

respectively adjusting the characteristic parameters in the service model according to each target service to obtain respective submodels of the different target services, including:

7. The method of claim 6, retraining the business model with the updated sample data, comprising:

8. The method according to claim 7, adjusting the updated characteristic parameter according to the at least one target class service to obtain an updated sub-model corresponding to the at least one target class service, comprising:

judging whether the service processing result meets the expectation;

and if so, updating the sub-models corresponding to the other target services.

9. A method of data processing, comprising:

10. A method of data processing, comprising:

acquiring service data;

11. A method of data processing, comprising:

acquiring service data;

processing the business data by using an identifier model to obtain target risks to which the business data belong, wherein the identifier model is obtained by training a risk identification model by using sample data, determining characteristic parameters of preset characteristics in the risk identification model when different target risks are identified, and adjusting the characteristic parameters in the risk identification model according to the target risks to obtain identifier models corresponding to the different target risks, wherein the sample data comprises risk data of the different target risks;

12. A data processing apparatus comprising:

13. A data processing apparatus comprising:

and the adjusting module is used for respectively adjusting the characteristic parameters in the risk identification model according to each target risk to obtain respective identifier models of different target risks, so that the identifier models carry out corresponding target risk identification.

14. A data processing apparatus comprising:

the acquisition module acquires service data;

15. A data processing apparatus comprising:

the acquisition module acquires service data;

16. An electronic device comprising at least one processor and a memory, the memory storing a program and configured for the at least one processor to perform the steps of:

17. An electronic device comprising at least one processor and a memory, the memory storing a program and configured for the at least one processor to perform the steps of:

18. An electronic device comprising at least one processor and a memory, the memory storing a program and configured for the at least one processor to perform the steps of:

acquiring service data;

19. An electronic device comprising at least one processor and a memory, the memory storing a program and configured for the at least one processor to perform the steps of:

acquiring service data;