Detailed Description
In order to make those skilled in the art better understand the technical solutions in the embodiments of the present specification, the technical solutions in the embodiments of the present specification will be described in detail below with reference to the drawings in the embodiments of the present specification, and it is obvious that the described embodiments are only a part of the embodiments of the present specification, and not all the embodiments. All other embodiments that can be derived by one of ordinary skill in the art from the embodiments given herein are intended to be within the scope of protection.
An embodiment of the present specification provides a prediction model training method for a new scenario, and referring to fig. 1, the method may include the following steps:
s101, acquiring a set of models to be migrated;
due to the difference between the new scene and the old scene, some models deployed and used in the old scene may not be suitable for the new scene, and some models deployed and used in the old scene may be suitable for the new scene and may be migrated to the new scene, where the model to be migrated is a model deployed and used in the old scene and may be migrated to the new scene.
The specification does not limit the specific manner in which the set of models to be migrated is obtained.
In this specification, feature vectors input by each model in the old scene may be compared with feature vectors extractable from training samples in the new scene, so as to determine whether each model in the old scene may migrate to the new scene. Specifically, a first feature set is obtained, where the first feature set includes: a plurality of feature vectors which can be extracted by a predetermined new scene training sample; then for any model deployed in the old scenario: obtaining a second feature set, the set comprising: a plurality of feature vectors input by the model; determining the model as a model to be migrated under the condition that the model accords with a preset migration rule; the preset migration rule comprises the following steps: and the feature vector included in the intersection of the first feature set and the second feature set meets a preset migration condition.
The preset migration condition may be in various forms, and the first feature set and the second feature set may be compared from various angles.
For example, the preset migration condition may be: and determining whether the model can be migrated to a new scene or not by comparing the number of the feature vectors in the intersection of the first feature set and the second feature set. If the number of the feature vectors in the intersection is small, the probability that the model is poor in performance in a new scene is high, and therefore the model can not be transferred to the new scene; otherwise, the model is considered to be migrated to the new scene.
For another example, some feature vectors in the new scene are important for model training, and when it is determined whether the model in the old scene is suitable for being migrated to the new scene, whether the feature vectors are included may be considered in an important manner, so the preset migration condition may be: and the weighted score calculated according to the preset weight of each feature vector in the intersection is not less than a preset threshold value. The feature vectors that are more important for model training may be preset with higher weights, and the more important the preset weights are, the higher the weights are. Thus, if the significant feature vectors included in the intersection are higher, the final computed weighted score is also higher, and it can be considered that the model can migrate to a new scene.
The preset migration conditions may also be in other forms, and each migration condition may be used alone or in combination, and those skilled in the art may flexibly set the migration conditions according to actual needs, which is not specifically limited in this specification.
In addition, other specific rules may be included in the preset migration rule. The type of the prediction model trained in the new scenario may be determined and specified by a developer in advance according to experience or an algorithm, and then, in order to further measure each model in the old scenario on the basis of comparing the feature vectors, whether the model can be migrated to the new scenario, when obtaining a set of models to be migrated, at least one type specified in advance for the prediction model of the new scenario may be further obtained, and the preset migration rule may further include: the at least one type that is pre-specified includes a type of the model.
And determining whether a certain model in the old scene can be migrated to the new scene or not from two dimensions of the feature vector and the model type, so that the model migrated to the new scene can be better applied to the new scene through further training. Of course, the preset migration rule may also include rules of other dimensions, which is not limited in the embodiment of the present specification.
Of course, the model to be migrated from the old scene to the new scene may also be specified by the research and development staff, and when the research and development staff specify, whether each model can be migrated to the new scene, the performance after migration, and the like may also be measured by dimensions such as feature vectors, model types, and the like according to experience or an algorithm.
S102, selecting at least one model from the set of models to be migrated for prediction labeling of unlabeled samples in a new scene;
s103, obtaining an initial training sample set in a new scene, wherein the initial training sample set comprises unlabeled samples;
s104, adding a prediction label for the label-free sample in the initial training sample set by using the selected model;
for convenience of description, S102 to S104 are explained in combination.
When training samples are based on supervised learning, the training samples are required to be labeled samples. Training samples can often be labeled in a variety of ways. For example, manual labeling can be performed, and is generally accurate, but training sample data volume for model training is generally large, and manual labeling efficiency is low; for another example, in some scenarios, the label may be generated according to actual situations, such as a credit card scenario, when a bank verifies that a credit card is stolen, the credit card and the corresponding transaction may be marked as a black sample, but in such a scenario, the black sample label may not be available in a short period of time.
In the embodiment of the description, at least one model is selected from a set of models to be migrated and is used for predicting and labeling unlabeled samples in a new scene, so that the labeling efficiency is improved, and the labeling period is shortened.
Each model to be migrated is a model that can be migrated to a new scene, but specifically, because a difference exists between a feature vector input by each model and a model type, a part of the models can be directly and better applied to the new scene, and a part of the models can be better applied to the new scene only after being updated, so that a part of the models that better appear in the new scene can be selected from a set of models to be migrated for prediction tagging.
The selection of at least one model from the set of models to be migrated may specifically be achieved in various ways.
In an embodiment of the present specification, a third feature set may be obtained first, the set including: a plurality of pre-specified feature vectors for predicting sample labels in a new scene; then, obtaining each feature set corresponding to each model to be migrated, wherein any feature set comprises: a plurality of feature vectors input by the corresponding model; and selecting at least one model from the set of models to be migrated according to a preset selection rule.
Similarly to the determination of the model to be migrated in S101, when selecting the model for prediction labeling, whether to select a certain model for prediction labeling may also be measured from dimensions such as the number of feature vectors in the intersection, the number of important feature vectors, and whether the model types are the same, which is not described herein again.
In addition, only by determining whether the number or the weighted score is greater than a preset threshold value and whether the model types are equal to each other, there may be no model conforming to a preset selection rule in the set of models to be migrated, and therefore, various priority ordering conditions may be preset, and 1 or more models may be selected for prediction labeling according to the ordering result.
As described in S101, the model to be migrated may be specified by a developer, and in this step, when a model for performing predictive labeling on a unlabeled sample in a new scene is selected from the set of models to be migrated, the model may also be selected by the developer according to experience or an algorithm, which is not described herein again.
The initial training sample set in the new scene may include unlabeled samples to be labeled with prediction labels, or may include labeled samples (which may be white samples and/or black samples) labeled with actual labels, and the selected model is used to perform prediction labeling on the unlabeled samples.
The predictive tag may specifically be added in a number of ways.
In the embodiment of the present specification, the correspondence between different values and different prediction labels may be preset, where a value is greater than or less than a preset value, and corresponds to a black sample label, and conversely, corresponds to a white sample label. For any model selected: inputting unlabeled samples in the initial training sample set into the model to obtain an output predicted value; for any unlabeled sample entered: determining the weight of the predicted value output by each model; calculating the weighted sum of the predicted values, and determining a predicted label corresponding to the weighted sum; the prediction tag is added to the unlabeled exemplar.
For example, if only 1 model is selected from the set of models to be migrated for prediction labeling, the corresponding prediction label can be obtained directly according to the prediction value output by the model (i.e. equal to the weighted sum).
For another example, if multiple models are selected from the set of models to be migrated for prediction labeling, the weights corresponding to the output values of the models may be preset, for example, the weights corresponding to the better performing models are higher or lower, or of course, the weights of the models may be preset to be the same, that is, the weights are not set for the models.
In addition, the accuracy of the prediction label can be improved by manual inspection and correction by using the prediction label added by the selected model.
The above situation can be flexibly set in the field according to the actual situation, and the present specification is not limited.
And S105, updating the model to be migrated by utilizing the initial training sample set added with the prediction label based on a supervised learning algorithm to obtain the model applicable to the new scene.
When the model to be migrated is updated, only the initial training samples may be collected, and the training samples to which the prediction labels have been added may be input into the model to be migrated.
If the number of the training samples accumulated in the new scene is small, a training sample set in the old scene can be obtained, wherein the training sample set comprises labeled samples added with actual labels; and merging the initial sample set in the new scene with the training sample set in the old scene, and updating the model to be migrated by using the merged training sample set based on a supervised learning algorithm.
A large number of training samples are already accumulated in the old scene, and the training samples are labeled samples added with actual labels, so that the method can be used for assisting the updating of the model to be migrated in the new scene under the condition that the number of the training samples accumulated in the new scene is small.
Certainly, the training samples in the old scene are not necessarily completely suitable for model updating of the new scene, where there may be some samples with higher similarity to the training samples in the new scene and other samples with lower similarity, so different weights may be preset for different training samples in the training sample set after the initial sample set in the new scene and the training sample set in the old scene are combined.
For example, the training samples in the initial sample set have the highest weight, the training samples in the old scene have the next highest weight to the training samples in the initial sample set, and the training samples with lower similarity have the lowest weight.
In addition, as time goes on, labeled samples added with actual labels are accumulated in the new scene, so that an optimized training sample set is formed, and the optimized training sample set in the new scene can be obtained, wherein the optimized training sample set comprises the labeled samples added with actual labels; and merging the initial training sample set added with the prediction label and the optimized training sample set added with the actual label, and updating the model to be migrated by utilizing the merged training sample set based on a supervised learning algorithm.
It can be understood that, according to the demand of the new scene for the prediction model, each model to be migrated may be directly applied to the new scene, and updated according to the scheme while being applied, so as to obtain a model more suitable for the new scene, or may be applied to the new scene after being updated for a period of time, and may also be continuously updated after being applied, which is not limited in this specification.
The method for training the prediction model for the new scene provided in the present specification is described below with reference to a more specific example.
In the field of financial risk control, a large amount of accumulated transaction data can be used as sample data, and a wind control model is trained through machine learning, so that risk decision and the like can be timely and accurately carried out on new transactions based on the trained wind control model.
However, when a risk control model is built in a new scene, a long time is often required to accumulate a large amount of sample data required for training the model. For example, the sample data size is generally related to the transaction amount and the accumulated time of a new scene, and a certain amount of black sample data needs to be included in a sample training set.
Aiming at the problems, the existing wind control model in the old scene can be migrated to the new scene.
The new scene and the old scene may be trade markets of different countries and regions, and the wind control model deployed and used in the old scene may include: the system comprises a card-stealing wind control model, a card-stealing account wind control model, a hidden case identification model and the like, and the wind control models can be obtained by training transaction data based on multiple countries and regions.
As shown in fig. 2, multiple models that can be deployed for use in new and old scenes can be trained in the cloud in advance based on data gathered from the old scenes.
The card stealing and account stealing wind control models respectively carry out risk control according to the situations of stealing credit cards and stealing payment accounts and can carry out supervised learning training.
The hidden case identification model is used for identifying the transaction of a bank which is not determined as a case (namely, a non-obvious case) but has case characteristics by inputting a feature vector with stronger pertinence.
For example, if a plurality of credit cards or payment accounts are used simultaneously in the same device (such as a mobile phone) or the same network environment, the risk of batch card stealing and account stealing in the device or environment is high; for another example, for devices, accounts, credit cards, network environments, etc. listed as blacklists, the risk of card stealing and account stealing is high for the devices, accounts, credit cards, network environments, etc. associated with the blacklists; for example, in a device, an account, a credit card, a network environment, and the like, which have performed abnormal transactions (such as abnormality in transaction amount, transaction time, transaction location, and the like), the risk of card theft and account theft is high; the hidden pattern recognition model may recognize the corresponding transaction as a black sample based on the above features.
Moreover, the hidden case recognition model can be trained through unsupervised learning, so that the hidden case recognition model can be applied to scenes without actual cases (labels).
When a card-stealing pneumatic control model, a account-stealing pneumatic control model and a hidden case identification model need to be deployed in a new scene, the models can be issued to the local part of the new scene in the form of model files. And the deployed model can be directly used locally to score transaction events, make risk decisions and the like.
The cloud issuing deployment model is obtained by training of training samples of multiple countries and regions, and has the advantages of comprehensive training samples and high universality.
After each model is deployed to the local of a new scene and used, the model can be divided into a plurality of stages from the point of training sample accumulation.
In the first stage, it can be considered that the accumulation time in the new scene is short, for example, in 1 week after deployment, fewer training samples are accumulated, and each sample has no label, so that the model cannot be updated. Thus, in the first phase, transactions in the new scene are weathered using cloud-trained and un-updated models.
In the second stage, for example, between 1 week and 1 month after deployment, it can be considered that a certain amount of training samples are accumulated in the new scene to form an initial training sample set, and if a large amount of training data in the old scene sent by the cloud is combined, each model can be updated. However, because the period of processing card stealing and account stealing by the financial institution is long, no labeled sample with an actual label is accumulated at this time, and therefore, a prediction sample can be added to the initial training sample set through the hidden case identification model.
In addition, different weights can be set for training samples in new and old scenes, for example, the new scene is the market of malaysia, and the old scene comprises the markets of thailand, usa, japan, and the like, wherein thailand is closer to the consumption level and habit of malaysia, the similarity of transaction data is higher, and the similarity of transaction data of usa, japan and malaysia is lower. Therefore, it is possible to set the highest weight for training samples accumulated locally in malaysia, set a higher weight for training samples from thailand, and set a lower weight for training samples from the united states, japan. Therefore, by means of dynamic weighting, each model after updating training can be made to be suitable for a new scene under the condition that data in the new scene are less.
The models updated in the second stage can still be used for trading decisions for new scenarios.
In the third phase, such as after 1 month of deployment, it can be considered that a sufficient amount of training samples have accumulated in the new scene, and that labeled samples with actual labels have accumulated, then the models can be further updated. The training samples used for updating may include only the training samples with actual labels in the new scene, may also include the training samples with prediction labels added by the hidden pattern recognition model in the new scene, may also include a large number of training samples in the old scene, and so on.
Except for the model pre-trained through the cloud and the accumulated data, the wind control model is deployed and updated in the new scene, and the accumulated data in the new scene can also be uploaded to the cloud so as to be used for updating the existing model, training other new models, deploying to other new scenes and the like.
Therefore, by applying the scheme, the model deployed and used in the old scene can be migrated to the new scene, and the accumulation time of the sample in the new scene is short, so that the label prediction is performed through the model to be migrated under the condition that the sample has no or only a few actual labels, so that the model to be migrated is further optimized, the models are more suitable for being used in the new scene, and a more efficient and accurate prediction model training scheme is provided for the new scene.
Corresponding to the above method embodiment, an embodiment of the present specification further provides a prediction model training apparatus for a new scenario, and referring to fig. 3, the apparatus may include:
a to-be-migrated model obtaining module 110, configured to obtain a set of to-be-migrated models, where the to-be-migrated model is: deploying a model which is used in an old scene and can be migrated to a new scene;
an annotation model selection module 120, configured to select at least one model from the set of models to be migrated, so as to perform predictive annotation on an unlabeled sample in a new scene;
a sample set obtaining module 130, configured to obtain an initial training sample set in a new scene, where the initial training sample set includes unlabeled samples;
a sample labeling module 140, configured to add a prediction label to an unlabeled sample in the initial training sample set by using the selected model;
and the model updating module 150 is configured to update the model to be migrated based on a supervised learning algorithm by using the initial training sample set to which the prediction tag is added, so as to obtain a model applicable to a new scene.
In a specific embodiment provided in this specification, the to-be-migrated model obtaining module 110 may include:
a to-be-migrated feature obtaining unit 111, configured to obtain a first feature set, where the set includes: a plurality of feature vectors which can be extracted by a predetermined new scene training sample; for any model deployed in the old scenario: obtaining a second feature set, the set comprising: a plurality of feature vectors input by the model;
a model to be migrated selecting unit 112, configured to determine the model as a model to be migrated when the model meets a preset migration rule; the preset migration rule comprises the following steps: and the feature vector included in the intersection of the first feature set and the second feature set meets a preset migration condition.
In a specific embodiment provided in this specification, the preset migration condition may include:
the number of the feature vectors in the intersection is not less than a preset threshold; and/or the weighted score calculated according to the preset weight of each feature vector included in the intersection is not less than a preset threshold.
In a specific implementation manner provided in this specification, the module to be migrated obtaining module 110 may further include: a to-be-migrated type obtaining unit 113 configured to obtain at least one type specified in advance for the new scene prediction model;
the preset migration rule may further include: the at least one type that is pre-specified includes a type of the model.
In a specific embodiment provided in this specification, the annotation model selecting module 120 may include:
an annotated feature obtaining unit 121, configured to obtain a third feature set, where the set includes: a plurality of pre-specified feature vectors for predicting sample labels in a new scene; obtaining each feature set corresponding to each model to be migrated, wherein any feature set comprises: a plurality of feature vectors input by the corresponding model;
and the annotation model selecting unit 122 is configured to select at least one model from the set of models to be migrated according to a preset selection rule.
In one embodiment provided in the present specification, the sample labeling module 140 may include:
a predicted value determination unit 141 for, for any one of the selected models: inputting unlabeled samples in the initial training sample set into the model to obtain an output predicted value;
a predicted label determination unit 142, configured to, for any unlabeled exemplar input: determining the weight of the predicted value output by each model; calculating the weighted sum of the predicted values, and determining a predicted label corresponding to the weighted sum; the prediction tag is added to the unlabeled exemplar.
In an embodiment provided in this specification, the sample set obtaining module 130 may be further configured to: obtaining an optimized training sample set in a new scene, wherein the optimized training sample set comprises labeled samples added with actual labels;
the model updating module 150 may be specifically configured to: and merging the initial training sample set added with the prediction label and the optimized training sample set added with the actual label, and updating the model to be migrated by utilizing the merged training sample set based on a supervised learning algorithm.
In an embodiment provided in this specification, the sample set obtaining module 130 may be further configured to: obtaining a training sample set in an old scene, wherein the training sample set comprises labeled samples added with actual labels;
the model updating module 150 may be specifically configured to: and merging the initial sample set in the new scene with the training sample set in the old scene, and updating the model to be migrated by using the merged training sample set based on a supervised learning algorithm.
The implementation process of the functions and actions of each module in the above device is specifically described in the implementation process of the corresponding step in the above method, and is not described herein again.
Embodiments of the present specification also provide a computer device, which at least includes a memory, a processor, and a computer program stored on the memory and executable on the processor, wherein the processor implements the aforementioned prediction model training method for a new scene when executing the program. The method at least comprises the following steps:
obtaining a set of models to be migrated, wherein the models to be migrated are as follows: deploying a model which is used in an old scene and can be migrated to a new scene;
selecting at least one model from the set of models to be migrated for predictive labeling of unlabeled samples in a new scene;
obtaining an initial training sample set in a new scene, wherein the initial training sample set comprises unlabeled samples;
adding a prediction label for the label-free sample in the initial training sample set by using the selected model;
and updating the model to be migrated by utilizing the initial training sample set added with the prediction label based on a supervised learning algorithm to obtain the model applicable to the new scene.
Fig. 4 is a schematic diagram illustrating a more specific hardware structure of a computing device according to an embodiment of the present disclosure, where the computing device may include: a processor 1010, a memory 1020, an input/output interface 1030, a communication interface 1040, and a bus 1050. Wherein the processor 1010, memory 1020, input/output interface 1030, and communication interface 1040 are communicatively coupled to each other within the device via bus 1050.
The processor 1010 may be implemented by a general-purpose CPU (Central Processing Unit), a microprocessor, an Application Specific Integrated Circuit (ASIC), or one or more Integrated circuits, and is configured to execute related programs to implement the technical solutions provided in the embodiments of the present disclosure.
The Memory 1020 may be implemented in the form of a ROM (Read Only Memory), a RAM (Random Access Memory), a static storage device, a dynamic storage device, or the like. The memory 1020 may store an operating system and other application programs, and when the technical solution provided by the embodiments of the present specification is implemented by software or firmware, the relevant program codes are stored in the memory 1020 and called to be executed by the processor 1010.
The input/output interface 1030 is used for connecting an input/output module to input and output information. The i/o module may be configured as a component in a device (not shown) or may be external to the device to provide a corresponding function. The input devices may include a keyboard, a mouse, a touch screen, a microphone, various sensors, etc., and the output devices may include a display, a speaker, a vibrator, an indicator light, etc.
The communication interface 1040 is used for connecting a communication module (not shown in the drawings) to implement communication interaction between the present apparatus and other apparatuses. The communication module can realize communication in a wired mode (such as USB, network cable and the like) and also can realize communication in a wireless mode (such as mobile network, WIFI, Bluetooth and the like).
Bus 1050 includes a path that transfers information between various components of the device, such as processor 1010, memory 1020, input/output interface 1030, and communication interface 1040.
It should be noted that although the above-mentioned device only shows the processor 1010, the memory 1020, the input/output interface 1030, the communication interface 1040 and the bus 1050, in a specific implementation, the device may also include other components necessary for normal operation. In addition, those skilled in the art will appreciate that the above-described apparatus may also include only those components necessary to implement the embodiments of the present description, and not necessarily all of the components shown in the figures.
Embodiments of the present specification also provide a computer readable storage medium, on which a computer program is stored, which when executed by a processor, implements the aforementioned prediction model training method for a new scene. The method at least comprises the following steps:
obtaining a set of models to be migrated, wherein the models to be migrated are as follows: deploying a model which is used in an old scene and can be migrated to a new scene;
selecting at least one model from the set of models to be migrated for predictive labeling of unlabeled samples in a new scene;
obtaining an initial training sample set in a new scene, wherein the initial training sample set comprises unlabeled samples;
adding a prediction label for the label-free sample in the initial training sample set by using the selected model;
and updating the model to be migrated by utilizing the initial training sample set added with the prediction label based on a supervised learning algorithm to obtain the model applicable to the new scene.
Computer-readable media, including both non-transitory and non-transitory, removable and non-removable media, may implement information storage by any method or technology. The information may be computer readable instructions, data structures, modules of a program, or other data. Examples of computer storage media include, but are not limited to, phase change memory (PRAM), Static Random Access Memory (SRAM), Dynamic Random Access Memory (DRAM), other types of Random Access Memory (RAM), Read Only Memory (ROM), Electrically Erasable Programmable Read Only Memory (EEPROM), flash memory or other memory technology, compact disc read only memory (CD-ROM), Digital Versatile Discs (DVD) or other optical storage, magnetic cassettes, magnetic tape magnetic disk storage or other magnetic storage devices, or any other non-transmission medium that can be used to store information that can be accessed by a computing device. As defined herein, a computer readable medium does not include a transitory computer readable medium such as a modulated data signal and a carrier wave.
From the above description of the embodiments, it is clear to those skilled in the art that the embodiments of the present disclosure can be implemented by software plus necessary general hardware platform. Based on such understanding, the technical solutions of the embodiments of the present specification may be essentially or partially implemented in the form of a software product, which may be stored in a storage medium, such as a ROM/RAM, a magnetic disk, an optical disk, etc., and includes several instructions for enabling a computer device (which may be a personal computer, a server, or a network device, etc.) to execute the methods described in the embodiments or some parts of the embodiments of the present specification.
The systems, devices, modules or units illustrated in the above embodiments may be implemented by a computer chip or an entity, or by a product with certain functions. A typical implementation device is a computer, which may take the form of a personal computer, laptop computer, cellular telephone, camera phone, smart phone, personal digital assistant, media player, navigation device, email messaging device, game console, tablet computer, wearable device, or a combination of any of these devices.
The embodiments in the present specification are described in a progressive manner, and the same and similar parts among the embodiments are referred to each other, and each embodiment focuses on the differences from the other embodiments. In particular, for the apparatus embodiment, since it is substantially similar to the method embodiment, it is relatively simple to describe, and reference may be made to some descriptions of the method embodiment for relevant points. The above-described apparatus embodiments are merely illustrative, and the modules described as separate components may or may not be physically separate, and the functions of the modules may be implemented in one or more software and/or hardware when implementing the embodiments of the present disclosure. And part or all of the modules can be selected according to actual needs to achieve the purpose of the scheme of the embodiment. One of ordinary skill in the art can understand and implement it without inventive effort.
The foregoing is only a specific embodiment of the embodiments of the present disclosure, and it should be noted that, for those skilled in the art, a plurality of modifications and decorations can be made without departing from the principle of the embodiments of the present disclosure, and these modifications and decorations should also be regarded as the protection scope of the embodiments of the present disclosure.