WO2021068513A1 - 异常对象识别方法、装置、介质及电子设备 - Google Patents

异常对象识别方法、装置、介质及电子设备 Download PDF

Info

Publication number
WO2021068513A1
WO2021068513A1 PCT/CN2020/092812 CN2020092812W WO2021068513A1 WO 2021068513 A1 WO2021068513 A1 WO 2021068513A1 CN 2020092812 W CN2020092812 W CN 2020092812W WO 2021068513 A1 WO2021068513 A1 WO 2021068513A1
Authority
WO
WIPO (PCT)
Prior art keywords
neural network
deep neural
network model
object data
model
Prior art date
Application number
PCT/CN2020/092812
Other languages
English (en)
French (fr)
Inventor
高呈琳
Original Assignee
平安科技(深圳)有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 平安科技(深圳)有限公司 filed Critical 平安科技(深圳)有限公司
Publication of WO2021068513A1 publication Critical patent/WO2021068513A1/zh

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/14Network analysis or design
    • H04L41/147Network analysis or design for predicting network behaviour
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/14Network analysis or design
    • H04L41/145Network analysis or design involving simulating, designing, planning or modelling of a network

Definitions

  • the present disclosure relates to the technical field of neural networks, and in particular to an abnormal object recognition method, device, medium and electronic equipment.
  • the purpose of the present disclosure is to provide an abnormal object identification method, device, medium and electronic equipment.
  • a method for identifying an abnormal object including:
  • the multiple object data in the training set and the label corresponding to each object data are respectively input to multiple deep neural network models to be trained, and the multiple deep neural network models to be trained are trained to obtain multiple deep neural network models , Wherein the connection weights between the neurons in each of the deep neural network models to be trained are randomly initialized;
  • a device for identifying an abnormal object comprising:
  • An obtaining module configured to obtain a plurality of object data and a label corresponding to each of the object data representing whether the object is abnormal, the object data including a plurality of object characteristic values;
  • the object data dividing module is configured to divide the multiple object data into a training set and a test set according to a predetermined rule, wherein the training set and the test set respectively contain multiple object data;
  • the training module is configured to input multiple object data and labels corresponding to each object data in the training set to multiple deep neural network models to be trained, and train the multiple deep neural network models to be trained to obtain A plurality of deep neural network models, wherein the connection weights between the neurons in each of the deep neural network models to be trained are initialized randomly;
  • the input module is configured to input the object data in the test set to the multiple deep neural network models to obtain the abnormal probability of each object data in the test set output by each of the deep neural network models ;
  • a determining module configured to determine a target deep neural network model from the plurality of deep neural network models according to the abnormal probability of each object data in the test set output by each deep neural network model;
  • the cascade module is configured to cascade the target deep neural network model and the extreme gradient boosting model to obtain a cascade model, and use multiple object data in the training set to train the cascade model to obtain training Good cascade model;
  • the prediction module is configured to input the object data to be recognized into the trained cascade model to predict whether the object corresponding to the object data to be recognized is abnormal.
  • a computer-readable program medium which stores computer program instructions, and when the computer program instructions are executed by a computer, the computer executes the following steps:
  • the multiple object data in the training set and the label corresponding to each object data are respectively input to multiple deep neural network models to be trained, and the multiple deep neural network models to be trained are trained to obtain multiple deep neural network models , Wherein the connection weights between the neurons in each of the deep neural network models to be trained are randomly initialized;
  • an electronic device including:
  • a memory where computer-readable instructions are stored, and when the computer-readable instructions are executed by the processor, the following steps are implemented:
  • the multiple object data in the training set and the label corresponding to each object data are respectively input to multiple deep neural network models to be trained, and the multiple deep neural network models to be trained are trained to obtain multiple deep neural network models , Wherein the connection weights between the neurons in each of the deep neural network models to be trained are randomly initialized;
  • the embodiment of this application first trains multiple deep neural network models, and then selects the most suitable target deep neural network model for abnormal object recognition from the trained deep neural network models, so that the performance of the selected target deep neural network model It is optimal.
  • the target deep neural network model and the extreme gradient boosting model while retaining the advantages of the two models in classification and prediction, the accuracy of identifying abnormal objects is improved, and the accuracy of identifying abnormal objects is reduced. The rate of missed recognition of abnormal objects.
  • Fig. 1 is a schematic diagram showing a model structure for an abnormal object recognition method according to an exemplary embodiment
  • Fig. 2 is a flow chart showing a method for identifying abnormal objects according to an exemplary embodiment
  • FIG. 3 is a flowchart showing details of step 250 in an embodiment according to the embodiment corresponding to FIG. 2;
  • FIG. 4 is a flowchart showing details of step 250 in another embodiment according to the embodiment corresponding to FIG. 2;
  • Fig. 5 is a block diagram showing a device for identifying abnormal objects according to an exemplary embodiment
  • Fig. 6 is a block diagram showing an example of an electronic device implementing the above method for identifying abnormal objects according to an exemplary embodiment
  • Fig. 7 shows a computer-readable storage medium for realizing the above abnormal object identification method according to an exemplary embodiment.
  • the present disclosure first provides a method for identifying abnormal objects.
  • the object can be any tangible or intangible entity that can exert a certain effect on it, and it can be anything that can be processed by a computing device.
  • An abnormal object is an object whose characteristics or attributes do not meet certain requirements. It is necessary to monitor and identify abnormal objects among all objects, and perform corresponding treatment or restriction in accordance with predetermined methods or rules.
  • the abnormal object identification method provided in the present disclosure can be applied to the field of network traffic monitoring and insurance.
  • the target is visitor traffic
  • the abnormal target is abnormal traffic, such as illegal user traffic or abnormal large traffic. It is necessary to monitor and limit these abnormal traffic to avoid network congestion. Thereby ensuring the availability of the network platform.
  • the target is the customer who initiated the insurance application
  • the abnormal target is the customer whose insurance fraud risk is high and is not allowed to apply for insurance or the customer whose insurance is more likely to take effect, so it is necessary to identify such customers. Avoiding insurance business for these customers who are not suitable for insuring from the source, so as to reduce the possibility of damage to the interests of insurance companies due to the existence of these customers, and improve the efficiency of insurance companies.
  • the technical solution of the present application can be applied to the field of artificial intelligence technology, involving neural networks.
  • the implementation terminal of the present disclosure can be any device with computing and processing functions.
  • the device can be connected to an external device for receiving or sending data.
  • it can be a portable mobile device, such as a smart phone, a tablet computer, a notebook computer, or a PDA ( Personal Digital Assistant), etc., can also be fixed devices, such as computer equipment, field terminals, desktop computers, servers, workstations, etc., or a collection of multiple devices, such as the physical infrastructure of cloud computing.
  • the implementation terminal of the present disclosure may be a server or a physical infrastructure of cloud computing.
  • Fig. 1 is a schematic diagram showing a model structure for an abnormal object recognition method according to an exemplary embodiment.
  • the model structure of the method for identifying abnormal objects includes a deep neural network model 110 and an extreme gradient boosting model 120.
  • the Deep Neural Network module (DNN module) is a network under a series of deep learning algorithms. It is a mode that imitates the brain's neuron transmission to process information. It includes multiple layers of neurons, with multiple layers in each layer.
  • the deep neural network model can include at least input layer, hidden layer, output layer and other multi-layer neuron structures, and can also include at least one fully connected layer, where the hidden layer of the deep neural network model can also be stacked Multi-layer structure, the depth in a deep neural network means that the path from input to output is sufficiently long.
  • the extreme gradient boosting model is a strong classifier model composed of multiple weak classifiers. It is a boosting tree model.
  • the tree model used is generally CART (Classification And Regression Tree). , Classification and regression tree) regression tree.
  • the extreme gradient boosting model 120 includes multiple weak classifiers-CART regression binary tree.
  • Xgboost grows each tree through continuous feature splitting. Each tree represents a trained function, and each grows into A tree of can fit the residual predicted by the tree generated before.
  • Fig. 2 is a flow chart showing a method for identifying abnormal objects according to an exemplary embodiment. As shown in Figure 2, the following steps can be included:
  • Step 210 Obtain a plurality of object data and a label corresponding to each of the object data that represents whether the object is abnormal.
  • the object data includes a plurality of object feature values, the object data corresponds to the object, and each object feature value corresponds to one object feature.
  • Object data refers to data related to the object, which can be data generated by the object itself, or data obtained by recording the behavior of the object when the object is active.
  • the target is the visitor's traffic
  • the abnormal target is the abnormal traffic, such as the traffic generated by illegal visitors or abnormally large traffic.
  • the object data at this time is the IP address corresponding to the visitor's traffic.
  • the object is the customer who initiated the insurance application, and the abnormal object is the customer who has a high risk of fraud and is not allowed to apply for insurance or For customers whose insurance is more likely to take effect, the object data at this time is the data generated by the customer's occupation, age, pension, provident fund, and personal assets.
  • the label representing whether the object is abnormal corresponding to each of the object data identifies whether the object corresponding to the corresponding object data is abnormal, and the label may be manually labeled, for example, the label may be labeled depending on the experience of an expert;
  • the label may also be a label performed automatically by a machine. For example, an expert classifies each object data according to experience in advance, and the machine automatically performs label labeling according to the category of each object data.
  • the specific form of the label representing whether the object is abnormal or not corresponding to each of the object data may be arbitrary, as long as it can be recognized by the computer device.
  • the label representing the abnormality of the object can be "NO”, and the label representing the abnormality of the object can be "YES”.
  • the label representing the abnormality of the object can be "OK”, and the corresponding label representing the abnormality of the object can be
  • the label that is "" (empty) or that represents the abnormality of the object can be "1", and the corresponding label that represents the non-abnormality of the object can be "0".
  • the object data and the label indicating whether the object is abnormal corresponding to each of the object data are stored in the database at the same time, and the object data and the representative object corresponding to each of the object data are obtained by querying the database. Whether the label is abnormal.
  • the plurality of object data and the data identifier corresponding to each object data are stored in the first terminal correspondingly, and the label corresponding to each object data representing whether the object is abnormal is associated with each object data.
  • the corresponding data identifier is correspondingly stored in the second terminal.
  • a plurality of object data and a data identifier corresponding to each object data are first obtained from the first terminal, and then a data identifier corresponding to each object data is used from the second terminal.
  • the terminal obtains a label corresponding to each data identifier and corresponding to each of the object data representing whether the object is abnormal, so as to obtain the object data and the label representing whether the object corresponding to each object data is abnormal.
  • Each object feature value represents the value of the object feature in one dimension of the object data.
  • the object feature can also be referred to as an object attribute, and the object feature value can also be referred to as an object attribute value.
  • the object feature value corresponding to the object feature of the IP address that is, the value of the IP address can be 158.135.213.25; and in the field of insurance, if the object feature is Monthly pension amount, the value of the object characteristic of monthly pension amount can be 1000.
  • Step 220 Divide the multiple object data into a training set and a test set according to a predetermined rule.
  • the training set and the test set respectively contain multiple object data, that is, the training set and the test set are both sets of object data.
  • the predetermined rule is to keep the number of object data in the training set and the number of object data in the test set at a predetermined ratio.
  • the advantage of this embodiment is that the relative relationship between the numbers of the training set and the test set is kept within a relatively stable range.
  • the predetermined ratio may be 7:3, that is, for every 7 pieces of object data allocated to the training set, 3 pieces of object data shall be allocated to the test set correspondingly, if the number of object data in the multiple object data is 100 , Then the number of object data in the training set is 70, and the number of object data in the test set is 30.
  • a predetermined number of object data is obtained from the plurality of object data to form a training set, and the remaining object data is formed into a test set.
  • the predetermined rule is to keep the number of object data in the training set at a predetermined ratio to the number of object data in the test set, and to make the ratios of the object data in the training set and the test set labeled as a label representing the abnormality of the object equal. the same.
  • the advantage of this embodiment is that it avoids the possibility of introducing additional deviations in the modeling process due to the different proportions of the same label object data in the data division of the training set and the test set, which ensures the establishment to a certain extent.
  • the accuracy of the model is that it avoids the possibility of introducing additional deviations in the modeling process due to the different proportions of the same label object data in the data division of the training set and the test set, which ensures the establishment to a certain extent. The accuracy of the model.
  • the advantage of this embodiment is that by limiting the number of object data constituting the training set, it is ensured that a good training effect can be achieved when the training set is used for model training.
  • Step 230 Input the multiple object data in the training set and the label corresponding to each object data to multiple deep neural network models to be trained, and train the multiple deep neural network models to be trained to obtain multiple depths Neural network model.
  • connection weights between the neurons in each of the deep neural network models to be trained are initialized randomly.
  • the label corresponding to each object data When multiple object data in the training set and the label corresponding to each object data are input to the deep neural network model to be trained, the label corresponding to each object data will be converted to a numeric value, and the object feature value in the object data will be first It is converted into a vector, and the vector is transformed and mapped through the connection of multi-layer neurons in the deep neural network model, and finally the predicted value output by the deep neural network model is obtained, and then the predicted value is calculated and the object data label conversion of the object Based on the difference between the values of, use Stochastic Gradient Descent (SGD) and Backpropagation Algorithm (BP Algorithm) to adjust the connection of multi-layer neurons in the deep neural network model based on the difference Weight; iteratively execute the above process until the number of iterations reaches the threshold of the number of iterations or the training of the model meets the predetermined conditions, and the model obtained at this time is the trained deep neural network model.
  • SGD Stochastic Gradient Descent
  • parameters such as batch size and learning rate can be set.
  • connection weight between the neurons in each deep neural network model to be trained is randomly initialized means that the connection weight between each pair of neurons in each deep neural network model to be trained is initially Randomly set, the connection weights between pairs of neurons in the same deep neural network model to be trained are likely to be different, and the connection weights between pairs of neurons in different deep neural network models to be trained are also likely to be Different, in this way, the connection weights between the pairs of neurons in the trained deep neural network models are basically different, which ensures the specificity of each trained deep neural network model, that is, Each deep neural network model trained is a unique model.
  • Step 240 Input the object data in the test set to the multiple deep neural network models to obtain the abnormal probability of each object data in the test set output by each of the deep neural network models.
  • the trained deep neural network model can predict each object data, and output the corresponding prediction result according to the input of the object data.
  • the prediction result is the abnormal probability of the object data, that is, the possibility that the object corresponding to the object data is abnormal is measured It means that the greater the probability of the abnormality of the object data, the more likely the object corresponding to the object data is the abnormal object.
  • the anomaly of the object can also be called the anomaly of the object data.
  • the test set is used to test and evaluate the performance of each deep neural network model trained. It is easy to understand that the weights between neurons in each trained deep neural network model are different, so each deep neural network model is a different model. For each object data in the test set, the abnormal probability of each deep neural network model output to the object data may be different. It is necessary to use the test set to test several different models that have been trained to achieve training. Evaluation of good models.
  • Step 250 Determine a target deep neural network model from the plurality of deep neural network models according to the abnormal probability of each object data in the test set output by each deep neural network model.
  • the target deep neural network model is selected from the trained multiple deep neural network models based on the abnormal probability that each deep neural network model outputs to each object data in the test set.
  • each deep neural network model Since the connection weights between neurons in each trained deep neural network model are different, the performance of each deep neural network model is often different. According to the abnormal probability of each object data output in the test set according to each deep neural network model, it can be realized By evaluating the performance of each trained deep neural network model, it is possible to select the most suitable deep neural network model for abnormal object prediction.
  • FIG. 3 is a flowchart showing details of step 250 in an embodiment according to the embodiment corresponding to FIG. 2. As shown in Figure 3, it includes the following steps:
  • Step 251 Obtain the ratio of the number of object data corresponding to the label representing the abnormality of the object in the test set to the number of all object data contained in the test set, as a first ratio.
  • a counter is embedded in the terminal for implementing the present disclosure, which can count the number of object data.
  • the first counter in the terminal in the implementation of the present disclosure will count the number of object data corresponding to the label representing the abnormality of the object in the test set, and first set the first counter and the second counter to 0.
  • For the test set For each object data, judge whether the label corresponding to the object data represents an abnormality of the object. If so, increase the first counter by 1.
  • the second counter will also increase 1. Until all object data has been judged.
  • Step 252 For each deep neural network model, sort the abnormal probability of each object data in the test set output by the deep neural network model from large to small.
  • a bubble sorting algorithm is used to sort the abnormal probability of each object data.
  • a quick sort algorithm is used to sort the abnormal probability of each object data.
  • Step 253 For each deep neural network model, each object data corresponding to the deep neural network model is divided into a predetermined number of groups according to the sorting order.
  • Each object data belongs to a group.
  • the purpose of grouping is to make the number of object data contained in most groups the same.
  • all object data is equally divided into a predetermined number of groups, wherein when the number of all object data is divisible by the predetermined number, the number of object data contained in each group is the same, when all object data When the number of is not divisible by the predetermined number, the number of object data contained in all groups except the last group is the same.
  • the predetermined number is a first predetermined number.
  • a predetermined number of object data is allocated to the first second predetermined number of groups, and the remaining object data is allocated to unallocated , Wherein the first predetermined number is greater than the second predetermined number.
  • Step 254 For each deep neural network model, for each group of object data corresponding to the deep neural network model, obtain the number of object data in the group of object data that includes a label representing the abnormality of the object and the number of object data in the group of object data. The ratio of the number of all object data contained is used as the second ratio.
  • each deep neural network model there is a grouping and sorting method of the object data corresponding to the deep neural network model, so the second ratio of the object data grouping in the same order corresponding to each deep neural network model may be different .
  • Step 255 Determine a target deep neural network model among the multiple deep neural network models based on the first ratio and each second ratio obtained for each deep neural network model.
  • step 255 may include:
  • each deep neural network model For each deep neural network model, obtain the second ratio corresponding to the first set of object data corresponding to the deep neural network model as the target second ratio; for each deep neural network model, determine the depth The ratio of the target second ratio obtained by the neural network model to the first ratio is used as the third ratio; and the deep neural network model with the largest third ratio is used as the target deep neural network model.
  • Each deep neural network model has a corresponding set of object data grouping and sorting methods. Therefore, each group of object data corresponding to each deep neural network model has a group of object data sorted in the first place. Correspondingly, The group of object data ranked at the top has a second ratio, and at this time, the second ratio can be used as the target second ratio of the corresponding deep neural network model.
  • each group of object data corresponding to each deep neural network model is sorted according to the abnormal probability of each object data output by each deep neural network model, this means that the sort of an object data is higher .
  • the object data is more likely to be recognized as abnormal object data by the corresponding deep neural network model (determining the object corresponding to the object data as an abnormal object), so the ranking corresponding to a deep neural network model is the first
  • a set of object data is the data that the deep neural network model considers that the corresponding object in all object data is most likely to be an abnormal object, and the first ratio reflects the proportion of object data in which the corresponding label in all object data represents an abnormal object. That is to say, it reflects the proportion of abnormal objects in all object data.
  • the ratio of the second ratio to the first ratio of the target of a deep neural network model that is, the larger the third ratio, it means that all objects are obtained from all objects in a random manner.
  • the deep neural network model performs better in identifying abnormal object data, and has a higher accuracy in identifying abnormal object data. Therefore, the advantage of this embodiment is that by selecting the deep neural network model with the largest third ratio as the target deep neural network model, the performance of the selected target deep neural network model is optimized, thereby improving the final application The accuracy of the model used to identify abnormal objects.
  • step 255 may include:
  • For each deep neural network model obtain the average value of the second ratio corresponding to the third predetermined number of groups of object data corresponding to the deep neural network model as the target second ratio; for each deep neural network model , Determining the ratio of the target second ratio obtained for the deep neural network model to the first ratio as the third ratio; taking the deep neural network model with the largest third ratio as the target deep neural network model.
  • the average value of the second ratio corresponding to the first 3 groups of object data should be obtained for each deep neural network model.
  • the second ratio corresponding to the top ranked group may not fully and objectively reflect the performance of a deep neural network model, for example, when a deep neural network model corresponds to the top ranked second ratio, the second ratio is smaller, but corresponding When the second ratio of the first several groups as a whole is large enough, it can also indicate that the performance of the deep neural network model is relatively excellent. Therefore, the advantage of this embodiment is that each depth is determined by the average value.
  • the neural network model corresponds to the overall size of the second ratio of the first several groups, and then uses the overall size to select the target deep neural network model, which improves the fairness and fairness of the selection of the target deep neural network model. reliability.
  • step 255 may include:
  • each deep neural network model For each deep neural network model, compare the abnormal probability of each object data in the test set output by the deep neural network model with a preset abnormal probability threshold to determine that the deep neural network model is effective for the test Whether the prediction result of each object data in the test set is abnormal; based on the label representing whether the object is abnormal or not corresponding to each object data in the test set and the prediction result of each deep neural network model on each object data in the test set, calculate each 1.
  • the recall and precision of the deep neural network model according to the recall, precision, the first ratio of each deep neural network model, the first ratio, and each second ratio obtained for each deep neural network model,
  • the target deep neural network model is determined from the plurality of deep neural network models.
  • P is the precision rate
  • R is the recall rate
  • TP is the prediction result of the object data of the deep neural network model in the test set as abnormal
  • the label corresponding to the object data represents the abnormal object data of the object.
  • FP is in the test set
  • the prediction result of the deep neural network model on the object data is normal
  • the label corresponding to the object data represents the number of abnormal object data
  • FN is in the test set
  • the deep neural network The prediction result of the network model on the object data is abnormal
  • the label corresponding to the object data represents the number of normal object data of the object.
  • the first ratio and each second ratio obtained for each deep neural network model, in the multiple depths is determined from the neural network model, including:
  • each deep neural network model uses the recall and precision of each deep neural network model to calculate the first parameter of each deep neural network model; use the first ratio and each second ratio obtained for each deep neural network model to obtain each A second parameter of a deep neural network model; based on the first parameter and the second parameter of each deep neural network model, a target deep neural network model is determined from the plurality of deep neural network models.
  • the advantage of this embodiment is that the target deep neural network model can be selected by integrating the indicators of the second ratio, recall, and precision corresponding to the deep neural network model, so that the selected target deep neural network model can be The performance is better, and it is more suitable for identifying abnormal objects.
  • the calculation of the first parameter of each deep neural network model by using the recall and precision of each deep neural network model includes: comparing the recall and precision of each deep neural network model.
  • the average value of the accuracy rate is used as the first parameter of each deep neural network model;
  • the second ratio of each deep neural network model is obtained by using the first ratio and each second ratio obtained for each deep neural network model
  • the parameters include: respectively obtaining the weighted sum of the first predetermined number of second ratios corresponding to each deep neural network model, and calculating the ratio of the weighted sum to the first ratio as the second parameter of each deep neural network model
  • the first parameter and the second parameter based on each deep neural network model, determining the target deep neural network model in the plurality of deep neural network models includes: obtaining the first parameter and the second parameter respectively Parameter weight; for each deep neural network model, use the weight to determine the weighted sum of the first parameter and the second parameter of the deep neural network model; take the deep neural network model with the largest weighted sum as the target deep neural network model.
  • Step 260 cascade the target deep neural network model and the extreme gradient boosting model to obtain a cascade model, and train the cascade model using multiple object data in the training set to obtain a trained cascade model.
  • Cascading the target deep neural network model and the extreme gradient boosting model refers to directly passing the output of the target deep neural network model as input to the extreme gradient boosting model.
  • the target deep neural network model includes an output layer and at least one hidden layer
  • the target deep neural network model is cascaded with an extreme gradient boosting model to obtain a cascaded model
  • the The multiple object data in the training set trains the cascade model to obtain a trained cascade model, including:
  • the feature vector output by the layer can be input to the extreme gradient boosting model to obtain the cascade model;
  • the cascade model is trained by using multiple object data in the training set to obtain a trained cascade model.
  • Stochastic gradient descent method and error back propagation algorithm can be used to train the cascade model.
  • Step 270 Input the object data to be recognized into the trained cascade model to predict whether the object corresponding to the object data to be recognized is abnormal.
  • the cascade model After the cascade model is trained, it can be used to predict the object data. For example, in the field of insurance, whether the prediction is not allowed to handle insurance for the corresponding customer, and in the field of network traffic monitoring, the prediction is whether the visitor’s traffic is Abnormal flow.
  • the cascade model combines the advantages of the deep neural network model and the extreme gradient boost (xgboost) model. Compared with a separate deep neural network model, it improves the interpretability, and compared with the xgboost model, it improves the prediction to a certain extent. The precision.
  • the two models are retained in classification and prediction.
  • the advantage of this improves the accuracy of identifying abnormal objects and reduces the missed recognition rate of abnormal objects.
  • FIG. 4 is a flowchart showing details of step 250 in another embodiment according to the embodiment corresponding to FIG. 2. As shown in Figure 4, it includes the following steps:
  • Step 251' for each deep neural network model, compare the abnormal probability of each object data in the test set output by the deep neural network model with a preset abnormal probability threshold to determine the pair of deep neural network models Whether the prediction result of each object data in the test set is abnormal.
  • the prediction result of the deep neural network model for the object data can be determined Is abnormal.
  • Step 252' based on the label representing whether the object is abnormal or not corresponding to each object data in the test set and the prediction result of each deep neural network model on each object data in the test set, calculate the completeness of each deep neural network model Rate and precision rate.
  • step 253' the deep neural network model with the largest precision is selected from the deep neural network models with the recall rate greater than the preset recall rate threshold as the target deep neural network model.
  • the recall rate reflects the proportion of truly abnormal object data in the target data predicted by the deep neural network model for abnormal object data in the test set. Therefore, in order to identify abnormal objects as much as possible, it is necessary to select The recall rate of the model is high enough.
  • the advantage of this embodiment is that by limiting the recall rate of the selected target deep neural network model, and on this basis, the model with the highest precision rate is selected, so that the selected target The deep neural network model is more suitable for identifying abnormal objects and can maintain a sufficiently high accuracy.
  • the present disclosure also provides an abnormal object recognition device.
  • the following are device embodiments of the present disclosure.
  • Fig. 5 is a block diagram showing a device for identifying abnormal objects according to an exemplary embodiment. As shown in FIG. 5, the device 500 includes:
  • the obtaining module 510 is configured to obtain a plurality of object data and a label corresponding to each of the object data representing whether the object is abnormal, and the object data includes a plurality of object characteristic values;
  • the object data dividing module 520 is configured to divide the multiple object data into a training set and a test set according to a predetermined rule, wherein the training set and the test set respectively contain multiple object data;
  • the training module 530 is configured to input multiple object data in the training set and labels corresponding to each object data to multiple deep neural network models to be trained, and train the multiple deep neural network models to be trained to Obtain a plurality of deep neural network models, wherein the connection weights between the neurons in each of the deep neural network models to be trained are randomly initialized;
  • the input module 540 is configured to input the object data in the test set to the multiple deep neural network models to obtain the abnormality of each object data in the test set output by each of the deep neural network models. Probability
  • the determining module 550 is configured to determine a target deep neural network model from the multiple deep neural network models according to the abnormal probability of each object data in the test set output by each deep neural network model;
  • the cascade module 560 is configured to cascade the target deep neural network model and the extreme gradient boosting model to obtain a cascade model, and train the cascade model by using multiple object data in the training set to obtain Trained cascade model;
  • the prediction module 570 is configured to input the object data to be recognized into the trained cascade model to predict whether the object corresponding to the object data to be recognized is abnormal.
  • an electronic device capable of implementing the above method.
  • the electronic device 600 according to this embodiment of the present application will be described below with reference to FIG. 6.
  • the electronic device 600 shown in FIG. 6 is only an example, and should not bring any limitation to the functions and scope of use of the embodiments of the present application.
  • the electronic device 600 is represented in the form of a general-purpose computing device.
  • the components of the electronic device 600 may include, but are not limited to: the aforementioned at least one processing unit 610, the aforementioned at least one storage unit 620, and a bus 630 connecting different system components (including the storage unit 620 and the processing unit 610).
  • the storage unit stores program code, and the program code can be executed by the processing unit 610, so that the processing unit 610 executes the various exemplary methods described in the "Methods of Embodiments" section of this specification. Steps of implementation.
  • the storage unit 620 may include a readable medium in the form of a volatile storage unit, such as a random access storage unit (RAM) 621 and/or a cache storage unit 622, and may further include a read-only storage unit (ROM) 623.
  • RAM random access storage unit
  • ROM read-only storage unit
  • the storage unit 620 may also include a program/utility tool 624 having a set of (at least one) program module 625.
  • program module 625 includes but is not limited to: an operating system, one or more application programs, other program modules, and program data, Each of these examples or some combination may include the implementation of a network environment.
  • the bus 630 may represent one or more of several types of bus structures, including a storage unit bus or a storage unit controller, a peripheral bus, a graphics acceleration port, a processing unit, or a local area using any bus structure among multiple bus structures. bus.
  • the electronic device 600 may also communicate with one or more external devices 800 (such as keyboards, pointing devices, Bluetooth devices, etc.), and may also communicate with one or more devices that enable a user to interact with the electronic device 600, and/or communicate with Any device (such as a router, modem, etc.) that enables the electronic device 600 to communicate with one or more other computing devices. This communication can be performed through an input/output (I/O) interface 650.
  • the electronic device 600 may also communicate with one or more networks (for example, a local area network (LAN), a wide area network (WAN), and/or a public network, such as the Internet) through the network adapter 660.
  • networks for example, a local area network (LAN), a wide area network (WAN), and/or a public network, such as the Internet
  • the network adapter 660 communicates with other modules of the electronic device 600 through the bus 630. It should be understood that although not shown in the figure, other hardware and/or software modules can be used in conjunction with the electronic device 600, including but not limited to: microcode, device drivers, redundant processing units, external disk drive arrays, RAID systems, tape drives And data backup storage system, etc.
  • the example embodiments described here can be implemented by software, or can be implemented by combining software with necessary hardware. Therefore, the technical solution according to the embodiments of the present disclosure can be embodied in the form of a software product, which can be stored in a non-volatile storage medium (which can be a CD-ROM, U disk, mobile hard disk, etc.) or on the network , Including several instructions to make a computing device (which may be a personal computer, a server, a terminal device, or a network device, etc.) execute the method according to the embodiment of the present disclosure.
  • a computing device which may be a personal computer, a server, a terminal device, or a network device, etc.
  • a computer-readable storage medium on which is stored a program product capable of implementing the above method of this specification.
  • various aspects of the present application can also be implemented in the form of a program product, which includes program code.
  • the program product runs on a terminal device, the program code is used to make the The terminal device executes the steps according to various exemplary embodiments of the present application described in the above-mentioned "Exemplary Method" section of this specification.
  • the computer-readable storage medium may be a non-volatile storage medium or a volatile storage medium.
  • a program product 700 for implementing the above method according to an embodiment of the present application is described. It can adopt a portable compact disk read-only memory (CD-ROM) and include program code, and can be stored in a terminal device, For example, running on a personal computer.
  • CD-ROM compact disk read-only memory
  • the program product of this application is not limited to this.
  • the readable storage medium can be any tangible medium that contains or stores a program, and the program can be used by or combined with an instruction execution system, device, or device.
  • the program product can use any combination of one or more readable media.
  • the readable medium may be a readable signal medium or a readable storage medium.
  • the readable storage medium may be, for example, but not limited to, an electrical, magnetic, optical, electromagnetic, infrared, or semiconductor system, device, or device, or a combination of any of the above. More specific examples (non-exhaustive list) of readable storage media include: electrical connections with one or more wires, portable disks, hard disks, random access memory (RAM), read-only memory (ROM), erasable Type programmable read only memory (EPROM or flash memory), optical fiber, portable compact disk read only memory (CD-ROM), optical storage device, magnetic storage device, or any suitable combination of the above.
  • the computer-readable signal medium may include a data signal propagated in baseband or as a part of a carrier wave, and readable program code is carried therein. This propagated data signal can take many forms, including but not limited to electromagnetic signals, optical signals, or any suitable combination of the foregoing.
  • the readable signal medium may also be any readable medium other than a readable storage medium, and the readable medium may send, propagate, or transmit a program for use by or in combination with the instruction execution system, apparatus, or device.
  • the program code contained on the readable medium can be transmitted by any suitable medium, including but not limited to wireless, wired, optical cable, RF, etc., or any suitable combination of the above.
  • the program code used to perform the operations of the present application can be written in any combination of one or more programming languages.
  • the programming languages include object-oriented programming languages—such as Java, C++, etc., as well as conventional procedural programming languages. Programming language-such as "C" language or similar programming language.
  • the program code can be executed entirely on the user's computing device, partly on the user's device, executed as an independent software package, partly on the user's computing device and partly executed on the remote computing device, or entirely on the remote computing device or server Executed on.
  • the remote computing device can be connected to a user computing device through any kind of network, including a local area network (LAN) or a wide area network (WAN), or it can be connected to an external computing device (for example, using Internet service providers). Business to connect via the Internet).
  • LAN local area network
  • WAN wide area network
  • Internet service providers for example, using Internet service providers.

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • General Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Image Analysis (AREA)

Abstract

本公开涉及神经网络领域,揭示了一种异常对象识别方法、装置、介质及电子设备。该方法包括:获取对象数据和与对象数据对应的代表是否异常的标签;将对象数据分为训练集和测试集;将训练集中对象数据和对应的标签输入至多个待训练深度神经网络模型进行训练,得到多个模型;将测试集的对象数据输入至深度神经网络模型,得到模型输出的异常概率;根据各模型输出的异常概率,确定目标深度神经网络模型;将目标深度神经网络模型与极端梯度提升模型级联,得到级联模型,并用训练集对级联模型进行训练,得到训练好的级联模型;将待识别对象数据输入至训练好的级联模型,以进行预测。此方法下,提高了识别异常对象的准确率,降低了异常对象的漏识别率。

Description

异常对象识别方法、装置、介质及电子设备
本申请要求于2019年10月12日提交中国专利局、申请号为201910970120.7,发明名称为“异常对象识别方法、装置、介质及电子设备”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本公开涉及神经网络技术领域,特别涉及一种异常对象识别方法、装置、介质及电子设备。
背景技术
在将计算机相关技术应用到实际业务领域时,经常需要识别出不符合一定要求的实体,然后对识别出的这些实体按照一定策略进行处理。比如,在网络流量监控领域,为了对异常的不合法流量或者大流量进行监控,目前一般需要设置相应的规则进行限制。然而,发明人发现,这种利用固定规则来识别特定实体的方式存在识别准确率低和漏识别率高等缺陷。
发明内容
在神经网络技术领域,为了解决上述技术问题,本公开的目的在于提供一种异常对象识别方法、装置、介质及电子设备。
根据本申请的一方面,提供了一种异常对象识别方法,所述方法包括:
获取多个对象数据和与每一所述对象数据对应的代表对象是否异常的标签,所述对象数据包括多个对象特征值;
将所述多个对象数据按照预定规则分为训练集和测试集,其中,所述训练集和所述测试集中分别包含多个对象数据;
将所述训练集中的多个对象数据和各对象数据对应的标签分别输入至多个待训练深度神经网络模型,对所述多个待训练深度神经网络模型进行训练,以得到多个深度神经网络模型,其中,每一所述待训练深度神经网络模型中各神经元之间的连接权重被随机初始化;
将所述测试集中对象数据分别输入至所述多个深度神经网络模型,以得到每一所述深度神经网络模型输出的所述测试集中每一所述对象数据的异常概率;
根据各深度神经网络模型输出的所述测试集中每一所述对象数据的异常概率,在所述多个深度神经网络模型中确定出目标深度神经网络模型;
将所述目标深度神经网络模型与极端梯度提升模型级联,得到级联模型,并利用所述训练集中的多个对象数据对所述级联模型进行训练,得到训练好的级联模型;
将待识别的对象数据输入至所述训练好的级联模型,以对待识别的对象数据对应的对象是否异常进行预测。
根据本申请的另一方面,提供了一种异常对象识别装置,所述装置包括:
获取模块,被配置为获取多个对象数据和与每一所述对象数据对应的代表对象是否异常的标签,所述对象数据包括多个对象特征值;
对象数据划分模块,被配置为将所述多个对象数据按照预定规则分为训练集和测试集,其中,所述训练集和所述测试集中分别包含多个对象数据;
训练模块,被配置为将所述训练集中的多个对象数据和各对象数据对应的标签分别输入至多个待训练深度神经网络模型,对所述多个待训练深度神经网络模型进行训练,以得到多个深度神经网络模型,其中,每一所述待训练深度神经网络模型中各神经元之间的连接权重被随机初始化;
输入模块,被配置为将所述测试集中对象数据分别输入至所述多个深度神经网络模型,以得到每一所述深度神经网络模型输出的所述测试集中每一所述对象数据的异常概率;
确定模块,被配置为根据各深度神经网络模型输出的所述测试集中每一所述对象数据 的异常概率,在所述多个深度神经网络模型中确定出目标深度神经网络模型;
级联模块,被配置为将所述目标深度神经网络模型与极端梯度提升模型级联,得到级联模型,并利用所述训练集中的多个对象数据对所述级联模型进行训练,得到训练好的级联模型;
预测模块,被配置为将待识别的对象数据输入至所述训练好的级联模型,以对待识别的对象数据对应的对象是否异常进行预测。
根据本申请的另一方面,提供了一种计算机可读程序介质,其存储有计算机程序指令,当所述计算机程序指令被计算机执行时,使计算机执行以下步骤:
获取多个对象数据和与每一所述对象数据对应的代表对象是否异常的标签,所述对象数据包括多个对象特征值;
将所述多个对象数据按照预定规则分为训练集和测试集,其中,所述训练集和所述测试集中分别包含多个对象数据;
将所述训练集中的多个对象数据和各对象数据对应的标签分别输入至多个待训练深度神经网络模型,对所述多个待训练深度神经网络模型进行训练,以得到多个深度神经网络模型,其中,每一所述待训练深度神经网络模型中各神经元之间的连接权重被随机初始化;
将所述测试集中对象数据分别输入至所述多个深度神经网络模型,以得到每一所述深度神经网络模型输出的所述测试集中每一所述对象数据的异常概率;
根据各深度神经网络模型输出的所述测试集中每一所述对象数据的异常概率,在所述多个深度神经网络模型中确定出目标深度神经网络模型;
将所述目标深度神经网络模型与极端梯度提升模型级联,得到级联模型,并利用所述训练集中的多个对象数据对所述级联模型进行训练,得到训练好的级联模型;
将待识别的对象数据输入至所述训练好的级联模型,以对待识别的对象数据对应的对象是否异常进行预测。
根据本申请的另一方面,提供了一种电子设备,所述电子设备包括:
处理器;
存储器,所述存储器上存储有计算机可读指令,所述计算机可读指令被所述处理器执行时,实现以下步骤:
获取多个对象数据和与每一所述对象数据对应的代表对象是否异常的标签,所述对象数据包括多个对象特征值;
将所述多个对象数据按照预定规则分为训练集和测试集,其中,所述训练集和所述测试集中分别包含多个对象数据;
将所述训练集中的多个对象数据和各对象数据对应的标签分别输入至多个待训练深度神经网络模型,对所述多个待训练深度神经网络模型进行训练,以得到多个深度神经网络模型,其中,每一所述待训练深度神经网络模型中各神经元之间的连接权重被随机初始化;
将所述测试集中对象数据分别输入至所述多个深度神经网络模型,以得到每一所述深度神经网络模型输出的所述测试集中每一所述对象数据的异常概率;
根据各深度神经网络模型输出的所述测试集中每一所述对象数据的异常概率,在所述多个深度神经网络模型中确定出目标深度神经网络模型;
将所述目标深度神经网络模型与极端梯度提升模型级联,得到级联模型,并利用所述训练集中的多个对象数据对所述级联模型进行训练,得到训练好的级联模型;
将待识别的对象数据输入至所述训练好的级联模型,以对待识别的对象数据对应的对象是否异常进行预测。
本申请的实施例首先通过训练多个深度神经网络模型,然后在训练好的深度神经网络模型中挑选出最适合进行异常对象识别的目标深度神经网络模型,使得选出的目标深度神 经网络模型性能是最优的,在此基础上,通过将目标深度神经网络模型和极端梯度提升模型级联,同时保留了两个模型在分类和预测方面的长处,提高了识别异常对象的准确率,降低了异常对象的漏识别率。
附图说明
图1是根据一示例性实施例示出的一种用于异常对象识别方法的模型结构示意图;
图2是根据一示例性实施例示出的一种异常对象识别方法的流程图;
图3是根据图2对应实施例示出的一实施例的步骤250的细节的流程图;
图4是根据图2对应实施例示出的另一实施例的步骤250的细节的流程图;
图5是根据一示例性实施例示出的一种异常对象识别装置的框图;
图6是根据一示例性实施例示出的一种实现上述异常对象识别方法的电子设备示例框图;
图7是根据一示例性实施例示出的一种实现上述异常对象识别方法的计算机可读存储介质。
具体实施方式
附图中所示的一些方框图是功能实体,不一定必须与物理或逻辑上独立的实体相对应。
本公开首先提供了一种异常对象识别方法。对象可以是任何能够对其施加一定作用的有形或者无形的实体,可以是任何能够被计算设备处理的事物。而异常对象则是其特点或属性不满足一定要求的对象,有必要在所有对象中进行异常对象的监控和识别,并按照预定的方式或者规则进行相应的处理或者限制。本公开提供的异常对象识别方法可以应用于网络流量监控领域和保险领域。比如,在网络流量监控领域,对象是访问者的流量,异常对象则是异常的流量,比如非法用户的流量或者异常的大流量等,有必要对这些异常流量进行监控和限制以避免网络阻塞,从而保障网络平台的可用性。而在保险领域,对象是发起投保申请的客户,异常对象则是骗保风险较高不允许为其办理保险的客户或者保险生效可能性较大的客户,所以有必要对这种客户进行识别,从源头上避免为这些不适合投保的客户办理保险业务,从而降低因为这些客户的存在导致的保险公司利益受损的可能,可以提高保险公司的效益。
本申请的技术方案可应用于人工智能技术领域,涉及神经网络。本公开的实施终端可以是任何具有运算和处理功能的设备,该设备可以与外部设备相连,用于接收或者发送数据,具体可以是便携移动设备,例如智能手机、平板电脑、笔记本电脑、PDA(Personal Digital Assistant)等,也可以是固定式设备,例如,计算机设备、现场终端、台式电脑、服务器、工作站等,还可以是多个设备的集合,比如云计算的物理基础设施。
优选地,本公开的实施终端可以为服务器或者云计算的物理基础设施。
图1是根据一示例性实施例示出的一种用于异常对象识别方法的模型结构示意图。如图1所示,该用于异常对象识别方法的模型结构包括深度神经网络模型110和极端梯度提升模型120。深度神经网络模型(Deep Neural Network module,DNN module)是深度学习系列算法下的一种网络,是模仿大脑进行神经元传递来处理信息的一种模式,包括多层神经元,每一层多个神经元节点,深度神经网络模型至少可以包括输入层、隐藏层、输出层等多层神经元结构,还可以包括至少一层全连接层,其中,深度神经网络模型的隐藏层也可以为层叠的多层结构,深度神经网络中的深度是指输入至输出所流经的路径是足够长的。极端梯度提升模型即Xgboost(eXtreme Gradient Boosting)模型,是将多个弱分类器集合在一起组成的一个强分类器模型,是一种提升树模型,使用的树模型一般为CART(Classification And Regression Tree,分类与回归树)回归树。参考图1所示,极端梯度提升模型120包括多个弱分类器——CART回归二叉树,Xgboost通过不断进行特征分裂生长每一棵树,每一棵树代表了训练出的一个函数,每生长成的一棵树都能够拟合之前生成的树所预测的残差。
图2是根据一示例性实施例示出的一种异常对象识别方法的流程图。如图2所示,可以包括以下步骤:
步骤210,获取多个对象数据和与每一所述对象数据对应的代表对象是否异常的标签。
其中,所述对象数据包括多个对象特征值,所述对象数据与对象对应,每一对象特征值与一个对象特征对应。
对象数据即与对象有关的数据,可以是对象自身产生的数据,也可以是在对象活动时对对象的行为进行记录而得到的数据。比如,在网络流量监控领域,对象是访问者的流量,异常对象则是异常的流量,如非法访问者产生的流量或者异常大的流量,此时的对象数据即为访问者流量对应的IP地址、WIFI名称等在对象活动时对对象的行为进行记录而得到的数据;而在保险领域,对象是发起投保申请的客户,异常对象则是骗保风险较高不允许为其办理保险的客户或者保险生效可能性较大的客户,此时对象数据即为客户的职业、年龄、养老金、公积金、个人资产等对象自身产生的数据。
与每一所述对象数据对应的代表对象是否异常的标签标识了与对应的对象数据对应的对象是否异常,该标签可以是人工的方式进行标注的,比如依赖于专家的经验进行标签的标注;该标签还可以是通过机器自动执行的标注,比如事先专家根据经验将各对象数据分好类,通过机器根据各对象数据所在的类别自动执行标签的标注。
与每一所述对象数据对应的代表对象是否异常的标签的具体形式可以是任意的,只要能被计算机设备识别。比如,代表对象异常的标签可以为“NO”,而代表对象不异常的标签可以为“YES”,再比如,代表对象异常的标签可以为“OK”,而对应的代表对象不异常的标签可以为“”(空)或者代表对象异常的标签可以为“1”,而对应的代表对象不异常的标签可以为“0”。
在一个实施例中,对象数据和与每一所述对象数据对应的代表对象是否异常的标签同时存储在数据库中,通过查询该数据库,获取对象数据和与每一所述对象数据对应的代表对象是否异常的标签。
在一个实施例中,所述多个对象数据和与每一对象数据对应的数据标识对应存储在第一终端,与每一所述对象数据对应的代表对象是否异常的标签和与每一对象数据对应的数据标识对应存储在第二终端,先从所述第一终端获取多个对象数据和与每一对象数据对应的数据标识,然后利用与每一对象数据对应的数据标识从所述第二终端获取与各个数据标识对应的与每一所述对象数据对应的代表对象是否异常的标签,从而实现获取对象数据和与每一所述对象数据对应的代表对象是否异常的标签。
每一对象特征值代表了对象数据在一个维度上的对象特征的取值,对象特征亦可以称为对象属性,而对象特征值亦可以称为对象属性值。比如,在网络流量监控领域,若对象特征为IP地址,则与IP地址这一对象特征对应的对象特征值,即IP地址的取值可以为158.135.213.25;而在保险领域,若对象特征为月缴纳养老金数额,则月缴纳养老金数额这一对象特征的取值可以为1000。
步骤220,将所述多个对象数据按照预定规则分为训练集和测试集。
其中,所述训练集和所述测试集中分别包含多个对象数据,即,训练集和测试集均为对象数据的集合。
在一个实施例中,所述预定规则为使训练集中的对象数据的数目与测试集中对象数据的数目保持在预定比例。
本实施例的好处在于,使训练集和测试集之间数目的相对关系保持在相对稳定的范围内。
比如,该预定比例可以为7:3,即,每为训练集分配7个对象数据,则要对应地为测试集分配3个对象数据,如果所述多个对象数据中对象数据的数目为100,那么分得的训练 集中对象数据的数目为70,而测试集中对象数据的数目为30。
在一个实施例中,所述多个对象数据中获取预定数目个对象数据组成训练集,并将剩余的对象数据组成测试集。
在一个实施例中,所述预定规则为使训练集中的对象数据的数目与测试集中对象数据的数目保持在预定比例并且使训练集和测试集中标签为代表对象异常的标签的对象数据的比例均相同。
本实施例的好处在于,避免了训练集和测试集的数据在数据划分时同样标签的对象数据的比例不同导致的引入了建模过程中的额外偏差的可能性,在一定程度上保证了建立的模型的精度。
本实施例的好处在于,通过限制组成训练集的对象数据的数目,保证了利用训练集进行模型训练时能够实现良好的训练效果。
步骤230,将所述训练集中的多个对象数据和各对象数据对应的标签分别输入至多个待训练深度神经网络模型,对所述多个待训练深度神经网络模型进行训练,以得到多个深度神经网络模型。
其中,每一所述待训练深度神经网络模型中各神经元之间的连接权重被随机初始化。
当将所述训练集中的多个对象数据和各对象数据对应的标签输入至待训练深度神经网络模型后,各对象数据对应的标签会被转换为数值型,对象数据中的对象特征值会先被转换为向量,分别通过深度神经网络模型中多层神经元的连接对该向量进行变换和映射,最终得到该深度神经网络模型输出的预测值,然后计算该预测值与对象的对象数据标签转换为的数值之间的差值,基于该差值利用随机梯度下降法(Stochastic gradient descent,SGD)和误差反向传播算法(Backpropagation Algorithm,BP算法)调整深度神经网络模型中多层神经元的连接权重;迭代执行上述过程,直至迭代次数达到预定迭代次数阈值或者对模型的训练满足预定条件,此时得到的模型即为经过训练的深度神经网络模型。
在一个实施例中,在对待训练深度神经网络模型进行训练时,可以设置批大小(batch size)、学习率(learning rate)等参数。
每一所述待训练深度神经网络模型中各神经元之间的连接权重被随机初始化是指,每一所述待训练深度神经网络模型中每一对神经元之间的连接权重在起始是随机进行设置的,同一待训练深度神经网络模型中各对神经元之间的连接权重很可能是不同的,而不同待训练深度神经网络模型中各对神经元之间的连接权重也很可能是不同的,这样就可以使训练出的各个深度神经网络模型中各对神经元之间的连接权重基本上都是不同的,保证了训练出的每一深度神经网络模型的特异性,即,使得训练出的每一深度神经网络模型都是独一无二的模型。
步骤240,将所述测试集中对象数据分别输入至所述多个深度神经网络模型,以得到每一所述深度神经网络模型输出的所述测试集中每一所述对象数据的异常概率。
经过训练的深度神经网络模型便可以对各个对象数据进行预测,可以根据对象数据的输入而输出相应的预测结果,预测结果为对象数据的异常概率,即衡量了对象数据对应的对象为异常的可能性,即对象数据的异常概率越大,对象数据对应的对象越可能是异常对象。
易于理解,由于对象数据与对象是一一对应的,我们通过对象数据的异常来判断对象的异常,因此,对象的异常亦可以称为对象数据的异常。
测试集用于检验和评估训练好的每一深度神经网络模型的性能。易于理解,训练好的每一深度神经网络模型中神经元之间的权重是不同的,因此各深度神经网络模型是不同的模型。针对所述测试集中每一对象数据,各深度神经网络模型对该对象数据输出的异常概率可能都是不同的,有必要利用测试集对训练好的若干个不同的模型进行测试,从而实现 对训练好的模型的评估。
步骤250,根据各深度神经网络模型输出的所述测试集中每一所述对象数据的异常概率,在所述多个深度神经网络模型中确定出目标深度神经网络模型。
在本步骤中,通过基于每一深度神经网络模型对测试集中每一对象数据输出的异常概率,在训练好的多个深度神经网络模型中选出目标深度神经网络模型。
由于训练好的各深度神经网络模型中神经元之间的连接权重不同,所以各深度神经网络模型的性能往往不同,通过根据各深度神经网络模型对测试集中各对象数据输出的异常概率,可以实现对训练好的各深度神经网络模型的性能评估,从而能够选出最适合用来进行异常对象预测的深度神经网络模型。
在一个实施例中,步骤250的具体步骤可以如图3所示。图3是根据图2对应实施例示出的一实施例的步骤250的细节的流程图。如图3所示,包括以下步骤:
步骤251,获取所述测试集中对应的代表对象异常的标签的对象数据的数目与所述测试集中包含的所有对象数据的数目的比值,作为第一比值。
在一个实施例中,本公开的实施终端中内嵌有计数器,可以统计对象数据的数目。具体而言,本公开实施终端中的第一计数器会计算所述测试集中对应的代表对象异常的标签的对象数据的数目,首先将第一计数器和第二计数器置为0,对于所述测试集中每一对象数据,判断与该对象数据对应的标签是否代表对象异常,如果是,则将第一计数器加1,与此同时,对测试集中的对象数据每进行一次判断,第二计数器也会加1,直至所有对象数据均经过了判断。
步骤252,针对每一深度神经网络模型,对该深度神经网络模型输出的所述测试集中各对象数据的异常概率从大到小进行排序。
在一个实施例中,利用冒泡排序算法对各对象数据的异常概率进行排序。
在一个实施例中,利用快速排序算法对各对象数据的异常概率进行排序。
步骤253,针对每一深度神经网络模型,将与所述深度神经网络模型对应的各对象数据按照所述排序顺序分为预定数目组。
每一对象数据属于一组。分组的目的是使大部分组中包含的对象数据的数目均相同。
在一个实施例中,将所有对象数据平均分为预定数目组,其中,当所有对象数据的数目能够被所述预定数目整除时,每一组包含的对象数据的数目均相同,当所有对象数据的数目不能够被所述预定数目整除时,除最后一组外所有组包含的对象数据的数目均相同。
在一个实施例中,所述预定数目为第一预定数目,在对各对象数据进行分组时,为前第二预定数目组分配预设数量的对象数据,并将剩余的对象数据分配至未分配的组,其中,第一预定数目大于第二预定数目。
步骤254,针对每一深度神经网络模型,针对与该深度神经网络模型对应的每一组对象数据,获取该组对象数据中包含了代表对象异常的标签的对象数据的数目与该组对象数据中包含的所有对象数据的数目的比值,作为第二比值。
针对每一深度神经网络模型,都有与该深度神经网络模型对应的对象数据的分组和排序方式,所以与各深度神经网络模型对应的相同次序的对象数据分组的第二比值都可能是不同的。
步骤255,基于所述第一比值和针对每一深度神经网络模型获取的各个第二比值,在所述多个深度神经网络模型中确定出目标深度神经网络模型。
在一个实施例中,步骤255可以包括:
针对每一深度神经网络模型,获取和与该深度神经网络模型对应的排序在最前的一组对象数据对应的第二比值,作为目标第二比值;针对每一深度神经网络模型,确定针对该深度神经网络模型获取的目标第二比值与所述第一比值的比值,作为第三比值;将所述第 三比值最大的深度神经网络模型作为目标深度神经网络模型。
每一深度神经网络模型都有对应的一套对象数据的分组和排序方式,因此,与每一深度神经网络模型对应的各组对象数据中都有排序在最前的一组对象数据,相应地,排序在最前的该组对象数据具有第二比值,此时即可将该第二比值作为对应的深度神经网络模型的目标第二比值。
由于与每一深度神经网络模型对应的各组对象数据是按照每一深度神经网络模型对各对象数据输出的异常概率从大到小进行排序的,这意味着,一个对象数据的排序越靠前,则该对象数据越可能被对应的深度神经网络模型识别为异常对象数据(判定该对象数据对应的对象为异常对象)的可能性越大,所以与一个深度神经网络模型对应的排序在最前的一组对象数据是该深度神经网络模型认为的在所有对象数据中对应的对象最可能是异常对象的数据,而第一比值反映了所有对象数据中对应的标签代表对象异常的对象数据的比例,即反映了所有对象数据中对应的对象为异常对象的比例,所以一个深度神经网络模型的目标第二比值与第一比值的比值,即第三比值越大,说明与通过随机的方式从所有对象数据中挑选出对应的对象为异常对象的对象数据相比,该深度神经网络模型识别异常对象数据方面表现得更好,识别异常对象数据的准确率更高。所以本实施例的好处在于,通过将第三比值最大的深度神经网络模型选为目标深度神经网络模型,使得选出的目标深度神经网络模型的性能是最优的,从而提高了最终建立的用于识别异常对象的模型的精度。
在一个实施例中,步骤255可以包括:
针对每一深度神经网络模型,获取和与该深度神经网络模型对应的排序在前第三预定数目组对象数据对应的第二比值的平均值,作为目标第二比值;针对每一深度神经网络模型,确定针对该深度神经网络模型获取的目标第二比值与所述第一比值的比值,作为第三比值;将所述第三比值最大的深度神经网络模型作为目标深度神经网络模型。
比如第三预定数目是3,则要针对每一深度神经网络模型获取前3组对象数据对应的第二比值的平均值。
由于排序在最前的组对应的第二比值并不一定能够完全客观地反映一个深度神经网络模型的性能,比如当一个深度神经网络模型对应的排序在最前的组的第二比值较小,但对应的排序在前的几个组整体的第二比值足够大时,也能说明该深度神经网络模型的性能是比较优异的,所以本实施例的好处在于,通过用平均值的方式确定每一深度神经网络模型对应的排序在前的几个组的第二比值的整体大小,进而利用该整体大小来进行目标深度神经网络模型的选择,提高了选择目标深度神经网络模型这一环节的公平性和可靠性。
在一个实施例中,步骤255可以包括:
针对每一深度神经网络模型,将该深度神经网络模型输出的所述测试集中每一所述对象数据的异常概率与预设的异常概率阈值进行比较,以确定该深度神经网络模型对所述测试集中每一所述对象数据的预测结果是否为异常;基于所述测试集中各对象数据对应的代表对象是否异常的标签和各深度神经网络模型对所述测试集中各对象数据的预测结果,计算每一深度神经网络模型的查全率和查准率;根据每一深度神经网络模型的查全率、查准率、所述第一比值和针对每一深度神经网络模型获取的各个第二比值,在所述多个深度神经网络模型中确定出目标深度神经网络模型。
在一个实施例中,分别利用如下公式计算每一深度神经网络模型的查全率和查准率:
Figure PCTCN2020092812-appb-000001
其中,P为查准率,R为查全率,TP为在所述测试集中,深度神经网络模型对对象数据的预测结果为异常,且与该对象数据对应的标签代表对象异常的对象数据的数目,FP为在所述测试集中,深度神经网络模型对对象数据的预测结果为正常,且与该对象数据对应 的标签代表对象异常的对象数据的数目,FN为在所述测试集中,深度神经网络模型对对象数据的预测结果为异常,且与该对象数据对应的标签代表对象正常的对象数据的数目。
在一个实施例中,所述根据每一深度神经网络模型的查全率和查准率以及所述第一比值和针对每一深度神经网络模型获取的各个第二比值,在所述多个深度神经网络模型中确定出目标深度神经网络模型,包括:
利用每一深度神经网络模型的查全率和查准率计算每一深度神经网络模型的第一参数;利用所述第一比值和针对每一深度神经网络模型获取的各个第二比值,获取每一深度神经网络模型的第二参数;基于各深度神经网络模型的所述第一参数和所述第二参数,在所述多个深度神经网络模型中确定出目标深度神经网络模型。
本实施例的好处在于,通过综合深度神经网络模型对应的第二比值、查全率、查准率多个维度的指标来选择目标深度神经网络模型,能够使得选出的目标深度神经网络模型的性能更优异,更适合用于对异常对象进行识别。
在一个实施例中,所述利用每一深度神经网络模型的查全率和查准率计算每一深度神经网络模型的第一参数,包括:将每一深度神经网络模型的查全率和查准率的平均值作为每一深度神经网络模型的第一参数;所述利用所述第一比值和针对每一深度神经网络模型获取的各个第二比值,获取每一深度神经网络模型的第二参数,包括:分别获取每一深度神经网络模型对应的前预定数目组第二比值的加权和,并计算所述加权和与所述第一比值的比值作为每一深度神经网络模型的第二参数;所述基于各深度神经网络模型的所述第一参数和所述第二参数,在所述多个深度神经网络模型中确定出目标深度神经网络模型,包括:分别获取第一参数和第二参数的权重;针对每一深度神经网络模型,利用所述权重确定该深度神经网络模型的第一参数和第二参数的加权和;将所述加权和最大的深度神经网络模型作为目标深度神经网络模型。
步骤260,将所述目标深度神经网络模型与极端梯度提升模型级联,得到级联模型,并利用所述训练集中的多个对象数据对所述级联模型进行训练,得到训练好的级联模型。
将所述目标深度神经网络模型与极端梯度提升模型级联是指将所述目标深度神经网络模型的输出直接作为输入传递至极端梯度提升模型。
在一个实施例中,所述目标深度神经网络模型包括输出层和至少一层隐藏层,所述将所述目标深度神经网络模型与极端梯度提升模型级联,得到级联模型,并利用所述训练集中的多个对象数据对所述级联模型进行训练,得到训练好的级联模型,包括:
去除所述目标深度神经网络模型中的输出层,并将所述目标深度神经网络模型的最后一层隐藏层与极端梯度提升模型级联,以使所述目标深度神经网络模型的最后一层隐藏层输出的特征向量能够输入至极端梯度提升模型,得到级联模型;
利用所述训练集中的多个对象数据对所述级联模型进行训练,得到训练好的级联模型。
对级联模型的训练可以使用随机梯度下降法以及误差反向传播算法。
步骤270,将待识别的对象数据输入至所述训练好的级联模型,以对待识别的对象数据对应的对象是否异常进行预测。
如前所述,对象数据与对象是一一对应的,对对象数据是否异常进行预测即相当于对对象是否异常进行预测。
级联模型被训练好后,即可用于对对象数据进行预测,比如在保险领域,预测的是否不允许为对应的客户办理保险,而在网络流量监控领域,预测的是访问者的流量是否为异常流量。级联模型综合了深度神经网络模型和极端梯度提升(xgboost)模型的优点,与单独的深度神经网络模型相比,提高了可解释性,而与xgboost模型相比,在一定程度上提高了预测的精度。
综上所述,根据图2实施例示出的异常对象识别方法,通过将选出的精度较高的目标 深度神经网络模型和极端梯度提升模型级联,同时保留了两种模型在分类和预测方面的优点,提高了识别异常对象的准确率,降低了异常对象的漏识别率。
图4是根据图2对应实施例示出的另一实施例的步骤250的细节的流程图。如图4所示,包括以下步骤:
步骤251',针对每一深度神经网络模型,将该深度神经网络模型输出的所述测试集中每一所述对象数据的异常概率与预设的异常概率阈值进行比较,确定该深度神经网络模型对所述测试集中每一所述对象数据的预测结果是否为异常。
比如,若预设的异常概率阈值为0.7,而深度神经网络模型对一个对象数据对应输出的异常概率为0.75,0.75>0.7,此时即可确定该深度神经网络模型对该对象数据的预测结果为异常。
步骤252',基于所述测试集中各对象数据对应的代表对象是否异常的标签和各深度神经网络模型对所述测试集中各对象数据的预测结果,计算每一所述深度神经网络模型的查全率和查准率。
查全率和查准率的计算可以采用前述实施例提供的方式来进行,此处不再赘述。
步骤253',在查全率大于预设查全率阈值的深度神经网络模型中选择出查准率最大的深度神经网络模型,作为目标深度神经网络模型。
查全率反映了在所述测试集中,深度神经网络模型对对象数据的预测结果为异常的对象数据中真正异常的对象数据的占比,因此为了尽可能地识别出异常对象,需要使选择出的模型的查全率足够高,本实施例的好处在于,通过限制选择出的目标深度神经网络模型的查全率,并在此基础上选出查准率最大的模型,使得选出的目标深度神经网络模型更适合用于识别异常对象,并能保持足够高的精度。
本公开还提供了一种异常对象识别装置,以下是本公开的装置实施例。
图5是根据一示例性实施例示出的一种异常对象识别装置的框图。如图5所示,装置500包括:
获取模块510,被配置为获取多个对象数据和与每一所述对象数据对应的代表对象是否异常的标签,所述对象数据包括多个对象特征值;
对象数据划分模块520,被配置为将所述多个对象数据按照预定规则分为训练集和测试集,其中,所述训练集和所述测试集中分别包含多个对象数据;
训练模块530,被配置为将所述训练集中的多个对象数据和各对象数据对应的标签分别输入至多个待训练深度神经网络模型,对所述多个待训练深度神经网络模型进行训练,以得到多个深度神经网络模型,其中,每一所述待训练深度神经网络模型中各神经元之间的连接权重被随机初始化;
输入模块540,被配置为将所述测试集中对象数据分别输入至所述多个深度神经网络模型,以得到每一所述深度神经网络模型输出的所述测试集中每一所述对象数据的异常概率;
确定模块550,被配置为根据各深度神经网络模型输出的所述测试集中每一所述对象数据的异常概率,在所述多个深度神经网络模型中确定出目标深度神经网络模型;
级联模块560,被配置为将所述目标深度神经网络模型与极端梯度提升模型级联,得到级联模型,并利用所述训练集中的多个对象数据对所述级联模型进行训练,得到训练好的级联模型;
预测模块570,被配置为将待识别的对象数据输入至所述训练好的级联模型,以对待识别的对象数据对应的对象是否异常进行预测。
据本公开的第三方面,还提供了一种能够实现上述方法的电子设备。
所属技术领域的技术人员能够理解,本申请的各个方面可以实现为系统、方法或程序 产品。因此,本申请的各个方面可以具体实现为以下形式,即:完全的硬件实施方式、完全的软件实施方式(包括固件、微代码等),或硬件和软件方面结合的实施方式,这里可以统称为“电路”、“模块”或“系统”。
下面参照图6来描述根据本申请的这种实施方式的电子设备600。图6显示的电子设备600仅仅是一个示例,不应对本申请实施例的功能和使用范围带来任何限制。
如图6所示,电子设备600以通用计算设备的形式表现。电子设备600的组件可以包括但不限于:上述至少一个处理单元610、上述至少一个存储单元620、连接不同系统组件(包括存储单元620和处理单元610)的总线630。
其中,所述存储单元存储有程序代码,所述程序代码可以被所述处理单元610执行,使得所述处理单元610执行本说明书上述“实施例方法”部分中描述的根据本申请各种示例性实施方式的步骤。
存储单元620可以包括易失性存储单元形式的可读介质,例如随机存取存储单元(RAM)621和/或高速缓存存储单元622,还可以进一步包括只读存储单元(ROM)623。
存储单元620还可以包括具有一组(至少一个)程序模块625的程序/实用工具624,这样的程序模块625包括但不限于:操作系统、一个或者多个应用程序、其它程序模块以及程序数据,这些示例中的每一个或某种组合中可能包括网络环境的实现。
总线630可以为表示几类总线结构中的一种或多种,包括存储单元总线或者存储单元控制器、外围总线、图形加速端口、处理单元或者使用多种总线结构中的任意总线结构的局域总线。
电子设备600也可以与一个或多个外部设备800(例如键盘、指向设备、蓝牙设备等)通信,还可与一个或者多个使得用户能与该电子设备600交互的设备通信,和/或与使得该电子设备600能与一个或多个其它计算设备进行通信的任何设备(例如路由器、调制解调器等等)通信。这种通信可以通过输入/输出(I/O)接口650进行。并且,电子设备600还可以通过网络适配器660与一个或者多个网络(例如局域网(LAN),广域网(WAN)和/或公共网络,例如因特网)通信。如图所示,网络适配器660通过总线630与电子设备600的其它模块通信。应当明白,尽管图中未示出,可以结合电子设备600使用其它硬件和/或软件模块,包括但不限于:微代码、设备驱动器、冗余处理单元、外部磁盘驱动阵列、RAID系统、磁带驱动器以及数据备份存储系统等。
通过以上的实施方式的描述,本领域的技术人员易于理解,这里描述的示例实施方式可以通过软件实现,也可以通过软件结合必要的硬件的方式来实现。因此,根据本公开实施方式的技术方案可以以软件产品的形式体现出来,该软件产品可以存储在一个非易失性存储介质(可以是CD-ROM,U盘,移动硬盘等)中或网络上,包括若干指令以使得一台计算设备(可以是个人计算机、服务器、终端装置、或者网络设备等)执行根据本公开实施方式的方法。
根据本公开的第四方面,还提供了一种计算机可读存储介质,其上存储有能够实现本说明书上述方法的程序产品。在一些可能的实施方式中,本申请的各个方面还可以实现为一种程序产品的形式,其包括程序代码,当所述程序产品在终端设备上运行时,所述程序代码用于使所述终端设备执行本说明书上述“示例性方法”部分中描述的根据本申请各种示例性实施方式的步骤。可选的,该计算机可读存储介质可以是非易失性存储介质,也可以是易失性存储介质。
参考图7所示,描述了根据本申请的实施方式的用于实现上述方法的程序产品700,其可以采用便携式紧凑盘只读存储器(CD-ROM)并包括程序代码,并可以在终端设备,例如个人电脑上运行。然而,本申请的程序产品不限于此,在本文件中,可读存储介质可以是任何包含或存储程序的有形介质,该程序可以被指令执行系统、装置或者器件使用或者与 其结合使用。
所述程序产品可以采用一个或多个可读介质的任意组合。可读介质可以是可读信号介质或者可读存储介质。可读存储介质例如可以为但不限于电、磁、光、电磁、红外线、或半导体的系统、装置或器件,或者任意以上的组合。可读存储介质的更具体的例子(非穷举的列表)包括:具有一个或多个导线的电连接、便携式盘、硬盘、随机存取存储器(RAM)、只读存储器(ROM)、可擦式可编程只读存储器(EPROM或闪存)、光纤、便携式紧凑盘只读存储器(CD-ROM)、光存储器件、磁存储器件、或者上述的任意合适的组合。
计算机可读信号介质可以包括在基带中或者作为载波一部分传播的数据信号,其中承载了可读程序代码。这种传播的数据信号可以采用多种形式,包括但不限于电磁信号、光信号或上述的任意合适的组合。可读信号介质还可以是可读存储介质以外的任何可读介质,该可读介质可以发送、传播或者传输用于由指令执行系统、装置或者器件使用或者与其结合使用的程序。
可读介质上包含的程序代码可以用任何适当的介质传输,包括但不限于无线、有线、光缆、RF等等,或者上述的任意合适的组合。
可以以一种或多种程序设计语言的任意组合来编写用于执行本申请操作的程序代码,所述程序设计语言包括面向对象的程序设计语言—诸如Java、C++等,还包括常规的过程式程序设计语言—诸如“C”语言或类似的程序设计语言。程序代码可以完全地在用户计算设备上执行、部分地在用户设备上执行、作为一个独立的软件包执行、部分在用户计算设备上部分在远程计算设备上执行、或者完全在远程计算设备或服务器上执行。在涉及远程计算设备的情形中,远程计算设备可以通过任意种类的网络,包括局域网(LAN)或广域网(WAN),连接到用户计算设备,或者,可以连接到外部计算设备(例如利用因特网服务提供商来通过因特网连接)。
此外,上述附图仅是根据本申请示例性实施例的方法所包括的处理的示意性说明,而不是限制目的。易于理解,上述附图所示的处理并不表明或限制这些处理的时间顺序。另外,也易于理解,这些处理可以是例如在多个模块中同步或异步执行的。
应当理解的是,本申请并不局限于上面已经描述并在附图中示出的精确结构,并且可以在不脱离其范围执行各种修改和改变。本申请的范围仅由所附的权利要求来限制。

Claims (20)

  1. 一种异常对象识别方法,其中,所述方法包括:
    获取多个对象数据和与每一所述对象数据对应的代表对象是否异常的标签,所述对象数据包括多个对象特征值;
    将所述多个对象数据按照预定规则分为训练集和测试集,其中,所述训练集和所述测试集中分别包含多个对象数据;
    将所述训练集中的多个对象数据和各对象数据对应的标签分别输入至多个待训练深度神经网络模型,对所述多个待训练深度神经网络模型进行训练,以得到多个深度神经网络模型,其中,每一所述待训练深度神经网络模型中各神经元之间的连接权重被随机初始化;
    将所述测试集中对象数据分别输入至所述多个深度神经网络模型,以得到每一所述深度神经网络模型输出的所述测试集中每一所述对象数据的异常概率;
    根据各深度神经网络模型输出的所述测试集中每一所述对象数据的异常概率,在所述多个深度神经网络模型中确定出目标深度神经网络模型;
    将所述目标深度神经网络模型与极端梯度提升模型级联,得到级联模型,并利用所述训练集中的多个对象数据对所述级联模型进行训练,得到训练好的级联模型;
    将待识别的对象数据输入至所述训练好的级联模型,以对待识别的对象数据对应的对象是否异常进行预测。
  2. 根据权利要求1所述的方法,其中,所述根据各深度神经网络模型输出的所述测试集中每一所述对象数据的异常概率,在所述多个深度神经网络模型中确定出目标深度神经网络模型,包括:
    获取所述测试集中对应的代表对象异常的标签的对象数据的数目与所述测试集中包含的所有对象数据的数目的比值,作为第一比值;
    针对每一深度神经网络模型,对该深度神经网络模型输出的所述测试集中各对象数据的异常概率从大到小进行排序;
    针对每一深度神经网络模型,将与所述深度神经网络模型对应的各对象数据按照所述排序顺序分为预定数目组,每一对象数据属于一组;
    针对每一深度神经网络模型,针对与该深度神经网络模型对应的每一组对象数据,获取该组对象数据中包含了代表对象异常的标签的对象数据的数目与该组对象数据中包含的所有对象数据的数目的比值,作为第二比值;
    基于所述第一比值和针对每一深度神经网络模型获取的各个第二比值,在所述多个深度神经网络模型中确定出目标深度神经网络模型。
  3. 根据权利要求2所述的方法,其中,所述基于所述第一比值和针对每一深度神经网络模型获取的各个第二比值,在所述多个深度神经网络模型中确定出目标深度神经网络模型,包括:
    针对每一深度神经网络模型,获取和与该深度神经网络模型对应的排序在最前的一组对象数据对应的第二比值,作为目标第二比值;
    针对每一深度神经网络模型,确定针对该深度神经网络模型获取的目标第二比值与所述第一比值的比值,作为第三比值;
    将所述第三比值最大的深度神经网络模型作为目标深度神经网络模型。
  4. 根据权利要求2所述的方法,其中,所述基于所述第一比值和针对每一深度神经网络模型获取的各个第二比值,在所述多个深度神经网络模型中确定出目标深度神经网络模型,包括:
    针对每一深度神经网络模型,将该深度神经网络模型输出的所述测试集中每一所述对象数据的异常概率与预设的异常概率阈值进行比较,以确定该深度神经网络模型对所述测 试集中每一所述对象数据的预测结果是否为异常;
    基于所述测试集中各对象数据对应的代表对象是否异常的标签和各深度神经网络模型对所述测试集中各对象数据的预测结果,计算每一深度神经网络模型的查全率和查准率;
    根据每一深度神经网络模型的查全率、查准率、所述第一比值和针对每一深度神经网络模型获取的各个第二比值,在所述多个深度神经网络模型中确定出目标深度神经网络模型。
  5. 根据权利要求4所述的方法,其中,所述根据每一深度神经网络模型的查全率和查准率以及所述第一比值和针对每一深度神经网络模型获取的各个第二比值,在所述多个深度神经网络模型中确定出目标深度神经网络模型,包括:
    利用每一深度神经网络模型的查全率和查准率计算每一深度神经网络模型的第一参数;
    利用所述第一比值和针对每一深度神经网络模型获取的各个第二比值,获取每一深度神经网络模型的第二参数;
    基于各深度神经网络模型的所述第一参数和所述第二参数,在所述多个深度神经网络模型中确定出目标深度神经网络模型。
  6. 根据权利要求1所述的方法,其中,所述根据各深度神经网络模型输出的所述测试集中每一所述对象数据的异常概率,在所述多个深度神经网络模型中确定出目标深度神经网络模型,包括:
    针对每一深度神经网络模型,将该深度神经网络模型输出的所述测试集中每一所述对象数据的异常概率与预设的异常概率阈值进行比较,确定该深度神经网络模型对所述测试集中每一所述对象数据的预测结果是否为异常;
    基于所述测试集中各对象数据对应的代表对象是否异常的标签和各深度神经网络模型对所述测试集中各对象数据的预测结果,计算每一所述深度神经网络模型的查全率和查准率;
    在查全率大于预设查全率阈值的深度神经网络模型中选择出查准率最大的深度神经网络模型,作为目标深度神经网络模型。
  7. 根据权利要求1所述的方法,其中,所述目标深度神经网络模型包括输出层和至少一层隐藏层,所述将所述目标深度神经网络模型与极端梯度提升模型级联,得到级联模型,并利用所述训练集中的多个对象数据对所述级联模型进行训练,得到训练好的级联模型,包括:
    去除所述目标深度神经网络模型中的输出层,并将所述目标深度神经网络模型的最后一层隐藏层与极端梯度提升模型级联,以使所述目标深度神经网络模型的最后一层隐藏层输出的特征向量能够输入至极端梯度提升模型,得到级联模型;
    利用所述训练集中的多个对象数据对所述级联模型进行训练,得到训练好的级联模型。
  8. 一种异常对象识别装置,其中,所述装置包括:
    获取模块,被配置为获取多个对象数据和与每一所述对象数据对应的代表对象是否异常的标签,所述对象数据包括多个对象特征值;
    对象数据划分模块,被配置为将所述多个对象数据按照预定规则分为训练集和测试集,其中,所述训练集和所述测试集中分别包含多个对象数据;
    训练模块,被配置为将所述训练集中的多个对象数据和各对象数据对应的标签分别输入至多个待训练深度神经网络模型,对所述多个待训练深度神经网络模型进行训练,以得到多个深度神经网络模型,其中,每一所述待训练深度神经网络模型中各神经元之间的连接权重被随机初始化;
    输入模块,被配置为将所述测试集中对象数据分别输入至所述多个深度神经网络模型,以得到每一所述深度神经网络模型输出的所述测试集中每一所述对象数据的异常概率;
    确定模块,被配置为根据各深度神经网络模型输出的所述测试集中每一所述对象数据的异常概率,在所述多个深度神经网络模型中确定出目标深度神经网络模型;
    级联模块,被配置为将所述目标深度神经网络模型与极端梯度提升模型级联,得到级联模型,并利用所述训练集中的多个对象数据对所述级联模型进行训练,得到训练好的级联模型;
    预测模块,被配置为将待识别的对象数据输入至所述训练好的级联模型,以对待识别的对象数据对应的对象是否异常进行预测。
  9. 一种计算机可读程序介质,其中,其存储有计算机程序指令,当所述计算机程序指令被计算机执行时,使计算机执行以下步骤:
    获取多个对象数据和与每一所述对象数据对应的代表对象是否异常的标签,所述对象数据包括多个对象特征值;
    将所述多个对象数据按照预定规则分为训练集和测试集,其中,所述训练集和所述测试集中分别包含多个对象数据;
    将所述训练集中的多个对象数据和各对象数据对应的标签分别输入至多个待训练深度神经网络模型,对所述多个待训练深度神经网络模型进行训练,以得到多个深度神经网络模型,其中,每一所述待训练深度神经网络模型中各神经元之间的连接权重被随机初始化;
    将所述测试集中对象数据分别输入至所述多个深度神经网络模型,以得到每一所述深度神经网络模型输出的所述测试集中每一所述对象数据的异常概率;
    根据各深度神经网络模型输出的所述测试集中每一所述对象数据的异常概率,在所述多个深度神经网络模型中确定出目标深度神经网络模型;
    将所述目标深度神经网络模型与极端梯度提升模型级联,得到级联模型,并利用所述训练集中的多个对象数据对所述级联模型进行训练,得到训练好的级联模型;
    将待识别的对象数据输入至所述训练好的级联模型,以对待识别的对象数据对应的对象是否异常进行预测。
  10. 根据权利要求9所述的计算机可读程序介质,其中,所述根据各深度神经网络模型输出的所述测试集中每一所述对象数据的异常概率,在所述多个深度神经网络模型中确定出目标深度神经网络模型时,计算机具体执行以下步骤:
    获取所述测试集中对应的代表对象异常的标签的对象数据的数目与所述测试集中包含的所有对象数据的数目的比值,作为第一比值;
    针对每一深度神经网络模型,对该深度神经网络模型输出的所述测试集中各对象数据的异常概率从大到小进行排序;
    针对每一深度神经网络模型,将与所述深度神经网络模型对应的各对象数据按照所述排序顺序分为预定数目组,每一对象数据属于一组;
    针对每一深度神经网络模型,针对与该深度神经网络模型对应的每一组对象数据,获取该组对象数据中包含了代表对象异常的标签的对象数据的数目与该组对象数据中包含的所有对象数据的数目的比值,作为第二比值;
    基于所述第一比值和针对每一深度神经网络模型获取的各个第二比值,在所述多个深度神经网络模型中确定出目标深度神经网络模型。
  11. 根据权利要求10所述的计算机可读程序介质,其中,所述基于所述第一比值和针对每一深度神经网络模型获取的各个第二比值,在所述多个深度神经网络模型中确定出目标深度神经网络模型时,计算机具体执行以下步骤:
    针对每一深度神经网络模型,获取和与该深度神经网络模型对应的排序在最前的一组对象数据对应的第二比值,作为目标第二比值;
    针对每一深度神经网络模型,确定针对该深度神经网络模型获取的目标第二比值与所 述第一比值的比值,作为第三比值;
    将所述第三比值最大的深度神经网络模型作为目标深度神经网络模型。
  12. 根据权利要求10所述的计算机可读程序介质,其中,所述基于所述第一比值和针对每一深度神经网络模型获取的各个第二比值,在所述多个深度神经网络模型中确定出目标深度神经网络模型时,计算机具体执行以下步骤:
    针对每一深度神经网络模型,将该深度神经网络模型输出的所述测试集中每一所述对象数据的异常概率与预设的异常概率阈值进行比较,以确定该深度神经网络模型对所述测试集中每一所述对象数据的预测结果是否为异常;
    基于所述测试集中各对象数据对应的代表对象是否异常的标签和各深度神经网络模型对所述测试集中各对象数据的预测结果,计算每一深度神经网络模型的查全率和查准率;
    根据每一深度神经网络模型的查全率、查准率、所述第一比值和针对每一深度神经网络模型获取的各个第二比值,在所述多个深度神经网络模型中确定出目标深度神经网络模型。
  13. 根据权利要求12所述的计算机可读程序介质,其中,所述根据每一深度神经网络模型的查全率和查准率以及所述第一比值和针对每一深度神经网络模型获取的各个第二比值,在所述多个深度神经网络模型中确定出目标深度神经网络模型时,计算机具体执行以下步骤:
    利用每一深度神经网络模型的查全率和查准率计算每一深度神经网络模型的第一参数;
    利用所述第一比值和针对每一深度神经网络模型获取的各个第二比值,获取每一深度神经网络模型的第二参数;
    基于各深度神经网络模型的所述第一参数和所述第二参数,在所述多个深度神经网络模型中确定出目标深度神经网络模型。
  14. 一种电子设备,其中,所述电子设备包括:
    处理器;
    存储器,所述存储器上存储有计算机可读指令,所述计算机可读指令被所述处理器执行时,实现以下步骤:
    获取多个对象数据和与每一所述对象数据对应的代表对象是否异常的标签,所述对象数据包括多个对象特征值;
    将所述多个对象数据按照预定规则分为训练集和测试集,其中,所述训练集和所述测试集中分别包含多个对象数据;
    将所述训练集中的多个对象数据和各对象数据对应的标签分别输入至多个待训练深度神经网络模型,对所述多个待训练深度神经网络模型进行训练,以得到多个深度神经网络模型,其中,每一所述待训练深度神经网络模型中各神经元之间的连接权重被随机初始化;
    将所述测试集中对象数据分别输入至所述多个深度神经网络模型,以得到每一所述深度神经网络模型输出的所述测试集中每一所述对象数据的异常概率;
    根据各深度神经网络模型输出的所述测试集中每一所述对象数据的异常概率,在所述多个深度神经网络模型中确定出目标深度神经网络模型;
    将所述目标深度神经网络模型与极端梯度提升模型级联,得到级联模型,并利用所述训练集中的多个对象数据对所述级联模型进行训练,得到训练好的级联模型;
    将待识别的对象数据输入至所述训练好的级联模型,以对待识别的对象数据对应的对象是否异常进行预测。
  15. 根据权利要求14所述的电子设备,其中,所述处理器执行所述根据各深度神经网络模型输出的所述测试集中每一所述对象数据的异常概率,在所述多个深度神经网络模型中确定出目标深度神经网络模型时,具体执行以下步骤:
    获取所述测试集中对应的代表对象异常的标签的对象数据的数目与所述测试集中包含的所有对象数据的数目的比值,作为第一比值;
    针对每一深度神经网络模型,对该深度神经网络模型输出的所述测试集中各对象数据的异常概率从大到小进行排序;
    针对每一深度神经网络模型,将与所述深度神经网络模型对应的各对象数据按照所述排序顺序分为预定数目组,每一对象数据属于一组;
    针对每一深度神经网络模型,针对与该深度神经网络模型对应的每一组对象数据,获取该组对象数据中包含了代表对象异常的标签的对象数据的数目与该组对象数据中包含的所有对象数据的数目的比值,作为第二比值;
    基于所述第一比值和针对每一深度神经网络模型获取的各个第二比值,在所述多个深度神经网络模型中确定出目标深度神经网络模型。
  16. 根据权利要求15所述的电子设备,其中,所述处理器执行所述基于所述第一比值和针对每一深度神经网络模型获取的各个第二比值,在所述多个深度神经网络模型中确定出目标深度神经网络模型时,具体执行以下步骤:
    针对每一深度神经网络模型,获取和与该深度神经网络模型对应的排序在最前的一组对象数据对应的第二比值,作为目标第二比值;
    针对每一深度神经网络模型,确定针对该深度神经网络模型获取的目标第二比值与所述第一比值的比值,作为第三比值;
    将所述第三比值最大的深度神经网络模型作为目标深度神经网络模型。
  17. 根据权利要求15所述的电子设备,其中,所述处理器执行所述基于所述第一比值和针对每一深度神经网络模型获取的各个第二比值,在所述多个深度神经网络模型中确定出目标深度神经网络模型时,具体执行以下步骤:
    针对每一深度神经网络模型,将该深度神经网络模型输出的所述测试集中每一所述对象数据的异常概率与预设的异常概率阈值进行比较,以确定该深度神经网络模型对所述测试集中每一所述对象数据的预测结果是否为异常;
    基于所述测试集中各对象数据对应的代表对象是否异常的标签和各深度神经网络模型对所述测试集中各对象数据的预测结果,计算每一深度神经网络模型的查全率和查准率;
    根据每一深度神经网络模型的查全率、查准率、所述第一比值和针对每一深度神经网络模型获取的各个第二比值,在所述多个深度神经网络模型中确定出目标深度神经网络模型。
  18. 根据权利要求17所述的电子设备,其中,所述处理器执行所述根据每一深度神经网络模型的查全率和查准率以及所述第一比值和针对每一深度神经网络模型获取的各个第二比值,在所述多个深度神经网络模型中确定出目标深度神经网络模型时,具体执行以下步骤:
    利用每一深度神经网络模型的查全率和查准率计算每一深度神经网络模型的第一参数;
    利用所述第一比值和针对每一深度神经网络模型获取的各个第二比值,获取每一深度神经网络模型的第二参数;
    基于各深度神经网络模型的所述第一参数和所述第二参数,在所述多个深度神经网络模型中确定出目标深度神经网络模型。
  19. 根据权利要求14所述的电子设备,其中,所述处理器执行所述根据各深度神经网络模型输出的所述测试集中每一所述对象数据的异常概率,在所述多个深度神经网络模型中确定出目标深度神经网络模型时,具体执行以下步骤:
    针对每一深度神经网络模型,将该深度神经网络模型输出的所述测试集中每一所述对象数据的异常概率与预设的异常概率阈值进行比较,确定该深度神经网络模型对所述测试 集中每一所述对象数据的预测结果是否为异常;
    基于所述测试集中各对象数据对应的代表对象是否异常的标签和各深度神经网络模型对所述测试集中各对象数据的预测结果,计算每一所述深度神经网络模型的查全率和查准率;
    在查全率大于预设查全率阈值的深度神经网络模型中选择出查准率最大的深度神经网络模型,作为目标深度神经网络模型。
  20. 根据权利要求14所述的电子设备,其中,所述目标深度神经网络模型包括输出层和至少一层隐藏层,所述处理器执行所述将所述目标深度神经网络模型与极端梯度提升模型级联,得到级联模型,并利用所述训练集中的多个对象数据对所述级联模型进行训练,得到训练好的级联模型时,具体执行以下步骤:
    去除所述目标深度神经网络模型中的输出层,并将所述目标深度神经网络模型的最后一层隐藏层与极端梯度提升模型级联,以使所述目标深度神经网络模型的最后一层隐藏层输出的特征向量能够输入至极端梯度提升模型,得到级联模型;
    利用所述训练集中的多个对象数据对所述级联模型进行训练,得到训练好的级联模型。
PCT/CN2020/092812 2019-10-12 2020-05-28 异常对象识别方法、装置、介质及电子设备 WO2021068513A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201910970120.7 2019-10-12
CN201910970120.7A CN110995459B (zh) 2019-10-12 2019-10-12 异常对象识别方法、装置、介质及电子设备

Publications (1)

Publication Number Publication Date
WO2021068513A1 true WO2021068513A1 (zh) 2021-04-15

Family

ID=70081940

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2020/092812 WO2021068513A1 (zh) 2019-10-12 2020-05-28 异常对象识别方法、装置、介质及电子设备

Country Status (2)

Country Link
CN (1) CN110995459B (zh)
WO (1) WO2021068513A1 (zh)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113900865A (zh) * 2021-08-16 2022-01-07 广东电力通信科技有限公司 智能的电网设备自动化测试方法、系统和可读存储介质
CN114726749A (zh) * 2022-03-02 2022-07-08 阿里巴巴(中国)有限公司 数据异常检测模型获取方法、装置、设备、介质及产品
CN116244659A (zh) * 2023-05-06 2023-06-09 杭州云信智策科技有限公司 一种识别异常设备的数据处理方法、装置、设备及介质

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110995459B (zh) * 2019-10-12 2021-12-14 平安科技(深圳)有限公司 异常对象识别方法、装置、介质及电子设备
CN113705764A (zh) * 2020-05-20 2021-11-26 华为技术有限公司 歧视性样本生成方法和电子设备

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20080183427A1 (en) * 2007-01-31 2008-07-31 Fisher-Rosemount Systems, Inc. Heat Exchanger Fouling Detection
CN101582813A (zh) * 2009-06-26 2009-11-18 西安电子科技大学 基于分布式迁移网络学习的入侵检测系统及其方法
CN104935600A (zh) * 2015-06-19 2015-09-23 中国电子科技集团公司第五十四研究所 一种基于深度学习的移动自组织网络入侵检测方法与设备
CN106357618A (zh) * 2016-08-26 2017-01-25 北京奇虎科技有限公司 一种Web异常检测方法和装置
CN107682216A (zh) * 2017-09-01 2018-02-09 南京南瑞集团公司 一种基于深度学习的网络流量协议识别方法
CN108632279A (zh) * 2018-05-08 2018-10-09 北京理工大学 一种基于网络流量的多层异常检测方法
CN109035488A (zh) * 2018-08-07 2018-12-18 哈尔滨工业大学(威海) 基于cnn特征提取的航空发动机时间序列异常检测方法
CN110995459A (zh) * 2019-10-12 2020-04-10 平安科技(深圳)有限公司 异常对象识别方法、装置、介质及电子设备

Family Cites Families (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9852019B2 (en) * 2013-07-01 2017-12-26 Agent Video Intelligence Ltd. System and method for abnormality detection
CN107123033A (zh) * 2017-05-04 2017-09-01 北京科技大学 一种基于深度卷积神经网络的服装搭配方法
CN109600345A (zh) * 2017-09-30 2019-04-09 北京国双科技有限公司 异常数据流量检测方法及装置
CN108304720B (zh) * 2018-02-06 2020-12-11 恒安嘉新(北京)科技股份公司 一种基于机器学习的安卓恶意程序检测方法
US10878569B2 (en) * 2018-03-28 2020-12-29 International Business Machines Corporation Systems and methods for automatic detection of an indication of abnormality in an anatomical image
CN109190828A (zh) * 2018-09-07 2019-01-11 苏州大学 泄漏气体浓度分布确定方法、装置、设备及可读存储介质
CN110189769B (zh) * 2019-05-23 2021-11-19 复钧智能科技(苏州)有限公司 基于多个卷积神经网络模型结合的异常声音检测方法

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20080183427A1 (en) * 2007-01-31 2008-07-31 Fisher-Rosemount Systems, Inc. Heat Exchanger Fouling Detection
CN101582813A (zh) * 2009-06-26 2009-11-18 西安电子科技大学 基于分布式迁移网络学习的入侵检测系统及其方法
CN104935600A (zh) * 2015-06-19 2015-09-23 中国电子科技集团公司第五十四研究所 一种基于深度学习的移动自组织网络入侵检测方法与设备
CN106357618A (zh) * 2016-08-26 2017-01-25 北京奇虎科技有限公司 一种Web异常检测方法和装置
CN107682216A (zh) * 2017-09-01 2018-02-09 南京南瑞集团公司 一种基于深度学习的网络流量协议识别方法
CN108632279A (zh) * 2018-05-08 2018-10-09 北京理工大学 一种基于网络流量的多层异常检测方法
CN109035488A (zh) * 2018-08-07 2018-12-18 哈尔滨工业大学(威海) 基于cnn特征提取的航空发动机时间序列异常检测方法
CN110995459A (zh) * 2019-10-12 2020-04-10 平安科技(深圳)有限公司 异常对象识别方法、装置、介质及电子设备

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
SONG, JIAMING: "Analysis of Network Abnormal Behavior Based Artifical Intelligence", CHINESE MASTER’S THESES FULL-TEXT DATABASE (ELECTRONIC JOURNAL), 15 August 2019 (2019-08-15), pages 1 - 80, XP055802031 *

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113900865A (zh) * 2021-08-16 2022-01-07 广东电力通信科技有限公司 智能的电网设备自动化测试方法、系统和可读存储介质
CN113900865B (zh) * 2021-08-16 2023-07-11 广东电力通信科技有限公司 智能的电网设备自动化测试方法、系统和可读存储介质
CN114726749A (zh) * 2022-03-02 2022-07-08 阿里巴巴(中国)有限公司 数据异常检测模型获取方法、装置、设备、介质及产品
CN114726749B (zh) * 2022-03-02 2023-10-31 阿里巴巴(中国)有限公司 数据异常检测模型获取方法、装置、设备及介质
CN116244659A (zh) * 2023-05-06 2023-06-09 杭州云信智策科技有限公司 一种识别异常设备的数据处理方法、装置、设备及介质

Also Published As

Publication number Publication date
CN110995459A (zh) 2020-04-10
CN110995459B (zh) 2021-12-14

Similar Documents

Publication Publication Date Title
WO2021068513A1 (zh) 异常对象识别方法、装置、介质及电子设备
US9811438B1 (en) Techniques for processing queries relating to task-completion times or cross-data-structure interactions
US10867244B2 (en) Method and apparatus for machine learning
US20170255790A1 (en) Systems and methods for processing requests for genetic data based on client permission data
CN112148987B (zh) 基于目标对象活跃度的消息推送方法及相关设备
CN108108743B (zh) 异常用户识别方法和用于识别异常用户的装置
CN111612038B (zh) 异常用户检测方法及装置、存储介质、电子设备
CN110929799B (zh) 用于检测异常用户的方法、电子设备和计算机可读介质
CN110909222B (zh) 基于聚类的用户画像建立方法、装置、介质及电子设备
CN111435463A (zh) 数据处理方法及相关设备、系统
US20220180209A1 (en) Automatic machine learning system, method, and device
US11593665B2 (en) Systems and methods driven by link-specific numeric information for predicting associations based on predicate types
US10678821B2 (en) Evaluating theses using tree structures
CN112215604A (zh) 交易双方关系信息识别方法及装置
CN111966886A (zh) 对象推荐方法、对象推荐装置、电子设备及存储介质
CN111191825A (zh) 用户违约预测方法、装置及电子设备
CN111062431A (zh) 图像聚类方法、图像聚类装置、电子设备及存储介质
CN111582645B (zh) 基于因子分解机的app风险评估方法、装置和电子设备
CN112887371B (zh) 边缘计算方法、装置、计算机设备及存储介质
CN111582649B (zh) 基于用户app独热编码的风险评估方法、装置和电子设备
WO2020252925A1 (zh) 用户特征群中用户特征寻优方法、装置、电子设备及计算机非易失性可读存储介质
WO2023236588A1 (zh) 基于客群偏差平滑优化的用户分类方法及装置
CN113837843B (zh) 产品推荐方法、装置、介质及电子设备
US20220318819A1 (en) Risk clustering and segmentation
CN114330720A (zh) 用于云计算的知识图谱构建方法、设备及存储介质

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20873488

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 20873488

Country of ref document: EP

Kind code of ref document: A1