CN113689288B - Risk identification method, device, equipment and storage medium based on entity list - Google Patents

Risk identification method, device, equipment and storage medium based on entity list Download PDF

Info

Publication number
CN113689288B
CN113689288B CN202110983252.0A CN202110983252A CN113689288B CN 113689288 B CN113689288 B CN 113689288B CN 202110983252 A CN202110983252 A CN 202110983252A CN 113689288 B CN113689288 B CN 113689288B
Authority
CN
China
Prior art keywords
entity
vector
trained
interaction
model
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202110983252.0A
Other languages
Chinese (zh)
Other versions
CN113689288A (en
Inventor
张鹏
陈婷
吴三平
庄伟亮
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
WeBank Co Ltd
Original Assignee
WeBank Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by WeBank Co Ltd filed Critical WeBank Co Ltd
Priority to CN202110983252.0A priority Critical patent/CN113689288B/en
Publication of CN113689288A publication Critical patent/CN113689288A/en
Application granted granted Critical
Publication of CN113689288B publication Critical patent/CN113689288B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q40/00Finance; Insurance; Tax strategies; Processing of corporate or income taxes
    • G06Q40/03Credit; Loans; Processing thereof
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Business, Economics & Management (AREA)
  • Physics & Mathematics (AREA)
  • Accounting & Taxation (AREA)
  • Finance (AREA)
  • Software Systems (AREA)
  • General Physics & Mathematics (AREA)
  • General Business, Economics & Management (AREA)
  • Data Mining & Analysis (AREA)
  • Strategic Management (AREA)
  • Marketing (AREA)
  • Economics (AREA)
  • Artificial Intelligence (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Technology Law (AREA)
  • Evolutionary Computation (AREA)
  • Medical Informatics (AREA)
  • Development Economics (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The application discloses a risk identification method, a risk identification device, risk identification equipment and a risk identification storage medium based on an entity list, wherein the risk identification method based on the entity list comprises the following steps: and acquiring interaction flow data respectively corresponding to the user to be predicted and each entity, wherein the interaction flow data comprises entity information and action information, and constructing each interaction feature vector based on each interaction flow data and a preset vector sharing model, wherein the preset vector sharing model is obtained by performing iterative training optimization based on the entity vector to be trained and the action feature vector to be trained corresponding to the interaction flow data collected in advance, and predicting the user to be predicted through a risk prediction model based on each interaction feature vector to obtain a risk recognition result. The method solves the technical problem of low accuracy of model prediction.

Description

Risk identification method, device, equipment and storage medium based on entity list
Technical Field
The present application relates to the field of machine learning technologies of financial technologies (Fintech), and in particular, to a risk identification method, apparatus, device and storage medium based on an entity list.
Background
With the continuous development of financial science and technology, especially internet science and technology finance, more and more technologies (such as distributed, artificial intelligence, etc.) are applied in the finance field, but the finance industry also puts higher demands on technologies, such as distribution of corresponding backlog in the finance industry.
With the development of computer technology, federal learning is increasingly used. Currently, when risk prediction is performed according to interaction flow information between a user and an entity, the entity needs to be classified first, then after the entity category, the type of action and time interaction, the action quantity is aggregated, for example, the total number of applied loans from a consumer finance company in early morning hours, the consumer finance category is the entity category, the action type and time are respectively corresponding to the application loan and early morning hours, the total number of applied loans corresponds to the aggregation of the action quantity, so that model prediction is performed, however, the classification needs to be performed with high manual labeling time cost and can not guarantee to cover all entity topics, further, some entities may not have specific information for category judgment, for example, the entity may have very short name description, so that the actual application of the entity can not be judged, and further the accuracy of model prediction is low.
Disclosure of Invention
The application mainly aims to provide a risk identification method, device, equipment and storage medium based on an entity list, and aims to solve the technical problem of low accuracy of model prediction in the prior art.
In order to achieve the above object, the present application provides a risk identification method based on an entity list, the risk identification method based on the entity list includes:
acquiring interaction flow data respectively corresponding to a user to be predicted and each entity, wherein the interaction flow data comprises entity information and action information;
Constructing each interaction feature vector based on each interaction flow data and a preset vector sharing model, wherein the preset vector sharing model is obtained by performing iterative training optimization based on entity vectors to be trained and action feature vectors to be trained corresponding to the interaction flow data collected in advance;
And predicting the user to be predicted through a risk prediction model based on each interaction feature vector to obtain a risk identification result.
Optionally, the step of constructing each interaction feature vector based on each interaction pipeline data and a preset vector sharing model includes:
based on the interaction flow data, converting the entity information into entity vectors through a preset entity dictionary;
Performing characteristic preprocessing operation on the action information corresponding to each interactive stream data to obtain a characteristic vector corresponding to each action information;
and inputting each entity vector and each feature vector into the preset vector sharing model, and outputting each interaction feature vector.
Optionally, the step of predicting the user to be predicted through a risk prediction model based on each interaction feature vector, and obtaining a risk recognition result includes:
longitudinally aggregating each interaction feature vector to obtain an aggregated interaction vector;
And predicting the user to be predicted through the risk prediction model based on the aggregate interaction vector to obtain the risk identification result.
Optionally, before the step of constructing each interaction feature vector based on each interaction flow data and a preset vector sharing model, where the preset vector sharing model is obtained by performing iterative training optimization based on an entity vector corresponding to the interaction flow data to be trained and an action feature vector, the risk identification method based on the entity list further includes:
obtaining a vector sharing model to be trained;
acquiring interactive running water data to be trained between a sample user and each sample entity;
Based on the interaction flow data to be trained and a preset entity dictionary, converting sample entities corresponding to the interaction flow data to be trained into entity vectors to be trained;
Extracting action characteristics corresponding to the interactive running water data to be trained, so as to perform characteristic preprocessing operation on the action characteristics and obtain action characteristic vectors to be trained;
And carrying out iterative training on the shared model of the to-be-trained vector based on the to-be-trained entity vectors and the to-be-trained motion feature vectors to obtain the shared model of the preset vector, and outputting interaction vectors to be trained.
Optionally, the step of extracting the action features corresponding to the interaction flow data to be trained to perform feature preprocessing operation on the action features, and obtaining each action feature vector to be trained includes:
extracting action characteristics corresponding to the interactive running water data to be trained, and carrying out box division processing on the action characteristics to obtain grouping action information;
And encoding each piece of grouping action information to obtain a to-be-trained action characteristic vector corresponding to each sample entity.
Optionally, after the step of iteratively training the to-be-trained vector sharing model based on each to-be-trained entity vector and each to-be-trained motion feature vector to obtain the preset vector sharing model and output each to-be-trained interaction vector, the risk identification method based on the entity list further includes:
acquiring a risk prediction model to be trained;
Longitudinally aggregating each interaction vector to be trained to obtain a training aggregation interaction vector;
and carrying out iterative training optimization on the risk prediction model to be trained based on the training aggregation interaction vector to obtain the risk prediction model.
Optionally, the step of performing iterative training optimization on the risk prediction model to be trained based on the training aggregate interaction vector to obtain the risk prediction model includes:
Inputting the training aggregation interaction vector into the risk prediction model to be trained, and inputting a prediction label corresponding to the sample user;
Calculating model loss based on the difference between the prediction label and the real label corresponding to the sample user;
And carrying out iterative training optimization on the risk prediction model to be trained based on the gradient calculated by the model loss, and obtaining the risk prediction model.
The application also provides a risk identification device based on the entity list, wherein the risk identification device based on the entity list is a virtual device, and the risk identification device based on the entity list comprises:
the system comprises an acquisition module, a prediction module and a prediction module, wherein the acquisition module is used for acquiring interaction flow data respectively corresponding to a user to be predicted and each entity, and the interaction flow data comprises entity information and action information;
The feature extraction module is used for constructing each interaction feature vector based on each interaction flow data and a preset vector sharing model, wherein the preset vector sharing model is obtained by performing iterative training optimization based on to-be-trained entity vectors and to-be-trained action feature vectors corresponding to the interaction flow data collected in advance;
And the prediction module is used for predicting the user to be predicted through a risk prediction model based on each interaction feature vector to obtain a risk identification result.
The application also provides a risk identification device based on the entity list, wherein the risk identification device based on the entity list is entity equipment, and the risk identification device based on the entity list comprises: the risk identification system comprises a memory, a processor and an entity list-based risk identification program stored on the memory, wherein the entity list-based risk identification program can realize the steps of the entity list-based risk identification method when being executed by the processor.
The application also provides a storage medium, which is a readable storage medium, wherein the readable storage medium stores a risk identification program based on an entity list, and the risk identification program based on the entity list realizes the steps of the risk identification method based on the entity list when being executed by a processor.
The application also provides a computer program product comprising a computer program which, when executed by a processor, implements the steps of the entity list based risk identification method as described above or the steps of the data prediction method as described above.
Compared with the prior art that the model prediction is carried out according to the interaction flow information between users and entities, the method, the device and the storage medium need to classify the entities in advance so as to predict the risk according to the entity category and the interaction action information, the method firstly obtains the interaction flow data respectively corresponding to the users to be predicted and the entities, wherein the interaction flow data comprises the entity information and the action information, and further constructs each interaction feature vector based on each interaction flow data and a preset vector sharing model, wherein the preset vector sharing model is obtained by carrying out iterative training optimization based on the entity vector to be trained corresponding to the interaction flow data collected in advance and the action feature vector to be trained, and further, based on each interaction feature vector, the risk recognition result is obtained by carrying out the prediction on the users to be predicted through the risk prediction model, the entity information and the action information in each interaction flow data are directly predicted through the model, and the entity itself does not need to be directly used as information features, and further, the entity is input is not directly as information features, the entity is further, the entity is highly-dependent on the interaction flow data is further required to be highly-classified, the prior art has improved, the interaction feature is further improved, the prior art has improved, and the risk recognition is further improved, the prior art has improved by the interaction feature is better, and the prior art has improved, however, some entities may not have specific information to perform category judgment, so that actual use of the entities cannot be judged, and further, the technical defect of low accuracy of model prediction is caused, thereby improving accuracy of model prediction.
Drawings
The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the application and together with the description, serve to explain the principles of the application.
In order to more clearly illustrate the embodiments of the application or the technical solutions of the prior art, the drawings which are used in the description of the embodiments or the prior art will be briefly described, and it will be obvious to a person skilled in the art that other drawings can be obtained from these drawings without inventive effort.
FIG. 1 is a flowchart of a first embodiment of a risk identification method based on an entity list according to the present application;
FIG. 2 is a flowchart of a risk identification method based on an entity list according to a second embodiment of the present application;
FIG. 3 is a flowchart of a third embodiment of a risk identification method based on an entity list according to the present application;
FIG. 4 is a schematic diagram of a model structure in the risk identification method based on entity list according to the present application;
Fig. 5 is a schematic device structure diagram of a hardware operating environment related to a risk identification method based on an entity list in an embodiment of the present application.
The achievement of the objects, functional features and advantages of the present application will be further described with reference to the accompanying drawings, in conjunction with the embodiments.
Detailed Description
It should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the scope of the application.
In a first embodiment of the risk identification method based on the entity list according to the present application, referring to fig. 1, the risk identification method based on the entity list includes:
step S10, interactive flow data respectively corresponding to a user to be predicted and each entity is obtained, wherein the interactive flow data comprises entity information and action information;
In this embodiment, it should be noted that, the entity may be a physical entity or may be a virtual concept, where the entity may be represented by using an entity serial number, for example, if a user applies for a loan at a bank, the bank is an entity, or if the user clicks on an APP, the APP is itself an entity, further, the interaction flow data is interaction information between the user and the entity, where the interaction information includes entity information and action information, and the action information includes action type, time, action amount (number of times, days, amount of money, etc.), for example, the number of times that the user clicks on the APP in a month.
And acquiring interaction flow data respectively corresponding to the user to be predicted and each entity, wherein the interaction flow data comprises entity information and action information, and particularly, collecting the interaction information between the user to be predicted and each entity in a preset observation time period.
Step S20, constructing each interaction feature vector based on each interaction flow data and a preset vector sharing model, wherein the preset vector sharing model is obtained by performing iterative training optimization based on entity vectors to be trained and action feature vectors to be trained corresponding to the interaction flow data collected in advance;
In this embodiment, each interaction feature vector is constructed based on each interaction flow data and a preset vector sharing model, where the preset vector sharing model is obtained by performing iterative training optimization based on a to-be-trained entity vector and a to-be-trained motion feature vector corresponding to interaction flow data collected in advance, specifically, first, based on a preset entity dictionary, entity vectors corresponding to entity serial numbers in each entity information are stored in the entity dictionary, entity information corresponding to each interaction flow data is converted into each entity vector, and then action information corresponding to interaction flow data between the to-be-predicted user and each entity is extracted, so that action information corresponding to each entity vector and each entity vector pair is spliced, and then input into the preset vector sharing model, so that each interaction feature vector is obtained.
The step of constructing each interaction feature vector based on each interaction flow data and a preset vector sharing model comprises the following steps:
step S21, based on the interactive flow data, converting the entity information into entity vectors through a preset entity dictionary;
In this embodiment, it should be noted that, the preset entity dictionary stores entity vectors corresponding to the entity numbers in advance, and the vector length of the entity vectors may be set manually, which is not limited herein.
Based on the interactive flow data, converting the entity information into entity vectors through a preset entity dictionary, and specifically, determining the entity vectors corresponding to the entity serial numbers respectively according to the entity serial numbers corresponding to the entity information in the preset entity dictionary.
Step S22, performing characteristic preprocessing operation on action information corresponding to each interaction flow data to obtain a characteristic vector corresponding to each action information;
in this embodiment, it should be noted that the action information is interaction information between the user and the entity, for example, the number of clicks of the user on the website.
Performing feature preprocessing operation on motion information corresponding to each piece of interaction flow data to obtain feature vectors corresponding to each piece of motion information, specifically extracting the motion information corresponding to each piece of interaction flow data, and performing box division processing and coding processing on each piece of motion information, wherein the box division processing is a processing mode of data segmentation on the motion information, the coding processing comprises a one-hot coding (one-hot) processing method, other data coding processing methods and the like, and further obtaining the feature vectors corresponding to each piece of motion information respectively, so that the obtained feature vectors become discretized to enhance the stability of a model and avoid model overfitting;
Step S23, inputting each entity vector and each feature vector into the preset vector sharing model, and outputting each interaction feature vector.
In this embodiment, each entity vector and each feature vector are input into the preset vector sharing model, each interaction feature vector is output, specifically, each entity vector and each feature vector associated with each entity vector are spliced to obtain each spliced interaction information, and each spliced interaction information is further input into the preset vector sharing model to obtain each interaction feature vector.
And step S30, predicting the user to be predicted through a risk prediction model based on each interaction feature vector to obtain a risk identification result.
In this embodiment, based on each interaction feature vector, the user to be predicted is predicted through a risk prediction model to obtain a risk recognition result, specifically, each interaction feature vector is longitudinally aggregated to obtain an aggregated interaction vector, and then the aggregated interaction vector is input into the risk prediction model to perform risk recognition on the user to be predicted to obtain the risk recognition result.
The step of predicting the user to be predicted through a risk prediction model based on each interaction feature vector to obtain a risk recognition result comprises the following steps:
Step S31, longitudinally aggregating each interaction feature vector to obtain an aggregated interaction vector;
In this embodiment, the aggregation includes a processing manner such as summation and averaging.
And longitudinally aggregating each interaction feature vector to obtain an aggregate interaction vector, and specifically summing each interaction feature vector to obtain the aggregate interaction vector corresponding to the user to be predicted.
And step S32, predicting the user to be predicted through the risk prediction model based on the aggregate interaction vector, and obtaining the risk identification result.
And predicting the user to be predicted through the risk prediction model based on the aggregate interaction vector to obtain the risk identification result, specifically, inputting the aggregate interaction vector into the risk prediction model to perform risk prediction on the user to be predicted to obtain the risk identification result, and judging whether the user to be predicted is a target user according to the risk identification result.
Compared with the prior art that when model prediction is carried out according to interaction flow information between users and entities, entities are required to be classified in advance to predict risks according to entity types and interaction action information, the embodiment of the application firstly obtains interaction flow data which correspond to the users to be predicted and the entities respectively, wherein the interaction flow data comprises entity information and action information, and further establishes each interaction feature vector based on the interaction flow data and a preset vector sharing model, wherein the preset vector sharing model is obtained by carrying out iterative training optimization according to the entity vector to be trained and the action feature vector to be trained corresponding to the interaction flow data collected in advance, further, based on each interaction feature vector, the risk prediction model predicts the users to be predicted to obtain a risk recognition result, and based on the entity information and the action information in each interaction flow data, the entity is directly predicted through the model, so that the entity is not required to be classified, further, the entity is input, the entity is not required to be hidden by the entity sharing model, the prior art is further improved, the prior art is required to be better provided with the interaction feature vector classification is further improved, the prior art is required to be better provided with the interaction feature classification model, and the prior art is further improved, some entities may not have specific information to perform category judgment, so that the actual use of the entities cannot be judged, and further, the technical defect of low accuracy of model prediction is caused, thereby improving the accuracy of model prediction.
Further, referring to fig. 2, according to a first embodiment of the present application, in another embodiment of the present application, before the step of constructing each interaction feature vector based on each interaction pipeline data and a preset vector sharing model, the preset vector sharing model is obtained by performing iterative training optimization based on an entity vector and an action feature vector corresponding to the interaction pipeline data to be trained, the risk identification method based on the entity list further includes:
step A10, obtaining a vector sharing model to be trained;
Step A20, obtaining interactive running water data to be trained between a sample user and each sample entity;
In this embodiment, it should be noted that when the user performs an associated action with a certain entity for a certain reason and requirement, interactive flow data, such as total loan amount of loan of the user at the finance company or the number of times the user clicks a certain APP, etc., is generated
Step A30, based on the interaction flow data to be trained and a preset entity dictionary, converting sample entities corresponding to the interaction flow data to be trained into entity vectors to be trained;
In this embodiment, based on each of the to-be-trained interaction flow data and a preset entity dictionary, a sample entity corresponding to each of the to-be-trained interaction flow data is converted into each of the to-be-trained entity vectors, and specifically, based on each of the to-be-trained interaction flow data and the preset entity dictionary, a to-be-trained entity vector corresponding to an entity serial number of each of the sample entities is determined.
Step A40, extracting action features corresponding to the interactive running water data to be trained, so as to perform feature preprocessing operation on the action features and obtain action feature vectors to be trained;
in this embodiment, the action feature is an associated action when the sample user and the sample entity interact. For example, the number of clicks of a certain APP by the user, etc.
Extracting action features corresponding to the interaction flow data to be trained, so as to perform feature preprocessing operation on the action features to obtain action feature vectors to be trained, specifically, extracting the action features corresponding to the interaction flow data to be trained between the sample user and the sample entities respectively, and performing box division processing and independent heat coding processing on the action features to obtain the action feature vectors to be trained corresponding to the sample entities.
The step of extracting the action characteristics corresponding to the interaction flow data to be trained to perform characteristic preprocessing operation on the action characteristics to obtain action characteristic vectors to be trained comprises the following steps:
Step A41, extracting action features corresponding to the interactive running water data to be trained, and carrying out box division processing on the action features to obtain grouping action information;
in this embodiment, the binning is a processing method of grouping or segmenting data.
Extracting action features corresponding to the interactive flow data to be trained, and carrying out box separation processing on the action features to obtain grouping action information, specifically, after the action features are extracted from the interactive flow data to be trained, carrying out box separation processing on the action features to obtain grouping action information, so that the grouping action information becomes discrete, and the situation of model fitting is avoided.
And step A42, coding each piece of grouping action information to obtain action feature vectors to be trained corresponding to each sample entity.
In the present embodiment, the encoding is a processing method for vectorizing packet operation information.
And coding each piece of grouping action information to obtain a to-be-trained action feature vector corresponding to each sample entity, and specifically, performing one-hot coding (one-hot) on each piece of grouping action information to obtain the to-be-trained action feature vector corresponding to each sample entity.
And step A50, performing iterative training on the shared model of the to-be-trained vector based on the to-be-trained entity vectors and the to-be-trained motion feature vectors to obtain the shared model of the preset vector, and outputting interaction vectors to be trained.
In this embodiment, based on each to-be-trained entity vector and each to-be-trained motion feature vector, performing iterative training on the to-be-trained vector sharing model to obtain the preset vector sharing model, and outputting each to-be-trained interaction vector, specifically, respectively splicing each to-be-trained entity vector and each to-be-trained motion feature vector corresponding to each sample entity to obtain each to-be-trained spliced interaction vector, and taking each to-be-trained spliced interaction vector as an input of the to-be-trained vector sharing model, optimizing the to-be-trained vector sharing model by iterative training to obtain the preset vector sharing model, and outputting each to-be-trained interaction vector.
The embodiment of the application provides a risk identification method based on entity lists, namely, a to-be-trained vector sharing model is obtained, to-be-trained interaction running water data between a sample user and each sample entity is obtained, and further, based on each to-be-trained interaction running water data and a preset entity dictionary, each sample entity corresponding to each to-be-trained interaction running water data is converted into each to-be-trained entity vector, further, action characteristics corresponding to each to-be-trained interaction running water data are extracted, so that feature preprocessing operation is carried out on each action characteristic to obtain each to-be-trained action characteristic vector, further, iterative training is carried out on the to-be-trained vector sharing model based on each to-be-trained entity vector and each to-be-trained action characteristic vector, the preset vector sharing model is obtained, and each to-be-trained interaction vector is output, so that the to-be-trained vector sharing model is trained through the characteristics of the sample entity and the action characteristic vector, the entity is not required to be classified according to the description of the entity, the model can learn the effective information of the entity of hidden information, the reliability of model is improved, the model prediction is not required to be carried out on the basis of the fact that the actual model is not required to be predicted, however, the actual classification of the model cannot be well has low-known by the actual classification of the entity model is required to be compared with the actual model.
Further, referring to fig. 3, in another embodiment of the present application, after the step of performing iterative training on the vector sharing model to be trained based on each entity vector to be trained and each motion feature vector to be trained to obtain the preset vector sharing model and output each interaction vector to be trained, the risk identification method based on the entity list further includes:
step B10, acquiring a risk prediction model to be trained;
Step B20, longitudinally aggregating each interaction vector to be trained to obtain a training aggregation interaction vector;
In this embodiment, each interaction vector to be trained is longitudinally aggregated to obtain a training aggregated interaction vector, and specifically, each interaction vector to be trained is longitudinally summed to obtain the training aggregated interaction vector.
And step B30, performing iterative training optimization on the risk prediction model to be trained based on the training aggregation interaction vector to obtain the risk prediction model.
In this embodiment, iterative training optimization is performed on a risk prediction model to be trained based on the training aggregate interaction vector to obtain a risk prediction model, specifically, the training aggregate interaction vector is input into the risk prediction model to be trained, a prediction label is output, and then model loss between the prediction label and a real label corresponding to a user to be predicted is calculated through a preset loss function, and further parameters of the risk prediction model to be trained are calculated by combining a back propagation algorithm, so that the risk prediction model to be trained is optimized until a preset training end condition is reached, and the risk prediction model is obtained, wherein the preset training end condition comprises conditions such as loss function convergence and maximum iteration number threshold.
The step of performing iterative training optimization on the risk prediction model to be trained based on the training aggregate interaction vector to obtain the risk prediction model comprises the following steps:
step B31, inputting the training aggregation interaction vector into the risk prediction model to be trained, and inputting a prediction label corresponding to the sample user;
In this embodiment, the training aggregate interaction vector is input into the risk prediction model to be trained, and a prediction label corresponding to the sample user is input, and specifically, the training aggregate interaction vector is used as the input of the risk prediction model to be trained, so as to obtain the prediction label corresponding to the sample user.
Step B32, calculating model loss based on the difference degree between the prediction label and the real label corresponding to the sample user;
In this embodiment, model loss is calculated based on the degree of difference between the prediction tag and the real tag corresponding to the sample user, and specifically, model loss is calculated according to the degree of difference between the prediction tag and the real tag corresponding to the sample user through a preset loss function, so as to iteratively optimize the risk prediction model to be trained according to model loss.
And step B33, performing iterative training optimization on the risk prediction model to be trained based on the gradient calculated by the model loss, and obtaining the risk prediction model.
In this embodiment, based on the gradient calculated by the model loss, iterative training optimization is performed on the risk prediction model to be trained to obtain the risk prediction model, specifically, according to the model loss, the gradient corresponding to the model is calculated, and further, based on the gradient, iterative training is performed on the risk prediction model to be trained to optimize the risk prediction model to be trained, and further, whether the optimized risk prediction model to be trained meets a preset training end condition is judged, where the preset training end condition includes conditions such as convergence of a loss function and reaching a maximum iteration number threshold, if yes, the risk prediction model is obtained, and if not, the execution step is returned: and acquiring interactive running water data to be trained between the sample user and each sample entity.
The embodiment of the application provides a risk identification method based on an entity list, namely, a risk prediction model to be trained is obtained, and then each interaction vector to be trained is longitudinally aggregated to obtain a training aggregation interaction vector, further, iterative training optimization is carried out on the risk prediction model to be trained based on the training aggregation interaction vector to obtain a risk prediction model, the risk prediction model to be trained is realized, the iterative training is carried out on the risk prediction model to be trained according to the entity vector to be trained corresponding to each sample entity and the action feature vector to be trained, the risk prediction model to be trained can learn the effective information of each sample entity, and further, the risk prediction model obtained through training does not depend on the description of the entity in the prediction process, so that the identification capability of a cluster of risk models is improved, and the entity is required to be classified manually in advance when model prediction is carried out according to interaction flow information between users and the entities in the prior art, however, some entities may not have specific information to carry out category judgment, so that the actual use of the entity cannot be judged, and further, the technical defect of low accuracy of model prediction is caused.
Referring to fig. 4, fig. 4 is a schematic diagram of a model structure in the risk identification method based on an entity list, where the user to be predicted moves as the action information, is actually the entity, an entity emb dictionary is the preset entity dictionary, an entity emb is the entity vector, an action feature is the feature vector, an interaction emb is the interaction feature vector, sum_emb is the aggregate interaction vector, a sharing DNN is the preset vector sharing model, DNN is the risk prediction model, and Y is the risk identification result.
Referring to fig. 5, fig. 5 is a schematic device structure diagram of a hardware running environment according to an embodiment of the present application.
As shown in fig. 5, the entity list-based risk identification apparatus may include: a processor 1001, such as a CPU, memory 1005, and a communication bus 1002. Wherein a communication bus 1002 is used to enable connected communication between the processor 1001 and a memory 1005. The memory 1005 may be a high-speed RAM memory or a stable memory (non-volatile memory), such as a disk memory. The memory 1005 may also optionally be a storage device separate from the processor 1001 described above.
Optionally, the entity list-based risk identification device may further include a rectangular user interface, a network interface, a camera, an RF (Radio Frequency) circuit, a sensor, an audio circuit, a WiFi module, and so on. The rectangular user interface may include a Display screen (Display), an input sub-module such as a Keyboard (Keyboard), and the optional rectangular user interface may also include a standard wired interface, a wireless interface. The network interface may optionally include a standard wired interface, a wireless interface (e.g., WI-FI interface).
It will be appreciated by those skilled in the art that the entity list based risk identification device structure shown in fig. 5 does not constitute a limitation of the entity list based risk identification device, and may include more or fewer components than shown, or may combine certain components, or may be a different arrangement of components.
As shown in fig. 5, an operating system, a network communication module, and a risk identification program based on an entity list may be included in a memory 1005 as one type of computer storage medium. The operating system is a program that manages and controls the entity list-based risk identification device hardware and software resources, supporting the execution of the entity list-based risk identification program and other software and/or programs. The network communication module is used to implement communication between components within the memory 1005 and other hardware and software in the entity list-based risk identification method system.
In the entity list based risk identification device shown in fig. 5, the processor 1001 is configured to execute an entity list based risk identification program stored in the memory 1005, to implement the steps of the entity list based risk identification method described in any one of the above.
The specific implementation manner of the risk identification device based on the entity list is basically the same as that of each embodiment of the risk identification method based on the entity list, and is not repeated here.
The application also provides a risk identification device based on the entity list, which comprises:
the system comprises an acquisition module, a prediction module and a prediction module, wherein the acquisition module is used for acquiring interaction flow data respectively corresponding to a user to be predicted and each entity, and the interaction flow data comprises entity information and action information;
The feature extraction module is used for constructing each interaction feature vector based on each interaction flow data and a preset vector sharing model, wherein the preset vector sharing model is obtained by performing iterative training optimization based on to-be-trained entity vectors and to-be-trained action feature vectors corresponding to the interaction flow data collected in advance;
And the prediction module is used for predicting the user to be predicted through a risk prediction model based on each interaction feature vector to obtain a risk identification result.
Optionally, the feature extraction module is further configured to:
based on the interaction flow data, converting the entity information into entity vectors through a preset entity dictionary;
Performing characteristic preprocessing operation on the action information corresponding to each interactive stream data to obtain a characteristic vector corresponding to each action information;
and inputting each entity vector and each feature vector into the preset vector sharing model, and outputting each interaction feature vector.
Optionally, the prediction module is further configured to:
longitudinally aggregating each interaction feature vector to obtain an aggregated interaction vector;
And predicting the user to be predicted through the risk prediction model based on the aggregate interaction vector to obtain the risk identification result.
Optionally, the risk identification device is further configured to:
obtaining a vector sharing model to be trained;
acquiring interactive running water data to be trained between a sample user and each sample entity;
Based on the interaction flow data to be trained and a preset entity dictionary, converting sample entities corresponding to the interaction flow data to be trained into entity vectors to be trained;
Extracting action characteristics corresponding to the interactive running water data to be trained, so as to perform characteristic preprocessing operation on the action characteristics and obtain action characteristic vectors to be trained;
And carrying out iterative training on the shared model of the to-be-trained vector based on the to-be-trained entity vectors and the to-be-trained motion feature vectors to obtain the shared model of the preset vector, and outputting interaction vectors to be trained.
Optionally, the risk identification device is further configured to:
extracting action characteristics corresponding to the interactive running water data to be trained, and carrying out box division processing on the action characteristics to obtain grouping action information;
And encoding each piece of grouping action information to obtain a to-be-trained action characteristic vector corresponding to each sample entity.
Optionally, the risk identification device is further configured to:
acquiring a risk prediction model to be trained;
Longitudinally aggregating each interaction vector to be trained to obtain a training aggregation interaction vector;
and carrying out iterative training optimization on the risk prediction model to be trained based on the training aggregation interaction vector to obtain the risk prediction model.
Optionally, the risk identification device is further configured to:
Inputting the training aggregation interaction vector into the risk prediction model to be trained, and inputting a prediction label corresponding to the sample user;
calculating model loss based on the degree of difference between the predicted label and the label corresponding to the sample user;
And carrying out iterative training optimization on the risk prediction model to be trained based on the gradient calculated by the model loss, and obtaining the risk prediction model.
The specific implementation manner of the risk identification device based on the entity list is basically the same as the above embodiments of the risk identification method based on the entity list, and will not be described herein.
Embodiments of the present application provide a storage medium, which is a readable storage medium, and the readable storage medium stores one or more programs, and the one or more programs are further executable by one or more processors to implement the steps of the entity list-based risk identification method described in any one of the above.
The specific implementation manner of the readable storage medium of the present application is basically the same as the embodiments of the risk identification method based on the entity list, and will not be described herein.
Embodiments of the present application provide a computer program product comprising one or more computer programs, the one or more computer programs being further executable by one or more processors for performing the steps of the entity list based risk identification method as described in any of the above.
The specific implementation manner of the computer program product of the present application is basically the same as the above embodiments of the risk identification method based on the entity list, and will not be described herein.
The foregoing description is only of the preferred embodiments of the present application, and is not intended to limit the scope of the application, but rather is intended to cover any equivalents of the structures or equivalent processes disclosed herein, or any application, directly or indirectly, within the scope of the application.

Claims (9)

1. The risk identification method based on the entity list is characterized by comprising the following steps of:
acquiring interaction flow data respectively corresponding to a user to be predicted and each entity, wherein the interaction flow data comprises entity information and action information;
Constructing each interaction feature vector based on each interaction flow data and a preset vector sharing model, wherein the preset vector sharing model is obtained by performing iterative training optimization based on entity vectors to be trained and action feature vectors to be trained corresponding to the interaction flow data collected in advance;
based on each interaction feature vector, predicting the user to be predicted through a risk prediction model to obtain a risk identification result;
the step of constructing each interaction feature vector based on each interaction flow data and a preset vector sharing model comprises the following steps:
based on the interaction flow data, converting the entity information into entity vectors through a preset entity dictionary;
Performing characteristic preprocessing operation on the action information corresponding to each interactive stream data to obtain a characteristic vector corresponding to each action information;
and inputting each entity vector and each feature vector into the preset vector sharing model, and outputting each interaction feature vector.
2. The risk identification method based on an entity list according to claim 1, wherein the step of predicting the user to be predicted by a risk prediction model based on each of the interaction feature vectors to obtain a risk identification result comprises:
longitudinally aggregating each interaction feature vector to obtain an aggregated interaction vector;
And predicting the user to be predicted through the risk prediction model based on the aggregate interaction vector to obtain the risk identification result.
3. The risk identification method based on an entity list according to claim 1, wherein before the step of constructing each interaction feature vector based on each interaction pipeline data and a preset vector sharing model, the preset vector sharing model is obtained by performing iterative training optimization based on an entity vector to be trained and a motion feature vector to be trained corresponding to the interaction pipeline data collected in advance, the risk identification method based on the entity list further comprises:
obtaining a vector sharing model to be trained;
acquiring interactive running water data to be trained between a sample user and each sample entity;
Based on the interaction flow data to be trained and a preset entity dictionary, converting sample entities corresponding to the interaction flow data to be trained into entity vectors to be trained;
Extracting action characteristics corresponding to the interactive running water data to be trained, so as to perform characteristic preprocessing operation on the action characteristics and obtain action characteristic vectors to be trained;
And carrying out iterative training on the shared model of the to-be-trained vector based on the to-be-trained entity vectors and the to-be-trained motion feature vectors to obtain the shared model of the preset vector, and outputting interaction vectors to be trained.
4. The risk identification method based on entity list as claimed in claim 3, wherein the step of extracting the motion features corresponding to each piece of interactive running water data to be trained to perform feature preprocessing operation on each motion feature, and obtaining each motion feature vector to be trained comprises:
extracting action characteristics corresponding to the interactive running water data to be trained, and carrying out box division processing on the action characteristics to obtain grouping action information;
And encoding each piece of grouping action information to obtain a to-be-trained action characteristic vector corresponding to each sample entity.
5. The risk identification method based on an entity list according to claim 3, wherein after the step of iteratively training the shared model of the to-be-trained vector based on each to-be-trained entity vector and each to-be-trained motion feature vector to obtain the preset shared model of the vector and outputting each to-be-trained interaction vector, the risk identification method based on an entity list further comprises:
acquiring a risk prediction model to be trained;
Longitudinally aggregating each interaction vector to be trained to obtain a training aggregation interaction vector;
and carrying out iterative training optimization on the risk prediction model to be trained based on the training aggregation interaction vector to obtain the risk prediction model.
6. The risk identification method based on entity list of claim 5, wherein the step of performing iterative training optimization on the risk prediction model to be trained based on the training aggregate interaction vector to obtain the risk prediction model comprises:
Inputting the training aggregation interaction vector into the risk prediction model to be trained, and inputting a prediction label corresponding to the sample user;
Calculating model loss based on the difference between the prediction label and the real label corresponding to the sample user;
And carrying out iterative training optimization on the risk prediction model to be trained based on the gradient calculated by the model loss, and obtaining the risk prediction model.
7. A risk identification device based on an entity list, characterized in that the risk identification device based on the entity list comprises:
the system comprises an acquisition module, a prediction module and a prediction module, wherein the acquisition module is used for acquiring interaction flow data respectively corresponding to a user to be predicted and each entity, and the interaction flow data comprises entity information and action information;
The feature extraction module is used for constructing each interaction feature vector based on each interaction flow data and a preset vector sharing model, wherein the preset vector sharing model is obtained by performing iterative training optimization based on entity vectors to be trained and action feature vectors to be trained corresponding to the interaction flow data collected in advance, and the feature extraction module is also used for converting each entity information into each entity vector through a preset entity dictionary based on each interaction flow data; performing characteristic preprocessing operation on the action information corresponding to each interactive stream data to obtain a characteristic vector corresponding to each action information; inputting each entity vector and each feature vector into the preset vector sharing model, and outputting each interaction feature vector;
And the prediction module is used for predicting the user to be predicted through a risk prediction model based on each interaction feature vector to obtain a risk identification result.
8. A risk identification device based on an entity list, the risk identification device based on the entity list comprising: a memory, a processor and a risk identification program based on an entity list stored on the memory,
The entity list based risk identification program is executed by the processor to implement the steps of the entity list based risk identification method according to any of claims 1 to 6.
9. A storage medium, the medium being a readable storage medium, wherein the readable storage medium has stored thereon an entity list based risk identification program, the entity list based risk identification program being executed by a processor to implement the steps of the entity list based risk identification method of any of claims 1 to 6.
CN202110983252.0A 2021-08-25 2021-08-25 Risk identification method, device, equipment and storage medium based on entity list Active CN113689288B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110983252.0A CN113689288B (en) 2021-08-25 2021-08-25 Risk identification method, device, equipment and storage medium based on entity list

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110983252.0A CN113689288B (en) 2021-08-25 2021-08-25 Risk identification method, device, equipment and storage medium based on entity list

Publications (2)

Publication Number Publication Date
CN113689288A CN113689288A (en) 2021-11-23
CN113689288B true CN113689288B (en) 2024-05-14

Family

ID=78582666

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110983252.0A Active CN113689288B (en) 2021-08-25 2021-08-25 Risk identification method, device, equipment and storage medium based on entity list

Country Status (1)

Country Link
CN (1) CN113689288B (en)

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2019095572A1 (en) * 2017-11-17 2019-05-23 平安科技(深圳)有限公司 Enterprise investment risk assessment method, device, and storage medium
CN110958220A (en) * 2019-10-24 2020-04-03 中国科学院信息工程研究所 Network space security threat detection method and system based on heterogeneous graph embedding
CN111210335A (en) * 2019-12-16 2020-05-29 北京淇瑀信息科技有限公司 User risk identification method and device and electronic equipment
CN112348663A (en) * 2020-10-21 2021-02-09 深圳乐信软件技术有限公司 Credit risk assessment method, credit risk assessment device, computer equipment and storage medium
CN112597299A (en) * 2020-12-07 2021-04-02 深圳价值在线信息科技股份有限公司 Text entity classification method and device, terminal equipment and storage medium
CN112801773A (en) * 2021-01-20 2021-05-14 招商银行股份有限公司 Enterprise risk early warning method, device, equipment and storage medium
CN112907360A (en) * 2021-03-25 2021-06-04 深圳前海微众银行股份有限公司 Risk assessment method, apparatus, storage medium, and program product
CN113034193A (en) * 2021-04-02 2021-06-25 墨致科技(上海)有限公司 Working method for modeling of APP2VEC in wind control system

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2019095572A1 (en) * 2017-11-17 2019-05-23 平安科技(深圳)有限公司 Enterprise investment risk assessment method, device, and storage medium
CN110958220A (en) * 2019-10-24 2020-04-03 中国科学院信息工程研究所 Network space security threat detection method and system based on heterogeneous graph embedding
CN111210335A (en) * 2019-12-16 2020-05-29 北京淇瑀信息科技有限公司 User risk identification method and device and electronic equipment
CN112348663A (en) * 2020-10-21 2021-02-09 深圳乐信软件技术有限公司 Credit risk assessment method, credit risk assessment device, computer equipment and storage medium
CN112597299A (en) * 2020-12-07 2021-04-02 深圳价值在线信息科技股份有限公司 Text entity classification method and device, terminal equipment and storage medium
CN112801773A (en) * 2021-01-20 2021-05-14 招商银行股份有限公司 Enterprise risk early warning method, device, equipment and storage medium
CN112907360A (en) * 2021-03-25 2021-06-04 深圳前海微众银行股份有限公司 Risk assessment method, apparatus, storage medium, and program product
CN113034193A (en) * 2021-04-02 2021-06-25 墨致科技(上海)有限公司 Working method for modeling of APP2VEC in wind control system

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
基于状态向量的危化品事故分析方法及应用;刘康炜;万剑华;靳熙芳;;计算机系统应用;20190615(06);262-269 *

Also Published As

Publication number Publication date
CN113689288A (en) 2021-11-23

Similar Documents

Publication Publication Date Title
JP7127120B2 (en) Video classification method, information processing method and server, and computer readable storage medium and computer program
US11386128B2 (en) Automatic feature learning from a relational database for predictive modelling
CN112270545A (en) Financial risk prediction method and device based on migration sample screening and electronic equipment
EP3893169A2 (en) Method, apparatus and device for generating model and storage medium
EP4242955A1 (en) User profile-based object recommendation method and device
CN110705719A (en) Method and apparatus for performing automatic machine learning
US11809505B2 (en) Method for pushing information, electronic device
US20170345054A1 (en) Generating and utilizing a conversational index for marketing campaigns
CN112328869A (en) User loan willingness prediction method and device and computer system
CN111738331A (en) User classification method and device, computer-readable storage medium and electronic device
CN113887655A (en) Model chain regression prediction method, device, equipment and computer storage medium
CN113610625A (en) Overdue risk warning method and device and electronic equipment
CN113378067A (en) Message recommendation method, device, medium, and program product based on user mining
CN113689288B (en) Risk identification method, device, equipment and storage medium based on entity list
CN115048561A (en) Recommendation information determination method and device, electronic equipment and readable storage medium
CN112527851B (en) User characteristic data screening method and device and electronic equipment
CN113792952A (en) Method and apparatus for generating a model
CN112434083A (en) Event processing method and device based on big data
CN113052635A (en) Population attribute label prediction method, system, computer device and storage medium
CN112906723A (en) Feature selection method and device
CN112115316A (en) Box separation method and device, electronic equipment and storage medium
CN110782287A (en) Entity similarity calculation method and device, article recommendation system, medium and equipment
CN116823407B (en) Product information pushing method, device, electronic equipment and computer readable medium
CN114637921B (en) Item recommendation method, device and equipment based on modeling accidental uncertainty
CN113706040B (en) Risk identification method, apparatus, device and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant