CN109345260B - Method for detecting abnormal operation behavior - Google Patents

Method for detecting abnormal operation behavior Download PDF

Info

Publication number
CN109345260B
CN109345260B CN201811180634.4A CN201811180634A CN109345260B CN 109345260 B CN109345260 B CN 109345260B CN 201811180634 A CN201811180634 A CN 201811180634A CN 109345260 B CN109345260 B CN 109345260B
Authority
CN
China
Prior art keywords
behavior
user
detected
conversion
feature vector
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201811180634.4A
Other languages
Chinese (zh)
Other versions
CN109345260A (en
Inventor
郭豪
孙善萍
王文刚
蔡准
孙悦
郭晓鹏
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Trusfort Technology Co ltd
Original Assignee
Beijing Trusfort Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Trusfort Technology Co ltd filed Critical Beijing Trusfort Technology Co ltd
Priority to CN201811180634.4A priority Critical patent/CN109345260B/en
Publication of CN109345260A publication Critical patent/CN109345260A/en
Application granted granted Critical
Publication of CN109345260B publication Critical patent/CN109345260B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q20/00Payment architectures, schemes or protocols
    • G06Q20/38Payment protocols; Details thereof
    • G06Q20/40Authorisation, e.g. identification of payer or payee, verification of customer or shop credentials; Review and approval of payers, e.g. check credit lines or negative lists
    • G06Q20/401Transaction verification
    • G06Q20/4016Transaction verification involving fraud or risk level assessment in transaction processing

Abstract

The application provides a method and a device for detecting abnormal operation behaviors, which comprise the following steps: acquiring target historical operation behavior information of a plurality of sample users at the sampling time of the target historical operation behavior and in a preset historical time period before the sampling time, and label information indicating whether each sample user has abnormal behavior at the sampling time; generating a behavior characteristic vector sequence and a time interval sequence under each operation type according to the acquired target historical operation behavior information; and training an anomaly detection model based on the behavior characteristic vector sequence, the time interval sequence and the label information of the sample user to obtain the anomaly detection model. According to the method and the device, the current target operation behavior of the user when the user uses the electronic bank can be analyzed according to the target historical operation behavior and the current target operation behavior, and the accuracy of judging whether the current target operation behavior of the user is abnormal or not is improved.

Description

Method for detecting abnormal operation behavior
Technical Field
The application relates to the technical field of computer information, in particular to a method for detecting abnormal operation behaviors.
Background
Electronic banking refers to a communication channel opened by banks to the public of society. With the rapid development of the internet and the popularization of intelligent terminals, electronic banking is a common banking business processing mode in electronic commerce activities, such as: the electronic bank is used for carrying out online transfer, account balance inquiry, online financing and other off-cabinet services, and the off-cabinet transaction service provides great convenience for users. However, electronic banking provides convenience for users, and meanwhile, has many potential safety hazards, such as: the user's benefits are damaged by abnormal operation behaviors such as account information stealing, fund transferring and the like.
In order to enhance the security of the electronic bank, in the prior art, whether the current behavior of the user is an abnormal behavior is determined according to the currently acquired user behavior characteristics and pre-stored abnormal behavior characteristics.
The method for identifying the abnormal behaviors has certain one-sidedness and poor identification accuracy.
Disclosure of Invention
In view of this, an object of the embodiments of the present application is to provide a method and an apparatus for detecting an abnormal operation behavior, which can analyze a current target operation behavior occurring when a user uses an electronic bank according to a target historical operation behavior and a target operation behavior at a current time, so as to improve an accuracy rate of determining whether the current target operation behavior of the user is an abnormal behavior.
In a first aspect, an embodiment of the present application provides an anomaly detection model training method, where the method includes:
target historical operation behavior information within a preset historical time period before the sampling time, and label information indicating whether abnormal behavior occurs at the sampling time;
for each sample user, generating a behavior feature vector sequence of the sample user under each operation type and a time interval sequence of target historical operation behaviors according to the acquired target historical operation behavior information; the behavior feature vector sequence comprises a plurality of first behavior feature vectors, and different first behavior feature vectors are feature vectors corresponding to target historical operation behaviors occurring at different moments;
performing anomaly detection model training based on the behavior characteristic vector sequence of the sample user under each operation type, the time interval sequence and the label information to obtain an anomaly detection model; the anomaly detection model is used for detecting the anomaly risk probability of the target operation behavior of the user to be detected.
With reference to the first aspect, an embodiment of the present application provides a first possible implementation manner of the first aspect, where: the training of the anomaly detection model based on the behavior feature vector sequence, the time interval sequence and the label information of the sample user under each operation type comprises the following steps:
for each sample user, inputting the behavior characteristic vector sequence of the sample user under each operation type and the time interval sequence into a recurrent neural network to obtain a second behavior characteristic vector corresponding to the operation type;
splicing the second behavior characteristic vectors corresponding to the operation types, inputting the second behavior characteristic vectors to a classification neural network, and acquiring the abnormal risk probability of the sample user at the sampling moment;
and training the recurrent neural network and the classified neural network based on the abnormal risk probability and the label information to obtain the abnormal detection model.
With reference to the first aspect, an embodiment of the present application provides a second possible implementation manner of the first aspect, where: the recurrent neural network includes: a conversion layer and an encoding layer;
the generating of the second behavior feature vector corresponding to each operation type includes:
inputting the behavior characteristic vector sequence corresponding to each operation type into the conversion layer, and performing standardized conversion on each first behavior characteristic vector in the behavior characteristic vector sequence through the conversion layer to obtain a behavior conversion vector when a sample user generates an operation behavior each time;
and inputting the time interval of each sample user in the time interval sequence when the operation behavior occurs every time and the behavior conversion vector into the coding layer for coding aiming at the time interval sequence under each operation type, and generating second behavior characteristic vectors corresponding to each operation type.
With reference to the first aspect, an embodiment of the present application provides a third possible implementation manner of the first aspect, where: the conversion layer comprises conversion matrixes corresponding to various operation types;
performing standardized conversion on each first behavior feature vector in the behavior feature vector sequence to generate a behavior conversion vector when a sample user generates an operation behavior each time under each operation type, wherein the behavior conversion vector comprises:
under each operation type, multiplying each first behavior feature vector by the conversion matrix respectively to generate a behavior conversion vector when a sample user generates an operation behavior each time under each operation type;
when the recurrent neural network is trained, parameter adjustment is carried out on the recurrent neural network, and each element value of the conversion matrix is adjusted according to the conversion layer.
In a second aspect, an embodiment of the present application further provides an abnormal operation behavior detection method, including:
after the target operation behavior of a user to be detected is detected, acquiring target operation behavior information of the user to be detected in a latest preset time period;
generating a to-be-tested behavior characteristic vector sequence and a to-be-tested time interval sequence which are respectively corresponding to each operation type of the to-be-tested user under various operation types according to the target operation behavior information; the behavior feature vector sequence to be tested comprises a plurality of behavior feature vectors to be tested, and different behavior feature vectors to be tested are feature vectors corresponding to target operation behaviors occurring at different moments;
inputting the behavior feature vector sequence to be detected and the time interval sequence to be detected into an anomaly detection model obtained by the anomaly detection model training method according to any one of the possible implementation manners of the first aspect and the first aspect, and obtaining the anomaly risk probability of the target operation behavior of the user to be detected;
the anomaly detection model includes: a recurrent neural network and a categorical neural network.
In combination with the second aspect, the present embodiments provide a first possible implementation manner of the second aspect, where: the acquiring of the abnormal risk probability of the target operation behavior of the user to be tested comprises:
inputting the behavior feature vector sequence to be tested corresponding to each operation type and the time interval sequence to be tested into a recurrent neural network to obtain a third behavior feature vector under each operation type;
and splicing the third row characteristic vectors corresponding to each operation type, and inputting the spliced third row characteristic vectors into a classification neural network to obtain the abnormal risk probability of the target operation behavior of the user to be detected.
In combination with the first possible implementation manner of the second aspect, the present embodiments provide a second possible implementation manner of the second aspect, where: the recurrent neural network includes: a conversion layer and an encoding layer;
the obtaining of the third behavior feature vector under each operation type includes:
inputting the behavior feature vector sequence to be tested corresponding to each operation type into the conversion layer, and performing standardized conversion on each behavior feature vector to be tested in the behavior feature vector sequence to be tested through the conversion layer to obtain a behavior conversion vector to be tested when the user to be tested generates an operation behavior each time;
and inputting the time interval of each operation behavior of the user to be detected in the time interval sequence to be detected and the conversion vector of the behavior to be detected into the coding layer for coding, and generating a third row feature vector corresponding to each operation type.
In combination with the second possible implementation manner of the second aspect, the present application provides a third possible implementation manner of the second aspect, where: the conversion layer corresponds to a conversion matrix of each operation type;
and carrying out standardized conversion on each behavior feature vector to be detected in the behavior feature vector sequence to be detected to generate a behavior conversion vector to be detected when the user to be detected generates an operation behavior each time under each operation type, wherein the behavior conversion vector to be detected comprises the following steps:
and under each operation type, multiplying each first behavior feature vector by the conversion matrix respectively to generate a to-be-detected behavior conversion vector when the to-be-detected user generates an operation behavior each time under each operation type.
In combination with the second aspect, the present application provides a fourth possible implementation manner of the second aspect
Detecting whether the risk probability reaches a preset risk threshold corresponding to the target operation behavior of the user to be detected;
intercepting the operation behavior of the user to be tested at the current moment when the risk probability reaches a preset risk threshold corresponding to the operation behavior of the user to be tested at the current moment;
and when the risk probability does not reach a preset risk threshold corresponding to the operation behavior of the user to be tested at the current moment, allowing the operation behavior of the user to be tested at the current moment to be executed.
In a third aspect, an embodiment of the present application further provides an electronic device, including: a processor, a memory and a bus, the memory storing processor-executable machine-readable instructions, the processor and the memory communicating via the bus when the electronic device is running, the machine-readable instructions when executed by the processor performing the steps of any one of the possible embodiments of the first aspect and any one of the possible embodiments of the second aspect.
In a fourth aspect, the present application further provides a computer-readable storage medium, where a computer program is stored on the computer-readable storage medium, and the computer program is executed by a processor to perform the steps in any one of the possible implementation manners of the first aspect and any one of the possible implementation manners of the second aspect.
In the embodiment of the application, a behavior feature vector sequence and a time interval sequence for each operation type are generated based on target historical operation behavior information of each sample user, when an abnormality detection model is trained, not only the behavior feature vector sequence under each operation type but also the time interval sequence under each operation type are required, wherein the behavior feature vector sequence comprises a plurality of first behavior feature vectors, each first behavior feature vector is a feature vector corresponding to a target historical operation behavior occurring at different time, the abnormality detection model is trained according to label information indicating whether each sample user has abnormal behavior at the sampling time, the behavior feature vector sequence and the time interval sequence, so that the model can learn continuous target historical operation behaviors and an association relationship between the time interval and the current target operation behavior, therefore, the trained anomaly detection model can be used for detecting the anomaly risk probability of the target operation behavior of the user to be detected.
Further, when the anomaly detection model is used for carrying out anomaly detection on the current target operation behavior to be detected of the user to be detected, based on the target operation behavior information of the user to be detected in the latest preset time period, whether the current target operation behavior to be detected of the user to be detected is the anomaly or not can be judged more accurately according to the incidence relation between the target operation behavior of the user to be detected in the latest preset time period and the current target operation behavior to be detected, so that the method has higher accuracy compared with the mode of carrying out anomaly detection only based on the current operation behavior in the prior art.
In order to make the aforementioned objects, features and advantages of the present application more comprehensible, preferred embodiments accompanied with figures are described in detail below.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings that are required to be used in the embodiments will be briefly described below, it should be understood that the following drawings only illustrate some embodiments of the present application and therefore should not be considered as limiting the scope, and for those skilled in the art, other related drawings can be obtained from the drawings without inventive effort.
FIG. 1 is a flow chart illustrating a method for training an anomaly detection model according to an embodiment of the present application;
FIG. 2 is a flow chart illustrating a particular method of anomaly detection model training provided by an embodiment of the present application;
fig. 3 is a flowchart illustrating a specific method for obtaining a second behavior feature vector corresponding to the operation type according to an embodiment of the present application;
FIG. 4 is a flow chart illustrating a method for detecting abnormal operation behavior according to an embodiment of the present application;
fig. 5 is a flowchart illustrating a specific method for obtaining a risk probability of an operation behavior of a user to be tested according to an embodiment of the present application;
fig. 6 shows a structure diagram of an anti-anomaly network provided in an embodiment of the present application;
FIG. 7 is a schematic diagram illustrating a GRU operating state change according to an embodiment of the present application;
FIG. 8 is a flow chart illustrating another method for detecting abnormal operating behavior provided by an embodiment of the present application;
FIG. 9 is a schematic structural diagram illustrating an anomaly detection model training apparatus according to an embodiment of the present application;
fig. 10 is a schematic structural diagram illustrating an apparatus for detecting abnormal operation behavior according to an embodiment of the present application;
FIG. 11 is a schematic diagram illustrating an anti-anomaly system according to an embodiment of the present application;
fig. 12 is a schematic diagram illustrating an internal functional entity structure of an anti-exception system according to an embodiment of the present application;
fig. 13 shows a schematic structural diagram of an electronic device provided in an embodiment of the present application.
Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present application clearer, the technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are only a part of the embodiments of the present application, and not all the embodiments. The components of the embodiments of the present application, generally described and illustrated in the figures herein, can be arranged and designed in a wide variety of different configurations. Thus, the following detailed description of the embodiments of the present application, presented in the accompanying drawings, is not intended to limit the scope of the claimed application, but is merely representative of selected embodiments of the application. All other embodiments, which can be derived by a person skilled in the art from the embodiments of the present application without making any creative effort, shall fall within the protection scope of the present application.
At present, the popularization of electronic banking brings convenience to users and also causes a problem of safety of a plurality of user accounts, in order to enhance the safety of electronic banking, a traditional machine learning model is adopted in the prior art to model banking business, a large amount of behavior operation information is collected to be used as a training set to train to obtain the machine learning model, whether the current behavior of a user is abnormal behavior is judged according to the behavior characteristics obtained by the current user and the machine learning model, and further the account safety of the user is ensured.
Current machine learning models typically learn only the characteristics of a user's single operational behavior. However, because there is a great difference between different users in the operation behaviors, the operation behaviors of the same user in different time periods may also differ, and there is a great chance that the operation behavior of the user once is, for example: if the user needs to perform a large amount of transfer transaction in a certain transaction, it is unreasonable to judge the transaction of the user at this time as an abnormal behavior. Therefore, whether the user behavior is abnormal behavior in the current service scene is detected and analyzed only based on the current user behavior operation information, and the accuracy is low. When the user generates an operation behavior, the machine learning model judges whether the current operation behavior of the user belongs to an abnormal behavior based on the characteristics of the current operation behavior. However, different operation behaviors of the user and a time interval between each operation behavior and the last operation behavior have a certain association relationship, and the current machine learning model cannot extract such continuous operation behaviors, and the association relationship between the time interval between each operation behavior and the last released operation behavior and whether the operation behavior of the current user is an abnormal behavior, so that the accuracy rate of detecting whether the operation behavior belongs to the abnormal behavior based on a single operation behavior is low.
Based on this, the method and the device for detecting the abnormal operation behavior provided by the embodiment of the application can analyze the current target operation behavior of the user when the user uses the electronic bank according to the target historical operation behavior and the target operation behavior at the current moment, so that the accuracy of judging whether the current target operation behavior of the user is the abnormal behavior is improved.
In order to facilitate understanding of the embodiment, a method for training an abnormal behavior detection model disclosed in the embodiment of the present application is first described in detail, and an abnormal behavior detection model obtained based on the method for training the abnormal behavior detection model can be used for more accurately determining whether a current operation behavior of a user belongs to an abnormal behavior in combination with historical operation behavior information and current behavior operation information of the user when the user uses an electronic bank within a certain period of time. In the present application, the trained abnormal behavior detection model includes two parts: a recurrent neural network and a categorical neural network. The Recurrent neural network used in the present application is a Gated Recurrent Unit (GRU). The GRU is used for respectively extracting the incidence relation between all target historical operation behaviors and whether the current target operation behavior is abnormal behavior when the user uses the electronic bank within a certain time period aiming at different operation types, so as to obtain the behavior feature vector of the user under each operation type, and the classification neural network is used for obtaining the abnormal risk probability of the current behavior of the user according to the behavior feature vector.
Referring to fig. 1, the method for training an anomaly detection model provided in the embodiment of the present application includes:
s101: the method comprises the steps of obtaining target historical operation behavior information of a plurality of sample users at the sampling time when the target historical operation behavior occurs and in a preset historical time period before the sampling time, and label information indicating whether each sample user has abnormal behavior at the sampling time.
In the specific implementation, the target operation behavior refers to a business handling behavior to be performed by the user in the electronic bank, and the business handling behavior includes transfer, payment, financing, credit card repayment and the like. The target historical operation behavior information refers to operation behaviors representing business handling of a sample user through an electronic bank in the past time, and comprises basic operation behaviors, such as: registering account information, logging into an electronic bank, etc., and business operations such as: transfer accounts, pay fees, make money, credit card repayment, etc.; the sampling instants and the preset historical time periods have a continuous time relationship. For each sample user, the sample user has certain operation behavior at the sampling moment, and if the user has abnormal behavior when the operation behavior occurs, the corresponding sample user is taken as a positive sample user; and if the user does not have abnormal behavior when the operation behavior occurs, taking the corresponding sample user as a negative sample user.
The label information indicating whether each sample user has abnormal behavior at the sampling time may be: if the sample user has abnormal behavior, the label information is 1; if the sample user does not have abnormal behavior, the label information is 0. When the anomaly detection model is trained, if the prediction result output by the model for a certain sample user tends to be 1, the probability that the sample user has an abnormal behavior at the sampling moment is considered to be higher; if the prediction result output by the model for a certain sample user tends to be 0, the probability that the sample user has abnormal behaviors at the sampling moment is considered to be lower; and then, training the model based on the prediction result output by the model and the corresponding label information.
S102: for each sample user, generating a behavior feature vector sequence of the sample user under each operation type and a time interval sequence of target historical operation behaviors according to the acquired target historical operation behavior information; the behavior feature vector sequence comprises a plurality of first behavior feature vectors, and different first behavior feature vectors are feature vectors corresponding to target historical operation behaviors occurring at different moments.
In the specific implementation, the target historical operation behavior information of the sample user comprises operation behavior characteristics of the sample user in different operation types within a preset historical time period, each operation behavior characteristic of the sample user corresponds to a corresponding characteristic category, and a characteristic value of the operation behavior characteristic is determined based on the characteristic category of the operation behavior characteristic of the sample user.
Here, the feature classes include binary class features and numerical features, the binary class features refer to operation behavior features that cannot be expressed by numbers, but include only two classes, for example: whether the user is transferring money at a sensitive time, if so, may be represented by a number "1", and if not, may be represented by a number "0". Numerical characteristics refer to operational behavior characteristics that can be identified directly by numbers, such as: if the user transferred 50 times to a certain person within a certain time, the feature "the user transferred 50 times to a certain person within a certain time" may be directly represented by 50.
Specifically, an embodiment of the present application further provides a specific method for generating a behavior feature vector sequence and a time interval sequence of the sample user under each operation type, where the method includes:
and for each sample user, acquiring the time of each operation of the sample user in a preset historical time period and the characteristic values under a plurality of preset operation characteristics corresponding to each operation type respectively according to the target historical operation behavior information.
According to the time of each operation behavior of the sample user in the preset historical time period, calculating the sampling time and the time interval before the sampling time, which is the time interval between each operation behavior of the sample user in each operation type and the last operation behavior, wherein each time interval corresponds to each target operation behavior of the sample user, and each time interval can be the same or different.
And generating a time interval sequence of the sample user under each operation type according to the time interval corresponding to each occurrence of the operation behavior of the sample user under each operation type.
And generating first behavior feature vectors respectively corresponding to the operation types when the sample user generates the operation behavior each time according to the feature values under the multiple preset operation behavior features corresponding to the operation types when the sample user generates the operation behavior each time in a preset historical time period and at the sampling time.
And generating behavior feature vector sequences respectively corresponding to the sample user in each operation type according to the first behavior feature vectors respectively corresponding to each operation type when the sample user generates operation behaviors each time.
Each operation type corresponds to a plurality of preset operation behavior characteristics, for example, in the operation type of payment, when a sample user performs an operation behavior at a certain moment within a preset historical time period, the corresponding behavior characteristics at the moment and the behavior characteristics at the historical moment are acquired, for example: the payment times within 1 day, the payment times within 7 days, the payment amount of the same user within 1 day, the payment amount of the same user within 7 days and the like.
Judging the operation behavior feature category of the sample user according to the operation behavior of the sample user each time, and adopting a one-hot (one-hot) encoding mode for the binary category feature, so that the feature value corresponding to the operation behavior feature is only 0 and 1, for example: the operation behavior characteristic is 'whether the device is tampered during registration', if so, the characteristic value corresponding to the operation behavior characteristic is '1', and if not, the characteristic value corresponding to the operation behavior characteristic is '0'. For the numerical feature, the numerical value corresponding to the operation behavior feature is directly used as the feature value of the operation behavior feature, for example: the operation behavior characteristic is "the number of login accounts of the same device in 1 day", and if the number is 240, the characteristic value of the operation behavior characteristic is 240.
For example, the sampling time is 2018, 3 and 1, and the preset time period before the sampling time is 2017, 10 and 1 to 2018, 3 and 1.
During this period, the sample user A has a total of 50 transfer operations, A1-A50. There are 3 operation types, respectively registered account information, login, and transfer, denoted B1, B2, B3, respectively.
After acquiring historical operation behavior information of 50 operation behaviors of the first person, which occur together within the preset historical time period and at the sampling time, generating first behavior feature vectors under operation types B1 to B3 corresponding to A1 to A50 respectively for the operations A1 to A50. The first behavior feature vectors of the A1 in the operation types B1-B3 are respectively: C11-C13; the first behavior feature vectors of A2 in operation types B1-B3 are: C21-C23; … … A50 the first behavior feature vectors at operation types B1-B3 are: C501-C503.
Then, the row feature vector sequences corresponding to the three operation types of the sample user A are generated based on C11-C13, C21-C23, … … and C501-C503, wherein the row feature vector sequences respectively corresponding to the operation types are as follows:
the operation type B1 corresponds to a sequence of row feature vectors as follows: c11, C21, C31, … …, C501;
the operation type B2 corresponds to a sequence of row feature vectors as follows: c12, C22, C32, … …, C502;
the operation type B3 corresponds to a sequence of row feature vectors as follows: c13, C23, C33, … …, C503.
In addition, since there may be some missing, abnormal, etc. conditions in the historical behavior operation information, before generating the feature vector sequence corresponding to each operation type respectively under multiple operation types by the sample user, a preprocessing operation is further performed on the behavior feature vector, where the preprocessing operation includes: data cleaning, data enhancement, characteristic screening and standardized operation.
The data cleaning is to remove the abnormal feature distribution data and the feature data filled with missing values when the data are wrong and lost in the acquisition and transmission processes, and the following two modes are adopted when the data are filled:
one is as follows: and for the operation behavior with the characteristic category as the numerical characteristic, taking the average value of the operation behavior characteristic of the corresponding sample user as a characteristic value, and forming a characteristic vector with other characteristic values.
The second step is as follows: and for the operation behavior with the characteristic category as the binary category characteristic, filling the characteristic value corresponding to the operation behavior characteristic with the maximum occurrence frequency of the corresponding sample user.
Aiming at data enhancement, the operation behaviors of the sample users comprise normal operation behaviors and abnormal operation behaviors, the obtained quantity distribution of the normal operation behaviors and the abnormal operation behaviors of the sample users is unbalanced, taking the synthesis least priority Oversampling Technique (Somte) data enhancement algorithm as an example, the Somte data enhancement algorithm can map all sample users with abnormal behaviors into a feature space, each sample user with abnormal behavior corresponds to a point in the space, one point in a connecting line of the corresponding points of any two sample users with abnormal behavior at each time is used as a newly generated sample user data point with abnormal behavior, any number of sample user data points with abnormal behavior can be generated by repeatedly carrying out the operations, and finally, the proportion between the generated sample user data amount with abnormal behavior and the normal user data amount is controlled to be within a certain range.
And aiming at the characteristic screening and standardization operation, the operation behavior data subjected to the data enhancement processing is subjected to dimension reduction processing, the operation behavior data with lower importance degree is removed to obtain a behavior characteristic vector with uniform standard, and the operation behavior data is mapped to the same numerical range to eliminate dimension influence among different characteristics, so that the model training speed is improved and the model identification accuracy is improved.
After the behavior feature vectors are preprocessed, the first behavior feature vectors corresponding to all scenes respectively generate corresponding behavior feature vector sequences under each operation type.
S103: performing anomaly detection model training based on the behavior characteristic vector sequence, the time interval sequence and the label information of the sample user under each operation type to obtain an anomaly detection model; and the anomaly detection model is used for detecting the anomaly risk probability of the target operation behavior of the user to be detected.
When the method is concretely realized, an abnormality detection model comprises a GRU and a classification neural network, firstly, a behavior feature vector sequence is input into the GRU, a relationship between a target historical operation behavior at a sampling moment and a target historical operation behavior in a preset historical time period before the sampling moment can be represented through the GRU, a relationship between the target historical operation behavior at the sampling moment and a time interval between each occurrence of the target historical operation behavior and the last target historical operation behavior of a sample user can also be established, a second behavior feature vector capable of representing the relationship is further obtained, an abnormality risk probability of each sample user at the sampling moment is obtained on the basis of the classification neural network according to the second behavior feature vector of each sample user under each operation type, and label information indicating whether each sample user has an abnormality at the sampling moment is obtained on the basis of the abnormality risk probability of each sample user and the label information indicating whether each sample user has an abnormality at the sampling moment, and training the anomaly detection model to further obtain the trained anomaly detection model.
Specifically, referring to fig. 2, an embodiment of the present application further provides a specific method for training an anomaly detection model, including:
s201: and inputting the behavior characteristic vector sequence and the time interval sequence of the sample user under each operation type into the recurrent neural network aiming at each sample user to obtain a second behavior characteristic vector corresponding to the operation type.
When the method is specifically implemented, the second behavior feature vector can represent the internal relationship between the target operation behaviors of the sample user at the sampling moment and the target operation behaviors occurring in a preset time period before the sampling moment. It should be noted here that, for the GRU, in order to enhance the robustness and generalization capability of the anomaly detection model, the same group of GRU structural internal parameters are shared among the operation behaviors of the sample user under different operation types, but in order to embody the characteristics of different service operations, the input first operation behavior feature vector needs to be subjected to standardized transformation, and then the second behavior feature vector corresponding to the operation type is obtained.
Specifically, referring to fig. 3, an embodiment of the present application further provides a specific method for obtaining a second behavior feature vector corresponding to the operation type, including:
s301: and inputting the behavior characteristic vector sequence corresponding to each operation type into a conversion layer, and performing standardized conversion on each first behavior characteristic vector in the behavior characteristic vector sequence through the conversion layer to obtain a behavior conversion vector when a sample user generates an operation behavior each time.
S302: and inputting the time interval of each sample user in the time interval sequence when the operation behavior occurs every time and the behavior conversion vector into the coding layer for coding aiming at the time interval sequence under each operation type, and generating second behavior characteristic vectors corresponding to each operation type.
In a specific implementation, the GRU has multiple layers, including a conversion layer and an encoding layer, a behavior feature vector sequence and a time interval sequence are input into the GRU, first, a first behavior feature vector in the behavior feature vector sequence under each operation type is subjected to standardized conversion by using the conversion layer, so as to obtain a behavior conversion vector when each sample user generates an operation behavior under each operation type, and under each operation type, because the behavior conversion vector corresponds to the operation behavior generated each time, and each time interval in the time interval sequence corresponds to the operation behavior generated each time, the behavior conversion vector under each operation type corresponds to the time interval one by one.
Here, the translation layer includes transformation matrices corresponding to different operation types, and the transformation matrices can perform normalized transformation on the first behavior feature vector in the following manner: and multiplying the conversion vector by the first behavior feature vector to obtain the behavior conversion vector when the sample user operates every time under each operation type.
And under each operation type, inputting each behavior conversion vector and the corresponding time interval into the coding layer together for coding, and generating second behavior characteristic vectors corresponding to each operation type. It should be noted here that, for the sampling time and the time at which the operation behavior occurs each time in the preset history time period before the sampling time, a second behavior feature vector is corresponded, and because only the second behavior feature vector corresponding to the sampling time can represent the internal relationship between the target operation behavior at the sampling time and the preset history time period, the second behavior feature vector corresponding to each operation type is the second behavior feature vector corresponding to the sampling time.
S202: and splicing the second behavior characteristic vectors corresponding to the operation types, inputting the second behavior characteristic vectors into the classification neural network, and acquiring the abnormal risk probability of the sample user at the sampling moment.
Each operation type includes a basic operation scenario and a service operation scenario, and therefore, the feature vector corresponding to each operation type includes a basic operation feature vector, such as: logging in a feature vector corresponding to an account, registering a feature vector corresponding to an account, and operating a feature vector, such as: characteristic vectors corresponding to payment services, characteristic vectors corresponding to transfer services and the like.
It is noted here that the feature vector corresponding to the registered account is consistent with the behavior feature vector corresponding to the type of registration operation.
For example, in the above example, the second behavior feature vectors of operation types B1 through B3 are E1 through E3, respectively, and then E1 through E3 are spliced, as shown: e1+ E2+ E3, E1+ E2+ E3 to the classification neural network. Where "+" indicates a splice.
For example, if the operation types include B1 and B2, the second behavior feature vector E1 of B1 is (M1, M2), and the second behavior feature vector E1 of B2 is (N1, N2, N3), then E1 and E2 are spliced, and the result after splicing is (M1, M2, N1, N2, N3).
The classified neural network has multiple layers, each layer of the classified neural network has multiple neurons, the interrelation between different elements in the spliced feature vectors is established through the neurons of each layer of the neural network, and after multilayer processing, the output of the multilayer neural network is classified by using a classification function to obtain the abnormal risk probability corresponding to the sample user.
For example: and the operation type of the sample user is the transfer service, the characteristic vector comprises a characteristic vector corresponding to a login account, a characteristic vector corresponding to a registration account and a characteristic vector corresponding to the transfer service, the characteristic vector corresponding to the login account, the characteristic vector corresponding to the registration account and the characteristic vector corresponding to the transfer service are spliced and input into the classification neural network, and then the abnormal risk probability of the sample user under the transfer operation type is obtained.
S203: and training the circulating neural network and the classified neural network based on the abnormal risk probability and the label information to obtain an abnormal detection model.
In specific implementation, cross entropy loss between the abnormal risk probability and the labeling information can be calculated according to the abnormal risk probability and the label information, and the GRU and the classification neural network are trained according to the cross entropy loss and a preset cross entropy loss threshold.
Specifically, training the GRU and the classification neural network according to the cross entropy loss and a preset cross entropy loss threshold includes:
executing the following comparison operation until the cross entropy loss between the abnormal risk probability and the labeling information is not greater than a preset cross entropy loss threshold;
the comparison operation comprises the following steps: determining cross entropy loss based on the abnormal risk probability and the labeling information, and comparing the cross entropy loss with a cross entropy loss threshold; and aiming at the condition that the cross entropy loss is larger than the cross entropy loss threshold value, adjusting parameters of the recurrent neural network and the classified neural network, re-acquiring the labeling information based on the GRU and the classified neural network after the parameters are adjusted, and re-executing comparison operation.
It is noted that when adjusting parameters of the GRU, the parameters, i.e. the values of the elements of the transformation matrix, are adjusted for the transformation layer.
In the embodiment of the application, after target historical operation behavior information of a plurality of sample users in a sampling time and a preset historical time period before the sampling time is obtained, a behavior feature vector sequence and a time interval sequence of the sample user in each operation type are generated for each sample user, the behavior feature vector sequence and the time interval sequence are input into a recurrent neural network, an abnormality detection model is trained based on label information of whether abnormal behaviors occur at the sampling time of the sample users, an abnormality detection model is obtained, when the model is trained, the behavior feature vector sequence and the time interval sequence for each operation type are generated based on the target historical operation behavior information of each sample user, and the abnormality detection model trained based on the behavior feature vector sequence and the time interval sequence can learn the operation behaviors between continuous target historical operation behaviors and the operation behavior of a current user When the abnormal behavior detection model is used for detecting the abnormal behavior, the current target operation behavior of the user when using the electronic bank can be analyzed according to the target historical operation behavior and the target operation behavior at the current moment, and the accuracy of judging whether the current target operation behavior of the user is the abnormal behavior is improved.
Referring to fig. 4, an embodiment of the present application further provides a method for detecting an abnormal operation behavior, including:
s401: and after the target operation behavior of the user to be detected is detected, acquiring the target operation behavior information of the user to be detected in the latest preset time period.
In the specific implementation, the target operation behavior information refers to information of a service to be handled by the user to be tested, which is carried in a service handling request initiated by the user to be tested through an electronic bank at the current moment. The latest preset time period is a time length between the current time and a certain historical time, wherein the historical time changes along with the change of the time length, and the time lengths can be the same or different for different users.
S402: generating a to-be-tested behavior characteristic vector sequence and a to-be-tested time interval sequence which are respectively corresponding to each operation type of a to-be-tested user under various operation types according to the target operation behavior information; the behavior feature vector sequence to be tested comprises a plurality of behavior feature vectors to be tested, and different behavior feature vectors to be tested are feature matrixes corresponding to target operation behaviors occurring at different moments.
In specific implementation, feature vector extraction and feature vector preprocessing are performed under each operation type according to target operation behavior information, where a preprocessing process of the feature vector is consistent with a corresponding embodiment in the anomaly detection model training method provided by the present application, and details are not repeated here.
And for each operation type, corresponding to a basic operation scene and a business operation scene of the user to be tested, wherein the behavior feature vector sequence to be tested corresponding to each operation type respectively comprises a behavior feature vector to be tested at the current time and a behavior vector to be tested in the latest preset time period corresponding to the basic operation scene, a behavior feature vector to be tested at the current time and a behavior feature vector to be tested in the third history time period corresponding to the business operation scene.
The behavior characteristic vector sequence to be detected and the time interval sequence to be detected are generated in the following mode:
according to the target operation behavior information, acquiring the time of each operation behavior of a user to be tested in the latest preset time period and characteristic values under a plurality of preset operation behavior characteristics corresponding to each operation type respectively;
calculating the time interval between each time of the operation behavior of the user to be tested in each operation type and the last time of the operation behavior in the latest preset time period according to the time of each time of the operation behavior of the user to be tested in the latest preset time period;
generating a time interval sequence to be tested of the user to be tested under each operation type according to the time interval corresponding to each operation behavior of the user to be tested under each operation type;
and the number of the first and second groups,
generating characteristic vectors of the behavior to be detected respectively corresponding to each operation type when the sample user generates the operation behavior each time according to the characteristic values under a plurality of preset operation behavior characteristics corresponding to each operation type when the user to be detected generates the operation behavior each time in the latest preset time period;
and generating a to-be-tested behavior feature vector sequence corresponding to each operation type of the to-be-tested user according to the to-be-tested behavior feature vector corresponding to each operation type when the to-be-tested user generates the operation behavior each time.
S403: and inputting the characteristic vector sequence of the behavior to be detected and the time interval sequence to be detected into an abnormality detection model obtained by an abnormality detection model training method, and acquiring the abnormal risk probability of the target operation behavior of the user to be detected.
The abnormal behavior detection model comprises: a GRU corresponding to each operation type and a classification neural network.
The training method of the anomaly detection model may refer to the embodiments corresponding to fig. 1 to fig. 3, and is not described herein again.
When the method is concretely realized, the behavior characteristic vector sequence to be detected under each operation type and the time interval sequence to be detected are input into the abnormal behavior detection model corresponding to each operation type, and the incidence relation between all the behavior characteristic vectors to be detected in the latest preset time period and the corresponding behavior characteristic vectors to be detected when the target behavior occurs to the user to be detected is established through the abnormal behavior detection model corresponding to each operation type, so that the risk probability of the operation behavior of the user to be detected is obtained.
Specifically, referring to fig. 5, an embodiment of the present application further provides a specific method for obtaining a risk probability of an operation behavior of a user to be tested, including:
s501: and inputting the behavior feature vector sequence to be detected and the time interval sequence to be detected corresponding to each operation type into the recurrent neural network to obtain a third behavior feature vector under each operation type.
S502: and splicing the third row characteristic vectors corresponding to each operation type, and inputting the spliced third row characteristic vectors into a classification neural network to obtain the abnormal risk probability of the target operation behavior of the user to be detected.
When the method is concretely realized, the behavior feature vector sequence to be tested corresponding to each operation type is input into the GRU, firstly, the behavior feature vector to be tested in the behavior feature vector sequence to be tested under each operation type is subjected to standardized conversion, and when the standardized conversion is carried out, the conversion is carried out through the conversion matrix of the conversion layer.
Specifically, an embodiment of the present application further provides a specific method for performing standardized transformation on each behavior feature vector to be tested in a behavior feature vector sequence to be tested, including:
the conversion layer corresponds to the conversion matrix of each operation type;
and under each operation type, multiplying each first behavior feature vector by the conversion matrix respectively to generate a to-be-detected behavior conversion vector when the to-be-detected user generates an operation behavior each time under each operation type.
In specific implementation, referring to the anti-anomaly network structure diagram shown in fig. 6, multiplying the to-be-detected behavior feature vector in the to-be-detected behavior feature vector sequence under each operation type by the conversion matrix to obtain to-be-detected behavior conversion vectors corresponding to operation behavior moments within the latest preset time period under each operation type, inputting the to-be-detected behavior conversion vectors and time intervals corresponding to each to-be-detected behavior conversion vector in the to-be-detected time interval sequence into the coding layer to obtain second behavior feature vectors corresponding to the operation types, splicing the second behavior feature vectors corresponding to the operation types, and inputting the spliced second behavior feature vectors into the classification neural network to obtain the risk probability of the target operation behavior of the to-be-detected user.
For example: referring to the schematic diagram of the change of the operational state of the GRU shown in fig. 7, the current time is t, and the time length of the latest preset time period is n +1, then, the behavior feature vector to be measured in the latest preset time period refers to a behavior feature vector to be measured corresponding to an operation behavior of the user to be measured when registering the account at the time t-n, t- (n-1), …, t-1, and t, a behavior feature vector to be measured corresponding to an operation behavior of the user to be measured when logging in the account, and a behavior feature vector to be measured corresponding to an operation behavior of the user to be measured when initiating the transfer request.
It should be noted here that the conversion matrices in the conversion layer are different for different operation types, and when the operation type is the basic operation behavior, the behavior feature vector to be measured corresponding to the basic operation behavior does not need to be converted by the conversion matrices. When the operation type is a business operation behavior, the to-be-tested behavior feature vector corresponding to the business operation behavior needs to be converted through the conversion matrix of the conversion layer, for example, if the operation type is a transfer, the conversion matrix W isBusiness=WTransferring accountsWherein W isBusinessRepresenting a transformation matrix, WTransferring accountsA transformation matrix representing the operation type as a transfer.
Under each operation type, the characteristic vector V- (t-k) of the behavior to be measured at each moment corresponds to the conversion matrix of the current operation typeMultiplication, i.e. (V) - (t-k). times.WBusiness) Wherein V- (t-k) is a characteristic vector of the behavior to be measured at the t-k moment, WBusinessA conversion matrix is represented, k is 0,1,2, aBusinessInputting the behavior conversion vectors to be detected at different moments and time intervals corresponding to the behavior conversion vectors to be detected at different moments into an encoding layer for encoding, obtaining a feature vector capable of reflecting the association relationship between the feature vector V-t of the behavior to be detected at the current moment and the feature vector V- (t-k) of the behavior to be detected at each historical moment through the internal parameter information state of the encoding layer, and further obtaining a third row feature vector h of the current moment ttAnd the third line of the current time t is the feature vector htCan reflect the comprehensive characteristics of the operation type downlink as the characteristic vector sequence, namely, the third behavior characteristic vector h of the current time ttAnd extracting the characteristics of all the operation behavior information of the user to be tested at the previous n moments.
Specifically, the coding layer of the GRU obtains the third line feature vector by the following process:
each node of the GRU comprises two gates, namely an update gate and a reset gate, wherein the gated state quantity of the update gate and the gated state quantity of the reset gate are obtained through the state transmitted by the previous node and the input quantity of the current node.
For example: the state transmitted by the last node is sm-1The input quantity of the current node is xmThe gating state quantity of the update gate is represented as zmThen the gating state quantity z of the gate is updatedmSatisfies the following formula (1):
zm=σ(Wzxm+Uzsm-1) (1);
wherein z ismIndicating the gating state quantity, s, of the update gatem-1Representing the amount of state, x, transmitted by the last nodemRepresenting the input quantity, W, of the current nodezUpdated weight matrix, U, representing the input of the current nodezRepresents the last nodeThe updated weight matrix of the state transferred, σ (·) represents the Sigmoid function.
The gating state quantity of the reset gate is represented as rmThe state transmitted by the previous node is sm-1The input quantity of the current node is xmThen the gating state quantity r of the gate is resetmSatisfies the following formula (2):
rm=σ(Wrxm+Ursm-1) (2);
wherein r ismRepresenting the gating state quantity, s, of the reset gatem-1Representing the amount of state, x, transmitted by the last nodemRepresenting the input quantity, W, of the current noderReset weight matrix, U, representing current node input quantityrA reset weight matrix representing the state transmitted by the previous node, σ (-) representing the Sigmoid function.
Here, the input quantity of the current node is a to-be-tested behavior conversion vector corresponding to the business operation behavior or a to-be-tested behavior feature vector corresponding to the basic operation behavior.
In addition, in the anomaly detection of the application, not only the historical operation behavior of the user to be detected is considered, but also the influence of the time interval of the operation behavior occurring each time on whether the current operation behavior of the user to be detected is the abnormal behavior is considered, and the formula (3) specifically represents that:
Tm=σ(Wtxm+σ(QtΔtm)) (3);
wherein, TmRepresenting a time state quantity, xmRepresenting the input quantity, Δ t, of the current nodemTime interval representing the current node, σ (-) represents Sigmoid function, WtA time state quantity weight matrix, Q, representing the input quantity of the current nodetA weight parameter representing a time interval.
Obtaining a gating state quantity z of an update gatemResetting the gate control state quantity r of the gatemAnd a time state quantity TmThen, the state quantity s of the current node is obtained through the following formula (4)m
sm=zm×Tm×tanh(Whxm+Uh(rm×xm-1))+(1-zm)×sm-1 (4);
Wherein s ismRepresenting the state quantity of the current node, zmIndicating the gating state quantity of the update gate, rmRepresenting the gating state quantity, x, of the reset gatem-1Representing the input quantity, s, of the previous nodem-1Representing the amount of state, T, transmitted by the last nodemRepresents a time state quantity, WhA first integrated coding weight matrix, U, representing the input of the current nodehRepresenting a second complex coding weight matrix, tanh (·) representing a hyperbolic tangent function.
Obtaining the state quantity s of the current nodemThen, the state quantities that can integrate the features of the state quantities output by all the nodes are taken as the third behavior feature vector.
And after the third row feature vectors under each operation type are obtained, the third row feature vectors under different operation types are spliced and input into a classification neural network, and the abnormal risk probability of the current target operation behavior of the user to be detected is obtained.
After the abnormal risk probability of the current target operation behavior of the user to be detected is obtained, whether the operation behavior of the user to be detected at the current moment is intercepted or not is judged according to the risk probability of the operation behavior of the user to be detected and a preset risk threshold value corresponding to the operation behavior at the current moment.
Specifically, as shown in fig. 8, the method for detecting an abnormal operation behavior according to the embodiment of the present application further includes:
s801: and detecting whether the risk probability reaches a corresponding preset risk threshold value when the target operation behavior occurs to the user to be detected.
S802: and when the risk probability reaches a preset risk threshold corresponding to the target operation behavior of the user to be detected, intercepting the target operation behavior of the user to be detected.
S803: and when the risk probability does not reach a preset risk threshold corresponding to the target operation behavior of the user to be tested, allowing the target operation behavior of the user to be tested to be executed.
When the target operation behavior is detected to be abnormal, the target operation behavior of the user to be detected is intercepted, the target operation behavior information of the user to be detected is marked with an abnormal behavior label, and the abnormal behavior label is stored in a historical operation behavior database.
And if the risk probability does not reach a preset risk threshold corresponding to the target operation behavior of the user to be detected, the target operation behavior of the user to be detected belongs to a normal behavior, the target operation behavior of the user to be detected is allowed to be executed, the target operation behavior information of the user to be detected is marked with a normal behavior label, and the normal behavior label is stored in a historical operation behavior database.
According to the method for detecting the abnormal operation behavior, the target operation behavior information of the user to be detected corresponds to each operation type in the preset time period, the characteristic vector sequence of the behavior to be detected is obtained, the characteristic vector sequence of the behavior to be detected is input into the abnormal behavior detection model, the internal relation between the characteristic vectors of the behavior to be detected in the characteristic vector sequence of the behavior to be detected is established through the abnormal behavior detection model, the risk probability of the target operation behavior of the user to be detected is further obtained, the operation behavior of the current user can be analyzed according to the historical operation behavior information and the operation behavior information at the current moment, and the accuracy of judging whether the operation behavior of the user is the abnormal behavior is improved.
Based on the same inventive concept, an abnormal behavior detection model training device corresponding to the abnormal behavior detection model training method is also provided in the embodiment of the present application, and as the principle of solving the problem of the device in the embodiment of the present application is similar to the above abnormal behavior detection model training method in the embodiment of the present application, the implementation of the device can refer to the implementation of the method, and repeated details are omitted.
Referring to fig. 9, another embodiment of the present application further provides an anomaly detection model training device, including:
an acquisition module 901. The system comprises a data acquisition module, a data processing module and a data processing module, wherein the data acquisition module is used for acquiring target historical operation behavior information of a plurality of sample users in a historical time period before and at a sampling moment and label information indicating whether each sample user has abnormal behavior at the sampling moment;
a generating module 902, configured to generate, for each sample user, a behavior feature vector sequence of the sample user in each operation type and a time interval sequence of occurrence of target historical operation behavior according to target historical operation behavior information; the behavior feature vector sequence comprises a plurality of first behavior feature vectors, and different first behavior feature vectors are feature vectors corresponding to target historical operation behaviors occurring at different moments;
and the training module 903 is configured to perform anomaly detection model training based on the behavior feature vector sequence, the time interval sequence and the label information of the sample user in each operation type, so as to obtain an anomaly detection model.
Optionally, the training module 903 is configured to perform anomaly detection model training based on the behavior feature vector sequence and the time interval sequence of the sample user in each operation type and the label information in the following manner, including:
inputting the behavior characteristic vector sequence and the time interval sequence of each sample user under each operation type into a recurrent neural network aiming at each sample user to obtain a second behavior characteristic vector corresponding to the operation type;
splicing the second behavior characteristic vectors corresponding to the operation types, inputting the second behavior characteristic vectors to a classification neural network, and acquiring the abnormal risk probability of the sample user at the sampling moment;
and training the circulating neural network and the classified neural network based on the abnormal risk probability and the label information to obtain an abnormal detection model.
Optionally, the training module 903 obtains a second behavior feature vector corresponding to the operation type specifically according to the following manner, including:
the recurrent neural network includes: a conversion layer and an encoding layer;
inputting the behavior characteristic vector sequence corresponding to each operation type into a conversion layer, and performing standardized conversion on each first behavior characteristic vector in the behavior characteristic vector sequence through the conversion layer to obtain a behavior conversion vector when a sample user generates an operation behavior each time;
and inputting the time interval of each sample user in the time interval sequence when the operation behavior occurs every time and the behavior conversion vector into the coding layer for coding aiming at the time interval sequence under each operation type, and generating second behavior characteristic vectors corresponding to each operation type.
Optionally, the training module 903 is specifically configured to perform standardized transformation on each first behavior feature vector in the behavior feature vector sequence according to the following manner, and generate a behavior transformation vector when a sample user generates an operation behavior each time in each operation type, where the behavior transformation vector includes:
the conversion layer comprises conversion matrixes corresponding to various operation types;
under each operation type, multiplying each first behavior feature vector by a conversion matrix respectively to generate a behavior conversion vector when a sample user generates an operation behavior each time under each operation type;
when the recurrent neural network is trained, parameter adjustment is carried out on the recurrent neural network, and each element value of the conversion matrix is adjusted according to the conversion layer.
Based on the same inventive concept, an abnormal behavior detection device corresponding to the abnormal detection method is also provided in the embodiments of the present application, and as the principle of solving the problem of the device in the embodiments of the present application is similar to the abnormal detection method in the embodiments of the present application, the implementation of the device can refer to the implementation of the method, and repeated details are not repeated.
Referring to fig. 10, an embodiment of the present application further provides an apparatus for detecting an abnormal operation behavior, including:
the information acquisition module 1001 is configured to acquire target operation behavior information of a user to be detected within a latest preset time period after detecting that the user to be detected performs a target operation behavior;
the sequence generating module 1002 is configured to generate a to-be-detected behavior feature vector sequence and a to-be-detected time interval sequence, which correspond to each operation type of a to-be-detected user under multiple operation types, according to the target operation behavior information; the behavior feature vector sequence to be tested comprises a plurality of behavior feature vectors to be tested, and different behavior feature vectors to be tested are feature vectors corresponding to target operation behaviors occurring at different moments;
a probability obtaining module 1003, configured to input the feature vector sequence of the behavior to be detected and the time interval sequence to be detected into an anomaly detection model obtained by the anomaly detection model training device, and obtain a risk probability of a target operation behavior of a user to be detected;
the anomaly detection model includes: a recurrent neural network and a categorical neural network.
Optionally, the probability obtaining module 1003 is specifically configured to obtain the abnormal risk probability of the target operation behavior of the user to be tested according to the following manner:
inputting the behavior feature vector sequence to be detected and the time interval sequence to be detected corresponding to each operation type into a recurrent neural network to obtain a third behavior feature vector under each operation type;
and splicing the third row characteristic vectors corresponding to each operation type, and inputting the spliced third row characteristic vectors into a classification neural network to obtain the abnormal risk probability of the target operation behavior of the user to be detected.
Optionally, the probability obtaining module 1003 is specifically configured to obtain the third row feature vector under each operation type according to the following manner: the recurrent neural network includes: conversion layer and coding layer
Inputting the behavior feature vector sequence to be tested corresponding to each operation type into a conversion layer, and performing standardized conversion on each behavior feature vector to be tested in the behavior feature vector sequence to be tested through the conversion layer to obtain a behavior conversion vector to be tested when the operation behavior occurs to a user to be tested each time;
and inputting the time interval of each operation behavior of the user to be detected in the time interval sequence to be detected and the conversion vector of the behavior to be detected into the coding layer for coding, and generating a third behavior feature vector corresponding to each operation type.
Optionally, the probability obtaining module 1003 is specifically configured to perform standardized conversion on each to-be-detected behavior feature vector in the to-be-detected behavior feature vector sequence according to the following manner, and generate a to-be-detected behavior conversion vector when the to-be-detected user generates an operation behavior each time in each operation type
The conversion layer corresponds to the conversion matrix of each operation type;
and under each operation type, multiplying each first behavior feature vector by the conversion matrix respectively to generate a to-be-detected behavior conversion vector when the to-be-detected user generates an operation behavior each time under each operation type.
Optionally, the anomaly detection apparatus further includes a detection module 1004; the detection module 1004 is specifically configured to:
detecting whether the risk probability reaches a corresponding preset risk threshold value when a target operation behavior occurs to a user to be detected;
intercepting the target operation behavior of the user to be detected when the risk probability reaches a preset risk threshold corresponding to the target operation behavior of the user to be detected;
and when the risk probability does not reach a preset risk threshold corresponding to the target operation behavior of the user to be tested, allowing the target operation behavior of the user to be tested to be executed.
Referring to fig. 11, an anomaly detection system according to the embodiment of the present application is further provided, where the anomaly detection method according to the embodiment of fig. 5 specifically includes:
the anti-exception system acquires a service request of a user to be detected through the service system, further acquires target operation behavior information of the user to be detected, and calls the target operation behavior information of the latest preset time period from the historical behavior database corresponding to the operation behavior information of the user to be detected. And analyzing the risk value of the target operation behavior of the user to be detected at the current moment for evaluation based on the target operation behavior information, and performing evading operation of the risk according to the evaluation result. And sending the information of normal or abnormal target operation behavior information mark at the current moment to a historical database for storage, and sending the information to a training database for parameter adjustment of an abnormal behavior detection model.
Referring to fig. 12, an embodiment of the present application further provides an internal functional entity structure of an anti-exception system, which specifically includes:
it can be seen from the figure that there are a total of 3 physical modules: one is a timer 10, and the other is an abnormal behavior detection model training pipeline module 11 and a bank anti-abnormal wind control engine 12 based on the user's historical behavior. The timer 10 has a single structure, and is mainly used for periodically triggering the operation of the abnormal behavior detection model training pipeline module 11, so that the abnormal behavior detection model always keeps the identification state of the latest abnormal mode; the anomaly detection model training pipeline module 11 can perform a data preprocessing process before the anomaly detection model training in the embodiment corresponding to fig. 1, such as data vectorization representation, data cleaning, data enhancement, feature screening, standardization operation, and the like, and can also perform a training process of the anomaly detection model in the embodiment corresponding to fig. 2. The bank abnormal wind control engine 12 based on the historical behavior of the user mainly receives the model parameters trained by the detection model training pipeline module 11 and judges the operation behavior of the user in real time.
Corresponding to the method for training the anomaly detection model in fig. 1 and the method for detecting the operational behavior in fig. 4, an embodiment of the present invention further provides a computer device 1300, as shown in fig. 13, the device includes a memory 1000, a processor 2000 and a bus 33, where the processor 2000 implements the steps of the method for training the anomaly detection model and the method for detecting the operational behavior when executing the computer program.
Specifically, the memory 1000 and the processor 2000 may be a general memory and a general processor, which are not limited to this, and when the processor 2000 runs a computer program stored in the memory 1000, the steps of the above-described abnormal detection model training method and the abnormal operation behavior detection method may be executed, and the operation behavior of the current user may be analyzed according to the historical operation behavior information and the operation behavior information at the current time, so as to improve the accuracy of determining whether the operation behavior of the user is an abnormal behavior.
Corresponding to the abnormal detection model training method in fig. 1 and the abnormal operation behavior detection method in fig. 4, an embodiment of the present application further provides a computer-readable storage medium, where a computer program is stored on the computer-readable storage medium, and when the computer program is executed by a processor, the steps of the abnormal detection model training method and the abnormal operation behavior detection method are performed.
Specifically, the storage medium can be a general-purpose storage medium, such as a removable disk, a hard disk, or the like, and when a computer program on the storage medium is executed, the above-mentioned abnormality detection model training method and the abnormal operation behavior detection method can be executed, and the operation behavior of the current user can be analyzed according to the historical operation behavior information and the operation behavior information at the current time, so as to improve the accuracy of determining whether the operation behavior of the user is an abnormal behavior.
The computer program product of the method and the apparatus for training the anomaly detection model and the method and the apparatus for detecting the abnormal operation behavior provided in the embodiments of the present application includes a computer readable storage medium storing a program code, and instructions included in the program code may be used to execute the method in the foregoing method embodiments.
It is clear to those skilled in the art that, for convenience and brevity of description, the specific working processes of the system and the apparatus described above may refer to the corresponding processes in the foregoing method embodiments, and are not described herein again. In the several embodiments provided in the present application, it should be understood that the disclosed system, apparatus and method may be implemented in other ways. The above-described embodiments of the apparatus are merely illustrative, and for example, a division of a unit is merely a division of one logic function, and there may be other divisions when actually implemented, and for example, a plurality of units or components may be combined or integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection of devices or units through some communication interfaces, and may be in an electrical, mechanical or other form.
Units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.
In addition, functional units in the embodiments of the present application may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit.
The functions, if implemented in the form of software functional units and sold or used as a stand-alone product, may be stored in a non-volatile computer-readable storage medium executable by a processor. Based on such understanding, the technical solution of the present application or portions thereof that substantially contribute to the prior art may be embodied in the form of a software product stored in a storage medium and including instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present application. And the aforementioned storage medium includes: various media capable of storing program codes, such as a usb disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk, or an optical disk.
Finally, it should be noted that: the above examples are only specific embodiments of the present application, and are not intended to limit the technical solutions of the present application, and the scope of the present application is not limited thereto, although the present application is described in detail with reference to the foregoing examples, those skilled in the art should understand that: any person skilled in the art can modify or easily conceive the technical solutions described in the foregoing embodiments or equivalent substitutes for some technical features within the technical scope disclosed in the present application; such modifications, changes or substitutions do not depart from the spirit and scope of the exemplary embodiments of the present application, and are intended to be covered by the scope of the present application. Therefore, the protection scope of the present application shall be subject to the protection scope of the claims.

Claims (6)

1. A method for detecting abnormal operating behavior, comprising:
acquiring target historical operation behavior information of a plurality of sample users at a sampling moment when target historical operation behaviors occur and in a preset historical time period before the sampling moment, and label information indicating whether each sample user has abnormal behaviors at the sampling moment;
for each sample user, generating a behavior feature vector sequence of the sample user under each operation type and a time interval sequence of target historical operation behaviors according to the acquired target historical operation behavior information; the behavior feature vector sequence comprises a plurality of first behavior feature vectors, and different first behavior feature vectors are feature vectors corresponding to target historical operation behaviors occurring at different moments;
performing anomaly detection model training based on the behavior characteristic vector sequence of the sample user under each operation type, the time interval sequence and the label information to obtain an anomaly detection model; the anomaly detection model is used for detecting the anomaly risk probability of the target operation behavior of the user to be detected;
the method comprises the following steps:
after the target operation behavior of a user to be detected is detected, acquiring target operation behavior information of the user to be detected in a latest preset time period;
generating a to-be-tested behavior characteristic vector sequence and a to-be-tested time interval sequence which are respectively corresponding to each operation type of the to-be-tested user under various operation types according to the target operation behavior information; the behavior feature vector sequence to be tested comprises a plurality of behavior feature vectors to be tested, and different behavior feature vectors to be tested are feature vectors corresponding to target operation behaviors occurring at different moments;
inputting the characteristic vector sequence of the behavior to be detected and the time interval sequence to be detected into the anomaly detection model, and acquiring the anomaly risk probability of the target operation behavior of the user to be detected;
the method further comprises the following steps:
detecting whether the risk probability reaches a corresponding preset risk threshold value when a target operation behavior occurs to a user to be detected;
intercepting the target operation behavior of the user to be detected when the risk probability reaches a preset risk threshold corresponding to the target operation behavior of the user to be detected;
when the risk probability does not reach a preset risk threshold corresponding to the target operation behavior of the user to be tested, allowing the target operation behavior of the user to be tested to be executed;
the training of the anomaly detection model based on the behavior feature vector sequence, the time interval sequence and the label information of the sample user under each operation type comprises the following steps:
for each sample user, inputting the behavior characteristic vector sequence of the sample user under each operation type and the time interval sequence into a recurrent neural network to obtain a second behavior characteristic vector corresponding to the operation type;
splicing the second behavior characteristic vectors corresponding to the operation types, inputting the second behavior characteristic vectors to a classification neural network, and acquiring the abnormal risk probability of the sample user at the sampling moment;
training the recurrent neural network and the classified neural network based on the abnormal risk probability and the label information to obtain the abnormal detection model;
the recurrent neural network includes: a conversion layer and an encoding layer;
the obtaining of the second behavior feature vector corresponding to the operation type includes:
inputting the behavior characteristic vector sequence corresponding to each operation type into the conversion layer, and performing standardized conversion on each first behavior characteristic vector in the behavior characteristic vector sequence through the conversion layer to obtain a behavior conversion vector when a sample user generates an operation behavior each time;
and inputting the time interval of each sample user in the time interval sequence when the operation behavior occurs every time and the behavior conversion vector into the coding layer for coding aiming at the time interval sequence under each operation type, and generating second behavior characteristic vectors corresponding to each operation type.
2. The method of claim 1, wherein the translation layer comprises a translation matrix corresponding to each operation type;
performing standardized conversion on each first behavior feature vector in the behavior feature vector sequence to generate a behavior conversion vector when a sample user generates an operation behavior each time under each operation type, wherein the behavior conversion vector comprises:
under each operation type, multiplying each first behavior feature vector by the conversion matrix respectively to generate a behavior conversion vector when a sample user generates an operation behavior each time under each operation type;
when the recurrent neural network is trained, parameter adjustment is carried out on the recurrent neural network, and each element value of the conversion matrix is adjusted according to the conversion layer.
3. The method according to claim 1, wherein the obtaining of the abnormal risk probability of the target operation behavior of the user to be tested comprises:
inputting the behavior feature vector sequence to be tested corresponding to each operation type and the time interval sequence to be tested into a recurrent neural network to obtain a third behavior feature vector under each operation type;
and splicing the third row characteristic vectors corresponding to each operation type, and inputting the spliced third row characteristic vectors into a classification neural network to obtain the abnormal risk probability of the target operation behavior of the user to be detected.
4. The method of claim 3, wherein the recurrent neural network comprises: a conversion layer and an encoding layer;
the obtaining of the third behavior feature vector under each operation type includes:
inputting the behavior feature vector sequence to be tested corresponding to each operation type into the conversion layer, and performing standardized conversion on each behavior feature vector to be tested in the behavior feature vector sequence to be tested through the conversion layer to obtain a behavior conversion vector to be tested when the user to be tested generates an operation behavior each time;
and inputting the time interval of each operation behavior of the user to be detected in the time interval sequence to be detected and the conversion vector of the behavior to be detected into the coding layer for coding, and generating a third row feature vector corresponding to each operation type.
5. The method of claim 4, wherein the translation layer corresponds to a translation matrix for each operation type;
and carrying out standardized conversion on each behavior feature vector to be detected in the behavior feature vector sequence to be detected to generate a behavior conversion vector to be detected when the user to be detected generates an operation behavior each time under each operation type, wherein the behavior conversion vector to be detected comprises the following steps:
and under each operation type, multiplying each first behavior feature vector by the conversion matrix respectively to generate a to-be-detected behavior conversion vector when the to-be-detected user generates an operation behavior each time under each operation type.
6. An electronic device, comprising: processor, memory and bus, the memory storing machine readable instructions executable by the processor, the processor and the memory communicating via the bus when the electronic device is running, the machine readable instructions when executed by the processor performing the steps of the method of detecting abnormal operating behavior according to any one of claims 1 to 5.
CN201811180634.4A 2018-10-09 2018-10-09 Method for detecting abnormal operation behavior Active CN109345260B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201811180634.4A CN109345260B (en) 2018-10-09 2018-10-09 Method for detecting abnormal operation behavior

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201811180634.4A CN109345260B (en) 2018-10-09 2018-10-09 Method for detecting abnormal operation behavior

Publications (2)

Publication Number Publication Date
CN109345260A CN109345260A (en) 2019-02-15
CN109345260B true CN109345260B (en) 2021-11-30

Family

ID=65308584

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201811180634.4A Active CN109345260B (en) 2018-10-09 2018-10-09 Method for detecting abnormal operation behavior

Country Status (1)

Country Link
CN (1) CN109345260B (en)

Families Citing this family (28)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110033120A (en) * 2019-03-06 2019-07-19 阿里巴巴集团控股有限公司 For providing the method and device that risk profile energizes service for trade company
CN111723200A (en) * 2019-03-20 2020-09-29 京东数字科技控股有限公司 Method and system for determining user behavior characteristics
CN110189134B (en) * 2019-05-17 2023-01-31 同济大学 Suspected fraud transaction reference ordinal-based network payment anti-fraud system architecture design method
CN110191113B (en) * 2019-05-24 2021-09-24 新华三信息安全技术有限公司 User behavior risk assessment method and device
CN110263106B (en) * 2019-06-25 2020-02-21 中国人民解放军国防科技大学 Collaborative public opinion fraud detection method and device
CN110363649A (en) * 2019-06-27 2019-10-22 上海淇馥信息技术有限公司 A kind of method for prewarning risk based on user operation case, device, electronic equipment
CN110335144B (en) * 2019-07-10 2023-04-07 中国工商银行股份有限公司 Personal electronic bank account security detection method and device
CN111178523B (en) * 2019-08-02 2023-06-06 腾讯科技(深圳)有限公司 Behavior detection method and device, electronic equipment and storage medium
CN112347457A (en) * 2019-08-06 2021-02-09 上海晶赞融宣科技有限公司 Abnormal account detection method and device, computer equipment and storage medium
CN110662169B (en) * 2019-09-25 2021-04-27 北京明略软件系统有限公司 Terminal equipment matching method and device
CN111047332B (en) * 2019-11-13 2021-05-07 支付宝(杭州)信息技术有限公司 Model training and risk identification method, device and equipment
CN111274907B (en) * 2020-01-16 2023-04-25 支付宝(中国)网络技术有限公司 Method and apparatus for determining category labels of users using category recognition model
CN111274501B (en) * 2020-02-25 2023-04-18 支付宝(杭州)信息技术有限公司 Method, system and non-transitory storage medium for pushing information
CN111340112B (en) * 2020-02-26 2023-09-26 腾讯科技(深圳)有限公司 Classification method, classification device and classification server
CN111708995A (en) * 2020-06-12 2020-09-25 中国建设银行股份有限公司 Service processing method, device and equipment
CN111709754B (en) * 2020-06-12 2023-08-25 中国建设银行股份有限公司 User behavior feature extraction method, device, equipment and system
US20230111652A1 (en) * 2020-06-16 2023-04-13 Paypal, Inc. Training a Recurrent Neural Network Machine Learning Model with Behavioral Data
CN111836064B (en) * 2020-07-02 2022-01-07 北京字节跳动网络技术有限公司 Live broadcast content identification method and device
CN112037001A (en) * 2020-09-03 2020-12-04 云账户技术(天津)有限公司 Money printing risk prediction model training method, money printing risk prediction method and device
CN112037052B (en) * 2020-11-04 2021-01-26 上海冰鉴信息科技有限公司 User behavior detection method and device
CN112491875B (en) * 2020-11-26 2022-07-08 四川长虹电器股份有限公司 Intelligent tracking safety detection method and system based on account system
CN112634026A (en) * 2020-12-30 2021-04-09 四川新网银行股份有限公司 Credit fraud identification method based on user page operation behavior
CN113362069A (en) * 2021-06-01 2021-09-07 深圳前海微众银行股份有限公司 Dynamic adjustment method, device and equipment of wind control model and readable storage medium
CN113591932A (en) * 2021-07-06 2021-11-02 北京淇瑀信息科技有限公司 User abnormal behavior processing method and device based on support vector machine
KR102591483B1 (en) * 2022-11-16 2023-10-19 주식회사우경정보기술 Apparatus and method for spatiotemporal neural network-based labeling for building anomaly data set
CN116433242B (en) * 2023-02-28 2023-10-31 王宇轩 Fraud detection method based on attention mechanism
CN116259110B (en) * 2023-05-09 2023-08-08 杭州木兰科技有限公司 Security detection method, device, equipment and storage medium for ATM protection cabin
CN117176478B (en) * 2023-11-02 2024-02-02 南京怡晟安全技术研究院有限公司 Network security practical training platform construction method and system based on user operation behaviors

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7181768B1 (en) * 1999-10-28 2007-02-20 Cigital Computer intrusion detection system and method based on application monitoring
CN107481019A (en) * 2017-07-28 2017-12-15 上海携程商务有限公司 Order fraud recognition methods, system, storage medium and electronic equipment
CN108428132A (en) * 2018-03-15 2018-08-21 阿里巴巴集团控股有限公司 Fraudulent trading recognition methods, device, server and storage medium
CN108446374A (en) * 2018-03-16 2018-08-24 北京三快在线科技有限公司 User view prediction technique, device, electronic equipment, storage medium

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20190259033A1 (en) * 2015-06-20 2019-08-22 Quantiply Corporation System and method for using a data genome to identify suspicious financial transactions

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7181768B1 (en) * 1999-10-28 2007-02-20 Cigital Computer intrusion detection system and method based on application monitoring
CN107481019A (en) * 2017-07-28 2017-12-15 上海携程商务有限公司 Order fraud recognition methods, system, storage medium and electronic equipment
CN108428132A (en) * 2018-03-15 2018-08-21 阿里巴巴集团控股有限公司 Fraudulent trading recognition methods, device, server and storage medium
CN108446374A (en) * 2018-03-16 2018-08-24 北京三快在线科技有限公司 User view prediction technique, device, electronic equipment, storage medium

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
B2B电子商务平台欺诈用户识别研究;郑一曼;《中国优秀硕士学位论文全文数据库 经济与管理科学辑》;20140715(第07期);第J157-125页 *

Also Published As

Publication number Publication date
CN109345260A (en) 2019-02-15

Similar Documents

Publication Publication Date Title
CN109345260B (en) Method for detecting abnormal operation behavior
CN109409896B (en) Bank fraud recognition model training method, bank fraud recognition method and device
CN110009174B (en) Risk recognition model training method and device and server
CN109410036A (en) A kind of fraud detection model training method and device and fraud detection method and device
CN109302410B (en) Method and system for detecting abnormal behavior of internal user and computer storage medium
CN110830448B (en) Target event flow abnormity detection method and device, electronic equipment and medium
CN110163242B (en) Risk identification method and device and server
CN110827138B (en) Push information determining method and device
CN109389494B (en) Loan fraud detection model training method, loan fraud detection method and device
CN111737546B (en) Method and device for determining entity service attribute
CN110264270B (en) Behavior prediction method, behavior prediction device, behavior prediction equipment and storage medium
CN112580952A (en) User behavior risk prediction method and device, electronic equipment and storage medium
CN114549001A (en) Method and device for training risk transaction recognition model and recognizing risk transaction
CN112990989B (en) Value prediction model input data generation method, device, equipment and medium
CN115935265B (en) Method for training risk identification model, risk identification method and corresponding device
CN111951008A (en) Risk prediction method and device, electronic equipment and readable storage medium
CN110910241A (en) Cash flow evaluation method, apparatus, server device and storage medium
CN113409050B (en) Method and device for judging business risk based on user operation
CN112632219B (en) Method and device for intercepting junk short messages
CN114722954A (en) Content exception handling method and device for evaluation information
CN110570301B (en) Risk identification method, device, equipment and medium
CN109272398B (en) Operation request processing system
CN113435900A (en) Transaction risk determination method and device and server
CN113052604A (en) Object detection method, device, equipment and storage medium
CN113420789A (en) Method, device, storage medium and computer equipment for predicting risk account

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant