CN110929799B - Method, electronic device, and computer-readable medium for detecting abnormal user - Google Patents

Method, electronic device, and computer-readable medium for detecting abnormal user Download PDF

Info

Publication number
CN110929799B
CN110929799B CN201911200519.3A CN201911200519A CN110929799B CN 110929799 B CN110929799 B CN 110929799B CN 201911200519 A CN201911200519 A CN 201911200519A CN 110929799 B CN110929799 B CN 110929799B
Authority
CN
China
Prior art keywords
detection result
user
model
detection
abnormal user
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201911200519.3A
Other languages
Chinese (zh)
Other versions
CN110929799A (en
Inventor
何莹
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shanghai Shengpay E Payment Service Co ltd
Original Assignee
Shanghai Shengpay E Payment Service Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shanghai Shengpay E Payment Service Co ltd filed Critical Shanghai Shengpay E Payment Service Co ltd
Priority to CN201911200519.3A priority Critical patent/CN110929799B/en
Publication of CN110929799A publication Critical patent/CN110929799A/en
Application granted granted Critical
Publication of CN110929799B publication Critical patent/CN110929799B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q40/00Finance; Insurance; Tax strategies; Processing of corporate or income taxes
    • G06Q40/04Trading; Exchange, e.g. stocks, commodities, derivatives or currency exchange

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Business, Economics & Management (AREA)
  • Evolutionary Computation (AREA)
  • Computational Linguistics (AREA)
  • Software Systems (AREA)
  • Finance (AREA)
  • Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Accounting & Taxation (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Evolutionary Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Development Economics (AREA)
  • Economics (AREA)
  • Marketing (AREA)
  • Strategic Management (AREA)
  • Technology Law (AREA)
  • General Business, Economics & Management (AREA)
  • Debugging And Monitoring (AREA)
  • Data Exchanges In Wide-Area Networks (AREA)

Abstract

Embodiments of the present disclosure disclose a method, an electronic device, and a computer-readable medium for detecting an abnormal user. One embodiment of the method comprises the following steps: inputting a behavior data set of a target user in a target time period into a pre-trained detection model to generate a model detection result; generating a condition detection result based on whether the behavior data in the behavior data set accords with an abnormal user judgment condition in a predetermined abnormal user judgment condition set; and generating a target detection result based on the model detection result and the condition detection result, wherein the target detection result is used for indicating whether the target user is an abnormal user or not. The embodiment improves the accuracy of abnormal user detection, can automatically and efficiently realize effective risk monitoring and management for mass data, and is beneficial to reducing the occupation of resources such as CPU, bandwidth and the like by abnormal users, thereby ensuring the normal use of the resources by normal users.

Description

Method, electronic device, and computer-readable medium for detecting abnormal user
Technical Field
Embodiments of the present disclosure relate to the field of computer technology, and in particular, to a method, an electronic device, and a computer-readable medium for detecting an abnormal user.
Background
Generally, users can be classified into different types according to their characteristics.
For example, a user may belong to an abnormal user when the user has a high frequency of value resource exchange behavior (e.g., transaction behavior) for a certain period of time.
Detection of abnormal users is currently an important aspect in network security. The existing abnormal user detection method generally only depends on personal experience of wind control personnel to find risk transaction behaviors, ensure safe and convenient operation and judge whether a user is an abnormal user.
Disclosure of Invention
Embodiments of the present disclosure propose a method, an electronic device, and a computer-readable medium for detecting an abnormal user.
In a first aspect, embodiments of the present disclosure provide a method for detecting an abnormal user, the method comprising: inputting a behavior data set of a target user in a target time period into a pre-trained detection model to generate a model detection result, wherein the detection model is used for determining whether the user corresponding to the input behavior data set is an abnormal user or not, and the model detection result is a detection result generated by the detection model; generating a condition detection result based on whether the behavior data in the behavior data set accords with an abnormal user judgment condition in a predetermined abnormal user judgment condition set, wherein the condition detection result is a detection result generated based on the abnormal user judgment condition in the abnormal user judgment condition set; and generating a target detection result based on the model detection result and the condition detection result. The target detection result is used for indicating whether the target user is an abnormal user or not.
In a second aspect, embodiments of the present disclosure provide an apparatus for detecting an abnormal user, the apparatus comprising: the input unit is configured to input a behavior data set of a target user in a target time period into a pre-trained detection model to generate a model detection result, wherein the detection model is used for determining whether the user corresponding to the input behavior data set is an abnormal user or not, and the model detection result is a detection result generated by the detection model; a first generation unit configured to generate a condition detection result based on whether behavior data in the behavior data set meets an abnormal user determination condition in a predetermined abnormal user determination condition set, wherein the condition detection result is a detection result generated based on an abnormal user determination condition in the abnormal user determination condition set; and a second generation unit configured to generate a target detection result based on the model detection result and the condition detection result, wherein the target detection result is used for indicating whether the target user is an abnormal user.
In a third aspect, embodiments of the present disclosure provide an electronic device, comprising: one or more processors; and a storage device having one or more programs stored thereon, which when executed by the one or more processors, cause the one or more processors to implement a method as in any of the embodiments of the method for detecting an abnormal user described above.
In a fourth aspect, embodiments of the present disclosure provide a computer readable medium having stored thereon a computer program which, when executed by a processor, implements a method as in any of the embodiments of the method for detecting an abnormal user described above.
The method, the electronic device and the computer readable medium for detecting the abnormal user provided by the embodiments of the present disclosure generate a model detection result by inputting a behavior data set of a target user in a target time period into a pre-trained detection model, wherein the detection model is used for determining whether the user corresponding to the input behavior data set is the abnormal user, the model detection result is a detection result generated by the detection model, then, based on whether the behavior data in the behavior data set accords with an abnormal user judgment condition in a predetermined abnormal user judgment condition set, a condition detection result is generated, wherein the condition detection result is a detection result generated based on the abnormal user judgment condition in the abnormal user judgment condition set, and finally, based on the model detection result and the condition detection result, a target detection result is generated, wherein the target detection result is used for indicating whether the target user is the abnormal user, thereby, improving the accuracy of abnormal user detection and helping to reduce the occupation of resources such as a CPU (central processing unit ), a bandwidth and the like by the abnormal user, thereby ensuring the normal use of the resources by the normal user.
Drawings
Other features, objects and advantages of the present application will become more apparent upon reading of the detailed description of non-limiting embodiments, made with reference to the following drawings, in which:
FIG. 1 is an exemplary system architecture diagram in which an embodiment of the present application may be applied;
FIG. 2 is a flow chart of one embodiment of a method for detecting an abnormal user according to the present application;
FIG. 3 is a schematic illustration of one application scenario of a method for detecting an abnormal user according to the present application;
FIG. 4 is a flow chart of yet another embodiment of a method for detecting an abnormal user according to the present application;
5A-5B are schematic diagrams of yet another application scenario of a method for detecting an abnormal user according to the present application;
fig. 6 is a schematic diagram of a computer system suitable for use in implementing embodiments of the present disclosure.
Detailed Description
The present application is described in further detail below with reference to the drawings and examples. It is to be understood that the specific embodiments described herein are merely illustrative of the invention and are not limiting of the invention. It should be noted that, for convenience of description, only the portions related to the present invention are shown in the drawings.
It should be noted that, in the case of no conflict, the embodiments and features in the embodiments may be combined with each other. The present application will be described in detail below with reference to the accompanying drawings in conjunction with embodiments.
FIG. 1 illustrates an exemplary system architecture 100 to which embodiments of the method for detecting an abnormal user of embodiments of the present disclosure may be applied.
As shown in fig. 1, a system architecture 100 may include terminal devices 101, 102, 103, a network 104, and a server 105. The network 104 is used as a medium to provide communication links between the terminal devices 101, 102, 103 and the server 105. The network 104 may include various connection types, such as wired, wireless communication links, or fiber optic cables, among others.
The user may interact with the server 105 via the network 104 using the terminal devices 101, 102, 103 to receive or send messages (e.g., a set of behavioral data of the target user over a target period of time), etc. Various communication client applications, such as a payment type application, financial software, a web browser application, a shopping type application, a search type application, an instant messaging tool, a mailbox client, social platform software, etc., may be installed on the terminal devices 101, 102, 103.
The terminal devices 101, 102, 103 may be hardware or software. When the terminal devices 101, 102, 103 are hardware, they may be various electronic devices including, but not limited to, smartphones, tablet computers, e-book readers, MP3 players (Moving Picture Experts Group Audio Layer III, dynamic video expert compression standard audio plane 3), MP4 (Moving Picture Experts Group Audio Layer IV, dynamic video expert compression standard audio plane 4) players, laptop and desktop computers, and the like. When the terminal devices 101, 102, 103 are software, they can be installed in the above-listed electronic devices. Which may be implemented as multiple software or software modules (e.g., software or software modules for providing distributed services) or as a single software or software module. The present invention is not particularly limited herein.
The server 105 may be a server providing various services, such as a background server providing support for pages displayed on the terminal devices 101, 102, 103. The background server may obtain a set of behavior data (e.g., user account number, transaction time, transaction amount, transaction duration, transaction number, transaction frequency, transaction characteristics, transaction trend, transaction period, transaction type) of the target user within the target time period from the terminal device 101, 102, 103. Inputting the acquired behavior data set into a pre-trained detection model, generating a model detection result, generating a condition detection result based on whether behavior data in the behavior data set accords with an abnormal user judgment condition in a predetermined abnormal user judgment condition set, and generating a target detection result based on the model detection result and the condition detection result.
It should be noted that, the method for detecting an abnormal user provided by the embodiment of the present disclosure may be performed by the server 105, may be performed by the terminal devices 101, 102, 103, or may be performed by the terminal devices 101, 102, 103 and the server 105 in cooperation with each other. Accordingly, each part (e.g., each unit) included in the means for detecting an abnormal user may be provided in the server 105, may be provided in the terminal devices 101, 102, 103, or may be provided in the server 105 and the terminal devices 101, 102, 103, respectively.
The server may be hardware or software. When the server is hardware, the server may be implemented as a distributed server cluster formed by a plurality of servers, or may be implemented as a single server. When the server is software, it may be implemented as a plurality of software or software modules (e.g., software or software modules for providing distributed services), or as a single software or software module. The present invention is not particularly limited herein.
It should be understood that the number of terminal devices, networks and servers in fig. 1 is merely illustrative. There may be any number of terminal devices, networks, and servers, as desired for implementation. When the electronic device on which the method for detecting an abnormal user operates does not need to perform data transmission with other electronic devices, the system architecture may include only the electronic device on which the method for detecting an abnormal user operates.
With continued reference to FIG. 2, a flow 200 of one embodiment of a method for detecting an abnormal user in accordance with the present application is shown. The method for detecting abnormal users comprises the following steps:
step 201, inputting a behavior data set of a target user in a target time period into a pre-trained detection model to generate a model detection result.
In this embodiment, an execution subject (e.g., a server or a terminal device shown in fig. 1) of the method for detecting an abnormal user may input a behavior data set of a target user in a target period into a detection model trained in advance, generating a model detection result. The detection model is used for determining whether the user corresponding to the input behavior data set is an abnormal user, and the model detection result is a detection result which is generated by the detection model and is used for indicating whether the target user is the abnormal user.
Here, the target user may be a user for whom abnormality detection is to be performed. The target time period may be any historical time period. As an example, the target time period may be from 11.11.0.2019 to 11.12.0.2019. The behavior data set may be a set of data generated by various operations of the target user described above. As an example, the behavioral data set may include at least one of: user account number, transaction time, transaction amount, transaction duration, transaction count, transaction frequency, transaction characteristics, transaction trend, transaction period, transaction type.
In practice, the behavior data sets may be characterized by vectors, matrices, etc.
The model detection result can be characterized by words, for example, the model detection result can be 'yes' or 'no'; the model detection result may also be represented by a number, for example, when the model detection result is "0", it may be represented that the target user is an abnormal user, and when the model detection result is "1", it may be represented that the target user is not an abnormal user (i.e., is a normal user); the model detection results can also be characterized by a matrix or a vector. For example, the execution body may generate, for each behavior data in the behavior data set, a probability that a user corresponding to the behavior data is an abnormal user, so that each probability is used as an element of a matrix or a vector for characterizing a model detection result. For another example, the execution body may generate, for each behavior data in the behavior data set, a probability that a user corresponding to the behavior data is an abnormal user, if the probability that the user corresponding to the behavior data is an abnormal user is greater than a preset probability threshold (for example, 50%), 1 is used as an element of a matrix or a vector for representing the model detection result, and if the probability that the user corresponding to the behavior data is an abnormal user is less than or equal to the preset probability threshold (for example, 50%), 0 is used as an element of a matrix or a vector for representing the model detection result.
Here, when the model detection result is characterized by a matrix or a vector, the execution body may determine whether the model detection result indicates that the target user is an abnormal user based on each element in the matrix or the vector. For example, the execution subject may determine whether the model detection result indicates that the target user is an abnormal user by calculating a magnitude relation between a mean value of each element in the matrix or the vector and a preset threshold. For another example, the execution subject may determine whether the model detection result indicates that the target user is an abnormal user by calculating a magnitude relation between the number of elements exceeding the first preset threshold and the second preset threshold among the respective elements in the matrix or the vector.
In practice, when the execution subject is a terminal device, the target user may operate an application installed on the execution subject, thereby generating behavior data, and thus the execution subject or a server communicatively connected to the execution subject may use a set of behavior data generated by the target user in a target period of time as a behavior data set; when the execution subject is a server, the target user may operate an application supported by the execution subject and installed on a terminal device communicatively connected to the execution subject, thereby generating behavior data, and the execution subject or the terminal device communicatively connected to the execution subject may use a set of behavior data generated by the target user during a target period as a behavior data set.
In some alternative implementations of the present embodiment, the behavior data set may also be obtained by:
first, a user data set of a target user within a target time period is acquired. Wherein the set of user information comprises at least one of: user account number, transaction time, transaction amount, transaction duration, transaction count, transaction frequency, transaction characteristics, transaction trend, transaction period, transaction type.
And then, carrying out data cleaning on the user data set to obtain a cleaned data set.
And then, carrying out data characteristic derivation on the cleaned data set to obtain a derived data set.
And finally, adopting a principal component analysis method to reduce the dimension of the derived data set, and taking the data set obtained after the dimension reduction as a behavior data set.
It can be appreciated that the accuracy and speed of generating the target detection result can be improved through subsequent steps by performing data cleaning, data feature derivation and dimension reduction processing on the behavior data set.
Here, the detection model may be a convolutional neural network model that is trained using a machine learning algorithm based on a predetermined set of training samples. Wherein each training sample in the training sample set may correspond to a user. The training samples in the training sample set may include input data and desired output data. The input data may be a single user's behavioral data set. The desired output data may be used to indicate whether the user is an abnormal user.
Step 202, generating a condition detection result based on whether the behavior data in the behavior data set meets the abnormal user judgment condition in the predetermined abnormal user judgment condition set.
In this embodiment, the execution body may generate the condition detection result based on whether the behavior data in the behavior data set meets the abnormal user determination condition in the predetermined abnormal user determination condition set. The condition detection result is a detection result which is generated based on the abnormal user judgment condition in the abnormal user judgment condition set and is used for indicating whether the target user is the abnormal user or not.
The abnormal user determination condition may be a condition for determining whether the user is an abnormal user. For example, the abnormal user determination condition may include: the number of transactions by the user in the target time period exceeds 50. As another example, the abnormal user determination condition may also include: the user's transaction amount exceeds 10 times more than ten thousand times in the target period.
The condition detection result may be characterized by text, for example, the condition detection result may be "yes" or "no". Alternatively, the condition detection result may also be represented by a number, for example, when the condition detection result is "0", it may be represented that the target user is an abnormal user, and when the condition detection result is "1", it may be represented that the target user is not an abnormal user. Alternatively, the condition detection result may also be characterized by a matrix or vector. For example, the execution body may generate, for each piece of behavior data in the behavior data set, a probability that a user corresponding to the piece of behavior data is an abnormal user, and use each probability as an element of a matrix or vector for representing the condition detection result.
Each abnormal user decision condition in the abnormal user decision condition set may correspond to a probability for characterizing the user as an abnormal user. Thus, the execution subject can use the probability of the abnormal user determination condition to which the behavior data corresponds as the probability that the user corresponding to the behavior data is the abnormal user. For another example, the execution body may generate, for each behavior data in the behavior data set, a probability that a user corresponding to the behavior data is an abnormal user, if the probability that the user corresponding to the behavior data is an abnormal user is greater than a preset probability threshold (for example, 50%), 1 is used as an element of a matrix or a vector for representing the condition detection result, and if the probability that the user corresponding to the behavior data is an abnormal user is less than or equal to the preset probability threshold (for example, 50%), 0 is used as an element of a matrix or a vector for representing the condition detection result. Wherein, each abnormal user judgment condition in the abnormal user judgment condition set can correspond to one probability for representing the user as the abnormal user. Thus, the execution subject can use the probability of the abnormal user determination condition to which the behavior data corresponds as the probability that the user corresponding to the behavior data is the abnormal user.
Here, when the condition detection result is characterized by a matrix or a vector, the above-described execution body may determine whether the condition detection result indicates that the target user is an abnormal user based on each element in the matrix or the vector. For example, the above-described execution subject may determine whether the condition detection result indicates that the target user is an abnormal user by calculating a magnitude relation between a mean value of each element in the matrix or vector and a preset threshold value. For another example, the execution subject may determine whether the condition detection result indicates that the target user is an abnormal user by calculating a magnitude relation between the number of elements exceeding the first preset threshold and a preset number among the respective elements in the matrix or the vector.
Here, the execution subject may generate the condition detection result in various ways.
For example, in a case where behavior data that meets an abnormal user determination condition in a predetermined abnormal user determination condition set exists in the behavior data set, the execution subject may generate a condition detection result for indicating that the target user is an abnormal user; in the case where, in the behavior data set, there is no behavior data conforming to the abnormal user determination condition in the predetermined abnormal user determination condition set, the execution subject may generate a condition detection result indicating that the target user is not the abnormal user.
Alternatively, in the case where the number of behavior data that is in a behavior data set that meets the abnormal user determination condition in the predetermined abnormal user determination condition set is greater than the number of behavior data that does not meet the abnormal user determination condition in the abnormal user determination condition set, the execution subject may generate a condition detection result for indicating that the target user is an abnormal user; in the case where the number of pieces of behavior data, which satisfy the abnormal user determination condition in the predetermined abnormal user determination condition set, in the behavior data set is smaller than or equal to the number of pieces of behavior data, which do not satisfy the abnormal user determination condition in the abnormal user determination condition set, the execution subject may generate a condition detection result indicating that the target user is not the abnormal user.
In step 203, a target detection result is generated based on the model detection result and the condition detection result.
In this embodiment, the execution body may generate the target detection result based on the model detection result and the condition detection result. The target detection result is used for indicating whether the target user is an abnormal user or not.
As an example, in the case where the model detection result and the detection result indicated by the condition detection result coincide, the execution subject may regard the model detection result or the condition detection result as the target detection result. In the case where the model detection result and the detection result indicated by the condition detection result are inconsistent, the execution body may randomly select one detection result from the model detection result and the condition detection result, and use the randomly selected detection result as the target detection result.
As yet another example, in the case where the detection results indicated by the model detection result and the condition detection result are not identical, the execution subject may be based on a selection operation by a relevant person (for example, a person responsible for detecting whether the user is an abnormal user) so as to take the detection result indicated by the selection operation (i.e., the model detection result or the condition detection result) as the target detection result.
With continued reference to fig. 3, fig. 3 is a schematic diagram of an application scenario of the method for detecting an abnormal user according to the present embodiment. In the application scenario of fig. 3, the server 301 first inputs the behavior data set 3001 of the target user in the target period into the pre-trained detection model 3002, and generates a model detection result 3004 (in fig. 3, the model detection result 3004 indicates that the target user is an abnormal user), where the detection model 3002 is used to determine whether the user corresponding to the input behavior data set is an abnormal user. The model detection result 3004 is a detection result generated by the detection model 3002.
Then, the server 301 generates a condition detection result 3005 based on whether the behavior data in the behavior data set 3001 meets the abnormal user determination condition in the predetermined abnormal user determination condition set 3003 (in fig. 3, the condition detection result 3005 indicates that the target user is not an abnormal user). The condition detection result 3005 is a detection result generated based on the abnormal user determination condition in the abnormal user determination condition set 3003.
Finally, the server 301 generates a target detection result 3006 based on the model detection result 3004 and the condition detection result 3005 (in fig. 3, the target detection result 3006 indicates that the target user is not an abnormal user). The target detection result 3006 is used to indicate whether the target user is an abnormal user.
According to the method provided by the embodiment of the application, the behavior data set of the target user in the target time period is input into the pre-trained detection model to generate the model detection result, wherein the detection model is used for determining whether the user corresponding to the input behavior data set is an abnormal user or not, then the condition detection result is generated based on whether the behavior data in the behavior data set accords with the abnormal user judgment condition in the pre-determined abnormal user judgment condition set, and finally the target detection result is generated based on the model detection result and the condition detection result, wherein the target detection result is used for indicating whether the target user is the abnormal user or not, and therefore the final detection result (namely the target detection result) used for indicating whether the target user is the abnormal user or not is generated by combining the model detection result generated based on the detection model and the condition detection result generated based on the abnormal user judgment condition set, so that the accuracy of abnormal user detection is improved, effective risk monitoring and management can be automatically and efficiently realized based on mass data, occupation of resources such as CPU and bandwidth can be reduced, and normal use of the resources can be ensured.
In some optional implementations of this embodiment, in a case where the model detection result and the detection result indicated by the condition detection result are inconsistent, the execution body may further execute at least one of:
the first item, determining whether to update the abnormal user decision condition set.
Second, it is determined whether to continue training the detection model.
It can be understood that, in the case that the detection results indicated by the model detection result and the condition detection result are inconsistent, the condition detection result determined based on the abnormal user determination condition set or the model detection result determined based on the detection model may have a low accuracy, in this scenario, the accuracy of the condition detection result determined by the abnormal user determination condition set may be improved by updating the abnormal user determination condition set or continuing to train the detection model, or the accuracy of the model detection result determined by the detection model may be improved, so that a more accurate target detection result may be obtained by adopting the updated abnormal user determination condition set and the detection model obtained after continuing to train, and the accuracy of abnormal user detection may be further improved.
In some optional implementations of the present embodiment, in the case of determining whether to update the abnormal user determination condition set in response to the detection result indicated by the model detection result and the condition detection result not being identical, whether to update the abnormal user determination condition set may be determined by: in response to receiving the information indicating to update the abnormal user decision condition set, the abnormal user decision condition set is updated based on the behavioral data set.
The information for instructing to update the abnormal user determination condition set may be transmitted to the execution subject by a person (for example, a technician) through an electronic device used by the person, or may be directly input to the execution subject by the person.
It will be appreciated that in this alternative implementation, the executing entity may determine whether the abnormal user determination condition set needs to be updated based on experience of the relevant person.
Alternatively, the execution subject may update the abnormal user determination condition set if the accuracy of the condition detection result obtained based on the abnormal user determination condition set is smaller than a preset accuracy threshold. So that the execution subject can automatically update the abnormal user determination condition set.
The accuracy of the condition detection result obtained based on the abnormal user judgment condition set can be obtained by the following steps:
first, a test sample set is obtained. Wherein each test sample in the set of test samples corresponds to a user. Each test sample comprises a behavior data set of a user corresponding to the test sample and expected result data representing whether the user corresponding to the test sample is an abnormal user.
And a second step of generating a condition detection result based on whether the behavior data in the behavior data set meets the abnormal user judgment condition in the abnormal user judgment condition set for each test sample in the test sample set. And taking the condition detection result of whether the user corresponding to the test sample is an abnormal user or not as actual result data corresponding to the test sample. The method of generating the condition detection result in the second step may refer to the above step 202, and will not be described herein.
And a third step of determining, for each of the actual result data obtained in the second step, whether the actual result data and the desired result data corresponding to the actual result data indicate the same meaning (i.e., whether the user is an abnormal user or whether the user is not an abnormal user) to determine the number of actual result data of the actual result data obtained in the second step, the actual result data indicating the same meaning as the corresponding desired result data.
And a fourth step of determining the ratio of the number obtained in the third step to the number of test samples included in the test sample set obtained in the first step as the accuracy of the condition detection result obtained based on the abnormal user determination condition set.
In some optional implementations of the present embodiment, in a case where it is determined whether to continue training the detection model in response to the detection result indicated by the model detection result and the condition detection result not being identical, determining whether to continue training the detection model includes: in response to receiving the information indicating to continue training, continuing to train the detection model based on the set of behavioral data.
The information for indicating the continuation of training may be sent to the execution subject by a related person (e.g., a technician) through an electronic device used by the related person, or may be directly input to the execution subject by the related person.
It will be appreciated that in this alternative implementation, the executing entity may determine whether to continue training the detection model based on experience of the relevant person.
Optionally, the executing body may further train the detection model when the accuracy of the detection model is less than a preset accuracy threshold. So that the executing body can automatically continue training the detection model.
The accuracy of the detection model can be obtained by the following steps:
step one, a test sample set is obtained. Wherein each test sample in the set of test samples corresponds to a user. Each test sample comprises a behavior data set of a user corresponding to the test sample and expected result data representing whether the user corresponding to the test sample is an abnormal user.
And step two, sequentially inputting each test sample in the test sample set into a detection model to obtain a model detection result for indicating whether the user corresponding to the test sample is an abnormal user. And taking the model detection result of whether the user corresponding to the test sample is an abnormal user or not as actual result data corresponding to the test sample.
And thirdly, determining whether the actual result data and the expected result data corresponding to the actual result data indicate the same meaning (namely whether the user is an abnormal user or whether the user is not an abnormal user) according to each actual result data in the actual result data obtained in the second step, so as to determine the quantity of the actual result data which has the same meaning as the corresponding expected result data in the actual result data obtained in the second step.
And step four, determining the ratio of the number obtained in the step three to the number of the test samples included in the test sample set as the accuracy of the detection model.
In some alternative implementations of this embodiment, the model detection result and the condition detection result are respectively represented by numerical values. And, the execution body may further execute step 203 in the following manner:
Determining whether the target user is an abnormal user based on the result of the weighted summation of the model detection result and the condition detection result, and generating the target detection result. The weight of the model detection result and the weight of the condition detection result are positively correlated with the accuracy of the detection result generated based on the detection model and on the abnormal user judgment condition set and used for indicating whether the user is an abnormal user or not.
As an example, when the model detection result and the condition detection result are characterized by a single numerical value, the above-described execution body may determine whether the target user is an abnormal user based on a magnitude relation between a result of weighted summation of the model detection result and the condition detection result and a preset numerical value, and generate the target detection result.
As yet another example, when the model detection result and the condition detection result are characterized by a matrix or vector (i.e., a plurality of values), the above-described execution body may determine whether the target user is an abnormal user based on a magnitude relation between the number of elements larger than the first preset value and the second preset value in the result of weighted summation of the model detection result and the condition detection result, and generate the target detection result. Alternatively, the execution subject may determine whether the target user is an abnormal user based on a magnitude relation between an average value of each element in a result of weighted summation of the model detection result and the condition detection result and a preset average value, and generate the target detection result.
It can be appreciated that the target detection result can be generated in multiple ways in the alternative implementation manner, so that the generation manner of the target detection result is enriched, and in some situations, two or more ways described in the alternative implementation manner can also be used to generate the target detection result, so that the accuracy of abnormal user detection can be further improved.
In some optional implementations of this embodiment, in a case where the target detection result indicates that the target user is an abnormal user, the executing body may further execute a predetermined abnormal user control operation. The abnormal user management operation may be an operation for managing and/or controlling the abnormal user.
It can be understood that, when the target detection result indicates that the target user is an abnormal user, the abnormal user can be managed and/or controlled correspondingly by executing the predetermined abnormal user management operation, so that effective risk monitoring and management can be automatically and efficiently realized for massive data, occupation of resources such as a CPU and a bandwidth by the abnormal user can be reduced, and normal use of the resources by a normal user can be ensured.
In some optional implementations of the present embodiment, the abnormal user management operation includes at least one of:
the first item limits the rights of the target user.
It can be understood that the loss of related personnel caused by actions such as improper utilization, misoperation and the like of the abnormal user can be avoided through permission limitation of the abnormal user.
And second, sending prompt information for indicating abnormal operation to the target user.
It can be appreciated that in some cases, the account number of the target user is stolen by other people, or the background program runs privately, which may cause the target detection result to indicate that the target user is an abnormal user.
And thirdly, associating the target user with a preset label.
It will be appreciated that the target user may be associated with a preset tag to distinguish between an abnormal user and a normal user for subsequent management of the distinction between the two types of users.
In some alternative implementations of the present embodiment, the target user is a consumer user (rather than a merchant).
It will be appreciated that in the case where the target user is a consumer user, the alternative implementation may enable anomaly detection for the consumer user.
With further reference to fig. 4, a flow 400 of yet another embodiment of a method for detecting an abnormal user is shown. The process 400 of the method for detecting an abnormal user comprises the steps of:
in step 401, a predetermined training sample set is obtained.
In this embodiment, the execution subject of the method for detecting an abnormal user (such as the server or the terminal device shown in fig. 1) may acquire a predetermined training sample set locally from other electronic devices through a wired connection or a wireless connection. Wherein each training sample in the training sample set corresponds to a user. The training samples in the training sample set include a set of behavior data for the user corresponding to the training sample.
Here, the behavior data set may be a set of data generated by various operations of the user. As an example, the behavioral data set may include at least one of: user account number, transaction time, transaction amount, transaction duration, transaction count, transaction frequency, transaction characteristics, transaction trend, transaction period, transaction type.
In practice, the behavior data sets may be characterized by vectors, matrices, etc.
Step 402, for each anomaly detection algorithm in a predetermined anomaly detection algorithm set, performing anomaly detection on the training sample set by adopting the anomaly detection algorithm to obtain a candidate model corresponding to the anomaly detection algorithm.
In this embodiment, for each anomaly detection algorithm in the predetermined anomaly detection algorithm set, the executing body may perform anomaly detection on the training sample set by using the anomaly detection algorithm to obtain a candidate model corresponding to the anomaly detection algorithm. The candidate model characterizes whether the behavior data set included in the training sample corresponds to an abnormal user.
Here, the anomaly detection algorithm in the above-described anomaly detection algorithm set may be used to perform anomaly detection on the training sample set. As an example, the anomaly detection algorithm may include, but is not limited to, any of the following: classification-based anomaly detection algorithms, nearest neighbor-based anomaly detection algorithms, cluster-based anomaly detection algorithms, statistical-based anomaly detection algorithms (e.g., gaussian model-based anomaly detection algorithms, regression model-based anomaly detection algorithms, mixed parameter distribution-based anomaly detection algorithms, histogram-based anomaly detection algorithms, kernel function-based anomaly detection algorithms, density estimation-based anomaly detection algorithms).
In some cases, the individual anomaly detection algorithms in the set of anomaly detection algorithms described above may be different from one another.
In some alternative implementations of the present embodiment, the set of anomaly detection algorithms may include the following anomaly detection algorithms: gaussian distribution, density-based clustering algorithms, isolated forest algorithms, local anomaly factor algorithms (Local Outlier Factor, LOF).
The gaussian distribution, the density-based clustering algorithm, the isolated forest algorithm, and the local anomaly factor algorithm are known to those skilled in the art, and are not described in detail herein.
It will be appreciated that by performing step 402 described above, a candidate model corresponding to each anomaly detection algorithm in the set of anomaly detection algorithms may be obtained. Because before the anomaly detection algorithm is adopted to detect the anomaly of the training sample set, recall rate ((Recall), accuracy (Precision), F-Measure), macro-average (macro-average), micro-average) and other evaluation indexes of candidate models corresponding to each anomaly detection algorithm are difficult to predict in advance, the candidate models corresponding to each anomaly detection algorithm can be trained according to the anomaly detection algorithm, so that the detection models selected through the subsequent steps are more excellent in Recall rate, accuracy, F-Measure, macro-average, micro-average and other aspects.
In some optional implementations of this embodiment, for "performing anomaly detection on the training sample set using the anomaly detection algorithm to obtain a candidate model corresponding to the anomaly detection algorithm" in step 402, the executing body may perform the following steps:
the first step, selecting at least two groups of parameter combinations from a predetermined parameter set of the abnormality detection algorithm. Wherein the number of parameters in each set of parameter combinations may be an integer greater than or equal to 0.
Here, each anomaly detection algorithm may correspond to a set of parameters. For example, when the anomaly detection algorithm is an isolated forest algorithm, the set of parameters corresponding to the anomaly detection algorithm may include, but is not limited to: the number of samples, the number of selected characteristics, the number of trees, the number of layers of each tree, and the like. When the anomaly detection algorithm is a clustering algorithm, the parameter set corresponding to the anomaly detection algorithm may include, but is not limited to: the number of clusters, parameters for filtering noise, etc.
And a second step of setting a parameter value of each parameter in the at least two sets of parameter combinations to a parameter value set in advance for the parameter.
Here, the parameter value set in advance for the parameter may be a parameter value set empirically by a technician or may be a default parameter value of the parameter. It should be noted that, for the same parameter, a technician may set a plurality of different parameter values for the same parameter.
And thirdly, aiming at each group of parameter combinations in at least two groups of parameter combinations, carrying out anomaly detection on the training sample set based on the anomaly detection algorithm and parameter values set for each parameter in the group of parameter combinations, and obtaining candidate models corresponding to the anomaly detection algorithm and the group of parameter combinations.
It will be appreciated that, since each anomaly detection algorithm may correspond to multiple parameters, it is often difficult to predict how to determine the values of the respective parameters before model training to train a model (e.g., a detection model) that exhibits better performance (e.g., highest scoring determined based on evaluation metrics such as recall, accuracy, F-value, macro-average, micro-average, etc.). The alternative implementation may train out a candidate model for each set of parameter combinations corresponding to each anomaly detection algorithm. The detection model selected through the subsequent steps has more excellent performance in aspects of recall rate, accuracy, F value, macro average, micro average and the like.
And step 403, taking the candidate models which meet the preset selection conditions in the obtained candidate models as detection models.
In this embodiment, the execution body may use, as the detection model, a candidate model that meets a preset selection condition from among the obtained candidate models.
The preset selection condition may be the highest accuracy, or may be the highest result of weighted summation of the accuracy, the contour coefficient, the adjusted rand coefficient, the Calinski-Harabasz Index (Calinski-Harabasz Index), and the Fowlkes-Mallows score (Fowlkes-Mallows score) of the candidate model.
Here, the accuracy, the contour coefficient, the adjustment rand coefficient, the Calinski-Harabasz index, and the Fowlkes-Mallows score are well known model evaluation indexes widely studied by those skilled in the art, and are not described herein.
In some optional implementations of this embodiment, the executing body may further execute the step 403 in the following manner:
and based on the accuracy, the contour coefficient, the adjustment Rankine coefficient, the Calinski-Harabasz index and the odd term evaluation index in the Fowlkes-Mallows score, taking the candidate model determined by adopting a voting mechanism from the obtained candidate models as a detection model.
It can be appreciated that in this alternative implementation manner, a voting mechanism may be used to determine the detection model, so as to improve the performance of the detection model in terms of accuracy, profile coefficient, adjustment rand coefficient, calinski-Harabasz index, fowlkes-Mallows score, and odd terms.
Step 404, inputting the behavior data set of the target user in the target time period into a pre-trained detection model to generate a model detection result.
In this embodiment, the execution body may input the behavior data set of the target user in the target period into a pre-trained detection model, and generate a model detection result. The detection model is used for determining whether the user corresponding to the input behavior data set is an abnormal user or not. The model detection result is a detection result generated by the detection model and used for indicating whether the target user is an abnormal user or not.
Step 405, generating a condition detection result based on whether the behavior data in the behavior data set meets the abnormal user judgment condition in the predetermined abnormal user judgment condition set.
In this embodiment, the execution body may generate the condition detection result based on whether the behavior data in the behavior data set meets the abnormal user determination condition in the predetermined abnormal user determination condition set. The condition detection result is a detection result which is generated based on the abnormal user judgment condition in the abnormal user judgment condition set and is used for indicating whether the target user is the abnormal user or not.
In step 406, a target detection result is generated based on the model detection result and the condition detection result.
In this embodiment, the execution body may generate the target detection result based on the model detection result and the condition detection result. The target detection result is used for indicating whether the target user is an abnormal user or not.
In this embodiment, the steps 404, 405, 406 may be substantially identical to the steps 201, 202, 203 in the corresponding embodiment of fig. 2, and will not be described herein.
With continued reference to fig. 5A-5B, fig. 5A-5B are schematic diagrams of yet another application scenario of the method for detecting an abnormal user according to the present embodiment. It should be noted that the schematic diagrams of fig. 5A-5B are only examples, and should not be construed as limiting the present application.
In fig. 5A, an executing body of a method for detecting an abnormal user in an embodiment of the present disclosure may first acquire a historical transaction data set 501. Wherein each historical transaction data in the set of historical transaction data may include, but is not limited to, at least one of the following for a single user: user account number, transaction time, transaction amount, transaction duration, transaction count, transaction frequency, transaction characteristics, transaction trend, transaction period, transaction type.
The execution entity may then perform data preprocessing (e.g., data cleansing), feature engineering (e.g., data feature derivation, dimension reduction), etc. on the historical transaction data set 501 to obtain a processed data set 502.
Here, different cleansing rules may be designed for different types of non-canonical data, with adjustments or modifications to the data format. For the data with the missing, the characteristics can be divided into time characteristics, classification characteristics and continuous characteristics, and the mode, the mean value and the like are respectively selected for filling. Then, converting and deriving the data obtained by preprocessing the data, and obtaining new characteristics such as maximum transaction amount, average transaction amount, transaction sum and the like on the basis of the original data; then, a principal component analysis method is adopted to select the characteristics, and the characteristic dimension reduction is carried out.
The execution body may then divide the processed data set 502 into data. For example, the processed data set 502 is divided into a training sample set 503 and a test sample set 504 in a certain proportion.
Then, the executing body may perform abnormality detection on the training sample set by using the abnormality detection algorithm for each abnormality detection algorithm in the predetermined abnormality detection algorithm set, to obtain a candidate model corresponding to the abnormality detection algorithm. In fig. 5A, the anomaly detection algorithm includes a statistical algorithm, a clustering algorithm, an isolated forest algorithm, and a local anomaly factor algorithm. Thus, the executing body may perform anomaly detection on the training sample set 503 by using a statistical algorithm 505, a clustering algorithm 506, an isolated forest algorithm 507, and a local anomaly factor algorithm 508, to obtain a candidate model 509 corresponding to the statistical algorithm 505, a candidate model 510 corresponding to the clustering algorithm 506, a candidate model 511 corresponding to the isolated forest algorithm 507, and a candidate model 512 corresponding to the local anomaly factor algorithm 508.
Here, in the process of performing abnormality detection on the training sample set using the abnormality detection algorithm, different parameter sets may be set for each abnormality detection algorithm, thereby establishing different candidate models corresponding to the abnormality detection algorithm. For example, when the anomaly detection algorithm is an isolated forest algorithm, the set of parameters corresponding to the anomaly detection algorithm may include, but is not limited to: the number of samples, the number of selected characteristics, the number of trees, the number of layers of each tree, and the like. When the anomaly detection algorithm is a clustering algorithm, the parameter set corresponding to the anomaly detection algorithm may include, but is not limited to: the number of clusters, parameters for filtering noise, etc.
Then, the execution body may use the candidate model determined by the voting mechanism from the obtained candidate models 509-512 as the detection model 513 based on the accuracy, the contour coefficient, the adjustment rand coefficient, the Calinski-Harabasz index, and the odd term evaluation index in the Fowlkes-Mallows score.
Next, please continue to refer to fig. 5B.
In fig. 5B, the above-described execution subject first acquires a user data set 515 of the target user within the target period of time, on the basis of acquisition of a detection model 513 (for example, the detection model 513 obtained by the method shown in fig. 5A). Wherein user data collection 515 may include, but is not limited to, at least one of the following for a single user: user account number, transaction time, transaction amount, transaction duration, transaction count, transaction frequency, transaction characteristics, transaction trend, transaction period, transaction type.
The execution entity may then perform data preprocessing (e.g., data cleansing), feature engineering (e.g., data feature derivation, dimension reduction) processing on the user data collection 515 to obtain the behavior data collection 516.
Here, different cleansing rules may be designed for different types of non-canonical data, with adjustments or modifications to the data format. For the data with the missing, the characteristics can be divided into time characteristics, classification characteristics and continuous characteristics, and the mode, the mean value and the like are respectively selected for filling. Then, converting and deriving the data obtained by preprocessing the data, and obtaining new characteristics such as maximum transaction amount, average transaction amount, transaction sum and the like on the basis of the original data; then, a principal component analysis method is adopted to select the characteristics, and the characteristic dimension reduction is carried out.
Thereafter, the execution subject may input the behavior data set 516 into the pre-trained detection model 513 to generate a model detection result 517. The model detection result 517 is a detection result generated by the detection model 513. And generating a condition detection result 518 based on whether the behavior data in the behavior data set 516 meets the abnormal user determination condition in the predetermined abnormal user determination condition set 514. The condition detection result 518 is a detection result generated based on an abnormal user determination condition in the abnormal user determination condition set.
Finally, the execution body may generate a target detection result 519 based on the model detection result 517 and the condition detection result 518. The target detection result 519 is used to indicate whether the target user is an abnormal user.
Optionally, the executing body may further execute at least one of the following in a case where the detection results indicated by the model detection result 517 and the condition detection result 518 are inconsistent: determining whether to update the abnormal user decision condition set; it is determined whether to continue training the detection model.
Here, in the case where information for instructing to continue training is received, the execution subject may continue training the detection model 513 based on the behavior data set 516. Alternatively, in the case where information for instructing to update the abnormal user determination condition set is received, the execution body may update the abnormal user determination condition set 514 based on the behavior data set 516.
As can be seen from fig. 4, the process 400 of the method for detecting an abnormal user in the present embodiment highlights the training process of the detection model, and thus, the accuracy of detecting an abnormal user can be further improved.
Referring now to fig. 6, a schematic diagram of an electronic device (e.g., server or terminal device of fig. 1) 600 suitable for use in implementing embodiments of the present disclosure is shown. The terminal devices in the embodiments of the present disclosure may include, but are not limited to, mobile terminals such as mobile phones, notebook computers, digital broadcast receivers, PDAs (personal digital assistants), PADs (tablet computers), PMPs (portable multimedia players), car terminals (e.g., car navigation terminals), and the like, and stationary terminals such as digital TVs, desktop computers, and the like. The electronic device shown in fig. 6 is merely an example and should not impose any limitations on the functionality and scope of use of embodiments of the present disclosure.
As shown in fig. 6, the electronic device 600 may include a processing means (e.g., a central processing unit, a graphics processor, etc.) 601, which may perform various appropriate actions and processes according to a program stored in a Read Only Memory (ROM) 602 or a program loaded from a storage means 608 into a Random Access Memory (RAM) 603. In the RAM603, various programs and data required for the operation of the electronic apparatus 600 are also stored. The processing device 601, the ROM 602, and the RAM603 are connected to each other through a bus 604. An input/output (I/O) interface 605 is also connected to bus 604.
In general, the following devices may be connected to the I/O interface 605: input devices 606 including, for example, a touch screen, touchpad, keyboard, mouse, camera, microphone, accelerometer, gyroscope, and the like; an output device 607 including, for example, a Liquid Crystal Display (LCD), a speaker, a vibrator, and the like; storage 608 including, for example, magnetic tape, hard disk, etc.; and a communication device 609. The communication means 609 may allow the electronic device 600 to communicate with other devices wirelessly or by wire to exchange data. While fig. 6 shows an electronic device 600 having various means, it is to be understood that not all of the illustrated means are required to be implemented or provided. More or fewer devices may be implemented or provided instead. Each block shown in fig. 6 may represent one device or a plurality of devices as needed.
In particular, according to embodiments of the present disclosure, the processes described above with reference to flowcharts may be implemented as computer software programs. For example, embodiments of the present disclosure include a computer program product comprising a computer program embodied on a computer readable medium, the computer program comprising program code for performing the method shown in the flowcharts. In such an embodiment, the computer program may be downloaded and installed from a network via communication means 609, or from storage means 608, or from ROM 602. The above-described functions defined in the methods of the embodiments of the present disclosure are performed when the computer program is executed by the processing means 601. It should be noted that, the computer readable medium described in the present application may be a computer readable signal medium or a computer readable storage medium, or any combination of the two. The computer readable storage medium can be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or a combination of any of the foregoing. More specific examples of the computer-readable storage medium may include, but are not limited to: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the context of this document, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. In the present application, however, a computer-readable signal medium may include a data signal propagated in baseband or as part of a carrier wave, with computer-readable program code embodied therein. Such a propagated data signal may take any of a variety of forms, including, but not limited to, electro-magnetic, optical, or any suitable combination of the foregoing. A computer readable signal medium may also be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device. Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to: wireless, wire, fiber optic cable, RF, etc., or any suitable combination of the foregoing.
Computer program code for carrying out operations of the present application may be written in one or more programming languages, including an object oriented programming language such as Python, java, smalltalk, C ++ and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the case of a remote computer, the remote computer may be connected to the user's computer through any kind of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or may be connected to an external computer (for example, through the Internet using an Internet service provider).
The flowcharts and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present application. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
As another aspect, the present application also provides a computer-readable medium that may be contained in the electronic device described in the above embodiment; or may exist alone without being incorporated into the electronic device. The computer readable medium carries one or more programs which, when executed by the electronic device, cause the electronic device to: inputting a behavior data set of a target user in a target time period into a pre-trained detection model to generate a model detection result, wherein the detection model is used for determining whether the user corresponding to the input behavior data set is an abnormal user or not, and the model detection result is a detection result generated by the detection model; generating a condition detection result based on whether the behavior data in the behavior data set accords with an abnormal user judgment condition in a predetermined abnormal user judgment condition set, wherein the condition detection result is a detection result generated based on the abnormal user judgment condition in the abnormal user judgment condition set; and generating a target detection result based on the model detection result and the condition detection result, wherein the target detection result is used for indicating whether the target user is an abnormal user or not.
The foregoing description is only of the preferred embodiments of the present application and is presented as a description of the principles of the technology being utilized. It will be appreciated by persons skilled in the art that the scope of the invention referred to in this application is not limited to the specific combinations of features described above, but it is intended to cover other embodiments in which any combination of features described above or equivalents thereof is possible without departing from the spirit of the invention. Such as the above-described features and technical features having similar functions (but not limited to) disclosed in the present application are replaced with each other.

Claims (14)

1. A method for detecting an abnormal user, comprising:
inputting a behavior data set of a target user in a target time period into a pre-trained detection model to generate a model detection result, wherein the detection model is used for determining whether the user corresponding to the input behavior data set is an abnormal user or not, and the behavior data set is a set of data generated by various operations of the target user;
generating a condition detection result based on whether the behavior data in the behavior data set accords with an abnormal user judgment condition in a predetermined abnormal user judgment condition set;
Generating a target detection result based on the model detection result and the condition detection result, wherein the target detection result is used for indicating whether the target user is an abnormal user or not;
the detection model is obtained through training the following steps:
acquiring a predetermined training sample set, wherein each training sample in the training sample set corresponds to one user, and the training samples in the training sample set comprise a behavior data set of the user corresponding to the training sample;
for each anomaly detection algorithm in a predetermined anomaly detection algorithm set, performing anomaly detection on the training sample set by adopting the anomaly detection algorithm to obtain a candidate model corresponding to the anomaly detection algorithm, wherein the candidate model characterizes whether a behavior data set included in the training sample corresponds to an anomaly user or not;
and taking the candidate models which meet the preset selection conditions in the obtained candidate models as detection models.
2. The method of claim 1, wherein the employing the anomaly detection algorithm to perform anomaly detection on the training sample set to obtain a candidate model corresponding to the anomaly detection algorithm comprises:
Selecting at least two groups of parameter combinations from a predetermined parameter set of the abnormality detection algorithm;
setting a parameter value of each parameter in the at least two sets of parameter combinations to a parameter value set in advance for the parameter;
and aiming at each group of parameter combinations in the at least two groups of parameter combinations, carrying out anomaly detection on the training sample set based on the anomaly detection algorithm and parameter values set for each parameter in the group of parameter combinations to obtain candidate models corresponding to the anomaly detection algorithm and the group of parameter combinations.
3. The method according to claim 1, wherein the step of using, as the detection model, the candidate model satisfying the preset selection condition from among the obtained candidate models includes:
and based on the accuracy, the contour coefficient, the adjustment Rankine coefficient, the Calinski-Harabasz index and the odd term evaluation index in the Fowlkes-Mallows score, taking the candidate model determined by adopting a voting mechanism from the obtained candidate models as a detection model.
4. The method of claim 1, wherein an anomaly detection algorithm in the set of anomaly detection algorithms is any one of:
statistical algorithms, clustering algorithms, isolated forest algorithms, local anomaly factor algorithms.
5. The method according to one of claims 1-4, wherein the method further comprises:
in response to the model detection result and the detection result indicated by the condition detection result not being consistent, performing at least one of:
determining whether to update the abnormal user decision condition set;
it is determined whether to continue training the detection model.
6. The method of claim 5, wherein in the event that a determination is made whether to continue training the detection model in response to the model detection result not being consistent with the detection result indicated by the condition detection result, the determination is made whether to continue training the detection model, comprising:
in response to receiving information indicating to continue training, continuing to train the detection model based on the set of behavioral data.
7. The method of claim 5, wherein in the event that the determination of whether to update the set of abnormal user decision conditions is responsive to the model detection result not being consistent with the detection result indicated by the condition detection result, the determination of whether to update the set of abnormal user decision conditions comprises:
in response to receiving information indicating to update the set of abnormal user decision conditions, the set of abnormal user decision conditions is updated based on the set of behavioral data.
8. The method according to one of claims 1 to 4, wherein the model test result and the condition test result are each characterized by a numerical value; and
the generating a target detection result based on the model detection result and the condition detection result includes:
determining whether the target user is an abnormal user based on the result of weighted summation of the model detection result and the condition detection result, and generating a target detection result, wherein the weight of the model detection result and the weight of the condition detection result are positively correlated with the accuracy of the detection result generated based on the detection model and the abnormal user judgment condition set and used for indicating whether the user is an abnormal user.
9. Method according to one of claims 1-4, wherein the behavior data set is obtained by:
acquiring a user data set of a target user in a target time period, wherein the user information set comprises at least one of the following: user account number, transaction time, transaction amount, transaction duration, transaction number, transaction frequency, transaction characteristics, transaction trend, transaction period, transaction type;
Performing data cleaning on the user data set to obtain a cleaned data set;
performing data characteristic derivation on the cleaned data set to obtain a derived data set;
and adopting a principal component analysis method to reduce the dimension of the derived data set, and taking the data set obtained after the dimension reduction as a behavior data set.
10. The method according to one of claims 1-4, wherein the method further comprises:
and responding to the target detection result to indicate that the target user is an abnormal user, and executing a predetermined abnormal user control operation.
11. The method of claim 10, wherein the abnormal user management operation comprises at least one of:
limiting the authority of the target user;
sending prompt information for indicating abnormal operation to the target user;
and associating the target user with a preset label.
12. The method of one of claims 1-4, wherein the target user is a consumer user.
13. An electronic device, comprising:
one or more processors;
a storage device having one or more programs stored thereon,
when executed by the one or more processors, causes the one or more processors to implement the method of any of claims 1-12.
14. A computer readable medium having stored thereon a computer program, wherein the program when executed by a processor implements the method of any of claims 1-12.
CN201911200519.3A 2019-11-29 2019-11-29 Method, electronic device, and computer-readable medium for detecting abnormal user Active CN110929799B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201911200519.3A CN110929799B (en) 2019-11-29 2019-11-29 Method, electronic device, and computer-readable medium for detecting abnormal user

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911200519.3A CN110929799B (en) 2019-11-29 2019-11-29 Method, electronic device, and computer-readable medium for detecting abnormal user

Publications (2)

Publication Number Publication Date
CN110929799A CN110929799A (en) 2020-03-27
CN110929799B true CN110929799B (en) 2023-05-12

Family

ID=69847840

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911200519.3A Active CN110929799B (en) 2019-11-29 2019-11-29 Method, electronic device, and computer-readable medium for detecting abnormal user

Country Status (1)

Country Link
CN (1) CN110929799B (en)

Families Citing this family (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111612037A (en) * 2020-04-24 2020-09-01 平安直通咨询有限公司上海分公司 Abnormal user detection method, device, medium and electronic equipment
CN112052185B (en) * 2020-09-29 2023-11-10 北京百度网讯科技有限公司 Exception handling method and device for applet, electronic equipment and storage medium
CN112199640B (en) * 2020-09-30 2024-03-12 广州市百果园网络科技有限公司 Abnormal user auditing method and device, electronic equipment and storage medium
CN112445679B (en) * 2020-11-13 2023-01-06 度小满科技(北京)有限公司 Information detection method, device, server and storage medium
CN113191824A (en) * 2021-05-24 2021-07-30 北京大米科技有限公司 Data processing method and device, electronic equipment and readable storage medium
CN113722707A (en) * 2021-11-02 2021-11-30 西安热工研究院有限公司 Database abnormal access detection method, system and equipment based on distance measurement
CN117057941B (en) * 2023-09-14 2024-03-26 上海甄汇信息科技有限公司 Abnormal consumption detection method based on multidimensional data analysis

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107801090A (en) * 2017-11-03 2018-03-13 北京奇虎科技有限公司 Utilize the method, apparatus and computing device of audio-frequency information detection anomalous video file
JP2018051721A (en) * 2016-09-30 2018-04-05 キヤノン株式会社 Abnormality detection device, abnormality detection method, and program
CN108509979A (en) * 2018-02-28 2018-09-07 努比亚技术有限公司 A kind of method for detecting abnormality, server and computer readable storage medium
CN109388548A (en) * 2018-09-29 2019-02-26 北京京东金融科技控股有限公司 Method and apparatus for generating information
CN109886290A (en) * 2019-01-08 2019-06-14 平安科技(深圳)有限公司 Detection method, device, computer equipment and the storage medium of user's request
CN109919684A (en) * 2019-03-18 2019-06-21 上海盛付通电子支付服务有限公司 For generating method, electronic equipment and the computer readable storage medium of information prediction model
CN109936561A (en) * 2019-01-08 2019-06-25 平安科技(深圳)有限公司 User request detection method and device, computer equipment and storage medium
WO2019128552A1 (en) * 2017-12-29 2019-07-04 Oppo广东移动通信有限公司 Information pushing method, apparatus, terminal, and storage medium

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7941382B2 (en) * 2007-10-12 2011-05-10 Microsoft Corporation Method of classifying and active learning that ranks entries based on multiple scores, presents entries to human analysts, and detects and/or prevents malicious behavior

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2018051721A (en) * 2016-09-30 2018-04-05 キヤノン株式会社 Abnormality detection device, abnormality detection method, and program
CN107801090A (en) * 2017-11-03 2018-03-13 北京奇虎科技有限公司 Utilize the method, apparatus and computing device of audio-frequency information detection anomalous video file
WO2019128552A1 (en) * 2017-12-29 2019-07-04 Oppo广东移动通信有限公司 Information pushing method, apparatus, terminal, and storage medium
CN108509979A (en) * 2018-02-28 2018-09-07 努比亚技术有限公司 A kind of method for detecting abnormality, server and computer readable storage medium
CN109388548A (en) * 2018-09-29 2019-02-26 北京京东金融科技控股有限公司 Method and apparatus for generating information
CN109886290A (en) * 2019-01-08 2019-06-14 平安科技(深圳)有限公司 Detection method, device, computer equipment and the storage medium of user's request
CN109936561A (en) * 2019-01-08 2019-06-25 平安科技(深圳)有限公司 User request detection method and device, computer equipment and storage medium
CN109919684A (en) * 2019-03-18 2019-06-21 上海盛付通电子支付服务有限公司 For generating method, electronic equipment and the computer readable storage medium of information prediction model

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
于冰洁 ; 夏战国 ; 王久龙 ; .基于高斯过程模型的异常检测算法.计算机工程与设计.2016,(04),全文. *
朱佳俊 ; 陈功 ; 施勇 ; 薛质 ; .基于用户画像的异常行为检测.通信技术.2017,(10),全文. *

Also Published As

Publication number Publication date
CN110929799A (en) 2020-03-27

Similar Documents

Publication Publication Date Title
CN110929799B (en) Method, electronic device, and computer-readable medium for detecting abnormal user
US10547618B2 (en) Method and apparatus for setting access privilege, server and storage medium
CN108108743B (en) Abnormal user identification method and device for identifying abnormal user
CN110442712B (en) Risk determination method, risk determination device, server and text examination system
CN109471783B (en) Method and device for predicting task operation parameters
CN110659657B (en) Method and device for training model
CN109961032B (en) Method and apparatus for generating classification model
CN110866040B (en) User portrait generation method, device and system
CN108595448B (en) Information pushing method and device
CN112348321A (en) Risk user identification method and device and electronic equipment
CN110008926B (en) Method and device for identifying age
CN114780338A (en) Host information processing method and device, electronic equipment and computer readable medium
CN110704614B (en) Information processing method and device for predicting user group type in application
CN112116397A (en) User behavior characteristic real-time processing method and device, storage medium and electronic equipment
CN115187364A (en) Method and device for monitoring deposit risk under bank distributed scene
CN112685799B (en) Device fingerprint generation method and device, electronic device and computer readable medium
US20220207284A1 (en) Content targeting using content context and user propensity
CN114925275A (en) Product recommendation method and device, computer equipment and storage medium
CN115048561A (en) Recommendation information determination method and device, electronic equipment and readable storage medium
CN111949860B (en) Method and apparatus for generating a relevance determination model
CN113052509A (en) Model evaluation method, model evaluation apparatus, electronic device, and storage medium
US11303683B2 (en) Methods and systems for managing distribution of online content based on content maturity
CN111898027A (en) Method, device, electronic equipment and computer readable medium for determining feature dimension
CN116911304B (en) Text recommendation method and device
CN113362097B (en) User determination method and device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant