CN110929799A - Method, electronic device, and computer-readable medium for detecting abnormal user - Google Patents

Method, electronic device, and computer-readable medium for detecting abnormal user Download PDF

Info

Publication number
CN110929799A
CN110929799A CN201911200519.3A CN201911200519A CN110929799A CN 110929799 A CN110929799 A CN 110929799A CN 201911200519 A CN201911200519 A CN 201911200519A CN 110929799 A CN110929799 A CN 110929799A
Authority
CN
China
Prior art keywords
detection result
user
model
detection
abnormal
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201911200519.3A
Other languages
Chinese (zh)
Other versions
CN110929799B (en
Inventor
何莹
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shanghai Sheng Electronic Payment Services Ltd
Original Assignee
Shanghai Sheng Electronic Payment Services Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shanghai Sheng Electronic Payment Services Ltd filed Critical Shanghai Sheng Electronic Payment Services Ltd
Priority to CN201911200519.3A priority Critical patent/CN110929799B/en
Publication of CN110929799A publication Critical patent/CN110929799A/en
Application granted granted Critical
Publication of CN110929799B publication Critical patent/CN110929799B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q40/00Finance; Insurance; Tax strategies; Processing of corporate or income taxes
    • G06Q40/04Trading; Exchange, e.g. stocks, commodities, derivatives or currency exchange

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Business, Economics & Management (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Computation (AREA)
  • Software Systems (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • General Health & Medical Sciences (AREA)
  • Computational Linguistics (AREA)
  • Mathematical Physics (AREA)
  • Biophysics (AREA)
  • Biomedical Technology (AREA)
  • Finance (AREA)
  • Accounting & Taxation (AREA)
  • Health & Medical Sciences (AREA)
  • Evolutionary Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Development Economics (AREA)
  • Economics (AREA)
  • Marketing (AREA)
  • Strategic Management (AREA)
  • Technology Law (AREA)
  • General Business, Economics & Management (AREA)
  • Debugging And Monitoring (AREA)
  • Data Exchanges In Wide-Area Networks (AREA)

Abstract

Embodiments of the present disclosure disclose methods, electronic devices, and computer-readable media for detecting anomalous users. One embodiment of the method comprises: inputting a behavior data set of a target user in a target time period into a pre-trained detection model to generate a model detection result; generating a condition detection result based on whether the behavior data in the behavior data set meet the abnormal user judgment condition in the predetermined abnormal user judgment condition set; and generating a target detection result based on the model detection result and the condition detection result, wherein the target detection result is used for indicating whether the target user is an abnormal user. The embodiment improves the accuracy of abnormal user detection, can automatically and efficiently realize effective risk monitoring and management for mass data, and is beneficial to reducing the occupation of resources such as CPU (central processing unit), bandwidth and the like by abnormal users, thereby ensuring the normal use of the resources by normal users.

Description

Method, electronic device, and computer-readable medium for detecting abnormal user
Technical Field
Embodiments of the present disclosure relate to the field of computer technologies, and in particular, to a method, an electronic device, and a computer-readable medium for detecting an abnormal user.
Background
Generally, users can be classified into different types according to their characteristics.
For example, a user may belong to an anomalous user when there is a high frequency of resource-of-value exchange behavior (e.g., transaction behavior) for a certain period of time.
Currently, the detection of abnormal users is an important aspect in network security. The conventional abnormal user detection method generally only depends on personal experience of wind control personnel to find risk transaction behaviors, ensure safe and convenient operation and judge whether a user is an abnormal user.
Disclosure of Invention
Embodiments of the present disclosure propose methods, electronic devices, and computer-readable media for detecting anomalous users.
In a first aspect, an embodiment of the present disclosure provides a method for detecting an abnormal user, where the method includes: inputting a behavior data set of a target user in a target time period into a pre-trained detection model to generate a model detection result, wherein the detection model is used for determining whether a user corresponding to the input behavior data set is an abnormal user, and the model detection result is a detection result generated by the detection model; generating a condition detection result based on whether the behavior data in the behavior data set meet the abnormal user judgment condition in the predetermined abnormal user judgment condition set, wherein the condition detection result is a detection result generated based on the abnormal user judgment condition in the abnormal user judgment condition set; and generating a target detection result based on the model detection result and the condition detection result. And the target detection result is used for indicating whether the target user is an abnormal user.
In a second aspect, an embodiment of the present disclosure provides an apparatus for detecting an abnormal user, the apparatus including: the input unit is configured to input a behavior data set of a target user in a target time period into a pre-trained detection model, and generate a model detection result, wherein the detection model is used for determining whether a user corresponding to the input behavior data set is an abnormal user, and the model detection result is a detection result generated by the detection model; a first generating unit configured to generate a condition detection result based on whether the behavior data in the behavior data set meets an abnormal user determination condition in a predetermined abnormal user determination condition set, wherein the condition detection result is a detection result generated based on the abnormal user determination condition in the abnormal user determination condition set; and a second generating unit configured to generate a target detection result based on the model detection result and the condition detection result, wherein the target detection result is used for indicating whether the target user is an abnormal user.
In a third aspect, an embodiment of the present disclosure provides an electronic device, including: one or more processors; a storage device having one or more programs stored thereon, which when executed by the one or more processors, cause the one or more processors to implement the method of any of the embodiments of the method for detecting an anomalous user as described above.
In a fourth aspect, embodiments of the present disclosure provide a computer-readable medium, on which a computer program is stored, which when executed by a processor, implements the method of any of the embodiments of the method for detecting an anomalous user as described above.
Embodiments of the present disclosure provide a method, an electronic device, and a computer-readable medium for detecting an abnormal user, which generate a model detection result by inputting a behavior data set of a target user in a target time period into a pre-trained detection model, wherein the detection model is used to determine whether a user corresponding to the input behavior data set is an abnormal user, the model detection result is a detection result generated by the detection model, then generate a condition detection result based on whether the behavior data in the behavior data set meets an abnormal user determination condition in a predetermined abnormal user determination condition set, wherein the condition detection result is a detection result generated based on an abnormal user determination condition in the abnormal user determination condition set, and finally generate a target detection result based on the model detection result and the condition detection result, wherein the target detection result is used to indicate whether the target user is an abnormal user, therefore, the accuracy of detecting the abnormal user is improved, the occupation of resources such as a Central Processing Unit (CPU) and bandwidth by the abnormal user is reduced, and the normal use of the resources by normal users is ensured.
Drawings
Other features, objects and advantages of the present application will become more apparent upon reading of the following detailed description of non-limiting embodiments thereof, made with reference to the accompanying drawings in which:
FIG. 1 is an exemplary system architecture diagram in which one embodiment of the present application may be applied;
FIG. 2 is a flow diagram of one embodiment of a method for detecting anomalous users in accordance with the present application;
FIG. 3 is a schematic diagram of an application scenario of a method for detecting anomalous users in accordance with the present application;
FIG. 4 is a flow diagram of yet another embodiment of a method for detecting anomalous users in accordance with the present application;
FIGS. 5A-5B are schematic diagrams of yet another application scenario of a method for detecting anomalous users in accordance with the present application;
FIG. 6 is a schematic block diagram of a computer system suitable for use with an electronic device implementing embodiments of the present disclosure.
Detailed Description
The present application will be described in further detail with reference to the following drawings and examples. It is to be understood that the specific embodiments described herein are merely illustrative of the relevant invention and not restrictive of the invention. It should be noted that, for convenience of description, only the portions related to the related invention are shown in the drawings.
It should be noted that the embodiments and features of the embodiments in the present application may be combined with each other without conflict. The present application will be described in detail below with reference to the embodiments with reference to the attached drawings.
Fig. 1 illustrates an exemplary system architecture 100 to which embodiments of the method for detecting anomalous users of embodiments of the present disclosure may be applied.
As shown in fig. 1, the system architecture 100 may include terminal devices 101, 102, 103, a network 104, and a server 105. The network 104 serves as a medium for providing communication links between the terminal devices 101, 102, 103 and the server 105. Network 104 may include various connection types, such as wired, wireless communication links, or fiber optic cables, to name a few.
The user may use the terminal devices 101, 102, 103 to interact with the server 105 over the network 104 to receive or send messages (e.g., a set of behavioral data of a target user over a target time period), and so on. The terminal devices 101, 102, 103 may have various communication client applications installed thereon, such as payment-type applications, financial software, web browser applications, shopping-type applications, search-type applications, instant messaging tools, mailbox clients, social platform software, and the like.
The terminal apparatuses 101, 102, and 103 may be hardware or software. When the terminal devices 101, 102, 103 are hardware, they may be various electronic devices, including but not limited to smart phones, tablet computers, electronic book readers, MP3 players (Moving Picture Experts Group Audio Layer III, mpeg Audio Layer 3), MP4 players (Moving Picture Experts Group Audio Layer IV, mpeg Audio Layer 4), laptop portable computers, desktop computers, and the like. When the terminal apparatuses 101, 102, 103 are software, they can be installed in the electronic apparatuses listed above. It may be implemented as multiple pieces of software or software modules (e.g., software or software modules used to provide distributed services) or as a single piece of software or software module. And is not particularly limited herein.
The server 105 may be a server providing various services, such as a background server providing support for pages displayed on the terminal devices 101, 102, 103. The background server may obtain a set of behavior data (e.g., user account number, transaction time, transaction amount, transaction duration, transaction number, transaction frequency, transaction characteristics, transaction trend, transaction period, transaction type) of the target user in the target time period from the terminal devices 101, 102, 103. Inputting the acquired behavior data set into a pre-trained detection model, generating a model detection result, generating a condition detection result based on whether behavior data in the behavior data set meet abnormal user judgment conditions in a predetermined abnormal user judgment condition set, and generating a target detection result based on the model detection result and the condition detection result.
It should be noted that the method for detecting an abnormal user provided by the embodiment of the present disclosure may be executed by the server 105, or may be executed by the terminal devices 101, 102, and 103 and the server 105 in cooperation with each other. Accordingly, each part (for example, each unit) included in the apparatus for detecting an abnormal user may be provided in the server 105, or may be provided in the terminal devices 101, 102, and 103, or may be provided in the server 105 and the terminal devices 101, 102, and 103, respectively.
The server may be hardware or software. When the server is hardware, it may be implemented as a distributed server cluster formed by multiple servers, or may be implemented as a single server. When the server is software, it may be implemented as multiple pieces of software or software modules (e.g., software or software modules used to provide distributed services), or as a single piece of software or software module. And is not particularly limited herein.
It should be understood that the number of terminal devices, networks, and servers in fig. 1 is merely illustrative. There may be any number of terminal devices, networks, and servers, as desired for implementation. When the electronic device on which the method for detecting an abnormal user operates does not need to perform data transmission with other electronic devices, the system architecture may include only the electronic device on which the method for detecting an abnormal user operates.
With continued reference to FIG. 2, a flow 200 of one embodiment of a method for detecting anomalous users in accordance with the present application is shown. The method for detecting the abnormal user comprises the following steps:
step 201, inputting a behavior data set of a target user in a target time period into a pre-trained detection model, and generating a model detection result.
In this embodiment, an executing subject (for example, a server or a terminal device shown in fig. 1) of the method for detecting an abnormal user may input a behavior data set of a target user in a target time period into a pre-trained detection model, and generate a model detection result. The detection model is used for determining whether a user corresponding to the input behavior data set is an abnormal user, and the model detection result is generated by the detection model and used for indicating whether a target user is the abnormal user.
Here, the target user may be a user for whom abnormality detection is to be performed. The target time period may be any historical time period. As an example, the target time period may be 11/0 in 2019 to 12/11/0 in 2019. The behavior data set may be a set of data generated by various operations of the target user described above. As an example, the behavioral data set may include at least one of: user account number, transaction time, transaction amount, transaction duration, transaction number, transaction frequency, transaction characteristics, transaction trend, transaction period and transaction type.
In practice, the behavioral data set may be characterized in the form of vectors, matrices, and the like.
The model detection result may be characterized by words, for example, the model detection result may be yes or no; the model detection result may also be represented numerically, for example, when the model detection result is "0", it may represent that the target user is an abnormal user, and when the model detection result is "1", it may represent that the target user is not an abnormal user (i.e., a normal user); the model detection result can also be characterized by a matrix or a vector. For example, the execution subject may generate, for each behavior data in the behavior data set, a probability that a user corresponding to the behavior data is an abnormal user, so that each probability is used as an element of a matrix or a vector used to characterize the model detection result. For another example, the execution subject may generate, for each behavior data in the behavior data set, a probability that the user corresponding to the behavior data is an abnormal user, and if the probability that the user corresponding to the behavior data is an abnormal user is greater than a preset probability threshold (e.g., 50%), take 1 as an element of a matrix or a vector used for characterizing the model detection result, and if the probability that the user corresponding to the behavior data is an abnormal user is less than or equal to the preset probability threshold (e.g., 50%), take 0 as an element of the matrix or the vector used for characterizing the model detection result.
Here, when the model detection result is characterized by a matrix or a vector, the execution body may determine whether the model detection result indicates that the target user is an abnormal user based on each element in the matrix or the vector. For example, the execution subject may determine whether the model detection result indicates that the target user is an abnormal user by calculating a magnitude relationship between a mean value of each element in the matrix or the vector and a preset threshold value. For another example, the execution subject may determine whether the model detection result indicates that the target user is an abnormal user by calculating a magnitude relationship between the number of elements exceeding the first preset threshold value among the elements in the matrix or the vector and the second preset threshold value.
In practice, when the execution main body is a terminal device, a target user may operate an application installed on the execution main body to generate behavior data, and thus, the execution main body or a server communicatively connected to the execution main body may use a set of behavior data generated by the target user in a target time period as a behavior data set; when the execution agent is a server, the target user may operate an application, which is supported by the execution agent and installed on a terminal device communicatively connected to the execution agent, to generate behavior data, and thereby the execution agent or the terminal device communicatively connected to the execution agent may use a set of behavior data of the target user generated in a target time period as a behavior data set.
In some optional implementations of this embodiment, the behavior data set may also be obtained by:
first, a user data set of a target user in a target time period is obtained. Wherein the user information set comprises at least one of: user account number, transaction time, transaction amount, transaction duration, transaction number, transaction frequency, transaction characteristics, transaction trend, transaction period and transaction type.
And then, data cleaning is carried out on the user data set to obtain a cleaned data set.
And then, carrying out data characteristic derivation on the cleaned data set to obtain a derived data set.
And finally, reducing the dimension of the derived data set by adopting a principal component analysis method, and taking the data set obtained after dimension reduction as a behavior data set.
It can be understood that, in the optional implementation manner, data cleaning, data feature derivation and dimension reduction processing are performed on the behavior data set, so that the accuracy and speed of generating the target detection result can be improved through subsequent steps.
Here, the detection model may be a convolutional neural network model trained based on a predetermined training sample set by using a machine learning algorithm. Wherein each training sample in the set of training samples may correspond to a user. The training samples in the set of training samples may include input data and expected output data. The input data may be a collection of behavioral data for a single user. The expected output data may be used to indicate whether the user is an anomalous user.
Step 202, generating a condition detection result based on whether the behavior data in the behavior data set meets the abnormal user judgment condition in the predetermined abnormal user judgment condition set.
In this embodiment, the execution subject may generate the condition detection result based on whether the behavior data in the behavior data set meets an abnormal user determination condition in a predetermined abnormal user determination condition set. The condition detection result is a detection result generated based on the abnormal user judgment condition in the abnormal user judgment condition set and used for indicating whether the target user is an abnormal user.
The abnormal user determination condition may be a condition for determining whether the user is an abnormal user. For example, the abnormal user determination condition may include: the number of transactions by the user within the target time period exceeds 50. For another example, the abnormal user determination condition may include: the number of times that the user transacts more than ten thousand dollars in the target time period exceeds 10.
The condition detection result may be characterized in words, for example, the condition detection result may be yes or no. Alternatively, the condition detection result may be represented by a number, for example, when the condition detection result is "0", the target user may be represented as an abnormal user, and when the condition detection result is "1", the target user may be represented as not an abnormal user. Alternatively, the condition detection result can be characterized by a matrix or a vector. For example, the execution subject may generate, for each behavior data in the behavior data set, a probability that a user corresponding to the behavior data is an abnormal user, so that each probability is used as an element of a matrix or a vector used to characterize the condition detection result.
Each abnormal user determination condition in the abnormal user determination condition set may correspond to a probability for characterizing that the user is an abnormal user. In this way, the execution agent may determine the probability corresponding to the abnormal user determination condition to which the behavior data corresponds as the probability that the user corresponding to the behavior data is an abnormal user. For another example, the execution main body may further generate, for each behavior data in the behavior data set, a probability that the user corresponding to the behavior data is an abnormal user, and if the probability that the user corresponding to the behavior data is an abnormal user is greater than a preset probability threshold (e.g., 50%), take 1 as an element of a matrix or a vector used for characterizing the condition detection result, and if the probability that the user corresponding to the behavior data is an abnormal user is less than or equal to the preset probability threshold (e.g., 50%), take 0 as an element of the matrix or the vector used for characterizing the condition detection result. Each abnormal user judgment condition in the abnormal user judgment condition set may correspond to a probability for characterizing that the user is an abnormal user. In this way, the execution agent may determine the probability corresponding to the abnormal user determination condition to which the behavior data corresponds as the probability that the user corresponding to the behavior data is an abnormal user.
Here, when the condition detection result is characterized by a matrix or a vector, the execution body may determine whether the condition detection result indicates that the target user is an abnormal user based on each element in the matrix or the vector. For example, the execution body may determine whether the condition detection result indicates that the target user is an abnormal user by calculating a magnitude relationship between a mean value of each element in the matrix or the vector and a preset threshold value. For another example, the execution main body may determine whether the condition detection result indicates that the target user is an abnormal user by calculating a size relationship between the number of elements exceeding the first preset threshold value among the elements in the matrix or the vector and a preset number.
Here, the execution body may generate the condition detection result in various ways.
For example, in a case where behavior data that meets an abnormal user determination condition in a predetermined abnormal user determination condition set exists in the behavior data set, the execution subject may generate a condition detection result indicating that the target user is an abnormal user; in the case where there is no behavior data that meets an abnormal user determination condition in the predetermined abnormal user determination condition set in the behavior data set, the execution body may generate a condition detection result indicating that the target user is not an abnormal user.
Optionally, in a case that the number of behavior data in the behavior data set that meets the abnormal user determination condition in the predetermined abnormal user determination condition set is greater than the number of behavior data that does not meet the abnormal user determination condition in the abnormal user determination condition set, the execution main body may generate a condition detection result indicating that the target user is an abnormal user; in a case where the number of pieces of behavior data in the behavior data set that satisfy the abnormal user determination condition in the predetermined abnormal user determination condition set is smaller than or equal to the number of pieces of behavior data that do not satisfy the abnormal user determination condition in the abnormal user determination condition set, the execution main body may generate a condition detection result indicating that the target user is not an abnormal user.
And step 203, generating a target detection result based on the model detection result and the condition detection result.
In this embodiment, the execution body may generate the target detection result based on the model detection result and the condition detection result. And the target detection result is used for indicating whether the target user is an abnormal user.
As an example, in the case where the model detection result and the detection result indicated by the condition detection result coincide, the execution subject described above may take the model detection result or the condition detection result as the target detection result. In the case where the model detection result and the detection result indicated by the condition detection result are not identical, the execution body may randomly select one detection result from the model detection result and the condition detection result, and use the randomly selected detection result as the target detection result.
As still another example, in a case where the model detection result and the detection result indicated by the condition detection result do not coincide, the execution main body may take the detection result indicated by the selected operation (i.e., the model detection result or the condition detection result) as the target detection result based on the selected operation of a relevant person (e.g., a person in charge of detecting whether the user is an abnormal user).
With continued reference to fig. 3, fig. 3 is a schematic diagram of an application scenario of the method for detecting an abnormal user according to the present embodiment. In the application scenario of fig. 3, the server 301 first inputs a behavior data set 3001 of a target user in a target time period into a pre-trained detection model 3002, and generates a model detection result 3004 (in fig. 3, the model detection result 3004 indicates that the target user is an abnormal user), where the detection model 3002 is used to determine whether a user corresponding to the input behavior data set is an abnormal user. The model detection result 3004 is a detection result generated by the detection model 3002.
Then, the server 301 generates a condition detection result 3005 (in fig. 3, the condition detection result 3005 indicates that the target user is not an abnormal user) based on whether or not the behavior data in the behavior data set 3001 meets the abnormal user determination condition in the abnormal user determination condition set 3003 determined in advance. Here, the condition detection result 3005 is a detection result generated based on the abnormal user determination condition in the abnormal user determination condition set 3003.
Finally, the server 301 generates a target detection result 3006 based on the model detection result 3004 and the condition detection result 3005 (in fig. 3, the target detection result 3006 indicates that the target user is not an abnormal user). The target detection result 3006 is used to indicate whether the target user is an abnormal user.
The method provided by the above embodiment of the present application generates a model detection result by inputting a behavior data set of a target user in a target time period into a pre-trained detection model, wherein the detection model is used to determine whether a user corresponding to the input behavior data set is an abnormal user, then generates a condition detection result based on whether the behavior data in the behavior data set meets an abnormal user determination condition in a predetermined abnormal user determination condition set, and finally generates a target detection result based on the model detection result and the condition detection result, wherein the target detection result is used to indicate whether the target user is an abnormal user, thereby generating a final detection result (i.e. a target detection result) indicating whether the target user is an abnormal user by combining the model detection result generated based on the detection model and the condition detection result generated based on the abnormal user determination condition set, therefore, the accuracy of abnormal user detection is improved, effective risk monitoring and management can be automatically and efficiently realized for mass data, the occupation of resources such as a CPU (central processing unit), bandwidth and the like by abnormal users is reduced, and the normal use of the resources by normal users is ensured.
In some optional implementations of this embodiment, in a case that the detection result indicated by the model detection result and the condition detection result are not consistent, the execution main body may further perform at least one of the following:
first, it is determined whether to update the abnormal user decision condition set.
And the second item is used for determining whether to continue training the detection model.
It can be understood that, under the condition that the detection result indicated by the model detection result and the detection result indicated by the condition detection result are not consistent, the possible accuracy of the condition detection result determined based on the abnormal user determination condition set or the model detection result determined based on the detection model is low, and in this scenario, the accuracy of the condition detection result determined by the abnormal user determination condition set can be improved by updating the abnormal user determination condition set or continuing to train the detection model, or the accuracy of the model detection result determined by the detection model is improved, so that a more accurate target detection result can be obtained by using the updated abnormal user determination condition set and the detection model obtained after continuing to train, and the accuracy of the detection of the abnormal user is further improved.
In some optional implementations of the embodiment, in a case where it is determined whether to update the abnormal user determination condition set in response to a detection result indicated by the model detection result and the condition detection result being inconsistent, it may be determined whether to update the abnormal user determination condition set by: in response to receiving information indicating to update the abnormal user decision condition set, the abnormal user decision condition set is updated based on the behavior data set.
The information indicating the update abnormal user determination condition set may be transmitted to the execution main body by a relevant person (for example, a technician) through an electronic device used by the relevant person, or may be directly input to the execution main body by the relevant person.
It is to be understood that, in this alternative implementation, the execution subject may determine whether the abnormal user determination condition set needs to be updated according to experience of relevant personnel.
Optionally, the execution main body may update the abnormal user determination condition set when the accuracy of the condition detection result obtained based on the abnormal user determination condition set is smaller than a preset accuracy threshold. So that the execution subject can automatically update the abnormal user determination condition set.
The accuracy of the condition detection result obtained based on the abnormal user judgment condition set can be obtained by adopting the following steps:
in a first step, a set of test samples is obtained. Wherein each test sample in the set of test samples corresponds to a user. Each test sample comprises a behavior data set of a user corresponding to the test sample and expected result data representing whether the user corresponding to the test sample is an abnormal user.
And secondly, generating a condition detection result for each test sample in the test sample set based on whether the behavior data in the behavior data set meets the abnormal user judgment condition in the abnormal user judgment condition set or not. And taking the condition detection result of whether the user corresponding to the test sample is an abnormal user as the actual result data corresponding to the test sample. The manner of generating the condition detection result in the second step may refer to step 202, and is not described herein again.
And thirdly, determining whether the actual result data and the expected result data corresponding to the actual result data indicate the same meaning (namely, whether the actual result data and the expected result data all indicate that the user is an abnormal user or whether the actual result data and the expected result data all indicate that the user is not an abnormal user) aiming at each actual result data obtained in the second step so as to determine the number of the actual result data which indicate the same meaning with the corresponding expected result data in each actual result data obtained in the second step.
And step four, determining the ratio of the number obtained in the step three to the number of the test samples in the test sample set obtained in the step one as the accuracy of the condition detection result obtained based on the abnormal user judgment condition set.
In some optional implementations of this embodiment, in the case that, in response to the detection result indicated by the model detection result and the condition detection result being inconsistent, it is determined whether to continue training the detection model, determining whether to continue training the detection model includes: in response to receiving the information indicative of continuing the training, continuing to train the detection model based on the set of behavioral data.
The information for instructing to continue training may be transmitted to the execution main body by a relevant person (for example, a technician) through an electronic device used by the relevant person, or may be directly input to the execution main body by the relevant person.
It is understood that in this alternative implementation, the execution subject may determine whether to continue training the detection model according to the experience of the relevant person.
Optionally, the executing body may also continue to train the detection model when the accuracy of the detection model is smaller than a preset accuracy threshold. So that the executing agent may automatically continue training the detection model.
Wherein, the accuracy of the detection model can be obtained by adopting the following steps:
step one, a test sample set is obtained. Wherein each test sample in the set of test samples corresponds to a user. Each test sample comprises a behavior data set of a user corresponding to the test sample and expected result data representing whether the user corresponding to the test sample is an abnormal user.
And step two, sequentially inputting each test sample in the test sample set to a detection model to obtain a model detection result for indicating whether a user corresponding to the test sample is an abnormal user. And taking the model detection result of whether the user corresponding to the test sample is an abnormal user as the actual result data corresponding to the test sample.
And step three, determining whether the actual result data and the expected result data corresponding to the actual result data indicate the same meaning (that is, whether the actual result data and the expected result data all indicate that the user is an abnormal user or whether the actual result data and the expected result data all indicate that the user is not an abnormal user) for each actual result data obtained in the step two, so as to determine the number of actual result data which indicate the same meaning as the corresponding expected result data in each actual result data obtained in the step two.
And step four, determining the ratio of the quantity obtained in the step three to the quantity of the test samples in the test sample set as the accuracy of the detection model.
In some optional implementation manners of this embodiment, the model detection result and the condition detection result are respectively represented by a numerical value. And, the executing main body may further execute step 203 in the following manner:
and determining whether the target user is an abnormal user or not based on the result of the weighted summation of the model detection result and the condition detection result, and generating a target detection result. Wherein the weight of the model detection result and the weight of the condition detection result are positively correlated with the accuracy of the detection result generated based on the detection model and the abnormal user determination condition set for indicating whether the user is an abnormal user.
As an example, when the model detection result and the condition detection result are represented by a single value, the execution body may determine whether the target user is an abnormal user based on a magnitude relationship between a result of weighted summation of the model detection result and the condition detection result and a preset value, and generate the target detection result.
As yet another example, when the model detection result and the condition detection result are characterized by a matrix or a vector (i.e., a plurality of values), the execution body may determine whether the target user is an abnormal user and generate the target detection result based on a magnitude relationship between the number of elements greater than a first preset value in a result of weighted summation of the model detection result and the condition detection result and a second preset value. Alternatively, the execution body may determine whether the target user is an abnormal user and generate the target detection result based on a magnitude relationship between an average value of each element in a result of weighted summation of the model detection result and the condition detection result and a preset average value.
It can be understood that the optional implementation manner may employ multiple manners to generate the target detection result, thereby enriching the generation manner of the target detection result, and in some scenarios, two or more manners described in the optional implementation manner may also be employed to generate the target detection result, thereby further improving the accuracy of the abnormal user detection.
In some optional implementation manners of this embodiment, in a case that the target detection result indicates that the target user is an abnormal user, the execution main body may further perform a predetermined abnormal user management and control operation. The abnormal user management and control operation may be an operation for managing and/or controlling the abnormal user.
It can be understood that, when it is determined that the target detection result indicates that the target user is an abnormal user, the abnormal user can be correspondingly managed and/or controlled by executing predetermined abnormal user management and control operations, so that effective risk monitoring and management can be automatically and efficiently realized for mass data, occupation of resources such as a CPU and a bandwidth by the abnormal user can be reduced, and normal use of the resources by normal users can be ensured.
In some optional implementations of the embodiment, the abnormal user management operation includes at least one of:
the first item, limits the rights of the target user.
It can be understood that the loss of the related personnel caused by actions such as improper profit and misoperation of the abnormal user can be avoided by limiting the authority of the abnormal user.
And the second item is used for sending prompt information for indicating abnormal operation to the target user.
It can be understood that in some cases, the account of the target user is stolen by another person, or the background program runs privately, which may cause the target detection result to indicate that the target user is an abnormal user, in this scenario, a prompt message for indicating that the operation is abnormal may be sent to the target user, so as to prompt that the current account of the target user has a potential safety hazard, and prevent loss caused by the target user.
Third, the target user is associated with a preset tag.
It will be appreciated that the target user may be associated with a preset tag to distinguish between abnormal and normal users for subsequent differentiated management of the two types of users.
In some alternative implementations of the present embodiment, the target user is a consumer user (rather than a merchant).
It will be appreciated that in the case where the target user is a consumer user, this alternative implementation may enable anomaly detection for the consumer user.
With further reference to FIG. 4, a flow 400 of yet another embodiment of a method for detecting anomalous users is illustrated. The process 400 of the method for detecting an abnormal user includes the following steps:
step 401, a predetermined training sample set is obtained.
In this embodiment, an executing subject (for example, the server or the terminal device shown in fig. 1) of the method for detecting an abnormal user may obtain a predetermined training sample set from other electronic devices by a wired connection manner or a wireless connection manner, or locally. Wherein each training sample in the set of training samples corresponds to a user. The training samples in the set of training samples include a set of behavioral data of a user corresponding to the training samples.
Here, the behavior data set may be a set of data generated by various operations of the user. As an example, the behavioral data set may include at least one of: user account number, transaction time, transaction amount, transaction duration, transaction number, transaction frequency, transaction characteristics, transaction trend, transaction period and transaction type.
In practice, the behavioral data set may be characterized in the form of vectors, matrices, and the like.
Step 402, aiming at each anomaly detection algorithm in a predetermined anomaly detection algorithm set, carrying out anomaly detection on the training sample set by adopting the anomaly detection algorithm to obtain a candidate model corresponding to the anomaly detection algorithm.
In this embodiment, for each anomaly detection algorithm in the predetermined set of anomaly detection algorithms, the executing entity may perform anomaly detection on the training sample set by using the anomaly detection algorithm to obtain a candidate model corresponding to the anomaly detection algorithm. And the candidate model represents whether the behavior data set included in the training sample corresponds to the abnormal user or not.
Here, the anomaly detection algorithm in the set of anomaly detection algorithms may be used for anomaly detection on a training sample set. As an example, the anomaly detection algorithm may include, but is not limited to, any of: a classification-based anomaly detection algorithm, a nearest neighbor-based anomaly detection algorithm, a cluster-based anomaly detection algorithm, a statistics-based anomaly detection algorithm (e.g., a gaussian model-based anomaly detection algorithm, a regression model-based anomaly detection algorithm, a mixed parameter distribution-based anomaly detection algorithm, a histogram-based anomaly detection algorithm, a kernel function-based anomaly detection algorithm, a density estimation-based anomaly detection algorithm).
In some cases, the individual anomaly detection algorithms in the set of anomaly detection algorithms described above may be different from one another.
In some optional implementations of this embodiment, the set of anomaly detection algorithms may include the following anomaly detection algorithms: gaussian distribution, density-based clustering algorithm, isolated forest algorithm, Local outlier factor algorithm (LOF).
Here, the gaussian distribution, the density-based clustering algorithm, the isolated forest algorithm, and the local abnormal factor algorithm are well known to those skilled in the art, and are not described herein again.
It will be appreciated that by performing step 402 described above, a candidate model corresponding to each anomaly detection algorithm in the set of anomaly detection algorithms can be obtained. Before the anomaly detection algorithm is adopted to detect the anomalies in the training sample set, it is often difficult to predict the merits of the evaluation indexes such as Recall rate ((Recall), Precision (Precision), F-Measure), macro-average (macro-average), micro-average (micro-average) and the like of the candidate models corresponding to each anomaly detection algorithm in advance, so that the optional implementation mode can train the candidate models corresponding to each anomaly detection algorithm, and the detection models selected through the subsequent steps are more excellent in Recall rate, Precision, F-Measure, macro-average, micro-average and the like.
In some optional implementation manners of this embodiment, for "performing anomaly detection on the training sample set by using the anomaly detection algorithm to obtain a candidate model corresponding to the anomaly detection algorithm" in step 402, the executing body may perform the following steps:
the first step, select at least two groups of parameter combinations from the predetermined parameter set of the anomaly detection algorithm. Wherein, the number of the parameters in each group of parameter combination can be an integer greater than or equal to 0.
Here, each anomaly detection algorithm may correspond to a set of parameters. For example, when the anomaly detection algorithm is an isolated forest algorithm, the parameter set corresponding to the anomaly detection algorithm may include, but is not limited to: sampling number, selecting characteristic number, tree number, layer number of each tree and the like. When the anomaly detection algorithm is a clustering algorithm, the parameter set corresponding to the anomaly detection algorithm may include, but is not limited to: the number of clusters, parameters for filtering noise, etc.
And a second step of setting a parameter value of each parameter in at least two sets of parameter combinations as a parameter value set in advance for the parameter.
Here, the parameter value set for the parameter in advance may be a parameter value set by a technician based on experience, or may be a default parameter value of the parameter. The technician may set a plurality of different parameter values for the same parameter.
And thirdly, aiming at each group of parameter combination in at least two groups of parameter combinations, carrying out anomaly detection on the training sample set based on the anomaly detection algorithm and the parameter values set for all the parameters in the group of parameter combinations to obtain candidate models corresponding to the anomaly detection algorithm and the group of parameter combinations.
It can be understood that, since each anomaly detection algorithm may correspond to a plurality of parameters, it is often difficult to predict how to determine the values of the parameters in advance before training the model (e.g., the detection model) that has more excellent performance (e.g., the highest score determined based on the evaluation indexes such as recall rate, accuracy, F value, macro-average, micro-average, etc.). The optional implementation manner can train a candidate model for each group of parameter combinations corresponding to each anomaly detection algorithm. So that the detection model selected by the subsequent steps has more excellent performance in the aspects of recall rate, accuracy, F value, macro-average, micro-average and the like.
And step 403, using the candidate model meeting the preset selection condition in the obtained candidate models as a detection model.
In this embodiment, the execution subject may use a candidate model that meets a preset selection condition from the obtained candidate models as the detection model.
The preset selection condition may be that the accuracy is the highest, or that the result of weighted summation of the accuracy of the candidate model, the contour coefficient, the adjusted landed coefficient, the Calinski-Harabasz Index (Calinski-Harabasz Index), and the Fowlkes-Mallows score (Fowlkes-Mallows scores) is the largest.
Here, the accuracy, the profile coefficient, the adjusted land coefficient, the Calinski-Harabasz index, and the Fowlkes-Mallows score are well-known model evaluation indexes widely studied by those skilled in the art, and are not described herein again.
In some optional implementations of this embodiment, the executing main body may further execute the step 403 in the following manner:
and based on the accuracy, the contour coefficient, the adjusted landed coefficient, the Calinski-Harabasz index and the odd evaluation index in the Fowles-Mallows score, determining the candidate model determined by adopting a voting mechanism from the obtained candidate models as a detection model.
It can be understood that in the optional implementation manner, the detection model can be determined by adopting a voting mechanism, so that the superiority of the performance of the detection model on the odd evaluation indexes in the accuracy, the contour coefficient, the adjusted landed coefficient, the Calinski-Harabasz index and the Fowlkes-Mallows score is improved.
Step 404, inputting the behavior data set of the target user in the target time period into a pre-trained detection model, and generating a model detection result.
In this embodiment, the executing entity may input a behavior data set of the target user in the target time period into a pre-trained detection model, and generate a model detection result. The detection model is used for determining whether a user corresponding to the input behavior data set is an abnormal user. The model detection result is a detection result generated by the detection model and used for indicating whether the target user is an abnormal user or not.
Step 405, generating a condition detection result based on whether the behavior data in the behavior data set meets the abnormal user determination condition in the predetermined abnormal user determination condition set.
In this embodiment, the execution subject may generate the condition detection result based on whether the behavior data in the behavior data set meets an abnormal user determination condition in a predetermined abnormal user determination condition set. The condition detection result is a detection result generated based on the abnormal user judgment condition in the abnormal user judgment condition set and used for indicating whether the target user is an abnormal user.
And 406, generating a target detection result based on the model detection result and the condition detection result.
In this embodiment, the execution body may generate the target detection result based on the model detection result and the condition detection result. And the target detection result is used for indicating whether the target user is an abnormal user.
In this embodiment, the steps 404, 405, and 406 may be substantially the same as the steps 201, 202, and 203 in the corresponding embodiment of fig. 2, and are not described herein again.
With continuing reference to fig. 5A-5B, fig. 5A-5B are schematic diagrams of yet another application scenario of the method for detecting an anomalous user in accordance with the present embodiment. It should be noted that the schematic diagrams of fig. 5A-5B are only an example, and should not play any limiting role in the present application.
In fig. 5A, an executive of a method for detecting anomalous users in an embodiment of the present disclosure may first obtain a historical transaction data set 501. Wherein each historical transaction data in the set of historical transaction data may include, but is not limited to, at least one of the following for a single user: user account number, transaction time, transaction amount, transaction duration, transaction number, transaction frequency, transaction characteristics, transaction trend, transaction period and transaction type.
The execution agent may then perform data preprocessing (e.g., data cleansing), feature engineering (e.g., data feature derivation, dimension reduction), and the like on the historical transaction data set 501, thereby obtaining a processed data set 502.
Here, different cleansing rules can be designed for different types of non-standard data, and the data format can be adjusted or modified. For data with missing, the features can be divided into time features, classification features and continuous features, and mode and mean values are selected for filling respectively. Then, data obtained by data preprocessing are converted and derived, and new characteristics such as maximum transaction amount, average transaction amount, transaction sum and the like are obtained on the basis of original data; then, a principal component analytical method is adopted to select the features, and feature dimension reduction is carried out.
The execution agent may then perform data partitioning on the processed data set 502. For example, the processed data set 502 is divided into a training sample set 503 and a testing sample set 504 according to a certain proportion.
Then, the executing body may perform anomaly detection on the training sample set by using the anomaly detection algorithm for each anomaly detection algorithm in the predetermined anomaly detection algorithm set, so as to obtain a candidate model corresponding to the anomaly detection algorithm. In fig. 5A, the anomaly detection algorithms include a statistical algorithm, a clustering algorithm, an isolated forest algorithm, and a local anomaly factor algorithm. Therefore, the executing agent may perform anomaly detection on the training sample set 503 by using a statistical algorithm 505, a clustering algorithm 506, an isolated forest algorithm 507, and a local anomaly factor algorithm 508, so as to obtain a candidate model 509 corresponding to the statistical algorithm 505, a candidate model 510 corresponding to the clustering algorithm 506, a candidate model 511 corresponding to the isolated forest algorithm 507, and a candidate model 512 corresponding to the local anomaly factor algorithm 508.
Here, in the process of performing anomaly detection on the training sample set by using the anomaly detection algorithm, different parameter sets may be set for each anomaly detection algorithm, so as to establish different candidate models corresponding to the anomaly detection algorithm. For example, when the anomaly detection algorithm is an isolated forest algorithm, the parameter set corresponding to the anomaly detection algorithm may include, but is not limited to: sampling number, selecting characteristic number, tree number, layer number of each tree and the like. When the anomaly detection algorithm is a clustering algorithm, the parameter set corresponding to the anomaly detection algorithm may include, but is not limited to: the number of clusters, parameters for filtering noise, etc.
Then, the execution body may determine, based on the odd evaluation indexes in the accuracy, the contour coefficient, the adjusted reed coefficient, the Calinski-Harabasz index, and the Fowlkes-Mallows score, a candidate model determined by using a voting mechanism from the obtained candidate models 509-512 as the detection model 513.
Please continue with fig. 5B.
In fig. 5B, upon obtaining a detection model 513 (e.g., detection model 513 obtained using the method shown in fig. 5A), the execution subject first obtains a user data set 515 of the target user over the target time period. User data set 515 may include, but is not limited to, at least one of the following for a single user: user account number, transaction time, transaction amount, transaction duration, transaction number, transaction frequency, transaction characteristics, transaction trend, transaction period and transaction type.
Then, the execution subject may perform data preprocessing (e.g., data cleansing), feature engineering (e.g., data feature derivation, dimension reduction) on the user data set 515, thereby obtaining a behavior data set 516.
Here, different cleaning rules can be designed for the same type of irregular data, and the data format can be adjusted or modified. For data with missing, the features can be divided into time features, classification features and continuous features, and mode and mean values are selected for filling respectively. Then, data obtained by data preprocessing are converted and derived, and new characteristics such as maximum transaction amount, average transaction amount, transaction sum and the like are obtained on the basis of original data; then, a principal component analytical method is adopted to select the features, and feature dimension reduction is carried out.
Thereafter, the execution agent may input the behavior data set 516 into the pre-trained detection model 513 to generate a model detection result 517. The model detection result 517 is a detection result generated by the detection model 513. And generating a condition detection result 518 based on whether the behavior data in the behavior data set 516 meets the predetermined abnormal user determination condition in the abnormal user determination condition set 514. Here, the condition detection result 518 is a detection result generated based on the abnormal user determination condition in the abnormal user determination condition set.
Finally, the execution body may generate a target detection result 519 based on the model detection result 517 and the condition detection result 518. The target detection result 519 is used to indicate whether the target user is an abnormal user.
Optionally, the executing entity may further execute, in a case where the detection result indicated by the model detection result 517 and the condition detection result 518 is inconsistent, at least one of the following: determining whether to update the abnormal user determination condition set; it is determined whether to continue training the detection model.
Here, in the case where information indicating to continue training is received, the execution subject described above may continue training the detection model 513 based on the behavior data set 516. Alternatively, when receiving information for instructing to update the abnormal user determination condition set, the execution subject may update the abnormal user determination condition set 514 based on the behavior data set 516.
As can be seen from fig. 4, the process 400 of the method for detecting an abnormal user in the present embodiment highlights the training process of the detection model, so that the accuracy of detecting an abnormal user can be further improved.
Referring now to fig. 6, a schematic diagram of an electronic device (e.g., the server or terminal device of fig. 1) 600 suitable for use in implementing embodiments of the present disclosure is shown. The terminal device in the embodiments of the present disclosure may include, but is not limited to, a mobile terminal such as a mobile phone, a notebook computer, a digital broadcast receiver, a PDA (personal digital assistant), a PAD (tablet computer), a PMP (portable multimedia player), a vehicle terminal (e.g., a car navigation terminal), and the like, and a fixed terminal such as a digital TV, a desktop computer, and the like. The electronic device shown in fig. 6 is only an example, and should not bring any limitation to the functions and the scope of use of the embodiments of the present disclosure.
As shown in fig. 6, electronic device 600 may include a processing means (e.g., central processing unit, graphics processor, etc.) 601 that may perform various appropriate actions and processes in accordance with a program stored in a Read Only Memory (ROM)602 or a program loaded from a storage means 608 into a Random Access Memory (RAM) 603. In the RAM603, various programs and data necessary for the operation of the electronic apparatus 600 are also stored. The processing device 601, the ROM 602, and the RAM603 are connected to each other via a bus 604. An input/output (I/O) interface 605 is also connected to bus 604.
Generally, the following devices may be connected to the I/O interface 605: input devices 606 including, for example, a touch screen, touch pad, keyboard, mouse, camera, microphone, accelerometer, gyroscope, etc.; output devices 607 including, for example, a Liquid Crystal Display (LCD), a speaker, a vibrator, and the like; storage 608 including, for example, tape, hard disk, etc.; and a communication device 609. The communication means 609 may allow the electronic device 600 to communicate with other devices wirelessly or by wire to exchange data. While fig. 6 illustrates an electronic device 600 having various means, it is to be understood that not all illustrated means are required to be implemented or provided. More or fewer devices may alternatively be implemented or provided. Each block shown in fig. 6 may represent one device or may represent multiple devices as desired.
In particular, according to an embodiment of the present disclosure, the processes described above with reference to the flowcharts may be implemented as computer software programs. For example, embodiments of the present disclosure include a computer program product comprising a computer program embodied on a computer readable medium, the computer program comprising program code for performing the method illustrated in the flow chart. In such an embodiment, the computer program may be downloaded and installed from a network via the communication means 609, or may be installed from the storage means 608, or may be installed from the ROM 602. The computer program, when executed by the processing device 601, performs the above-described functions defined in the methods of embodiments of the present disclosure. It should be noted that the computer readable medium described herein can be a computer readable signal medium or a computer readable storage medium or any combination of the two. A computer readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination of the foregoing. More specific examples of the computer readable storage medium may include, but are not limited to: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the present application, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. In this application, however, a computer readable signal medium may include a propagated data signal with computer readable program code embodied therein, for example, in baseband or as part of a carrier wave. Such a propagated data signal may take many forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof. A computer readable signal medium may also be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device. Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to: wireless, wire, fiber optic cable, RF, etc., or any suitable combination of the foregoing.
Computer program code for carrying out operations for aspects of the present application may be written in any combination of one or more programming languages, including an object oriented programming language such as Python, Java, Smalltalk, C + +, and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the case of a remote computer, the remote computer may be connected to the user's computer through any type of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or the connection may be made to an external computer (for example, through the Internet using an Internet service provider).
The flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present application. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
As another aspect, the present application also provides a computer-readable medium, which may be contained in the electronic device described in the above embodiments; or may exist separately without being assembled into the electronic device. The computer readable medium carries one or more programs which, when executed by the electronic device, cause the electronic device to: inputting a behavior data set of a target user in a target time period into a pre-trained detection model to generate a model detection result, wherein the detection model is used for determining whether a user corresponding to the input behavior data set is an abnormal user, and the model detection result is a detection result generated by the detection model; generating a condition detection result based on whether the behavior data in the behavior data set meet the abnormal user judgment condition in the predetermined abnormal user judgment condition set, wherein the condition detection result is a detection result generated based on the abnormal user judgment condition in the abnormal user judgment condition set; and generating a target detection result based on the model detection result and the condition detection result, wherein the target detection result is used for indicating whether the target user is an abnormal user.
The above description is only a preferred embodiment of the application and is illustrative of the principles of the technology employed. It will be appreciated by those skilled in the art that the scope of the invention herein disclosed is not limited to the particular combination of features described above, but also encompasses other arrangements formed by any combination of the above features or their equivalents without departing from the spirit of the invention. For example, the above features may be replaced with (but not limited to) features having similar functions disclosed in the present application.

Claims (15)

1. A method for detecting anomalous users, comprising:
inputting a behavior data set of a target user in a target time period into a pre-trained detection model to generate a model detection result, wherein the detection model is used for determining whether a user corresponding to the input behavior data set is an abnormal user;
generating a condition detection result based on whether the behavior data in the behavior data set meet the abnormal user judgment condition in a predetermined abnormal user judgment condition set;
and generating a target detection result based on the model detection result and the condition detection result, wherein the target detection result is used for indicating whether the target user is an abnormal user.
2. The method of claim 1, wherein the detection model is trained by:
acquiring a predetermined training sample set, wherein each training sample in the training sample set corresponds to a user, and the training samples in the training sample set comprise a behavior data set of the user corresponding to the training sample;
aiming at each abnormal detection algorithm in a predetermined abnormal detection algorithm set, performing abnormal detection on the training sample set by adopting the abnormal detection algorithm to obtain a candidate model corresponding to the abnormal detection algorithm, wherein the candidate model represents whether a behavior data set included in the training sample corresponds to an abnormal user or not;
and taking the candidate model which meets the preset selection condition in the obtained candidate models as a detection model.
3. The method of claim 2, wherein the performing anomaly detection on the training sample set by using the anomaly detection algorithm to obtain a candidate model corresponding to the anomaly detection algorithm comprises:
selecting at least two groups of parameter combinations from a predetermined parameter set of the anomaly detection algorithm;
setting a parameter value of each parameter in the at least two groups of parameter combinations as a parameter value preset for the parameter;
and aiming at each group of parameter combinations in the at least two groups of parameter combinations, carrying out anomaly detection on the training sample set based on the anomaly detection algorithm and the parameter values set for all the parameters in the group of parameter combinations to obtain candidate models corresponding to the anomaly detection algorithm and the group of parameter combinations.
4. The method according to claim 2 or 3, wherein the step of using the candidate model meeting the preset selection condition in the obtained candidate models as the detection model comprises the following steps:
and based on the accuracy, the contour coefficient, the adjusted landed coefficient, the Calinski-Harabasz index and the odd evaluation index in the Fowles-Mallows score, determining the candidate model determined by adopting a voting mechanism from the obtained candidate models as a detection model.
5. The method according to one of claims 2-4, wherein an anomaly detection algorithm of the set of anomaly detection algorithms is any one of:
statistical algorithm, clustering algorithm, isolated forest algorithm and local abnormal factor algorithm.
6. The method according to one of claims 1-5, wherein the method further comprises:
in response to the model detection result and the detection result indicated by the condition detection result being inconsistent, performing at least one of:
determining whether to update the abnormal user decision condition set;
determining whether to continue training the detection model.
7. The method of claim 6, wherein in a case where determining whether to continue training the detection model in response to the model detection result and the detection result indicated by the condition detection result being inconsistent, the determining whether to continue training the detection model comprises:
continuing to train the detection model based on the set of behavioral data in response to receiving information indicative of continuing to train.
8. The method according to claim 6 or 7, wherein in a case where it is determined whether to update the abnormal user determination condition set in response to a detection result indicated by the model detection result and the condition detection result not being coincident, the determining whether to update the abnormal user determination condition set includes:
in response to receiving information indicating to update the set of abnormal user decision conditions, updating the set of abnormal user decision conditions based on the set of behavioral data.
9. The method according to one of claims 1 to 8, wherein the model test result and the condition test result are each characterized by a numerical value; and
generating a target detection result based on the model detection result and the condition detection result, including:
determining whether the target user is an abnormal user based on a result of weighted summation of the model detection result and the condition detection result, and generating a target detection result, wherein a weight of the model detection result and a weight of the condition detection result are positively correlated with an accuracy of the detection model and an accuracy of the detection result generated based on the abnormal user determination condition set for indicating whether the user is an abnormal user.
10. The method of one of claims 1 to 9, wherein the behavioural data set is obtained by:
acquiring a user data set of a target user in a target time period, wherein the user information set comprises at least one of the following items: the method comprises the following steps of (1) user account number, transaction time, transaction amount, transaction duration, transaction number, transaction frequency, transaction characteristics, transaction trend, transaction period and transaction type;
performing data cleaning on the user data set to obtain a cleaned data set;
carrying out data characteristic derivation on the cleaned data set to obtain a derived data set;
and reducing the dimension of the derived data set by adopting a principal component analysis method, and taking the data set obtained after dimension reduction as a behavior data set.
11. The method according to one of claims 1-10, wherein the method further comprises:
and executing predetermined abnormal user management and control operation in response to the target detection result indicating that the target user is an abnormal user.
12. The method of claim 11, wherein the anomalous user management operations include at least one of:
limiting the authority of the target user;
sending prompt information for indicating abnormal operation to the target user;
and associating the target user with a preset label.
13. The method of any of claims 1-12, wherein the target user is a consumer user.
14. An electronic device, comprising:
one or more processors;
a storage device having one or more programs stored thereon,
when executed by the one or more processors, cause the one or more processors to implement the method of any one of claims 1-13.
15. A computer-readable medium, on which a computer program is stored, wherein the program, when executed by a processor, implements the method of any one of claims 1-13.
CN201911200519.3A 2019-11-29 2019-11-29 Method, electronic device, and computer-readable medium for detecting abnormal user Active CN110929799B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201911200519.3A CN110929799B (en) 2019-11-29 2019-11-29 Method, electronic device, and computer-readable medium for detecting abnormal user

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911200519.3A CN110929799B (en) 2019-11-29 2019-11-29 Method, electronic device, and computer-readable medium for detecting abnormal user

Publications (2)

Publication Number Publication Date
CN110929799A true CN110929799A (en) 2020-03-27
CN110929799B CN110929799B (en) 2023-05-12

Family

ID=69847840

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911200519.3A Active CN110929799B (en) 2019-11-29 2019-11-29 Method, electronic device, and computer-readable medium for detecting abnormal user

Country Status (1)

Country Link
CN (1) CN110929799B (en)

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111612037A (en) * 2020-04-24 2020-09-01 平安直通咨询有限公司上海分公司 Abnormal user detection method, device, medium and electronic equipment
CN112052185A (en) * 2020-09-29 2020-12-08 北京百度网讯科技有限公司 Applet exception handling method and device, electronic device and storage medium
CN112199640A (en) * 2020-09-30 2021-01-08 广州市百果园网络科技有限公司 Abnormal user auditing method and device, electronic equipment and storage medium
CN112445679A (en) * 2020-11-13 2021-03-05 上海优扬新媒信息技术有限公司 Information detection method, device, server and storage medium
CN113191824A (en) * 2021-05-24 2021-07-30 北京大米科技有限公司 Data processing method and device, electronic equipment and readable storage medium
CN113722707A (en) * 2021-11-02 2021-11-30 西安热工研究院有限公司 Database abnormal access detection method, system and equipment based on distance measurement
CN117057941A (en) * 2023-09-14 2023-11-14 上海甄汇信息科技有限公司 Abnormal consumption detection method based on multidimensional data analysis

Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20090099988A1 (en) * 2007-10-12 2009-04-16 Microsoft Corporation Active learning using a discriminative classifier and a generative model to detect and/or prevent malicious behavior
CN107801090A (en) * 2017-11-03 2018-03-13 北京奇虎科技有限公司 Utilize the method, apparatus and computing device of audio-frequency information detection anomalous video file
JP2018051721A (en) * 2016-09-30 2018-04-05 キヤノン株式会社 Abnormality detection device, abnormality detection method, and program
CN108509979A (en) * 2018-02-28 2018-09-07 努比亚技术有限公司 A kind of method for detecting abnormality, server and computer readable storage medium
CN109388548A (en) * 2018-09-29 2019-02-26 北京京东金融科技控股有限公司 Method and apparatus for generating information
CN109886290A (en) * 2019-01-08 2019-06-14 平安科技(深圳)有限公司 Detection method, device, computer equipment and the storage medium of user's request
CN109919684A (en) * 2019-03-18 2019-06-21 上海盛付通电子支付服务有限公司 For generating method, electronic equipment and the computer readable storage medium of information prediction model
CN109936561A (en) * 2019-01-08 2019-06-25 平安科技(深圳)有限公司 User request detection method and device, computer equipment and storage medium
WO2019128552A1 (en) * 2017-12-29 2019-07-04 Oppo广东移动通信有限公司 Information pushing method, apparatus, terminal, and storage medium

Patent Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20090099988A1 (en) * 2007-10-12 2009-04-16 Microsoft Corporation Active learning using a discriminative classifier and a generative model to detect and/or prevent malicious behavior
JP2018051721A (en) * 2016-09-30 2018-04-05 キヤノン株式会社 Abnormality detection device, abnormality detection method, and program
CN107801090A (en) * 2017-11-03 2018-03-13 北京奇虎科技有限公司 Utilize the method, apparatus and computing device of audio-frequency information detection anomalous video file
WO2019128552A1 (en) * 2017-12-29 2019-07-04 Oppo广东移动通信有限公司 Information pushing method, apparatus, terminal, and storage medium
CN108509979A (en) * 2018-02-28 2018-09-07 努比亚技术有限公司 A kind of method for detecting abnormality, server and computer readable storage medium
CN109388548A (en) * 2018-09-29 2019-02-26 北京京东金融科技控股有限公司 Method and apparatus for generating information
CN109886290A (en) * 2019-01-08 2019-06-14 平安科技(深圳)有限公司 Detection method, device, computer equipment and the storage medium of user's request
CN109936561A (en) * 2019-01-08 2019-06-25 平安科技(深圳)有限公司 User request detection method and device, computer equipment and storage medium
CN109919684A (en) * 2019-03-18 2019-06-21 上海盛付通电子支付服务有限公司 For generating method, electronic equipment and the computer readable storage medium of information prediction model

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
于冰洁;夏战国;王久龙;: "基于高斯过程模型的异常检测算法" *
朱佳俊;陈功;施勇;薛质;: "基于用户画像的异常行为检测" *

Cited By (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111612037A (en) * 2020-04-24 2020-09-01 平安直通咨询有限公司上海分公司 Abnormal user detection method, device, medium and electronic equipment
CN112052185B (en) * 2020-09-29 2023-11-10 北京百度网讯科技有限公司 Exception handling method and device for applet, electronic equipment and storage medium
CN112052185A (en) * 2020-09-29 2020-12-08 北京百度网讯科技有限公司 Applet exception handling method and device, electronic device and storage medium
CN112199640A (en) * 2020-09-30 2021-01-08 广州市百果园网络科技有限公司 Abnormal user auditing method and device, electronic equipment and storage medium
EP4198775A4 (en) * 2020-09-30 2024-03-13 Bigo Tech Pte Ltd Abnormal user auditing method and apparatus, electronic device, and storage medium
CN112199640B (en) * 2020-09-30 2024-03-12 广州市百果园网络科技有限公司 Abnormal user auditing method and device, electronic equipment and storage medium
WO2022068493A1 (en) * 2020-09-30 2022-04-07 百果园技术(新加坡)有限公司 Abnormal user auditing method and apparatus, electronic device, and storage medium
CN112445679A (en) * 2020-11-13 2021-03-05 上海优扬新媒信息技术有限公司 Information detection method, device, server and storage medium
CN112445679B (en) * 2020-11-13 2023-01-06 度小满科技(北京)有限公司 Information detection method, device, server and storage medium
CN113191824A (en) * 2021-05-24 2021-07-30 北京大米科技有限公司 Data processing method and device, electronic equipment and readable storage medium
CN113722707A (en) * 2021-11-02 2021-11-30 西安热工研究院有限公司 Database abnormal access detection method, system and equipment based on distance measurement
CN117057941A (en) * 2023-09-14 2023-11-14 上海甄汇信息科技有限公司 Abnormal consumption detection method based on multidimensional data analysis
CN117057941B (en) * 2023-09-14 2024-03-26 上海甄汇信息科技有限公司 Abnormal consumption detection method based on multidimensional data analysis

Also Published As

Publication number Publication date
CN110929799B (en) 2023-05-12

Similar Documents

Publication Publication Date Title
CN110929799B (en) Method, electronic device, and computer-readable medium for detecting abnormal user
CN110442712B (en) Risk determination method, risk determination device, server and text examination system
CN108108743B (en) Abnormal user identification method and device for identifying abnormal user
CN111368980B (en) State detection method, device, equipment and storage medium
CN110543946B (en) Method and apparatus for training a model
CN106611291A (en) Information push method and device
CN109961032B (en) Method and apparatus for generating classification model
CN110070076B (en) Method and device for selecting training samples
US20180240131A1 (en) Identifying deceptive social media content
CN111582341A (en) User abnormal operation prediction method and device
CN113743971A (en) Data processing method and device
CN112149699A (en) Method and device for generating model and method and device for recognizing image
CN114780338A (en) Host information processing method and device, electronic equipment and computer readable medium
CN110609783B (en) Method and device for identifying abnormal behavior user
CN110245684B (en) Data processing method, electronic device, and medium
CN112950359A (en) User identification method and device
CN110704614B (en) Information processing method and device for predicting user group type in application
CN112685799A (en) Device fingerprint generation method and device, electronic device and computer readable medium
CN112116397A (en) User behavior characteristic real-time processing method and device, storage medium and electronic equipment
CN111598597A (en) Method and apparatus for transmitting information
CN114925275A (en) Product recommendation method and device, computer equipment and storage medium
CN111949860B (en) Method and apparatus for generating a relevance determination model
CN111898027A (en) Method, device, electronic equipment and computer readable medium for determining feature dimension
CN113052509A (en) Model evaluation method, model evaluation apparatus, electronic device, and storage medium
CN111309706A (en) Model training method and device, readable storage medium and electronic equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant