Detailed Description
In order to make the technical solutions of the present invention better understood, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
It should be noted that the terms "first," "second," and the like in the description and claims of the present invention and in the drawings described above are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used is interchangeable under appropriate circumstances such that the embodiments of the invention described herein are capable of operation in sequences other than those illustrated or described herein. Furthermore, the terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed, but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.
Example 1
There is also provided, in accordance with an embodiment of the present invention, a method embodiment of a method for risk assessment of user behavior, it being noted that the steps illustrated in the flowchart of the accompanying drawings may be performed in a computer system such as a set of computer-executable instructions and that, although a logical order is illustrated in the flowchart, in some cases, the steps illustrated or described may be performed in an order different than here.
The method provided by the first embodiment of the present application may be executed in a mobile terminal, a computer terminal, or a similar computing device. Taking the example of running on a computer terminal, fig. 1 is a hardware structure block diagram of a computer terminal of a risk assessment method for user behavior according to an embodiment of the present invention. As shown in fig. 1, the computer terminal 10 may include one or more (only one shown) processors 102 (the processor 102 may include, but is not limited to, a processing device such as a microprocessor MCU or a programmable logic device FPGA), a memory 104 for storing data, and a transmission device 106 for communication functions. It will be understood by those skilled in the art that the structure shown in fig. 1 is only an illustration and is not intended to limit the structure of the electronic device. For example, the computer terminal 10 may also include more or fewer components than shown in FIG. 1, or have a different configuration than shown in FIG. 1.
The memory 104 may be configured to store software programs and modules of application software, such as program instructions/modules corresponding to the risk assessment method for user behavior in the embodiment of the present invention, and the processor 102 executes various functional applications and data processing by running the software programs and modules stored in the memory 104, that is, implements the vulnerability detection method of the application program. The memory 104 may include high speed random access memory, and may also include non-volatile memory, such as one or more magnetic storage devices, flash memory, or other non-volatile solid-state memory. In some examples, the memory 104 may further include memory located remotely from the processor 102, which may be connected to the computer terminal 10 via a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof.
The transmission device 106 is used for receiving or transmitting data via a network. Specific examples of the network described above may include a wireless network provided by a communication provider of the computer terminal 10. In one example, the transmission device 106 includes a Network adapter (NIC) that can be connected to other Network devices through a base station to communicate with the internet. In one example, the transmission device 106 can be a Radio Frequency (RF) module, which is used to communicate with the internet in a wireless manner.
Under the operating environment, the application provides a risk assessment method for user behavior as shown in fig. 2. Fig. 2 is a flowchart of a risk assessment method for user behavior according to a first embodiment of the present invention.
As shown in fig. 2, the method for risk assessment of user behavior may include the following implementation steps:
step S202, acquiring a user behavior frequency corresponding to the first behavior executed by the first account within a first preset time period.
The first behavior, i.e., the user behavior in step S202, may include all behaviors that the user performs on the website, such as searching, browsing, scoring, commenting, adding to a shopping cart, taking out a shopping basket, adding to a wish list (WishList), purchasing, using a discount coupon, returning goods, and the like; and may even include related activities on third-party websites such as price comparison, viewing related assessments, participating in discussions, communication on social media, interacting with friends, and the like.
When the risk assessment device for user behavior performs risk assessment on a first account, the risk assessment device for user behavior in the embodiment of the present invention may acquire behavior data of the first account according to a day, a week, a month, or any time interval, that is, acquire a first behavior executed by the first account within a first preset time period, where the first behavior may be substantially an event combination, that is, include specific behaviors and objects, for example, the first behavior may be purchasing-living goods or browsing-pages.
After acquiring the first behavior executed by the first account within the first preset time period, the risk assessment device for user behavior may calculate a corresponding Frequency of user Behavior (BF). For the first account, the user behavior frequency refers to the number of times a behavior occurs within a time window divided by the total number of all behaviors of the first account within the time window, where the time window is the first preset time period.
Taking the example that the first action includes "buy-living goods", the total number of all actions of the first account in the first preset time is 100, and the "buy-living goods" appears 3 times in the first preset time period, so that the user action frequency of "buy-living goods" is 3/100 ═ 0.03.
Step S204, obtaining the reversal behavior frequency corresponding to the user behavior frequency, where the reversal behavior frequency is obtained according to a first total number and a second total number, the first total number is the number of the first behaviors of all accounts in a first preset time period, and the second total number is the number of all behaviors of all accounts in the first preset time period.
In the above step S204, the Inversion Behavior Frequency (IBF) refers to the number of all accounts executing "buy-living goods" in the time window, and is divided by the total number of all behaviors in all accounts in the time window.
Still taking the example that the first behavior includes "buy-living goods", if "buy-living goods" appears 1,000 times in the first preset time period and the total number of all behaviors of all accounts in the first preset time period is 10,000,000, the frequency of the reverse behavior is lg (10,000,000/1,000) ═ 4.
Step S206, a first feature value corresponding to the first behavior is obtained according to the user behavior frequency and the inversion behavior frequency.
In the foregoing step S206, the first feature value may be an important feature of the classification or clustering of the first account, and in the embodiment of the present invention, the obtained user behavior frequency may be multiplied by the obtained reverse behavior frequency, so as to obtain a first feature value corresponding to the first behavior, where the larger the first feature value is, the more obvious the first behavior is.
Still taking the example that the first behavior includes "buy-and-live goods", the frequency of the user behavior is 0.03 and the frequency of the reversal behavior is 4, which are obtained as described above, and the first characteristic value BF is 0.03 IBF-0.12.
It should be noted that, here, the description is only given by taking the first behavior as an example including "buy-daily article", and when the first behavior further includes other behaviors, such as "browse-page", the calculation method is the same as the above method, and details are not described here.
Step S208, calculating a feature ratio of the first feature value in the feature values of all behaviors of all accounts according to the first feature value corresponding to the first behavior.
In step S208, after obtaining the first feature value corresponding to the first behavior based on the user behavior frequency and the reversal behavior frequency, the risk assessment apparatus for the user behavior needs to calculate a feature ratio of the first feature value in the feature values of all behaviors of all accounts, so as to be used as an input parameter for risk assessment in the following.
The feature values of all behaviors of all accounts can be calculated according to the methods described in the above steps S202 to S206, and are not described herein again.
Step S210, obtaining a risk assessment result of the first account executing the first behavior within the first preset time period based on the feature proportion and the pre-obtained user behavior parameters.
In the above step S210, the pre-obtained user behavior parameters may include a conditional probability parameter and a classification ratio, where the conditional probability parameter and the classification ratio are obtained by creating a training sample set in advance and based on a naive bayes model, and a detailed description will be given in a subsequent embodiment of a specific method.
The risk assessment device for the user behavior of the embodiment of the invention can perform risk assessment on the first behavior executed by the first account in the first preset time period based on the characteristic proportion and the pre-acquired user behavior parameters.
As can be seen from the above, in the scheme provided in the first embodiment of the present application, the characteristic proportion of the first behavior is obtained by obtaining the user behavior frequency and the reversal behavior frequency of the first account, and then the risk assessment result of the first behavior is obtained based on the pre-obtained user behavior parameters, so that the purpose of accurately performing risk assessment on the user behavior is achieved, thereby achieving the technical effect of increasing the accuracy of risk assessment, and further solving the technical problem that in some special cases, the error rate of the risk assessment result is high due to the fact that the risk assessment of the user behavior is performed only based on the user behavior frequency in the prior art.
In an alternative solution provided by the foregoing embodiment of the application, in step S202, acquiring a user behavior frequency corresponding to the first behavior executed by the first account within the first preset time period, where the user behavior frequency may include:
and S20, determining a third total number and a fourth total number, wherein the third total number refers to the number of the first actions performed by the first account within the first preset time period, and the fourth total number refers to the number of all the actions performed by the first account within the first preset time period.
In step S20, when the user behavior risk assessment apparatus needs to obtain the user behavior frequency corresponding to the first behavior executed by the first account within the first time period, it needs to obtain two data, one of which is the number of the first behavior executed by the first account within the first preset time period, that is, the third total number, and the second of which is the number of all the behaviors of the first account within the first preset time period, that is, the fourth total number.
Still taking the example that the first action includes "buy-a-living goods", the first account performs the first action for the first preset time period, i.e. the third total is 3, and the first account performs all actions for the first preset time period, i.e. the fourth total is 100.
And S22, calculating the user behavior frequency according to the third total number and the fourth total number.
In step S20, after determining the number of first actions performed by the first account in the first preset time period and the number of all actions performed by the first account in the first preset time period, the risk assessment device for user actions may calculate the user action frequency according to the number of first actions performed by the first account in the first preset time period and the number of all actions performed by the first account in the first preset time period.
Still taking the example that the first action includes "buy-living goods", since the risk assessment apparatus of the user action determines that the first account performs the first action in the first preset time period, that is, the third total is 3, and the first account performs all actions in the first preset time period, that is, the fourth total is 100, the user action frequency is 3/100 ═ 0.03.
In an alternative solution provided by the foregoing embodiment of the present application, in step S204, obtaining the inversion behavior frequency corresponding to the user behavior frequency may include:
and S30, determining the first total number and the second total number.
In step S30, when the risk assessment apparatus for user behavior obtains the inversion behavior frequency corresponding to the user behavior frequency, it needs to obtain two data, one of which is the first total number of behaviors of all accounts in the first preset time period, and the other of which is the second total number of behaviors of all accounts in the first preset time period.
Still taking the example that the first behavior includes "buy-living goods", if "buy-living goods" occurs 1,000 times and the total number of all behaviors of all accounts in the first preset time period is 10,000,000 within the first preset time period, the number of the first behaviors of all accounts in the first preset time period, i.e., the first total number, is 1,000, and the number of all behaviors of all accounts in the first preset time period, i.e., the second total number, is 10,000,000.
S32, calculating the inversion behavior frequency by the formula I ═ lg (k/q), where I denotes the inversion behavior frequency, k denotes the second total number, and q denotes the first total number.
In step S20, after determining the number of first behaviors of all accounts within the first preset time period and the number of all behaviors of all accounts within the first preset time period, the risk assessment device for user behavior may calculate the reversal behavior frequency according to the number of first behaviors of all accounts within the first preset time period and the number of all behaviors of all accounts within the first preset time period.
Still taking the example that the first behavior includes "buy-living goods", since the risk assessment apparatus of the user behavior determines the number of first behaviors of all accounts in the first preset time period, i.e. the first total number is 1,000, and the number of all behaviors of all accounts in the first preset time period, i.e. the second total number is 10,000,000, the user behavior frequency is lg (10,000,000/1,000) ═ 4.
In an alternative solution provided by the foregoing embodiment of the application, in the step S206, calculating, according to the first feature value corresponding to the first action, a feature ratio of the first feature value in the feature values of all actions of all accounts may include:
by the formula
Obtaining the characteristic proportion of the first characteristic value in the characteristic values of all behaviors of all accounts, wherein a
jRepresenting a first characteristic value, P (a)
j) For showing a
jJ is an integer greater than 0, which is a characteristic ratio of the characteristic values of all the behaviors of all the accounts.
Optionally, calculating the user behavior frequency according to the third total and the fourth total, including: and dividing the third total number by the fourth total number to obtain the user behavior frequency.
Optionally, obtaining a first feature value corresponding to the first behavior according to the user behavior frequency and the inversion behavior frequency includes: and multiplying the user behavior frequency by the reversal behavior frequency to obtain a first characteristic value corresponding to the first behavior.
In an alternative solution provided by the foregoing embodiment of the present application, as shown in fig. 3, in a case that the pre-obtained user behavior parameters include a conditional probability parameter and a classification ratio, the step of obtaining the conditional probability parameter and the classification ratio may include:
s302, a training sample set is created, wherein the training sample set at least comprises one sample characteristic value and a risk assessment label corresponding to the at least one sample characteristic value.
In step S302, before the risk assessment apparatus for user behavior performs risk assessment on the first behavior executed by the first account within the first preset time period, a training sample set may be created, and a model based on naive bayes is established.
Similar to the calculation process of the first characteristic value, the at least one sample characteristic value in the training sample set may also be obtained from a sample user behavior frequency corresponding to the sample behavior of the sample account and a sample reversal behavior frequency corresponding to the sample user behavior frequency. Alternatively, the risk assessment label may be 0 or 1, e.g., 0 indicates no risk and 1 indicates at risk.
Optionally, the step S302 of creating a training sample set, where the training sample set at least includes one sample feature value and a risk assessment label corresponding to the at least one sample feature value, and the method may include:
and S40, acquiring at least one sample user behavior of at least one sample account in a second preset time period.
In step S40, the training sample set created by the user behavior risk assessment apparatus is also based on the behaviors of some users in a certain time period, and in order to distinguish the first account from the first behavior, the accounts in the training sample set are referred to as sample accounts, and the behaviors are referred to as sample user behaviors, where the sample user behaviors may also include all behaviors that occur on the website by the users, such as searching, browsing, scoring, commenting, adding to a shopping cart, taking out a shopping basket, adding to an expectation list, purchasing, using a discount coupon, returning goods, and the like; and may even include related activities on third-party websites such as price comparison, viewing related assessments, participating in discussions, communication on social media, interacting with friends, and the like.
And S42, calculating a sample user behavior frequency of at least one sample user behavior and a sample reversal behavior frequency corresponding to the sample user behavior frequency, wherein the sample reversal behavior frequency is obtained according to a fifth total number and a sixth total number, and the fifth total number is obtained by the number of at least one sample user behavior of all accounts in a second preset time period and the total number of all behaviors of all accounts in the second preset time period.
In the above step S42, similar to the above process, after acquiring at least one sample user behavior of at least one sample account in a second preset time period, the user behavior risk assessment device needs to calculate a sample user behavior frequency of the at least one sample user behavior and a sample reversal behavior frequency corresponding to the sample user behavior frequency, where the sample reversal behavior frequency is obtained according to a fifth total number and a sixth total number, and the fifth total number is obtained by the number of the at least one sample user behavior of all accounts in the second preset time period and the total number of all behaviors of all accounts in the second preset time period.
Optionally, calculating a sample user behavior frequency of the at least one sample user behavior, and a sample reversal behavior frequency corresponding to the sample user behavior frequency may include:
dividing the number of the at least one sample user behavior of the at least one sample account in a second preset time period by the number of all behaviors of the at least one sample account in the second preset time period to obtain a sample user behavior frequency of the at least one sample user behavior; and calculating a sample inversion behavior frequency by a formula I '═ lg (k'/q '), where I' denotes the sample inversion behavior frequency, k 'denotes the fifth total number, and q' denotes the sixth total number.
And S44, obtaining at least one sample characteristic value according to the sample user behavior frequency and the sample inversion behavior frequency.
In the above step S44, similar to the above process, the risk assessment device for user behavior may obtain at least one sample feature value according to the sample user behavior frequency and the sample inversion behavior frequency after calculating the sample user behavior frequency of at least one sample user behavior and the sample inversion behavior frequency corresponding to the sample user behavior frequency.
Optionally, obtaining at least one sample feature value according to the sample user behavior frequency and the sample inversion behavior frequency includes: and multiplying the sample user behavior frequency by the sample reversal behavior frequency to obtain at least one sample characteristic value.
And S46, creating a training sample set according to the at least one sample characteristic value and the risk assessment label corresponding to the at least one sample characteristic value.
In step S46, the risk assessment apparatus for user behavior creates a training sample set based on the at least one sample feature value and the risk assessment label corresponding to the at least one sample feature value after obtaining the at least one sample feature value.
It should be added that, after obtaining at least one sample characteristic value and at least one risk assessment label corresponding to the sample characteristic value, the risk assessment apparatus for user behavior may further optimize the at least one sample characteristic value and the risk assessment label corresponding to the at least one sample characteristic value, that is, may extract T pieces of data that are put back first in a set formed by the at least one sample characteristic value and the risk assessment label corresponding to the at least one sample characteristic value, each piece of data includes N sample characteristic values and risk assessment labels corresponding to the N sample characteristic values, and then extract M sample characteristic values and risk assessment labels corresponding to the M sample characteristic values, where M is Z1/2The value of T is slightly larger than the value of Z, for example, Z is 400, and T is 500, so as to obtain the training sample set.
S304, obtaining a conditional probability parameter and a classification proportion according to the at least one sample characteristic value and the risk assessment label corresponding to the at least one sample characteristic value.
In the above step S304, similar to the above process, after obtaining at least one sample characteristic value and a risk assessment label corresponding to the at least one sample characteristic value, the risk assessment device for user behavior may obtain the conditional probability parameter and the classification ratio according to the at least one sample characteristic value and the risk assessment label corresponding to the at least one sample characteristic value.
Optionally, in step S304, obtaining a conditional probability parameter and a classification ratio according to at least one sample characteristic value and a risk assessment label corresponding to the at least one sample characteristic value may include:
by the formula
Obtaining a conditional probability parameter, wherein P (a'
j|c
i) Is represented by a'
jBelong to c
iConditional probability parameter of (a)'
jRepresenting the characteristic value of the sample, c
iDenotes a Risk assessment tag, Count (a'
j|c
i) Indicates belonging to c
iA 'appears'
jNumber of times, Count (c)
i) Indicates belonging to c
iWherein j is more than 0 and less than n, n is the total number of samples in the training sample set, i is more than 0 and less than m, m is the number of types of risk assessment labels, and i and j are integers; and
by the formula
Obtaining a classification ratio, wherein P (c)
i) For representation of c
iThe classification ratio of all risk assessment labels.
In an alternative solution provided by the foregoing embodiment of the application, in step S210, obtaining a risk assessment result of the first account executing the first action within the first preset time period based on the feature proportion and the pre-obtained user action parameter, may include:
s50, by formula
Obtaining a risk assessment result of the first account executing the first action within a first preset time period, wherein c
MAPIs firstAnd executing the risk assessment result of the first action by the account within a first preset time period.
In the above step S50 of the present application, the risk assessment device for user behavior obtains P (a)j) Then, based on P (a ') obtained by training sample set'j|cj) And P (c)i) Obtaining a risk assessment result c of the first account executing the first action within a first preset time periodMAP。
The following describes a risk assessment method for user behavior according to an embodiment of the present invention with reference to fig. 4:
and step A, collecting sample behaviors of the sample accounts within a second preset time period.
In order to distinguish the first account from the first behavior, the account in the training sample set is referred to as a sample account, and the behavior is referred to as a sample user behavior, where the sample user behavior may also include all behaviors that occur on the website by the user, such as searching, browsing, scoring, commenting, adding to a shopping cart, taking out a shopping basket, adding to an expectation list, purchasing, using a discount coupon, returning goods, and the like; and may even include related activities on third-party websites such as price comparison, viewing related assessments, participating in discussions, communication on social media, interacting with friends, and the like.
And B, calculating a sample BF and a sample IBF.
Dividing the number of the at least one sample user behavior of the at least one sample account in a second preset time period by the number of all behaviors of the at least one sample account in the second preset time period to obtain a sample user behavior frequency of the at least one sample user behavior; and calculating a sample inversion behavior frequency by a formula I '═ lg (k'/q '), where I' denotes the sample inversion behavior frequency, k 'denotes the fifth total number, and q' denotes the sixth total number.
And step C, summarizing the sample BF sample IBF and the corresponding risk assessment labels thereof, and creating a training sample set.
After calculating the sample user behavior frequency of the at least one sample user behavior and the sample reversal behavior frequency corresponding to the sample user behavior frequency, the risk assessment device for the user behavior can obtain at least one sample characteristic value according to the sample user behavior frequency and the sample reversal behavior frequency.
Optionally, obtaining at least one sample feature value according to the sample user behavior frequency and the sample inversion behavior frequency includes: and multiplying the sample user behavior frequency by the sample reversal behavior frequency to obtain at least one sample characteristic value.
After obtaining the at least one sample characteristic value, the risk assessment device for the user behavior creates a training sample set based on the at least one sample characteristic value and a risk assessment label corresponding to the at least one sample characteristic value.
And D, obtaining user behavior parameters based on the training sample set.
The user behavior parameters comprise conditional probability parameters and classification proportions.
In particular, by the formula
Obtaining a conditional probability parameter, wherein P (a'
j|c
i) Is represented by a'
jBelong to c
iConditional probability parameter of (a)'
jRepresenting the characteristic value of the sample, c
iDenotes a Risk assessment tag, Count (a'
j|c
i) Indicates belonging to c
iA 'appears'
jNumber of times, Count (c)
i) Indicates belonging to c
iWherein j is more than 0 and less than n, n is the total number of samples in the training sample set, i is more than 0 and less than m, m is the number of types of risk assessment labels, and i and j are integers; and
by the formula
Obtaining a classification ratio, wherein P (c)
i) For representation of c
iThe classification ratio of all risk assessment labels.
And E, performing risk assessment on the first action of the first account in the first preset time period.
Like the above steps S202 to S210, the risk assessment apparatus for user behavior may perform risk assessment on the first behavior of the first account in the first preset time period, so as to obtain a risk assessment result of the first behavior of the first account in the first preset time period.
In the embodiment of the invention, the user behavior frequency corresponding to the execution of the first behavior of the first account in the first preset time period is obtained; acquiring reversal behavior frequency corresponding to user behavior frequency, wherein the reversal behavior frequency is obtained according to a first total number and a second total number, the first total number refers to the number of first behaviors of all accounts in a first preset time period, and the second total number refers to the number of all behaviors of all accounts in the first preset time period; obtaining a first characteristic value corresponding to the first behavior according to the user behavior frequency and the reversal behavior frequency; calculating the characteristic proportion of the first characteristic value in the characteristic values of all the behaviors of all the accounts according to the first characteristic value corresponding to the first behavior; the method comprises the steps of obtaining a risk assessment result of a first action executed by a first account within a first preset time period based on a characteristic proportion and a pre-obtained user action parameter, obtaining the characteristic proportion of the first action by obtaining a user action frequency and a reversal action frequency of the first account, obtaining the risk assessment result of the first action based on the pre-obtained user action parameter, achieving the purpose of accurately performing risk assessment on the user action, achieving the technical effect of increasing the accuracy of the risk assessment, and solving the technical problem that the error rate of the risk assessment result is high under some special conditions due to the fact that the risk assessment of the user action is performed only based on the user action frequency in the prior art.
It should be noted that, for simplicity of description, the above-mentioned method embodiments are described as a series of acts or combination of acts, but those skilled in the art will recognize that the present invention is not limited by the order of acts, as some steps may occur in other orders or concurrently in accordance with the invention. Further, those skilled in the art should also appreciate that the embodiments described in the specification are preferred embodiments and that the acts and modules referred to are not necessarily required by the invention.
Through the above description of the embodiments, those skilled in the art can clearly understand that the method according to the above embodiments can be implemented by software plus a necessary general hardware platform, and certainly can also be implemented by hardware, but the former is a better implementation mode in many cases. Based on such understanding, the technical solutions of the present invention may be embodied in the form of a software product, which is stored in a storage medium (e.g., ROM/RAM, magnetic disk, optical disk) and includes instructions for enabling a terminal device (e.g., a mobile phone, a computer, a server, or a network device) to execute the method according to the embodiments of the present invention.
Example 2
According to an embodiment of the present invention, an apparatus embodiment for implementing the foregoing method embodiment is also provided, and the apparatus provided by the foregoing embodiment of the present application may be run on a computer terminal.
Fig. 5 is a schematic structural diagram of a risk assessment device for user behavior according to an embodiment of the present application.
As shown in fig. 5, the risk assessment apparatus for user behavior may include a first obtaining unit 502, a second obtaining unit 504, a processing unit 506, a first calculating unit 508, and a risk assessment unit 510.
The first obtaining unit 502 is configured to obtain a user behavior frequency corresponding to a first behavior executed by a first account within a first preset time period; a second obtaining unit 504, configured to obtain a reversal behavior frequency corresponding to the user behavior frequency, where the reversal behavior frequency is obtained according to a first total number and a second total number, the first total number is the number of first behaviors of all accounts in the first preset time period, and the second total number is the number of all behaviors of all accounts in the first preset time period; a processing unit 506, configured to obtain a first feature value corresponding to the first behavior according to the user behavior frequency and the inversion behavior frequency; a first calculating unit 508, configured to calculate, according to the first feature value corresponding to the first behavior, a feature proportion of the first feature value in feature values of all behaviors of all accounts; a risk assessment unit 510, configured to obtain a risk assessment result of the first account executing the first behavior within the first preset time period based on the feature proportion and a pre-obtained user behavior parameter.
As can be seen from the above, in the scheme provided in the first embodiment of the present application, the characteristic proportion of the first behavior is obtained by obtaining the user behavior frequency and the reversal behavior frequency of the first account, and then the risk assessment result of the first behavior is obtained based on the pre-obtained user behavior parameters, so that the purpose of accurately performing risk assessment on the user behavior is achieved, thereby achieving the technical effect of increasing the accuracy of risk assessment, and further solving the technical problem that in some special cases, the error rate of the risk assessment result is high due to the fact that the risk assessment of the user behavior is performed only based on the user behavior frequency in the prior art.
It should be noted here that the first obtaining unit 502, the second obtaining unit 504, the processing unit 506, the first calculating unit 508 and the risk evaluating unit 510 correspond to steps S202 to S210 in the first embodiment, and the five modules are the same as the corresponding steps in the implementation example and the application scenario, but are not limited to the disclosure in the first embodiment. It should be noted that the modules described above as a part of the apparatus may be run in the computer terminal 10 provided in the first embodiment, and may be implemented by software or hardware.
Optionally, as shown in fig. 6, the first obtaining unit 502 may include: a first determining subunit 602 and a first calculating subunit 604.
A first determining subunit 602, configured to determine a third total and a fourth total, where the third total refers to a number of the first actions performed by the first account within the first preset time period, and the fourth total refers to a number of all the actions performed by the first account within the first preset time period; a first calculating subunit 604, configured to calculate the user behavior frequency according to the third total and the fourth total.
It should be noted here that the first determining subunit 602 and the first calculating subunit 604 correspond to steps S20 to S22 in the first embodiment, and the two modules are the same as the corresponding steps in the example and application scenarios, but are not limited to the disclosure in the first embodiment. It should be noted that the modules described above as a part of the apparatus may be run in the computer terminal 10 provided in the first embodiment, and may be implemented by software or hardware.
Alternatively, as shown in fig. 7, the second obtaining unit 504 may include: a second determining subunit 702 and a second calculating subunit 704.
A second determining subunit 702, configured to determine the first total number and the second total number; a second calculating subunit 704, configured to calculate the inversion behavior frequency according to a formula I ═ lg (k/q), where I represents the inversion behavior frequency, k represents the second total number, and q represents the first total number.
It should be noted here that the second determining subunit 702 and the second calculating subunit 704 correspond to steps S30 to S32 in the first embodiment, and the two modules are the same as the corresponding steps in the example and application scenarios, but are not limited to the disclosure in the first embodiment. It should be noted that the modules described above as a part of the apparatus may be run in the computer terminal 10 provided in the first embodiment, and may be implemented by software or hardware.
Optionally, the first calculating unit 508 is configured to perform the following steps to calculate, according to the first feature value corresponding to the first behavior, a feature ratio of the first feature value in feature values of all behaviors of all accounts: by the formula
Obtaining the characteristic proportion of the first characteristic value in the characteristic values of all behaviors of all accounts, wherein a
jRepresents the first characteristic value, P (a)
j) For showing a
jThe characteristic proportion occupied by the characteristic values of all behaviors of all accounts, j is an integer greater than 0.
Optionally, the first calculating subunit 604 is configured to perform the following steps to calculate the user behavior frequency according to the third total and the fourth total: dividing the third total number by the fourth total number to obtain the user behavior frequency;
the processing unit 506 is configured to perform the following steps to obtain a first feature value corresponding to the first behavior according to the user behavior frequency and the inversion behavior frequency: and multiplying the user behavior frequency by the reversal behavior frequency to obtain a first characteristic value corresponding to the first behavior.
Optionally, as shown in fig. 8, in a case that the pre-obtained user behavior parameters include a conditional probability parameter and a classification ratio, the risk assessment apparatus for user behavior may further include: a creating unit 802 and a second calculating unit 804.
The creating unit 802 is configured to create a training sample set, where the training sample set at least includes one sample feature value and a risk assessment label corresponding to the at least one sample feature value; a second calculating unit 804, configured to obtain the conditional probability parameter and the classification ratio according to the at least one sample feature value and the risk assessment label corresponding to the at least one sample feature value.
It should be noted here that the creating unit 802 and the second calculating unit 804 correspond to steps S302 to S304 in the first embodiment, and the two modules are the same as the examples and application scenarios realized by the corresponding steps, but are not limited to the disclosure of the first embodiment. It should be noted that the modules described above as a part of the apparatus may be run in the computer terminal 10 provided in the first embodiment, and may be implemented by software or hardware.
Optionally, as shown in fig. 9, the creating unit 802 may include: an acquisition subunit 902, a third computation subunit 904, a fourth computation subunit 906, and a creation subunit 908.
The acquiring subunit 902 is configured to acquire at least one sample user behavior of at least one sample account within a second preset time period; a third calculating subunit 904, configured to calculate a sample user behavior frequency of the at least one sample user behavior, and a sample reversal behavior frequency corresponding to the sample user behavior frequency, where the sample reversal behavior frequency is obtained according to a fifth total number and a sixth total number, and the fifth total number is obtained by the number of the at least one sample user behavior of all accounts in the second preset time period and the total number of all behaviors of all accounts in the second preset time period; a fourth calculating subunit 906, configured to obtain the at least one sample feature value according to the sample user behavior frequency and the sample inversion behavior frequency; a creating subunit 908, configured to create the training sample set according to the at least one sample feature value and a risk assessment label corresponding to the at least one sample feature value.
It should be noted here that the above-mentioned obtaining subunit 902, third calculating subunit 904, fourth calculating subunit 906 and creating subunit 908 correspond to steps S40 to S46 in the first embodiment, and the four modules are the same as the corresponding steps in the implementation example and application scenario, but are not limited to the disclosure in the first embodiment. It should be noted that the modules described above as a part of the apparatus may be run in the computer terminal 10 provided in the first embodiment, and may be implemented by software or hardware.
Optionally, the third calculating subunit 904 is configured to calculate a sample user behavior frequency of the at least one sample user behavior and a sample inversion behavior frequency corresponding to the sample user behavior frequency by:
dividing the number of the at least one sample user behavior of the at least one sample account in the second preset time period by the number of all behaviors of the at least one sample account in the second preset time period to obtain the sample user behavior frequency of the at least one sample user behavior; and
calculating the sample inversion behavior frequency by a formula I '═ lg (k'/q '), where I' denotes the sample inversion behavior frequency, k 'denotes the fifth total number, and q' denotes the sixth total number.
Optionally, the second calculating unit 804 is configured to perform the following steps to obtain the conditional probability parameter and the classification ratio according to the at least one sample feature value and the risk assessment label corresponding to the at least one sample feature value:
by the formula
Obtaining the conditional probability parameter, wherein the P (a'
j|c
i) Is represented by a'
jBelong to c
iThe conditional probability parameter of a'
jRepresenting the characteristic value of the sample, c
iDenotes the Risk assessment tag, Coumt (a'
j|c
i) Indicates belonging to c
iA 'appears'
jNumber of times, Count (c)
i) Indicates belonging to c
iWherein j is more than 0 and less than n, n is the total number of samples in the training sample set, i is more than 0 and less than m, m is the number of types of risk assessment labels, and i and j are integers; and
by the formula
Obtaining the classification ratio, wherein P (c)
i) For representation of c
iThe classification ratio among all risk assessment labels.
Optionally, the risk assessment unit 510 is configured to perform the following steps to obtain a risk assessment result of the first account performing the first action within the first preset time period, based on the feature proportion and a pre-obtained user action parameter: by the formula
Obtaining the risk assessment result of the first account performing the first action within the first preset time period,wherein, c
MAPExecuting the risk assessment result of the first action for the first account within the first preset time period.
In the embodiment of the invention, the user behavior frequency corresponding to the execution of the first behavior of the first account in the first preset time period is obtained; acquiring reversal behavior frequency corresponding to user behavior frequency, wherein the reversal behavior frequency is obtained according to a first total number and a second total number, the first total number refers to the number of first behaviors of all accounts in a first preset time period, and the second total number refers to the number of all behaviors of all accounts in the first preset time period; obtaining a first characteristic value corresponding to the first behavior according to the user behavior frequency and the reversal behavior frequency; calculating the characteristic proportion of the first characteristic value in the characteristic values of all the behaviors of all the accounts according to the first characteristic value corresponding to the first behavior; the method comprises the steps of obtaining a risk assessment result of a first action executed by a first account within a first preset time period based on a characteristic proportion and a pre-obtained user action parameter, obtaining the characteristic proportion of the first action by obtaining a user action frequency and a reversal action frequency of the first account, obtaining the risk assessment result of the first action based on the pre-obtained user action parameter, achieving the purpose of accurately performing risk assessment on the user action, achieving the technical effect of increasing the accuracy of the risk assessment, and solving the technical problem that the error rate of the risk assessment result is high under some special conditions due to the fact that the risk assessment of the user action is performed only based on the user action frequency in the prior art.
Example 3
The embodiment of the invention also provides a storage medium. Optionally, in this embodiment, the storage medium may be configured to store a program code executed by the risk assessment method for user behavior provided in the first embodiment.
Optionally, in this embodiment, the storage medium may be located in any one of computer terminals in a computer terminal group in a computer network, or in any one of mobile terminals in a mobile terminal group.
Optionally, in this embodiment, the storage medium is configured to store program code for performing the following steps: acquiring user behavior frequency corresponding to the execution of a first behavior of a first account within a first preset time period; acquiring reversal behavior frequency corresponding to the user behavior frequency, wherein the reversal behavior frequency is obtained according to a first total number and a second total number, the first total number refers to the number of first behaviors of all accounts in the first preset time period, and the second total number refers to the number of all behaviors of all accounts in the first preset time period; obtaining a first characteristic value corresponding to the first behavior according to the user behavior frequency and the reversal behavior frequency; calculating the characteristic proportion of the first characteristic value in the characteristic values of all the behaviors of all the accounts according to the first characteristic value corresponding to the first behavior; and obtaining a risk assessment result of the first account executing the first behavior within the first preset time period based on the characteristic proportion and the pre-acquired user behavior parameters.
Optionally, the storage medium is further arranged to store program code for performing the steps of: determining a third total and a fourth total, wherein the third total refers to the number of the first actions performed by the first account within the first preset time period, and the fourth total refers to the number of all the actions performed by the first account within the first preset time period; and calculating the user behavior frequency according to the third total number and the fourth total number.
Optionally, the storage medium is further arranged to store program code for performing the steps of: determining the first total and the second total; calculating the inversion behavior frequency by the formula I ═ lg (k/q), where I represents the inversion behavior frequency, k represents the second total number, and q represents the first total number.
Optionally, the storage medium is further arranged to store program code for performing the steps of: by the formula
To obtain the firstThe characteristic ratio of a characteristic value in the characteristic values of all behaviors of all accounts, wherein a
jRepresents the first characteristic value, P (a)
j) For showing a
jThe characteristic proportion occupied by the characteristic values of all behaviors of all accounts, j is an integer greater than 0.
Optionally, the storage medium is further arranged to store program code for performing the steps of: and dividing the third total number by the fourth total number to obtain the user behavior frequency.
Optionally, the storage medium is further arranged to store program code for performing the steps of: and multiplying the user behavior frequency by the reversal behavior frequency to obtain a first characteristic value corresponding to the first behavior.
Optionally, the storage medium is further arranged to store program code for performing the steps of: creating a training sample set, wherein the training sample set at least comprises one sample characteristic value and a risk assessment label corresponding to the at least one sample characteristic value; and obtaining the conditional probability parameter and the classification proportion according to the at least one sample characteristic value and the risk assessment label corresponding to the at least one sample characteristic value.
Optionally, the storage medium is further arranged to store program code for performing the steps of: obtaining at least one sample user behavior of at least one sample account within a second preset time period; calculating a sample user behavior frequency of the at least one sample user behavior and a sample reversal behavior frequency corresponding to the sample user behavior frequency, wherein the sample reversal behavior frequency is obtained according to a fifth total number and a sixth total number, and the fifth total number is obtained by the number of the at least one sample user behavior of all accounts in the second preset time period and the total number of all behaviors of all accounts in the second preset time period; obtaining the at least one sample characteristic value according to the sample user behavior frequency and the sample inversion behavior frequency; and creating the training sample set according to the at least one sample characteristic value and the risk assessment label corresponding to the at least one sample characteristic value.
Optionally, the storage medium is further arranged to store program code for performing the steps of: dividing the number of the at least one sample user behavior of the at least one sample account in the second preset time period by the number of all behaviors of the at least one sample account in the second preset time period to obtain the sample user behavior frequency of the at least one sample user behavior; and calculating the sample inversion behavior frequency by a formula I '═ lg (k'/q '), where I' denotes the sample inversion behavior frequency, k 'denotes the fifth total number, and q' denotes the sixth total number.
Optionally, the storage medium is further arranged to store program code for performing the steps of: by the formula
Obtaining the conditional probability parameter, wherein the P (a'
j|c
i) Is represented by a'
jThe conditional probability parameter, a 'belonging to ci'
jRepresenting the characteristic value of the sample, c
iRepresents the Risk assessment tag, Count (a'
j|c
i) Indicates belonging to c
iA 'appears'
jNumber of times, Count (c)
i) Indicates belonging to c
iWherein j is more than 0 and less than n, n is the total number of samples in the training sample set, i is more than 0 and less than m, m is the number of types of risk assessment labels, and i and j are integers; and by formula
Obtaining the classification ratio, wherein P (c)
i) For representation of c
iThe classification ratio among all risk assessment labels.
Optionally, the storage medium is further arranged to store program code for performing the steps of: by the formula
Obtaining that the first action is executed by the first account within the first preset time periodThe result of the risk assessment of (a), wherein c
MAPExecuting the risk assessment result of the first action for the first account within the first preset time period.
Optionally, in this embodiment, the storage medium may include, but is not limited to: a U-disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a removable hard disk, a magnetic or optical disk, and other various media capable of storing program codes.
Optionally, the specific examples in this embodiment may refer to the examples described in the above embodiments, and this embodiment is not described herein again.
The above-mentioned serial numbers of the embodiments of the present invention are merely for description and do not represent the merits of the embodiments.
In the above embodiments of the present invention, the descriptions of the respective embodiments have respective emphasis, and for parts that are not described in detail in a certain embodiment, reference may be made to related descriptions of other embodiments.
In the embodiments provided in the present application, it should be understood that the disclosed technology can be implemented in other ways. The above-described embodiments of the apparatus are merely illustrative, and for example, the division of the units is only one type of division of logical functions, and there may be other divisions when actually implemented, for example, a plurality of units or components may be combined or may be integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, units or modules, and may be in an electrical or other form.
The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.
In addition, functional units in the embodiments of the present invention may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit can be realized in a form of hardware, and can also be realized in a form of a software functional unit.
The integrated unit, if implemented in the form of a software functional unit and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present invention may be embodied in the form of a software product, which is stored in a storage medium and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: a U-disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a removable hard disk, a magnetic or optical disk, and other various media capable of storing program codes.
The foregoing is only a preferred embodiment of the present invention, and it should be noted that, for those skilled in the art, various modifications and decorations can be made without departing from the principle of the present invention, and these modifications and decorations should also be regarded as the protection scope of the present invention.