Summary of the invention
It is an object of the present invention to provide a kind of behavioral data appraisal procedure and devices, to solve according to user behavior number
The problem of interrupting the reason of trading according to automatic identification user.
According to a first aspect of the present invention, a kind of behavioral data appraisal procedure is provided, this method comprises:
Operation according to user to application program extracts user for the behavioural characteristic data of the application program;
The behavioural characteristic data are inputted in the Data Analysis Model pre-established and are matched, and obtain matching result
Information;
Information is assessed according to the behavioral data that the matching result information generates the user.
Further, the method according to a first aspect of the present invention, further includes:
Determine sample characteristics data;
The contribution degree of the sample characteristics data is calculated according to default contribution degree algorithm;
The sample characteristics data conduct for meeting contribution degree condition is filtered out according to the contribution degree of each sample characteristics data
Enter moding amount;
To it is described enter moding amount be trained, to construct the Data Analysis Model.
Further, the method according to a first aspect of the present invention, further includes:
It determines and specifies whether characteristic amount meets preset quantity condition in the sample characteristics data;
If satisfied, and sample characteristics data spread out when can determine corresponding behavior meaning to the sample characteristics data
Raw processing;
If not satisfied, carrying out Data Integration processing to sample characteristics data according to type of service, and executes and determine the sample
Whether data volume meets the step of preset quantity condition in eigen data.
Further, the method according to a first aspect of the present invention, further includes:
When data volume meets preset quantity condition in the sample characteristics data, if not according to the sample characteristics data
It can determine that corresponding behavior meaning, carry out vector product conversion process according to multiple sample characteristics data in identical services scene, with
Determine the corresponding behavior meaning of the sample characteristics data.
Further, the method according to a first aspect of the present invention, further includes:
The default contribution degree algorithm are as follows:
Comentropy Entropy (S)=- (p+) * log (p+)-(p-) * log (p-)
Sample characteristics gain G ain (p)=Entropy (S)-p*Entropy-p*Entropy
Wherein, S is sample set, and p+ is the probability of high safety sense user, and p- is the probability of lower security sense user.
Further, the method according to a first aspect of the present invention, is sieved according to the contribution degree of each sample characteristics data
It selects and meets the sample characteristics data of contribution degree condition as entering moding amount, comprising:
Descending sort is carried out to the contribution degree of each sample characteristics data;
The sample characteristics data for choosing the most preceding preset quantity of arrangement are used as moding amount.
Further, the method according to a first aspect of the present invention, further includes:
If it is described enter moding amount in continuous variable there are missing values, missing values supplement is carried out to the continuous variable;
Training set and test set are split as according to preset ratio to the moding amount that enters after progress missing values supplement, pass through survey
Examination collection data carry out test assessment to training set data.
According to a second aspect of the present invention, a kind of behavioral data assessment device is provided, comprising:
Data extraction module extracts user for the application program for the operation according to user to application program
Behavioural characteristic data;
Data match module, for the behavioural characteristic data to be inputted progress in the Data Analysis Model pre-established
Match, and obtains matching result information;
Data evaluation module, the behavioral data for generating the user according to the matching result information assess information.
According to a third aspect of the present invention, a kind of storage equipment is also provided, the storage equipment stores computer program instructions,
The computer program instructions are according to a first aspect of the present invention or method described in second aspect is executed.
According to a fourth aspect of the present invention, a kind of calculating equipment is also provided, comprising: for storing depositing for computer program instructions
Reservoir and processor for executing computer program instructions, wherein when the computer program instructions are executed by the processor,
It triggers the calculating equipment and executes method described in first aspect or a second aspect of the present invention.
A kind of behavioral data appraisal procedure provided by the invention and device, by the way that behavior of the user for application program is special
It is matched in sign data input data analysis model, to obtain corresponding behavioral data assessment information, magnanimity number can be based on
Analyzed according to the behavior to full dose user, thus realize it is fast and accurate to the customer transaction sense of security assess quickly to position
The key reason for influencing the customer transaction sense of security can be from user's by the Data Analysis Model based on machine learning techniques
Hiding information is excavated in behavioural characteristic, accuracy is high.
Specific embodiment
Present invention is further described in detail with reference to the accompanying drawing.
In the present invention one typical configuration, terminal, the equipment of service network include one or more processors
(CPU), input/output interface, network interface and memory.
Memory may include the non-volatile memory in computer-readable medium, random access memory (RAM) and/or
The forms such as Nonvolatile memory, such as read-only memory (ROM) or flash memory (flashRAM).Memory is showing for computer-readable medium
Example.
Computer-readable medium includes permanent and non-permanent, removable and non-removable media, can be by any side
Method or technology realize that information stores.Information can be the device or other numbers of computer readable instructions, data structure, program
According to.The example of the storage medium of computer includes, but are not limited to phase change memory (PRAM), static random access memory
(SRAM), dynamic random access memory (DRAM), other kinds of random access memory (RAM), read-only memory
(ROM), electrically erasable programmable read-only memory (EEPROM), flash memory or other memory techniques, CD-ROM (CD-
ROM), digital versatile disc (DVD) or other optical storage, magnetic cassettes, magnetic tape disk storage or other magnetic storages
Equipment or any other non-transmission medium, can be used for storage can be accessed by a computing device information.
Fig. 1 is the flow diagram of the behavioral data appraisal procedure of the embodiment of the present invention one, as shown in Figure 1, the present invention is real
The behavioral data appraisal procedure of the offer of example one is provided, behavioral data assessment equipment is used for, which can be server, computer
Deng, this method comprises:
Step S101, the operation according to user to application program extract user for the behavioural characteristic of the application program
Data.
Specifically, application program (APP, Application) can be the types of applications programs such as social, shopping, especially
With the payment class application program high for sense of security demand such as payment function.User includes user couple to the operation of application program
The clicking trigger behavior of any application function in application program, such as: it opens application program, click individual center in application program
Individual center module is exited in module, click, clicks bill button, exits bill browsing and function setting etc..Behavioural characteristic number
According to the behavioural characteristic data that may include any of the above operation, can also include and the data such as the interactive log of server-side.
The behavioural characteristic data are inputted in the Data Analysis Model pre-established and are matched by step S102, and
To matching result information.
Specifically, Data Analysis Model can be trained processing by machine learning, for example, can pass through GBDT
(Gradient Boosting Decision Tree, grad enhancement decision tree) two classification model construction modes construct the data point
Analyse model.According to the difference of application scenarios, it can choose and sense of security assessment is carried out to the behavioural characteristic of single user, it can also be right
The behavioural characteristic data of full dose user or certain customers carry out sense of security assessment, and the behavioural characteristic data for assessment can be pre-
If in condition and range user for the application program the corresponding data acquisition system of all operations, such as: can be user every time/this
For the corresponding data acquisition system of all operations of the application program, it is also possible to user in preset time period and uses the application program
In the corresponding data acquisition system of all operations.The Analysis model of network behaviors that the training obtains can by way of characteristic matching,
It, can be by the corresponding behavioural characteristic data of user to be assessed when needing to carry out sense of security assessment to user in application scenes
It inputs in the Data Analysis Model pre-established, to enter moding amount in matched data analysis model, and exports corresponding
With result information.
Step S103 assesses information according to the behavioral data that the matching result information generates user.
Behavior data assessment information can react user and use the transaction security sense of application program, and then can quickly determine
The key reason of the position influence customer transaction sense of security, and application program or related service are optimized according to these reasons, it is
Safety product personalized recommendation provides decision support.
Fig. 2 is the flow diagram of the behavioral data appraisal procedure of the embodiment of the present invention two, as shown in Fig. 2, the present invention is real
The behavioral data appraisal procedure of the offer of example two is provided, behavioral data assessment equipment is used for, this method comprises:
Step S201 determines sample characteristics data;
Fig. 3 is the flow diagram of the behavioral data appraisal procedure of the embodiment of the present invention two, as shown in figure 3, step S201
It may comprise steps of S2011- step S2016:
Step S2011 obtains sample characteristics data;
According to the account information of each user, traverse user grasps action trail of the application program etc. by client
Make, the sample behavioural characteristic to obtain as sampled data.It can also be by accessing the information-setting by user stored in server
Determine the sampling affiliated business scenario of behavioural characteristic, information-setting by user includes that be user set specific function in application program
Option, such as payment code setting options are set, the service attribute of sampling feature is portrayed by affiliated business scenario;Pass through exposure, point
Hit and portray with the interactive log of server-side the result attribute of sampling feature.In conjunction with business scenario, service attribute and result
Attribute optimizes processing to the sample characteristics data of user, comprising:
Step S2012 is determined and is specified whether characteristic amount meets preset quantity condition in the sample characteristics data;
To avoid having the accidental characteristic behavior of individual in sample collected, it is thus necessary to determine that each sample behavioural characteristic is
No to meet preset quantity condition, which can set according to the actual application, for example, it can be set to some
Specified sample characteristics are the one thousandths in full dose user behavior characteristics.If it is determined that each sample behavioural characteristic meets present count
Amount condition, thens follow the steps S2013;If not satisfied, executing step S2015.
Step S2013 judges whether can determine corresponding behavior meaning according to the sample characteristics data;
Business meaning is for characterizing specific business tine, such as business meaning can be click and log off account etc..
If it is determined that can determine corresponding behavior meaning according to the sample characteristics data, S2014 is thened follow the steps;If according to the sample characteristics
Data not can determine that corresponding behavior meaning, then follow the steps S2016.
Step S2014 carries out derivation process to the sample characteristics data.
Specifically, the convergence that can accelerate training network by normalized passes through Boolean variation, dilating window
Derivation process is carried out to analyze corresponding behavioural characteristic to sample characteristics data.
Step S2015 carries out Data Integration processing to sample characteristics data according to service attribute, and continues to execute step
S2012。
Step S2016 carries out vector product conversion process according to multiple sample characteristics data in identical services scene, with true
Determine the corresponding behavior meaning of the sample characteristics data, and continues to execute step S2013.
If not can determine that corresponding behavior meaning according to the sample characteristics data, obtain more under affiliated same business scenario
A behavioural characteristic carries out conversion process to the vector product of multiple behavioural characteristics under same business scenario, and constructs new variable, directly
After corresponding business meaning can be expressed to the sample characteristics data, and continue to execute step S2013.
Step S202 calculates the tribute to each sample characteristics data in the sample characteristics data according to default contribution degree algorithm
Degree of offering;
Specifically, presetting contribution degree algorithm can be IG (Information Gain, information gain) algorithm, be calculated using IG
Method calculates whether each behavioural characteristic occurs to the differentiation customer transaction information gain that perception is evaluated safely, and determines therefrom that each
The contribution degree of sample characteristics data.Its algorithmic formula are as follows:
Comentropy Entropy (S)=- (p+) * log (p+)-(p-) * log (p-)
Behavioural characteristic gain G ain (p)=Entropy (S)-p*Entropy-p*Entropy
Wherein, S is sample set, and p+ is the probability of high safety sense user, and p- is the probability of lower security sense user.
By taking " user closes face core body switch (face_off) " as an example, pi+ is that high safety sense is used in face_off crowd
The probability at family, pii+ are the probability of high safety sense user in not_face_off crowd, and whether there is or not two of generation behavior feature
The comentropy of group is respectively as follows:
Entropy (face_off)=- (pi+) * log (pi+)-(pi-) * log (pi-)
Entropy (not_face_off)=- (pii+) * log (pii+)-(pii-) * log (pii-)
Accordingly it can be calculated that user, which closes face core body, switchs contribution of this behavior to the customer transaction sense of security is distinguished
Degree are as follows:
Gain (face_off)=Entropy (S)-p (face_off) * Entropy (face_off)-p (not_face_
off)*Entropy(not_face_off)
Assuming that { a1, a2 ... an } is the set of all behavioural characteristics in step 1, then Gain (i) i=1 ... n is all lists
For behavior variable to whole information gain, which is used to indicate the contribution degree height of each sample characteristics data.
Step S203 filters out the sample characteristics for meeting contribution degree condition according to the contribution degree of each sample characteristics data
Data are used as moding amount;
By that can be determined to sense of security assessment according to contribution degree height sequence with more with reference to meaning to sample characteristics data
Justice enters moding amount.Specifically, descending row can be carried out to sample characteristics data according to the contribution degree of each sample characteristics data
Sequence, and the most preceding sample characteristics data for being believed that the significant preset quantity of discrimination of arrangement are chosen as moding amount is entered, this is pre-
If the value of quantity can according to the actual situation depending on.
Step S204, to it is described enter moding amount be trained, to construct the Data Analysis Model.
The Data Analysis Model, such as bis- classification model construction of GBDT can be constructed in embodiment of the present invention in several ways
Mode.Entering moding amount can be set a certain amount of black sample (if the setting of black sample size, which for example can be, provides questionnaire recycling
The 7% of rate), black sample is that the low user of the sense of security is clear by the various modes such as phone by investigation data collection and user
Indicate the corresponding data of sense of security missing behavior.For example, if preset quantity enter moding amount be 55, can be set including
7 user's demographic variables, 30 behavior Boolean variables and 16 derivative variables.If it is described enter moding amount in continuous variable
There are missing values, can carry out missing values supplement to the continuous variable with mean value filling mode, find and missing values variable phase
Data are divided into multiple groups by the maximum variable of closing property, then calculate separately each group of mean value, the position of mean value filling missing
As its value, and then the distribution of data is improved to a certain extent.It, can also be to sample characteristics data for boosting algorithm effect
Carry out branch mailbox processing, it may be assumed that training set and test set, example are split as according to preset ratio to the moding amount that enters after missing values supplement
If the primary contract of training set and test set can be 7:3, by taking turns iteration, every wheel iteration generates a Weak Classifier T more
(x;θm)T(x;θM), each Weak Classifier on the basis of the residual error of last round of Weak Classifier team's training set into
Row training, the training pattern can be described as:
Fm (x)=∑M=1MT (x;θm)
The loss function of Weak Classifier:
θ^m=argminθm∑I=1NL (yi, Fm−1(xi)+T(xi;θ
m))
Wherein, loss function is fitted loss function under the current model by every wheel iteration along the decline of gradient direction
Negative gradient, so, every wheel training can allow loss function to reduce, and converge to globally optimal solution as early as possible.Later, Ke Yitong
It crosses test set data and test assessment is carried out to training set data, and then verify the sense of security and assess accuracy.
The present invention can carry out comprehensively quickly assessment by above-mentioned Data Analysis Model to the sense of security of full dose user, and
It gives a mark respectively to single account, has refined the granularity of assessment.It simultaneously can also the quick positioning effects customer transaction sense of security
Key reason, and application program or related service are optimized according to these reasons, are mentioned for safety product personalized recommendation
For decision support.
Fig. 4 is that the behavioral data of the embodiment of the present invention three assesses the structural schematic diagram of device, as shown in Figure 4, comprising: data
Extraction module 41, data match module 42 and data evaluation module 43.
Data extraction module 41 extracts user for the application program for the operation according to user to application program
Behavioural characteristic data;
Data match module 42 is carried out for inputting the behavioural characteristic data in the Data Analysis Model pre-established
Matching, and obtain matching result information;
Data evaluation module 43, the behavioral data for generating the user according to the matching result information assess information.
The behavioral data of the embodiment of the present invention three assesses device, is that the realization of behavioral data appraisal procedure shown in FIG. 1 fills
It sets, specifically refers to Fig. 1 embodiment, details are not described herein again.
The embodiment of the present invention also provides a kind of storage equipment, and the storage equipment stores computer program instructions, the meter
Fig. 1 is executed to method shown in Fig. 3 calculation machine program instruction according to the present invention.
The embodiment of the present invention also provides a kind of calculating equipment, comprising: for store computer program instructions memory and
For executing the processor of computer program instructions, wherein when the computer program instructions are executed by the processor, trigger institute
It states and calculates equipment execution Fig. 1 of the present invention to method shown in Fig. 3.
In addition, some embodiments of the present invention additionally provide a kind of computer-readable medium, it is stored thereon with computer journey
Sequence instruction, the computer-readable instruction can be executed by processor with realize aforementioned multiple embodiments of the invention method and/
Or technical solution.
It should be noted that the present invention can be carried out in the assembly of software and/or software and hardware, for example, can adopt
With specific integrated circuit (ASIC), general purpose computer or any other realized similar to hardware device.In some embodiments
In, software program of the invention can be executed by processor to realize above step or function.Similarly, software of the invention
Program (including relevant data structure) can be stored in computer readable recording medium, for example, RAM memory, magnetic or
CD-ROM driver or floppy disc and similar devices.In addition, some of the steps or functions of the present invention may be implemented in hardware, for example,
As the circuit cooperated with processor thereby executing each step or function.
It is obvious to a person skilled in the art that invention is not limited to the details of the above exemplary embodiments, Er Qie
In the case where without departing substantially from spirit or essential attributes of the invention, the present invention can be realized in other specific forms.Therefore, no matter
From the point of view of which point, the present embodiments are to be considered as illustrative and not restrictive, and the scope of the present invention is by appended power
Benefit requires rather than above description limits, it is intended that all by what is fallen within the meaning and scope of the equivalent elements of the claims
Variation is included in the present invention.Any reference signs in the claims should not be construed as limiting the involved claims.This
Outside, it is clear that one word of " comprising " does not exclude other units or steps, and odd number is not excluded for plural number.That states in device claim is multiple
Unit or device can also be implemented through software or hardware by a unit or device.The first, the second equal words are used to table
Show title, and does not indicate any particular order.