CN108108743A - Abnormal user recognition methods and the device for identifying abnormal user - Google Patents
Abnormal user recognition methods and the device for identifying abnormal user Download PDFInfo
- Publication number
- CN108108743A CN108108743A CN201611051585.5A CN201611051585A CN108108743A CN 108108743 A CN108108743 A CN 108108743A CN 201611051585 A CN201611051585 A CN 201611051585A CN 108108743 A CN108108743 A CN 108108743A
- Authority
- CN
- China
- Prior art keywords
- characteristic
- abnormal user
- user
- abnormal
- key
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/23—Clustering techniques
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/214—Generating training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
- G06F18/241—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
- G06F18/2415—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on parametric or probabilistic models, e.g. based on likelihood ratio or false acceptance rate versus a false rejection rate
- G06F18/24155—Bayesian classification
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
- G06F18/243—Classification techniques relating to the number of classes
- G06F18/24323—Tree-organised classifiers
Landscapes
- Engineering & Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Artificial Intelligence (AREA)
- Bioinformatics & Computational Biology (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Evolutionary Biology (AREA)
- Evolutionary Computation (AREA)
- Bioinformatics & Cheminformatics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Probability & Statistics with Applications (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
Device this application discloses abnormal user recognition methods and for identifying abnormal user.One specific embodiment of this method includes:The characteristic of multiple users and feature based data are obtained, the abnormal user in multiple users is determined in a manner of unsupervised learning;Based on the characteristic for the abnormal user determined, select to build the key characterization parameter of disaggregated model from multiple characteristic parameters in a manner of supervised learning and generate the key feature data for including key characterization parameter;Disaggregated model is built using key feature data.It realizes and abnormal user is identified using unsupervised learning mode, key feature structure disaggregated model is selected using characteristic of the supervised learning mode based on abnormal user, so that disaggregated model is identified abnormal user only with the higher key feature of the significance level to identifying abnormal user, interference of the feature for avoiding significance level relatively low to identification process promotes recognition accuracy and reduces the expense of identification process.
Description
Technical field
This application involves computer realms, and in particular to big data field more particularly to abnormal user recognition methods and use
In the device of identification abnormal user.
Background technology
In big data analysis, it is often necessary to abnormal user be identified the data of removal abnormal user to promote big number
According to the accuracy of analysis.At present, usually judge whether the feature of user matches with recognition rule, really by configuring recognition rule
Determine whether user is abnormal user.
However, when abnormal user being identified using aforesaid way the data of removal abnormal user, on the one hand, due to
The data of user are magnanimity rank, with recognition rule match by the characteristic information of each user one by one causing identification process
Expense is larger.On the other hand, due to that can not determine that each feature of user to identifying the significance level of abnormal user, causes big
It measures the low feature of importance to participate in calculating, in turn results in the interference to identification process, accuracy rate is caused to reduce, further increase and know
The expense of other process.
Invention information
A kind of device this application provides abnormal user recognition methods and for identifying abnormal user, it is above-mentioned for solving
Background section.
In a first aspect, this application provides abnormal user recognition methods, this method includes:Obtain the characteristic of multiple users
According to and feature based data, the abnormal user in multiple users is determined in a manner of unsupervised learning, characteristic includes:
The characteristic parameter of the feature of multiple instruction users;Based on the characteristic for the abnormal user determined, in a manner of supervised learning
It selects to build the key characterization parameter of disaggregated model from multiple characteristic parameters and generate comprising key characterization parameter
Key feature data;Disaggregated model is built using key feature data, whether to be abnormal use to user using disaggregated model
Family is identified.
Second aspect, this application provides for identifying the device of abnormal user, which includes:Recognition unit, configuration
For obtaining the characteristic of multiple users and feature based data, determined in a manner of unsupervised learning in multiple users
Abnormal user, characteristic includes:The characteristic parameter of the feature of multiple instruction users;Unit is chosen, is configured to based on true
The characteristic for the abnormal user made is selected from multiple characteristic parameters for structure classification mould in a manner of supervised learning
The key characterization parameter of type and generation include the key feature data of key characterization parameter;Construction unit is configured to utilize
Whether key feature data build disaggregated model, to be that abnormal user is identified to user using disaggregated model.
The abnormal user recognition methods that the application provides and the device for identifying abnormal user, by obtaining multiple users
Characteristic and feature based data, the abnormal user in multiple users, characteristic are determined in a manner of unsupervised learning
According to including:The characteristic parameter of the feature of multiple instruction users;Based on the characteristic for the abnormal user determined, to there is supervision to learn
Habit mode selects to build from multiple characteristic parameters the key characterization parameter of disaggregated model and generation comprising crucial special
Levy the key feature data of parameter;Disaggregated model is built using key feature data.It realizes and is known using unsupervised learning mode
Do not go out abnormal user, key feature structure classification mould is selected using characteristic of the supervised learning mode based on abnormal user
Type so that disaggregated model knows abnormal user only with the higher key feature of the significance level to identifying abnormal user
Not, interference of the feature for avoiding significance level relatively low to identification process promotes the accuracy of identification, meanwhile, reduce identification process
Expense.
Description of the drawings
By reading the detailed description made to non-limiting example made with reference to the following drawings, the application's is other
Feature, objects and advantages will become more apparent upon:
Fig. 1 shows the abnormal user recognition methods that can be applied to the application or for identifying the device of abnormal user
Exemplary system architecture;
Fig. 2 shows the flow chart of one embodiment of the abnormal user recognition methods according to the application;
Fig. 3 shows the flow chart of another embodiment of the abnormal user recognition methods according to the application;
Fig. 4 shows the structure diagram for being used to identify one embodiment of the device of abnormal user according to the application;
Fig. 5 shows to be used for the computer system for being used to identify the device of abnormal user for realizing the embodiment of the present application
Structure diagram.
Specific embodiment
The application is described in further detail with reference to the accompanying drawings and examples.It is understood that this place is retouched
The specific embodiment stated is used only for explaining related invention rather than the restriction to the invention.It also should be noted that in order to
Convenient for description, illustrated only in attached drawing and invent relevant part with related.
It should be noted that in the case where there is no conflict, the feature in embodiment and embodiment in the application can phase
Mutually combination.The application is described in detail below with reference to the accompanying drawings and in conjunction with the embodiments.
Fig. 1 shows the abnormal user recognition methods that can be applied to the application or for identifying the device of abnormal user
The exemplary system architecture 100 of embodiment.
As shown in Figure 1, system architecture 100 can include terminal 101,102,103, network 104 and server 105.Network
104 between terminal 101,102,103 and server 105 provide transmission link medium.Network 104 can include various
Connection type, such as wired, wireless transmission link or fiber optic cables etc..
User can be interacted with using terminal 101,102,103 by network 104 with server 105, be disappeared with receiving or sending
Breath etc..Various communication applications can be installed, such as searching class is applied, purchases by group class application, is instant in terminal 101,102,103
Communication class application etc..
Terminal 101,102,103 can be the various electronic equipments for having display screen and supporting network communication, including but
It is not limited to smart mobile phone, tablet computer, E-book reader, MP3 player (Moving Picture Experts Group
Audio Layer III, dynamic image expert's compression standard audio level 3), MP4 (Moving Picture Experts
Group Audio Layer IV, dynamic image expert's compression standard audio level 4) player, pocket computer on knee and
Desktop computer etc..
Terminal 101,102,103 can gather the feature of the instruction user such as account, user name, telephone number, URL of user
Characteristic parameter, the characteristic comprising multiple characteristic parameters is sent to server 105.Server 105 can utilize feature
Data structure for identify user whether be abnormal user disaggregated model.
It please refers to Fig.2, it illustrates the flows 200 of one embodiment of the abnormal user recognition methods according to the application.
The abnormal user recognition methods that the embodiment of the present application is provided can be performed by the server 105 in Fig. 1, correspondingly, for knowing
The device of other abnormal user can be arranged in server 105.This method comprises the following steps:
Step 201, the abnormal user in multiple users is determined in a manner of unsupervised learning.
In the present embodiment, multiple characteristic parameters can be utilized to describe the feature of user from multiple dimensions.The feature of user
Data include:The characteristic parameter of the feature of multiple instruction users, wherein, each feature can correspond at least one feature ginseng
Number.In multiple users are obtained after the characteristic of each user, unsupervised learning mode may be employed according to multiple use
The characteristic at family determines the abnormal user in multiple users.
In the present embodiment, information that can be in advance based on each attribute from the attribute of user selects the attribute of user
As characteristic parameter.It, can be with for example, the characteristic parameter selected includes the characteristic parameters such as account, user name, telephone number, URL
The feature of user is described using characteristic parameters such as the account of user, user name, telephone number, URL from multiple dimensions.Correspondingly,
The characteristic of user can include the characteristic parameters such as account, user name, telephone number, the URL of user.
In some optional realization methods of the present embodiment, determined in a manner of unsupervised learning different in multiple users
Common family includes:The characteristic of multiple users is clustered using clustering algorithm, obtains multiple clusters;When in cluster include with it is pre-
If during the characteristic of off-note Data Matching, the corresponding user of all characteristics in cluster is determined as abnormal user.
In the present embodiment, unsupervised learning mode can be clustering algorithm, such as density-based algorithms.It can be with
The characteristic of multiple users is clustered using clustering algorithm, the characteristic of the high user of the degree of association is polymerize, is obtained
Multiple clusters.The characteristic of the high user of multiple degrees of association can be included in each cluster.
In the present embodiment, can judge respectively in each cluster with the presence or absence of the characteristic with off-note Data Matching
According to.The characteristic parameter composition off-note data of numerical exception can be advanced with.When in cluster include and default off-note number
It, can will be in the cluster since the degree of association of the characteristic information of multiple users in a cluster is higher during according to matched characteristic
The characteristic of all users is determined as the characteristic of abnormal user.It is correspondingly, the user belonging to this feature data is true
It is set to abnormal user.
Step 202, in a manner of supervised learning the characteristic based on abnormal user select key characterization parameter and
Generate key feature data.
In the present embodiment, the abnormal user in multiple users are determined in a manner of unsupervised learning by step 201
Afterwards, can be selected according to the characteristic for the abnormal user determined using supervised learning mode from multiple characteristic parameters
Take out for build for identify user whether be abnormal user disaggregated model key characterization parameter, that is, select to identification
The more important characteristic parameter of abnormal user.
It, can be with by taking the characteristic of user includes the characteristic parameters such as account, user name, telephone number, URL of user as an example
The characteristic of user to being determined by step 201 is analyzed, from the account of user, user name, telephone number, URL
Characteristic parameters is waited to select the characteristic parameter more important to identification abnormal user.
Step 203, disaggregated model is built using key feature data.
It in the present embodiment, can be with after by key feature data of step 202 generation comprising key characterization parameter
Disaggregated model is built using key feature data, for example, being instructed using key feature data as training sample to disaggregated model
Practice, whether be that abnormal user is identified to user using the disaggregated model after training.
In the present embodiment, using the disaggregated model that key feature data construct only with the weight to identifying abnormal user
Abnormal user is identified in the key characterization parameter for wanting degree higher to determine by step 202, avoid significance level compared with
Interference of the low feature to identification process promotes the accuracy of identification, meanwhile, reduce the expense of identification process.
It please refers to Fig.3, shows the flow 300 of another embodiment of abnormal user recognition methods according to the application.
The abnormal user recognition methods that the embodiment of the present application is provided can be performed by the server 105 in Fig. 1, and this method includes following
Step:
Step 301, the characteristic based on multiple users in a manner of unsupervised learning determines the exception in multiple users
User.
In the present embodiment, multiple characteristic parameters can be utilized to describe the feature of user from multiple dimensions.The feature of user
Data include:The characteristic parameter of the feature of multiple instruction users, wherein, each feature corresponds to a characteristic parameter.For example, with
The characteristic at family includes the characteristic parameters such as account, user name, telephone number, URL.Each use in multiple users are obtained
After the characteristic at family, characteristic of the unsupervised learning mode according to multiple users may be employed, determine multiple users
In abnormal user.
Step 302, crucial spy is selected using the characteristic of decision tree or NB Algorithm based on abnormal user
Levy parameter and generation key feature data.
It in the present embodiment, can be first using there is supervision in order to build the disaggregated model that abnormal user is identified
Mode of learning is to determining that the characteristic of the abnormal user in multiple users is analyzed by step 301, from characteristic parameter
In select to build the key characterization parameter of disaggregated model, the i.e. more important parameter in abnormal user is identified.
In the present embodiment, decision tree may be employed in supervised learning mode.It can be used for being selected using decision tree
Before the key characterization parameter for building disaggregated model, first with the characteristic for the abnormal user determined by step 301,
Build decision tree.Decision tree is trained by regarding the characteristic of multiple abnormal users as training sample, decision tree can
To learn significance level of each characteristic parameter in the characteristic of abnormal user in abnormal user is identified.It is logical utilizing
It crosses in the decision tree that the characteristic of the abnormal user that step 301 is determined constructs, includes multiple nodes, each node pair
One characteristic parameter, the nearer corresponding characteristic parameter of node in the position of the root node apart from decision tree is in abnormal user is identified
It is more important.The feature ginseng i.e. more important more than the corresponding characteristic parameter of node of depth threshold of depth in decision tree can be chosen
Key characterization parameter of the number as structure disaggregated model.
By taking the characteristic of user includes the characteristic parameters such as account, user name, telephone number, URL as an example, abnormal use is utilized
In the decision tree that the characteristic at family constructs, the corresponding section of the characteristic parameters such as account, user name, telephone number, URL is included
Point, in decision tree, according to characteristic parameters such as account, user name, telephone number, URL to the significance level of identification abnormal user
Difference, the depth of the corresponding node of characteristic parameters in decision tree such as account, user name, telephone number, URL be also different.
It in the present embodiment, can after the key characterization parameter for building disaggregated model is selected by decision tree
To select the feature for the abnormal user for meeting the following conditions from the characteristic for the abnormal user determined by step 301
Data:The classification results that decision tree classifies to the characteristic of abnormal user are abnormal user.Use decision tree
The characteristic of abnormal user to being identified by step 301 is classified again, obtains classification results.When decision tree is to different
It, can be by the key feature in the characteristic of the abnormal user when classification results of the characteristic at common family are abnormal user
Parameter is combined, and obtains key feature data, to build disaggregated model using the key feature data.
In the present embodiment, supervised learning mode can also use NB Algorithm.Simple pattra leaves may be employed
It is corresponding to calculate each characteristic parameter according to the characteristic for the abnormal user determined by step 301 respectively for this algorithm
Abnormal probability, it is the probability that user is abnormal user when the numerical exception of characteristic parameter that characteristic parameter, which corresponds to abnormal probability,.It is different
Normal probability can represent significance level of the characteristic parameter in abnormal user is identified.The bigger characteristic parameter of corresponding exception probability
It is abnormal more important for identification.Calculated respectively by NB Algorithm the corresponding abnormal probability of each characteristic parameter it
Afterwards, the characteristic parameter that corresponding abnormal probability can be more than to probability threshold value is joined as building the key feature of disaggregated model
Number.
In the present embodiment, selecting to build the key characterization parameter of disaggregated model by NB Algorithm
Afterwards, the abnormal use for meeting the following conditions can be selected from the characteristic for the abnormal user determined by step 301
The characteristic at family:The classification results that NB Algorithm classifies to the characteristic of abnormal user are used to be different
Common family.Divided again using the characteristic of abnormal user of the NB Algorithm to being identified by step 301
Class obtains classification results.When NB Algorithm to the classification results of the characteristic of abnormal user are abnormal user when, can
It is combined with the key characterization parameter in the characteristic by the abnormal user, obtains key feature data, to utilize the pass
Key characteristic builds disaggregated model.
Step 303, disaggregated model is built using key feature data.
In the present embodiment, disaggregated model can be decision-tree model.Decision-tree model can be created, step will be passed through
302 generation the key feature data critical characteristics comprising key characterization parameter as training sample to decision-tree model into
Row training.It is then possible to whether it is that abnormal user is identified to user using the decision-tree model after training.
In the present embodiment, the decision-tree model after training leads to only with the significance level to identifying abnormal user is higher
It crosses the key characterization parameter that step 302 is determined abnormal user is identified, the feature for avoiding significance level relatively low is to identification
The interference of process promotes the accuracy of identification, meanwhile, reduce the expense of identification process.
It please refers to Fig.4, it illustrates the knots for being used to identify one embodiment of the device of abnormal user according to the application
Structure schematic diagram.The device embodiment is corresponding with embodiment of the method shown in Fig. 2.
As shown in figure 4, the present embodiment is used to identify that the device 400 of abnormal user to include:Recognition unit 401 is chosen single
Member 402, construction unit 403.Wherein, recognition unit 401 is configured to obtain the characteristic and feature based of multiple users
Data, determine the abnormal user in multiple users in a manner of unsupervised learning, and characteristic includes:The spy of multiple instruction users
The characteristic parameter of sign;It chooses unit 402 and is configured to the characteristic based on the abnormal user determined, with supervised learning side
Formula selects to build the key characterization parameter of disaggregated model from multiple characteristic parameters and generation is joined comprising key feature
Several key feature data;Construction unit 403 is configured to build disaggregated model using key feature data, to utilize mould of classifying
Whether type is that abnormal user is identified to user.
In some optional realization methods of the present embodiment, recognition unit 401 includes:Abnormal user identifies subelement
(not shown) is configured to cluster the characteristic of multiple users using clustering algorithm, obtains multiple clusters;When being wrapped in cluster
During containing characteristic with default off-note Data Matching, the corresponding user of all characteristics in cluster is determined as exception
User.
In some optional realization methods of the present embodiment, choosing unit 402 includes:Decision tree chooses subelement (not
Show), the characteristic of the abnormal user that will be determined is configured to as training sample, builds decision tree, wherein, decision tree
In a node correspond to a characteristic parameter;The corresponding characteristic parameter of node that depth in decision tree is more than to depth threshold is made
To be used to build the key characterization parameter of disaggregated model;Select the characteristic for the abnormal user for meeting the following conditions:Decision-making
It is abnormal user to set the classification results classified to the characteristic of abnormal user;Spy to the abnormal user selected
Key characterization parameter in sign data is combined, and obtains key feature data.
In some optional realization methods of the present embodiment, choosing unit 402 includes:Bayes chooses subelement (not
Show), it is configured to calculate each respectively according to the characteristic for the abnormal user determined using NB Algorithm
The corresponding abnormal probability of characteristic parameter, abnormal probability instruction user when the numerical exception of characteristic parameter is the general of abnormal user
Rate;Corresponding abnormal probability is more than the characteristic parameter of probability threshold value as building the key characterization parameter of disaggregated model;
Select the characteristic for the abnormal user for meeting the following conditions:Using NB Algorithm to the characteristic of abnormal user
The classification results classified are abnormal user;To the key characterization parameter in the characteristic of the abnormal user selected
It is combined, obtains key feature data.
In some optional realization methods of the present embodiment, construction unit 403 includes:Model construction subelement (does not show
Go out), it is configured to create decision-tree model;Decision-tree model is trained using key feature data as training sample, with
Whether it is that abnormal user is identified to user using the decision-tree model after training.
Fig. 5 shows to be used for the computer system for being used to identify the device of abnormal user for realizing the embodiment of the present application
Structure diagram.
As shown in figure 5, computer system 500 includes central processing unit (CPU) 501, it can be read-only according to being stored in
Program in memory (ROM) 502 or be loaded into program in random access storage device (RAM) 503 from storage part 508 and
Perform various appropriate actions and processing.In RAM503, also it is stored with system 500 and operates required various programs and data.
CPU501, ROM502 and RAM503 are connected with each other by bus 504.Input/output (I/O) interface 505 is also connected to bus
504。
I/O interfaces 505 are connected to lower component:Importation 506 including keyboard, mouse etc.;It is penetrated including such as cathode
The output par, c 507 of spool (CRT), liquid crystal display (LCD) etc. and loud speaker etc.;Storage part 508 including hard disk etc.;
And the communications portion 509 of the network interface card including LAN card, modem etc..Communications portion 509 via such as because
The network of spy's net performs communication process.Driver 510 is also according to needing to be connected to I/O interfaces 505.Detachable media 511, such as
Disk, CD, magneto-optic disk, semiconductor memory etc. are mounted on driver 510, as needed in order to read from it
Computer program be mounted into as needed storage part 508.
Particularly, in accordance with an embodiment of the present disclosure, it may be implemented as computer above with reference to the process of flow chart description
Software program.For example, embodiment of the disclosure includes a kind of computer program product, it is machine readable including being tangibly embodied in
Computer program on medium, the computer program are included for the program code of the method shown in execution flow chart.At this
In the embodiment of sample, which can be downloaded and installed from network by communications portion 509 and/or from removable
Medium 511 is unloaded to be mounted.
Flow chart and block diagram in attached drawing, it is illustrated that according to the system of the various embodiments of the application, method and computer journey
Architectural framework in the cards, function and the operation of sequence product.In this regard, each box in flow chart or block diagram can generation
The part of one module of table, program segment or code, a part for the module, program segment or code include one or more
The executable instruction of logic function as defined in being used to implement.It should also be noted that some as replace realization in, institute in box
The function of mark can also be occurred with being different from the order marked in attached drawing.For example, two boxes succeedingly represented are actual
On can perform substantially in parallel, they can also be performed in the opposite order sometimes, this is depending on involved function.Also
It is noted that the combination of each box in block diagram and/or flow chart and the box in block diagram and/or flow chart, Ke Yiyong
The dedicated hardware based systems of functions or operations as defined in execution is realized or can referred to specialized hardware and computer
The combination of order is realized.
As on the other hand, present invention also provides a kind of nonvolatile computer storage media, the non-volatile calculating
Machine storage medium can be nonvolatile computer storage media included in equipment described in above-described embodiment;Can also be
Individualism, without the nonvolatile computer storage media in supplying terminal.Above-mentioned nonvolatile computer storage media is deposited
One or more program is contained, when one or more of programs are performed by an equipment so that the equipment:It obtains
The characteristic of multiple users and based on the characteristic, is determined different in multiple users in a manner of unsupervised learning
Common family, the characteristic include:The characteristic parameter of the feature of multiple instruction users;Spy based on the abnormal user determined
Data are levied, select from multiple characteristic parameters to build the key characterization parameter of disaggregated model in a manner of supervised learning,
And generation includes the key feature data of the key characterization parameter;Disaggregated model is built using the key feature data,
Whether to be that abnormal user is identified to user using disaggregated model.
The preferred embodiment and the explanation to institute's application technology principle that above description is only the application.People in the art
Member should be appreciated that invention scope involved in the application, however it is not limited to the technology that the particular combination of above-mentioned technical characteristic forms
Scheme, while should also cover in the case where not departing from the inventive concept, it is carried out by above-mentioned technical characteristic or its equivalent feature
The other technical solutions for being combined and being formed.Such as features described above has similar work(with (but not limited to) disclosed herein
The technical solution that the technical characteristic of energy is replaced mutually and formed.
Claims (10)
1. a kind of abnormal user recognition methods, including:
It obtains the characteristic of multiple users and based on the characteristic, multiple use is determined in a manner of unsupervised learning
Abnormal user in family, the characteristic include:The characteristic parameter of the feature of multiple instruction users;
Based on the characteristic for the abnormal user determined, selected and be used for from multiple characteristic parameters in a manner of supervised learning
It builds the key characterization parameter of disaggregated model and generates the key feature data for including the key characterization parameter;
Disaggregated model is built using the key feature data, whether to be that abnormal user is known to user using disaggregated model
Not.
2. it according to the method described in claim 1, it is characterized in that, is determined in a manner of unsupervised learning different in multiple users
Common family includes:
The characteristic of multiple users is clustered using clustering algorithm, obtains multiple clusters;
When including the characteristic with default off-note Data Matching in cluster, all characteristics in the cluster are corresponded to
User be determined as abnormal user.
3. according to the method described in claim 2, it is characterized in that, the characteristic based on the abnormal user determined, to have
Supervised learning mode selects to build the key characterization parameter of disaggregated model from multiple characteristic parameters and generation includes
The key feature data of the key characterization parameter include:
Using the characteristic for the abnormal user determined as training sample, decision tree is built, wherein, a section in decision tree
The corresponding characteristic parameter of point;
Depth in decision tree is more than the corresponding characteristic parameter of node of depth threshold as building the key of disaggregated model
Characteristic parameter;
Select the characteristic for the abnormal user for meeting the following conditions:The decision tree is to the characteristic of the abnormal user
The classification results classified are abnormal user;
Key characterization parameter in the characteristic of the abnormal user selected is combined, obtains key feature data.
4. according to the method described in claim 2, it is characterized in that, the characteristic based on the abnormal user determined, to have
Supervised learning mode selects to build the key characterization parameter of disaggregated model from multiple characteristic parameters and generation includes
The key feature data of the key characterization parameter include:
Using NB Algorithm according to the characteristic for the abnormal user determined, each characteristic parameter pair is calculated respectively
The abnormal probability answered, exception probability instruction user when the numerical exception of characteristic parameter are the probability of abnormal user;
Corresponding abnormal probability is more than the characteristic parameter of probability threshold value as building the key characterization parameter of disaggregated model;
Select the characteristic for the abnormal user for meeting the following conditions:Using NB Algorithm to the abnormal user
The classification results that characteristic is classified are abnormal user;
Key characterization parameter in the characteristic of the abnormal user selected is combined, obtains key feature data.
5. the method according to claim 3 or 4, which is characterized in that the disaggregated model is decision-tree model;And
Disaggregated model is built using the key feature data, whether to be that abnormal user is known to user using disaggregated model
Do not include:
Create decision-tree model;
The decision-tree model is trained using key feature data as training sample, to utilize the decision tree mould after training
Whether type is that abnormal user is identified to user.
6. it is a kind of for identifying the device of abnormal user, including:
Recognition unit is configured to obtain the characteristic of multiple users and based on the characteristic, with unsupervised learning
Mode determines the abnormal user in multiple users, and the characteristic includes:The characteristic parameter of the feature of multiple instruction users;
Unit is chosen, the characteristic based on the abnormal user determined is configured to, from multiple spies in a manner of supervised learning
The pass for building the key characterization parameter of disaggregated model and generation includes the key characterization parameter is selected in sign parameter
Key characteristic;
Construction unit is configured to build disaggregated model using the key feature data, to be to user using disaggregated model
It is no to be identified for abnormal user.
7. device according to claim 6, which is characterized in that recognition unit includes:
Abnormal user identifies subelement, is configured to cluster the characteristic of multiple users using clustering algorithm, obtain
Multiple clusters;When including the characteristic with default off-note Data Matching in cluster, by all characteristics in the cluster
Corresponding user is determined as abnormal user.
8. device according to claim 7, which is characterized in that choosing unit includes:
Decision tree chooses subelement, is configured to the characteristic for the abnormal user that will be determined as training sample, structure is determined
Plan tree, wherein, a node in decision tree corresponds to a characteristic parameter;Depth in decision tree is more than to the node of depth threshold
Corresponding characteristic parameter is as building the key characterization parameter of disaggregated model;Select the abnormal user for meeting the following conditions
Characteristic:The classification results that the decision tree classifies to the characteristic of the abnormal user are used to be abnormal
Family;Key characterization parameter in the characteristic of the abnormal user selected is combined, obtains key feature data.
9. device according to claim 8, which is characterized in that choosing unit includes:
Bayes chooses subelement, is configured to the characteristic according to the abnormal user determined using NB Algorithm
According to calculating the corresponding abnormal probability of each characteristic parameter respectively, the exception probability instruction is when the numerical exception of characteristic parameter
When user be abnormal user probability;The characteristic parameter that corresponding abnormal probability is more than probability threshold value is classified as building
The key characterization parameter of model;Select the characteristic for the abnormal user for meeting the following conditions:Using NB Algorithm
The classification results classified to the characteristic of the abnormal user are abnormal user;To the abnormal user that selects
Key characterization parameter in characteristic is combined, and obtains key feature data.
10. device according to claim 9, which is characterized in that construction unit includes:
Model construction subelement is configured to create decision-tree model;It determines using key feature data as training sample to described
Whether plan tree-model is trained, to be that abnormal user is identified to user using the decision-tree model after training.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201611051585.5A CN108108743B (en) | 2016-11-24 | 2016-11-24 | Abnormal user identification method and device for identifying abnormal user |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201611051585.5A CN108108743B (en) | 2016-11-24 | 2016-11-24 | Abnormal user identification method and device for identifying abnormal user |
Publications (2)
Publication Number | Publication Date |
---|---|
CN108108743A true CN108108743A (en) | 2018-06-01 |
CN108108743B CN108108743B (en) | 2022-06-24 |
Family
ID=62204087
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201611051585.5A Active CN108108743B (en) | 2016-11-24 | 2016-11-24 | Abnormal user identification method and device for identifying abnormal user |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN108108743B (en) |
Cited By (13)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108269012A (en) * | 2018-01-12 | 2018-07-10 | 中国平安人寿保险股份有限公司 | Construction method, device, storage medium and the terminal of risk score model |
CN109166624A (en) * | 2018-09-21 | 2019-01-08 | 广州杰赛科技股份有限公司 | A kind of behavior analysis method, device, server, system and storage medium |
CN109902486A (en) * | 2019-01-24 | 2019-06-18 | 平安科技(深圳)有限公司 | Electronic device, abnormal user processing strategie Intelligent Decision-making Method and storage medium |
CN109918279A (en) * | 2019-01-24 | 2019-06-21 | 平安科技(深圳)有限公司 | Electronic device, method and storage medium based on daily record data identification user's abnormal operation |
CN110008980A (en) * | 2019-01-02 | 2019-07-12 | 阿里巴巴集团控股有限公司 | Identification model generation method, recognition methods, device, equipment and storage medium |
CN110570244A (en) * | 2019-09-04 | 2019-12-13 | 深圳创新奇智科技有限公司 | hot-selling commodity construction method and system based on abnormal user identification |
WO2020078059A1 (en) * | 2018-10-17 | 2020-04-23 | 阿里巴巴集团控股有限公司 | Interpretation feature determination method and device for anomaly detection |
WO2020143322A1 (en) * | 2019-01-08 | 2020-07-16 | 平安科技(深圳)有限公司 | User request detection method and apparatus, computer device, and storage medium |
CN112308566A (en) * | 2020-09-27 | 2021-02-02 | 中智关爱通(上海)科技股份有限公司 | Enterprise fraud identification method |
CN113129054A (en) * | 2021-03-30 | 2021-07-16 | 广州博冠信息科技有限公司 | User identification method and device |
CN113743963A (en) * | 2021-09-28 | 2021-12-03 | 北京奇艺世纪科技有限公司 | Abnormal recognition model training method, abnormal object recognition device and electronic equipment |
CN113822309A (en) * | 2020-09-25 | 2021-12-21 | 京东科技控股股份有限公司 | User classification method, device and non-volatile computer-readable storage medium |
CN113129054B (en) * | 2021-03-30 | 2024-05-31 | 广州博冠信息科技有限公司 | User identification method and device |
Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20020133721A1 (en) * | 2001-03-15 | 2002-09-19 | Akli Adjaoute | Systems and methods for dynamic detection and prevention of electronic fraud and network intrusion |
CN103458042A (en) * | 2013-09-10 | 2013-12-18 | 上海交通大学 | Microblog advertisement user detection method |
CN103793484A (en) * | 2014-01-17 | 2014-05-14 | 五八同城信息技术有限公司 | Fraudulent conduct identification system based on machine learning in classified information website |
CN104657744A (en) * | 2015-01-29 | 2015-05-27 | 中国科学院信息工程研究所 | Multi-classifier training method and classifying method based on non-deterministic active learning |
CN105005594A (en) * | 2015-06-29 | 2015-10-28 | 嘉兴慧康智能科技有限公司 | Abnormal Weibo user identification method |
CN105376248A (en) * | 2015-11-30 | 2016-03-02 | 睿峰网云(北京)科技股份有限公司 | Method and device for identifying abnormal flow |
CN105873113A (en) * | 2015-01-21 | 2016-08-17 | 中国移动通信集团福建有限公司 | Method and system for positioning wireless quality problem |
-
2016
- 2016-11-24 CN CN201611051585.5A patent/CN108108743B/en active Active
Patent Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20020133721A1 (en) * | 2001-03-15 | 2002-09-19 | Akli Adjaoute | Systems and methods for dynamic detection and prevention of electronic fraud and network intrusion |
CN103458042A (en) * | 2013-09-10 | 2013-12-18 | 上海交通大学 | Microblog advertisement user detection method |
CN103793484A (en) * | 2014-01-17 | 2014-05-14 | 五八同城信息技术有限公司 | Fraudulent conduct identification system based on machine learning in classified information website |
CN105873113A (en) * | 2015-01-21 | 2016-08-17 | 中国移动通信集团福建有限公司 | Method and system for positioning wireless quality problem |
CN104657744A (en) * | 2015-01-29 | 2015-05-27 | 中国科学院信息工程研究所 | Multi-classifier training method and classifying method based on non-deterministic active learning |
CN105005594A (en) * | 2015-06-29 | 2015-10-28 | 嘉兴慧康智能科技有限公司 | Abnormal Weibo user identification method |
CN105376248A (en) * | 2015-11-30 | 2016-03-02 | 睿峰网云(北京)科技股份有限公司 | Method and device for identifying abnormal flow |
Non-Patent Citations (1)
Title |
---|
赵秀恒等: "《概率统计模型与优化》", 30 June 2015 * |
Cited By (16)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108269012A (en) * | 2018-01-12 | 2018-07-10 | 中国平安人寿保险股份有限公司 | Construction method, device, storage medium and the terminal of risk score model |
CN109166624A (en) * | 2018-09-21 | 2019-01-08 | 广州杰赛科技股份有限公司 | A kind of behavior analysis method, device, server, system and storage medium |
WO2020078059A1 (en) * | 2018-10-17 | 2020-04-23 | 阿里巴巴集团控股有限公司 | Interpretation feature determination method and device for anomaly detection |
TWI723476B (en) * | 2018-10-17 | 2021-04-01 | 開曼群島商創新先進技術有限公司 | Interpretation feature determination method, device and equipment for abnormal detection |
CN110008980A (en) * | 2019-01-02 | 2019-07-12 | 阿里巴巴集团控股有限公司 | Identification model generation method, recognition methods, device, equipment and storage medium |
WO2020143322A1 (en) * | 2019-01-08 | 2020-07-16 | 平安科技(深圳)有限公司 | User request detection method and apparatus, computer device, and storage medium |
CN109902486A (en) * | 2019-01-24 | 2019-06-18 | 平安科技(深圳)有限公司 | Electronic device, abnormal user processing strategie Intelligent Decision-making Method and storage medium |
CN109918279A (en) * | 2019-01-24 | 2019-06-21 | 平安科技(深圳)有限公司 | Electronic device, method and storage medium based on daily record data identification user's abnormal operation |
CN109918279B (en) * | 2019-01-24 | 2022-09-27 | 平安科技(深圳)有限公司 | Electronic device, method for identifying abnormal operation of user based on log data and storage medium |
CN110570244A (en) * | 2019-09-04 | 2019-12-13 | 深圳创新奇智科技有限公司 | hot-selling commodity construction method and system based on abnormal user identification |
CN113822309A (en) * | 2020-09-25 | 2021-12-21 | 京东科技控股股份有限公司 | User classification method, device and non-volatile computer-readable storage medium |
CN113822309B (en) * | 2020-09-25 | 2024-04-16 | 京东科技控股股份有限公司 | User classification method, apparatus and non-volatile computer readable storage medium |
CN112308566A (en) * | 2020-09-27 | 2021-02-02 | 中智关爱通(上海)科技股份有限公司 | Enterprise fraud identification method |
CN113129054A (en) * | 2021-03-30 | 2021-07-16 | 广州博冠信息科技有限公司 | User identification method and device |
CN113129054B (en) * | 2021-03-30 | 2024-05-31 | 广州博冠信息科技有限公司 | User identification method and device |
CN113743963A (en) * | 2021-09-28 | 2021-12-03 | 北京奇艺世纪科技有限公司 | Abnormal recognition model training method, abnormal object recognition device and electronic equipment |
Also Published As
Publication number | Publication date |
---|---|
CN108108743B (en) | 2022-06-24 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN108108743A (en) | Abnormal user recognition methods and the device for identifying abnormal user | |
US11741361B2 (en) | Machine learning-based network model building method and apparatus | |
US11593458B2 (en) | System for time-efficient assignment of data to ontological classes | |
Shanthamallu et al. | A brief survey of machine learning methods and their sensor and IoT applications | |
KR102252081B1 (en) | Acquisition of image characteristics | |
CN112232925A (en) | Method for carrying out personalized recommendation on commodities by fusing knowledge maps | |
CN111966904B (en) | Information recommendation method and related device based on multi-user portrait model | |
US20220284349A1 (en) | Techniques to generate network simulation scenarios | |
CN110598869B (en) | Classification method and device based on sequence model and electronic equipment | |
CN108229485A (en) | For testing the method and apparatus of user interface | |
US20080189237A1 (en) | Goal seeking using predictive analytics | |
CN110995459B (en) | Abnormal object identification method, device, medium and electronic equipment | |
CN108111399B (en) | Message processing method, device, terminal and storage medium | |
CN110457476A (en) | Method and apparatus for generating disaggregated model | |
CN107679737A (en) | The method and device of project recommendation | |
CN110708285A (en) | Flow monitoring method, device, medium and electronic equipment | |
CN111459898A (en) | Machine learning method, computer-readable recording medium, and machine learning apparatus | |
CN111159481B (en) | Edge prediction method and device for graph data and terminal equipment | |
CN104077408B (en) | Extensive across media data distributed semi content of supervision method for identifying and classifying and device | |
CN109961075A (en) | User gender prediction method, apparatus, medium and electronic equipment | |
CN109961163A (en) | Gender prediction's method, apparatus, storage medium and electronic equipment | |
CN105357583A (en) | Method and device for discovering interest and preferences of intelligent television user | |
WO2019062404A1 (en) | Application program processing method and apparatus, storage medium, and electronic device | |
CN114898184A (en) | Model training method, data processing method and device and electronic equipment | |
CN114861004A (en) | Social event detection method, device and system |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |