CN108205763A - A kind of user account detection method - Google Patents

A kind of user account detection method Download PDF

Info

Publication number
CN108205763A
CN108205763A CN201611174998.2A CN201611174998A CN108205763A CN 108205763 A CN108205763 A CN 108205763A CN 201611174998 A CN201611174998 A CN 201611174998A CN 108205763 A CN108205763 A CN 108205763A
Authority
CN
China
Prior art keywords
account
characteristic attribute
user
classification
weights
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201611174998.2A
Other languages
Chinese (zh)
Inventor
靳皎洋
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Jingdong Century Trading Co Ltd
Beijing Jingdong Shangke Information Technology Co Ltd
Original Assignee
Beijing Jingdong Century Trading Co Ltd
Beijing Jingdong Shangke Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Jingdong Century Trading Co Ltd, Beijing Jingdong Shangke Information Technology Co Ltd filed Critical Beijing Jingdong Century Trading Co Ltd
Priority to CN201611174998.2A priority Critical patent/CN108205763A/en
Publication of CN108205763A publication Critical patent/CN108205763A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/02Marketing; Price estimation or determination; Fundraising
    • G06Q30/0201Market modelling; Market analysis; Collecting market data
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/50Network services
    • H04L67/535Tracking the activity of the user

Landscapes

  • Engineering & Computer Science (AREA)
  • Business, Economics & Management (AREA)
  • Finance (AREA)
  • Development Economics (AREA)
  • Strategic Management (AREA)
  • Accounting & Taxation (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Data Mining & Analysis (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • General Engineering & Computer Science (AREA)
  • Computer Hardware Design (AREA)
  • Signal Processing (AREA)
  • Game Theory and Decision Science (AREA)
  • Economics (AREA)
  • Marketing (AREA)
  • Physics & Mathematics (AREA)
  • General Business, Economics & Management (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Financial Or Insurance-Related Operations Such As Payment And Settlement (AREA)

Abstract

The invention discloses a kind of user account detection method, this method includes:The probability that each characteristic attribute occurs under each class condition is determined according to the affiliated weights range of each characteristic attribute of active user's account and training sample database;The probability calculation active user's account occurred according to characteristic attribute each under each class condition belongs to the probability of each classification;The classification of select probability maximum is as user account testing result.It can be detected automatically using the present invention, improve detection efficiency.

Description

A kind of user account detection method
Technical field
The present invention relates to technical field of electronic commerce, more particularly to a kind of user account detection method.
Background technology
In electric business plateform system, possess thousands of user account, it is right when register account number quantity reaches certain scale Quality and classification in account, some accounts are non-genuine account, that is, pass through the false account of Program Generating, some accounts note There is no the operation informations such as any browsing, purchase after volume, for such account, the method for needing science carries out examination detection, To reduce maintenance cost caused by influence and reduction account quantity height of such account for system.It is examined currently for account The detection method without unified science is surveyed, it is artificial to carry out by the account information of system record mostly by being manually detected It checks and differentiates, efficiency is low, and testing result is inaccurate.
Existing account detection method includes:
1) whole account informations are scanned;
2) operation log in a period of time of each account is checked;
3) daily record is analyzed by program or manually;
4) it will detect that abnormal account is handled;
5) 2 to 4 operating procedures are repeated, until completing whole account detections.
From the above, it can be seen that prior art shortcoming has:
1) detection mode is complicated, relies on manual operation;
2) detection accuracy is poor, easily causes account and accidentally handles;
3) with the expansion of the quantitative range of user, detection efficiency is relatively low;
4) testing result real-time is poor.
Invention content
The purpose of the present invention is to provide a kind of user account detection methods, can be detected automatically, improve detection effect Rate.
For achieving the above object, the present invention provides a kind of user account detection method, for detecting user account Generic, this method include:According to the affiliated weights range of each characteristic attribute of active user's account and training sample data Library determines the probability that each characteristic attribute occurs under each class condition;According to characteristic attribute hair each under each class condition Raw probability calculation active user's account belongs to the probability of each classification;The classification of select probability maximum is detected as user account As a result.
In conclusion user account detection method provided by the invention, according to each characteristic attribute institute of active user's account Belong to weights range and training sample database determines the probability of each characteristic attribute generation under each class condition;According to Probability calculation active user's account that each characteristic attribute occurs under each class condition belongs to the probability of each classification;Selection is general The classification of rate maximum is as user account testing result.In the present invention, active user's account can be with combined training sample database Automatic classification detection is carried out, compared with artificial detection of the prior art, improves the accuracy of detection efficiency and detection.
Description of the drawings
Fig. 1 is the flow diagram of user account detection method of the present invention.
Specific embodiment
To make the objectives, technical solutions, and advantages of the present invention more comprehensible, develop simultaneously embodiment referring to the drawings, right Scheme of the present invention is described in further detail.
For electric business platform user, there is the attribute of multi-angle to influence the activity of user, such as the purchase row of user For personal information safeguards integrality, user's login times, browsing commodity number etc., and we term it the characteristic attributes of user. Here we select to enliven user influential key factor as characteristic attribute:
It if user logs in, publishes, browsing history etc., these operations can all leave operation log in systems, and here Referred to as the characteristic attribute of user account operation log is A1;
The conversion ratio of electric business platform is its core and proves the factor for enlivening the most key property of characteristic of the user, because We select user account purchaser record characteristic attribute as A2 for this;
The personal information of user has crucial effect, such as telephone number, ship-to, transaction card number for its behavior Deng we select user account information integrity feature attribute as A3 here.
It should be noted that user account characteristic attribute includes but not limited to above-mentioned three kinds, as long as actively having shadow to user Loud key factor can serve as characteristic attribute.
Therefore, user account characteristic attribute weights can include:Average operation daily record quantity of the user account from after registering, Average purchaser record quantity of the user account from after registering, according to the value that whether user account information is complete and determines.
Fig. 1 is the flow diagram of user account detection method of the present invention, for detecting user account generic, the party Method includes:
It is step 11, true according to the affiliated weights range of each characteristic attribute of active user's account and training sample database It is scheduled on the probability that each characteristic attribute occurs under each class condition;
Wherein, before step 11 is performed, multiple characteristic attributes that predetermined user account has, and be each feature Attribute definition enlivens account number, the weights range where common account number and inactive account number.
Specifically, the method for forming the training sample database includes:
S111, predetermined quantity sample of users account is obtained, each sample of users account has determining classification, each sample Each characteristic attribute of this user account has determining weights;
S112, it is that each characteristic attribute establishes statistical table in training sample database;
S113, statistical table the first row record enliven account number for this feature attribute definition in advance, common account number and Weights range where inactive account number, first row record each classification of user account;
S114, according to the classification of each sample of users account and the weights of each characteristic attribute, count in each classification Sample of users account quantity in the range of the affiliated weights of each characteristic attribute, is added in statistical table, forms number of training According to library.
Specifically, it is true according to the affiliated weights range of each characteristic attribute of active user's account and training sample database The method for being scheduled on the probability that each characteristic attribute occurs under each class condition includes:
S116, the weights according to each characteristic attribute of active user's account determine each characteristic attribute institute of the user account Belong to weights range;
S117, corresponding spy in training sample database is determined according to the affiliated weights range of each characteristic attribute of the user account Levy the sample of users account number under each classification in the range of the affiliated weights of attribute;
S118, for a characteristic attribute under a classification, according to Naive Bayes Classification Algorithm, by active user's account Sample of users account under sample of users account number divided by the category in the range of number affiliated weights of this feature attribute under the category Sum obtains the probability that this feature attribute occurs under the conditions of the category.
Step 12, the probability calculation active user's account occurred according to characteristic attribute each under each class condition belong to every The probability of a classification;
Wherein, specific method includes:For a classification, the probability that characteristic attribute each under the conditions of the category occurs is multiplied With probability of the category in training sample database, the probability that active user's account belongs to the category is obtained.
Step 13, the classification of select probability maximum are as user account testing result.
So far, the user account detection method of the present invention is completed.
It is of the invention to clearly illustrate, concrete scene is set forth below and illustrates.By user account in the embodiment of the present invention Classification, which is divided into, enlivens two classifications of account and inactive account.
1) it is user account operation log A1, user account purchaser record A2, user account in user account characteristic attribute In the case of information integrity, the weights range of each characteristic attribute is divided:
A1:{a1<=0.05,0.05<a1<0.2,a1>=0.2 }
The operation log quantity that user account has been generated since registration as reference, i.e., such as register 100 days by user account, It produces the operation less than or equal to 5 times and is considered inactive account, produce the operation more than 5 times less than 20 times and be considered common Account produces the operation more than or equal to 20 times and is considered to enliven account.That is, the weights range where inactive account It is a1<=0.05, the weights range where common account is 0.05<a1<0.2, the weights range where enlivening account is a1>= 0.2。
A2:{a2<=0.1,0.1<a2<0.8,a2>=0.8 }
The purchaser record quantity that user account has been generated since registration as reference, i.e., such as register 100 days by user account Come, produce less than 10 times buying behaviors as inactive account, produce 10 times to 80 times buying behaviors as common account, be more than 80 times is are enlivened account.That is, the weights range where inactive account is a2<=0.1, the weights where common account Range is 0.1<a2<0.8, the weights range where enlivening account is a2>=0.8.
A3:{ a3=0 (is not) a3=1 (YES) }
Whether user account information is complete as reference, it is imperfect for inactive account, a3=0;Complete is to enliven account Number, a3=1.
2) training sample database is formed
Using 10000 user accounts of artificial detection are had already passed through as training sample, each sample of users account tool There is determining classification, each characteristic attribute of each sample of users account has determining weights.89% is active in the sample Account (being set as C0) 8900,11% is inactive account (being set as C1) 1100.
The probability for enlivening account is 89%, that is, P (C0)=8900/10000=0.89
The probability of inactive account is 11%, i.e. P (C1)=1100/10000=0.11
It is that each characteristic attribute establishes statistical table in training sample database:
For characteristic attribute A1
a1<=0.05 0.05<a1<0.2 a1>=0.2
Enliven account C0 2670 4450 1780
Inactive account C1 880 110 110
Table 1
Table 1 is obtained by statistics, it is a1 to enliven in account number classification characteristic attribute A1 in weights range<Sample in=0.05 User account quantity is 2670, is 0.05 in weights range<a1<Sample of users account quantity in 0.2 is 4450, in weights model It is a1 to enclose>Sample of users account quantity in=0.2 is 1780.Characteristic attribute A1 is in weights range in inactive account number classification a1<Sample of users account quantity in=0.05 is 880, is 0.05 in weights range<a1<Sample of users account number in 0.2 It is 110 to measure, and is a1 in weights range>Sample of users account quantity in=0.2 is 110.
For characteristic attribute A2
Table 2
Table 2 is obtained by statistics, it is a2 to enliven in account number classification characteristic attribute A2 in weights range<Sample in=0.1 User account quantity is 890, is 0.1 in weights range<a2<Sample of users account quantity in 0.8 is 6230, in weights range It is a2>Sample of users account quantity in=0.8 is 1780.Characteristic attribute A2 is a2 in weights range in inactive account number classification <Sample of users account quantity in=0.1 is 770, is 0.1 in weights range<a2<Sample of users account quantity in 0.8 is 220, it is a2 in weights range>Sample of users account quantity in=0.8 is 110.
For characteristic attribute A3
A3=0 A3=1
Enliven account C0 1780 7120
Inactive account C1 990 110
Table 3
Table 3 is obtained by statistics, enlivening sample of users account quantity of the characteristic attribute A3 in a3=0 in account number classification is 1780, the sample of users account quantity in a3=1 is 7120.Characteristic attribute A3 is in a3=0 in inactive account number classification Sample of users account quantity is 990, and the sample of users account quantity in a3=1 is 110.
3) assume to be detected to working as previous user account
According to active user's account operation log quantity/registration number of days, the weights of A1 are obtained, belong to common account 0.05< a1<In the range of 0.2;
According to active user's account purchaser record quantity/registration number of days, the weights of A2 are obtained, belong to common account 0.1<a2 <In the range of 0.8;
It is incomplete according to active user's account information integrality, obtains the weights a3=0 of A3, belong to inactive account In the range of.
4) it is obtained according to Naive Bayes Classification Algorithm, active user's account each characteristic attribute under each class condition The probability of generation:
Any active ues operation log probability is F1=P (A1 | C0)=4450/8900=0.5
Inactive users operation log probability is F2=P (A1 | C1)=110/1100=0.1
Any active ues purchaser record generates probability as F3=P (A2 | C0)=6230/8900=0.7
Inactive users purchaser record generates probability as F4=P (A2 | C1)=220/1100=0.2
Complete information user generates probability as F5=P (A3 | C0)=1780/8900=0.2
Incomplete information user generates probability as F6=P (A3 | C1)=990/1100=0.9
It following is a brief introduction of Bayes' theorem:Bayes' theorem be about chance event A and B conditional probability (or Marginal probability) first theorem.Wherein P (A | B) it is the possibility that A occurs in the case where B occurs.It follows that citing comes It says, and P (A1 | C0) refer to the probability that User operation log occurs in the case where enlivening account number classification, circular is exactly:According to The weights of active user's account A1 belong to common account 0.05<a1<In the range of 0.2, from table 1 obtain this 0.05<a1<0.2 It is 4450 that the sample of users account number under account number classification is enlivened in the range of weights, the sample of users account in the case where enlivening account number classification Sum is 8900, therefore is obtained:Any active ues operation log probability is F1=P (A1 | C0)=4450/8900=0.5.Class successively It pushes away, repeats no more.
5) it calculates active user's account and belongs to and enliven the probability of account number classification and be:
R1=F1F3F5P (C0)=P (A1 | C0) P (A2 | C0) P (A3 | C0) P (C0)
=0.5x 0.7x 0.2x 0.89=0.0623
It calculates active user's account and belongs to the probability of inactive account number classification and be:
R2=F2F4F6P (C1)=P (A1 | C1) P (A2 | C1) P (A3 | C1) P (C1)
=0.1x 0.2x 0.9x 0.11=0.00198
6) by comparing R1>R2, so active user's account is enlivens account number classification.
It can be seen from above-described embodiment that the user account currently to be detected can be with combined training sample database, base The user account generic currently to be detected is calculated in Naive Bayes Classification Algorithm, realizes automatic classification inspection It surveys.Substantially increase the accuracy of detection efficiency and detection.In the above-described embodiments by three characteristic attributes and, will be each The weights range of characteristic attribute carries out division segmentation, carries out account detection, it should be noted that and it is above-mentioned only to illustrate, In practical application, characteristic attribute can be selected according to concrete scene and flexibly the weights range of each characteristic attribute is carried out Divide segmentation.Further, the weights range of each segmentation divided for each characteristic attribute can also be continued to optimize, to reach Further improve the purpose of detection accuracy.
The user account detection method of the present invention, has the advantages that:
First, maintenance cost caused by influence and reduction account quantity height of the improper account for system is reduced;
2nd, differentiate and distinguish class of subscriber, facilitate and establish user's portrait, provide personalized service;
3rd, by the detection and monitoring to account, security of system and stability are improved.
The foregoing is only a preferred embodiment of the present invention, is not intended to limit the scope of the present invention.It is all Within the spirit and principles in the present invention, any modification, equivalent replacement, improvement and so on should be included in the protection of the present invention Within the scope of.

Claims (7)

1. a kind of user account detection method, which is characterized in that for detecting user account generic, this method includes:
It is determined according to the affiliated weights range of each characteristic attribute of active user's account and training sample database in each classification Under the conditions of the probability that occurs of each characteristic attribute;
The general of each classification is belonged to according to probability calculation active user's account that characteristic attribute each under each class condition occurs Rate;
The classification of select probability maximum is as user account testing result.
2. the method as described in claim 1, which is characterized in that this method further includes:Predetermined user account has more A characteristic attribute, and enliven account number, the weights model where common account number and inactive account number for the definition of each characteristic attribute It encloses.
3. method as claimed in claim 2, which is characterized in that the method for forming the training sample database includes:
Predetermined quantity sample of users account is obtained, each sample of users account has determining classification, each sample of users account Number each characteristic attribute there are determining weights;
It is that each characteristic attribute establishes statistical table in training sample database;
The first row record of statistical table enlivens account number, common account number and inactive account for this feature attribute definition in advance Weights range where number, first row record each classification of user account;
According to the classification of each sample of users account and the weights of each characteristic attribute, each feature in each classification is counted Sample of users account quantity in the range of the affiliated weights of attribute, is added in statistical table, forms training sample database.
4. method as claimed in claim 3, which is characterized in that described according to each characteristic attribute ownership of active user's account Value range and training sample database determine the method packet of the probability that each characteristic attribute occurs under each class condition It includes:
According to the weights of each characteristic attribute of active user's account, the affiliated weights model of each characteristic attribute of the user account is determined It encloses;
Individual features attribute institute in training sample database is determined according to the affiliated weights range of each characteristic attribute of the user account Belong to the sample of users account number under each classification in the range of weights;
For a characteristic attribute under a classification, according to Naive Bayes Classification Algorithm, by active user's account this feature The sum of sample of users account obtains under sample of users account number divided by the category in the range of the affiliated weights of attribute under the category The probability that this feature attribute occurs under the conditions of the category.
5. method as claimed in claim 4, which is characterized in that each characteristic attribute occurs under each class condition of basis Probability calculation active user's account method of probability for belonging to each classification include:
For a classification, the probability that characteristic attribute each under the conditions of the category occurs is multiplied by the category in training sample data Probability in library obtains the probability that active user's account belongs to the category.
6. the method as described in claim 1, which is characterized in that user account characteristic attribute includes:User account operation log, User account purchaser record, user account information integrality.
7. method as claimed in claim 6, which is characterized in that user account characteristic attribute weights include:User account is noted certainly Average operation daily record quantity after volume, average purchaser record quantity of the user account from after registering are according to user account information No complete and determining value.
CN201611174998.2A 2016-12-19 2016-12-19 A kind of user account detection method Pending CN108205763A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201611174998.2A CN108205763A (en) 2016-12-19 2016-12-19 A kind of user account detection method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201611174998.2A CN108205763A (en) 2016-12-19 2016-12-19 A kind of user account detection method

Publications (1)

Publication Number Publication Date
CN108205763A true CN108205763A (en) 2018-06-26

Family

ID=62602792

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201611174998.2A Pending CN108205763A (en) 2016-12-19 2016-12-19 A kind of user account detection method

Country Status (1)

Country Link
CN (1) CN108205763A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109635529A (en) * 2018-11-13 2019-04-16 平安科技(深圳)有限公司 Account shares detection method, device, medium and electronic equipment

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103853701A (en) * 2012-11-30 2014-06-11 中国科学院声学研究所 Neural-network-based self-learning semantic detection method and system
CN104348810A (en) * 2013-08-05 2015-02-11 深圳市腾讯计算机系统有限公司 Method, device and system for detecting stolen account
CN104901847A (en) * 2015-05-27 2015-09-09 国家计算机网络与信息安全管理中心 Social network zombie account detection method and device
CN105389704A (en) * 2015-11-16 2016-03-09 小米科技有限责任公司 Method and device for judging authenticity of users

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103853701A (en) * 2012-11-30 2014-06-11 中国科学院声学研究所 Neural-network-based self-learning semantic detection method and system
CN104348810A (en) * 2013-08-05 2015-02-11 深圳市腾讯计算机系统有限公司 Method, device and system for detecting stolen account
CN104901847A (en) * 2015-05-27 2015-09-09 国家计算机网络与信息安全管理中心 Social network zombie account detection method and device
CN105389704A (en) * 2015-11-16 2016-03-09 小米科技有限责任公司 Method and device for judging authenticity of users

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109635529A (en) * 2018-11-13 2019-04-16 平安科技(深圳)有限公司 Account shares detection method, device, medium and electronic equipment

Similar Documents

Publication Publication Date Title
US20240037225A1 (en) Systems and methods for detecting resources responsible for events
CN107133265B (en) Method and device for identifying user with abnormal behavior
Sarno et al. Hybrid Association Rule Learning and Process Mining for Fraud Detection.
KR102138965B1 (en) Account theft risk identification method, identification device, prevention and control system
CN111614690B (en) Abnormal behavior detection method and device
CN105808639B (en) Network access behavior identification method and device
CN106384273A (en) Malicious order scalping detection system and method
CN103793484A (en) Fraudulent conduct identification system based on machine learning in classified information website
CN109635007B (en) Behavior evaluation method and device and related equipment
WO2019071906A1 (en) Financial product recommendation device and method, and computer-readable storage medium
US10637939B2 (en) Systems and methods for identifying merchant locations based on transaction records
CN108596759A (en) loan application information detecting method and server
CN106327230B (en) Abnormal user detection method and equipment
CN109274842B (en) Method, device and equipment for positioning key factors of customer service level fluctuation
CN111445259A (en) Method, device, equipment and medium for determining business fraud behaviors
KR101999765B1 (en) Method and system for analyzing pattern of banking phishing loan using data mining technology
CN109313541A (en) For showing and the user interface of comparison attacks telemetering resource
CN108205763A (en) A kind of user account detection method
CN111311276B (en) Identification method and device for abnormal user group and readable storage medium
CN109242658A (en) Suspicious transaction reporting generation method, system, computer equipment and storage medium
CN114817518B (en) License handling method, system and medium based on big data archive identification
CN108446739B (en) Data entry monitoring method and device
CN110990867A (en) Database-based data leakage detection model modeling method and device, and leakage detection method and system
CN109274834A (en) A kind of express delivery number identification method based on call behavior
US20150134532A1 (en) Method and system for creating a control group for campaign measurements

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20180626