CN103793484A - Fraudulent conduct identification system based on machine learning in classified information website - Google Patents

Fraudulent conduct identification system based on machine learning in classified information website Download PDF

Info

Publication number
CN103793484A
CN103793484A CN201410022138.1A CN201410022138A CN103793484A CN 103793484 A CN103793484 A CN 103793484A CN 201410022138 A CN201410022138 A CN 201410022138A CN 103793484 A CN103793484 A CN 103793484A
Authority
CN
China
Prior art keywords
data
user
user behavior
probability
point
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201410022138.1A
Other languages
Chinese (zh)
Other versions
CN103793484B (en
Inventor
张鹏
张爱华
张美琦
张朝阳
孙亚健
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing 58 Information Technology Co Ltd
Original Assignee
Beijing 58 Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing 58 Information Technology Co Ltd filed Critical Beijing 58 Information Technology Co Ltd
Priority to CN201410022138.1A priority Critical patent/CN103793484B/en
Publication of CN103793484A publication Critical patent/CN103793484A/en
Application granted granted Critical
Publication of CN103793484B publication Critical patent/CN103793484B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/958Organisation or management of web site content, e.g. publishing, maintaining pages or automatic linking

Landscapes

  • Engineering & Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention provides a method used for a fraudulent conduct identification system based on machine learning in a classified information website. The method includes the following steps that (a), sample data are extracted based on existing user behavior data and used for generating a model for the first time; (b), multiple user behavior characteristics are selected to be extracted according to training data of different service types; (c), based on the extracted user behavior characteristics, the sample training data are vectorized; (d), the vectorized sample training data are used for generating a prediction model; (e), on-line data are detected by using the generated model based on classification and cluster rules; (f), detected abnormal user data are processed. User behaviors can be identified in multiple dimensions through the method, and the false amount of trade information can be reduced efficiently. Moreover, the user behaviors of low quality can be identified well even if the training data contain noise data.

Description

The fraud recognition system based on machine learning in classified information website
Technical field
The present invention relates to Internet technology, particularly the fraud recognition system based on machine learning in a kind of classified information website.
Background technology
Classified information net is the Type of website of the new a kind of every aspect information that relates to daily life of rising in internet.Can obtain freely the inside, these websites users, information issuing service easily, comprise second-hand article trading, used car trade, housing, pet, recruitment, part-time, job hunting, friend-making activity, service for life information etc.Classified information claims again classified advertisement, people's advertisement of seeing on TV, newpapers and periodicals of being everlasting day, no matter beholder is willing to be unwilling, it all can impose on beholder often, and this series advertisements is passive advertisement; And the information of people initiatively go aspects such as inquiry recruit, rented a house, tourism, to these information, claims that it is active advertisement.The today of progressively developing in information society, passive advertisement more and more causes people's dislike, and active advertisement is subject to people's extensive favor.Almost the evening paper in each place, daily paper, life & amusement report be can't do without the figure of classified information, and do to obtain better newspaper, and the length of classified information is often larger.Just produce thus classified information net.
In the user who releases news in classified information website, often there will be a part of user inferior, they obtain interests to issue the mode fraudulent user such as deceptive information.Therefore, classified information website can arrange some processing rule and filter logic etc. to information inferior.
The means of existing deceptive information identification are mainly rule-based recognition method, more additional artificial interventions, for example within a period of time, issue and count, whether contain in the information content commodity of illegal word, issue or whether unreasonable etc. the rule of the price range of service judges whether a user is the user inferior who issues deceptive information by adding up an ip, thereby take the processing means such as deletion information, warning, logging off users.But common processing rule and filter logic are all to carry out the identification of behavior inferior by single dimension conventionally, thereby therefore user inferior can sound out regular critical point and get around the processing to information inferior and the filter logic of system by every means.
In addition, along with reaching the standard grade of various rules, spendable rule can be fewer and feweri, because rule is all the obvious feature of meeting.In existing method, can only use linear classifying face to distinguish to regular identification, thereby cause most information inferior can and not process by system identification.
Therefore, the fraud recognition system based on machine learning in needs a kind of classified information website, identifies user's behavior in multiple dimensions, thereby reduces efficiently the falseness amount of Transaction Information, improves the authenticity of Transaction Information.
Summary of the invention
The object of this invention is to provide the fraud recognition system based on machine learning in a kind of classified information website.
According to an aspect of the present invention, provide a kind of for classified information website the method for the fraud recognition system based on machine learning, described method comprises the steps: a) based on existing user behavior data sample drawn data, for generation model first; B) for the multiple user behavior feature of training data selective extraction of different service types; C) the user behavior feature based on extracted, carries out vectorization to described sample training data; D) utilize the sample training data of vectorization to produce forecast model; E) utilize the model producing based on Classification and clustering rule, data on line to be detected; F) the detect abnormal user data that obtain are processed.
Preferably, the sample data in described step a comprises positive sample data and negative sample data, corresponds respectively to the user of high-quality behavior and the user of behavior inferior.
Preferably, in described step b, user behavior feature comprises for the user behavior data of same cookie and the statistical magnitude of the each dimension of user.
Preferably, in described step b, select by the mode of computing information entropy and the data verification of model intersection the user characteristics that different service types is extracted.
Preferably, in described steps d, the sorter of probability of use type carries out decision-making.
Preferably, in described step e, utilize model to calculate the Probability Point of the abnormal probability that represents user behavior data.
Preferably, the method for calculating described Probability Point is that multiple models detect many stack features of user behavior data respectively, and draw respectively a point of Probability Point, then each point of Probability Point are carried out to sum of products conversion operation, draw the Probability Point of user behavior data.
Preferably, in described step e, the user's anomaly detection method based on classifying rules comprises that a probability line of setting is for judging whether user behavior data is bad data.
Preferably, in described step e, the user's anomaly detection method based on clustering rule comprises as follows: e1) Probability Point is carried out to the monitoring of cluster phenomenon; E2) Probability Point cluster is detected to the user behavior of some, take judge cluster to the user behavior of equal probabilities point whether as user behavior inferior; E3), according to testing result, abnormal user behavior discrimination model upgrades the Probability Point of such user behavior; E4) will add Sample Storehouse as training data through detecting the new bad data of finding; E5) utilize new training data training pattern.
Preferably, in described step e5, carry out off-line analysis for the inaccurate user behavior data of Probability Point, find new user behavior feature and select suitable feature.
Utilize the fraud recognition system based on machine learning in a kind of classified information of the present invention website, can identify user's behavior from multiple dimensions, thereby reduce efficiently the falseness amount of Transaction Information, improve the authenticity of Transaction Information.And, even also can be good at the in the situation that of containing noise data in training data, user behavior inferior is identified.
Accompanying drawing explanation
With reference to the accompanying drawing of enclosing, the more object of the present invention, function and advantage are illustrated the following description by embodiment of the present invention, wherein:
Fig. 1 has schematically shown the method flow diagram of the fraud recognition system based on machine learning in classified information of the present invention website.
Embodiment
By reference to one exemplary embodiment, object of the present invention and function and will be illustrated for the method that realizes these objects and function.But the present invention is not limited to following disclosed one exemplary embodiment; Can be realized it by multi-form.The essence of instructions is only to help various equivalent modifications Integrated Understanding detail of the present invention.
Hereinafter, embodiments of the invention will be described with reference to the drawings.In the accompanying drawings, identical Reference numeral represents same or similar parts, or same or similar step.
Fraud information recognition methods of the present invention has been used the data that produce based on user behavior, and the information data that can immediately issue user is identified.The Model Identification of the machine learning that the present invention adopts, can identify user's behavior in multiple dimensions, and the particularly user of information inferior that makes to release news is difficult to know what the dimension of identification is, thereby cannot evade by getting around rule.The present invention can, under the environment of a small amount of sample and high noisy, data are predicted, and accuracy rate be high.Collect modeling in the various actions to user, thereby reach, abnormal user is identified.
Fig. 1 has schematically shown the method flow diagram of the fraud recognition system based on machine learning in classified information of the present invention website.As shown in Figure 1:
Step 110, based on existing user behavior data sample drawn data, for generation model first.The sample data of this extraction can extract from existing user behavior data storehouse of having examined, be mainly used in user area to be divided into high-quality user and user inferior, correspond respectively to positive sample data and negative sample data, wherein, positive sample is the user behavior data through examining the high-quality behavior user who passes through, and negative sample is the behavioral data through the behavior user inferior of audit identification, for example, offend the user behavior data of some comparatively serious rules (for example issuing wash sale information).Sample data in existing audit storehouse is to the user behavior database of setting up of classifying by some conventional user behavior recognition methodss.Described method is for example: detect in the text message that user issues and whether contain illegal word, detect in the pictorial information that user issues whether contain illegal contents etc.
In the process of the upper once model iteration after generation model, the method according to this invention can directly be used extracted positive and negative samples storehouse, and without re-using original audit library information.
Step 120, for the multiple user behavior feature of training data selective extraction of different service types.Effect by experiment, judges in different business line, to use which user behavior feature.
User's characteristic behavior is conventionally very many, in view of the requirement of while computational accuracy and counting yield, summarizing according to user behavior feature of the present invention is the feature that good and bad user is had to discrimination, and do not require a lot of features therefore, model is the smaller the better, and object is can use multiple models to detect data in the time of last identification.
User behavior feature is the behavioural characteristic of finding by some off-line analysiss, generally can not repeat with other rules on line, can be understood as the feature that background audit personnel can not find, it is for example the user behavior data for same cookie, comprise: the trans-city number of posting, use mobile phone number, the time interval and mouse to click behavior, user to be registered to the login behavior that the time interval of posting divides, also has user, the statistical magnitude of the data of the dimensions such as user browsing behavior and the each dimension of user is as the trans-city N of ip days countings, cookie counting etc.
Preferably, for the training data of different types of service, system is extracted different features.The feature that for example used car service line uses can only comprise: be registered to the time of posting, the user that posts fills in data time, user's mouse track, 30 days ip correlated counts etc.Second-hand service line can only comprise following feature: user registers login time, the number of times of the browsing pages of N days before the corresponding ip of user.
More preferably, select the feature to different business line drawing by the mode of computing information entropy and the data verification of model intersection.
The model that the method according to this invention is set up has only been selected the feature of less dimension, each dimension that model uses be produce by data analysis to the good dimension of classifying quality, the data that therefore each dimension produces are not sparse.Method of the present invention has overcome dimension in prior art and has too much caused calculating too complicated shortcoming, thereby prior art normally produces a large amount of dimensions by text being carried out to participle differentiation user, each word is as a dimension, thereby cause training sample too much, calculation of complex, also can be too high to the accuracy requirement of sample data.
Step 130, according to the user behavior feature of selecting in step 120, carries out vectorization to sample training data.The result of the vectorization of training data can be saved in file.The component of each dimension of training data is a feature of selecting.The vectorization procedure of this dimension of time of filling in model take user below, as example, illustrates the vectorization procedure of training data:
1. in sample data, obtain the data of each model fill in the time.
2. these data are carried out to data cleansing, its objective is outlier is playbacked.
3. the attribute of pair successive value carries out discretize processing, uses K mean cluster 100 times, is to have the cluster centre point of error minimum as discretize cut section.
4. then data are carried out to last correction.
5. vectorization completes.
Step 140, utilizes the sample training data of vectorization to produce forecast model.When training pattern, preferably the sorter of probability of use type carries out decision-making.Probabilistic type sorter is for calculating the Probability Point of user behavior data.
The reason that the sorter of probability of use type carries out decision-making is, the object producing due to last model is to identify and delete some abnormal behaviour information, so this model need to have very high accuracy rate, because the sorter such as neural network or decision tree has the situation of manslaughtering, so the sorter of probability of use type carries out decision-making.
The model using is preferably employing Bayesian network model.Bayesian network (Bayesian network) is the mathematical model based on probability inference, and it has stronger generalization ability, and can the layering of firm logic and the output of probability, so be well suited for the scene of behavior identification.
Preferably, the employing program WEKA that increases income carries out the training of model, WEKA(Waikato Environment for Knowledge Analysis) as a disclosed data mining workbench, the a large amount of machine learning algorithms that can bear data mining task are gathered, comprise data are carried out to pre-service, classification, recurrence, cluster, correlation rule and visual on new interactive interface.
Step 150, utilizes the multiple models that produce based on Classification and clustering rule, data on line to be detected, and user inferior is processed.While utilizing abnormal user behavior model of cognition to detect the user behavior data on line, can calculate every user behavior data, generate a Probability Point, Probability Point represents that these data are the probable value of bad data, and higher these data of this Probability Point are more prone to bad data.Wherein, generate the Probability Point of user behavior data with following method: each model respectively many stack features to user behavior data (organizing the feature of different dimensions) detects more, each model show that respectively one represents that data are the probable value (hereinafter referred to as a point Probability Point) of the probability of quality tendency, finally each point of Probability Point carried out to sum of products conversion operation, draw the Probability Point of user behavior data.
The method that detects user's abnormal behaviour comprises following two kinds:
User's anomaly detection method based on classifying rules.Set a probability line (being probability threshold value), for judging whether user behavior data is bad data.If the Probability Point of certain user behavior data exceedes probability threshold value, this user behavior data is judged to be to bad data, be judged to be user inferior by this user.Otherwise this user behavior data is judged to be to normal data, this user is judged to be to high-quality user.Wherein, probability line is to obtain by the mode of artificial checking.
Based on clustering rule, data on line are detected.Concrete steps are as follows:
Step a, carries out the monitoring of cluster phenomenon to Probability Point.
Step b, Probability Point cluster is detected to the user behavior data turning-over operation personnel of some, judge whether cluster to the user behavior of equal probabilities point is user behavior inferior, and on detection line, whether other the user behavior data with this Probability Point is all bad data.Cluster is to the user behavior data same class behavior of equal probabilities point or the user behavior data of similar behavior, and they may only have the feature of less dimension distinct.Detection method is preferably, and whether the user behavior data that detection has this Probability Point has all been fallen by other rule treatments.Wherein, other rules are modes of the identification bad data outside invention.For example: the pictorial information that contains illegal word, user's issue in the text message that user issues contains illegal contents etc.
Step c, according to operating personnel's testing result, abnormal user behavior discrimination model upgrades the Probability Point of such user behavior.That is, if find, this behavior is user's inferior behavior, this user behavior data is judged to be to bad data, improves the Probability Point of this user behavior to certain higher probable value, for example, Probability Point is increased to 0.999.
Steps d, will add Sample Storehouse as training data through detecting the new bad data of finding.That is, the user behavior data of bad data adds in Sample Storehouse when being judged as, as the training data of model next time, thereby provides new training data for the renewal of model.
Step e, utilizes new training data training pattern.
Preferably, in step e, carry out off-line analysis for the inaccurate user behavior data of Probability Point, find new user behavior feature and select suitable feature.And the model of new generation is done to cross validation judgment models whether there is better performance.
Utilize above-mentioned based on clustering rule the detection method to data on line, can realize the detection to user behavior data in the situation that sample data is not accurate enough.And, utilize above-mentioned steps a to step e, can realize in semi-supervised machine learning mode and carry out the renewal to model.And, by this mechanism, can avoid the inaccurate problem of the Probability Point of data being calculated owing to containing the caused model of the reasons such as noise data in the sample data as training data, so even also can be good at user behavior inferior to identify in the inaccurate situation of sample data.
Step 160, processes abnormal user data.After definite certain user behavior is abnormal behaviour, system can be processed user inferior, and the information of for example user being issued is on the net deleted etc.
Utilize the fraud recognition system based on machine learning in a kind of classified information of the present invention website, can identify user's behavior from multiple dimensions, thereby reduce efficiently the falseness amount of Transaction Information, improve the authenticity of Transaction Information.And, even also can be good at the in the situation that of containing noise data in training data, user behavior inferior is identified.
In conjunction with the explanation of the present invention and the practice that disclose here, other embodiment of the present invention are easy to expect and understand for those skilled in the art.Illustrate with embodiment and be only considered to exemplary, true scope of the present invention and purport limit by claim.

Claims (10)

1. a method for the fraud recognition system based on machine learning for classified information website, described method comprises the steps:
A) based on existing user behavior data sample drawn data, for generation model first;
B) for the multiple user behavior feature of training data selective extraction of different service types;
C) the user behavior feature based on extracted, carries out vectorization to described sample training data;
D) utilize the sample training data of vectorization to produce forecast model;
E) utilize the model producing based on Classification and clustering rule, data on line to be detected;
F) the detect abnormal user data that obtain are processed.
2. the method for claim 1, the sample data in wherein said step a comprises positive sample data and negative sample data, corresponds respectively to the user of high-quality behavior and the user of behavior inferior.
3. the method for claim 1, in wherein said step b, user behavior feature comprises for the user behavior data of same cookie and the statistical magnitude of the each dimension of user.
4. the method for claim 1, selects by the mode of computing information entropy and the data verification of model intersection the user characteristics that different service types is extracted in wherein said step b.
5. the method for claim 1, in wherein said steps d, the sorter of probability of use type carries out decision-making.
6. the method for claim 1, utilizes model to calculate the Probability Point of the abnormal probability that represents user behavior data in wherein said step e.
7. method as claimed in claim 6, the method of wherein calculating described Probability Point is, multiple models detect many stack features of user behavior data respectively, and draw respectively a point of Probability Point, then each point of Probability Point carried out to sum of products conversion operation, draw the Probability Point of user behavior data.
8. the method for claim 1, the user's anomaly detection method based on classifying rules in wherein said step e comprises sets a probability line for judging whether user behavior data is bad data.
9. the method for claim 1, the user's anomaly detection method based on clustering rule in wherein said step e comprises as follows:
E1) Probability Point is carried out to the monitoring of cluster phenomenon;
E2) Probability Point cluster is detected to the user behavior of some, take judge cluster to the user behavior of equal probabilities point whether as user behavior inferior;
E3), according to testing result, abnormal user behavior discrimination model upgrades the Probability Point of such user behavior;
E4) will add Sample Storehouse as training data through detecting the new bad data of finding;
E5) utilize new training data training pattern.
10. the method for claim 1, carries out off-line analysis for the inaccurate user behavior data of Probability Point in wherein said step e5, finds new user behavior feature and selects suitable feature.
CN201410022138.1A 2014-01-17 2014-01-17 The fraud identifying system based on machine learning in classification information website Active CN103793484B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201410022138.1A CN103793484B (en) 2014-01-17 2014-01-17 The fraud identifying system based on machine learning in classification information website

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201410022138.1A CN103793484B (en) 2014-01-17 2014-01-17 The fraud identifying system based on machine learning in classification information website

Publications (2)

Publication Number Publication Date
CN103793484A true CN103793484A (en) 2014-05-14
CN103793484B CN103793484B (en) 2017-03-15

Family

ID=50669150

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201410022138.1A Active CN103793484B (en) 2014-01-17 2014-01-17 The fraud identifying system based on machine learning in classification information website

Country Status (1)

Country Link
CN (1) CN103793484B (en)

Cited By (47)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104361010A (en) * 2014-10-11 2015-02-18 北京中搜网络技术股份有限公司 Automatic classification method for correcting news classification
CN104463668A (en) * 2014-10-24 2015-03-25 南京邦科威信息科技有限公司 Online credit checking method and device
CN104636912A (en) * 2015-02-13 2015-05-20 银联智惠信息服务(上海)有限公司 Identification method and device for withdrawal of credit cards
CN105306495A (en) * 2015-11-30 2016-02-03 百度在线网络技术(北京)有限公司 User identification method and device
CN105447755A (en) * 2014-09-01 2016-03-30 阿里巴巴集团控股有限公司 Transaction control method and apparatus
CN105703966A (en) * 2014-11-27 2016-06-22 阿里巴巴集团控股有限公司 Internet behavior risk identification method and apparatus
CN105760950A (en) * 2016-02-05 2016-07-13 北京物思创想科技有限公司 Method for providing or obtaining prediction result and device thereof and prediction system
CN105808639A (en) * 2016-02-24 2016-07-27 平安科技(深圳)有限公司 Network access behavior recognizing method and device
CN105844501A (en) * 2016-05-18 2016-08-10 上海亿保健康管理有限公司 Consumption behavior risk control system and method
CN106096657A (en) * 2016-06-13 2016-11-09 北京物思创想科技有限公司 The method and system of prediction data examination & verification target are carried out based on machine learning
CN106230849A (en) * 2016-08-22 2016-12-14 中国科学院信息工程研究所 A kind of smart machine machine learning safety monitoring system based on user behavior
CN106228410A (en) * 2016-07-29 2016-12-14 武汉斗鱼网络科技有限公司 Virtual present task anti-brush system and method in a kind of live platform
CN106296406A (en) * 2015-05-13 2017-01-04 阿里巴巴集团控股有限公司 The processing method and processing device of interaction data
CN106294508A (en) * 2015-06-10 2017-01-04 深圳市腾讯计算机系统有限公司 A kind of brush amount tool detection method and device
CN106408411A (en) * 2016-08-31 2017-02-15 北京城市网邻信息技术有限公司 Credit assessment method and device
CN106682067A (en) * 2016-11-08 2017-05-17 浙江邦盛科技有限公司 Machine learning anti-fraud monitoring system based on transaction data
CN106682985A (en) * 2016-12-26 2017-05-17 深圳先进技术研究院 Financial fraud identification method and system thereof
CN107093076A (en) * 2016-02-18 2017-08-25 卡巴斯基实验室股份制公司 The system and method for detecting fraudulent user transaction
CN107169768A (en) * 2016-03-07 2017-09-15 阿里巴巴集团控股有限公司 The acquisition methods and device of abnormal transaction data
CN107335220A (en) * 2017-06-06 2017-11-10 广州华多网络科技有限公司 A kind of recognition methods of passive user, device and server
CN107465648A (en) * 2016-06-06 2017-12-12 腾讯科技(深圳)有限公司 The recognition methods of warping apparatus and device
WO2017215370A1 (en) * 2016-06-14 2017-12-21 平安科技(深圳)有限公司 Method and apparatus for constructing decision model, computer device and storage device
CN107730717A (en) * 2017-10-31 2018-02-23 华中科技大学 A kind of suspicious card identification method of public transport of feature based extraction
WO2018072580A1 (en) * 2016-10-21 2018-04-26 中国银联股份有限公司 Method for detecting illegal transaction and apparatus
CN108108743A (en) * 2016-11-24 2018-06-01 百度在线网络技术(北京)有限公司 Abnormal user recognition methods and the device for identifying abnormal user
CN108512682A (en) * 2017-02-28 2018-09-07 腾讯科技(深圳)有限公司 A kind of method and apparatus of determining false terminal iidentification
CN108550052A (en) * 2018-04-03 2018-09-18 杭州呯嘭智能技术有限公司 Brush list detection method and system based on user behavior data feature
CN108875761A (en) * 2017-05-11 2018-11-23 华为技术有限公司 A kind of method and device for expanding potential user
CN109325779A (en) * 2018-08-20 2019-02-12 北京数美时代科技有限公司 A kind of read-write portrait method, system and portrait processing system cheated for counter
CN109389494A (en) * 2018-10-25 2019-02-26 北京芯盾时代科技有限公司 Borrow or lend money fraud detection model training method, debt-credit fraud detection method and device
CN109416700A (en) * 2017-09-30 2019-03-01 深圳市得道健康管理有限公司 A kind of the classification based training method and the network terminal of internet behavior
CN109714636A (en) * 2018-12-21 2019-05-03 武汉瓯越网视有限公司 A kind of user identification method, device, equipment and medium
CN109804370A (en) * 2017-09-30 2019-05-24 深圳市得道健康管理有限公司 A kind of evaluation method and the network terminal of internet behavior
CN109829713A (en) * 2019-01-28 2019-05-31 重庆邮电大学 A kind of mobile payment mode recognition methods that knowledge based drives jointly with data
CN109886284A (en) * 2018-12-12 2019-06-14 同济大学 Fraud detection method and system based on hierarchical clustering
CN110097066A (en) * 2018-01-31 2019-08-06 阿里巴巴集团控股有限公司 A kind of user classification method, device and electronic equipment
CN110557447A (en) * 2019-08-26 2019-12-10 腾讯科技(武汉)有限公司 user behavior identification method and device, storage medium and server
CN110703723A (en) * 2018-07-09 2020-01-17 佳能株式会社 System, method, and non-transitory computer-readable storage medium
CN110796153A (en) * 2018-08-01 2020-02-14 阿里巴巴集团控股有限公司 Training sample processing method and device
CN111259985A (en) * 2020-02-19 2020-06-09 腾讯科技(深圳)有限公司 Classification model training method and device based on business safety and storage medium
CN111476258A (en) * 2019-01-24 2020-07-31 杭州海康威视数字技术股份有限公司 Feature extraction method and device based on attention mechanism and electronic equipment
CN111639681A (en) * 2020-05-09 2020-09-08 同济大学 Early warning method, system, medium and device based on education drive type fraud
CN111914645A (en) * 2020-06-30 2020-11-10 五八有限公司 Method and device for identifying false information, electronic equipment and storage medium
CN112533208A (en) * 2019-08-27 2021-03-19 中国移动通信有限公司研究院 Model training method, false terminal identification method and device, and electronic device
CN112818142A (en) * 2021-01-29 2021-05-18 北京达佳互联信息技术有限公司 Account behavior information processing method and device, electronic equipment and storage medium
CN112887325A (en) * 2021-02-19 2021-06-01 浙江警察学院 Telecommunication network fraud crime fraud identification method based on network flow
CN113506084A (en) * 2021-06-23 2021-10-15 上海师范大学 False recruitment position detection method based on deep learning

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1649311A (en) * 2005-03-23 2005-08-03 北京首信科技有限公司 Detecting system and method for user behaviour abnormal based on machine study
US20100094768A1 (en) * 2008-06-12 2010-04-15 Tom Miltonberger Fraud Detection and Analysis System
CN102176698A (en) * 2010-12-20 2011-09-07 北京邮电大学 Method for detecting abnormal behaviors of user based on transfer learning
CN102238045A (en) * 2010-04-27 2011-11-09 广州迈联计算机科技有限公司 System and method for predicting user behavior in wireless Internet
CN102402517A (en) * 2010-09-09 2012-04-04 北京启明星辰信息技术股份有限公司 Method and system for establishing normal database login model and method and system for detecting abnormal login behavior

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1649311A (en) * 2005-03-23 2005-08-03 北京首信科技有限公司 Detecting system and method for user behaviour abnormal based on machine study
US20100094768A1 (en) * 2008-06-12 2010-04-15 Tom Miltonberger Fraud Detection and Analysis System
CN102238045A (en) * 2010-04-27 2011-11-09 广州迈联计算机科技有限公司 System and method for predicting user behavior in wireless Internet
CN102402517A (en) * 2010-09-09 2012-04-04 北京启明星辰信息技术股份有限公司 Method and system for establishing normal database login model and method and system for detecting abnormal login behavior
CN102176698A (en) * 2010-12-20 2011-09-07 北京邮电大学 Method for detecting abnormal behaviors of user based on transfer learning

Cited By (75)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105447755A (en) * 2014-09-01 2016-03-30 阿里巴巴集团控股有限公司 Transaction control method and apparatus
CN105447755B (en) * 2014-09-01 2021-12-17 创新先进技术有限公司 Transaction control method and device
CN104361010A (en) * 2014-10-11 2015-02-18 北京中搜网络技术股份有限公司 Automatic classification method for correcting news classification
CN104463668A (en) * 2014-10-24 2015-03-25 南京邦科威信息科技有限公司 Online credit checking method and device
CN105703966A (en) * 2014-11-27 2016-06-22 阿里巴巴集团控股有限公司 Internet behavior risk identification method and apparatus
CN104636912A (en) * 2015-02-13 2015-05-20 银联智惠信息服务(上海)有限公司 Identification method and device for withdrawal of credit cards
US10956847B2 (en) 2015-05-13 2021-03-23 Advanced New Technologies Co., Ltd. Risk identification based on historical behavioral data
CN106296406A (en) * 2015-05-13 2017-01-04 阿里巴巴集团控股有限公司 The processing method and processing device of interaction data
CN106294508B (en) * 2015-06-10 2020-02-11 深圳市腾讯计算机系统有限公司 Brushing amount tool detection method and device
CN106294508A (en) * 2015-06-10 2017-01-04 深圳市腾讯计算机系统有限公司 A kind of brush amount tool detection method and device
CN105306495B (en) * 2015-11-30 2018-06-19 百度在线网络技术(北京)有限公司 user identification method and device
CN105306495A (en) * 2015-11-30 2016-02-03 百度在线网络技术(北京)有限公司 User identification method and device
CN105760950B (en) * 2016-02-05 2018-09-11 第四范式(北京)技术有限公司 There is provided or obtain the method, apparatus and forecasting system of prediction result
CN105760950A (en) * 2016-02-05 2016-07-13 北京物思创想科技有限公司 Method for providing or obtaining prediction result and device thereof and prediction system
CN109146151A (en) * 2016-02-05 2019-01-04 第四范式(北京)技术有限公司 There is provided or obtain the method, apparatus and forecasting system of prediction result
CN107093076A (en) * 2016-02-18 2017-08-25 卡巴斯基实验室股份制公司 The system and method for detecting fraudulent user transaction
CN105808639B (en) * 2016-02-24 2021-02-09 平安科技(深圳)有限公司 Network access behavior identification method and device
CN105808639A (en) * 2016-02-24 2016-07-27 平安科技(深圳)有限公司 Network access behavior recognizing method and device
CN107169768B (en) * 2016-03-07 2021-07-27 阿里巴巴集团控股有限公司 Method and device for acquiring abnormal transaction data
CN107169768A (en) * 2016-03-07 2017-09-15 阿里巴巴集团控股有限公司 The acquisition methods and device of abnormal transaction data
CN105844501A (en) * 2016-05-18 2016-08-10 上海亿保健康管理有限公司 Consumption behavior risk control system and method
CN107465648A (en) * 2016-06-06 2017-12-12 腾讯科技(深圳)有限公司 The recognition methods of warping apparatus and device
CN107465648B (en) * 2016-06-06 2020-09-04 腾讯科技(深圳)有限公司 Abnormal equipment identification method and device
CN106096657B (en) * 2016-06-13 2019-04-30 第四范式(北京)技术有限公司 Based on machine learning come the method and system of prediction data audit target
CN106096657A (en) * 2016-06-13 2016-11-09 北京物思创想科技有限公司 The method and system of prediction data examination & verification target are carried out based on machine learning
WO2017215370A1 (en) * 2016-06-14 2017-12-21 平安科技(深圳)有限公司 Method and apparatus for constructing decision model, computer device and storage device
JP2018522343A (en) * 2016-06-14 2018-08-09 平安科技(深▲せん▼)有限公司 Method, computer device and storage device for building a decision model
CN106228410A (en) * 2016-07-29 2016-12-14 武汉斗鱼网络科技有限公司 Virtual present task anti-brush system and method in a kind of live platform
CN106230849B (en) * 2016-08-22 2019-04-19 中国科学院信息工程研究所 A kind of smart machine machine learning safety monitoring system based on user behavior
CN106230849A (en) * 2016-08-22 2016-12-14 中国科学院信息工程研究所 A kind of smart machine machine learning safety monitoring system based on user behavior
CN106408411A (en) * 2016-08-31 2017-02-15 北京城市网邻信息技术有限公司 Credit assessment method and device
WO2018072580A1 (en) * 2016-10-21 2018-04-26 中国银联股份有限公司 Method for detecting illegal transaction and apparatus
CN106682067A (en) * 2016-11-08 2017-05-17 浙江邦盛科技有限公司 Machine learning anti-fraud monitoring system based on transaction data
CN106682067B (en) * 2016-11-08 2018-05-01 浙江邦盛科技有限公司 A kind of anti-fake monitoring system of machine learning based on transaction data
CN108108743A (en) * 2016-11-24 2018-06-01 百度在线网络技术(北京)有限公司 Abnormal user recognition methods and the device for identifying abnormal user
CN106682985A (en) * 2016-12-26 2017-05-17 深圳先进技术研究院 Financial fraud identification method and system thereof
CN106682985B (en) * 2016-12-26 2020-03-27 深圳先进技术研究院 Financial fraud identification method and system
CN108512682A (en) * 2017-02-28 2018-09-07 腾讯科技(深圳)有限公司 A kind of method and apparatus of determining false terminal iidentification
CN108512682B (en) * 2017-02-28 2021-02-26 腾讯科技(深圳)有限公司 Method and device for determining false terminal identification
CN108875761A (en) * 2017-05-11 2018-11-23 华为技术有限公司 A kind of method and device for expanding potential user
CN107335220A (en) * 2017-06-06 2017-11-10 广州华多网络科技有限公司 A kind of recognition methods of passive user, device and server
CN107335220B (en) * 2017-06-06 2021-01-26 广州华多网络科技有限公司 Negative user identification method and device and server
CN109416700A (en) * 2017-09-30 2019-03-01 深圳市得道健康管理有限公司 A kind of the classification based training method and the network terminal of internet behavior
WO2019061377A1 (en) * 2017-09-30 2019-04-04 深圳市得道健康管理有限公司 Internet behavior classification training method and network terminal
CN109804370A (en) * 2017-09-30 2019-05-24 深圳市得道健康管理有限公司 A kind of evaluation method and the network terminal of internet behavior
CN107730717B (en) * 2017-10-31 2019-08-30 华中科技大学 A kind of suspicious card identification method of public transport based on feature extraction
CN107730717A (en) * 2017-10-31 2018-02-23 华中科技大学 A kind of suspicious card identification method of public transport of feature based extraction
CN110097066A (en) * 2018-01-31 2019-08-06 阿里巴巴集团控股有限公司 A kind of user classification method, device and electronic equipment
CN110097066B (en) * 2018-01-31 2024-01-05 阿里巴巴集团控股有限公司 User classification method and device and electronic equipment
CN108550052A (en) * 2018-04-03 2018-09-18 杭州呯嘭智能技术有限公司 Brush list detection method and system based on user behavior data feature
CN110703723B (en) * 2018-07-09 2023-02-24 佳能株式会社 System, method, and non-transitory computer-readable storage medium
CN110703723A (en) * 2018-07-09 2020-01-17 佳能株式会社 System, method, and non-transitory computer-readable storage medium
CN110796153B (en) * 2018-08-01 2023-06-20 阿里巴巴集团控股有限公司 Training sample processing method and device
CN110796153A (en) * 2018-08-01 2020-02-14 阿里巴巴集团控股有限公司 Training sample processing method and device
CN109325779A (en) * 2018-08-20 2019-02-12 北京数美时代科技有限公司 A kind of read-write portrait method, system and portrait processing system cheated for counter
CN109389494B (en) * 2018-10-25 2021-11-05 北京芯盾时代科技有限公司 Loan fraud detection model training method, loan fraud detection method and device
CN109389494A (en) * 2018-10-25 2019-02-26 北京芯盾时代科技有限公司 Borrow or lend money fraud detection model training method, debt-credit fraud detection method and device
CN109886284A (en) * 2018-12-12 2019-06-14 同济大学 Fraud detection method and system based on hierarchical clustering
CN109714636A (en) * 2018-12-21 2019-05-03 武汉瓯越网视有限公司 A kind of user identification method, device, equipment and medium
CN109714636B (en) * 2018-12-21 2021-04-23 武汉瓯越网视有限公司 User identification method, device, equipment and medium
CN111476258A (en) * 2019-01-24 2020-07-31 杭州海康威视数字技术股份有限公司 Feature extraction method and device based on attention mechanism and electronic equipment
CN111476258B (en) * 2019-01-24 2024-01-05 杭州海康威视数字技术股份有限公司 Feature extraction method and device based on attention mechanism and electronic equipment
CN109829713A (en) * 2019-01-28 2019-05-31 重庆邮电大学 A kind of mobile payment mode recognition methods that knowledge based drives jointly with data
CN109829713B (en) * 2019-01-28 2020-09-15 重庆邮电大学 Mobile payment mode identification method based on common drive of knowledge and data
CN110557447B (en) * 2019-08-26 2022-06-10 腾讯科技(武汉)有限公司 User behavior identification method and device, storage medium and server
CN110557447A (en) * 2019-08-26 2019-12-10 腾讯科技(武汉)有限公司 user behavior identification method and device, storage medium and server
CN112533208A (en) * 2019-08-27 2021-03-19 中国移动通信有限公司研究院 Model training method, false terminal identification method and device, and electronic device
CN111259985A (en) * 2020-02-19 2020-06-09 腾讯科技(深圳)有限公司 Classification model training method and device based on business safety and storage medium
CN111639681A (en) * 2020-05-09 2020-09-08 同济大学 Early warning method, system, medium and device based on education drive type fraud
CN111914645A (en) * 2020-06-30 2020-11-10 五八有限公司 Method and device for identifying false information, electronic equipment and storage medium
CN112818142B (en) * 2021-01-29 2023-12-08 北京达佳互联信息技术有限公司 Account behavior information processing method and device, electronic equipment and storage medium
CN112818142A (en) * 2021-01-29 2021-05-18 北京达佳互联信息技术有限公司 Account behavior information processing method and device, electronic equipment and storage medium
CN112887325B (en) * 2021-02-19 2022-04-01 浙江警察学院 Telecommunication network fraud crime fraud identification method based on network flow
CN112887325A (en) * 2021-02-19 2021-06-01 浙江警察学院 Telecommunication network fraud crime fraud identification method based on network flow
CN113506084A (en) * 2021-06-23 2021-10-15 上海师范大学 False recruitment position detection method based on deep learning

Also Published As

Publication number Publication date
CN103793484B (en) 2017-03-15

Similar Documents

Publication Publication Date Title
CN103793484A (en) Fraudulent conduct identification system based on machine learning in classified information website
US11025735B2 (en) Trend detection in a messaging platform
Artzi et al. Predicting responses to microblog posts
Mendon et al. A hybrid approach of machine learning and lexicons to sentiment analysis: Enhanced insights from twitter data of natural disasters
US10484413B2 (en) System and a method for detecting anomalous activities in a blockchain network
Chen et al. Non-parametric scan statistics for event detection and forecasting in heterogeneous social media graphs
US20230289665A1 (en) Failure feedback system for enhancing machine learning accuracy by synthetic data generation
Lin et al. Voices of victory: A computational focus group framework for tracking opinion shift in real time
TW201443811A (en) Social media impact assessment (1)
CN111614690A (en) Abnormal behavior detection method and device
WO2017013529A1 (en) System and method for determining credit worthiness of a user
CN102365637A (en) Characterizing user information
CN112329816A (en) Data classification method and device, electronic equipment and readable storage medium
KR102407057B1 (en) Systems and methods for analyzing the public data of SNS user channel and providing influence report
CN108021651A (en) Network public opinion risk assessment method and device
Nilizadeh et al. Think outside the dataset: Finding fraudulent reviews using cross-dataset analysis
CN106294406B (en) Method and equipment for processing application access data
Kwon et al. User profiling via application usage pattern on digital devices for digital forensics
CN109933648B (en) Real user comment distinguishing method and device
KR20210148573A (en) Systems and methods for gathering public data of SNS user channel and providing influence reports based on the collected public data
Jenkins Clickgraph: Web page embedding using clickstream data for multitask learning
Kim et al. Crowdsourced promotions in doubt: Analyzing effective crowdsourced promotions
Zhao et al. Detecting fake reviews via dynamic multimode network
CN112541669A (en) Risk identification method, system and device
Kanchana et al. Stress detection using classification algorithm

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant