CN103793484B - The fraud identifying system based on machine learning in classification information website - Google Patents

The fraud identifying system based on machine learning in classification information website Download PDF

Info

Publication number
CN103793484B
CN103793484B CN201410022138.1A CN201410022138A CN103793484B CN 103793484 B CN103793484 B CN 103793484B CN 201410022138 A CN201410022138 A CN 201410022138A CN 103793484 B CN103793484 B CN 103793484B
Authority
CN
China
Prior art keywords
data
user
user behavior
behavior
model
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201410022138.1A
Other languages
Chinese (zh)
Other versions
CN103793484A (en
Inventor
张鹏
张爱华
张美琦
张朝阳
孙亚健
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing 58 Information Technology Co Ltd
Original Assignee
Beijing 58 Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing 58 Information Technology Co Ltd filed Critical Beijing 58 Information Technology Co Ltd
Priority to CN201410022138.1A priority Critical patent/CN103793484B/en
Publication of CN103793484A publication Critical patent/CN103793484A/en
Application granted granted Critical
Publication of CN103793484B publication Critical patent/CN103793484B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/958Organisation or management of web site content, e.g. publishing, maintaining pages or automatic linking

Landscapes

  • Engineering & Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention provides in a kind of website for classification information based on machine learning fraud identifying system method, methods described comprises the steps:A) existing user behavior data sample drawn data are based on, for generation model first;B) for the multiple user behavior features of training data selective extraction of different service types;C) based on the user behavior feature that is extracted, vectorization is carried out to the sample training data;D) forecast model is produced using the sample training data of vectorization;E) data on line are detected based on classification and clustering rule using produced model;F) the detected abnormal user data for obtaining are processed.The behavior of user can be identified from multiple dimensions using the present invention, efficiently reduce the false amount of Transaction Information.Even if also, also can be good at being identified user behavior inferior in the case of containing noise data in training data.

Description

The fraud identifying system based on machine learning in classification information website
Technical field
The present invention relates to Internet technology, the fraud based on machine learning in particularly a kind of classification information website Identifying system.
Background technology
Classification information net is a kind of website class of every aspect information for being related to daily life that the Internet newly rises Type.Inside these websites user can obtain freely, easily information distribution service, including second-hand article trading, used car Dealing, housing, house pet, recruitment, part-time, job hunting, make friend activity, life service information etc..Classification information is wide also known as classification Accuse, the daily advertisement that is seen on TV, newpapers and periodicals of people, often no matter beholder is willing to be unwilling that it can all impose on sight The person of seeing, this series advertisements are passive advertisement;And people actively go to inquire about the information of the aspect such as recruit, rent a house, travelling, these are believed Breath, it is called active advertisement.In today that information-intensive society progressively develops, passive advertisement increasingly causes the dislike of people, and leads Dynamic advertisement is but subject to the extensive favor of people.Almost each local evening paper, daily paper, life & amusement report can't do without classification information Figure, and do to obtain better newspaper, the length of classification information is often bigger.Thus classification information net is just generated.
In the user of classification information website orientation information, a part of poor quality user often occurs, they are issuing falseness The mode such as information fraudulent user is obtaining interests.Therefore, classification information website can arrange some process rules to tinpot information With filter logic etc..
The means of existing deceptive information identification are mainly based upon the recognition method of rule, some artificial interventions additional, For example counted with issuing within a period of time by counting an ip, in information content whether containing illegal word, the commodity that issues Or whether unreasonable etc. the rule of price range of service is come whether judge a user be the user inferior that issues deceptive information, So as to take the processing means such as deletion information, warning, logging off users.However, common process rule and filter logic are generally all It is to carry out the identification of behavior inferior using single dimension, therefore user inferior can sound out the critical of rule by every means Point is so as to process and the filter logic to tinpot information around open system.
In addition, reaching the standard grade with various rules, spendable rule can be fewer and feweri, because rule is all can be obvious Feature.The identification of rule can only be made a distinction using linear classifying face in existing method, so as to cause the bad of majority Matter information is all without being recognized by the system and process.
Accordingly, it would be desirable to the fraud identifying system based on machine learning in a kind of classification information website, comes multiple Dimension is identified to the behavior of user, so as to efficiently reduce the false amount of Transaction Information, improves the verity of Transaction Information.
Content of the invention
It is an object of the invention to provide the fraud identifying system based on machine learning in a kind of classification information website.
According to an aspect of the invention, there is provided the fraud row in a kind of website for classification information based on machine learning For the method for identifying system, methods described comprises the steps:A) based on existing user behavior data sample drawn data, use In generation model first;B) for the multiple user behavior features of training data selective extraction of different service types;C) based on institute The sample training data are carried out vectorization by the user behavior feature of extraction;D) produced using the sample training data of vectorization Raw forecast model;E) data on line are detected based on classification and clustering rule using produced model;F) to being detected The abnormal user data for obtaining are processed.
Preferably, the sample data in step a includes positive sample data and negative sample data, corresponds respectively to high-quality The user of behavior and the user of behavior inferior.
Preferably, in step b, user behavior feature includes user behavior data and the use for same cookie The statistical magnitude of each dimension in family.
Preferably, selected to difference by way of calculating comentropy and model intersection data verification in step b The user characteristicses that type of service is extracted.
Preferably, used in step d, the grader of probabilistic type carries out decision-making.
Preferably, the Probability Point of the abnormal probability for representing user behavior data is calculated using model in step e.
Preferably, the method for calculating the Probability Point is that multiple models enter to many stack features of user behavior data respectively Row detection, and draw a point of Probability Point respectively, sum of products conversion operation is carried out to each point of Probability Point then, user behavior is drawn The Probability Point of data.
Preferably, include setting a probability based on user's anomaly detection method of classifying ruless in step e Line is used for judging whether user behavior data is bad data.
Preferably, included based on user's anomaly detection method of clustering rule in step e as follows:E1) to general Rate point carries out clustering phenomenon monitoring;E2) Probability Point cluster is detected to a number of user behavior, to judge cluster extremely Whether the user behavior of equal probabilities point is user behavior inferior;E3) according to testing result, abnormal user behavior discrimination model pair The Probability Point of such user behavior is updated;E4 the new bad data for passing through detection discovery is added sample as training data) This storehouse;E5) new training data training pattern is utilized.
Preferably, for the inaccurate user behavior data of Probability Point carries out off-line analysiss in step e5, find new User behavior feature and select suitable feature.
Using the fraud identifying system based on machine learning in a kind of classification information website of the present invention, Neng Goucong Multiple dimensions are identified to the behavior of user, so as to efficiently reduce the false amount of Transaction Information, improve the true of Transaction Information Reality.Even if also, also can be good at carrying out user behavior inferior in the case of containing noise data in training data Identification.
Description of the drawings
With reference to the accompanying drawing that encloses, the present invention more purpose, function and advantages are by by the as follows of embodiment of the present invention Description is illustrated, wherein:
Fig. 1 diagrammatically illustrates the fraud identifying system based on machine learning in the classification information website of the present invention Method flow diagram.
Specific embodiment
By reference to one exemplary embodiment, the purpose of the present invention and function and the side for realizing these purposes and function Method will be illustrated.However, the present invention is not limited to one exemplary embodiment disclosed below;Can by multi-form come Which is realized.The essence of description is only to aid in the detail of the various equivalent modifications Integrated Understanding present invention.
Hereinafter, embodiments of the invention will be described with reference to the drawings.In the accompanying drawings, identical reference represents identical Or similar part, or same or like step.
The fraud information recognition methodss of the present invention have used the data produced based on user behavior, can immediately to user The information data of issue is identified.The Model Identification of the machine learning that the present invention is adopted, can be in multiple dimensions to user's Behavior is identified so that the user of the particularly tinpot information that releases news is difficult to know that what the dimension of identification is, so as to nothing Method is evaded by getting around rule.The present invention can be predicted to data in the environment of a small amount of sample and high noisy, And accuracy rate is high.Modeling is collected in the various actions to user, abnormal user is identified so as to reach.
Fig. 1 diagrammatically illustrates the fraud identifying system based on machine learning in the classification information website of the present invention Method flow diagram.As shown in Figure 1:
Step 110, based on existing user behavior data sample drawn data, for generation model first.The extraction Sample data can be extracted from the existing user behavior data storehouse that had audited, be mainly used in for user dividing into high-quality use Family and user inferior, correspond respectively to positive sample data and negative sample data, and wherein, positive sample is through auditing the high-quality row for passing through For the user behavior data of user, and negative sample is the behavioral data of the behavior user inferior through examination & verification identification, for example offence Some more serious rules(Wash sale information is for example issued)User behavior data.Sample number in existing examination & verification storehouse According to the data base set up for being classified to user behavior by some conventional user behavior recognition methods.Methods described is for example: Whether whether contain in the text message that detection user issues in illegal word, the pictorial information that detection user issues containing illegal interior Hold etc..
During the upper once model iteration after generation model, the method according to the invention can directly using institute The positive and negative samples storehouse of extraction, and original examination & verification storehouse information need not be reused.
Step 120, for the multiple user behavior features of training data selective extraction of different service types.Imitated by experiment Really, which user behavior feature used in different business line judged.
The characteristic behavior of user is generally very more, in view of the requirement of computational accuracy and computational efficiency simultaneously, according to the present invention User behavior feature generalizations be the feature for having discrimination to good and bad user, be therefore not required for a lot of features, model is got over Little better, it is therefore an objective to data can be detected using multiple models in last identification.
User behavior is characterized in that the behavior characteristicss found by some off-line analysiss, typically will not be with other rule weights on line Multiple, it can be understood as the feature that background audit personnel can not find, e.g. for the user behavior data of same cookie, bag Include:Trans-city post number, using mobile phone number, time interval and click behavior, user's registration to the time interval that posts Point, also have the statistical magnitude of the login behavior of user, the data of dimension such as user browsing behavior and each dimension of user such as:ip Count within trans-city N days, cookie is counted etc..
Preferably for the training data of different types of service, system extracts different features.Such as used car business Line using feature only can include:The time of posting is registered to, the user that posts fills in data time, the mouse track of user, 30 Its ip correlated count etc..Second-hand service line can only include following feature:User's registration login time, N days before the corresponding ip of user Browsing pages number of times.
It is highly preferred that being selected to different business line drawing by way of calculating comentropy and model intersection data verification Feature.
The model set up by the method according to the invention only have selected each that the feature of less dimension, i.e. model are used Dimension be all by data analysiss produce to the good dimension of classifying quality, the data that therefore each dimension is produced are not Sparse.The method of the present invention overcomes dimension in prior art excessively to be caused to calculate excessively complicated shortcoming, and prior art is led to Be often by text is carried out participle distinguish user so as to produce a large amount of dimensions, each word as a dimension, so as to cause to instruct Practice sample excessive, calculate complexity, also can be too high to the accuracy requirement of sample data.
Sample training data, according to the user behavior feature that selects in step 120, are carried out vectorization by step 130.Training The result of the vectorization of data can be saved in file.The component of each dimension of training data is the feature of a selection. Below by taking the vectorization procedure of this dimension of time that user fills in model as an example, the vectorization procedure of training data is described:
1. the data of each model fill in the time are obtained in sample data.
2. pair these data carry out data cleansing, its objective is to playback outlier.
3. the attribute of pair successive value carries out sliding-model control, using K mean cluster 100 times, is the cluster for having error minimum Central point is used as discretization cut section.
4. and then last correction is carried out to data.
5. vectorization is completed.
Step 140, produces forecast model using the sample training data of vectorization.During training pattern, probability is preferably used The grader of type carries out decision-making.Probabilistic type grader is used for the Probability Point for calculating user behavior data.
The reason for decision-making is carried out using the grader of probabilistic type is, as the purpose that last model is produced is to recognize and delete Some Deviant Behavior information, so this model needs that there is very high accuracy rate, due to graders such as neutral net or decision trees Situation about manslaughtering is had, so the grader using probabilistic type carries out decision-making.
The model for using is preferably and adopts Bayesian network model.Bayesian network (Bayesian network) is to be based on The mathematical model of probability inference, which has a stronger generalization ability, and is capable of the layering and the output of probability of firm logic, institute To be well suited for the scene of Activity recognition.
Preferably, the training of model, WEKA are carried out using the program WEKA of increasing income(Waikato Environment for Knowledge Analysis)As disclosed data mining work platformses, gathered a large amount of can undertake data mining appoint The machine learning algorithm of business, including data are carried out with pretreatment, classification is returned, cluster, correlation rule and in new interactive mode Visualization on interface.
Data on line are detected based on classification and clustering rule by step 150 using produced multiple models, and right User inferior is processed.When being detected to the user behavior data on line using abnormal user Activity recognition model, can be right Every user behavior data is calculated, and generates a Probability Point, and Probability Point represents the probit that the data are bad data, this Probability Point is more high, and the data are more likely to bad data.Wherein, the Probability Point of user behavior data is generated in the following manner:Each Model many stack features respectively to user behavior data(The feature of multigroup different dimensions)Detected, each model is drawn respectively One probit for representing that data are the probability that quality is inclined to(Hereinafter referred to as divide Probability Point), finally each point of Probability Point is carried out Sum of products conversion operation, draws the Probability Point of user behavior data.
The method of detection user's Deviant Behavior includes following two:
User's anomaly detection method based on classifying ruless.Set a probability line(That is probability threshold value), for sentencing Whether disconnected user behavior data is bad data.If the Probability Point of certain user behavior data exceedes probability threshold value, by user's row For data judging be bad data, will the user be judged to user inferior.Otherwise then the user behavior data is judged to normally The user is judged to high-quality user by data.Wherein, probability line is obtained by way of artificially verifying.
Data on line are detected based on clustering rule.Comprise the following steps that:
Step a, carries out clustering phenomenon monitoring to Probability Point.
Step b, Probability Point cluster is detected to a number of user behavior data turning-over operation personnel, cluster is judged extremely Whether the user behavior of equal probabilities point is other user behaviors with the Probability Point in user behavior inferior, i.e. detection line Whether data are all bad data.Cluster to the same class behavior of user behavior data or user's row of similar behavior of equal probabilities point For data, they may only have the feature of less dimension distinct.Detection method is preferably, and detection has the use of the Probability Point Whether family behavioral data is all fallen by other rule treatments.Wherein, other rules are the modes of the identification bad data outside invention.Example Such as:The pictorial information that issues containing illegal word, user in the text message that user issues contains illegal contents etc..
Step c, according to operator's testing result, Probability Point of the abnormal user behavior discrimination model to such user behavior It is updated.That is, if finding, this behavior is the behavior of user inferior, and the user behavior data is judged to bad data, improves The Probability Point of the user behavior is for example improved Probability Point to 0.999 to certain higher probit.
The new bad data for passing through detection discovery is added Sample Storehouse as training data by step d.That is, will be judged as When bad data user behavior data add Sample Storehouse in, as the training data of model next time, so as to the renewal for model New training data is provided.
Step e, using new training data training pattern.
Preferably, in step e, for the inaccurate user behavior data of Probability Point carries out off-line analysiss, find new User behavior feature simultaneously selects suitable feature.And do to the model for newly producing whether cross validation judgment models have preferably Performance.
Using the above-mentioned detection method based on clustering rule to data on line, can realize not accurate enough in sample data In the case of detection to user behavior data.Also, can be realized with semi-supervised engineering to step e using above-mentioned steps a Habit mode carries out the renewal to model.Also, by present mechanism, can avoid due to containing in the sample data as training data The inaccurate problem for having the model caused by the reasons such as noise data to calculate the Probability Point of data, even if so in sample number According to also can be good at being identified user behavior inferior in the case of inaccurate.
Abnormal user data are processed by step 160.After determining that certain user behavior is Deviant Behavior, system meeting User inferior is processed, for example, user is deleted etc. in the information of Web realease.
Using the fraud identifying system based on machine learning in a kind of classification information website of the present invention, Neng Goucong Multiple dimensions are identified to the behavior of user, so as to efficiently reduce the false amount of Transaction Information, improve the true of Transaction Information Reality.Even if also, also can be good at carrying out user behavior inferior in the case of containing noise data in training data Identification.
In conjunction with the explanation and practice of the present invention for disclosing here, the other embodiment of the present invention is for those skilled in the art All will be readily apparent and understand.Illustrate and embodiment be to be considered only as exemplary, the present invention true scope and purport equal It is defined in the claims.

Claims (9)

1. in a kind of website for classification information based on machine learning fraud identifying system method, methods described includes Following steps:
A) existing user behavior data sample drawn data are based on, for generation model first;
B) for the multiple user behavior features of training data selective extraction of different service types;
C) based on the user behavior feature that is extracted, vectorization is carried out to the sample training data;
D) forecast model is produced using the sample training data of vectorization;
E) data on line are detected based on classification and clustering rule using produced model, wherein,
Included based on user's anomaly detection method of clustering rule as follows:
E1) Probability Point is carried out clustering phenomenon monitoring;
E2) Probability Point cluster is detected to a number of user behavior, to judge the user clustered to equal probabilities point Whether behavior is user behavior inferior;
E3) according to testing result, abnormal user behavior discrimination model is updated to the Probability Point of such user behavior;
E4 the new bad data for passing through detection discovery is added Sample Storehouse as training data);E5) instructed using new training data Practice model;
F) the detected abnormal user data for obtaining are processed.
2. the method for claim 1, the sample data in wherein described step a include positive sample data and negative sample number According to corresponding respectively to the user of high-quality behavior and the user of behavior inferior.
3. the method for claim 1, in wherein described step b, user behavior feature is included for the use of same cookie Family behavioral data and the statistical magnitude of each dimension of user.
4. the method for claim 1, intersects data verification by calculating comentropy and model in wherein described step b Mode come select to different service types extract user characteristicses.
5. the method for claim 1, used in wherein described step d, the grader of probabilistic type carries out decision-making.
6. the method for claim 1, calculates the exception for representing user behavior data using model in wherein described step e The Probability Point of probability.
7. method as claimed in claim 6, the method for wherein calculating the Probability Point is that multiple models are respectively to user behavior Many stack features of data are detected, and draw a point of Probability Point respectively, then carry out sum of products conversion to each point of Probability Point Operation, draws the Probability Point of user behavior data.
8. the method for claim 1, the user's anomaly detection method in wherein described step e based on classifying ruless It is used for judging whether user behavior data is bad data including setting a probability line.
9. the method for claim 1, for the inaccurate user behavior data of Probability Point enters in wherein described step e5 Row off-line analysiss, find new user behavior feature and select suitable feature.
CN201410022138.1A 2014-01-17 2014-01-17 The fraud identifying system based on machine learning in classification information website Active CN103793484B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201410022138.1A CN103793484B (en) 2014-01-17 2014-01-17 The fraud identifying system based on machine learning in classification information website

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201410022138.1A CN103793484B (en) 2014-01-17 2014-01-17 The fraud identifying system based on machine learning in classification information website

Publications (2)

Publication Number Publication Date
CN103793484A CN103793484A (en) 2014-05-14
CN103793484B true CN103793484B (en) 2017-03-15

Family

ID=50669150

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201410022138.1A Active CN103793484B (en) 2014-01-17 2014-01-17 The fraud identifying system based on machine learning in classification information website

Country Status (1)

Country Link
CN (1) CN103793484B (en)

Families Citing this family (47)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105447755B (en) * 2014-09-01 2021-12-17 创新先进技术有限公司 Transaction control method and device
CN104361010A (en) * 2014-10-11 2015-02-18 北京中搜网络技术股份有限公司 Automatic classification method for correcting news classification
CN104463668A (en) * 2014-10-24 2015-03-25 南京邦科威信息科技有限公司 Online credit checking method and device
CN105703966A (en) * 2014-11-27 2016-06-22 阿里巴巴集团控股有限公司 Internet behavior risk identification method and apparatus
CN104636912A (en) * 2015-02-13 2015-05-20 银联智惠信息服务(上海)有限公司 Identification method and device for withdrawal of credit cards
CN106296406A (en) * 2015-05-13 2017-01-04 阿里巴巴集团控股有限公司 The processing method and processing device of interaction data
CN106294508B (en) * 2015-06-10 2020-02-11 深圳市腾讯计算机系统有限公司 Brushing amount tool detection method and device
CN105306495B (en) * 2015-11-30 2018-06-19 百度在线网络技术(北京)有限公司 user identification method and device
CN109146151A (en) * 2016-02-05 2019-01-04 第四范式(北京)技术有限公司 There is provided or obtain the method, apparatus and forecasting system of prediction result
RU2626337C1 (en) * 2016-02-18 2017-07-26 Акционерное общество "Лаборатория Касперского" Method of detecting fraudulent activity on user device
CN105808639B (en) * 2016-02-24 2021-02-09 平安科技(深圳)有限公司 Network access behavior identification method and device
CN107169768B (en) * 2016-03-07 2021-07-27 阿里巴巴集团控股有限公司 Method and device for acquiring abnormal transaction data
CN105844501A (en) * 2016-05-18 2016-08-10 上海亿保健康管理有限公司 Consumption behavior risk control system and method
CN107465648B (en) * 2016-06-06 2020-09-04 腾讯科技(深圳)有限公司 Abnormal equipment identification method and device
CN106096657B (en) * 2016-06-13 2019-04-30 第四范式(北京)技术有限公司 Based on machine learning come the method and system of prediction data audit target
CN106384282A (en) * 2016-06-14 2017-02-08 平安科技(深圳)有限公司 Method and device for building decision-making model
CN106228410A (en) * 2016-07-29 2016-12-14 武汉斗鱼网络科技有限公司 Virtual present task anti-brush system and method in a kind of live platform
CN106230849B (en) * 2016-08-22 2019-04-19 中国科学院信息工程研究所 A kind of smart machine machine learning safety monitoring system based on user behavior
CN106408411A (en) * 2016-08-31 2017-02-15 北京城市网邻信息技术有限公司 Credit assessment method and device
CN106548343B (en) * 2016-10-21 2020-11-10 中国银联股份有限公司 Illegal transaction detection method and device
CN106682067B (en) * 2016-11-08 2018-05-01 浙江邦盛科技有限公司 A kind of anti-fake monitoring system of machine learning based on transaction data
CN108108743B (en) * 2016-11-24 2022-06-24 百度在线网络技术(北京)有限公司 Abnormal user identification method and device for identifying abnormal user
CN106682985B (en) * 2016-12-26 2020-03-27 深圳先进技术研究院 Financial fraud identification method and system
CN108512682B (en) * 2017-02-28 2021-02-26 腾讯科技(深圳)有限公司 Method and device for determining false terminal identification
CN108875761B (en) * 2017-05-11 2022-06-28 华为技术有限公司 Method and device for expanding potential users
CN107335220B (en) * 2017-06-06 2021-01-26 广州华多网络科技有限公司 Negative user identification method and device and server
WO2019061376A1 (en) * 2017-09-30 2019-04-04 深圳市得道健康管理有限公司 Method for evaluating internet behavior and network terminal
CN109416700A (en) * 2017-09-30 2019-03-01 深圳市得道健康管理有限公司 A kind of the classification based training method and the network terminal of internet behavior
CN107730717B (en) * 2017-10-31 2019-08-30 华中科技大学 A kind of suspicious card identification method of public transport based on feature extraction
CN110097066B (en) * 2018-01-31 2024-01-05 阿里巴巴集团控股有限公司 User classification method and device and electronic equipment
CN108550052A (en) * 2018-04-03 2018-09-18 杭州呯嘭智能技术有限公司 Brush list detection method and system based on user behavior data feature
JP7091174B2 (en) * 2018-07-09 2022-06-27 キヤノン株式会社 System, system control method and program
CN110796153B (en) * 2018-08-01 2023-06-20 阿里巴巴集团控股有限公司 Training sample processing method and device
CN109325779A (en) * 2018-08-20 2019-02-12 北京数美时代科技有限公司 A kind of read-write portrait method, system and portrait processing system cheated for counter
CN109389494B (en) * 2018-10-25 2021-11-05 北京芯盾时代科技有限公司 Loan fraud detection model training method, loan fraud detection method and device
CN109886284B (en) * 2018-12-12 2021-02-12 同济大学 Fraud detection method and system based on hierarchical clustering
CN109714636B (en) * 2018-12-21 2021-04-23 武汉瓯越网视有限公司 User identification method, device, equipment and medium
CN111476258B (en) * 2019-01-24 2024-01-05 杭州海康威视数字技术股份有限公司 Feature extraction method and device based on attention mechanism and electronic equipment
CN109829713B (en) * 2019-01-28 2020-09-15 重庆邮电大学 Mobile payment mode identification method based on common drive of knowledge and data
CN110557447B (en) * 2019-08-26 2022-06-10 腾讯科技(武汉)有限公司 User behavior identification method and device, storage medium and server
CN112533208A (en) * 2019-08-27 2021-03-19 中国移动通信有限公司研究院 Model training method, false terminal identification method and device, and electronic device
CN111259985B (en) * 2020-02-19 2023-06-30 腾讯云计算(长沙)有限责任公司 Classification model training method and device based on business safety and storage medium
CN111639681A (en) * 2020-05-09 2020-09-08 同济大学 Early warning method, system, medium and device based on education drive type fraud
CN111914645A (en) * 2020-06-30 2020-11-10 五八有限公司 Method and device for identifying false information, electronic equipment and storage medium
CN112818142B (en) * 2021-01-29 2023-12-08 北京达佳互联信息技术有限公司 Account behavior information processing method and device, electronic equipment and storage medium
CN112887325B (en) * 2021-02-19 2022-04-01 浙江警察学院 Telecommunication network fraud crime fraud identification method based on network flow
CN113506084A (en) * 2021-06-23 2021-10-15 上海师范大学 False recruitment position detection method based on deep learning

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1649311A (en) * 2005-03-23 2005-08-03 北京首信科技有限公司 Detecting system and method for user behaviour abnormal based on machine study
CN102176698A (en) * 2010-12-20 2011-09-07 北京邮电大学 Method for detecting abnormal behaviors of user based on transfer learning
CN102238045A (en) * 2010-04-27 2011-11-09 广州迈联计算机科技有限公司 System and method for predicting user behavior in wireless Internet
CN102402517A (en) * 2010-09-09 2012-04-04 北京启明星辰信息技术股份有限公司 Method and system for establishing normal database login model and method and system for detecting abnormal login behavior

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2009152465A1 (en) * 2008-06-12 2009-12-17 Guardian Analytics, Inc. Modeling users for fraud detection and analysis

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1649311A (en) * 2005-03-23 2005-08-03 北京首信科技有限公司 Detecting system and method for user behaviour abnormal based on machine study
CN102238045A (en) * 2010-04-27 2011-11-09 广州迈联计算机科技有限公司 System and method for predicting user behavior in wireless Internet
CN102402517A (en) * 2010-09-09 2012-04-04 北京启明星辰信息技术股份有限公司 Method and system for establishing normal database login model and method and system for detecting abnormal login behavior
CN102176698A (en) * 2010-12-20 2011-09-07 北京邮电大学 Method for detecting abnormal behaviors of user based on transfer learning

Also Published As

Publication number Publication date
CN103793484A (en) 2014-05-14

Similar Documents

Publication Publication Date Title
CN103793484B (en) The fraud identifying system based on machine learning in classification information website
Morstatter et al. A new approach to bot detection: striking the balance between precision and recall
US10484413B2 (en) System and a method for detecting anomalous activities in a blockchain network
Yu et al. Attention-based convolutional approach for misinformation identification from massive and noisy microblog posts
Pozzana et al. Measuring bot and human behavioral dynamics
CN103795612B (en) Rubbish and illegal information detecting method in instant messaging
CN107807941B (en) Information processing method and device
CN111371767B (en) Malicious account identification method, malicious account identification device, medium and electronic device
CN103530540A (en) User identity attribute detection method based on man-machine interaction behavior characteristics
CN105574544A (en) Data processing method and device
CN108833139B (en) OSSEC alarm data aggregation method based on category attribute division
CN104040963A (en) System and methods for spam detection using frequency spectra of character strings
KR102124846B1 (en) Source analysis based news reliability evaluation system and method thereof
CN112329816A (en) Data classification method and device, electronic equipment and readable storage medium
CN109063736B (en) Data classification method and device, electronic equipment and computer readable storage medium
CN110598129B (en) Cross-social network user identity recognition method based on two-stage information entropy
CN106537387B (en) Retrieval/storage image associated with event
CN106202126B (en) A kind of data analysing method and device for logistics monitoring
WO2024067387A1 (en) User portrait generation method based on characteristic variable scoring, device, vehicle, and storage medium
WO2019200739A1 (en) Data fraud identification method, apparatus, computer device, and storage medium
CN103853744A (en) Deceptive junk comment detection method oriented to user generated contents
Yamak et al. Detection of multiple identity manipulation in collaborative projects
CN115545103A (en) Abnormal data identification method, label identification method and abnormal data identification device
CN113157871B (en) News public opinion text processing method, server and medium applying artificial intelligence
CN109478219A (en) For showing the user interface of network analysis

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant