CN108197795A - The account recognition methods of malice group, device, terminal and storage medium - Google Patents
The account recognition methods of malice group, device, terminal and storage medium Download PDFInfo
- Publication number
- CN108197795A CN108197795A CN201711460104.0A CN201711460104A CN108197795A CN 108197795 A CN108197795 A CN 108197795A CN 201711460104 A CN201711460104 A CN 201711460104A CN 108197795 A CN108197795 A CN 108197795A
- Authority
- CN
- China
- Prior art keywords
- data set
- characteristic
- account
- weight coefficient
- standard data
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q10/00—Administration; Management
- G06Q10/06—Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling
- G06Q10/063—Operations research, analysis or management
- G06Q10/0639—Performance analysis of employees; Performance analysis of enterprise or organisation operations
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q30/00—Commerce
- G06Q30/06—Buying, selling or leasing transactions
- G06Q30/0601—Electronic shopping [e-shopping]
- G06Q30/0609—Buyer or seller confidence or verification
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q50/00—Systems or methods specially adapted for specific business sectors, e.g. utilities or tourism
- G06Q50/01—Social networking
Abstract
The present invention relates to big data digging technology fields, provide a kind of malice group account recognition methods, device, terminal and storage medium, the method includes:Obtain the characteristic data set of all accounts to be identified;Each characteristic data set is standardized, obtains corresponding standard data set;Each standard data set is predicted using preset account Relationship Prediction model, obtains corresponding first weight coefficient of each standard data set;Feature extraction is carried out to each standard data set, obtains corresponding second weight coefficient of each standard data set;According to corresponding first weight coefficient of all standard data sets and the second weight coefficient, the malice group account in all accounts to be identified is obtained.The present invention considers the associated weights between account from multiple dimensions, the accuracy of classification results when improving community discovery algorithm process practical problem.
Description
Technical field
The present invention relates to big data digging technology field, in particular to a kind of malice group account recognition methods, dress
It puts, terminal and storage medium.
Background technology
With the development of Internet technology, China Internet trip market comes into high-speed development period, in order into quotations
Field is promoted, and trip platform is often tactful using the subsidy of high dynamics, more drivers to be attracted to use the platform.At the same time, it is
The subsidy of earning great number, various means illegally practised fraud also generate therewith.Moreover, there is trend to show at present, illegal cheating
The crime of group's property is gradually evolved to from individuality crime.On the other hand, with the hair of the technologies such as machine learning, artificial intelligence
Exhibition, is also reached its maturity using the method for the intelligent recognition group of data mining mode.But group's recognition methods at present is in reality
Application in scene is only limitted between group's account the considerations of the single factors of relationship, therefore the accuracy identified is not high.
Invention content
Be designed to provide a kind of malice group account recognition methods, device, terminal and the storage of the embodiment of the present invention are situated between
Matter, to improve the above problem.
To achieve these goals, technical solution used in the embodiment of the present invention is as follows:
In a first aspect, an embodiment of the present invention provides a kind of malice group account recognition methods, the method includes:It obtains
The characteristic data set of all accounts to be identified;Each characteristic data set is standardized, obtains corresponding criterion numeral
According to collection;Each standard data set is predicted using preset account Relationship Prediction model, obtains each standard data set
Corresponding first weight coefficient;Feature extraction is carried out to each standard data set, obtains each standard data set corresponding
Two weight coefficients;According to corresponding first weight coefficient of all standard data sets and the second weight coefficient, obtain all to be identified
Malice group account in account.
Second aspect, the embodiment of the present invention additionally provide a kind of malice group account identification device, and described device includes spy
Levy data set acquisition module, data normalization module, data prediction module, characteristic extracting module and malice account division module.
Wherein, characteristic data set acquisition module is used to obtain the characteristic data set of all accounts to be identified;Data normalization module is used for
Each characteristic data set is standardized, obtains corresponding standard data set;Data prediction module is used for using pre-
If account Relationship Prediction model each standard data set is predicted, obtain each standard data set it is corresponding first power
Weight coefficient;Characteristic extracting module obtains each standard data set and corresponds to for carrying out feature extraction to each standard data set
The second weight coefficient;Malice account division module is used for according to corresponding first weight coefficient of all standard data sets and second
Weight coefficient obtains the malice group account in all accounts to be identified.
The third aspect, the embodiment of the present invention additionally provide a kind of terminal, and the terminal includes:One or more processors;
Memory, for storing one or more programs, when one or more of programs are performed by one or more of processors
When so that one or more of processors realize above-mentioned malice group account recognition methods.
Fourth aspect, the embodiment of the present invention additionally provide a kind of computer readable storage medium, are stored thereon with computer
Program, the computer program realize above-mentioned malice group account recognition methods when being executed by processor.
Compared with the prior art, it a kind of malice group account recognition methods provided in an embodiment of the present invention, device, terminal and deposits
Storage media first, by being standardized to each characteristic data set, obtains the corresponding normal data of account to be identified
Collection;Then, each standard data set is predicted respectively and feature extraction, to obtain each standard data set corresponding first
First weight coefficient and the second weight coefficient are finally combined and identify malice group by weight coefficient and the second weight coefficient
Account.Compared with prior art, the embodiment of the present invention considered from multiple dimensions the relationship between account to be identified of influencing it is a variety of because
Element, and different weights is given for the influence degree of account relationship according to these factors, most ownership is integrated again into one at last
A comprehensive weight carries out malice group account using comprehensive weight and identifies, so as to improve the accurate of malice group account identification
Property.
For the above objects, features and advantages of the present invention is enable to be clearer and more comprehensible, special embodiment below, and appended by cooperation
Attached drawing is described in detail below.
Description of the drawings
It in order to illustrate the technical solution of the embodiments of the present invention more clearly, below will be to needed in the embodiment attached
Figure is briefly described, it should be understood that the following drawings illustrates only certain embodiments of the present invention, therefore is not construed as pair
The restriction of range, for those of ordinary skill in the art, without creative efforts, can also be according to this
A little attached drawings obtain other relevant attached drawings.
Fig. 1 shows the block diagram of terminal provided in an embodiment of the present invention.
Fig. 2 shows malice group account recognition methods flow charts provided in an embodiment of the present invention.
Fig. 3 be Fig. 2 shows step S102 sub-step flow chart.
Fig. 4 be Fig. 2 shows step S104 sub-step flow chart.
Fig. 5 shows the schematic diagram for the relational network figure that present example provides.
Fig. 6 be Fig. 2 shows step S105 sub-step flow chart.
Fig. 7 shows the block diagram of malice group account identification device provided in an embodiment of the present invention.
Icon:100- terminals;101- memories;102- storage controls;103- processors;200- malice group account is known
Other device;201- characteristic data set acquisition modules;202- data normalization modules;203- data prediction modules;204- features carry
Modulus block;205- malice account division modules.
Specific embodiment
Below in conjunction with attached drawing in the embodiment of the present invention, the technical solution in the embodiment of the present invention is carried out clear, complete
Ground describes, it is clear that described embodiment is only part of the embodiment of the present invention, instead of all the embodiments.Usually exist
The component of the embodiment of the present invention described and illustrated in attached drawing can be configured to arrange and design with a variety of different herein.Cause
This, the detailed description of the embodiment of the present invention to providing in the accompanying drawings is not intended to limit claimed invention below
Range, but it is merely representative of the selected embodiment of the present invention.Based on the embodiment of the present invention, those skilled in the art are not doing
Go out all other embodiments obtained under the premise of creative work, shall fall within the protection scope of the present invention.
It should be noted that:Similar label and letter represents similar terms in following attached drawing, therefore, once a certain Xiang Yi
It is defined in a attached drawing, does not then need to that it is further defined and explained in subsequent attached drawing.Meanwhile the present invention's
In description, term " first ", " second " etc. are only used for distinguishing description, and it is not intended that instruction or hint relative importance.
Fig. 1 is please referred to, Fig. 1 shows the block diagram of terminal 100 provided in an embodiment of the present invention.Terminal 100 can be with
It is, but is not limited to smart mobile phone, tablet computer, PC (personal computer, PC), server etc..Terminal
100 operating system may be, but not limited to, Android (Android) system, IOS (iPhone operating system) system
System, Windows phone systems, Windows systems etc..The terminal 100 includes malice group account identification device 200, deposits
Reservoir 101, storage control 102 and processor 103.
The memory 101, storage control 102 and 103 each element of processor are directly or indirectly electrical between each other
Connection, to realize the transmission of data or interaction.For example, these elements can pass through one or more communication bus or letter between each other
Number line, which is realized, to be electrically connected.Malice group account identification device 200 include it is at least one can be with software or firmware (firmware)
Form be stored in memory 101 or be solidificated in the operating system (operating system, OS) of the terminal 100
Software function module.Processor 103 is used to perform the executable module stored in memory 101, such as malice group account is known
Software function module and computer program included by other device 200 etc..
Wherein, memory 101 may be, but not limited to, random access memory (Random Access Memory,
RAM), read-only memory (Read Only Memory, ROM), programmable read only memory (Programmable Read-Only
Memory, PROM), erasable read-only memory (Erasable Programmable Read-Only Memory, EPROM),
Electricallyerasable ROM (EEROM) (Electric Erasable Programmable Read-Only Memory, EEPROM) etc..
Wherein, for memory 101 for storing program, the processor 103 performs described program after execute instruction is received.
Processor 103 can be a kind of IC chip, have signal handling capacity.Above-mentioned processor 103 can be with
It is general processor, including central processing unit (Central Processing Unit, CPU), network processing unit (Network
Processor, NP), speech processor and video processor etc.;Can also be digital signal processor, application-specific integrated circuit,
Field programmable gate array either other programmable logic device, discrete gate or transistor logic, discrete hardware components.
It can realize or perform disclosed each method, step and the logic diagram in the embodiment of the present invention.General processor can be
Microprocessor or the processor 103 can also be any conventional processors etc..
First embodiment
Fig. 2 is please referred to, Fig. 2 shows the process flows of identification malice group provided in an embodiment of the present invention account
Figure.Processing method includes the following steps:
Step S101 obtains the characteristic data set of all accounts to be identified.
In embodiments of the present invention, account to be identified can be the user registered on transaction platform for the purpose of transaction,
And these users are accused of carrying out malice transaction on transaction platform, for example, the user registered on net about vehicle platform, user can be with
Be driver can also be passenger, if these users are accused of on net about vehicle platform obtaining unlawful interests with extremity, such as
It is accused of using platform loophole or deliberately to gain great number subsidy etc. by cheating, these is related to fraud transaction data with illegal means intrusion platform
It is account to be identified to detest the user that meaning is merchandised.Characteristic attribute can be characterized each between account to be identified and other accounts
The attribute of Transaction Information, characteristic attribute can include, but are not limited to be traded between each account to be identified and other accounts
Danger classes, loco, transaction count and transaction amount etc..Characteristic data set is all features of each account to be identified
The set of the value of attribute, characteristic data set are extracted according to the initial data of account to be identified, and initial data can derive from,
But be not limited to account to be identified day full dose tran list, day full dose transfers accounts table, reimbursement record sheet, traction equipment information table, basic
Information table, day full dose table, obtains the numbers such as resource table, attrition prediction table, message registration table, consulting complaint table at order record full edition
According to table, the period for obtaining initial data is 30 days, i.e., obtained an initial data every 30 days.Characteristic data set can represent
For:{x1_1=3, x1_2=1, x1_3=10 ... x1_125=37 }, it is meant that:First digit in subscript represents characteristic
The serial number of collection, the second digit in subscript represent the serial number of the characteristic attribute of corresponding characteristic data set.Implement in the present invention
In example, first characteristic attribute represents transaction danger classes, and second characteristic attribute represents loco, third characteristic attribute
Transaction count is represented, the 125th characteristic attribute represents transaction amount.For example, x1_1=3 representatives are meant that first characteristic
It is 3 grades according to the i.e. transaction danger classes of first characteristic attribute of collection, x1_2=1 represents and is meant that first characteristic data set
Second characteristic attribute, that is, loco is 1,1 to represent some city, x1_3=10 representatives are meant that first characteristic data set
Third characteristic attribute, that is, transaction count be 10 times ... ..., x1_125=37 represent and are meant that first characteristic data set
125th characteristic attribute, that is, transaction amount is 370,000 yuan.The characteristic data set of all accounts to be identified is each account to be identified
Characteristic data set set, for example,
Step S102 is standardized each characteristic data set to obtain corresponding standard data set.
In embodiments of the present invention, first, the corresponding characteristic attribute collection of characteristic data set is obtained according to step S101.Feature
Property set is the set of multiple characteristic attributes, the row in each characteristic attribute character pair data set.For example, existing 3 spies
Data set is levied, the characteristic attribute of the 1st characteristic data set includes transaction danger classes and loco, the 2nd characteristic data set
Characteristic attribute include transaction count and transaction amount, the characteristic attribute of the 3rd characteristic data set includes transaction danger classes and
Transaction count, then the characteristic attribute collection of 3 characteristic data sets include transaction danger classes, loco, transaction count and friendship
The easy amount of money.Secondly, it counts each characteristic attribute and lacks sample and effective sample what each characteristic was concentrated, obtain lacking sample
This value and effective sample value.The sample that lacks of each characteristic attribute can either contain the spy not comprising this feature attribute
Attribute but the characteristic data set of its characteristic attribute value missing are levied, the effective sample of each characteristic attribute can include this feature category
Property and this feature attribute value be virtual value characteristic data set, for example, the 1st characteristic concentrate no transaction count this
Characteristic attribute, then the 1st characteristic data set be exactly one of transaction count this characteristic attribute and lack sample, the 2nd characteristic
Include this characteristic attribute of loco, and the value of this characteristic attribute of loco is a virtual value according to collection, then the 2nd spy
Sign data set is exactly an effective sample of this characteristic attribute of loco.Each characteristic attribute lacks sample value and can be
The ratio that each characteristic attribute is obtained in the total number of sum divided by characteristic data set that all characteristics concentration lacks sample.
For example, existing 10 characteristic data sets, wherein there is 8 characteristics to concentrate all without this attribute of transaction count, then transaction time
The sample value that lacks of this attribute of number is 0.8.Can be each characteristic attribute in institute in the effective sample value of each characteristic attribute
There is characteristic to concentrate the sum of effective sample divided by the obtained ratio of total number of characteristic data set.Then, according to lacking sample
This value and effective sample value filter out and really contain the characteristic data set of effective information, and the method for screening can be from being needed
The characteristic concentration of identification account deletes invalid characteristic attribute row.Invalid characteristic attribute row can be the absence of sample value and reach
The characteristic attribute that the corresponding row of characteristic attribute or effective sample value of first threshold are not up to second threshold is corresponding
Row.It is empirical value that first threshold, which may be, but not limited to,.For example, first threshold is 0.7, then lack sample value more than or equal to 0.7
The corresponding row of characteristic attribute be invalid characteristic attribute row, need to concentrate it from characteristic and delete.Second threshold can be with
It is, but is not limited to empirical value.For example, second threshold is 0.8, then effective sample value is less than the 0.8 corresponding row of characteristic attribute
As invalid characteristic attribute row, need to concentrate it from characteristic and delete.Finally, the characteristic after calculating sifting is concentrated every
The sample average and sample standard deviation of a characteristic attribute.The sample average of each characteristic attribute can be to the characteristic after screening
According to concentrating the summation of corresponding characteristic attribute value, then by it is acquiring and divided by screening after the total number of characteristic data set obtain
Ratio, sample standard deviation are calculated according to formula (1).Formula (1) is as follows:
Wherein, uijIt is characterized attribute uiValue,It is characterized attribute uiSample average,It is characterized attribute uiSample
Standard deviation.Finally, standardization formula (2) is used to each feature according to the sample average of each characteristic attribute and sample standard deviation
Data set is standardized, and obtains corresponding standard data set.Formula (2) is as follows:
Wherein, uijIt is characterized attribute uiValue, ui'jFor the characteristic attribute u after standardizationiValue,It is characterized attribute ui's
Sample average, σuiIt is characterized attribute uiSample standard deviation.
It should be noted that in this step, the missing that the characteristic attribute of each characteristic data set is concentrated can also be counted
Property value and effective property value carry out the characteristic data set of all accounts to be identified according to missing attribute values and effective property value
Screening.Missing attribute values can be each characteristic data set characteristic attribute concentration lacked attribute or contain the attribute and
The number divided by characteristic attribute of the attribute of the value missing of the attribute concentrate the ratio that the total number of attribute obtains, and effective property value can
It concentrates and belongs to the number divided by the characteristic attribute that are the attribute that the characteristic attribute of each characteristic data set concentrates that property value is virtual value
The ratio that the total number of property obtains.For example, the characteristic attribute concentration of the 1st characteristic data set includes 100 attributes, wherein having
What the property value of 90 attributes was missing from, the property values of 10 attributes is effective, the then missing of first characteristic data set
Property value is 0.9, and effective property value is 0.1.When missing attribute values reach third threshold value, it is believed that the letter that this feature data set includes
It ceases insufficient, this feature data set from the characteristics of all accounts to be identified is concentrated and is deleted.Third threshold value can be, but not
Be limited to an empirical value, such as 0.7, when missing attribute values are more than or equal to 0.7, by corresponding characteristic data set from
The characteristic of all accounts to be identified, which is concentrated, deletes.When effective property value is not up to four threshold values, it is believed that this feature data set
Comprising effective information value it is not high, this feature data set from the characteristics of all accounts to be identified is concentrated and is deleted.The
It is an empirical value that four threshold values, which may be, but not limited to,, such as 0.8, it, will be corresponding when effective property value is less than 0.8
Characteristic data set is concentrated from the characteristic of all accounts to be identified and is deleted.
Fig. 3 is please referred to, step S102's can also include following sub-step:
Step S1021 obtains the characteristic attribute collection of all accounts to be identified, wherein, the characteristic attribute collection includes multiple
Characteristic attribute.
In embodiments of the present invention, characteristic attribute may be, but not limited to, transaction danger classes, loco, transaction time
Number, transaction amount etc..
Step S1022 counts each characteristic attribute and lacks sample and effective sample what each characteristic was concentrated, obtains
Lack sample value and effective sample value.
In embodiments of the present invention, each characteristic attribute each characteristic concentrate lack sample and effective sample it
With the summation of the characteristic data set less than or equal to all accounts to be identified, lack the sum of sample value and effective sample value and be less than or equal to
1.For example, the summation of the characteristic data set of all accounts to be identified is 10, wherein there is 5 characteristic data sets to lack transaction count
This characteristic attribute, 4 characteristic data sets have this characteristic attribute of transaction count and the property value is effective, then transaction time
Several sample values that lack are 0.5, and effective sample value is 0.4.
Step S1023 lacks sample value and effective sample value according to described, obtain each characteristic attribute sample average and
Sample standard deviation.
In embodiments of the present invention, according to lacking sample value and effective sample value filters out and really contains effective information
Characteristic data set, the characteristic after calculating sifting concentrate the sample average and sample standard deviation of each characteristic attribute.It is screening
The property value of the characteristic attribute of missing can also be concentrated to fill a vacancy to each characteristic according to actual conditions before.The side of filling a vacancy
It is that average value is filled a vacancy that method, which may be, but not limited to,.The processing method that average value is filled a vacancy is:It is corresponding to the missing attribute values first
The characteristic attribute property value that other characteristics are concentrated in all accounts to be identified is averaging to obtain average value, then flat with this
Mean value is as the missing attribute values.For example, existing 5 characteristic data sets, the value of this attribute of transaction amount are:
Wherein, the property value of the 125th characteristic attribute missing that the 3rd characteristic is concentrated, calculates the 125th feature
Attribute is 33 in the average value that this characteristic is concentrated, then the 125th spy of the 3rd characteristic concentration of being filled a vacancy with average value 33
The property value of attribute is levied, the property value after filling a vacancy is:
According to sample average and sample standard deviation, each characteristic data set is calculated using standardized algorithm by step S1024
Corresponding standard data set.
In present example, since the data of each characteristic data set are different dimensions, for example, the transaction of A and B is total
Number is 10 times, and 1000 yuan of total amount of merchandising, the transaction danger classes of A and B are 3, this 3 characteristic attributes be it is different magnitude of,
It can not directly be handled its characteristic data set as the input data of algorithm, need to carry out it first with standardization formula
Standardization obtains the corresponding standard data set of each characteristic data set.
Step S103 predicts each standard data set using preset account Relationship Prediction model, obtains every
Corresponding first weight coefficient of a standard data set.
In embodiments of the present invention, account Relationship Prediction model is by carrying out multiple linear regression point to historical account
Analysis, so as to obtain the prediction model of multiple linear regression equations, historical account therein is fixed malice group account.It is logical
It crosses and predicts each standardized data collection substitution multiple linear regression equations, it is possible to obtain the evil of each standardized data
It anticipates relationship depth predicted value, then the malice relationship depth predicted value is normalized to get to each normalized number
According to corresponding first weight coefficient.
As a kind of embodiment, the method for obtaining the first weight coefficient can include:
First, multiple linear regression equations h is definedθ(x)=θ0+θ1x1+θ2x2+L+θnxn, wherein, n is characterized of attribute
Number, xjJ-th of characteristic attribute value is concentrated for each characteristic.For example, xjCan be transaction count, the trade gold of each pair of transaction
Volume etc..In order to facilitate program processing, it is by multiple linear regression equation simplification:hθ(x)=θTX=a+bxj, wherein, θ, x are
(n+1,1) dimensional vector is represented, for example, the transaction amount of 2000 couple transaction in historical data can be denoted as (2000,1), table
Show 1 dimensional vector of the transaction amount of 2000 pairs of transaction.
Secondly, the loss function such as formula (3) is defined
Wherein, n is characterized the number (row) of attribute, numbers (row) of the m for historical account, yiFor in historical account database
Known real result value, hθ(xi) it is estimated value.
Third derives the calculation formula of regression coefficient using least square method, such as formula (4)
4th, the value of each characteristic attribute is concentrated to carry out dummy variable each characteristic of each historical account and turned
It changes, obtains virtual attribute variable, conversion method is:The each characteristic for obtaining each historical account concentrates each characteristic attribute
Then whole codomain ranges of value are all converted to virtual attribute variable, indicate whether to hit using 1 or 0, for example, x2Codomain
{ 1,2,3 } indicates 3 cities, respectively Beijing, Shanghai, Guangzhou, need to be converted to whether Beijing, whether Shanghai, whether wide
State 3 arranges, and then carries out one-to-one correspondence conversion.So x2=1 can be converted to:x2_ Beijing=1, indicate whether this virtual category of Beijing
Property variable-value be "Yes", x2_ Shanghai=0 indicates whether this virtual attribute variable-value of Shanghai as "No", x2_ Guangzhou=0 represent be
This virtual attribute variable-value of no Guangzhou is "No".
5th, the regression equation h with unknown regression coefficient a and b that the virtual attribute value substitution first step is definedθ(x)
=θTX=a+bxj, solve and obtain the value of unknown regression coefficient a and b, finally obtain complete regression equation.
6th, the complete regression equation that the 5th step of each standardized data collection substitution is obtained obtains corresponding malice
Relationship depth predicted value.For example, by standardized data collection { x200_1=0, x200_2=1, x200_3=40 ... x200_125=55 } it substitutes into
Regression equation hθ(x)=θTX=a+bxj, wherein, the value of a and b have all been obtained in the 5th step, and it is deep to obtain corresponding malice relationship
It spends for { y200=98.233 }.
7th, corresponding first power is obtained after the malice relationship depth predicted value that previous step acquires is normalized
Weight coefficient.Normalize formula such as formula (5):
wi=(wi-wmin)/(wmax-wmin) (5)
Wherein, wiFor each characteristic value, wminIt is characterized minimum value in value, wmaxMaximum value in value is characterized, for example, above-mentioned mark
Standardization data { x200_1=0, x200_2=1, x200_3=40 ... x200_125=55 } malice relationship depth is { y200=98.233 },
After carrying out linear normalization processing to it, corresponding first weight coefficient is obtained as { w1_200'=0.878804 }.
Step S104 carries out feature extraction to each standard data set, obtains each standard data set corresponding second
Weight coefficient.
In embodiments of the present invention, the computational methods of corresponding second weight coefficient of each standard data set can be:It is first
First, normal data is obtained according to the standard data set of each account to be identified and is always collected, always collection includes each waiting to know the normal data
The standard data set of other account, and always collect carry out principal component analysis to the normal data, obtain the normal data always concentrate it is more
A number of principal components evidence, for example, characteristic attribute concentration includes transaction danger classes, loco, transaction count and transaction amount
This 4 attributes, obtained after principal component analysis transaction danger classes, loco, transaction count be its principal component;Then,
The variance contribution ratio of corresponding principal component is integrated as weight using each standardized data, it is corresponding multiple to each standardized data collection
Principal component is weighted processing and obtains a comprehensive substantial connection degree value of each standardized data, then the synthesis is closed closely
It is that degree value is normalized to get to corresponding second weight coefficient of each standard data set.Variance contribution ratio is anti-
Reflect weight of its principal component to each standard data set influence power.
Fig. 4 is please referred to, step S104 can also include following sub-step:
Sub-step S1041 always collects normal data carry out principal component analysis, obtain the normal data concentration it is multiple it is main into
Divided data, wherein, normal data always collects the standard data set for including each account to be identified.
In embodiments of the present invention, carry out principal component analysis is always collected to normal data, so as to fulfill to characteristic attribute collection into
Row dimensionality reduction obtains wherein most correlated characteristic property set, and solving characteristic equation according to maximally related characteristic attribute collection finally obtains master
Ingredient.
As a kind of embodiment, the method for principal component analysis can include:
First, the normalized matrix of standard data set is obtained, the often row in normalized matrix represents a pair of of transaction, standardizes
Each column in matrix represents the characteristic attribute of a transaction.
Secondly, the correlation matrix of previous step Plays matrix, the use equation below of correlation matrix are solved
(6) it acquires:
Wherein,
Related coefficient is the statistical indicator of correlativity level of intimate between reflection variable.Related coefficient is by product moment method
It calculates, equally based on the deviation of two variables and respective average value, is multiplied to reflect phase between two variables by two deviations
Pass degree.The value of related coefficient is between -1 and+1, i.e. -1≤r≤1.Its property is as follows:
As r > 0, two variable positive correlations, r are represented<When 0, two variables are negative correlation.
When | r | when=1, two variables of expression are fairly linear correlation, as functional relation.
As r=0, without linear relationship between two variables of expression.
When 0<|r|<When 1, represent that there are a degree of linear correlations for two variables.And | r | the line between 1, two variables
Sexual intercourse is closer;| r | closer to 0, represent that the linear correlation of two variables is weaker.It can generally be divided by three-level:|r|≤0.4
It is low linearly related;0.4<|r|<0.7 is related for conspicuousness;0.7≤|r|<1 is related for highly linear.For example, the present invention is real
In example, x1Represent the danger classes merchandised between account, x3Represent the number merchandised between account, r13=r31=-0.0423465,
Illustrate that account danger classes is weak related to transaction count between account.
Third solves characteristic equation according to the correlation matrix that second step acquires | R- λ Ip|=0, obtain p characteristic root
λ。
4th, in order to determine the number m of the characteristic root of correlation maximum from the p characteristic root that third walks, utilize
Formula (7),
It determines m values, i.e., m characteristic root of correlation maximum is chosen from the p characteristic root that third walks, makes information
Utilization rate is up to 95%, to each characteristic root λj, j=1,2 ..., m, solving equations Rb=λjB obtains unit character vector。
5th, the feature vector that the 4th step obtains is converted to principal component, conversion formula such as formula (8):
Wherein, U1Referred to as first principal component, U2Referred to as Second principal component, UmReferred to as m-th of principal component.The present invention is implemented
In example, first principal component U1={ 12.044529,11.927115 ..., -5.901484 }, Second principal component, U2=-
0.427374, -0.079803 ..., -0.793191 } 67 principal components ... are shared.
Sub-step S1042 according to each number of principal components according to the contribution rate to each standard data set, obtains each criterion numeral
According to corresponding second weight coefficient of collection.
In embodiments of the present invention, the second weight coefficient can be solved according to following methods:With each normal data
The variance contribution ratio for concentrating each number of principal components evidence is weight, and summation is weighted to m principal component of each standard data set,
A comprehensive weight coefficient is finally obtained, which is normalized to obtain corresponding second weight system
Number.For example, the second weight coefficient of the 1st standard data set is ResultWeight={ w2_1=0.143221642 }, the 200th
Second weight coefficient of standard data set is ResultWeight={ w2_200=-0.136319985 }.
Step S105 according to corresponding first weight coefficient of all standard data sets and the second weight coefficient, is owned
Malice group account in account to be identified.
In embodiments of the present invention, obtain corresponding first weight coefficient of each standard data set and the second weight coefficient it
Afterwards, first, the first weight coefficient corresponding to each standard data set and the second weight coefficient are weighted average, are obtained each
The corresponding third weight coefficient of standard data set;Then, it according to the corresponding third weight coefficient of all standard data sets, restores
Relational network figure between all accounts to be identified, for example, there is 4 standard data sets, the 1st standard data set represents account 5
It just merchandises to what account 8 was initiated, and malice weight coefficient is that 0.325, the 2nd standard data set represents account 3 and initiated to account 5
Transaction, and malice weight coefficient is that 0.56, the 3rd standard data set represents the transaction that account 6 is initiated to account 3, and malice
Weight coefficient is that 0.84, the 4th standard data set represents the transaction that account 6 is initiated to account 5, and malice weight coefficient is
0.66, it is as shown in Figure 5 according to the relational network figure that this information restores;Finally, using hierarchical clustering greedy algorithm to the pass
It is that network carries out cluster calculation, obtains the malice group account in all accounts to be identified.
Fig. 6 is please referred to, step S105 can also include following sub-step:
Sub-step S1051, the first weight coefficient corresponding to each standard data set and the second weight coefficient are weighted
It is average, obtain the corresponding third weight coefficient of each standard data set.
In embodiments of the present invention, first weight coefficient corresponding to each standard data set and the second weight coefficient into
After row weighted average, if obtained weighted average is too small, it is unfavorable for the calculating of follow-up corporations' sorting algorithm, can incites somebody to action each
The corresponding weighted average of standard data set is enlarged after appropriate multiple as corresponding third weight coefficient.For example, the 1st
First weight coefficient of a standard data set is ResultWeight={ w1_1'=0.104926 }, the 200th standard data set
First weight coefficient is ResultWeight={ w1_200'=0.878804 };Second weight coefficient of the 1st standard data set is
ResultWeight={ w2_1'=0.595968 }, the second weight coefficient of the 200th standard data set is ResultWeight=
{w2_200'=0.294188 }, after weighted average, obtain the third weight coefficient Result of the 1st standard data setWeight={ W1
=3.50447 }, the third weight coefficient of the 200th standard data set is ResultWeight={ W200=5.86496 }, in order to just
In the calculating of subsequent algorithm, 10 times of processing is amplified to weighted average herein, concrete meaning is, first pair merchandise into
After the first weight of row and the second weight integrated treatment, malice relationship is familiar with depth and carries out for the 3.50447, the 200th pair of transaction
After one weight and the second weight integrated treatment, it is 5.86496 that malice relationship, which is familiar with depth,.
Sub-step S1052 according to the corresponding third weight coefficient of all standard data sets, restores all accounts to be identified
Between relational network figure.
In embodiments of the present invention, the node in relational network figure is the corresponding account of standard data set, relational network figure
In side be the corresponding standard data set of account third weight coefficient.It is restored according to the information of node and side all to be identified
Relational network figure between account.For example, ResultWeight={ W1=3.50447 } what is represented is that account 0 is initiated to account 9
The relation data of transaction, then representing the node of account 0 and representing between the node of account 9 just has one to be directed toward account 9 from account 0
Directed edge, and the weight on the side is 3.50447.
Using hierarchical clustering greedy algorithm, all accounts to be identified are identified according to the relational network figure by sub-step S1053
Malice group account in family.
In embodiments of the present invention, a malice group account corresponds to a cluster in algorithm.Hierarchical clustering greed is calculated
Method is constantly to add in new node to existing cluster in the case where ensureing that modularity does not reduce, and finally obtains modularity most
It is big and comprising the most cluster of number of nodes, so as to obtain the malice group account in all accounts to be identified, the cluster in algorithm
Also referred to as community, group etc., it should be understood that those skilled in the art can recognize that the different sayings of this identical meanings.
The community described below meaning identical with cluster representative.
As a kind of embodiment, the method for hierarchical clustering greedy algorithm can include:
First, introductory die lumpiness Q, wherein calculation formula equation below (9) are calculated:
WhereinFor all weights in network, Ai,jFor the weight between node i and node j,Weight for the side being connect with vertex i, ciFor the community that vertex is assigned to, δ (ci,cj) for judge vertex i with
Whether vertex j is divided in same community, if so, returning to 1, otherwise, returns to 0;
For convenience of calculating, formula (9) is reduced to formula (10), it is as follows:
Wherein, ∑ in is weight inside community c, and Σ tot represent the weight with the side of the point connection inside community c, including
Inside community while and community outside while.
Secondly, it by the community where any one node division to point adjacent thereto, again according to formula (10), counts
This stylish modularity is calculated, by this node-home if new modularity is not less than the introductory die lumpiness calculated in step 1
It, otherwise cannot will be in this node-home to the community into the community.
Again, using new modularity as introductory die lumpiness, continue the iteration of second step, until modularity reaches most
Big or all nodes, which all divide, to be finished, and finally obtains a community.
4th, pair first is repeated to remaining node and is walked to third, finally obtains all communities in relational network figure, i.e.,
Malice group that should be all.
In embodiments of the present invention, corporations are divided by hierarchical clustering greedy algorithm, finally obtains 7 malice and roll into a ball
Body.First connected transaction account group is totally 12 accounts, including 1,9,21,22,26,30,41,43,44,48,53,
61};Second connected transaction account group totally 7 accounts, including { 8,12,19,35,46,52,62 };Third connected transaction
Account group totally 9 accounts, including { 3,4,5,11,29,38,47,55,64 };4th connected transaction account group totally 10
Account, including { 0,10,13,15,16,17,20,25,49,54 };5th connected transaction account group, 11 accounts, including
{7,14,23,27,31,32,33,34,37,59,65};6th connected transaction account group totally 8 account, including 18,28,
40,42,45,50,57,63};7th connected transaction account group totally 9 accounts, including 2,6,24,36,39,51,56,
58,60};Amount to 66 accounts.The concrete meaning of each group is, for example, first connected transaction account group is totally 12
Account, including { 1,9,21,22,26,30,41,43,44,48,53,61 }, meaning is:Degree of danger between this 12 accounts
It is close, and the degree of mutual dealing is excessively close, malice group property crime probability is larger.
In embodiments of the present invention, first, the characteristic data set of all accounts to be identified is obtained, is extracted from initial data
On the one hand characteristic data set ensure that and comprehensively extract and be accused of relevant transaction attribute of maliciously merchandising as far as possible, ensure follow-up
The malice evaluation criteria of enough multiple dimensions is included in the first and second weight coefficients calculated, is disliked so as to be conducive to improve
The accuracy of meaning group account identification;On the other hand the incoherent information at all that will merchandise again with malice is removed, and is reduced
Follow-up data scale to be treated improves the efficiency of data processing.Secondly, feature extraction is carried out to each standard data set
When, the information of account to be identified is on the one hand taken full advantage of, the utilization rate of information is reached 95%, to ensure obtain the
Two weight coefficients reflect the characteristic attribute information of malice correlation maximum in account to be identified as much as possible, further improve most
The accuracy identified eventually;Weed out on the other hand the smaller characteristic attribute of malice correlation reduce it is follow-up to be treated
Data scale improves the efficiency of data processing.Finally, in hierarchical clustering greedy algorithm is used to calculate all accounts to be identified
Malice group account when, the calculation formula of modularity is simplified, the calculating time of algorithm is shortened, improves
The efficiency of algorithm.
Second embodiment
Fig. 7 is please referred to, Fig. 7 shows that the box of malice group account identification device 200 provided in an embodiment of the present invention shows
It is intended to.Malice group account identification device 200 is applied to terminal 100, including characteristic data set acquisition module 201, data mark
Standardization module 202;Data prediction module 203;Characteristic extracting module 204;Malice account division module 205.
Characteristic data set acquisition module 201, for obtaining the characteristic data set of all accounts to be identified.
In the embodiment of the present invention, characteristic data set acquisition module 201 can be used for performing step S101.
Data normalization module 202 for being standardized to each characteristic data set, obtains corresponding criterion numeral
According to collection.
In the embodiment of the present invention, data normalization module 202 can be used for performing step S102.
In the embodiment of the present invention, data normalization module 202 can be also used for performing the sub-step S1021- of step S102
S1024。
Data prediction module 203, for being carried out to each standard data set using preset account Relationship Prediction model
Prediction, obtains corresponding first weight coefficient of each standard data set.
In the embodiment of the present invention, data prediction module 203 can be used for performing step S103.
Characteristic extracting module 204 for carrying out feature extraction to each standard data set, obtains each standard data set
Corresponding second weight coefficient.
In the embodiment of the present invention, characteristic extracting module 204 can be used for performing step S104.
In the embodiment of the present invention, characteristic extracting module 204 can be also used for performing the sub-step S1041- of step S104
S1042。
Malice account division module 205, for being weighed according to corresponding first weight coefficient of all standard data sets and second
Weight coefficient, obtains the malice group account in all accounts to be identified.
In the embodiment of the present invention, malice account division module 205 can be used for performing step S105.
In the embodiment of the present invention, malice account division module 205 can be also used for performing the sub-step of step S105
S1051-S1053。
The embodiment of the present invention further discloses a kind of computer readable storage medium, is stored thereon with computer program, described
The malice group account recognition methods that present invention discloses is realized when computer program is performed by processor 103.
In conclusion a kind of malice group account recognition methods provided by the invention, device, terminal and storage medium, institute
The method of stating includes:Obtain the characteristic data set of all accounts to be identified;Each characteristic data set is standardized, is obtained
To corresponding standard data set;Each standard data set is predicted using preset account Relationship Prediction model, is obtained
Corresponding first weight coefficient of each standard data set;Feature extraction is carried out to each standard data set, obtains each standard
Corresponding second weight coefficient of data set;According to corresponding first weight coefficient of all standard data sets and the second weight coefficient,
Obtain the malice group account in all accounts to be identified.Compared with prior art, the present invention considers influence from multiple dimensions
The many factors of relationship between account to be identified, and different power is given for the influence degree of account relationship according to these factors
Weight, most ownership is integrated again into a comprehensive weight at last, and carrying out malice group account using comprehensive weight identifies, so as to improve
The accuracy of malice group account identification.
In several embodiments provided herein, it should be understood that disclosed device and method can also pass through
Other modes are realized.The apparatus embodiments described above are merely exemplary, for example, flow chart and block diagram in attached drawing
Show the device of multiple embodiments according to the present invention, the architectural framework in the cards of method and computer program product,
Function and operation.In this regard, each box in flow chart or block diagram can represent the one of a module, program segment or code
Part, a part for the module, program segment or code include one or more and are used to implement holding for defined logic function
Row instruction.It should also be noted that at some as in the realization method replaced, the function that is marked in box can also be to be different from
The sequence marked in attached drawing occurs.For example, two continuous boxes can essentially perform substantially in parallel, they are sometimes
It can perform in the opposite order, this is depended on the functions involved.It is it is also noted that every in block diagram and/or flow chart
The combination of a box and the box in block diagram and/or flow chart can use function or the dedicated base of action as defined in performing
It realizes or can be realized with the combination of specialized hardware and computer instruction in the system of hardware.
In addition, each function module in each embodiment of the present invention can integrate to form an independent portion
Point or modules individualism, can also two or more modules be integrated to form an independent part.
If the function is realized in the form of software function module and is independent product sale or in use, can be with
It is stored in a computer read/write memory medium.Based on such understanding, technical scheme of the present invention is substantially in other words
The part contribute to the prior art or the part of the technical solution can be embodied in the form of software product, the meter
Calculation machine software product is stored in a storage medium, is used including some instructions so that a computer equipment (can be
People's computer, server or network equipment etc.) perform all or part of the steps of the method according to each embodiment of the present invention.
And aforementioned storage medium includes:USB flash disk, mobile hard disk, read-only memory (ROM, Read-OnlyMemory), arbitrary access are deposited
The various media that can store program code such as reservoir (RAM, Random Access Memory), magnetic disc or CD.It needs
Illustrate, herein, relational terms such as first and second and the like be used merely to by an entity or operation with
Another entity or operation distinguish, and without necessarily requiring or implying between these entities or operation, there are any this realities
The relationship or sequence on border.Moreover, term " comprising ", "comprising" or its any other variant are intended to the packet of nonexcludability
Contain so that process, method, article or equipment including a series of elements not only include those elements, but also including
It other elements that are not explicitly listed or further includes as elements inherent to such a process, method, article, or device.
In the absence of more restrictions, the element limited by sentence "including a ...", it is not excluded that including the element
Process, method, also there are other identical elements in article or equipment.
The foregoing is only a preferred embodiment of the present invention, is not intended to restrict the invention, for the skill of this field
For art personnel, the invention may be variously modified and varied.All within the spirits and principles of the present invention, that is made any repaiies
Change, equivalent replacement, improvement etc., should all be included in the protection scope of the present invention.It should be noted that:Similar label and letter exists
Similar terms are represented in following attached drawing, therefore, once being defined in a certain Xiang Yi attached drawing, are then not required in subsequent attached drawing
It is further defined and is explained.
Claims (10)
1. a kind of malice group account recognition methods, which is characterized in that the method includes:
Obtain the characteristic data set of all accounts to be identified;
Each characteristic data set is standardized, obtains corresponding standard data set;
Each standard data set is predicted using preset account Relationship Prediction model, obtains each normal data set pair
The first weight coefficient answered;
Feature extraction is carried out to each standard data set, obtains corresponding second weight coefficient of each standard data set;
According to corresponding first weight coefficient of all standard data sets and the second weight coefficient, obtain in all accounts to be identified
Malice group account.
2. the method as described in claim 1, which is characterized in that it is described that each characteristic data set is standardized,
The step of obtaining corresponding standard data set, including:
The characteristic attribute collection of all accounts to be identified is obtained, wherein, the characteristic attribute collection includes multiple characteristic attributes;
It counts each characteristic attribute and lacks sample and effective sample what each characteristic was concentrated, obtain lacking sample value and have
Imitate sample value;
Lack sample value and effective sample value according to described, obtain the sample average and sample standard deviation of each characteristic attribute;
According to sample average and sample standard deviation, the corresponding normal data of each characteristic data set is calculated using standardized algorithm
Collection.
3. the method as described in claim 1, which is characterized in that the preset account Relationship Prediction model is to historical account
Characteristic data set carries out multiple regression analysis and obtains, wherein, the historical account is malice group account.
4. the method as described in claim 1, which is characterized in that it is described that feature extraction is carried out to each standard data set, it obtains
The step of the second weight coefficient corresponding to each standard data set, including:
Always collect carry out principal component analysis to normal data, obtain multiple number of principal components evidences that the normal data is always concentrated, wherein, mark
Quasi- data always collect the standard data set for including each account to be identified;
According to each number of principal components according to the contribution rate to each standard data set, corresponding second power of each standard data set is obtained
Weight coefficient.
5. the method as described in claim 1, which is characterized in that described according to the corresponding first weight system of all standard data sets
Number and the second weight coefficient, the step of obtaining the malice group account in all accounts to be identified, including:
The first weight coefficient corresponding to each standard data set and the second weight coefficient are weighted averagely, obtain each standard
The corresponding third weight coefficient of data set;
According to the corresponding third weight coefficient of all standard data sets, the relational network between all accounts to be identified is restored
Figure;
Using hierarchical clustering greedy algorithm, the malice group account in all accounts to be identified is identified according to the relational network figure
Family.
6. a kind of malice group account identification device, which is characterized in that described device includes:
Characteristic data set acquisition module, for obtaining the characteristic data set of all accounts to be identified;
Data normalization module for being standardized to each characteristic data set, obtains corresponding standard data set;
Data prediction module for being predicted each standard data set using preset account Relationship Prediction model, is obtained
To corresponding first weight coefficient of each standard data set;
For carrying out feature extraction to each standard data set, it is corresponding to obtain each standard data set for characteristic extracting module
Second weight coefficient;
Malice account division module, for according to corresponding first weight coefficient of all standard data sets and the second weight coefficient,
Obtain the malice group account in all accounts to be identified.
7. device as claimed in claim 6, which is characterized in that the data normalization module is additionally operable to:
The characteristic attribute collection of all accounts to be identified is obtained, wherein, the characteristic attribute collection includes multiple characteristic attributes;
It counts each characteristic attribute and lacks sample and effective sample what each characteristic was concentrated, obtain lacking sample value and have
Imitate sample value;
Lack sample value and effective sample value according to described, obtain the sample average and sample standard deviation of each characteristic attribute;
According to sample average and sample standard deviation, the corresponding normal data of each characteristic data set is calculated using standardized algorithm
Collection.
8. device as claimed in claim 6, which is characterized in that the preset account Relationship Prediction model is to historical account
Characteristic data set carries out multiple regression analysis and obtains, wherein, the historical account is malice group account.
9. a kind of terminal, which is characterized in that the terminal includes:
One or more processors;
Memory, for storing one or more programs, when one or more of programs are by one or more of processors
During execution so that one or more of processors realize the method as described in any one of claim 1-5.
10. a kind of computer readable storage medium, is stored thereon with computer program, which is characterized in that the computer program quilt
The method as described in any one of claim 1-5 is realized when processor performs.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201711460104.0A CN108197795B (en) | 2017-12-28 | 2017-12-28 | Malicious group account identification method, device, terminal and storage medium |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201711460104.0A CN108197795B (en) | 2017-12-28 | 2017-12-28 | Malicious group account identification method, device, terminal and storage medium |
Publications (2)
Publication Number | Publication Date |
---|---|
CN108197795A true CN108197795A (en) | 2018-06-22 |
CN108197795B CN108197795B (en) | 2020-11-03 |
Family
ID=62585345
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201711460104.0A Active CN108197795B (en) | 2017-12-28 | 2017-12-28 | Malicious group account identification method, device, terminal and storage medium |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN108197795B (en) |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110232630A (en) * | 2019-05-29 | 2019-09-13 | 腾讯科技(深圳)有限公司 | The recognition methods of malice account, device and storage medium |
CN110414987A (en) * | 2019-07-18 | 2019-11-05 | 中国工商银行股份有限公司 | Recognition methods, device and the computer system of account aggregation |
CN112015723A (en) * | 2019-05-28 | 2020-12-01 | 顺丰科技有限公司 | Data grading method and device, computer equipment and storage medium |
CN113506150A (en) * | 2021-06-24 | 2021-10-15 | 深圳市盈捷创想科技有限公司 | Network behavior monitoring method and device and computer readable storage medium |
Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20060080422A1 (en) * | 2004-06-02 | 2006-04-13 | Bernardo Huberman | System and method for discovering communities in networks |
CN104408149A (en) * | 2014-12-04 | 2015-03-11 | 威海北洋电气集团股份有限公司 | Criminal suspect mining association method and system based on social network analysis |
CN106549974A (en) * | 2016-12-06 | 2017-03-29 | 北京知道创宇信息技术有限公司 | Prediction the social network account whether equipment of malice, method and system |
CN106599273A (en) * | 2016-12-23 | 2017-04-26 | 贾志娟 | Social network analysis-based microblog swindling team mining method |
CN106952167A (en) * | 2017-03-06 | 2017-07-14 | 浙江工业大学 | A kind of catering trade good friend Lian Bian influence force prediction methods based on multiple linear regression |
CN107169768A (en) * | 2016-03-07 | 2017-09-15 | 阿里巴巴集团控股有限公司 | The acquisition methods and device of abnormal transaction data |
CN107403326A (en) * | 2017-08-14 | 2017-11-28 | 云数信息科技(深圳)有限公司 | A kind of Insurance Fraud recognition methods and device based on teledata |
-
2017
- 2017-12-28 CN CN201711460104.0A patent/CN108197795B/en active Active
Patent Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20060080422A1 (en) * | 2004-06-02 | 2006-04-13 | Bernardo Huberman | System and method for discovering communities in networks |
CN104408149A (en) * | 2014-12-04 | 2015-03-11 | 威海北洋电气集团股份有限公司 | Criminal suspect mining association method and system based on social network analysis |
CN107169768A (en) * | 2016-03-07 | 2017-09-15 | 阿里巴巴集团控股有限公司 | The acquisition methods and device of abnormal transaction data |
CN106549974A (en) * | 2016-12-06 | 2017-03-29 | 北京知道创宇信息技术有限公司 | Prediction the social network account whether equipment of malice, method and system |
CN106599273A (en) * | 2016-12-23 | 2017-04-26 | 贾志娟 | Social network analysis-based microblog swindling team mining method |
CN106952167A (en) * | 2017-03-06 | 2017-07-14 | 浙江工业大学 | A kind of catering trade good friend Lian Bian influence force prediction methods based on multiple linear regression |
CN107403326A (en) * | 2017-08-14 | 2017-11-28 | 云数信息科技(深圳)有限公司 | A kind of Insurance Fraud recognition methods and device based on teledata |
Cited By (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112015723A (en) * | 2019-05-28 | 2020-12-01 | 顺丰科技有限公司 | Data grading method and device, computer equipment and storage medium |
CN110232630A (en) * | 2019-05-29 | 2019-09-13 | 腾讯科技(深圳)有限公司 | The recognition methods of malice account, device and storage medium |
CN110414987A (en) * | 2019-07-18 | 2019-11-05 | 中国工商银行股份有限公司 | Recognition methods, device and the computer system of account aggregation |
CN113506150A (en) * | 2021-06-24 | 2021-10-15 | 深圳市盈捷创想科技有限公司 | Network behavior monitoring method and device and computer readable storage medium |
CN113506150B (en) * | 2021-06-24 | 2023-12-05 | 深圳市盈捷创想科技有限公司 | Network behavior monitoring method, device and computer readable storage medium |
Also Published As
Publication number | Publication date |
---|---|
CN108197795B (en) | 2020-11-03 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN108229314B (en) | Target person searching method and device and electronic equipment | |
CN112148987B (en) | Message pushing method based on target object activity and related equipment | |
CN108133418A (en) | Real-time credit risk management system | |
CN104899508B (en) | A kind of multistage detection method for phishing site and system | |
CN108197795A (en) | The account recognition methods of malice group, device, terminal and storage medium | |
CN109191226B (en) | Risk control method and device | |
CN110400215B (en) | Method and system for constructing enterprise family-oriented small micro enterprise credit assessment model | |
CN103605714B (en) | The recognition methods of website abnormal data and device | |
CN110163242B (en) | Risk identification method and device and server | |
CN112926699A (en) | Abnormal object identification method, device, equipment and storage medium | |
CN110084609B (en) | Transaction fraud behavior deep detection method based on characterization learning | |
CN111325248A (en) | Method and system for reducing pre-loan business risk | |
CN113837323B (en) | Training method and device of satisfaction prediction model, electronic equipment and storage medium | |
CN113011889A (en) | Account abnormity identification method, system, device, equipment and medium | |
CN109670933A (en) | Identify method, user equipment, storage medium and the device of user role | |
CN114782051A (en) | Ether phishing account detection device and method based on multi-feature learning | |
CN107885754A (en) | The method and apparatus for extracting credit variable from transaction data based on LDA models | |
CN112966728A (en) | Transaction monitoring method and device | |
CN112487284A (en) | Bank customer portrait generation method, equipment, storage medium and device | |
WO2019223082A1 (en) | Customer category analysis method and apparatus, and computer device and storage medium | |
CN112991079B (en) | Multi-card co-occurrence medical treatment fraud detection method, system, cloud end and medium | |
CN110472680B (en) | Object classification method, device and computer-readable storage medium | |
CN113706258A (en) | Product recommendation method, device, equipment and storage medium based on combined model | |
CN113706279A (en) | Fraud analysis method and device, electronic equipment and storage medium | |
CN113065892A (en) | Information pushing method, device, equipment and storage medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |