CN109600344A - Identify the method, apparatus and electronic equipment of risk group - Google Patents

Identify the method, apparatus and electronic equipment of risk group Download PDF

Info

Publication number
CN109600344A
CN109600344A CN201710937630.5A CN201710937630A CN109600344A CN 109600344 A CN109600344 A CN 109600344A CN 201710937630 A CN201710937630 A CN 201710937630A CN 109600344 A CN109600344 A CN 109600344A
Authority
CN
China
Prior art keywords
communicating
number combination
group
combination
characteristic value
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201710937630.5A
Other languages
Chinese (zh)
Other versions
CN109600344B (en
Inventor
刘站奇
李健
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tencent Technology Shenzhen Co Ltd
Original Assignee
Tencent Technology Shenzhen Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tencent Technology Shenzhen Co Ltd filed Critical Tencent Technology Shenzhen Co Ltd
Priority to CN201710937630.5A priority Critical patent/CN109600344B/en
Publication of CN109600344A publication Critical patent/CN109600344A/en
Application granted granted Critical
Publication of CN109600344B publication Critical patent/CN109600344B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/14Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic
    • H04L63/1408Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic by monitoring network traffic
    • H04L63/1425Traffic logging, e.g. anomaly detection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/02Marketing; Price estimation or determination; Fundraising
    • G06Q30/0201Market modelling; Market analysis; Collecting market data
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/50Network services
    • H04L67/535Tracking the activity of the user
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04WWIRELESS COMMUNICATION NETWORKS
    • H04W12/00Security arrangements; Authentication; Protecting privacy or anonymity
    • H04W12/12Detection or prevention of fraud

Abstract

The embodiment of the invention discloses a kind of method, apparatus and electronic equipment for identifying risk group.This method comprises: obtaining the corresponding historical behavior data of communicating number;In historical behavior data will there are at least two communicating numbers of similar network behavior at least once to be added to a number combination;Calculate separately the corresponding associated weights of each number combination;According to the corresponding associated weights of each number combination, all communicating numbers in each number combination are clustered to obtain at least one group;Risk group is identified according to the quantity of group's risk number.In embodiments of the present invention, due to comprehensive consideration IP address, the data of multiple dimensions such as moment and number feature is requested to determine group, and risk group is further identified according to the quantity of above-mentioned group's risk number, therefore the embodiment of the present invention determines that the accuracy rate of group is higher, identifies that the accuracy rate of risk group is also higher.

Description

Identify the method, apparatus and electronic equipment of risk group
Technical field
The present embodiments relate to data analysis technique field, in particular to a kind of method, apparatus for identifying risk group And electronic equipment.
Background technique
Currently, internet platform attracts potential user by way of releasing marketing activity, part criminal is by criticizing The mode of amount registered members participates in above-mentioned marketing activity, and the member of above-mentioned batch registration can be referred to as risk group, also be referred to as For " wool party ".To avoid resource needed for releasing marketing activity from being wasted, it usually needs identify risk group.
In the related technology, the method for identifying risk group is as follows: by counting under certain dimension (such as same IP (Internet Protocol, Internet protocol) address, same period) member's quantity of registration identifies risk group.Example Such as, if the member's quantity registered under the same IP address is more than first threshold, it is determined that registered under the above-mentioned same IP address Member belongs to same risk group.For another example if the member's quantity registered in the same period is more than second threshold, it is determined that on It states the member registered in section at the same time and belongs to same risk group.
The method for the identification risk group that the relevant technologies provide, identifies that the accuracy rate of risk group is lower.
Summary of the invention
The embodiment of the invention provides a kind of method, apparatus and electronic equipment for identifying risk group, to solve correlation The problem for identifying that risk group accuracy rate is lower in the presence of technology.The technical solution is as follows:
In a first aspect, providing a kind of method for identifying risk group, which comprises
The corresponding historical behavior data of communicating number are obtained, the historical behavior data include a plurality of historical behavior record, Each historical behavior record includes: the internet protocol address, described that the communicating number is used when executing network behavior Communicating number executes the request moment of the network behavior;
In the historical behavior data will there are at least two communicating numbers of similar network behavior at least once to be added to One number combination, the similar network behavior refer to the network row that the same period is in using same IP address and request moment For;
Calculate separately the corresponding associated weights of each number combination, wherein the corresponding association power of the number combination It is reused in the correlation degree characterized in the number combination between communicating number;
According to the corresponding associated weights of each number combination, to all messengers in each number combination Code is clustered to obtain at least one group;
Risk group is identified according to the quantity of group's risk number.
It is optionally, described to calculate separately the corresponding associated weights of each number combination, comprising:
The corresponding characteristic value of each number combination is calculated separately, the characteristic value includes the First Eigenvalue, the second spy At least one of in value indicative and third feature value;Wherein, the corresponding the First Eigenvalue of the number combination is for characterizing described number Code character close in the number of the similar network behavior that has of communicating number, the corresponding Second Eigenvalue of the number combination is used for Characterize the similarity in the number combination between communicating number, the corresponding third feature value of the number combination is for characterizing institute State the type information of communicating number used IP address when executing the similar network behavior each time in number combination;
According to the corresponding characteristic value of each number combination, the corresponding association power of each number combination is determined Weight.
Optionally, described to calculate separately each number combination when the characteristic value includes the Second Eigenvalue Corresponding characteristic value, comprising:
For each number combination, the corresponding number characteristic value of communicating number in the number combination is obtained, it is described Number characteristic value includes conversational nature value, binds characteristic value and enliven at least one in characteristic value;Wherein, the conversational nature Value is quantified according to the corresponding call behavior of communicating number, and the binding characteristic value is tied up according to communicating number is corresponding Determine what behavior quantified, it is described to enliven characteristic value to be that the application program according to bound in communicating number is corresponding enliven metrization It obtains;
According to the corresponding number characteristic value of communicating number in the number combination, it is corresponding to calculate the number combination Second Eigenvalue.
Optionally, described to calculate separately each number combination when the characteristic value includes the third feature value Corresponding characteristic value, comprising:
For each number combination, obtains communicating number in the number combination and executing the similar network row each time For when used IP address type;
According to the usage quantity of the IP address of specified type, the corresponding third feature value of the number combination is determined.
Optionally, described to be distinguished according to each number combination when the number combination includes two communicating numbers Corresponding associated weights are clustered to obtain at least one group to all communicating numbers in each number combination, comprising:
Building group's characteristic pattern, a node in the population characteristic figure indicate included by each number combination One communicating number, the line between two nodes being connected in the population characteristic figure indicate that described two nodes respectively correspond Communicating number composition the corresponding associated weights of number combination;
Different labels is added for each node in the population characteristic figure;
At least one wheel renewal process is executed to the label of each node in the population characteristic figure, it is updated in each round Cheng Zhong, for each node of the population characteristic figure, according to the tag update institute for the other nodes being connected with the node State the label of node;
When at least one wheel renewal process, which executes, to be completed, will there is the node of same label in the population characteristic figure Corresponding communicating number is added to the same group.
Optionally, described according to each number combination when the quantity of communicating number in the number combination is greater than 2 Corresponding associated weights are clustered to obtain at least one group to all communicating numbers in each number combination, Include:
When the corresponding associated weights of a number combination are greater than the first thresholding, then by messenger in one number combination Code is added to the same group;
And/or
If the corresponding associated weights of multiple number combinations are all larger than the second thresholding, and in the multiple number combination The quantity for the same communication number that any two number combination has is all larger than third thresholding, then will be in the multiple number combination Communicating number is added to the same group.
Optionally, the quantity according to group's risk number determine the group whether be risk group it Afterwards, further includes:
It will not be added in the blacklist by the communicating number that the blacklist records in the risk group.
Second aspect, provides a kind of device for identifying risk group, and described device includes:
Data acquisition module, for obtaining the corresponding historical behavior data of communicating number, the historical behavior data include A plurality of historical behavior record, every historical behavior record includes: that the communicating number is used when executing the network behavior Internet protocol address and the communicating number execute the request moment of the network behavior;
Extraction module is combined, for will have at least the two of similar network behavior at least once in the historical behavior data A communicating number is added to a number combination, and the similar network behavior refers to be in using same IP address and request moment The network behavior of same period;
Weight calculation module, for calculating separately the corresponding associated weights of each number combination, wherein the number Corresponding associated weights are combined for characterizing the correlation degree in the number combination between communicating number;
Cluster module, for according to the corresponding associated weights of each number combination, in each number combination All communicating numbers are clustered to obtain at least one group;
Group's determining module, for identifying risk group according to the quantity of group's risk number.
Optionally, the weight calculation module, comprising:
First computing unit, for calculating separately the corresponding characteristic value of each number combination, the characteristic value includes At least one of in the First Eigenvalue, Second Eigenvalue and third feature value;Wherein, the corresponding fisrt feature of the number combination For value for characterizing the number for the similar network behavior that communicating number in the number combination has, the number combination is corresponding Second Eigenvalue be used to characterize similarity in the number combination between communicating number, the corresponding third of the number combination Characteristic value is for characterizing communicating number used IP when executing the similar network behavior each time in the number combination The type information of address;
Second computing unit, for determining described each number according to the corresponding characteristic value of each number combination Code character closes corresponding associated weights.
Optionally, when the characteristic value includes the Second Eigenvalue, first computing unit is used for:
For each number combination, the corresponding number characteristic value of communicating number in the number combination is obtained, it is described Number characteristic value includes conversational nature value, binds characteristic value and enliven at least one in characteristic value;Wherein, the conversational nature Value is quantified according to the corresponding call behavior of communicating number, and the binding characteristic value is tied up according to communicating number is corresponding Determine what behavior quantified, it is described to enliven characteristic value to be that the application program according to bound in communicating number is corresponding enliven metrization It obtains;
According to the corresponding number characteristic value of communicating number in the number combination, it is corresponding to calculate the number combination Second Eigenvalue.
Optionally, when the characteristic value includes the third feature value, first computing unit is used for:
For each number combination, obtains communicating number in the number combination and executing the similar network row each time For when used IP address type;
According to the usage quantity of the IP address of specified type, the corresponding third feature value of the number combination is determined.
Optionally, when the number combination includes two communicating numbers, the cluster module, comprising:
Characteristic pattern construction unit is used for building group's characteristic pattern, described in the node expression in the population characteristic figure A communicating number included by each number combination, the line between two nodes being connected in the population characteristic figure indicate The corresponding associated weights of number combination of the corresponding communicating number composition of described two nodes;
Label adding unit, for adding different labels for each node in the population characteristic figure;
It is updated to execute at least one wheel for the label to each node in the population characteristic figure for tag update unit Journey, it is other according to being connected with the node for each node of the population characteristic figure in each round renewal process The label of node described in the tag update of node;
First cluster cell is used for when at least one wheel renewal process executes completion, will be in the population characteristic figure The corresponding communicating number of node with same label is added to the same group.
Optionally, when the quantity of communicating number in the number combination is greater than 2, the cluster module, comprising:
Second cluster cell, for being greater than the first thresholding when the corresponding associated weights of a number combination, then by described one Communicating number is added to the same group in a number combination;
And/or
Third cluster cell, for being all larger than the second thresholding, and institute when the corresponding associated weights of multiple number combinations It, will when stating the quantity of the same communication number that any two number combination in multiple number combinations has and being all larger than third thresholding Communicating number is added to the same group in the multiple number combination.
Optionally, described device further include:
Number adding module, for institute will not to be added to by the communicating number that the blacklist records in the risk group It states in blacklist.
The third aspect, provides a kind of electronic equipment, and the electronic equipment includes processor and memory, the memory In be stored at least one instruction, at least a Duan Chengxu, code set or instruction set, at least one instruction, described at least one Duan Chengxu, the code set or instruction set are loaded by the processor and are executed to realize identification risk as described in relation to the first aspect The method of group.
Fourth aspect provides a kind of computer readable storage medium, is stored in the computer readable storage medium At least one instruction, at least a Duan Chengxu, code set or instruction set, at least one instruction, an at least Duan Chengxu, institute Code set or instruction set is stated to be loaded by processor and executed to realize the method for identifying risk group as described in relation to the first aspect.
5th aspect, provides a kind of computer program product, when the computer program product is performed, is used to hold The method of risk group is identified described in the above-mentioned first aspect of row.
Technical solution provided in an embodiment of the present invention can be brought the following benefits:
By calculating in the same period and using the association between the communicating number of same IP address execution network behavior Weight, and being clustered according to above-mentioned associated weights, and then determine group, due to comprehensive consideration IP address, request moment with And the data of multiple dimensions such as number feature determine group, and further identify wind according to the quantity of above-mentioned group's risk number Dangerous group, therefore the embodiment of the present invention determines that the accuracy rate of group is higher, identifies that the accuracy rate of risk group is also higher.
Detailed description of the invention
To describe the technical solutions in the embodiments of the present invention more clearly, make required in being described below to embodiment Attached drawing is briefly described, it should be apparent that, drawings in the following description are only some embodiments of the invention, for For those of ordinary skill in the art, without creative efforts, it can also be obtained according to these attached drawings other Attached drawing.
Fig. 1 is the flow chart of the method for identification risk group provided by one embodiment of the present invention;
Fig. 2 is the schematic diagram that embodiment illustrated in fig. 1 is related to;
Fig. 3 is the flow chart of the method for the identification risk group that another embodiment of the present invention provides;
Fig. 4 is the schematic diagram of population characteristic figure provided by one embodiment of the present invention;
Fig. 5 is the schematic diagram that embodiment illustrated in fig. 3 is related to;
Fig. 6 is the block diagram of the device of identification risk group provided by one embodiment of the present invention;
Fig. 7 is the block diagram of electronic equipment provided by one embodiment of the present invention.
Specific embodiment
To make the object, technical solutions and advantages of the present invention clearer, below in conjunction with attached drawing to embodiment party of the present invention Formula is described in further detail.
The embodiment of the present invention provides a kind of method, apparatus and electronic equipment for identifying risk group, by calculating same Period and the same IP address of use execute the associated weights between the communicating number of network behavior, and are weighed according to above-mentioned association It is clustered again, and then determines group, due to comprehensive consideration IP address, request multiple dimensions such as moment and number feature Data further identify risk group according to the quantity of above-mentioned group's risk number to determine group, determine the standard of group True rate is higher, identifies that the accuracy rate of risk group is also higher, is conducive to the accurate dispensing of subsequent marketing activity.
Method provided in an embodiment of the present invention, the executing subject of each step, which can be, has data analysis and processing capacity Electronic equipment.Optionally, above-mentioned electronic equipment is server.Server can be a server, be also possible to by several The server cluster or a cloud computing service center of server composition.
Referring to FIG. 1, it illustrates the flow charts of the method for the identification risk group shown in one embodiment of the invention.It should Method may include steps of:
Step 101, the corresponding historical behavior data of communicating number are obtained.
Historical behavior data include a plurality of historical behavior record.Historical behavior data can be logical by having with mobile terminal foundation The electronic equipment of letter connection collects.For example, if desired identifying whether deposit in the corresponding whole communicating numbers of specified application In risk group or risk number, above-mentioned electronic equipment can be specified using corresponding background server.Optionally, history row For the behavioral data that data are in preset time period, above-mentioned preset time period can be set according to actual needs, for example, history row For the behavioral data that data are in nearest 7 days.
Each historical behavior record includes: the IP address and communicating number that communicating number is used when executing network behavior Execute the request moment of network behavior.Network behavior is by the corresponding mobile terminal execution of the communicating number.Optionally, network behavior At least one including registering behavior and trading activity.Wherein, registration behavior refers to using communicating number registrer application Related account number, trading activity are to complete transaction by the account number of above-mentioned registration, for example, getting discount coupon, get red packet etc., this Inventive embodiments are not construed as limiting this.
In embodiments of the present invention, related communicating number refer to operator be user distribution identification user identity and The identification number of mobile terminal identity.In general, communicating number refers to Mobile Directory Number namely phone number.At it In its possible example, communicating number is the number in instant messaging (Instant Messaging) application.
Historical behavior data can refer to following tables -1.
Table -1
Time IP address Phone number
2017-06-01 10.10.10.10 13600000001
2017-06-01 10.10.10.10 13600000002
2017-06-01 10.10.10.10 13600000003
2017-06-02 11.11.11.11 13600000001
2017-06-02 11.11.11.11 13600000002
2017-06-02 11.11.11.11 13600000004
Step 102, in historical behavior data will there are at least two communicating numbers of similar network behavior at least once to add Add to a number combination.
Similar network behavior refers to the network behavior that the same period is in using same IP address and request moment.
Electronic equipment first detects the communicating number with similar network behavior from historical behavior data, then from above-mentioned Number combination is extracted in communicating number with similar network behavior.Communicating number included by number combination, which can be, to be had All or part of communicating number of similar network behavior, the embodiment of the present invention, which is checked numbers, combines the quantity of included communicating number It is not construed as limiting.
By taking above-mentioned table -1 as an example, the communicating number included by the number combination is whole communications with similar network behavior When number, electronic equipment extract number combination include (13600000001,13600000002,13600000003) and (13600000001,13600000002,13600000004).The communicating number included by the number combination is with similar net When the section communication number of network behavior, electronic equipment extract number combination include (13600000001,13600000002), (13600000001,13600000003), (13600000002,13600000003) and (13600000001, 13600000004)。
Step 103, the corresponding associated weights of each number combination are calculated separately.
The corresponding associated weights of number combination are used to characterize the correlation degree in number combination between communicating number.Number sets Close the correlation degree correlation in corresponding associated weights and number combination between communicating number.That is, number combination Corresponding associated weights are bigger, and the correlation degree in the number combination between communicating number is stronger;The corresponding association of number combination Weight is smaller, and the correlation degree in the number combination between communicating number is weaker.Calculate the corresponding association power of each number combination The detailed process of weight, will be introduced in embodiments below.
Step 104, according to the corresponding associated weights of each number combination, to all messengers in each number combination Code is clustered to obtain at least one group.
Cluster refers to the process of that the set by physical object or abstract object is divided into the multiple classes being made of similar object. In embodiments of the present invention, cluster refers to the process of that all communicating numbers for including by each number combination are divided into multiple groups, Each group includes multiple communicating numbers, and the correlation degree between each communicating number is higher.In embodiments of the present invention, right Algorithm used by all communicating numbers that n number combination includes are clustered can be label propagation algorithm (label Propagation algorithm, LPA), be also possible to improved label propagation algorithm (Speaker label Propagation algorithm, SLPA), can also be HANP algorithm, the embodiment of the present invention is not construed as limiting this.
Step 105, risk group is identified according to the quantity of group's risk number.
Risk number refers to the communicating number recorded in blacklist.Optionally, risk number refers to that call behavior is less Even without the communicating number of call behavior, which can be referred to as " cat pond number ".Optionally, risk number refers to Used IP address is the communicating number of risk IP when executing network behavior.
Optionally, if the quantity of group's risk number is more than preset threshold, group is determined as risk group.It is default Threshold value can the communicating number according to included by the first group quantity it is practical determine.Optionally, electronic equipment is by preset threshold It is determined as the 60% of the quantity of communicating number included by the first group.In other possible examples, preset threshold can be with people For setting.If the quantity for belonging to risk number in communicating number included by the first group is less than or equal to preset threshold, First group is determined as secure groups.It, can be with when subsequent internet platform releases marketing activity after determining risk group Communicating number in risk group is forbidden to participate in, so that resource needed for marketing activity be avoided to be wasted.
In other possible examples, if the quantity of security number is less than specified threshold in group, group is determined as Risk group.Security number can be the communicating number recorded in white list.
In conjunction with reference Fig. 2, it illustrates the schematic diagrames that embodiment illustrated in fig. 1 is related to.Electronic equipment obtains in nearest 7 days Historical behavior data, and according to preset algorithm (such as label propagation algorithm) to the phone number for executing network behavior in 7 days And the phone number executes IP address used in network behavior and is analyzed, and determines group.Subsequent electronic equipment can be into One step risk number according to recorded in blacklist determines the whether all risk numbers of phone number included by the group, And blacklist of the update for recording the blacklist of risk number and for recording risk IP, for inquiry.Wherein, k be zero or Positive integer.
In conclusion method provided in an embodiment of the present invention, by calculating in the same period and using same IP address The associated weights between the communicating number of network behavior are executed, and are clustered according to above-mentioned associated weights, and then determine group Body determines group due to comprehensive consideration IP address, the data of request multiple dimensions such as moment and number feature, and further Risk group is identified according to the quantity of group's risk number, therefore the embodiment of the present invention determines that the accuracy rate of group is higher, Identify that the accuracy rate of risk group is also higher.
Referring to FIG. 3, it illustrates the flow charts of the method for the identification risk group shown in one embodiment of the invention.It should Method may include the following steps.
Step 301, the corresponding historical behavior data of communicating number are obtained.
Historical behavior data include a plurality of historical behavior record.Each historical behavior note in a plurality of historical behavior record Record include: communicating number used when executing network behavior IP address, communicating number execute network behavior the request moment.
Step 302, in historical behavior data will there are at least two communicating numbers of similar network behavior at least once to add A number combination is added to,.
Similar network behavior refers to the network behavior that the same period is in using same IP address and request moment.
Step 303, the corresponding characteristic value of each number combination is calculated separately, characteristic value includes the First Eigenvalue, the second spy At least one of in value indicative and third feature value.
The corresponding the First Eigenvalue of number combination is for characterizing the similar network behavior that communicating number in number combination has Number.
By taking table -1 as an example, the number for the similar network behavior that number combination (13600000001,13600000002) has It is 2, then the corresponding the First Eigenvalue of number combination (13600000001,13600000002) is 2;Number combination The number for the similar network behavior that (13600000001,13600000003) have be 1, then number combination (13600000001, 13600000003) corresponding the First Eigenvalue is 1;Number combination (13600000002,136000000003) has similar The number of network behavior is 1, then the corresponding the First Eigenvalue of number combination (13600000002,136000000003) is 1;Number The number that code character closes the similar network behavior that (13600000001,13600000004) have is 1, then number combination (13600000001,13600000004) corresponding the First Eigenvalue is 1.
The corresponding Second Eigenvalue of number combination is used to characterize the similarity in number combination between communicating number.Number sets Close the similarity correlation in corresponding Second Eigenvalue and number combination between communicating number.That is, number combination Corresponding Second Eigenvalue is bigger, and the similarity in the number combination between communicating number is higher;Number combination corresponding second Characteristic value is smaller, and the similarity in the number combination between communicating number is lower.
Optionally the corresponding Second Eigenvalue of number combination can be calculated by following sub-step:
Step 303a obtains the corresponding number characteristic value of communicating number in number combination for each number combination, Number characteristic value includes conversational nature value, binds characteristic value and enliven at least one in characteristic value;
Conversational nature value is quantified according to the corresponding call behavior of communicating number.The corresponding call row of communicating number Being includes talk times, the duration of call etc..The talk times of the corresponding conversational nature value of communicating number and communicating number, call The equal correlation of duration.The corresponding talk times of communicating number are more, and the corresponding conversational nature value of communicating number is higher;It is logical The corresponding talk times of signal code are fewer, and the corresponding conversational nature value of communicating number is lower.The corresponding duration of call of communicating number Longer, the corresponding conversational nature value of communicating number is higher;The corresponding talk times of communicating number are shorter, and communicating number is corresponding logical It is lower to talk about characteristic value.
Specifically, for each number combination, electronic equipment obtains the corresponding call of conversation number in number combination The data such as number, the duration of call, and quantified according to above-mentioned data, and then obtain conversation number in number combination and respectively correspond Conversational nature value.For example, the talk times of number 13600000001 are 6 times, total duration of call is 31 minutes, electronic equipment The conversational nature value quantified for number 13600000001 is 0.8.For another example the talk times of number 13600000002 are 2 times, Total duration of call is 3 minutes, and electronic equipment is that the conversational nature value that number 13600000002 quantifies is 0.1.
Binding characteristic value is quantified according to the corresponding binding behavior of communicating number.The corresponding binding row of communicating number Be include communicating number whether binding application program, the quantity etc. of application program bound in communicating number.Unbound application The corresponding binding characteristic value of the communicating number of program should be less than the corresponding conversational nature of communicating number of binding application program Value.For being bundled with the communicating number of application program, the corresponding binding characteristic value of communicating number is answered with what communicating number was bound With the quantity correlation of program.The quantity of the application program of communicating number binding is more, the corresponding binding of communicating number Characteristic value is lower;The quantity of the application program of communicating number binding is fewer, and the corresponding binding characteristic value of communicating number is higher.
Specifically, for each number combination, electronic equipment obtains whether conversation number in number combination is bound using journey The data such as sequence and the quantity of binding application program, and quantified according to above-mentioned data, and then obtain conversing in number combination The corresponding binding characteristic value of number.For example, number 13600000001 is bundled with 13 application programs, electronic equipment is number The binding characteristic value of 13600000001 quantization of code is 0.7.For another example number 13600000002 is bundled with 2 application programs, electricity Sub- equipment is that the binding characteristic value that number 13600000002 quantifies is 0.1.
Enlivening characteristic value, to be that the application program according to bound in communicating number is corresponding enliven what metrization obtained.Messenger Code bound in the corresponding liveness of application program and enliven characteristic value correlation.That is, bound in communicating number The corresponding liveness of application program is bigger, and the communicating number is corresponding to enliven that characteristic value is bigger, application bound in communicating number The corresponding liveness of program is smaller, and the communicating number is corresponding, and to enliven characteristic value smaller.
The corresponding liveness of application program bound in communicating number can log in the client of the application program by user The number at end is measured, and the number that user logs in the client of the application program is more, application program bound in communicating number Corresponding liveness is bigger.When application program is social category application program, above-mentioned liveness can also be by user by being somebody's turn to do The number of session is measured between the client of application program and other users, and user passes through the client of the application program and its The number of session is more between its user, and the corresponding liveness of application program bound in communicating number is bigger.Work as application program When for shopping class application program, above-mentioned liveness can also by user by number that the client of the application program is done shopping come It measures, user is more by the number that the client of the application program is done shopping, and application program bound in communicating number is corresponding Liveness is bigger.The embodiment of the present invention does not limit the mode for measuring the corresponding liveness of application program bound in communicating number It is fixed.
Specifically, for each number combination, electronic equipment obtains the corresponding binding of conversation number in number combination The data such as the liveness of application program, and quantified according to above-mentioned data, and then obtain conversation number point in number combination It is not corresponding to enliven characteristic value.For example, it is 0.9 that electronic equipment, which is the characteristic value of enlivening that number 13600000001 quantifies,.Example again Such as, it is 0.2 that electronic equipment, which is the binding characteristic value that number 13600000002 quantifies,.
It is corresponding to calculate number combination according to the corresponding number characteristic value of communicating number in number combination by step 303b Second Eigenvalue.
Optionally, electronic equipment calculates number sets according to the corresponding number characteristic value of communicating number in number combination Similarity in conjunction between communicating number obtains the corresponding Second Eigenvalue of number combination.Wherein, it calculates and is communicated in number combination Algorithm used by similarity between number can be Euclidean distance (euclidean metric), Jie Kade distance (Jaccard Distance), cosine similarity etc., the embodiment of the present invention is not construed as limiting this.
Using algorithm as cosine similarity, for number combination (13600000001,13600000002), number 13600000001 corresponding conversational nature values, to bind characteristic value and enliven characteristic value be respectively 0.8,0.7 and 0.9, number 13600000002 corresponding conversational nature values, to bind characteristic value and enliven characteristic value be respectively 0.1,0.1 and 0.2, then number sets Close (13600000001,13600000002) corresponding Second Eigenvalue are as follows:
The corresponding third feature value of number combination is executing similar net for characterizing communicating number in number combination each time The type information of used IP address when network behavior.Optionally, the type information of IP address include above-mentioned IP address whether be The IP address of specified type, usage quantity of IP address of specified type etc..The IP address of specified type can be preset, For example, the IP address of specified type is risk IP.Risk IP can be Agent IP or Concern Mafia IP.
Optionally, the corresponding third feature value of number combination can be calculated by following sub-step.
Step 303c obtains communicating number in number combination and is executing similar network each time for each number combination The type of used IP address when behavior;
Step 303d determines the corresponding third feature value of number combination according to the usage quantity of the IP address of specified type.
Optionally, the usage quantity of the IP address of specified type it is corresponding to be determined directly as number combination by electronic equipment Third feature value.For example, number combination (13600000001,13600000002) executes used IP when similar network behavior Address is respectively IP address 1 and IP address 2, wherein IP address 1 is Agent IP, and IP2 is that Concern Mafia IP, IP1 and IP2 are risk IP, then number combination (13600000001,13600000002) is determined as 2 by electronic equipment.
Step 304, according to the corresponding characteristic value of each number combination, the corresponding association power of each number combination is determined Weight.
Optionally, the corresponding the First Eigenvalue of each number combination, Second Eigenvalue and third feature value are asked With obtain the corresponding associated weights of each number combination.By taking number combination (1360000001,13600000002) as an example, number Code character close (1360000001,13600000002) corresponding the First Eigenvalue, Second Eigenvalue and third feature value be respectively 2, 0.46 and 2, the then corresponding associated weights=2+0.46+2=4.46 of number combination (1360000001,13600000002).
Step 305, when number combination includes two communicating numbers, building group's characteristic pattern.
A node in population characteristic figure indicates a communicating number included by each number combination.Population characteristic figure In line between two nodes being connected indicate that the number combination of two nodes corresponding communicating number composition is corresponding Associated weights;
In conjunction with reference Fig. 4, it illustrates the schematic diagrames of population characteristic figure provided by one embodiment of the present invention.Node 1 to Node 6 respectively represents phone number 1 to phone number 6, and the line between node 1 and node 2 indicates cell-phone number 1 and phone number The corresponding associated weights of number combination of 2 compositions are 4.46, and meaning represented by the line between other nodes can be with such It pushes away.
Step 306, different labels is added for each node in population characteristic figure.
By taking population characteristic figure shown in Fig. 4 as an example, electronic equipment is that node 1 to the label that node 6 adds is respectively group 1 To group 6.
Step 307, at least one wheel renewal process is executed to the label of each node in population characteristic figure, each round more During new, for each node of population characteristic figure, according to the tag update node for the other nodes being connected with node Label.
The wheel number of renewal process can the node according to included by population characteristic figure quantity it is practical determine.Population characteristic figure The quantity of included node is more, then the wheel number of renewal process is more;The quantity of node included by population characteristic figure is fewer, Then the wheel number of renewal process is also fewer.The other nodes being connected with node are the nodes between node there are line.
The process of above-mentioned steps 307 can be referred to as " label propagation ".Optionally, for each section of population characteristic figure Point, according to association corresponding with the number combination of the adjacent each node of node and the corresponding communicating number composition of the node Weight updates the label of the node.Specifically, the starting point that electronic equipment selects any one node to propagate as label, is obtained later The corresponding association of the number combination of each node and the node adjacent with the node corresponding communicating number composition is taken to weigh The tag update of the node is the label of another node included by the maximum number combination of associated weights by weight.
For the population characteristic figure shown in Fig. 4, the number combination of the communicating number composition corresponding with node 2 of node 1 Corresponding associated weights are 4.46, the association power corresponding with the number combination that the corresponding communicating number of node 3 forms of node 1 Weight is 2.15, and the associated weights corresponding with the number combination that the corresponding communicating number of node 2 forms of node 1 are 1.75.Its In, the associated weights corresponding with the number combination that the corresponding communicating number of node 2 forms of node 1 are maximum, then electronic equipment The label of node 1 is updated to group 2 by group 1.
Step 308, when at least one wheel renewal process, which executes, to be completed, will there is the node of same label in population characteristic figure Corresponding communicating number is added to the same group.
Optionally, when the label of each node in population characteristic figure is no longer changed, at least one wheel renewal process Completion is executed, the corresponding communicating number of node with same label is added to the same group by electronic equipment at this time.
For the population characteristic figure shown in Fig. 4, at least one wheel renewal process is executed when completing, node 1, node 2, node 4 Label with node 5 is group 5, and the label of node 3 and node 6 is group 6, then electronic equipment is by node 1, node 2, section Point 4 and the corresponding phone number 1 of node 5, phone number 2, phone number 4 and phone number 5 are added to the same group, will Node 3 and the corresponding phone number 3 of node 6 and phone number 6 are added to the same group.
Step 309, risk group is identified according to the quantity of group's risk number
Risk number refers to the communicating number recorded in blacklist.
Step 310, it will not be added in blacklist by the communicating number that blacklist records in risk group.
Since group is made of the higher communicating number of correlation degree, when most of the communicating number in the group is wind When dangerous number, then it is assumed that all risk numbers of communicating number in the group, then electronic equipment will not be hacked in risk group The communicating number of name unirecord is also added in blacklist.
Optionally, the communicating number in the group is executed IP address used in similar network behavior and determined by electronic equipment For risk IP, and above-mentioned risk IP is added in the blacklist for being used to record risk IP.
In conjunction with reference Fig. 5, it illustrates the schematic diagrames that embodiment illustrated in fig. 3 is related to.Wherein, phone number 1, phone number 2 and phone number 3 executed network behavior using IP address 1, phone number 1, phone number 2 and phone number 4 use IP address 2 executed network behavior, extracted number combination (phone number 1, phone number according to above-mentioned historical behavior record 2) number (namely the First Eigenvalue) for the similar network behavior that phone number 1 and phone number 2 have, is calculated separately later, Similarity (namely Second Eigenvalue) between phone number 1 and phone number 2, IP address 1 and the corresponding IP of IP address 2 Feature (namely third feature value), in summary the First Eigenvalue, Second Eigenvalue and third feature value obtain number combination (hand Machine number 1, phone number 2) corresponding associated weights, then clustered using preset algorithm (such as label propagation algorithm), At least one group is obtained, the risk number recorded later according to blacklist finally determines risk group, and further by wind The unmarked communicating number for risk number is added to above-mentioned blacklist in dangerous group, and risk number is executed similar network behavior Used IP address is also added to the blacklist for recording risk IP.
In addition, when the quantity for the communicating number that number combination includes is greater than 2, it is corresponding according to each number combination Associated weights are clustered to obtain at least one group to all communicating numbers in each number combination, may include following two The possible implementation of kind.
It in one possible implementation, will if the corresponding associated weights of a number combination are greater than the first thresholding Communicating number included by one number combination is added to the same group.Above-mentioned first thresholding can be according to the precision for determining group It is required that practical determine.If the required precision of group's grouping body is higher, the first thresholding is larger;If it is determined that the required precision of group compared with Low, then the first thresholding is lower.For example, the first thresholding is 7, that is, if the corresponding associated weights of some number combination are greater than 7, Communicating number included by the number combination is added to the same group by electronic equipment.
In alternatively possible implementation, if the corresponding associated weights of multiple number combinations are all larger than second Limit, and the quantity of same communication number that any two number combination in multiple number combinations has is all larger than third thresholding, Communicating number included by multiple number combinations is then added to the same group.Above-mentioned second thresholding and third thresholding can also roots It is determined according to the required precision for determining group is practical.For example, the second thresholding is 5, third thresholding is 6, number combination 1 and number combination 2 corresponding associated weights are respectively 5.16 and 5.27, and number combination 1 and number combination 2 have 9 identical communicating numbers, Then communicating number included by communicating number included by number combination 1 and number combination 2 is added to same a group by electronic equipment Body.
In conclusion method provided in an embodiment of the present invention, by calculating in the same period and using same IP address The associated weights between the communicating number of network behavior are executed, and are clustered according to above-mentioned associated weights, and then determine group Body determines group due to comprehensive consideration IP address, the data of request multiple dimensions such as moment and number feature, and further Risk group is identified according to the quantity of group's risk number, therefore the embodiment of the present invention determines that the accuracy rate of group is higher, Identify that the accuracy rate of risk group is also higher.
It also is updated to blacklist by the way that risk number will be not labeled as in risk group, subsequent determining risk groups can be improved The accuracy rate of body.
Following is apparatus of the present invention embodiment, can be used for executing embodiment of the present invention method.For apparatus of the present invention reality Undisclosed details in example is applied, embodiment of the present invention method is please referred to.
Referring to FIG. 6, it illustrates the block diagrams of the device of identification risk group provided by one embodiment of the present invention.The dress Setting has the function of realizing in above method example, and the function can also be executed corresponding by hardware realization by hardware Software realization.The apparatus may include: data acquisition module 601, combination extraction module 602, weight calculation module 603, cluster Module 604 and group's determining module 605.
Data acquisition module 601, for obtaining the corresponding historical behavior data of communicating number, the historical behavior data packet A plurality of historical behavior record is included, every historical behavior record includes: the IP that the communicating number is used when executing network behavior Address and the communicating number execute the request moment when network behavior.
Extraction module 602 is combined, for will there is similar network behavior at least once extremely in the historical behavior data Few two communicating numbers are added to a number combination, and the similar network behavior referred to using same IP address and request moment Network behavior in the same period.
Weight calculation module 603, for calculating separately the corresponding associated weights of each number combination, the number sets Corresponding associated weights are closed for characterizing the correlation degree in the number combination between communicating number.
Cluster module 604, for according to the corresponding associated weights of each number combination, to each number All communicating numbers are clustered to obtain at least one group in combination.
Group's determining module 605, for identifying risk group according to the quantity of group's risk number.
In the alternative embodiment provided based on embodiment illustrated in fig. 6, the weight calculation module 603, comprising: the One computing unit and the second computing unit (not shown).
First computing unit, for calculating separately the corresponding characteristic value of each number combination, the characteristic value includes At least one of in the First Eigenvalue, Second Eigenvalue and third feature value;Wherein, the corresponding fisrt feature of the number combination For value for characterizing the number for the similar network behavior that communicating number in the number combination has, the number combination is corresponding Second Eigenvalue be used to characterize similarity in the number combination between communicating number, the corresponding third of the number combination Characteristic value is for characterizing communicating number used IP when executing the similar network behavior each time in the number combination The type information of address.
Second computing unit, for determining described each number according to the corresponding characteristic value of each number combination Code character closes corresponding associated weights.
In another alternative embodiment provided based on embodiment illustrated in fig. 6, when the characteristic value includes described second When characteristic value, first computing unit is used for:
For each number combination, the corresponding number characteristic value of communicating number in the number combination is obtained, it is described Number characteristic value includes conversational nature value, binds characteristic value and enliven at least one in characteristic value;Wherein, the conversational nature Value is quantified according to the corresponding call behavior of communicating number, and the binding characteristic value is tied up according to communicating number is corresponding Determine what behavior quantified, it is described to enliven characteristic value to be that the application program according to bound in communicating number is corresponding enliven metrization It obtains;
According to the corresponding number characteristic value of communicating number in the number combination, it is corresponding to calculate the number combination Second Eigenvalue.
In another alternative embodiment provided based on embodiment illustrated in fig. 6, when the characteristic value includes the third When characteristic value, first computing unit is used for:
For each number combination, obtains communicating number in the number combination and executing the similar network row each time For when used IP address type;
According to the usage quantity of the IP address of specified type, the corresponding third feature value of the number combination is determined.
In another alternative embodiment provided based on embodiment illustrated in fig. 6, when the number combination includes two logical When signal code, the cluster module 604, comprising: characteristic pattern construction unit, label adding unit, tag update unit and first Cluster cell (not shown).
Characteristic pattern construction unit is used for building group's characteristic pattern, described in the node expression in the population characteristic figure A communicating number included by each number combination, the line between two nodes being connected in the population characteristic figure indicate The corresponding associated weights of number combination of the corresponding communicating number composition of described two nodes.
Label adding unit, for adding different labels for each node in the population characteristic figure.
It is updated to execute at least one wheel for the label to each node in the population characteristic figure for tag update unit Journey, it is other according to being connected with the node for each node of the population characteristic figure in each round renewal process The label of node described in the tag update of node.
First cluster cell is used for when at least one wheel renewal process executes completion, will be in the population characteristic figure The corresponding communicating number of node with same label is added to the same group.
In another alternative embodiment provided based on embodiment illustrated in fig. 6, when communicating number in the number combination Quantity be greater than 2 when, the cluster module, comprising: the second cluster cell, and/or, third cluster cell (not shown).
Second cluster cell, for being greater than the first thresholding when the corresponding associated weights of a number combination, then by described one Communicating number included by a number combination is added to the same group.
Third cluster cell, if being all larger than the second thresholding, and institute for the corresponding associated weights of multiple number combinations The quantity for stating the same communication number that any two number combination in multiple number combinations has is all larger than third thresholding, then will Communicating number is added to the same group in the multiple number combination.
In another alternative embodiment provided based on embodiment illustrated in fig. 6, described device further include: number adds mould Block (not shown).
Number adding module, for institute will not to be added to by the communicating number that the blacklist records in the risk group It states in blacklist.
In conclusion device provided in an embodiment of the present invention, by calculating in the same period and using same IP address The associated weights between the communicating number of network behavior are executed, and are clustered according to above-mentioned associated weights, and then determine group Body determines group due to comprehensive consideration IP address, the data of request multiple dimensions such as moment and number feature, and further Risk group is identified according to the quantity of group's risk number, therefore the embodiment of the present invention determines that the accuracy rate of group is higher, Identify that the accuracy rate of risk group is also higher.
Referring to FIG. 7, the structural block diagram of the electronic equipment 700 provided it illustrates another embodiment of the present invention.It should The method for the identification risk group that electronic equipment 700 is used to implement to provide in above-described embodiment.
The electronic equipment 700 includes 702 He of central processing unit (CPU) 701 including random access memory (RAM) The system storage 704 of read-only memory (ROM) 703, and connection system storage 704 and central processing unit 701 be System bus 705.The electronic equipment 700 further include help computer in each device between transmit information it is basic input/ Output system (I/O system) 706, and for the great Rong of storage program area 713, application program 714 and other program modules 715 Amount storage electronic equipment 707.
The basic input/output 706 includes display 708 for showing information and inputs letter for user The input electronic equipment 709 of such as mouse, keyboard etc of breath.Wherein the display 708 and input electronic equipment 709 are all logical It crosses and is connected to the input and output controller 710 of system bus 705 and is connected to central processing unit 701.The basic input/output System 706 can also include input and output controller 710 for receiving and handling from keyboard, mouse or electronic touch pen Etc. the input of other multiple electronic equipments.Similarly, input and output controller 710 also provide output to display screen, printer or Other kinds of output electronic equipment.
The massive store electronic equipment 707 (is not shown by being connected to the bulk memory controller of system bus 705 It is connected to central processing unit 701 out).The massive store electronic equipment 707 and its associated computer-readable medium Non-volatile memories are provided for electronic equipment 700.That is, the massive store electronic equipment 707 may include such as The computer-readable medium (not shown) of hard disk or CD-ROM drive etc.
Without loss of generality, the computer-readable medium may include computer storage media and communication media.Computer Storage medium includes information such as computer readable instructions, data structure, program module or other data for storage The volatile and non-volatile of any method or technique realization, removable and irremovable medium.Computer storage medium includes RAM, ROM, EPROM, EEPROM, flash memory or other solid-state storages its technologies, CD-ROM, DVD or other optical storages, tape Box, tape, disk storage or other magnetic storage electronic equipments.Certainly, skilled person will appreciate that the computer stores Medium is not limited to above-mentioned several.Above-mentioned system storage 704 and massive store electronic equipment 707 may be collectively referred to as storing Device.
According to various embodiments of the present invention, the electronic equipment 700 can also be connected to the network by internet etc. Remote computer operation on to network.Namely electronic equipment 700 can be by the network that is connected on the system bus 705 Interface unit 711 is connected to network 712, in other words, Network Interface Unit 711 can be used also to be connected to other kinds of net Network or remote computer system (not shown).
It is stored at least one instruction, at least a Duan Chengxu, code set or instruction set in the memory, described at least one Item instruction, an at least Duan Chengxu, the code set or instruction set are loaded by the processor and are executed to realize above-mentioned knowledge The method of other risk group.
In the exemplary embodiment, a kind of computer readable storage medium is additionally provided, is stored in the storage medium At least one instruction, at least a Duan Chengxu, code set or instruction set, at least one instruction, an at least Duan Chengxu, institute Code set or instruction set is stated to be loaded by the processor of electronic equipment and executed to realize the identification risk in above method embodiment The method of group.
Optionally, above-mentioned computer readable storage medium can be ROM, random access memory (RAM), CD-ROM, magnetic Band, floppy disk and optical data storage devices etc..
It should be understood that referenced herein " multiple " refer to two or more."and/or", description association The incidence relation of object indicates may exist three kinds of relationships, for example, A and/or B, can indicate: individualism A exists simultaneously A And B, individualism B these three situations.Character "/" typicallys represent the relationship that forward-backward correlation object is a kind of "or".Make herein " first ", " second " and similar word are not offered as any sequence, quantity or importance, and are used only to distinguish Different component parts.
The serial number of the above embodiments of the invention is only for description, does not represent the advantages or disadvantages of the embodiments.
The above is only exemplary embodiment of the present invention, are not intended to limit the invention, all in spirit and original of the invention Within then, any modification, equivalent replacement, improvement and so on be should all be included in the protection scope of the present invention.

Claims (10)

1. a kind of method for identifying risk group, which is characterized in that the described method includes:
The corresponding historical behavior data of communicating number are obtained, the historical behavior data include that a plurality of historical behavior records, and every Historical behavior record includes: internet protocol address that the communicating number is used when executing network behavior and described leads to Signal code executes the request moment when network behavior;
In the historical behavior data will there are at least two communicating numbers of similar network behavior at least once to be added to one Number combination, the similar network behavior refer to the network behavior that the same period is in using same IP address and request moment;
Calculate separately the corresponding associated weights of each number combination, the corresponding associated weights of the number combination are for characterizing Correlation degree in the number combination between communicating number;
According to the corresponding associated weights of each number combination, to all communicating numbers in each number combination into Row cluster obtains at least one group;
Risk group is identified according to the quantity of group's risk number.
2. the method according to claim 1, wherein described calculate separately the corresponding pass of each number combination Join weight, comprising:
The corresponding characteristic value of each number combination is calculated separately, the characteristic value includes the First Eigenvalue, Second Eigenvalue With at least one in third feature value;Wherein, the corresponding the First Eigenvalue of the number combination is for characterizing the number sets The number for the similar network behavior that communicating number has in conjunction, the corresponding Second Eigenvalue of the number combination is for characterizing Similarity in the number combination between communicating number, the corresponding third feature value of the number combination is for characterizing described number The type information of communicating number used IP address when executing the similar network behavior each time in code character conjunction;
According to the corresponding characteristic value of each number combination, the corresponding associated weights of each number combination are determined.
3. according to the method described in claim 2, it is characterized in that, when the characteristic value includes the Second Eigenvalue, institute It states and calculates separately the corresponding characteristic value of each number combination, comprising:
For each number combination, the corresponding number characteristic value of communicating number in the number combination, the number are obtained Characteristic value includes conversational nature value, binds characteristic value and enliven at least one in characteristic value;Wherein, the conversational nature value is Quantified according to the corresponding call behavior of communicating number, the binding characteristic value is according to the corresponding binding row of communicating number Quantization obtains, and the characteristic value of enlivening is that the corresponding metrization of enlivening of the application program according to bound in communicating number obtains 's;
According to the corresponding number characteristic value of communicating number in the number combination, the number combination corresponding second is calculated Characteristic value.
4. according to the method described in claim 2, it is characterized in that, when the characteristic value includes the third feature value, institute It states and calculates separately the corresponding characteristic value of each number combination, comprising:
For each number combination, communicating number is obtained in the number combination when executing the similar network behavior each time The type of used IP address;
According to the usage quantity of the IP address of specified type, the corresponding third feature value of the number combination is determined.
5. method according to any one of claims 1 to 4, which is characterized in that when the number combination includes two communications It is described according to the corresponding associated weights of each number combination when number, to all logical in each number combination Signal code is clustered to obtain at least one group, comprising:
Building group's characteristic pattern, a node in the population characteristic figure indicate included by each number combination one Communicating number, the line between two nodes being connected in the population characteristic figure indicate that described two nodes are corresponding logical The corresponding associated weights of number combination of signal code composition;
Different labels is added for each node in the population characteristic figure;
At least one wheel renewal process is executed to the label of each node in the population characteristic figure, in each round renewal process In, for each node of the population characteristic figure, according to the tag update for the other nodes being connected with the node The label of node;
It is when at least one wheel renewal process, which executes, to be completed, the node in the population characteristic figure with same label is corresponding Communicating number be added to the same group.
6. method according to any one of claims 1 to 4, which is characterized in that when communicating number in the number combination It is described according to the corresponding associated weights of each number combination when quantity is greater than 2, to institute in each number combination There is communicating number to be clustered to obtain at least one group, comprising:
If the corresponding associated weights of a number combination are greater than the first thresholding, communicating number in one number combination is added Add to the same group;
And/or
If the corresponding associated weights of multiple number combinations are all larger than the second thresholding, and any in the multiple number combination The quantity for the same communication number that two number combinations have is all larger than third thresholding, then will communicate in the multiple number combination Number is added to the same group.
7. method according to any one of claims 1 to 4, which is characterized in that described according to group's risk number Quantity identification risk group after, further includes:
It will not be added in the blacklist by the communicating number that the blacklist records in the risk group.
8. a kind of device for identifying risk group, which is characterized in that described device includes:
Data acquisition module, for obtaining the corresponding historical behavior data of communicating number, the historical behavior data include a plurality of Historical behavior record, every historical behavior record include: IP that the communicating number is used when executing the network behavior Location and the communicating number execute the request moment of the network behavior;
Extraction module is combined, for will have at least two of similar network behavior at least once to lead in the historical behavior data Signal code is added to a number combination, and the similar network behavior, which refers to, is in same using same IP address and request moment The network behavior of period;
Weight calculation module, for calculating separately the corresponding associated weights of each number combination, wherein the number combination Corresponding associated weights are used to characterize the correlation degree in the number combination between communicating number;
Cluster module, for owning in each number combination according to the corresponding associated weights of each number combination Communicating number is clustered to obtain at least one group;
Group's determining module, for identifying risk group according to the quantity of group's risk number.
9. a kind of electronic equipment, which is characterized in that the electronic equipment includes processor and memory, is stored in the memory Have at least one instruction, at least a Duan Chengxu, code set or instruction set, at least one instruction, an at least Duan Chengxu, The code set or instruction set are loaded by the processor and are executed to realize identification as described in any one of claim 1 to 7 The method of risk group.
10. a kind of computer readable storage medium, which is characterized in that be stored at least one in the computer readable storage medium Item instruction, at least a Duan Chengxu, code set or instruction set, at least one instruction, an at least Duan Chengxu, the code Collection or instruction set are loaded by processor and are executed the side to realize identification risk group as described in any one of claim 1 to 7 Method.
CN201710937630.5A 2017-09-30 2017-09-30 Method and device for identifying risk group and electronic equipment Active CN109600344B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201710937630.5A CN109600344B (en) 2017-09-30 2017-09-30 Method and device for identifying risk group and electronic equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201710937630.5A CN109600344B (en) 2017-09-30 2017-09-30 Method and device for identifying risk group and electronic equipment

Publications (2)

Publication Number Publication Date
CN109600344A true CN109600344A (en) 2019-04-09
CN109600344B CN109600344B (en) 2021-03-23

Family

ID=65956849

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201710937630.5A Active CN109600344B (en) 2017-09-30 2017-09-30 Method and device for identifying risk group and electronic equipment

Country Status (1)

Country Link
CN (1) CN109600344B (en)

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110166635A (en) * 2019-07-11 2019-08-23 中国联合网络通信集团有限公司 Suspicious terminal identification method and suspicious terminal recognition system
CN110225036A (en) * 2019-06-12 2019-09-10 北京奇艺世纪科技有限公司 A kind of account detection method, device, server and storage medium
CN111245815A (en) * 2020-01-07 2020-06-05 同盾控股有限公司 Data processing method, data processing device, storage medium and electronic equipment
CN111931047A (en) * 2020-07-31 2020-11-13 中国平安人寿保险股份有限公司 Artificial intelligence-based black product account detection method and related device
CN112351441A (en) * 2019-08-06 2021-02-09 中国移动通信集团广东有限公司 Data processing method and device and electronic equipment
CN112615966A (en) * 2020-12-14 2021-04-06 南方电网海南数字电网研究院有限公司 Cat pool terminal identification method
CN113641970A (en) * 2021-08-16 2021-11-12 深圳竹云科技有限公司 Risk detection method and device and computing equipment

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102413013A (en) * 2011-11-21 2012-04-11 北京神州绿盟信息安全科技股份有限公司 Method and device for detecting abnormal network behavior
CN103577991A (en) * 2012-08-03 2014-02-12 阿里巴巴集团控股有限公司 User identification method and device
CN104933570A (en) * 2014-03-20 2015-09-23 阿里巴巴集团控股有限公司 User detection method and device
CN106157326A (en) * 2015-04-07 2016-11-23 中国科学院深圳先进技术研究院 Group abnormality behavioral value method and system
CN106339615A (en) * 2016-08-29 2017-01-18 北京红马传媒文化发展有限公司 Abnormal registration behavior recognition method, system and equipment
CN106919953A (en) * 2017-02-23 2017-07-04 北京工业大学 A kind of abnormal trip Stock discrimination method based on track traffic data analysis
US20170244735A1 (en) * 2014-12-22 2017-08-24 Palantir Technologies Inc. Systems and user interfaces for dynamic and interactive investigation of bad actor behavior based on automatic clustering of related data in various data structures

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102413013A (en) * 2011-11-21 2012-04-11 北京神州绿盟信息安全科技股份有限公司 Method and device for detecting abnormal network behavior
CN103577991A (en) * 2012-08-03 2014-02-12 阿里巴巴集团控股有限公司 User identification method and device
CN104933570A (en) * 2014-03-20 2015-09-23 阿里巴巴集团控股有限公司 User detection method and device
US20170244735A1 (en) * 2014-12-22 2017-08-24 Palantir Technologies Inc. Systems and user interfaces for dynamic and interactive investigation of bad actor behavior based on automatic clustering of related data in various data structures
CN106157326A (en) * 2015-04-07 2016-11-23 中国科学院深圳先进技术研究院 Group abnormality behavioral value method and system
CN106339615A (en) * 2016-08-29 2017-01-18 北京红马传媒文化发展有限公司 Abnormal registration behavior recognition method, system and equipment
CN106919953A (en) * 2017-02-23 2017-07-04 北京工业大学 A kind of abnormal trip Stock discrimination method based on track traffic data analysis

Cited By (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110225036A (en) * 2019-06-12 2019-09-10 北京奇艺世纪科技有限公司 A kind of account detection method, device, server and storage medium
CN110166635A (en) * 2019-07-11 2019-08-23 中国联合网络通信集团有限公司 Suspicious terminal identification method and suspicious terminal recognition system
CN110166635B (en) * 2019-07-11 2021-06-08 中国联合网络通信集团有限公司 Suspicious terminal identification method and suspicious terminal identification system
CN112351441A (en) * 2019-08-06 2021-02-09 中国移动通信集团广东有限公司 Data processing method and device and electronic equipment
CN112351441B (en) * 2019-08-06 2023-08-15 中国移动通信集团广东有限公司 Data processing method and device and electronic equipment
CN111245815A (en) * 2020-01-07 2020-06-05 同盾控股有限公司 Data processing method, data processing device, storage medium and electronic equipment
CN111931047A (en) * 2020-07-31 2020-11-13 中国平安人寿保险股份有限公司 Artificial intelligence-based black product account detection method and related device
CN112615966A (en) * 2020-12-14 2021-04-06 南方电网海南数字电网研究院有限公司 Cat pool terminal identification method
CN113641970A (en) * 2021-08-16 2021-11-12 深圳竹云科技有限公司 Risk detection method and device and computing equipment
CN113641970B (en) * 2021-08-16 2022-08-26 深圳竹云科技有限公司 Risk detection method and device and computing equipment

Also Published As

Publication number Publication date
CN109600344B (en) 2021-03-23

Similar Documents

Publication Publication Date Title
CN109600344A (en) Identify the method, apparatus and electronic equipment of risk group
CN102300012B (en) One-to-one matching in contact center
US10637990B1 (en) Call center load balancing and routing management
CN110163474A (en) A kind of method and apparatus of task distribution
CN106875110A (en) Operational indicator layered calculation method and device, distributed computing method and system
CN105630977B (en) Application program recommended method, apparatus and system
CN105389488B (en) Identity identifying method and device
CN106469413B (en) Data processing method and device for virtual resources
CN102223453A (en) High performance queueless contact center
CN102300009A (en) View and metrics for a non-queue contact center
CN108833453A (en) A kind of method and apparatus determined using account
CN107294974A (en) The method and apparatus for recognizing target clique
CN102300011A (en) Automated mechanism for populating and maintaining data structures in queueless contact center
CN108520471A (en) It is overlapped community discovery method, device, equipment and storage medium
CN109274639A (en) The recognition methods of open platform abnormal data access and device
CN107886361A (en) A kind of method and server for assessing ad conversion rates prediction model
CN106390451B (en) Method and device for testing capacity of game server
CN106648688A (en) Information display method and apparatus
CN110490416A (en) Task management method and terminal device
CN109377633A (en) A kind of queue number generation method and terminal device
US10757263B1 (en) Dynamic resource allocation
CN107679980A (en) The determination method and apparatus of user credit
CN109635969B (en) Method and device for pushing resource transfer offer
CN110428128A (en) Performance data acquisition methods, device and storage medium
CN106998386A (en) The method and its device, user terminal of a kind of newly-increased contact method

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant