CN109919790A - Group type recognition methods, device, electronic equipment and storage medium - Google Patents

Group type recognition methods, device, electronic equipment and storage medium Download PDF

Info

Publication number
CN109919790A
CN109919790A CN201711331027.9A CN201711331027A CN109919790A CN 109919790 A CN109919790 A CN 109919790A CN 201711331027 A CN201711331027 A CN 201711331027A CN 109919790 A CN109919790 A CN 109919790A
Authority
CN
China
Prior art keywords
cluster
measured
parameter
behavioral parameters
significance level
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201711331027.9A
Other languages
Chinese (zh)
Inventor
杨洋
郑雪菲
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tencent Technology Shenzhen Co Ltd
Original Assignee
Tencent Technology Shenzhen Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tencent Technology Shenzhen Co Ltd filed Critical Tencent Technology Shenzhen Co Ltd
Priority to CN201711331027.9A priority Critical patent/CN109919790A/en
Priority to PCT/CN2018/115353 priority patent/WO2019114481A1/en
Publication of CN109919790A publication Critical patent/CN109919790A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Systems or methods specially adapted for specific business sectors, e.g. utilities or tourism

Abstract

The embodiment of the invention provides a kind of group type recognition methods, device, electronic equipment and storage mediums, in this method, obtain the feature to be measured of cluster to be measured, since feature to be measured includes at least each member of cluster to be measured in every a kind of default corresponding significance level parameter of behavioral parameters, illustrate that feature to be measured can characterize the significance level of behavioural characteristic and each member in the cluster to be measured on every a kind of default behavioral parameters in cluster to be measured between each member;Since group type prediction model is that trained neural network obtains, group type prediction model can be based on feature to be measured, the cluster topology of the acquisition cluster to be measured of multi-angle various dimensions;And due to being the length minimum training objective with true vector sum difference value vector during training group type prediction model, therefore, group type prediction model can accurately predict the prediction probability that cluster to be measured belongs to all kinds of known clusters.

Description

Group type recognition methods, device, electronic equipment and storage medium
Technical field
This application involves fields of communication technology, are more particularly to group type recognition methods, device, electronic equipment and storage Medium.
Background technique
Current network is made of several groups, and group is the combination for having a certain class members of same characteristic features.For example, group Group can be QQ group, wechat group.Association in each group between member determines whether each group belongs to the same corporations, example Such as, the corresponding number for being overlapped member of same corporations Zhong Ge group and membership's purpose ratio of corresponding group are greater than Or it is equal to preset threshold, such as a corporations include two QQ groups.
The cluster structure of group can reflect the behavioural characteristic of each member in a group.The community structure of corporations can be anti- Reflect the behavioural characteristic of each member in a corporations.For the structure of group, group's classification belonging to group can be obtained, for example, Multiple level marketing group, gambling group, pornographic website propagate group, legal group etc.;For the structure of corporations, can obtain belonging to corporations Corporations' classification, for example, multiple level marketing corporations, gambling corporations, pornographic website propagate corporations, legal corporations etc..
In order to be described collectively, group and corporations are referred to as cluster herein, i.e. cluster is group or corporations, it is seen that is obtained The affiliated group type of cluster is taken, significance is found to have for illegal cluster.Collect realm belonging to cluster however, obtaining at present Type is artificially operated, inefficiency;Therefore, how to be quickly obtained the affiliated cluster classification of cluster is those skilled in the art Problem in need of consideration.
Summary of the invention
In view of this, the embodiment of the invention provides a kind of group type recognition methods, device, electronic equipment and storages to be situated between Matter, to overcome the problems, such as inefficiency in the prior art.
To achieve the above object, the embodiment of the present invention provides the following technical solutions:
A kind of group type recognition methods, comprising:
The feature to be measured of cluster to be measured is obtained, the feature to be measured includes at least: each member of the cluster to be measured is extremely Corresponding significance level parameter on few a kind of default behavioral parameters;
By the group type prediction model of the feature input prebuild to be measured;
Wherein, the group type prediction model is with the difference value vector of the predicted vector of sample cluster and true vector The minimum training objective of length, training neural network obtain;The predicted vector includes that the sample cluster is belonging respectively to respectively The prediction probability of cluster known to class;The true vector includes that the sample cluster is belonging respectively to the true of all kinds of known clusters Real probability;
The group type prediction model output is obtained, the cluster to be measured is belonging respectively to all kinds of known clusters Prediction probability.
A kind of group type identification device, comprising:
First obtains module, and for obtaining the feature to be measured of cluster to be measured, the feature to be measured is included at least: described to be measured Each member of cluster corresponding significance level parameter at least a kind of default behavioral parameters;
Input module, for the feature to be measured to be inputted to the group type prediction model of prebuild;
Wherein, the group type prediction model is with the difference value vector of the predicted vector of sample cluster and true vector The minimum training objective of length, training neural network obtain;The predicted vector include the sample cluster belong to it is all kinds of Know the prediction probability of cluster;The true vector includes the true probability that the sample cluster belongs to all kinds of known clusters;
Second obtains module, and for obtaining the group type prediction model output, the cluster to be measured is belonging respectively to The prediction probability of all kinds of known clusters.
A kind of electronic equipment, comprising:
Memory, for storing program;
Processor, for executing described program, described program is specifically used for:
The feature to be measured of cluster to be measured is obtained, the feature to be measured includes at least: each member of the cluster to be measured is extremely Corresponding significance level parameter on few a kind of default behavioral parameters;
By the group type prediction model of the feature input prebuild to be measured;
Wherein, the group type prediction model is with the difference value vector of the predicted vector of sample cluster and true vector The minimum training objective of length, training neural network obtain;The predicted vector includes that the sample cluster is belonging respectively to respectively The prediction probability of cluster known to class;The true vector includes that the sample cluster is belonging respectively to the true of all kinds of known clusters Real probability;
The group type prediction model output is obtained, the cluster to be measured is belonging respectively to all kinds of known clusters Prediction probability.
A kind of storage medium, the storage medium are stored with the program executed suitable for processor, and described program is used for:
The feature to be measured of cluster to be measured is obtained, the feature to be measured includes at least: each member of the cluster to be measured is extremely Corresponding significance level parameter on few a kind of default behavioral parameters;
By the group type prediction model of the feature input prebuild to be measured;
Wherein, the group type prediction model is with the difference value vector of the predicted vector of sample cluster and true vector The minimum training objective of length, training neural network obtain;The predicted vector includes that the sample cluster is belonging respectively to respectively The prediction probability of cluster known to class;The true vector includes that the sample cluster is belonging respectively to the true of all kinds of known clusters Real probability;
The group type prediction model output is obtained, the cluster to be measured is belonging respectively to all kinds of known clusters Prediction probability.
It can be seen via above technical scheme that compared with prior art, the embodiment of the invention provides a kind of group types Recognition methods obtains the feature to be measured of cluster to be measured, since feature to be measured includes at least each member of cluster to be measured in every one kind The default corresponding significance level parameter of behavioral parameters, illustrates that feature to be measured can characterize in cluster to be measured between each member The significance level of behavioural characteristic and each member in the cluster to be measured on every a kind of default behavioral parameters;Due to group type Prediction model is that trained neural network obtains, therefore group type prediction model can be based on feature to be measured, multi-angle multidimensional The cluster topology of the acquisition cluster to be measured of degree;And due to being with true vector during training group type prediction model With the minimum training objective of length of difference value vector, therefore, group type prediction model can accurately predict cluster to be measured Belong to the prediction probability of all kinds of known clusters.
Detailed description of the invention
In order to more clearly explain the embodiment of the invention or the technical proposal in the existing technology, to embodiment or will show below There is attached drawing needed in technical description to be briefly described, it should be apparent that, the accompanying drawings in the following description is only this The embodiment of invention for those of ordinary skill in the art without creative efforts, can also basis The attached drawing of offer obtains other attached drawings.
Fig. 1 a is provided in an embodiment of the present invention a kind of to classify to cluster to be measured using group type prediction model Process schematic;
Fig. 1 b is a kind of application scenarios schematic diagram of group type recognition methods provided in an embodiment of the present invention;
Fig. 2 is a kind of internal structure chart of electronic equipment provided in an embodiment of the present invention;
Fig. 3 is a kind of flow chart of group type recognition methods provided in an embodiment of the present invention;
Fig. 4 is the feature to be measured that cluster to be measured is obtained in a kind of group type recognition methods provided in an embodiment of the present invention A kind of flow chart of implementation method;
Fig. 5 is the relationship signal between each member that the cluster to be measured in an example provided in an embodiment of the present invention includes Figure;
Fig. 6 is the feature to be measured that cluster to be measured is obtained in a kind of group type recognition methods provided in an embodiment of the present invention The flow chart of another implementation;
Fig. 7 is that each member of the cluster to be measured exists in a kind of group type recognition methods provided in an embodiment of the present invention Such default corresponding significance level parameter of behavioral parameters is determined as in such default corresponding parameter matrix of behavioral parameters Element a kind of implementation flow chart;
Fig. 8 is that each member of the cluster to be measured exists in a kind of group type recognition methods provided in an embodiment of the present invention Such default corresponding significance level parameter of behavioral parameters is determined as in such default corresponding parameter matrix of behavioral parameters Element another implementation flow chart;
Fig. 9 is a kind of process schematic of trained neural network provided in an embodiment of the present invention;
Figure 10 is a kind of structure chart of group type identification device provided in an embodiment of the present invention.
Specific embodiment
Following will be combined with the drawings in the embodiments of the present invention, and technical solution in the embodiment of the present invention carries out clear, complete Site preparation description, it is clear that described embodiments are only a part of the embodiments of the present invention, instead of all the embodiments.It is based on Embodiment in the present invention, it is obtained by those of ordinary skill in the art without making creative efforts every other Embodiment shall fall within the protection scope of the present invention.
Group type recognition methods provided in an embodiment of the present invention is using group type prediction model to having deposited at present Cluster (cluster be corporations or group) classify, detect which type of cluster cluster to be measured belongs to, for example, multiple level marketing cluster, gambling Rich cluster, pornographic website propagate each type of cluster such as cluster, legal cluster.So as to find in time the multiple level marketing cluster in network, Gambling cluster, pornographic website propagate the illegal clusters such as cluster.
As shown in Figure 1a, for it is provided in an embodiment of the present invention it is a kind of using group type prediction model to cluster to be measured progress The process schematic of classification.
By the feature to be measured of cluster to be measured, for example, turning in QQ group to be measured for characterizing transfer amounts and/or for characterizing The feature to be measured of account number, is input in group type prediction model 10, and group type prediction model can export cluster to be measured Belong to the prediction probability of all kinds of group types, as shown in Figure 1a, the prediction probability that QQ group to be measured belongs to multiple level marketing cluster is 95%; The prediction probability for belonging to gambling cluster is 3%;Belonging to pornographic website to propagate the prediction probability of cluster is 2%;Belong to legal cluster Prediction probability be 0%.
Optionally, above-mentioned group type recognition methods can apply in the electronic device, for example, desktop computer, mobile terminal The electronic equipments such as (such as smart phone), ipad, server.
In one example, group type recognition methods can be the client of operation in the electronic device, such as wechat, QQ Equal instant communication clients.The client can be application client, be also possible to webpage client.
Above-mentioned server can be a server, be also possible to the server cluster consisted of several servers, or Person is a cloud computing service center.For example, the background server of the instant communication clients such as wechat, QQ.
It is illustrated below with reference to concrete application scene, is a kind of cluster provided in an embodiment of the present invention as shown in Figure 1 b The application scenarios schematic diagram of kind identification method.
Cluster to be measured can be group or corporations, and corporations may include one or more groups, if corporations include multiple groups Group, each group that corporations include may belong to same social software or same social network sites;Each group that corporations include can also be with Belong to different social softwares, and/or, different social network sites a, for example, corporations may include wechat group and QQ groups Group.Therefore, the cluster to be measured in the embodiment of the present invention includes one or more groups.
Each member uses the social activity in intelligent terminal 11 (such as smart phone or computer or PAD) soft respectively in practical applications Part or social network sites are interacted with other members, group can be established in social software or social network sites, so that respectively Member interacts in the group of social software or social network sites.
The relevant information that each intelligent terminal 11 can interact corresponding member in group with other members uploads To server 12;Server 12 can obtain feature to be measured based on the relevant information of each member in cluster to be measured.
Memory in server 12 can store group type prediction model and group type recognition methods is corresponding Program, the processor in server 12 can execute the program based on feature to be measured, and call group type prediction model, obtain Obtain the prediction probability that cluster to be measured belongs to all kinds of group types.
Server 12 can be corresponding first background server of social software, for example, the corresponding backstage of QQ social software QQ server;Or, corresponding second background server of social network sites, for example, the corresponding background server of A forum social software; Server 12 can also for can obtain that cluster to be measured includes from the first background server and/or the second background server it is each at The server of the relevant information of member.
Fig. 2 shows a kind of internal structure charts of electronic equipment provided in an embodiment of the present invention.
Electronic equipment may include: bus, input equipment 1, memory 2, processor 3, output equipment 4 and communication interface 5。
Wherein, input equipment 1, memory 2, processor 3, output equipment 4 and communication interface 5, are connected with each other by bus.
Input equipment 1 may include the device for receiving the data and information of user's input, such as keyboard, mouse, camera, sweep Retouch instrument, light pen, speech input device, touch screen, pedometer or gravity sensor etc..Such as it will be QQ groups to be measured using input equipment The feature to be measured of group is input to the client in electronic equipment with group type identification function.
The program for executing the technical solution of the embodiment of the present invention is preserved in memory 2, can also preserve operating system With other key businesses.Specifically, program may include program code, and program code includes computer operation instruction.More specifically , memory 2 may include read-only memory (read-only memory, ROM), can store static information and instruction other The static storage device of type, can store information and instruction at random access memory (random access memory, RAM) Other kinds of dynamic memory, magnetic disk storage, flash etc..
Memory 2 can store group type prediction model and group type identification side in the embodiment of the present invention The corresponding program of method.
Processor 3 can execute the program in memory 2 and call group type prediction during executing program Model.So that group type prediction model exports the prediction probability that QQ group to be measured is belonging respectively to all kinds of group's classifications.
Processor 3 can be general processor, such as general central processor (CPU), network processing unit (Network Processor, abbreviation NP), microprocessor etc., be also possible to application-specific integrated circuit (application-specific Integrated circuit, ASIC), or it is one or more for controlling the integrated electricity of the embodiment of the present invention program execution Road.Can also be digital signal processor (DSP), specific integrated circuit (ASIC), ready-made programmable gate array (FPGA) or Other programmable logic device, discrete gate or transistor logic, discrete hardware components.
Processor 3 may include primary processor, may also include baseband chip, modem etc..
Output equipment 4 may include allowing output information to the device, such as display screen, printer, loudspeaker etc. of user.Example Such as, display shows that the QQ group to be measured of group type prediction model output is belonging respectively to the prediction probability of all kinds of known clusters, For example, the prediction probability for belonging to multiple level marketing cluster is 95%;The prediction probability for belonging to gambling cluster is 3%;Belong to pornographic website biography The prediction probability for broadcasting cluster is 2%;The prediction probability for belonging to legal cluster is 0%.
Processor 3 executes the program stored in memory 2, and calls other equipment, can be used for realizing of the invention real Apply each step in group type recognition methods provided by example.
Communication interface 5 may include using the device of any transceiver one kind, so as to other equipment or communication, Such as Ethernet, wireless access network (RAN), WLAN (WLAN) etc..
Below by based on it is above the present embodiments relate to general character in terms of, further specifically to the embodiment of the present invention It is bright.As shown in figure 3, be a kind of flow chart of group type recognition methods provided in an embodiment of the present invention, this method comprises:
Step S301: the feature to be measured of cluster to be measured is obtained.
The feature to be measured includes: that each member of the cluster to be measured respectively corresponds at least a kind of default behavioral parameters Significance level parameter.
Above-mentioned " at least a kind of default behavioral parameters " can preset behavioral parameters for P class, and P is just whole more than or equal to 1 Number.It is pre-set that P class, which presets behavioral parameters,.
P class preset behavioral parameters can at least to one kind known to the corresponding behavioural characteristic of cluster it is corresponding, behavior feature use In characterizing the feature interacted in cluster between member known to such.
P class preset behavioral parameters can at least to one kind known to the corresponding behavioural characteristic of cluster it is corresponding, may include following Several situations:
One, the default behavioral parameters of P class are corresponding to the corresponding behavioural characteristic of cluster known to one kind, and cluster known to one kind is cluster Cluster known to any sort in all kinds of known clusters that type prediction model can be predicted.
For example, all kinds of known clusters include: gambling cluster, legal cluster etc..
Assuming that the default behavioral parameters behavioural characteristic corresponding to multiple level marketing cluster of P class is corresponding.In multiple level marketing cluster bottom member to Middle layer member transfers accounts, and middle layer member transfers accounts to high-rise member, i.e., the level-one level-one with level is transferred accounts;Each member in multiple level marketing cluster Between the feature that interacts can be with are as follows: the behavioural characteristic of transferring accounts with level, then it may include: use that P class, which presets behavioral parameters, In characterization member between the number transferred accounts transfer accounts count parameter and/or, for characterizing the amount of money transferred accounts between member Transfer amounts parameter.
Assuming that the default behavioral parameters behavioural characteristic corresponding to legal cluster of P class is corresponding.Assuming that in legal cluster between member It only interacts, is not related to transferring accounts, then the feature interacted between member in the legal cluster can be with are as follows: be based on text or language Sound or video or picture interact, then it may include: to interact number between member for characterizing that P class, which presets behavioral parameters, Interaction times parameter.
To sum up, to inhomogeneity known to the corresponding behavioural characteristic of cluster it is corresponding, P class presets behavioral parameters may be different;P class is pre- If behavioral parameters behavioural characteristic corresponding to cluster known to which class is corresponding, then group type prediction model predicts the collection to be measured The prediction probability that group belongs to such known cluster is more accurate.
For example, the default behavioral parameters behavioural characteristic corresponding to legal cluster of P class is corresponding, it is assumed that P class predictive behavior parameter For interaction times parameter.The cluster to be measured can be gone out based on interaction times parameter prediction with group type prediction model and belong to biography Sell cluster prediction probability, it is assumed that be 5%, accuracy it is poor;And it predicts the obtained cluster to be measured and belongs to legal cluster Prediction probability, it is assumed that be 95%, accuracy it is higher.
In practical applications, determine that cluster to be measured belongs to which illegal known cluster is relatively of practical significance, therefore, to The accuracy that cluster belongs to the prediction probability of which illegal known cluster is surveyed, is higher than, cluster to be measured belongs to the prediction of legal cluster The accuracy of probability, compares and is of practical significance, it is preferred, therefore, that " it is corresponding with cluster known to one kind that P class presets behavioral parameters Behavioural characteristic is corresponding " it is " it is corresponding to a kind of illegal known corresponding behavioural characteristic of cluster that P class presets behavioral parameters ".
Two, it is corresponding to the behavioural characteristic that cluster at least known to two classes shares that P class presets behavioral parameters, cluster known to every one kind Cluster known to any sort in all kinds of known clusters that can be predicted for group type prediction model;
Optionally, " it is corresponding to the behavioural characteristic that cluster at least known to two classes shares that P class presets behavioral parameters " may include: It is corresponding to the behavioural characteristic that cluster known to two classes shares that P class presets behavioral parameters, or, P class is preset known to behavioral parameters and three classes The shared behavioural characteristic of cluster is corresponding, or..., or, P class presets the behavioural characteristic phase that behavioral parameters are shared with all kinds of known clusters It answers.
It is corresponding to the behavioural characteristic that P class presets behavioral parameters to all kinds of known clusters share below to be illustrated.
It is assumed that all kinds of known clusters include: gambling cluster, multiple level marketing cluster, pornographic website propagation cluster, legal cluster etc..
Bottom member transfers accounts to middle layer member in multiple level marketing cluster, and middle layer member transfers accounts to high-rise member, i.e., with level Level-one level-one is transferred accounts;The feature interacted between each member in multiple level marketing cluster can be with are as follows: the behavior of transferring accounts with level.Gambling It is related to behavior of mutually transferring accounts between each member in cluster, and, gambling typically occurs in night, therefore, each member in cluster of gambling Between the feature that interacts are as follows: the behavior of mutually transferring accounts between nocturnal each member.Pornographic website is propagated in cluster Each member transfers accounts to the member for holding yellow video or yellow link, and therefore, pornographic website is propagated in cluster between each member The feature interacted are as follows: concentrate to one or several members and carry out behavior of transferring accounts.
Assuming that legal cluster is the cluster comprising some intra-company employee, it is one or several when celebrating a festival in the legal cluster Leader gives bonus, i.e., the feature interacted between member in the legal cluster can be with are as follows: same in certain several set time Or several members send the behavior of transferring accounts of balance due red packet to other multiple members.
If P class presets behavioral parameters and each known cluster, to share behavioural characteristic corresponding, illustrate P class preset behavioral parameters with respectively The common characteristic interacted between the member that known cluster separately includes is corresponding.For above-mentioned, each known cluster wraps respectively The common characteristic interacted between the member contained is behavioural characteristic of transferring accounts, then it may include: gold of transferring accounts that P class, which presets behavioral parameters, Volume parameter, and/or, count parameter of transferring accounts.
Cluster in the embodiment of the present application in addition to illegal known cluster is legal cluster, it is to be understood that legal cluster There are many types, for example, legal cluster can also include: that cluster, inhomogeneous legal collection are shared in news exchange cluster or shopping Group seldom has shared behavioural characteristic with illegal known cluster, for example, the spy interacted between member in news exchange cluster Sign may be only the interaction of text or voice or video or picture, not be related to the behavior of transferring accounts, it is preferred, therefore, that " P class is default Behavioral parameters are corresponding to the behavioural characteristic that cluster at least known to two classes shares " it may include: that P class presets behavioral parameters and at least two Illegally the shared behavioural characteristic of known cluster is corresponding for class, or, P class preset behavioral parameters and legal cluster at least it is a kind of it is illegal Know that the shared behavioural characteristic of cluster is corresponding.
It is corresponding to the behavioural characteristic that P class presets behavioral parameters to all kinds of illegal known clusters share below to be illustrated.Still For above-mentioned, the common characteristic interacted between the member that each illegal known cluster separately includes is behavioural characteristic of transferring accounts, It may include: transfer amounts parameter that then P class, which presets behavioral parameters, and/or, count parameter of transferring accounts.
Since P class is preset, behavioral parameters are corresponding to the shared behavioural characteristic of all kinds of illegal known clusters, then group type is pre- The accuracy for surveying the prediction probability that the cluster to be measured that model prediction goes out is belonging respectively to each illegal known cluster is relatively high.
Cluster to be measured can be group, for example, the group in social software, such as QQ group, wechat group, or, social network Group in standing.
Cluster to be measured can be corporations, and a corporations may include at least one group.
If corporations include multiple groups, multiple groups can have a kind of at least following feature:
One, membership's purpose ratio of each group corresponding number for being overlapped member and corresponding group, is greater than Or it is equal to the first preset threshold.
For example, at present there are three group, respectively the first group, the second group and third group, the first group at Member's total number is 200;Membership's mesh of second group is 150;Membership's mesh of third group is 300;If the first group, The number that member is overlapped in second group and these three groups, third group is 100;The number of member is then overlapped in the first group Membership's purpose ratio with the first group is 100/200=1/2;Number and the second group of member are overlapped in second group Membership's purpose ratio be 100/150=2/3;The data of member and the membership of third group are overlapped in third group Purpose ratio is 100/300=1/3.Assuming that the first preset threshold is 1/5;Due to 1/2,2/3,1/3 bigger than 1/5, the One group, the second group and third group belong to same corporations.
Two, the corresponding number for being overlapped administrator of each group, and the ratio of administrator's total number in corresponding group, More than or equal to the second preset threshold.
For example, there are three groups at present, respectively the first group, the second group and third group, the first group is with 3 A administrator;Second group have 5 administrators, third group have 4 administrators, it is assumed that the first group, the second group and Coincidence administrator in these three groups, third group has 2;Number and the first group of administrator are overlapped in first group The ratio of administrator's total number is 2/3;The number and administrator's total number of the second group of administrator are overlapped in second group Ratio is 2/5;The ratio that the number of administrator and administrator's total number of third group are overlapped in third group is 1/2;Assuming that Second preset threshold is 1/4, greatly than 1/4 due to 2/3,2/5,1/2, the first group, the second group and third group Belong to the same corporations.
Step S302: by the group type prediction model of the feature input prebuild to be measured.
Wherein, the group type prediction model is with the difference value vector of the predicted vector of sample cluster and true vector The minimum training objective of length, training neural network obtain;The predicted vector includes that the sample cluster is belonging respectively to institute State the prediction probability of all kinds of known clusters;The true vector includes that the sample cluster is belonging respectively to all kinds of known clusters True probability.
All kinds of known clusters may include all kinds of illegal clusters and legal cluster.
All kinds of illegal clusters may include: gambling cluster, multiple level marketing cluster, pornographic website propagation cluster;The embodiment of the present invention Middle each type of cluster by addition to illegal cluster is known as legal cluster.
Assuming that the output result of neural network is sample cluster after the training characteristics of sample cluster are input to neural network The prediction probability for belonging to gambling cluster is 95%;The probability that sample cluster belongs to multiple level marketing cluster is 4%;Sample cluster belongs to yellow The probability that cluster is propagated in website is 1%, and the probability that sample cluster belongs to legal cluster is 0%.The then corresponding prediction of sample cluster Vector is (95%, 4%, 1%, 0%);If sample cluster belongs to the type of gambling cluster, then sample cluster in real scene True vector be (1,0,0,0).The difference value vector of sample cluster predicted vector and true vector be (- 0,05,0.04,0.01, 0),
" group type prediction model is with the difference of the predicted vector of sample cluster and true vector in the embodiment of the present invention The minimum training objective of the length of vector " refers to that the predicted vector of sample cluster is with true vector closer to better.In order to reach This training objective can in several ways be trained neural network.Method detailed is in the building of the subsequent descriptions collection It is referred to during realm type prediction model, it can be with cross-reference.Which is not described herein again.
The prediction probability that sample cluster is belonging respectively to all kinds of known clusters is (to obtain group type in training by neural network Before prediction model, group type prediction model is known as neural network) prediction obtains.
Step S303: the group type prediction model output is obtained, the cluster to be measured is belonging respectively to described all kinds of The prediction probability of known cluster.
In an alternative embodiment, group type prediction model can obtain collection to be measured based on the feature to be measured of cluster to be measured The cluster topology of group can be obtained because the cluster topology of different group types is different based on the cluster topology of cluster to be measured Obtain the prediction probability that cluster to be measured is belonging respectively to all kinds of known clusters.
For example, the cluster topology of gambling cluster includes at least: mutual frequent progress is transferred accounts between each member;Multiple level marketing cluster Cluster topology includes at least: each layer-by-layer superior member of member transfers accounts, transferring accounts mutually seldom between each member, and has level Concept;The cluster topology that pornographic website propagates cluster includes at least: each member transfers accounts to certain several fixed member, generally will not Appearance successively transfers accounts and transfers accounts mutually.
If the feature to be measured based on cluster to be measured finds mutually frequently to transfer accounts between each member of cluster to be measured, do not turn successively Account belongs to the prediction probability and yellow net of multiple level marketing cluster then the prediction probability that the cluster to be measured belongs to gambling cluster is very big The probability for propagating cluster stand with regard to very little.
The embodiment of the invention provides a kind of group type recognition methods, obtain the feature to be measured of cluster to be measured, due to It surveys feature and includes at least each member of cluster to be measured in every a kind of default corresponding significance level parameter of behavioral parameters, explanation Feature to be measured can characterize behavioural characteristic and each member every one kind in the cluster to be measured in cluster to be measured between each member Significance level on default behavioral parameters;Since group type prediction model is that trained neural network obtains, collect realm Type prediction model can be based on feature to be measured, the cluster topology of the acquisition cluster to be measured of multi-angle various dimensions;And due in training It is the minimum training objective of length with true vector sum difference value vector, therefore, cluster during group type prediction model Type prediction model can accurately predict the prediction probability that cluster to be measured belongs to all kinds of known clusters
In the embodiment of the present invention, there are many implementation method, the embodiment of the present invention to provide but be not limited to following several by step S301 Kind method.
The first, by each member of cluster to be measured, corresponding significance level is joined at least a kind of predictive behavior parameter Number, directly as the feature to be measured.Specific method can be found in Fig. 4, and Fig. 4 is a kind of group type provided in an embodiment of the present invention A kind of flow chart of implementation method of the feature to be measured of cluster to be measured is obtained in recognition methods, this method may include:
Step S401: each member for obtaining the cluster to be measured is corresponding at least a kind of default behavioral parameters Behavioral parameters value.
Step S402: for every a kind of default behavioral parameters, row is preset at such according to each member of the cluster to be measured For the corresponding behavioral parameters value of parameter, determine that each member of the cluster to be measured presets behavioral parameters at such and respectively corresponds Significance level parameter, it is corresponding at least a kind of default behavioral parameters to obtain each member of the cluster to be measured Significance level parameter.
Step S403: each member of the cluster to be measured is corresponding heavy at least a kind of default behavioral parameters Extent index is wanted, the feature to be measured of cluster to be measured is determined as.
It illustrates below and the above method is illustrated.
For example, it is assumed that at least a kind of default behavioral parameters include: transfer amounts parameter and frequency parameter of transferring accounts.So A kind of mode mainly includes that step " obtains each member of cluster to be measured in the corresponding significance level ginseng of transfer amounts parameter Number " and step " obtaining each member of cluster to be measured in the corresponding significance level parameter of count parameter of transferring accounts ".
As shown in figure 5, the relation schematic diagram between each member for including for the cluster to be measured in an example.
Assuming that cluster to be measured includes: member A, member B, member C and member D;Assuming that cluster to be measured is by Liang Ge group structure At.Wherein, member A, member B and member C belong to group 1;Member A, member C and member D belong to group 2.In Fig. 5 Member two-by-two between line represent interaction path.
In the embodiment of the present invention, the relationship between two members that will directly interact referred to as once was linking pass System;For example, the relationship between member A and member B, it will be between the member two-by-two that can be just interacted at least through a member Relationship is known as two degree of linking relationships;For example, the relationship between member B and member D;Friendship will just be can be carried out at least through two members Relationship between mutual member two-by-two is known as three degree of linking relationships, and so on.
1, each member of acquisition cluster to be measured corresponding significance level parameter in transfer amounts parameter.
It is assumed that whithin a period of time, member A to the summation of the transfer amounts of member B be 100;Member A turns to member C's The summation of the account amount of money is 0;Transfer amounts from member A to member D summation be 200;Transfer amounts from member B to member A it is total Be 60;Transfer amounts from member B to member C summation be 150;Transfer amounts from member C to member A summation be 600;At Member transfer amounts from C to member B summation be 180;Transfer amounts from member C to member D summation be 300;Member D is to member The summation of the transfer amounts of A is 900;Transfer amounts from member D to member C summation be 450.Due between member B and member D There is no interaction path, therefore there is no directly transfer accounts between member B and member D.
The first step obtains each member of cluster to be measured in the corresponding behavioral parameters value of transfer amounts parameter.
By taking Fig. 5 as an example, each member includes: A → A=0, A → B=in the corresponding behavioral parameters value of transfer amounts parameter 100, A → C=0, A → D=200;B → A=60, B → B=0, B → C=150, B → D=0;C → A=600, C → B=180, C → C=0, C → D=300;D → A=900, D → B=0, D → C=450, D → D=0.
For sake of clarity, above-mentioned data are embodied with a matrix type, it is assumed that be matrix M1;
Wherein, the first list person of being shown as A is respectively to the transfer amounts of member A, member B, member C and member D;Second list The person of being shown as B is respectively to the transfer amounts of member A, member B, member C, member D;The third list person of being shown as C respectively to member A, at The transfer amounts of member B, member C, member D;The 4th list person of being shown as D transfers accounts to member A, member B, member C, member D respectively The amount of money.
Second step, according to each member of the cluster to be measured in transfer amounts parameter corresponding behavioral parameters value, Determine each member of the cluster to be measured in the corresponding significance level parameter of such transfer amounts parameter.
Since M1 is the behavioral parameters value of the transfer amounts parameter in a period of time, work as since behavioral parameters value is only capable of representing The transfer amounts situation of this preceding period has certain unstability, i.e., cannot represent cluster to be measured in this time Therefore history transfer amounts situation in order to enable the significance level parameter obtained is representative, is below transferring accounts to each member The corresponding behavioral parameters value of amount of money parameter is handled.
It can obtain each member in cluster to be measured based on the historical data of cluster to be measured and initiate to transfer accounts behavior probability.? In one example, it is assumed that in cluster to be measured each member initiate to transfer accounts behavior probability it is identical, it is assumed that each member initiates row of transferring accounts in Fig. 5 For probability be 1/4;Then each member is carried out the following processing in the corresponding behavioral parameters value of transfer amounts parameter:
Constantly carry out following iteration:
Vje1=M1*Vje0;Vje2=M1*Vje1;Vje3=M1*Vje2;...;Vje(n1)=M1*Vje(n1-1)
For example,
Alternatively,
Vje11M1*Vje0+(1-α1)e1;Vje21M1*Vje1+(1-α1)e1
Vje31M1*Vje2+(1-α1)e1;...;Vje(n1)1M1*Vje(n1-1)+(1-α1)e1
Wherein, α1It can be any value greater than 0, less than 1;
Optionally,
Until V twice in successionje(n1)With Vje(n1-1)As a result it is no longer changed, alternatively, difference is within a preset range, i.e., It is believed that convergence, that is, have universal representative, it is assumed that convergent result is Vje(n1)=[Vje11Vje21Vje31Vje41]T, then member A It is V in the corresponding significance level parameter of transfer amounts parameterje11;Member B is in the corresponding significance level parameter of transfer amounts parameter For Vje21;Member C is V in the corresponding significance level parameter of transfer amounts parameterje31;Member D is corresponding heavy in transfer amounts parameter Wanting extent index is Vje41
2, each member of cluster to be measured is obtained in the corresponding significance level parameter of count parameter of transferring accounts.
Assuming that in cluster to be measured shown in fig. 5, in a period of time, member A to the number of transferring accounts of member B be 10 times;Member A It is 0 time to the transfer amounts of member C;Member A to the transfer amounts of member D be 2 times;Member B to the transfer amounts of member A be 1 It is secondary;Member B to the transfer amounts of member C be 3 times;Member C to the transfer amounts of member A be 6 times;Member C transfers accounts to member B's The amount of money is 2 times;Member C to the transfer amounts of member D be 6 times;Member D to the transfer amounts of member A be 9 times;Member D is to member The transfer amounts of C are 9 times.
The first step obtains each member of cluster to be measured in the corresponding behavioral parameters value of count parameter of transferring accounts.
By taking Fig. 5 as an example, each member includes: A → A=0, A → B=in the corresponding behavioral parameters value of count parameter of transferring accounts 10, A → C=0, A → D=2;B → A=1, B → B=0, B → C=3, B → D=0;C → A=6, C → B=2, C → C=0, C → D=6;D → A=9, D → B=0, D → C=9, D → D=0.
For sake of clarity, above-mentioned data are embodied with a matrix type, it is assumed that be matrix M2;
Wherein, the first list person of being shown as A is respectively to the number of transferring accounts of member A, member B, member C, member D;Second list The person of being shown as B is respectively to the number of transferring accounts of member A, member B, member C, member D;The third list person of being shown as C respectively to member A, at The number of transferring accounts of member B, member C, member D;The 4th list person of being shown as D transfers accounts to member A, member B, member C, member D respectively Number.
Second step, according to each member of the cluster to be measured in the corresponding behavioral parameters value of count parameter of transferring accounts, really Each member of the fixed cluster to be measured is in the corresponding significance level parameter of such count parameter of transferring accounts.
Since M2 is the behavioral parameters value of the count parameter of transferring accounts in a period of time, work as since behavioral parameters value is only capable of representing The number situation of transferring accounts of this preceding period cannot represent cluster to be measured in the number of transferring accounts of this time with unstability History common situation, therefore, in order to enable the significance level parameter obtained is representative, below to each member in number of transferring accounts The corresponding behavioral parameters value of parameter is handled.
It can obtain each member in cluster to be measured based on the historical data of cluster to be measured and initiate to transfer accounts behavior probability.? In one example, it is assumed that in cluster to be measured each member initiate to transfer accounts behavior probability it is identical, it is assumed that each member initiates row of transferring accounts in Fig. 5 For probability be 1/4 (it's not limited to that for the embodiment of the present invention);Then to each member in the corresponding row of count parameter of transferring accounts It is carried out the following processing for parameter value:
Constantly carry out following iteration:
Vcs1=M2*Vcs0;Vcs2=M2*Vcs1;Vcs3=M2*Vcs2;...;Vcs(n2)=M2*Vcs(n2-1)
For example,
Alternatively,
Vcs12M2*Vcs0+(1-α2)e2;Vcs22M2*Vcs1+(1-α2)e2
Vcs32M2*Vcs2+(1-α2)e2;...;Vcs(n2)2M2*Vcs(n2-1)+(1-α2)e2
Wherein, α2It can be any value greater than 0, less than 1;
Optionally,
Until continuous Vcs(n2)With Vcs(n2-1)Result twice is no longer changed, alternatively, difference is within a preset range, i.e., It is believed that convergence, it is assumed that convergent result is Vcs(n2)=[Vcs11Vcs21Vcs31Vcs41]T, then member A is corresponding in count parameter of transferring accounts Significance level parameter be Vcs11;Member B is V in the corresponding significance level parameter of count parameter of transferring accountscs21;Member C is transferring accounts time The corresponding significance level parameter of number parameter is Vcs31;Member D is V in the corresponding significance level parameter of count parameter of transferring accountscs41
To sum up, by member each in cluster to be measured in the corresponding V of parameter of transferring accountsje11,Vje21,Vje31,Vje41, and, to Each member is in the corresponding V of count parameter of transferring accounts in survey clustercs11,Vcs21,Vcs31,Vcs41, as feature to be measured.
It is above-mentioned, each member of cluster to be measured is obtained in the corresponding significance level parameter of transfer amounts parameter;With acquisition Each member of cluster to be measured can hold in the corresponding significance level parameter of count parameter of transferring accounts without sequencing simultaneously Row, can also first carry out " obtaining each member of cluster to be measured in the corresponding significance level parameter of transfer amounts parameter ", then It executes " obtaining each member of cluster to be measured in the corresponding significance level parameter of count parameter of transferring accounts ";It can also first carry out " obtaining each member of cluster to be measured in the corresponding significance level parameter of count parameter of transferring accounts ", then execute and " obtain collection to be measured Each member of group is in the corresponding significance level parameter of transfer amounts parameter.
It is second, corresponding heavy at least a kind of default behavioral parameters according to each member of the cluster to be measured Extent index is wanted, determines the feature to be measured of cluster to be measured.Specific method is as shown in fig. 6, Fig. 6 is provided in an embodiment of the present invention one The flow chart of the another implementation of the feature to be measured of cluster to be measured is obtained in kind group type recognition methods, this method comprises:
Step S601: each member for obtaining the cluster to be measured is corresponding at least a kind of default behavioral parameters Behavioral parameters value;
Step S602: for every a kind of default behavioral parameters, row is preset at such according to each member of the cluster to be measured For the corresponding behavioral parameters value of parameter, determine that each member of the cluster to be measured presets behavioral parameters at such and respectively corresponds Significance level parameter, it is corresponding at least a kind of default behavioral parameters to obtain each member of the cluster to be measured Significance level parameter;
Step S603: for every a kind of default behavioral parameters, each member of the cluster to be measured is preset into behavior at such The corresponding significance level parameter of parameter presets each element in the corresponding parameter matrix of behavioral parameters as such, with To at least a kind of default corresponding parameter matrix of behavioral parameters.
Still by taking Fig. 5 as an example, i.e., for transfer amounts parameter, each element in corresponding parameter matrix includes: Vje11,Vje21,Vje31,Vje41;For count parameter of transferring accounts, each element in corresponding parameter matrix includes: Vcs11, Vcs21,Vcs31,Vcs41.But the number and position that each significance level parameter occurs in parameter matrix is different.
In an alternative embodiment, the number of the row of at least a kind of default corresponding parameter matrix of behavioral parameters Identical respectively, the number difference of column is identical.
" each member of the cluster to be measured is preset into the corresponding significance level parameter of behavioral parameters at such, is determined Preset the element in the corresponding parameter matrix of behavioral parameters for such " a kind of concrete methods of realizing can with as shown in fig. 7, packet It includes:
Step S701: each member is preset into the corresponding significance level parameter of behavioral parameters at such and carries out descending row Sequence obtains such default corresponding first ranking results of behavioral parameters.
Step S702: by corresponding first sequence of such behavioral parameters as a result, as the corresponding parameter of such behavioral parameters The first column element in matrix.
Step S703: for every a line in the corresponding parameter matrix of such behavioral parameters, first weight of the row is determined Want corresponding first member of extent index.
Step S704: obtaining first member corresponding with the row has the weight of first object member of X degree linking relationship Extent index is wanted, the initial value of X is that 1, X degree connection relationship refers to that first member at least passes through with the first object member X-1 member is crossed to interact.
Step S705: the significance level parameter of the corresponding first object member of the row is subjected to descending sort, obtains the row Corresponding second ranking results.
Step S706: by the significance level parameter in corresponding second ranking results of the row, successively it is determined as the member of the row Element, until the row includes that the number of element is equal to total columns of the parameter matrix.If important journey in second ranking results When degree parameter is the element of the row, the number for the element which includes is less than total columns, and X+1 is assigned to X, returns to step Rapid S704;To obtain all elements that each row separately includes.
Above-mentioned steps are illustrated below.
1, the corresponding parameter matrix of transfer amounts parameter is obtained.
Each member is carried out descending sort in the corresponding significance level parameter of such transfer amounts parameter by the first step, Obtain corresponding first ranking results of such transfer amounts parameter.
Still by taking Fig. 5 as an example, it is assumed that Vje11>Vje31>Vje41>Vje21, then the first ranking results are as follows: Vje11,Vje31,Vje41, Vje21
Second step, by corresponding first sequence of such transfer amounts parameter as a result, corresponding as such transfer amounts parameter Parameter matrix in the first column element.
In an alternative embodiment, due to using the first ranking results as the first column element, the line number of parameter matrix is The total number M of member in cluster to be measured.By taking Fig. 5 as an example, then total line number M=4 of parameter matrix.The columns of parameter matrix is also pre- First be arranged, optionally, total columns N=1/2*M of parameter matrix, the embodiment of the present invention not to total columns of parameter matrix into Row limits, for example, it is also possible to select to be less than or equal to M, and is greater than or equal to 1 any positive integer.
It is understood that parameter matrix can be one-dimensional vector, i.e. such corresponding parameter matrix of transfer amounts parameter It can only include the first column element.
It can also include the following steps in another alternative embodiment.
Third step determines first of the row for every a line in such corresponding parameter matrix of transfer amounts parameter Corresponding first member of significance level parameter.
Still by taking Fig. 5 as an example, then parameter matrix includes 4 rows in total, and first significance level parameter of the first row is Vje11, Corresponding first member is member A;First significance level parameter of the second row is Vje31, corresponding first member be Member C;First significance level parameter of the third line is Vje41, corresponding first member is member D;The first of fourth line A significance level parameter is Vje21, corresponding first member is member B.
4th step, obtaining first member corresponding with the row has the weight of each first object member of X degree linking relationship Want extent index.
The initial value of X is that 1, X degree connection relationship refers to first member and first object member at least through X-1 Member interacts.
Still by taking Fig. 5 as an example, there is the first mesh of once linking relationship with first member A of the first row in the parameter matrix Marking member includes: member B, member C and member D.The significance level parameter of member B is Vje21;The significance level parameter of member C For Vje31;The significance level parameter of member D is Vje41.In example shown in fig. 5, there is no have two degree of chains with first member A Meet the member of relationship.
There is in the parameter matrix with first member C of the second row the first object member of once linking relationship to include: Member A, member B and member D.The significance level parameter of member A is Vje11;The significance level parameter of member B is Vje21;Member D Significance level parameter be Vje41.In example shown in fig. 5, there is no with first member C have two degree of linking relationships at Member.
There is in the parameter matrix with first member D of the third line the first object member of once linking relationship to include: Member A and member C.The significance level parameter of member A is Vje11;The significance level parameter of member C is Vje31.With first at It includes: member B that member D, which has the first object member of two degree of linking relationships,.
With first member B of fourth line there is each first object member of once linking relationship to wrap in the parameter matrix It includes: member A and member C.The significance level parameter of member A is Vje11;The significance level parameter of member C is Vje31.With first It includes: member D that member B, which has the first object member of two degree of linking relationships,.
The significance level parameter of the corresponding each first object member of the row is carried out descending sort, obtains the row by the 5th step Corresponding second ranking results.
Have the ranking results of the once first object member of linking relationship as follows with first member of each row:
Corresponding second ranking results of the first row are as follows: Vje31,Vje41,Vje21;Corresponding second ranking results of second row are as follows: Vje11,Vje41,Vje21;Corresponding second ranking results of the third line are as follows: Vje11,Vje31;Corresponding second ranking results of fourth line are as follows: Vje11,Vje31
Have the second ranking results of the first object member of two degree of linking relationships as follows with first member of each row:
In the example embodiment shown in fig. 5, distinguish with first member A in the first row and first member C in the second row First object member with two degree of linking relationships is not present, and therefore, is not illustrated here to the first row and the second row.
With first member D of the third line there is the first object member of two degree of linking relationships to only have member B, therefore, the Two ranking results are are as follows: Vje21;With first object member that first member B of fourth line has two degree linking relationships only have at Member D, then the second ranking results are Vje41
Significance level parameter in corresponding second ranking results of the row is successively determined as the element of the row by the 6th step, Until the row includes that the number of element is equal to total columns of the parameter matrix;If significance level is joined in second ranking results When number is the element of the row, the number of the element which includes is less than total columns, and X+1 is assigned to X, return step the Four steps.
Still by taking Fig. 5 as an example, if total columns N=4 of parameter matrix, be directed to the first row, first corresponding with the first row It is V that member A, which has corresponding second ranking results of first object member of once linking relationship,je31,Vje41,Vje21, by Vje31, Vje41,Vje21, successively it is determined as the element of the first row, until the number for the element that the first row includes is equal to 4, due in second row Each significance level parameter V in sequence resultje31,Vje41,Vje21When as the element in the first row, the element that includes in the first row Number is equal to 4, and therefore, the element acquisition in the first row finishes, and each element that the first row includes is [Vje11Vje31Vje41Vje21]。
For the second row, firstly, first member C corresponding with the second row have once linking relationship first object at Corresponding second ranking results of member are Vje11,Vje41,Vje21, by Vje11,Vje41,Vje21Successively it is determined as the element of the second row, directly The number for the element for including to the second row is equal to 4, due to significance level parameter V each in the second ranking resultsje11,Vje41,Vje21 When being the element in the second row, the element number for including in the second row is equal to 4, and therefore, the element acquisition in the second row finishes, The each element that second row includes are as follows: [Vje31Vje11Vje41Vje21]。
For the third line, firstly, first member D corresponding with the third line have once linking relationship first object at Corresponding second ranking results of member are Vje11,Vje31, by Vje11,Vje31Successively it is determined as the element of the third line, until the third line packet The number of the element contained is equal to 4, due to significance level parameter V each in the second ranking resultsje11,Vje31As in the third line Element when, the element number for including in the third line is equal to 3 < 4, and therefore, the element in the third line, which has not been obtained, to be finished, at this point, the The each element that three rows include are as follows: [Vje41Vje11Vje31], i.e. the 4th of the third line element does not determine.
For fourth line, firstly, first member B corresponding with fourth line have once linking relationship first object at Each significance level parameter V in corresponding second ranking results of memberje11,Vje31, by Vje11,Vje31Successively it is determined as fourth line Element, until the number for the element that fourth line includes is equal to 4, due to significance level parameter V each in the second ranking resultsje11, Vje31When as the element in fourth line, the element number for including in fourth line is equal to 3 < 4, therefore, the element in fourth line It has not been obtained and finishes, at this point, each element that fourth line includes are as follows: [Vje21Vje11Vje31], i.e. the 4th of fourth line element is not true It is fixed.
Require to execute step third step to the 6th step, until the element number of the row is equal to total columns for every a line.
It is being directed to the third line, after executing the 4th step to the 6th step for the first time, each element of the third line is [Vje41Vje11Vje31], i.e. the 4th of the third line element does not determine.Therefore need to return the 4th step;Since X+1 being assigned to X, during executing the 4th step to seven steps second, X 2, then the 4th step, can obtain first member with the third line D has the significance level parameter of the first object member of two degree of linking relationships, and first object member is member B, and member B is corresponding Significance level parameter be Vje21.To V in 5th stepje21Descending sort is carried out, due to only having a significance level parameter, Second ranking results are Vje21;6th step, by the significance level parameter V in second ranking resultsje21, it is determined as the third line Element, due to by significance level parameter Vje21After the element of the third line, the element number that the third line includes is equal to 4, because This, the element acquisition of the third line terminates, and each element of the third line is [Vje41Vje11Vje31Vje21]。
Similarly, which is not described herein again for fourth line, and each element is [V in fourth lineje21Vje11Vje31Vje41]。
7th step repeats the 4th step to the 6th step, until obtaining each in such corresponding parameter matrix of transfer amounts parameter The all elements that row separately includes.
By taking Fig. 5 as an example, the corresponding parameter matrix of transfer amounts parameter are as follows:
2, the corresponding parameter matrix of count parameter of transferring accounts is obtained.
Each member is carried out descending sort in the corresponding significance level parameter of such count parameter of transferring accounts by the first step, Obtain corresponding first ranking results of such count parameter of transferring accounts.
Still by taking Fig. 5 as an example, it is assumed that Vcs11>Vcs21≥Vcs31>Vcs41, then the first ranking results are as follows: Vcs11,Vcs21,Vcs31, Vcs41
Second step, it is by corresponding first sequence of such count parameter of transferring accounts as a result, corresponding as such count parameter of transferring accounts Parameter matrix in the first column element.
In an alternative embodiment, due to using the first ranking results as the first column element, the line number of parameter matrix is The total number M of member in cluster to be measured.By taking Fig. 5 as an example, then total line number M=4 of parameter matrix.The columns of parameter matrix is also pre- First be arranged, it is preferred that total columns N=1/2*M of parameter matrix, the embodiment of the present invention not to total columns of parameter matrix into Row limits, for example, it is also possible to select to be less than or equal to M, and is greater than or equal to 1 any positive integer.
It is understood that parameter matrix can be one-dimensional vector, i.e. such corresponding parameter matrix of count parameter of transferring accounts It can only include first row order elements.
It can also include the following steps in another alternative embodiment.
Third step determines first of the row for every a line in such corresponding parameter matrix of count parameter of transferring accounts Corresponding first member of significance level parameter.
Still by taking Fig. 5 as an example, then parameter matrix includes 4 rows in total, and first significance level parameter of the first row is Vcs11, Corresponding first member is member A;First significance level parameter of the second row is Vcs21, corresponding first member be Member B;First significance level parameter of the third line is Vcs31, corresponding first member is member C;The first of fourth line A significance level parameter is Vcs41, corresponding first member is member D.
4th step obtains the important of first object member of first member corresponding with the row with X degree linking relationship Extent index.
The initial value of X is that 1, X degree connection relationship refers to first member and first object member at least through X-1 Member interacts.
Still by taking Fig. 5 as an example, there is the first mesh of once linking relationship with first member A of the first row in the parameter matrix Marking member includes: member B, member C and member D.The significance level parameter of member B is Vcs21;The significance level parameter of member C For Vcs31;The significance level parameter of member D is Vcs41.In example shown in fig. 5, there is no have two degree of chains with first member A Meet the member of relationship.
There is in the parameter matrix with first member B of the second row the first object member of once linking relationship to include: Member A and member C.The significance level parameter of member A is Vcs11;The significance level parameter of member C is Vcs31;With first at It includes: member D that member B, which has the first object member of two degree of linking relationships,.
There is in the parameter matrix with first member C of the third line the first object member of once linking relationship to include: Member A, member B and member D.The significance level parameter of member A is Vcs11;The significance level parameter of member B is Vcs21;Member D Significance level parameter be Vcs41.In example shown in fig. 5, there is no with first member C have two degree of linking relationships at Member.
With first member D of fourth line there is each first object member of once linking relationship to wrap in the parameter matrix It includes: member A and member C.The significance level parameter of member A is Vcs11;The significance level parameter of member C is Vcs31.With first It includes: member B that member D, which has the first object member of two degree of linking relationships,.
The significance level parameter of the corresponding first object member of the row is carried out descending sort, obtains the row pair by the 5th step The second ranking results answered.
Have the ranking results of the once first object member of linking relationship as follows with first member of each row:
Corresponding second ranking results of the first row are as follows: Vcs21,Vcs31,Vcs41;Corresponding second ranking results of second row are as follows: Vcs11,Vcs31;Corresponding second ranking results of the third line are as follows: Vcs11,Vcs21,Vcs41;Corresponding second ranking results of fourth line are as follows: Vcs11,Vcs31
Have the second ranking results of the first object member of two degree of linking relationships as follows with first member of each row:
In the example embodiment shown in fig. 5, with first member A in the first row and first in the third line at C person's difference First object member with two degree of linking relationships is not present, and therefore, is not illustrated here to the first row and the third line.
With first member B of the second row there is the first object member of two degree of linking relationships to only have member D, therefore, the Two ranking results are are as follows: Vcs41;With first object member that first member D of fourth line has two degree linking relationships only have at Member B, then the second ranking results are Vcs21
Significance level parameter in corresponding second ranking results of the row is successively determined as the element of the row by the 6th step, Until the row includes that the number of element is equal to total columns of the parameter matrix;If significance level is joined in second ranking results When number is the element of the row, the number of the element which includes is less than total columns, and X+1 is assigned to X, return step the Four steps.
Still by taking Fig. 5 as an example, if total columns N=4 of parameter matrix, it is directed to the first row, firstly, corresponding with the first row the It is V that one member A, which has corresponding second ranking results of first object member of once linking relationship,cs21,Vcs31,Vcs41, will Vcs21,Vcs31,Vcs41Successively it is determined as the element of the first row, until the number of the first row element that includes is equal to 4, due to the Each significance level parameter V in two ranking resultscs21,Vcs31,Vcs41When as the element in the first row, include in the first row Element number is equal to 4, and therefore, the element acquisition in the first row finishes, and each element that the first row includes is [Vcs11Vcs21Vcs31Vcs41]。
For the second row, firstly, first member B corresponding with the second row have once linking relationship first object at Corresponding second ranking results of member are Vcs11,Vcs31, by Vcs11,Vcs31Successively it is determined as the element of the second row, until the second row packet The number of the element contained is equal to 4, due to significance level parameter V each in the second ranking resultscs11,Vcs31It is in the second row When element, the element number for including in the second row is equal to 3 < 4, and therefore, the element in the second row, which has not been obtained, to be finished, the second row packet The each element contained are as follows: [Vcs21Vcs11Vjcs31], i.e. the 4th element of the second row does not determine.
For the third line, firstly, first member C corresponding with the third line have once linking relationship first object at Corresponding second ranking results of member are Vcs11,Vcs21,Vcs41, by Vcs11,Vcs21,Vcs41Successively it is determined as the element of the third line, directly The number for the element for including to the third line is equal to 4, due to significance level parameter V each in the second ranking resultscs11,Vcs21,Vcs41 When as the element in the third line, the element number for including in the third line is equal to 4, and therefore, the element in the third line has obtained Finish, at this point, each element that the third line includes are as follows: [Vcs31Vcs11Vcs21Vcs41]。
For fourth line, firstly, first member D corresponding with fourth line have once linking relationship first object at Corresponding second ranking results of member are Vcs11,Vcs31, by Vcs11,Vcs31Successively it is determined as the element of fourth line, until fourth line packet The number of the element contained is equal to 4, due to significance level parameter V each in the second ranking resultscs11,Vcs31As in fourth line Element when, the element number for including in fourth line is equal to 3 < 4, and therefore, the element in fourth line, which has not been obtained, to be finished, at this point, the The each element that four rows include are as follows: [Vcs41Vcs11Vcs31], i.e. the 4th of fourth line element does not determine.
Require to execute step third step to the 6th step, until the element number of the row is equal to total columns for every a line.
It is being directed to the second row, after executing the 4th step to the 6th step for the first time, each element of the second row is [Vcs21Vcs11Vjcs31], i.e. the 4th element of the second row does not determine.Therefore need to return the 4th step;Since X+1 being assigned to X, during executing the 4th step to six steps second, X 2, then the 4th step, can obtain first member with the second row B has the significance level parameter of the first object member of two degree of linking relationships, and first object member is member D, and member D is corresponding Significance level parameter be Vcs41.To V in 5th stepcs41Descending sort is carried out, due to only having a significance level parameter, Second ranking results are Vcs41;6th step, by the significance level parameter V in second ranking resultscs41, it is determined as the second row Element, due to by significance level parameter Vcs41After element as the second row, the element number that the second row includes is equal to 4, because This, the element acquisition of the second row terminates, and each element of the second row is [Vcs21Vcs11Vcs31Vcs41]。
Similarly, which is not described herein again for fourth line, and each element is [V in fourth linecs41Vcs11Vcs31Vcs21]。
7th step repeats the 4th step to the 6th step, until obtaining each in such corresponding parameter matrix of count parameter of transferring accounts The all elements that row separately includes.
By taking Fig. 5 as an example, the corresponding parameter matrix of count parameter of transferring accounts are as follows:
In above-described embodiment, " obtain transfer amounts parameter corresponding parameter matrix " with " acquisition count parameter of transferring accounts is corresponding Parameter matrix " may be performed simultaneously, can also successively execute.
Each member of the cluster to be measured " is preset behavioral parameters at such to distinguish by the third on the basis of second Another specific reality of corresponding significance level parameter, the element being determined as in such default corresponding parameter matrix of behavioral parameters " Existing method can be as shown in Figure 8, comprising:
Step S801: each member is preset into the corresponding significance level parameter of behavioral parameters at such and carries out descending row Sequence obtains such default corresponding first ranking results of behavioral parameters;
Step S802: by corresponding first sequence of such behavioral parameters as a result, as the corresponding parameter of such behavioral parameters The first column element in matrix;
Step S803: for every a line in the corresponding parameter matrix of such behavioral parameters, first weight of the row is determined Want corresponding first member of extent index;
Step S804: obtaining first member corresponding with the row has the weight of first object member of X degree linking relationship Extent index is wanted, the initial value of X is that 1, X degree connection relationship refers to that first member at least passes through with the first object member X-1 member is crossed to interact;
Step S805: the significance level parameter of the corresponding first object member of the row is subjected to descending sort, obtains the row Corresponding second ranking results;
Step S806: by the significance level parameter in corresponding second ranking results of the row, successively it is determined as the member of the row Element, until the row includes that the number of element is equal to total columns of the parameter matrix;
If significance level parameter is the element of the row in second ranking results, the number for the element which includes Less than total columns, and, X is less than preset value, X+1 is assigned to X, return step obtains first member corresponding with the row The significance level parameter of first object member with X degree linking relationship, the preset value are the positive integer more than or equal to 1;
If significance level parameter is the element of the row in second ranking results, the number for the element which includes Less than total columns, and, X is equal to the preset value, neutral element is supplemented in the row, until the element number etc. that the row includes In total columns, to obtain all elements that each row separately includes.
Above-mentioned steps are illustrated below.Still by taking Fig. 5 as an example, and assume that preset value is 1, preset value may be 2,3 ... etc., the embodiment of the present application is limited not to this.
1, the corresponding parameter matrix of transfer amounts parameter is obtained.
The process for executing step S801 to step S805 for the first time " it is corresponding to obtain transfer amounts parameter with second method Parameter matrix " in for the first time execute the first step to the 5th step process it is identical;It is when executing step S806, the row is corresponding Significance level parameter in second ranking results, is successively determined as the element of the row, due to preset value=1 X=, if should The element for including in row is still not equal to total columns, then supplements neutral element in the row, until the element number that the row includes is equal to Total columns.The element of every a line is performed both by above-mentioned steps, to obtain all elements that each row includes.
The corresponding parameter matrix of transfer amounts parameter are as follows:
2, the corresponding parameter matrix of count parameter of transferring accounts is obtained.
Executing the process of step S801 to step S805 for the first time, " acquisition count parameter of transferring accounts is corresponding with second method Parameter matrix " in for the first time execute the first step to the 5th step process it is identical;It is when executing step S806, the row is corresponding Significance level parameter in second ranking results, is successively determined as the element of the row, due to preset value=1 X=, if should The element for including in row is still not equal to total columns, then supplements neutral element in the row, until the element number that the row includes is equal to Total columns.The element of every a line is performed both by above-mentioned steps, to obtain all elements that each row includes.
The corresponding parameter matrix of count parameter of transferring accounts are as follows:
In an alternative embodiment, feature to be measured provided in an embodiment of the present invention further include: each member of cluster to be measured exists The corresponding significance level parameter of social parameter.Therefore, the feature to be measured of cluster to be measured is obtained further include:
The first step obtains each member in the cluster to be measured and hands over respectively with the member to be measured for each member to be measured Mutual probability;Member to be measured is any member in the cluster to be measured.
Still by taking Fig. 5 as an example, in an alternative embodiment, it is assumed that if a member to be measured is once to link pass with K member System, the then probability that the member to be measured and any one member interact are 1/K, for example, member A respectively with member B, member C and member D is once linking relationship, and therefore, optionally, member A interacts with member B, member C, member D initial respectively Probability is 1/3.
The probability that each member in Fig. 5 is interacted with other members respectively respectively table in the form of matrix M3 Show.
Wherein the first row successively indicates: member A, member B, member C, member D are interacted with member A initial general respectively Rate;Second row successively indicates: the probability that member A, member B, member C, member D are interacted with member B respectively;The third line It successively indicates: the probability that member A, member B, member C, member D are interacted with member C respectively;Fourth line successively indicates: The probability that member A, member B, member C, member D are interacted with member D respectively.
Second step obtains the probability that each member in the cluster to be measured carries out Social behaviors.
The probability that each member in cluster to be measured initiates Social behaviors can be obtained based on the historical data of cluster to be measured.? In one example, it is assumed that the probability of each member's initiation Social behaviors is identical in cluster to be measured, it is assumed that each member initiates social row in Fig. 5 For probability be 1/4.
Third step, respectively by each member carry out Social behaviors probability, with it is corresponding interacted with the member to be measured at the beginning of Beginning probability combines, and obtains the final probability that each member carries out Social behaviors with the member to be measured respectively.
4th step, each member is carried out with the member to be measured respectively the final probability of Social behaviors and, be determined as institute The social probability of member to be measured is stated, to obtain the corresponding social probability of each member in the cluster to be measured.
Still by taking Fig. 5 as an example, third step and the 4th step can be indicated with following formula:
Constantly carry out following iteration:
Vsj1=M3*Vsj0;Vsj2=M3*Vsj1;Vsj3=M3*Vsj2;...;Vsj(n3)=M3*Vsj(n3-1)
Alternatively,
Vsj13M3*Vsj0+(1-α3)e3;Vsj23M3*Vsj1+(1-α3)e3
Vsj33M3*Vsj2+(1-α3)e3;...;Vsj(n2)3M3*Vsj(n2-1)+(1-α3)e3
Wherein, α3It can be any value greater than 0, less than 1;
Optionally,
Until continuous Vsj(n3)With Vsj(n3-1)Result twice is not changing, alternatively, difference is within a preset range, i.e., It is believed that convergence, it is assumed that convergent result is Vsj(n3)=[Vsj11Vsj21Vsj31Vsj41]T, wherein the corresponding social probability of member A For Vsj11;The corresponding social probability of member B is Vsj21;The corresponding social probability of member C is Vsj31;The corresponding social probability of member D For Vsj41
5th step obtains the cluster to be measured based on the corresponding social probability of member each in the cluster to be measured Each member is in the corresponding significance level parameter of social parameter.
5th step includes three kinds of methods:
One, directly using the corresponding social probability of member each in the cluster to be measured as each of, the cluster to be measured Member is in the corresponding significance level parameter of social parameter.
By taking Fig. 5 as an example, each member is respectively as follows: in the corresponding each significance level parameter of social parameter in cluster to be measured Vsj11,Vsj21,Vsj31,Vsj41
Two, by each member of the cluster to be measured in such corresponding significance level parameter of social activity parameter, as this Each element in the corresponding parameter matrix of class social activity parameter.
1, each member is subjected to descending sort in such corresponding significance level parameter of social activity parameter, obtains such society Hand over corresponding first ranking results of parameter.
By taking Fig. 5 as an example, it is assumed that Vsj11≥Vsj31>Vsj21≥Vsj41, then the first ranking results are as follows: Vsj11,Vsj31,Vsj21, Vsj41
2, such social parameter corresponding first is sorted as a result, as in such corresponding parameter matrix of social activity parameter First column element.
3, for every a line in such corresponding parameter matrix of social activity parameter, first significance level ginseng of the row is determined Corresponding first member of number.
4, the significance level that first member corresponding with the row has each first object member of X degree linking relationship is obtained Parameter.
5, the significance level parameter of the corresponding each first object member of the row is subjected to descending sort, it is right respectively obtains the row The second ranking results answered.
6, by each significance level parameter in corresponding second ranking results of the row, successively it is determined as the element of the row, directly It include that the number of element is equal to total columns of the parameter matrix to the row;If each significance level ginseng in second ranking results When number is the element of the row, the number of the element which includes is less than total columns, and X+1 is assigned to X, return step the Four steps.
The corresponding parameter matrix of social parameter are as follows:
The third, by each member of the cluster to be measured in such corresponding significance level parameter of social activity parameter and zero Element, as each element in such corresponding parameter matrix of social activity parameter.
1 to 6 step in this method is identical to step S806 as step S801, it is assumed that preset value 1.
1, each member is subjected to descending sort in such corresponding significance level parameter of social activity parameter, obtains such society Hand over corresponding first ranking results of parameter.
By taking Fig. 5 as an example, it is assumed that Vsj11≥Vsj31>Vsj21≥Vsj41, then the first ranking results are as follows: Vsj11,Vsj31,Vsj21, Vsj41
2, such social parameter corresponding first is sorted as a result, as in such corresponding parameter matrix of social activity parameter First column element.
3, for every a line in such corresponding parameter matrix of social activity parameter, first significance level ginseng of the row is determined Corresponding first member of number.
4, the significance level that first member corresponding with the row has each first object member of X degree linking relationship is obtained Parameter.
5, the significance level parameter of the corresponding each first object member of the row is subjected to descending sort, it is right respectively obtains the row The second ranking results answered.
6, by each significance level parameter in corresponding second ranking results of the row, successively it is determined as the element of the row, directly It include that the number of element is equal to total columns of the parameter matrix to the row;If each significance level ginseng in second ranking results When number is the element of the row, the number for the element which includes is less than total columns, and X=1, supplements null element in the row Element, until the element number that the row includes is equal to total columns, to obtain all elements that each row separately includes.
The corresponding parameter matrix of social parameter are as follows:
In an alternative embodiment, such (in order to distinguish, is preset behavioral parameters and is known as presetting by the default behavioral parameters of one kind First object behavioral parameters) it can be that at least two classes (in order to distinguish, are preset behavioral parameters and claimed by the default behavioral parameters of at least two classes Be that at least two classes preset the second goal behavior parameter) combination, or, at least a kind of default behavioral parameters (in order to distinguish, will at least The default behavioral parameters of one kind are known as at least a kind of default second goal behavior parameter) combination with social activity parameter, then collection to be measured Each member of group the default corresponding significance level parameter of first object behavioral parameters include: the cluster to be measured it is each at Combination of the member in all kinds of default corresponding significance level parameters of second goal behavior parameter;Or, the cluster to be measured is each Member is at least a kind of default corresponding significance level parameter of second goal behavior parameter, with the corresponding cluster to be measured Each member in the corresponding significance level parameter of social parameter, combination.
For example, at least a kind of default behavioral parameters may include: that one kind is transferred accounts parameter, such parameter of transferring accounts is transfer amounts The combination of parameter and count parameter of transferring accounts.
So, each member of cluster to be measured is respectively as follows: in the corresponding significance level parameter of parameter of transferring accounts
Each member of cluster to be measured is in the corresponding significance level parameter of transfer amounts parameter, with the phase of cluster to be measured The member answered the corresponding significance level parameter of count parameter of transferring accounts linearly or nonlinearly combination, for example, cluster to be measured is each Member is in the corresponding significance level parameter of parameter of transferring accounts respectively include: (each member's the first default weight * of cluster to be measured exists Turning the corresponding significance level parameter of amount of money account parameter) (each member of cluster to be measured is in number of transferring accounts by the+the second default weight * The corresponding significance level parameter of parameter).
Still by taking Fig. 5 as an example, it is assumed that the first default weight and the second default weight are 1/2, then the corresponding ginseng of parameter of transferring accounts Matrix number
In an alternative embodiment, it can be obtained based on the corresponding parameter matrix of at least one kind behavioral parameters Core member in the cluster to be measured.
Because the significance level parameter of first row is descending arrangement in the corresponding parameter matrix of each class behavior parameter, each It in the corresponding parameter matrix of class behavior parameter in each row in addition to first element, is arranged according to descending, therefore, Ke Yicong In the corresponding parameter matrix of all kinds of behavioral parameters, the biggish candidate core member of significance level parameter is obtained respectively, in conjunction with The corresponding candidate core member of all kinds of behavioral parameters, determines the core member in cluster to be measured.
For example, the member A simultaneously in transfer amounts parameter, count parameter of transferring accounts as candidate core member is determined For the core member in cluster to be measured.
In order to which those skilled in the art more understand group type recognition methods provided in an embodiment of the present invention, one is named Specific example is illustrated.
Still by taking Fig. 5 as an example, it is assumed that cluster to be measured is corporations, which includes Liang Ge group, respectively group 1 and group 2, Group 1 includes: member A, member B and member C;Group 2 includes: member A, member C and member D.
Firstly, obtaining the feature to be measured of the corporations.
Assuming that the feature to be measured of the corporations includes:
The corresponding parameter matrix of transfer amounts parameter are as follows: count parameter of transferring accounts is corresponding Parameter matrix are as follows: the corresponding parameter matrix of social parameter are as follows:
Secondly, by the group type prediction model of feature to be measured input prebuild.
That is the feature to be measured of cluster to be measured is above three matrix in Fig. 1.
The group type prediction model of prebuild can be based on above three matrix, obtain corporations' junction structure of corporations, from And the prediction probability that the corporations belong to all kinds of known corporations can be predicted.
It is assumed that the community structure of the corporations includes: that the layer-by-layer superior member of each member transfers accounts, turn mutually between each member Account is seldom, and has the concept of level.
Finally, the group type prediction model of prebuild, exporting the corporations to belong to the prediction probability of multiple level marketing corporations is 95%; The prediction probability for belonging to gambling corporations is 3%;Belonging to pornographic website to propagate the prediction probability of corporations is 2%;Belong to legal corporations Prediction probability be 0%.
It is understood that the above-mentioned group type prediction model referred to is obtained by neural metwork training, below The method for constructing the group type prediction model is illustrated.
The first step obtains multiple positive sample clusters and multiple negative sample clusters, obtains multiple sample clusters.
Positive sample cluster may belong to illegal cluster.Negative sample cluster may belong to legal cluster.
Legal cluster is other clusters in addition to illegal cluster in the embodiment of the present invention.
Second step obtains the corresponding training characteristics of each sample cluster.
The training characteristics of one sample cluster include: each member of the sample cluster at least a kind of default behavioral parameters point Not corresponding significance level parameter.
Obtain the process phase of the process feature to be measured corresponding with cluster to be measured is obtained of the corresponding training characteristics of sample cluster Together, it may refer to the process of feature to be measured corresponding to cluster to be measured, which is not described herein again.
Third step is inputted the corresponding training characteristics of each sample cluster as the training of neural network, and training obtains The group type prediction model.
In an alternative embodiment, third step can be specifically included:
1, it is inputted the corresponding training characteristics of each sample cluster as the training of neural network, obtains each sample cluster It is belonging respectively to the prediction probability of all kinds of known clusters, obtains the corresponding predicted vector of each sample cluster.
2, based on the corresponding predicted vector of each sample cluster and the corresponding true vector of corresponding sample cluster Comparison result updates feature extraction parameter and probability calculation parameter in the neural network.
Wherein, the feature extraction parameter is at least used to extract the collection of the sample cluster from the training characteristics of sample cluster Group structure;The probability calculation parameter is used for the cluster topology based on sample cluster, and acquisition sample cluster is belonging respectively to described each The prediction probability of cluster known to class.
3, return step 1;Until the comparison result meets termination condition;Obtain the group type prediction model.
It executes step 1 and 2 and is known as an iteration process, in iterative process each time, neural network can be based on more Feature extraction parameter and probability calculation parameter after new estimate each sample cluster and are belonging respectively to all kinds of known clusters Prediction probability.
" group type prediction model is with the difference of the predicted vector of sample cluster and true vector in the embodiment of the present invention The minimum training objective of the length of vector " refers to that the predicted vector of sample cluster is with true vector closer to better.In order to reach This training objective can be trained neural network by following methods, and the embodiment of the present invention is provided but is not limited to Following several method.
The first, obtains the corresponding predicted vector of each sample cluster, true vector corresponding with respective sample cluster Difference value vector;The sum for obtaining the corresponding difference value vector of each sample cluster, obtains and vector;Foundation and vector update nerve Feature extraction parameter and probability calculation parameter in network.
It constantly executes in iterative process, constantly to the feature extraction parameter and the progress of probability calculation parameter in neural network It updates, so that the length with vector is smaller and smaller.
Second, obtain the corresponding predicted vector of each sample cluster, true vector corresponding with respective sample cluster Difference value vector;Based on the corresponding difference value vector of each sample cluster, update feature extraction parameter in neural network and Probability calculation parameter.
In continuous iterative process, the feature extraction parameter and probability calculation parameter in neural network are constantly updated, is made The length for obtaining the corresponding difference value vector of each sample cluster is smaller and smaller.
The third, obtains the corresponding predicted vector of each sample cluster, true vector corresponding with respective sample cluster Difference value vector;The dimension is obtained from the corresponding difference value vector of each sample cluster for every dimension in difference value vector The differential probability of maximum absolute value in degree;To obtain the differential probability of the corresponding maximum absolute value of each dimension;By each dimension The differential probability of corresponding maximum absolute value forms a target difference vector;Based on the target difference vector more new feature Extracting parameter and probability calculation parameter.
In continuous iterative process, the feature extraction parameter and probability calculation parameter in neural network are constantly updated, is made The length for obtaining target difference vector is smaller and smaller.
Still with difference value vector (- 0,05,0.04,0.01,0) for, in the example present, difference value vector include four dimension Degree, the number for the dimension that difference value vector includes be the class that can be predicted by neural network known to number of clusters determine, difference to The number for the dimension that amount includes, and number of clusters known to the class that neural network can be predicted is identical.Wherein, difference value vector is every Cluster known to the corresponding one kind of dimension;The specific value of the corresponding dimension of cluster known to every one kind are as follows: the sample of neural network prediction This cluster belongs to the prediction probability of such known cluster, and sample cluster belongs to the difference of the true probability of such known cluster Probability.For example, the sample cluster of above-mentioned differential probability 0.04=neural network prediction belongs to the prediction probability of multiple level marketing group type 0.04- sample cluster belongs to the true probability 0=0.04 of multiple level marketing group type.
Which kind of no matter the parameter of neural network is updated by above-mentioned mode, training objective is each sample cluster Predicted vector and true vector difference value vector length it is minimum.
Termination condition can be respectively less than for the length of the difference value vector of each sample cluster or be equal to the first preset length threshold Value;And/or and vector length be less than or equal to the second pre-set length threshold;And/or the length of target difference vector is less than Or it is equal to third pre-set length threshold.
First pre-set length threshold, the second pre-set length threshold and third pre-set length threshold can be equal, can also be with Unequal, the embodiment of the present invention is not specifically limited in this embodiment.
Optionally, neural network can select full Connection Neural Network (such as MLP network, MLP expression Multi-layer Perceptron is the meaning of multilayer perceptron), neural network (such as convolutional neural networks, depth of other forms can also be selected Spend neural network etc.).
It should be noted that neural network, which starts used parameter, can be the random parameter of initialization, neural network First the corresponding training characteristics of each sample cluster are handled based on random parameter, and are based on processing result, to parameter into Row updates;Then neural network is based on updated parameter again and handles the corresponding training characteristics of each sample cluster; Parameter is updated again based on processing result;After process successive ignition, if the number of iterations is greater than preset times, or processing As a result meet termination condition then deconditioning, obtain final group type prediction model.
Optionally, neural network can use the parameter update that back-propagation gradient descent algorithm carries out neural network, real The repetitive exercise of existing neural network and convergence.
The training process of the group type prediction model in the embodiment of the present invention is more understood for those skilled in the art, It is illustrated below with reference to a specific example.
As shown in figure 9, being a kind of process schematic of trained neural network provided in an embodiment of the present invention.
Group type prediction model includes at least: convolutional layer, pond layer, full articulamentum and normalization layer.
Convolutional layer executes the process of convolution as shown in 91 in Fig. 9.
Convolutional layer can obtain the cluster topology of sample cluster from the training characteristics of the sample cluster of input.
Convolutional layer is provided with multiple convolution kernels, and multiple convolution kernels carry out convolution with the training characteristics of input respectively, from training The cluster topology of sample cluster is obtained in feature.Feature extraction parameter includes multiple convolution kernels in convolutional layer.
In an alternative embodiment, training characteristics can be the corresponding parameter matrix of at least a kind of behavioral parameters.
Assuming that P class behavior parameter altogether, then can input P parameter matrix in input layer;Each parameter square in convolutional layer Battle array can carry out convolution operation with multiple convolution kernels respectively.It is assumed that carrying out convolution respectively for P parameter matrix, Q are obtained First matrix.The value of Q is very big, is far longer than P.
Pond layer is added after convolutional layer, is because pond layer can reduce the data volume of calculating, to improve calculating speed Degree.Pond layer is referred to as down-sampling layer.
Pond layer includes multiple Chi Huahe, and Chi Huahe is used to further extract the cluster of sample cluster from Q the first matrixes Structure.Assuming that Q the first matrixes are carried out down-sampling by pond layer respectively, it is assumed that obtain Q the second matrixes.92 be that will roll up in Fig. 9 Q the first matrixes after product carry out down-sampling and obtain the process of Q the second matrixes.
Feature extraction parameter can also include multiple Chi Huahe in the layer of pond.
Full articulamentum (fully connected layers, FC) plays the work of " classifier " in entire neural network With.Full articulamentum is based on probability calculation parameter, and the Q of Chi Huahou the second matrixes are combined, and calculates sample cluster difference Belong to the score of all kinds of known clusters.93 show full articulamentum and are combined Q the second matrixes in Fig. 9, obtain one it is total The process of the cluster topology of body.
The score that normalization layer 94 is used to for sample cluster being belonging respectively to all kinds of known clusters is normalized, and exports sample Cluster belongs to the prediction probability of all kinds of known clusters.
The sum of the probability that i.e. sample cluster belongs to all kinds of known clusters is 1.It is assumed that all kinds of known clusters are respectively as follows: gambling Group type, multiple level marketing group type, pornographic website Spread type;And it is 90 that sample cluster, which belongs to the score of gambling group type, (or decimal, the embodiment of the present invention are not especially limited), the score that sample cluster belongs to multiple level marketing group type is 80, sample The score that this cluster belongs to pornographic website Spread type is 70;Layer 94 is then normalized, the sample cluster of acquisition belongs to gambling cluster The prediction probability of type is 90/ (90+70+80)=0.375;The prediction probability that this cluster belongs to multiple level marketing group type is 80/ (90 + 70+80)=0.333, the prediction probability that sample cluster belongs to pornographic website Spread type is 70/ (90+70+80)=0.292.
The difference value vector of the neural network predicted vector based on sample cluster and true vector, update feature extraction parameter with And probability calculation parameter updates each Chi Huahe in the layer of pond for example, updating each convolution kernel in convolutional layer respectively;Update full connection Probability calculation parameter in layer.Error based on predicted vector and true vector carries out parameter update, because being from back to front It is reversed, therefore renewal process is also referred to as the back-propagation process (BackBackpropagatio) of neural network.
After undated parameter, need that the training characteristics of sample cluster are inputted neural network again, the convolution in neural network Layer is based on updated each convolution kernel, and the cluster topology of sample cluster is extracted from training characteristics;Pond layer is based on updated Chi Huahe further extracts cluster topology from Q the first matrixes after convolution;Full articulamentum is based on updated probability calculation The Q of Chi Huahou the second matrixes are combined by parameter, calculate the score that sample cluster is belonging respectively to all kinds of known clusters; The score that normalization layer 94 is used to belong to sample cluster all kinds of known clusters is normalized, and output sample cluster belongs to all kinds of The prediction probability of known cluster.
The difference value vector of the neural network predicted vector based on this sample cluster and true vector, more new feature mentions again Take parameter and probability calculation parameter.
Above-mentioned iterative process is executed repeatedly, until, the predicted vector of each sample cluster and the difference value vector of true vector Length be less than or pre-set length threshold.Or the number of iterations reaches preset times, obtains final group type prediction model.
In an alternative embodiment, group type prediction model may include multilayer convolutional layer, and the purpose of multilayer convolution is The cluster topology that one convolutional layer is acquired is often part, and the number of plies of convolutional layer is higher, and the cluster topology acquired more is globalized. The cluster topology of globalization can be obtained by multiple convolutional layers.Group type prediction model include at least one group of convolutional layer and Pond layer, i.e. group type prediction model can also include: convolutional layer, pond layer, convolutional layer, pond layer, full articulamentum and Normalize layer.
Ellipsis in Fig. 9 is the process for the multiple groups 91 and 92 omitted.
It as shown in Figure 10, is a kind of structure chart of group type identification device provided in an embodiment of the present invention, the device packet It includes:
First obtains module 1001, and for obtaining the feature to be measured of cluster to be measured, the feature to be measured is included at least: described Each member of cluster to be measured corresponding significance level parameter at least a kind of default behavioral parameters;
Input module 1002, for the feature to be measured to be inputted to the group type prediction model of prebuild;
Wherein, the group type prediction model is with the difference value vector of the predicted vector of sample cluster and true vector The minimum training objective of length, training neural network obtain;The predicted vector includes that the sample cluster is belonging respectively to respectively The prediction probability of cluster known to class;The true vector includes that the sample cluster is belonging respectively to the true general of all kinds of known clusters Rate;
Second obtains module 1003, for obtaining the group type prediction model output, the cluster difference to be measured Belong to the prediction probability of all kinds of known clusters.
Optionally, the first acquisition module includes:
First acquisition unit, for obtaining each member of the cluster to be measured at least a kind of default behavioral parameters point Not corresponding behavioral parameters value;
First determination unit, for behavioral parameters default for every one kind, according to each member of the cluster to be measured at this Class presets the corresponding behavioral parameters value of behavioral parameters, determines that each member of the cluster to be measured presets behavioral parameters at such Corresponding significance level parameter, to obtain each member of the cluster to be measured at least a kind of default behavioral parameters point Not corresponding significance level parameter;
Second determination unit, for each member according at least to the cluster to be measured at least a kind of default behavior ginseng The corresponding significance level parameter of number, determines the feature to be measured of cluster to be measured.
Optionally, the feature to be measured further include: each member of the cluster to be measured is corresponding heavy in social parameter Want extent index;Described first obtains module further include:
Second acquisition unit, for being directed to each member to be measured, obtain in the cluster to be measured each member respectively with it is described The probability of member's interaction to be measured;Member to be measured is any member in the cluster to be measured;
Third acquiring unit carries out the probability of Social behaviors for obtaining each member in the cluster to be measured;
4th acquiring unit, for respectively by each member carry out Social behaviors probability, with it is corresponding with it is described it is to be measured at The probability of member's interaction combines, and obtains the final probability that each member carries out Social behaviors with the member to be measured respectively;
Third determination unit, for each member to be carried out with the member to be measured to the final probability of Social behaviors respectively Be determined as the social probability of the member to be measured, to obtain the corresponding social probability of each member in the cluster to be measured;
5th acquiring unit, for based on the corresponding social probability of member each in the cluster to be measured, described in acquisition Each member of cluster to be measured is in the corresponding significance level parameter of social parameter.
Optionally, second determination unit includes:
Determine subelement, it is for behavioral parameters default for every one kind, each member of the cluster to be measured is pre- at such If the corresponding significance level parameter of behavioral parameters presets each member in the corresponding parameter matrix of behavioral parameters as such Element, to obtain at least a kind of default corresponding parameter matrix of behavioral parameters.
Optionally, the determining subelement includes:
Each member is preset behavioral parameters at such for for every a kind of default behavioral parameters by the first acquisition submodule Corresponding significance level parameter carries out descending sort, obtains such default corresponding first ranking results of behavioral parameters;
First determines submodule, for such behavioral parameters corresponding first to sort as a result, as such behavioral parameters The first column element in corresponding parameter matrix, to obtain at least a kind of default corresponding parameter matrix of behavioral parameters In the first column element.
Optionally, the determining subelement further include:
Second determines submodule, for determining the row for every a line in the corresponding parameter matrix of such behavioral parameters Corresponding first member of first significance level parameter;
Second acquisition submodule, the first mesh that there is X degree linking relationship for obtaining first member corresponding with the row The significance level parameter of member is marked, the initial value of X is that 1, X degree connection relationship refers to first member and first object member It is interacted at least through X-1 member;
Third acquisition submodule, for the significance level parameter of the corresponding first object member of the row to be carried out descending row Sequence obtains corresponding second ranking results of the row;
Third determines submodule, for successively determining the significance level parameter in corresponding second ranking results of the row For the element of the row, until the row includes that the number of element is equal to total columns of the parameter matrix;
Return to submodule, if be the element of the row for significance level parameter in second ranking results, the row The number for the element for including is less than total columns, and X+1 is assigned to X, return step obtain first corresponding with the row at Member has the significance level parameter of each first object member of X degree linking relationship;To obtain all elements that each row separately includes.
Optionally, returning to submodule can be specifically used for:
If significance level parameter is the element of the row in second ranking results, the number for the element which includes Less than total columns, and, X is less than preset value, X+1 is assigned to X, return step obtains first member corresponding with the row The significance level parameter of first object member with X degree linking relationship, the preset value are the positive integer more than or equal to 1;
If significance level parameter is the element of the row in second ranking results, the number for the element which includes Less than total columns, and, X is equal to the preset value, neutral element is supplemented in the row, until the element number etc. that the row includes In total columns, to obtain all elements that each row separately includes.
Optionally, further includes:
Third obtains module, for based on the corresponding parameter matrix of at least a kind of behavioral parameters, described in acquisition Core member in cluster to be measured.
Optionally, further includes:
4th acquisition module obtains multiple sample sets for obtaining multiple positive sample clusters and multiple negative sample clusters Group;
5th obtains module, for obtaining the corresponding training characteristics of each sample cluster, the training of a sample cluster Feature includes: each member of the sample cluster at least a kind of default corresponding significance level parameter of behavioral parameters;
Training module, for being inputted the corresponding training characteristics of each sample cluster as the training of neural network, instruction Get the group type prediction model.
Optionally, training module includes:
6th acquiring unit, for the corresponding training characteristics of each sample cluster are defeated as the training of neural network Enter, obtains the prediction probability that each sample cluster is belonging respectively to all kinds of known clusters, it is corresponding to obtain each sample cluster Predicted vector;
Updating unit, for corresponding based on the corresponding predicted vector of each sample cluster and corresponding sample cluster The comparison result of true vector, updates the feature extraction parameter and probability calculation parameter in the neural network;
Trigger unit, for triggering the 6th acquiring unit;Until the comparison result meets termination condition;It obtains described Group type prediction model;
Wherein, the feature extraction parameter is at least used to extract the collection of the sample cluster from the training characteristics of sample cluster Group structure;The probability calculation parameter is used for the cluster topology based on sample cluster, and acquisition sample cluster is belonging respectively to described each The prediction probability of cluster known to class.
The embodiment of the invention also provides a kind of storage medium, the storage medium is stored with the journey executed suitable for processor Sequence, described program are used for:
The feature to be measured of cluster to be measured is obtained, the feature to be measured includes at least: each member of the cluster to be measured is extremely Corresponding significance level parameter on few a kind of default behavioral parameters;
By the group type prediction model of the feature input prebuild to be measured;
Wherein, the group type prediction model is with the difference value vector of the predicted vector of sample cluster and true vector The minimum training objective of length, training neural network obtain;The predicted vector includes that the sample cluster is belonging respectively to respectively The prediction probability of cluster known to class;The true vector includes that the sample cluster is belonging respectively to the true of all kinds of known clusters Real probability;
The group type prediction model output is obtained, the cluster to be measured is belonging respectively to all kinds of known clusters Prediction probability.
Optionally, processor is specifically used for when executing the feature to be measured for obtaining cluster to be measured:
Each member of the cluster to be measured is obtained at least a kind of default corresponding behavioral parameters of behavioral parameters Value;
For every a kind of default behavioral parameters, behavioral parameters are preset at such according to each member of the cluster to be measured and are distinguished Corresponding behavioral parameters value determines that each member of the cluster to be measured presets the corresponding significance level of behavioral parameters at such Parameter is joined with obtaining each member of the cluster to be measured at least a kind of corresponding significance level of behavioral parameters of presetting Number;
It is corresponding important at least a kind of default behavioral parameters according at least to each member of the cluster to be measured Extent index determines the feature to be measured of cluster to be measured.
Optionally, the feature to be measured further include: each member of the cluster to be measured is corresponding heavy in social parameter Want extent index, processor when executing the feature to be measured for obtaining cluster to be measured, also particularly useful for:
For each member to be measured, each member interacts with the member to be measured initial respectively in the acquisition cluster to be measured Probability;Member to be measured is any member in the cluster to be measured;
Obtain the probability that each member in the cluster to be measured carries out Social behaviors;
The probability that each member is carried out to Social behaviors respectively, with the corresponding probability phase interacted with the member to be measured In conjunction with obtaining the final probability that each member carries out Social behaviors with the member to be measured respectively;
Each member is carried out with the member to be measured respectively the final probability of Social behaviors and, be determined as it is described it is to be measured at The social probability of member, to obtain the corresponding social probability of each member in the cluster to be measured;
Based on the corresponding social probability of member each in the cluster to be measured, each member for obtaining the cluster to be measured exists The corresponding significance level parameter of social parameter.
Optionally, processor is executing each member according at least to the cluster to be measured at least a kind of default behavior Parameter corresponding significance level parameter is specifically used for when determining the feature to be measured of cluster to be measured:
For every a kind of default behavioral parameters, it is right respectively that each member of the cluster to be measured at such is preset into behavioral parameters The significance level parameter answered presets each element in the corresponding parameter matrix of behavioral parameters as such, with obtain it is described at least The default corresponding parameter matrix of behavioral parameters of one kind.
Optionally, that each member of the cluster to be measured at such is preset behavioral parameters is corresponding executing for processor Significance level parameter when the element being determined as in such default corresponding parameter matrix of behavioral parameters, is specifically used for:
Each member is preset into the corresponding significance level parameter of behavioral parameters at such and carries out descending sort, obtains such Default corresponding first ranking results of behavioral parameters;
By corresponding first sequence of such behavioral parameters as a result, as the in the corresponding parameter matrix of such behavioral parameters One column element.
Optionally, that each member of the cluster to be measured at such is preset behavioral parameters is corresponding executing for processor Significance level parameter, the element being determined as in such default corresponding parameter matrix of behavioral parameters, is also used to:
For every a line in the corresponding parameter matrix of such behavioral parameters, first significance level parameter of the row is determined Corresponding first member;
Obtaining first member corresponding with the row has the significance level ginseng of first object member of X degree linking relationship Number, the initial value of X are that 1, X degree connection relationship refers to first member and the first object member at least through X-1 Member interacts;
The significance level parameter of the corresponding first object member of the row is subjected to descending sort, it is corresponding to obtain the row Second ranking results;
By the significance level parameter in corresponding second ranking results of the row, successively it is determined as the element of the row, until should Row is equal to total columns of the parameter matrix comprising the number of element;
If significance level parameter is the element of the row in second ranking results, the number for the element which includes Less than total columns, X+1 is assigned to X, return step, which obtains first member corresponding with the row, has X degree linking relationship First object member significance level parameter;To obtain all elements that each row separately includes.
Optionally, if processor is when significance level parameter is the element of the row in executing second ranking results, The number for the element that the row includes is less than total columns, X+1 is assigned to X, return step obtains corresponding with the row first When a member has the significance level parameter of the first object member of X degree linking relationship, it is specifically used for:
If significance level parameter is the element of the row in second ranking results, the number for the element which includes Less than total columns, and, X is less than preset value, X+1 is assigned to X, return step obtains first member corresponding with the row The significance level parameter of first object member with X degree linking relationship, the preset value are the positive integer more than or equal to 1;
If significance level parameter is the element of the row in second ranking results, the number for the element which includes Less than total columns, and, X is equal to the preset value, neutral element is supplemented in the row, until the element number etc. that the row includes In total columns.
Optionally, processor is also used to execute:
Based on the corresponding parameter matrix of at least a kind of behavioral parameters, obtain core in the cluster to be measured at Member.
Optionally, processor is also used to:
Obtain multiple sample clusters;
The corresponding training characteristics of each sample cluster are obtained, the training characteristics of a sample cluster include: the sample set Each member of group is at least a kind of default corresponding significance level parameter of behavioral parameters;
Using the corresponding training characteristics of each sample cluster as the training input of neural network, training obtains the cluster Type prediction model.
Optionally, processor is defeated as the training of neural network using the corresponding training characteristics of each sample cluster in execution Enter, when training obtains the group type prediction model, be specifically used for:
Using the corresponding training characteristics of each sample cluster as the training input of neural network, each sample cluster point is obtained The prediction probability for not belonging to all kinds of known clusters, obtains the corresponding predicted vector of each sample cluster;
Ratio based on the corresponding predicted vector of each sample cluster and the corresponding true vector of corresponding sample cluster Compared with as a result, updating the feature extraction parameter and probability calculation parameter in the neural network;
Wherein, the feature extraction parameter is at least used to extract the collection of the sample cluster from the training characteristics of sample cluster Group structure;The probability calculation parameter is used for the cluster topology based on sample cluster, and acquisition sample cluster is belonging respectively to described each The prediction probability of cluster known to class;
Return step is inputted the corresponding training characteristics of each sample cluster as the training of neural network, obtains various kinds This cluster is belonging respectively to the prediction probability of all kinds of known clusters;Until the comparison result meets termination condition;Obtain institute State group type prediction model.
Finally, it is to be noted that, herein, relational terms such as first and second and the like be used merely to by One entity or operation are distinguished with another entity or operation, without necessarily requiring or implying these entities or operation Between there are any actual relationship or orders.Moreover, the terms "include", "comprise" or its any other variant meaning Covering non-exclusive inclusion, so that the process, method, article or equipment for including a series of elements not only includes that A little elements, but also including other elements that are not explicitly listed, or further include for this process, method, article or The intrinsic element of equipment.In the absence of more restrictions, the element limited by sentence "including a ...", is not arranged Except there is also other identical elements in the process, method, article or apparatus that includes the element.
Each embodiment in this specification is described in a progressive manner, the highlights of each of the examples are with other The difference of embodiment, the same or similar parts in each embodiment may refer to each other.
The foregoing description of the disclosed embodiments makes professional and technical personnel in the field can be realized or use the application. Various modifications to these embodiments will be readily apparent to those skilled in the art, as defined herein General Principle can be realized in other embodiments without departing from the spirit or scope of the application.Therefore, the application It is not intended to be limited to the embodiments shown herein, and is to fit to and the principles and novel features disclosed herein phase one The widest scope of cause.

Claims (15)

1. a kind of group type recognition methods characterized by comprising
The feature to be measured of cluster to be measured is obtained, the feature to be measured includes at least: each member of the cluster to be measured is at least one Class presets corresponding significance level parameter on behavioral parameters;
By the group type prediction model of the feature input prebuild to be measured;
Wherein, the group type prediction model is the length of the difference value vector with the predicted vector of sample cluster Yu true vector Minimum training objective, training neural network obtain;The predicted vector include the sample cluster be belonging respectively to it is all kinds of Know the prediction probability of cluster;The true vector includes that the sample cluster is belonging respectively to the true general of all kinds of known clusters Rate;
The group type prediction model output is obtained, the cluster to be measured is belonging respectively to the prediction of all kinds of known clusters Probability.
2. group type recognition methods according to claim 1, which is characterized in that the feature to be measured for obtaining cluster to be measured Include:
Each member of the cluster to be measured is obtained at least a kind of default corresponding behavioral parameters value of behavioral parameters;
For every a kind of default behavioral parameters, behavioral parameters are preset at such according to each member of the cluster to be measured and are respectively corresponded Behavioral parameters value, determine that each member of the cluster to be measured presets the corresponding significance level ginseng of behavioral parameters at such Number is joined with obtaining each member of the cluster to be measured at least a kind of corresponding significance level of behavioral parameters of presetting Number;
The corresponding significance level of behavioral parameters is preset in described at least one kind according at least to each member of the cluster to be measured Parameter determines the feature to be measured of cluster to be measured.
3. group type recognition methods according to claim 1 or claim 2, which is characterized in that the feature to be measured further include: described Each member of cluster to be measured is in the corresponding significance level parameter of social parameter;The feature to be measured for obtaining cluster to be measured, Further include:
For each member to be measured, each member interacts with the member to be measured initial general respectively in the acquisition cluster to be measured Rate;Member to be measured is any member in the cluster to be measured;
Obtain the probability that each member in the cluster to be measured carries out Social behaviors;
The probability that each member is carried out to Social behaviors respectively, is mutually tied with the corresponding probability interacted with the member to be measured It closes, obtains the final probability that each member carries out Social behaviors with the member to be measured respectively;
Each member is carried out with the member to be measured respectively the final probability of Social behaviors and, be determined as the member's to be measured Social probability, to obtain the corresponding social probability of each member in the cluster to be measured;
Based on the corresponding social probability of member each in the cluster to be measured, each member of the cluster to be measured is obtained in social activity The corresponding significance level parameter of parameter.
4. group type recognition methods according to claim 2, which is characterized in that described according at least to the cluster to be measured Each member determines the feature to be measured of cluster to be measured at least a kind of default corresponding significance level parameter of behavioral parameters Include:
For every a kind of default behavioral parameters, it is corresponding that each member of the cluster to be measured at such is preset into behavioral parameters Significance level parameter presets each element in the corresponding parameter matrix of behavioral parameters as such, described at least a kind of to obtain The default corresponding parameter matrix of behavioral parameters.
5. group type recognition methods according to claim 4, which is characterized in that by each member of the cluster to be measured at this Class presets the corresponding significance level parameter of behavioral parameters, is determined as in such default corresponding parameter matrix of behavioral parameters Element includes:
Each member is preset into the corresponding significance level parameter of behavioral parameters at such and carries out descending sort, it is default to obtain such Corresponding first ranking results of behavioral parameters;
By corresponding first sequence of such behavioral parameters as a result, as the first row in the corresponding parameter matrix of such behavioral parameters Element.
6. group type recognition methods according to claim 5, which is characterized in that by each member of the cluster to be measured at this Class presets the corresponding significance level parameter of behavioral parameters, is determined as in such default corresponding parameter matrix of behavioral parameters Element, further includes:
For every a line in the corresponding parameter matrix of such behavioral parameters, determine that first significance level parameter of the row is corresponding First member;
Obtaining first member corresponding with the row has the significance level parameter of first object member of X degree linking relationship, X's Initial value be 1, X degree connection relationship refer to first member and the first object member at least through X-1 member into Row interaction;
The significance level parameter of the corresponding first object member of the row is subjected to descending sort, obtains the row corresponding second Ranking results;
By the significance level parameter in corresponding second ranking results of the row, successively it is determined as the element of the row, until the row packet Number containing element is equal to total columns of the parameter matrix;
If significance level parameter is the element of the row in second ranking results, the number for the element which includes is less than X+1 is assigned to X by total columns, and return step obtains the of first member corresponding with the row with X degree linking relationship The significance level parameter of one target members;To obtain all elements that each row separately includes.
7. group type recognition methods according to claim 6, which is characterized in that if important journey in second ranking results When degree parameter is the element of the row, the number for the element which includes is less than total columns, and X+1 is assigned to X, returns to step The rapid significance level parameter for obtaining first member corresponding with the row and there is the first object member of X degree linking relationship, comprising:
If significance level parameter is the element of the row in second ranking results, the number for the element which includes is less than Total columns, and, X is less than preset value, X+1 is assigned to X, return step, which obtains first member corresponding with the row, to be had The significance level parameter of the first object member of X degree linking relationship, the preset value are the positive integer more than or equal to 1;
If significance level parameter is the element of the row in second ranking results, the number for the element which includes is less than Total columns, and, X is equal to the preset value, neutral element is supplemented in the row, until the element number that the row includes is equal to institute State total columns.
8. according to any group type recognition methods of claim 4 to 7, which is characterized in that further include:
Based on the corresponding parameter matrix of at least one kind behavioral parameters, the core member in the cluster to be measured is obtained.
9. group type recognition methods according to claim 1, which is characterized in that the building of the group type prediction model Method includes:
Obtain multiple sample clusters;
The corresponding training characteristics of each sample cluster are obtained, the training characteristics of a sample cluster include: the sample cluster Each member is at least a kind of default corresponding significance level parameter of behavioral parameters;
Using the corresponding training characteristics of each sample cluster as the training input of neural network, training obtains the group type Prediction model.
10. group type recognition methods according to claim 8, which is characterized in that described to respectively correspond each sample cluster Training input of the training characteristics as neural network, training obtains the group type prediction model and includes:
Using the corresponding training characteristics of each sample cluster as the training input of neural network, obtains each sample cluster and belong to respectively In the prediction probability of all kinds of known clusters, the corresponding predicted vector of each sample cluster is obtained;
Comparison knot based on the corresponding predicted vector of each sample cluster and the corresponding true vector of corresponding sample cluster Fruit updates feature extraction parameter and probability calculation parameter in the neural network;
Wherein, the feature extraction parameter is at least used to extract the cluster knot of the sample cluster from the training characteristics of sample cluster Structure;The probability calculation parameter be used for the cluster topology based on sample cluster, obtain sample cluster be belonging respectively to it is described it is all kinds of Know the prediction probability of cluster;
Return step is inputted the corresponding training characteristics of each sample cluster as the training of neural network, obtains each sample set Group is belonging respectively to the prediction probability of all kinds of known clusters;Until the comparison result meets termination condition;Obtain the collection Realm type prediction model.
11. a kind of group type identification device characterized by comprising
First obtains module, and for obtaining the feature to be measured of cluster to be measured, the feature to be measured is included at least: the cluster to be measured Each member at least a kind of default behavioral parameters corresponding significance level parameter;
Input module, for the feature to be measured to be inputted to the group type prediction model of prebuild;
Wherein, the group type prediction model is the length of the difference value vector with the predicted vector of sample cluster Yu true vector Minimum training objective, training neural network obtain;The predicted vector includes that the sample cluster belongs to all kinds of known collection The prediction probability of group;The true vector includes the true probability that the sample cluster belongs to all kinds of known clusters;
Second obtains module, and for obtaining the group type prediction model output, the cluster to be measured is belonging respectively to described The prediction probability of all kinds of known clusters.
12. group type identification device according to claim 11, which is characterized in that described first, which obtains module, includes:
First acquisition unit, each member for obtaining the cluster to be measured are right respectively at least a kind of default behavioral parameters The behavioral parameters value answered;
First determination unit, it is pre- at such according to each member of the cluster to be measured for behavioral parameters default for every one kind If the corresponding behavioral parameters value of behavioral parameters, determine that each member of the cluster to be measured presets behavioral parameters at such and distinguishes Corresponding significance level parameter is right respectively at least a kind of default behavioral parameters with each member for obtaining the cluster to be measured The significance level parameter answered;
Second determination unit, for each member according at least to the cluster to be measured at least a kind of default behavioral parameters point Not corresponding significance level parameter, determines the feature to be measured of cluster to be measured.
13. group type identification device according to claim 12, which is characterized in that second determination unit includes:
It determines subelement, for for every a kind of default behavioral parameters, each member of the cluster to be measured is preset into row at such For the corresponding significance level parameter of parameter, each element in the corresponding parameter matrix of behavioral parameters is preset as such, with Obtain at least a kind of default corresponding parameter matrix of behavioral parameters.
14. a kind of electronic equipment characterized by comprising
Memory, for storing program;
Processor, for executing described program, described program is specifically used for:
The feature to be measured of cluster to be measured is obtained, the feature to be measured includes at least: each member of the cluster to be measured is at least one Class presets corresponding significance level parameter on behavioral parameters;
By the group type prediction model of the feature input prebuild to be measured;
Wherein, the group type prediction model is the length of the difference value vector with the predicted vector of sample cluster Yu true vector Minimum training objective, training neural network obtain;The predicted vector include the sample cluster be belonging respectively to it is all kinds of Know the prediction probability of cluster;The true vector includes that the sample cluster is belonging respectively to the true general of all kinds of known clusters Rate;
The group type prediction model output is obtained, the cluster to be measured is belonging respectively to the prediction of all kinds of known clusters Probability.
15. a kind of storage medium, which is characterized in that the storage medium is stored with the program executed suitable for processor, the journey Sequence is used for:
The feature to be measured of cluster to be measured is obtained, the feature to be measured includes at least: each member of the cluster to be measured is at least one Class presets corresponding significance level parameter on behavioral parameters;
By the group type prediction model of the feature input prebuild to be measured;
Wherein, the group type prediction model is the length of the difference value vector with the predicted vector of sample cluster Yu true vector Minimum training objective, training neural network obtain;The predicted vector include the sample cluster be belonging respectively to it is all kinds of Know the prediction probability of cluster;The true vector includes that the sample cluster is belonging respectively to the true general of all kinds of known clusters Rate;
The group type prediction model output is obtained, the cluster to be measured is belonging respectively to the prediction of all kinds of known clusters Probability.
CN201711331027.9A 2017-12-13 2017-12-13 Group type recognition methods, device, electronic equipment and storage medium Pending CN109919790A (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN201711331027.9A CN109919790A (en) 2017-12-13 2017-12-13 Group type recognition methods, device, electronic equipment and storage medium
PCT/CN2018/115353 WO2019114481A1 (en) 2017-12-13 2018-11-14 Cluster type recognition method, apparatus, electronic apparatus, and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201711331027.9A CN109919790A (en) 2017-12-13 2017-12-13 Group type recognition methods, device, electronic equipment and storage medium

Publications (1)

Publication Number Publication Date
CN109919790A true CN109919790A (en) 2019-06-21

Family

ID=66819870

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201711331027.9A Pending CN109919790A (en) 2017-12-13 2017-12-13 Group type recognition methods, device, electronic equipment and storage medium

Country Status (2)

Country Link
CN (1) CN109919790A (en)
WO (1) WO2019114481A1 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110532758A (en) * 2019-07-24 2019-12-03 阿里巴巴集团控股有限公司 A kind of Risk Identification Method and device for group

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103853841A (en) * 2014-03-19 2014-06-11 北京邮电大学 Method for analyzing abnormal behavior of user in social networking site
CN106204083B (en) * 2015-04-30 2020-02-18 中国移动通信集团山东有限公司 Target user classification method, device and system
EP3335126A4 (en) * 2015-08-11 2019-05-01 Cognoa, Inc. Methods and apparatus to determine developmental progress with artificial intelligence and user input
CN105894372B (en) * 2016-06-13 2018-03-16 腾讯科技(深圳)有限公司 The method and apparatus for predicting colony's credit
CN107273454B (en) * 2017-05-31 2020-11-03 北京京东尚科信息技术有限公司 User data classification method, device, server and computer readable storage medium

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110532758A (en) * 2019-07-24 2019-12-03 阿里巴巴集团控股有限公司 A kind of Risk Identification Method and device for group
CN110532758B (en) * 2019-07-24 2023-06-06 创新先进技术有限公司 Risk identification method and device for group

Also Published As

Publication number Publication date
WO2019114481A1 (en) 2019-06-20

Similar Documents

Publication Publication Date Title
CN108197532B (en) The method, apparatus and computer installation of recognition of face
CN104331411B (en) The method and apparatus of recommended project
CN108833458B (en) Application recommendation method, device, medium and equipment
CN110266745B (en) Information flow recommendation method, device, equipment and storage medium based on deep network
CN109360097A (en) Prediction of Stock Index method, apparatus, equipment and storage medium based on deep learning
CN109299436B (en) Preference sorting data collection method meeting local differential privacy
CN107003834B (en) Pedestrian detection device and method
CN107895038A (en) A kind of link prediction relation recommends method and device
CN106407349A (en) Product recommendation method and device
CN109766557A (en) A kind of sentiment analysis method, apparatus, storage medium and terminal device
CN112380453B (en) Article recommendation method and device, storage medium and equipment
CN109871208A (en) Software systems generation method, device, computer readable storage medium and server
CN110889759A (en) Credit data determination method, device and storage medium
CN109218769A (en) A kind of recommended method and relevant device of direct broadcasting room
CN109670927A (en) The method of adjustment and its device of credit line, equipment, storage medium
CN115130711A (en) Data processing method and device, computer and readable storage medium
CN104035978B (en) Combo discovering method and system
CN116108267A (en) Recommendation method and related equipment
CN104750762A (en) Information retrieval method and device
CN115034836A (en) Model training method and related device
CN109075987A (en) Optimize digital assembly analysis system
CN104484365B (en) In a kind of multi-source heterogeneous online community network between network principal social relationships Forecasting Methodology and system
CN109753275A (en) Recommended method, device, storage medium and the electronic equipment of Application Programming Interface
CN109359542A (en) The determination method and terminal device of vehicle damage rank neural network based
CN109919790A (en) Group type recognition methods, device, electronic equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination