CN109977992A - Electronic device, the recognition methods of batch registration behavior and storage medium - Google Patents

Electronic device, the recognition methods of batch registration behavior and storage medium Download PDF

Info

Publication number
CN109977992A
CN109977992A CN201910067104.7A CN201910067104A CN109977992A CN 109977992 A CN109977992 A CN 109977992A CN 201910067104 A CN201910067104 A CN 201910067104A CN 109977992 A CN109977992 A CN 109977992A
Authority
CN
China
Prior art keywords
processed
characteristic information
account
eigenmatrix
matrix
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201910067104.7A
Other languages
Chinese (zh)
Other versions
CN109977992B (en
Inventor
关欣
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Ping An Technology Shenzhen Co Ltd
Original Assignee
Ping An Technology Shenzhen Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Ping An Technology Shenzhen Co Ltd filed Critical Ping An Technology Shenzhen Co Ltd
Priority to CN201910067104.7A priority Critical patent/CN109977992B/en
Publication of CN109977992A publication Critical patent/CN109977992A/en
Application granted granted Critical
Publication of CN109977992B publication Critical patent/CN109977992B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/25Fusion techniques
    • G06F18/253Fusion techniques of extracted features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/30Authentication, i.e. establishing the identity or authorisation of security principals

Abstract

The present invention relates to a kind of artificial intelligence technology, a kind of electronic device, the recognition methods of batch registration behavior and computer readable storage medium are disclosed.The present invention obtains characteristic information from each account to be processed, includes the characteristic information for being denoted as critical field in the characteristic information of acquisition;Generate the feature vector of each account to be processed;All feature vectors are subjected to clustering, obtain multiple eigenmatrixes;The eigenmatrix for meeting the first preset condition is judged whether there is, when it is present, the eigenmatrix of the first preset condition will be met as matrix to be processed;All matrixes to be processed for meeting the second preset condition are inquired, when inquiring, the institute's matrix to be handled inquired are labeled as abnormal matrix, and the corresponding account to be processed of feature vector each in each abnormal matrix is identified as batch registration account respectively.Compared to the prior art, the present invention can recognize a variety of batch registration behaviors, and recognition accuracy is high.

Description

Electronic device, the recognition methods of batch registration behavior and storage medium
Technical field
The present invention relates to field of computer technology, in particular to a kind of electronic device, batch registration behavior recognition methods And computer readable storage medium.
Background technique
With the development of internet science and technology, internet is widely used in the production and living for being present in people.In general, people Before being engaged in transaction using internet or receiving service, generally require first providing the service provision platform of the transaction or service Register account number.For the same service provision platform, the demand that one or several accounts are sufficient for a user is registered.It compares The batch registration behavior that there is also some in normal registration behavior, each service provision platform for the purpose of making profit, for batch Registration behavior is often the object that each service provision platform is sternly hit.
The recognition methods of existing batch registration behavior is: finding out from the registration information of user and utilizes same internet The account quantity of protocol address (Internet Protocol Address, IP address) registration, when the account of IP address registration When quantity is more than preset threshold, determine the account that the IP address is registered as batch registration account.The defect of this method is, only It can identify the behavior of same IP address batch registration, recognition accuracy is low.
Therefore, the recognition accuracy for how improving batch registration behavior becomes a urgent problem to be solved.
Summary of the invention
The main object of the present invention is to provide a kind of electronic device, the recognition methods of batch registration behavior and computer-readable Storage medium, it is intended to improve the recognition accuracy of batch registration behavior.
To achieve the above object, the present invention proposes a kind of electronic device, and the electronic device includes memory and processor, The recognizer of batch registration behavior is stored on the memory, the recognizer of the batch registration behavior is by the processing Device realizes following steps when executing:
Obtaining step: the characteristic information of the first preset quantity, first present count are obtained from each account to be processed It include the characteristic information for being denoted as critical field of the second preset quantity in the characteristic information of amount;
Generation step: the characteristic information of the first preset quantity of each account to be processed is converted into respectively corresponding Characteristic value, and according to the corresponding characteristic value of the characteristic information each in each account to be processed, generate it is each it is described to Handle the feature vector of account;
Sorting procedure: all described eigenvectors are subjected to clustering, obtain multiple eigenmatrixes, each feature Matrix is made of several feature vectors respectively;
Extraction step: obtaining the corresponding characteristic value of each critical field from each eigenmatrix respectively, will An eigenvalue cluster of the corresponding all characteristic values of a critical field as the critical field in same eigenmatrix;
Judgment step: it according to the eigenvalue cluster of each critical field in each eigenmatrix, judges whether there is full The eigenmatrix of the first preset condition of foot will meet the eigenmatrix of the first preset condition as matrix to be processed when it is present;
Identification step: all matrixes to be processed for meeting the second preset condition of inquiry, when inquiring, the institute that will inquire Matrix to be handled is labeled as abnormal matrix, and respectively that each feature vector in each abnormal matrix is corresponding to be processed Account is identified as batch registration account.
Preferably, the judgment step includes:
Calculate separately out the corresponding dispersion of eigenvalue cluster of each critical field in each eigenmatrix;
The eigenmatrix for meeting the first preset condition is judged whether there is, when it is present, the first preset condition will be met For eigenmatrix as matrix to be processed, first preset condition is the eigenvalue cluster pair of all critical fielies in an eigenmatrix The dispersion answered is respectively less than the first preset threshold.
Preferably, the processor executes the recognizer of the batch registration behavior, before the identification step, also It performs the steps of
According to the corresponding characteristic value of critical fielies all in all matrixes to be processed, each critical field is determined Corresponding characteristic value distributed data;
The identification step includes:
According to the corresponding characteristic value distributed data of each critical field, determine each in each matrix to be processed The corresponding distribution probability value of the eigenvalue cluster of critical field;
The eigenvalue cluster that all distribution probability values are less than third predetermined threshold value is inquired, when inquiring, the institute that will inquire There is matrix to be processed belonging to eigenvalue cluster to be labeled as abnormal matrix, and respectively will in each exception matrix each feature to It measures corresponding account to be processed and is identified as batch registration account.
Preferably, the generation step includes:
According to the mapping relations between predetermined characteristic information and preprocessing rule, each characteristic information is determined Corresponding preprocessing rule;
According to the corresponding preprocessing rule of each characteristic information, each characteristic information is pre-processed, is obtained To the corresponding characteristic value of each characteristic information;
According to the corresponding characteristic value of the characteristic information each in each account to be processed, generate each described wait locate Manage the feature vector of account.
Preferably, the critical field includes one of phone number, network address, equipment identification information or a variety of;
When a characteristic information is phone number, network address, is any in equipment identification information, the characteristic information pair The preprocessing rule answered includes:
Using the characteristic information as characteristic information to be processed, respectively from a feature letter to be processed of each account to be processed At least one feature field is extracted in breath;
All feature fields of the characteristic information to be processed in all accounts to be processed are added to the spy to be processed In the feature field set of reference breath, each feature field is counted in the feature field set of the characteristic information to be processed Frequency of occurrence;
According to the frequency of occurrence of each feature field of the characteristic information to be processed in each account to be processed, determine The characteristic value of the characteristic information to be processed.
In addition, to achieve the above object, the present invention also proposes a kind of recognition methods of batch registration behavior, this method includes Step:
Obtaining step: the characteristic information of the first preset quantity, first present count are obtained from each account to be processed It include the characteristic information for being denoted as critical field of the second preset quantity in the characteristic information of amount;
Generation step: the characteristic information of the first preset quantity of each account to be processed is converted into respectively corresponding Characteristic value, and according to the corresponding characteristic value of the characteristic information each in each account to be processed, generate it is each it is described to Handle the feature vector of account;
Sorting procedure: all described eigenvectors are subjected to clustering, obtain multiple eigenmatrixes, each feature Matrix is made of several feature vectors respectively;
Extraction step: obtaining the corresponding characteristic value of each critical field from each eigenmatrix respectively, will An eigenvalue cluster of the corresponding all characteristic values of a critical field as the critical field in same eigenmatrix;
Judgment step: it according to the eigenvalue cluster of each critical field in each eigenmatrix, judges whether there is full The eigenmatrix of the first preset condition of foot will meet the eigenmatrix of the first preset condition as matrix to be processed when it is present;
Identification step: all matrixes to be processed for meeting the second preset condition of inquiry, when inquiring, the institute that will inquire Matrix to be handled is labeled as abnormal matrix, and respectively that each feature vector in each abnormal matrix is corresponding to be processed Account is identified as batch registration account.
Preferably, the judgment step includes:
Calculate separately out the corresponding dispersion of eigenvalue cluster of each critical field in each eigenmatrix;
The eigenmatrix for meeting the first preset condition is judged whether there is, when it is present, the first preset condition will be met For eigenmatrix as matrix to be processed, first preset condition is the eigenvalue cluster pair of all critical fielies in an eigenmatrix The dispersion answered is respectively less than the first preset threshold.
Preferably, before the identification step, this method further include:
According to the corresponding characteristic value of critical fielies all in all matrixes to be processed, each critical field is determined Corresponding characteristic value distributed data;
The identification step includes:
According to the corresponding characteristic value distributed data of each critical field, determine each in each matrix to be processed The corresponding distribution probability value of the eigenvalue cluster of critical field;
The eigenvalue cluster that all distribution probability values are less than third predetermined threshold value is inquired, when inquiring, the institute that will inquire There is matrix to be processed belonging to eigenvalue cluster to be labeled as abnormal matrix, and respectively will in each exception matrix each feature to It measures corresponding account to be processed and is identified as batch registration account.
Preferably, the generation step includes:
According to the mapping relations between predetermined characteristic information and preprocessing rule, each characteristic information is determined Corresponding preprocessing rule;
According to the corresponding preprocessing rule of each characteristic information, each characteristic information is pre-processed, is obtained To the corresponding characteristic value of each characteristic information;
According to the corresponding characteristic value of the characteristic information each in each account to be processed, generate each described wait locate Manage the feature vector of account.
In addition, to achieve the above object, the present invention also proposes a kind of computer readable storage medium, which is characterized in that institute The recognizer that computer-readable recording medium storage has batch registration behavior is stated, the recognizer of the batch registration behavior can It is executed by least one processor, so that at least one described processor executes batch registration behavior as described in any one of the above embodiments Recognition methods the step of.
The present invention obtains the characteristic information of the first preset quantity from each account to be processed, first preset quantity It include the characteristic information for being denoted as critical field of the second preset quantity in characteristic information;It will be first in each account to be processed The characteristic information of preset quantity is converted into corresponding characteristic value respectively, and according to each spy in each account to be processed Reference ceases corresponding characteristic value, generates the feature vector of each account to be processed;All described eigenvectors are gathered Alanysis, obtains multiple eigenmatrixes, and each eigenmatrix is made of several feature vectors respectively;Respectively from each institute It states and obtains the corresponding characteristic value of each critical field in eigenmatrix, a critical field in same eigenmatrix is corresponding An eigenvalue cluster of all characteristic values as the critical field;According to the spy of each critical field in each eigenmatrix Value indicative group judges whether there is the eigenmatrix for meeting the first preset condition, when it is present, will meet the spy of the first preset condition Matrix is levied as matrix to be processed;Inquiring all matrixes to be processed for meeting the second preset condition will inquire when inquiring Institute's matrix to be handled be labeled as abnormal matrix, and respectively will in each exception matrix each feature vector it is corresponding to Processing account is identified as batch registration account.Compared to the prior art, present invention employs include being associated with batch registration behavior Spend the various features information including higher critical field as analysis target, and after being analyzed by a variety of analysis means most Batch registration account is identified eventually, and therefore, the present invention can recognize a variety of batch registration behaviors, and recognition accuracy is high.
Detailed description of the invention
In order to more clearly explain the embodiment of the invention or the technical proposal in the existing technology, to embodiment or will show below There is attached drawing needed in technical description to be briefly described, it should be apparent that, the accompanying drawings in the following description is only this Some embodiments of invention for those of ordinary skill in the art without creative efforts, can be with The structure shown according to these attached drawings obtains other attached drawings.
Fig. 1 is the running environment schematic diagram of the recognizer first embodiment of batch registration behavior of the present invention;
Fig. 2 is the Program modual graph of the recognizer first embodiment of batch registration behavior of the present invention;
Fig. 3 is the flow diagram of the recognition methods first embodiment of batch registration behavior of the present invention.
The embodiments will be further described with reference to the accompanying drawings for the realization, the function and the advantages of the object of the present invention.
Specific embodiment
The principle and features of the present invention will be described below with reference to the accompanying drawings, and the given examples are served only to explain the present invention, and It is non-to be used to limit the scope of the invention.
The present invention proposes a kind of recognizer of batch registration behavior.
Referring to Fig. 1, being the running environment schematic diagram of 10 first embodiment of recognizer of batch registration behavior of the present invention.
In the present embodiment, the recognizer 10 of batch registration behavior is installed and is run in electronic device 1.Electronic device 1, which can be desktop PC, notebook, palm PC and server etc., calculates equipment.The electronic device 1 may include, but not It is only limitted to, memory 11, processor 12 and display 13.Fig. 1 illustrates only the electronic device 1 with component 11-13, but answers What is understood is, it is not required that implements all components shown, the implementation that can be substituted is more or less component.
Memory 11 can be the internal storage unit of electronic device 1 in some embodiments, such as the electronic device 1 Hard disk or memory.Memory 11 is also possible to the External memory equipment of electronic device 1, such as electronics dress in further embodiments Set the plug-in type hard disk being equipped on 1, intelligent memory card (Smart Media Card, SMC), secure digital (Secure Digital, SD) card, flash card (Flash Card) etc..Further, memory 11 can also be both interior including electronic device 1 Portion's storage unit also includes External memory equipment.Memory 11 is for storing the application software for being installed on electronic device 1 and all kinds of Data, for example, batch registration behavior recognizer 10 program code etc..Memory 11 can be also used for temporarily storing Data through exporting or will export.
Processor 12 can be in some embodiments a central processing unit (Central Processing Unit, CPU), microprocessor or other data processing chips, program code or processing data for being stored in run memory 11, example Such as execute the recognizer 10 of batch registration behavior.
Display 13 can be in some embodiments light-emitting diode display, liquid crystal display, touch-control liquid crystal display and OLED (Organic Light-Emitting Diode, Organic Light Emitting Diode) touches device etc..Display 13 is for being shown in The information that is handled in electronic device 1 and for showing visual user interface.The component 11-13 of electronic device 1 passes through journey Sequence bus is in communication with each other.
Referring to Fig. 2, being the Program modual graph of 10 first embodiment of recognizer of batch registration behavior of the present invention.At this In embodiment, the recognizer 10 of batch registration behavior can be divided into one or more modules, one or more module It is stored in memory 11, and performed by one or more processors (the present embodiment is processor 12), to complete this hair It is bright.For example, in Fig. 2, the recognizer 10 of batch registration behavior can be divided into obtain module 101, generation module 102, Cluster module 103, extraction module 104, judgment module 105 and identification module 106.The so-called module of the present invention is to refer to complete The series of computation machine program instruction section of specific function exists than program more suitable for describing the recognizer 10 of batch registration behavior Implementation procedure in electronic device 1, in which:
Module 101 is obtained, for obtaining the characteristic information of the first preset quantity from each account to be processed, described first It include the characteristic information for being denoted as critical field of the second preset quantity in the characteristic information of preset quantity.
Obtain the characteristic information that module 101 obtains the first preset quantity from each account to be processed, features described above information Including one of phone number, network address (for example, IP address), equipment identification information or a variety of, in application scenes In, features described above information further includes one of geographical location information, academic information, loss of learning amount or a variety of.It is each to In the characteristic information for handling the first preset quantity of account, there are the characteristic information for being denoted as critical field of the second preset quantity, For example, critical field can be labeled as the characteristic informations such as phone number, network address, equipment identification information.Wherein, above-mentioned first Preset quantity is greater than or equal to the second preset quantity.
In the present embodiment, obtains module 101 and be also used to be arranged characteristic information and critical field, features described above information and key The setting method of field includes:
The account information that plurality of classes is obtained from each account to be processed, determines the account item of each classification respectively The degree of correlation between mesh information and batch registration behavior, and according to the size order of the degree of correlation, to the account of each classification Information is ranked up.Believe according to the account information that the sequence of the degree of correlation from big to small chooses the first preset quantity as feature Breath, chooses the characteristic information of the second preset quantity as keyword according still further to the sequence of the degree of correlation from big to small from characteristic information Section.
Generation module 102, for converting the characteristic information of the first preset quantity of each account to be processed respectively At corresponding characteristic value, and according to the corresponding characteristic value of the characteristic information each in each account to be processed, generate each The feature vector of a account to be processed.
In the present embodiment, the generation module 102 is also used to:
Firstly, determining each spy according to the mapping relations between predetermined characteristic information and preprocessing rule Reference ceases corresponding preprocessing rule.
Then, according to the corresponding preprocessing rule of each characteristic information, each characteristic information is located in advance Reason, obtains the corresponding characteristic value of each characteristic information.
Finally, generating each institute according to the corresponding characteristic value of the characteristic information each in each account to be processed State the feature vector of account to be processed.
Wherein, above-mentioned preprocessing rule can be arranged according to specific application scenarios, for example, it is pre- to can refer to following instance setting If processing rule:
Example one, when a characteristic information is phone number, network address, is any in equipment identification information, the feature The corresponding preprocessing rule of information includes: to extract at least one from a characteristic information to be processed of each account to be processed respectively Feature field e.g. intercepts the first seven 1234567 feature field as the phone number from phone number 12345678912, For another example, feature field of preceding two sets of numbers 10.11 as the IP address, or interception are intercepted in secondary IP address 10.11.12.13 Feature field of first three sets of numbers 10.11.12 as the IP address, for another example, the extract equipment conduct from equipment identification information The feature field of the equipment identification information.Then, by all spies of the characteristic information to be processed in all accounts to be processed Sign field is added to the feature field set of the characteristic information to be processed, in the feature field set of the characteristic information to be processed The frequency of occurrence of the middle each feature field of statistics.According to each spy of the characteristic information to be processed in each account to be processed The frequency of occurrence for levying field, determines the characteristic value of the characteristic information to be processed.
Example two, when a characteristic information is geographical location information, is any in academic information, the characteristic information is corresponding Preprocessing rule include: the characteristic information to be processed of each account to be processed is converted into encoding using one-hot coding mode, and The characteristic value of the coding being converted to as the characteristic information to be processed is divided for example, there are three kinds of field values for academic information Not Wei senior middle school, undergraduate course, master, then use the coding of three bit lengths as the corresponding characteristic value of academic information, wherein each Bit represents a kind of state of educational background, when the field value of the academic information of an account to be processed is senior middle school, then will represent senior middle school The bit of educational background is set as 1, other two bits are set as 0.
Alternatively, being added the field value of the characteristic information to be processed in all accounts to be processed to the feature to be processed The field value set of information counts the frequency of occurrence of each field value in the field value set of the characteristic information to be processed. According to the frequency of occurrence of the field value of the characteristic information to be processed in each account to be processed, the feature to be processed is determined The characteristic value of information.
Cluster module 103, it is each for obtaining multiple eigenmatrixes for all described eigenvectors progress clustering The eigenmatrix is made of several feature vectors respectively.
All feature vectors are input to the Clustering Model pre-established (for example, calculating based on greatest hope by cluster module 103 The Clustering Model that method is established) in, which passes through K-means algorithm (K-means algorithm is a kind of hard clustering algorithm), height This mixed model (Gaussian Mixed Model, GMM) etc. carries out clustering to feature vector, obtains multiple feature vectors Grouping, each feature vector grouping are exported in the form of eigenmatrix, for example, by the feature vector in a feature vector grouping As row vector or Column vector groups at a corresponding eigenmatrix.
Extraction module 104, for obtaining the corresponding spy of each critical field from each eigenmatrix respectively Value indicative, using the corresponding all characteristic values of a critical field in same eigenmatrix as an eigenvalue cluster of the critical field.
For example, if an eigenmatrix by a feature vector be grouped in each feature vector form as row vector, and the spy The sign each column element of matrix represents the corresponding all characteristic values of a characteristic information, then it is corresponding can directly to find each critical field Eigenvalue cluster of the column element as the critical field.
Judgment module 105 judges whether for the eigenvalue cluster according to each critical field in each eigenmatrix In the presence of the eigenmatrix for meeting the first preset condition, when it is present, the eigenmatrix of the first preset condition will be met as wait locate Manage matrix.
The judgment module 105 is also used to:
Firstly, calculating separately out the corresponding dispersion of eigenvalue cluster of each critical field in each eigenmatrix. The corresponding dispersion of one eigenvalue cluster refers to difference degree or degree of scatter in this group of characteristic value between each characteristic value, example Such as, can be referred to by calculating the corresponding standard deviation of this feature value group, variance, mean difference etc. according to characteristic value each in an eigenvalue cluster It is denoted as the corresponding dispersion of this feature value group.
Next, it is determined that the first default item will be met when it is present with the presence or absence of the eigenmatrix for meeting the first preset condition The eigenmatrix of part is as matrix to be processed, when it be not present, exports unidentified to batch registration account, the first default item Part is that the corresponding dispersion of eigenvalue cluster of all critical fielies in an eigenmatrix is respectively less than the first preset threshold.
Identification module 106 will be looked into for inquiring all matrixes to be processed for meeting the second preset condition when inquiring The institute's matrix to be handled ask is labeled as abnormal matrix, and respectively that each feature vector in each abnormal matrix is corresponding Account to be processed be identified as batch registration account.
Further, in this embodiment the program further include:
Determining module (not shown), for according to the corresponding spy of critical fielies all in all matrixes to be processed Value indicative determines the corresponding characteristic value distributed data of each critical field.
For example, determining module using account to be handled feature vector as row vector or as Column vector groups at one Full dose matrix extracts the corresponding all characteristic values of each critical field from the full dose matrix, corresponding to each critical field All characteristic values counted, obtain the corresponding characteristic value distributed data of each critical field (for example, integral distribution curve, Cumulative distribution table etc.).
Further, in this embodiment the identification module 106 is also used to:
According to the corresponding characteristic value distributed data of each critical field, determine each in each matrix to be processed The corresponding distribution probability value of the eigenvalue cluster of critical field.For example, determining the characteristic value of a critical field in a matrix to be processed The maximum characteristic value of numerical value is denoted as M in group and the smallest characteristic value of numerical value is denoted as N, and the corresponding numerical intervals of this feature value group are remembered The corresponding distribution probability value of numerical intervals [N, M], example are determined according to the characteristic value distributed data of the critical field for [N, M] Such as, the corresponding cumulative distribution probability value of N and M is inquired respectively, and the corresponding cumulative distribution probability value of M is subtracted into the corresponding iterated integral of N Cloth probability value obtains the corresponding distribution probability value of numerical intervals [N, M].
The eigenvalue cluster that all distribution probability values are less than third predetermined threshold value is inquired, when inquiring, the institute that will inquire There is matrix to be processed belonging to eigenvalue cluster to be labeled as abnormal matrix, and respectively will in each exception matrix each feature to It measures corresponding account to be processed and is identified as batch registration account.
The present invention obtains the characteristic information of the first preset quantity from each account to be processed, first preset quantity It include the characteristic information for being denoted as critical field of the second preset quantity in characteristic information;By the first of each account to be processed The characteristic information of preset quantity is converted into corresponding characteristic value respectively, and according to each spy in each account to be processed Reference ceases corresponding characteristic value, generates the feature vector of each account to be processed;All described eigenvectors are gathered Alanysis, obtains multiple eigenmatrixes, and each eigenmatrix is made of several feature vectors respectively;Respectively from each institute It states and obtains the corresponding characteristic value of each critical field in eigenmatrix, a critical field in same eigenmatrix is corresponding An eigenvalue cluster of all characteristic values as the critical field;According to the spy of each critical field in each eigenmatrix Value indicative group judges whether there is the eigenmatrix for meeting the first preset condition, when it is present, will meet the spy of the first preset condition Matrix is levied as matrix to be processed;Inquiring all matrixes to be processed for meeting the second preset condition will inquire when inquiring Institute's matrix to be handled be labeled as abnormal matrix, and respectively will in each exception matrix each feature vector it is corresponding to Processing account is identified as batch registration account.Compared to the prior art, present invention employs include being associated with batch registration behavior Spend the various features information including higher critical field as analysis target, and after being analyzed by a variety of analysis means most Batch registration account is identified eventually, and therefore, the present invention can recognize a variety of batch registration behaviors, and recognition accuracy is high.
Further it is proposed that a kind of recognition methods of batch registration behavior.
As shown in figure 3, Fig. 3 is the flow diagram of the recognition methods first embodiment of batch registration behavior of the present invention.
In the present embodiment, this method comprises:
Step S10 obtains the characteristic information of the first preset quantity, first preset quantity from each account to be processed Characteristic information in include the second preset quantity the characteristic information for being denoted as critical field.
The characteristic information of the first preset quantity is obtained from each account to be processed, features described above information includes cell-phone number One of code, network address (for example, IP address), equipment identification information are a variety of, in application scenes, features described above Information further includes one of geographical location information, academic information, loss of learning amount or a variety of.The of each account to be processed In the characteristic information of one preset quantity, there are the characteristic informations for being denoted as critical field of the second preset quantity, for example, can be by mobile phone The characteristic informations such as number, network address, equipment identification information are labeled as critical field.Wherein, above-mentioned first preset quantity is greater than Or it is equal to the second preset quantity.
In the present embodiment, the setting method of features described above information and critical field includes:
The account information that plurality of classes is obtained from each account to be processed, determines the account item of each classification respectively The degree of correlation between mesh information and batch registration behavior, and according to the size order of the degree of correlation, to the account of each classification Information is ranked up.Believe according to the account information that the sequence of the degree of correlation from big to small chooses the first preset quantity as feature Breath, chooses the characteristic information of the second preset quantity as keyword according still further to the sequence of the degree of correlation from big to small from characteristic information Section.
The characteristic information of first preset quantity of each account to be processed is converted into corresponding by step S20 respectively Characteristic value, and according to the corresponding characteristic value of the characteristic information each in each account to be processed, generate it is each it is described to Handle the feature vector of account.
In the present embodiment, step S20 includes:
Firstly, determining each spy according to the mapping relations between predetermined characteristic information and preprocessing rule Reference ceases corresponding preprocessing rule.
Then, according to the corresponding preprocessing rule of each characteristic information, each characteristic information is located in advance Reason, obtains the corresponding characteristic value of each characteristic information.
Finally, generating each institute according to the corresponding characteristic value of the characteristic information each in each account to be processed State the feature vector of account to be processed.
Wherein, above-mentioned preprocessing rule can be arranged according to specific application scenarios, for example, it is pre- to can refer to following instance setting If processing rule:
Example one, when a characteristic information is phone number, network address, is any in equipment identification information, the feature The corresponding preprocessing rule of information includes: to extract at least one from a characteristic information to be processed of each account to be processed respectively Feature field e.g. intercepts the first seven 1234567 feature field as the phone number from phone number 12345678912, For another example, feature field of preceding two sets of numbers 10.11 as the IP address, or interception are intercepted in secondary IP address 10.11.12.13 Feature field of first three sets of numbers 10.11.12 as the IP address, for another example, the extract equipment conduct from equipment identification information The feature field of the equipment identification information.Then, by all spies of the characteristic information to be processed in all accounts to be processed Sign field is added to the feature field set of the characteristic information to be processed, in the feature field set of the characteristic information to be processed The frequency of occurrence of the middle each feature field of statistics.According to each spy of the characteristic information to be processed in each account to be processed The frequency of occurrence for levying field, determines the characteristic value of the characteristic information to be processed.
Example two, when a characteristic information is geographical location information, is any in academic information, the characteristic information is corresponding Preprocessing rule include: the characteristic information to be processed of each account to be processed is converted into encoding using one-hot coding mode, and The characteristic value of the coding being converted to as the characteristic information to be processed is divided for example, there are three kinds of field values for academic information Not Wei senior middle school, undergraduate course, master, then use the coding of three bit lengths as the corresponding characteristic value of academic information, wherein each Bit represents a kind of state of educational background, when the field value of the academic information of an account to be processed is senior middle school, then will represent senior middle school The bit of educational background is set as 1, other two bits are set as 0.
Alternatively, being added the field value of the characteristic information to be processed in all accounts to be processed to the feature to be processed The field value set of information counts the frequency of occurrence of each field value in the field value set of the characteristic information to be processed. According to the frequency of occurrence of the field value of the characteristic information to be processed in each account to be processed, the feature to be processed is determined The characteristic value of information.
All described eigenvectors are carried out clustering, obtain multiple eigenmatrixes, each feature by step S30 Matrix is made of several feature vectors respectively.
All feature vectors are input to the Clustering Model pre-established (for example, establishing based on EM algorithm poly- Class model) in, which passes through K-means algorithm (K-means algorithm is a kind of hard clustering algorithm), gauss hybrid models (Gaussian Mixed Model, GMM) etc. carries out clustering to feature vector, obtains multiple feature vector groupings, each Feature vector grouping is exported in the form of eigenmatrix, for example, using the feature vector in the grouping of feature vector as it is capable to Amount or Column vector groups are at a corresponding eigenmatrix.
Step S40 obtains the corresponding characteristic value of each critical field from each eigenmatrix respectively, will be same An eigenvalue cluster of the corresponding all characteristic values of a critical field as the critical field in one eigenmatrix.
For example, if an eigenmatrix by a feature vector be grouped in each feature vector form as row vector, and the spy The sign each column element of matrix represents the corresponding all characteristic values of a characteristic information, then it is corresponding can directly to find each critical field Eigenvalue cluster of the column element as the critical field.
Step S50 judges whether there is satisfaction according to the eigenvalue cluster of each critical field in each eigenmatrix The eigenmatrix of first preset condition will meet the eigenmatrix of the first preset condition as matrix to be processed when it is present.
The step S50 includes:
Firstly, calculating separately out the corresponding dispersion of eigenvalue cluster of each critical field in each eigenmatrix. The corresponding dispersion of one eigenvalue cluster refers to difference degree or degree of scatter in this group of characteristic value between each characteristic value, example Such as, can be referred to by calculating the corresponding standard deviation of this feature value group, variance, mean difference etc. according to characteristic value each in an eigenvalue cluster It is denoted as the corresponding dispersion of this feature value group.
Next, it is determined that the first default item will be met when it is present with the presence or absence of the eigenmatrix for meeting the first preset condition The eigenmatrix of part is as matrix to be processed, when it be not present, exports unidentified to batch registration account, the first default item Part is that the corresponding dispersion of eigenvalue cluster of all critical fielies in an eigenmatrix is respectively less than the first preset threshold.
Step S60 inquires all matrixes to be processed for meeting the second preset condition, when inquiring, the institute that will inquire Matrix to be handled is labeled as abnormal matrix, and respectively that each feature vector in each abnormal matrix is corresponding to be processed Account is identified as batch registration account.
Further, in this embodiment before step S60, this method further include:
According to the corresponding characteristic value of critical fielies all in all matrixes to be processed, each critical field is determined Corresponding characteristic value distributed data.
For example, using account to be handled feature vector as row vector or as Column vector groups at a full dose square Battle array extracts the corresponding all characteristic values of each critical field from the full dose matrix, corresponding to each critical field all Characteristic value is counted, and obtains the corresponding characteristic value distributed data of each critical field (for example, integral distribution curve, iterated integral Cloth table etc.).
Further, in this embodiment the step S60 includes:
According to the corresponding characteristic value distributed data of each critical field, determine each in each matrix to be processed The corresponding distribution probability value of the eigenvalue cluster of critical field.For example, determining the characteristic value of a critical field in a matrix to be processed The maximum characteristic value of numerical value is denoted as M in group and the smallest characteristic value of numerical value is denoted as N, and the corresponding numerical intervals of this feature value group are remembered The corresponding distribution probability value of numerical intervals [N, M], example are determined according to the characteristic value distributed data of the critical field for [N, M] Such as, the corresponding cumulative distribution probability value of N and M is inquired respectively, and the corresponding cumulative distribution probability value of M is subtracted into the corresponding iterated integral of N Cloth probability value obtains the corresponding distribution probability value of numerical intervals [N, M].
The eigenvalue cluster that all distribution probability values are less than third predetermined threshold value is inquired, when inquiring, the institute that will inquire There is matrix to be processed belonging to eigenvalue cluster to be labeled as abnormal matrix, and respectively will in each exception matrix each feature to It measures corresponding account to be processed and is identified as batch registration account.
The present invention obtains the characteristic information of the first preset quantity from each account to be processed, first preset quantity It include the characteristic information for being denoted as critical field of the second preset quantity in characteristic information;By the first of each account to be processed The characteristic information of preset quantity is converted into corresponding characteristic value respectively, and according to each spy in each account to be processed Reference ceases corresponding characteristic value, generates the feature vector of each account to be processed;All described eigenvectors are gathered Alanysis, obtains multiple eigenmatrixes, and each eigenmatrix is made of several feature vectors respectively;Respectively from each institute It states and obtains the corresponding characteristic value of each critical field in eigenmatrix, a critical field in same eigenmatrix is corresponding An eigenvalue cluster of all characteristic values as the critical field;According to the spy of each critical field in each eigenmatrix Value indicative group judges whether there is the eigenmatrix for meeting the first preset condition, when it is present, will meet the spy of the first preset condition Matrix is levied as matrix to be processed;Inquiring all matrixes to be processed for meeting the second preset condition will inquire when inquiring Institute's matrix to be handled be labeled as abnormal matrix, and respectively will in each exception matrix each feature vector it is corresponding to Processing account is identified as batch registration account.Compared to the prior art, present invention employs include being associated with batch registration behavior Spend the various features information including higher critical field as analysis target, and after being analyzed by a variety of analysis means most Batch registration account is identified eventually, and therefore, the present invention can recognize a variety of batch registration behaviors, and recognition accuracy is high.
Further, the present invention also proposes that a kind of computer readable storage medium, the computer readable storage medium are deposited The recognizer of batch registration behavior is contained, the recognizer of the batch registration behavior can be executed by least one processor, So that at least one described processor executes the recognition methods of the batch registration behavior in any of the above-described embodiment.
The above description is only a preferred embodiment of the present invention, is not intended to limit the scope of the invention, all at this Under the inventive concept of invention, using equivalent structure transformation made by description of the invention and accompanying drawing content, or directly/use indirectly It is included in other related technical areas in scope of patent protection of the invention.

Claims (10)

1. a kind of electronic device, the electronic device includes memory and processor, which is characterized in that is stored on the memory There is the recognizer of batch registration behavior, is realized when the recognizer of the batch registration behavior is executed by the processor as follows Step:
Obtaining step: obtaining the characteristic information of the first preset quantity from each account to be processed, first preset quantity It include the characteristic information for being denoted as critical field of the second preset quantity in characteristic information;
Generation step: the characteristic information of the first preset quantity of each account to be processed is converted into corresponding feature respectively Value, and according to the corresponding characteristic value of the characteristic information each in each account to be processed, it generates each described to be processed The feature vector of account;
Sorting procedure: all described eigenvectors are subjected to clustering, obtain multiple eigenmatrixes, each eigenmatrix It is made of respectively several feature vectors;
Extraction step: obtaining the corresponding characteristic value of each critical field from each eigenmatrix respectively, will be same An eigenvalue cluster of the corresponding all characteristic values of a critical field as the critical field in eigenmatrix;
Judgment step: according to the eigenvalue cluster of each critical field in each eigenmatrix, satisfaction is judged whether there is The eigenmatrix of one preset condition will meet the eigenmatrix of the first preset condition as matrix to be processed when it is present;
Identification step: all matrixes to be processed for meeting the second preset condition of inquiry are needed when inquiring by what is inquired Processing array is labeled as abnormal matrix, and respectively by the corresponding account to be processed of feature vector each in each abnormal matrix It is identified as batch registration account.
2. electronic device as described in claim 1, which is characterized in that the judgment step includes:
Calculate separately out the corresponding dispersion of eigenvalue cluster of each critical field in each eigenmatrix;
The eigenmatrix for meeting the first preset condition is judged whether there is, when it is present, the feature of the first preset condition will be met For matrix as matrix to be processed, first preset condition is that the eigenvalue cluster of all critical fielies in an eigenmatrix is corresponding Dispersion is respectively less than the first preset threshold.
3. electronic device as claimed in claim 1 or 2, which is characterized in that the processor executes the batch registration behavior Recognizer also performed the steps of before the identification step
According to the corresponding characteristic value of critical fielies all in all matrixes to be processed, determine that each critical field is corresponding Characteristic value distributed data;
The identification step includes:
According to the corresponding characteristic value distributed data of each critical field, each key in each matrix to be processed is determined The corresponding distribution probability value of the eigenvalue cluster of field;
The eigenvalue cluster that all distribution probability values are less than third predetermined threshold value is inquired, when inquiring, all spies for will inquiring Matrix to be processed belonging to value indicative group is labeled as abnormal matrix, and respectively will each feature vector pair in each exception matrix The account to be processed answered is identified as batch registration account.
4. electronic device as claimed in claim 1 or 2, which is characterized in that the generation step includes:
According to the mapping relations between predetermined characteristic information and preprocessing rule, determine that each characteristic information is corresponding Preprocessing rule;
According to the corresponding preprocessing rule of each characteristic information, each characteristic information is pre-processed, is obtained each The corresponding characteristic value of a characteristic information;
According to the corresponding characteristic value of the characteristic information each in each account to be processed, each account to be processed is generated The feature vector at family.
5. electronic device as claimed in claim 4, which is characterized in that the critical field include phone number, network address, One of equipment identification information is a variety of;
When a characteristic information is phone number, network address, is any in equipment identification information, the characteristic information is corresponding Preprocessing rule includes:
Using the characteristic information as characteristic information to be processed, respectively from a characteristic information to be processed of each account to be processed Extract at least one feature field;
All feature fields of the characteristic information to be processed in all accounts to be processed are added to the feature to be processed and are believed In the feature field set of breath, the appearance of each feature field is counted in the feature field set of the characteristic information to be processed The frequency;
According to the frequency of occurrence of each feature field of the characteristic information to be processed in each account to be processed, determine described in The characteristic value of characteristic information to be processed.
6. a kind of recognition methods of batch registration behavior, which is characterized in that the method comprising the steps of:
Obtaining step: obtaining the characteristic information of the first preset quantity from each account to be processed, first preset quantity It include the characteristic information for being denoted as critical field of the second preset quantity in characteristic information;
Generation step: the characteristic information of the first preset quantity of each account to be processed is converted into corresponding feature respectively Value, and according to the corresponding characteristic value of the characteristic information each in each account to be processed, it generates each described to be processed The feature vector of account;
Sorting procedure: all described eigenvectors are subjected to clustering, obtain multiple eigenmatrixes, each eigenmatrix It is made of respectively several feature vectors;
Extraction step: obtaining the corresponding characteristic value of each critical field from each eigenmatrix respectively, will be same An eigenvalue cluster of the corresponding all characteristic values of a critical field as the critical field in eigenmatrix;
Judgment step: according to the eigenvalue cluster of each critical field in each eigenmatrix, satisfaction is judged whether there is The eigenmatrix of one preset condition will meet the eigenmatrix of the first preset condition as matrix to be processed when it is present;
Identification step: all matrixes to be processed for meeting the second preset condition of inquiry are needed when inquiring by what is inquired Processing array is labeled as abnormal matrix, and respectively by the corresponding account to be processed of feature vector each in each abnormal matrix It is identified as batch registration account.
7. the recognition methods of batch registration behavior as claimed in claim 6, which is characterized in that the judgment step includes:
Calculate separately out the corresponding dispersion of eigenvalue cluster of each critical field in each eigenmatrix;
The eigenmatrix for meeting the first preset condition is judged whether there is, when it is present, the feature of the first preset condition will be met For matrix as matrix to be processed, first preset condition is that the eigenvalue cluster of all critical fielies in an eigenmatrix is corresponding Dispersion is respectively less than the first preset threshold.
8. the recognition methods of batch registration behavior as claimed in claims 6 or 7, which is characterized in that the identification step it Before, this method further include:
According to the corresponding characteristic value of critical fielies all in all matrixes to be processed, determine that each critical field is corresponding Characteristic value distributed data;
The identification step includes:
According to the corresponding characteristic value distributed data of each critical field, each key in each matrix to be processed is determined The corresponding distribution probability value of the eigenvalue cluster of field;
The eigenvalue cluster that all distribution probability values are less than third predetermined threshold value is inquired, when inquiring, all spies for will inquiring Matrix to be processed belonging to value indicative group is labeled as abnormal matrix, and respectively will each feature vector pair in each exception matrix The account to be processed answered is identified as batch registration account.
9. the recognition methods of batch registration behavior as claimed in claims 6 or 7, which is characterized in that the generation step includes:
According to the mapping relations between predetermined characteristic information and preprocessing rule, determine that each characteristic information is corresponding Preprocessing rule;
According to the corresponding preprocessing rule of each characteristic information, each characteristic information is pre-processed, is obtained each The corresponding characteristic value of a characteristic information;
According to the corresponding characteristic value of the characteristic information each in each account to be processed, each account to be processed is generated The feature vector at family.
10. a kind of computer readable storage medium, which is characterized in that the computer-readable recording medium storage has batch registration The recognizer of the recognizer of behavior, the batch registration behavior can be executed by least one processor so that it is described at least One processor executes the step of recognition methods of the batch registration behavior as described in any one of claim 6-9.
CN201910067104.7A 2019-01-24 2019-01-24 Electronic device, method for identifying batch registration behaviors and storage medium Active CN109977992B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910067104.7A CN109977992B (en) 2019-01-24 2019-01-24 Electronic device, method for identifying batch registration behaviors and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910067104.7A CN109977992B (en) 2019-01-24 2019-01-24 Electronic device, method for identifying batch registration behaviors and storage medium

Publications (2)

Publication Number Publication Date
CN109977992A true CN109977992A (en) 2019-07-05
CN109977992B CN109977992B (en) 2023-01-17

Family

ID=67076625

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910067104.7A Active CN109977992B (en) 2019-01-24 2019-01-24 Electronic device, method for identifying batch registration behaviors and storage medium

Country Status (1)

Country Link
CN (1) CN109977992B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110324352A (en) * 2019-07-11 2019-10-11 武汉斗鱼网络科技有限公司 Identify the method and device of batch registration account group

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105634855A (en) * 2014-11-06 2016-06-01 阿里巴巴集团控股有限公司 Method and device for recognizing network address abnormity
CN105791255A (en) * 2014-12-23 2016-07-20 阿里巴巴集团控股有限公司 Method and system for identifying computer risks based on account clustering
CN105808988A (en) * 2014-12-31 2016-07-27 阿里巴巴集团控股有限公司 Method and device for identifying exceptional account

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105634855A (en) * 2014-11-06 2016-06-01 阿里巴巴集团控股有限公司 Method and device for recognizing network address abnormity
CN105791255A (en) * 2014-12-23 2016-07-20 阿里巴巴集团控股有限公司 Method and system for identifying computer risks based on account clustering
CN105808988A (en) * 2014-12-31 2016-07-27 阿里巴巴集团控股有限公司 Method and device for identifying exceptional account

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
方勇 等: "基于层次聚类的虚假用户检测", 《清华大学学报(自然科学版)》 *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110324352A (en) * 2019-07-11 2019-10-11 武汉斗鱼网络科技有限公司 Identify the method and device of batch registration account group
CN110324352B (en) * 2019-07-11 2021-10-15 武汉斗鱼网络科技有限公司 Method and device for identifying batch registered account groups

Also Published As

Publication number Publication date
CN109977992B (en) 2023-01-17

Similar Documents

Publication Publication Date Title
CN111371858B (en) Group control equipment identification method, device, medium and electronic equipment
CN109685092B (en) Clustering method, equipment, storage medium and device based on big data
CN111507470A (en) Abnormal account identification method and device
CN109302541A (en) Electronic device, distribution method of attending a banquet and computer readable storage medium
CN110264274A (en) Objective group's division methods, model generating method, device, equipment and storage medium
CN109933502A (en) Electronic device, the processing method of user operation records and storage medium
CN110135421A (en) Licence plate recognition method, device, computer equipment and computer readable storage medium
CN108614895B (en) Abnormal data access behavior identification method and data processing device
CN116662839A (en) Associated big data cluster analysis method and device based on multidimensional intelligent acquisition
CN112950347B (en) Resource data processing optimization method and device, storage medium and terminal
CN111680167A (en) Service request response method and server
CN109977992A (en) Electronic device, the recognition methods of batch registration behavior and storage medium
CN116186594B (en) Method for realizing intelligent detection of environment change trend based on decision network combined with big data
CN109784634A (en) Coverage division methods, electronic device and readable storage medium storing program for executing
CN109753561B (en) Automatic reply generation method and device
CN112069269A (en) Big data and multidimensional feature-based data tracing method and big data cloud server
CN113515591B (en) Text defect information identification method and device, electronic equipment and storage medium
CN109583492A (en) A kind of method and terminal identifying antagonism image
CN109034542A (en) Investment combination generation method, device and computer readable storage medium
CN112559589A (en) Remote surveying and mapping data processing method and system
CN112559590A (en) Mapping data resource processing method and device and server
CN115225489B (en) Dynamic control method for queue service flow threshold, electronic equipment and storage medium
CN116484230B (en) Method for identifying abnormal business data and training method of AI digital person
CN115759875B (en) Classified and hierarchical management method and system for suppliers of public resource transaction
CN117519948B (en) Method and system for realizing computing resource adjustment under building construction based on cloud platform

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant