CN106469179A - A kind of information monitoring method and device - Google Patents
A kind of information monitoring method and device Download PDFInfo
- Publication number
- CN106469179A CN106469179A CN201510518846.9A CN201510518846A CN106469179A CN 106469179 A CN106469179 A CN 106469179A CN 201510518846 A CN201510518846 A CN 201510518846A CN 106469179 A CN106469179 A CN 106469179A
- Authority
- CN
- China
- Prior art keywords
- feature
- account
- character feature
- character
- classification
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/90—Details of database functions independent of the retrieved data types
- G06F16/95—Retrieval from the web
- G06F16/958—Organisation or management of web site content, e.g. publishing, maintaining pages or automatic linking
Landscapes
- Engineering & Computer Science (AREA)
- Databases & Information Systems (AREA)
- Theoretical Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Information Transfer Between Computers (AREA)
Abstract
This application discloses a kind of information monitoring method and device, the method includes:Determine the accounts information of the account each to be identified receiving in special time period, extract the character feature in each accounts information receiving, according to the character feature extracting, count the account quantity in described special time period with identical character feature respectively, according to each character feature corresponding account quantitative criteria pre-building, and there is in the described special time period counting the account quantity of identical character feature, judge whether each accounts information is fallacious message.Therefore, normal accounts information and malice accounts information, in the accounts information providing in the face of user, can effectively be distinguished by Internet service provider, improves the accuracy rate distinguishing normal accounts information and malice accounts information.
Description
Technical field
The application is related to field of computer technology, more particularly, to a kind of information monitoring method and device.
Background technology
With the continuous development of network technology, Internet service provider is (such as:Website) receive user provide user
After information, all kinds of abundant network services can be provided the user.
At present, the accounts information received by Internet service provider includes different types of information, such as:User
The accounts information of registration on certain commodity website, or the accounts information registered on certain game website.Network
The accounts information that user provides can be stored in the webserver for service provider.But, the use that user is provided
Accounts information may for the accounts information of malice (such as:The batch registration accounts information of malice), these malice
Accounts information can affect the normal operation of Internet service provider, cause the unnecessary wasting of resources.
In prior art, the accounts information that the docking of the network service chamber of commerce receives is identified and processes, generally,
By in the accounts information receiving, the accounts information with same or like information characteristics carries for Internet service provider
Take out, e.g., the same or analogous prefix of the account name in accounts information, then by its in accounts information
His attribute, quantifies the probability size that accounts information is Mass production, e.g., its included in accounts information
His attribute includes userspersonal information (such as:Address name, subscriber phone), and the equipment letter being comprised
Breath, such as:Agreement (Internet Protocol, the IP) address of interconnection between network.Other attribute identicals
More, then prove accounts information be Mass production probability bigger, then, account information be malice account
Family information probability is also bigger, otherwise then it is considered that account information is normal accounts information.
For example:In certain commodity website, user's login account information on this commodity website, wherein, registration
Accounts information include:Luha001@163.com, luha002@163.com, luha003@163.com,
Luha004@163.com, luha005@163.com, these accounts informations have obvious similarity:In prefix
Letter identical, and the regular growth of the digital Cheng Zizeng in prefix.These accounts informations are likely to be
Malice accounts information, then, what the server of commodity website can be registered to this user comprises above-mentioned mailbox
Accounts information extracts, and counts other attributes included in these accounts informations (such as:Address name,
Subscriber phone), other attributes included in these accounts informations are compared, identical attribute is more,
Just illustrate that the accounts information that this user is registered on the web site is bigger as the probability of malice accounts information.
But, for the mailbox more than some numbers of users, newly-increased quantity is larger daily, and millions of is new
Increase register user for, even if the mailbox having a fairly large number of model identical also belongs to normal, be necessarily required to by
Other accounts informations, the computation complexity so accounts information being identified is high, and easily flase drop,
Meanwhile, the facility information in accounts information is (such as:IP) less stable, user can also use certain
A little network equipment change IP, thus, Internet service provider's normal accounts information in distinguishing accounts information can be led to
Relatively low with malice accounts information accuracy rate.
Content of the invention
The embodiment of the present application provides a kind of information monitoring method and device, is distinguishing in order to solve Internet service provider
Normal accounts information and the relatively low problem of malice accounts information accuracy rate in accounts information.
A kind of information monitoring method that the embodiment of the present application provides, including:
Determine the accounts information of the account each to be identified receiving in special time period;
Extract the character feature in each accounts information receiving;
According to the character feature extracting, count respectively and there is in described special time period identical character spy
The account quantity levied;
According to each character feature corresponding account quantitative criteria pre-building, and count described specific
There is in time period the account quantity of identical character feature, judge whether each accounts information is fallacious message.
A kind of information monitoring device that the embodiment of the present application provides, including:
Receiver module, for determining the accounts information of the account each to be identified receiving in special time period;
Extraction module, for extracting the character feature in each accounts information receiving;
Statistical module, for according to the character feature extracting, counting phase in described special time period respectively
The account quantity of same character feature;
Judge module, for according to each character feature corresponding account quantitative criteria pre-building, Yi Jitong
There is in the described special time period counted out the account quantity of identical character feature, judge that each accounts information is
No for fallacious message.
The embodiment of the present application provides a kind of information monitoring method and device, and the method is directed to user's letter to be monitored
The accounts information of the account each to be identified that breath, the first webserver receive in determination special time period,
Extract the character feature in each accounts information receiving, according to the character feature extracting, count respectively
There is the account quantity of identical character feature, according to each character feature pair pre-building in special time period
The account quantitative criteria answered, and there is in the special time period counting the account number of identical character feature
Amount, judges whether each accounts information is fallacious message, therefore, Internet service provider is in the account providing in the face of user
During the information of family, can effectively distinguish normal accounts information and malice accounts information, improve differentiation normal
Accounts information and the accuracy rate of malice accounts information.
Brief description
Accompanying drawing described herein is used for providing further understanding of the present application, constitutes of the application
Point, the schematic description and description of the application is used for explaining the application, does not constitute to the application not
Work as restriction.In the accompanying drawings:
The process schematic of the information monitoring method that Fig. 1 provides for the embodiment of the present application;
The structural representation of the information monitoring device that Fig. 2 provides for the embodiment of the present application.
Specific embodiment
Purpose, technical scheme and advantage for making the application are clearer, specifically real below in conjunction with the application
Apply example and corresponding accompanying drawing is clearly and completely described to technical scheme.Obviously, described
Embodiment is only some embodiments of the present application, rather than whole embodiments.Based on the enforcement in the application
Example, the every other enforcement that those of ordinary skill in the art are obtained under the premise of not making creative work
Example, broadly falls into the scope of the application protection.
The information monitoring process that Fig. 1 provides for the embodiment of the present application, specifically includes following steps:
S101:Determine the accounts information of the account each to be identified receiving in special time period.
Wherein, described accounts information, including but not limited to user are filled out on webpage (or application interface)
The accounts information write, described special time period can be current slot or pass by sometime
Section.
In the embodiment of the present application, the webserver receive user registration each accounts information and store, work as net
When network server receives decision instruction, first have to determine special time period (e.g., one day, the past in past
One hour etc.) each accounts information of interior user's registration (and wherein, described each accounts information be each to be identified
The accounts information of account), then determine the accounts information of each account to be identified, described decision instruction is used for making net
Network server judges whether the accounts information of each account to be identified is fallacious message.
S102:Extract the character feature in each accounts information receiving.
Wherein, described character feature, in order to characterize the information of accounts information feature, as the word of accounts information
Symbol quantity.
In the embodiment of the present application, the webserver receives the account letter that special time period user is provided
Breath, extracts the information containing a certain feature in accounts information.As:In current slot, a certain use
The account of this commodity website is registered at family on certain commodity website, and the server of this commodity website can receive this use
The accounts information that family is registered, and using a certain specific feature extracting method (such as:Without clear and definite account letter
Numeral in breath specifically how many it is only necessary to the position of reference numbers) accounts information registered in this user
In extract the character feature containing individual features, the such as quantity of numeral.
S103:According to the character feature extracting, count respectively in described special time period and there is identical
The account quantity of character feature.
In the embodiment of the present application, in special time period, the webserver is according to specific feature extraction side
Method is (such as:Without the numeral in clear and definite accounts information specifically how many it is only necessary to the position of reference numbers)
Extract character feature in accounts information, will have identical character feature and be classified as same category, and count this
The character feature corresponding account quantity being comprised in classification, and be stored in the webserver.
As it is assumed that in current slot, the server of certain commodity website receives the accounts information of user
(such as:Account name) be:Dafa123, dasa324, dafa897, dasa898 are it is assumed that this specific feature
Extracting method is:Without the numeral in clear and definite specify information specifically how many it is only necessary to the position of reference numbers,
Then the character feature of above-mentioned accounts information is:Dafa^^^, dasa^^^, dafa^^^, dasa^^^ it is clear that
The character feature dafa^^^ of the character feature dafa^^^ and dafa897 of dafa123 is identical, the word of dasa324
Symbol feature dasa^^^ is identical with the character feature dasa^^^ of dasa898, and therefore, server is by above-mentioned character
In feature, the character feature dafa^^^ of character feature dafa^^^ and dafa897 of dafa123 is classified as the first kind
Not, the character feature dasa^^^ of character feature dasa^^^ and dasa898 of dasa324 is classified as Equations of The Second Kind
Not it is clear that the character feature corresponding account quantity that first category comprises is 2, the word that second category comprises
Symbol feature corresponding account quantity is 2, by corresponding with second category for the first category coming out account number
Amount is stored in the webserver.
S104:According to each character feature corresponding account quantitative criteria pre-building, and the institute counting
State the account quantity in special time period with identical character feature, judge whether each accounts information is malice
Information.
Wherein, the corresponding account quantitative criteria of described each character feature, including corresponding according to each character feature
The criterion that account quantity is set up, e.g., the average of character feature corresponding account quantity.
In special time period, the webserver counts the account of user according to specific feature extracting method
Different classes of character feature corresponding account quantity in information, in each character feature pair pre-building
Find in the account quantitative criteria (e.g., the average of character feature corresponding account quantity) answered with described certain
The identical character feature classification of one classification character feature, by the account quantity mark in this character feature classification
Standard takes out, and is compared judgement.
Continuation of the previous cases it is assumed that in each character feature corresponding account quantitative criteria pre-building,
The character feature corresponding account quantitative criteria of one classification is 3, the corresponding account of character feature of second category
Quantitative criteria is 1, and the character feature corresponding account quantity of the first category that the webserver comes out
For 2, without departing from the category corresponding account quantitative criteria, then the webserver can be to the word of first category
Accounts information corresponding to symbol feature carries out clearance process (that is, not carrying out any process to account information),
The character feature corresponding account quantity of the second category that the webserver comes out is 2, beyond such
Not corresponding account quantitative criteria, then the webserver can to the character feature of second category corresponding account letter
Breath carries out air control process (that is, carrying out behavior early warning).
By above-mentioned steps, in special time period, the webserver receives the account of each account to be identified
Information, accounts information extracts corresponding character feature by feature extracting method, will have identical character
Feature is classified as same category, counts the corresponding account of character feature of each classification in special time period respectively
Quantity, for the character feature of each classification, finds out institute in each character feature corresponding account quantitative criteria
State each classification corresponding account quantitative criteria, thus judge whether described accounts information is fallacious message.
Therefore, normal account, in the accounts information providing in the face of user, can effectively be distinguished by Internet service provider
Information and malice accounts information, improve the accuracy rate distinguishing normal accounts information and malice accounts information.
For the clearer described information monitoring method illustrating the application, included with described accounts information below
Account name, described special time period includes entering in case of the time span that the default unit interval divides
Row describes in detail.
In actual applications, user can register corresponding accounts information on some commodity websites, can with this
Continue to carry out meeting the operation of oneself needs on commodity website, but, the accounts information of user's registration has can
Can be the accounts information of malice, therefore the webserver, after the accounts information receiving described user, is determined
Account name in accounts information, according at least one feature extracting method default, carries from each account name respectively
Take out character feature.Wherein, described default multiple feature extracting methods, including the number of characters in account name
Amount, character types, the combination in any of character sortord.
As user have registered the account name in accounts information in certain commodity website and includes:Fawd2431, faad
783, fawd 7972, faad442, luha8988 are it is assumed that feature extracting method has eight, respectively:
Method one:Obscure all numerals in account name, and retain the digital number being blurred, wherein,
Obscure refer to specifically how many without clear and definite numeral;
Method two:Obscure all numerals in account name, without clear and definite digital number it is only necessary to identify
Going out the part obscuring is numeral;
Method three:Obscure all letters in account name, and retain the alphabetical number being blurred;
Method four:Obscure all letters in account name, without clear and definite alphabetical number it is only necessary to identify
The part obscuring is letter;
Method five:Obscure all characters in addition to specified location in account name, individual without clear and definite numeral
Count it is only necessary to retain the number of the nonnumeric character (including alphabetic character and sign character) being blurred,
Wherein it is intended that position will be nonnumeric character;
Method six:Obscure all characters in addition to specified location in account name, without clearly obscuring institute
Some character numbers it is only necessary to the part that is blurred of mark is digital or nonnumeric, wherein it is intended that position
Put nonnumeric character to be;
Method seven:Obscure all of monogram in account name, obscure all of blockette in account name
Close it is only necessary to the part that mark is blurred is numeral combination or monogram;
Method eight:Obscure all character combinations in account name, without the character combination being clearly blurred
In character number it is only necessary to mark be blurred be part be character combination, wherein, described character group
Close the combination referring to other any characters in addition to playing the separating character of segmentation effect.
This eight methods are parallel presence, and account name often will extract one according to a feature extracting method
Individual corresponding character feature, specifically:
When above-mentioned account name according to the character feature that method one is extracted is:Fawd^^^^, faad^^^,
Fawd^^^^, faad^^^, luha^^^^;
When above-mentioned account name according to the character feature that method two is extracted is:Fawd^, faad^, fawd
^, faad^, luha^;
When above-mentioned account name according to the character feature that method three is extracted is:Cccc2431, cccc783,
Cccc7972, cccc442, cccc8988;
When above-mentioned account name according to the character feature that method four is extracted is:C2431, c783, c7972,
C442, c8988;
When above-mentioned account name according to the character feature that method five is extracted is:Facc^, facc^, facc^,
Facc^, lucc^, wherein it is intended that being front two nonnumeric character at position;
When above-mentioned account name according to the character feature that method six is extracted is:Fac^, fac^, fac^, fac^,
Luc^, wherein it is intended that being front two nonnumeric character at position;
When above-mentioned account name according to the character feature that method seven is extracted is:C^, c^, c^, c^, c^;
When above-mentioned account name according to the character feature that method eight is extracted is:X, x, x, x, x.
" c " in above-mentioned represents alphabetical identifier, and " ^ " represents numeric identifier, and " x " represents character group
Close identifier.Obviously, above-mentioned 5 account names have according to the character feature that eight feature extracting methods extract
22.
In actual applications, the accounts information that user is registered is not limited only to 5 in example, but can
Reach hundred grades thousand grades even ten thousand grades, the enforcement of this implementation steps to be only described here taking 5 accounts informations as a example
Process, certainly, above-mentioned in characteristics information extraction method be also not limited to 8, can set as needed
Fixed N number of extracting method.
In the embodiment of the present application, in special time period, extracted in account name according to feature extracting method
Different character features, the webserver will have identical character feature and be classified as same category, count respectively
Go out character feature corresponding account quantity in each classification, and by each class being counted in special time period
The corresponding account quantity of not middle character feature is deposited in the webserver.
Adopt example, user have registered the account name in accounts information in certain commodity website and still includes:
Fawd2431, faad 783, fawd 7972, faad442, luha8988 are it is assumed that feature extracting method only selects
Select three:It is the feature extracting method one in upper example, feature extracting method two and feature extracting method seven respectively,
The account name of user according to the character feature that feature extracting method one is extracted is:Fawd^^^^, faad
^^^, fawd^^^^, faad^^^, luha^^^^;When above-mentioned account name is extracted according to feature extracting method two
Character feature out is:Fawd^, faad^, fawd^, faad^, luha^;Account name carries according to feature
The character feature that method seven is extracted is taken to be:C^, c^, c^, c^, c^, above-mentioned 5 account name roots
Have 15 according to the character feature that above three feature extracting method extracts, in the character feature extracting,
Containing identical character feature, same category can be classified as by having identical character feature, that is, above-mentioned
15 character features can be divided into 7 classifications, specifically includes:Fawd^^^^, faad^^^, luha^^^^, fawd^,
Faad^, luha^, c^.Afterwards, the webserver can count the corresponding account of character feature in each classification
Amount amount, specifically, the character feature corresponding account quantity comprising in classification " fawd^^^^ " is 2,
The character feature corresponding account quantity comprising in classification faad^^^ is 2, the word comprising in classification luha^^^^
Symbol feature corresponding account quantity is 1, and the character feature comprising in classification fawd^ corresponding account quantity is
2, the character feature comprising in classification faad^ corresponding account quantity is 2, the word comprising in classification luha^
Symbol feature corresponding account quantity is 1, and the character feature comprising in classification c^ corresponding account quantity is 5,
And the character feature corresponding account quantity being comprised above-mentioned each classification is deposited in the webserver.
The webserver counts the character feature corresponding account quantity that kinds of characters feature classification is comprised,
By corresponding for each character feature account quantity and each character feature corresponding account quantitative criteria pre-building
Compare, accordingly, it would be desirable to pre-build each character feature corresponding account quantitative criteria.
In the embodiment of the present application, pre-build each character feature corresponding account quantitative criteria, specifically include:
Previously according to the character feature in the historical account information receiving in multiple historical time sections, will be described
In each historical time section, identical character feature is sorted out, and wherein, described historical time section is specific with described
The time span of time period is identical, and for each feature classification, counting respectively should in each historical time section
The history character feature corresponding account quantity of feature classification, this feature classification according to counting is gone through each
History character feature corresponding account quantity in the history time period, determines that the character feature of this feature classification corresponds to
Average in all historical time sections for the account quantity and standard deviation, according to the corresponding account of this feature classification
The average of quantity and standard deviation, determine this feature classification corresponding account quantitative criteria.
For example, it is assumed that setting 4 historical time sections, each historical time Duan Weiyi days, receive within first day
The historical account information quantity of each user (assume the historical account of this 100 historical account information for 100
Name is respectively provided with identical character feature), wherein, with the historical account name comprising in two historical account information:
As a example fawd2431, faw 783.Continue to use above-mentioned in feature extracting method one and two, when above-mentioned account name
According to the character feature that method one is extracted it is:Fawd^^^^, faw^^^;When above-mentioned account name is according to side
The character feature that method two is extracted is:Fawd^, faw^;Above-mentioned " ^ " represents numeric identifier.
Wherein, the webserver using character feature fawd^^^^ as first category, by character feature faw^^^
As second category, using character feature fawd^ as the 3rd classification, using character feature faw^ as the 4th class
Not, the webserver will count the comprised character feature of each classification and correspond to according to this 100 account names
Account quantity, it is, the corresponding account of character feature of each classification that the webserver comes out
Total quantity includes:It is 70 that first category comprises character feature corresponding account quantity, and second category comprises character
Feature corresponding account quantity is 30, and it is 60 that the 3rd classification comprises character feature corresponding account quantity, the
It is 40 that four classifications comprise character feature corresponding account quantity.
The historical account information quantity of each user that hypothesis receives for second day is 100, wherein, according to above-mentioned
The character feature that feature extracting method one extracts includes the character feature corresponding account number of first category
Measure as 60, the character feature corresponding account quantity of second category is 40, according to features described above extracting method
The character feature corresponding account quantity that two character features extracting include the 3rd classification is 50, the 4th class
The corresponding account quantity of other character feature is 50.
Assume that the historical account information quantity of each user receiving is 100, wherein, according to above-mentioned the 3rd day
The character feature that feature extracting method one extracts includes the character feature corresponding account number of first category
Measure as 50, the character feature corresponding account quantity of second category is 50, according to features described above extracting method
The character feature corresponding account quantity that two character features extracting include the 3rd classification is 40, the 4th class
The corresponding account quantity of other character feature is 60.
Assume that the historical account information quantity of each user receiving is 100, wherein, according to above-mentioned the 4th day
The character feature that feature extracting method one extracts includes the character feature corresponding account number of first category
Measure as 80, the character feature corresponding account quantity of second category is 20, according to features described above extracting method
The character feature corresponding account quantity that two character features extracting include the 3rd classification is 50, the 4th class
The corresponding account quantity of other character feature is 50.
The webserver counts the account quantity corresponding to each classification in this four days every day, that is,:Each
Character feature all corresponds to certain account quantity in every day.Can be simulated by normal distribution, each character
The feature average of corresponding account quantity and standard deviation daily, that is,:First category character feature is corresponding daily
The average of account quantity is 65, and standard deviation is 11, second category character feature corresponding account quantity daily
Average be 35, standard deviation is 11, and the average of the 3rd classification character feature corresponding account quantity daily is
55, standard deviation be 11, the 4th classification character feature daily corresponding account quantity average be 45, standard
Difference is for 11 it is assumed that adopting formula (μ+k σ) in this example as each character feature corresponding account number
Amount standard, wherein μ represent the average of classification character feature corresponding account quantity daily, and k represents and extremely refers to
Mark coefficient, assumes that k represents classification character feature corresponding account quantity daily for 2, σ in application example
Standard deviation, then first category character feature corresponding account quantitative criteria be 87, second category character feature
Corresponding account quantitative criteria is 57, and the corresponding account quantitative criteria of the 3rd classification character feature is 77, the
The corresponding account quantitative criteria of four classification character features is 67, by each classification character feature pair obtained above
The numerical value of account quantitative criteria answered and the average of category character feature corresponding account quantity and standard
Difference is stored in the data base of the webserver as feature quantity standard.
In the embodiment of the present application, in special time period, the webserver is through above-mentioned steps basis
Specific feature extracting method extracts the character feature of account name in all accounts informations, will have identical word
Symbol feature is classified as same category, counts the character feature corresponding account quantity that each classification is comprised, and upper
State each character feature corresponding account quantitative criteria set up to compare, judge described each accounts information
Whether it is fallacious message, therefore, judge whether described each accounts information is fallacious message, specifically include:Pin
To each feature classification, judge the character feature corresponding account quantity in this feature classification in special time period
Whether it is more than this feature classification corresponding account quantitative criteria, the tool if so, then receiving in special time period
The corresponding accounts information of character feature in this feature classification is had to be fallacious message, otherwise, then special time period
The corresponding accounts information of character feature having in this feature classification inside receiving is normal information.
The character feature corresponding account letter having in this feature classification receiving in by special time period
Before breath is defined as fallacious message, it is more than account quantitative criteria for each character feature corresponding account quantity
Feature classification, determine character feature corresponding account quantity in this feature classification in special time period with should
The difference of the average of feature classification corresponding account quantity, determines described difference account corresponding with this feature classification
The ratio of the standard deviation of amount amount, in this ratio determined, determines the maximum feature classification of ratio,
The maximum accounts information corresponding to feature classification of described ratio is fallacious message.
In corresponding each accounts information in above-mentioned fallacious message, after feature extracting method, containing identical
Character feature, the character feature of these accounts informations is classified as one group, character feature corresponding account number
Amount has exceeded account quantitative criteria, just illustrates that the quantity of these accounts informations has exceeded daily and above-mentioned account letter
Breath has the quantity of the normal accounts information of identical characters feature it is possible to be belonging to the malice account of batch registration
Information.
Continuation of the previous cases, it is assumed that the account name quantity that the same day webserver receives user's registration is 155, is led to
Crossing the first category character feature corresponding account quantity that features described above extracting method one extracts is 76, the
The corresponding account quantity of two classification character features is 79, the being extracted by features described above extracting method two
The corresponding account quantity of three classification character features is 77, and the corresponding account quantity of the 4th classification character feature is
78, according in the above-mentioned each character feature corresponding account quantitative criteria set up it should be apparent that
First category character feature corresponding account quantity is less than this first category character feature corresponding account quantity
Standard, second category character feature corresponding account quantity is more than this corresponding account of second category character feature
Quantitative criteria, it is corresponding that the corresponding account quantity of the 3rd classification character feature is equal to the 3rd classification character feature
Account quantitative criteria, the corresponding account quantity of the 4th classification character feature is more than the 4th classification character feature pair
The account quantitative criteria answered.
Therefore, for aforementioned four classification character feature, first category character feature and the 3rd classification character are special
Levy corresponding account quantity and be not above account quantitative criteria, and second category character feature and the 4th class malapropism
Symbol feature corresponding account quantity has exceeded account quantitative criteria, therefore, for exceeding account quantitative criteria
For second category character feature and the 4th classification character feature, second category character feature corresponding account number
The difference measuring account number average value corresponding with history second category character feature is 44, that is,:79-35=44,
The corresponding account quantity of 4th classification character feature account quantity corresponding with history the 4th classification character feature
The difference of average is 33, that is,:78-45=33, the difference of second category character feature corresponding account quantity with
The ratio of the standard deviation of history second category character feature corresponding account quantity is 4, and the 4th classification character is special
Levy the standard deviation of the difference account quantity corresponding with history the 4th classification character feature of corresponding account quantity
Ratio be 3 it is clear that the ratio of second category character feature corresponding account quantity is in this two ratios
Maximum, thus may determine that the accounts information corresponding to second category character feature is fallacious message.
In the examples described above, judge that the accounts information corresponding to second category is fallacious message, in this regard,
These fallacious messages are processed accordingly, that is,:Fallacious message corresponding to second category is removed, moves
The quantity removed is 79, wherein, counts the account quantity comprising this fallacious message in other classifications, and from every
These account quantity are removed it is assumed that the character feature corresponding account quantity of the 3rd classification is 77 in one classification,
In 3rd classification, the account quantity of fallacious message is 35, the character feature corresponding account number of the 4th classification
Measure as 78, in the 4th classification, the account quantity of fallacious message is 44, therefore, moves in each category
After the account quantity of these fallacious messages, the character feature corresponding account quantity of the 3rd classification is 42,
The character feature corresponding account quantity of the 4th classification is 34, and recalculates and judge new first category,
3rd classification, whether the corresponding account quantity of the 4th classification exceedes account quantitative criteria, until all categories
Character feature corresponding account quantity all no longer has more than account quantitative criteria, and remaining character feature is corresponding
Accounts information is all normal information.
In actual applications, the webserver can store each classification coming out in special time period
Character feature corresponding account quantity, for the character feature corresponding account quantity continuously several days of a certain classification
The above corresponding account quantitative criteria of the category, such as:Continuous three days of second category corresponding account quantity
Above second category corresponding account quantitative criteria, then the webserver can be to corresponding to second category
Accounts information carries out early warning.
The information monitoring method providing for the embodiment of the present application above, based on same thinking, the application is implemented
Example also provides a kind of information monitoring device.
As shown in Fig. 2 a kind of information monitoring device that the embodiment of the present application provides includes:
Receiver module 201, for determining the accounts information of the account each to be identified receiving in special time period;
Extraction module 202, for extracting the character feature in each accounts information receiving;
Statistical module 203, for according to the character feature extracting, counting described special time period respectively
The account quantity of interior identical character feature;
Judge module 204, each character feature corresponding account quantitative criteria pre-building for basis, with
And there is in the described special time period counting the account quantity of identical character feature, judge each account letter
Whether breath is fallacious message.
In the embodiment of the present application, described accounts information includes account name;Described special time period is included with pre-
If unit interval divide time span.
Described extraction module 202, specifically for determining the account name in each accounts information, according to default
At least one feature extracting method, extracts character feature from each account name respectively.
Described statistical module 203, specifically for each character that will be extracted in described special time period
In feature, identical character feature is sorted out, and counts the corresponding account of character feature in each feature classification respectively
Amount amount.
Described device also includes:
Pre-build module 205, specifically for previously according to going through of receiving in multiple historical time sections
Character feature in history accounts information, identical character feature will be returned in described each historical time section
Class, wherein, described historical time section is identical with the time span of described special time period, for each feature
Classification, counts the history character feature corresponding account number of this feature classification in each historical time section respectively
Amount, according to the corresponding account of history character feature in each historical time section for this feature classification counting
Quantity, determines the character feature corresponding account quantity of this feature classification average in all historical time sections
And standard deviation, the average according to this feature classification corresponding account quantity and standard deviation, determine this feature class
Not corresponding account quantitative criteria.
Described judge module 204, specifically for for each feature classification, judging should in special time period
Whether the corresponding account quantity of the character feature in feature classification is more than this feature classification corresponding account quantity
Standard, the corresponding account of character feature having in this feature classification if so, then receiving in special time period
Family information is fallacious message, otherwise, the then character having in this feature classification that receives in special time period
The corresponding accounts information of feature is normal information.
Described judge module 204, specifically for for each feature classification, determining should in special time period
The average of the corresponding account quantity of the character feature in feature classification account quantity corresponding with this feature classification
Difference, determine the ratio of the standard deviation of described difference account quantity corresponding with this feature classification, determine
In this ratio going out, determine the maximum feature classification of ratio.
Described device also includes:
Processing module 206, specifically for, after maximum ratio in the ratio determining described feature classification,
Remove the corresponding accounts information of maximum ratio, and again count the corresponding account of character feature in each feature classification
Amount amount, is compared with each character feature corresponding account quantitative criteria pre-building, until each feature classification
Character feature corresponding account quantity be respectively less than each character feature corresponding account quantitative criteria.
In a typical configuration, computing device includes one or more processors (CPU), input/defeated
Outgoing interface, network interface and internal memory.
Internal memory potentially includes the volatile memory in computer-readable medium, random access memory
(RAM) and/or the form such as Nonvolatile memory, such as read only memory (ROM) or flash memory (flash RAM).
Internal memory is the example of computer-readable medium.
Computer-readable medium include permanent and non-permanent, removable and non-removable media can by appoint
What method or technique is realizing information Store.Information can be computer-readable instruction, data structure, program
Module or other data.The example of the storage medium of computer includes, but are not limited to phase transition internal memory
(PRAM), static RAM (SRAM), dynamic random access memory (DRAM), its
The random access memory (RAM) of his type, read only memory (ROM), electrically erasable are read-only
Memorizer (EEPROM), fast flash memory bank or other memory techniques, read-only optical disc read only memory
(CD-ROM), digital versatile disc (DVD) or other optical storage, magnetic cassette tape, tape magnetic
Disk storage or other magnetic storage apparatus or any other non-transmission medium, can be used for storage can be calculated
The information that equipment accesses.Define according to herein, computer-readable medium does not include temporary computer-readable matchmaker
Body (transitory media), the such as data signal of modulation and carrier wave.
Also, it should be noted term " inclusion ", "comprising" or its any other variant are intended to non-row
The comprising, so that include a series of process of key elements, method, commodity or equipment not only including of his property
Those key elements, but also include other key elements of being not expressly set out, or also include for this process,
Method, commodity or the intrinsic key element of equipment.In the absence of more restrictions, " included by sentence
One ... " key element that limits is being it is not excluded that including the process of described key element, method, commodity or setting
Also there is other identical element in standby.
It will be understood by those skilled in the art that embodiments herein can be provided as method, system or computer journey
Sequence product.Therefore, the application can using complete hardware embodiment, complete software embodiment or combine software and
The form of the embodiment of hardware aspect.And, the application can adopt and wherein include calculating one or more
Machine usable program code computer-usable storage medium (including but not limited to disk memory, CD-ROM,
Optical memory etc.) the upper computer program implemented form.
The foregoing is only embodiments herein, be not limited to the application.For this area skill
For art personnel, the application can have various modifications and variations.All institutes within spirit herein and principle
Any modification, equivalent substitution and improvement made etc., within the scope of should be included in claims hereof.
Claims (16)
1. a kind of information monitoring method is it is characterised in that include:
Determine the accounts information of the account each to be identified receiving in special time period;
Extract the character feature in each accounts information receiving;
According to the character feature extracting, count respectively and there is in described special time period identical character spy
The account quantity levied;
According to each character feature corresponding account quantitative criteria pre-building, and count described specific
There is in time period the account quantity of identical character feature, judge whether each accounts information is fallacious message.
2. the method for claim 1 is it is characterised in that described accounts information includes account name;
Described special time period includes the time span dividing with the default unit interval.
3. method as claimed in claim 2 is it is characterised in that extract in each accounts information receiving
Character feature, specifically include:
Determine the account name in each accounts information;
According at least one feature extracting method default, extract character feature from each account name respectively.
4. the method for claim 1 is it is characterised in that count described special time period respectively
Inside there is the account quantity of identical character feature, specifically include:
In each character feature that will be extracted in described special time period, identical character feature is returned
Class;
Count the character feature corresponding account quantity in each feature classification respectively.
5. method as claimed in claim 4 is it is characterised in that pre-build that each character feature is corresponding
Account quantitative criteria, specifically includes:
Previously according to the character feature in the historical account information receiving in multiple historical time sections, will be
In described each historical time section, identical character feature is sorted out, wherein, described historical time section with described
The time span of special time period is identical;
For each feature classification, count the history character of this feature classification in each historical time section respectively
Feature corresponding account quantity;
According to the corresponding account of history character feature in each historical time section for this feature classification counting
Amount amount, determines that the character feature corresponding account quantity of this feature classification is equal in all historical time sections
Value and standard deviation;
Average according to this feature classification corresponding account quantity and standard deviation, determine that this feature classification corresponds to
Account quantitative criteria.
6. method as claimed in claim 5 is it is characterised in that judge whether each accounts information is malice
Information, specifically includes:
For each feature classification, judge the corresponding account of character feature in this feature classification in special time period
Whether amount amount is more than this feature classification corresponding account quantitative criteria;
If so, the corresponding account of character feature having in this feature classification then receiving in special time period
Information is fallacious message;
Otherwise, then the corresponding account of character feature having in this feature classification receiving in special time period
Information is normal information.
7. method as claimed in claim 6 is it is characterised in that receive in by special time period
Have before the corresponding accounts information of character feature in this feature classification is fallacious message, also include:
For each feature classification, determine the corresponding account of character feature in this feature classification in special time period
The difference of the average of amount amount account quantity corresponding with this feature classification;
Determine the ratio of the standard deviation of described difference account quantity corresponding with this feature classification;
In this ratio determined, determine the maximum feature classification of ratio.
8. method as claimed in claim 7 is it is characterised in that methods described also includes:When determining
After maximum ratio in the ratio of described feature classification, remove the corresponding accounts information of maximum ratio, and again
Count the character feature corresponding account quantity in each feature classification, corresponding with each character feature pre-building
Account quantitative criteria compare, until each feature classification character feature corresponding account quantity be respectively less than each word
Symbol feature corresponding account quantitative criteria.
9. a kind of information monitoring device is it is characterised in that include:
Receiver module, for determining the accounts information of the account each to be identified receiving in special time period;
Extraction module, for extracting the character feature in each accounts information receiving;
Statistical module, for according to the character feature extracting, counting tool in described special time period respectively
There is the account quantity of identical character feature;
Judge module, for according to each character feature corresponding account quantitative criteria pre-building, Yi Jitong
There is in the described special time period counted out the account quantity of identical character feature, judge that each accounts information is
No for fallacious message.
10. device as claimed in claim 9 is it is characterised in that described accounts information includes account name;
Described special time period includes the time span dividing with the default unit interval.
11. devices as claimed in claim 10 it is characterised in that described extraction module specifically for,
Determine the account name in each accounts information, according at least one feature extracting method default, respectively from each account
Name in an account book extracts character feature.
12. devices as claimed in claim 9 are it is characterised in that described statistical module is specifically for inciting somebody to action
In each character feature being extracted in described special time period, identical character feature is sorted out, respectively
Count the character feature corresponding account quantity in each feature classification.
13. methods as claimed in claim 12 are it is characterised in that described device also includes:
Pre-build module, specifically for previously according to the history account receiving in multiple historical time sections
Character feature in the information of family, identical character feature will be sorted out in described each historical time section, its
In, described historical time section is identical with the time span of described special time period, for each feature classification,
Count the history character feature corresponding account quantity of this feature classification in each historical time section, root respectively
History character feature corresponding account quantity in each historical time section for this feature classification going out according to statistics,
Determine the character feature corresponding account quantity of this feature classification average in all historical time sections and mark
Accurate poor, the average according to this feature classification corresponding account quantity and standard deviation, determine this feature classification pair
The account quantitative criteria answered.
14. devices as claimed in claim 13 it is characterised in that described judge module specifically for,
For each feature classification, judge the character feature corresponding account number in this feature classification in special time period
Whether amount is more than this feature classification corresponding account quantitative criteria, if so, then receives in special time period
Having the corresponding accounts information of character feature in this feature classification is fallacious message, otherwise, then special time
The corresponding accounts information of character feature having in this feature classification receiving in section is normal information.
15. devices as claimed in claim 14 it is characterised in that described judge module specifically for,
For each feature classification, determine the character feature corresponding account number in this feature classification in special time period
Measure the difference of the average of account quantity corresponding with this feature classification, determine described difference and this feature classification pair
The ratio of the standard deviation of account quantity answered, in this ratio determined, determines the maximum feature of ratio
Classification.
16. devices as claimed in claim 15 are it is characterised in that described device also includes:
Processing module, specifically for, after maximum ratio in the ratio determining described feature classification, moving
Except the corresponding accounts information of maximum ratio, and again count the corresponding account of character feature in each feature classification
Quantity, is compared with each character feature corresponding account quantitative criteria pre-building, until each feature classification
Character feature corresponding account quantity is respectively less than each character feature corresponding account quantitative criteria.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201510518846.9A CN106469179A (en) | 2015-08-21 | 2015-08-21 | A kind of information monitoring method and device |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201510518846.9A CN106469179A (en) | 2015-08-21 | 2015-08-21 | A kind of information monitoring method and device |
Publications (1)
Publication Number | Publication Date |
---|---|
CN106469179A true CN106469179A (en) | 2017-03-01 |
Family
ID=58229738
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201510518846.9A Pending CN106469179A (en) | 2015-08-21 | 2015-08-21 | A kind of information monitoring method and device |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN106469179A (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110825924A (en) * | 2019-11-01 | 2020-02-21 | 深圳市前海随手数据服务有限公司 | Data detection method, device and storage medium |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102081774A (en) * | 2009-11-26 | 2011-06-01 | 中国移动通信集团广东有限公司 | Card-raising identification method and system |
CN102402517A (en) * | 2010-09-09 | 2012-04-04 | 北京启明星辰信息技术股份有限公司 | Method and system for establishing normal database login model and method and system for detecting abnormal login behavior |
CN103377319A (en) * | 2012-04-13 | 2013-10-30 | 索尼公司 | System and method used for detecting users of piracy |
CN103905532A (en) * | 2014-03-13 | 2014-07-02 | 微梦创科网络科技(中国)有限公司 | Microblog marketing account recognition method and system |
CN104572765A (en) * | 2013-10-25 | 2015-04-29 | 西安群丰电子信息科技有限公司 | Method and system for finding vest account based on behavior analysis of user account |
CN104715007A (en) * | 2014-12-26 | 2015-06-17 | 小米科技有限责任公司 | User identification method and device |
-
2015
- 2015-08-21 CN CN201510518846.9A patent/CN106469179A/en active Pending
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102081774A (en) * | 2009-11-26 | 2011-06-01 | 中国移动通信集团广东有限公司 | Card-raising identification method and system |
CN102402517A (en) * | 2010-09-09 | 2012-04-04 | 北京启明星辰信息技术股份有限公司 | Method and system for establishing normal database login model and method and system for detecting abnormal login behavior |
CN103377319A (en) * | 2012-04-13 | 2013-10-30 | 索尼公司 | System and method used for detecting users of piracy |
CN104572765A (en) * | 2013-10-25 | 2015-04-29 | 西安群丰电子信息科技有限公司 | Method and system for finding vest account based on behavior analysis of user account |
CN103905532A (en) * | 2014-03-13 | 2014-07-02 | 微梦创科网络科技(中国)有限公司 | Microblog marketing account recognition method and system |
CN104715007A (en) * | 2014-12-26 | 2015-06-17 | 小米科技有限责任公司 | User identification method and device |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110825924A (en) * | 2019-11-01 | 2020-02-21 | 深圳市前海随手数据服务有限公司 | Data detection method, device and storage medium |
CN110825924B (en) * | 2019-11-01 | 2022-12-06 | 深圳市卡牛科技有限公司 | Data detection method, device and storage medium |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN105808988B (en) | Method and device for identifying abnormal account | |
CN110381151B (en) | Abnormal equipment detection method and device | |
CN104067567B (en) | System and method for carrying out spam detection using character histogram | |
CN108809745A (en) | A kind of user's anomaly detection method, apparatus and system | |
CN110033302B (en) | Malicious account identification method and device | |
CN108282450A (en) | The detection method and device of abnormal domain name | |
CN104040963A (en) | System and methods for spam detection using frequency spectra of character strings | |
CN106951571A (en) | A kind of method and apparatus for giving application mark label | |
CN107547671A (en) | A kind of URL matching process and device | |
CN102708186A (en) | Identification method of phishing sites | |
CN105045911B (en) | Label generating method and equipment for user to mark | |
CN113328994B (en) | Malicious domain name processing method, device, equipment and machine readable storage medium | |
CN108647997A (en) | A kind of method and device of detection abnormal data | |
CN109446391A (en) | User's reading behavior analysis method, electronic device, computer readable storage medium | |
CN107622326A (en) | User's classification, available resources Forecasting Methodology, device and equipment | |
CN107622406A (en) | Identify the method and system of virtual unit | |
CN111242218A (en) | Cross-social network user identity recognition method fusing user multi-attribute information | |
CN111476375B (en) | Method and device for determining identification model, electronic equipment and storage medium | |
CN113010637A (en) | Text auditing method and device | |
CN106469179A (en) | A kind of information monitoring method and device | |
CN104462448B (en) | A kind of packet name classification method and device | |
CN107391543A (en) | The kind identification method and device of a kind of hotspot | |
CN111340380A (en) | Client resource allocation method, device and storage medium | |
CN106844765A (en) | Notable information detecting method and device based on convolutional neural networks | |
CN109062638B (en) | System component display method, computer readable storage medium and terminal device |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20170301 |