CN101686444A - System and method for detecting spam SMS sender number in real time - Google Patents

System and method for detecting spam SMS sender number in real time Download PDF

Info

Publication number
CN101686444A
CN101686444A CN200810168774A CN200810168774A CN101686444A CN 101686444 A CN101686444 A CN 101686444A CN 200810168774 A CN200810168774 A CN 200810168774A CN 200810168774 A CN200810168774 A CN 200810168774A CN 101686444 A CN101686444 A CN 101686444A
Authority
CN
China
Prior art keywords
rule
center node
spam sms
real
sms sender
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN200810168774A
Other languages
Chinese (zh)
Other versions
CN101686444B (en
Inventor
王晨
李洁
陆薇
田启明
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
International Business Machines Corp
Original Assignee
International Business Machines Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by International Business Machines Corp filed Critical International Business Machines Corp
Priority to CN 200810168774 priority Critical patent/CN101686444B/en
Publication of CN101686444A publication Critical patent/CN101686444A/en
Application granted granted Critical
Publication of CN101686444B publication Critical patent/CN101686444B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Landscapes

  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The application provides a system and a method for detecting a spam SMS sender number in real time. The system comprises an event handling engine and a chart analyzing engine, wherein the event handling engine is used for acquiring a SMS event flow in real time, carrying out rule matching according to a preset time parameter and the rule thereof, and extracting a potential spam SMS sender number;and the chart analyzing engine is used for receiving the number extracted by the event handling engine, then acquiring the social network data of the number, and carrying out spatial behavioral pattern analysis by combining at least one preset spatial feature and the parameter and rule thereof, so as to determine whether the number is the real spam SMS sender number; thus the real spam SMS sendercan be exactly detected in real time through an implementation mode.

Description

Spam SMS sender number real-time detecting system and method
Technical field
The application relates to a kind of spam SMS sender number real-time detecting system and method.
Background technology
Refuse messages (Spam Short Message) is defined as: have the batch note that is sent to the illegal or ad content of telecommunications terminal user by particular organization/individual, the Mobile Directory Number of described telecommunications terminal user is in the fixed number district or obtained unusually.Refuse messages is just being harassed common mobile phone user's daily life.The useful solution that does not address this problem promptly detects the effective ways of spam SMS sender number, and each mobile phone will become dustbin, and the terminal use may obtain Useful Information from it hardly.Another serious problem is: the refuse messages sender has preserved a large amount of mobile phone users' personal information, and this brings the personal secrets problem to the client.On the other hand, the refuse messages sender always attempts illegal the use provides the telecommunication service tariff package of dog-cheap telecommunication service price, thereby has had a strong impact on the normal income of telecom operators.
Telecom operators have attempted analyzing the feature of refuse messages, comprising: there be (on average being less than 3 months) in spam SMS sender number in the short time, and the defaulting subscriber leads high relatively (being higher than 41%).Simultaneously,, be about to the sender that the normal users mistake is judged to be refuse messages, will cause more serious customer complaint if produce erroneous judgement.Therefore, detecting spam SMS sender number needs in real time with correct.
Operator has adopted the solution of several detection spam SMS sender numbers, and the message number as real-time supervision transmission number monitors message context etc.But all these solutions all have its restriction.There is a solution in China Mobile: the threshold value of 1 minute at interval interior message quantity forwarded is set, if the sender sends the number of messages more than threshold value in 1 minute, the sender will be assumed that the refuse messages sender so, and its number is with locked.This solution can be discerned the true refuse messages sender of some type, but as the Spring Festival of China so in particular cases, domestic consumer may use note transmit to wish to its all friends.Simple detection rule based on threshold value can be misinterpreted as domestic consumer the refuse messages sender and lock their number, causes the economic loss of operator.In addition, increasing real refuse messages sender tends to use as the normal transmission behavior of the automatic message transmitting system imitation domestic consumer of note cat and is detected avoiding.Need more matched rule based on the supervision of message content, on the one hand, the sender of refuse messages can obscure, add mode such as additional character by content and escape supervision, such as " invoice " being written as " a * ticket " or " fa ticket ".In addition, because text retrieval and coupling are consuming time longer, under the situation for the larger data amount, this is inefficient basically.
The Business Intelligence system of many operators has made up the potential defaulting subscriber that some technology based on data mining detect ISP's fraud, have unusual bill.All these technology are used for off-line system, and the periodic data Collection and analysis can not be guaranteed real-time detection.
Summary of the invention
In realizing an embodiment of the invention, a kind of spam SMS sender number real-time detecting system is provided, comprises: the event handling engine is used for obtaining in real time short message event stream, carry out rule match according to preset time parameter and rule thereof, extract potential spam SMS sender number; And map analysis engine, be used to receive the described number that the event handling engine extracts, obtain the community network data of described number then, in conjunction with at least one predetermined space characteristics and parameter and rule, carry out the spatial behavior pattern analysis, so that judge the spam SMS sender number that described number is whether real.
In realizing another embodiment of the invention, a kind of spam SMS sender number real-time detection method is provided, comprise: obtain short message event stream in real time, carry out rule match, extract potential spam SMS sender number according to preset time parameter and rule thereof; And the community network data of obtaining the described number of extraction, in conjunction with at least one predetermined space characteristics and parameter and rule, carry out the spatial behavior pattern analysis, so that judge the spam SMS sender number that described number is whether real.
According to an embodiment of the invention, can produce following advantage:
1, can supply the knowledge of operator based on the pattern recognition of data mining, set up simple decision rule because current spam SMS sender number detection system can only depend on the experience of operator.
2, by the online detection of time-based potential spam SMS sender number of flow of event, can utilize existing flow data treatment system that effective detection solution is provided.
3, the user's community network figure based on the space further helps accurately to filter potential refuse messages sender, to discern real refuse messages sender, this has realized the correctness requirement of telecom operators, thereby has improved its income when avoiding potential economic loss.
4, detect real refuse messages sender in real time and correctly.
Description of drawings
Fig. 1 illustrates the system architecture according to an embodiment of the invention.
Fig. 2 A and Fig. 2 B illustrate subgraph once.
Fig. 3 A and Fig. 3 B illustrate two degree subgraphs.
Fig. 4 A and Fig. 4 B illustrate the flow chart according to the method for an embodiment of the invention.
Embodiment
A main thought of the present invention provides a kind of new solution that effectively also correctly detects real spam SMS sender number.In an embodiment of the invention, in order to realize the spam SMS sender number testing goal, designed time and the supervision of space characteristics and the layer architecture of analysis of integrated user behavior, it can be divided into off-line part and online part.
Refuse messages is sent automatically by machine usually, and its sending mode meets some rules really aspect statistics relatively for a long time.With the transmission number of each time point of each short message service user with send object and send at interval and frequency, be recorded as incident (being the note summary), can be sequence of events with sending behavior description.Utilize the experience and the data mining technology of operator, some rules of definable are also implemented in sms center, so that to sequence of events executing rule coupling, thereby find out potential spam SMS sender number.Therefore, the real-time detection of spam SMS sender number is useful.But have a problem: some normal senders may have identical behavior at certain time point of certain time durations, and for example, the person of active organization may give notice for simultaneously a plurality of users, and this is as the refuse messages behavior.Therefore, further checking is important so that avoid the complaint of normal users.
Online part promptly is used for the online detection subsystem that spam SMS sender number monitors and analyzes, and can be further divided into two sublayers: low layer Time-Series analysis engine and high-level diagram analysis engine.Low layer Time-Series analysis engine is based on the user behavior pattern analyzer of time, and it uses from the flow of event of the record retrieval of sms center and finds potential spam SMS sender number in real time.The high-level diagram analysis engine is based on user's community network pattern analyzer in space, and it further uses network diagram to filter potential spam SMS sender number with the real spam SMS sender number of accurate detection from low layer.
The off-line part, be that off-line excavates subsystem, mainly concentrate on historical data analysis, by using data mining technology, can excavate refuse messages sender behavior pattern from the long-time statistical record and help operator's identification and confirm special pattern, to assist to set up the decision rule of high-rise online detection subsystem.
Followingly embodiments of the present invention are described in detail with reference to accompanying drawing.
Fig. 1 illustrates the system architecture according to an embodiment of the invention, wherein also illustrates the realization environment of this execution mode.The system of this execution mode comprises that online detection module 1 and off-line mode excavate module 2.
Online detection module 1 comprises event handling engine 11 and map analysis engine 12.
The rule match that event handling engine 11 is carried out based on flow of event.The note of all transmissions can be by event handling engine 11, and flows processing in real time for each incident of passing through, promptly carries out rule match according to preset time parameter and rule (back detailed description).All potential refuse messages are captured and the transmission number of these notes will be reported to map analysis engine 12 and further verify.At this, event handling engine 11 is analyzed the inherent law of refuse messages senders' time behavior, carries out Time-Series analysis, i.e. time BMAT extracts the transmission number of matched rule and reports to map analysis engine 12.
Map analysis engine 12 is carried out the network diagram analysis, and promptly the community network to calling party or note sender carries out spatial behavior pattern analysis (back detailed description).Map analysis engine 12 is with the calling of analysis user or the topological structure of short message service figure (telecommunity network).
By specific telecommunications connection features, as calling party or note record, the connection in a time period between the mobile phone user will make up oriented network figure, node representative of consumer wherein, and this is called the telecommunity network.Detect for the refuse messages sender, this oriented network graph topological structure described should be during the time period the user's space behavior.Can extract statistical nature from the telecommunity network, and analyze this statistical nature and help detect refuse messages sender behavior.As the spatial statistics feature, be difficult to hide at the refuse messages behavior of target receiver.
The analytic process of above-mentioned telecommunity topology of networks can be divided into three progressively deep steps, and data that all need come from the detailed unirecord system of operator.Below be that example describes with the short message service.
In the first step, carry out subgraph analysis once.Shown in Fig. 2 A and Fig. 2 B, once subgraph represented that calling party or note sender (the center node among the figure) and its contacted directly the contact behavior between people's (direct-connected all nodes of among the figure and center node).Wherein, Fig. 2 A illustrates potential refuse messages sender's subgraph once, center node among the figure is represented potential refuse messages sender, connect this center node and the limit that highlights, identify this potential refuse messages sender and contact directly people (potential victim's) contact behavior with it, the arrow on each limit has been indicated sending direction.In contrast, Fig. 2 B illustrates the subgraph once of normal short-message users, center node among the figure is represented specific normal short-message users (sender of concern), the limit that connects this center node and highlight, identify this specific normal short-message users and contact directly people's the behavior of getting in touch with it, the arrow on each limit has been indicated sending direction.Therefore, once subgraph has been showed current sender's space relationship behavior.By comparison diagram 2A and Fig. 2 B as seen: for most of normal short-message users (sender of concern), in the limit that connects the center node, the out-degree (expression sends) and the in-degree (expression receives) of being indicated by the arrow on limit are balance substantially, and promptly sending behavior and reception behavior is balance substantially; But, for potential refuse messages sender, in the limit that connects the center node, the out-degree (expression sends) of being indicated by the arrow on limit is far longer than in-degree (expression receives), promptly mainly be a large amount of transmission behaviors, because its purpose is to send refuse messages but not proper communication.
In second step, further carry out two degree subgraph analyses.Shown in Fig. 3 A and Fig. 3 B, two degree subgraphs are contacted directly contact behavior between people's (direct-connected all nodes of among the figure and center node) except show Calls side or note sender (the center node among the figure) and its, show that further these contact directly the contact behavior between the people.That is, two degree subgraphs can show potential refuse messages sender all contact directly communication behavior between the people.Wherein, Fig. 3 A illustrates potential refuse messages sender's two degree subgraphs, center node among the figure is represented potential refuse messages sender, connect this center node and the limit that highlights, identify this potential refuse messages sender and contact directly people (potential victim's) contact behavior with it, the arrow on each limit has been indicated sending direction.In Fig. 3 A, representing there is not the limit substantially between these nodes of contacting directly the people, it is starlike that the limit that highlights seems to remain.In contrast, Fig. 3 B illustrates two degree subgraphs of normal short-message users, center node among the figure is represented specific normal short-message users (being paid close attention to the sender), the limit that connects this center node and highlight, identify this specific normal short-message users and contact directly people's the behavior of getting in touch with it, the arrow on each limit has been indicated sending direction.In addition, in Fig. 3 B, representing also may to have the limit of being with arrow between these nodes of contacting directly the people, the limit that highlights presents netted.By comparison diagram 3A and Fig. 3 B as seen: for normal short-message users (being paid close attention to the sender), it is contacted directly and may have a plurality of contacts (limit that highlights that occurs netted development in Fig. 3 B) between the people, because normal contacting directly between the people of short-message users also may be friend usually; But, for potential refuse messages sender, note recipient's (contacting directly the people) be select at random or from its object listing, thereby seldom may have contact (limit that highlights of netted development in Fig. 3 A, seldom occurs, still present starlike) between these notes recipient.
In the 3rd step, further carry out three degree subgraph analyses, i.e. the contact behavior of people's profound community network is contacted directly in consideration, so that come to distinguish more exactly normal short-message users and refuse messages sender by the difference that compares topological structure.Particularly, three degree subgraphs except show Calls side or note sender (as the center node among the above-mentioned figure) and its contact directly the people (as among the above-mentioned figure and direct-connected all nodes of center node) between the contact behavior and these contact directly contact behavior between the people, shown that further these contact directly people's deeper contact, promptly, also show with each and contact directly the two degree figure that the people is the center node, get in touch with himself the community network of reflection except that the direct communication of contacting directly between the people.
Can further analyse in depth by expanding aforesaid analysis, thereby introduce the more profound of community network contact.
By analyzing the spatial behavior pattern (telecommunity network) of short-message users, accuracy that can be high is relatively discerned real refuse messages sender, thereby has remedied the deficiency (low accuracy) of time behavior pattern analysis.The analysis space behavior pattern need make up note sender's community network figure, and this is more difficult and consuming time than flow of event analysis, and therefore, the spatial behavior pattern analysis is unsuitable for handling simultaneously a large amount of quilts and pays close attention to the sender.
Off-line mode excavates module 2 and comprises transaction manager 21 and mode excavation engine 22.
Transaction manager 21 is carried out the data preliminary treatment.As shown in Figure 1, all historical short message service data enter transaction manager 21 from charge system 5 and business intelligence (BI) system 6 by data access layer 4 and filter, that is, only relevant field is retained in all short message services records.Described relevant field includes but not limited to following attribute: sender ID (identifier), recipient ID and transmitting time.After this, these attributes are summarized as and are used for the further characteristic of study.The data record that produces will be summarized the behavior of SMS sender number in certain minimum time section (as apart from first note 10 seconds occurring).The field of the data that produce includes but not limited to: amount, the frequency of transmission and the variation of frequency of the total amount of sender ID, different recipient's numbers, transmission note.All these key elements form characteristic, are also referred to as the incident that note sends.
Mode excavation engine 22 carries out mode excavation and produces feature, parameter and rule.Mode excavation engine 22 uses suitable cluster (clustering) algorithm with the incident cluster from transaction manager 21 reception incidents, with the event set of output cluster.Operator can be visual with result's (attribute) of the incident of cluster, uses these knowledge then and find out or verify spam SMS sender number and the behavior pattern in certain class thereof, be i.e. rule.That is to say that for the spatial behavior pattern, mode excavation engine 22 utilizes machine learning method, and all features are carried out cluster analysis and checking, find effective character subset and draw suitable parameter and rule; And for the time behavior pattern, as transmission frequency, quantity forwarded etc., mode excavation engine 22 utilizes machine learning method analysis, draws suitable parameter and rule.In addition, operator also can manually increase or revise feature, parameter and rule, as according to particular requirement.
The feature that relates in the spatial behavior pattern analysis (space characteristics) has merged the feature of subgraph, two degree subgraphs, three degree subgraphs once, including but not limited to following feature:
● the in-degree of center node, the note number that expression center event number is received in section sometime;
● the out-degree of center node, the note number of expression center event number transmission in section sometime;
● the in-degree out-degree ratio of center node, expression center event number is being received and the note number in the section sometime
● connect two-way limit proportion in all limits of center node, be illustrated in the note of receiving the center event number in the section sometime and to the ratio of its answer short message (send with receive no sequencing concern);
● connect the average weight on all limits of center node, expression center event number contacts directly to each in section sometime that people's (with the center event number number of contacting directly being arranged in this time period) sends or the average note number of reception;
● the weight limit in all limits of connection center node, expression center event number is being contacted directly maximum note numbers that people's (implication is the same) sends or receives to all in the section sometime;
● the variance of the weight on all limits of connection center node, expression center event number is being contacted directly the difference distribution degree that people's (implication is the same) sent or received the note number to each in the section sometime;
● all of center node are contacted directly the limit number between people's node, and expression center event number contacting directly in section sometime sends between people's (implication is the same) or the number of reception note mutually;
● all of center node are contacted directly the average weight on the limit between people's node, and expression center event number contacting directly in section sometime sends between people's (implication is the same) or the average number of reception note mutually; And
● all of center node are contacted directly the weight sum on the limit between people's node, and expression center event number contacting directly in section sometime sends between people's (implication is the same) or the sum of reception note mutually.
Explanation to space characteristics is an example with the community network that makes up by the note record above.If the telecommunity network is to make up by the voice call record, the transmission note in the above-mentioned explanation can be replaced with caller phone so, the reception note replaces with and receives calls.
Be characterized as example with in-degree out-degree ratio, cluster analysis can draw: carry out normal users and refuse messages sender's division with ratio 0.001-0.01,90% accuracy rate is arranged, so: 0.001-0.01 is called as parameter, if the in-degree out-degree is than dropping on that there is 90% probability in the 0.001-0.01 interval then decidable is the refuse messages sender, this is called rule.Can in the spatial behavior pattern analysis, select one of above-mentioned feature or its combination as required.
Fig. 4 A and Fig. 4 B illustrate the flow chart according to the method for an embodiment of the invention, and wherein Fig. 4 A illustrates the flow process of online detection, and Fig. 4 B illustrates the flow process that off-line mode excavates.Below with reference to the method for Fig. 1, Fig. 4 A and Fig. 4 B explanation according to an embodiment of the invention.
Shown in Fig. 4 A, at step S42, event handling engine 11 regularly obtains up-to-date communication data in real time from sms center 3, i.e. short message event stream.At step S44, in event handling engine 11, to the rule match of note stream execution based on flow of event, be about to the short message event flow data and carry out Time-Series analysis according to above-mentioned parameter and rule from 2 decisions of off-line mode excavation module, filter out the number of proper communication, extract suspicious number (potential spam SMS sender number) and export to map analysis engine 12.At step S46, for each suspicious number, map analysis engine 12 obtains historical community network data by data access layer 4 from charge system 5, BI system 6 and from related systems such as short message service centers 3, excavate feature, parameter and the rule that module 2 obtains in conjunction with above-mentioned from off-line mode, carry out the spatial behavior pattern analysis, promptly each is paid close attention to the community network that sends number (suspicious number) and carried out synthetic determination, exported the suspicious probability of refuse messages of each suspicious number then.At step S48,, judge whether suspicious number is spam SMS sender number according to the suspicious probability of refuse messages.The business personnel of operator can manually determine whether spam SMS sender number of suspicious number according to the suspicious probability of this refuse messages, perhaps can judge automatically by setting threshold, surpass 60% such as suspicious probability, judge that automatically suspicious number is a spam SMS sender number.
Below further specify map analysis engine 12 carries out the spatial behavior pattern analysis at step S46 detailed process.As mentioned above, this process need is carried out synthetic determination to paying close attention to the community network that sends number (suspicious number), and this can pass through accomplished in many ways.For example, can adopt the synthetic determination of rule-based weighted sum pattern:
1. according to the value of being paid close attention to the complete above-mentioned correlated characteristic of community network data computation that sends number;
2. to each feature, the value of this feature of calculating and the parameter and the rule of this feature are compared, obtain probability corresponding to this feature;
3. the probability of all features is asked weighted sum, obtain final probability, be i.e. the suspicious probability of refuse messages.Also can adopt synthetic determination based on the complex classifier of machine learning:
1. utilize off-line data as training sample, train, obtain to be used for the grader that spam SMS sender number detects according to the sorting technique of having selected (grader is as neural net);
2. according to the value of being paid close attention to the complete above-mentioned correlated characteristic of community network data computation that sends number;
3. will import by sequence (vector) conduct that all calculated feature values constitute, and give grader and differentiate;
4. the output of grader is final probability, and promptly the suspicious probability of refuse messages also can be the two-value result of 0 (being spam SMS sender number)/1 (not being spam SMS sender number).
For the processed offline flow process, shown in Fig. 4 B, at step S52, transaction manager 21 is obtained historical communication data by data access layer 4 from charge system 5, BI system 6 and from related systems such as short message service centers 3, comprises calling out and note etc.At step S54, transaction manager 21 is carried out preliminary treatment according to form to communication data, promptly only keeps relevant field, forms characteristic, outputs to mode excavation engine 22.At step S56,22 pairs of pretreated characteristics of mode excavation engine are carried out time and spatial behavior pattern analysis and study.According to analysis result, by time parameter and rule and space characteristics and the parameter and the rule of operator's decision use in online detection module 1.As mentioned above, also manual amendment's parameter and rule as required of operator.
The application has only described specific execution mode of the present invention and realization.According to the content that the application describes, can make various improvement, distortion and other execution mode and realization.
For example, except shown in Figure 1 according to an embodiment of the invention system and Fig. 4 A and Fig. 4 B shown in according to an embodiment of the invention, another execution mode according to system of the present invention can only comprise event handling engine 11 and map analysis engine 12, and another execution mode of the method according to this invention can only comprise the online testing process shown in Fig. 4 A.

Claims (12)

1. spam SMS sender number real-time detecting system comprises:
The event handling engine is used for obtaining in real time short message event stream, carries out rule match according to preset time parameter and rule thereof, extracts potential spam SMS sender number; And
The map analysis engine, be used to receive the described number that the event handling engine extracts, obtain the community network data of described number then, in conjunction with at least one predetermined space characteristics and parameter and rule, carry out the spatial behavior pattern analysis, so that judge the spam SMS sender number that described number is whether real.
2. spam SMS sender number real-time detecting system according to claim 1, wherein the map analysis engine carries out the spatial behavior pattern analysis and comprises:
Value according to the described space characteristics of community network data computation of described number, the value of each calculating and the parameter and the rule of this space characteristics are compared, obtain probability, then the probability of all space characteristics is asked weighted sum, obtain the suspicious probability of refuse messages corresponding to this space characteristics.
3. spam SMS sender number real-time detecting system according to claim 1, wherein the map analysis engine carries out the spatial behavior pattern analysis and comprises:
The community network data of utilizing described number are as training sample, train according to the sorting technique of selecting, obtain being used for the grader that spam SMS sender number detects, value according to the described space characteristics of community network data computation of described number, the sequence that will be made of the value of all calculating is imported described grader and is differentiated then, obtains the suspicious probability of refuse messages.
4. spam SMS sender number real-time detecting system according to claim 1, wherein said space characteristics comprise subgraph once, the two degree subgraphs of described community network, at least one of following feature of three degree subgraphs:
The in-degree of center node;
The out-degree of center node;
The in-degree out-degree ratio of center node;
Two-way limit proportion in all limits of connection center node;
The average weight on all limits of connection center node;
The weight limit on all limits of connection center node;
The variance of the weight on all limits of connection center node;
All of center node are contacted directly the limit number between people's node;
All of center node are contacted directly the average weight on the limit between people's node; And
All of center node are contacted directly the weight sum on the limit between people's node.
5. spam SMS sender number real-time detecting system according to claim 1 also comprises:
Transaction manager is used to obtain historical communication data, is characteristic with its preliminary treatment; And
The mode excavation engine is used to receive described characteristic, and described characteristic is carried out time and spatial behavior pattern analysis and study, produces described preset time parameter and rule and described predetermined space characteristics and parameter and rule.
6. spam SMS sender number real-time detecting system according to claim 5, wherein for the spatial behavior pattern, the mode excavation engine carries out cluster analysis and checking to all features, finds effective character subset and draws suitable parameter and rule.
7. spam SMS sender number real-time detection method comprises:
Obtain short message event stream in real time, carry out rule match, extract potential spam SMS sender number according to preset time parameter and rule thereof; And
Obtain the community network data of the described number of extraction,, carry out the spatial behavior pattern analysis, so that judge the spam SMS sender number that described number is whether real in conjunction with at least one predetermined space characteristics and parameter and rule.
8. spam SMS sender number real-time detection method according to claim 7, wherein carry out the spatial behavior pattern analysis and comprise:
Value according to the described space characteristics of community network data computation of described number, the value of each calculating and the parameter and the rule of this space characteristics are compared, obtain probability, then the probability of all space characteristics is asked weighted sum, obtain the suspicious probability of refuse messages corresponding to this space characteristics.
9. spam SMS sender number real-time detection method according to claim 7, wherein carry out the spatial behavior pattern analysis and comprise:
The community network data of utilizing described number are as training sample, train according to the sorting technique of selecting, obtain being used for the grader that spam SMS sender number detects, value according to the described space characteristics of community network data computation of described number, the sequence that will be made of the value of all calculating is imported described grader and is differentiated then, obtains the suspicious probability of refuse messages.
10. spam SMS sender number real-time detection method according to claim 7, wherein said space characteristics comprise subgraph once, the two degree subgraphs of described community network, at least one of following feature of three degree subgraphs:
The in-degree of center node;
The out-degree of center node;
The in-degree out-degree ratio of center node;
Two-way limit proportion in all limits of connection center node;
The average weight on all limits of connection center node;
The weight limit on all limits of connection center node;
The variance of the weight on all limits of connection center node;
All of center node are contacted directly the limit number between people's node;
All of center node are contacted directly the average weight on the limit between people's node; And
All of center node are contacted directly the weight sum on the limit between people's node.
11. spam SMS sender number real-time detection method according to claim 7 also comprises:
Obtaining historical communication data, is characteristic with its preliminary treatment; And
Described characteristic is carried out time and spatial behavior pattern analysis and study, produce described preset time parameter and rule and described predetermined space characteristics and parameter and rule.
12. spam SMS sender number real-time detection method according to claim 11 wherein for the spatial behavior pattern, carries out cluster analysis and checking to all features, finds effective character subset and draws suitable parameter and rule.
CN 200810168774 2008-09-28 2008-09-28 System and method for detecting spam SMS sender number in real time Expired - Fee Related CN101686444B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN 200810168774 CN101686444B (en) 2008-09-28 2008-09-28 System and method for detecting spam SMS sender number in real time

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN 200810168774 CN101686444B (en) 2008-09-28 2008-09-28 System and method for detecting spam SMS sender number in real time

Publications (2)

Publication Number Publication Date
CN101686444A true CN101686444A (en) 2010-03-31
CN101686444B CN101686444B (en) 2012-12-26

Family

ID=42049348

Family Applications (1)

Application Number Title Priority Date Filing Date
CN 200810168774 Expired - Fee Related CN101686444B (en) 2008-09-28 2008-09-28 System and method for detecting spam SMS sender number in real time

Country Status (1)

Country Link
CN (1) CN101686444B (en)

Cited By (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101909261A (en) * 2010-08-10 2010-12-08 中兴通讯股份有限公司 Method and system for monitoring spam
CN102510400A (en) * 2011-10-31 2012-06-20 百度在线网络技术(北京)有限公司 Method, apparatus and equipment used for determining user suspectableness degree
CN103686639A (en) * 2013-12-04 2014-03-26 华为技术有限公司 Message processing method, device and system
US8874649B2 (en) 2011-06-30 2014-10-28 International Business Machines Corporation Determination of a spammer through social network characterization
CN104714947A (en) * 2013-12-11 2015-06-17 深圳市腾讯计算机系统有限公司 Preset type number recognition method and device
WO2016058390A1 (en) * 2014-10-13 2016-04-21 中兴通讯股份有限公司 Method and device for blocking spam short messages
CN108391240A (en) * 2018-05-23 2018-08-10 中国联合网络通信集团有限公司 Garbage multimedia messages judgment method and device
CN108769933A (en) * 2018-05-31 2018-11-06 中国联合网络通信集团有限公司 Multimedia message recognition method and multimedia message identifying system
CN110913353A (en) * 2018-09-17 2020-03-24 阿里巴巴集团控股有限公司 Short message classification method and device
CN111124698A (en) * 2018-10-30 2020-05-08 北京奇虎科技有限公司 Communication event identification method and device, electronic equipment and readable storage medium
CN113839962A (en) * 2021-11-25 2021-12-24 阿里云计算有限公司 User attribute determination method, apparatus, storage medium, and program product

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1520214A (en) * 2003-09-02 2004-08-11 �ź㴫 Firewall system for short message and method for building up firewall
CN1256001C (en) * 2003-11-18 2006-05-10 海信集团有限公司 Method for negative receiving of short message for handset

Cited By (18)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101909261A (en) * 2010-08-10 2010-12-08 中兴通讯股份有限公司 Method and system for monitoring spam
WO2012019386A1 (en) * 2010-08-10 2012-02-16 中兴通讯股份有限公司 Method and system for monitoring spam short messages
US8874649B2 (en) 2011-06-30 2014-10-28 International Business Machines Corporation Determination of a spammer through social network characterization
US8880604B2 (en) 2011-06-30 2014-11-04 International Business Machines Corporation Determination of a spammer through social network characterization
CN102510400A (en) * 2011-10-31 2012-06-20 百度在线网络技术(北京)有限公司 Method, apparatus and equipment used for determining user suspectableness degree
CN102510400B (en) * 2011-10-31 2015-09-30 百度在线网络技术(北京)有限公司 A kind of method of the suspectableness degree for determining user, device and equipment
CN103686639A (en) * 2013-12-04 2014-03-26 华为技术有限公司 Message processing method, device and system
CN104714947A (en) * 2013-12-11 2015-06-17 深圳市腾讯计算机系统有限公司 Preset type number recognition method and device
WO2016058390A1 (en) * 2014-10-13 2016-04-21 中兴通讯股份有限公司 Method and device for blocking spam short messages
CN108391240A (en) * 2018-05-23 2018-08-10 中国联合网络通信集团有限公司 Garbage multimedia messages judgment method and device
CN108391240B (en) * 2018-05-23 2021-08-24 中国联合网络通信集团有限公司 Junk multimedia message judgment method and device
CN108769933A (en) * 2018-05-31 2018-11-06 中国联合网络通信集团有限公司 Multimedia message recognition method and multimedia message identifying system
CN108769933B (en) * 2018-05-31 2021-06-04 中国联合网络通信集团有限公司 Multimedia message identification method and multimedia message identification system
CN110913353A (en) * 2018-09-17 2020-03-24 阿里巴巴集团控股有限公司 Short message classification method and device
CN110913353B (en) * 2018-09-17 2022-01-18 阿里巴巴集团控股有限公司 Short message classification method and device
CN111124698A (en) * 2018-10-30 2020-05-08 北京奇虎科技有限公司 Communication event identification method and device, electronic equipment and readable storage medium
CN113839962A (en) * 2021-11-25 2021-12-24 阿里云计算有限公司 User attribute determination method, apparatus, storage medium, and program product
CN113839962B (en) * 2021-11-25 2022-05-06 阿里云计算有限公司 User attribute determination method, apparatus, storage medium, and program product

Also Published As

Publication number Publication date
CN101686444B (en) 2012-12-26

Similar Documents

Publication Publication Date Title
CN101686444B (en) System and method for detecting spam SMS sender number in real time
CN109600752B (en) Deep clustering fraud detection method and device
CN106791220B (en) Method and system for preventing telephone fraud
CN106550155B (en) Swindle sample is carried out to suspicious number and screens the method and system sorted out and intercepted
CN109451182B (en) Detection method and device for fraud telephone
Becker et al. Fraud detection in telecommunications: History and lessons learned
CN108471429A (en) A kind of network attack alarm method and system
Barson et al. The detection of fraud in mobile phone networks
CN104301896A (en) Intelligent fraud short message monitor and alarm system and method
Wang et al. A behavior-based SMS antispam system
CN106970911A (en) A kind of strick precaution telecommunication fraud system and method based on big data and machine learning
CN107819747B (en) Telecommunication fraud association analysis system and method based on communication event sequence
CN110248322B (en) Fraud group partner identification system and identification method based on fraud short messages
CN101860822A (en) Method and system for monitoring spam messages
CN108881263A (en) A kind of network attack result detection method and system
CN102802133A (en) Junk information identification method, device and system
CN110337059A (en) A kind of parser, server and the network system of subscriber household relationship
CN111917574B (en) Social network topology model and construction method, user confidence and affinity calculation method and telecom fraud intelligent interception system
CN101389085B (en) Rubbish short message recognition system and method based on sending behavior
CN110493476B (en) Detection method, device, server and storage medium
CN111131627B (en) Method, device and readable medium for detecting personal harmful call based on streaming data atlas
CN101909261A (en) Method and system for monitoring spam
CN108234435A (en) A kind of automatic testing method based on IP classification
CN111105064A (en) Method and device for determining suspected information of fraud event
CN104581729A (en) Junk information processing method and device

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant
CF01 Termination of patent right due to non-payment of annual fee
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20121226

Termination date: 20160928