CN102761848A - Method for determining short message intercepting key words - Google Patents
Method for determining short message intercepting key words Download PDFInfo
- Publication number
- CN102761848A CN102761848A CN2012102708434A CN201210270843A CN102761848A CN 102761848 A CN102761848 A CN 102761848A CN 2012102708434 A CN2012102708434 A CN 2012102708434A CN 201210270843 A CN201210270843 A CN 201210270843A CN 102761848 A CN102761848 A CN 102761848A
- Authority
- CN
- China
- Prior art keywords
- keyword
- short message
- interception
- probe
- mutation
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Images
Landscapes
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
Abstract
The invention discloses a method for determining short message intercepting key words. The method includes: presetting probe key words; collecting all short messages transmitted in a mobile network in real time and extracting short messages containing the preset probe key words; combining the word segmentation technology to analyze the extracted short messages containing the preset probe key words to obtain varietal key words; and analyzing the obtained varietal key words, determining new intercepting key words and adding the new intercepting key words into a short message intercepting key word database to achieve real-time and automatic updating of the key word database so as to lead intercepting accuracy to be high when spam messages or advertising messages are intercepted according to the key words in the key word database.
Description
Technical field
The present invention relates to the communications field, relate in particular to a kind of method of definite SMS interception keyword.
Background technology
In the prior art, the interpolation of keyword all is manual intervention, promptly needs manual work from a large amount of refuse messages, to analyze keyword; Add into rubbish short message interception system through the mode of craft again, As time goes on mutation possibly take place in keyword; Just can not the keyword SMS interception that contain mutation be got off with predefined keyword; Need the manual analysis note to extract the mutation keyword again, a large amount of manpowers of whole process need input, human cost is too high.Owing to need the manual analysis note to extract the mutation keyword, for the mutation keyword, manually add into that system time relatively lags behind, can not the effect of fine performance keyword, and artificially extract the mutation keyword to omit probability big.
Summary of the invention
The method that the purpose of this invention is to provide a kind of definite SMS interception keyword; Solve keyword adopts artificial input mode to cause in the prior art hysteresis and infull problem; Through the predetermined keyword probe, collection analyzed in the keyword of a series of possibility mutation, obtain new keyword accurately; Crucial dictionary is carried out real-time update automatically, and time interception accuracy rate is higher to use these keyword catching rubbish notes or advertisement SMS.
In order to realize the foregoing invention purpose, the invention provides a kind of method of definite SMS interception keyword, comprising: preset probe keyword; Gather in real time all short messages that transmit among the mobile network, extract the short message that comprises said preset probe keyword; In conjunction with participle technique the said short message that comprises the probe keyword that extracts is analyzed, obtained the mutation keyword; The said mutation keyword that obtains is analyzed, confirmed the interception new keywords, wherein the mutation keyword that obtains is analyzed and comprised part of speech analysis, frequency of occurrences analysis.
Preferably, all short message steps of transmitting among the said real-time collection mobile network are specially: gather all short messages that transmit among the mobile network in real time, and reject the spcial character in the content of short message.
Preferably; Said combination participle technique is analyzed the said short message that comprises the probe keyword that extracts; Obtain mutation keyword step, further comprise: utilize participle technique that the said short message that comprises the probe keyword that extracts is carried out participle; Mate decomposing the vocabulary and the said probe keyword that come out, obtain the mutation keyword.
Wherein, Said the said mutation keyword that obtains is analyzed; Confirm that also comprise: the interception new keywords that will confirm joins the SMS interception keyword database after the interception new keywords step, supply rubbish short message interception system to call to carry out spam real time interception.
Compared with prior art, the present invention has following beneficial effect:
The present invention is through the predetermined keyword probe; Collection analyzed in keyword to the mutation of a series of possibility; Obtain new keyword accurately, crucial dictionary is carried out real time automatic update, time interception accuracy rate is higher to use these keyword catching rubbish notes or advertisement SMS.
Description of drawings
In order to be illustrated more clearly in the embodiment of the invention or technical scheme of the prior art; To do to introduce simply to the accompanying drawing of required use in embodiment or the description of the Prior Art below; Obviously, the accompanying drawing in describing below only is some embodiments of the present invention, for those of ordinary skills; Under the prerequisite of not paying creative work property, can also obtain other accompanying drawing according to these accompanying drawings:
Fig. 1 is the flow chart that the embodiment of the invention one is confirmed the SMS interception keyword method;
Fig. 2 is the flow chart that the embodiment of the invention two is confirmed the SMS interception keyword method.
Embodiment
To combine the accompanying drawing in the embodiment of the invention below, the technical scheme in the embodiment of the invention is carried out clear, intactly description, obviously, described embodiment only is the present invention's part embodiment, rather than whole embodiment.Based on the embodiment among the present invention, those of ordinary skills are not making the every other embodiment that is obtained under the creative work prerequisite, all belong to the scope of the present invention's protection.
Referring to Fig. 1, be the flow chart of the embodiment of the invention one definite SMS interception keyword method, the method for said definite SMS interception keyword comprises the steps:
Step S101: preset probe keyword;
Step S102: gather all short messages that transmit among the mobile network in real time, extract the short message comprise said preset probe keyword; After this step is gathered all short messages that transmit among the mobile network in real time; In order to discern the mutation keyword more accurately; Can reject the spcial character in the content of short message earlier, extract the short message that comprises said preset probe keyword again the short message after rejecting spcial character;
Step S103: combine participle technique that the said short message that comprises the probe keyword that extracts is analyzed, obtain the mutation keyword, specifically can for: utilize participle technique that the said short message that comprises the probe keyword that extracts is carried out participle; Mate decomposing the vocabulary and the said probe keyword that come out, obtain the mutation keyword.
Step S104: the said mutation keyword that obtains is analyzed, confirmed the interception new keywords, the interception new keywords of confirming is joined the SMS interception keyword database, supply rubbish short message interception system to call to carry out spam real time interception.Wherein the mutation keyword that obtains is analyzed and comprised part of speech analysis, frequency of occurrences analysis.The embodiment of the invention is to the keyword of possibility mutation, and system carries out real-time update to these keywords through this method, determines rational keyword, can be more accurate when utilizing this keyword catching rubbish note or advertisement SMS.
Below in conjunction with Fig. 2 flow chart, specify embodiment of the invention method.
Referring to Fig. 2, the flow chart for the embodiment of the invention two definite SMS interception keyword methods comprises the steps:
Step S201: preset probe keyword, as ticket, tax, square, mortgage etc.
Step S202: platform is gathered all short messages that transmit among the mobile network in real time, and rejects the spcial character in the content of short message.Users can pre-configure the short message special characters, such as space, underline, etc., for the short message platform for analysis in accordance with user-configured before special characters, special characters will be removed from the message content; for example, for pre-configured spaces, " "," # "," ... "," * "," & "," - "and other special characters for the need to eliminate the character, when the platform collected content for" hairdresser ticket & Generation # open, please - contact * Department 1223222 "The short interest is to weed out the special character of the short interest becomes" hairdresser votes on behalf of open, please contact 1223222. "
Step S203: extracting step S202, after removing the special characters over the short message, including the default probe keywords short message, as described above after removing the special characters over the short message "hairdresser tickets on behalf of open, contact 1223222 "comprising ticket words, it is extracted.
Step S204: the use of segmentation technique probes containing the extracted keywords short message for segmentation, such as: the content is "hairdresser votes on behalf of open, please contact 1223222" short message for segmentation, got hairdresser vote, on behalf of open contact and other words.
Step S205: the decomposition of words with the probe keyword matching, to obtain variants keywords, such as: the above hairdresser ticket, and open, contact a few words with the previous default probe keywords? Such as ticket , taxes, square, mortgage, etc. for comparison, drawn hairdresser tickets for variant keywords.
Step S206: variant of the obtained keyword speech, frequency analysis, etc., determine the interception new keywords, the result in Step S205, for example, the extracted keyword "hairdresser vote" message appears at all times analysis, more than a certain percentage (configurable), which can be considered "hairdresser ticket" as a keyword spam messages, for example, to send a 1000 analysis of the message, which includes 300 SMS "hairdresser vote" frequency greater than or equal to 30%, that is, junk SMS keywords;
Step S207: the interception new keywords that will confirm joins the SMS interception keyword database, supplies rubbish short message interception system to call to carry out spam real time interception.
The embodiment of the invention is through the predetermined keyword probe; Collection analyzed in keyword to the mutation of a series of possibility; Obtain new keyword accurately, crucial dictionary is carried out real time automatic update, time interception accuracy rate is higher to use these keyword catching rubbish notes or advertisement SMS.
Disclosed all characteristics in this specification, or the step in disclosed all methods or the process except mutually exclusive characteristic and/or the step, all can make up by any way.
Disclosed arbitrary characteristic in this specification (comprising any accessory claim, summary and accompanying drawing) is only if special narration all can be replaced by other equivalences or the alternative features with similar purpose.That is, only if special narration, each characteristic is an example in a series of equivalences or the similar characteristics.
The present invention is not limited to aforesaid embodiment.The present invention expands to any new feature or any new combination that discloses in this manual, and the arbitrary new method that discloses or step or any new combination of process.
Claims (4)
1. the method for a definite SMS interception keyword is characterized in that, comprising:
Preset probe keyword;
Gather in real time all short messages that transmit among the mobile network, extract the short message that comprises said preset probe keyword;
In conjunction with participle technique the said short message that comprises the probe keyword that extracts is analyzed, obtained the mutation keyword;
The said mutation keyword that obtains is analyzed, confirmed the interception new keywords, wherein the mutation keyword that obtains is analyzed and comprised part of speech analysis, frequency of occurrences analysis.
2. the method for claim 1 is characterized in that, all short message steps of transmitting among the said real-time collection mobile network are specially:
Gather all short messages that transmit among the mobile network in real time, and reject the spcial character in the content of short message.
3. method as claimed in claim 2 is characterized in that, said combination participle technique is analyzed the said short message that comprises the probe keyword that extracts, and obtains mutation keyword step, further comprises:
Utilize participle technique that the said short message that comprises the probe keyword that extracts is carried out participle;
Mate decomposing the vocabulary and the said probe keyword that come out, obtain the mutation keyword.
4. method as claimed in claim 3 is characterized in that, said the said mutation keyword that obtains is analyzed, and confirms also to comprise after the interception new keywords step:
The interception new keywords of confirming is joined the SMS interception keyword database, supply rubbish short message interception system to call to carry out spam real time interception.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201210270843.4A CN102761848B (en) | 2012-08-01 | 2012-08-01 | Method for determining short message intercepting key words |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201210270843.4A CN102761848B (en) | 2012-08-01 | 2012-08-01 | Method for determining short message intercepting key words |
Publications (2)
Publication Number | Publication Date |
---|---|
CN102761848A true CN102761848A (en) | 2012-10-31 |
CN102761848B CN102761848B (en) | 2015-05-06 |
Family
ID=47056138
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201210270843.4A Active CN102761848B (en) | 2012-08-01 | 2012-08-01 | Method for determining short message intercepting key words |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN102761848B (en) |
Cited By (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103605690A (en) * | 2013-11-04 | 2014-02-26 | 北京奇虎科技有限公司 | Device and method for recognizing advertising messages in instant messaging |
CN103605692A (en) * | 2013-11-04 | 2014-02-26 | 北京奇虎科技有限公司 | Device and method used for shielding advertisement contents in ask-and-answer community |
CN103888921A (en) * | 2013-09-21 | 2014-06-25 | 天津思博科科技发展有限公司 | Short message intelligent deleting module |
CN104765784A (en) * | 2015-03-20 | 2015-07-08 | 新浪网技术(中国)有限公司 | Key words list maintenance method and system |
CN104915333A (en) * | 2014-03-10 | 2015-09-16 | 中国移动通信集团设计院有限公司 | Method and device for generating keyword combined strategy |
CN105426405A (en) * | 2015-10-29 | 2016-03-23 | 维沃移动通信有限公司 | Information processing method and mobile terminal |
CN106899947A (en) * | 2015-12-21 | 2017-06-27 | 北京奇虎科技有限公司 | Short message method for cleaning and device |
CN111092803A (en) * | 2018-10-23 | 2020-05-01 | 阿里巴巴集团控股有限公司 | Message processing method, device, system and storage medium |
CN111090787A (en) * | 2018-10-23 | 2020-05-01 | 阿里巴巴集团控股有限公司 | Message processing method, device, system and storage medium |
CN113919337A (en) * | 2021-11-02 | 2022-01-11 | 湖南快乐阳光互动娱乐传媒有限公司 | Short message interception method and device, storage medium and electronic equipment |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101137087A (en) * | 2007-08-01 | 2008-03-05 | 浙江大学 | Short message monitoring center and monitoring method |
CN101150762A (en) * | 2007-11-06 | 2008-03-26 | 中国移动通信集团江苏有限公司 | A spam real time interception method and system |
CN101304589A (en) * | 2008-04-14 | 2008-11-12 | 中国联合通信有限公司 | Method and system for monitoring and filtering garbage short message transmitted by short message gateway |
CN101472244A (en) * | 2007-12-29 | 2009-07-01 | 上海粱江通信系统有限公司 | Rubbish short message interception system implemented in signaling link layer |
-
2012
- 2012-08-01 CN CN201210270843.4A patent/CN102761848B/en active Active
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101137087A (en) * | 2007-08-01 | 2008-03-05 | 浙江大学 | Short message monitoring center and monitoring method |
CN101150762A (en) * | 2007-11-06 | 2008-03-26 | 中国移动通信集团江苏有限公司 | A spam real time interception method and system |
CN101472244A (en) * | 2007-12-29 | 2009-07-01 | 上海粱江通信系统有限公司 | Rubbish short message interception system implemented in signaling link layer |
CN101304589A (en) * | 2008-04-14 | 2008-11-12 | 中国联合通信有限公司 | Method and system for monitoring and filtering garbage short message transmitted by short message gateway |
Cited By (12)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103888921A (en) * | 2013-09-21 | 2014-06-25 | 天津思博科科技发展有限公司 | Short message intelligent deleting module |
CN103605690A (en) * | 2013-11-04 | 2014-02-26 | 北京奇虎科技有限公司 | Device and method for recognizing advertising messages in instant messaging |
CN103605692A (en) * | 2013-11-04 | 2014-02-26 | 北京奇虎科技有限公司 | Device and method used for shielding advertisement contents in ask-and-answer community |
CN104915333A (en) * | 2014-03-10 | 2015-09-16 | 中国移动通信集团设计院有限公司 | Method and device for generating keyword combined strategy |
CN104915333B (en) * | 2014-03-10 | 2017-11-28 | 中国移动通信集团设计院有限公司 | A kind of method and device for generating key combination strategy |
CN104765784A (en) * | 2015-03-20 | 2015-07-08 | 新浪网技术(中国)有限公司 | Key words list maintenance method and system |
CN105426405A (en) * | 2015-10-29 | 2016-03-23 | 维沃移动通信有限公司 | Information processing method and mobile terminal |
CN105426405B (en) * | 2015-10-29 | 2019-05-17 | 维沃移动通信有限公司 | Information processing method and mobile terminal |
CN106899947A (en) * | 2015-12-21 | 2017-06-27 | 北京奇虎科技有限公司 | Short message method for cleaning and device |
CN111092803A (en) * | 2018-10-23 | 2020-05-01 | 阿里巴巴集团控股有限公司 | Message processing method, device, system and storage medium |
CN111090787A (en) * | 2018-10-23 | 2020-05-01 | 阿里巴巴集团控股有限公司 | Message processing method, device, system and storage medium |
CN113919337A (en) * | 2021-11-02 | 2022-01-11 | 湖南快乐阳光互动娱乐传媒有限公司 | Short message interception method and device, storage medium and electronic equipment |
Also Published As
Publication number | Publication date |
---|---|
CN102761848B (en) | 2015-05-06 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN102761848A (en) | Method for determining short message intercepting key words | |
CN101784022A (en) | Method and system for filtering and classifying short messages | |
CN104794125B (en) | A kind of recognition methods of refuse messages and device | |
CN106328124A (en) | Voice recognition method based on user behavior characteristics | |
CN106934068A (en) | The method that robot is based on the semantic understanding of environmental context | |
CN104317787A (en) | Instant communication terminal and information translation method and device thereof | |
CN102761872A (en) | Spam message intercepting method | |
CN110880142B (en) | Risk entity acquisition method and device | |
CN105812554A (en) | Method and system for intelligently managing text messages in mobile phones | |
CN110213152B (en) | Method, device, server and storage medium for identifying junk mails | |
CN106897290B (en) | Method and device for establishing keyword model | |
CN105469789A (en) | Voice information processing method and voice information processing terminal | |
CN109634994A (en) | A kind of the matching method for pushing and computer equipment and storage medium of resume and position | |
US8775534B2 (en) | Method and system for e-mail enhancement | |
CN103108290A (en) | Short message handling method and device | |
CN104714938A (en) | Message processing method and electronic device | |
WO2016058390A1 (en) | Method and device for blocking spam short messages | |
CN105589845A (en) | Junk text recognizing method, device and system | |
CN104602274A (en) | Method and system for dynamic identification on terminal brand and terminal type | |
CN102236639A (en) | System and method for updating language model | |
CN105912725A (en) | System for calling vast intelligence applications through natural language interaction | |
CN103778226A (en) | Method for establishing language information recognition model and language information recognition device | |
CN104284306A (en) | Junk message filter method and system, mobile terminal and cloud server | |
CN103279483B (en) | A kind of topic Epidemic Scope appraisal procedure towards micro-blog and system | |
CN104765784A (en) | Key words list maintenance method and system |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
C14 | Grant of patent or utility model | ||
GR01 | Patent grant | ||
C41 | Transfer of patent application or patent right or utility model | ||
TR01 | Transfer of patent right |
Effective date of registration: 20160308 Address after: 1, No. three, 2, 4, 610041 Garden Road, Chengdu hi tech Zone, Sichuan, China Patentee after: CHENGDU SEFON SOFTWARE CO., LTD. Address before: High tech Zone Gaopeng road in Chengdu city of Sichuan province 610041 No. 11 block 22ABC Patentee before: Chengdu Sifang Technologies Co., Ltd. |