CN102761848A - Method for determining short message intercepting key words - Google Patents

Method for determining short message intercepting key words Download PDF

Info

Publication number
CN102761848A
CN102761848A CN2012102708434A CN201210270843A CN102761848A CN 102761848 A CN102761848 A CN 102761848A CN 2012102708434 A CN2012102708434 A CN 2012102708434A CN 201210270843 A CN201210270843 A CN 201210270843A CN 102761848 A CN102761848 A CN 102761848A
Authority
CN
China
Prior art keywords
keyword
short message
interception
probe
mutation
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN2012102708434A
Other languages
Chinese (zh)
Other versions
CN102761848B (en
Inventor
王纯斌
谢崇竹
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Chengdu Sefon Software Co Ltd
Original Assignee
CHENGDU SIFANG TECHNOLOGIES Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by CHENGDU SIFANG TECHNOLOGIES Co Ltd filed Critical CHENGDU SIFANG TECHNOLOGIES Co Ltd
Priority to CN201210270843.4A priority Critical patent/CN102761848B/en
Publication of CN102761848A publication Critical patent/CN102761848A/en
Application granted granted Critical
Publication of CN102761848B publication Critical patent/CN102761848B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Landscapes

  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention discloses a method for determining short message intercepting key words. The method includes: presetting probe key words; collecting all short messages transmitted in a mobile network in real time and extracting short messages containing the preset probe key words; combining the word segmentation technology to analyze the extracted short messages containing the preset probe key words to obtain varietal key words; and analyzing the obtained varietal key words, determining new intercepting key words and adding the new intercepting key words into a short message intercepting key word database to achieve real-time and automatic updating of the key word database so as to lead intercepting accuracy to be high when spam messages or advertising messages are intercepted according to the key words in the key word database.

Description

A kind of method of definite SMS interception keyword
Technical field
The present invention relates to the communications field, relate in particular to a kind of method of definite SMS interception keyword.
Background technology
In the prior art, the interpolation of keyword all is manual intervention, promptly needs manual work from a large amount of refuse messages, to analyze keyword; Add into rubbish short message interception system through the mode of craft again, As time goes on mutation possibly take place in keyword; Just can not the keyword SMS interception that contain mutation be got off with predefined keyword; Need the manual analysis note to extract the mutation keyword again, a large amount of manpowers of whole process need input, human cost is too high.Owing to need the manual analysis note to extract the mutation keyword, for the mutation keyword, manually add into that system time relatively lags behind, can not the effect of fine performance keyword, and artificially extract the mutation keyword to omit probability big.
Summary of the invention
The method that the purpose of this invention is to provide a kind of definite SMS interception keyword; Solve keyword adopts artificial input mode to cause in the prior art hysteresis and infull problem; Through the predetermined keyword probe, collection analyzed in the keyword of a series of possibility mutation, obtain new keyword accurately; Crucial dictionary is carried out real-time update automatically, and time interception accuracy rate is higher to use these keyword catching rubbish notes or advertisement SMS.
In order to realize the foregoing invention purpose, the invention provides a kind of method of definite SMS interception keyword, comprising: preset probe keyword; Gather in real time all short messages that transmit among the mobile network, extract the short message that comprises said preset probe keyword; In conjunction with participle technique the said short message that comprises the probe keyword that extracts is analyzed, obtained the mutation keyword; The said mutation keyword that obtains is analyzed, confirmed the interception new keywords, wherein the mutation keyword that obtains is analyzed and comprised part of speech analysis, frequency of occurrences analysis.
Preferably, all short message steps of transmitting among the said real-time collection mobile network are specially: gather all short messages that transmit among the mobile network in real time, and reject the spcial character in the content of short message.
Preferably; Said combination participle technique is analyzed the said short message that comprises the probe keyword that extracts; Obtain mutation keyword step, further comprise: utilize participle technique that the said short message that comprises the probe keyword that extracts is carried out participle; Mate decomposing the vocabulary and the said probe keyword that come out, obtain the mutation keyword.
Wherein, Said the said mutation keyword that obtains is analyzed; Confirm that also comprise: the interception new keywords that will confirm joins the SMS interception keyword database after the interception new keywords step, supply rubbish short message interception system to call to carry out spam real time interception.
Compared with prior art, the present invention has following beneficial effect:
The present invention is through the predetermined keyword probe; Collection analyzed in keyword to the mutation of a series of possibility; Obtain new keyword accurately, crucial dictionary is carried out real time automatic update, time interception accuracy rate is higher to use these keyword catching rubbish notes or advertisement SMS.
Description of drawings
In order to be illustrated more clearly in the embodiment of the invention or technical scheme of the prior art; To do to introduce simply to the accompanying drawing of required use in embodiment or the description of the Prior Art below; Obviously, the accompanying drawing in describing below only is some embodiments of the present invention, for those of ordinary skills; Under the prerequisite of not paying creative work property, can also obtain other accompanying drawing according to these accompanying drawings:
Fig. 1 is the flow chart that the embodiment of the invention one is confirmed the SMS interception keyword method;
Fig. 2 is the flow chart that the embodiment of the invention two is confirmed the SMS interception keyword method.
Embodiment
To combine the accompanying drawing in the embodiment of the invention below, the technical scheme in the embodiment of the invention is carried out clear, intactly description, obviously, described embodiment only is the present invention's part embodiment, rather than whole embodiment.Based on the embodiment among the present invention, those of ordinary skills are not making the every other embodiment that is obtained under the creative work prerequisite, all belong to the scope of the present invention's protection.
Referring to Fig. 1, be the flow chart of the embodiment of the invention one definite SMS interception keyword method, the method for said definite SMS interception keyword comprises the steps:
Step S101: preset probe keyword;
Step S102: gather all short messages that transmit among the mobile network in real time, extract the short message comprise said preset probe keyword; After this step is gathered all short messages that transmit among the mobile network in real time; In order to discern the mutation keyword more accurately; Can reject the spcial character in the content of short message earlier, extract the short message that comprises said preset probe keyword again the short message after rejecting spcial character;
Step S103: combine participle technique that the said short message that comprises the probe keyword that extracts is analyzed, obtain the mutation keyword, specifically can for: utilize participle technique that the said short message that comprises the probe keyword that extracts is carried out participle; Mate decomposing the vocabulary and the said probe keyword that come out, obtain the mutation keyword.
Step S104: the said mutation keyword that obtains is analyzed, confirmed the interception new keywords, the interception new keywords of confirming is joined the SMS interception keyword database, supply rubbish short message interception system to call to carry out spam real time interception.Wherein the mutation keyword that obtains is analyzed and comprised part of speech analysis, frequency of occurrences analysis.The embodiment of the invention is to the keyword of possibility mutation, and system carries out real-time update to these keywords through this method, determines rational keyword, can be more accurate when utilizing this keyword catching rubbish note or advertisement SMS.
Below in conjunction with Fig. 2 flow chart, specify embodiment of the invention method.
Referring to Fig. 2, the flow chart for the embodiment of the invention two definite SMS interception keyword methods comprises the steps:
Step S201: preset probe keyword, as ticket, tax, square, mortgage etc.
Step S202: platform is gathered all short messages that transmit among the mobile network in real time, and rejects the spcial character in the content of short message.Users can pre-configure the short message special characters, such as space, underline, etc., for the short message platform for analysis in accordance with user-configured before special characters, special characters will be removed from the message content; for example, for pre-configured spaces, " "," # "," ... "," * "," & "," - "and other special characters for the need to eliminate the character, when the platform collected content for" hairdresser ticket & Generation # open, please - contact * Department 1223222 "The short interest is to weed out the special character of the short interest becomes" hairdresser votes on behalf of open, please contact 1223222. "
Step S203: extracting step S202, after removing the special characters over the short message, including the default probe keywords short message, as described above after removing the special characters over the short message "hairdresser tickets on behalf of open, contact 1223222 "comprising ticket words, it is extracted.
Step S204: the use of segmentation technique probes containing the extracted keywords short message for segmentation, such as: the content is "hairdresser votes on behalf of open, please contact 1223222" short message for segmentation, got hairdresser vote, on behalf of open contact and other words.
Step S205: the decomposition of words with the probe keyword matching, to obtain variants keywords, such as: the above hairdresser ticket, and open, contact a few words with the previous default probe keywords? Such as ticket , taxes, square, mortgage, etc. for comparison, drawn hairdresser tickets for variant keywords.
Step S206: variant of the obtained keyword speech, frequency analysis, etc., determine the interception new keywords, the result in Step S205, for example, the extracted keyword "hairdresser vote" message appears at all times analysis, more than a certain percentage (configurable), which can be considered "hairdresser ticket" as a keyword spam messages, for example, to send a 1000 analysis of the message, which includes 300 SMS "hairdresser vote" frequency greater than or equal to 30%, that is, junk SMS keywords;
Step S207: the interception new keywords that will confirm joins the SMS interception keyword database, supplies rubbish short message interception system to call to carry out spam real time interception.
The embodiment of the invention is through the predetermined keyword probe; Collection analyzed in keyword to the mutation of a series of possibility; Obtain new keyword accurately, crucial dictionary is carried out real time automatic update, time interception accuracy rate is higher to use these keyword catching rubbish notes or advertisement SMS.
Disclosed all characteristics in this specification, or the step in disclosed all methods or the process except mutually exclusive characteristic and/or the step, all can make up by any way.
Disclosed arbitrary characteristic in this specification (comprising any accessory claim, summary and accompanying drawing) is only if special narration all can be replaced by other equivalences or the alternative features with similar purpose.That is, only if special narration, each characteristic is an example in a series of equivalences or the similar characteristics.
The present invention is not limited to aforesaid embodiment.The present invention expands to any new feature or any new combination that discloses in this manual, and the arbitrary new method that discloses or step or any new combination of process.

Claims (4)

1. the method for a definite SMS interception keyword is characterized in that, comprising:
Preset probe keyword;
Gather in real time all short messages that transmit among the mobile network, extract the short message that comprises said preset probe keyword;
In conjunction with participle technique the said short message that comprises the probe keyword that extracts is analyzed, obtained the mutation keyword;
The said mutation keyword that obtains is analyzed, confirmed the interception new keywords, wherein the mutation keyword that obtains is analyzed and comprised part of speech analysis, frequency of occurrences analysis.
2. the method for claim 1 is characterized in that, all short message steps of transmitting among the said real-time collection mobile network are specially:
Gather all short messages that transmit among the mobile network in real time, and reject the spcial character in the content of short message.
3. method as claimed in claim 2 is characterized in that, said combination participle technique is analyzed the said short message that comprises the probe keyword that extracts, and obtains mutation keyword step, further comprises:
Utilize participle technique that the said short message that comprises the probe keyword that extracts is carried out participle;
Mate decomposing the vocabulary and the said probe keyword that come out, obtain the mutation keyword.
4. method as claimed in claim 3 is characterized in that, said the said mutation keyword that obtains is analyzed, and confirms also to comprise after the interception new keywords step:
The interception new keywords of confirming is joined the SMS interception keyword database, supply rubbish short message interception system to call to carry out spam real time interception.
CN201210270843.4A 2012-08-01 2012-08-01 Method for determining short message intercepting key words Active CN102761848B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201210270843.4A CN102761848B (en) 2012-08-01 2012-08-01 Method for determining short message intercepting key words

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201210270843.4A CN102761848B (en) 2012-08-01 2012-08-01 Method for determining short message intercepting key words

Publications (2)

Publication Number Publication Date
CN102761848A true CN102761848A (en) 2012-10-31
CN102761848B CN102761848B (en) 2015-05-06

Family

ID=47056138

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201210270843.4A Active CN102761848B (en) 2012-08-01 2012-08-01 Method for determining short message intercepting key words

Country Status (1)

Country Link
CN (1) CN102761848B (en)

Cited By (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103605690A (en) * 2013-11-04 2014-02-26 北京奇虎科技有限公司 Device and method for recognizing advertising messages in instant messaging
CN103605692A (en) * 2013-11-04 2014-02-26 北京奇虎科技有限公司 Device and method used for shielding advertisement contents in ask-and-answer community
CN103888921A (en) * 2013-09-21 2014-06-25 天津思博科科技发展有限公司 Short message intelligent deleting module
CN104765784A (en) * 2015-03-20 2015-07-08 新浪网技术(中国)有限公司 Key words list maintenance method and system
CN104915333A (en) * 2014-03-10 2015-09-16 中国移动通信集团设计院有限公司 Method and device for generating keyword combined strategy
CN105426405A (en) * 2015-10-29 2016-03-23 维沃移动通信有限公司 Information processing method and mobile terminal
CN106899947A (en) * 2015-12-21 2017-06-27 北京奇虎科技有限公司 Short message method for cleaning and device
CN111092803A (en) * 2018-10-23 2020-05-01 阿里巴巴集团控股有限公司 Message processing method, device, system and storage medium
CN111090787A (en) * 2018-10-23 2020-05-01 阿里巴巴集团控股有限公司 Message processing method, device, system and storage medium
CN113919337A (en) * 2021-11-02 2022-01-11 湖南快乐阳光互动娱乐传媒有限公司 Short message interception method and device, storage medium and electronic equipment

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101137087A (en) * 2007-08-01 2008-03-05 浙江大学 Short message monitoring center and monitoring method
CN101150762A (en) * 2007-11-06 2008-03-26 中国移动通信集团江苏有限公司 A spam real time interception method and system
CN101304589A (en) * 2008-04-14 2008-11-12 中国联合通信有限公司 Method and system for monitoring and filtering garbage short message transmitted by short message gateway
CN101472244A (en) * 2007-12-29 2009-07-01 上海粱江通信系统有限公司 Rubbish short message interception system implemented in signaling link layer

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101137087A (en) * 2007-08-01 2008-03-05 浙江大学 Short message monitoring center and monitoring method
CN101150762A (en) * 2007-11-06 2008-03-26 中国移动通信集团江苏有限公司 A spam real time interception method and system
CN101472244A (en) * 2007-12-29 2009-07-01 上海粱江通信系统有限公司 Rubbish short message interception system implemented in signaling link layer
CN101304589A (en) * 2008-04-14 2008-11-12 中国联合通信有限公司 Method and system for monitoring and filtering garbage short message transmitted by short message gateway

Cited By (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103888921A (en) * 2013-09-21 2014-06-25 天津思博科科技发展有限公司 Short message intelligent deleting module
CN103605690A (en) * 2013-11-04 2014-02-26 北京奇虎科技有限公司 Device and method for recognizing advertising messages in instant messaging
CN103605692A (en) * 2013-11-04 2014-02-26 北京奇虎科技有限公司 Device and method used for shielding advertisement contents in ask-and-answer community
CN104915333A (en) * 2014-03-10 2015-09-16 中国移动通信集团设计院有限公司 Method and device for generating keyword combined strategy
CN104915333B (en) * 2014-03-10 2017-11-28 中国移动通信集团设计院有限公司 A kind of method and device for generating key combination strategy
CN104765784A (en) * 2015-03-20 2015-07-08 新浪网技术(中国)有限公司 Key words list maintenance method and system
CN105426405A (en) * 2015-10-29 2016-03-23 维沃移动通信有限公司 Information processing method and mobile terminal
CN105426405B (en) * 2015-10-29 2019-05-17 维沃移动通信有限公司 Information processing method and mobile terminal
CN106899947A (en) * 2015-12-21 2017-06-27 北京奇虎科技有限公司 Short message method for cleaning and device
CN111092803A (en) * 2018-10-23 2020-05-01 阿里巴巴集团控股有限公司 Message processing method, device, system and storage medium
CN111090787A (en) * 2018-10-23 2020-05-01 阿里巴巴集团控股有限公司 Message processing method, device, system and storage medium
CN113919337A (en) * 2021-11-02 2022-01-11 湖南快乐阳光互动娱乐传媒有限公司 Short message interception method and device, storage medium and electronic equipment

Also Published As

Publication number Publication date
CN102761848B (en) 2015-05-06

Similar Documents

Publication Publication Date Title
CN102761848A (en) Method for determining short message intercepting key words
CN101784022A (en) Method and system for filtering and classifying short messages
CN104794125B (en) A kind of recognition methods of refuse messages and device
CN106328124A (en) Voice recognition method based on user behavior characteristics
CN106934068A (en) The method that robot is based on the semantic understanding of environmental context
CN104317787A (en) Instant communication terminal and information translation method and device thereof
CN102761872A (en) Spam message intercepting method
CN110880142B (en) Risk entity acquisition method and device
CN105812554A (en) Method and system for intelligently managing text messages in mobile phones
CN110213152B (en) Method, device, server and storage medium for identifying junk mails
CN106897290B (en) Method and device for establishing keyword model
CN105469789A (en) Voice information processing method and voice information processing terminal
CN109634994A (en) A kind of the matching method for pushing and computer equipment and storage medium of resume and position
US8775534B2 (en) Method and system for e-mail enhancement
CN103108290A (en) Short message handling method and device
CN104714938A (en) Message processing method and electronic device
WO2016058390A1 (en) Method and device for blocking spam short messages
CN105589845A (en) Junk text recognizing method, device and system
CN104602274A (en) Method and system for dynamic identification on terminal brand and terminal type
CN102236639A (en) System and method for updating language model
CN105912725A (en) System for calling vast intelligence applications through natural language interaction
CN103778226A (en) Method for establishing language information recognition model and language information recognition device
CN104284306A (en) Junk message filter method and system, mobile terminal and cloud server
CN103279483B (en) A kind of topic Epidemic Scope appraisal procedure towards micro-blog and system
CN104765784A (en) Key words list maintenance method and system

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant
C41 Transfer of patent application or patent right or utility model
TR01 Transfer of patent right

Effective date of registration: 20160308

Address after: 1, No. three, 2, 4, 610041 Garden Road, Chengdu hi tech Zone, Sichuan, China

Patentee after: CHENGDU SEFON SOFTWARE CO., LTD.

Address before: High tech Zone Gaopeng road in Chengdu city of Sichuan province 610041 No. 11 block 22ABC

Patentee before: Chengdu Sifang Technologies Co., Ltd.