CN103888921A - Short message intelligent deleting module - Google Patents

Short message intelligent deleting module Download PDF

Info

Publication number
CN103888921A
CN103888921A CN201310433559.9A CN201310433559A CN103888921A CN 103888921 A CN103888921 A CN 103888921A CN 201310433559 A CN201310433559 A CN 201310433559A CN 103888921 A CN103888921 A CN 103888921A
Authority
CN
China
Prior art keywords
short message
note
word
extraction
dictionary
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201310433559.9A
Other languages
Chinese (zh)
Inventor
牛晓芳
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tianjin Siboke Technology Development Co Ltd
Original Assignee
Tianjin Siboke Technology Development Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tianjin Siboke Technology Development Co Ltd filed Critical Tianjin Siboke Technology Development Co Ltd
Priority to CN201310433559.9A priority Critical patent/CN103888921A/en
Publication of CN103888921A publication Critical patent/CN103888921A/en
Pending legal-status Critical Current

Links

Images

Landscapes

  • Machine Translation (AREA)

Abstract

The invention discloses a short message intelligent deleting module, and aims at judging whether a short message is a spam short message after extraction of key information from contents of the short message and processing the short message according to the judgment result. The short message intelligent deleting module comprises four steps of short message characteristic word extraction, keyword marking, short message content judgment and short message processing. The short message processing has two modes of short message retaining and short message deleting. Realization of the aforementioned short message characteristic word extraction requires establishment of a dictionary, and a great dictionary can guarantee accuracy of Chinese word segmentation.

Description

A kind of short message intelligent removing module
Technical field
The present invention relates to short message intelligent and delete field, more specifically a kind of by the content of note being carried out after the extraction of key message, conclude whether note is refuse messages, the module of note being processed according to judged result.
 
Background technology
Note short message service, being called for short SMS is that user passes through mobile phone or other telecommunication terminals directly send or word or the digital information of reception, user can receive and send at every turn the number of characters of note, is 160 English or numerical character, or 70 Chinese characters.1992, when Article 1 note in the world sends successful time to mobile phone by PC on the rich GSM network in Britain Wal, whoever can not expect, this was the service that solves the too high cheap text message of releasing of mobile phone charge by telecom operators originally, this seems the very thing of a Communication of child's thing, unexpectedly in the economic civilization life to people afterwards for many years, even politics has all been produced to large impact like this.
1992, Article 1 note sent successfully to mobile phone by computer on the network of Waduven of Britain in the world, thereby announced that SMS is born.Be born in and when and where have no way of finding out about it as for Chinese Article 1 note, but according to textual criticism, the mobile communications network of China just possesses SMS as far back as 1994, just have at that time the people of mobile phone not need it.Day by day universal along with mobile phone, since 1998, mobile, UNICOM successively expanded short message service on a large scale: 2000, China mobile short message amount broke through 1,000,000,000; Calendar year 2001, reach 18,900,000,000; 2004, numeral skyrocketed 90,000,000,000.So note naturally becomes the 5th kind of spreading tool, therefore the formulation of " letter life " is also born; So from 1998 so far, in the time of 7 years, whether be ready regardless of you, note enters into our life gradually, becomes a part for life, and our life is also because note is changing.
Show according to the up-to-date announcement statistics of Ministry of Industry and Information, 2012, national SMS traffic volume reached 8973.1 hundred million, only increased by 2.1% on a year-on-year basis, amplification be 4 years minimum.Two Data Comparisons can find, China's cellphone subscriber's speedup is much larger than short message service, China actual 9% left and right that glided of the equal traffic volume of note in 2012.
Ministry of Industry and Information's data demonstration, within 2012, China mobile phone user reaches 1,100,000,000 families, and wherein short message service user reaches 7.6 hundred million families, and permeability is 68.8%, and the civilian family of cellular network is about 4.2 hundred million, and permeability is 38.2%.On the other hand, in China 1,100,000,000 cellphone subscribers, the micro-credit household of Tengxun exceedes 300,000,000 families, and permeability is 27.3%.And in every 10 the mobile phone netizens of China, just have and exceed 7 for micro-credit household.
Before telecommunications industry research advisory organization Irving consulting firm, also once issued bulletin and claim, along with a large amount of smart phone users transfer to use free text message application software, to the end of the year 2012, therefore Telecom Carriers will lose 23,000,000,000 dollars of note incomes.
Along with the development of information, all kinds of note is boundless confused dance as the heavy snow of twelfth month, and mobile phone has departed from that pure, quiet world; Be no matter the traditional note of mobile phone or various social application as: micro-letter, footpath between fields, footpath between fields etc., various refuse messages catch someone on the wrong foot.
What is the short breath of rubbish, refuse messages, what all exactly users did not customize includes the same content of continuous transmission in the illegal contents such as advertisement, deception, pornographic, curse and short time, and any information that affects user's normal use, work and life is refuse messages.
The intelligence of note interception so, deletion just become the focus of each communication circle, expert's research.
 
Summary of the invention
The invention discloses a kind of short message intelligent removing module, object is by the content of note being carried out after the extraction of key message, concludes that whether note is refuse messages, processes note according to judged result.
The present invention takes following technical scheme to realize: a kind of short message intelligent removing module, comprises the extraction of note Feature Words, keyword, short message content judgement and four steps of note processing.Note processing is divided into note reservation and note is deleted two kinds of modes.
Realization of the present invention also comprises following technical scheme:
Said short message Feature Words extracts will relate to Chinese text information extraction and Chinese Word Automatic Segmentation, and the present invention will take Forward Maximum Method algorithm to extract keyword.
Realize the extraction of said short message Feature Words and need to set up dictionary, good dictionary is the guarantee of Chinese word segmentation accuracy rate.
Forward Maximum Method algorithm: from left to right, by treating that the several continuation characters in participle text mate with vocabulary, if matched, be syncopated as a word.But there is a problem here: accomplish maximum coupling, be not that just match for the first time can cutting.We give an example:
Treat participle text: content[]=" in ", " China ", " people ", " family ", " from ", " this ", " station ", " rising ", " coming ", " ", "."}
Vocabulary: dict[]={ " China ", " Chinese nation ", " from then on ", " standing up " }
(1) from content[1], when scanning content[2] time, find that " China " is at vocabulary dict[] suffer.But can't cut out, because we do not know that word below can form longer word (maximum coupling).
(2) continue scanning content[3], find " the Chinese people " be not dict[] in word.But we can't determine whether that " China " that finds has been maximum word above.Because " the Chinese people " are dict[2] prefix.
(3) scanning content[4], find " Chinese nation " be dict[] in word.Continuing scanning goes down.
(4) as scanning content[5] time, find that " Chinese nation from " is not the word vocabulary, prefix that neither word.Therefore can be syncopated as maximum word---" Chinese nation " above.
Advantage of the present invention and beneficial effect, be embodied in the following aspects:
1. to improve to a certain extent for the deletion of refuse messages be a kind of improvement in the present invention, in the process of constantly refuse messages being processed, automatically improve refuse messages dictionary, keep constantly increasing for the judging nicety rate of refuse messages, have vital effect for the powerful property of module.
2. next can, for some customizing messages, such as the information of some people, some unit is carried out timing deletion, can reduce like this memory space of cellphone information.
3. also improve the efficiency that realizes information searching simultaneously, not be used in a certain information of searching again and again in immense cellphone information.
 
Brief description of the drawings
Fig. 1 is execution step schematic diagram of the present invention;
Fig. 2 is the flow chart of automatically setting up dictionary.
Embodiment
Below in conjunction with Figure of description 1, enforcement of the present invention is further described:
A kind of short message intelligent removing module, comprises the extraction of note Feature Words, keyword, short message content judgement and four steps of note processing.Note processing is divided into note reservation and note is deleted two kinds of modes.
Said short message Feature Words extracts will relate to Chinese text information extraction and Chinese Word Automatic Segmentation, and the present invention will take Forward Maximum Method algorithm to extract keyword.
Forward Maximum Method algorithm: from left to right, by treating that the several continuation characters in participle text mate with vocabulary, if matched, be syncopated as a word.But there is a problem here: accomplish maximum coupling, be not that just match for the first time can cutting.We give an example:
Treat participle text: content[]=" in ", " China ", " people ", " family ", " from ", " this ", " station ", " rising ", " coming ", " ", "."}
Vocabulary: dict[]={ " China ", " Chinese nation ", " from then on ", " standing up " }
(1) from content[1], when scanning content[2] time, find that " China " is at vocabulary dict[] suffer.But can't cut out, because we do not know that word below can form longer word (maximum coupling).
(2) continue scanning content[3], find " the Chinese people " be not dict[] in word.But we can't determine whether that " China " that finds has been maximum word above.Because " the Chinese people " are dict[2] prefix.
(3) scanning content[4], find " Chinese nation " be dict[] in word.Continuing scanning goes down.
(4) as scanning content[5] time, find that " Chinese nation from " is not the word vocabulary, prefix that neither word.Therefore can be syncopated as maximum word---" Chinese nation " above.
Below in conjunction with Figure of description 2, the foundation of dictionary is further described:
Good dictionary is the guarantee of Chinese word segmentation accuracy rate, so how to build dictionary, first will collect current popular refuse messages, its Feature Words is carried out to corresponding artificial extraction, build a dictionary that amount of information is more complete, such as popular refuse messages has: " drawing a bill ", " certificates handling ", " sim card clone " etc.; The second, the storage organization of these words is arranged, carry out the storage of phrase according to English alphabet order, the word under each alphabetic index, according to storing from long to short, will be of value to so maximum coupling and realize.
The program that dictionary is set up realizes:
/**
* build the Trie tree node of internal memory dictionary
*/
public?class?TrieNode?{
/ * * node keyword, its value for a word * in Chinese word/
public?char?key=(char)0;
If this word of/* * is at the end of word, bound=true*/
public?boolean?bound=false;
/ * * points to the pointer structure of next node, be used for depositing the position * of the next word of current word in word/
public?HashMap<Character,TrieNode>?childs=new?HashMap<Character,TrieNode>();
public?TrieNode(){
}
public?TrieNode(char?k){
this.key=k;
}
}
Utilize technical solutions according to the invention, or those skilled in the art being under the inspiration of technical solution of the present invention, designs similar technical scheme, and reaching above-mentioned technique effect, is all to fall into protection scope of the present invention.

Claims (3)

1. a short message intelligent removing module, it is characterized in that: comprise the extraction of note Feature Words, keyword, short message content judgement and four steps of note processing, realize the extraction of said short message Feature Words and need to set up dictionary, said short message treatment step is divided into note reservation and note is deleted two kinds of modes.
2. a kind of short message intelligent removing module according to claim 1, is characterized in that: described note Feature Words extraction step is taked Forward Maximum Method algorithm.
3. a kind of short message intelligent removing module according to claim 1, is characterized in that: described note treatment step is divided into note reservation and note is deleted two kinds of modes.
CN201310433559.9A 2013-09-21 2013-09-21 Short message intelligent deleting module Pending CN103888921A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201310433559.9A CN103888921A (en) 2013-09-21 2013-09-21 Short message intelligent deleting module

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201310433559.9A CN103888921A (en) 2013-09-21 2013-09-21 Short message intelligent deleting module

Publications (1)

Publication Number Publication Date
CN103888921A true CN103888921A (en) 2014-06-25

Family

ID=50957605

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201310433559.9A Pending CN103888921A (en) 2013-09-21 2013-09-21 Short message intelligent deleting module

Country Status (1)

Country Link
CN (1) CN103888921A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105516944A (en) * 2015-12-21 2016-04-20 小米科技有限责任公司 Short message canceling method and device
CN105959926A (en) * 2016-07-15 2016-09-21 北京奇虎科技有限公司 Junk short message filtering method and filtering device

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101784022A (en) * 2009-01-16 2010-07-21 北京炎黄新星网络科技有限公司 Method and system for filtering and classifying short messages
CN101938565A (en) * 2010-09-10 2011-01-05 中兴通讯股份有限公司 Short message processing method and mobile terminal
CN102355517A (en) * 2011-07-01 2012-02-15 宇龙计算机通信科技(深圳)有限公司 Information classification apparatus, information classification method and terminal
CN102761848A (en) * 2012-08-01 2012-10-31 成都四方信息技术有限公司 Method for determining short message intercepting key words
CN103108290A (en) * 2011-11-09 2013-05-15 北京华中融合科技有限公司 Short message handling method and device

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101784022A (en) * 2009-01-16 2010-07-21 北京炎黄新星网络科技有限公司 Method and system for filtering and classifying short messages
CN101938565A (en) * 2010-09-10 2011-01-05 中兴通讯股份有限公司 Short message processing method and mobile terminal
CN102355517A (en) * 2011-07-01 2012-02-15 宇龙计算机通信科技(深圳)有限公司 Information classification apparatus, information classification method and terminal
CN103108290A (en) * 2011-11-09 2013-05-15 北京华中融合科技有限公司 Short message handling method and device
CN102761848A (en) * 2012-08-01 2012-10-31 成都四方信息技术有限公司 Method for determining short message intercepting key words

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105516944A (en) * 2015-12-21 2016-04-20 小米科技有限责任公司 Short message canceling method and device
CN105959926A (en) * 2016-07-15 2016-09-21 北京奇虎科技有限公司 Junk short message filtering method and filtering device

Similar Documents

Publication Publication Date Title
CN102801859B (en) Method and device for identifying junk short message, and mobile communication terminal with device
CN101938565A (en) Short message processing method and mobile terminal
CN101784022A (en) Method and system for filtering and classifying short messages
CN103441924B (en) A kind of rubbish mail filtering method based on short text and device
CN103294776B (en) Smartphone address book fuzzy search method
CN104883671B (en) A kind of judgment method and system of refuse messages
CN101719954B (en) Method and device for realizing shot message topping
CN102088697A (en) Method and system for processing spam
CN101651731A (en) Method and system for managing address book and mobile terminal
CN107633081A (en) A kind of querying method and system of user profile of breaking one&#39;s promise
CN102685312B (en) Mobile phone with short message cancellation function and short message cancellation realization method for mobile phone
CN105072238A (en) Method and apparatus for creating contact list according to note information of newly-added number
CN102761848A (en) Method for determining short message intercepting key words
CN106649338B (en) Information filtering strategy generation method and device
CN103607515A (en) Short message merging device and method
CN102036198A (en) Method and device for adding additional information to short message contents
CN101867660A (en) Method for automatically deleting short message
CN103888921A (en) Short message intelligent deleting module
CN105049341A (en) Method and device for automatically adding remark information to newly-increased instant messaging number
CN104765784A (en) Key words list maintenance method and system
CN103428341A (en) Method and device for searching for mobile phone contacts
CN101094197A (en) Method and mail server of anti garbage mail
CN101930458B (en) Short message matching method based on characteristic value
CN102184247A (en) High-efficiency short message querying and filtering method
CN105681523A (en) Method and apparatus for sending birthday blessing short message automatically

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
WD01 Invention patent application deemed withdrawn after publication
WD01 Invention patent application deemed withdrawn after publication

Application publication date: 20140625