CN106156105A - Email polymerization sorting technique and device - Google Patents

Email polymerization sorting technique and device Download PDF

Info

Publication number
CN106156105A
CN106156105A CN201510155716.3A CN201510155716A CN106156105A CN 106156105 A CN106156105 A CN 106156105A CN 201510155716 A CN201510155716 A CN 201510155716A CN 106156105 A CN106156105 A CN 106156105A
Authority
CN
China
Prior art keywords
mail
information
contents information
mail contents
keyword
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201510155716.3A
Other languages
Chinese (zh)
Inventor
王明
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Alibaba Group Holding Ltd
Original Assignee
Alibaba Group Holding Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Alibaba Group Holding Ltd filed Critical Alibaba Group Holding Ltd
Priority to CN201510155716.3A priority Critical patent/CN106156105A/en
Publication of CN106156105A publication Critical patent/CN106156105A/en
Pending legal-status Critical Current

Links

Abstract

Disclosure one Email polymerization sorting technique and device, wherein this device includes: mail acquisition module, for obtaining the mail of mail server storage;Keyword-extraction module, for obtaining the content information of described mail, uses the participle mode preset that Mail Contents information is carried out word segmentation processing, extracts the keyword of Mail Contents information;Polymerization sort module, forms classification tree for Mail Contents being carried out be polymerized classification process according to the keyword of Mail Contents information.Can be classified easily by the application and check Mail Contents.

Description

Email polymerization sorting technique and device
Technical field
The application relates to field of computer technology, particularly relates to a kind of Email polymerization sorting technique and dress Put.
Background technology
In enterprises, a large amount of Inner email all can be had every day little group with individual-to-individual or individual The mode of team sends, and Mail Contents is much begging for for certain treatment scheme or technology contents Opinion and summing up, is wherein no lack of important information and the highest content of gold content.
But, most of mails are all to be submerged in mailbox or just have forgotten after simply someone has seen, Return to search in mailbox if used next time;And a lot of mails are provided to avoid mail to disturb With point-to-point transmission, in team, a lot of people run into similar situation and are difficult to acquire solution, very Difficulty accomplishes corresponding knowledge precipitation and taxonomic revision.
In prior art, the scheme of classification automatically integrated by individual's mailbox with the enterprise similar content of mailbox, From the point of view of current processing mode, a lot of valuable Inner emails all can be permanently kept in mailbox server, It addition, Mail Contents mostly is single addressee or mails are visible, other people are difficult to synchronization gain to corresponding Content information, the most existing individual's mailbox and enterprise's mailbox majority only have the function of work exchange and notice, Seldom there are knowledge precipitation and the effect shared.
Summary of the invention
The main purpose of the application is to provide a kind of Email polymerization sorting technique and device, to overcome Of the prior art cannot share Mail Contents and the problem that mail is carried out taxonomic revision.
For solving the problems referred to above, provide a kind of Email polymerization sorter according to the embodiment of the present application, Comprising: mail acquisition module, for obtaining the mail of mail server storage;Keyword-extraction module, For obtaining the content information of described mail, the participle mode preset is used Mail Contents information to be carried out point Word processes, and extracts the keyword of Mail Contents information;Polymerization sort module, for according to Mail Contents The keyword of information carries out being polymerized classification process and forms classification tree Mail Contents.
Wherein, described polymerization sort module is additionally operable to, using the keyword of Mail Contents information as this mail The label of content information, and according to label by Mail Contents information matches to the classification preset.
Wherein, also include: classification tree index service module, for the mark according to described Mail Contents information Sign the index file setting up described Mail Contents information, and the search service of described Mail Contents information is provided.
Wherein, also include: content center extraction module, for the Mail Contents information received is carried out Centre point is extracted, and obtains the centre point of Mail Contents information;Described keyword-extraction module is additionally operable to, The centre point of described Mail Contents information is carried out keyword extraction process.
Wherein, described keyword-extraction module is additionally operable to, and carries out word frequency calculating for the keyword extracted, Extract the word frequency keyword more than preset value.
Wherein, also include: Spam filtering processing module, be used for deleting described mail acquisition module certainly Spam in the mail that mail server gets and/or repetition mail.
Wherein, also include: content sorting module, for delete duplicate contents in Mail Contents information and/ Or mail format content.
Wherein, described Mail Contents information includes one below or a combination thereof: message body, theme, pluck Want, sender information, addressee information.
A kind of Email polymerization sorter is also provided for, comprising: mail obtains according to the embodiment of the present application Delivery block, for obtaining the mail of mail server storage;Content center extraction module, is used for obtaining institute State the content information of mail, Mail Contents information is carried out centre point extraction and obtains Mail Contents information Centre point;Search service module, sets up described for the centre point according to described Mail Contents information The index file of Mail Contents information, and the search service of described Mail Contents information is provided.
Wherein, also include: Spam filtering processing module, be used for deleting described mail acquisition module certainly Spam in the mail that mail server gets and/or repetition mail.
Wherein, also include: content sorting module, for delete duplicate contents in Mail Contents information and/ Or mail format content.
Wherein, described Mail Contents information includes one below or a combination thereof: message body, theme, pluck Want, sender information, addressee information.
A kind of Email polymerization sorting technique is also provided for, comprising: obtain postal according to the embodiment of the present application The mail of part server storage;Obtain the content information of described mail, use the participle mode preset to postal Part content information carries out word segmentation processing, extracts the keyword of Mail Contents information;Believe according to Mail Contents The keyword of breath carries out being polymerized classification process and forms classification tree Mail Contents.
Wherein, Mail Contents is carried out being polymerized classification process shape by the described keyword according to Mail Contents information Become classification tree, also include: using the keyword of Mail Contents information as the label of this Mail Contents information, And according to label by Mail Contents information matches to the classification preset.
Wherein, also include: set up described Mail Contents information according to the label of described Mail Contents information Index file, and the search service of described Mail Contents information is provided.
Wherein, also include: the Mail Contents information received is carried out centre point extraction, obtains mail The centre point of content information;The described keyword extracting Mail Contents information also includes: to described postal The centre point of part content information carries out keyword extraction process.
Wherein, the keyword extracting Mail Contents information described in also includes: for the keyword extracted Carry out word frequency calculating, extract the word frequency keyword more than preset value.
Wherein, also include: delete the spam in the mail that mail server gets and/or repetition Mail.
Wherein, also include: delete the duplicate contents in Mail Contents information and/or mail format content.
Wherein, described Mail Contents information includes one below or a combination thereof: message body, theme, pluck Want, sender information, addressee information.
A kind of Email polymerization sorting technique is also provided for, comprising: obtain postal according to the embodiment of the present application The mail of part server storage;Obtain the content information of described mail, Mail Contents information is carried out center Contents extraction obtains the centre point of Mail Contents information;Centre point according to described Mail Contents information Set up the index file of described Mail Contents information, and the search service of described Mail Contents information is provided.
Wherein, also include: delete from mail server get mail in spam and/or repetition Mail.
Wherein, also include: delete the duplicate contents in Mail Contents information and/or mail format content.
Wherein, described Mail Contents information includes one below or a combination thereof: message body, theme, pluck Want, sender information, addressee information.
Compared with prior art, according to the technical scheme of the application, by Mail Contents is analyzed with Extract keyword, and carry out being polymerized classification process formation classification tree to Mail Contents according to keyword, it is possible to Mail Contents is checked in classification easily.
Accompanying drawing explanation
Accompanying drawing described herein is used for providing further understanding of the present application, constitutes of the application Point, the schematic description and description of the application is used for explaining the application, is not intended that the application's Improper restriction.In the accompanying drawings:
Fig. 1 illustrates the structural frames of the Email polymerization sorter 100 according to one embodiment of the application Figure;
Fig. 2 illustrates the structural frames of the Email polymerization sorter 200 according to another embodiment of the application Figure;
Fig. 3 illustrates the structural frames of the Email polymerization sorter 300 according to another embodiment of the application Figure;
Fig. 4 illustrates the structural frames of the Email polymerization sorter 400 according to another embodiment of the application Figure;
Fig. 5 illustrates the flow chart of the Email polymerization sorting technique according to one embodiment of the application;
Fig. 6 illustrates the flow chart of the Email polymerization sorting technique according to another embodiment of the application.
Detailed description of the invention
For making the purpose of the application, technical scheme and advantage clearer, concrete below in conjunction with the application Technical scheme is clearly and completely described by embodiment and corresponding accompanying drawing.Obviously, retouched The embodiment stated is only some embodiments of the present application rather than whole embodiments.Based in the application Embodiment, it is all that those of ordinary skill in the art are obtained under not making creative work premise Other embodiments, broadly fall into the scope of the application protection.
A kind of Email polymerization sorter is provided according to the embodiment of the present application.Show with reference to Fig. 1, Fig. 1 Go out the Email according to one embodiment of the application and be polymerized the structured flowchart of sorter 100, this electronics Mail polymerization sorter 100 includes:
Mail acquisition module 110, for obtaining the mail of mail server storage, wherein mail obtains mould Block 110 can be passive receive from mail server forward mail or can also actively gather mail The mail of server storage;
In an embodiment of the application, important email that an envelope is replied by many people and discussed or an envelope When common mail needs the content being shared or needing to integrate mail, the mail specified is sent to electronics Mail polymerization sorter 100, such mail acquisition module 110 is just able to receive that the mail specified. Such as, in actual applications, being sent in the mailbox of presumptive address by the mail specified, this mailbox is real-time Or periodically mail is sent to described Email polymerization sorter 100;Or, at individual's mailbox Or the actions menu of enterprise's mailbox provides the forwarding capability of forwarding by mail, when user clicks on this forwarding capability During button, then mail is specified just to send to described Email polymerization sorter 100.
Keyword-extraction module 120, for obtaining the content information of described mail, uses the participle preset Mode carries out word segmentation processing to Mail Contents information, extracts the keyword of Mail Contents information;
Wherein, described Mail Contents information includes but not limited to one or a combination set of following dimensional information: postal Part text, theme, summary, sender information, addressee information.Keyword-extraction module 120 is to postal The information such as part text, theme, summary, sender information and/or addressee information carry out word segmentation processing, carry Take out the keyword of Mail Contents information.
In an embodiment of the application, keyword-extraction module 120 is additionally operable to the key extracted Word carries out word frequency calculating, and word frequency is more than the keyword extraction of preset value out, and word frequency is less than preset value Keyword is given up, and does not extracts;Or, the keyword of extracting directly maximum word frequency;Or, in advance One word frequency list is set, this word frequency list is provided with multiple word or word, by identical with the words in word frequency list Keyword extraction is out.Further, it is also possible to adopt the keyword extracting Mail Contents information in other ways, The application does not limits.
Polymerization sort module 130, for gathering Mail Contents according to the keyword of Mail Contents information Close classification process and form classification tree.Further, described polymerization sort module 130 is additionally operable to, by mail The keyword of content information is as the label of this Mail Contents information, and according to label by Mail Contents information It is matched in the classification preset.
Such as, a kind of classification tree includes multiple classifications such as " occurrences in human life ", " administrative " and " technology ", " skill Art " class includes there is A in A label and B label, text keyword or other Mail Contents information now The Mail Contents of label can be aggregated to this class now, and this classification can be multilevel hierarchy, the most such as, " occurrences in human life " class includes " department " label now, according to the affiliated function of sender/receiver in mail Hold and carry out various dimensions classification and displaying.The application can be classified easily by classification tree and check Mail Contents.
With reference to Fig. 2, Fig. 2, the Email polymerization sorter 200 according to another embodiment of the application is shown Structured flowchart, this Email polymerization sorter 200 include:
Mail acquisition module 210, for obtaining the mail of mail server storage;
Spam filtering processing module 220, it couples with mail acquisition module 210, is used for deleting institute State the mail acquisition module 210 spam in the mail that mail server gets and/or repeat postal Part;
Content sorting module 230, it couples with Spam filtering processing module 220, is used for deleting postal Duplicate contents in part content information and/or mail format content, the mail after process sends to content center Extraction module 240;
By above-mentioned process, the mail that mail acquisition module 210 acquires sequentially passes through spam mistake After the process of filter processing module 220 and content sorting module 230, it is provided that safety guarantee also can improve Processing speed, the operation deleting non-central content too increases the accuracy of keyword.
Content center extraction module 240, it couples with content sorting module 230, for receiving Mail Contents information carries out centre point extraction, obtains the centre point of Mail Contents information, after process Mail sends to keyword-extraction module 250;
Keyword-extraction module 250, it couples with content center extraction module 240, is used for obtaining described The centre point of Mail Contents information, uses the participle mode the preset centre point to Mail Contents information Carry out word segmentation processing, extract the keyword of the centre point of Mail Contents information.Further, described Keyword-extraction module 250 is additionally operable to, and carries out word frequency calculating for the keyword extracted, extracts word Frequency is more than the keyword of preset value.
Polymerization sort module 260, it couples with keyword-extraction module 250, for according to Mail Contents The keyword of information (centre point) carries out being polymerized classification process and is formed Mail Contents (centre point) Classification tree.Specifically, using the keyword of Mail Contents information as the label of this Mail Contents information, and According to label by Mail Contents information matches to the classification preset.
Classification tree index service module 270, it couples with being polymerized sort module 260, for according to mail The index file of described Mail Contents information set up by the label of content information, and provides described Mail Contents to believe The search service of breath.
By embodiments herein, automatically the content of mail is arranged and the polymerization such as classification processes, And provide and share Mail Contents and the effect of knowledge precipitation, and provided by the classification tree of Mail Contents The function of mail classification, user can search Mail Contents information easily.
With reference to Fig. 3, Fig. 3, the Email polymerization sorter 300 according to another embodiment of the application is shown Structured flowchart, this Email polymerization sorter 300 include:
Mail acquisition module 310, for obtaining the mail of mail server storage, wherein, mail obtains Module 110 can be passive receive from mail server forward mail or can also actively gather postal The mail of part server storage;
In an embodiment of the application, important email that an envelope is replied by many people and discussed or an envelope When common mail needs the content being shared or needing to integrate mail, the mail specified is sent to electronics Mail polymerization sorter 300, such mail acquisition module 310 is just able to receive that the mail specified. Such as, in actual applications, being sent in the mailbox of presumptive address by the mail specified, this mailbox is real-time Or periodically mail is sent to described Email polymerization sorter 300;Or, at individual's mailbox Or the actions menu of enterprise's mailbox provides the forwarding capability of forwarding by mail, when user clicks on this forwarding capability During button, then mail is specified just to send to described Email polymerization sorter 300.
Content center extraction module 320, for obtaining the content information of described mail, believes Mail Contents Breath carries out centre point and extracts the centre point obtaining Mail Contents information;
Wherein, described Mail Contents information includes but not limited to one or a combination set of following dimensional information: postal Part text, theme, summary, sender information, addressee information.By above-mentioned Mail Contents information Carry out centre point and extract the centre point obtaining Mail Contents information.
Search service module 330, for setting up described postal according to the centre point of described Mail Contents information The index file of part content information, and the search service of described Mail Contents information is provided.
With reference to Fig. 4, Fig. 4, the Email polymerization sorter 400 according to another embodiment of the application is shown Structured flowchart, this Email polymerization sorter 400 include:
Mail acquisition module 410, for obtaining the mail of mail server storage;
Spam filtering processing module 420, it couples with mail acquisition module 410, is used for deleting institute State the mail acquisition module 410 spam in the mail that mail server gets and/or repeat postal Part;
Content sorting module 430, it couples with Spam filtering processing module 420, is used for deleting postal Duplicate contents in part content information and/or mail format content;
By above-mentioned process, the mail that mail acquisition module 410 gets sequentially passes through Spam filtering After the process of processing module 420 and content sorting module 430, it is provided that safety guarantee also can improve place Reason speed, the operation deleting non-central content too increases the accuracy of keyword.
Content center extraction module 440, it couples with content sorting module 430, is used for obtaining described postal The content information of part, carries out centre point and extracts the center obtaining Mail Contents information Mail Contents information Content;Wherein, described Mail Contents information includes but not limited to one below or a combination thereof: message body, Theme, summary, sender information, addressee information.
Search service module 450, it couples with content center extraction module 440, for according to described postal The centre point of part content information sets up the index file of described Mail Contents information, and provides described mail The search service of content information.
By embodiments herein, automatically the content of mail is arranged and the polymerization such as classification processes, And providing the index service of Mail Contents information, user can search for required Mail Contents letter easily Breath.
Also providing for a kind of Email polymerization sorting technique according to the embodiment of the present application, Fig. 5 illustrates according to this The flow chart of the Email polymerization sorting technique of one embodiment of application, with reference to Fig. 5, described method bag Include step:
Step S502, obtains the mail of mail server storage, wherein it is possible to passive reception is from postal Part server forward mail or can also actively gather mail server storage mail;
Step S504, obtains the content information of described mail, uses the participle mode preset to Mail Contents Information carries out word segmentation processing, extracts the keyword of Mail Contents information;
Wherein, described Mail Contents information includes but not limited to one or a combination set of following dimensional information: postal Part text, theme, summary, sender information, addressee information, respectively to message body, theme, The information such as summary, sender information and/or addressee information carry out word segmentation processing, extract Mail Contents letter The keyword of breath.
Further, also included before step S504:
Delete the spam in the mail that mail server gets and/or repeat mail;
Delete the duplicate contents in Mail Contents information and/or mail format content.
Provide safety guarantee by the mail after above-mentioned process and processing speed can be improved, deleting non- The operation of centre point too increases the accuracy of keyword.
Then, the Mail Contents information received is carried out centre point extraction, obtains Mail Contents information Centre point, the centre point of described Mail Contents information is carried out keyword extraction process.
Further, the keyword extracting Mail Contents information described in also includes: for the keyword extracted Carry out word frequency calculating, extract the word frequency keyword more than preset value, word frequency is less than the key of preset value Word is given up, and does not extracts;Or, the keyword of extracting directly maximum word frequency;Or, pre-set One word frequency list, is provided with multiple word or word, by the key identical with the words in word frequency list in this word frequency list Word extracts.Further, it is also possible to adopt the keyword extracting Mail Contents information in other ways, this Application does not limits..
Step S506, carries out being polymerized classification according to the keyword of Mail Contents information to Mail Contents and processes shape Become classification tree.Further, using the keyword of Mail Contents information as the label of this Mail Contents information, And according to label by Mail Contents information matches to the classification preset.
Further, described method also includes: set up in described mail according to the label of described Mail Contents information The index file of appearance information, and the search service of described Mail Contents information is provided, facilitate user to search postal Part content information.
Fig. 6 illustrates the flow chart of the Email polymerization sorting technique according to another embodiment of the application, ginseng Examining Fig. 6, described method includes step:
Step S602, obtains the mail of mail server storage, wherein it is possible to passive reception is from postal Part server forward mail or can also actively gather mail server storage mail;
Step S604, obtains the content information of described mail, Mail Contents information is carried out centre point and carries Obtain the centre point of Mail Contents information;
Wherein, described Mail Contents information includes but not limited to one or a combination set of following dimensional information: postal Part text, theme, summary, sender information, addressee information, respectively to message body, theme, The information such as summary, sender information and/or addressee information carry out centre point extraction and obtain Mail Contents letter The centre point of breath.
Further, also included before step S604:
Delete the spam in the mail that mail server gets and/or repeat mail;
Delete the duplicate contents in Mail Contents information and/or mail format content.
Provide safety guarantee by the mail after above-mentioned process and processing speed can be improved, deleting non- The operation of centre point too increases the accuracy of keyword.
Step S606, sets up described Mail Contents information according to the centre point of described Mail Contents information Index file, and the search service of described Mail Contents information is provided.
In sum, according to the technical scheme of the application, by being analyzed Mail Contents extracting pass Key word, and carry out being polymerized classification process formation classification tree to Mail Contents according to keyword, it is possible to easily Mail Contents is checked in classification.
It will be understood by those skilled in the art that embodiments herein can be provided as method, system or computer Program product.Therefore, the application can use complete hardware embodiment, complete software implementation or combine soft The form of the embodiment of part and hardware aspect.And, the application can use and wherein comprise one or more Have computer usable program code computer-usable storage medium (include but not limited to disk memory, CD-ROM, optical memory etc.) form of the upper computer program implemented.
In a typical configuration, calculating equipment include one or more processor (CPU), input/ Output interface, network interface and internal memory.
Internal memory potentially includes the volatile memory in computer-readable medium, random access memory (RAM) and/or the form such as Nonvolatile memory, such as read only memory (ROM) or flash memory (flash RAM).Internal memory is the example of computer-readable medium.
Computer-readable medium includes that permanent and non-permanent, removable and non-removable media can be by Any method or technology realize information storage.Information can be computer-readable instruction, data structure, The module of program or other data.The example of the storage medium of computer includes, but are not limited to phase transition internal memory (PRAM), static RAM (SRAM), dynamic random access memory (DRAM), Other kinds of random access memory (RAM), read only memory (ROM), electrically erasable Read only memory (EEPROM), fast flash memory bank or other memory techniques, read-only optical disc read only memory (CD-ROM), digital versatile disc (DVD) or other optical storage, magnetic cassette tape, tape magnetic Disk storage or other magnetic storage apparatus or any other non-transmission medium, can be used for storage can be counted The information that calculation equipment accesses.According to defining herein, computer-readable medium does not include that temporary computer can Read media (transitory media), such as data signal and the carrier wave of modulation.
Also, it should be noted term " includes ", " comprising " or its any other variant are intended to non- Comprising of exclusiveness, so that include the process of a series of key element, method, commodity or equipment not only Including those key elements, but also include other key elements being not expressly set out, or also include for this The key element that process, method, commodity or equipment are intrinsic.In the case of there is no more restriction, by language The key element that sentence " including ... " limits, it is not excluded that including the process of described key element, method, business Product or equipment there is also other identical element.
The foregoing is only embodiments herein, be not limited to the application.For this area For technical staff, the application can have various modifications and variations.All spirit herein and principle it Interior made any modification, equivalent substitution and improvement etc., should be included in claims hereof scope Within.

Claims (24)

1. an Email polymerization sorter, it is characterised in that including:
Mail acquisition module, for obtaining the mail of mail server storage;
Keyword-extraction module, for obtaining the content information of described mail, uses the participle mode preset Mail Contents information is carried out word segmentation processing, extracts the keyword of Mail Contents information;
Polymerization sort module, divides for Mail Contents being carried out polymerization according to the keyword of Mail Contents information Class process forms classification tree.
Device the most according to claim 1, it is characterised in that described polymerization sort module is additionally operable to, Using the keyword of Mail Contents information as the label of this Mail Contents information, and according to label by mail Hold in the classification that information matches is extremely preset.
Device the most according to claim 2, it is characterised in that also include:
Classification tree index service module, for setting up described mail according to the label of described Mail Contents information The index file of content information, and the search service of described Mail Contents information is provided.
Device the most according to claim 1, it is characterised in that also include:
Content center extraction module, for the Mail Contents information received is carried out centre point extraction, Obtain the centre point of Mail Contents information;
Described keyword-extraction module is additionally operable to, and the centre point of described Mail Contents information is carried out key Word extraction process.
5. according to the device described in claim 1 or 4, it is characterised in that described keyword-extraction module It is additionally operable to, word frequency calculating is carried out for the keyword extracted, extract the word frequency key more than preset value Word.
Device the most according to claim 1, it is characterised in that also include:
Spam filtering processing module, is used for deleting described mail acquisition module and obtains from mail server To mail in spam and/or repeat mail.
Device the most according to claim 6, it is characterised in that also include:
Content sorting module, for deleting in the duplicate contents in Mail Contents information and/or mail format Hold.
Device the most according to claim 1, it is characterised in that described Mail Contents information include with One or a combination set of lower: message body, theme, summary, sender information, addressee information.
9. an Email polymerization sorter, it is characterised in that including:
Mail acquisition module, for obtaining the mail of mail server storage;
Content center extraction module, for obtaining the content information of described mail, enters Mail Contents information Row centre point extracts the centre point obtaining Mail Contents information;
Search service module, for setting up in described mail according to the centre point of described Mail Contents information The index file of appearance information, and the search service of described Mail Contents information is provided.
Device the most according to claim 9, it is characterised in that also include:
Spam filtering processing module, is used for deleting described mail acquisition module and obtains from mail server To mail in spam and/or repeat mail.
11. devices according to claim 10, it is characterised in that also include:
Content sorting module, for deleting in the duplicate contents in Mail Contents information and/or mail format Hold.
12. devices according to claim 9, it is characterised in that described Mail Contents information includes One below or a combination thereof: message body, theme, summary, sender information, addressee information.
13. 1 kinds of Email polymerization sorting techniques, it is characterised in that including:
Obtain the mail of mail server storage;
Obtain the content information of described mail, use the participle mode preset Mail Contents information to be carried out point Word processes, and extracts the keyword of Mail Contents information;
Keyword according to Mail Contents information carries out being polymerized classification process and forms classification tree Mail Contents.
14. methods according to claim 13, it is characterised in that described according to Mail Contents information Keyword Mail Contents is carried out be polymerized classification process formed classification tree, also include:
Using the keyword of Mail Contents information as the label of this Mail Contents information, and according to label by postal Part content information is matched in the classification preset.
15. methods according to claim 14, it is characterised in that also include:
The index file of described Mail Contents information set up by label according to described Mail Contents information, and carries Search service for described Mail Contents information.
16. methods according to claim 13, it is characterised in that also include:
The Mail Contents information received is carried out centre point extraction, obtains the center of Mail Contents information Content;
The described keyword extracting Mail Contents information also includes: the center to described Mail Contents information Content carries out keyword extraction process.
17. according to the method described in claim 13 or 16, it is characterised in that described in extract mail The keyword of content information also includes: the keyword for extracting carries out word frequency calculating, extracts word frequency Keyword more than preset value.
18. methods according to claim 13, it is characterised in that also include:
Delete the spam in the mail that mail server gets and/or repeat mail.
19. methods according to claim 18, it is characterised in that also include:
Delete the duplicate contents in Mail Contents information and/or mail format content.
20. methods according to claim 13, it is characterised in that described Mail Contents information includes One below or a combination thereof: message body, theme, summary, sender information, addressee information.
21. 1 kinds of Email polymerization sorting techniques, it is characterised in that including:
Obtain the mail of mail server storage;
Obtain the content information of described mail, Mail Contents information is carried out centre point extraction and obtains mail The centre point of content information;
Centre point according to described Mail Contents information sets up the index file of described Mail Contents information, And the search service of described Mail Contents information is provided.
22. methods according to claim 21, it is characterised in that also include:
Delete the spam in the mail that mail server gets and/or repeat mail.
23. methods according to claim 22, it is characterised in that also include:
Delete the duplicate contents in Mail Contents information and/or mail format content.
24. methods according to claim 21, it is characterised in that described Mail Contents information includes One below or a combination thereof: message body, theme, summary, sender information, addressee information.
CN201510155716.3A 2015-04-02 2015-04-02 Email polymerization sorting technique and device Pending CN106156105A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201510155716.3A CN106156105A (en) 2015-04-02 2015-04-02 Email polymerization sorting technique and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201510155716.3A CN106156105A (en) 2015-04-02 2015-04-02 Email polymerization sorting technique and device

Publications (1)

Publication Number Publication Date
CN106156105A true CN106156105A (en) 2016-11-23

Family

ID=57338201

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201510155716.3A Pending CN106156105A (en) 2015-04-02 2015-04-02 Email polymerization sorting technique and device

Country Status (1)

Country Link
CN (1) CN106156105A (en)

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2018102995A1 (en) * 2016-12-06 2018-06-14 深圳市唯德科创信息有限公司 Mail management method and system
CN109144632A (en) * 2018-07-19 2019-01-04 平安科技(深圳)有限公司 A kind of method, apparatus, storage medium and electronic equipment handling mail
CN109600300A (en) * 2018-11-19 2019-04-09 郑州云海信息技术有限公司 A kind of artificial intelligence mail management system and method
CN109800433A (en) * 2019-01-24 2019-05-24 深圳市小满科技有限公司 Method, apparatus of filing, electronic equipment and medium based on two disaggregated model of mail
CN110073345A (en) * 2016-12-06 2019-07-30 深圳市唯德科创信息有限公司 A kind of management method and system of Email attachment
CN111047455A (en) * 2019-12-31 2020-04-21 武汉市烽视威科技有限公司 Personal statue method and system for mail
CN113595863A (en) * 2020-04-30 2021-11-02 北京字节跳动网络技术有限公司 Display method and device of shared mails, electronic equipment and storage medium

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101059815A (en) * 2007-05-09 2007-10-24 宋鸣 Network abstract customization search engine
CN101150529A (en) * 2006-09-21 2008-03-26 腾讯科技(深圳)有限公司 A method and system for mail search
CN103942282A (en) * 2014-04-02 2014-07-23 新浪网技术(中国)有限公司 Sample data obtaining method, device and system
US20140359480A1 (en) * 2013-06-04 2014-12-04 Yahoo! Inc. System and method for contextual mail recommendations

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101150529A (en) * 2006-09-21 2008-03-26 腾讯科技(深圳)有限公司 A method and system for mail search
CN101059815A (en) * 2007-05-09 2007-10-24 宋鸣 Network abstract customization search engine
US20140359480A1 (en) * 2013-06-04 2014-12-04 Yahoo! Inc. System and method for contextual mail recommendations
CN103942282A (en) * 2014-04-02 2014-07-23 新浪网技术(中国)有限公司 Sample data obtaining method, device and system

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
杨慧洁: "邮件通联关系网络中重要节点及社团发现技术研究", 《中国优秀硕士学位论文全文数据库 信息科技辑》 *

Cited By (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2018102995A1 (en) * 2016-12-06 2018-06-14 深圳市唯德科创信息有限公司 Mail management method and system
CN110073345A (en) * 2016-12-06 2019-07-30 深圳市唯德科创信息有限公司 A kind of management method and system of Email attachment
CN110235117A (en) * 2016-12-06 2019-09-13 深圳市唯德科创信息有限公司 A kind of management method and system of mail
CN109144632A (en) * 2018-07-19 2019-01-04 平安科技(深圳)有限公司 A kind of method, apparatus, storage medium and electronic equipment handling mail
CN109600300A (en) * 2018-11-19 2019-04-09 郑州云海信息技术有限公司 A kind of artificial intelligence mail management system and method
CN109800433A (en) * 2019-01-24 2019-05-24 深圳市小满科技有限公司 Method, apparatus of filing, electronic equipment and medium based on two disaggregated model of mail
CN109800433B (en) * 2019-01-24 2023-11-10 深圳市小满科技有限公司 Filing method and device based on mail two-class model, electronic equipment and medium
CN111047455A (en) * 2019-12-31 2020-04-21 武汉市烽视威科技有限公司 Personal statue method and system for mail
CN113595863A (en) * 2020-04-30 2021-11-02 北京字节跳动网络技术有限公司 Display method and device of shared mails, electronic equipment and storage medium
CN113595863B (en) * 2020-04-30 2023-04-18 北京字节跳动网络技术有限公司 Display method and device of shared mails, electronic equipment and storage medium
US11895075B2 (en) 2020-04-30 2024-02-06 Beijing Bytedance Network Technology Co., Ltd. Method and apparatus for displaying shared mail, and electronic device and storage medium

Similar Documents

Publication Publication Date Title
CN106156105A (en) Email polymerization sorting technique and device
US9819634B2 (en) Organizing messages in a messaging system using social network information
Navaney et al. SMS spam filtering using supervised machine learning algorithms
CN104982011B (en) Use the document classification of multiple dimensioned text fingerprints
US8688690B2 (en) Method for calculating semantic similarities between messages and conversations based on enhanced entity extraction
CN103514174B (en) A kind of file classification method and device
US10360537B1 (en) Generating and applying event data extraction templates
Saad et al. A survey of machine learning techniques for Spam filtering
CN103136266A (en) Method and device for classification of mail
US9785705B1 (en) Generating and applying data extraction templates
WO2019179010A1 (en) Data set acquisition method, classification method and device, apparatus, and storage medium
US20140074947A1 (en) Automated e-mail screening to verify recipients of an outgoing e-mail message
CN110213152B (en) Method, device, server and storage medium for identifying junk mails
Bhat et al. Classification of email using BeaKS: Behavior and keyword stemming
CN108347367B (en) E-mail processing method and device, server and client
Khan et al. Text mining approach to detect spam in emails
Dewi et al. Multiclass SMS message categorization: Beyond spam binary classification
CN106230690B (en) A kind of process for sorting mailings and system of combination user property
CN104268214A (en) Micro-blog user relationship based user gender identification method and system
CN111047455A (en) Personal statue method and system for mail
Patidar et al. A novel technique of email classification for spam detection
CN106779080A (en) A kind of people information knowledge base method for auto constructing
US10163005B2 (en) Document structure analysis device with image processing
JP2008250437A (en) Mail data sorting apparatus, mail data sorting program, mail data sorting method, e-mail data hierarchy localization device, e-mail data hierarchy localization program, and e-mail data hierarchy localization method
CN109660961B (en) Method and device for matching short message number and attribution information thereof and storage medium

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20161123