CN101208686B - Thread identification and classification - Google Patents

Thread identification and classification Download PDF

Info

Publication number
CN101208686B
CN101208686B CN2006800114025A CN200680011402A CN101208686B CN 101208686 B CN101208686 B CN 101208686B CN 2006800114025 A CN2006800114025 A CN 2006800114025A CN 200680011402 A CN200680011402 A CN 200680011402A CN 101208686 B CN101208686 B CN 101208686B
Authority
CN
China
Prior art keywords
clue
message
electronic information
subclass
described electronic
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Fee Related
Application number
CN2006800114025A
Other languages
Chinese (zh)
Other versions
CN101208686A (en
Inventor
安德鲁·本斯基
阿内诗·马达坡西
弗雷德里克·米勒
约尔·萨奇
马格纳斯·斯坦斯蒙
詹姆斯·查尔斯·威廉姆斯
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Ev Ott
K-boom Knight Limited by Share Ltd.
Original Assignee
Metalincs Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Priority claimed from US11/128,935 external-priority patent/US8055715B2/en
Application filed by Metalincs Corp filed Critical Metalincs Corp
Publication of CN101208686A publication Critical patent/CN101208686A/en
Application granted granted Critical
Publication of CN101208686B publication Critical patent/CN101208686B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Abstract

Systems, methods and apparatuses for analyzing electronic messages and grouping them into threads are described. In addition, the present invention may classify threads based on their relationship to other threads.

Description

Clue identification and classification
Technical field
The present invention relates generally to the electronic message delivery technology, and relate more specifically to analytical electron message and be its grouping with clue (also being considered to discuss or talk).
Background technology
Along with the individual of electronic message delivery (as Email) with commercially use sustainable growth, need be used to store, file and the improved application of retrieve electronic message.Wherein this technology is that important field comprises, needs the investigation field of a large amount of electronic informations of search matched particular search theme.For example, electronic information may need searched and retrieval, is used to meet specific company's obligation or the discovery request of initiating during the lawsuit process.
A kind of method that electronic information can be organized is by electronic information is grouped into clue.Clue comprises the one or more electronic informations that form the contact human chain.Clue begins with initiating electron message, and comprise to or any follow-up answer or the forwarding of any other message from initiating electron message or clue.A current problem that electronic information is grouped into the application of clue is that message only is grouped into clue based on the title of electronic information.
Only may cause the electronic information that it doesn't matter each other is grouped into clue based on title grouping electronic information.For example, each department of company may have quarterly meeting, so that the state or the problem of particular quarter engineering to be discussed.Each department can be sent to the electronic information (as email message) with conference agenda the member of department.Yet if each department uses the title of title " quarterly meeting " as message, so current application may be grouped into all meetings from each group in the single clue.
The electronic information possibility problem in single inbox environment of dividing into groups by this way is little, will be included in the clue because only send to the electronic information of inbox.Yet when when a plurality of inboxes are checked electronic information, it doesn't matter each other and may comprise that the electronic information of different linkman sets may be grouped together in the single clue, only because their share common title.These message may it doesn't matter each other and may have been increased the difficulty of search electronic information, but not make it be easier to locate and the relevant electronic information of particular search inquiry.
Summary of the invention
The invention describes each system, equipment and method, the message that is used for receiving from a plurality of electronic message accounts is grouped into one or more clues.The present invention can will utilize the message of different electronic information forms and the unordered reception of its possibility to be grouped in the suitable clue.The clue of electronic information and identification can be stored in database or other storage mediums, and this database and other storage mediums can be based on being searched for by the search terms of user's input.Mating the electronic information of this search terms and clue can be displayed to the user and be used for extra investigation.
In one embodiment of the invention, electronic information can resolvedly be discerned one or more header fields, and this header fields can be used to the clue that identification message may belong to.The example that can be used to the header fields of the clue that identification message may belong to includes but not limited to: one or more clues identification header, title and the dates of finding in the electronic information.When the clue that belongs to when message was identified, electronic information can be added to clue, and is stored in database or other storage mediums.If particular electronic is not discerned clue, then message can be stored in the database and can begin new clue.
Except electronic information is grouped into the clue, the present invention also discerns related thread, and one or more clues are classified with respect to its related thread.In one embodiment, each clue can be according to the contact person's of related thread difference classification.In another embodiment, each clue can be classified according to the difference or the similarity of main topic of discussion or content in the specific clue.
Description of drawings
With reference to various embodiments of the present invention, the example can illustrate in the accompanying drawings.These accompanying drawings intention is n-lustrative and nonrestrictive.Although the present invention generally describes in the environment of these embodiment, should be appreciated that its intention does not lie in to limit the scope of the invention to these specific embodiments.
Fig. 1 diagram is with the clue of the electronic information of related thread.
Fig. 2 diagram is used for according to one embodiment of present invention electronic information being grouped into the process flow diagram of the method for clue.
Fig. 3 A illustrates the visual representation of the date range of two clues 310 and 320.
Fig. 3 B illustrates the visual representation of the date range of two clues 330 and 340.
Fig. 4 is the figure of three example cord of diagram.
Fig. 5 is the information table of three electronic informations.
Fig. 6 is the figure of example cord of Fig. 4 of the electronic information of diagram with the Fig. 5 that adds clue to.
Embodiment
Each system, equipment and the method that are used for analytical electron message and one or more message are grouped into clue are described below.Except grouping messages into clue, the present invention also set up between each clue relation and based on these each clues of relation classification.In the following description, for explanatory purposes, specific details is proposed so that the understanding of the present invention is provided.Yet, be that the present invention can need not these details and put into practice to what it will be apparent to those skilled in the art that.In addition, person of skill in the art will appreciate that the embodiments of the invention that describe below can be realized by various media (comprising software, hardware or firmware or its combination).Therefore, the process flow diagram that describes below is to illustrate specific embodiment of the present invention, and intention is avoided fuzzy the present invention.
Address to " embodiment ", " preferred embodiment " or " embodiment " in the instructions means: comprise special characteristic, structure, characteristic or the function described in conjunction with this embodiment at least one embodiment of the present invention.The various local phrases " in one embodiment " that occur also needn't all refer to same embodiment in instructions.
In one embodiment, a kind of software application (application) is analyzed the electronic information that receives from one or more user accounts, and groups messages into clue.For example, application can be with from all users' of the department in company or the company electronic information analysis and be grouped into clue and be used for further analysis.Exist a lot of application can receive the method for electronic information from a plurality of user accounts.In one embodiment, application can receive electronic information from electronic message server (as the e-mail server in the email message situation).Electronic message server can comprise the copy by all electronic informations of user's transmission of company or particular department or reception.In another embodiment, can set up electronic message server, be forwarded to copy and be used for analyzing the electronic information of each inflow or outflow., person of skill in the art will appreciate that those except above-described, have that use can be from the additive method of a plurality of user accounts visit electronic informations.Though be used for not mentioning especially, consider within the scope of the invention from the additive method of a plurality of user account visit electronic informations.
In one embodiment, each electronic information of analytic application of the present invention is to discern one or more header fields.The header fields of electronic information can be compared with the corresponding header fields of one or more existing clues in being stored in database or other storage mediums, with the clue of determining that message belongs to.Application can be stored in electronic information in the database, and adds electronic information to suitable clue.The present invention also can analyze the message that forms each clue, with the relation between definite various clues, and can be based on these relation classification clues.
Fig. 1 illustrates the example of thread group 100 according to an embodiment of the invention.Thread group is the message combinations with initial message, and this initial message is not any answer or forwarding of message before, makes that all message in the thread group are to initial message in the thread group or the answer or the forwarding of message more early.In Fig. 1, message A, B, C, D, E and F form thread group.Message A is an initial message, because it is the earliest a message in the clue.Other message are the answers to message A or another message from the chain of messages that message A initiates.
Exist a lot of modes to determine whether electronic information is the answer or the forwarding of another message.For example, message can comprise answers or transmits header, and it discerns current message is message to its forwarding or answer.As another example, if the title of message share standardization or modification also falls in the acceptable date range of second message, and share one or more contact persons with any other message in the thread group, then message can be categorized as the answer or the forwarding of second message.The normalized subject of message can remove any subject prefix (as Re, RE, FW, FWD, Fwd etc.) by the header field from message and find.Normalized subject can be compared with the normalized subject of other message, to find the one or more message of utilizing its message can form the part thread group.
The date of message can with the date range of the message of shared same standardized title relatively, to determine whether each message forms thread group.For example, the message with same title that may want to be grouped into thread group is restricted to some message, and take place in the amount its interval time before or after the message of sharing same title, as hereinafter discussing.
As another example, determine whether message is the answer of previous message or transmits and can infer from the title or the main body of message exploratoryly.Subject prefix (as Re, RE, FW, FWD, Fwd etc.) but the existence Indication message be before the answer of message.As another example, the main body of message can comprise text or one other identification symbol, and its expression message is the answer or the forwarding of previous message.For example, if the main body of text comprises as the statement of " message that begins to transmit " or comprises indent text from previous message that this can indicate this message is the answer or the forwarding of previous message.
In one embodiment of the invention, the electronic information of formation thread group can be grouped in the one or more clues.There is the criterion that much can be used for grouping messages into clue.In one embodiment, message can be grouped into clue based on the contact person of each message.Message in the thread group can be grouped into clue, and these message have identical contact person and are from the forwarding of collective message or answer or from the forwarding or the answer of the message of the part chain that is derived from collective message.
In one embodiment of the invention, the contact person of message can be defined as sender or all recipients, even this recipient or clearly name or be not that the recipient (bcc) of tabulation also names according to the message that they have received according to its membership qualification name in the mail tabulation in recipient's field of electronic information or they.Mail tabulation is the virtual address that can be input to a plurality of recipients of expression in the electronic information.The member of mail tabulation can determine in many ways.In one embodiment, application can ask by represent from the mail tabulation of electronic information (e-message) server (as the e-mail server of Email (email) message scenarios) each one.In another embodiment, the membership qualification in the mail tabulation can be inferred exploratoryly.For example, if someone answer message, he clearly is not named as the contact person in this message, and message comprises mail tabulation, and then this people can be assumed that the member of one of mail tabulation.
Can have different linkman sets owing to form the message of thread group, thereby each thread group can comprise a plurality of clues.In one embodiment of the invention, each clue can further be classified, and makes thread group comprise main thread and one or more related thread.In one embodiment, main thread is such clue, and it comprises the initial message of thread group and from having the answer message chain of the initial message of same correspondents with initial message.For example, in Fig. 1, message A, B, C and D form the main thread of thread group 100.In this example, electronic information A, B, C and D have identical title (product issue) and identical contact person (Tim, Carl, Bob and Ray).Message A is the initial message of main thread, and message B and C are the answers of message A, and message D is the answer of message C.
All the other clues in the thread group can be classified as related thread.Related thread comprises initial message, and it is from as the part of another clue in the thread group but comprise the answer or the forwarding (or with electronic information other criterions from father's message area of its answer or forwarding is separated) of the message of different linkman sets.Related thread is called as from other clue branches.Referring again to Fig. 1, message E and F are the message in the thread group that can be grouped in second clue.As shown, message E is the answer of message B.Yet message E is the answer from Tim that only is sent to Carl.Because message E comprises the contact person who is less than message B, so message E is at the initial message of message B from the related thread of main thread branch.In this example, related thread comprises message E and to the message F of the answer of message E.
In one embodiment, can suppose theme or the content of message E, though may be relevant with the theme or the content of main thread, may be only relevant with Tim and Carl.By distinguishing clue by this way, and the clue that will send to the contact person different with the clue of its branch is categorized as related thread, the electronic information that forms clue can be presented to the user in one way, and it helps user to determine which message may be relevant with particular search inquiry or specific discussion theme.For example, related thread can be illustrated as different with main thread.In Fig. 1, related thread messages by a dotted line but not solid line connect.This forms related thread for the user provides indication directly perceived: message E and F.
Related thread can be based on the difference of the contact person between two clues, by further with respect to related thread from the clue of its branch and classify.For example, following clue can be classified the clue as reduction, and the initial message of this clue is the answer from another message in the thread group, but comprises the All Contacts contact person still less of the message of comparing its answer.In Fig. 1, comprise that the related thread of message E and F can be classified as the clue of reduction, because the contact person of message E is the subclass of related thread from the contact person of the message (message B) of its branch.
As another example, following clue can be classified as the clue of expansion, and the initial message of this clue is the answer from another message in the thread group, but comprises and the contact person identical to the message of its answer, and adds extra contact person.Yet in another embodiment, following answer message can be grouped into clue but not be categorized as the clue of expansion, and this answer message comprises from the All Contacts of the message of its answer, also comprises one or more extra contact persons.In another example, following clue can be classified as overlapping clue, and the initial message of this clue is that its contact person is neither the also answer of non-expansion of strict reduction.For example, following clue can be classified as overlapping clue, and its initial message comprises that the contact person's of his father's message message of its answer (promptly to) subclass adds some new contact persons that do not find in father's message.
In another example, following clue can be classified as the clue of forwarding, and its initial message is the forwarding to another message in the thread group, and regardless of its contact person.As another example, do not comprise common contact person's forwarding or (spawned) clue that overlapping clue can be classified as initiation with main thread.
The clue that following clue can be considered to revise, the title of this clue be by from main thread or from the title of the clue of its branch, with a kind of be not that the mode of the change of simple prefix, capital and small letter, white space and punctuate is revised.As another example, have one just and can be classified as special-purpose clue with clue from the common contact person of the message of its branch.
In another example, the clue that changed from the clue of its branch of its theme can be classified as the clue of modification.In one embodiment, the interior perhaps theme of discussing in the electronic information can with its clue in or the interior perhaps theme discussed the one or more message in the clue of its branch relatively.If other message in the theme of electronic information or content and the clue or different from the interior perhaps theme of the message of the clue of its branch with message, then this message can be classified as the clue of modification.Once more, this may be useful for the investigator when looking back the message clue.
In one embodiment, keyword and/or phrase can be from forming the message extraction of clue.These keywords and/or phrase can with the keyword of resolving and/or phrase from electronic information relatively, overlapping with in the main body of determining message.Content between the high overlapping Indication message is similar.Low overlapping indication electronic information can be classified as the clue of modification, because its content is different with the content of remaining clue.Those skilled in the art will recognize that have many other content similarity tests, it can be applicable to form the message of clue, to remain the similarity of the content of message in the content of determining message and the clue.
Those skilled in the art will recognize that these only are some the possible methods of clue of can classifying.Other classification is possible, and considers within the scope of the invention.
Fig. 2 diagram is used for according to one embodiment of present invention electronic information being grouped into the process flow diagram of clue.Electronic information is resolved 110, with the one or more header fields of identification electronic information.The example of header fields can include but not limited to date, clue identification packet header, reference field that the title, message of message send, answer to field and one or more recipient (recipient) field.
Recipient's field (comprise To:, From:, Cc:, Bcc: and Apparently-To: field) typically comprise the contact person's of electronic information electronic address.Recipient's field also can comprise one or more mail tabulations.For example, company can have the mail tabulation that is called sales department, and it is mapped to the electronic address tabulation of the personnel in the sales department of company.When the personnel in sales department sent electronic information, the user can import " sales department " to recipient's field, but not inserted everyone electronic message address.When sending message, electronic message server is discerned mail tabulation, and expands mail tabulation by the address that message copying is forwarded to each personnel who is represented by mail tabulation.
In one embodiment of the invention, when parsing comprised recipient's field of electronic information of mail tabulation, mail tabulation can be expanded by searching everyone electronic message address of being represented by mail tabulation.For example, the electronic message address that can represent by mail tabulation from electronic message server or directory service request of application.Application can be when message sends, and will be associated with suitable recipient's header fields by each electronic message address that mail tabulation is represented and be stored in the database.Mail tabulation also can be stored in the database with the message that is associated.
Utilization of the present invention is from the clue of header fields to determine that message belongs to of message parse.Person of skill in the art will appreciate that the quantity of the header fields of finding and type can change according to the message format that is used to create electronic information in electronic information.For example, utilize the electronic information of RFC 2822 internet message forms can comprise specific header fields, it does not find in the proprietary electronic message format of Microsoft, otherwise still.Following description, the present invention can be analyzed the message of different-format and it is grouped in the suitable clue.
An example that can be included in the header fields in the electronic information is a clue identification header.As will be discussed here, clue identification header comprises information, and it is one or more message to its answer that this information can be used for clue that identification message belongs to or current electronic information.For example, clue identification header can be included in unique numeral or other identifiers that comprise in the header fields of each message of clue or thread group.As another example, can tabulate father's message or form other message all or part of of same thread and/or thread group of clue sign header.These clue identification headers provide shortcut, think the suitable thread of given message identification.
If clue identification header is found 120, then clue identification header can compare with the corresponding clue identification header of each existing clue.If at 130 (that is, the clue identification header that its coupling is associated with one or more existing clues) in sight before, the one or more clues of then available clue identification header identification can be added to first of candidate thread and gather 140 clue identification header.If have the one or more related thread of sharing common clue identifier, then can have one or more candidate thread to electronic information.
For example, Microsoft uses the proprietary electronic message format that comprises the clue identification field that is called thread index.Thread index is discerned each the interior message of clue before current electronic information and the current message.Be derived from the electronic information of microsoft application,, can comprise thread index as Microsoft Exchange or Microsoft Outlook.The message string that forms clue can be resolved from thread index, and with each existing clue relatively, the one or more candidate thread that can belong to identification message.The existing clue that comprises the message of discerning in the thread index of electronic information is the potential candidate thread that can be added to first set of candidate thread.
As another example, utilize the message of RFC 2822 formatting forms can comprise one or more header fields, it is the clue identification header of identification electronic information to (a plurality of) father message of its answer.The example of the clue identification header in the message of RFC 2822 formattings includes but not limited to " answering extremely " and " quoting " header fields.Each electronic information with RFC 2822 formatting forms comprises unique message id.Answer the message id that comprises his father's message (to the message of its answer) to header.Similarly, the references header field can comprise the message id that forms other electronic informations that field is discussed.In other words, reference field can be included in the message id of the current electronic information one or more electronic informations before in the clue.
By will answer to and reference field in the message id field of message id that finds and the message that forms existing clue compare, can discern the one or more candidate thread 140 that electronic information may belong to.These candidate thread can be added to first set of candidate thread.
First set of candidate thread can reduce from the subclass of clue that has first set of same correspondents with electronic information by discerning 155.As above discuss, clue can be restricted to and comprise that those are as the answer of another message or forwarding and comprise same contact person's message.Have the answer of different linkman sets or transmit and to be classified as related thread.Therefore, the clue that has a same correspondents with electronic information is the clue that message may belong to.
Should be noted that not every electronic information all will comprise clue identification header fields.Many these clue identification header fields are optional header fields, and many electronic message programs are not included in each electronic information.Yet as described herein, the present invention can use other header fields of resolving from electronic information, the potential candidate thread that may belong to identification message.By handling electronic information and not having the message of thread identification header with thread identification header, and can handle the electronic information of clue identification header, present invention is capable of grouping electronic messages of differing formats in the suitable clue with variation or mixed format.
If do not find the clue 130 of sharing common clue sign header with message, the identification header 120 if perhaps do not pick up scent in electronic information, then the additional header field of electronic information can be used for the one or more existing clue that identification message may belong to.In one embodiment, the title of electronic information, date and contact person can compare with title, date range and the contact person of each existing clue, to discern one or more potential candidate thread.
In one embodiment, second set of candidate thread can be identified 150, and it has the title identical with electronic information, and to it near the date of the same time electronic information taking place with the one or more electronic informations that form clue.In one embodiment, the title of electronic information can come standardization by removing any designator, and this designator Indication message is the response of previous message or from previous forwards.For example, many e-mail programs add the title of the message that has been forwarded or has answered respectively to " Re: " and " Fw: ".Standardized title can be compared with the common title of the message in the existing clue in being stored in database.
In one embodiment, the date range of clue limit in the clue the earliest message and clue in time durations between the message that takes place the latest.In one embodiment, if the date of electronic information falls in the date range of clue, or fall in the clue oldest message time quantum (space before) before, or fall in the time quantum (back at interval) after the message that clue takes place the latest, if it shares the title identical with electronic information, clue can be added to second set of candidate thread so.Space before can be and back identical at interval time quantum or different time quantums.Person of skill in the art will appreciate that space before and back can be depending on application at interval and change.Space before and back can be fixed at interval in one embodiment, and space before at interval can be with other characteristic dynamic adjustments of clue size or clue with the back in another embodiment.For example, space before and/or back at interval for the clue that comprises smallest number message can be compared to have during long-time on the big clue of a large amount of message of diffusion little.
Fig. 3 A is the expression for the timeline of two clues 310 and 320.Solid box is represented the date range of clue.The date of first message is represented by the left end of solid box in the clue, and in the clue date of stop press represent by the right-hand member of solid box.The dashed extensions of solid box is represented space before and back respectively at interval.Space before and back solve at interval as the clue part but than the Zao or late message of current message that forms clue, limit the outer time quantum of current clue date range simultaneously, and the message that has same title betwixt can be considered to the part clue.
For this example, suppose that current analyzed electronic information has the date of being represented by the line A among Fig. 3 A.Also the title for this example supposition electronic information is identical with the title of clue 310 and 320.Because subject matches thread 310 and message fall in the space before early than first message of clue 310, so clue 310 can be added to second set of candidate thread.Yet, because message does not fall in the space before early than first message of clue 320, so this clue is not the candidate thread of electronic information, even it and the shared identical title of electronic information.
Second set of candidate thread can reduce from the subclass of clue that has first set of same correspondents with electronic information by discerning 155.As above discuss, clue can be restricted to and comprise that those are as the answer of another message or forwarding and comprise the message of same correspondents.Have the answer of different linkman sets and transmit and to be classified as related thread.Therefore, the clue that has a same correspondents with electronic information is the clue that message may belong to.
If be found 170 in the subclass of first or second set of the clue of clue in frame 155, then Shi Bie clue is the candidate thread that electronic information may belong to.In most of situations, only will find single clue, it is the clue that electronic information belongs to.When identification during single clue, database can be updated 190, adding electronic information to database, and electronic information is included as the part of the clue of identification.
In some situations, in frame 155, can find 180 more than a clue.This may take place when electronic information is two bridges between the existing clue, and this bridge is introduced single clue together with two clues.In this situation, two clues can be merged 185 in single clue together.
For example, two clues 330 of Fig. 3 B diagram and 340, it has by amount interval time after the message last before first message in each clue of dashed extensions identification and in the clue.Example supposition clue 330 is shared identical title and identical contact person with 340 with the electronic information with date of being represented by dotted line B for this reason.In this situation, two clues will be identified in frame 150 and 155.Reason is that two clues are similar parts of same thread.Yet, they have been classified as different clues up to this point, because the stop press of first message of clue 340 and 330 is less than taking place in measuring the back interval time after 330 stop press or in the space before time quantum before first message of clue 340.Therefore, they can not be in being grouped together into single clue before.Along with time B (its before falling into the beginning of clue 340 the space before time quantum and or fall into clue 330 after back interval time of amount) introducing of electronic information of generation, two clues can be merged in the single clue together, and database can be updated the clue that merges with reflection.
Return decision box 170, if do not find 170 clues in first or second subclass of gathering of candidate thread, then electronic information is the part 175 of new thread.In this situation, the clue of discerning in frame 140 and 150 can be classified 195 and be related thread.These clues are shared clue identification header, or share title and the date range identical with electronic information.Further analysis can be performed the related thread of further classifying.In one embodiment, the contact person of electronic information can compare with the contact person of each related thread, whether can be classified as reduction, expansion, the special-purpose or overlapping clue about each related thread to determine new thread.
As an alternative, other attributes of the type of message or electronic information can be compared with each related thread, to determine its relation about related thread.For example, if electronic information is the forwarding of message before of a related thread, then new thread can be classified as the clue of forwarding.If the clue of transmitting does not comprise common contact person with message from its forwarding, then the clue of Zhuan Faing can further be classified as the initiation clue.As another example, if electronic information has following title, then it can be classified as the clue of correction, and this title is with except simple prefix, alphabet size are write, mode the change of white space and/or punctuate is revised from the title of related thread.
As another example, if the theme of electronic information is different from the theme of the clue of its branch, then new thread can be classified as the clue of modification.As above discuss, the interior perhaps theme of electronic information can be compared with the interior perhaps theme discussed in the one or more message in related thread.If the theme of electronic information or content are different with the interior perhaps theme of the message of formation related thread, then new thread can be classified as the clue of revising about related thread.
Should be noted that and the invention is not restricted to above-mentioned clue definition.Person of skill in the art will appreciate that other clue definition and classification are fine, and consider within the scope of the invention.Should also be noted that the present invention can group the messages received out of order in the suitable clue.
Database can upgrade 190 with new thread, and can be updated with the classification that comprises various clues and/or the relation between each related thread that finds in frame 195 and the new thread.
Fig. 4 is with three clues of graphic form diagram, clue 1, clue 2 and clue 3.As shown in the figure, clue 1 comprises three message, and has linkman set and the header entry that comprises A, B and C.Clue 1 does not have clue ID or any other clue identification header.Clue 2 comprises two message, and has linkman set and the title meeting that comprises A, B and D.Clue 3 comprises single message, and has linkman set and the title meeting that comprises A, B and D.Clue 3 also has expression, and it belongs to the clue identification header of the clue with clue ID 12.
Fig. 5 diagram has from three message A, the B of header fields date, title, contact person and the clue ID of each message parse and the form of C.Utilize among Fig. 4 illustrated clue as the existing clue that is stored in the database, the message of listing in the form 1 can be analyzed, grouping and classification, as illustrated in Figure 2 described in the process flow diagram.The message that Fig. 6 is shown in from form 1 has been grouped into the clue that produces after the suitable clue.
As shown in form 1, message A does not have clue identification header.As a result, the title of message A and date can be analyzed, to discern the candidate thread that 150 one or more message can belong to.For this example supposition space before and back amount interval time were defined as 4 days.Because date of message A, on February 1st, 2005, fall in 4 days of first message of clue 1 and 3, so they can be potential candidate thread.Yet,, can find that the title of message A and the title of clue 3 do not match when the title of the title of message A and clue 1 and 3 relatively the time.Therefore, clue 1 can be added to second set of candidate thread.Should be noted that and in frame 140, do not find any candidate thread, because in message A, do not discover a clue the identification header.
Proceed to frame 155, the contact person of message A compares with the contact person of each clue in second set, has the subclass of the clue of same correspondents with identification and message.In this situation, clue 1 be with message A have same correspondents clue second the set in unique clue.Owing to only find a candidate thread 180, so database can be updated 190 to comprise message A and to add message A to clue 1.Fig. 6 illustrates the clue 1. of having added message A
Get back to table 1, can find that message B does not have clue identification header.Therefore, the title of message B and date are analyzed to discern the candidate thread that 150 one or more message may belong to.Because date of message B falls into the date range of clue 1, and fall into after the stop press of clue 2 in 4 days and 4 days of first message of clue 3 in, so all three clues can be potential candidate thread.Yet when the title of the title of message B and three clues relatively the time, can discover a clue 1 is unique clue of sharing same title with message B.Therefore, clue 1 can be added to second set of candidate thread.
Proceed to frame 155, the contact person of message B compares with the contact person of each clue in second set, has the subclass of the clue of same correspondents with identification and message B.In this situation, clue 1 is unique clue during second of each clue is gathered.Yet clue 1 comprises contact person A, B and C, and message B only comprises contact person A and C.
Owing in second set, do not have the clue of same correspondents, so message B begins new thread 175 with message B.Proceed to frame 195, clue 1 is identified in frame 150.Therefore, the new thread that is begun by message B is the related thread with respect to clue 1.Once more, the contact person of message B is compared with the contact person of clue 1, can find that message B comprises than clue 1 contact person still less.Therefore, the new thread that is begun by message B can be classified as the clue with respect to the reduction of clue 1.Database can be updated 190, with the new thread of adding message B and being begun by message B.In addition, database can be updated comprising the classification of new thread, as the clue with respect to the reduction of clue 1.
The new thread that Fig. 6 illustrates clue 1 and begun by message B is labeled as clue 4.As shown, message B at message X from clue 1 branch.Noticing that message B is connected to the message X of clue 1 by a dotted line, is related thread with labeled message B, in the clue of this situation for reduction.Notice and to use different visual representation, to distinguish dissimilar related thread each different classification of related thread.
Message C does not have clue identification header yet.Therefore, the title of message C and date can analyzedly be discerned the one or more candidate thread that 150 message can belong to.Since date of message C falls in the date range of clue 1 and after the stop press of clue 24 days and 4 days of first message of clue 3 in, so all three clues can be potential candidate thread.Yet,, can find that only clue 2 and 3 is shared the titles identical with message C when the title of the title of message C and three clues relatively the time.Therefore, clue 2 and 3 can be added to second set of candidate thread.
Proceed to frame 155, the contact person of message C compares with the contact person of each clue in second set, has the subclass of each clue of same correspondents with identification and message B.In this situation, clue 2 is shared the contact person identical with message C with clue 3.Therefore, clue 2 and 3 is identified as subclass.
Owing to find more than 180 clue in the frame 155, therefore run into special situation, wherein message C is being considered to provide bridge between two clues of clue independently before.Clue 2 and 3 can be merged 185, and message C is added to the clue of merging.Database can be updated 190 to comprise message C and to upgrade each clue, makes clue 2 and clue 3 and message C be integrated into single clue.Fig. 6 illustrates the clue of merging.
Be important to note that clue 3 has clue ID 12, and clue 2 do not have clue ID, even two parts that clue is same clue.Its reason is derived from by the message of analyzing together and classifying may be derived from a plurality of user accounts, and these a plurality of user accounts may utilize different electronic mail formats.For example, the message that forms clue 3 may be derived from user A, and it utilizes discerns the electronic information application that header is applied to each message with clue.Yet user B and D may use and clue do not discerned the e-mail applications that header adds message to.Be important to note that the present invention groups these messages in the identical clue, no matter and from the difference of the header fields of each message parse.
In alternate embodiment of the present invention, just carry out further and analyze in case message has been grouped into clue, with determine in message main topic of discussion whether with all the other message that form this clue in the main topic of discussion coupling.In one embodiment, the interior perhaps theme of discussing in electronic information can be compared with the interior perhaps theme discussed in the one or more message in its clue.If the theme of electronic information or content are different with the theme or the content of remaining one or more message of this clue of formation, then this electronic information can be classified as the clue of modification.Once more, this comes in handy to the investigator when looking back the message clue.
In one embodiment, can be from forming each message extraction keyword and/or phrase of clue.These keywords, phrase and/or the keyword of resolving from electronic information with keyword and the similar Xiang Keyu of phrase and/or phrase relatively, overlapping with the keyword determining in the main body of message, to use.Content between high each message of overlapping indication is similar.Low overlapping indication electronic information can be classified as the clue of modification, because this content is different with the remaining content of this clue.Person of skill in the art will appreciate that have many other content similarity tests, it can be applied to each message that forms clue, to remain the similarity of the content of message in the content of determining message and the clue.
Above-mentioned system, method and apparatus can be handled the electronic information that receives from one or more user accounts in many ways.For example, when real-time reception, message can and be grouped into clue by sequence analysis.As another example, message can be used batch message, divide into groups to determine various clues from each clue in batches afterwards by batch quantity analysis.As an alternative, when from electronic message server or the reception of other storage mediums, message can be by sequence analysis.In these examples, message can be analyzed and be divided into groups disorderly.As a result, the various classification that are associated with each clue and each clue can change in time.Therefore, for example, because consideration or other factors of date range may need and will think before that each different clues merged.Person of skill in the art will appreciate that, use aforesaid the present invention, exist many modes analyze and blocking message in each clue.All these processes are considered within the scope of the invention.
Although the present invention describes with reference to some embodiment, yet person of skill in the art will appreciate that various modifications can be provided.For example,, yet person of skill in the art will appreciate that the present invention can be applicable to email message, instant message (IM), Short Message Service (SMS) message, speech message, video messaging etc. although the present invention usually describes at electronic information.In addition, although various embodiments of the present invention at be used for from a plurality of mailboxes organize electronic information should be used for describe, yet many features discussed above also can be used in single mailbox environment.Each that the invention provides embodiment changes and modification, and the present invention only is defined by the claims.

Claims (25)

1. method that is used for electronic information is grouped into one of a plurality of clues, this method comprises:
Resolve title, date and the linkman set of described electronic information to discern described electronic information; And
Discern first subclass of a plurality of existing clues, comprising:
Title with the title of described electronic information coupling; And
The date range that the date of described electronic information falls into;
Second subclass of identification clue, this second subclass comprises the subclass of first subclass, it comprises the contact person identical with the linkman set of described electronic information; And
In response to this second subclass that comprises single clue, add described electronic information to described clue.
2. the method for claim 1 wherein in response to second subclass that comprises a plurality of clues, merges to a plurality of clues in the single clue, and described electronic information is added to the clue of merging.
3. the method for claim 1 wherein in response to second subclass that does not comprise clue, is created new thread with the described electronic information that forms new thread.
4. method as claimed in claim 3, wherein this new thread is classified as the related thread about each clue of discerning in first subclass.
5. method as claimed in claim 4, wherein this new thread is classified as the clue about the reduction of the clue of first subclass, and this first subclass comprises the contact person more than the linkman set of described electronic information.
6. method as claimed in claim 4, wherein this new thread is classified as the clue about the expansion of the clue of first subclass, and this first subclass is included in the contact person's who finds in the linkman set of described electronic information subclass.
7. the method for claim 1, wherein the title of resolving from described electronic information is by standardization.
8. the method for claim 1, wherein date range is included in the time interval before first message of clue.
9. the method for claim 1, wherein date range is included in the time interval after the stop press of clue.
10. method that is used for electronic information is grouped into one of a plurality of clues, this method comprises:
Resolve described electronic information, with identification clue identification header and linkman set; And
First subclass of a plurality of existing clues of identification and matching clue identification header;
Second subclass of identification clue, this second subclass comprises the subclass of first subclass, it comprises the contact person identical with the linkman set of described electronic information; And
In response to second subclass that comprises single clue, add described electronic information to described clue.
11. method as claimed in claim 10 wherein in response to second subclass that does not comprise clue, is created new thread with the described electronic information that forms new thread.
12. method as claimed in claim 11, wherein this new thread is classified as the related thread about each clue of discerning in first subclass.
13. method as claimed in claim 12, wherein this new thread is classified as the clue about the reduction of the clue of first subclass, and this first subclass comprises the contact person more than the linkman set of described electronic information.
14. method as claimed in claim 12, wherein this new thread is classified as the clue about the expansion of the clue of first subclass, and this first subclass is included in the contact person's who finds in the linkman set of described electronic information subclass.
15. method as claimed in claim 10, wherein this clue identification header is identified in the described electronic information one or more message before in the clue.
16. one kind is used for respect to the classify method of first clue in the thread group of second clue of thread group, this method comprises:
The contact person's of the contact person's of first clue first set and second clue second gathered compare; And
If the contact person's of first clue first set is different from the contact person's of second clue second set, then first clue is categorized as related thread about second clue.
17. method as claimed in claim 16, if wherein the contact person's first set is the subclass of contact person's second set, then first clue is the clue of reduction.
18. method as claimed in claim 16 if wherein first set comprises second All Contacts who gathers, adds the contact person that at least one is additional, then first clue is the clue of expansion.
19. method as claimed in claim 16 if wherein first set comprises second subclass of gathering, adds at least one undiscovered additional contact person in second set, then first clue is overlapping clue.
20. one kind is used for method that first clue that electronic information belongs to respect to this electronic information is classified, this method comprises:
The content that to resolve from described electronic information is compared with the content from the one or more message parses that form first clue;
If the content of electronic information is different from the content of the one or more message that form first clue, then described electronic information is categorized as new thread.
21. method as claimed in claim 20 if wherein the title of electronic information is different from the title of the one or more message that form first clue, then is categorized as new thread the clue about the correction of first clue.
22. method as claimed in claim 20, if wherein electronic information and the content difference in the one or more message that form first clue indicate in the described electronic information theme of main topic of discussion and first clue different, then new thread is categorized as clue about the modification of first clue.
23. method as claimed in claim 20 if wherein described electronic information is the forwarding of one of message of formation first clue, then is categorized as new thread the clue about the forwarding of first clue.
24. a method that is used for electronic information is grouped into one of a plurality of existing clues, this method comprises:
Resolve described electronic information to discern one or more header fields;
In response to identification clue identification header,
Each that clue is discerned header and a plurality of existing clues relatively, with first set of identification with a plurality of existing clues of clue identification header coupling;
The linkman set of described electronic information is compared with the linkman set of each set of first set of clue, have the clue of same correspondents set with identification;
Described electronic information is added to clue with same correspondents set;
Discern header in response to not discerning clue,
With the title of described electronic information and date and a plurality of existing clues each title and date range relatively, to discern second set of a plurality of existing clues that described electronic information may belong to;
The linkman set of described electronic information is compared with the linkman set of each set of second set of clue, have the clue of same correspondents set with identification; And
Described electronic information is added to clue with same correspondents set.
25. method as claimed in claim 24, the wherein one or more message before the described electronic information in the clue identification header identification clue.
CN2006800114025A 2005-02-01 2006-01-31 Thread identification and classification Expired - Fee Related CN101208686B (en)

Applications Claiming Priority (5)

Application Number Priority Date Filing Date Title
US64939505P 2005-02-01 2005-02-01
US60/649,395 2005-02-01
US11/128,935 2005-05-12
US11/128,935 US8055715B2 (en) 2005-02-01 2005-05-12 Thread identification and classification
PCT/US2006/003332 WO2006083820A2 (en) 2005-02-01 2006-01-31 Thread identification and classification

Publications (2)

Publication Number Publication Date
CN101208686A CN101208686A (en) 2008-06-25
CN101208686B true CN101208686B (en) 2010-09-29

Family

ID=39334893

Family Applications (2)

Application Number Title Priority Date Filing Date
CNA2006800114006A Pending CN101167077A (en) 2005-02-01 2006-01-30 Electronic communication analysis and visualization
CN2006800114025A Expired - Fee Related CN101208686B (en) 2005-02-01 2006-01-31 Thread identification and classification

Family Applications Before (1)

Application Number Title Priority Date Filing Date
CNA2006800114006A Pending CN101167077A (en) 2005-02-01 2006-01-30 Electronic communication analysis and visualization

Country Status (1)

Country Link
CN (2) CN101167077A (en)

Families Citing this family (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20100153836A1 (en) * 2008-12-16 2010-06-17 Rich Media Club, Llc Content rendering control system and method
US11468453B2 (en) 2005-12-24 2022-10-11 Rich Media Club, Llc System and method for creation, distribution and tracking of advertising via electronic networks
US8356247B2 (en) 2008-12-16 2013-01-15 Rich Media Worldwide, Llc Content rendering control system and method
KR101922467B1 (en) * 2011-12-22 2018-11-28 삼성전자주식회사 Apparatus and method for managing attached file of message in portable terminal
US10853860B2 (en) 2013-05-28 2020-12-01 Siemens Industry, Inc. Systems and methods for requesting a quote, processing an order, or requesting support
EP3028243A1 (en) * 2013-07-30 2016-06-08 Hewlett Packard Enterprise Development LP Determining topic relevance of an email thread

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6154764A (en) * 1997-01-31 2000-11-28 Fujitsu Limited On-line forum-type electronic conference system maintaining consistency of messages
US6182117B1 (en) * 1995-05-31 2001-01-30 Netscape Communications Corporation Method and apparatus for workgroup information replication
CN1342940A (en) * 2000-09-06 2002-04-03 国际商业机器公司 Coprocessor with multiple logic interface

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6182117B1 (en) * 1995-05-31 2001-01-30 Netscape Communications Corporation Method and apparatus for workgroup information replication
US6154764A (en) * 1997-01-31 2000-11-28 Fujitsu Limited On-line forum-type electronic conference system maintaining consistency of messages
CN1342940A (en) * 2000-09-06 2002-04-03 国际商业机器公司 Coprocessor with multiple logic interface

Also Published As

Publication number Publication date
CN101208686A (en) 2008-06-25
CN101167077A (en) 2008-04-23

Similar Documents

Publication Publication Date Title
US8055715B2 (en) Thread identification and classification
US10911383B2 (en) Spam filtering and person profiles
US7596594B2 (en) System and method for displaying and acting upon email conversations across folders
US9819634B2 (en) Organizing messages in a messaging system using social network information
US10778624B2 (en) Systems and methods for spam filtering
US7657603B1 (en) Methods and systems of electronic message derivation
US11595340B2 (en) Methods and apparatus for mass email detection and collaboration
US11093559B2 (en) Integration of news into direct social communications and interactions
US20160179781A1 (en) System and method for read-ahead enhancements
US20130185336A1 (en) System and method for supporting natural language queries and requests against a user's personal data cloud
CN101208686B (en) Thread identification and classification
US20180219810A1 (en) Transmitting tagged electronic messages
US8370437B2 (en) Method and apparatus to associate a modifiable CRM related token to an email
US20240020305A1 (en) Systems and methods for automatic archiving, sorting, and/or indexing of secondary message content
CN103297316A (en) Method and system for processing e-mail
US10685069B2 (en) Message system for social networks
US20240098053A1 (en) System and method for transforming email messages to communication stream messages
KR100451051B1 (en) E-mail management system on internet
US20220292069A1 (en) Method and System for Enhancement and Cross Relating Messages Received and Stored on a Mobile Device
EP3068080A1 (en) Method for managing message to be exchanged between a plurality of client devices, a related system and devices
JP2001022655A (en) Electronic mail system

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant
C41 Transfer of patent application or patent right or utility model
TR01 Transfer of patent right

Effective date of registration: 20160412

Address after: Zurich

Patentee after: K-boom Knight Limited by Share Ltd.

Address before: Delaware

Patentee before: EV Ott

Effective date of registration: 20160412

Address after: Delaware

Patentee after: EV Ott

Address before: California, USA

Patentee before: Metalincs Corp.

CF01 Termination of patent right due to non-payment of annual fee
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20100929

Termination date: 20180131