CN101119326B - Method and device for managing instant communication conversation record - Google Patents

Method and device for managing instant communication conversation record Download PDF

Info

Publication number
CN101119326B
CN101119326B CN2006101095396A CN200610109539A CN101119326B CN 101119326 B CN101119326 B CN 101119326B CN 2006101095396 A CN2006101095396 A CN 2006101095396A CN 200610109539 A CN200610109539 A CN 200610109539A CN 101119326 B CN101119326 B CN 101119326B
Authority
CN
China
Prior art keywords
conversation recording
session
conversation
session theme
characteristic
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN2006101095396A
Other languages
Chinese (zh)
Other versions
CN101119326A (en
Inventor
石燕伟
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tencent Technology Shenzhen Co Ltd
Original Assignee
Tencent Technology Shenzhen Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tencent Technology Shenzhen Co Ltd filed Critical Tencent Technology Shenzhen Co Ltd
Priority to CN2006101095396A priority Critical patent/CN101119326B/en
Publication of CN101119326A publication Critical patent/CN101119326A/en
Application granted granted Critical
Publication of CN101119326B publication Critical patent/CN101119326B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Landscapes

  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The present invention discloses a managing method for instant communication conversation records, which aims at solving the prior problems that inquiring information in the conversation records is not only fussy but also inefficiency for instant communicating users. The method comprises the following steps: getting and assorting the conversation records of users to get sample groups; correlating analyzing each sample group to get a corresponding assorted group that contains eigenvector corresponding with the conversation records in the sample groups; determining the conversation topic corresponding with the assorted group according to the emerging frequency of the words in the assorted groups, and relating the conversation topic to the conversation records corresponding with the assorted group; searching for the conversation topic that matches with the key word inputted by the users, and displaying the conversation records that relates with the conversation topic to the users. The present invention also discloses a managing device used for instant communication conversation records.

Description

A kind of management method of instant communication conversation recording and device
Technical field
The present invention relates to communication and field of computer technology, relate in particular to a kind of management method and device of instant telecommunication session record.
Background technology
Along with the continuous development of instant messaging (IM) technology with popularize, increasing user not only adopts IM software to exchange with other users in network, the instrument that IM software can also be encountered problems in other user's works of consultation or study as the user, simultaneously, conversation recording between the user is accompanied by under interchange between the user preserves in the IM system, provides data for the user searches the information of oneself paying close attention to later on.
For example: when user A seeks advice from a problem to user B, user B has returned the answer of problem, as user C during with regard to same problem counsel user A or user B, user A need check the relevant information in the conversation recording with user B, when perhaps user B need check relevant information in the conversation recording with user A, user A or user B need manually search relative recording in conversation recording, the time interval more when conversation recording or user A and user C counseling problem is when longer, adopt the method for prior art, not only increased the workload of manually searching, and search efficiency is lower.
If user A seeks advice from a plurality of users with regard to same problem, when user A wish from a plurality of users' conversation recording during Query Information, adopt the method for prior art, when the instantaneous communication system that uses as the user provides the instantaneous communication system of conversation recording look facility, user A can only manually check a plurality of users' conversation recording one by one, finds the information of oneself being concerned about.Even user A uses some other that instantaneous communication system of the data importing/export function of user conversation record is provided, user A also needs a plurality of users' conversation recording data are derived earlier, in derived data, inquire about then, user A also can inquire about in derived data according to the keyword of the information of oneself being concerned about, but adopt the mode of keyword also can only navigate to the paragraph that comprises this keyword, this paragraph is not necessarily relevant with the information that the user is concerned about, can not realize that the user effectively searches information in conversation recording.
Summary of the invention
The invention provides a kind of management method and device of instant telecommunication session record, in order to solve the instant communication user that exists in the prior art in conversation recording during Query Information, the problem that not only complex operation, and search efficiency is low.
The invention provides following technical scheme:
A kind of management method of instant communication conversation recording comprises the steps:
Obtain user's conversation recording and it is classified and obtain sample set;
Generate every conversation recording characteristic of correspondence vector in the described sample set, analyze the correlation of each characteristic vector and other characteristic vectors, according to described correlation to the characteristic vector generation sort merge of classifying;
Determine the session theme of sort merge correspondence according to the frequency that word in each sort merge occurs, and make this session theme be associated with the conversation recording of sort merge correspondence; And
The session theme of the keyword lookup of importing during according to user inquiring and this keyword coupling, and the conversation recording related with the session theme that will find presented to the user.
Wherein, correlation behind the generation session theme between the further analysis session theme, and correlation merged into same session theme greater than the session theme of predetermined threshold, make the session theme after the merging related with the pairing conversation recording of merged all session themes.
By different session subscriber to the conversation recording generation sample set of classifying.
Preferable, according to the blanking time of conversation recording in the described sample set, further a sample set is divided into a plurality of different sample sets.
Generate every conversation recording characteristic of correspondence vector in the sample set, analyze the correlation of each characteristic vector and other characteristic vectors, specifically comprise:
Every conversation recording is carried out word segmentation processing, and the word of deleting no practical significance in this conversation recording obtains S set, merges the synonym among the S, and carries out vectorization, then generates and this conversation recording characteristic of correspondence vector
Figure DEST_PATH_GA20179767200610109539601D00011
(W 1, W 2, W 3... Wn), wherein Wi is the weight of i element, and each element is the word among the S;
Calculate and each conversation recording characteristic of correspondence vector
Figure DEST_PATH_GA20179767200610109539601D00021
In the weight of each speech, according to the correlation of weight calculation each characteristic vector of each speech in its characteristic vector of forming described characteristic vector.
Determine the session theme of this sort merge greater than the word of predetermined threshold according to the frequency of occurrences in the sort merge.
A kind of management devices of instant communication conversation recording comprises:
Be used to store the unit of user conversation record;
Be used for described conversation recording classified and generate the unit of sample set;
Be used for generating every conversation recording characteristic of correspondence of described sample set vector, analyze the correlation of each characteristic vector and other characteristic vectors, according to described correlation characteristic vector being classified generates the unit of sort merge;
Be used for determining the session theme of described sort merge correspondence, and make this session theme be associated with the unit of the conversation recording of sort merge correspondence; And
The session theme of the keyword lookup of importing when being used for and this keyword coupling, and the conversation recording related that will the find unit of presenting to the user with the session theme according to user inquiring.
Preferable, described device also comprises:
Be used for the correlation between the analysis session theme, and correlation is merged into same session theme greater than the session theme of predetermined threshold, and session theme after will merging and the related unit of the merged pairing conversation recording of all session themes.
Beneficial effect of the present invention is as follows:
The present invention classifies to the user conversation record behind the generation sample set, respectively each sample set is carried out the session theme that correlation analysis generates the respective classified combination and determines the sort merge correspondence, and the conversation recording that the session theme is associated with the sort merge correspondence.After adopting the present invention, when the user need be from conversation recording during Query Information, the user only need import keyword, system will search the session theme with this keyword coupling automatically, and the associated conversation recording of session theme that finds presented to the user, troublesome operation when not only having avoided user's craft Query Information, and improved search efficiency.
Description of drawings
Fig. 1 is the management devices structural representation of user conversation record in the embodiment of the invention;
Fig. 2 is the schematic diagram of user conversation record management method in the embodiment of the invention;
The process chart of Fig. 3 in the embodiment of the invention user conversation record being classified;
Fig. 4 is for carrying out the process chart of correlation analysis to sample set in the embodiment of the invention.
Embodiment
In order to solve in the prior art, instant communication user is in conversation recording during Query Information, complex operation not only, and the low problem of search efficiency, in the present embodiment user conversation is write down the generation sample set of classifying, respectively each sample set is carried out the session theme that correlation analysis generates the respective classified combination and determines the sort merge correspondence, and the session theme is associated with the conversation recording of sort merge correspondence, and the session theme that mates according to the keyword lookup of user input and this keyword, and the associated conversation recording of session theme that finds presented to the user.
Consult the management devices structural representation that Figure 1 shows that user conversation record in the present embodiment, comprising: memory cell 101, taxon 102, analytic unit 103, session thematic unit 104, merge cells 105 and query unit 106.
Memory cell 101 is used to preserve user's conversation recording and session theme.Taxon 102 is used to obtain conversation recording and conversation recording is classified and obtains sample set.Analytic unit 103 is used for sample set is carried out correlation analysis, generates the sort merge of sample set.Session thematic unit 104 is used for determining the session theme of sample set sort merge, and makes this session theme be associated with the conversation recording of sort merge correspondence.Merge cells 105 is used for the correlation between the analysis session theme, and correlation is merged into same session theme greater than the session theme of predetermined threshold, and the session theme after will merging is associated with the conversation recording of merged all session theme correspondences.Query unit 106 is used for receiving the keyword that the user imports and searches session theme with this keyword coupling when the conversation recording Query Information, and the associated conversation recording of session theme that finds is presented to the user.
Consult the schematic diagram that Figure 2 shows that user conversation record management method in the present embodiment, comprising:
Step 201, obtain user's conversation recording and this conversation recording classified and obtain sample set.
Step 202, the sample set that generates is carried out correlation analysis generate the respective classified combination.
Step 203, the frequency that occurs according to word in each sort merge are determined the session theme of sort merge correspondence, and are made this session theme be associated with the conversation recording of sort merge correspondence.
Correlation between step 204, the analysis session theme, and correlation merged into same session theme greater than the session theme of predetermined threshold makes session theme after the merging be associated with the conversation recording of merged all session theme correspondences.
Step 205, when the user in conversation recording during Query Information, the session theme of the keyword lookup of importing during according to user inquiring and this keyword coupling, and the associated conversation recording of session theme that finds presented to the user.
In step 201, the handling process that conversation recording is classified is consulted shown in Figure 3, and processing procedure is as follows:
Step 301, judge that whether conversation recording handle through classification,, then it is not handled if handle through classification; Otherwise, execution in step 302.
Step 302, the conversation recording of handling through classification is not classified to conversation recording according to different users, as: judge conversation recording TRi and conversation recording TR jWhether belong to the conversation recording between same user, if conversation recording TRi and conversation recording TR jBelong to the session between different user, TRi and conversation recording TR are write down in session jBe divided into different sample set TS; If conversation recording TRi and conversation recording TR jBelong to the conversation recording between same user, then session is write down TRi and conversation recording TR jBe divided in the identical sample set.
Step 303, with same sample set according to dividing the blanking time of the conversation recording in this sample set, further be divided into different sample sets, can be made as a week etc. according to practical application the blanking time of conversation recording.
Handling the sample set TS that generates through step 303 is the sample set that carries out correlation analysis.
Consult shown in Figure 4ly, it is as follows to adopt KNN (K Nearest Neighbor, K nearest-neighbors) algorithm to carry out the processing procedure of correlation analysis to sample set:
Step 401, every conversation recording TR among the sample set TS is generated the characteristic of correspondence vector.At first every conversation recording TR is carried out word segmentation processing, remove auxiliary word wherein, the speech of no practical significance such as interjection obtains S set; Merge the synonym among the S, for example { " computer ", " computer " } merged into { " computer ", " computer " }.The S set corresponding to every conversation recording after will merging through synonym is carried out vectorization, the generating feature vector
Figure G061A9539620060810D000061
(W 1, W 2, W 3... Wn), wherein Wi is the weight of i element, and each element is the word among the S.
Step 402, calculating and each conversation recording TR characteristic of correspondence vector
Figure G061A9539620060810D000062
In the weights W of each element.Adopting following formula to carry out weights calculates:
W ( t , d → ) = tf ( t , d → ) × log ( N / n t + 0.01 ) Σ t ∈ d → [ tf ( t , d → ) × log ( N / n t + 0.01 ) ] 2
Wherein,
Figure G061A9539620060810D000064
For speech t in characteristic vector
Figure G061A9539620060810D000065
In weight, and For speech t in characteristic vector In word frequency, N is the sum of conversation recording TR among each sample set TS, n tFor occurring the conversation recording TR number of speech t among each sample set TS, denominator is a normalization factor.
Coefficient correlation between step 403, calculating and each conversation recording characteristic of correspondence vector is determined K the characteristic vector the most similar to each characteristic vector according to calculating the gained coefficient correlation.
During concrete enforcement, adopt following formula Sim ( d i , d j ) = Σ k = 1 M W ik × W jk ( Σ k = 1 M W ik 2 ) ( Σ k = 1 M W jk 2 ) Calculate the coefficient correlation between each conversation recording characteristic of correspondence vector, wherein, Sim (d i, d j) be characteristic vector d iWith characteristic vector d jCoefficient correlation, W IkAnd W JkBe respectively characteristic vector d iWith characteristic vector d jThe weights of k element.
By calculating, obtain the coefficient correlation between each characteristic vector, according to this coefficient correlation, will be combined as a set respectively with the maximally related K of each a characteristic vector characteristic vector, the value of K can be determined according to practical application.
Step 404, each conversation recording characteristic of correspondence vector is divided in the inhomogeneity of classification among the C generates sort merge.
Classification C is the set that each conversation recording characteristic of correspondence vector is formed among the sample set TS.
Method one: when classification C is sky, then generate a vector set c among the classification C in the following way, then c is added among the C that classifies:
Characteristic vector corresponding to conversation recording
Figure G061A9539620060810D000071
And characteristic vector
Figure G061A9539620060810D000072
Belong to the set that the most similar K of the other side neighbours form respectively, then
Figure G061A9539620060810D000073
With
Figure G061A9539620060810D000074
Belong to same class c, generate class c and such and characteristic vector
Figure G061A9539620060810D000075
And characteristic vector Classification C is added class c in corresponding conversation recording association then, and the characteristic vector among each class c is formed a sort merge.
Method two: when classification C is not sky, then calculate characteristic vector corresponding to each conversation recording
Figure G061A9539620060810D000077
Belong to the weight of certain class c (c ∈ C), adopt following formula:
p ( x → , C j ) = Σ d → i ∈ KNN Sim ( x → , d → i ) y ( d → i , C j )
Wherein,
Figure G061A9539620060810D000079
Be characteristic vector corresponding to a conversation recording,
Figure G061A9539620060810D0000710
For with
Figure G061A9539620060810D0000711
Characteristic vector in the set of the most similar K neighbours' composition,
Figure G061A9539620060810D0000712
For
Figure G061A9539620060810D0000713
Characteristic vector maximally related with it
Figure G061A9539620060810D0000714
Coefficient correlation, this coefficient correlation can obtain according to step 403 result of calculation,
Figure G061A9539620060810D0000715
Be the category attribute function, if characteristic vector
Figure G061A9539620060810D0000716
Belong to class C j,
Figure G061A9539620060810D0000717
Functional value be 1, otherwise be 0.According to calculating The comparative feature vector
Figure G061A9539620060810D0000719
At all kinds of C jIn weights, with characteristic vector
Figure G061A9539620060810D0000720
Assign to the bigger class C of weights jIn, and with such C jWith characteristic vector
Figure G061A9539620060810D0000721
Corresponding conversation recording association.
When adopting method two, if characteristic vector
Figure G061A9539620060810D0000722
All very little with the degree of correlation of existing each class c, then can adopt the mode of method one to generate a new class c ', and class c ' is joined among the classification C, and with class c ' and eigen vector
Figure G061A9539620060810D0000723
Corresponding conversation recording association.
After each characteristic vector handled, characteristic vector all is divided in the class, by all kinds of sort merges of forming respectively.
N word that the frequency of occurrences in each sort merge that generates is the highest or frequency are defined as the session theme of this sort merge greater than the word of a, and N value and a value are determined according to practical application.
Each sample set TS is carried out generating after the above-mentioned processing session theme of sort merge and this sort merge correspondence, when the session theme that generates is carried out correlation analysis, with the sample set of session theme as the KNN algorithm, calculate in this set the weight of each speech in this session theme in each session theme, according to weight, utilize the formula in the step 403, calculate the coefficient correlation of each session theme, the session theme of coefficient correlation greater than setting threshold merged.
When presenting conversation recording, the session record is arranged, also can be arranged according to the weight order of conversation recording in the session theme according to different session subscriber to the user.
Adopted the KNN algorithm that sample set is carried out correlation analysis among the above embodiment, but the present invention is not limited only to adopt the KNN algorithm that sample set is analyzed.Conversation recording is carried out the method for correlation analysis and can also use training algorithm and the sorting technique based on vector space such as vector machine algorithm, neural network algorithm and bayesian algorithm.When for example adopting bayesian algorithm, calculate each speech in each conversation recording character pair vector and appear at probability in certain session, calculate the probability that characteristic vector belongs to certain session according to Bayesian formula then, it is joined in the session of probability maximum.
Adopt the present invention, when user's information that inquiring user is concerned about in conversation recording, the user only need import keyword, system will inquire about the session theme with the keyword coupling automatically, and the conversation recording related with this session theme presented to the user, troublesome operation when not only having avoided user's craft Query Information, and improved search efficiency.
Obviously, those skilled in the art can carry out various changes and modification to the present invention and not break away from the spirit and scope of the present invention.Like this, if of the present invention these are revised and modification belongs within the scope of claim of the present invention and equivalent technologies thereof, then the present invention also is intended to comprise these changes and modification interior.

Claims (8)

1. the management method of an instant communication conversation recording is characterized in that, comprises the steps:
Obtain user's conversation recording and it is classified and obtain sample set;
Generate every conversation recording characteristic of correspondence vector in the described sample set, analyze the correlation of each characteristic vector and other characteristic vectors, according to described correlation to the characteristic vector generation sort merge of classifying;
Determine the session theme of sort merge correspondence according to the frequency that word in each sort merge occurs, and make this session theme be associated with the conversation recording of sort merge correspondence; And
The session theme of the keyword lookup of importing during according to user inquiring and this keyword coupling, and the conversation recording related with the session theme that will find presented to the user.
2. the method for claim 1, it is characterized in that, correlation behind the generation session theme between the further analysis session theme, and correlation merged into same session theme greater than the session theme of predetermined threshold, make the session theme after the merging related with the pairing conversation recording of merged all session themes.
3. method as claimed in claim 1 or 2 is characterized in that, by different session subscriber to the conversation recording generation sample set of classifying.
4. method as claimed in claim 3 is characterized in that, according to the blanking time of conversation recording in the described sample set, further a sample set is divided into a plurality of different sample sets.
5. the method for claim 1 is characterized in that, every conversation recording characteristic of correspondence vector in the described generation sample set is analyzed the correlation of each characteristic vector and other characteristic vectors, specifically comprises:
Every conversation recording is carried out word segmentation processing, and the word of deleting no practical significance in this conversation recording obtains S set, merges the synonym among the S, and carries out vectorization, then generates and this conversation recording characteristic of correspondence vector
Figure FA20179767200610109539601C00011
(W 1, W 2, W 3... Wn), wherein Wi is the weight of i element, and each element is the word among the S;
Calculate and each conversation recording characteristic of correspondence vector
Figure FA20179767200610109539601C00012
In the weight of each speech, according to the correlation of weight calculation each characteristic vector of each speech in its characteristic vector of forming described characteristic vector.
6. the method for claim 1 is characterized in that, determines the session theme of this sort merge greater than the word of predetermined threshold according to the frequency of occurrences in the sort merge.
7. the management devices of an instant communication conversation recording is characterized in that, comprising:
Be used to store the unit of user conversation record;
Be used for described conversation recording classified and generate the unit of sample set;
Be used for generating every conversation recording characteristic of correspondence of described sample set vector, analyze the correlation of each characteristic vector and other characteristic vectors, according to described correlation characteristic vector being classified generates the unit of sort merge;
Be used for determining the session theme of described sort merge correspondence, and make this session theme be associated with the unit of the conversation recording of sort merge correspondence; And
The session theme of the keyword lookup of importing when being used for and this keyword coupling, and the conversation recording related that will the find unit of presenting to the user with the session theme according to user inquiring.
8. device as claimed in claim 7 is characterized in that, also comprises:
Be used for the correlation between the analysis session theme, and correlation is merged into same session theme greater than the session theme of predetermined threshold, and session theme after will merging and the related unit of the merged pairing conversation recording of all session themes.
CN2006101095396A 2006-08-04 2006-08-04 Method and device for managing instant communication conversation record Active CN101119326B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN2006101095396A CN101119326B (en) 2006-08-04 2006-08-04 Method and device for managing instant communication conversation record

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN2006101095396A CN101119326B (en) 2006-08-04 2006-08-04 Method and device for managing instant communication conversation record

Publications (2)

Publication Number Publication Date
CN101119326A CN101119326A (en) 2008-02-06
CN101119326B true CN101119326B (en) 2010-07-28

Family

ID=39055265

Family Applications (1)

Application Number Title Priority Date Filing Date
CN2006101095396A Active CN101119326B (en) 2006-08-04 2006-08-04 Method and device for managing instant communication conversation record

Country Status (1)

Country Link
CN (1) CN101119326B (en)

Families Citing this family (32)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102246175A (en) * 2008-12-12 2011-11-16 皇家飞利浦电子股份有限公司 An assertion-based record linkage in distributed and autonomous healthcare environments
CN101483620B (en) * 2009-02-17 2012-09-26 腾讯科技(深圳)有限公司 Session reservation method and system in instant communication tool
CN101997964A (en) * 2009-08-13 2011-03-30 中国电信股份有限公司 Processing method of mobile communication terminal and contact records thereof
CN103078781A (en) * 2011-10-25 2013-05-01 国际商业机器公司 Method for instant messaging system and instant messaging system
CN102646134A (en) * 2012-03-29 2012-08-22 百度在线网络技术(北京)有限公司 Method and device for determining message session in message record
CN103425648B (en) * 2012-05-15 2016-04-13 腾讯科技(深圳)有限公司 The disposal route of relation loop and system
CN103279465B (en) * 2012-12-18 2018-05-25 北京奇虎科技有限公司 The control method and device of communication historical data
CN103279466B (en) * 2012-12-18 2018-01-26 北京奇虎科技有限公司 Control the method and device of communication historical data
CN105024906B (en) * 2014-04-21 2018-10-02 腾讯科技(深圳)有限公司 The storage of group's message, querying method and system in social networks
CN105450497A (en) * 2014-07-31 2016-03-30 国际商业机器公司 Method and device for generating clustering model and carrying out clustering based on clustering model
CN104361003A (en) * 2014-10-10 2015-02-18 金硕澳门离岸商业服务有限公司 Method and device for classified displaying of chat records
CN104462518B (en) * 2014-12-22 2018-10-19 百度在线网络技术(北京)有限公司 Method and apparatus for being labeled to IM information
CN105141502A (en) * 2015-08-12 2015-12-09 深圳前海珩昌科技有限公司 Method and device for managing instant communication process
CN105049336A (en) * 2015-08-12 2015-11-11 深圳前海珩昌科技有限公司 Method and system for processing instant communication messages, server and client
CN106487640A (en) * 2015-08-25 2017-03-08 平安科技(深圳)有限公司 Many communication modules control method and server
CN106888236B (en) * 2015-12-15 2021-08-31 腾讯科技(深圳)有限公司 Session management method and session management device
CN105589625B (en) * 2015-12-21 2020-06-02 惠州Tcl移动通信有限公司 Processing method and device of social media message and communication terminal
CN105959205A (en) * 2016-04-29 2016-09-21 杨夫春 Chatting records keeping method
CN106599147A (en) * 2016-12-06 2017-04-26 庄爱芹 Method and device for browser browsing history management
CN106777013B (en) * 2016-12-07 2020-09-11 科大讯飞股份有限公司 Conversation management method and device
CN108737240A (en) * 2017-04-18 2018-11-02 阿里巴巴集团控股有限公司 The method that the method, apparatus and group that chat group automatically creates create
CN111357245B (en) * 2017-11-15 2022-08-09 华为技术有限公司 Information searching method, terminal, network equipment and system
CN111698143B (en) * 2019-03-14 2022-12-16 阿里巴巴集团控股有限公司 Information processing method, information display method and device
CN110138645B (en) * 2019-03-29 2021-06-18 腾讯科技(深圳)有限公司 Session message display method, device, equipment and storage medium
CN110781930A (en) * 2019-10-14 2020-02-11 西安交通大学 User portrait grouping and behavior analysis method and system based on log data of network security equipment
CN112769673A (en) * 2019-11-05 2021-05-07 钉钉控股(开曼)有限公司 Communication record generation, recommendation and display method and device
CN111327518B (en) * 2020-01-21 2022-10-11 上海掌门科技有限公司 Method and equipment for splicing messages
CN111708866B (en) * 2020-08-24 2020-12-11 北京世纪好未来教育科技有限公司 Session segmentation method and device, electronic equipment and storage medium
CN111798870A (en) * 2020-09-08 2020-10-20 共道网络科技有限公司 Session link determining method, device and equipment and storage medium
CN113113017B (en) * 2021-04-08 2024-04-09 百度在线网络技术(北京)有限公司 Audio processing method and device
CN113595886A (en) * 2021-07-29 2021-11-02 北京达佳互联信息技术有限公司 Instant messaging message processing method and device, electronic equipment and storage medium
CN114691830B (en) * 2022-03-31 2022-12-20 江苏冬云云计算股份有限公司 Network security analysis method and system based on big data

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1584883A (en) * 2004-05-27 2005-02-23 威盛电子股份有限公司 Related document connecting managing system, method and recording media
CN1609859A (en) * 2004-11-26 2005-04-27 孙斌 Search result clustering method
CN1741012A (en) * 2004-08-23 2006-03-01 富士施乐株式会社 Test search apparatus and method

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1584883A (en) * 2004-05-27 2005-02-23 威盛电子股份有限公司 Related document connecting managing system, method and recording media
CN1741012A (en) * 2004-08-23 2006-03-01 富士施乐株式会社 Test search apparatus and method
CN1609859A (en) * 2004-11-26 2005-04-27 孙斌 Search result clustering method

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
JP特开2005-173847A 2005.06.30
同上.

Also Published As

Publication number Publication date
CN101119326A (en) 2008-02-06

Similar Documents

Publication Publication Date Title
CN101119326B (en) Method and device for managing instant communication conversation record
CN103678670B (en) Micro-blog hot word and hot topic mining system and method
CN105808590B (en) Search engine implementation method, searching method and device
JP5092165B2 (en) Data construction method and system
US8543380B2 (en) Determining a document specificity
Weng et al. Using text classification and multiple concepts to answer e-mails
CN104573130B (en) The entity resolution method and device calculated based on colony
CN109101479A (en) A kind of clustering method and device for Chinese sentence
CN107729336A (en) Data processing method, equipment and system
CN108549647B (en) Method for realizing active prediction of emergency in mobile customer service field without marking corpus based on SinglePass algorithm
CN104731954A (en) Music recommendation method and system based on group perspective
CN101621391A (en) Method and system for classifying short texts based on probability topic
US20110191335A1 (en) Method and system for conducting legal research using clustering analytics
CN116455861B (en) Big data-based computer network security monitoring system and method
CN112257419A (en) Intelligent retrieval method and device for calculating patent document similarity based on word frequency and semantics, electronic equipment and storage medium thereof
EP2045732A2 (en) Determining the depths of words and documents
CN105787662A (en) Mobile application software performance prediction method based on attributes
CN112149422A (en) Enterprise news dynamic monitoring method based on natural language
CN105159898A (en) Searching method and searching device
JP2005092442A (en) Multi-dimensional space model expressing device and method
CN105512270B (en) Method and device for determining related objects
Goldberg et al. CASTLE: crowd-assisted system for text labeling and extraction
CN113761104A (en) Method and device for detecting entity relationship in knowledge graph and electronic equipment
CN103793448B (en) Article information providing method and system
Siegen Virtual Citation Proximity (VCP): Calculating Co-Citation-Proximity-Based Document Relatedness for Uncited Documents with Machine Learning (preprint)

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant