CN105653671A - Similar information recommendation method and system - Google Patents

Similar information recommendation method and system Download PDF

Info

Publication number
CN105653671A
CN105653671A CN201511017551.XA CN201511017551A CN105653671A CN 105653671 A CN105653671 A CN 105653671A CN 201511017551 A CN201511017551 A CN 201511017551A CN 105653671 A CN105653671 A CN 105653671A
Authority
CN
China
Prior art keywords
word
information
keyword
semantic
search content
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201511017551.XA
Other languages
Chinese (zh)
Inventor
沈磊
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
CHANJET INFORMATION TECHNOLOGY Co Ltd
Original Assignee
CHANJET INFORMATION TECHNOLOGY Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by CHANJET INFORMATION TECHNOLOGY Co Ltd filed Critical CHANJET INFORMATION TECHNOLOGY Co Ltd
Priority to CN201511017551.XA priority Critical patent/CN105653671A/en
Publication of CN105653671A publication Critical patent/CN105653671A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9535Search customisation based on user profiles and personalisation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/289Phrasal analysis, e.g. finite state techniques or chunking
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/30Semantic analysis

Abstract

The present invention provides a similar information recommendation method and system. The similar information recommendation method comprises: according to a keyword in searching content, determining a preliminary candidate set; according to a semantic similarity degree between the searching content and each piece of information in the preliminary candidate set, determining similar information corresponding to the searching content in the preliminary candidate set; and presenting the similar information. By means of the method and system provided by the technical scheme of the present invention, the problem that specific semantics of searching information can not be determined by a simple keyword search is avoided, the information similar to the searching content can be more accurately provided, searching efficiency of users is improved, repeated posting of the users is also avoided, and thus user experience is enhanced.

Description

Analog information recommend method and system
Technical field
The present invention relates to field of computer technology, specifically, it relates to a kind of analog information recommend method and a kind of analog information commending system.
Background technology
At present, in Web Community, put question to or when browsing to asked a question relevant model, similar problem and answer thereof initiatively can be recommended user by system when user posts. Such as, when user inputs the content of enquirement in input frame, system can provide similar problem list, and along with the change of user input content, recommendation list also can change thereupon. For another example, when user is when browsing to asked a question relevant model, system can provide the problem list being asked a question similar with user. By aforesaid method, user is recommended in the same or similar problem of buffer memory and answer thereof in network, then need not again carry out the Q&A of repetition, both reduce the redundancy of same or similar model, and also improved the satisfactory degree of user.
But, aforesaid method is that the keyword based on user's subject of question is searched for usually, instead of based on, on the basis of semanteme understanding user's problem, which results in much similar problem cannot by system recommendation because of the difference of personal expression mode.
It is thus desirable to a kind of new technical scheme, it is possible to the information similar to search content more accurately is provided, promotes the search efficiency of user.
Summary of the invention
The present invention is just based on the problems referred to above, it is proposed that a kind of new technical scheme, it is possible to more accurately provide the information similar to search content, promotes the search efficiency of user.
In view of this, an aspect of of the present present invention proposes a kind of analog information recommend method, comprising: according to the keyword in search content, it is determined that preliminary candidate's collection; Concentrate the semantic similarity of every bar information according to described search content and described preliminary candidate, concentrate described preliminary candidate and determine the analog information corresponding with described search content; Show described analog information.
In this technical scheme, it is possible to after determining that according to keyword preliminary candidate collects, calculating search content and preliminary candidate concentrate the semantic similarity of every bar information, thus determine that the analog information of search content recommends user according to semantic similarity. By this technical scheme, avoid the problem that simple keyword search cannot confirm the concrete semanteme of search information, it is possible to the information similar to search content more accurately is provided, improves the search efficiency of user, also user is avoided to repeat to post, convenient for users.
In technique scheme, it may be preferred that described search content comprises asked questions, the information that described preliminary candidate concentrates: existing asked questions and existing problem answers.
In this technical scheme, search content comprises asked questions, namely user proposes problem at social network sites such as forums, the existing asked questions of the information that preliminary candidate concentrates and existing problem answers, that is, when the asked questions for user is retrieved, the semanteme of existing asked questions and existing problem answers can be covered simultaneously, thus it is convenient to more accurately provide the information similar to search content, it is convenient to as user shows more accurate answer.
In above-mentioned any technology scheme, preferably, the described semantic similarity concentrating every bar information according to described search content and described preliminary candidate, concentrate described preliminary candidate and determine the analog information corresponding with described search content, comprise: train described search content by language model and the unit semantic vector of information that described preliminary candidate concentrates, wherein, described unit semantic vector is word vector or word vector; According to described unit semantic vector, calculating described search content and the semantic similarity of information that described preliminary candidate concentrates, wherein, described semantic similarity comprises: word vector cumulative sum, word vector cumulative sum, word vector mean value or word vector mean value; And the described analog information of described displaying, comprising: the information that described preliminary candidate concentrates is sorted and shows from height to low according to described semantic similarity.
In this technical scheme, by language model training units semantic vector. In search content, the basic semantic unit of sentence has two kinds, one is word, one is word, word justice and word semanteme can be used for forming sentence semantics, if word is as fundamental unit, it is necessary to sentence is carried out participle, with word as fundamental unit, needing sentence word for word cutting, therefore, unit semantic vector is word vector or word vector. These two kinds of methods all need to train word occurrence semantic vector or word semantic vector according to language model with previously prepd word language material, language model is the probability model calculating a sentence, it is assumed based on Markov, that is, the appearance of next word only depends on one or several word before it. According to this principle, it is possible to use word language material trains word vector or word vector. Relation between the semantic vector trained like this, it is possible to directly embody from these two vectorial differences. The difference of vector is exactly definition mathematically, directly subtract each other by turn, such as, semantic " king "-semantic " queen " �� semanteme " man "-semantic " woman " is exactly semantic " queen " with semantic " king "-semantic " man "+semantic " woman " immediate vector.
In this technical scheme, how vectorial according to word vector or word focusing on, obtain sentence vector, the relationship between quality of sentence vector is to the similarity of sentence, and then affects the effect of sentence recommendation. Calculating sentence vector and can take two kinds of methods: one is by word (or word) semantic vector cumulative sum, as sentence vector, one is the mean value with word (or word) semantic vector, as sentence vector.
Pass through technique scheme, simple keyword search can be avoided cannot to confirm the problem of concrete semanteme of search information, it is possible to more accurately provide and the information of search content semantic similitude, improve the search efficiency of user, also user is avoided to repeat to post, convenient for users.
In above-mentioned any technology scheme, preferably, before described preliminary candidate concentrates and determines the analog information corresponding with described search content, also comprise: other similarities determining the information that described search content and described preliminary candidate concentrate, wherein, other similarities described comprise following one or a combination set of: keyword similarity, keyword duplicate removal similarity, keyword discrate analog degree and product Word similarity; And concentrate described preliminary candidate and to determine the analog information corresponding with described search content, specifically comprise: determine described analog information according to described semantic similarity and other similarities described.
In this technical scheme, it is possible to one or more in product Word similarity of keyword similarity, keyword duplicate removal similarity, keyword discrate analog degree is used as the standard recommended together with semantic similarity.Wherein, according to keyword shared weight in problem, and user inputs and the keyword coincidence degree of candidate's problem (comprising the answer that it is corresponding), calculates keyword similarity; Keyword duplicate removal similarity is exactly remove the impact of repetition keyword, and calculating separately both has the common keyword how much not repeated; Keyword discrate analog degree refers between the search content that user inputs and the information that preliminary candidate concentrates, whether keyword has identical distribution, it is be uniformly distributed, still somewhere is concentrated on, the information that the search content generally user inputted and preliminary candidate concentrate is cut into clause, calculating has how many clauses to comprise common keyword, as keyword discrate analog degree score. In addition, the problem of user there are many about products such as softwares, for the problem that different products proposes, should not be considered as similar problem, such as, if two problems contains identical product word, then product Word similarity is 1, and not containing like products word, then product Word similarity is 0. By technique scheme, semantic similarity and other one or more similarities with the use of, it is possible to more accurately recommend analog information, improve Consumer's Experience.
In above-mentioned any technology scheme, it may be preferred that described according to the keyword in search content, it is determined that preliminary candidate also comprises: the expression word removing in described keyword and stop-word before collecting.
In this technical scheme, owing to expression word and stop-word are often useless, can cause recommending not to be inconsistent with expectation, then can remove the expression word in keyword and stop-word before forming preliminary candidate collection, promote the validity of content recommendation.
The another aspect of the present invention proposes a kind of analog information commending system, comprising: candidate collects determining unit, according to the keyword in search content, it is determined that preliminary candidate's collection; Analog information determining unit, concentrates the semantic similarity of every bar information according to described search content and described preliminary candidate, concentrates described preliminary candidate and determines the analog information corresponding with described search content; Analog information display unit, shows described analog information.
In this technical scheme, it is possible to after determining that according to keyword preliminary candidate collects, calculating search content and preliminary candidate concentrate the semantic similarity of every bar information, thus determine that the analog information of search content recommends user according to semantic similarity. By this technical scheme, avoid the problem that simple keyword search cannot confirm the concrete semanteme of search information, it is possible to the information similar to search content more accurately is provided, improves the search efficiency of user, also user is avoided to repeat to post, convenient for users.
In technique scheme, it may be preferred that described search content comprises asked questions, the information that described preliminary candidate concentrates: existing asked questions and existing problem answers.
In this technical scheme, search content comprises asked questions, namely user proposes problem at social network sites such as forums, the existing asked questions of the information that preliminary candidate concentrates and existing problem answers, that is, when the asked questions for user is retrieved, the semanteme of existing asked questions and existing problem answers can be covered simultaneously, thus it is convenient to more accurately provide the information similar to search content, it is convenient to as user shows more accurate answer.
In above-mentioned any technology scheme, preferably, described analog information determining unit comprises: vector training unit, trains described search content by language model and the unit semantic vector of information that described preliminary candidate concentrates, wherein, described unit semantic vector is word vector or word vector; Semantic Similarity Measurement unit, according to described unit semantic vector, calculating described search content and the semantic similarity of information that described preliminary candidate concentrates, wherein, described semantic similarity comprises: word vector cumulative sum, word vector cumulative sum, word vector mean value or word vector mean value; And described analog information display unit specifically for: the information that described preliminary candidate concentrates is sorted and shows from height to low according to described semantic similarity.
In this technical scheme, by language model training units semantic vector.In search content, the basic semantic unit of sentence has two kinds, one is word, one is word, word justice and word semanteme can be used for forming sentence semantics, if word is as fundamental unit, it is necessary to sentence is carried out participle, with word as fundamental unit, needing sentence word for word cutting, therefore, unit semantic vector is word vector or word vector. These two kinds of methods all need to train word occurrence semantic vector or word semantic vector according to language model with previously prepd word language material, language model is the probability model calculating a sentence, it is assumed based on Markov, that is, the appearance of next word only depends on one or several word before it. According to this principle, it is possible to use word language material trains word vector or word vector. Relation between the semantic vector trained like this, it is possible to directly embody from these two vectorial differences. The difference of vector is exactly definition mathematically, directly subtract each other by turn, such as, semantic " king "-semantic " queen " �� semanteme " man "-semantic " woman " is exactly semantic " queen " with semantic " king "-semantic " man "+semantic " woman " immediate vector.
In this technical scheme, how vectorial according to word vector or word focusing on, obtain sentence vector, the relationship between quality of sentence vector is to the similarity of sentence, and then affects the effect of sentence recommendation. Calculating sentence vector and can take two kinds of methods: one is by word (or word) semantic vector cumulative sum, as sentence vector, one is the mean value with word (or word) semantic vector, as sentence vector.
Pass through technique scheme, simple keyword search can be avoided cannot to confirm the problem of concrete semanteme of search information, it is possible to more accurately provide and the information of search content semantic similitude, improve the search efficiency of user, also user is avoided to repeat to post, convenient for users.
In above-mentioned any technology scheme, preferably, also comprise: other similarity determining unit, before described preliminary candidate concentrates and determines the analog information corresponding with described search content, determine other similarities of the information that described search content and described preliminary candidate concentrate, wherein, other similarities described comprise following one or a combination set of: keyword similarity, keyword duplicate removal similarity, keyword discrate analog degree and product Word similarity; And described analog information determining unit is used for: determine described analog information according to described semantic similarity and other similarities described.
In this technical scheme, it is possible to one or more in product Word similarity of keyword similarity, keyword duplicate removal similarity, keyword discrate analog degree is used as the standard recommended together with semantic similarity. Wherein, according to keyword shared weight in problem, and user inputs and the keyword coincidence degree of candidate's problem (comprising the answer that it is corresponding), calculates keyword similarity; Keyword duplicate removal similarity is exactly remove the impact of repetition keyword, and calculating separately both has the common keyword how much not repeated; Keyword discrate analog degree refers between the search content that user inputs and the information that preliminary candidate concentrates, whether keyword has identical distribution, it is be uniformly distributed, still somewhere is concentrated on, the information that the search content generally user inputted and preliminary candidate concentrate is cut into clause, calculating has how many clauses to comprise common keyword, as keyword discrate analog degree score.In addition, the problem of user there are many about products such as softwares, for the problem that different products proposes, should not be considered as similar problem, such as, if two problems contains identical product word, then product Word similarity is 1, and not containing like products word, then product Word similarity is 0. By technique scheme, semantic similarity and other one or more similarities with the use of, it is possible to more accurately recommend analog information, improve Consumer's Experience.
In above-mentioned any technology scheme, it may be preferred that also comprise: removal unit, described according to the keyword in search content, it is determined that preliminary candidate removes the expression word in described keyword and stop-word before collecting.
In this technical scheme, owing to expression word and stop-word are often useless, can cause recommending not to be inconsistent with expectation, then can remove the expression word in keyword and stop-word before forming preliminary candidate collection, promote the validity of content recommendation.
By above technical scheme, avoid the problem that simple keyword search cannot confirm the concrete semanteme of search information, it is possible to the information similar to search content more accurately is provided, improves the search efficiency of user, also avoid user to repeat to post, thus improve the experience of user.
Accompanying drawing explanation
Fig. 1 shows the schema of analog information recommend method according to one embodiment of present invention;
Fig. 2 shows the block diagram of analog information commending system according to one embodiment of present invention;
Fig. 3 shows the schematic diagram carrying out analog information recommendation according to one embodiment of present invention;
Fig. 4 shows the schematic diagram of determination semantic similarity according to one embodiment of present invention;
Fig. 5 shows the schematic diagram that analog information according to one embodiment of present invention recommends interface;
Fig. 6 shows the schematic diagram that the analog information according to an alternative embodiment of the invention recommends interface.
Embodiment
In order to more clearly understand above-mentioned purpose, the feature and advantage of the present invention, below in conjunction with the drawings and specific embodiments, the present invention is further described in detail. It should be noted that, when not conflicting, the feature in the embodiment of the application and embodiment can combine mutually.
Set forth a lot of detail in the following description so that fully understanding the present invention; but; the present invention can also adopt other to be different from other modes described here to implement, and therefore, protection scope of the present invention is by the restriction of following public specific embodiment.
Fig. 1 shows the schema of analog information recommend method according to one embodiment of present invention.
As shown in Figure 1, analog information recommend method according to one embodiment of present invention, comprising:
Step 102, according to the keyword in search content, it is determined that preliminary candidate's collection;
Step 104, concentrates the semantic similarity of every bar information according to described search content and described preliminary candidate, concentrates described preliminary candidate and determines the analog information corresponding with described search content;
Step 106, shows described analog information.
In this technical scheme, it is possible to after determining that according to keyword preliminary candidate collects, calculating search content and preliminary candidate concentrate the semantic similarity of every bar information, thus determine that the analog information of search content recommends user according to semantic similarity. By this technical scheme, avoid the problem that simple keyword search cannot confirm the concrete semanteme of search information, it is possible to the information similar to search content more accurately is provided, improves the search efficiency of user, also user is avoided to repeat to post, convenient for users.
In technique scheme, it may be preferred that described search content comprises asked questions, the information that described preliminary candidate concentrates: existing asked questions and existing problem answers.
In this technical scheme, search content comprises asked questions, namely user proposes problem at social network sites such as forums, the existing asked questions of the information that preliminary candidate concentrates and existing problem answers, that is, when the asked questions for user is retrieved, the semanteme of existing asked questions and existing problem answers can be covered simultaneously, thus it is convenient to more accurately provide the information similar to search content, it is convenient to as user shows more accurate answer.
In above-mentioned any technology scheme, it may be preferred that step 104 comprises: training described search content by language model and the unit semantic vector of information that described preliminary candidate concentrates, wherein, described unit semantic vector is word vector or word vector; According to described unit semantic vector, calculating described search content and the semantic similarity of information that described preliminary candidate concentrates, wherein, described semantic similarity comprises: word vector cumulative sum, word vector cumulative sum, word vector mean value or word vector mean value; And step 106 comprises: the information that described preliminary candidate concentrates is sorted and shows from height to low according to described semantic similarity.
In this technical scheme, by language model training units semantic vector. In search content, the basic semantic unit of sentence has two kinds, one is word, one is word, word justice and word semanteme can be used for forming sentence semantics, if word is as fundamental unit, it is necessary to sentence is carried out participle, with word as fundamental unit, needing sentence word for word cutting, therefore, unit semantic vector is word vector or word vector. These two kinds of methods all need to train word occurrence semantic vector or word semantic vector according to language model with previously prepd word language material, language model is the probability model calculating a sentence, it is assumed based on Markov, that is, the appearance of next word only depends on one or several word before it. According to this principle, it is possible to use word language material trains word vector or word vector. Relation between the semantic vector trained like this, it is possible to directly embody from these two vectorial differences. The difference of vector is exactly definition mathematically, directly subtract each other by turn, such as, semantic " king "-semantic " queen " �� semanteme " man "-semantic " woman " is exactly semantic " queen " with semantic " king "-semantic " man "+semantic " woman " immediate vector.
In this technical scheme, how vectorial according to word vector or word focusing on, obtain sentence vector, the relationship between quality of sentence vector is to the similarity of sentence, and then affects the effect of sentence recommendation. Calculating sentence vector and can take two kinds of methods: one is by word (or word) semantic vector cumulative sum, as sentence vector, one is the mean value with word (or word) semantic vector, as sentence vector.
Pass through technique scheme, simple keyword search can be avoided cannot to confirm the problem of concrete semanteme of search information, it is possible to more accurately provide and the information of search content semantic similitude, improve the search efficiency of user, also user is avoided to repeat to post, convenient for users.
In above-mentioned any technology scheme, preferably, before step 104, also comprise: other similarities determining the information that described search content and described preliminary candidate concentrate, wherein, other similarities described comprise following one or a combination set of: keyword similarity, keyword duplicate removal similarity, keyword discrate analog degree and product Word similarity;And step 104 specifically comprises: determine described analog information according to described semantic similarity and other similarities described.
In this technical scheme, it is possible to one or more in product Word similarity of keyword similarity, keyword duplicate removal similarity, keyword discrate analog degree is used as the standard recommended together with semantic similarity. Wherein, according to keyword shared weight in problem, and user inputs and the keyword coincidence degree of candidate's problem (comprising the answer that it is corresponding), calculates keyword similarity; Keyword duplicate removal similarity is exactly remove the impact of repetition keyword, and calculating separately both has the common keyword how much not repeated; Keyword discrate analog degree refers between the search content that user inputs and the information that preliminary candidate concentrates, whether keyword has identical distribution, it is be uniformly distributed, still somewhere is concentrated on, the information that the search content generally user inputted and preliminary candidate concentrate is cut into clause, calculating has how many clauses to comprise common keyword, as keyword discrate analog degree score. In addition, the problem of user there are many about products such as softwares, for the problem that different products proposes, should not be considered as similar problem, such as, if two problems contains identical product word, then product Word similarity is 1, and not containing like products word, then product Word similarity is 0. By technique scheme, semantic similarity and other one or more similarities with the use of, it is possible to more accurately recommend analog information, improve Consumer's Experience.
In above-mentioned any technology scheme, it may be preferred that before step 102, also comprise: remove the expression word in described keyword and stop-word.
In this technical scheme, owing to expression word and stop-word are often useless, can cause recommending not to be inconsistent with expectation, then can remove the expression word in keyword and stop-word before forming preliminary candidate collection, promote the validity of content recommendation.
Fig. 2 shows the block diagram of analog information commending system according to one embodiment of present invention.
As shown in Figure 2, analog information commending system 200 according to one embodiment of present invention, comprising: candidate collects determining unit 202, analog information determining unit 204, analog information display unit 206.
Wherein, candidate collects determining unit 202 for according to the keyword in search content, it is determined that preliminary candidate's collection; Analog information determining unit 204, for concentrating the semantic similarity of every bar information according to described search content and described preliminary candidate, is concentrated described preliminary candidate and is determined the analog information corresponding with described search content; Analog information display unit 206 is for showing described analog information.
In this technical scheme, it is possible to after determining that according to keyword preliminary candidate collects, calculating search content and preliminary candidate concentrate the semantic similarity of every bar information, thus determine that the analog information of search content recommends user according to semantic similarity. By this technical scheme, avoid the problem that simple keyword search cannot confirm the concrete semanteme of search information, it is possible to the information similar to search content more accurately is provided, improves the search efficiency of user, also user is avoided to repeat to post, convenient for users.
In technique scheme, it may be preferred that described search content comprises asked questions, the information that described preliminary candidate concentrates: existing asked questions and existing problem answers.
In this technical scheme, search content comprises asked questions, namely user proposes problem at social network sites such as forums, the existing asked questions of the information that preliminary candidate concentrates and existing problem answers, that is, when the asked questions for user is retrieved, the semanteme of existing asked questions and existing problem answers can be covered simultaneously, thus it is convenient to more accurately provide the information similar to search content, it is convenient to as user shows more accurate answer.
In above-mentioned any technology scheme, it may be preferred that described analog information determining unit 204 comprises: vector training unit 2042 and Semantic Similarity Measurement unit 2044.
Wherein, vector training unit 2042 is for training described search content by language model and the unit semantic vector of information that described preliminary candidate concentrates, and wherein, described unit semantic vector is word vector or word vector; Semantic Similarity Measurement unit 2044 is for according to described unit semantic vector, calculate described search content and the semantic similarity of information that described preliminary candidate concentrates, wherein, described semantic similarity comprises: word vector cumulative sum, word vector cumulative sum, word vector mean value or word vector mean value; And described analog information display unit 206 specifically for: the information that described preliminary candidate concentrates is sorted and shows from height to low according to described semantic similarity.
In this technical scheme, by language model training units semantic vector. In search content, the basic semantic unit of sentence has two kinds, one is word, one is word, word justice and word semanteme can be used for forming sentence semantics, if word is as fundamental unit, it is necessary to sentence is carried out participle, with word as fundamental unit, needing sentence word for word cutting, therefore, unit semantic vector is word vector or word vector. These two kinds of methods all need to train word occurrence semantic vector or word semantic vector according to language model with previously prepd word language material, language model is the probability model calculating a sentence, it is assumed based on Markov, that is, the appearance of next word only depends on one or several word before it. According to this principle, it is possible to use word language material trains word vector or word vector. Relation between the semantic vector trained like this, it is possible to directly embody from these two vectorial differences. The difference of vector is exactly definition mathematically, directly subtract each other by turn, such as, semantic " king "-semantic " queen " �� semanteme " man "-semantic " woman " is exactly semantic " queen " with semantic " king "-semantic " man "+semantic " woman " immediate vector.
In this technical scheme, how vectorial according to word vector or word focusing on, obtain sentence vector, the relationship between quality of sentence vector is to the similarity of sentence, and then affects the effect of sentence recommendation. Calculating sentence vector and can take two kinds of methods: one is by word (or word) semantic vector cumulative sum, as sentence vector, one is the mean value with word (or word) semantic vector, as sentence vector.
Pass through technique scheme, simple keyword search can be avoided cannot to confirm the problem of concrete semanteme of search information, it is possible to more accurately provide and the information of search content semantic similitude, improve the search efficiency of user, also user is avoided to repeat to post, convenient for users.
In above-mentioned any technology scheme, preferably, also comprise: other similarity determining unit 208, before described preliminary candidate concentrates and determines the analog information corresponding with described search content, determine other similarities of the information that described search content and described preliminary candidate concentrate, wherein, other similarities described comprise following one or a combination set of: keyword similarity, keyword duplicate removal similarity, keyword discrate analog degree and product Word similarity; And described analog information determining unit 204 for: determine described analog information according to described semantic similarity and other similarities described.
In this technical scheme, it is possible to one or more in product Word similarity of keyword similarity, keyword duplicate removal similarity, keyword discrate analog degree is used as the standard recommended together with semantic similarity. Wherein, according to keyword shared weight in problem, and user inputs and the keyword coincidence degree of candidate's problem (comprising the answer that it is corresponding), calculates keyword similarity; Keyword duplicate removal similarity is exactly remove the impact of repetition keyword, and calculating separately both has the common keyword how much not repeated; Keyword discrate analog degree refers between the search content that user inputs and the information that preliminary candidate concentrates, whether keyword has identical distribution, it is be uniformly distributed, still somewhere is concentrated on, the information that the search content generally user inputted and preliminary candidate concentrate is cut into clause, calculating has how many clauses to comprise common keyword, as keyword discrate analog degree score. In addition, the problem of user there are many about products such as softwares, for the problem that different products proposes, should not be considered as similar problem, such as, if two problems contains identical product word, then product Word similarity is 1, and not containing like products word, then product Word similarity is 0. By technique scheme, semantic similarity and other one or more similarities with the use of, it is possible to more accurately recommend analog information, improve Consumer's Experience.
In above-mentioned any technology scheme, it may be preferred that also comprise: removal unit 210, described according to the keyword in search content, it is determined that preliminary candidate removes the expression word in described keyword and stop-word before collecting.
In this technical scheme, owing to expression word and stop-word are often useless, can cause recommending not to be inconsistent with expectation, then can remove the expression word in keyword and stop-word before forming preliminary candidate collection, promote the validity of content recommendation.
By above technical scheme, avoid the problem that simple keyword search cannot confirm the concrete semanteme of search information, it is possible to the information similar to search content more accurately is provided, improves the search efficiency of user, also avoid user to repeat to post, thus improve the experience of user.
Fig. 3 shows the schematic diagram of the frame carrying out analog information commending system according to an alternative embodiment of the invention.
As shown in Figure 3, when carrying out analog information recommendation, the keyword that first system uses user to put question to is searched in search system with crucial phrase, obtain preliminary candidate question set, calculated candidate concentrates the similarity that candidate's problem and user put question to again, according to sequencing of similarity, draws ranked candidate collection. Finally, ranked candidate collection is filtered and provide recommendation results.
Below the main characteristic sum design implementation of system is explained in detail.
The main feature of system comprises:
(1) quick recommendation is achieved.
(2) using multiple method to calculate the similarity of problem, measure from multiple angle, comprehensive multiple factor provides more effective recommendation results.
(3) support to dynamically update data, reach synchronous in real time with the system of posting.
The design implementation of system is as follows:
(1) the preliminary candidate's collection of search system rapid screening is used.
Because model quantity is at least 1,000,000 grades in Web Community, system can use search system to provide preliminary candidate to collect. Search system have recorded asked questions and answer and corresponding keyword and phrase.The keyword of search subscriber input in search system and phrase, can provide preliminary candidate question set. This candidate question set is N times of recommendation results collection, and N can be arranged as requested, can carry out preliminary screening so fast, meet the real-time demand of system. Simultaneously, it may also be useful to search system can also support the operations such as the increase at any time of model, deletion and amendment, reach synchronous in real time with the system of posting.
(2) measuring similarity is carried out.
Native system uses keyword similarity, keyword Jaccard (Jie Kade coefficient) similarity, keyword discrate analog degree, name of product similarity and semantic similarity as measure, finally the score that all measures obtain is multiplied by weight sum, it is exactly the final score of problem similarity. Sort according to similarity score, just obtain the candidate question set of sequence.
A. keyword similarity is calculated.
According to keyword shared weight in problem, and user inputs and the keyword coincidence degree of candidate's problem (comprising the answer that it is corresponding), calculates keyword similarity.
During the keyword similarity of calculating problem, first will by the keyword that comprises in problem and phrase extraction out. And the quality of keywords database is extremely important for the tolerance of similarity.
Keywords database has two sources, first, collect website model and form language material, language material is carried out participle, calculate TFIDF (information retrieval excavate the conventional weighting) value of word, it is sorted, choose and front N number of enter keywords database, period, it is necessary to this N number of word is removed stop-word and insignificant word that some are conventional. 2nd, the keyword collecting same area on network adds dictionary. The forming process in crucial phrase storehouse is also similar.
B. keyword Jaccard similarity is calculated.
Between the asked questions of user's input and candidate's problem, there is the keyword that some are common, and some keyword repeats in problem, keyword Jaccard similarity is exactly remove these impacts repeating keyword, and calculating separately both has the common keyword how much not repeated.
C. keyword discrate analog degree is calculated.
Keyword discrate analog degree refers between the search content that user inputs and the information that preliminary candidate concentrates, whether keyword has identical distribution, it is be uniformly distributed, still somewhere is concentrated on, the information that the search content generally user inputted and preliminary candidate concentrate is cut into clause, calculating has how many clauses to comprise common keyword, as keyword discrate analog degree score.
D. product Word similarity is calculated.
The problem of user there are many about products such as softwares, for the problem that different products proposes, it should not be considered as similar problem, such as, if two problems contains identical product word, then product Word similarity is 1, not containing like products word, then product Word similarity is 0.
E. computing semantic similarity.
The sentence of problem (or the sentence set of problem, also the answer that problem is corresponding is comprised, lower abbreviation sentence) less fundamental unit can be cut into, the semanteme that the set of semantics of native system fundamental unit is a problem, and then Utilizing question semantic vector, calculate the similarity of problem.
As shown in Figure 4, by language model training units semantic vector. In search content, the semantic fundamental unit of sentence has two kinds, one is word, and one is word, and word justice and word semanteme can be used for forming sentence semantics, if word is as fundamental unit, need sentence is carried out participle, it is divided into multiple semantic fundamental unit, with word as fundamental unit, need sentence word for word cutting, also being divided into multiple semantic fundamental unit, therefore, unit semantic vector is word vector or word vector.
For problem 1 and problem 2, it is possible to respectively by language model training units semantic vector, it is necessary to sentence is cut into multiple semantic fundamental unit, calculates semanteme, then calculate the semantic similarity with problem 2 semanteme of problem 1 further.
These two kinds of methods all need to train word occurrence semantic vector or word semantic vector according to language model with previously prepd word language material, language model is the probability model calculating a sentence, it is assumed based on Markov, that is, the appearance of next word only depends on one or several word before it. According to this principle, it is possible to use word language material trains word vector or word vector. Relation between the semantic vector trained like this, it is possible to directly embody from these two vectorial differences. The difference of vector is exactly definition mathematically, directly subtract each other by turn, such as, semantic " king "-semantic " queen " �� semanteme " man "-semantic " woman " is exactly semantic " queen " with semantic " king "-semantic " man "+semantic " woman " immediate vector.
In this technical scheme, how vectorial according to word vector or word focusing on, obtain sentence vector, the relationship between quality of sentence vector is to the similarity of sentence, and then affects the effect of sentence recommendation. Calculating sentence vector and can take two kinds of methods: one is by word (or word) semantic vector cumulative sum, as sentence vector, one is the mean value with word (or word) semantic vector, as sentence vector.
Pass through technique scheme, simple keyword search can be avoided cannot to confirm the problem of concrete semanteme of search information, it is possible to more accurately provide and the information of search content semantic similitude, improve the search efficiency of user, also user is avoided to repeat to post, convenient for users.
Finally, it is possible to filter and provide recommendation results. If the problem having in Web Community is not answered, so recommended nonsensical yet, so when providing recommendation results, it does not have the problem of answer can be filtered. If the answer of problem obtains the recommendation of Web Community expert, illustrate that answer obtains the accreditation of expert, before such problem being come when providing recommendation results. Finally, sorted candidate concentrate, select before N number of as recommendation results in dedicating user to.
In addition because keywords database be the enquirement from user and answer extract, have some expression words be also counted into keywords database owing to TFIDF value is higher. When recommendation, recommendation results is had certain influence by expression word, and the such as enquirement of a user is containing espressiove, and the problem so containing identical expression word can enter preliminary candidate collection, and then enters recommendation set, and this is not inconsistent with recommendation expectation. So needing the expression word in keywords database to remove.
Equally, stop-word as conventional in " consulting ", " may I ask " etc. is also nonsensical, so this class word also needs to remove from keywords database.
Fig. 5 shows the schematic diagram at information recommendation interface according to one embodiment of present invention.
As shown in Figure 5, the information recommendation system of the present invention is applied in financial accounting class website " accounting home ", when user browses model in website, system can provide the recommendation of similar problem, user puts question to: " income tax season declaration form, operation revenue, running cost, how total profit fills in? " then system shows similar problem by the calculating of semantic similarity: " income tax season declaration form, does is running cost that operation revenue subtracts total profit? or the running cost according to profit statement ...? " and show the answer of this type of similar problem, thus, solve the problem of user.
Fig. 6 shows the schematic diagram at the information recommendation interface according to an alternative embodiment of the invention.
As shown in Figure 6, information recommendation interface according to an alternative embodiment of the invention, user puts question to: " having handed over income tax when making the final settlement today; return by the tax bureau afterwards; record separately and how to do? " more then system shows many similar problems and answer by the calculating of semantic similarity, correctly analyze the semanteme that user puts question to, improve the experience of user.
More than it is described with reference to the accompanying drawings the technical scheme of the present invention, by the technical scheme of the present invention, it is possible to more accurately provide the information similar to search content, improve the search efficiency of user, also avoid user to repeat to post, thus improve the experience of user.
The foregoing is only the preferred embodiments of the present invention, be not limited to the present invention, for a person skilled in the art, the present invention can have various modifications and variations. Within the spirit and principles in the present invention all, any amendment of doing, equivalent replacement, improvement etc., all should be included within protection scope of the present invention.

Claims (10)

1. an analog information recommend method, it is characterised in that, comprising:
According to the keyword in search content, it is determined that preliminary candidate's collection;
Concentrate the semantic similarity of every bar information according to described search content and described preliminary candidate, concentrate described preliminary candidate and determine the analog information corresponding with described search content;
Show described analog information.
2. analog information recommend method according to claim 1, it is characterised in that, described search content comprises asked questions, the information that described preliminary candidate concentrates: existing asked questions and existing problem answers.
3. analog information recommend method according to claim 1 and 2, it is characterized in that, the described semantic similarity concentrating every bar information according to described search content and described preliminary candidate, concentrate described preliminary candidate and determine the analog information corresponding with described search content, comprising:
Training described search content by language model and the unit semantic vector of information that described preliminary candidate concentrates, wherein, described unit semantic vector is word vector or word vector;
According to described unit semantic vector, calculating described search content and the semantic similarity of information that described preliminary candidate concentrates, wherein, described semantic similarity comprises: word vector cumulative sum, word vector cumulative sum, word vector mean value or word vector mean value; And
The described analog information of described displaying, comprising:
The information that described preliminary candidate concentrates is sorted and shows from height to low according to described semantic similarity.
4. analog information recommend method according to claim 3, it is characterised in that, before described preliminary candidate concentrates and determines the analog information corresponding with described search content, also comprise:
Determining other similarities of the information that described search content and described preliminary candidate concentrate, wherein, other similarities described comprise following one or a combination set of: keyword similarity, keyword duplicate removal similarity, keyword discrate analog degree and product Word similarity; And
Concentrate described preliminary candidate and determine the analog information corresponding with described search content, specifically comprise:
Described analog information is determined according to described semantic similarity and other similarities described.
5. analog information recommend method according to claim 3, it is characterised in that, described according to the keyword in search content, it is determined that preliminary candidate also comprises before collecting:
Remove the expression word in described keyword and stop-word.
6. an analog information commending system, it is characterised in that, comprising:
Candidate collects determining unit, according to the keyword in search content, it is determined that preliminary candidate's collection;
Analog information determining unit, concentrates the semantic similarity of every bar information according to described search content and described preliminary candidate, concentrates described preliminary candidate and determines the analog information corresponding with described search content;
Analog information display unit, shows described analog information.
7. analog information commending system according to claim 6, it is characterised in that, described search content comprises asked questions, the information that described preliminary candidate concentrates: existing asked questions and existing problem answers.
8. analog information commending system according to claim 6 or 7, it is characterised in that, described analog information determining unit comprises:
Vector training unit, trains described search content by language model and the unit semantic vector of information that described preliminary candidate concentrates, and wherein, described unit semantic vector is word vector or word vector;
Semantic Similarity Measurement unit, according to described unit semantic vector, calculating described search content and the semantic similarity of information that described preliminary candidate concentrates, wherein, described semantic similarity comprises: word vector cumulative sum, word vector cumulative sum, word vector mean value or word vector mean value; And
Described analog information display unit specifically for:
The information that described preliminary candidate concentrates is sorted and shows from height to low according to described semantic similarity.
9. analog information commending system according to claim 8, it is characterised in that, also comprise:
Other similarity determining unit, before described preliminary candidate concentrates and determines the analog information corresponding with described search content, determine other similarities of the information that described search content and described preliminary candidate concentrate, wherein, other similarities described comprise following one or a combination set of: keyword similarity, keyword duplicate removal similarity, keyword discrate analog degree and product Word similarity; And
Described analog information determining unit is used for:
Described analog information is determined according to described semantic similarity and other similarities described.
10. analog information commending system according to claim 9, it is characterised in that, also comprise:
Removal unit, described according to the keyword in search content, it is determined that preliminary candidate removes the expression word in described keyword and stop-word before collecting.
CN201511017551.XA 2015-12-29 2015-12-29 Similar information recommendation method and system Pending CN105653671A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201511017551.XA CN105653671A (en) 2015-12-29 2015-12-29 Similar information recommendation method and system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201511017551.XA CN105653671A (en) 2015-12-29 2015-12-29 Similar information recommendation method and system

Publications (1)

Publication Number Publication Date
CN105653671A true CN105653671A (en) 2016-06-08

Family

ID=56477329

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201511017551.XA Pending CN105653671A (en) 2015-12-29 2015-12-29 Similar information recommendation method and system

Country Status (1)

Country Link
CN (1) CN105653671A (en)

Cited By (17)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106407280A (en) * 2016-08-26 2017-02-15 合网络技术(北京)有限公司 Query target matching method and device
CN107169077A (en) * 2017-05-10 2017-09-15 百度在线网络技术(北京)有限公司 Method and apparatus for pushed information
CN107423284A (en) * 2017-06-14 2017-12-01 中国科学院自动化研究所 Merge the construction method and system of the sentence expression of Chinese language words internal structural information
WO2018033030A1 (en) * 2016-08-19 2018-02-22 中兴通讯股份有限公司 Natural language library generation method and device
CN108196926A (en) * 2017-12-29 2018-06-22 努比亚技术有限公司 Content of platform identification method, terminal and computer readable storage medium
CN109033305A (en) * 2018-07-16 2018-12-18 深圳前海微众银行股份有限公司 Question answering method, equipment and computer readable storage medium
CN109033156A (en) * 2018-06-13 2018-12-18 腾讯科技(深圳)有限公司 A kind of information processing method, device and terminal
CN109063000A (en) * 2018-07-06 2018-12-21 深圳前海微众银行股份有限公司 Question sentence recommended method, customer service system and computer readable storage medium
CN110019669A (en) * 2017-10-31 2019-07-16 北京国双科技有限公司 A kind of text searching method and device
CN110019670A (en) * 2017-10-31 2019-07-16 北京国双科技有限公司 A kind of text searching method and device
CN110020171A (en) * 2017-12-28 2019-07-16 阿里巴巴集团控股有限公司 Data processing method, device, equipment and computer readable storage medium
CN110019668A (en) * 2017-10-31 2019-07-16 北京国双科技有限公司 A kind of text searching method and device
CN110096567A (en) * 2019-03-14 2019-08-06 中国科学院自动化研究所 Selection method, system are replied in more wheels dialogue based on QA Analysis of Knowledge Bases Reasoning
CN112613320A (en) * 2019-09-19 2021-04-06 北京国双科技有限公司 Method and device for acquiring similar sentences, storage medium and electronic equipment
JP2021099774A (en) * 2019-12-20 2021-07-01 ベイジン バイドゥ ネットコム サイエンス アンド テクノロジー カンパニー リミテッド Vectorized representation method of document, vectorized representation device of document, and computer device
CN113239276A (en) * 2021-05-31 2021-08-10 上海明略人工智能(集团)有限公司 Method and device for determining recommended materials based on session information
WO2022111726A1 (en) * 2020-11-30 2022-06-02 华为技术有限公司 Information sorting method, and electronic device

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101118554A (en) * 2007-09-14 2008-02-06 中兴通讯股份有限公司 Intelligent interactive request-answering system and processing method thereof
CN101373532A (en) * 2008-07-10 2009-02-25 昆明理工大学 FAQ Chinese request-answering system implementing method in tourism field
CN101408894A (en) * 2007-10-12 2009-04-15 莱克西私人有限公司 System and method for enhancing search relevancy using semantic keys
CN103218373A (en) * 2012-01-20 2013-07-24 腾讯科技(深圳)有限公司 System, method and device for relevant searching
CN103577558A (en) * 2013-10-21 2014-02-12 北京奇虎科技有限公司 Device and method for optimizing search ranking of frequently asked question and answer pairs
CN104331523A (en) * 2014-11-27 2015-02-04 韩慧健 Conceptual object model-based question searching method
CN104573028A (en) * 2015-01-14 2015-04-29 百度在线网络技术(北京)有限公司 Intelligent question-answer implementing method and system
CN105183714A (en) * 2015-08-27 2015-12-23 北京时代焦点国际教育咨询有限责任公司 Sentence similarity calculation method and apparatus

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101118554A (en) * 2007-09-14 2008-02-06 中兴通讯股份有限公司 Intelligent interactive request-answering system and processing method thereof
CN101408894A (en) * 2007-10-12 2009-04-15 莱克西私人有限公司 System and method for enhancing search relevancy using semantic keys
CN101373532A (en) * 2008-07-10 2009-02-25 昆明理工大学 FAQ Chinese request-answering system implementing method in tourism field
CN103218373A (en) * 2012-01-20 2013-07-24 腾讯科技(深圳)有限公司 System, method and device for relevant searching
CN103577558A (en) * 2013-10-21 2014-02-12 北京奇虎科技有限公司 Device and method for optimizing search ranking of frequently asked question and answer pairs
CN104331523A (en) * 2014-11-27 2015-02-04 韩慧健 Conceptual object model-based question searching method
CN104573028A (en) * 2015-01-14 2015-04-29 百度在线网络技术(北京)有限公司 Intelligent question-answer implementing method and system
CN105183714A (en) * 2015-08-27 2015-12-23 北京时代焦点国际教育咨询有限责任公司 Sentence similarity calculation method and apparatus

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
EASTMOUNT: "Python简单实现基于VSM的余弦相似度计算", 《CSDN HTTPS://BLOG.CSDN.NET/EASTMOUNT/ARTICLE/DETAILS/49898133》 *
LIANGXIAXU: "文本相似度算法", 《博客园 HTTPS://WWW.CNBLOGS.COM/LIANGXIAXU/ARCHIVE/2012/05/05/2484972.HTML》 *
阮一峰: "TF-IDF与余弦相似性的应用(二):找出相似文章", 《阮一峰的网络日志 HTTP://WWW.RUANYIFENG.COM/BLOG/2013/03/COSINE_SIMILARITY.HTML》 *

Cited By (27)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2018033030A1 (en) * 2016-08-19 2018-02-22 中兴通讯股份有限公司 Natural language library generation method and device
CN106407280A (en) * 2016-08-26 2017-02-15 合网络技术(北京)有限公司 Query target matching method and device
CN107169077A (en) * 2017-05-10 2017-09-15 百度在线网络技术(北京)有限公司 Method and apparatus for pushed information
CN107423284A (en) * 2017-06-14 2017-12-01 中国科学院自动化研究所 Merge the construction method and system of the sentence expression of Chinese language words internal structural information
CN107423284B (en) * 2017-06-14 2020-03-06 中国科学院自动化研究所 Method and system for constructing sentence representation fusing internal structure information of Chinese words
CN110019669B (en) * 2017-10-31 2021-06-29 北京国双科技有限公司 Text retrieval method and device
CN110019669A (en) * 2017-10-31 2019-07-16 北京国双科技有限公司 A kind of text searching method and device
CN110019670A (en) * 2017-10-31 2019-07-16 北京国双科技有限公司 A kind of text searching method and device
CN110019668A (en) * 2017-10-31 2019-07-16 北京国双科技有限公司 A kind of text searching method and device
CN110020171B (en) * 2017-12-28 2023-05-16 阿里巴巴集团控股有限公司 Data processing method, device, equipment and computer readable storage medium
CN110020171A (en) * 2017-12-28 2019-07-16 阿里巴巴集团控股有限公司 Data processing method, device, equipment and computer readable storage medium
CN108196926A (en) * 2017-12-29 2018-06-22 努比亚技术有限公司 Content of platform identification method, terminal and computer readable storage medium
CN108196926B (en) * 2017-12-29 2021-03-26 努比亚技术有限公司 Platform content identification method, terminal and computer readable storage medium
CN109033156A (en) * 2018-06-13 2018-12-18 腾讯科技(深圳)有限公司 A kind of information processing method, device and terminal
CN109033156B (en) * 2018-06-13 2021-06-15 腾讯科技(深圳)有限公司 Information processing method and device and terminal
CN109063000A (en) * 2018-07-06 2018-12-21 深圳前海微众银行股份有限公司 Question sentence recommended method, customer service system and computer readable storage medium
CN109063000B (en) * 2018-07-06 2022-02-01 深圳前海微众银行股份有限公司 Question recommendation method, customer service system and computer-readable storage medium
CN109033305A (en) * 2018-07-16 2018-12-18 深圳前海微众银行股份有限公司 Question answering method, equipment and computer readable storage medium
CN109033305B (en) * 2018-07-16 2022-04-01 深圳前海微众银行股份有限公司 Question answering method, device and computer readable storage medium
CN110096567A (en) * 2019-03-14 2019-08-06 中国科学院自动化研究所 Selection method, system are replied in more wheels dialogue based on QA Analysis of Knowledge Bases Reasoning
CN110096567B (en) * 2019-03-14 2020-12-25 中国科学院自动化研究所 QA knowledge base reasoning-based multi-round dialogue reply selection method and system
CN112613320A (en) * 2019-09-19 2021-04-06 北京国双科技有限公司 Method and device for acquiring similar sentences, storage medium and electronic equipment
JP7194150B2 (en) 2019-12-20 2022-12-21 ベイジン バイドゥ ネットコム サイエンス テクノロジー カンパニー リミテッド Document vector representation method, document vector representation device and computer equipment
JP2021099774A (en) * 2019-12-20 2021-07-01 ベイジン バイドゥ ネットコム サイエンス アンド テクノロジー カンパニー リミテッド Vectorized representation method of document, vectorized representation device of document, and computer device
US11403468B2 (en) 2019-12-20 2022-08-02 Beijing Baidu Netcom Science And Technology Co., Ltd. Method and apparatus for generating vector representation of text, and related computer device
WO2022111726A1 (en) * 2020-11-30 2022-06-02 华为技术有限公司 Information sorting method, and electronic device
CN113239276A (en) * 2021-05-31 2021-08-10 上海明略人工智能(集团)有限公司 Method and device for determining recommended materials based on session information

Similar Documents

Publication Publication Date Title
CN105653671A (en) Similar information recommendation method and system
US9990356B2 (en) Device and method for analyzing reputation for objects by data mining
KR101536520B1 (en) Method and server for extracting topic and evaluating compatibility of the extracted topic
US20180165712A1 (en) Method and apparatus for composing search phrases, distributing ads and searching product information
CN103729359B (en) A kind of method and system recommending search word
CN101408885B (en) Modeling topics using statistical distributions
CN103177090B (en) A kind of topic detection method and device based on big data
CN110347814B (en) Lawyer accurate recommendation method and system
CN104077407B (en) A kind of intelligent data search system and method
US20140081995A1 (en) Method and System for Creating a Data Profile Engine, Tool Creation Engines and Product Interfaces for Identifying and Analyzing File and Sections of Files
US20130080422A1 (en) Method, Apparatus and System of Intelligent Navigation
US20150205580A1 (en) Method and System for Sorting Online Videos of a Search
CN102262663B (en) Method for repairing software defect reports
CN106327227A (en) Information recommendation system and information recommendation method
CN104063523A (en) E-commerce search scoring and ranking method and system
CN101566997A (en) Determining words related to given set of words
CN104462336A (en) Information pushing method and device
CN107544988A (en) A kind of method and apparatus for obtaining public sentiment data
CN107885857B (en) A kind of search results pages user's behavior pattern mining method, apparatus and system
KR20160089177A (en) Polarity-based user opinion ranking algorithm and system
CN104050243B (en) It is a kind of to search for the network search method combined with social activity and its system
CN112418695A (en) Multi-dimensional portrait construction method and recommendation method for scientific researchers in tobacco field
CN105740434A (en) Network information scoring method and device
CN103425705B (en) The acquisition methods and device and searching method and device of a kind of negative keyword
KR101902460B1 (en) Device for document categorizing

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20160608