CN104077327B - The recognition methods of core word importance and equipment and search result ordering method and equipment - Google Patents

The recognition methods of core word importance and equipment and search result ordering method and equipment Download PDF

Info

Publication number
CN104077327B
CN104077327B CN201310109430.2A CN201310109430A CN104077327B CN 104077327 B CN104077327 B CN 104077327B CN 201310109430 A CN201310109430 A CN 201310109430A CN 104077327 B CN104077327 B CN 104077327B
Authority
CN
China
Prior art keywords
information
core word
word
user
core
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201310109430.2A
Other languages
Chinese (zh)
Other versions
CN104077327A (en
Inventor
宁伟
黄云平
顾湘余
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Alibaba Group Holding Ltd
Original Assignee
Alibaba Group Holding Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Alibaba Group Holding Ltd filed Critical Alibaba Group Holding Ltd
Priority to CN201310109430.2A priority Critical patent/CN104077327B/en
Publication of CN104077327A publication Critical patent/CN104077327A/en
Application granted granted Critical
Publication of CN104077327B publication Critical patent/CN104077327B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/951Indexing; Web crawling techniques

Landscapes

  • Engineering & Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The application is related to the recognition methods of core word importance and equipment and search result ordering method and equipment.The core word importance recognition methods is used for the importance of the middle core word that releases news of identification information issue user, it is characterised in that including:It is determined that multiple core words in releasing news;It is that each core word in multiple core words assigns corresponding initial weight value according to the feature to release news;And the historical behavior daily record of user is issued according to information, the corresponding initial weight value of each core word is adjusted, to obtain corresponding final weight value.It is possible thereby to the recognition accuracy of core word importance is improved, so as to improve the related search to release news and the sequence degree of accuracy.

Description

The recognition methods of core word importance and equipment and search result ordering method and equipment
Technical field
The application is related to field of computer technology, more particularly relates in identification information issue the releasing news of user The core word importance recognition methods of the importance of core word and equipment and search result ordering method and equipment.
Background technology
Disclosure wherein also contains the discovery of the present inventor despite being illustrated under background technology title And design, so prior art should not be properly termed as.
With the fast development of internet, information is released news and searched for by the network platform has turned into daily one of people Kind life style.Therefore, search engine technique also in continuous innovation and develops, to meet the expectation of people and demand.
In this manual, " information issue user " refers to the user to be released news in the network platform, and " information search User " refers to the user that information is searched in the network platform.
Artconventionally, the retrieval mode commonly used in search engine is following information retrieval model, wherein being sent out for information The information that cloth user is issued(Referred to as " release news ")Index database is established, is then passed through according to the search term of information search user Literal hit and matching process show and searched for according to relevance ranking to determine the correlation to release news with search term Word is related to release news.
However, in such method, only literal hit and matching, which may result in, recalls substantial amounts of non-search word Demand information.For example, released news in the network platform in the presence of two:(A)Supply trendy Nokia's smart mobile phone;(B)Supply Nokia's battery of mobile phone.When information search user is scanned for by the use of " mobile phone " as search term, the inspection of engine is generally searched for Rope logic uses literal hit and matching process, it is thus determined that this two release news it is all related to the search term, so this two Bar information can all be called back.However, actually only have(A)It is the demand for meeting information search user,(B)Searched for the information It is irrelevant information for rope user, such search degree of accuracy is relatively low, it is impossible to meets user's request well.
Such to solve the problems, such as, search engine is typically released news come in identifying that this releases news by analysis at present Core presentive word or core product word(Referred to as core word), and recall as search engine the correlation calculations of result Foundation, rather than released news by whole piece and carry out literal hit to the search term of information search user and match related to determine Property, literal matching thus can be reduced to a certain extent and the incoherent information of meaning is recalled.
In such method, identifying the usual way of core word is, releasing news point for user is issued to information Word processing and part-of-speech tagging, according to pre-prepd part of speech or attribute dictionary matching, are labeled, Huo Zhetong to releasing news Machine learning method automatic marking part of speech or attribute are crossed, then basis will such as be labeled as the word of noun as core word etc Pre-defined rule, identify the core word that this releases news.
For those describe it is lack of standardization or more description information release news, may will recognise that by this method Multiple core words.For example, release news " to supply trendy notebook computer, 500G hard disks, 4G internal memories, 15 cun of liquid crystal displays Device ", " notebook computer ", " hard disk ", " internal memory " and " liquid crystal display " can be all identified as by core word by the above method. Under such circumstances, it will usually according to such as TF-IDF(Term frequency-inverse document frequency, Word frequency-anti-document frequency)Etc method identify importance of these core words in this releases news, then according to core The importance of word determines correlation that this releases news with search term, and is shown and search term phase according to relevance ranking What is closed releases news.
However, the inventors discovered that, in the case of this multiple core words, i.e. release news description it is lack of standardization or In the case that person's description information is more, the search degree of accuracy is not often high.A kind of therefore, it is desirable to skill that can improve the search degree of accuracy Art overcomes this defect.
The content of the invention
The present inventors have noted that in the core word importance identification of conventional method, the importance of word is as it is in text The directly proportional increase of number occurred in part, but the frequency that can occur simultaneously with it in corpus is inversely proportional decline.Wherein only It make use of the description information of that releases news of information issue user itself.Such identification is not accurate enough, causes subsequent searches to be drawn Hold up the correlation calculations for recalling result and mistake occur, so as to cause the degree of accuracy of relevance ranking not high, it is impossible to meet well User's request.
Therefore, the purpose of the application is in providing a kind of weight that core word in the releasing news of user is issued to information The technology that the property wanted is identified.
Further object is to provide the technology that a kind of search result for releasing news to correlation is ranked up.
According to the embodiment of the one side of the application, there is provided a kind of core word importance recognition methods, believe for identifying The importance of the middle core word that releases news of breath issue user, it is characterised in that including:It is determined that multiple cores in releasing news Word;It is that each core word in multiple core words assigns corresponding initial weight value according to the feature to release news;And root It is believed that the historical behavior daily record of breath issue user, adjusts the corresponding initial weight value of each core word, to obtain accordingly most Whole weighted value.
According to the embodiment of the one side of the application, a kind of search result ordering method is also provided, it is characterised in that bag Include:The search term of receive information inquiry user's input;The importance of the middle core word that releases news based on information issue user, really Surely the correlation to release news with search term;And according to correlation, be ranked up and show to releasing news, wherein, it is described The importance of core word is through the following steps that identification:It is determined that it is described release news in multiple core words;According to the hair The feature of cloth information, it is that each core word in the multiple core word assigns corresponding initial weight value;And according to letter The historical behavior daily record of breath issue user, adjusts the corresponding initial weight value of each described core word, to obtain accordingly most Whole weighted value.
According to the embodiment of the another aspect of the application, there is provided a kind of core word importance identification equipment, believe for identifying The importance of the middle core word that releases news of breath issue user, it is characterised in that including:Core word determining device, for determining Multiple core words in releasing news;Valuator device, for being each in multiple core words according to the feature to release news Core word assigns corresponding initial weight value;And adjusting apparatus, for issuing the historical behavior daily record of user according to information, adjust The whole corresponding initial weight value of each core word, to obtain corresponding final weight value.
According to the embodiment of the another aspect of the application, a kind of search results ranking equipment is also provided, it is characterised in that bag Include:Search term reception device, the search term for receive information inquiry user's input;Correlation determining device, for based on letter The importance of the middle core word that releases news of breath issue user, it is determined that the correlation to release news with search term;And sequence and Display device, for according to correlation, being ranked up and showing to releasing news, wherein, the importance of the core word is logical Cross following steps identification:It is determined that it is described release news in multiple core words;It is described according to the feature to release news Each core word in multiple core words assigns corresponding initial weight value;And the historical behavior of user is issued according to information Daily record, the corresponding initial weight value of each described core word is adjusted, to obtain corresponding final weight value.
Compared with prior art, according to the technical scheme of the application, releasing news originally for user is issued not only according to information The feature of body, and other suitable auxiliary informations of such as historical behavior daily record of information issue user etc are combined, to identify Importance of the core word in this releases news, it is accurate so as to improve the identification of the importance of the core word in releasing news Degree.Correspondingly, the search degree of accuracy can be improved, i.e. the sequence degree of accuracy that releases news related to search term is improved, so as to more Meet user's request well.
Brief description of the drawings
Accompanying drawing described herein is used for providing further understanding of the present application, forms the part of the application, this Shen Schematic description and description please is used to explain the application, does not form the improper restriction to the application.In the accompanying drawings:
Fig. 1 shows the middle core word that releases news for being used for identification information issue user according to the application one embodiment The flow chart of the core word importance recognition methods of importance;
Fig. 2 shows the flow chart of the search result ordering method according to the application one embodiment;
Fig. 3 shows the middle core word that releases news for being used for identification information issue user according to the application one embodiment The schematic block diagram of the core word importance identification equipment of importance;And
Fig. 4 shows the schematic block diagram of the search results ranking equipment according to the application one embodiment.
Embodiment
As described above, the present inventors have noted that, the feature of itself that releases news that only merely combining information releases news To identify the importance of core word, such recognition accuracy is not high.Then the present inventor expects, can combine except information is sent out Suitable auxiliary information outside feature that releases news of cloth user itself optimizes the result of core word importance identification.
The main thought of the application is, in addition to the feature of that releases news of information issue user itself, considers knot The historical behavior daily record for closing information issue user carrys out the importance that identification information issues the middle core word that releases news of user, so as to Improve the degree of accuracy of importance identification.
The present inventors have noted that the feedback information of information inquiry user is also available important and high quality a letter Breath, the result of core word importance identification can be optimized by such feedback information.Further it is observed that information issue user Personal information also can be used for being lifted the good foundation of importance recognition accuracy.
To make the purpose, technical scheme and advantage of the application clearer, below in conjunction with drawings and the specific embodiments, to this Application is described in further detail.
According to the embodiment of the application one side, there is provided in a kind of issue the releasing news of user for identification information The core word importance recognition methods of the importance of core word.
Fig. 1 shows the middle core word that releases news for being used for identification information issue user according to the application one embodiment The flow chart of the core word importance recognition methods of importance.
As shown in figure 1, at step S110, it is determined that multiple core words in releasing news.
The analysis of such as word segmentation processing and part-of-speech tagging etc, and root can be carried out to information releasing news for user of issue The core word in releasing news is determined according to pre-defined rule.
In a specific embodiment, word segmentation processing and part-of-speech tagging are carried out for example, can be released news to this, and can So that the word for being labeled as noun part-of-speech or product word to be defined as to the core word in releasing news.
In one releases news, there may be one or more noun or product words.The application is mainly in one Releasing news includes the situation of multiple core words.
It is to be herein pointed out it can be determined by any desired manner of known in the art or following exploitation Core word in releasing news, and it is not limited to mode listed above.
It is each in multiple core words according to feature of the core word in releasing news next, at step S120 Individual core word assigns corresponding initial weight value Score_initial.
Wherein, the initial weight value can be used for preliminarily identifying importance of the core word in this releases news.
Wherein, the description information for being published the title, attribute and/or details of object etc can be included by releasing news.
Core word can include the frequency that occurs in releasing news of core word in the feature in releasing news(Number)With/ Or position feature.
In a specific embodiment, for example, the number that occurs in releasing news of core word is more, weighted value Score_ Initial is higher.In addition, for example, if core word is appeared in title description, weighted value Score_initial is high, and such as Fruit core word is only present in details description, then weighted value Score_initial is low.These features can be used alone can also It is used in combination.This point is can well to be realized by any desired manner of known in the art or following exploitation, here not Repeat again.
As previously mentioned, the present inventor is exactly it is noted that according only to the spy of itself that releases news in existing scheme Levy to assign each core word weighted value, the importance of multiple core words, such importance are identified by this weighted value Recognition accuracy is not high, causes the accuracy that this releases news with the correlation calculations of user input query word in information search It is not high, so contemplating with reference to other suitable auxiliary informations to adjust the weights of importance of this middle core word that releases news Value so that improve importance recognition accuracy, be advantageous to search in lifted this release news it is related to user input query word Property calculate accuracy.
According to embodiments herein, the historical behavior daily record of user can be issued according to such as information, information inquiry is used The feedback information at family, information issue the one or more in the auxiliary information of the personal information of user etc to adjust core word Initial weight value, so as to improve the degree of accuracy of identification core word importance, as described by with reference to step S130.
At step S130, the historical behavior daily record of user is issued according to information, adjusts each core word accordingly just Beginning weighted value Score_initiali, to obtain corresponding final weight value Score_finali
In a specific embodiment, for some core word i in an information of information issue user's issue, meter Calculate the number Count_key that the core word occurs in the historical behavior daily record of information issue useriIt is and each in this information The number sum ∑ Count_key that core word occurs in the historical behavior daily record of information issue useriRatio Score_ keyi, i.e. Score_keyi=Count_keyi/∑Count_keyi, i represent one issue information in i-th of core word.
Wherein, the historical behavior daily record of information issue user can specifically include the keyword purchase day of information issue user Will.The keyword of information issue user's purchase can include participle and participle combination, and participle combination is combined by multiple participles.
In one embodiment, the final weight value Score_final of each core wordiCan be Score_keyiWith it is first The weighted sum of beginning weighted value, such as following formula(2)It is shown:
Score_finali=w5*(w7*Score_keyi)+w6*Score_initiali(2)
Wherein, w5、w6And w7Can be the experience weights drawn in an experiment according to experimental result, they can be 0-1 it Between arbitrary value.
It is described above identifying the importance of core word according to the historical behavior daily record of information issue user.It is actual On, recognition accuracy can also be improved according further to other appropriate informations of information issue user side.
In one embodiment, the personal information that user can be issued according to information is first accordingly to adjust each core word Beginning weighted value, to obtain corresponding final weight value.
According to embodiments herein, personal information comprises at least at least one in personal label, summary and regional information It is individual.The personal information can be issued in the log-on message of user from information and obtained.For example, log-on message can include such as title Etc personal label information, the summary infos of remarks etc, the regional information etc. of address etc.
In a specific embodiment, the frequency that can be occurred according to core word in above-mentioned personal information(Number)Or position Put to adjust the initial weight value of imparting core word.For example, core word is in personal label information, summary info, regional information Occur more, weighted value can be higher.This situation is similar to situation when considering to release news feature itself.According to here Disclosure, those skilled in the art can easily realize this point, therefore repeat no more here.
It is described above issuing the historical behavior daily record of user and/or the personal information of information issue user according to information Come identify release news in each core word importance, it is believed that be to be adjusted according to the information of information issue user side above The whole initial weight value that core word is assigned by the feature of itself that releases news.In fact, can also be according to information inquiry user The appropriate information of side adjusts the initial weight value.
In one embodiment, can be determined according to the feedback information of information inquiry user in set-up procedure S110 every The corresponding initial weight value Score_initial of one core wordi, to obtain corresponding final weight value Score_finali
Wherein, the feedback information of information inquiry user comprises at least the inquiry and click information, transaction of information inquiry user It is at least one in information and evaluation behavioural information.The feedback information of these information inquiries user, such as information inquiry user Inquiry and click information, click on subsequent transaction information and evaluation behavioural information, can be obtained by network log.Here should It is understood that the different types of feedback information that user can be inquired about with combining information is obtained to adjust using the feature of information issue user The weighted value gone out, so as to improve the recognition accuracy of core word importance, and then it is accurate to improve the related search to release news Exactness.
In a specific embodiment, can be each to adjust according to the inquiry of information inquiry user and click historical information The corresponding initial weight value Score_initial of core wordi, to obtain corresponding final weight value Score_finali.For example, For each core word, can according to the core word in network log certain period of time(Such as 100 days)Interior Query Result The number Count_show of middle appearanceiThe number sum ∑ occurred with each in multiple core words in the Query Result Count_showiRatio Score_showi=Count_showi/∑Count_showiAnd the period(Such as 100 days) The number Count_click that the core word occurs in the Query Result being inside clickediWith each core in the Query Result that is clicked The number sum ∑ Count_click that heart word occursiRatio Score_clicki=Count_clicki/∑Count_ clicki, to adjust initial weight value Score_initialiSo as to obtain final weight value Score_finali, i expressions one I-th of core word in releasing news.In one embodiment, the final weight value of each core word can be Score_ showi、Score_clickiAnd Score_initialiWeighted sum, such as following formula(1)It is shown:
Score_finali=w1*(w3*Score_showi+w4*Score_clicki)+w2*Score_initiali(1)
Wherein, w1、w2、w3And w4Can be preset in an experiment according to experimental result, they can be between 0-1 Arbitrary value.
It is described above corresponding to adjust each core word according to the inquiry of information inquiry user and click historical information Initial weight value.In a similar way, equally can be every to adjust according to subsequent transaction information or evaluation behavioural information is clicked on The individual corresponding initial weight value of core word, can also be according to the inquiry of information inquiry user and click information, click subsequent transaction Information, any combination between behavioural information three is evaluated to adjust the corresponding initial weight value of each core word.This area skill Art personnel can realize these schemes according to content disclosed above, therefore for brevity, on their own realization side Formula, repeat no more here.
Describe in the above embodiments only in conjunction with the historical behavior daily record of information issue user or only in conjunction with letter The personal information of breath issue user identifies the importance of core word only in conjunction with the feedback information of information inquiry user, thus The recognition accuracy for the core word importance that releases news can be improved, it is accurate so as to improve the related search to release news and sequence Degree.It should be understood that the application is not limited to above-described embodiment, but can be known according to any combination in above- mentioned information The importance of other core word, the recognition accuracy that so can further improve core word importance are searched with what correlation released news The rope degree of accuracy.
For example, in another embodiment, can be inquired about with combining information the inquiry of user and the feedback information of click and Information issues both historical behavior daily records of user to adjust the corresponding initial weight value Score_ of each core word initiali, to obtain corresponding final weight value Score_finali.Such as following formula(3)It is shown:
Score_finali=w1'*(w3'*(w5'*Score_showi+w6'*Score_clicki)+w4'*(Score_ keyi))+w2'*Score_initiali(3)
Wherein, w1'、w2'、w3'、w4'、w5' and w6' can be the experience weights drawn in an experiment according to experimental result, They can be between 0-1 arbitrary value.
So far, by according to release news itself feature and obtained the final power of each core word with reference to auxiliary information Weight values, so as to identify release news in multiple core words each core word importance, it is possible thereby to significantly improve core The recognition accuracy of heart word importance.
When current queries user inputs a certain search term, one or more information is searched by the search term, according to The final weight value of each core word calculates every information and the correlation of the search term in every information, and according to the correlation Result of calculation sorts to described information.
In the embodiment of the present application, when the search term inputted according to information inquiry user scans for, phase can be improved The degree of accuracy in the calculating of closing property and irrelevant information filtering, so as to improve the accurate of the related search results ranking to release news Degree, this point is described in detail with reference to Fig. 2.
Fig. 2 shows the flow chart of the search result ordering method according to the application one embodiment.
As shown in Fig. 2 at step S210, the search term of receive information inquiry user's input.
In one embodiment, the search term of information inquiry user input can be analyzed, to find out the core in the search term Heart word information.In general, search term is shorter character string, by method commonly used in the art or with combining step S110 The similar approach of description can well identify core word information therein, for follow-up correlation calculations.
Obviously it will be appreciated that, the application is not limited to above-described embodiment, can not also find out the core word letter in search term Breath, but subsequent step S220 correlation calculations are directly carried out using search term.
Next, at step S220, the importance of the middle core word that releases news based on information issue user, it is determined that hair Cloth information and the correlation of search term.
The importance of the wherein middle core word that releases news of information issue user is by above in conjunction with the sheet described by Fig. 1 What the method for the identification core word importance of application obtained, its details refers to description above, repeats no more here.
In one embodiment, will can receive in the core word in predetermined release news and step S210 Core word information in search term is contrasted, it is determined that the correlation to release news with search term.
Specifically, if one release news in the high core word of final weight value and search term core word information Matching, it is determined that this releases news higher with search word correlation.If one release news in final weight value it is low The core word information matches of core word and search term, it is determined that this releases news relatively low with search word correlation.If one The core word information of core word and search term in releasing news all mismatches, it is determined that this release news with search term without Close.
Next, at step S230, according to the correlation determined at step S220, correlation is released news and is ranked up And show.
It is, can be according to the correlation to release news with search term determined above in step S220, pair with searching Related the releasing news of rope word is ranked up and shown.Specifically, can be according to the height of correlation, by the phase with search term Before higher the releasing news of closing property is shown in, be shown in relatively low the releasing news of the correlation of search term behind, and with searching Unrelated the releasing news of rope word is not shown.
In other embodiments, can be right according to the final weight value of each core word in obtained above release news Multiple core words in releasing news carry out importance ranking.Answered for the higher search of correlation requirement or information classification etc. With in scene, only the issue can be just determined with when the core word of importance ranking in releasing news first matches in search term Information releases news for correlation.
By the above method, according to not only in conjunction with the feature of itself and the history of combining information issue user of releasing news User behaviors log identifies the importance for the middle core word that releases news, and is scanned in the search term inputted according to information inquiry user When, the degree of accuracy for the correlation calculations that release news can be improved, it is convenient to use so as to improve the related sequence degree of accuracy to release news The use at family and the use feeling for lifting user.
Know with the core word importance of the importance of the above-mentioned middle core word that releases news for identification information issue user Other method is similar, and the embodiment of the present application additionally provides the importance of the middle core word that releases news for identification information issue user Core word importance identification equipment.
Fig. 3 shows the middle core word that releases news for being used for identification information issue user according to the application one embodiment The schematic block diagram of the core word importance identification equipment 300 of importance.
As shown in figure 3, equipment 300 can include core word determining device 310, valuator device 320 and adjusting apparatus 330.
Specifically, core word determining device 310 is determined for multiple core words in releasing news.Valuator device 320 can be used for assigning corresponding initial weight according to the feature to release news for each core word in multiple core words Value.The historical behavior daily record that adjusting apparatus 330 can be used for issuing user according to information is corresponding to adjust each core word Initial weight value is to obtain corresponding final weight value.
Set by the importance for being used for the middle core word that releases news that identification information issues user of the embodiment of the present application It is standby, compared to existing technologies, the degree of accuracy of identification core word importance can be significantly improved.
The core word of the importance of the middle core word that releases news described above for identification information issue user is important Property identification equipment and describe before be used for identification information issue user the middle core word that releases news importance core word The processing of importance recognition methods is corresponding, accordingly, with respect to more detailed ins and outs, may refer to the side described before Method, repeat no more here.
On the other hand, similar with mentioned above searching results sort method, the embodiment of the present application additionally provides search results ranking Equipment, it is described in detail with reference to Fig. 4.
Fig. 4 shows the schematic block diagram of the search results ranking equipment 400 according to the application one embodiment.
As shown in figure 4, equipment 400 can include search term reception device 410, correlation determining device 420 and sequence With display device 430.
Specifically, search term reception device 410 can be used for the search term of receive information inquiry user's input.Correlation Determining device 420 can be used for based on information issue user the middle core word that releases news importance come determine to release news with The correlation of search term.The importance of core word wherein in information issue the releasing news of user is by above in conjunction with Fig. 1 What the method for the identification core word importance of the application of description obtained.Sequence can be used for according to related to display device 430 Property is ranked up and shown to releasing news.
Similarly, by the search results ranking equipment of the embodiment of the present application, the correlation calculations that release news can be improved The degree of accuracy, so as to improve the related sequence degree of accuracy to release news, use feeling that is user-friendly and lifting user.
The processing of search result ordering method of the search results ranking equipment described above with describing before be it is corresponding, Accordingly, with respect to more detailed ins and outs, the method described before is may refer to, is repeated no more here.
It will be understood by those skilled in the art that embodiments herein can be provided as method, system or computer program product. Therefore, the application can be using the embodiment in terms of complete hardware embodiment, complete software embodiment or combination software and hardware Form.Deposited moreover, the application can use to can use in one or more computers for wherein including computer usable program code Storage media(Including but not limited to magnetic disk storage, CD-ROM, optical memory etc.)The shape of the computer program product of upper implementation Formula.
Embodiments herein is the foregoing is only, is not limited to the application.For those skilled in the art For, the application can have various modifications and variations.All any modifications made within spirit herein and principle, it is equal Replace, improve etc., it should be included within the scope of claims hereof.

Claims (12)

1. a kind of core word importance recognition methods, the important of middle core word that release news for identification information issue user Property, it is characterised in that including:
It is determined that it is described release news in multiple core words;
It is that each core word in the multiple core word assigns corresponding initial weight according to the feature to release news Value;And
Issue the historical behavior daily record of user and the feedback information of information inquiry user according to described information, adjustment it is described each The corresponding initial weight value of core word, to obtain corresponding final weight value;
Wherein, the historical behavior daily record that user is issued according to described information, each described core word is adjusted accordingly just Beginning weighted value, including:
Calculate each core word each core in the number that occurs and this information in the historical behavior daily record of information issue user The ratio for the number sum that heart word occurs in the historical behavior daily record of information issue user;
The initial weight value is adjusted according to the ratio.
2. according to the method for claim 1, it is characterised in that the feedback information of described information inquiry user comprises at least institute State information inquiry user inquiry and click information, click on subsequent transaction information and evaluate behavioural information in it is at least one.
3. according to the method for claim 1, it is characterised in that also include:
The personal information of user is issued according to described information, adjusts the corresponding initial weight value of each described core word, with To corresponding final weight value, the personal information comprises at least at least one in personal label, summary and regional information.
4. according to the method for claim 1, it is characterised in that multiple core words in being released news described in the determination Step includes:
Word segmentation processing and part-of-speech tagging are carried out to described release news;And
Core word in being released news according to determining pre-defined rule, the pre-defined rule are will to be labeled as noun part-of-speech or production The word of product word is defined as the core word in described release news.
5. according to the method for claim 1, it is characterised in that described that institute is adjusted according to the feedback information of information inquiry user The corresponding initial weight value of each core word is stated to include the step of obtaining corresponding final weight value:
According to the Query Result in certain period of time in multiple core words each occur number with it is the multiple It is more described in the Query Result being clicked in the ratio and certain period of time of each number sum occurred in core word The ratio for the number sum that each number occurred in individual core word occurs with each in the multiple core word, is adjusted The whole initial weight value is so as to obtaining final weight value.
A kind of 6. search result ordering method, it is characterised in that including:
The search term of receive information inquiry user's input;
The importance of the middle core word that releases news based on information issue user, it is determined that described release news and the search term Correlation;And
According to the correlation, described release news is ranked up and shown,
Wherein, the core word importance through the following steps that identification:
It is determined that it is described release news in multiple core words;
It is that each core word in the multiple core word assigns corresponding initial weight according to the feature to release news Value;And
The historical behavior daily record of user and the feedback information of information inquiry user are issued according to information, adjusts each described core The corresponding initial weight value of word, to obtain corresponding final weight value;
Wherein, the historical behavior daily record that user is issued according to described information, each described core word is adjusted accordingly just Beginning weighted value, to obtain corresponding final weight value, including:
Calculate each core word each core in the number that occurs and this information in the historical behavior daily record of information issue user The ratio for the number sum that heart word occurs in the historical behavior daily record of information issue user;
The initial weight value is adjusted according to the ratio so as to obtain final weight value.
7. a kind of core word importance identification equipment, the important of middle core word that release news for identification information issue user Property, it is characterised in that including:
Core word determining device, for multiple core words in being released news described in determination;
Valuator device, it is that each core word in the multiple core word assigns for the feature to be released news according to Corresponding initial weight value;And
Adjusting apparatus, for issuing the historical behavior daily record of user and the feedback information of information inquiry user according to described information, The corresponding initial weight value of each described core word of adjustment, to obtain corresponding final weight value;
Wherein, the adjusting apparatus calculates the number that each core word occurs in the historical behavior daily record of information issue user And the ratio of number sum that each core word occurs in the historical behavior daily record of information issue user in this information;According to The ratio adjusts the initial weight value.
8. equipment according to claim 7, it is characterised in that the feedback information of described information inquiry user comprises at least institute State information inquiry user inquiry and click information, click on subsequent transaction information and evaluate behavioural information in it is at least one.
9. equipment according to claim 7, it is characterised in that the adjusting apparatus issues user's always according to described information To obtain corresponding final weight value, the individual believes the personal information adjustment corresponding initial weight value of each core word Breath comprises at least at least one in personal label, summary and regional information.
10. equipment according to claim 7, it is characterised in that the core word determining device to it is described release news into Row word segmentation processing and part-of-speech tagging and the core word in being released news according to determining pre-defined rule, the pre-defined rule are The word for being labeled as noun part-of-speech or product word is defined as the core word in described release news.
11. equipment according to claim 7, it is characterised in that the adjusting apparatus is according to the inquiry in certain period of time As a result the number that each number occurred described in multiple core words occurs with each in the multiple core word Ratio and certain period of time in each number occurred in multiple core words described in the Query Result that is clicked with The ratio of each number sum occurred in the multiple core word, adjusts the initial weight value so as to finally be weighed Weight values.
A kind of 12. search results ranking equipment, it is characterised in that including:
Search term reception device, the search term for receive information inquiry user's input;
Correlation determining device, for the importance of the middle core word that releases news based on information issue user, determine the hair Cloth information and the correlation of the search term;And
Sequence and display device, for according to the correlation, described release news being ranked up and being shown,
Wherein, the core word importance through the following steps that identification:
It is determined that it is described release news in multiple core words;
It is that each core word in the multiple core word assigns corresponding initial weight according to the feature to release news Value;And
The historical behavior daily record of user and the feedback information of information inquiry user are issued according to information, adjusts each described core The corresponding initial weight value of word, to obtain corresponding final weight value;
Wherein, the historical behavior daily record that user is issued according to described information, each described core word is adjusted accordingly just Beginning weighted value, including:
Calculate each core word each core in the number that occurs and this information in the historical behavior daily record of information issue user The ratio for the number sum that heart word occurs in the historical behavior daily record of information issue user;
The initial weight value is adjusted according to the ratio.
CN201310109430.2A 2013-03-29 2013-03-29 The recognition methods of core word importance and equipment and search result ordering method and equipment Active CN104077327B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201310109430.2A CN104077327B (en) 2013-03-29 2013-03-29 The recognition methods of core word importance and equipment and search result ordering method and equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201310109430.2A CN104077327B (en) 2013-03-29 2013-03-29 The recognition methods of core word importance and equipment and search result ordering method and equipment

Publications (2)

Publication Number Publication Date
CN104077327A CN104077327A (en) 2014-10-01
CN104077327B true CN104077327B (en) 2018-01-19

Family

ID=51598586

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201310109430.2A Active CN104077327B (en) 2013-03-29 2013-03-29 The recognition methods of core word importance and equipment and search result ordering method and equipment

Country Status (1)

Country Link
CN (1) CN104077327B (en)

Families Citing this family (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105893397B (en) * 2015-06-30 2019-03-15 北京爱奇艺科技有限公司 A kind of video recommendation method and device
CN105205045A (en) * 2015-09-21 2015-12-30 上海智臻智能网络科技股份有限公司 Semantic model method for intelligent interaction
CN107688606A (en) * 2017-07-26 2018-02-13 北京三快在线科技有限公司 The acquisition methods and device of a kind of recommendation information, electronic equipment
CN107818781B (en) * 2017-09-11 2021-08-10 远光软件股份有限公司 Intelligent interaction method, equipment and storage medium
CN113761110B (en) * 2020-06-28 2024-10-18 北京沃东天骏信息技术有限公司 Information issuing method, device, equipment and storage medium
CN111949697B (en) * 2020-07-09 2022-08-16 厦门美柚股份有限公司 Data processing method, device, terminal and medium based on search engine

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102289436A (en) * 2010-06-18 2011-12-21 阿里巴巴集团控股有限公司 Method and device for determining weighted value of search term and method and device for generating search results
US8145618B1 (en) * 2004-02-26 2012-03-27 Google Inc. System and method for determining a composite score for categorized search results
CN102446174A (en) * 2010-10-09 2012-05-09 百度在线网络技术(北京)有限公司 Method for determining weights of key sub-words in network equipment and equipment adopting same

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101334796B (en) * 2008-02-29 2011-01-12 浙江师范大学 Personalized and synergistic integration network multimedia search and enquiry method

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8145618B1 (en) * 2004-02-26 2012-03-27 Google Inc. System and method for determining a composite score for categorized search results
CN102289436A (en) * 2010-06-18 2011-12-21 阿里巴巴集团控股有限公司 Method and device for determining weighted value of search term and method and device for generating search results
CN102446174A (en) * 2010-10-09 2012-05-09 百度在线网络技术(北京)有限公司 Method for determining weights of key sub-words in network equipment and equipment adopting same

Also Published As

Publication number Publication date
CN104077327A (en) 2014-10-01

Similar Documents

Publication Publication Date Title
US10853360B2 (en) Searchable index
US10896212B2 (en) System and methods for automating trademark and service mark searches
US20210279552A1 (en) Method for making recommendations to a user and apparatus, computing device, and storage medium
CN108153876B (en) Intelligent question and answer method and system
US10565533B2 (en) Systems and methods for similarity and context measures for trademark and service mark analysis and repository searches
CN109189904A (en) Individuation search method and system
CN104077327B (en) The recognition methods of core word importance and equipment and search result ordering method and equipment
CN103339623B (en) Method and apparatus relating to internet searching
US20180268038A1 (en) Systems and Methods for Similarity and Context Measures for Trademark and Service Mark Analysis and Repository Searches
US9934293B2 (en) Generating search results
US12061656B2 (en) Techniques to leverage machine learning for search engine optimization
CN112733042B (en) Recommendation information generation method, related device and computer program product
US20170371965A1 (en) Method and system for dynamically personalizing profiles in a social network
US20150186938A1 (en) Search service advertisement selection
US11682060B2 (en) Methods and apparatuses for providing search results using embedding-based retrieval
JP2014501422A (en) Search keyword recommendation based on user intention
CN110737756B (en) Method, apparatus, device and medium for determining answer to user input data
TWI662495B (en) Processing method, device and system for promotion information
CN109063000A (en) Question sentence recommended method, customer service system and computer readable storage medium
US12008047B2 (en) Providing an object-based response to a natural language query
US12079572B2 (en) Rule-based machine learning classifier creation and tracking platform for feedback text analysis
WO2023151576A1 (en) Search recommendation method, search recommendation system, computer device and storage medium
WO2021051587A1 (en) Search result sorting method and apparatus based on semantic recognition, electronic device, and storage medium
Wei et al. Online education recommendation model based on user behavior data analysis
US20130332440A1 (en) Refinements in Document Analysis

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant