CN104361115B - It is a kind of based on the entry Weight Determination clicked jointly and device - Google Patents

It is a kind of based on the entry Weight Determination clicked jointly and device Download PDF

Info

Publication number
CN104361115B
CN104361115B CN201410718382.1A CN201410718382A CN104361115B CN 104361115 B CN104361115 B CN 104361115B CN 201410718382 A CN201410718382 A CN 201410718382A CN 104361115 B CN104361115 B CN 104361115B
Authority
CN
China
Prior art keywords
term
weight
entry
query
gram
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Fee Related
Application number
CN201410718382.1A
Other languages
Chinese (zh)
Other versions
CN104361115A (en
Inventor
邹启波
周连强
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Qihoo Technology Co Ltd
Qizhi Software Beijing Co Ltd
Original Assignee
Beijing Qihoo Technology Co Ltd
Qizhi Software Beijing Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Qihoo Technology Co Ltd, Qizhi Software Beijing Co Ltd filed Critical Beijing Qihoo Technology Co Ltd
Priority to CN201410718382.1A priority Critical patent/CN104361115B/en
Publication of CN104361115A publication Critical patent/CN104361115A/en
Application granted granted Critical
Publication of CN104361115B publication Critical patent/CN104361115B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/3331Query processing
    • G06F16/3332Query translation
    • G06F16/3334Selection or weighting of terms from queries, including natural language queries
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/205Parsing
    • G06F40/216Parsing using statistical methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Artificial Intelligence (AREA)
  • Computational Linguistics (AREA)
  • General Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Probability & Statistics with Applications (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • General Health & Medical Sciences (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

Search daily record data is primarily based on based on the entry Weight Determination clicked jointly and device an embodiment of the present invention provides a kind of, obtains the read statement query set corresponding to the uniform resource locator URL clicked jointly;Each query is segmented in gathering the query, obtains multiple basic entry term;The frequency that each term occurs in query set is counted, the height based on the frequency of occurrences obtains the entry weight of each term.This method and device can be obtained accurately based on the entry weight clicked jointly, the shortcomings that playing important function for the core word extraction and document ordering of read statement query, overcome existing TF IDF technologies, and then improve the accuracy of search result.

Description

It is a kind of based on the entry Weight Determination clicked jointly and device
Technical field
The present invention relates to information advancing technique fields more particularly to a kind of based on the entry Weight Acquisition method clicked jointly And device.
Background technology
With the fast development of network and information technology, explosive growth is also presented in the information content of network, then quickly And correct information is correctly obtained inside the data of these magnanimity becomes the key problem of present search engine technique, so Prodigious otherness is but presented in the input of user afterwards, and different people receives different education and different culture, causes It states widely different above the same problem.The marking that entry weight is so carried out to the input entry of user is that have very much must It wants, this extracts query core words, and document ordering etc. is all a very important technology.
Current TF-IDF (Term Frequency-Inverse Document Frequency) technology, to assess Significance level of one words for a copy of it file in a file set or a corpus.It is that one kind being used for information retrieval The common weighting technique prospected with information.The weight of one entry described from documentation level, but it and context-free.
For example, in different query, because under different context or semantic background, the weight of the same word It can be significant different;Such as:One query is " Beijing's Imperial Palace admission ticket ", another query is the " height in Beijing to Wuhan Iron " all occurs " Beijing " this word, but " Beijing " this word, search corresponding to the two query in the two query As a result significance level is certain to different, and existing TF-IDF technologies cannot describe such situation, cause finally to search for As a result error.
Invention content
In view of the above problems, it is proposed that the present invention overcoming the above problem in order to provide one kind or solves at least partly State a kind of based on the entry Weight Determination clicked jointly and device of problem.
It is a kind of based on the entry Weight Determination clicked jointly, including:
Based on search daily record data, the read statement query corresponding to the uniform resource locator URL clicked jointly is obtained Set;
Each query is segmented in gathering the query, obtains multiple basic entry term;
The frequency that each term occurs in query set is counted, the height based on the frequency of occurrences obtains each The entry weight of term.
The present invention also provides a kind of based on the entry weight determining device clicked jointly, and described device includes:
Query gathers acquiring unit, for based on search daily record data, obtaining the uniform resource locator clicked jointly Read statement query set corresponding to URL;
Participle unit, for the query set acquiring unit acquired in query set in each query into Row word segmentation processing obtains multiple basic entry term;
Entry Weight Acquisition unit is gathered for counting the obtained each term of the participle unit in the query The frequency of middle appearance, and the height based on the frequency of occurrences obtains the entry weight of each term.
As known from the above, this method and device can be obtained accurately based on the entry weight clicked jointly, for inputting language The shortcomings that core word extraction and document ordering of sentence query play important function, overcome existing TF-IDF technologies, And then improve the accuracy of search result.
Above description is only the general introduction of technical solution of the present invention, in order to better understand the technical means of the present invention, And can be implemented in accordance with the contents of the specification, and in order to allow above and other objects of the present invention, feature and advantage can It is clearer and more comprehensible, below the special specific implementation mode for lifting the present invention.
Description of the drawings
By reading the detailed description of hereafter preferred embodiment, various other advantages and benefit are common for this field Technical staff will become clear.Attached drawing only for the purpose of illustrating preferred embodiments, and is not considered as to the present invention Limitation.And throughout the drawings, the same reference numbers will be used to refer to the same parts.In the accompanying drawings:
Fig. 1 is provided in an embodiment of the present invention based on the entry Weight Determination flow diagram clicked jointly;
Fig. 2 is enumeration process schematic diagram provided in an embodiment of the present invention;
Fig. 3 is the flow diagram provided in an embodiment of the present invention for being inputted according to user and accordingly being retrieved;
Fig. 4 is the structural schematic diagram provided in an embodiment of the present invention based on the entry weight determining device clicked jointly;
Fig. 5 is another structural representation provided in an embodiment of the present invention based on the entry weight determining device clicked jointly Figure.
Specific implementation mode
The exemplary embodiment of the disclosure is more fully described below with reference to accompanying drawings.Although showing the disclosure in attached drawing Exemplary embodiment, it being understood, however, that may be realized in various forms the disclosure without should be by embodiments set forth here It is limited.On the contrary, these embodiments are provided to facilitate a more thoroughly understanding of the present invention, and can be by the scope of the present disclosure Completely it is communicated to those skilled in the art.
With reference to the accompanying drawings of the specification, entry Weight Acquisition method provided in an embodiment of the present invention is illustrated, such as Fig. 1 It show provided in an embodiment of the present invention based on the entry Weight Determination flow diagram clicked jointly, the method packet It includes:
Step 11:Based on search daily record data, the input language corresponding to the uniform resource locator URL clicked jointly is obtained Sentence query set;
In this step, daily record data can be stored in the search server of backstage.
Here, the read statement query corresponding to the URL clicked jointly, meaning are exactly to click identical URL Query, these query, it is believed that be to have potential synonymy, their core should be to maintain it is stable, only It is to have changed a kind of expression, such as " Beijing's Imperial Palace admission ticket is how much ", " the Forbidden City admission ticket how much ", " Beijing's Imperial Palace admission ticket ", " therefore The problem of what palace admission ticket admission fee " ... etc. was asked is all the Forbidden City admission ticket, then for example following several query:" 360 search ", " 360 search website ", " 360 ", " 360 search engine ", " 360 search network address " }, user clicks URL:Www.so.com, Such one group of query is also considered as clicking jointly.
Step 12:Each query is segmented in gathering the query, obtains multiple basic entry term;
In this step, the rule and mode specifically segmented is referred to existing participle technique, such as can be to described Each query is based on n-gram and carries out word segmentation processing in query set, i.e., multiple segments are generated by the way of multistage enumerate Gram obtains the basic entry term of multiple segment gram.
For example, such as Q={ T1, T2, T3 ... ... Tn }, when enumerating, the exponent number of n-gram can be preset, so It is enumerated one by one afterwards, preferably, in embodiments of the present invention, the mode of 1-4 ranks gram may be used, the process reference enumerated Shown in Fig. 2, when in such a way that 1-4 ranks being enumerated, it from the beginning (T1) can start to enumerate 1-4gram, multiple segments can be obtained gram。
Such as when enumerating Q={ a, b, c, d } 4 ranks of progress, following several segment gram can be generated:
Single order gram:A, b, c, d;
Second order gram:Ab, bc, cd;
Three rank gram:abc,bcd;
Quadravalence gram:abcd.
Step 13:The frequency that each term occurs in query set is counted, the height based on the frequency of occurrences obtains The entry weight of each term.
In this step, the detailed process that the height based on the frequency of occurrences obtains the entry weight of each term can be: The number of the highest term of the frequency of occurrences is chosen as denominator, is calculated according to the occurrence number of each term and obtains each term Entry weight, that is, as molecule, the ratio obtained is for the number that occurs in query set using each term The entry weight of each term.
For example, if being based on n-gram to each query carries out word segmentation processing, the basis of multiple segment gram is obtained Entry term is then directed to each gram, count respectively it includes term in query set the number that occurs, it is assumed that gram For " 360 search ", poll query set, there is primary increase by 1, until end of polling(EOP), finally obtained statistical result are: " 360 " this term occurs 5 times in query set, and " search " this term occurs 4 times in Qs set;Then it presses According to the above method, it is " 1,0.8 " that can obtain the ratio between number.
Above-mentioned " 360 search:1,0.8 " is the numerical value counted for some query in query set, whole In a query set (the various query for containing enormous amount), according to the method described above, several can be equally calculated " 360 search " corresponding numerical value (with " 1,0.8 " similar numerical value), the gram is then directed in entire query range of convergence It averages, so that it may to obtain the corresponding entry weights of each term in " 360 search " this gram.
It, can also be according to each term and right in the specific implementation, after the entry weight for obtaining each term The entry weight composition weight dictionary answered, is similar to " 360 search in the weight dictionary comprising multiple:1,0.8 " as data For inquiry.
In addition, after forming weight dictionary, it can also be inputted according to user and be retrieved and exported as a result, having accordingly Gymnastics work is as shown in figure 3, retrieving includes:
Step 31:A certain query input by user is received first, which is segmented to obtain multiple term;
The method specifically segmented is shown in described in above-described embodiment.
Step 32:The weight dictionary is inquired, the entry weight of each term is obtained;
Further, if above-mentioned steps 31 are based on n-gram and carry out word segmentation processing, the basic word of multiple segment gram is obtained Term is then directed to each term, the multiple gram hit using the term, inquires weight dictionary, obtains the more of term hits The corresponding each entry weights of the term in a gram.
Specifically, the entry weight of each term in multiple gram and each gram is preserved in weight dictionary, below It is the content of a certain example in weight dictionary:
360:1;
360 search:1,0.8;
Search:0.8.
In above-mentioned segment, " 360 ", " 360 search " and " search " are gram, and each subsequent numbers of gram are in the gram The entry weight of term.For example, in " 360 search ", the entry weight of " 360 " is 1, and the entry weight of " search " is 0.8.
Above-mentioned weight dictionary can be stored in a manner of database or other storage modes, and the embodiment of the present invention is to this It does not limit.
In the specific implementation, due to each term that query is segmented, one or more in multiple gram may be all hit A gram using the gram of hit, is inquired, so that it may each to obtain in this way, based on above-mentioned weight dictionary in weight dictionary In the gram of term hits, the corresponding weighted values of the term.
Assuming that when to inquire " term " in weight dictionary be " 360 ", " 360 " can be hit in weight dictionary and " 360 search The two gram of rope " obtain two entry weights i.e.:1 and 1.
Since there is the entry weights of the gram of enormous amount and corresponding term in weight dictionary, so for user The query of input segmented after each term for, can all obtain several entry weights, following two may be used in this way Kind formula calculates the corresponding entry weights of each term:
In above-mentioned formula one, score is term finally calculated entry weights, and X1~Xm is that inquiry weight dictionary obtains Term hit gram in corresponding entry weight, W1~Wm is the corresponding weight of each entry weight inquired.
Above-mentioned formula two, using the calculated term entries weight of the method for arithmetic average, wherein score and X1~ The meaning of Xm is identical with formula one.
It should be noted that above-mentioned two formula is not the sole mode for realizing the present invention, only one as embodiment Kind realization method.Technical staff can need to do appropriate deformation to formula according to business, still fall within the scope of the present invention, Such as increase parameter or multiple value etc..
It gives one example, it is assumed that query input by user is " 360 search network address ", and after being segmented, it contains three Term, one of term are " 360 ", for this term, inquire weight dictionary, it is assumed that it hit gram include: 360,360 search, 360 search engines, 360 search engine network address, 360 search websites, then for 360 difference in 5 gram 5 entry weights have been corresponded to, then this 5 entry weights are weighted average calculating, have just obtained query input by user In " 360 " this term final entry weight.
Step 33:The entry weight of each term is compared with preset weight threshold, entry weight is more than etc. In the weight threshold term as search key, export corresponding search result.
In this step, when the entry weight of each term being compared with preset weight threshold, word can be ignored Weight is less than the term of the weight threshold, to be conducive to the core word extraction and document ordering of read statement query, Improve the accuracy of search result.
Based on the above method, the embodiment of the present invention additionally provide it is a kind of based on the entry weight determining device clicked jointly, It is illustrated in figure 4 the structural schematic diagram provided in an embodiment of the present invention based on the entry weight determining device clicked jointly, it is described Device includes:
Query gathers acquiring unit 41, for based on search daily record data, obtaining the uniform resource locator clicked jointly Read statement query set corresponding to URL;
Participle unit 42, for each query in the query set acquired in query set acquiring units Word segmentation processing is carried out, multiple basic entry term are obtained;
Entry Weight Acquisition unit 43, for counting the obtained each term of the participle unit in the query collection The frequency occurred in conjunction, and the height based on the frequency of occurrences obtains the entry weight of each term.
It is illustrated in figure 5 another structure provided in an embodiment of the present invention based on the entry weight determining device clicked jointly Schematic diagram, with reference to figure 5, in the concrete realization, which may also include:
Weight dictionary unit 44, for forming weight dictionary according to each term and corresponding entry weight.
User's input receiving unit 45 for receiving a certain query input by user, and to the query segment To multiple term;
Entry weight query unit 46 obtains user's input receiving unit for inquiring the weight dictionary unit The entry weight of obtained each term;
Search result output unit 47, for weighing the entry of the obtained each term of the entry weight query unit Weight is compared with preset weight threshold, and entry weight is more than or equal to the term of the weight threshold as search key Word exports corresponding search result.
In the specific implementation, above-mentioned participle unit 42 further may include:
Word segmentation processing module 421 is carried out based on n-gram at participle for each query in gathering the query Reason, obtains the basic entry term of multiple segment gram.
Above-mentioned entry Weight Acquisition unit 43 further may include:
Weight computation module 431, for choosing the number of the highest term of the frequency of occurrences as denominator, according to each The occurrence number of term calculates the entry weight for obtaining each term.
Each unit concrete implementation process is shown in described in above method embodiment in above-mentioned apparatus.
In conclusion the method and device that the embodiment of the present invention is provided can be obtained accurately based on the entry clicked jointly Weight plays important function for the core word extraction and document ordering of read statement query, overcomes existing TF- The shortcomings that IDF technologies, and then improve the accuracy of search result.
In the instructions provided here, numerous specific details are set forth.It is to be appreciated, however, that the implementation of the present invention Example can be put into practice without these specific details.In some instances, well known method, structure is not been shown in detail And technology, so as not to obscure the understanding of this description.
Similarly, it should be understood that in order to simplify the disclosure and help to understand one or more of each inventive aspect, Above in the description of exemplary embodiment of the present invention, each feature of the invention is grouped together into single implementation sometimes In example, figure or descriptions thereof.However, the method for the disclosure should be construed to reflect following intention:It is i.e. required to protect Shield the present invention claims the more features of feature than being expressly recited in each claim.More precisely, as following Claims reflect as, inventive aspect is all features less than single embodiment disclosed above.Therefore, Thus the claims for following specific implementation mode are expressly incorporated in the specific implementation mode, wherein each claim itself All as a separate embodiment of the present invention.
In addition, it will be appreciated by those of skill in the art that although some embodiments described herein include other embodiments In included certain features rather than other feature, but the combination of the feature of different embodiments means in of the invention Within the scope of and form different embodiments.For example, in the following claims, embodiment claimed is appointed One of meaning mode can use in any combination.
The all parts embodiment of the present invention can be with hardware realization, or to run on one or more processors Software module realize, or realized with combination thereof.It will be understood by those of skill in the art that can use in practice Microprocessor either digital signal processor (DSP) come realize in search system according to the ... of the embodiment of the present invention some or it is complete The some or all functions of portion's component.The present invention be also implemented as a part for executing method as described herein or The equipment or program of device (for example, computer program and computer program product) of person's whole.It is such to realize the present invention's Program can may be stored on the computer-readable medium, or can be with the form of one or more signal.Such signal It can download and obtain from internet website, either provide on carrier signal or provide in any other forms.
It should be noted that the present invention will be described rather than limits the invention for above-described embodiment, and ability Field technique personnel can design alternative embodiment without departing from the scope of the appended claims.In the claims, Any reference mark between bracket should not be configured to limitations on claims.Word "comprising" does not exclude the presence of not Element or step listed in the claims.Word "a" or "an" before element does not exclude the presence of multiple such Element.The present invention can be by means of including the hardware of several different elements and being come by means of properly programmed computer real It is existing.In the unit claims listing several devices, several in these devices can be by the same hardware branch To embody.The use of word first, second, and third does not indicate that any sequence.These words can be explained and be run after fame Claim.
Obviously, various changes and modifications can be made to the invention without departing from essence of the invention by those skilled in the art God and range.In this way, if these modifications and changes of the present invention belongs to the range of the claims in the present invention and its equivalent technologies Within, then the present invention is also intended to include these modifications and variations.

Claims (7)

1. a kind of based on the entry Weight Determination clicked jointly, which is characterized in that including:
Based on search daily record data, the read statement query collection corresponding to the uniform resource locator URL clicked jointly is obtained It closes;
Each query is segmented based on n-gram in gathering the query, obtains the basic entry of multiple segment gram term;
The frequency that each term occurs in query set is counted, the height based on the frequency of occurrences obtains each term's Entry weight, specifically includes:For each gram, count respectively it includes term in query set the number that occurs, and The entry weight of each term in each gram is determined according to the number;
Weight dictionary is formed according to the entry weight of each term in each gram and each gram;
A certain query input by user is received, which is segmented to obtain multiple term;
The weight dictionary is inquired, the entry weight of each term is obtained, specifically includes:It will be right in each gram of term hits The entry weight answered is weighted average computation, obtains the entry weight of term;
The entry weight of each term is compared with preset weight threshold, entry weight is more than or equal to the weight threshold The term of value exports corresponding search result as search key.
2. the method as described in claim 1, which is characterized in that by the entry weight of each term and preset weight threshold When being compared, ignore the term that entry weight is less than the weight threshold.
3. such as claim 1-2 any one of them methods, which is characterized in that the daily record data is stored in backstage search service In device.
4. such as claim 1-2 any one of them methods, which is characterized in that the n-gram is 4 rank gram.
5. such as claim 1-2 any one of them methods, which is characterized in that the height based on the frequency of occurrences obtains each The entry weight of term, specifically includes:
The number of the highest term of the frequency of occurrences is chosen as denominator, it is each that acquisition is calculated according to the occurrence number of each term The entry weight of term.
6. a kind of based on the entry weight determining device clicked jointly, which is characterized in that described device includes:
Query gathers acquiring unit, for based on search daily record data, obtaining the uniform resource locator URL institutes clicked jointly Corresponding read statement query set;
Participle unit, for being based on n- to each query in the query set acquired in query set acquiring units Gram carries out word segmentation processing, obtains the basic entry term of multiple segment gram;
Entry Weight Acquisition unit goes out for counting the obtained each term of the participle unit in query set Existing frequency, and the height based on the frequency of occurrences obtains the entry weight of each term, specifically includes:For each gram, divide Do not count it includes term in query set the number that occurs, and determine that the entry of each term is weighed according to the number Weight;
Weight dictionary unit, for forming weight dictionary according to the entry weight of each term in each gram and each gram;
User's input receiving unit for receiving a certain query input by user, and is segmented to obtain multiple to the query term;
Entry weight query unit is obtained for inquiring the weight dictionary unit obtained by user's input receiving unit Each term entry weight, specifically include:Corresponding entry weight in each gram of term hits is weighted flat It calculates, obtains the entry weight of term;
Search result output unit, for by the entry weight of the obtained each term of the entry weight query unit and in advance If weight threshold be compared, and using entry weight be more than or equal to the weight threshold term as search key, it is defeated Go out corresponding search result.
7. device as claimed in claim 6, which is characterized in that the entry Weight Acquisition unit further comprises:
Weight computation module, for choosing the number of the highest term of the frequency of occurrences as denominator, according to going out for each term Occurrence number calculates the entry weight for obtaining each term.
CN201410718382.1A 2014-12-01 2014-12-01 It is a kind of based on the entry Weight Determination clicked jointly and device Expired - Fee Related CN104361115B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201410718382.1A CN104361115B (en) 2014-12-01 2014-12-01 It is a kind of based on the entry Weight Determination clicked jointly and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201410718382.1A CN104361115B (en) 2014-12-01 2014-12-01 It is a kind of based on the entry Weight Determination clicked jointly and device

Publications (2)

Publication Number Publication Date
CN104361115A CN104361115A (en) 2015-02-18
CN104361115B true CN104361115B (en) 2018-07-27

Family

ID=52528375

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201410718382.1A Expired - Fee Related CN104361115B (en) 2014-12-01 2014-12-01 It is a kind of based on the entry Weight Determination clicked jointly and device

Country Status (1)

Country Link
CN (1) CN104361115B (en)

Families Citing this family (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105183912B (en) * 2015-10-12 2019-03-01 北京百度网讯科技有限公司 Abnormal log determines method and apparatus
CN105528430B (en) * 2015-12-10 2019-05-31 北京奇虎科技有限公司 A kind of method and apparatus of the weight of determining search terms
CN105488209B (en) * 2015-12-11 2019-06-07 北京奇虎科技有限公司 A kind of analysis method and device of word weight
CN105528441A (en) * 2015-12-22 2016-04-27 北京奇虎科技有限公司 Automatic marking based head word extracting method and device
CN106919603B (en) * 2015-12-25 2020-12-04 北京奇虎科技有限公司 Method and device for calculating word segmentation weight in query word mode
CN106919649B (en) * 2017-01-19 2020-06-26 北京奇艺世纪科技有限公司 Entry weight calculation method and device
CN108804511B (en) * 2018-04-20 2022-04-22 北京奇艺世纪科技有限公司 Search recall method and device and electronic equipment
CN108897736B (en) * 2018-06-20 2022-04-12 大连诺道认知医学技术有限公司 Document sorting method and device based on Paper Rank algorithm
CN109815396B (en) * 2019-01-16 2021-09-21 北京搜狗科技发展有限公司 Search term weight determination method and device
CN110147421B (en) * 2019-05-10 2022-06-21 腾讯科技(深圳)有限公司 Target entity linking method, device, equipment and storage medium

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102043845A (en) * 2010-12-08 2011-05-04 百度在线网络技术(北京)有限公司 Method and equipment for extracting core keywords based on query sequence cluster

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20080065617A1 (en) * 2005-08-18 2008-03-13 Yahoo! Inc. Search entry system with query log autocomplete
US20110137886A1 (en) * 2009-12-08 2011-06-09 Microsoft Corporation Data-Centric Search Engine Architecture
CN103425687A (en) * 2012-05-21 2013-12-04 阿里巴巴集团控股有限公司 Retrieval method and system based on queries
CN103150409B (en) * 2013-04-08 2017-04-12 深圳市宜搜科技发展有限公司 Method and system for recommending user search word

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102043845A (en) * 2010-12-08 2011-05-04 百度在线网络技术(北京)有限公司 Method and equipment for extracting core keywords based on query sequence cluster

Also Published As

Publication number Publication date
CN104361115A (en) 2015-02-18

Similar Documents

Publication Publication Date Title
CN104361115B (en) It is a kind of based on the entry Weight Determination clicked jointly and device
CA2626860C (en) Search over structured data
US9672251B1 (en) Extracting facts from documents
CN104376115B (en) A kind of fuzzy word based on global search determines method and device
US20120191745A1 (en) Synthesized Suggestions for Web-Search Queries
US8255414B2 (en) Search assist powered by session analysis
US8977625B2 (en) Inference indexing
EP2480995A1 (en) Searching for information based on generic attributes of the query
CN104462399B (en) The processing method and processing device of search result
WO2008106667A1 (en) Searching heterogeneous interrelated entities
CN107894986B (en) Enterprise relation division method based on vectorization, server and client
US20130006975A1 (en) System and method for matching entities and synonym group organizer used therein
US8825620B1 (en) Behavioral word segmentation for use in processing search queries
CN103530339A (en) Mobile application information push method and device
US11604794B1 (en) Interactive assistance for executing natural language queries to data sets
CN108572971B (en) Method and device for mining keywords related to search terms
CN105786910B (en) Entry weighing computation method and device
CN104268230A (en) Method for detecting objective points of Chinese micro-blogs based on heterogeneous graph random walk
JP5367632B2 (en) Knowledge amount estimation apparatus and program
CN111476026A (en) Statement vector determination method and device, electronic equipment and storage medium
Scharpf et al. Arqmath lab: An incubator for semantic formula search in zbmath open?
CN104462556A (en) Method and device for recommending question and answer page related questions
KR20120038418A (en) Searching methods and devices
CN103186573B (en) A kind of method, demand of definite search need intensity are known method for distinguishing and device thereof
Blanco et al. Supporting the automatic construction of entity aware search engines

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20180727

Termination date: 20211201

CF01 Termination of patent right due to non-payment of annual fee