CN104504070B - A kind of method and apparatus of search - Google Patents

A kind of method and apparatus of search Download PDF

Info

Publication number
CN104504070B
CN104504070B CN201410806935.9A CN201410806935A CN104504070B CN 104504070 B CN104504070 B CN 104504070B CN 201410806935 A CN201410806935 A CN 201410806935A CN 104504070 B CN104504070 B CN 104504070B
Authority
CN
China
Prior art keywords
text
information
characteristic information
search
communication characteristic
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201410806935.9A
Other languages
Chinese (zh)
Other versions
CN104504070A (en
Inventor
王翀
陈进平
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Qihoo Technology Co Ltd
Original Assignee
Beijing Qihoo Technology Co Ltd
Qizhi Software Beijing Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Qihoo Technology Co Ltd, Qizhi Software Beijing Co Ltd filed Critical Beijing Qihoo Technology Co Ltd
Priority to CN201410806935.9A priority Critical patent/CN104504070B/en
Publication of CN104504070A publication Critical patent/CN104504070A/en
Application granted granted Critical
Publication of CN104504070B publication Critical patent/CN104504070B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/957Browsing optimisation, e.g. caching or content distillation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/951Indexing; Web crawling techniques

Abstract

The embodiment of the invention provides a kind of searching method and devices, which comprises receives the search key of user;Identify one or more search information in described search keyword;When described search information includes the Serial No. of specific bit number, the weight having in search result with the search result items of the matched communication characteristic information of the Serial No. of the specified digit is improved.The embodiment of the present invention will have preferentially to be shown with the webpage of the communication characteristic information of telephone number matches, improve the accuracy rate of search, and then it reduces the page turning in search result and searches, re-enters the modes such as search key and scan for, improve the simplicity of operation, reduce the consumption of the resource of search engine and local system, bandwidth consumption is reduced, search efficiency is improved.

Description

A kind of method and apparatus of search
Technical field
The present invention relates to search technique fields, method and a kind of device of search more particularly to a kind of search.
Background technique
With the rapid development of network, the information on network is sharply increased.User in the information of magnanimity in order to find institute The information needed is scanned for usually using search engine.
Search engine refers to that collecting information from internet automatically is supplied to what user was inquired after centainly arranging System.Information vastness is multifarious, and has no order, and all information is these as the island one by one on vast sea, web page interlinkage Criss-cross bridge between island, and search engine, then be user draw an open-and-shut information map, for user with When consult.
But as shown in Figure 1, user is when searching for commonly used telephone number (such as 2223256), search engine still presses general calculation Method provides result.Since title and the weight of link are higher, the result for coming front is often inquired in title or link Word, and these results sometimes and non-user needed for, accuracy rate is low.User is usually searching for when not searching required information As a result middle page turning, which searches, re-enters the modes such as search key scans for, troublesome in poeration, search engine and local system Resource consumption is big, and bandwidth consumption is big, and search efficiency is low.
Summary of the invention
In view of the above problems, it proposes on the present invention overcomes the above problem or at least be partially solved in order to provide one kind State the method and a kind of device of search accordingly of a kind of search of problem.
According to one aspect of the present invention, a kind of searching method is provided, comprising:
Receive the search key of user;
Identify one or more search information in described search keyword;
When described search information includes the Serial No. of specific bit number, improving in search result has and the specific bit The weight of the search result items of the matched communication characteristic information of several Serial No.s.
Optionally, the method also includes:
When described search information includes communications identification, improving has and the matched communication characteristic information of the communications identification Search result items weight.
Optionally, the method also includes:
Obtain the area code of present position;
When the area code is matched with the communication characteristic information, the search with the communication characteristic information is improved As a result weight.
Optionally, the method also includes:
According to the weight to one or more of search result items carry out sequence sequences;
Search result after sequence is sorted returns to client and is shown.
Optionally, described search result items include web page digest information, and the web page digest information includes that the communication is special Reference ceases the corresponding webpage information in position occurred in webpage.
Optionally, it is described receive user search key the step of before, the method also includes:
Establish document index.
Optionally, described the step of establishing document index, includes:
Extract the text information in webpage;
Judge whether there is communication characteristic information in the text information;If so, extracting the communication characteristic information;
Document index is established using the communication characteristic information and the webpage.
Optionally, the webpage include page title, banner, header, footer, navigation, in body matter at least One region;
The step of text information in the extraction webpage includes:
Extract page title in webpage, header, footer, body matter, functional areas, at least one region in navigation area Text information.
Optionally, the communication characteristic information includes the telephone number of specified digit;
It is described to judge that the step of whether having communication characteristic information in the text information includes:
Word segmentation processing is carried out to the text information, obtains one or more text participles;
When text participle is matched with preset communications identification, judge whether first object text participle is specific bit Several Serial No.s;The first object text participle divides for the text after segmenting with the matched text of the communications identification Word;
If so, judging the first object text participle for the telephone number of specified digit.
Optionally, the communication characteristic information further includes area code;
It is described to judge whether there is the step of communication characteristic information in the text information further include:
Judge whether there is area code mark in the second target text participle;If so, judging the target text participle pair The text participle answered is area code;After the second target text participle is segments with the matched text of the communications identification Text participle.
Optionally, described to judge that the target text segments corresponding text participle as the step of area code information and includes:
The text for judging that the target text participle includes segments as area code;
Alternatively,
Judge the text participle before the target text segments as area code.
Optionally, the specified digit is 7 or 8.
Optionally, described the step of establishing document index using the communication characteristic information and the webpage, includes:
The position that the communication characteristic information that record occurs occurs in the webpage;
The position of the communication characteristic information and the appearance is recorded in document index.
According to another aspect of the present invention, a kind of searcher is provided, comprising:
Receiving module, suitable for receiving the search key of user;
Identification module, suitable for identifying one or more search information in described search keyword;
First improves module, suitable for improving search result when described search information includes the Serial No. of specific bit number In have with the Serial No. of the specified digit it is matched communication characteristic information search result items weight.
Optionally, described device further include:
Second improves module, suitable for when described search information includes communications identification, improving has and the communications identification The weight of the search result items of matched communication characteristic information.
Optionally, described device further include:
Module is obtained, suitable for obtaining the area code of present position;
Third improves module, is suitable for when the area code is matched with the communication characteristic information, improves described in having Communicate the weight of the search result of characteristic information.
Optionally, described device further include:
Sorting module is suitable for according to the weight to one or more of search result items carry out sequence sequences;
Return module returns to client suitable for the search result after sequence sorts and is shown.
Optionally, described search result items include web page digest information, and the web page digest information includes that the communication is special Reference ceases the corresponding webpage information in position occurred in webpage.
Optionally, described device further include:
Module is established in document index, is adapted to set up document index.
Optionally, the document index is established module and is further adapted for:
Extract the text information in webpage;
Judge whether there is communication characteristic information in the text information;If so, extracting the communication characteristic information;
Document index is established using the communication characteristic information and the webpage.
Optionally, the webpage include page title, banner, header, footer, navigation, in body matter at least One region;
The document index is established module and is further adapted for:
Extract page title in webpage, header, footer, body matter, functional areas, at least one region in navigation area Text information.
Optionally, the communication characteristic information includes the telephone number of specified digit;The document index establishes module also It is suitable for:
Word segmentation processing is carried out to the text information, obtains one or more text participles;
When text participle is matched with preset communications identification, judge whether first object text participle is specific bit Several Serial No.s;The first object text participle divides for the text after segmenting with the matched text of the communications identification Word;
If so, judging the first object text participle for the telephone number of specified digit.
Optionally, the communication characteristic information further includes area code;The document index is established module and is further adapted for:
Judge whether there is area code mark in the second target text participle;If so, judging the target text participle pair The text participle answered is area code;After the second target text participle is segments with the matched text of the communications identification Text participle.
Optionally, the document index is established module and is further adapted for:
The text for judging that the target text participle includes segments as area code;
Alternatively,
Judge the text participle before the target text segments as area code.
Optionally, the specified digit is 7 or 8.
Optionally, the document index is established module and is further adapted for:
The position that the communication characteristic information that record occurs occurs in the webpage;
The position of the communication characteristic information and the appearance is recorded in document index.
The embodiment of the present invention identifies one or more of search information, is searching for the search key received When rope information includes the Serial No. of specific bit number, improve in search result with matched logical with the Serial No. of specified digit The weight for interrogating the search result items of characteristic information is preferentially opened up having with the webpage of the communication characteristic information of telephone number matches Show, improve the accuracy rate of search, so reduce the page turning in search result search, re-enter the modes such as search key into Row search, improves the simplicity of operation, reduces the consumption of the resource of search engine and local system, reduces bandwidth consumption, Improve search efficiency.
When searching for information includes communications identification, improve has and communications identification matched communication feature the embodiment of the present invention The weight of the search result items of information further improves the accuracy rate of search.
When area code of the embodiment of the present invention in current location is matched with communication characteristic information, improving has the communication special The weight of the search result of reference breath, further improves the accuracy rate of search.
The position Web- Designing summary info of characteristic information and appearance will be communicated in the embodiment of the present invention, can be tied in search Summary info in fruit gets the information of telephone number and its ownership, reduces the frequency that user clicks search result, reduces The consumption of web page server, the resource of current electronic device and bandwidth.
The above description is only an overview of the technical scheme of the present invention, in order to better understand the technical means of the present invention, And it can be implemented in accordance with the contents of the specification, and in order to allow above and other objects of the present invention, feature and advantage can It is clearer and more comprehensible, the followings are specific embodiments of the present invention.
Detailed description of the invention
By reading the following detailed description of the preferred embodiment, various other advantages and benefits are common for this field Technical staff will become clear.The drawings are only for the purpose of illustrating a preferred embodiment, and is not considered as to the present invention Limitation.And throughout the drawings, the same reference numbers will be used to refer to the same parts.In the accompanying drawings:
Fig. 1 shows a kind of search result examples figure;
Fig. 2 shows a kind of step processes of the method for building up embodiment of document index according to an embodiment of the invention Figure;
Fig. 3 shows a kind of search result examples figure according to an embodiment of the invention;
Fig. 4 shows a kind of step flow chart of the embodiment of the method 1 of search according to an embodiment of the invention;
Fig. 5 shows a kind of the step of 2 embodiment of the method for embodiment of the method for search according to an embodiment of the invention Flow chart;
Fig. 6 shows a kind of search result examples figure according to an embodiment of the invention;
Fig. 7 shows a kind of structural frames for establishing Installation practice of document index according to an embodiment of the invention Figure;And
Fig. 8 shows a kind of structural block diagram of the Installation practice of search according to an embodiment of the invention.
Specific embodiment
Exemplary embodiments of the present disclosure are described in more detail below with reference to accompanying drawings.Although showing the disclosure in attached drawing Exemplary embodiment, it being understood, however, that may be realized in various forms the disclosure without should be by embodiments set forth here It is limited.On the contrary, these embodiments are provided to facilitate a more thoroughly understanding of the present invention, and can be by the scope of the present disclosure It is fully disclosed to those skilled in the art.
Referring to Fig. 2, a kind of step of the method for building up embodiment of document index according to an embodiment of the invention is shown Rapid flow chart, can specifically include following steps:
Step 201, the text information in webpage is extracted;
The process flow of search engine can be generally divided into two parts, and first part is that front end user is requested, and second Dividing is rear end production data.
One, front end user request treatment process may include:
1. user entered keyword;
2. query word is analyzed, search engine segments keyword;
3. retrieval, according to word segmentation result, from the document index made in advance, finds out relevant collections of web pages;
4. sequence is ranked up candidate collections of web pages according to dimensions such as content relevance, timeliness;
5. showing: the webpage after sequence is showed.
Two, production data procedures in rear end may include:
1. webpage capture, crawler passes through the linking relationship between webpage, grabs the webpage of internet and preservation;
2. compilation of index analyzes the webpage for having grabbed preservation, web page title and page text are segmented, according to point Word result makes document index, retrieves and uses for front end.
The webpage of crawler capturing, which can be stored in web database, forms a large amount of searching resource, and can in web page contents To include a large amount of text information.It then in embodiments of the present invention, can be from the text envelope extracted in web database in webpage Breath.
In an alternative example of an embodiment of the present invention, the webpage include page title, header, footer, in main body Appearance, functional areas, at least one region in navigation area;Then in embodiments of the present invention, step 201 may include following sub-step It is rapid:
Sub-step S11, extract page title in webpage, header, footer, body matter, functional areas, in navigation area at least The text information in one region.
The content arrangement of the website of heterogeneity and classification, webpage is generally different.But general webpage is basic Content includes title, header, footer, body matter, functional areas, navigation area billboard etc..These elements are pacified in the position of webpage Row, is exactly the integral layout of webpage.
The top of each webpage has an information, this information tends to occur at the title bar of browser, rather than net In page, but this information is also a part in page layout.This information is the prompt to main contents in this webpage, That is title.
LOGO is the tool of site owners international communication self-image.
The upper end of webpage is the header of this page.Header is not to have in all webpages, often one Considerable position in a page easily causes viewer's note that so publicity can be all arranged in many websites in header The content of this website, such as website objective, website LOGO.
Body matter is the most important element in webpage.Body matter is simultaneously imperfect, often by the mark of next stage content The hyperlink that topic, synopsis, content are selected and edit is constituted.Body matter can use a page, high level overview by hyperlink Content expressed by several pages, and the body matter of homepage even can be in a page in the entire website of high level overview Hold.
The lowermost end part of webpage is referred to as footer, and Footer Section is usually used to the specifying information for introducing site owners And contact information, such as title, address, contact method, copyright information.Some of contents are made into the hyperlink of title formula, Guidance viewer further appreciates that detailed content.
Functional areas are the concentrated expressions of website major function.It is normally at upper right side or the right side sidebar of webpage.Functional areas Include: Email, information publication, user name registration, log in the contents such as website.Some websites have used IP positioning function, fixed Then position viewer location can show the customized informations such as local weather, news in functional areas.
Navigation area can provide certain approach by certain technological means for the visitor of website, period can be square Just required content is accessed, it is left side, right side, top and bottom respectively that navigation area, which is generally divided into 4 kinds of positions,.General website The navigation area used be all it is single, can also navigate more, as use left-hand navigation with bottom navigate combine by the way of.But It is no matter to use several navigation areas, the navigation zone position of each page in website is fixed.
Advertisement area is the region that profit or self are realized in website.It is normally at header, right side and the bottom of webpage.Extensively Area's content is accused based on text, image, Flash animation.Reach advertising results by way of attracting viewer's clickthrough.Extensively Accusing area's setting will reach obvious, reasonable, noticeable, this is critically important to the layout of entire website.
It should be noted that footer will not be included in general document index, still, due to Footer Section usually quilt For introducing the specifying information and contact information of site owners, the status in the scene that user searches for telephone number is relatively attached most importance to It wants, therefore, Footer Section can be included in the embodiment of the present invention.
Step 202, judge whether there is communication characteristic information in the text information;If so, thening follow the steps 203;
Characteristic information is communicated, the information of communication can be used for for characterization.
In an alternative embodiment of the invention, the communication characteristic information may include the phone number of specified digit Code;
Telephone number is a succession of several combinatorics on words, and the corresponding telephone wire of a sets of numbers will such as be dialed to another party, can To dial the correspondence number combination of other side.When telephone number just begins to use, number is shorter, only about two or three of numbers, It can only dial to neighbouring telephone subscriber, gradually, telephone system more becomes flourishing, and covering scope is up to the whole world, telephone number also phase To growth.Telephone number can also connect computer and facsimile machine in addition to connecting phone.Telephone Management Agency department is number of telephone set setting Code.(phone number is 11) of general 7 or 8 digits composition, there is 5 or 6 the case where in early days.
Then in embodiments of the present invention, step 202 may include following sub-step:
Sub-step S21 carries out word segmentation processing to the text information, obtains one or more text participles;
Some common segmenting methods are described below:
1, based on the segmenting method of string matching: referring to that the Chinese character string being analysed to according to certain strategy and one are pre- The entry in machine dictionary set is matched, if finding some character string in dictionary, successful match (identifies one Word).
2, the segmenting method based on mark scanning or mark cutting: refer to preferentially identification and cutting in character string to be analyzed Former character string can be divided into lesser string and be come again into machinery point by the word for having obvious characteristic more out using these words as breakpoint Word, to reduce matched error rate;Or combine participle and part-of-speech tagging, using grammatical category information abundant to participle Decision provides help, and tests, adjusts to word segmentation result in turn again in annotation process, to improve the standard of cutting True rate.
3, based on the segmenting method of understanding: referring to by allowing the understanding of computer mould personification distich, reach identification word Effect.Its basic thought is exactly to carry out syntax, semantic analysis while participle, is handled using syntactic information and semantic information Ambiguity.It generally includes three parts: participle subsystem, syntactic-semantic subsystem, master control part.Association in master control part Under tune, participle subsystem can obtain the syntax and semantic information in relation to word, sentence etc. to judge segmentation ambiguity, i.e., it People is simulated to the understanding process of sentence.This segmenting method is needed using a large amount of linguistry and information.
4, based on the segmenting method of statistics: referring to, due to the frequency or probability energy of word co-occurrence adjacent with word in Chinese information Enough preferable confidence levels reflected into word, it is possible to unite to the frequency of each combinatorics on words of co-occurrence adjacent in corpus Meter calculates their information that appears alternatively, and calculates the adjacent co-occurrence probabilities of two Chinese characters X, Y.The information that appears alternatively can embody Chinese character Between marriage relation tightness degree.When tightness degree is higher than some threshold value, it can think that this word group may constitute one A word.This method counts the word group frequency in corpus, does not need cutting dictionary.
Sub-step S22 judges that first object text participle is when text participle is matched with preset communications identification The no Serial No. for specified digit;If so, executing sub-step S23;
Communications identification, can be identification telephone numbers information, for example, " please contact ", " phone ", " mobile phone ", " Tel ", " Mobile " etc..
Wherein, the first object text participle can be the text after segmenting with the matched text of the communications identification Participle, for example, if text participle " phone " match with communications identification, it can be that text, which segments the text after " phone " and segments, One target text participle.
Sub-step S23 judges the first object text participle for the telephone number of specified digit.
In embodiments of the present invention, in number of the first object text participle for specified digit, it can be determined that this is specified The number of digit is the telephone number of specified digit.
For example, if the first object text participle after text participle " phone " is " 2223256 ", since " 2223256 " are 7 The Serial No. of position then may determine that " 2223256 " are 7 telephone numbers.
In an alternative embodiment of the invention, the communication characteristic information can also include area code;Telephone region Number refer to that each administrative region ordinary phone zoning number, these numbers are mainly used for domestic, international long-distance telephone access.For example, China's Mainland world area code 86, Chengdu area code 28.And when using National calls, 0 is dialed before area code.
Then in embodiments of the present invention, step 202 may include following sub-step:
Sub-step S24 judges whether there is area code mark in the second target text participle;If so, executing sub-step S25;
Area code mark can be the information of identification telephone area code, for example, " () " in " (010) 2223256 " is area code mark Know, the "-" in " 010-2223256 " is area code mark etc..
Wherein, the second target text participle can be the text after segmenting with the matched text of the communications identification Participle, for example, if text participle " phone " match with communications identification, it can be that text, which segments the text after " phone " and segments, Two target texts participle.
Sub-step S25 judges that the target text segments corresponding text participle as area code;
In embodiments of the present invention, when target text participle is matched with communications identification, it can be determined that the target text point The corresponding telephone number for text participle of word.
In an alternative example of an embodiment of the present invention, sub-step S25 may include following sub-step:
Sub-step S251, the text for judging that the target text participle includes segment as area code;
For example, " () " in " (010) 2223256 " is area code mark, then text participle " 010 " can be area code.
Alternatively,
Sub-step S252 judges the text participle before the target text segments as area code.
For example, the "-" in " 010-2223256 " is area code mark, then text participle " 010 " can be area code.
Step 204, the communication characteristic information is extracted;
In embodiments of the present invention, it if judging that there is communication characteristic information in the text information in webpage, can extract The communication characteristic information, such as specify the telephone number of digit, area code etc., to establish document index.
Step 205, document index is established using the communication characteristic information and the webpage.
In the concrete realization, document index may include inverted index, forward index etc., and document index can be by rope Draw table and master file two parts are constituted.
Concordance list can be the table of corresponding relationship between an instruction logic record and physical record.Each in concordance list Referred to as index entry.Index entry is that key (or logic record number) sequence arranges.
In an alternative embodiment of the invention, step 205 may include following sub-step:
Sub-step S31 records the position that the communication characteristic information of appearance occurs in the webpage;
The position of the communication characteristic information and the appearance is recorded in document index by sub-step S32.
In the embodiment of the present invention, the position of appearance can be recorded in write-in inverted index, to make in search result items It is shown for web page digest information.
In many scenes, user searches for telephone number, and the information for needing to obtain is that determining this is telephone number mostly And the ownership of this telephone number, such as company, shop.If summary info of the user in search result items gets electricity The information of number and its ownership is talked about, often carries out detailed inquiry without clicking the search result items.
The position for communicating characteristic information and appearance is recorded in document index in the embodiment of the present invention, can be tied in search Web page digest information in fruit gets the information of telephone number and its ownership, reduces the frequency that user clicks search result, Reduce web page server, current electronic device resource and bandwidth consumption.
Inverted index needs to search record according to the value of attribute in practical application.Each single item in this concordance list It all include an attribute value and the address respectively recorded with the attribute value.Due to not determining attribute value by recording, The position of record, thus referred to as inverted index (inverted index) are determined by attribute value.File with inverted index Referred to as inverted index file, abbreviation inverted file (inverted file).
Inverted file (inverted index), index object are the words etc. in document or collection of document (such as webpage), are used It is normal to one kind of document or collection of document to store storage location of these words in a document perhaps one group of document Indexing Mechanism.
In the concrete realization, communicate characteristic information appearance position may include occur webpage, appearance webpage and its Position in the web page.
By taking English as an example, the following are the text informations in webpage to be indexed:
T1=" it is what it is ";
T2=" what is it ";
T3=" it is a banana ";
The following are inverted indexs:
"a":{(2,2)}
"banana":{(2,3)}
"is":{(0,1),(0,4),(1,1),(2,1)}
"it":{(0,0),(0,3),(1,2),(2,0)}
"what":{(0,2),(1,0)}
Wherein, " banana ": { (2,3) } are " banana " in the text information of third webpage (T3), and the The position of three webpages is the 4th word (address 3).
General page analysis does not identify special point (such as telephone number, area code), it is possible to mainly for The intermediate portions such as the keyword that title or the head of a station provide do document index, and may be ignored many things in document index, When user has the demand of enquiring telephone number, result needed for not returning to user.
In addition, major commercial undertaking such as bank, online shopping mall, can generally be arranged 5 telephone numbers, the electricity of 400 beginnings Words number, and these telephone numbers, it will usually preferentially be promoted the head for arriving search result by way of bidding by the commercial undertaking Page is shown.
And 7 or 8 telephone numbers are the telephone number for being generally the small organizations such as little company, small shop, usual nothing Power pays required expense of bidding, and is generally set lower than the importance of title, network address etc., is usually buried in search result Very deep position, or even can not search.
The embodiment of the present invention in the text information in webpage have communication characteristic information when, using communication characteristic information and Webpage establishes document index, communicates characteristic information by label, visually can be described as establishing in a wide range of the phone (such as the whole nation) Book will have and the communication characteristic information of the telephone number matches to support subsequent other users when searching for telephone number Webpage is preferentially shown, improves the accuracy rate of search, and then is reduced the page turning in search result and searched, re-enter search key The modes such as word scan for, and improve the simplicity of operation, reduce the consumption of the resource of search engine and local system, reduce Bandwidth consumption improves search efficiency.
Referring to Fig. 3, a kind of step flow chart of searching method embodiment 1 according to an embodiment of the invention is shown, It can specifically include following steps:
Step 301, the search key of user is received;
In the concrete realization, user can access search engine, such as mobile phone, PDA from any electronic equipment (Personal Digital Assistant, personal digital assistant), laptop computer, palm PC etc., the present invention are real It is without restriction to this to apply example.
These electronic equipments can support to include Android (Android), IOS, WindowsPhone or windows etc. Operating system can usually run the application program of the browser or built-in miniature browser by internet access webpage.
In an alternative example of an embodiment of the present invention, user can be in the application of browser or built-in miniature browser The webpage where search engine is opened in program, would generally include in the web page search box, user can be in the search box Input search key.
It, can in the application program of browser or built-in miniature browser in the optional example of another kind of the embodiment of the present invention To be equipped with search plug-in unit, (plug-ins, can be by interacting, in browser or built-in miniature browser with search engine Application program in increase function of search), which can provide search box, and user can input in the search box and search Rope keyword.
The application program of browser or built-in miniature browser can be assembled in the search key that user inputs At searching request, searching request is sent to search engine, to request search engine to search for information relevant to the search key.
In practical applications, which can be HTTP (Hypertext transfer protocol, hypertext Transport protocol) request.Wherein, the content of searching request may include the mark and/or webpage of the webpage of user's request of loading Feature.Banner can be the information that can represent the webpage that one uniquely determines, such as uniform resource identifier (Uniform Resource Identifier, URI), uniform resource identifier can specifically include uniform resource locator again (Uniform Resource Locator, URL) or uniform resource name (Uniform Resource Name, URN) etc. Deng.
The application program of browser or built-in miniature browser can pass through DNS (Domain Name System, domain name solution Analysis system) parse domain name (Domain Name) mapped IP (Internet Protocol, the net searched in webpage URL The agreement interconnected between network) address.After obtaining IP address success, the application program of browser or built-in miniature browser can To be connected to the search engine request where the IP address.After the search engine being successfully connected where the IP address, browsing Request header information can be passed through the search where http protocol to this IP address by the application program of device or built-in miniature browser Engine initiates searching request.
Search engine receives searching request, then search key can be extracted from the searching request, then can basis Search key Rapid Detection search result in document index may include one or more search in the search result Result items.
Step 302, one or more search information in described search keyword are identified;
In embodiments of the present invention, one or more in described search keyword can be identified by means such as word segmentation processings A search information.
For example, the search key includes a search information " 2223256 " if search key is " 2223256 "; If search key is " phone 2223256 ", which includes search information " phone ", " 2223256 ".
Step 303, when described search information includes the Serial No. of specific bit number, improving in search result has and institute State the weight of the search result items of the matched communication characteristic information of Serial No. of specified digit.
In practical applications, user searches for the Serial No. of specified digit (such as 7 or 8), then it is possible to inquire purpose for it For enquiring telephone number.
The number of communication characteristic information (such as telephone number) and specified digit (such as 7 or 8) in search result items When word sequences match, the weight of the search result items can be improved, to improve displaying of the search result items in search result Sequentially.
For example, as shown in figure 4, can will include the search result of phone " 2223256 " if user searches for " 2223256 " The display location of item is promoted to the first few items of search result, facilitates user's express query.
The embodiment of the present invention identifies one or more of search information, is searching for the search key received When rope information includes the Serial No. of specific bit number, improve in search result with matched logical with the Serial No. of specified digit The weight for interrogating the search result items of characteristic information is preferentially opened up having with the webpage of the communication characteristic information of telephone number matches Show, improve the accuracy rate of search, so reduce the page turning in search result search, re-enter the modes such as search key into Row search, improves the simplicity of operation, reduces the consumption of the resource of search engine and local system, reduces bandwidth consumption, Improve search efficiency.
Referring to Fig. 5, a kind of step flow chart of searching method embodiment 2 according to an embodiment of the invention is shown, It can specifically include following steps:
Step 501, document index is established;
In an alternative embodiment of the invention, step 501 may include following sub-step:
Sub-step S41 extracts the text information in webpage;
In an alternative example of an embodiment of the present invention, the webpage may include page title, header, footer, master Internal appearance, functional areas, at least one region in navigation area, then in this example, sub-step S41 may include following sub-step It is rapid:
Sub-step S411, extract page title in webpage, header, footer, body matter, functional areas, in navigation area extremely The text information in a few region.
Sub-step S42 judges whether there is communication characteristic information in the text information;If so, executing sub-step S43;
In an alternative embodiment of the invention, the communication characteristic information may include the phone number of specified digit Code, the specified digit can be 7 or 8.Then in embodiments of the present invention, sub-step S42 may include following sub-step:
Sub-step S421 carries out word segmentation processing to the text information, obtains one or more text participles;
Sub-step S422 judges that first object text segments when text participle is matched with preset communications identification It whether is the Serial No. for specifying digit;If so, executing sub-step S423;
The first object text participle can segment for the text after segmenting with the matched text of the communications identification;
Sub-step S423 judges the first object text participle for the telephone number of specified digit.
In an alternative embodiment of the invention, the communication characteristic information can also include area code;Then at this In inventive embodiments, sub-step S42 can also include following sub-step:
Sub-step S424 judges whether there is area code mark in the second target text participle;If so, executing sub-step S425;
The second target text participle can segment for the text after segmenting with the matched text of the communications identification;
Sub-step S425 judges that the target text segments corresponding text participle as area code.
In an alternative example of an embodiment of the present invention, sub-step S425 may include following sub-step:
Sub-step S4251, the text for judging that the target text participle includes segment as area code;
Alternatively,
Sub-step S4252 judges the text participle before the target text segments as area code.
Sub-step S43 extracts the communication characteristic information;
Sub-step S44 establishes document index using the communication characteristic information and the webpage.
In an alternative embodiment of the invention, sub-step S44 may include following sub-step:
Sub-step S441 records the position that the communication characteristic information of appearance occurs in the webpage;
The position of the communication characteristic information and the appearance is recorded in document index by sub-step S442.
In embodiments of the present invention, since step 501 and the application of embodiment of the method 1 are substantially similar, so the ratio of description Relatively simple, related place illustrates that the embodiment of the present invention is not described in detail herein referring to the part of embodiment of the method 1.
Step 502, the search key of user is received;
Step 503, one or more search information in described search keyword are identified;
Step 504, when described search information includes the Serial No. of specific bit number, improving in search result has and institute State the weight of the search result items of the matched communication characteristic information of Serial No. of specified digit.
Step 505, when described search information includes communications identification, improving has and the matched communication of the communications identification The weight of the search result items of characteristic information;
In the concrete realization, user's search and the communication matched communications identification of Feature Words, then it may be to look into that it, which inquires purpose, Telephone number is ask, then the weight of the search result items comprising the communication Feature Words can be improved, to further increase the search knot Displaying sequence of the fruit in search result.
When searching for information includes communications identification, improve has and communications identification matched communication feature the embodiment of the present invention The weight of the search result items of information further improves the accuracy rate of search.
Step 506, the area code of present position is obtained;
In embodiments of the present invention, the position that available user is currently located, then inquire the corresponding telephone region in the position Number.
If user submits search key by mobile devices such as mobile phones, current longitude and latitude can be positioned, by inverse The modes such as geocoding identify the position where the longitude and latitude.
If user submits search key by fixed equipments such as computers, current IP address can be inquired (Internet Protocol Address, and be translated into internet protocol address), then identify the position where the IP address.
Step 507, when the area code is matched with the communication characteristic information, improving has communication feature letter The weight of the search result of breath.
In the concrete realization, the area code of user's present position is matched with communication Feature Words (such as area code), then may be used To improve the weight of the search result items comprising the communication Feature Words (such as area code), to further increase the search result items Displaying sequence in search result.
When area code of the embodiment of the present invention in current location is matched with communication characteristic information, improving has the communication special The weight of the search result of reference breath, further improves the accuracy rate of search.
In practical applications, described search result items may include web page digest information, and the web page digest information can be with The corresponding webpage information in position occurred in webpage including communication characteristic information (such as telephone number, area code).
For example, as shown in fig. 6, phone " phone " (communication mark can will be included if user searches for " phone 2223256 " Know), the display locations of the search result items of " 2223256 " Serial No. of digit (specified) be promoted to the former of search result , facilitate user's express query.
The position Web- Designing summary info of characteristic information and appearance will be communicated in the embodiment of the present invention, can be tied in search Summary info in fruit gets the information of telephone number and its ownership, reduces the frequency that user clicks search result, reduces The consumption of web page server, the resource of current electronic device and bandwidth.
Step 508, according to the weight to one or more of search result items carry out sequence sequences;
, can be according to weight to one or more search result items carry out sequence sequences in the embodiment of the present invention, weight is got over High search result items, sequence is more in preceding, the lower search result items of weight, and sequence is more rear.
Step 509, the search result after sequence being sorted returns to client and is shown.
Under http protocol, the application program of browser or built-in miniature browser can be from the service where search engine Device receives the document of HTML (Hypertext Markup Language, hypertext markup language) type.
The application program of browser or built-in miniature browser can parse html document, generate the object of tree, That is DOM (Document Object Model, document dbject model), each object is a node on DOM, and these are right As the web page resources such as text, picture can be represented.The application program of browser or built-in miniature browser can start to show this Html document, and the address of wherein embedded web page resources is obtained, then browser initiates request to server to obtain this again A little web page resources, and search result is shown in the html document of the application program in browser or built-in miniature browser.
For embodiment of the method, for simple description, therefore, it is stated as a series of action combinations, but this field Technical staff should be aware of, and embodiment of that present invention are not limited by the describe sequence of actions, because implementing according to the present invention Example, some steps may be performed in other sequences or simultaneously.Secondly, those skilled in the art should also know that, specification Described in embodiment belong to preferred embodiment, the actions involved are not necessarily necessary for embodiments of the present invention.
Referring to Fig. 7, a kind of knot for establishing Installation practice of document index according to an embodiment of the invention is shown Structure block diagram, can specifically include following module:
First extraction module 701, suitable for extracting the text information in webpage;
Judgment module 702 is suitable for judging whether have communication characteristic information in the text information;If so, calling the Two extraction modules 703;
Second extraction module 703 is suitable for extracting the communication characteristic information;
Module 704 is established, suitable for establishing document index using the communication characteristic information and the webpage.
In an alternative example of an embodiment of the present invention, the webpage may include page title, header, footer, master Internal appearance, functional areas, at least one region in navigation area;
First extraction module 701 can be adapted to:
Extract page title in webpage, header, footer, body matter, functional areas, at least one region in navigation area Text information.
In an alternative embodiment of the invention, the communication characteristic information may include the phone number of specified digit Code;The judgment module 702 can be adapted to:
Word segmentation processing is carried out to the text information, obtains one or more text participles;
When text participle is matched with preset communications identification, judge whether first object text participle is specific bit Several Serial No.s;The first object text participle divides for the text after segmenting with the matched text of the communications identification Word;
If so, judging the first object text participle for the telephone number of specified digit.
In an alternative embodiment of the invention, the communication characteristic information can also include area code;It is described to sentence Disconnected module 702 can be adapted to:
Judge whether there is area code mark in the second target text participle;If so, judging the target text participle pair The text participle answered is area code;After the second target text participle is segments with the matched text of the communications identification Text participle.
In an alternative example of an embodiment of the present invention, the judgment module 702 can be adapted to:
The text for judging that the target text participle includes segments as area code;
Alternatively,
Judge the text participle before the target text segments as area code.
In an alternative example of an embodiment of the present invention, the specified digit can be 7 or 8.
In an alternative embodiment of the invention, the module 704 of establishing can be adapted to:
The position that the communication characteristic information that record occurs occurs in the webpage;
The position of the communication characteristic information and the appearance is recorded in document index.
Referring to Fig. 8, a kind of structural block diagram of the Installation practice of search according to an embodiment of the invention is shown, is had Body may include following module:
Receiving module 801, suitable for receiving the search key of user;
Identification module 802, suitable for identifying one or more search information in described search keyword;
First improves module 803, suitable for when described search information includes the Serial No. of specific bit number, improving search knot There is the weight with the search result items of the matched communication characteristic information of the Serial No. of the specified digit in fruit.
In an alternative embodiment of the invention, described device can also include following module:
Second improves module, suitable for when described search information includes communications identification, improving has and the communications identification The weight of the search result items of matched communication characteristic information.
In an alternative embodiment of the invention, described device can also include following module:
Module is obtained, suitable for obtaining the area code of present position;
Third improves module, is suitable for when the area code is matched with the communication characteristic information, improves described in having Communicate the weight of the search result of characteristic information.
In an alternative embodiment of the invention, described device can also include following module:
Sorting module is suitable for according to the weight to one or more of search result items carry out sequence sequences;
Return module returns to client suitable for the search result after sequence sorts and is shown.
In an alternative embodiment of the invention, described search result items may include web page digest information, the net Page summary info may include the corresponding webpage information in position that the communication characteristic information occurs in webpage.
In an alternative embodiment of the invention, described device can also include following module:
Module is established in document index, is adapted to set up document index.
In an alternative embodiment of the invention, the document index is established module and be can be adapted to:
Extract the text information in webpage;
Judge whether there is communication characteristic information in the text information;If so, extracting the communication characteristic information;
Document index is established using the communication characteristic information and the webpage.
In an alternative example of an embodiment of the present invention, the webpage may include page title, banner, page Eyebrow, footer, navigation, at least one region in body matter;
Module is established in the document index can be adapted to:
Extract page title in webpage, header, footer, body matter, functional areas, at least one region in navigation area Text information.
In an alternative embodiment of the invention, the communication characteristic information may include the phone number of specified digit Code;Module is established in the document index can be adapted to:
Word segmentation processing is carried out to the text information, obtains one or more text participles;
When text participle is matched with preset communications identification, judge whether first object text participle is specific bit Several Serial No.s;The first object text participle divides for the text after segmenting with the matched text of the communications identification Word;
If so, judging the first object text participle for the telephone number of specified digit.
In an alternative embodiment of the invention, the communication characteristic information can also include area code;The text Shelves index, which establishes module, can be adapted to:
Judge whether there is area code mark in the second target text participle;If so, judging the target text participle pair The text participle answered is area code;After the second target text participle is segments with the matched text of the communications identification Text participle.
In an alternative example of an embodiment of the present invention, the document index is established module and be can be adapted to:
The text for judging that the target text participle includes segments as area code;
Alternatively,
Judge the text participle before the target text segments as area code.
In an alternative example of an embodiment of the present invention, the specified digit can be 7 or 8.
In an alternative embodiment of the invention, the document index is established module and be can be adapted to:
The position that the communication characteristic information that record occurs occurs in the webpage;
The position of the communication characteristic information and the appearance is recorded in document index.
For device embodiment, since it is basically similar to the method embodiment, related so being described relatively simple Place illustrates referring to the part of embodiment of the method.
Algorithm and display are not inherently related to any particular computer, virtual system, or other device provided herein. Various general-purpose systems can also be used together with teachings based herein.As described above, it constructs required by this kind of system Structure be obvious.In addition, the present invention is also not directed to any particular programming language.It should be understood that can use various Programming language realizes summary of the invention described herein, and the description done above to language-specific is to disclose this hair Bright preferred forms.
In the instructions provided here, numerous specific details are set forth.It is to be appreciated, however, that implementation of the invention Example can be practiced without these specific details.In some instances, well known method, structure is not been shown in detail And technology, so as not to obscure the understanding of this specification.
Similarly, it should be understood that in order to simplify the disclosure and help to understand one or more of the various inventive aspects, Above in the description of exemplary embodiment of the present invention, each feature of the invention is grouped together into single implementation sometimes In example, figure or descriptions thereof.However, the disclosed method should not be interpreted as reflecting the following intention: i.e. required to protect Shield the present invention claims features more more than feature expressly recited in each claim.More precisely, as following Claims reflect as, inventive aspect is all features less than single embodiment disclosed above.Therefore, Thus the claims for following specific embodiment are expressly incorporated in the specific embodiment, wherein each claim itself All as a separate embodiment of the present invention.
Those skilled in the art will understand that can be carried out adaptively to the module in the equipment in embodiment Change and they are arranged in one or more devices different from this embodiment.It can be the module or list in embodiment Member or component are combined into a module or unit or component, and furthermore they can be divided into multiple submodule or subelement or Sub-component.Other than such feature and/or at least some of process or unit exclude each other, it can use any Combination is to all features disclosed in this specification (including adjoint claim, abstract and attached drawing) and so disclosed All process or units of what method or apparatus are combined.Unless expressly stated otherwise, this specification is (including adjoint power Benefit require, abstract and attached drawing) disclosed in each feature can carry out generation with an alternative feature that provides the same, equivalent, or similar purpose It replaces.
In addition, it will be appreciated by those of skill in the art that although some embodiments described herein include other embodiments In included certain features rather than other feature, but the combination of the feature of different embodiments mean it is of the invention Within the scope of and form different embodiments.For example, in the following claims, embodiment claimed is appointed Meaning one of can in any combination mode come using.
Various component embodiments of the invention can be implemented in hardware, or to run on one or more processors Software module realize, or be implemented in a combination thereof.It will be understood by those of skill in the art that can be used in practice Some in the equipment of microprocessor or digital signal processor (DSP) to realize search according to an embodiment of the present invention or The some or all functions of whole components.The present invention is also implemented as a part for executing method as described herein Or whole device or device program (for example, computer program and computer program product).Such realization present invention Program can store on a computer-readable medium, or may be in the form of one or more signals.Such letter It number can be downloaded from an internet website to obtain, be perhaps provided on the carrier signal or be provided in any other form.
It should be noted that the above-mentioned embodiments illustrate rather than limit the invention, and this field Technical staff can be designed alternative embodiment without departing from the scope of the appended claims.In the claims, no Any reference symbol between parentheses should be configured to limitations on claims.Word "comprising", which does not exclude the presence of, not to be arranged Element or step in the claims.Word "a" or "an" located in front of the element does not exclude the presence of multiple such members Part.The present invention can be realized by means of including the hardware of several different elements and by means of properly programmed computer. In the unit claims listing several devices, several in these devices, which can be through the same hardware branch, has Body embodies.The use of word first, second, and third does not indicate any sequence.These words can be construed to title.

Claims (20)

1. a kind of searching method, comprising:
Extract the text information in webpage;
Judge whether there is communication characteristic information in the text information;
Wherein, described to judge that the step of whether having communication characteristic information in the text information includes:
Word segmentation processing is carried out to the text information, obtains one or more text participles;
When text participle is matched with preset communications identification, judge whether there is area code mark in the second target text participle Know;Wherein, the communications identification is the information of identification telephone numbers;The area code is identified as the information of identification telephone area code;
If so, the text for judging that the second target text participle includes segments as area code;
Alternatively,
Judge the text participle before second target text segments as area code;Second target text segments It is segmented with the text after the matched text participle of the communications identification;
Extract the communication characteristic information;
Document index is established using the communication characteristic information and the webpage;Communicating characteristic information includes area code;
Receive the search key of user;
One or more search information in described search keyword are identified by word segmentation processing means;Wherein, at the participle Reason means include: to be segmented based on string matching, or, being based on mark scanning or mark cutting participle, are segmented or, being based on understanding, Or, based on statistics participle;
When the area code is matched with the communication characteristic information, the search result with the communication characteristic information is improved Weight.
2. searching method as described in claim 1, which is characterized in that further include:
When described search information includes communications identification, improves to have and be searched with the matched communication characteristic information of the communications identification The weight of rope result items.
3. searching method as described in claim 1, which is characterized in that further include:
Obtain the area code of present position;
When the area code is matched with the communication characteristic information, the search result with the communication characteristic information is improved Weight.
4. such as claims 1 or 2 or 3 described search methods, which is characterized in that further include:
According to the weight to one or more of search result items carry out sequence sequences;
Search result after sequence is sorted returns to client and is shown.
5. searching method as claimed in claim 1 or 2, which is characterized in that described search result items include web page digest information, institute Stating web page digest information includes the corresponding webpage information in position that the communication characteristic information occurs in webpage.
6. searching method as described in claim 1, which is characterized in that the webpage include page title, banner, header, Footer, navigation, at least one region in body matter;
The step of text information in the extraction webpage includes:
Extract page title in webpage, header, footer, body matter, functional areas, at least one region in navigation area text Information.
7. searching method as described in claim 1, which is characterized in that the communication characteristic information includes the phone number of specified digit Code, further includes:
When text participle is matched with preset communications identification, judge whether first object text participle is specified digit Serial No.;The first object text participle segments for the text after segmenting with the matched text of the communications identification;
If so, judging the first object text participle for the telephone number of specified digit.
8. searching method as claimed in claim 7, which is characterized in that the specified digit is 7 or 8.
9. searching method as described in claim 1, which is characterized in that described to be built using the communication characteristic information and the webpage The step of vertical document index includes:
The position that the communication characteristic information that record occurs occurs in the webpage;
The position of the communication characteristic information and the appearance is recorded in document index.
10. a kind of searcher, comprising:
Module is established in document index, is adapted to set up document index;
The document index is established module and is further adapted for:
Extract the text information in webpage;
When text participle is matched with preset communications identification, judge whether there is communication characteristic information in the text information; If so, extracting the communication characteristic information;
Document index is established using the communication characteristic information and the webpage;The communication characteristic information includes area code;
The document index is established module and is further adapted for:
Word segmentation processing is carried out to the text information, obtains one or more text participles;
The document index is established module and is further adapted for:
Judge whether there is area code mark in the second target text participle;Wherein, the communications identification is identification telephone numbers Information;The area code is identified as the information of identification telephone area code;If so, judging that second target text segments the text for including This participle is area code;
Alternatively,
Judge the text participle before second target text segments as area code;Second target text segments It is segmented with the text after the matched text participle of the communications identification;
Receiving module, suitable for receiving the search key of user;
Identification module, suitable for identifying one or more search information in described search keyword by word segmentation processing means;Institute Stating word segmentation processing means includes: to be segmented based on string matching, or, based on mark scanning or mark cutting participle, or, based on reason Solution participle, or, based on statistics participle;
First improves module, is suitable for when the area code is matched with the communication characteristic information, and improving has the communication The weight of the search result of characteristic information.
11. searcher as claimed in claim 10, which is characterized in that further include:
Second improves module, matches suitable for when described search information includes communications identification, improving to have with the communications identification Communication characteristic information search result items weight.
12. searcher as claimed in claim 10, which is characterized in that further include:
Module is obtained, suitable for obtaining the area code of present position;
Third improves module, is suitable for when the area code is matched with the communication characteristic information, and improving has the communication The weight of the search result of characteristic information.
13. searcher as claimed in claim 10, which is characterized in that further include:
Sorting module is suitable for according to the weight to one or more of search result items carry out sequence sequences;
Return module returns to client suitable for the search result after sequence sorts and is shown.
14. such as claim 10 or 11 described search devices, which is characterized in that described search result items include web-page summarization letter Breath, the web page digest information include the corresponding webpage information in position that the communication characteristic information occurs in webpage.
15. searcher as claimed in claim 10, which is characterized in that the webpage includes page title, banner, page Eyebrow, footer, navigation, at least one region in body matter;
The document index is established module and is further adapted for:
Extract page title in webpage, header, footer, body matter, functional areas, at least one region in navigation area text Information.
16. searcher as claimed in claim 10, which is characterized in that the communication characteristic information includes the phone of specified digit Number;The document index is established module and is further adapted for:
When text participle is matched with preset communications identification, judge whether first object text participle is specified digit Serial No.;The first object text participle segments for the text after segmenting with the matched text of the communications identification;
If so, judging the first object text participle for the telephone number of specified digit.
17. searcher as claimed in claim 16, which is characterized in that the specified digit is 7 or 8.
18. searcher as claimed in claim 10, which is characterized in that the document index is established module and is further adapted for:
The position that the communication characteristic information that record occurs occurs in the webpage;
The position of the communication characteristic information and the appearance is recorded in document index.
19. a kind of electronic equipment, wherein include:
Processor;And
It is arranged to the memory of storage computer executable instructions, the executable instruction makes the processor when executed Execute method described in any one of described claim 1-9.
20. a kind of computer readable storage medium, wherein the computer-readable recording medium storage one or more program, One or more of programs are when the electronic equipment for being included multiple application programs executes, so that the electronic equipment executes institute State method described in any one of claim 1-9.
CN201410806935.9A 2014-12-22 2014-12-22 A kind of method and apparatus of search Active CN104504070B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201410806935.9A CN104504070B (en) 2014-12-22 2014-12-22 A kind of method and apparatus of search

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201410806935.9A CN104504070B (en) 2014-12-22 2014-12-22 A kind of method and apparatus of search

Publications (2)

Publication Number Publication Date
CN104504070A CN104504070A (en) 2015-04-08
CN104504070B true CN104504070B (en) 2019-06-04

Family

ID=52945468

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201410806935.9A Active CN104504070B (en) 2014-12-22 2014-12-22 A kind of method and apparatus of search

Country Status (1)

Country Link
CN (1) CN104504070B (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107016602B (en) * 2017-04-19 2022-04-15 国网冀北电力有限公司物资分公司 Management method and management system for bid security
CN111914201B (en) * 2020-08-07 2023-11-07 腾讯科技(深圳)有限公司 Processing method and device of network page

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102110123A (en) * 2009-12-29 2011-06-29 中国人民解放军国防科学技术大学 Method for establishing inverted index
CN102368252A (en) * 2010-09-30 2012-03-07 微软公司 Applying search inquiry in content set
CN103970747A (en) * 2013-01-24 2014-08-06 爱帮聚信(北京)科技有限公司 Data processing method for network side computer to order search results

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103092856B (en) * 2011-10-31 2015-09-23 阿里巴巴集团控股有限公司 Search result ordering method and equipment, searching method and equipment

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102110123A (en) * 2009-12-29 2011-06-29 中国人民解放军国防科学技术大学 Method for establishing inverted index
CN102368252A (en) * 2010-09-30 2012-03-07 微软公司 Applying search inquiry in content set
CN103970747A (en) * 2013-01-24 2014-08-06 爱帮聚信(北京)科技有限公司 Data processing method for network side computer to order search results

Also Published As

Publication number Publication date
CN104504070A (en) 2015-04-08

Similar Documents

Publication Publication Date Title
CN109033358B (en) Method for associating news aggregation with intelligent entity
CN101452453B (en) A kind of method of input method Web side navigation and a kind of input method system
CN102982117B (en) Information search method and device
US11055373B2 (en) Method and apparatus for generating information
US20090240638A1 (en) Syntactic and/or semantic analysis of uniform resource identifiers
US8086953B1 (en) Identifying transient portions of web pages
CN104102639B (en) Popularization triggering method based on text classification and device
CN100511230C (en) Webpage-text based image search and display method thereof
CN1902627A (en) Systems and methods for direct navigation to specific portion of target document
CN102306201B (en) Method and system for analyzing webpage title
CN104715064A (en) Method and server for marking keywords on webpage
CN103838862B (en) Video searching method, device and terminal
CN101114284B (en) Method for displaying web page content relevant information and system
CN104391978A (en) Method and device for storing and processing web pages of browsers
CN104881428A (en) Information graph extracting and retrieving method and device for information graph webpages
CN105808615A (en) Document index generation method and device based on word segment weights
CN103778156A (en) Method and device for searching for data and server for data search
KR100913733B1 (en) Method for Providing Search Result Using Template
CN103530389A (en) Method and device for improving stopword searching effectiveness
CN104778232B (en) Searching result optimizing method and device based on long query
CN105204806A (en) Individual display method and device for mobile terminal webpage
US8635205B1 (en) Displaying local site name information with search results
KR20090130364A (en) Method, apparatus and computer-readable recording medium for tagging image contained in web page and providing web search service using tagged result
CN110955855B (en) Information interception method, device and terminal
CN104504070B (en) A kind of method and apparatus of search

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
TR01 Transfer of patent right
TR01 Transfer of patent right

Effective date of registration: 20220721

Address after: Room 801, 8th floor, No. 104, floors 1-19, building 2, yard 6, Jiuxianqiao Road, Chaoyang District, Beijing 100015

Patentee after: BEIJING QIHOO TECHNOLOGY Co.,Ltd.

Address before: 100088 room 112, block D, 28 new street, new street, Xicheng District, Beijing (Desheng Park)

Patentee before: BEIJING QIHOO TECHNOLOGY Co.,Ltd.

Patentee before: Qizhi software (Beijing) Co.,Ltd.