CN104504070A - Search method and device - Google Patents

Search method and device Download PDF

Info

Publication number
CN104504070A
CN104504070A CN201410806935.9A CN201410806935A CN104504070A CN 104504070 A CN104504070 A CN 104504070A CN 201410806935 A CN201410806935 A CN 201410806935A CN 104504070 A CN104504070 A CN 104504070A
Authority
CN
China
Prior art keywords
search
characteristic information
communication characteristic
information
webpage
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201410806935.9A
Other languages
Chinese (zh)
Other versions
CN104504070B (en
Inventor
王翀
陈进平
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Qihoo Technology Co Ltd
Original Assignee
Beijing Qihoo Technology Co Ltd
Qizhi Software Beijing Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Qihoo Technology Co Ltd, Qizhi Software Beijing Co Ltd filed Critical Beijing Qihoo Technology Co Ltd
Priority to CN201410806935.9A priority Critical patent/CN104504070B/en
Publication of CN104504070A publication Critical patent/CN104504070A/en
Application granted granted Critical
Publication of CN104504070B publication Critical patent/CN104504070B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/957Browsing optimisation, e.g. caching or content distillation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/951Indexing; Web crawling techniques

Abstract

The embodiment of the invention provides search method and device. The method comprises the steps of receiving search keywords from a user; recognizing one or a plurality of search information in the search keywords; increasing the weight of the search result item with the communication characteristic information matched with a digital sequence with the specified numbers in the search result while the search information includes the digital sequence with the specified numbers. According to the search method, a webpage with the communication characteristics information matched with a phone number is displayed preferentially, so that the search accuracy can be improved; the modes such as paging the search results and re-entering the search keyboards for searching can be decreased, and the simplicity and convenience of operation are improved, the resource consumption of a search engine and a local system can be decreased, the bandwidth consumption can be reduced, and the search efficiency is increased.

Description

A kind of method and apparatus of search
Technical field
The present invention relates to search technique field, particularly relate to a kind of method of search and a kind of device of search.
Background technology
Along with developing rapidly of network, the information on network sharply increases.User, in order to find required information in the information of magnanimity, uses search engine to search for usually.
Search engine refers to automatically gather information from the Internet, after certain arrangement, is supplied to the system that user carries out inquiring about.Information vastness is multifarious, and has no order, and all information is as the island one by one on vast sea, and web page interlinkage is bridge crisscross between these islands, and search engine, then for user draws an open-and-shut information map, consult at any time for user.
But as shown in Figure 1, user is when searching for commonly used telephone number (as 2223256), and search engine still provides result by general-purpose algorithm.Because the weight of title and link is higher, often there is query word in the result come above, and these results are sometimes also needed for non-user in title or link, and accuracy rate is low.User is not when searching required information, and in Search Results, page turning is searched, re-entered mode such as search keyword etc. and search for usually, and troublesome poeration, the resource consumption of search engine and local system is large, and bandwidth consumption is large, and search efficiency is low.
Summary of the invention
In view of the above problems, the present invention is proposed to provide a kind of overcoming the problems referred to above or a kind of method of search solved the problem at least in part and the device of corresponding a kind of search.
According to one aspect of the present invention, provide a kind of searching method, comprising:
Receive the search keyword of user;
Identify the one or more search information in described search keyword;
When described search information comprises the Serial No. of specific bit number, improve the weight in Search Results with the search result items of the communication characteristic information mated with the Serial No. of described appointment figure place.
Alternatively, described method also comprises:
When described search information comprises communications identification, improve the weight with the search result items of the communication characteristic information mated with described communications identification.
Alternatively, described method also comprises:
Obtain the area code of current position;
When described area code mates with described communication characteristic information, improve the weight with the Search Results of described communication characteristic information.
Alternatively, described method also comprises:
According to described weight, order sequence is carried out to described one or more search result items;
Search Results after order sequence is returned client show.
Alternatively, described search result items comprises web page digest information, and described web page digest information comprises info web corresponding to position that described communication characteristic information occurs in webpage.
Alternatively, before the step of the search keyword of described reception user, described method also comprises:
Set up document index.
Alternatively, the described step setting up document index comprises:
Extract the text message in webpage;
Judge, in described text message, whether there is communication characteristic information; If so, described communication characteristic information is then extracted;
Described communication characteristic information and described webpage is adopted to set up document index.
Alternatively, described webpage comprises at least one region in page title, banner, header, footer, navigation, body matter;
The step of the text message in described extraction webpage comprises:
Extract the text message at least one region in netpage page face title, header, footer, body matter, functional areas, navigation area.
Alternatively, described communication characteristic information comprises the telephone number of specifying figure place;
Describedly judge that the step whether in described text message with communication characteristic information comprises:
Word segmentation processing is carried out to described text message, obtains one or more text participle;
When described text participle mates with the communications identification preset, judge whether first object text participle is the Serial No. of specifying figure place; Described first object text participle is the text participle after the text participle mated with described communications identification;
If so, then judge that described first object text participle is the telephone number of specifying figure place.
Alternatively, described communication characteristic information also comprises area code;
Describedly judge that the step whether in described text message with communication characteristic information also comprises:
Judge whether there is in the second target text participle area code mark; If so, then judge that the text participle that described target text participle is corresponding is area code; Described second target text participle is the text participle after the text participle mated with described communications identification.
Alternatively, the text participle that described judgement described target text participle is corresponding is that the step of area code information comprises:
Judge that the text participle that described target text participle comprises is area code;
Or,
Judge that the text participle before described target text participle is area code.
Alternatively, described appointment figure place is 7 or 8.
Alternatively, the step that described employing described communication characteristic information and described webpage set up document index comprises:
The position that the described communication characteristic information that record occurs occurs in described webpage;
The position of described communication characteristic information and described appearance is recorded in document index.
According to a further aspect in the invention, provide a kind of searcher, comprising:
Receiver module, is suitable for the search keyword receiving user;
Identification module, is suitable for the one or more search information identified in described search keyword;
First improves module, is suitable for when described search information comprises the Serial No. of specific bit number, improves the weight in Search Results with the search result items of the communication characteristic information mated with the Serial No. of described appointment figure place.
Alternatively, described device also comprises:
Second improves module, is suitable for when described search information comprises communications identification, improves the weight with the search result items of the communication characteristic information mated with described communications identification.
Alternatively, described device also comprises:
Acquisition module, is suitable for the area code obtaining current position;
3rd improves module, is suitable for when described area code mates with described communication characteristic information, improves the weight with the Search Results of described communication characteristic information.
Alternatively, described device also comprises:
Order module, is suitable for carrying out order sequence according to described weight to described one or more search result items;
Return module, be suitable for the Search Results after by order sequence and return client and show.
Alternatively, described search result items comprises web page digest information, and described web page digest information comprises info web corresponding to position that described communication characteristic information occurs in webpage.
Alternatively, described device also comprises:
Module is set up in document index, is suitable for setting up document index.
Alternatively, described document index is set up module and is also suitable for:
Extract the text message in webpage;
Judge, in described text message, whether there is communication characteristic information; If so, described communication characteristic information is then extracted;
Described communication characteristic information and described webpage is adopted to set up document index.
Alternatively, described webpage comprises at least one region in page title, banner, header, footer, navigation, body matter;
Described document index is set up module and is also suitable for:
Extract the text message at least one region in netpage page face title, header, footer, body matter, functional areas, navigation area.
Alternatively, described communication characteristic information comprises the telephone number of specifying figure place; Described document index is set up module and is also suitable for:
Word segmentation processing is carried out to described text message, obtains one or more text participle;
When described text participle mates with the communications identification preset, judge whether first object text participle is the Serial No. of specifying figure place; Described first object text participle is the text participle after the text participle mated with described communications identification;
If so, then judge that described first object text participle is the telephone number of specifying figure place.
Alternatively, described communication characteristic information also comprises area code; Described document index is set up module and is also suitable for:
Judge whether there is in the second target text participle area code mark; If so, then judge that the text participle that described target text participle is corresponding is area code; Described second target text participle is the text participle after the text participle mated with described communications identification.
Alternatively, described document index is set up module and is also suitable for:
Judge that the text participle that described target text participle comprises is area code;
Or,
Judge that the text participle before described target text participle is area code.
Alternatively, described appointment figure place is 7 or 8.
Alternatively, described document index is set up module and is also suitable for:
The position that the described communication characteristic information that record occurs occurs in described webpage;
The position of described communication characteristic information and described appearance is recorded in document index.
The embodiment of the present invention is for the search keyword received, identify one or more search information, when search information comprises the Serial No. of specific bit number, improve in Search Results and there is the weight with the search result items of the communication characteristic information of specifying the Serial No. of figure place to mate, the webpage had with the communication characteristic information of telephone number matches is preferentially shown, improve the accuracy rate of search, and then minimizing page turning in Search Results is searched, re-enter the modes such as search keyword to search for, improve the simplicity of operation, decrease the consumption of the resource of search engine and local system, reduce bandwidth consumption, improve search efficiency.
The embodiment of the present invention, when search information comprises communications identification, improves the weight with the search result items of the communication characteristic information mated with communications identification, further increases the accuracy rate of search.
The embodiment of the present invention, when the area code of current location mates with communication characteristic information, improves the weight with the Search Results of this communication characteristic information, further increases the accuracy rate of search.
By the position Web-Designing summary info of communication characteristic information and appearance in the embodiment of the present invention, the information of telephone number and ownership thereof can be got by summary info in search result items, reduce the frequency that user clicks Search Results, reduce web page server, the resource of current electronic device and the consumption of bandwidth.
Above-mentioned explanation is only the general introduction of technical solution of the present invention, in order to technological means of the present invention can be better understood, and can be implemented according to the content of instructions, and can become apparent, below especially exemplified by the specific embodiment of the present invention to allow above and other objects of the present invention, feature and advantage.
Accompanying drawing explanation
By reading hereafter detailed description of the preferred embodiment, various other advantage and benefit will become cheer and bright for those of ordinary skill in the art.Accompanying drawing only for illustrating the object of preferred implementation, and does not think limitation of the present invention.And in whole accompanying drawing, represent identical parts by identical reference symbol.In the accompanying drawings:
Fig. 1 shows a kind of search result examples figure;
Fig. 2 shows a kind of according to an embodiment of the invention flow chart of steps of method for building up embodiment of document index;
Fig. 3 shows a kind of according to an embodiment of the invention search result examples figure;
Fig. 4 shows a kind of according to an embodiment of the invention flow chart of steps of embodiment of the method 1 of search;
Fig. 5 shows a kind of according to an embodiment of the invention flow chart of steps of embodiment of the method 2 embodiment of the method for search;
Fig. 6 shows a kind of according to an embodiment of the invention search result examples figure;
Fig. 7 shows a kind of according to an embodiment of the invention structured flowchart of apparatus for establishing embodiment of document index; And
Fig. 8 shows a kind of according to an embodiment of the invention structured flowchart of device embodiment of search.
Embodiment
Below with reference to accompanying drawings exemplary embodiment of the present disclosure is described in more detail.Although show exemplary embodiment of the present disclosure in accompanying drawing, however should be appreciated that can realize the disclosure in a variety of manners and not should limit by the embodiment set forth here.On the contrary, provide these embodiments to be in order to more thoroughly the disclosure can be understood, and complete for the scope of the present disclosure can be conveyed to those skilled in the art.
With reference to Fig. 2, show a kind of according to an embodiment of the invention flow chart of steps of method for building up embodiment of document index, specifically can comprise the steps:
Step 201, extracts the text message in webpage;
The treatment scheme of search engine generally can be divided into two parts, and Part I is front end user request, and Part II is that rear end makes data.
One, front end user request processing procedure can comprise:
1. user entered keyword;
2. query word analysis, search engine is to key word participle;
3. retrieve, according to word segmentation result, from the document index made in advance, find out relevant collections of web pages;
4. sort, to the collections of web pages of candidate, sort according to content relevance, the dimension such as ageing;
5. represent: the webpage after sequence is represented.
Two, making data procedures in rear end can comprise:
1. webpage capture, reptile, by the linking relationship between webpage, captures the webpage of internet and preserves;
2. compilation of index, analyzes the webpage capturing preservation, to web page title and page text participle, makes document index, retrieve for front end according to word segmentation result.
The webpage of crawler capturing can be kept in web database and form a large amount of searching resources, and can comprise a large amount of text messages in web page contents.Then in embodiments of the present invention, the text message in webpage can be extracted from web database.
In a kind of alternate exemplary of the embodiment of the present invention, described webpage comprises at least one region in page title, header, footer, body matter, functional areas, navigation area; Then in embodiments of the present invention, step 201 can comprise following sub-step:
Sub-step S11, extracts the text message at least one region in netpage page face title, header, footer, body matter, functional areas, navigation area.
The website of heterogeneity and classification, the content arrangement of webpage is generally different.But the substance of general webpage comprises title, header, footer, body matter, functional areas, navigation area advertisement column etc.These elements, at the position arrangement of webpage, are exactly the integral layout of webpage.
The top of each webpage has an information, and this information often appears at the title block of browser, but not in webpage, but the part in this information Ye Shi page layout.This information is the prompting to main contents in this webpage, i.e. title.
LOGO is the instrument of site owners international communication self-image.
Namely the upper end of webpage is the header of this page.Header is not have in all webpages, and it is considerable position in a page often, easily causes the attention of viewer, so a lot of website all can arrange the content of leaflet website in header, as website aim, website LOGO etc.
Body matter is the most important element in webpage.Body matter is also imperfect, and the hyperlink often selected and edit by the title of next stage content, synopsis, content is formed.Body matter, by hyperlink, can utilize a page, the content expressed by the several page of high level overview, and the body matter of homepage even can in a page content of the whole website of high level overview.
The lowermost end part of webpage is called as footer, and Footer Section is usually used to specifying information and the liaison method of introducing site owners, as title, address, contact method, copyright information etc.Some of them content is made into the hyperlink of title formula, guides viewer to understand detailed content further.
Functional areas are concentrated expressions of website major function.Generally be positioned at upper right side or the right side sidebar of webpage.Functional areas comprise: Email, Information issued, user name are registered, logged in the contents such as website.Some website employs IP positioning function, viewer location, location, then can at the customized information such as weather, news of display locality, functional areas.
Navigation area can by certain technological means, and the visitor for website provides certain approach, and period can access required content easily, and navigation area is generally divided into 4 kinds of positions, is left side, right side, top and bottom respectively.The navigation area that general website uses is all single, also can navigate more, to navigate the mode combined as have employed left-hand navigation and bottom.But no matter adopt several navigation area, the position, navigation area of each page in website is all fixing.
Advertisement area is the region that website realizes profit or self.Generally be positioned at the header of webpage, right side and bottom.Advertisement area content is based on word, image, Flash animation.Advertising results are reached by attracting the mode of viewer's clickthrough.Advertisement area is arranged will reach obvious, reasonable, noticeable, and this is very important to the layout of whole website.
It should be noted that, footer can not be included in general document index, but, because Footer Section is usually used to specifying information and the liaison method of introducing site owners, status outbalance in the sight of user search telephone number, therefore, Footer Section can be included in the embodiment of the present invention.
Step 202, judges whether have communication characteristic information in described text message; If so, then step 203 is performed;
Communication characteristic information, can for characterizing the information for communicating.
In a kind of embodiment of the present invention, described communication characteristic information can comprise the telephone number of specifying figure place;
Telephone number is a succession of several combinatorics on words, and the corresponding telephone wire of a sets of numbers, as dialed to the opposing party, can dial the corresponding number combination of the other side.When telephone number is just brought into use, number is shorter, and approximately only have two or three numerals, also can only dial to neighbouring telephone subscriber, little by little, telephone system gets over prosperity, and covering scope reaches the whole world, and telephone number also increases relatively.Telephone number, except connecting phone, also can connect computer and facsimile recorder.Telephone Management Agency department is the number of telephone set setting.(phone number is 11) of general 7 or 8 figure place compositions, had the situation of 5 or 6 in early days.
Then in embodiments of the present invention, step 202 can comprise following sub-step:
Sub-step S21, carries out word segmentation processing to described text message, obtains one or more text participle;
Introduce the segmenting method that some are conventional below:
1, based on the segmenting method of string matching: refer to and according to certain strategy, Chinese character string to be analyzed to be mated with the entry in a preset machine dictionary, if find certain character string in dictionary, then the match is successful (identifying a word).
2, the segmenting method of feature based scanning or mark cutting: refer to and preferential identify and be syncopated as some words with obvious characteristic in character string to be analyzed, using these words as breakpoint, can less string be divided into come into mechanical Chinese word segmentation more former character string, thus reduce the error rate of coupling; Or participle and part-of-speech tagging are combined, utilizes abundant grammatical category information to offer help to participle decision-making, and conversely word segmentation result tested again in annotation process, adjust, thus improve the accuracy rate of cutting.
3, based on the segmenting method understood: referring to by allowing the understanding of anthropomorphic distich of computer mould, reaching the effect identifying word.Its basic thought is exactly carry out syntax, semantic analysis while participle, utilizes syntactic information and semantic information to process Ambiguity.It generally includes three parts: participle subsystem, syntactic-semantic subsystem, master control part.Under the coordination of master control part, participle subsystem can obtain about the syntax of word, sentence etc. and semantic information judge segmentation ambiguity, and namely it simulates the understanding process of people to sentence.This segmenting method needs to use a large amount of linguistries and information.
4, the segmenting method of Corpus--based Method: refer to, because the frequency of the adjacent co-occurrence of word and word or probability can reflect into the confidence level of word preferably in Chinese information, so can add up the frequency of each combinatorics on words of co-occurrence adjacent in language material, calculate their information that appears alternatively, and calculate the adjacent co-occurrence probabilities of two Chinese characters X, Y.The information of appearing alternatively can embody the tightness degree of marriage relation between Chinese character.When tightness degree is higher than some threshold values, just can think that this word group may constitute a word.This method is added up the word group frequency in language material, does not need cutting dictionary.
Sub-step S22, when described text participle mates with the communications identification preset, judges whether first object text participle is the Serial No. of specifying figure place; If so, then sub-step S23 is performed;
Communications identification can be the information of identification telephone numbers, such as, and " please contact ", " phone ", " mobile phone ", " Tel ", " Mobile " etc.
Wherein, described first object text participle can be the text participle after the text participle mated with described communications identification, such as, if text participle " phone " mates with communications identification, then the text participle after text participle " phone " can be first object text participle.
Sub-step S23, judges that described first object text participle is the telephone number of specifying figure place.
In embodiments of the present invention, be when specifying figure place digital at first object text participle, can judge that the numeral of this appointment figure place is the telephone number of specifying figure place.
Such as, if the first object text participle after text participle " phone " is " 2223256 ", because " 2223256 " are the Serial No. of 7, then can judge that " 2223256 " are the telephone number of 7.
In a kind of embodiment of the present invention, described communication characteristic information can also comprise area code; Area code refers to each administrative region ordinary phone zoning number, and these numbers are mainly used in domestic, international long-distance telephone access.Such as, the international area code 86 in China's Mainland, Chengdu area code 28.And when using National calls, will add before area code and dialling 0.
Then in embodiments of the present invention, step 202 can comprise following sub-step:
Sub-step S24, judges whether have area code mark in the second target text participle; If so, then sub-step S25 is performed;
Area code identifies, and can be the information of identification telephone area code, and such as, " () " in " (010) 2223256 " be area code mark, "-" in " 010-2223256 " is that area code identifies etc.
Wherein, described second target text participle can be the text participle after the text participle mated with described communications identification, such as, if text participle " phone " mates with communications identification, then the text participle after text participle " phone " can be the second target text participle.
Sub-step S25, judges that the text participle that described target text participle is corresponding is area code;
In embodiments of the present invention, when target text participle mates with communications identification, the telephone number as text participle that this target text participle is corresponding can be judged.
In a kind of alternate exemplary of the embodiment of the present invention, sub-step S25 can comprise following sub-step:
Sub-step S251, judges that the text participle that described target text participle comprises is area code;
Such as, " () " in " (010) 2223256 " is area code mark, then text participle " 010 " can be area code.
Or,
Sub-step S252, judges that the text participle before described target text participle is area code.
Such as, "-" in " 010-2223256 " is area code mark, then text participle " 010 " can be area code.
Step 204, extracts described communication characteristic information;
In embodiments of the present invention, if judge, in the text message in webpage, there is communication characteristic information, then can extract this communication characteristic information, such as, specify the telephone number of figure place, area code etc., to set up document index.
Step 205, adopts described communication characteristic information and described webpage to set up document index.
In specific implementation, document index can comprise inverted index, just arrange index etc., and document index can be made up of concordance list and master file two parts.
Concordance list can be the table of corresponding relation between an instruction logical record and physical record.Every in concordance list is called index entry.Index entry is the arrangement of button (or logical record number) order.
In a kind of embodiment of the present invention, step 205 can comprise following sub-step:
Sub-step S31, the position that the described communication characteristic information that record occurs occurs in described webpage;
Sub-step S32, is recorded in the position of described communication characteristic information and described appearance in document index.
In the embodiment of the present invention, can the position record occurred be write in inverted index, to show as web page digest information in search result items.
In a lot of sight, user search telephone number, needs the information obtained to be determine that this is the ownership of telephone number and this telephone number, as company, shop etc. mostly.If the summary info of user in search result items gets the information of telephone number and ownership thereof, often carry out detailed inquiry without the need to clicking this search result items.
In the embodiment of the present invention, the position of communication characteristic information and appearance is recorded in document index, the information of telephone number and ownership thereof can be got by web page digest information in search result items, reduce the frequency that user clicks Search Results, reduce web page server, the resource of current electronic device and the consumption of bandwidth.
Inverted index comes from practical application needs the value according to attribute to search record.Each in this concordance list all comprises a property value and has the address of each record of this property value.Owing to not being determine property value by recording, but being determined the position of recording by property value, being thus called inverted index (inverted index).File with inverted index is called inverted index file, is called for short inverted file (inverted file).
Inverted file (inverted index), index object is the word etc. in document or collection of document (such as webpage), being used for storing the memory location of these words in a document or one group of document, is a kind of conventional Indexing Mechanism to document or collection of document.
In specific implementation, the appearance position of communication characteristic information can comprise the webpage of appearance, the webpage of appearance and position in the web page thereof.
For English, be below the text message in webpage that will be indexed:
T1="it is what it is";
T2="what is it";
T3="it is a banana";
Be below inverted index:
"a":{(2,2)}
"banana":{(2,3)}
"is":{(0,1),(0,4),(1,1),(2,1)}
"it":{(0,0),(0,3),(1,2),(2,0)}
"what":{(0,2),(1,0)}
Wherein, " banana ": { (2,3) } be " banana " in the text message of the 3rd webpage (T3), and be the 4th word (address is 3) in the position of the 3rd webpage.
General page analysis does not identify special point (as telephone number, area code), the intermediate portions such as the key word likely provided mainly for title or the head of a station do document index, may be left in the basket a lot of thing in document index, when user has a demand of enquiring telephone number, do not return the result needed for user.
In addition, bank, Deng Ge large-scale commerce mechanism of online shopping mall, generally can arrange the telephone number of the telephone number of 5,400 beginnings, and these telephone numbers, usually can be shown by the homepage that this commercial undertaking preferentially rises to Search Results by the mode of bidding.
And the telephone number of 7 or 8 is the telephone number being generally the small organization such as little company, little shop, usual financial insolvency is bidded required expense, general setting, lower than the importance of title, network address etc., is usually buried in very dark position in Search Results, even cannot search.
When there is communication characteristic information in the text message of the embodiment of the present invention in webpage, communication characteristic information and webpage is adopted to set up document index, by mark communication characteristic information, can be described as the telephone directory in (as the whole nation) in foundation on a large scale visually, to support that other users follow-up are when searching for telephone number, the webpage had with the communication characteristic information of this telephone number matches is preferentially shown, improve the accuracy rate of search, and then minimizing page turning in Search Results is searched, re-enter the modes such as search keyword to search for, improve the simplicity of operation, decrease the consumption of the resource of search engine and local system, reduce bandwidth consumption, improve search efficiency.
With reference to Fig. 3, show a kind of according to an embodiment of the invention flow chart of steps of searching method embodiment 1, specifically can comprise the steps:
Step 301, receives the search keyword of user;
In specific implementation, user can from any one electronic equipment access search engine, such as mobile phone, PDA (Personal Digital Assistant, personal digital assistant), laptop computer, palm PC etc., the embodiment of the present invention is not limited this.
These electronic equipments can support the operating system comprising Android (Android), IOS, WindowsPhone or windows etc., usually can run through the browser of internet access webpage or the application program of built-in miniature browser.
In a kind of alternate exemplary of the embodiment of the present invention; user can open the webpage at search engine place in the application program of browser or built-in miniature browser; usually can comprise search box in the web page, user can in this search box inputted search keyword.
In the another kind of alternate exemplary of the embodiment of the present invention, search plug-in unit (plug-ins can be installed in the application program of browser or built-in miniature browser, can by carrying out alternately with search engine, function of search is increased) in the application program of browser or built-in miniature browser, this search plug-in unit can provide search box, user can in this search box inputted search keyword.
The application program of browser or built-in miniature browser, when the search keyword that user inputs, can be assembled into searching request, sends searching request to search engine, searches for the information relevant to this search keyword to ask search engine.
In actual applications, this searching request can be HTTP (Hypertext transfer protocol, HTML (Hypertext Markup Language)) request.Wherein, the content of searching request can comprise the mark of the webpage of user's request of loading and/or the feature of webpage.Banner can be the information that can represent a well-determined webpage, such as Uniform Resource Identifier (Uniform Resource Identifier, URI), Uniform Resource Identifier specifically can comprise URL(uniform resource locator) (Uniform ResourceLocator again, URL), or uniform resource name (Uniform Resource Name, URN) etc.
The application program of browser or built-in miniature browser can pass through DNS (Domain NameSystem, domain name analysis system) resolve IP (Internet Protocol, the agreement interconnected between the network) address of searching the domain name (Domain Name) in this webpage URL and mapping.After the success of acquisition IP address, the application program of browser or built-in miniature browser can connect to the search engine request at this place, IP address.After the search engine successfully connecting this place, IP address, request header information can be initiated searching request by http protocol to the search engine at this place, IP address by the application program of browser or built-in miniature browser.
Search engine receives searching request, then can extract search keyword from this searching request, then according to this search keyword Rapid Detection Search Results in document index, can comprise one or more search result items in this Search Results.
Step 302, identifies the one or more search information in described search keyword;
In embodiments of the present invention, can by searching for the one or more search information in keyword described in the means identifications such as word segmentation processing.
Such as, if search keyword is " 2223256 ", then this search keyword comprises a search information " 2223256 "; If search keyword is " phone 2223256 ", then this search keyword comprises search information " phone ", " 2223256 ".
Step 303, when described search information comprises the Serial No. of specific bit number, improves the weight in Search Results with the search result items of the communication characteristic information mated with the Serial No. of described appointment figure place.
In actual applications, the Serial No. of figure place (as 7 or 8) is specified in user search, then its inquiry object may be enquiring telephone number.
When communication characteristic information (as telephone number) in search result items mates with the Serial No. of this appointment figure place (as 7 or 8), the weight of this search result items can be improved, to improve the displaying order of this search result items in Search Results.
Such as, as shown in Figure 4, if user search " 2223256 ", the display location of the search result items comprising phone " 2223256 " can be promoted to the first few items of Search Results, facilitate user's express query.
The embodiment of the present invention is for the search keyword received, identify one or more search information, when search information comprises the Serial No. of specific bit number, improve in Search Results and there is the weight with the search result items of the communication characteristic information of specifying the Serial No. of figure place to mate, the webpage had with the communication characteristic information of telephone number matches is preferentially shown, improve the accuracy rate of search, and then minimizing page turning in Search Results is searched, re-enter the modes such as search keyword to search for, improve the simplicity of operation, decrease the consumption of the resource of search engine and local system, reduce bandwidth consumption, improve search efficiency.
With reference to Fig. 5, show a kind of according to an embodiment of the invention flow chart of steps of searching method embodiment 2, specifically can comprise the steps:
Step 501, sets up document index;
In a kind of embodiment of the present invention, step 501 can comprise following sub-step:
Sub-step S41, extracts the text message in webpage;
In a kind of alternate exemplary of the embodiment of the present invention, described webpage can comprise at least one region in page title, header, footer, body matter, functional areas, navigation area, then in this example, sub-step S41 can comprise following sub-step:
Sub-step S411, extracts the text message at least one region in netpage page face title, header, footer, body matter, functional areas, navigation area.
Sub-step S42, judges whether have communication characteristic information in described text message; If so, then sub-step S43 is performed;
In a kind of embodiment of the present invention, described communication characteristic information can comprise the telephone number of specifying figure place, and described appointment figure place can be 7 or 8.Then in embodiments of the present invention, sub-step S42 can comprise following sub-step:
Sub-step S421, carries out word segmentation processing to described text message, obtains one or more text participle;
Sub-step S422, when described text participle mates with the communications identification preset, judges whether first object text participle is the Serial No. of specifying figure place; If so, then sub-step S423 is performed;
Described first object text participle can be the text participle after the text participle mated with described communications identification;
Sub-step S423, judges that described first object text participle is the telephone number of specifying figure place.
In a kind of embodiment of the present invention, described communication characteristic information can also comprise area code; Then in embodiments of the present invention, sub-step S42 can also comprise following sub-step:
Sub-step S424, judges whether have area code mark in the second target text participle; If so, then sub-step S425 is performed;
Described second target text participle can be the text participle after the text participle mated with described communications identification;
Sub-step S425, judges that the text participle that described target text participle is corresponding is area code.
In a kind of alternate exemplary of the embodiment of the present invention, sub-step S425 can comprise following sub-step:
Sub-step S4251, judges that the text participle that described target text participle comprises is area code;
Or,
Sub-step S4252, judges that the text participle before described target text participle is area code.
Sub-step S43, extracts described communication characteristic information;
Sub-step S44, adopts described communication characteristic information and described webpage to set up document index.
In a kind of embodiment of the present invention, sub-step S44 can comprise following sub-step:
Sub-step S441, the position that the described communication characteristic information that record occurs occurs in described webpage;
Sub-step S442, is recorded in the position of described communication characteristic information and described appearance in document index.
In embodiments of the present invention, due to step 501 and the application basic simlarity of embodiment of the method 1, so description is fairly simple, relevant part illustrates see the part of embodiment of the method 1, and the embodiment of the present invention is not described in detail at this.
Step 502, receives the search keyword of user;
Step 503, identifies the one or more search information in described search keyword;
Step 504, when described search information comprises the Serial No. of specific bit number, improves the weight in Search Results with the search result items of the communication characteristic information mated with the Serial No. of described appointment figure place.
Step 505, when described search information comprises communications identification, improves the weight with the search result items of the communication characteristic information mated with described communications identification;
In specific implementation, the communications identification that user search is mated with communication Feature Words, then its inquiry object may be enquiring telephone number, then can improve the weight of the search result items comprising this communication Feature Words, to improve the displaying order of this search result items in Search Results further.
The embodiment of the present invention, when search information comprises communications identification, improves the weight with the search result items of the communication characteristic information mated with communications identification, further increases the accuracy rate of search.
Step 506, obtains the area code of current position;
In embodiments of the present invention, the position at the current place of user can be obtained, then inquire about area code corresponding to this position.
If when user is by mobile device submission search keywords such as mobile phones, current longitude and latitude can be located, by the position at these longitude and latitude places of mode identification such as inverse geocoding.
If when user is by fixed equipment submission search keywords such as computers, current IP address (Internet Protocol Address is translated into internet protocol address again) can be inquired about, then identifies the position at this place, IP address.
Step 507, when described area code mates with described communication characteristic information, improves the weight with the Search Results of described communication characteristic information.
In specific implementation, the area code of the current position of user is mated with communication Feature Words (as area code), then can improve the weight of the search result items comprising this communication Feature Words (as area code), to improve the displaying order of this search result items in Search Results further.
The embodiment of the present invention, when the area code of current location mates with communication characteristic information, improves the weight with the Search Results of this communication characteristic information, further increases the accuracy rate of search.
In actual applications, described search result items can comprise web page digest information, and described web page digest information can comprise info web corresponding to position that described communication characteristic information (as telephone number, area code) occurs in webpage.
Such as, as shown in Figure 6, if user search " phone 2223256 ", the display location of the search result items comprising phone " phone " (communications identification), " 2223256 " (specifying the Serial No. of figure place) can be promoted to the first few items of Search Results, facilitate user's express query.
By the position Web-Designing summary info of communication characteristic information and appearance in the embodiment of the present invention, the information of telephone number and ownership thereof can be got by summary info in search result items, reduce the frequency that user clicks Search Results, reduce web page server, the resource of current electronic device and the consumption of bandwidth.
Step 508, carries out order sequence according to described weight to described one or more search result items;
In the embodiment of the present invention, can carry out order sequence, the search result items that weight is higher according to weight to one or more search result items, sequence is more front, and the search result items that weight is lower, sequence is more rear.
Step 509, returns the Search Results after order sequence to client and shows.
Under http protocol, the application program of browser or built-in miniature browser can receive the document of HTML (Hypertext Markup Language, HTML (Hypertext Markup Language)) type from the server at search engine place.
The application program of browser or built-in miniature browser can resolve html document, generate the object of tree structure, i.e. DOM (Document Object Model, document dbject model), each node to liking on DOM, and these objects can represent the web page resources such as word, picture.The application program of browser or built-in miniature browser can start to show this html document, and obtain the address of wherein embedded web page resources, then browser is initiated request to server again and is obtained these web page resources, and in the html document of the application program of browser or built-in miniature browser display of search results.
For embodiment of the method, in order to simple description, therefore it is all expressed as a series of combination of actions, but those skilled in the art should know, the embodiment of the present invention is not by the restriction of described sequence of movement, because according to the embodiment of the present invention, some step can adopt other orders or carry out simultaneously.Secondly, those skilled in the art also should know, the embodiment described in instructions all belongs to preferred embodiment, and involved action might not be that the embodiment of the present invention is necessary.
With reference to Fig. 7, show a kind of according to an embodiment of the invention structured flowchart of apparatus for establishing embodiment of document index, specifically can comprise as lower module:
First extraction module 701, is suitable for extracting the text message in webpage;
Judge module 702, is suitable for judging whether have communication characteristic information in described text message; If so, the second extraction module 703 is then called;
Second extraction module 703, is suitable for extracting described communication characteristic information;
Set up module 704, be suitable for adopting described communication characteristic information and described webpage to set up document index.
In a kind of alternate exemplary of the embodiment of the present invention, described webpage can comprise at least one region in page title, header, footer, body matter, functional areas, navigation area;
Described first extraction module 701 can also be suitable for:
Extract the text message at least one region in netpage page face title, header, footer, body matter, functional areas, navigation area.
In a kind of embodiment of the present invention, described communication characteristic information can comprise the telephone number of specifying figure place; Described judge module 702 can also be suitable for:
Word segmentation processing is carried out to described text message, obtains one or more text participle;
When described text participle mates with the communications identification preset, judge whether first object text participle is the Serial No. of specifying figure place; Described first object text participle is the text participle after the text participle mated with described communications identification;
If so, then judge that described first object text participle is the telephone number of specifying figure place.
In a kind of embodiment of the present invention, described communication characteristic information can also comprise area code; Described judge module 702 can also be suitable for:
Judge whether there is in the second target text participle area code mark; If so, then judge that the text participle that described target text participle is corresponding is area code; Described second target text participle is the text participle after the text participle mated with described communications identification.
In a kind of alternate exemplary of the embodiment of the present invention, described judge module 702 can also be suitable for:
Judge that the text participle that described target text participle comprises is area code;
Or,
Judge that the text participle before described target text participle is area code.
In a kind of alternate exemplary of the embodiment of the present invention, described appointment figure place can be 7 or 8.
In a kind of embodiment of the present invention, described module 704 of setting up can also be suitable for:
The position that the described communication characteristic information that record occurs occurs in described webpage;
The position of described communication characteristic information and described appearance is recorded in document index.
With reference to Fig. 8, show a kind of according to an embodiment of the invention structured flowchart of device embodiment of search, specifically can comprise as lower module:
Receiver module 801, is suitable for the search keyword receiving user;
Identification module 802, is suitable for the one or more search information identified in described search keyword;
First improves module 803, is suitable for when described search information comprises the Serial No. of specific bit number, improves the weight in Search Results with the search result items of the communication characteristic information mated with the Serial No. of described appointment figure place.
In a kind of embodiment of the present invention, described device can also comprise as lower module:
Second improves module, is suitable for when described search information comprises communications identification, improves the weight with the search result items of the communication characteristic information mated with described communications identification.
In a kind of embodiment of the present invention, described device can also comprise as lower module:
Acquisition module, is suitable for the area code obtaining current position;
3rd improves module, is suitable for when described area code mates with described communication characteristic information, improves the weight with the Search Results of described communication characteristic information.
In a kind of embodiment of the present invention, described device can also comprise as lower module:
Order module, is suitable for carrying out order sequence according to described weight to described one or more search result items;
Return module, be suitable for the Search Results after by order sequence and return client and show.
In a kind of embodiment of the present invention, described search result items can comprise web page digest information, and described web page digest information can comprise info web corresponding to position that described communication characteristic information occurs in webpage.
In a kind of embodiment of the present invention, described device can also comprise as lower module:
Module is set up in document index, is suitable for setting up document index.
In a kind of embodiment of the present invention, described document index is set up module and can also be suitable for:
Extract the text message in webpage;
Judge, in described text message, whether there is communication characteristic information; If so, described communication characteristic information is then extracted;
Described communication characteristic information and described webpage is adopted to set up document index.
In a kind of alternate exemplary of the embodiment of the present invention, described webpage can comprise at least one region in page title, banner, header, footer, navigation, body matter;
Described document index is set up module and can also be suitable for:
Extract the text message at least one region in netpage page face title, header, footer, body matter, functional areas, navigation area.
In a kind of embodiment of the present invention, described communication characteristic information can comprise the telephone number of specifying figure place; Described document index is set up module and can also be suitable for:
Word segmentation processing is carried out to described text message, obtains one or more text participle;
When described text participle mates with the communications identification preset, judge whether first object text participle is the Serial No. of specifying figure place; Described first object text participle is the text participle after the text participle mated with described communications identification;
If so, then judge that described first object text participle is the telephone number of specifying figure place.
In a kind of embodiment of the present invention, described communication characteristic information can also comprise area code; Described document index is set up module and can also be suitable for:
Judge whether there is in the second target text participle area code mark; If so, then judge that the text participle that described target text participle is corresponding is area code; Described second target text participle is the text participle after the text participle mated with described communications identification.
In a kind of alternate exemplary of the embodiment of the present invention, described document index is set up module and can also be suitable for:
Judge that the text participle that described target text participle comprises is area code;
Or,
Judge that the text participle before described target text participle is area code.
In a kind of alternate exemplary of the embodiment of the present invention, described appointment figure place can be 7 or 8.
In a kind of embodiment of the present invention, described document index is set up module and can also be suitable for:
The position that the described communication characteristic information that record occurs occurs in described webpage;
The position of described communication characteristic information and described appearance is recorded in document index.
For device embodiment, due to itself and embodiment of the method basic simlarity, so description is fairly simple, relevant part illustrates see the part of embodiment of the method.
Intrinsic not relevant to any certain computer, virtual system or miscellaneous equipment with display at this algorithm provided.Various general-purpose system also can with use based on together with this teaching.According to description above, the structure constructed required by this type systematic is apparent.In addition, the present invention is not also for any certain programmed language.It should be understood that and various programming language can be utilized to realize content of the present invention described here, and the description done language-specific is above to disclose preferred forms of the present invention.
In instructions provided herein, describe a large amount of detail.But can understand, embodiments of the invention can be put into practice when not having these details.In some instances, be not shown specifically known method, structure and technology, so that not fuzzy understanding of this description.
Similarly, be to be understood that, in order to simplify the disclosure and to help to understand in each inventive aspect one or more, in the description above to exemplary embodiment of the present invention, each feature of the present invention is grouped together in single embodiment, figure or the description to it sometimes.But, the method for the disclosure should be construed to the following intention of reflection: namely the present invention for required protection requires feature more more than the feature clearly recorded in each claim.Or rather, as claims below reflect, all features of disclosed single embodiment before inventive aspect is to be less than.Therefore, the claims following embodiment are incorporated to this embodiment thus clearly, and wherein each claim itself is as independent embodiment of the present invention.
Those skilled in the art are appreciated that and adaptively can change the module in the equipment in embodiment and they are arranged in one or more equipment different from this embodiment.Module in embodiment or unit or assembly can be combined into a module or unit or assembly, and multiple submodule or subelement or sub-component can be put them in addition.Except at least some in such feature and/or process or unit be mutually repel except, any combination can be adopted to combine all processes of all features disclosed in this instructions (comprising adjoint claim, summary and accompanying drawing) and so disclosed any method or equipment or unit.Unless expressly stated otherwise, each feature disclosed in this instructions (comprising adjoint claim, summary and accompanying drawing) can by providing identical, alternative features that is equivalent or similar object replaces.
In addition, those skilled in the art can understand, although embodiments more described herein to comprise in other embodiment some included feature instead of further feature, the combination of the feature of different embodiment means and to be within scope of the present invention and to form different embodiments.Such as, in the following claims, the one of any of embodiment required for protection can use with arbitrary array mode.
All parts embodiment of the present invention with hardware implementing, or can realize with the software module run on one or more processor, or realizes with their combination.It will be understood by those of skill in the art that the some or all functions that microprocessor or digital signal processor (DSP) can be used in practice to realize according to the some or all parts in the equipment of the search of the embodiment of the present invention.The present invention can also be embodied as part or all equipment for performing method as described herein or device program (such as, computer program and computer program).Realizing program of the present invention and can store on a computer-readable medium like this, or the form of one or more signal can be had.Such signal can be downloaded from internet website and obtain, or provides on carrier signal, or provides with any other form.
The present invention will be described instead of limit the invention to it should be noted above-described embodiment, and those skilled in the art can design alternative embodiment when not departing from the scope of claims.In the claims, any reference symbol between bracket should be configured to limitations on claims.Word " comprises " not to be got rid of existence and does not arrange element in the claims or step.Word "a" or "an" before being positioned at element is not got rid of and be there is multiple such element.The present invention can by means of including the hardware of some different elements and realizing by means of the computing machine of suitably programming.In the unit claim listing some devices, several in these devices can be carry out imbody by same hardware branch.Word first, second and third-class use do not represent any order.Can be title by these word explanations.

Claims (10)

1. a searching method, comprising:
Receive the search keyword of user;
Identify the one or more search information in described search keyword;
When described search information comprises the Serial No. of specific bit number, improve the weight in Search Results with the search result items of the communication characteristic information mated with the Serial No. of described appointment figure place.
2. the method for claim 1, is characterized in that, also comprises:
When described search information comprises communications identification, improve the weight with the search result items of the communication characteristic information mated with described communications identification.
3. the method as described in any one of claim 1-2, is characterized in that, also comprises:
Obtain the area code of current position;
When described area code mates with described communication characteristic information, improve the weight with the Search Results of described communication characteristic information.
4. the method as described in claim 1 or 2 or 3, is characterized in that, also comprise:
According to described weight, order sequence is carried out to described one or more search result items;
Search Results after order sequence is returned client show.
5. the method as described in any one of claim 1-3, is characterized in that, described search result items comprises web page digest information, and described web page digest information comprises info web corresponding to position that described communication characteristic information occurs in webpage.
6. the method as described in any one of claim 1-5, is characterized in that, the described step setting up document index comprises:
Extract the text message in webpage;
Judge, in described text message, whether there is communication characteristic information; If so, described communication characteristic information is then extracted;
Described communication characteristic information and described webpage is adopted to set up document index.
7. a searcher, comprising:
Receiver module, is suitable for the search keyword receiving user;
Identification module, is suitable for the one or more search information identified in described search keyword;
First improves module, is suitable for when described search information comprises the Serial No. of specific bit number, improves the weight in Search Results with the search result items of the communication characteristic information mated with the Serial No. of described appointment figure place.
8. device as claimed in claim 7, is characterized in that, also comprise:
Second improves module, is suitable for when described search information comprises communications identification, improves the weight with the search result items of the communication characteristic information mated with described communications identification.
9. the device as described in any one of claim 7-8, is characterized in that, also comprises:
Acquisition module, is suitable for the area code obtaining current position;
3rd improves module, is suitable for when described area code mates with described communication characteristic information, improves the weight with the Search Results of described communication characteristic information.
10. the device as described in any one of claim 7-9, is characterized in that, also comprises:
Order module, is suitable for carrying out order sequence according to described weight to described one or more search result items;
Return module, be suitable for the Search Results after by order sequence and return client and show.
CN201410806935.9A 2014-12-22 2014-12-22 A kind of method and apparatus of search Active CN104504070B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201410806935.9A CN104504070B (en) 2014-12-22 2014-12-22 A kind of method and apparatus of search

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201410806935.9A CN104504070B (en) 2014-12-22 2014-12-22 A kind of method and apparatus of search

Publications (2)

Publication Number Publication Date
CN104504070A true CN104504070A (en) 2015-04-08
CN104504070B CN104504070B (en) 2019-06-04

Family

ID=52945468

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201410806935.9A Active CN104504070B (en) 2014-12-22 2014-12-22 A kind of method and apparatus of search

Country Status (1)

Country Link
CN (1) CN104504070B (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107016602A (en) * 2017-04-19 2017-08-04 国网冀北电力有限公司物资分公司 The management method and management system of tender bond
CN111914201A (en) * 2020-08-07 2020-11-10 腾讯科技(深圳)有限公司 Network page processing method and device

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102110123A (en) * 2009-12-29 2011-06-29 中国人民解放军国防科学技术大学 Method for establishing inverted index
CN102368252A (en) * 2010-09-30 2012-03-07 微软公司 Applying search inquiry in content set
US20130110829A1 (en) * 2011-10-31 2013-05-02 Alibaba Group Holding Limited Method and Apparatus of Ranking Search Results, and Search Method and Apparatus
CN103970747A (en) * 2013-01-24 2014-08-06 爱帮聚信(北京)科技有限公司 Data processing method for network side computer to order search results

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102110123A (en) * 2009-12-29 2011-06-29 中国人民解放军国防科学技术大学 Method for establishing inverted index
CN102368252A (en) * 2010-09-30 2012-03-07 微软公司 Applying search inquiry in content set
US20130110829A1 (en) * 2011-10-31 2013-05-02 Alibaba Group Holding Limited Method and Apparatus of Ranking Search Results, and Search Method and Apparatus
CN103970747A (en) * 2013-01-24 2014-08-06 爱帮聚信(北京)科技有限公司 Data processing method for network side computer to order search results

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107016602A (en) * 2017-04-19 2017-08-04 国网冀北电力有限公司物资分公司 The management method and management system of tender bond
CN107016602B (en) * 2017-04-19 2022-04-15 国网冀北电力有限公司物资分公司 Management method and management system for bid security
CN111914201A (en) * 2020-08-07 2020-11-10 腾讯科技(深圳)有限公司 Network page processing method and device
CN111914201B (en) * 2020-08-07 2023-11-07 腾讯科技(深圳)有限公司 Processing method and device of network page

Also Published As

Publication number Publication date
CN104504070B (en) 2019-06-04

Similar Documents

Publication Publication Date Title
CN101452453B (en) A kind of method of input method Web side navigation and a kind of input method system
CN101918945B (en) Automatic expanded language search
CN101211364B (en) Method and system for social bookmarking of resources exposed in web pages
CN1902627B (en) Systems and methods for direct navigation to specific portion of target document
CN109033358B (en) Method for associating news aggregation with intelligent entity
CN100442283C (en) Extraction method and system of structured data of internet based on sample & faced to regime
CN102982117B (en) Information search method and device
CN113822067A (en) Key information extraction method and device, computer equipment and storage medium
CN104102639B (en) Popularization triggering method based on text classification and device
CN100511230C (en) Webpage-text based image search and display method thereof
CN102306201B (en) Method and system for analyzing webpage title
CN110222251B (en) Service packaging method based on webpage segmentation and search algorithm
CN102760150A (en) Webpage extraction method based on attribute reproduction and labeled path
CN102982118A (en) Searching method and device based on favorites
CN101894109A (en) Database building method and device
CN103530389A (en) Method and device for improving stopword searching effectiveness
CN105159885A (en) Point-of-interest name identification method and device
CN104778232B (en) Searching result optimizing method and device based on long query
CN105808615A (en) Document index generation method and device based on word segment weights
CN103617225A (en) Associated webpage searching method and system
CN102486792A (en) Method and system for reorganizing and displaying universal forum page
CN103631906A (en) Method and device for recognizing page number identification in webpage URL
CN104504070A (en) Search method and device
CN104504069A (en) Building method and device for file index
CN105138708A (en) Method and device for identifying names of points of interest (POI)

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
TR01 Transfer of patent right

Effective date of registration: 20220721

Address after: Room 801, 8th floor, No. 104, floors 1-19, building 2, yard 6, Jiuxianqiao Road, Chaoyang District, Beijing 100015

Patentee after: BEIJING QIHOO TECHNOLOGY Co.,Ltd.

Address before: 100088 room 112, block D, 28 new street, new street, Xicheng District, Beijing (Desheng Park)

Patentee before: BEIJING QIHOO TECHNOLOGY Co.,Ltd.

Patentee before: Qizhi software (Beijing) Co.,Ltd.

TR01 Transfer of patent right