CN1677389A - Mobile internet intelligent information retrieval engine based on key-word retrieval - Google Patents

Mobile internet intelligent information retrieval engine based on key-word retrieval Download PDF

Info

Publication number
CN1677389A
CN1677389A CN 200410026674 CN200410026674A CN1677389A CN 1677389 A CN1677389 A CN 1677389A CN 200410026674 CN200410026674 CN 200410026674 CN 200410026674 A CN200410026674 A CN 200410026674A CN 1677389 A CN1677389 A CN 1677389A
Authority
CN
China
Prior art keywords
information
network element
search
expression formula
content
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN 200410026674
Other languages
Chinese (zh)
Other versions
CN100357942C (en
Inventor
张光强
张炜
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Yulong Computer Telecommunication Scientific Shenzhen Co Ltd
Original Assignee
Yulong Computer Telecommunication Scientific Shenzhen Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Yulong Computer Telecommunication Scientific Shenzhen Co Ltd filed Critical Yulong Computer Telecommunication Scientific Shenzhen Co Ltd
Priority to CNB2004100266745A priority Critical patent/CN100357942C/en
Publication of CN1677389A publication Critical patent/CN1677389A/en
Application granted granted Critical
Publication of CN100357942C publication Critical patent/CN100357942C/en
Anticipated expiration legal-status Critical
Expired - Lifetime legal-status Critical Current

Links

Images

Landscapes

  • Information Transfer Between Computers (AREA)

Abstract

The invention discloses a mobile Internet intelligent information search engine based on keyword, characterized in that: according to the specific subject class and object websites, the machine samples and analyzes the object websites automatically to generate search rules and according to them, collects the object websites; then through an information processing course, organizing the network elements collected from the object websites into a specific full-text index structure and buffers to compose a full-text index information base; using a search task processing module to process the search commands sent out by mobile equipment, judging the mode of the mobile equipment accessing to the Internet by an equipment and channel recognizing module and a mobile Internet access module, making the equipment and used channel recognition for the mobile equipment, and finally returning the processing result to the mobile equipment. The invention fills the current gap existed in this service in the mobile field and perfectly meets the requirement of the broad masses of mobile users in the aspect of obtaining mobile information.

Description

A kind of mobile Internet intelligent information search engine based on keyword search
Technical field
The present invention relates to a kind of search engine, particularly a kind of for mobile Internet terminal provides, based on the intelligent searching engine of keyword search internet information.
Background technology
Along with the fast development of Internet, the information rapid growth on the network, people more and more depend on the information that removal search needs from the network, and search engine is the instrument of people's search and webpage and website.At present can realize that by the browser on the PC reasonable information obtains, for example use search engines such as google, sohu, yahoo.Mobile phone users also can use search engines such as google by wap browser and the http browser that is built on the portable terminal respectively, during use, according to the key word in the input command, the link of search engine obtains and this is information-related network url address, and turn back to portable terminal and select visit for the user.But, its return results connects because having comprised title, key word and website, the information element in the webpage of the web site url among the result is not separated, be not suitable for needs at the portable terminal that screen is less at present, computing function is weak, the network bandwidth is less.
The original target of information extraction technique is to find information specific from the natural language document, is the useful especially sub-field of natural language processing field, and the research that causes this respect that rolls up of online text message obtains paying much attention to.Information extraction system need to determine the information of extraction usually according to decimation rule or pattern.Applicable cases according to reality can have a lot of information extraction methods, and technology described herein is a mobile Internet intelligent information search engine, has specific method aspect information extraction.
Summary of the invention
The objective of the invention is to: a kind of mobile Internet intelligent information search engine based on keyword search is provided, to be implemented on the portable terminal, can be in the internet site in the intended target scope, according to the specified message sorted columns, carrying out searching for fast based on the information of key word, is the form of expression that satisfies termination property and move operation characteristic simultaneously with information translation.
The present invention realizes like this, a kind of mobile Internet intelligent information search engine based on keyword search, the column that is provided with according to appointment is classified and the targeted website, machine is sampled to the targeted website automatically and is analyzed, generate search rule, and described targeted website is gathered according to described search rule; Then, through an information process, the network element that described targeted website is collected is organized into specific full-text index structure, and buffer memory, constitutes a full-text index information bank; A search mission processing module is arranged, the search command that mobile device sends is handled, judge that by equipment and channel recognition module, mobile Internet access module mobile device inserts the mode of internet, carry out the recognition of devices of mobile device and the identification of employed passage, result is returned to mobile device.
Above-mentioned search rule is meant, utilizes the automatic evaluating objects website structure of system, gathers the html info web that has similar layout accordingly, and the content that generates the targeted website automatically connects obtains expression formula; And as required, generate the content match expression formula of the target network element of definite location; The mapping relations of target network element that obtains by described content match expression formula and target network element and column classification form a network element mapping graph, generate a content and obtain expression formula, form described search rule.
The information process of described search engine is meant, under the driving of above-mentioned search rule, classify in conjunction with column, the targeted website is gathered the http protocol data information of obtaining, carry out that webpage decomposes, coupling filtration, information format, information coding, intelligent sentence go heavy link, and in conjunction with the feature code table, demonstration processes to information, the feature code word that deletion will be filtered is exported one at last and has been removed space, mark and do not have illegal character, do not have the plain text information of other non-text messages.
The content of text of the target network element after the also processing that the information in the full-text index information bank of described search engine is gathered under search rule is formed, and be the information preparation increment full-text index of new typing, and set up index according to time series and the classification of described column in the mode of increment.
Described passage and access passage and the protocol header of recognition of devices module by communicating by letter are discerned the device type of portable terminal, thereby obtain the configuration information of this device type; According to different portable terminals, with search result information, through one can processing at portable terminal characteristic and mobile subscriber's operating characteristic after, the mobility protocol data are outputed to user's portable terminal, show the result of search.
The present invention with interactive means, is provided with column classification and targeted website by adopting above technical scheme, and machine is sampled to the targeted website automatically and analyzed, and generates search rule, and according to these rules described targeted website is gathered; Then, through a message processing flow, the network element that described targeted website is collected is organized into specific full-text index structure, and buffer memory; A search mission processing module is arranged, to mobile device send search command handle, by judging that described mobile device inserts the mode of internet, carry out the recognition of devices of mobile device and the identification of employed passage, return to mobile device after result is handled through corresponding presentation layer.Less relatively at present mobile device screen, computing power is more weak and the situation of network service bandwidth under, the present invention fills up the blank of present this service of mobile field, and has well satisfied numerous mobile subscribers at the needs of mobile message aspect obtaining.
Description of drawings
Fig. 1 is a system flowchart of the present invention
Fig. 2 generates the synoptic diagram of search rule for the present invention
Fig. 3 is the synoptic diagram of message processing flow of the present invention
Fig. 4 handles synoptic diagram for search mission of the present invention
Fig. 5 is a customized searches task synoptic diagram of the present invention
Embodiment
Below in conjunction with accompanying drawing the present invention is done and to describe in further detail:
As Fig. 1, generally speaking,, column classification 4 and targeted website 1 are set with interactive means, machine analyzes 2 to the targeted website automatically, forms search rule 3, and gathers according to these 5 pairs of targeted websites 1 of regular acquisition engine; Then, after an information process 6, will be from the targeted website 1 network element that collects, be organized into specific full-text index structure and buffer memory, constitute full-text index information bank 7; A search mission processing module 8 is arranged, the search command that mobile device sends is handled, judge that by recognition of devices and channel recognition module 9, mobile Internet access module 10 mobile device inserts the mode of internet, carry out the recognition of devices of mobile device and the identification of employed passage, result is returned to mobile device.
As shown in Figure 2, utilize the automatic evaluating objects website structure of system, gather the html info web that has similar layout accordingly, expression formula 3.1 is obtained in the connection of automatically generated content webpage, and according to the manual decision, generate the content match expression formula 3.2 of the target network element of definite location, and pass through the target network element that the content match expression formula obtains, and the mapping relations of target network element and column classification, a network element mapping graph formed, generate a content and obtain expression formula 3.3, constitute search rule.
Among Fig. 2, after system carries out targeted website structure analysis 3.11, the analysis 3.12 of target web Tag syntactic structure and target web content structure analysis 3.13 automatically, will be from the targeted website the webpage gathered of each column, each catalogue based on the tag grammer, classify by identical layout, identical catalogue, automatic generation is connected with the relevant content page in corresponding targeted website obtains expression formula 3.1.
The layout webpage Tag syntactic structure similarities and differences part similar according to each targeted website catalogue, web page contents structure similarities and differences part is determined the target complete network element position of target web, generates the content match expression formula 3.2 of target web.
Feature according to the information type of each target network element, determine the target network element of each information analysis key element correspondence in the webpage by content match expression formula 3.2, the mapping relations of target network element and column classification 4, that is to say, provide a manual decision's mode, the position of decision target network element on target web, and the classification of affiliated column, form a network element mapping graph 3.31, and the content of generation target network element is obtained expression formula 3.3.
Through above-mentioned steps, formed the complete search rule 3 of search engine.
Shown in Fig. 1,3, under the driving of search rule 3, in conjunction with column classification 4, targeted website 1 gathered the http protocol data information 5.1 obtained through an information process 6, carry out that webpage decomposes 6.1, coupling filters 6.2, information format 6.3, information coding 6.4, intelligent sentence go heavy link 6.5, and in conjunction with feature code table 6.7, demonstration processes 6.6 to information, the feature code word that deletion will be filtered, Shu Chu target network element 6.8 is one and has removed space, mark and do not have illegal character, do not have the plain text information of other non-text messages at last.This text message constitutes full-text index information bank 7 after treatment, is the information preparation increment full-text index of new typing in the mode of increment, and sets up index according to time series and column classification.
It heavily is that a kind of processing sentence information repeats elimination methods that above-mentioned intelligent sentence goes, and concrete step is, a) information is formed a complete sentence by the punctuation mark branch, extract condition code, b) information is carried out condition code and extract, every piece of information is extracted N condition code to N natural sentences, remaining is ignored, not enough zero padding; C) condition code is sorted, inserts, searches and compares, every fresh information comparative feature sign indicating number and a most close m piece of writing information d) are got rid of difference repeating in the value scope of setting.
According to full-text index information bank 7, as shown in Figure 4, after search mission processing module 8 is received portable terminal and is sent search command, task is handled, at first carry out user command and handle 8.1, combinatorial search condition, column and time range according to the user command appointment obtain the corresponding results collection from full-text index information bank 7, carry out query results then and handle 8.2, with this result set packing; The result who handles is by the access passage and the protocol header of communication, insert channel recognition 9.1 and recognition of devices 9.2 to what insert 10 portable terminal by mobile Internet, obtain the information of relevant device, according to different portable terminals, with search result information, through one can processing at portable terminal characteristic and mobile subscriber's operating characteristic after, the mobility protocol data are outputed to user's portable terminal, show the result of search.
Among Fig. 5, search mission processing module 8 also can comprise a timer 8.3 and a customizer 8.4, the search mission of inspection mobile phone users customization regularly, search mission comprises the key combination and the column that customizes generally speaking, whether system judges to exist in the information index storehouse and satisfies the up-to-date information that the user subscribes to condition, if have automatically this information push is arrived portable terminal, trigger next processing procedure if then do not continue waiting timer.
As desire on mobile phone mode by wap, to search is based on the relevant information of key word " sentiment undertone in the end of the year " in " analysis expert " the sub-column of " financial column ", the concrete realization and the mode of enforcement are as follows:
1, generates search rule
This part is that a personal-machine alternant way is finished, and has mainly comprised 2 following steps:
A. web analytics: by automatic analysis, generate content page and connect and obtain expression formula, generate the content match expression formula, generate content and obtain expression formula, generate complete information acquisition rule at last to the targeted website.
The coupling expression formula of this example:
sTitle>{.+?}<.+?<br>{.+?}<br><br></td></tr><
The expression formula of obtaining of this example is:
ef=([\″′]|\b)*{[^<\″′]+?}(([\″′]|\b)[^>]*?>)|(>){[^<]+?}<1
B. in diversified website, can specify in the zone of any one network element on the target web (content of text), thereby improve the order of accuarcy of search as our target retrieval.
This routine column is set as follows:
" financial column " coding 001
" analysis expert " coding 001001
These two steps have been finished the required search rule expression formula of driving acquisition engine, main way is the difference that the content page that produces based on two same templates on the targeted website is contrasted, analyzing structure of web page and content, analyzing web page TAG structure, determine the position of each network element in source file, the residing Tag structure of each network element.Analyze the order mapping of the network element that defines in each network element and the database.And obtain all webpages connections, and determine content page, determine that content page connects.Generate to connect and to obtain expression formula, content match expression formula, content and obtain expression formula.The checking expression formula.Form complete search rule with all the other parameters.The verification search rule.
Determine an external unified service column, classify that coded system is as follows: 3 characters are unit according to big or small column, as: 001 is the ground floor node, and 001001 is the child node under 001 node, and 001002 is the child node under 001 node; 002 is ground floor node and 001 sane level, and the like.According to setting, search engine in corresponding service column, provides content service accurately with the information stipulations of targeted website.
2, information acquisition and classification
This part is finished under the driving of the search rule of setting automatically, divides following step.
A. go up the search rule that generates according to this and drive, carry out search rule by circulation, according to a large amount of targeted website groups that set and the collection of target column,
B. in gatherer process, only gather the up-to-date information that occurs on the targeted website according to the mode of iteration.
C. after finishing information acquisition, export original webpage http protocol data-flow.
The search mission execution module decomposes task, and will searching at first, rule is divided into the subtask.
At first obtain homepage column webpage, the column classification is handled definition in search rule, after expression formula is obtained in the execution connection, obtain the connection of content page, obtain content page, submit it to information processing, obtaining next content page, and stipulations are to " analysis expert " the sub-column of " financial column ".
To the information of being gathered, to the respective classified column, just can provide an information agency door that can manage, unified according to regular stipulations to the user, make the result set of search more accurate.Engine is only gathered emerging information, and exports in the mode of quasi real time upgrading.
3, information processing and buffer memory
A. the http protocol data information of obtaining in collection is passed through the processing by the message processing module of information acquisition engine.
The search rule that just utilizes the web analytics module to generate uses content to obtain expression formula, carries out network element and extracts.Utilize the content match expression formula, carry out network element and separate, required network element is extracted.Obtain the complexity of expression formula in order to reduce content, adopt two-stage to obtain expression formula and extract, just the secondary coupling.Make mistakes if content is obtained expression formula, the write error daily record is also returned error code.Through network element decompose, coupling filtration, information format, information coding, information go heavily to handle etc., and link is handled, export one at last and remove the space, remove mark, do not have illegal character, do not have the plain text information of other non-text messages.
Will go heavy processing to the information article that repeats during the course, concrete step: information is formed a complete sentence by the punctuation mark branch, extract condition code, every piece of information extraction N condition code is just got N natural sentences, and unnecessary ignores, not enough benefit 0.Whether two pieces of articles are similar, depend on the condition code multiplicity.Condition code and, both the one piece of whole N of information condition code add up with.Information similar condition code and more approaching, different information characteristics sign indicating numbers adds up and differs bigger, utilizes the Hash table to carry out condition code and ordering, insert, search.Every fresh information comparative feature sign indicating number and the most close M piece of writing information and nearest M piece of writing information just can repeat to get rid of.Information is carried out condition code extract, search content information similar in buffer area if having then get rid of duplicate message, generally is that unit extracts condition code with the natural sentences, and purpose is the speed that improves in full relatively.
B. the information according to " analysis expert " of " financial column " after the above processing is cached to the full-text index information bank via the full-text index module, and be the information preparation increment full-text index of new typing in the mode of increment, this full-text index is establishment is carried out descending sort with time series and column as major key a full-text index, give tacit consent to up-to-date information up front, different columns can be respectively at different physics tables improving concurrent access speed, thereby more high efficiency retrieval can be provided.
4, moving the information search that inserts based on key word handles
Above-described full-text index information bank has been arranged, after search mission is handled mould and is received portable terminal and send search command, task is handled, the result who handles is by the access passage and the protocol header of communication, the device type of identification portable terminal, from a management holder, obtain the information of relevant device, on the wap web interface in " analysis expert " the sub-column of " financial column " search based on the relevant information of key word " sentiment undertone in the end of the year ", with search result information, different qualities according to portable terminal, be packaged into the wap protocol data, make appropriate being presented on the terminal of result.

Claims (13)

1, a kind of mobile Internet intelligent information search engine based on keyword search, it is characterized in that: according to the column classification and the targeted website of appointment, machine is sampled to the targeted website automatically and is analyzed, and generates search rule, and according to described search rule described targeted website is gathered; Then, through an information process, the network element that described targeted website is collected is organized into specific full-text index structure, and buffer memory, constitutes a full-text index information bank; A search mission processing module is arranged, the search command that mobile device sends is handled, judge that by recognition of devices and channel recognition module, mobile Internet access module mobile device inserts the mode of internet, carry out the recognition of devices of mobile device and the identification of employed passage, result is returned to mobile device.
2, according to claim 1 described intelligent information search engine, it is characterized in that: described search rule is meant, utilize the automatic evaluating objects website structure of system, gather the html info web that has similar layout accordingly, the content that generates the targeted website automatically connects obtains expression formula; And generate the content match expression formula of the target network element of definite location as required; The mapping relations of target network element that obtains by described content match expression formula and target network element and column classification form a network element mapping graph, generate a content and obtain expression formula, form described search rule.
3, according to claim 1 described intelligent information search engine, it is characterized in that: described information process is meant, under the driving of search rule, classify in conjunction with column, to the targeted website gather that the http protocol data information of obtaining carries out that webpage decomposes, coupling filtration, information format, information coding, intelligent sentence go heavy link, and in conjunction with a feature code table, demonstration processes to information, the feature code word that deletion will be filtered is exported one at last and has been removed space, mark and do not have illegal character, do not have the plain text information of other non-text messages.
4, according to claim 1 described intelligent information search engine, it is characterized in that: the information in the described full-text index information bank by under search rule, gather and handle after the content of text of target network element form, and be the information preparation increment full-text index of new typing, and set up index according to time series and the classification of described column in the mode of increment.
5, according to claim 1 described intelligent information search engine, it is characterized in that: described passage and access passage and the protocol header of recognition of devices module by communicating by letter, discern the device type of portable terminal, thereby obtain the configuration information of this device type; According to different portable terminals, with search result information, through one can processing at portable terminal characteristic and mobile subscriber's operating characteristic after, the mobility protocol data are outputed to user's portable terminal, show the result of search.
6, according to claim 2 described intelligent information search engines, it is characterized in that: described content page connection is obtained expression formula and is meant, system analyzes described targeted website structure, target web Tag syntactic structure automatically, the target web content structure, will be from the targeted website the webpage gathered of each column, each catalogue based on the tag grammer, classify by identical layout, identical catalogue, automatic generation is connected with the relevant content page in corresponding targeted website obtains expression formula.
7, according to claim 2,6 described intelligent information search engines, it is characterized in that: described content match expression formula is, the layout webpage Tag syntactic structure similarities and differences part similar according to each targeted website catalogue, web page contents structure similarities and differences part, determine the target complete network element position of described target web, generate the content match expression formula of target network element.
8, according to claim 2,6 described intelligent information search engines, it is characterized in that: described content is obtained expression formula and is, feature according to the information type of each target network element, determine the target network element of each information analysis key element correspondence in the target web by described content match expression formula, the mapping relations of target network element and column classification form a network element mapping graph, with the definite position of target network element on target web, and the content of generation target network element is obtained expression formula.
9, intelligent information search engine according to claim 7, it is characterized in that: described content is obtained expression formula and is, feature according to the information type of each target network element, determine the target network element of each information analysis key element correspondence in the target web by described content match expression formula, the mapping relations of target network element and column classification form a network element mapping graph, with the definite position of target network element on target web, and the content of generation target network element is obtained expression formula.
10, according to claim 9 described intelligent information search engines, it is characterized in that: it heavily is that a kind of processing sentence information repeats elimination methods that described intelligent sentence goes, be specially, a) information is formed a complete sentence by the punctuation mark branch, extract condition code, b) information is carried out condition code and extract, every piece of information is extracted N condition code to N natural sentences, remaining is ignored, not enough zero padding; C) condition code is sorted, inserts, searches and compares, every fresh information comparative feature sign indicating number and a most close m piece of writing information d) are got rid of difference repeating in the value scope of setting.
11, according to the described arbitrary intelligent information search engine of claim 1 to 6, it is characterized in that: the search mission processing module comprises a timer and a customizer.
12, intelligent information search engine according to claim 8 is characterized in that: the search mission processing module comprises a timer and a customizer.
13, intelligent information search engine according to claim 10 is characterized in that: the search mission processing module comprises a timer and a customizer.
CNB2004100266745A 2004-03-31 2004-03-31 Mobile internet intelligent information retrieval engine based on key-word retrieval Expired - Lifetime CN100357942C (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CNB2004100266745A CN100357942C (en) 2004-03-31 2004-03-31 Mobile internet intelligent information retrieval engine based on key-word retrieval

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CNB2004100266745A CN100357942C (en) 2004-03-31 2004-03-31 Mobile internet intelligent information retrieval engine based on key-word retrieval

Publications (2)

Publication Number Publication Date
CN1677389A true CN1677389A (en) 2005-10-05
CN100357942C CN100357942C (en) 2007-12-26

Family

ID=35049908

Family Applications (1)

Application Number Title Priority Date Filing Date
CNB2004100266745A Expired - Lifetime CN100357942C (en) 2004-03-31 2004-03-31 Mobile internet intelligent information retrieval engine based on key-word retrieval

Country Status (1)

Country Link
CN (1) CN100357942C (en)

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN100485690C (en) * 2007-08-09 2009-05-06 姜边 Internet information acquisition method facing field and oriented by policy
CN101324887B (en) * 2007-06-11 2011-08-24 国际商业机器公司 Method and apparatus for searching information resource
CN102236691A (en) * 2010-05-04 2011-11-09 张文广 Precision guided searching tool system
CN102637177A (en) * 2011-02-14 2012-08-15 苏州巴米特信息科技有限公司 Characteristic method for browsing webpages on mobile phones
CN103020224A (en) * 2012-12-12 2013-04-03 百度在线网络技术(北京)有限公司 Method and device of intelligent search
US8417684B2 (en) 2008-09-08 2013-04-09 Huawei Technologies Co., Ltd. Method, system, and device for searching for information and method for registering vertical search engine
CN105793844A (en) * 2013-11-27 2016-07-20 微软技术许可有限责任公司 Contextual information lookup and navigation
CN108664606A (en) * 2018-05-10 2018-10-16 北京鼎泰智源科技有限公司 A kind of big data coverage rate capturing analysis method

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CA2431762C (en) * 2000-12-18 2011-11-01 Kargo, Inc. A system and method for delivering content to mobile devices

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101324887B (en) * 2007-06-11 2011-08-24 国际商业机器公司 Method and apparatus for searching information resource
CN100485690C (en) * 2007-08-09 2009-05-06 姜边 Internet information acquisition method facing field and oriented by policy
US8417684B2 (en) 2008-09-08 2013-04-09 Huawei Technologies Co., Ltd. Method, system, and device for searching for information and method for registering vertical search engine
CN102236691A (en) * 2010-05-04 2011-11-09 张文广 Precision guided searching tool system
CN102637177A (en) * 2011-02-14 2012-08-15 苏州巴米特信息科技有限公司 Characteristic method for browsing webpages on mobile phones
CN103020224A (en) * 2012-12-12 2013-04-03 百度在线网络技术(北京)有限公司 Method and device of intelligent search
CN103020224B (en) * 2012-12-12 2019-01-15 百度在线网络技术(北京)有限公司 A kind of intelligent search method and device
CN105793844A (en) * 2013-11-27 2016-07-20 微软技术许可有限责任公司 Contextual information lookup and navigation
CN108664606A (en) * 2018-05-10 2018-10-16 北京鼎泰智源科技有限公司 A kind of big data coverage rate capturing analysis method

Also Published As

Publication number Publication date
CN100357942C (en) 2007-12-26

Similar Documents

Publication Publication Date Title
CN1240011C (en) File classifying management system and method for operation system
CN103226578B (en) Towards the website identification of medical domain and the method for webpage disaggregated classification
US7565350B2 (en) Identifying a web page as belonging to a blog
US8868621B2 (en) Data extraction from HTML documents into tables for user comparison
CN100498790C (en) Retrieving method and system
CN107885793A (en) A kind of hot microblog topic analyzing and predicting method and system
CN112749284B (en) Knowledge graph construction method, device, equipment and storage medium
CN101908071A (en) Method and device thereof for improving search efficiency of search engine
CN103544255A (en) Text semantic relativity based network public opinion information analysis method
CN109271477A (en) A kind of method and system by internet building taxonomy library
CN101814083A (en) Automatic webpage classification method and system
CN1858733A (en) Information searching system and searching method
CN1732451A (en) Methods and apparatus for summarizing document content for mobile communication devices
CN101261629A (en) Specific information searching method based on automatic classification technology
CN101650715A (en) Method and device for screening links on web pages
CN1702651A (en) Recognition method and apparatus for information files of specific types
CN110457579B (en) Webpage denoising method and system based on cooperative work of template and classifier
CN101630315B (en) Quick retrieval method and system
CN101310277B (en) Method of obtaining a representation of a text and system
CN109948154A (en) A kind of personage&#39;s acquisition and relationship recommender system and method based on name
US11334592B2 (en) Self-orchestrated system for extraction, analysis, and presentation of entity data
KR100671077B1 (en) Server, Method and System for Providing Information Search Service by Using Sheaf of Pages
CN100357942C (en) Mobile internet intelligent information retrieval engine based on key-word retrieval
CN1342942A (en) Computer recognizing and indexing method of Chinese names
KR20030069640A (en) System and method for geting information on hierarchical and conceptual clustering

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C57 Notification of unclear or unknown address
DD01 Delivery of document by public notice

Addressee: Zeng Xiaotong

Document name: Notification of the application for patent for invention to go through the substantive examination procedure

C57 Notification of unclear or unknown address
DD01 Delivery of document by public notice

Addressee: Zeng Xiaotong

Document name: Notice of correction

C14 Grant of patent or utility model
GR01 Patent grant
DD01 Delivery of document by public notice

Addressee: Li Jiaoling

Document name: Notification to Pay the Fees

CX01 Expiry of patent term
CX01 Expiry of patent term

Granted publication date: 20071226

DD01 Delivery of document by public notice

Addressee: Li Jiaoling

Document name: Notice of Termination of Patent Rights