CN109299353A - A kind of webpage information search method and device - Google Patents

A kind of webpage information search method and device Download PDF

Info

Publication number
CN109299353A
CN109299353A CN201811351819.7A CN201811351819A CN109299353A CN 109299353 A CN109299353 A CN 109299353A CN 201811351819 A CN201811351819 A CN 201811351819A CN 109299353 A CN109299353 A CN 109299353A
Authority
CN
China
Prior art keywords
webpage
webpage information
search
participle
information
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201811351819.7A
Other languages
Chinese (zh)
Inventor
何中
刘剑波
严伟
戴建峰
陈明敏
姚童
何登
王斌
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
JIANGSU ZHONGWEI TECHNOLOGY SOFTWARE SYSTEM Co Ltd
Original Assignee
JIANGSU ZHONGWEI TECHNOLOGY SOFTWARE SYSTEM Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by JIANGSU ZHONGWEI TECHNOLOGY SOFTWARE SYSTEM Co Ltd filed Critical JIANGSU ZHONGWEI TECHNOLOGY SOFTWARE SYSTEM Co Ltd
Priority to CN201811351819.7A priority Critical patent/CN109299353A/en
Publication of CN109299353A publication Critical patent/CN109299353A/en
Pending legal-status Critical Current

Links

Abstract

The invention discloses a kind of webpage information search method and devices, which comprises receives searching request, carries search term in described search request;The corresponding target webpage information of described search word is searched in the corresponding relationship of the webpage entry and webpage information that pre-establish;Return to the target webpage information.The present invention is based on to site information Word Intelligent Segmentation, allow users to quick search to qualified result, shorten search time, search time is second grade, and the exact matching without search term can be obtained accurately as a result, reducing the processing load of server.

Description

A kind of webpage information search method and device
Technical field
The present invention relates to field of computer technology, in particular to a kind of webpage information search method and device.
Background technique
With the development of computer networking technology, people more and more search for the information of oneself needs from network.It searches Index is held up to be scanned for according to the search term of user's input, and is returned and the matched webpage information of search term to user.
However, web data is often all based on relevant database at present, the information search of webpage cannot achieve fastly Speed search, and search key must be exactly matched and can just be obtained as a result, increasing the load of server.
Accordingly, it is desirable to provide relatively reliable or effective method is reduced and is serviced in search process to shorten search time The load of device.
Summary of the invention
In order to solve problems in the prior art, the embodiment of the invention provides a kind of webpage information search method and devices. The technical solution is as follows: on the one hand, providing a kind of webpage information search method, which comprises searching request is received, Search term is carried in described search request;It is searched described in being searched in the corresponding relationship of the webpage entry and webpage information that pre-establish The corresponding target webpage information of rope word;Return to the target webpage information.
Further, it the method also includes establishing the corresponding relationship of the webpage entry and webpage information, wraps It includes: crawling webpage information using reptile instrument;Cutting word segmentation processing is carried out to the webpage information crawled, obtains webpage word Item;Establish the corresponding relationship of the webpage entry and the webpage information.
Further, described to carry out cutting word segmentation processing to the webpage information crawled, obtaining webpage entry includes:
The webpage information crawled is pre-processed, the first webpage information is obtained;First webpage information is cut Word segmentation processing is cut, participle set is obtained;Determine the degree of correlation respectively segmented in the participle set with the webpage information;According to institute The degree of correlation for stating each participle and the webpage information, determines webpage entry.
Further, it is respectively segmented in the determination participle set and includes: with the degree of correlation of the webpage information
Part of speech is respectively segmented according in participle set, determines the first weighted value;Calculate the weight respectively segmented in the participle set Now rate;
According to first weighted value and fidelity factor, the degree of correlation of the participle and the webpage information is determined.
Further, described to search described search word in the corresponding relationship of the webpage pre-established participle and webpage information Corresponding target webpage information includes: to obtain search key according to described search word;
Judge whether described search keyword matches with the webpage participle in the corresponding relationship;
Be in the result judged it is yes, obtain the corresponding relationship of webpage participle and webpage information;
According to the corresponding relationship of webpage participle and webpage information, the target webpage information is obtained.
On the other hand, a kind of webpage information search device is provided, described device includes:
Receiving module carries search term in described search request for receiving searching request;
Searching module, it is corresponding for searching described search word in the corresponding relationship of the webpage entry and webpage information that pre-establish Target webpage information;
Return module, for returning to the target webpage information.
Further, described device further includes corresponding relation building module, and the corresponding relation building module is for establishing The corresponding relationship of the webpage entry and webpage information, comprising:
Module is crawled, for crawling webpage information using reptile instrument;
It cuts word segmentation module and obtains webpage entry for carrying out cutting word segmentation processing to the webpage information crawled;
Module is established, for establishing the corresponding relationship of the webpage entry and the webpage information.
Further, the cutting word segmentation module includes:
Preprocessing module obtains the first webpage information for pre-processing to the webpage information crawled;
Cutting module obtains participle set for carrying out cutting word segmentation processing to first webpage information;
Degree of correlation determining module, for determining the degree of correlation respectively segmented in the participle set with the webpage information;
Webpage entry determining module determines webpage entry according to the degree of correlation of each participle and the webpage information.
Further, the degree of correlation determining module includes:
First weight determination module determines the first weighted value for respectively segmenting part of speech according in participle set;
Computing module, for calculating the fidelity factor respectively segmented in the participle set;
Submodule is determined, for determining the phase of the participle and the webpage information according to first weighted value and fidelity factor Guan Du.
Further, the searching module includes:
First obtains module, for obtaining search key according to described search word;
Judgment module matches for judging whether described search keyword segments with the webpage in the corresponding relationship;
Second obtains module, for be in the result judged it is yes, obtain the corresponding relationship of webpage participle and webpage information;
Third obtains module, for the corresponding relationship according to webpage participle and webpage information, obtains the target webpage letter Breath.
On the other hand, a kind of electronic equipment is provided, comprising:
Processor is adapted for carrying out one or one or more instruction;And
Memory, the memory are stored with one or one or more instruction, and described one or one or more instruction are suitable for by institute Processor is stated to load and execute above-mentioned webpage information search method.
Technical solution provided in an embodiment of the present invention has the benefit that
The present invention carries search term in described search request by receiving searching request;In the webpage entry and net pre-established The corresponding target webpage information of described search word is searched in the corresponding relationship of page information;The target webpage information is returned, is realized Based on to site information Word Intelligent Segmentation, allow users to quick search to qualified as a result, shorten search time, search The rope time is second grade, and the exact matching without search term can be obtained accurate as a result, the processing for reducing server is negative Lotus.
Detailed description of the invention
To describe the technical solutions in the embodiments of the present invention more clearly, make required in being described below to embodiment Attached drawing is briefly described, it should be apparent that, drawings in the following description are only some embodiments of the invention, for For those of ordinary skill in the art, without creative efforts, it can also be obtained according to these attached drawings other Attached drawing.
Fig. 1 is a kind of flow diagram of webpage information search method provided in an embodiment of the present invention;
Fig. 2 is a kind of process signal of the corresponding relationship provided in an embodiment of the present invention for establishing the webpage entry and webpage information Figure;
Fig. 3 be it is provided in an embodiment of the present invention cutting word segmentation processing is carried out to the webpage information crawled, obtain webpage entry A kind of flow diagram;
Fig. 4 is one respectively segmented in the determination provided in an embodiment of the present invention participle set with the degree of correlation of the webpage information Kind flow diagram;
Fig. 5 be it is provided in an embodiment of the present invention the webpage pre-established participle with searched in the corresponding relationship of webpage information described in A kind of flow diagram of the corresponding target webpage information of search term;
Fig. 6 is a kind of structural schematic diagram of webpage information search device provided in an embodiment of the present invention;
Fig. 7 is the structural schematic diagram of another webpage information search device provided in an embodiment of the present invention;
Fig. 8 is a kind of structural schematic diagram of cutting word segmentation module provided in an embodiment of the present invention;
Fig. 9 is a kind of structural schematic diagram of degree of correlation determining module provided in an embodiment of the present invention;
Figure 10 is a kind of structural schematic diagram of searching module provided in an embodiment of the present invention;
Figure 11 is the structural schematic diagram of a kind of electronic equipment provided in an embodiment of the present invention.
Specific embodiment
To make the object, technical solutions and advantages of the present invention clearer, below in conjunction with attached drawing to embodiment party of the present invention Formula is described in further detail.
Referring to FIG. 1, its flow diagram for showing a kind of webpage information search method provided in an embodiment of the present invention. It should be noted that present description provides the method operating procedures as described in embodiment or flow chart, but based on conventional or No creative labor may include more or less operating procedure.The step of enumerating in embodiment sequence is only numerous One of step execution sequence mode does not represent and unique executes sequence.It, can be with when system in practice or product execute It is executed according to embodiment or method shown in the drawings sequence or parallel executes (such as parallel processor or multiple threads Environment).As shown in Figure 1, this method may include:
Step 101, searching request is received, carries search term in described search request.
In this specification embodiment, when user, which inputs search term by client, carries out webpage information search, client End group sends searching request to server in the search term that user inputs, and the search term of user's input is carried in the searching request.
Correspondingly, server receives the searching request that client is sent, and obtain the search term in searching request.
Step 103, it is corresponding that described search word is searched in the corresponding relationship of the webpage entry and webpage information that pre-establish Target webpage information.
In this specification embodiment, server is pre-established with the corresponding relationship of webpage entry and webpage information, this is right Should be related to can be stored in the form of table.
In a specific embodiment, the corresponding relationship that server is pre-established with webpage entry and webpage information can be with It is that provided in an embodiment of the present invention to establish the webpage entry corresponding with webpage information using method shown in Fig. 2, described in Fig. 2 A kind of flow diagram of relationship, as shown in Fig. 2, this method may include:
Step 201, webpage information is crawled using reptile instrument.
Specifically, the requested website URL of the js request event can be obtained with js request event based on the received, according to institute The website URL of request obtains the data of corresponding json format, is parsed to obtain corresponding crawl to the data of the json format Webpage information.
Step 203, cutting word segmentation processing is carried out to the webpage information crawled, obtains webpage entry.
Specifically, can be carried out at cutting participle using preset cutting segmenting method to the webpage information crawled Reason, to obtain corresponding to the webpage entry of the webpage information.Webpage entry can be single participle, be also possible to the group of multiple participles It closes, this is not limited by the present invention.
Preset cutting segmenting method can use cutting participle technique in the prior art, and the present invention does not make this specifically It limits.
In this specification embodiment, cutting word segmentation processing is carried out to the webpage information crawled, obtains webpage entry Method shown in Fig. 3 can be used, Fig. 3, which be shown, provided in an embodiment of the present invention to be cut the webpage information crawled Word segmentation processing obtains a kind of flow diagram of webpage entry, as shown in figure 3, this method may include:
Step 301, the webpage information crawled is pre-processed, obtains the first webpage information.
In this specification embodiment, pretreatment may include the stop words in the webpage information that removal crawls, auxiliary word Deng.
Step 303, cutting word segmentation processing is carried out to first webpage information, obtains participle set.
In this specification embodiment, cutting word segmentation processing can use cutting participle technique in the prior art, this hair It is bright that this is not especially limited.
Step 305, the degree of correlation respectively segmented in the participle set with the webpage information is determined.
In this specification embodiment, determine that respectively segment in the participle set can be with the degree of correlation of the webpage information Using method shown in Fig. 4, be illustrated in figure 4 in the determination provided in an embodiment of the present invention participle set respectively participle with it is described A kind of flow diagram of the degree of correlation of webpage information, as shown in figure 4, this method may include:
Step 401, part of speech is respectively segmented according in participle set, determines the first weighted value.
Illustrate in embodiment at this, part of speech may include noun, verb, adjective, for several times etc., can preset each The corresponding weighted value of class part of speech.
Step 403, the fidelity factor respectively segmented in the participle set is calculated.
Illustrate in embodiment at this, can divided with the total participle quantity for including in statistical set of words and each participle The number occurred in set of words, the number then occurred in participle set according to each participle and total participle quantity calculate each The fidelity factor of a participle, for example, the ratio of number and total participle quantity that each participle is occurred in participle set is as each The fidelity factor of a participle.
Step 405, according to first weighted value and fidelity factor, determine that the participle is related to the webpage information Degree.
Specifically, can be using the product of the first weighted value and fidelity factor as relevance degree, the relevance degree is bigger, explanation Corresponding participle is bigger with the degree of correlation of webpage information.
Step 307, according to the degree of correlation of each participle and the webpage information, webpage entry is determined.
Specifically, can be ranked up according to the degree of correlation of each participle and webpage information to each participle, such as according to phase The descending sort that Guan Du reduces, then make number one be with the maximum participle of the webpage information degree of correlation, can should be with webpage Webpage entry of the maximum participle of information correlation as the webpage.It is of course also possible to take sequence in point of preceding first quantity Webpage entry of the word as corresponding webpage information, for example, preceding 3 participles are as corresponding webpage information in the arrangement of degree of correlation descending Webpage entry.
Step 205, the corresponding relationship of the webpage entry and the webpage information is established.
Specifically, the index relative of webpage entry and webpage information can be stored in the form of list.
It is described to be searched in the corresponding relationship that the webpage pre-established segments with webpage information in this specification embodiment The corresponding target webpage information of described search word can use method shown in fig. 5, and Fig. 5 show provided in an embodiment of the present invention The corresponding target webpage information of lookup described search word in the corresponding relationship that the webpage that pre-establishes segments with webpage information A kind of flow diagram, as shown in figure 5, this method may include:
Step 501, according to described search word, search key is obtained.
Specifically, search key is the key message for embodying search content, which can be by search Word carries out word segmentation processing, then carries out key indices calculating to the participle after word segmentation processing, by the highest conduct pair of key indices The search key for the search term answered.
Key indices can be determined by segmenting the position weight of the weight and part word of part of speech in search term.Specifically, The weight and search term that all kinds of parts of speech can be preset are set by first place to the weight configuration of tail position, and it is corresponding to calculate part of speech Weight weight corresponding with position product, the key indices segmented.
Step 503, judge whether described search keyword matches with the webpage participle in the corresponding relationship.
Specifically, by search key and the webpage word in the corresponding relationship of the webpage entry and webpage information that pre-establish Item carries out similarity mode, when judging whether similarity reaches preset threshold, when reaching preset threshold, can determine and search for The webpage entry that keyword matches.
Step 505, the corresponding relationship of the webpage participle and webpage information is obtained.
Specifically, after the determining webpage entry to match with search key, it is available to arrive the webpage entry and net The corresponding relationship of page information, the corresponding relationship can store in the form of a list in the server.
Step 507, according to the corresponding relationship of webpage participle and webpage information, the target webpage information is obtained.
In this specification embodiment, target webpage information may include the webpage information of multiple webpages.
Step 105, the target webpage information is returned.
In this specification embodiment, server returns to the target webpage information after getting target webpage information Client, so that client is shown to user.
To sum up, the present invention carries search term in described search request by receiving searching request;In the webpage pre-established Entry target webpage information corresponding with described search word is searched in the corresponding relationship of webpage information;Return to the target webpage letter Breath, realize based on to site information Word Intelligent Segmentation, allow users to quick search to qualified as a result, shortening search Time, search time is second grade, and the exact matching without search term can be obtained accurately as a result, reducing server Handle load.
Corresponding with the webpage information search method that above-mentioned several embodiments provide, the embodiment of the present invention also provides a kind of net Page information searcher, due to the net of webpage information search device provided in an embodiment of the present invention and above-mentioned several embodiment offers Page information searching method is corresponding, thus the embodiment of aforementioned webpage information search method be also applied for it is provided in this embodiment Webpage information search device, is not described in detail in the present embodiment.
Referring to Fig. 6, it show the structural schematic diagram that the present invention implements a kind of webpage information search device provided, such as Shown in Fig. 6, the apparatus may include:
Receiving module 610 carries search term in described search request for receiving searching request;
Searching module 620, for searching described search word in the corresponding relationship of the webpage entry and webpage information that pre-establish Corresponding target webpage information;
Return module 630, for returning to the target webpage information.
In a specific embodiment, described as shown in fig. 7, the device can also include corresponding relation building module 640 Corresponding relation building module 640 is used to establish the corresponding relationship of the webpage entry and webpage information, specifically, corresponding relationship is built Formwork erection block 640 may include:
Module 6410 is crawled, for crawling webpage information using reptile instrument;
It cuts word segmentation module 6420 and obtains webpage entry for carrying out cutting word segmentation processing to the webpage information crawled;
Module 6430 is established, for establishing the corresponding relationship of the webpage entry and the webpage information.
Specifically, as shown in figure 8, cutting word segmentation module 6420 may include:
Preprocessing module 810 obtains the first webpage information for pre-processing to the webpage information crawled;
Cutting module 820 obtains participle set for carrying out cutting word segmentation processing to first webpage information;
Degree of correlation determining module 830, for determining the degree of correlation respectively segmented in the participle set with the webpage information;
Webpage entry determining module 840 determines webpage entry according to the degree of correlation of each participle and the webpage information.
Specifically, as shown in figure 9, degree of correlation determining module 830 may include:
First weight determination module 8310 determines the first weighted value for respectively segmenting part of speech according in participle set;
Computing module 8320, for calculating the fidelity factor respectively segmented in the participle set;
Submodule 8330 is determined, for determining the participle and the webpage information according to first weighted value and fidelity factor The degree of correlation.
In another embodiment, as shown in Figure 10, searching module 620 may include:
First obtains module 6210, for obtaining search key according to described search word;
Judgment module 6220 matches for judging whether described search keyword segments with the webpage in the corresponding relationship;
Second obtains module 6230, for be in the result judged it is yes, obtain webpage participle pass corresponding with webpage information System;
Third obtains module 6240, for the corresponding relationship according to webpage participle and webpage information, obtains the target network Page information.
By above-mentioned embodiment it is found that the present invention carries search term in described search request by receiving searching request; The corresponding target webpage information of described search word is searched in the corresponding relationship of the webpage entry and webpage information that pre-establish;It returns Return the target webpage information, realize based on to site information Word Intelligent Segmentation, allow users to quick search to eligible As a result, shorten search time, search time is second grade, and accurate knot can be obtained in the exact matching without search term Fruit reduces the processing load of server.
It should be noted that device provided by the above embodiment, when realizing its function, only with above-mentioned each functional module It divides and carries out for example, can according to need in practical application and be completed by different functional modules above-mentioned function distribution, The internal structure of equipment is divided into different functional modules, to complete all or part of the functions described above.
Please refer to Figure 11 which shows the structural schematic diagram of a kind of electronic equipment provided in an embodiment of the present invention, the electronics Equipment is used for the webpage information search device for implementing to provide in above-described embodiment.The electronic equipment can be such as PC The terminal devices such as (PersonalComputer, personal computer), mobile phone, PDA (tablet computer) are also possible to such as application clothes The service equipments such as business device, cluster server.Referring to Figure 11, the internal structure of the electronic equipment may include but be not limited to: processing Device, network interface and memory.Wherein, the processor in electronic equipment, network interface and memory can by bus or other Mode connects, in Figure 11 shown in this specification embodiment for being connected by bus.
Wherein, processor (or CPU (Central Processing Unit, central processing unit)) is electronic equipment Calculate core and control core.Network interface optionally may include that standard wireline interface and wireless interface (such as WI-FI, is moved Dynamic communication interface etc.).Memory (Memory) is the memory device in electronic equipment, for storing program and data.It can manage Solution, memory herein can be high-speed RAM storage equipment, be also possible to non-labile storage equipment (non- Volatile memory), a for example, at least disk storage equipment;It is aforementioned optionally to can also be that at least one is located remotely from The storage device of processor.Memory provides memory space, which stores the operating system of electronic equipment, it may include But it is not limited to: Windows system (a kind of operating system), Linux (a kind of operating system), Android (Android, a kind of movement Operating system) system, IOS (a kind of Mobile operating system) system etc., the present invention is to this and is not construed as limiting;Also, it deposits at this It also houses and is suitable for by one or more than one instructions that processor loads and executes in storage space, these instructions can be one A or more than one computer program (including program code).In this specification embodiment, processor is loaded and is executed and deposits One stored in reservoir or one or more instruction, to realize the webpage information search method of above method embodiment offer.
The embodiments of the present invention also provide a kind of storage medium, the storage medium may be disposed among electronic equipment with Save the method for drafting relevant at least one of the vector graphics implementation for realizing one of embodiment of the method across file format Item instruction, at least one section of program, code set or instruction set, at least one instruction, this at least one section of program, the code set or refers to Enable collection that the webpage information search method to realize above method embodiment offer can be loaded and executed by the processor of electronic equipment.
It should be noted that, in this document, relational terms such as first and second and the like are used merely to a reality Body or operation are distinguished with another entity or operation, are deposited without necessarily requiring or implying between these entities or operation In any actual relationship or order or sequence.Moreover, the terms "include", "comprise" or its any other variant are intended to Non-exclusive inclusion, so that process, method, article or device including a series of elements are not only wanted including those Element, but also including other elements that are not explicitly listed, or further include for this process, method, article or device Intrinsic element.In the absence of more restrictions, the element limited by sentence "including a ...", it is not excluded that There is also other identical elements in process, method, article or device including the element.
Those of ordinary skill in the art will appreciate that realizing that all or part of the steps of above-described embodiment can pass through hardware It completes, relevant hardware can also be instructed to complete by program, the program can store in a kind of computer-readable In storage medium, storage medium mentioned above can be read-only memory, disk or CD etc..
The foregoing is merely presently preferred embodiments of the present invention, is not intended to limit the invention, it is all in spirit of the invention and Within principle, any modification, equivalent replacement, improvement and so on be should all be included in the protection scope of the present invention.

Claims (10)

1. a kind of webpage information search method, which is characterized in that the described method includes:
Searching request is received, carries search term in described search request;
Described search word corresponding target webpage letter is searched in the corresponding relationship of the webpage entry and webpage information that pre-establish Breath;
Return to the target webpage information.
2. webpage information search method according to claim 1, which is characterized in that the method also includes establishing the net The step of corresponding relationship of page entry and webpage information, comprising:
Webpage information is crawled using reptile instrument;
Cutting word segmentation processing is carried out to the webpage information crawled, obtains webpage entry;
Establish the corresponding relationship of the webpage entry and the webpage information.
3. webpage information search method according to claim 2, which is characterized in that described to the webpage information crawled Cutting word segmentation processing is carried out, obtaining webpage entry includes:
The webpage information crawled is pre-processed, the first webpage information is obtained;
Cutting word segmentation processing is carried out to first webpage information, obtains participle set;
Determine the degree of correlation respectively segmented in the participle set with the webpage information;
According to the degree of correlation of each participle and the webpage information, webpage entry is determined.
4. webpage information search method according to claim 3, which is characterized in that each in the determination participle set It segments and includes: with the degree of correlation of the webpage information
Part of speech is respectively segmented according in participle set, determines the first weighted value;
Calculate the fidelity factor respectively segmented in the participle set;
According to first weighted value and fidelity factor, the degree of correlation of the participle and the webpage information is determined.
5. webpage information search method according to claim 1, which is characterized in that described in the webpage pre-established participle Include: with the corresponding target webpage information of described search word is searched in the corresponding relationship of webpage information
According to described search word, search key is obtained;
Judge whether described search keyword matches with the webpage participle in the corresponding relationship;
Be in the result judged it is yes, obtain the corresponding relationship of webpage participle and webpage information;
According to the corresponding relationship of webpage participle and webpage information, the target webpage information is obtained.
6. a kind of webpage information search device, which is characterized in that described device includes:
Receiving module carries search term in described search request for receiving searching request;
Searching module, it is corresponding for searching described search word in the corresponding relationship of the webpage entry and webpage information that pre-establish Target webpage information;
Return module, for returning to the target webpage information.
7. webpage information search device according to claim 6, which is characterized in that described device further includes that corresponding relationship is built Formwork erection block, the corresponding relation building module are used to establish the corresponding relationship of the webpage entry and webpage information, comprising:
Module is crawled, for crawling webpage information using reptile instrument;
It cuts word segmentation module and obtains webpage entry for carrying out cutting word segmentation processing to the webpage information crawled;
Module is established, for establishing the corresponding relationship of the webpage entry and the webpage information.
8. webpage information search device according to claim 7, which is characterized in that the cutting word segmentation module includes:
Preprocessing module obtains the first webpage information for pre-processing to the webpage information crawled;
Cutting module obtains participle set for carrying out cutting word segmentation processing to first webpage information;
Degree of correlation determining module, for determining the degree of correlation respectively segmented in the participle set with the webpage information;
Webpage entry determining module determines webpage entry according to the degree of correlation of each participle and the webpage information.
9. webpage information search device according to claim 8, which is characterized in that the degree of correlation determining module includes:
First weight determination module determines the first weighted value for respectively segmenting part of speech according in participle set;
Computing module, for calculating the fidelity factor respectively segmented in the participle set;
Submodule is determined, for determining the phase of the participle and the webpage information according to first weighted value and fidelity factor Guan Du.
10. webpage information search device according to claim 6, which is characterized in that the searching module includes:
First obtains module, for obtaining search key according to described search word;
Judgment module matches for judging whether described search keyword segments with the webpage in the corresponding relationship;
Second obtains module, for be in the result judged it is yes, obtain the corresponding relationship of webpage participle and webpage information;
Third obtains module, for the corresponding relationship according to webpage participle and webpage information, obtains the target webpage letter Breath.
CN201811351819.7A 2018-11-14 2018-11-14 A kind of webpage information search method and device Pending CN109299353A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201811351819.7A CN109299353A (en) 2018-11-14 2018-11-14 A kind of webpage information search method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201811351819.7A CN109299353A (en) 2018-11-14 2018-11-14 A kind of webpage information search method and device

Publications (1)

Publication Number Publication Date
CN109299353A true CN109299353A (en) 2019-02-01

Family

ID=65146556

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201811351819.7A Pending CN109299353A (en) 2018-11-14 2018-11-14 A kind of webpage information search method and device

Country Status (1)

Country Link
CN (1) CN109299353A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110362732A (en) * 2019-07-18 2019-10-22 江苏中威科技软件系统有限公司 A kind of method of information system content search
CN111444406A (en) * 2020-03-26 2020-07-24 安徽博约信息科技股份有限公司 Crawler text matching method

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103246678A (en) * 2012-02-13 2013-08-14 腾讯科技(深圳)有限公司 Method and device for previewing web page contents
CN104063489A (en) * 2014-07-04 2014-09-24 百度在线网络技术(北京)有限公司 Method and device for determining webpage image relevancy and displaying retrieved result
CN104142945A (en) * 2013-05-08 2014-11-12 阿里巴巴集团控股有限公司 Search method and device based on search term

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103246678A (en) * 2012-02-13 2013-08-14 腾讯科技(深圳)有限公司 Method and device for previewing web page contents
CN104142945A (en) * 2013-05-08 2014-11-12 阿里巴巴集团控股有限公司 Search method and device based on search term
CN104063489A (en) * 2014-07-04 2014-09-24 百度在线网络技术(北京)有限公司 Method and device for determining webpage image relevancy and displaying retrieved result

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110362732A (en) * 2019-07-18 2019-10-22 江苏中威科技软件系统有限公司 A kind of method of information system content search
CN111444406A (en) * 2020-03-26 2020-07-24 安徽博约信息科技股份有限公司 Crawler text matching method

Similar Documents

Publication Publication Date Title
US20210342549A1 (en) Method for training semantic analysis model, electronic device and storage medium
CN104933100B (en) keyword recommendation method and device
CN113822067A (en) Key information extraction method and device, computer equipment and storage medium
CN109947902B (en) Data query method and device and readable medium
CN114036322A (en) Training method for search system, electronic device, and storage medium
CN112925883A (en) Search request processing method and device, electronic equipment and readable storage medium
US20230004613A1 (en) Data mining method, data mining apparatus, electronic device and storage medium
CN109299353A (en) A kind of webpage information search method and device
CN112948573B (en) Text label extraction method, device, equipment and computer storage medium
CN112528146B (en) Content resource recommendation method and device, electronic equipment and storage medium
CN112699237B (en) Label determination method, device and storage medium
CN114647739B (en) Entity chain finger method, device, electronic equipment and storage medium
CN111666417A (en) Method and device for generating synonyms, electronic equipment and readable storage medium
CN114818736B (en) Text processing method, chain finger method and device for short text and storage medium
CN113792230B (en) Service linking method, device, electronic equipment and storage medium
CN112528644B (en) Entity mounting method, device, equipment and storage medium
CN112560425B (en) Template generation method and device, electronic equipment and storage medium
CN114048315A (en) Method and device for determining document tag, electronic equipment and storage medium
CN114329210A (en) Information recommendation method and device and electronic equipment
CN113806483A (en) Data processing method and device, electronic equipment and computer program product
CN113407579A (en) Group query method and device, electronic equipment and readable storage medium
CN115248890A (en) User interest portrait generation method and device, electronic equipment and storage medium
CN112926297A (en) Method, apparatus, device and storage medium for processing information
CN113806660B (en) Data evaluation method, training device, electronic equipment and storage medium
CN115795023B (en) Document recommendation method, device, equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20190201