CN113468410A - System for intelligently optimizing search results and search engine - Google Patents

System for intelligently optimizing search results and search engine Download PDF

Info

Publication number
CN113468410A
CN113468410A CN202110527628.7A CN202110527628A CN113468410A CN 113468410 A CN113468410 A CN 113468410A CN 202110527628 A CN202110527628 A CN 202110527628A CN 113468410 A CN113468410 A CN 113468410A
Authority
CN
China
Prior art keywords
search
user
information
results
module
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Withdrawn
Application number
CN202110527628.7A
Other languages
Chinese (zh)
Inventor
姜伟
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hangzhou Zhizhuo Technology Co ltd
Original Assignee
Hangzhou Zhizhuo Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hangzhou Zhizhuo Technology Co ltd filed Critical Hangzhou Zhizhuo Technology Co ltd
Priority to CN202110527628.7A priority Critical patent/CN113468410A/en
Publication of CN113468410A publication Critical patent/CN113468410A/en
Withdrawn legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9535Search customisation based on user profiles and personalisation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9538Presentation of query results
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/957Browsing optimisation, e.g. caching or content distillation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/289Phrasal analysis, e.g. finite state techniques or chunking
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/30Semantic analysis

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Artificial Intelligence (AREA)
  • General Health & Medical Sciences (AREA)
  • Computational Linguistics (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Health & Medical Sciences (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Evolutionary Computation (AREA)
  • Evolutionary Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention belongs to the technical field of internet, and discloses an intelligent system for optimizing search results and a search engine. According to the method, the search requirement description information is retrieved through the extracted keywords, so that the accuracy of the search result can be improved, excessive search results irrelevant to the content to be searched by the user are reduced, and the user can browse conveniently; the cluster optimization module can be used for sequencing objects among classes and classes based on user behavior analysis, obtaining a user interest characteristic pattern by analyzing past behaviors of a user, comparing a cluster center with the class to which the user interest belongs, arranging the clusters to which the user interest class belongs at the top, and sequentially arranging other cluster results along with reduction of the user interest degree, so that the browsing experience degree of the user is effectively improved.

Description

System for intelligently optimizing search results and search engine
Technical Field
The invention belongs to the technical field of internet, and particularly relates to an intelligent system for optimizing search results and a search engine.
Background
At present: with the continuous development of information technology, the internet has become an important component in the life of people. When people need to search various information, people only need to open a webpage and input related keywords in a search engine to collect the related information, and the information needed by people can be searched in a short time. With the continuous development of internet technology and the continuous expansion of information, people have higher and higher requirements for using network information, and a search engine becomes an important tool for obtaining the network information. The user inputs search requirement description information, such as keywords (query) or images, and the search engine returns search results to the user according to the search requirement description information.
However, when a user searches for information using a search engine, if too many input terms are used, the search engine is likely to output too many search results that are irrelevant to the content to be searched by the user. In addition, the existing output search results cannot be sorted according to the interestingness of the user, the user needs to frequently turn pages, and the use experience is reduced.
Through the above analysis, the problems and defects of the prior art are as follows:
(1) when the input terms are too many, the search engine outputs too many search results regardless of the contents the user has searched.
(2) The existing output search results cannot be sorted according to the interestingness of the user, the user needs to page frequently, and the use experience is reduced.
Disclosure of Invention
Aiming at the problems in the prior art, the invention provides a system for intelligently optimizing search results and a search engine.
The invention is realized in this way, a system for intelligently optimizing search results and search engines includes:
the device comprises a request acquisition module, a keyword extraction module, a retrieval module, a behavior detection module, a storage module, a behavior analysis module, a cluster optimization module and a sequencing module;
the request acquisition module is used for acquiring the search requirement description information of a user and determining at least one search requirement according to the search requirement description information;
the keyword extraction module is used for extracting keywords in the search requirement description information;
the retrieval module is used for retrieving the keywords extracted by the keyword extraction module in the internet webpage;
the behavior detection module is used for summarizing and summarizing the interests and hobbies of the user, the field of the user and the searching tendency of the user according to the historical input information and the searching information of the user and detecting and learning the behaviors of the user;
the storage module is used for storing historical behavior information of a user, and an information database is constructed through the stored historical behavior information and is used for subsequent behavior analysis;
the behavior analysis module is used for extracting and analyzing the historical behavior information stored in the storage module;
the cluster optimization module is used for optimizing the search result based on a cluster algorithm and a behavior analysis result;
and the sorting module is used for sorting the search results according to the optimization results.
Further, the cluster optimization module is configured to optimize the search result based on a clustering algorithm and a behavior analysis result, and specifically includes:
(1) firstly, giving class number c and fuzzy degree number m, and initializing a membership matrix U by using a random number with the value between 0 and 1 to enable the membership matrix U to satisfy the following formula:
Figure BDA0003066411360000021
the constraint of (2); where c is the given number of classes, UijRepresenting the membership degree of the ith webpage belonging to the jth class;
(2) with the formula:
Figure RE-GDA0003201641800000031
calculating C clustering centers Ci,i=1,…,c;
(3) According to the formula:
Figure BDA0003066411360000032
calculating an objective function;
(4) with the formula:
Figure BDA0003066411360000033
calculating a new U matrix, and returning to the step (2);
(5) the output of the fuzzy C-means clustering algorithm is a fuzzy partition matrix of C clustering center point vectors and C x n, the fuzzy partition matrix represents the membership degree of each class to which each webpage sample belongs, and the class to which each webpage sample belongs can be determined according to the partition matrix and the maximum membership principle in the fuzzy set.
Further, in step (3), if the objective function is smaller than a certain threshold value, or the amount of change from the last objective function value is smaller than a certain threshold value, the process goes to step (5) to stop.
Further, the extraction method adopted by the keyword extraction module specifically comprises the following steps:
performing word segmentation operation on all statement information of the search requirement description information to obtain word units of the statement information;
acquiring word characteristics of word units, sentence characteristics of the word units in corresponding sentence information and text characteristics of the word units in search requirement description information;
establishing a machine learning model by using a set number of analysis sentences based on a machine learning algorithm according to the acquired word characteristics, sentence characteristics and text characteristics;
and performing keyword extraction operation on each piece of search requirement description information by using the word characteristics, the sentence characteristics and the text characteristics of the word unit in each piece of sentence information based on the machine learning model.
Further, the request acquisition module includes:
the selection unit is used for showing the optimization information of the search requirement description information for the user to select;
the device comprises an acquisition unit, a processing unit and a processing unit, wherein the acquisition unit is used for acquiring a selection instruction of a user, and the selection instruction indicates search requirement optimization information selected by the user;
and the determining unit is used for determining the search requirement optimization information selected by the user according to the selection instruction.
Further, the behavior analysis method adopted by the behavior analysis module comprises the following steps:
carrying out statistical measurement on user characteristics according to the browsing behavior of a user on a webpage to obtain a series of characteristic data;
analyzing the statistical measurement information, and fitting the statistical measurement information by using a regression analysis method;
analyzing the fitting function, and calculating to obtain a regression equation of the overall characteristics;
and (5) checking the significance of the relation by using a correlation coefficient method, determining the reliability of the regression equation, and obtaining a behavior analysis result.
Further, the fitting by using the regression analysis method adopts a multiple linear regression model, and the formula is as follows:
y=β01x1+...+βkxk
in the formula x1,x2…,xkIs k variables; beta is a0,...,βkIs a coefficient; ε is a random variable.
Furthermore, the retrieval module also comprises a duplicate removal unit used for carrying out result duplicate removal on the search result, wherein the duplicate removal unit filters the extracted repeated websites, arranges the data and outputs the result to the user browser in an HTML form.
Further, the method for determining the repeated result by the deduplication unit includes:
if the URLs of the two query results are completely the same, judging the query results to be repeated results;
if the two URLs are different only in the last file name and the other parts are the same, the two URLs are judged to be the same result;
if the URL is completely different, but the title and the abstract are the same, the same result is judged.
Further, the method for determining the repeated result by the deduplication unit further includes: if the URLs of the two query results are completely different, but the titles and the summaries are similar, the two query results are judged to be the same.
By combining all the technical schemes, the invention has the advantages and positive effects that:
according to the method, the keyword extraction module is used for extracting the keywords in the search requirement description information, and the extracted keywords are used for retrieving the search requirement description information, so that the accuracy of search results can be improved, excessive search results irrelevant to the content to be searched by the user are reduced, and the user can browse conveniently; the search results are optimized based on the clustering algorithm and the behavior analysis results by utilizing the clustering optimization module, the inter-class sequencing and the in-class object sequencing of user behavior analysis can be based on, the user interest characteristic pattern can be obtained by analyzing the past behaviors of the user, the clustering center is compared with the class to which the user interest belongs, the clusters to which the user interest belongs are arranged at the forefront, and other clustering results are arranged in sequence along with the reduction of the user interest degree, so that the browsing experience degree of the user is effectively improved.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings used in the embodiments of the present application will be briefly described below, and it is obvious that the drawings described below are only some embodiments of the present application, and it is obvious for those skilled in the art that other drawings can be obtained according to the drawings without creative efforts.
Fig. 1 is a block diagram of a system for intelligently optimizing search results and a search engine according to an embodiment of the present invention.
Fig. 2 is a flow head of an extraction method adopted by the keyword extraction module according to the embodiment of the present invention.
Fig. 3 is a flow head of a behavior analysis method adopted by the behavior analysis module according to the embodiment of the present invention.
Fig. 4 is a flowchart of a method for determining a duplicate result by a deduplication unit according to an embodiment of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention is further described in detail with reference to the following embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention.
In view of the problems in the prior art, the present invention provides a system for intelligently optimizing search results and search engines, and the present invention is described in detail below with reference to the accompanying drawings.
As shown in fig. 1, the system for intelligently optimizing search results and a search engine provided in the embodiment of the present invention includes: the system comprises a request acquisition module 1, a keyword extraction module 2, a retrieval module 3, a behavior detection module 4, a storage module 5, a storage module 6, a cluster optimization module 7 and a sorting module 8.
The request acquisition module 1 is used for acquiring search requirement description information of a user and determining at least one search requirement according to the search requirement description information;
the request acquisition module 1 includes:
the selection unit 11 is configured to present optimization information of the search requirement description information for the user to select;
an obtaining unit 12, configured to obtain a selection instruction of a user, where the selection instruction indicates search requirement optimization information selected by the user;
and the determining unit 13 is configured to determine the search requirement optimization information selected by the user according to the selection instruction.
The keyword extraction module 2 is used for extracting keywords in the search requirement description information;
the retrieval module 3 is used for retrieving the keywords extracted by the keyword extraction module in the internet webpage;
the behavior detection module 4 is used for summarizing and summarizing the interests and hobbies of the user, the field of the user and the search tendency of the user according to the historical input information and the search information of the user, and detecting and learning the user behaviors;
the storage module 5 is used for storing historical behavior information of the user, and an information database is constructed through the stored historical behavior information and is used for subsequent behavior analysis;
the behavior analysis module 6 is used for extracting and analyzing the historical behavior information stored in the storage module;
the clustering optimization module 7 is used for optimizing the search result based on a clustering algorithm and a behavior analysis result;
and the sorting module 8 is used for sorting the search results according to the optimization results.
The cluster optimization module in the embodiment of the invention is used for optimizing the search result based on a clustering algorithm and a behavior analysis result, and specifically comprises the following steps:
(1) firstly, giving class number c and fuzzy degree number m, and initializing a membership matrix U by using a random number with the value between 0 and 1 to enable the membership matrix U to satisfy the following formula:
Figure BDA0003066411360000071
the constraint of (2); where c is the given number of classes, UijRepresenting the membership degree of the ith webpage belonging to the jth class;
(2) with the formula:
Figure RE-GDA0003201641800000072
calculating C clustering centers Ci,i=1,…,c;
(3) According to the formula:
Figure BDA0003066411360000073
calculating an objective function;
(4) with the formula:
Figure BDA0003066411360000074
calculating a new U matrix, and returning to the step (2);
(5) the output of the fuzzy C-means clustering algorithm is a fuzzy partition matrix of C clustering center point vectors and C x n, the fuzzy partition matrix represents the membership degree of each class to which each webpage sample belongs, and the class to which each webpage sample belongs can be determined according to the partition matrix and the maximum membership principle in the fuzzy set.
In step (3) in the embodiment of the present invention, if the objective function is smaller than a certain threshold value, or the amount of change from the last objective function value is smaller than a certain threshold value, the process goes to step (5) to stop.
As shown in fig. 2, the extraction method adopted by the keyword extraction module in the embodiment of the present invention specifically includes:
s101, performing word segmentation operation on all statement information of the search requirement description information to obtain word units of the statement information;
s102, acquiring word characteristics of word units, sentence characteristics of the word units in corresponding sentence information and text characteristics of the word units in the search requirement description information;
s103, establishing a machine learning model by using a set number of analysis sentences based on a machine learning algorithm according to the acquired word characteristics, sentence characteristics and text characteristics;
and S104, performing keyword extraction operation on each piece of search requirement description information by using the word characteristics, sentence characteristics and text characteristics of the word unit in each piece of sentence information based on the machine learning model.
As shown in fig. 3, the behavior analysis method adopted by the behavior analysis module in the embodiment of the present invention includes:
s201, carrying out statistical measurement on user characteristics according to browsing behaviors of a user on a webpage to obtain a series of characteristic data;
s202, analyzing the statistical measurement information, and fitting the statistical measurement information by using a regression analysis method;
s203, analyzing the fitting function, and calculating to obtain a regression equation of the overall characteristics;
and S204, checking the significance of the relation by using a correlation coefficient method, determining the reliability of the regression equation, and obtaining a behavior analysis result.
In step S202 in the embodiment of the present invention, a multiple linear regression model is used for fitting the regression analysis method, and the formula is as follows:
y=β01x1+...+βkxk
in the formula x1,x2…,xkIs k variables; beta is a0,...,βkIs a coefficient; ε is a random variable.
The retrieval module in the embodiment of the invention also comprises a duplication removing unit for carrying out result duplication removing on the search result, wherein the duplication removing unit filters out the extracted repeated websites, arranges the data and outputs the result to the user browser in an HTML (hypertext markup language) form.
As shown in fig. 4, the method for determining a duplicate result by a deduplication unit in the embodiment of the present invention includes:
s301, if the URLs of the two query results are completely the same, judging the query results to be repeated results;
s302, judging that the two URLs have the same result if the last file names are different and other parts are the same;
s303, if the URLs are completely different but the titles and the abstracts are the same, judging that the URLs have the same result;
and S304, if the URLs of the two query results are completely different, but the titles and the summaries are similar, determining that the two query results are the same.
The above description is only for the purpose of illustrating the preferred embodiments of the present invention, and the scope of the present invention is not limited thereto, and any modification, equivalent replacement, and improvement made by those skilled in the art within the technical scope of the present invention disclosed herein, which is within the spirit and principle of the present invention, should be covered by the present invention.

Claims (10)

1. A system for intelligently optimizing search results and a search engine, the system comprising:
the request acquisition module is used for acquiring the search requirement description information of the user and determining at least one search requirement according to the search requirement description information;
the keyword extraction module is used for extracting keywords in the search requirement description information;
the retrieval module is used for retrieving the keywords extracted by the keyword extraction module in the internet webpage;
the behavior detection module is used for inducing and summarizing the interests and hobbies of the user, the field of the user and the search tendency of the user according to the historical input information and the search information of the user, and detecting and learning the user behaviors;
the storage module is used for storing historical behavior information of the user, and an information database is constructed through the stored historical behavior information and is used for subsequent behavior analysis;
the behavior analysis module is used for extracting and analyzing the historical behavior information stored in the storage module;
the cluster optimization module is used for optimizing the search result based on a cluster algorithm and a behavior analysis result;
and the sequencing module is used for sequencing the search results according to the optimization results.
2. The system for intelligently optimizing search results and search engines of claim 1, wherein the cluster optimization module is configured to optimize search results based on a clustering algorithm and behavioral analysis results, and specifically comprises:
(1) firstly, giving class number c and fuzzy degree number m, and initializing a membership matrix U by using a random number with the value between 0 and 1 to enable the membership matrix U to satisfy the following formula:
Figure RE-FDA0003201641790000011
the constraint of (2); where c is the given number of classes, UijRepresenting the membership degree of the ith webpage belonging to the jth class;
(2) with the formula:
Figure RE-FDA0003201641790000012
calculating C clustering centers Ci,i=1,…,c;
(3) According to the formula:
Figure RE-FDA0003201641790000021
calculating an objective function;
(4) with the formula:
Figure RE-FDA0003201641790000022
calculating a new U matrix, and returning to the step (2);
(5) the output of the fuzzy C-means clustering algorithm is a fuzzy partition matrix of C clustering center point vectors and C x n, the fuzzy partition matrix represents the membership degree of each class to which each webpage sample belongs, and the class to which each webpage sample belongs can be determined according to the partition matrix according to the maximum membership principle in the fuzzy set.
3. The system for intelligently optimizing search results and search engines of claim 1 wherein in step (3), if the objective function is less than a certain threshold, or its change from the last objective function value is less than a certain threshold, then the process goes to step (5).
4. The system for intelligently optimizing search results and search engines of claim 1, wherein the extraction method employed by the keyword extraction module specifically comprises:
performing word segmentation operation on all statement information of the search requirement description information to obtain word units of the statement information;
acquiring word characteristics of word units, sentence characteristics of the word units in corresponding sentence information and text characteristics of the word units in search requirement description information;
establishing a machine learning model by using a set number of analysis sentences based on a machine learning algorithm according to the acquired word characteristics, sentence characteristics and text characteristics;
and performing keyword extraction operation on each piece of search requirement description information by using the word characteristics, the sentence characteristics and the text characteristics of the word unit in each piece of sentence information based on the machine learning model.
5. The system for intelligently optimizing search results and search engines of claim 1, wherein the request acquisition module comprises:
the selection unit is used for showing the optimization information of the search requirement description information for the user to select;
the device comprises an acquisition unit, a processing unit and a processing unit, wherein the acquisition unit is used for acquiring a selection instruction of a user, and the selection instruction indicates search requirement optimization information selected by the user;
and the determining unit is used for determining the search requirement optimization information selected by the user according to the selection instruction.
6. The system for intelligently optimizing search results and search engines of claim 1, wherein the behavior analysis module employs a behavior analysis method comprising:
carrying out statistical measurement on user characteristics according to the browsing behavior of a user on a webpage to obtain a series of characteristic data;
analyzing the statistical measurement information, and fitting the statistical measurement information by using a regression analysis method;
analyzing the fitting function, and calculating to obtain a regression equation of the overall characteristics;
and (5) checking the significance of the relation by using a correlation coefficient method, determining the reliability of the regression equation, and obtaining a behavior analysis result.
7. The system for intelligently optimizing search results and search engines of claim 6 wherein said fitting using regression analysis is using a multiple linear regression model having the formula:
y=β01x1+...+βkxk
in the formula x1,x2…,xkIs k variables; beta is a0,...,βkIs a coefficient; ε is a random variable.
8. The system for intelligently optimizing search results and search engines of claim 1, wherein the search module further comprises a deduplication unit for performing result deduplication on the search results, the deduplication unit filtering out extracted duplicate web addresses, collating data, and outputting the results to the user browser in HTML.
9. The system for intelligently optimizing search results and search engines of claim 8, wherein the method for the deduplication unit to determine duplicate results comprises:
if the URLs of the two query results are completely the same, judging the query results to be repeated results;
if the two URLs are different only in the last file name and the other parts are the same, the two URLs are judged to be the same result;
if the URL is completely different, but the title and the abstract are the same, the same result is judged.
10. The system for intelligently optimizing search results and search engines of claim 9, wherein the method for the deduplication unit determining duplicate results further comprises: if the URLs of the two query results are completely different, but the titles and the summaries are similar, the two query results are judged to be the same.
CN202110527628.7A 2021-05-14 2021-05-14 System for intelligently optimizing search results and search engine Withdrawn CN113468410A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110527628.7A CN113468410A (en) 2021-05-14 2021-05-14 System for intelligently optimizing search results and search engine

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110527628.7A CN113468410A (en) 2021-05-14 2021-05-14 System for intelligently optimizing search results and search engine

Publications (1)

Publication Number Publication Date
CN113468410A true CN113468410A (en) 2021-10-01

Family

ID=77870674

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110527628.7A Withdrawn CN113468410A (en) 2021-05-14 2021-05-14 System for intelligently optimizing search results and search engine

Country Status (1)

Country Link
CN (1) CN113468410A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115358214A (en) * 2022-08-23 2022-11-18 杭州达西信息技术有限公司 Keyword identification method and system based on user browsing and searching behaviors

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115358214A (en) * 2022-08-23 2022-11-18 杭州达西信息技术有限公司 Keyword identification method and system based on user browsing and searching behaviors
CN115358214B (en) * 2022-08-23 2024-04-12 深圳铁磁数字科技有限公司 Keyword recognition method and system based on user browsing and searching behaviors

Similar Documents

Publication Publication Date Title
CN100592293C (en) Knowledge search engine based on intelligent noumenon and implementing method thereof
CN107229668B (en) Text extraction method based on keyword matching
JP4944406B2 (en) How to generate document descriptions based on phrases
JP5632124B2 (en) Rating method, search result sorting method, rating system, and search result sorting system
CN1758245B (en) Method and system for classifying display pages using summaries
US7260571B2 (en) Disambiguation of term occurrences
US20040049499A1 (en) Document retrieval system and question answering system
JP2004005668A (en) System and method which grade, estimate and sort reliability about document in huge heterogeneous document set
JP2004005667A (en) System and method which grade, estimate and sort reliability about document in huge heterogeneous document set
CN110543595B (en) In-station searching system and method
CN107506472B (en) Method for classifying browsed webpages of students
CN1983255A (en) Internet searching method
Kim et al. Learning implicit user interest hierarchy for context in personalization
KR101059557B1 (en) Computer-readable recording media containing information retrieval methods and programs capable of performing the information
EP1843257A1 (en) Methods and systems of indexing and retrieving documents
Sivakumar Effectual web content mining using noise removal from web pages
Kim et al. Personalized search results with user interest hierarchies learnt from bookmarks
CN115130601A (en) Two-stage academic data webpage classification method and system based on multi-dimensional feature fusion
CN116775972A (en) Remote resource arrangement service method and system based on information technology
CN113468410A (en) System for intelligently optimizing search results and search engine
CN111898034A (en) News content pushing method and device, storage medium and computer equipment
Rajkumar et al. Users’ click and bookmark based personalization using modified agglomerative clustering for web search engine
CN113516202A (en) Webpage accurate classification method for CBL feature extraction and denoising
CN113468339A (en) Label extraction method, system, electronic device and medium based on knowledge graph
Vishwakarma et al. Web user prediction by: integrating Markov model with different features

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
WW01 Invention patent application withdrawn after publication

Application publication date: 20211001

WW01 Invention patent application withdrawn after publication