CN114741627B - Internet-oriented auxiliary information searching method - Google Patents

Internet-oriented auxiliary information searching method Download PDF

Info

Publication number
CN114741627B
CN114741627B CN202210378394.9A CN202210378394A CN114741627B CN 114741627 B CN114741627 B CN 114741627B CN 202210378394 A CN202210378394 A CN 202210378394A CN 114741627 B CN114741627 B CN 114741627B
Authority
CN
China
Prior art keywords
search
entity
terms
result
user
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202210378394.9A
Other languages
Chinese (zh)
Other versions
CN114741627A (en
Inventor
许鲁彦
杨健
肖德政
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
32802 Troops Of People's Liberation Army Of China
Original Assignee
32802 Troops Of People's Liberation Army Of China
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 32802 Troops Of People's Liberation Army Of China filed Critical 32802 Troops Of People's Liberation Army Of China
Priority to CN202210378394.9A priority Critical patent/CN114741627B/en
Publication of CN114741627A publication Critical patent/CN114741627A/en
Application granted granted Critical
Publication of CN114741627B publication Critical patent/CN114741627B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/954Navigation, e.g. using categorised browsing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/957Browsing optimisation, e.g. caching or content distillation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/10Text processing
    • G06F40/166Editing, e.g. inserting or deleting
    • G06F40/169Annotation, e.g. comment data or footnotes
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N5/00Computing arrangements using knowledge-based models
    • G06N5/02Knowledge representation; Symbolic representation

Abstract

The application discloses an internet-oriented auxiliary information search method, relates to a search history data processing and visualization method in the technical field of information retrieval, and supports sharing of search logs among multiple users. The knowledge graph-based visual interface design provides visual search path display for users, and provides a plurality of related entity users of each search term for exploration. The knowledge graph-based and list-based search history visualization interfaces support functions of user behavior marking, user annotation and the like, and a user can conveniently and quickly screen search logs. In this manner, it is possible to ensure that a user quickly reviews search contents by interacting with search logs, and to improve retrieval efficiency by sharing search logs of other users.

Description

Internet-oriented auxiliary information searching method
Technical Field
The invention relates to the technical field of internet, in particular to an auxiliary information searching method facing to the internet.
Background
The difficult search (creating search) means that when a user uses a search engine to perform information query, due to lack of background knowledge related to query content, accurate query keywords or target search results cannot be provided, and thus a search context of effective information cannot be searched in time.
In terms of user behavior, the difficult Search process has some behavior features, such as that a user personally inputs similar query keywords multiple times, but clicks less on a Search Engine pages (SERP) to view Search Results. Difficult searches are one of the most common user behaviors in information retrieval processes, where the experience of a difficult search can cause it to be dissatisfied or frustrated with the overall search experience, even if the user eventually finds the search target. Therefore, how to improve the search efficiency of the user and solve the search difficulty of the user is very important for the design of the internet search system.
For current internet search systems, it is difficult for users to resolve difficult search tasks by constantly interacting with the search engine interface. The search engine cannot automatically complement the search keywords/sentences necessary for solving difficult search tasks by the user, cannot provide related background knowledge supplementation for the user in the interaction process, or establish an effective search log for the user, so that the user cannot search or extract the keywords/sentences from the log to form effective search.
Disclosure of Invention
The invention mainly solves the problems that the existing internet search system can not automatically complement the search keywords/sentences necessary for users to solve difficult search tasks, can not provide related background knowledge supplement for the users in the interaction process, or can not establish an effective search log for the users, so that the users can not search or extract the keywords/sentences from the log to form effective search. Aiming at the problems, the invention discloses an auxiliary information searching method facing to the Internet.
The invention discloses an auxiliary information searching method facing to the Internet, which comprises the following steps:
grouping the search terms of the user by utilizing a grouping rule;
extracting entity information of the search terms from the grouped search terms, and constructing a search map for the entity information of the search terms by using a correlation criterion;
and carrying out classified display and sharing on the search results and the search maps of the search terms.
As an optional implementation manner, in the first aspect of the embodiment of the present invention, the grouping the user search terms by using the grouping rule includes extracting the search terms from the user search history data, and grouping the search terms according to time intervals or contents of the search terms;
as an optional implementation manner, in the first aspect of the embodiment of the present invention, the search graph is a knowledge graph formed by using entities of the search terms, the search graph uses the entities of the search terms as nodes, and relationships between the nodes as edges.
As an optional implementation manner, in the first aspect of the embodiment of the present invention, the extracting entity information of the search term from the grouped search terms, and constructing the search graph by using the correlation criterion for the entity information of the search term includes extracting entity information of the search term from the grouped search terms by using an entity identification method, as candidate entities of the search term, obtaining search results of the search term, calculating a quality score of each candidate entity, obtaining a visualized entity of the search term by using a semantic association rule of the search term, using the visualized entity of the search term as a node in the search graph, using the search term and the corresponding visualized entity as elements to construct an entity set, calculating a correlation degree function between elements in the entity set, establishing a relationship between elements in the entity set according to the correlation degree function, and using the relationship as an edge in the search graph.
As an optional implementation manner, in the first aspect of the embodiment of the present invention, the calculating a quality score of each candidate entity, and obtaining a visualized entity of a search term by using a semantic association rule of the search term includes:
calculating the quality score of the candidate entity by using the correlation degree of the candidate entity and the search result of the search word, i-th candidate entity e i The degree of correlation of the search result with the search word q is represented by e i Semantic similarity to search results and description e i Is expressed together with the degree of correlation of the concept set of (i) and q, the i-th candidate entity e i The expression of the degree of correlation with the kth search result of the search term q is:
Figure BDA0003591116180000021
wherein the content of the first and second substances,
Figure BDA0003591116180000031
representing candidate entities e i Related descriptive concepts and termsq degree of correlation of search results, < s j Represents a candidate entity e i Set of phrases, s, in the corresponding search result statement j Is e i The jth phrase, i.e., the jth descriptive concept, in the corresponding search result statement, n being the candidate entity e i The number of phrases in the corresponding search result statement. CoO(s) j ) Is as s j The co-occurrence relevance score of the search result with the search word q is expressed as:
Figure BDA0003591116180000032
wherein m is s j Number of search results co-occurring with q, frq m (s j Q) is at s j S in m-th search result co-occurring with q j Sum of word frequencies of q, con m (art) is s j And q the number of words contained in the mth search result co-occurring. Coh J (e i Q) as candidate entity e i The text similarity between the online knowledge base articles corresponding to the search word q is used for judging the correlation between different entities in the same search result title and the search word;
and averaging the correlation degree values of all the search results of the same candidate entity and the search word to obtain the correlation degree of the candidate entity and the search result of the search word. And screening a plurality of candidate entities with the highest quality scores as visual entities of the search words.
Optionally, the text similarity is obtained by using a Jaccard coefficient.
As an optional implementation manner, in the first aspect of the embodiment of the present invention, the constructing an entity set by using the search term and the corresponding visual entity as elements, and calculating a correlation function between the elements in the entity set includes:
in search results of search terms, correlation degree function between elements in entity set
Figure BDA0003591116180000033
The calculation formula of (2) is as follows: />
Figure BDA0003591116180000034
Wherein the content of the first and second substances,
Figure BDA0003591116180000035
element e representing a set of entities p And e q The number of simultaneous occurrences in the search results,
Figure BDA0003591116180000036
represents the simultaneous occurrence times of two entity set elements with the maximum simultaneous occurrence times in the search result, lambda represents a reconciliation parameter, num represents a discrimination threshold value, and/or>
Figure BDA0003591116180000041
The context association degree between two entity set elements is calculated by the following formula:
Figure BDA0003591116180000042
wherein, I 1 Represents a pair e p And e q If e is the same type of the online knowledge base, the judgment result is p And e q Belong to the same class, then I 1 =1, otherwise, I 1 =0;I 2 Represents a pair e p And e q The result of the discrimination as to whether they are commonly present in the same sentence or phrase in the on-line knowledge base is judged if e p And e q When they are present in the same sentence or phrase, then I 2 Number of co-occurring sentences/number of co-occurring articles, otherwise, I 2 =0。
The number of the commonly occurring articles is e p And e q The number of articles co-occurring in the online knowledge base.
As an optional implementation manner, in the first aspect of the embodiment of the present invention, the entity identification method includes a rule-based entity identification method, a dictionary-based entity identification method, and an online knowledge base method.
Optionally, the online knowledge base is wikipedia or encyclopedia, and the like.
As an optional implementation manner, in the first aspect of the embodiment of the present invention, the classifying, displaying and sharing the search result of the search term and the search map includes displaying nodes and edges included in the search map, representing the strength of the relationship between the nodes by using the transparency of the edges, and displaying an entity concept and a node corresponding to the entity of the search term according to an input instruction of a user.
As an optional implementation manner, in the first aspect of the embodiment of the present invention, the performing classified display on the search result of the search term and the search spectrum includes: performing list display on the search terms of the user and the corresponding search results; marking and displaying the search result operated by the user; and adding corresponding information to the search result and displaying the corresponding information according to the input information of the user to the search result.
As an optional implementation manner, in the first aspect of the embodiment of the present invention, the sharing of the search result of the user search term includes: creating a retrieval theme, adding retrieval words or search results related to the retrieval theme into the retrieval theme, and sharing the related retrieval theme according to the retrieval requirements of users.
As an optional implementation manner, in the first aspect of the embodiment of the present invention, the classified display and sharing of the search result and the search map of the search term includes displaying a search result page, a search result web page, a search map page of the search term, a search history interface based on a list, and a search result annotation page, respectively, where the search result page displays summary information of the search result of the search term, a link corresponding to the search result web page, a link of the annotation of the search result by the user, a link of the search map page of the search term, and a link of the search history interface based on the list, the search result web page is a page to which a user jumps after clicking the link of the search result web page on the search result page, and displays specific information included in each search result of the search term, the search result annotation page displays annotation information of the search result of the search term by the user, the search map page of the search term displays the search history interface of the search term based on the list.
Optionally, the relationship between the nodes includes a relationship between entities of the search term, a relationship between entity concepts, and a relationship between the search term entity and the entity concepts.
Optionally, whether any nodes of the knowledge graph are connected or not represents whether a semantic relationship exists between the two concepts, and the line segment transparency represents the strength of the semantic relationship between the concepts.
Optionally, in the classified display of the search results of the user search terms, the input information of the user to the search results includes the label information and annotation information of the user to the search results,
the invention has the beneficial effects that:
according to the visual design scheme of the historical search engine interface, search history is visual based on a knowledge graph, a user search path is presented in a graph form, and the concept association node and other functions provide a plurality of authoritative concepts most relevant to the field related to the search terms for the user, so that the user is helped to quickly acquire the field background knowledge relevant to the information requirement, the user can efficiently locate the information requirement and search answers under the conditions that the field background knowledge is insufficient in reserve and the information requirement cannot be clearly described through the search terms.
The method provides a visual design scheme for a historical interface of a search engine, can realize the functions of search result annotation, cooperative search groups and the like, and helps a user to obtain direct clues from the aspects of search logs, search results, search result annotation and the like of other users who have the same information requirements in the past under the conditions that background knowledge is insufficient and the search task cannot be completed through personal exploration by means of search result annotation and sharing of group member search historical records, so that the efficiency of solving the difficult search task is greatly improved.
Drawings
FIG. 1 is a flow chart of search term analysis and knowledge graph construction based on search history for a search engine according to an embodiment of the present application;
FIG. 2 is a schematic diagram of a search engine history interface architecture according to an embodiment of the present disclosure;
FIG. 3 is a schematic diagram illustrating a design of interaction function of a search result page of a search engine according to an embodiment of the present application;
FIG. 4 is a schematic diagram illustrating a design of interaction functions of a historical interface of a search engine according to an embodiment of the present application;
FIG. 5 is a schematic diagram illustrating an alternative search engine history interface interaction function design provided in an embodiment of the present application;
FIG. 6 is a flowchart illustrating interaction of a multi-user search engine history interface according to an embodiment of the present disclosure.
Detailed Description
In order to make the technical solutions of the present invention better understood, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
The terms "first," "second," and the like in the description and claims of the present invention and in the above-described drawings are used for distinguishing between different objects and not necessarily for describing a particular sequential or chronological order. Furthermore, the terms "include" and "have," as well as any variations thereof, are intended to cover non-exclusive inclusions. For example, a process, method, apparatus, product, or apparatus that comprises a list of steps or elements is not limited to those listed but may alternatively include other steps or elements not listed or inherent to such process, method, product, or apparatus.
Reference herein to "an embodiment" means that a particular feature, structure, or characteristic described in connection with the embodiment can be included in at least one embodiment of the invention. The appearances of the phrase in various places in the specification are not necessarily all referring to the same embodiment, nor are separate or alternative embodiments mutually exclusive of other embodiments. It is explicitly and implicitly understood by one skilled in the art that the embodiments described herein can be combined with other embodiments.
FIG. 1 is a flow chart of search term analysis and knowledge graph construction based on search history for a search engine according to an embodiment of the present application; FIG. 2 is a schematic diagram of a search engine history interface architecture according to an embodiment of the present disclosure; FIG. 3 is a schematic diagram illustrating a design of interaction function of a search result page of a search engine according to an embodiment of the present application; FIG. 4 is a schematic diagram illustrating a design of interaction functions of a historical interface of a search engine according to an embodiment of the present application; FIG. 5 is a schematic diagram illustrating an alternative search engine history interface interaction design provided in an embodiment of the present application; fig. 6 is a flowchart of interaction of a multi-user search engine history interface according to an embodiment of the present application.
The following are detailed below.
Example one
The embodiment of the application provides a method for analyzing search terms and constructing a knowledge graph based on search results, wherein the knowledge graph related to the search terms is constructed through a heuristic algorithm based on the search terms of a user in a difficult search process, a Wikipedia knowledge base and the like, so that the user is helped to expand knowledge space, and the search efficiency is improved.
The invention discloses an auxiliary information searching method facing to the Internet, which comprises the following steps:
grouping the search terms of the user by utilizing a grouping rule;
extracting entity information of the search terms from the grouped search terms, and constructing a search map for the entity information of the search terms by using a correlation criterion;
and carrying out classified display and sharing on the search results and the search maps of the search terms.
As an optional implementation manner, in the first aspect of the embodiment of the present invention, the grouping the user search terms by using the grouping rule includes extracting the search terms from the user search history data, and grouping the search terms according to time intervals or contents of the search terms;
as an optional implementation manner, in the first aspect of the embodiment of the present invention, the search graph is a knowledge graph formed by using entities of the search terms, the search graph uses the entities of the search terms as nodes, and relationships between the nodes as edges.
As an optional implementation manner, in the first aspect of the embodiment of the present invention, the extracting entity information of the search term from the grouped search terms, and constructing the search graph by using the correlation criterion for the entity information of the search term includes extracting entity information of the search term from the grouped search terms by using an entity identification method, as candidate entities of the search term, obtaining search results of the search term, calculating a quality score of each candidate entity, obtaining a visualized entity of the search term by using a semantic association rule of the search term, using the visualized entity of the search term as a node in the search graph, using the search term and the corresponding visualized entity as elements to construct an entity set, calculating a correlation degree function between elements in the entity set, establishing a relationship between elements in the entity set according to the correlation degree function, and using the relationship as an edge in the search graph. The scheme can help the user to quickly acquire the field background knowledge related to the information demand, so that the user can efficiently position the information demand and search the answer under the condition that the field background knowledge is not enough in reserve and the information demand cannot be clearly described through the search word.
As an optional implementation manner, in the first aspect of the embodiment of the present invention, the calculating a quality score of each candidate entity, and obtaining a visualized entity of a search term by using a semantic association rule of the search term includes:
calculating the quality score of the candidate entity by using the correlation degree of the candidate entity and the search result of the search word, i-th candidate entity e i The degree of correlation with the search result of the search term q is represented by e i Semantic similarity to search results and description e i Am ofThe degree of correlation between the idea set and q is commonly expressed, i-th candidate entity e i The expression of the degree of correlation with the kth search result of the search term q is:
Figure BDA0003591116180000081
wherein the content of the first and second substances,
Figure BDA0003591116180000082
representing candidate entities e i The degree to which the related descriptive concepts are related to the search result of the search term q,<s j >representing candidate entities e i Set of phrases, s, in corresponding search result statements j Is e i The jth phrase, i.e., the jth descriptive concept, in the corresponding search result statement, n being the candidate entity e i The number of phrases in the corresponding search result statement. CoO(s) j ) Is s is j The co-occurrence relevance score of the search result with the search term q is expressed as:
Figure BDA0003591116180000083
wherein m is s j Number of search results co-occurring with q, frq m (s j Q) is at s j S in m-th search result co-occurring with q j Sum of word frequencies of q, con m (art) is s j And q the number of words contained in the m-th search result co-occurring. Coh J (e i Q) as candidate entity e i The text similarity between the online knowledge base articles corresponding to the search word q is used for judging the correlation between different entities in the same search result title and the search word;
and averaging the correlation degree values of all the search results of the same candidate entity and the search word to obtain the correlation degree of the candidate entity and the search result of the search word. And screening a plurality of candidate entities with the highest quality scores to serve as visual entities of the search terms.
Optionally, the text similarity is obtained by using a Jaccard coefficient.
As an optional implementation manner, in the first aspect of the embodiment of the present invention, the constructing an entity set by using the search term and the corresponding visual entity as elements, and calculating a correlation function between the elements in the entity set includes:
in search results of search terms, correlation degree function between elements in entity set
Figure BDA0003591116180000091
The calculation formula of (2) is as follows:
Figure BDA0003591116180000092
wherein the content of the first and second substances,
Figure BDA0003591116180000093
the number of times the elements ep and eq representing the entity set co-occur in the search results,
Figure BDA0003591116180000094
denotes the number of simultaneous occurrences of the two entity set elements which occur most frequently in the search result, lambda denotes a reconciliation parameter, num denotes a discrimination threshold value, and/or>
Figure BDA0003591116180000095
The context association degree between two entity set elements is calculated by the following formula:
Figure BDA0003591116180000096
wherein, I 1 Represents a pair e p And e q If e is the same type of the online knowledge base, the judgment result is p And e q Belong to the same class, then I 1 =1, otherwise, I 1 =0;I 2 Represents a pair e p And e q The result of the discrimination as to whether they are commonly present in the same sentence or phrase in the on-line knowledge base is judged if e p And e q When they occur together in the same sentence or phrase, I 2 Number of co-occurring sentences/number of co-occurring articles, otherwise, I 2 =0。
The number of the commonly occurring articles is e p And e q The number of articles co-occurring in the online knowledge base.
As an optional implementation manner, in the first aspect of the embodiment of the present invention, the entity recognition method includes a rule-based entity recognition method, a dictionary-based entity recognition method, and an online knowledge base method.
Optionally, the online knowledge base is wikipedia or encyclopedia, and the like.
As an optional implementation manner, in the first aspect of the embodiment of the present invention, the classifying, displaying and sharing the search result of the search term and the search map includes displaying nodes and edges included in the search map, representing the strength of the relationship between the nodes by using the transparency of the edges, and displaying an entity concept and a node corresponding to the entity of the search term according to an input instruction of a user.
As an optional implementation manner, in the first aspect of the embodiment of the present invention, the performing classified display on the search result of the search term and the search spectrum includes: performing list display on the search terms of the user and the corresponding search results; marking and displaying the search result operated by the user; and adding corresponding information to the search result and displaying the corresponding information according to the input information of the user to the search result.
As an optional implementation manner, in the first aspect of the embodiment of the present invention, the sharing of the search result of the user search term includes: creating a retrieval theme, adding a retrieval word or a search result related to the retrieval theme into the retrieval theme, and sharing the related retrieval theme according to the retrieval requirements of a user.
As an optional implementation manner, in the first aspect of the embodiment of the present invention, the classified display and sharing of the search result and the search map of the search term includes displaying a search result page, a search result web page, a search map page of the search term, a search history interface based on a list, and a search result annotation page, respectively, where the search result page displays summary information of the search result of the search term, a link corresponding to the search result web page, a link of the annotation of the search result by the user, a link of the search map page of the search term, and a link of the search history interface based on the list, the search result web page is a page to which a user jumps after clicking the link of the search result web page on the search result page, and displays specific information included in each search result of the search term, the search result annotation page displays annotation information of the search result of the search term by the user, the search map page of the search term displays the search history interface of the search term based on the list.
Optionally, the relationship between the nodes includes a relationship between entities of the search term, a relationship between entity concepts, and a relationship between the search term entity and the entity concepts.
Optionally, whether any nodes of the knowledge graph are connected or not represents whether a semantic relationship exists between the two concepts, and the line segment transparency represents the strength of the semantic relationship between the concepts.
Optionally, in the classified display of the search results of the user search term, the input information of the user to the search results includes the label information and annotation information of the user to the search results,
according to the visual design scheme of the historical search engine interface, search history is visual based on a knowledge graph, a user search path is presented in a graph form, and the concept association node and other functions provide a plurality of authoritative concepts most relevant to the field related to the search terms for the user, so that the user is helped to quickly acquire the field background knowledge relevant to the information requirement, the user can efficiently locate the information requirement and search answers under the conditions that the field background knowledge is insufficient in reserve and the information requirement cannot be clearly described through the search terms.
In this optional embodiment, as an optional implementation manner, an embodiment of the present application provides a graph-based visualization design scheme for a search engine history interface, which is applied to display of search logs and search results of a search engine and interaction with a user, and the design scheme includes: a search log partition, a search map region, and a quick review region.
In the search log partition, historical data generated by user search is automatically grouped according to time intervals and presented on an interface in the form of unit groups. In the search map partition, each group of search logs of the search log partition is visually presented in the form of a knowledge map.
The knowledge graph visual design comprises search term nodes of a user and concept association nodes connected with the search term nodes, wherein the search term nodes are connected with each other and represent search paths for the user to search answers; whether any nodes of the knowledge graph are connected or not represents whether semantic relation exists between the two concepts or not, and the line segment transparency represents the strength of the semantic relation between the concepts. The search map partition supports a user to interact with the knowledge map through a mouse, and specifically comprises the steps of dragging/fixing a certain node in the knowledge map, hovering the mouse over the certain node to filter and view sub-maps connected with the certain node, clicking the mouse and other interaction behaviors.
And the quick review partition contains related search result entries corresponding to the search words of the user, any node on the knowledge graph in the search graph partition is clicked, the entries containing the node entities can be viewed in the quick review partition, and the corresponding page can be quickly skipped to and accessed by clicking the entries.
On the other hand, the embodiment of the application provides a list form search engine historical interface visualization scheme, which is applied to quick screening of useful search logs by users. Wherein, the search results clicked by the user are marked in a small hand icon style in a search log visual interface in a list form; the search results that have been browsed by the user are displayed in the form of highlighted terms. The user can autonomously switch the two search log presentation forms by clicking the page icon.
On the other hand, the embodiment of the application provides a search result labeling function, which is applied to a search result page based on a list form, and a user can directly add personal labels such as comments and labels into the search result page. The annotation function supports a user to add, edit and delete text information to the webpage content of a certain search result. Annotated search result pages are saved in pdf form and marked in bookmark form in corresponding terms on search result pages (SERPs).
On the other hand, the embodiment of the application provides a search log sharing function, which is applied to a collaborative search situation among users, namely, the search log sharing function helps the users to use for reference of search term records and search result records of other users in a search process. The system comprises a theme group creation module, a theme group search module, a theme group adding/quitting module and a content adding module. The theme group creating module, the theme group searching module and the theme group adding/pushing module are respectively used for creating a search theme group, searching for a certain theme group and adding or exiting a certain theme group by a user; the content adding module is used for adding the useful search results or search terms into the added subject group by the user and sharing information with other subject group members.
On the other hand, the embodiment of the application provides a visualization scheme for search result pages (SERPs), that is, functions of adding search results to groups and annotating the search results are added to each entry on the search result page. Through these functions, users will differentiate the tags of search results, add note tags and can share to cooperative groups according to the relevance to the search terms (information needs). At the same time, the search results page provides access to a search history interface.
In the visualized design scheme of the search engine historical interface, search history based on a knowledge graph is visualized, a user search path is presented in a graph form, and functions such as concept association nodes provide a plurality of authoritative concepts most relevant to the field related to the search word for the user, so that the user is helped to quickly acquire field background knowledge relevant to the information demand, and the user can efficiently locate the information demand and search answers under the conditions that the field background knowledge is insufficient in reserve and the information demand cannot be clearly described through the search word.
The embodiment provides a visual design scheme of a historical interface of a search engine, can realize the functions of search result annotation, cooperative search groups and the like, and helps a user to obtain direct clues from the aspects of search logs, search results, search result annotation and the like of other users who have the same information requirements in the past under the conditions that background knowledge is insufficient and the search task cannot be completed through personal exploration by means of search result annotation and sharing of group member search historical records, so that the efficiency of solving the difficult search task is greatly improved.
Example two
The embodiment discloses an auxiliary information searching method facing to the Internet, which comprises the following steps:
grouping the search terms of the user by utilizing a grouping rule;
extracting entity information of the search terms from the grouped search terms, and constructing a search map for the entity information of the search terms by using a correlation criterion;
and carrying out classified display and sharing on the search results and the search maps of the search terms.
As an optional implementation manner, in the first aspect of the embodiment of the present invention, the grouping the user search terms by using the grouping rule includes extracting the search terms from the user search history data, and grouping the search terms according to time intervals or contents of the search terms;
as an optional implementation manner, in the first aspect of the embodiment of the present invention, the search graph is a knowledge graph formed by using entities of the search terms, the search graph uses the entities of the search terms as nodes, and relationships between the nodes as edges.
As an optional implementation manner, in the first aspect of the embodiment of the present invention, the extracting entity information of the search term from the grouped search terms, and constructing the search graph by using the correlation criterion for the entity information of the search term includes extracting entity information of the search term from the grouped search terms by using an entity identification method, as candidate entities of the search term, obtaining search results of the search term, calculating a quality score of each candidate entity, obtaining a visualized entity of the search term by using a semantic association rule of the search term, using the visualized entity of the search term as a node in the search graph, using the search term and the corresponding visualized entity as elements to construct an entity set, calculating a correlation degree function between elements in the entity set, establishing a relationship between elements in the entity set according to the correlation degree function, and using the relationship as an edge in the search graph.
As an optional implementation manner, in the first aspect of the embodiment of the present invention, the calculating a quality score of each candidate entity, and obtaining a visualized entity of a search term by using a semantic association rule of the search term includes:
the quality score qe _ relationship of the candidate entity is obtained by calculating the correlation between the candidate entity and the search term under the specific context, and the calculation formula is as follows:
Figure BDA0003591116180000131
wherein, the correlation between the candidate entity e and the search word q is determined by the frequency of the candidate entity e in the search result of the search word q and the semantic relevance between the candidate entity and the search word, freq is the frequency of the candidate entity e in the search result, avgSL e The semantic relevance between the candidate entity e and the search word q is calculated by the following formula:
Figure BDA0003591116180000132
wherein e is i For the entry that appears in the search results at the i-th entity e,
Figure BDA0003591116180000133
i.e. e in the search result i And (4) semantic correlation scores between the search terms q, wherein n is the number of candidate entities. And screening a plurality of candidate entities with the highest quality scores to serve as visual entities of the search terms.
As an optional implementation manner, in the first aspect of the embodiment of the present invention, the constructing an entity set by using the search term and the corresponding visual entity as elements, and calculating a correlation function between the elements in the entity set includes:
visualizing a function of a degree of correlation between two entities in search results of a search term
Figure BDA0003591116180000134
The calculation formula of (c) is: />
Figure BDA0003591116180000135
Wherein, the first and the second end of the pipe are connected with each other,
Figure BDA0003591116180000141
representing candidate entities e i And e j The number of simultaneous occurrences in the online knowledge base,
Figure BDA0003591116180000142
the number of simultaneous occurrences of the two entities with the largest number of simultaneous occurrences in the online knowledge base is represented, and λ represents a reconciliation parameter.
As an optional implementation manner, in the first aspect of the embodiment of the present invention, the entity recognition method includes a rule-based entity recognition method, a dictionary-based entity recognition method, and an online knowledge base method.
Optionally, the online knowledge base is wikipedia or Baidu encyclopedia, etc.
In the visualized design scheme of the historical search engine interface provided by the embodiment, the search history is visualized based on the knowledge graph, and a user search path is presented in the form of a graph, wherein functions such as concept association nodes provide a plurality of authoritative concepts most relevant to the field related to the search terms of the user for the user, so that the user is helped to quickly acquire the field background knowledge relevant to the information demand, and the user can efficiently locate the information demand and search answers under the conditions that the field background knowledge is insufficient in reserve and the information demand cannot be clearly described through the search terms.
The embodiment provides a visual design scheme of a historical interface of a search engine, can realize the functions of search result annotation, cooperative search groups and the like, and helps a user to obtain direct clues from the aspects of search logs, search results, search result annotation and the like of other users who have the same information requirements under the conditions that the background knowledge is insufficient and the search task cannot be completed through personal exploration by means of search result annotation and sharing of group member search historical records, so that the efficiency of solving the difficult search task is greatly improved.
EXAMPLE III
Fig. 1 is a search term analysis and search history-based knowledge graph construction method of a search engine according to an embodiment of the present application, configured to extract relevant entities from user search history data and construct a knowledge graph. Given the search history of the user, the data processing flow and the related algorithm are implemented according to the following ideas:
(1) Grouping of user terms-dividing the search log into several groups of terms according to the time interval between terms.
(2) Search results are saved-given search terms, the first 20 entries corresponding to the search terms are archived to the system database.
(3) Extracting a search word of a given user based on related entities (entity) of the search word, obtaining related candidate entity concepts in Wikipedia through FastEntityLinker, and calculating the quality score of each candidate entity, thereby screening 5 entities with highest association degree scores with the search word meanings for visualization. Here, the quality score qe _ relationship of the entity, i.e. the score of the degree of correlation between the entity and the search term when combining with the specific context, is expressed by the following calculation formula:
Figure BDA0003591116180000151
the association between the specified entity e and the term q is determined by both the frequency of its occurrence in the search result entry and the semantic association between the entity and the term, which, in general,the larger the qe _ relationship value, the higher the degree of correlation between the entity e and the term q. Freq is the frequency, avgSL, at which entity e appears in the top 20 search result terms (i.e., the terms archived in step (2)) e And taking an average score for the association degree between the entity e and the search term q, namely:
Figure BDA0003591116180000152
wherein e is i For entries that occur at the ith entity e,
Figure BDA0003591116180000153
i.e. entity e in the context of the entry i A FastEntityLinker score with the search term q; each occurrence of computing entity e (i.e., e) i ) And averaging the association degrees of the entity e and the search word q to obtain the average association degree of the entity e and the search word q in the first 20 entries.
(4) Relevance score calculation-given a search term and its 5 related entities as an entity set, further calculates the relevance between two related entities of the search term in a particular search context, i.e. in the context of the entity set
Figure BDA0003591116180000154
Figure BDA0003591116180000155
Wherein the content of the first and second substances,
Figure BDA0003591116180000156
representing the number of simultaneous occurrences of two entities in wikipedia, device for selecting or keeping>
Figure BDA0003591116180000157
Representing the co-occurrence number of the two entities with the largest number of simultaneous occurrences in the search results. The non-linear function is used to represent the correlation between any two entities in the entity setTherefore, the influence of common sense entities (such as human, peoples) on the score calculation under the context of aggregation is avoided. Thus, even if the number of co-occurrences of a pair of entities in the set is too high, above a threshold (1000), other pairs of entities in the set may still get a more significant relevance score in the context of the set.
Fig. 2 is a schematic diagram of a search engine history interface architecture according to an embodiment of the present disclosure. The search engine history relevant page design provided by the embodiment comprises search result pages (SERPs), search result web pages, a knowledge graph-based search history interface, a list-based search history interface and a search result annotation page. The search result webpage is a webpage to which a search result is clicked on search result pages (SERPs) and then jumped to; the search result annotation page is a page to which to jump in order to annotate on the search result web page. The jump can be made between search result pages (SERPs) and search result web pages, between search result web pages and search result annotation pages, between knowledge graph/list-based search history interfaces, and between knowledge graph-based search history interfaces and list-based search history interfaces.
FIG. 3 is a schematic diagram of the interactive function design of search result pages (SERPs). The present embodiment sets a topic group addition function and a search result annotation viewing function on search result pages (SERPs). (1) And adding a function button for the theme group, clicking a button (1) on a certain search result entry, and adding and storing the search result entry and a corresponding search result webpage thereof to the corresponding theme group by a user through pop-up box prompt. (2) And (3) annotating the label for the search result, displaying the user annotation information on the entry of the search result on the label, clicking (2), skipping to a search result annotation page to view the complete annotation information corresponding to the search result page, or adding a new annotation and storing.
FIG. 4 is a schematic illustration of a knowledge-graph based search history interface interaction functionality partition. The interface supports the visualization of two versions of the personal search history and the topic group search history of a user. The user can complete the switching between the individual search history and the topic group search history through a button (7). (3) The method comprises the following steps of (1) carrying out interactive search log conversation, (3) enabling an area where the search log conversation is located to be an interactive search log conversation list functional partition, and displaying search word conversations of other users in a user person/subject group; clicking any search session in the list, the central area of the page will display the interactive knowledge map corresponding to the search log session, as shown in (5). (5) The larger node in the search term is a visual node of the search term, and the smaller node connected with the larger node is a visual node of the entity related to each search term; the search term nodes are connected to form a user search path, and whether any two nodes in the knowledge graph are connected depends on whether a semantic relation exists between the two node words. And (5) clicking any node in the page, and viewing all search result entries containing the key words of the node in the right functional area of the page. Wherein (2) annotates the tags for the search results (described in fig. 3). (6) And (3) marking historical behaviors of the user by a click mark and a browse mark, wherein if the user clicks/browses a certain search result entry in the search process, the entry appears (6) as the click/browse mark in the search history interface. (4) To search the history interface switch button, the user can switch to the list-based search history interface by clicking (4).
Fig. 5 is a schematic diagram illustrating an interactive function design of another search engine history interface provided in an embodiment of the present application. The interface supports the visualization of two versions of the personal search history and the topic group search history of a user. The user can complete the switching between the individual search history and the topic group search history through a button (7). The method comprises the following steps that (3) interactive search log conversation is conducted, and (3) the area where the search log conversation is located is an interactive search log conversation list functional partition which displays search word conversations of other users in a user personal/theme group; (2) annotate the label for the search result, (6) mark the user's historical behavior, and (4) switch the button for the search history interface. The functional details of each component are the same as the functions of each component in the knowledge-graph-based search history interface, and are described in detail with reference to fig. 3.
Fig. 6 is a schematic view illustrating an interaction flow between various user interaction interface functions according to an embodiment of the present application.
In the visualized design scheme of the historical search engine interface provided by the embodiment, the search history is visualized based on the knowledge graph, and a user search path is presented in the form of a graph, wherein functions such as concept association nodes provide a plurality of authoritative concepts most relevant to the field related to the search terms of the user for the user, so that the user is helped to quickly acquire the field background knowledge relevant to the information demand, and the user can efficiently locate the information demand and search answers under the conditions that the field background knowledge is insufficient in reserve and the information demand cannot be clearly described through the search terms.
The embodiment provides a visual design scheme of a historical interface of a search engine, can realize the functions of search result annotation, cooperative search groups and the like, and helps a user to obtain direct clues from the aspects of search logs, search results, search result annotation and the like of other users who have the same information requirements in the past under the conditions that background knowledge is insufficient and the search task cannot be completed through personal exploration by means of search result annotation and sharing of group member search historical records, so that the efficiency of solving the difficult search task is greatly improved.
Finally, it should be noted that: the above embodiments are only used to illustrate the technical solutions of the present application, and not to limit the same; although the present application has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; such modifications and substitutions do not necessarily depart from the spirit and scope of the corresponding technical solutions in the embodiments of the present application.

Claims (9)

1. An internet-oriented auxiliary information searching method comprises the following steps:
grouping the search terms of the user by utilizing a grouping rule;
extracting entity information of the search terms from the grouped search terms, and constructing a search map for the entity information of the search terms by using a correlation criterion;
carrying out classified display and sharing on the search results and the search maps of the search terms;
the method for extracting the entity information of the search term from the grouped search terms and constructing the search map for the entity information of the search term by using the relevance criterion comprises the following steps:
extracting entity information of the search terms from the grouped search terms by adopting an entity identification method to serve as candidate entities of the search terms, obtaining search results of the search terms, calculating the quality score of each candidate entity, obtaining the visual entities of the search terms by utilizing the semantic association rule of the search terms, taking the visual entities of the search terms as nodes in a search map, taking the search terms and the corresponding visual entities as elements to construct an entity set, calculating a correlation function among the elements in the entity set, establishing the relationship among the elements in the entity set according to the correlation function, and taking the relationship as the edge in the search map;
the method for calculating the quality score of each candidate entity and obtaining the visual entity of the search term by utilizing the semantic association rule of the search term comprises the following steps:
calculating the quality score of the candidate entity by using the correlation degree of the candidate entity and the search result of the search word, i-th candidate entity e i The degree of correlation of the search result with the search word q is represented by e i Semantic similarity to search results and description e i Is expressed together with the degree of correlation of the concept set of (i) and q, the i-th candidate entity e i The expression of the degree of correlation with the kth search result of the search term q is:
Figure FDA0003933603710000011
wherein the content of the first and second substances,
Figure FDA0003933603710000012
representing candidate entities e i The degree to which the related descriptive concepts are related to the search result of the search term q,<s j >representing candidates entity e i Set of phrases, s, in corresponding search result statements j Is e i The jth phrase, i.e., the jth descriptive concept, in the corresponding search result statement, n being the candidate entity e i The number of phrases in the corresponding search result sentence; coO(s) j ) Is s is j Search for search term qThe co-occurrence correlation score of the results is expressed as:
Figure FDA0003933603710000021
wherein m is s j Number of search results co-occurring with q, frq m (s j Q) is at s j S in m-th search result co-occurring with q j Sum of word frequencies of q, con m (art) is s j The number of words contained in the mth search result which co-occurs with q; coh J (e i Q) as candidate entity e i The text similarity between the online knowledge base articles corresponding to the search word q is used for judging the correlation between different entities in the same search result title and the search word;
averaging the correlation degree values of all search results of the same candidate entity and the search word to obtain the correlation degree of the candidate entity and the search result of the search word; and screening a plurality of candidate entities with the highest quality scores to serve as visual entities of the search terms.
2. The internet-oriented auxiliary information search method of claim 1, wherein the grouping of the user search terms by using the grouping rule comprises:
the search terms are extracted from the user search history data and grouped according to their time intervals or contents.
3. The internet-oriented auxiliary information search method as claimed in claim 1, wherein the search graph is a knowledge graph formed by using entities of the search terms, the search graph has the entities of the search terms as nodes, and the relationships between the nodes as edges.
4. The internet-oriented auxiliary information search method as claimed in claim 3, wherein the relationships between the nodes include relationships between entities of terms, relationships between concepts of entities, and relationships between entities of terms and concepts of entities.
5. The internet-oriented auxiliary information search method as claimed in claim 1, wherein the entity recognition method comprises a rule-based entity recognition method, a dictionary-based entity recognition method and an online knowledge base method.
6. The internet-oriented auxiliary information search method of claim 1, wherein the step of constructing an entity set by using the search terms and the corresponding visual entities as elements and calculating a correlation function between the elements in the entity set comprises:
in search results of search terms, correlation degree function between elements in entity set
Figure FDA0003933603710000031
The calculation formula of (2) is as follows:
Figure FDA0003933603710000032
wherein the content of the first and second substances,
Figure FDA0003933603710000033
element e representing a set of entities p And e q The number of simultaneous occurrences in the search results,
Figure FDA0003933603710000034
represents the number of simultaneous occurrences of two entity set elements having the largest number of simultaneous occurrences in the search result, λ represents a reconciliation parameter, num represents a discrimination threshold,
Figure FDA0003933603710000035
the context association degree between two entity set elements is calculated by the following formula:
Figure FDA0003933603710000036
wherein, I 1 Represents a pair e p And e q If e is the same type of the online knowledge base, the judgment result is p And e q Belong to the same class, then I 1 =1, otherwise, I 1 =0;I 2 Represents a pair e p And e q The result of the judgment on whether the words or phrases appear together in the online knowledge base is judged if e p And e q When they are present in the same sentence or phrase, then I 2 = number of co-occurring sentences/number of co-occurring articles, otherwise, I 2 =0;
The number of the commonly occurring articles is e p And e q The number of articles co-occurring in the online knowledge base.
7. The internet-oriented auxiliary information search method as claimed in claim 1, wherein the classified display and sharing of the search result of the search term and the search spectrum comprises:
and displaying the nodes and edges contained in the search map, expressing the strength degree of the relationship between the nodes by using the transparency of the edges, and displaying the entity concepts and the nodes corresponding to the search term entities according to the input instruction of a user.
8. The internet-oriented auxiliary information search method as claimed in claim 1, wherein the classified display and sharing of the search result of the search term and the search spectrum comprises:
performing list display on the search terms of the user and the corresponding search results; marking and displaying the search result operated by the user; adding corresponding information to the search result according to the input information of the user to the search result and displaying the corresponding information; creating a retrieval theme, adding retrieval words or search results related to the retrieval theme into the retrieval theme, and sharing the related retrieval theme according to the retrieval requirements of users.
9. The internet-oriented auxiliary information search method according to claim 1, wherein the classified display and sharing of the search result and the search pattern of the search term comprises:
respectively displaying a search result page, a search result webpage, a search map page of a search word, a search history interface based on a list and a search result comment page; the search result page shows summary information of search results of search words, links corresponding to search result webpages, links annotated by users to the search results, links of search map pages of the search words and links of a search history interface based on a list, the search result webpages are pages which are jumped to after the links of the search result webpages are clicked on the search result pages, and show specific information contained in each search result of the search words, the search result annotation pages show annotation information of the search results of the search words by users, the search map pages of the search words show search maps of the search words, and the search history interface based on the list shows all the search results of the search words in a list form.
CN202210378394.9A 2022-04-12 2022-04-12 Internet-oriented auxiliary information searching method Active CN114741627B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210378394.9A CN114741627B (en) 2022-04-12 2022-04-12 Internet-oriented auxiliary information searching method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210378394.9A CN114741627B (en) 2022-04-12 2022-04-12 Internet-oriented auxiliary information searching method

Publications (2)

Publication Number Publication Date
CN114741627A CN114741627A (en) 2022-07-12
CN114741627B true CN114741627B (en) 2023-03-24

Family

ID=82282430

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210378394.9A Active CN114741627B (en) 2022-04-12 2022-04-12 Internet-oriented auxiliary information searching method

Country Status (1)

Country Link
CN (1) CN114741627B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116340468A (en) * 2023-05-12 2023-06-27 华北理工大学 Theme literature retrieval prediction method

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104679783A (en) * 2013-11-29 2015-06-03 北京搜狗信息服务有限公司 Network searching method and device

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103365876B (en) * 2012-03-29 2020-04-24 北京百度网讯科技有限公司 Method and equipment for generating network operation auxiliary information based on relational graph
US9703859B2 (en) * 2014-08-27 2017-07-11 Facebook, Inc. Keyword search queries on online social networks
JP7106077B2 (en) * 2016-09-22 2022-07-26 エヌフェレンス,インコーポレイテッド Systems, methods, and computer-readable media for visualization of semantic information and inference of temporal signals that indicate salient associations between life science entities
CN110929038B (en) * 2019-10-18 2023-07-21 平安科技(深圳)有限公司 Knowledge graph-based entity linking method, device, equipment and storage medium
CN111680207B (en) * 2020-03-11 2023-08-04 华中科技大学鄂州工业技术研究院 Method and device for determining search intention of user
CN113987155B (en) * 2021-11-25 2024-03-26 中国人民大学 Conversational retrieval method integrating knowledge graph and large-scale user log

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104679783A (en) * 2013-11-29 2015-06-03 北京搜狗信息服务有限公司 Network searching method and device

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
基于知识图谱与语义计算的智能信息搜索技术研究;高龙等;《情报理论与实践》;20180510(第07期);全文 *
知识图谱在实体检索中的应用研究综述;阮光册等;《图书情报工作》;20200720(第14期);全文 *

Also Published As

Publication number Publication date
CN114741627A (en) 2022-07-12

Similar Documents

Publication Publication Date Title
Deng et al. Adapting sentiment lexicons to domain-specific social media texts
Asani et al. Restaurant recommender system based on sentiment analysis
Varathan et al. Comparative opinion mining: a review
US9659084B1 (en) System, methods, and user interface for presenting information from unstructured data
Moussa et al. A survey on opinion summarization techniques for social media
US20090265330A1 (en) Context-based document unit recommendation for sensemaking tasks
RU2696305C2 (en) Browsing images through intellectually analyzed hyperlinked fragments of text
Pyshkin et al. Approaches for web search user interfaces
JP6529133B2 (en) Apparatus, program and method for analyzing the evaluation of topics in multiple regions
Sharma et al. Opinion mining in Hindi language: a survey
Yang et al. Sentiment annotations for reviews: an information quality perspective
Strzelecki et al. Direct answers in Google search results
CN114741627B (en) Internet-oriented auxiliary information searching method
Tietz et al. Semantic Annotation and Information Visualization for Blogposts with refer.
Wong Learning lightweight ontologies from text across different domains using the web as background knowledge
Monachesi et al. What ontologies can do for eLearning
Golub et al. EnTag: enhancing social tagging for discovery
Cheng et al. Context-based page unit recommendation for web-based sensemaking tasks
Egger et al. A brief tutorial on how to extract information from user-generated content (UGC)
Hinze et al. Capisco: low-cost concept-based access to digital libraries
Qassimi et al. Towards an emergent semantic of web resources using collaborative tagging
WO2016156655A1 (en) Exploratory search
Mazurek et al. Visualizing expanded query results
Ji et al. Opinion mining of product reviews based on semantic role labeling
Nkongolo Wa Nkongolo News Classification and Categorization with Smart Function Sentiment Analysis

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant