CN113434696A - Knowledge graph-based search result updating method and device and computer equipment - Google Patents

Knowledge graph-based search result updating method and device and computer equipment Download PDF

Info

Publication number
CN113434696A
CN113434696A CN202110720141.0A CN202110720141A CN113434696A CN 113434696 A CN113434696 A CN 113434696A CN 202110720141 A CN202110720141 A CN 202110720141A CN 113434696 A CN113434696 A CN 113434696A
Authority
CN
China
Prior art keywords
content
entity
search
search content
knowledge graph
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110720141.0A
Other languages
Chinese (zh)
Inventor
王鹏
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Ping An Technology Shenzhen Co Ltd
Original Assignee
Ping An Technology Shenzhen Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Ping An Technology Shenzhen Co Ltd filed Critical Ping An Technology Shenzhen Co Ltd
Priority to CN202110720141.0A priority Critical patent/CN113434696A/en
Publication of CN113434696A publication Critical patent/CN113434696A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/205Parsing
    • G06F40/211Syntactic parsing, e.g. based on context-free grammar [CFG] or unification grammars
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/36Creation of semantic tools, e.g. ontology or thesauri
    • G06F16/367Ontology
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9535Search customisation based on user profiles and personalisation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/289Phrasal analysis, e.g. finite state techniques or chunking

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Databases & Information Systems (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • General Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Animal Behavior & Ethology (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The application relates to the technical field of correlation analysis, and discloses a search result updating method based on a knowledge graph, which comprises the following steps: acquiring a content entity inquired in a knowledge graph by a user based on search content within a preset time period; acquiring the number of clicks of a user on a content entity; calculating the floating correlation weight of the content entity and the search content according to the click times; acquiring initial relevance weight of a content entity and search content; calculating a total correlation weight according to the floating correlation weight and the initial correlation weight; replacing the initial relevance weight in the knowledge graph with the total relevance weight to complete the updating of the knowledge graph; when the user searches again, the search results corresponding to the searched content are returned based on the updated knowledge-graph. The method, the device and the computer equipment for updating the search results based on the knowledge graph solve the technical problems that in the prior art, the correlation of the search results in the vertical field is poor, and the click rate of a user is low.

Description

Knowledge graph-based search result updating method and device and computer equipment
Technical Field
The present application relates to the field of correlation analysis technologies, and in particular, to a method and an apparatus for updating search results based on a knowledge graph, and a computer device.
Background
With the development of internet science and technology, the application field of the search recommendation algorithm is wider and wider. Most of traditional search recommendation algorithms are based on full-text retrieval, namely, an inverted index is established by utilizing the relation between words and contents, when a user initiates a search, keyword matching is carried out based on the index, and the matched contents are recommended to the user, and due to the fact that platform contents are lack, the keyword matching is not wanted by the user, the relevance of a search recall result is insufficient, or the search intention is not clear, the click rate of the user is low or not high. The traditional search recommendation algorithm is based on index for keyword matching and recommending the matched content to the user, so that the corresponding result cannot be accurately given, the relevance of the searched result is poor, and the click rate of the user is low.
Disclosure of Invention
The application mainly aims to provide a method, a device and computer equipment for updating a search result based on a knowledge graph, and aims to solve the technical problems that in the prior art, the correlation of the search result is poor, and the click rate of a user is low.
The application provides a search result updating method based on a knowledge graph, which comprises the following steps:
acquiring a content entity inquired in a knowledge graph by a user based on search content within a preset time period;
acquiring the number of clicks of the user on the content entity;
calculating the floating correlation weight of the content entity and the search content according to the click times;
acquiring an initial relevance weight of the content entity and the search content;
calculating a total relevance weight of the content entity and the search content according to the floating relevance weight of the content entity and the search content and the initial relevance weight of the content entity and the search content;
replacing the initial relevance weight of the content entity and the search content in the knowledge-graph by the total relevance weight of the content entity and the search content to finish the updating of the knowledge-graph;
and when the user searches the search content again, returning a search result corresponding to the search content based on the updated knowledge graph.
Further, the step of obtaining the content entity queried by the user in the knowledge-graph based on the search content includes:
acquiring search content input by a user;
carrying out natural language processing on the search content to obtain the intention of the search content;
and querying in a knowledge graph by adopting a BM25 algorithm according to the intention of the search content to obtain a content entity.
Further, the step of performing natural language processing on the search content to obtain the intention of the search content includes:
performing word segmentation on the search content to obtain a word segmentation result of the search content;
performing entity recognition according to the word segmentation result of the search content to obtain an entity recognition result of the search content;
performing syntactic analysis according to the entity identification result of the search content to obtain a syntactic analysis result;
identifying an intent of the search content based on the entity recognition result and the syntactic analysis result.
Further, the step of replacing the initial relevance weight of the content entity and the search content in the knowledge-graph with the total relevance weight of the content entity and the search content comprises:
acquiring all content entities obtained based on the search content;
acquiring the total number of clicks of all content entities;
calculating the floating correlation weight of the content entity and the search content according to the click times of the content entity and the total click times of all the content entities; wherein, calculatingThe formula is as follows: wFloat=C1/CGeneral assembly(ii) a Wherein, the WFloatFor floating relevance weights of content entities to the search content, said C1Is the number of clicks of the content entity, CGeneral assemblyThe total number of clicks for all content entities.
Further, after the step of adjusting the relevance weight of the content entity in the knowledge-graph and the search content according to the click behavior, the method further includes:
acquiring a newly added content entity obtained by the user based on the search content;
performing word segmentation on the newly added content entity to obtain a word segmentation result of the newly added content entity;
performing entity identification according to the word segmentation result of the newly added content entity to obtain an entity identification result of the newly added content entity;
and calculating the initial correlation weight of the newly added content entity and the search content according to the entity identification result of the newly added content entity.
Further, the step of calculating the initial relevance weight of the newly added content entity and the search content according to the entity identification result of the newly added content entity includes:
calculating the relevance scores of the entities in the newly-added content entities and the content entities in the knowledge graph by adopting a TF-IDF algorithm; wherein, the content entity in the knowledge graph is the content entity corresponding to the search content in the knowledge graph;
using the relevance score as an initial relevance weight of the newly added content entity to the search content.
Further, after the step of calculating the initial relevance weight of the newly added content entity and the search content according to the entity identification result of the newly added content entity, the method further includes:
judging whether the initial correlation weight of the newly added content entity and the search content is greater than a set value or not;
if so, establishing the correlation between the newly added content entity and the content entity in the knowledge graph, and storing the newly added content entity in the knowledge graph; and the content entity in the knowledge graph is the content entity corresponding to the search content in the knowledge graph.
The present application further provides a device for updating search results based on a knowledge graph, comprising:
the first acquisition module is used for acquiring a content entity inquired in the knowledge graph by a user based on search content in a preset time period;
the second acquisition module is used for acquiring the click times of the user on the content entity;
the first calculation module is used for calculating the floating relevance weight of the content entity and the search content according to the click times;
a third obtaining module, configured to obtain an initial relevance weight of the content entity and the search content;
a second calculation module, configured to calculate a total relevance weight of the content entity and the search content according to the floating relevance weight of the content entity and the search content and the initial relevance weight of the content entity and the search content;
a replacing module, configured to replace the initial relevance weight of the content entity and the search content in the knowledge graph with the total relevance weight of the content entity and the search content, so as to complete updating of the knowledge graph;
and the returning module is used for returning a search result corresponding to the search content based on the updated knowledge graph when the user searches the search content again.
The present application further provides a computer device comprising a memory and a processor, wherein the memory stores a computer program, and the processor implements the steps of the above method when executing the computer program.
The present application also provides a computer-readable storage medium having stored thereon a computer program which, when being executed by a processor, carries out the steps of the above-mentioned method.
The beneficial effect of this application does: the method comprises the steps of inquiring a corresponding content entity in a knowledge graph based on search content of a user, calculating floating correlation weight of the search content and a search result according to the number of clicks of the user on the content entity, calculating total correlation weight according to the floating correlation weight of the search content and the search result and the obtained initial correlation weight of the search content and the search result, and replacing the initial correlation weight with the total correlation weight, so that when the user searches again, the user can obtain results more relevant to the search content, the correlation of the search content in the vertical field is improved, the click rate of the user in a search scene is improved, and the purposes of promoting transaction conversion, helping enterprises improve service quality and improving customer satisfaction are achieved.
Drawings
Fig. 1 is a schematic flowchart of a method for updating search results based on a knowledge graph according to an embodiment of the present application.
Fig. 2 is a schematic structural diagram of a search result updating apparatus based on a knowledge graph according to an embodiment of the present application.
Fig. 3 is a schematic diagram of an internal structure of a computer device according to an embodiment of the present application.
The implementation, functional features and advantages of the objectives of the present application will be further explained with reference to the accompanying drawings.
Detailed Description
It should be understood that the specific embodiments described herein are merely illustrative of the present application and are not intended to limit the present application.
As shown in fig. 1, the present application provides a method for updating search results based on a knowledge graph, including:
s1, acquiring a content entity inquired in the knowledge graph by a user based on the search content in a preset time period;
s2, acquiring the click times of the user on the content entity;
s3, calculating the floating relevance weight of the content entity and the search content according to the click times;
s4, acquiring the initial relevance weight of the content entity and the search content;
s5, calculating the total relevance weight of the content entity and the search content according to the floating relevance weight of the content entity and the search content and the initial relevance weight of the content entity and the search content;
s6, replacing the initial relevance weight of the content entity and the search content in the knowledge graph with the total relevance weight of the content entity and the search content, and completing the updating of the knowledge graph;
and S7, when the user searches the search content again, returning the search result corresponding to the search content based on the updated knowledge graph.
As described in the above step S1, within a preset time period (for example, one week or one month, which is set according to specific situations and is not limited herein), the user inputs the same search content multiple times, after each intention of the search content input by the user is obtained, a query is performed in the constructed knowledge graph, the knowledge graph includes multiple content entities (content entities, i.e., content to be specifically shown to the user), entity types and association relationships therebetween, and each content entity has a relevance weight with respect to the search content, the multiple content entities can be queried through the relationship network, and the multiple queried content entities (i.e., final results obtained through the search) are returned to the front end for being shown in front of the user.
As described in the above steps S2-S3, within a preset time period (for example, one week or one month, as set according to specific situations, but not limited thereto), the user inputs the same search content multiple times, and clicks on the same result multiple times, which indicates that the result is strongly correlated with the search content input by the user, so that the number of clicks of one content entity (i.e., the search result) by the user based on the same search content is recorded, and a floating relevance weight of the current content entity and the search content of the user is calculated according to the running batch of the number of clicks of the user within the preset time period, where the floating relevance weight indicates the influence of the number of clicks of the user on the relevance weight of the content entity and the search content.
After calculating the floating relevance weights of the content entities and the search content as described in the above steps S4-S6, the floating relevance weights only indicate the intention of one user, but cannot be directly used as the relevance weights of the content entities and the search content, but only have an influence on the initial relevance weights of the content entities and the search content. Therefore, the initial relevance weight of the content entity and the search content is obtained, the total relevance weight of the content entity and the search content is calculated according to the floating relevance weight and the initial relevance weight of the content entity and the search content, the total relevance weight of the content entity and the search content includes the influence caused by the click behavior of the user, therefore, the total relevance weight of the content entity and the search content is used as a new initial relevance weight of the content entity and the search content and replaces the original initial relevance weight, and when the click behavior of the subsequent user generates influence again, the new initial relevance weight (namely, the total relevance weight) is used for calculation.
As described in step S7, when the content entities returned include the content entity desired by the user, the user clicks on the content entity, and collects the search content, click behavior, and other data input by the user in a manner of front-end point burying. Then, the number of times of clicking the content by the user and the incidence relation between the content entity and the content searched by the user are recorded, so that the relevance weight between the content entity and the searched content in the constructed knowledge graph can be updated conveniently through the clicking behavior of the user, and the content which is not interested by the user is prevented from being returned. When the user has click behavior based on the result returned by the search content, the number of clicks of the user is recorded, the search result (i.e. content entity) clicked by the user shows that the user is more interested or the correlation between the result and the content searched by the user is high, but the correlation weight between the content entity and the search content in the knowledge graph is not high, so that the correlation of the search result can be improved and the click rate of the user can be improved by adjusting the weight of the corresponding content entity in the knowledge graph through the click behavior of the user.
The accuracy of the traditional search recommendation algorithm depends more on the magnitude of data and the click behaviors of a large number of users, and is only suitable for general fields with more and complicated contents and larger search volume, such as a hundred-degree and google search engine. However, the search in the vertical domain has the characteristics of professional content, less content, complex and changeable correlation and small search amount of users, so that the existing search solution algorithm in the general domain is not suitable for the search in the vertical domain (such as fund and insurance search), and the correlation between the search content and the search result in the vertical domain is stronger. Therefore, the method and the device can be applied to vertical-collar search, the correlation between the search content and the search result can be adjusted in real time, the correlation of the search result can be improved, and the click rate of the user can be improved
In one embodiment, the step of obtaining the content entity queried by the user in the knowledge-graph based on the search content comprises:
s01, acquiring search content input by a user;
s02, carrying out natural language processing on the search content to obtain the intention of the search content;
and S03, inquiring in the knowledge graph by adopting a BM25 algorithm according to the intention of the searched content to obtain a content entity.
As described in step S01 above, the search content input by the user can be a word, a word or a word, i.e., Query, which is a message sent by the search engine or database to find a specific file, website, record or series of records in the database. For example, if the user is interested in the fund recently, the user inputs "fund" or "what fund" and the like.
As described in step S02, the search content input by the user is processed by using NLP (Natural Language Processing), which is a sub-field of Artificial Intelligence (AI), and a method of inference, probability, and statistics is applied, and especially, an actual grammar is applied to analyze a long string of sentences that are easy to be highly blurred. Methods of processing highly ambiguous sentences using disambiguation are commonly applied to corpora and Markov models (Markov models). The NLP technology is mainly a sub-field related to learning behaviors under artificial intelligence and is evolved through machine learning and data mining. NLP technology is mainly used to study the language problem of human interaction with computers. The natural language processing of the search content input by the user mainly comprises entity recognition, syntactic analysis and intention recognition, the intention of the search content is finally obtained, and what returned result the user wants to obtain through searching is known. Meanwhile, before the search content input by the user is processed, whether the search content input by the user can be identified can be judged, and the monitoring billboard monitors the identified condition of the search content of the user in real time; if the search content input by the user can be identified, processing the search content, and if the search content input by the user cannot be identified, acquiring the condition that the search content cannot be identified in real time by the monitoring billboard, and periodically sending an instruction so that an operator can supplement the content.
After the intention of the search content input by the user is obtained, as described in step S03, a query is performed in the constructed knowledge graph, where the knowledge graph includes a plurality of content entities, entity types, and their association relationships, and each content entity has a relevance weight with the search content, the plurality of content entities can be queried through its relationship network, and the queried plurality of content entities (i.e., the final result obtained through the search) are returned to the front end for being displayed in front of the user. In some embodiments, the obtained content entities may be further sorted by relevance weights of the content entities and the search content, and the content entities are presented to the user in order of the relevance weights from big to small, so that the user clicks on the content desired to be viewed.
In addition, in addition to returning the content entities, other content entities related to the content entities in the knowledge graph can be returned, and words searched by other users except words searched by other users with the same content as that of the current user can be returned, the words searched by other users can be subjected to quantity statistics, and a plurality of words with the largest occurrence frequency can be displayed to the current user as target words. For example, a user searches for a golden cow prize, returns an encyclopedia explanation of the golden cow prize, associates a fund company for winning the golden cow prize, dynamically brings up other entity type contents such as relevant fund of the fund company according to the weight, and returns which words the other users have searched after searching for the golden cow prize.
In one embodiment, before the step of obtaining the search content input by the user, the method further includes:
s001, constructing a knowledge graph of the vertical field; wherein the vertical field is a professional field.
As described in the step S001 above, the vertical fields include the fund field, the knowledge question and answer field, the refined marketing field, and other professional fields; the method comprises the steps of constructing the knowledge graph of the vertical field, and establishing a relational network on a large amount of data to obtain the knowledge graph corresponding to the vertical field mainly by acquiring the large amount of data of the vertical field.
In one embodiment, the step of performing natural language processing on the search content to obtain the intention of the search content includes:
s021, performing word segmentation on the search content to obtain a word segmentation result of the search content;
s022, performing entity identification according to the word segmentation result of the search content to obtain an entity identification result of the search content;
s023, performing syntactic analysis according to the entity recognition result of the search content to obtain a syntactic analysis result;
s024, identifying the intention of the search content based on the entity identification result and the syntactic analysis result.
As described in the above steps S021-S024, performing word segmentation on the search content at the longest distance used for hanlp, or performing word segmentation on the search content at the finest granularity of chinese, the longest word segmentation of chinese, the finest granularity of full spelling, the finest granularity of first letter, precise word segmentation, single word segmentation, pinyin word segmentation, etc., and mainly aiming at the case that the content input by the user is a sentence, the entity recognition is used for recognizing the predefined entity type (person name, organization, place name, etc.) included in the search content; syntactic analysis is used for analyzing the syntactic structure (the structure of a principal object and a predicate object) of a sentence and the dependency relationship (parallel, subordinate and the like) among vocabularies; intent recognition is used to analyze a user's search intent based on search content, wherein the user's search intent may include: navigation type (navigating the user to the corresponding field or the corresponding flow), information type (providing the user with information which the user wants to know), transaction type (providing the user with each implementation link in the flow), and the like. For example, if the content input by the user is "the concept of fund", the sentence is segmented to obtain three words of "fund", "of" and "concept", and entity recognition is performed according to the segmentation result, the entities in the sentence are recognized as two entities of "fund" and "concept", and then syntactic analysis is performed, mainly to judge the front and back sequence of the entities, because the sentence usually appears in the form of main, predicate and object, the search intention of the user can be determined according to the sequence between the entities, and as above, the intention is the processing result of the "concept of fund" input by the user.
In one embodiment, the step of querying in the knowledge graph by using the BM25 algorithm according to the intention of the search content to obtain the content entity includes:
s031, according to the intention of the search content, inquiring the entity type in the knowledge graph;
s032, returning the content entity corresponding to the entity type.
As described in the above steps S031-S032, the BM25 algorithm is used to directly obtain the content entity according to the intention of the user to search for content, and BM25 is an algorithm for evaluating the relevance between the search term and the document, and is an algorithm proposed based on a probabilistic search model. The formula of BM25 consists essentially of three parts: 1. the relevance between each word t in the query and the document d; 2. similarity between the word t and the query; 3. the weight of each word. For example, when searching for "the concept of fund", the user can obtain the desired result through the relationship between the fund and the explanation of the fund (i.e., the concept of fund). In some embodiments, an entity type in the knowledge graph needs to be queried according to the intention of search content input by a user, one entity type corresponds to a plurality of content entities (content entities are content to be specifically shown to the user), and the plurality of corresponding content entities are returned according to the entity type; for example, if a user searches (a company of the fund), the identified and queried "company" is an entity type, the company has a plurality of companies, and the specific name of each company is a content entity, so that a plurality of entities can be queried through the entity type and returned to the front end for presentation.
In one embodiment, the step of replacing the initial relevance weight of the content entity to the search content in the knowledge-graph with the total relevance weight of the content entity to the search content comprises:
s321, acquiring all content entities obtained based on the search content;
s322, acquiring the total click times of all content entities;
s323, calculating the floating correlation weight of the content entity and the search content according to the click times of the content entity and the total click times of all the content entities; wherein, the calculation formula is: wFloat=C1/CGeneral assembly(ii) a Wherein, the WFloatFor floating relevance weights of content entities to the search content, said C1Is the number of clicks of the content entity, CGeneral assemblyThe total number of clicks for all content entities.
As described in the above steps S321-S323, the user obtains a plurality of content entities (i.e. a plurality of search structures) based on the input search content, and the user clicks on a content entity and not only clicks on one content entity, until the user closes the search engine or inputs the next search content, the total number of clicks made by the user on the plurality of content entities is recorded; calculating the floating correlation weight of the content entity and the search content according to the total click times and the click times of one of the content entities, wherein the formula is as follows: wFloat=C1/CGeneral assembly(ii) a Wherein, WFloatAs floating relevance weights of content entities and search content, C1Number of clicks for content entity, CGeneral assemblyThe total number of clicks for all content entities. E.g. by user input "Product of fund ", 10 content entities are obtained, the total number of clicks of 10 content entities is 27, wherein the number of clicks of the content entity No. 1 is 5, and the floating relevance weight of the content entity No. 1 and the search content is 5/27.
In one embodiment, in the step of calculating the total relevance weight of the content entity and the search content according to the floating relevance weight of the content entity and the search content and the initial relevance weight of the content entity and the search content, the calculation formula is as follows: wGeneral assembly=WInitial*A+WFloatB, and a + B ═ 1, wherein WGeneral assemblyIs the total relevance weight, W, of a content entity to the search contentInitialIs an initial relevance weight, W, of a content entity to the search contentFloatA, B is a weighted percentage factor for a floating relevance weight of a content entity to the search content.
As described above, the calculation formula for calculating the total relevance weight of the content entity to the search content is: wGeneral assembly=WInitial*A+WFloatB, and a + B ═ 1; wherein, WGeneral assemblyRelevance weights, W, for content entities and search content influenced by the user's click behaviorInitialInputting an initial relevance weight of a content entity before searching content and the searching content for a user, wherein the initial relevance weight may change after each clicking action of the user; wFloatThe influence value on the relevance weight of the content entity to the search content generated for the click behavior of the user calculated according to the above-described steps S321 to S323. The respective proportions of the content entities and the search content are set according to the initial relevance weight and the floating relevance weight of the content entities and the search content, namely a and B, the a and B are weighted percentage factors which can be 50% and 50%, 40% or 60% and other numerical values, and the setting is specifically carried out according to needs and is not limited herein.
In one embodiment, after the step of adjusting the relevance weights of the content entities in the knowledge-graph and the search content according to the click behavior, the method further comprises:
s4, acquiring a new content entity obtained by the user based on the search content;
s5, performing word segmentation on the newly added content entity to obtain a word segmentation result of the newly added content entity;
s6, performing entity recognition according to the word segmentation result of the newly added content entity to obtain an entity recognition result of the newly added content entity;
s7, calculating the initial correlation weight of the newly added content entity and the search content according to the entity identification result of the newly added content entity.
As described in the above steps S4-S7, the user searches out a plurality of search results based on the search content, for example, the user searches "products of fund" to obtain a plurality of products, and after a period of time, the products have a new product, that is, a new product (also a new content entity), and although the user can search for the new product in other ways based on the search content, the new product is not added to the constructed knowledge graph, so that the new product needs to be associated with the search content of the user income and added to the knowledge graph. Therefore, based on the same method for processing the search content input by the user, the newly added content entity is segmented to obtain a segmentation recognition result, and then the entity recognition is performed according to the segmentation recognition result, wherein the segmentation mode and the entity recognition mode are the same as the method for processing the search content input by the user, and are not repeated herein. And calculating the initial relevance weight of the newly added content and the search content according to the entity identification result of the newly added content entity so as to judge whether the newly added content entity has greater relevance with the search content input by the user according to the initial relevance weight and judge whether the newly added content entity can be added into the constructed knowledge graph.
In one embodiment, the step of calculating an initial relevance weight of the added content entity to the search content according to the entity identification result of the added content entity includes:
s71, calculating the relevance scores of the entities in the newly added content entities and the content entities in the knowledge graph by adopting a TF-IDF algorithm; wherein, the content entity in the knowledge graph is the content entity corresponding to the search content in the knowledge graph;
s72, taking the relevance score as the initial relevance weight of the newly-added content entity and the search content.
As described in the foregoing steps S71-S72, after the entity identification is performed on the newly added content entity, a TF-IDF (Term Frequency-Inverse Document Frequency) algorithm is used to calculate a relevance score between the entity in the newly added content entity and the content entity in the knowledge graph, that is, a score of relevance between the newly added content entity and the search content input by the user, and the score is used as an initial relevance weight of the newly added content entity and the search content, and the initial relevance weight is used when the newly added content entity has a greater relevance with the search content input by the user.
TF-IDF is a commonly used weighting technique for information retrieval and information exploration. TF-IDF is a statistical method to evaluate the importance of a word to one of a set of documents or a corpus. The importance of a word increases in proportion to the number of times it appears in a document, but at the same time decreases in inverse proportion to the frequency with which it appears in the corpus.
In one embodiment, after the step of calculating the initial relevance weight of the added content entity and the search content according to the entity identification result of the added content entity, the method further includes:
s8, judging whether the initial relevance weight of the newly added content entity and the search content is greater than a set value;
s9, if yes, establishing the correlation between the newly added content entity and the content entity in the knowledge graph, and storing the newly added content entity in the knowledge graph; and the content entity in the knowledge graph is the content entity corresponding to the search content in the knowledge graph.
As described in the above steps S8-S9, a setting value is set based on the initial correlation weight of the newly added content entity (the setting value is adjusted as needed, and is not limited herein), and when the calculated initial correlation weight of the newly added content entity is greater than the setting value, the newly added content entity is considered to have a greater correlation with the search content input by the user, so that the correlation between the newly added content entity and the entity corresponding to the search content input by the user in the knowledge graph is established, and the newly added content entity is stored in the knowledge graph, thereby achieving the purpose of updating the knowledge graph.
According to the method and the device, the corresponding content entity is inquired in the knowledge graph based on the search content of the user, the relevance weight of the content entity and the search content in the knowledge graph is adjusted according to the click behavior of the user on the content entity, the relevance of the search content in the vertical field is improved, the click rate under the search scene of the user is improved, and the purposes of promoting transaction conversion, helping enterprises improve the service quality and improving the customer satisfaction degree are achieved. Meanwhile, the method has dynamic expandability, and the newly added content entities can be led in the knowledge graph to update the knowledge graph so as to improve the click rate of the user on the search results.
As shown in fig. 2, the present application further provides a knowledge-graph-based search result updating apparatus, including:
the first acquisition module 1 is used for acquiring a content entity queried in a knowledge graph by a user based on search content within a preset time period;
the second obtaining module 2 is used for obtaining the click times of the user on the content entity;
the first calculating module 3 is used for calculating the floating correlation weight of the content entity and the search content according to the click times;
a third obtaining module 4, configured to obtain an initial relevance weight of the content entity and the search content;
a second calculating module 5, configured to calculate a total relevance weight of the content entity and the search content according to the floating relevance weight of the content entity and the search content and the initial relevance weight of the content entity and the search content;
a replacing module 6, configured to replace the initial relevance weight of the content entity and the search content in the knowledge graph with the total relevance weight of the content entity and the search content, so as to complete updating of the knowledge graph;
and a returning module 7, configured to, when the user searches the search content again, return a search result corresponding to the search content based on the updated knowledge graph.
In one embodiment, the first obtaining module 1 includes:
a search content acquisition unit for acquiring search content input by a user;
a search content processing unit, configured to perform natural language processing on the search content to obtain an intention of the search content;
and the content entity query unit is used for querying in the knowledge graph by adopting a BM25 algorithm according to the intention of the search content to obtain a content entity.
In one embodiment, the first obtaining module 1 further includes:
the building module is used for building a knowledge graph of the vertical field; wherein the vertical field is a professional field.
In one embodiment, a search content processing unit includes:
the word segmentation subunit is used for carrying out word segmentation on the search content to obtain a word segmentation result of the search content;
the entity identification subunit is used for carrying out entity identification according to the word segmentation result of the search content to obtain an entity identification result of the search content;
a syntax analysis subunit, configured to perform syntax analysis according to the entity identification result of the search content to obtain a syntax analysis result;
an intention identifying subunit configured to identify an intention of the search content based on the entity identification result and the syntax analysis result.
In one embodiment, the content entity query unit includes:
an entity type query subunit, configured to query an entity type in the knowledge graph according to the intention of the search content;
and the content entity returning subunit is used for returning the content entity corresponding to the entity type.
In one embodiment, the floating correlation weight calculation unit includes:
a content entity obtaining subunit, configured to obtain all content entities obtained based on the search content;
the motor total frequency acquiring subunit is used for acquiring the total click frequency of all the content entities;
the formula calculation subunit is used for calculating the floating correlation weight of the content entity and the search content according to the click times of the content entity and the total click times of all the content entities; wherein, the calculation formula is: wFloat=C1/CGeneral assembly(ii) a Wherein, the WFloatFor floating relevance weights of content entities to the search content, said C1Is the number of clicks of the content entity, CGeneral assemblyThe total number of clicks for all content entities.
In one embodiment, in the total correlation weight calculation unit, the calculation formula is: wGeneral assembly=WInitial*A+WFloatB, and a + B ═ 1, wherein WGeneral assemblyIs the total relevance weight, W, of a content entity to the search contentInitialIs an initial relevance weight, W, of a content entity to the search contentFloatA, B is a weighted percentage factor for a floating relevance weight of a content entity to the search content.
In one embodiment, further comprising:
a newly added content entity obtaining module, configured to obtain a newly added content entity obtained by the user based on the search content;
the newly added content entity word segmentation module is used for segmenting words of the newly added content entity to obtain word segmentation results of the newly added content entity;
the newly added content entity identification module is used for carrying out entity identification according to the word segmentation result of the newly added content entity to obtain an entity identification result of the newly added content entity;
and the newly added content entity initial correlation weight calculation module is used for calculating the initial correlation weight of the newly added content entity and the search content according to the entity identification result of the newly added content entity.
In one embodiment, the module for calculating the initial relevance weight of the newly added content entity includes:
a TF-IDF algorithm unit for calculating the relevance scores of the entities in the newly added content entities and the content entities in the knowledge graph by adopting a TF-IDF algorithm; wherein, the content entity in the knowledge graph is the content entity corresponding to the search content in the knowledge graph;
means for determining a relevance score for the newly added content entity based on the initial relevance weight of the search content.
In one embodiment, further comprising:
the judging module is used for judging whether the initial correlation weight of the newly added content entity and the search content is greater than a set value or not;
the storage module is used for establishing the correlation between the newly added content entity and the content entity in the knowledge graph and storing the newly added content entity in the knowledge graph when the initial correlation weight of the newly added content entity and the search content is larger than a set value; and the content entity in the knowledge graph is the content entity corresponding to the search content in the knowledge graph.
The above units, modules, and sub-units are all used to correspondingly execute each step in the above search result updating method based on a knowledge graph, and specific implementation manners thereof are described with reference to the above method embodiments, and are not described herein again.
As shown in fig. 3, the present application also provides a computer device, which may be a server, and the internal structure of which may be as shown in fig. 3. The computer device includes a processor, a memory, a network interface, and a database connected by a system bus. Wherein the computer designed processor is used to provide computational and control capabilities. The memory of the computer device comprises a nonvolatile storage medium and an internal memory. The non-volatile storage medium stores an operating system, a computer program, and a database. The memory provides an environment for the operation of the operating system and the computer program in the non-volatile storage medium. The database of the computer device is used to store all data required by the process of the knowledge-graph based search result update method. The network interface of the computer device is used for communicating with an external terminal through a network connection. The computer program, when executed by a processor, implements a method for knowledge-graph based search result updating.
Those skilled in the art will appreciate that the architecture shown in fig. 3 is only a block diagram of some of the structures associated with the disclosed aspects and is not intended to limit the computing devices to which the disclosed aspects may be applied.
An embodiment of the present application further provides a computer-readable storage medium, on which a computer program is stored, and the computer program, when executed by a processor, implements any one of the above methods for updating search results based on a knowledge graph.
It will be understood by those skilled in the art that all or part of the processes of the methods of the above embodiments may be implemented by hardware associated with instructions of a computer program, which may be stored on a non-volatile computer-readable storage medium, and when executed, may include processes of the above embodiments of the methods. Any reference to memory, storage, database, or other medium provided herein and used in the examples may include non-volatile and/or volatile memory. Non-volatile memory can include read-only memory (ROM), Programmable ROM (PROM), Electrically Programmable ROM (EPROM), Electrically Erasable Programmable ROM (EEPROM), or flash memory. Volatile memory can include Random Access Memory (RAM) or external cache memory. By way of illustration and not limitation, RAM is available in a variety of forms such as Static RAM (SRAM), Dynamic RAM (DRAM), Synchronous DRAM (SDRAM), double-rate SDRAM (SSRSDRAM), Enhanced SDRAM (ESDRAM), synchronous link (Synchlink) DRAM (SLDRAM), Rambus Direct RAM (RDRAM), direct bus dynamic RAM (DRDRAM), and memory bus dynamic RAM (RDRAM).
It should be noted that, in this document, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, apparatus, article, or method that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, apparatus, article, or method. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other like elements in a process, apparatus, article, or method that includes the element.
The above description is only a preferred embodiment of the present application, and not intended to limit the scope of the present application, and all modifications of equivalent structures and equivalent processes, which are made by the contents of the specification and the drawings of the present application, or which are directly or indirectly applied to other related technical fields, are also included in the scope of the present application.

Claims (10)

1. A method for updating search results based on knowledge graph is characterized by comprising the following steps:
acquiring a content entity inquired in a knowledge graph by a user based on search content within a preset time period;
acquiring the number of clicks of the user on the content entity;
calculating the floating correlation weight of the content entity and the search content according to the click times;
acquiring an initial relevance weight of the content entity and the search content;
calculating a total relevance weight of the content entity and the search content according to the floating relevance weight of the content entity and the search content and the initial relevance weight of the content entity and the search content;
replacing the initial relevance weight of the content entity and the search content in the knowledge-graph by the total relevance weight of the content entity and the search content to finish the updating of the knowledge-graph;
and when the user searches the search content again, returning a search result corresponding to the search content based on the updated knowledge graph.
2. The method for updating knowledge-graph based search results according to claim 1, wherein the step of obtaining the content entities queried by the user in the knowledge-graph based on the search content comprises:
acquiring search content input by a user;
carrying out natural language processing on the search content to obtain the intention of the search content;
and querying in a knowledge graph by adopting a BM25 algorithm according to the intention of the search content to obtain a content entity.
3. The method of claim 2, wherein the step of performing natural language processing on the search content to obtain the intention of the search content comprises:
performing word segmentation on the search content to obtain a word segmentation result of the search content;
performing entity recognition according to the word segmentation result of the search content to obtain an entity recognition result of the search content;
performing syntactic analysis according to the entity identification result of the search content to obtain a syntactic analysis result;
identifying an intent of the search content based on the entity recognition result and the syntactic analysis result.
4. The method of claim 1, wherein the step of replacing the initial relevance weight of the content entity to the search content in the knowledge-graph with the total relevance weight of the content entity to the search content comprises:
acquiring all content entities obtained based on the search content;
acquiring the total number of clicks of all content entities;
and calculating the floating correlation weight of the content entity and the search content according to the click times of the content entity and the total click times of all the content entities.
5. The method of claim 1, wherein the step of adjusting relevance weights of content entities in the knowledge-graph to the search content according to the click behavior is followed by further comprising:
acquiring a newly added content entity obtained by the user based on the search content;
performing word segmentation on the newly added content entity to obtain a word segmentation result of the newly added content entity;
performing entity identification according to the word segmentation result of the newly added content entity to obtain an entity identification result of the newly added content entity;
and calculating the initial correlation weight of the newly added content entity and the search content according to the entity identification result of the newly added content entity.
6. The method of claim 5, wherein the step of calculating an initial relevance weight of the added content entity to the search content according to the entity identification result of the added content entity comprises:
calculating the relevance scores of the entities in the newly-added content entities and the content entities in the knowledge graph by adopting a TF-IDF algorithm; wherein, the content entity in the knowledge graph is the content entity corresponding to the search content in the knowledge graph;
using the relevance score as an initial relevance weight of the newly added content entity to the search content.
7. The method of claim 5, wherein the step of calculating an initial relevance weight of the added content entity to the search content according to the entity identification result of the added content entity is followed by further comprising:
judging whether the initial correlation weight of the newly added content entity and the search content is greater than a set value or not;
if so, establishing the correlation between the newly added content entity and the content entity in the knowledge graph, and storing the newly added content entity in the knowledge graph; and the content entity in the knowledge graph is the content entity corresponding to the search content in the knowledge graph.
8. A knowledge-graph-based search result updating apparatus, comprising:
the first acquisition module is used for acquiring a content entity inquired in the knowledge graph by a user based on search content in a preset time period;
the second acquisition module is used for acquiring the click times of the user on the content entity;
the first calculation module is used for calculating the floating relevance weight of the content entity and the search content according to the click times;
a third obtaining module, configured to obtain an initial relevance weight of the content entity and the search content;
a second calculation module, configured to calculate a total relevance weight of the content entity and the search content according to the floating relevance weight of the content entity and the search content and the initial relevance weight of the content entity and the search content;
a replacing module, configured to replace the initial relevance weight of the content entity and the search content in the knowledge graph with the total relevance weight of the content entity and the search content, so as to complete updating of the knowledge graph;
and the returning module is used for returning a search result corresponding to the search content based on the updated knowledge graph when the user searches the search content again.
9. A computer device comprising a memory and a processor, the memory storing a computer program, wherein the processor implements the steps of the method of any one of claims 1 to 7 when executing the computer program.
10. A computer-readable storage medium, on which a computer program is stored, which, when being executed by a processor, carries out the steps of the method of any one of claims 1 to 7.
CN202110720141.0A 2021-06-28 2021-06-28 Knowledge graph-based search result updating method and device and computer equipment Pending CN113434696A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110720141.0A CN113434696A (en) 2021-06-28 2021-06-28 Knowledge graph-based search result updating method and device and computer equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110720141.0A CN113434696A (en) 2021-06-28 2021-06-28 Knowledge graph-based search result updating method and device and computer equipment

Publications (1)

Publication Number Publication Date
CN113434696A true CN113434696A (en) 2021-09-24

Family

ID=77754938

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110720141.0A Pending CN113434696A (en) 2021-06-28 2021-06-28 Knowledge graph-based search result updating method and device and computer equipment

Country Status (1)

Country Link
CN (1) CN113434696A (en)

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101256596A (en) * 2008-03-28 2008-09-03 北京搜狗科技发展有限公司 Method and system for instation guidance
CN103034660A (en) * 2011-10-10 2013-04-10 阿里巴巴集团控股有限公司 Method, device and system for providing information
US20160034462A1 (en) * 2014-08-01 2016-02-04 Facebook, Inc. Search Results Based on User Biases on Online Social Networks
CN107180059A (en) * 2016-03-11 2017-09-19 北大方正集团有限公司 Data retrieval method and data retrieval system
CN107273476A (en) * 2017-06-08 2017-10-20 广州优视网络科技有限公司 A kind of article search method, device and server
CN109522465A (en) * 2018-10-22 2019-03-26 国家电网公司 The semantic searching method and device of knowledge based map
CN112380352A (en) * 2020-10-28 2021-02-19 中国商用飞机有限责任公司北京民用飞机技术研究中心 Interactive retrieval method and device, computer equipment and storage medium

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101256596A (en) * 2008-03-28 2008-09-03 北京搜狗科技发展有限公司 Method and system for instation guidance
CN103034660A (en) * 2011-10-10 2013-04-10 阿里巴巴集团控股有限公司 Method, device and system for providing information
US20160034462A1 (en) * 2014-08-01 2016-02-04 Facebook, Inc. Search Results Based on User Biases on Online Social Networks
CN107180059A (en) * 2016-03-11 2017-09-19 北大方正集团有限公司 Data retrieval method and data retrieval system
CN107273476A (en) * 2017-06-08 2017-10-20 广州优视网络科技有限公司 A kind of article search method, device and server
CN109522465A (en) * 2018-10-22 2019-03-26 国家电网公司 The semantic searching method and device of knowledge based map
CN112380352A (en) * 2020-10-28 2021-02-19 中国商用飞机有限责任公司北京民用飞机技术研究中心 Interactive retrieval method and device, computer equipment and storage medium

Similar Documents

Publication Publication Date Title
CN108804641B (en) Text similarity calculation method, device, equipment and storage medium
CN108829822B (en) Media content recommendation method and device, storage medium and electronic device
CN109992646B (en) Text label extraction method and device
CN111898031B (en) Method and device for obtaining user portrait
CN109376222B (en) Question-answer matching degree calculation method, question-answer automatic matching method and device
CN109829629B (en) Risk analysis report generation method, apparatus, computer device and storage medium
CN108932342A (en) A kind of method of semantic matches, the learning method of model and server
CN109800307B (en) Product evaluation analysis method and device, computer equipment and storage medium
CN110458324B (en) Method and device for calculating risk probability and computer equipment
CN111209738A (en) Multi-task named entity recognition method combining text classification
CN110674319A (en) Label determination method and device, computer equipment and storage medium
CN112163424A (en) Data labeling method, device, equipment and medium
CN111507089B (en) Document classification method and device based on deep learning model and computer equipment
CN110321437B (en) Corpus data processing method and device, electronic equipment and medium
CN110362798B (en) Method, apparatus, computer device and storage medium for judging information retrieval analysis
CN113434763B (en) Method, device, equipment and storage medium for generating recommendation reason of search result
CN112215629A (en) Multi-target advertisement generation system and method based on construction countermeasure sample
CN110275949A (en) Auto-answer method and system for application of providing a loan
CN110020032A (en) Use the document searching of syntactic units
CN109992723B (en) User interest tag construction method based on social network and related equipment
CN115794898B (en) Financial information recommendation method and device, electronic equipment and storage medium
CN115860283B (en) Contribution degree prediction method and device based on knowledge worker portrait
CN115827990B (en) Searching method and device
CN116804998A (en) Medical term retrieval method and system based on medical semantic understanding
CN112989022B (en) Intelligent virtual text selection method and device and computer equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination