CN110263180B - Intention knowledge graph generation method, intention identification method and device - Google Patents

Intention knowledge graph generation method, intention identification method and device Download PDF

Info

Publication number
CN110263180B
CN110263180B CN201910511702.9A CN201910511702A CN110263180B CN 110263180 B CN110263180 B CN 110263180B CN 201910511702 A CN201910511702 A CN 201910511702A CN 110263180 B CN110263180 B CN 110263180B
Authority
CN
China
Prior art keywords
intention
entity
mapping relation
search
resource data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201910511702.9A
Other languages
Chinese (zh)
Other versions
CN110263180A (en
Inventor
李然
卢佳俊
王灿
朱嘉琪
任可欣
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Baidu Netcom Science and Technology Co Ltd
Original Assignee
Beijing Baidu Netcom Science and Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Baidu Netcom Science and Technology Co Ltd filed Critical Beijing Baidu Netcom Science and Technology Co Ltd
Priority to CN201910511702.9A priority Critical patent/CN110263180B/en
Publication of CN110263180A publication Critical patent/CN110263180A/en
Application granted granted Critical
Publication of CN110263180B publication Critical patent/CN110263180B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/36Creation of semantic tools, e.g. ontology or thesauri
    • G06F16/367Ontology
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/901Indexing; Data structures therefor; Storage structures
    • G06F16/9024Graphs; Linked lists

Abstract

The invention provides an intention knowledge graph generation method, an intention identification method and an intention identification device. When the intention is identified, for the entity type search terms input by the user, the corresponding entity is determined, then all the search terms under the entity are determined, the search term closest to the search term input by the user is selected, and the corresponding second mapping relation is determined, so that the related resource data is obtained. The present disclosure also provides a server and a computer readable medium.

Description

Intention knowledge graph generation method, intention identification method and device
Technical Field
The disclosure relates to the technical field of knowledge maps, in particular to an intention knowledge map generation method and device, an intention identification method and device, a server and a computer readable medium.
Background
In a conventional search scenario, a search engine returns relevant search results for a user according to a search term (query) input by the user. The traditional search scenario is based on logic of inverted index and string matching to return relevant web pages or information for the user, such as: the user searches the query as "how much money is for the teacup dog", and then the relevant pages of the "teacup dog" and the "how much money" can be matched for the user according to the method.
However, for entity type search (that is, the search term of the user only has one entity term without auxiliary information of the suffix), the search term may not contain the search intention of the user, for example, the user searches for "teacup dog", and in this scenario, the search engine cannot obtain the user intention through the literal intention of the search term, that is, does not know which information the user specifically wants to know about the teacup dog, and cannot accurately return the result related to the potential intention of the user. Therefore, how to accurately identify the potential retrieval intention of the user entity type retrieval word and display the most concerned information is a key for meeting the user requirements of the search product in the above scene.
For the identification of the search intention of a user, the prior art schemes are classified into two categories:
the first type: the user intention is directly analyzed from the search terms, and the method mainly comprises the following two schemes:
(1) the user intent, such as weather in Beijing, is parsed from online or historical retrieval using rule templates, and queries may be matched using templates like [ city ] [ weather ], thereby parsing out the user intent therein. On one hand, the scheme is only suitable for query describing comparison specification of a specific scene; on the other hand, once the expression mode of the search term is changed, the template needs to be redefined for adaptation. For example: the term "weather of Beijing" can be analyzed, and the term "weather of Beijing" needs to define a template newly, so that the management of the maintenance of the template is complex.
(2) And training the model from the search words by using a machine learning model or a deep learning model to identify the user intention. The scheme still depends strongly on the expression of the search word in nature, namely the more complete the search word and the more user intention, the better the result learned by the model can be. The scheme is generally used for conversation scenes, and is not applicable because the expression of the user is generally complete in the conversation scenes, and the retrieval words of the user are generally simple in the search scenes.
The second type: the intention dictionary is constructed first, and then the intention is recognized by the matching degree or similarity of the retrieval words and the intention dictionary. The key of the schemes lies in the effect of the dictionary, and the construction of the intention dictionary generally comprises the following schemes:
(1) obtaining an intention dictionary composed of different categories through model training according to a historical search word set; (2) clustering to obtain an intention dictionary based on categories according to the historical search word set; (3) and classifying or training according to the click of the webpage in the retrieval result page to obtain an intention classification model.
Identifying intents using an intent dictionary, the obtained intents being category based rather than entity granular; furthermore, the acquired intention is flattened, that is, no semantic intention system is formed, and it is not known what the acquired intention means.
In the third category: and integrating the intentions by using a knowledge graph method to establish an intention knowledge graph.
Existing intent knowledge graphs, again based on category granularity, such as: libai and Dufu are both characters, and currently, the intention atlas is only integrated according to the granularity of "characters", but not refined to the granularity of entities such as Libai or Dufu to arrange the intention, or even refined to the level of various retrieval expressions for Libai (for example, the retrieval words are Libai, poetry and the like).
Disclosure of Invention
The present disclosure addresses the above-identified deficiencies in the prior art by providing an intention knowledge graph generation method and apparatus, an intention identification method and apparatus, a server, and a computer-readable medium.
In a first aspect, an embodiment of the present disclosure provides an intention knowledge graph generating method, including:
acquiring historical retrieval information of a user, and identifying an entity and an intention corresponding to a retrieval word according to the historical retrieval information of the user;
establishing a first mapping relation between a search term and an entity and a second mapping relation between the search term and an intention;
associating the entity with the concept at the bottom layer in a preset concept system, wherein the concept system at least comprises two layers of structures;
and acquiring content information of the resource data, and establishing a link between the resource data and the second mapping relation according to the content information and the identified intention so as to generate an intention knowledge graph.
Further, after the identifying the intent corresponding to the search term, the method further comprises:
and if the same intention corresponds to multiple intention expressions, establishing a third mapping relation between the same intention and the corresponding intention expressions.
Preferably, the establishing a link between the resource data and the second mapping relationship according to the content information and the identified intention specifically includes:
matching the content information with the identified intention, and if the content information is matched with the identified intention, establishing a link between the resource data and the second mapping relation; if not, determining a corresponding intention expression according to the third mapping relation, matching the content information with the intention expression, and if the content information is matched with at least one intention expression, establishing a link between the resource data and the second mapping relation;
after establishing the link between the resource data and the second mapping relationship, the method further comprises: and setting a first weight for the resource data according to the matching degree.
Further, in the first mapping relationship, one entity corresponds to at least one search term; the method further comprises the following steps:
if one entity corresponds to a plurality of search terms, respectively determining the corresponding intention of each search term of the entity;
and determining the importance of the intention according to the historical retrieval information of the user, and setting a second weight for a corresponding second mapping relation according to the importance of the intention.
Further, after generating the intent knowledge-graph, the method further comprises the step of expanding the intent knowledge-graph, the step of expanding the intent knowledge-graph comprising:
judging whether non-common intentions exist among retrieval words corresponding to different entities under the same concept at the bottom layer;
if so, respectively supplementing the intentions corresponding to different entities under the same concept at the bottom layer in the intention knowledge graph according to the non-common intentions;
establishing a second mapping relation between the search terms under the entity supplemented with the intention and the supplemented intention;
and acquiring corresponding resource data according to the established second mapping relation, and establishing a link between the acquired resource data and the established second mapping relation.
In another aspect, an embodiment of the present disclosure provides an intention identifying method, including:
the method is applied to an intent knowledge graph comprising: a first mapping relationship between a term and an entity, a second mapping relationship between a term and an intent, and a link of resource data to the second mapping relationship, the method comprising:
judging whether the search word input by the user is an entity search word, if so, determining a corresponding entity according to the search word and the first mapping relation;
determining a search term corresponding to the entity according to the first mapping relation, and determining a search term closest to the search term input by the user from the search terms;
determining a second mapping relation corresponding to a retrieval word closest to the retrieval word input by the user;
and acquiring and returning corresponding resource data according to the determined second mapping relation and the link between the resource data and the second mapping relation.
Further, a first weight of the resource data is also included in the intention knowledge graph;
the returning of the corresponding resource data specifically includes: and returning corresponding resource data according to the first weight in the intention knowledge graph.
Preferably, the second mapping relationship has a second weight;
the acquiring and returning corresponding resource data according to the determined second mapping relationship and the link between the resource data and the second mapping relationship specifically includes:
if a plurality of second mapping relations are determined, acquiring a second weight corresponding to each second mapping relation;
and acquiring and returning corresponding resource data according to the second weight corresponding to each second mapping relation.
Further, the intention identification method further includes:
and if the second mapping relation corresponding to the retrieval word closest to the retrieval word input by the user cannot be determined, determining second mapping relations corresponding to other retrieval words, wherein the other retrieval words are retrieval words except the retrieval word closest to the retrieval word input by the user in the determined retrieval words corresponding to the entity.
In yet another aspect, the disclosed embodiments provide an intention knowledge graph generating apparatus, the apparatus comprising: the method comprises the following steps: the system comprises a first acquisition module, an identification module, an establishment module, an association module, a second acquisition module and a link module;
the first acquisition module is used for acquiring historical retrieval information of a user;
the identification module is used for identifying an entity and an intention corresponding to a search word according to the historical search information of the user;
the establishing module is used for establishing a first mapping relation between a search term and an entity and a second mapping relation between the search term and an intention;
the association module is used for associating the entity with the concept at the bottom layer in a preset concept system, wherein the concept system at least comprises two layers of structures;
the second acquisition module is used for acquiring the content information of the resource data;
and the link module is used for establishing a link between the resource data and the second mapping relation according to the content information and the identified intention so as to generate an intention knowledge graph.
Further, the establishing module is further configured to, after the identifying module identifies the intention corresponding to the search term, establish a third mapping relationship between the same intention and each corresponding intention expression when the same intention corresponds to multiple intention expressions.
Preferably, the link module is specifically configured to match the content information with the identified intention, and when the content information matches the identified intention, establish a link between the resource data and the second mapping relationship; when the content information is not matched with the identified intention, determining corresponding intention expressions according to the third mapping relation, matching the content information with the intention expressions, and when the content information is matched with at least one intention expression, establishing a link between the resource data and the second mapping relation;
the intention knowledge graph generating device further comprises a first setting module, wherein the first setting module is used for setting a first weight for the resource data according to the matching degree when the content information is matched with the identified intention.
Preferably, in the first mapping relationship, one entity corresponds to at least one search term; the intention knowledge graph generating device further comprises a second setting module, wherein the second setting module comprises a second determining unit, a processing unit and a setting unit;
the second determining unit is used for respectively determining the corresponding intention of each search term of an entity when the entity corresponds to a plurality of search terms;
the processing unit is used for determining the importance of the intention according to the user history retrieval information;
the setting unit is used for setting a second weight for the corresponding second mapping relation according to the importance of the intention.
Further, the intention knowledge graph generating device further comprises an extension module, wherein the extension module comprises a second judgment unit, an intention supplement unit, a second mapping relation supplement unit, a resource obtaining unit and a link supplement unit;
the second judging unit is used for judging whether the search terms corresponding to different entities under the same concept at the bottom layer have non-common intentions;
the intention supplementing unit is used for respectively supplementing the intentions corresponding to the different entities belonging to the same concept at the bottommost layer in the intention knowledge graph according to the non-shared intention when the second judging unit judges that the non-shared intention exists among the retrieval words corresponding to the different entities belonging to the same concept at the bottommost layer;
the second mapping relation supplementing unit is used for establishing a second mapping relation between the search terms under the entity supplemented with the intention and the supplemented intention;
the resource obtaining unit is used for obtaining corresponding resource data according to the established second mapping relation;
and the link supplementing unit is used for establishing a link between the acquired resource data and the established second mapping relation.
In still another aspect, an intention recognition apparatus is further provided in an intention knowledge graph, where the intention knowledge graph includes: a first mapping relation between the search term and the entity, a second mapping relation between the search term and the intention, and a link between the resource data and the second mapping relation; the intention recognition means includes: the device comprises a judging module, a determining module and a resource obtaining module;
the judging module is used for judging whether the search word input by the user is an entity search word;
the determining module is used for determining a corresponding entity according to the search term and the first mapping relation when the judging module judges that the search term input by the user is the entity search term; determining a search term corresponding to the entity according to the first mapping relation, and determining a search term closest to the search term input by the user from the search terms; determining a second mapping relation corresponding to a retrieval word closest to the retrieval word input by the user;
and the resource acquisition module is used for acquiring and returning corresponding resource data according to the determined second mapping relation and the link between the resource data and the second mapping relation.
Further, a first weight of the resource data is also included in the intention knowledge graph;
the resource obtaining module is specifically configured to return corresponding resource data according to the first weight in the intention knowledge graph.
Preferably, the second mapping relationship has a second weight;
the resource obtaining module is specifically configured to, when the determined second mapping relationships are multiple, obtain second weights corresponding to the second mapping relationships; and acquiring and returning corresponding resource data according to the second weight corresponding to each second mapping relation.
Further, the determining module is further configured to determine a second mapping relationship corresponding to another search term when the second mapping relationship corresponding to the search term closest to the search term input by the user cannot be determined, where the another search term is a search term other than the search term closest to the search term input by the user in the search terms corresponding to the determined entity.
Yet another embodiment of the present disclosure further provides a server, including:
one or more processors;
a storage device having one or more programs stored thereon;
the one or more programs, when executed by the one or more processors, cause the one or more processors to implement an intent knowledge graph generation method or an intent recognition method as previously described.
Another embodiment of the present disclosure also provides a computer readable medium having a computer program stored thereon, wherein the program when executed implements the intention knowledge graph generating method or the intention identifying method as described above.
According to the embodiment of the disclosure, according to historical search information of a user, an entity and an intention corresponding to a search word are identified, a first mapping relation between the search word and the entity and a second mapping relation between the search word and the intention are established, the entity is associated with a bottom concept in a preset concept system, and a link between resource data and the second mapping relation is established according to content information of the resource data and the identified intention, so that an intention knowledge graph is generated. When the intention is identified, for the entity type search terms input by the user, the corresponding entity is determined, then all the search terms under the entity are determined, the search term closest to the search term input by the user is selected, and the corresponding second mapping relation is determined, so that the related resource data is obtained.
The user intention is not directly analyzed from the search words or the intention is matched from the intention dictionary, but the intention of the user on the entity is obtained from the intention knowledge graph of the entity granularity through the search words of the entity class searched by the user. Compared with the existing intention identification method, the intention acquired by the method is not based on the category granularity, but can be refined to the entity granularity, so that the intention expression is more accurate. Compared with the existing clustering scheme and other schemes, the method and the device have the advantages that the intention knowledge graph is accurately linked to the entity, so that the intention set of the entity is obtained, meanwhile, the semantic relation between the intention and other intentions of the entity, the upper concept of the entity, the lower sub-intention of the entity and the like can be known, the obtained intention set is semantically interpretable, and the intention identification and satisfaction effects are more accurate.
Drawings
FIG. 1 is a schematic diagram of the structure of an intent knowledge graph in accordance with an embodiment of the present disclosure;
FIG. 2 is one of the flow diagrams of an intent knowledge graph generation method of an embodiment of the present disclosure;
FIG. 3 is a second flowchart of an intent-knowledge-graph generation method in accordance with an embodiment of the present disclosure;
FIG. 4 is a flowchart of establishing a link between resource data and a second mapping relationship according to an embodiment of the disclosure;
FIG. 5 is a flow chart of setting a second weight according to an embodiment of the present disclosure;
FIG. 6 is an expanded flow diagram of an intent knowledge graph of an embodiment of the present disclosure;
FIG. 7 is a flowchart of an intent recognition method of an embodiment of the present disclosure;
FIG. 8 is one of the schematic structural diagrams of an intent-to-knowledge-map generating device of an embodiment of the present disclosure;
FIG. 9 is a second schematic diagram of the diagram knowledge base generation apparatus according to the embodiment of the present disclosure;
FIG. 10 is a schematic structural diagram of a second setup module according to an embodiment of the disclosure;
FIG. 11 is a schematic structural diagram of an expansion module according to an embodiment of the present disclosure;
fig. 12 is a schematic structural diagram of an intention identifying apparatus according to an embodiment of the present disclosure.
Detailed Description
Example embodiments will be described more fully hereinafter with reference to the accompanying drawings, but which may be embodied in different forms and should not be construed as limited to the embodiments set forth herein. Rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the scope of the disclosure to those skilled in the art.
As used herein, the term "and/or" includes any and all combinations of one or more of the associated listed items.
The terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the disclosure. As used herein, the singular forms "a", "an" and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise. It will be further understood that the terms "comprises" and/or "comprising," when used in this specification, specify the presence of stated features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groups thereof.
Embodiments described herein may be described with reference to plan and/or cross-sectional views in light of idealized schematic illustrations of the disclosure. Accordingly, the example illustrations can be modified in accordance with manufacturing techniques and/or tolerances. Accordingly, the embodiments are not limited to the embodiments shown in the drawings, but include modifications of configurations formed based on a manufacturing process. Thus, the regions illustrated in the figures have schematic properties, and the shapes of the regions shown in the figures illustrate specific shapes of regions of elements, but are not intended to be limiting.
Unless otherwise defined, all terms (including technical and scientific terms) used herein have the same meaning as commonly understood by one of ordinary skill in the art. It will be further understood that terms, such as those defined in commonly used dictionaries, should be interpreted as having a meaning that is consistent with their meaning in the context of the relevant art and the present disclosure, and will not be interpreted in an idealized or overly formal sense unless expressly so defined herein.
The terminology to which this disclosure relates is to be interpreted as follows:
an entity (or concept) refers to an actual physical body or abstract concept existing or ever existing in the real world, such as a person, an article, a structure, a product, a building, a place, a country, an organization, an event, an art work, a scientific technology, a scientific theorem, and the like.
A knowledge graph is a database that represents relationships between different entities and attributes of the entities. In the knowledge graph, entities are taken as nodes; the entities are connected with each other through edges, and the entities are connected with the values (attribute-value) of the attributes corresponding to the entities through edges, so that the structured and network-shaped database is formed. Wherein, the connection (edge) between the entities represents the relationship between the entities, such as the third person of the entity is the father of the fourth person of the entity Liqu; and the connection (edge) between the entity and its corresponding attribute value indicates that a certain attribute of the entity is a certain value, for example, the height attribute of Zhang III (person) has a value of 172 cm.
The intention knowledge graph is a knowledge graph containing intentions corresponding to search terms, and can be used for identifying intention expressions of users.
As shown in fig. 1, the intent-knowledge graph of the present disclosure includes 5 levels, respectively: concept layer, entity layer, search term layer, meaning layer and resource layer.
The entity layer is used for indicating what the entity corresponding to the search term is, and the entity layer is composed of a plurality of entities, and each entity has a unique ID (Identity) in the intention knowledge graph. The entity layer shown in figure 1 comprises a "uk short haired cat" entity and a "siamese cat" entity.
The term layer is used to represent the set of terms that each entity in the entity layer may correspond to, such as: the search term of the entity "plum white" may be "plum white", "poetry" or the like. The retrieval word layer comprises a plurality of retrieval word sets, each retrieval word set corresponds to one entity in the entity layer, namely each retrieval word in one retrieval word set corresponds to one entity, and each retrieval word has a unique ID in the intention knowledge graph. In the present disclosure, a mapping relationship between a search term and an entity is defined as a first mapping relationship.
It should be noted that, in a general sense, there may be many-to-many relationships between entities and search terms, for example, for an entity of poem of down dynasty, the search term used by a user to search for the poem may be: plum white, plum taibai, poetry, Tang Dynasty poetry, etc. And one search term may correspond to multiple entities, such as a search term, which may refer to poem from down or singer from down. However, in the entity retrieval scenario, we make the following provisions: a term can only correspond to one entity. Of course, there may be multiple choices for the term-entity correspondence rule, for example, the term-entity correspondence rule may be set according to the search habits and behaviors of most people, for example, according to the search habits of most users, the user inputs the term, namely Libai, which is aimed at searching the entity Libai of poem in Tang dynasty.
In the disclosure, it is specified that one search term only corresponds to one entity, and the corresponding rule is determined according to the search habit and the behavior of most people, so in the intention knowledge graph, the first mapping relation between the search term and the entity is also established according to the search habit and the behavior of most people. Therefore, taking the above example, the retrieval word "lisk" corresponds to the entity "lisk of poetry of dynasty" in the intention knowledge graph (i.e. the first mapping relationship), and the first mapping relationship can be realized by a subgraph association technology.
In the intention knowledge graph, the purpose of associating the search term layer with the entity layer is to set a first mapping relation between the search term and the entity:
first, the problem of ambiguous meaning term of the search term is solved. In the prior art, if the search term is "lee", the intention of the user to search for lee of poetry of dynasty and the intention of the user to search for lee of singer may be mixed together and cannot be distinguished, so that the resources returned to the user may include both the resources related to lee of poetry of dynasty and the resources related to lee of singer. However, by using the intention knowledge graph disclosed by the disclosure, the entity of "poetry of Tang dynasty" corresponding to the search term "lie white" can be found first according to the subgraph association technology, and then the corresponding intention of the entity is further obtained.
Second, the intent of the same entity is shared. For example, the entity, tangchaos dui, whose corresponding terms may be: plum-taibai and poem. Supposing that in an intention knowledge graph, a retrieval word is poetry and has no corresponding intention information; the search term "plum white" corresponds to the intention "why plum white is called plum white". Then when the search term input by the user is poetry, the entity of 'plum white' can be corresponding to both poetry and plum white, so that the intention of 'why plum white is called plum white' is returned to the user, thereby achieving the effect of mutual sharing of the search terms under the granularity of the entity.
The Concept layer is a higher-level generalization and aggregation of the physical layer, and may be formed according to a Concept System (System of Concept) including at least a two-layer structure. The concept system is a concept system which is arranged according to different abstract degrees of concepts and relations among the concepts, and is composed of two dimensions, namely a vertical dimension and a horizontal dimension. The vertical dimension is a concept in which different levels are formed due to different degrees of abstraction for things, and is referred to as an overall level, a basic level, a generic level, and the like. Such as animals, pets, and pet cats, are structures with vertical dimensions from high to low. A concept hierarchy is made up of a set of related concepts. Each concept occupies an exact position in the concept hierarchy. The concept layer is made up of a plurality of concepts, each concept having a unique ID in the intent knowledge graph.
The intention layer is a set of user search intentions and is used for representing real intentions corresponding to user search words, and each intention has a unique ID in the intention knowledge graph. For example, the term "dui-white" corresponds to an entity of "dui-down poetry", the intention of which may be "famous poetry of dui-white" or the like. In the intention knowledge graph, the intention is linked with the search term, namely a second mapping relation is formed between the search term and the intention. It should be noted that the name of an entity is also a search term, for example, the search term "lisi" is a search expression of the entity "lisi of down dynasty". It should be noted that each intention in the intention layer may correspond to multiple intention expressions. For example, there may be multiple intent expressions such as "how much money is one", "price", and the like, for the "purchase" intent.
And the resource layer is used for recording resource links meeting a second mapping relation between the search terms and the intents. The resource data includes document data and/or multimedia data, and the document refers to a file generated by WORD, EXCEL and other text editing software. The resource data is stored in a database and can be a simple piece of information, a complex article or video, etc. For example, for a second mapping between the term "duel" and the intent "famous poetry", there may be an article named "duel's ten famous poetry and its resolution" matching it. It should be noted that the index of the resource data is a second mapping relationship, the second mapping relationship is uniquely represented by an edgeID, multiple resources may be hooked under each edgeID, each resource data has a first weight, the first weight represents the matching degree between the resource data and the intention, the higher the matching degree is, the larger the value of the first weight is, and when returning the resource data satisfying the intention to the user, the resource with the larger first weight is preferentially returned.
In the intention knowledge graph, the second mapping relationship (i.e., edgeID) has a second weight indicating the importance level of the corresponding user intention. For example, a second weight of a second mapping (i.e., edgeID2) between the "uk shorthair cat" entity, its underlying "uk shorthair cat" term and the "distinction from blue cat" intent is 0.9; if the second weight of the second mapping relationship (e.g., edgeID6) between the search term of "uk shorthair cat" and the intention of "purchase" is 0.8, then when acquiring resources satisfying the intention of the user, resources having a high second weight, i.e., resources corresponding to the "difference between uk shorthair cat and blue cat" will be acquired preferentially.
The intention knowledge graph generation method provided by one embodiment of the present disclosure is described in detail below with reference to fig. 2. As shown in fig. 2, the intention knowledge graph generating method includes the steps of:
and step 11, acquiring historical retrieval information of the user.
Preferably, the user history retrieval information includes at least one of: historical search logs, fuzzy matching information based on search terms, and information of previous and subsequent users of the same user session. The specific implementation manner for obtaining the user history retrieval information belongs to the prior art, and is not described herein again.
It should be noted that, the concept hierarchy information may also be obtained in this step, so as to subsequently associate the entity with the concept in the concept hierarchy.
And step 12, identifying entities and intentions corresponding to the search terms according to the historical search information of the user.
Specifically, for the acquired historical search information of the user, a target entity and intention of user search corresponding to the historical search word are identified through technologies such as word segmentation, named entity identification, proper name identification and the like.
And step 13, establishing a first mapping relation between the search term and the entity and a second mapping relation between the search term and the intention.
Referring to fig. 1, the 3 search terms "british short-haired cat", "foreign short-haired cat" and "english short" all correspond to the entity "british short-haired cat", and the search term "siamese cat" corresponds to the entity "siamese cat". The second mapping relationship is represented in FIG. 1 by an arrow, which may be uniquely identified by an edgeID.
Step 14, associating the entity with the concept at the bottom layer in the concept system.
Wherein the concept system comprises at least two layers of structures, and the entity can be associated with the concept at the bottom layer.
And step 15, acquiring the content information of the resource data.
Preferably, the resource data may include document data and/or multimedia data. The document data includes structured information and article information, the structured information is that the information is analyzed and then decomposed into a plurality of interrelated components, and each component has a definite hierarchical structure, and the use and maintenance of the components are managed through a database and have certain operation specifications. The multimedia data includes data such as pictures, video data and the like.
The acquiring content information of the resource data specifically includes:
judging the type of the resource data, if the resource data is document data, determining entities and/or keywords contained in the document data, preferably identifying the entities and the keywords contained in the document data through word segmentation, named entity identification, proper name identification technology, document theme model or other technologies.
And if the resource data is multimedia data, determining the subject content label of the multimedia data, and preferably obtaining the subject content label of the multimedia data through technologies such as content marking, label mining and the like.
And step 16, establishing a link between the resource data and the second mapping relation according to the content information and the identified intention so as to generate an intention knowledge graph.
Specifically, the specific process of establishing the link between the resource data and the second mapping relationship will be described in detail later with reference to fig. 4.
As can be seen from steps 11 to 16, according to the present disclosure, an entity and an intention corresponding to a search word are identified according to user history search information, a first mapping relationship between the search word and the entity and a second mapping relationship between the search word and the intention are established, the entity is associated with a bottom concept in a preset concept system, and a link between resource data and the second mapping relationship is established according to content information of the resource data and the identified intention, so as to generate an intention knowledge graph. When the intention is identified, for the entity type search terms input by the user, the corresponding entity is determined, then all the search terms under the entity are determined, the search term closest to the search term input by the user is selected, and the corresponding second mapping relation is determined, so that the related resource data is obtained.
The user intention is not directly analyzed from the search words or the intention is matched from the intention dictionary, but the intention of the user on the entity is obtained from the intention knowledge graph of the entity granularity through the search words of the entity class searched by the user. Compared with the existing intention identification method, the intention acquired by the method is not based on the category granularity, but can be refined to the entity granularity, so that the intention expression is more accurate. Compared with the existing clustering scheme and other schemes, the method and the device have the advantages that the intention knowledge graph is accurately linked to the entity, so that the intention set of the entity is obtained, meanwhile, the semantic relation between the intention and other intentions of the entity, the upper concept of the entity, the lower sub-intention of the entity and the like can be known, the obtained intention set is semantically interpretable, and the intention identification and satisfaction effects are more accurate.
In the present disclosure, the second mapping relationship is not a mapping relationship between an entity and an intention, but a mapping relationship between a term and an intention, because: for different search terms of the same entity, the potential search intentions of the users are different, and the corresponding resources meeting the potential search intentions are different. For example, the entity name is "Liuchan", and the corresponding search term may be "Liuchan" or "Abai", wherein the search intent corresponding to "Liuchan" may be "Liuchan Lebi's thoughts" or "Liuchan why you went straight". And the search intent for "fighting" may be "why fighting cannot hold".
Further, in order to reduce the amount of data operations and improve the processing efficiency, in the intention knowledge graph generating method provided in another embodiment of the present disclosure, after the entity and the intention corresponding to the search term are identified (i.e., step 12), before the first mapping relationship between the search term and the entity is established and the second mapping relationship between the search term and the intention is established (i.e., step 13), a step of data cleaning may be further included, that is, at least one of the following operations is performed on the identified entity and intention: cleaning, denoising, disambiguation and normalization, and optimization.
Further, in the intention knowledge graph generating method provided in another embodiment of the present disclosure, as shown in fig. 3, after establishing the link of the resource data and the second mapping relationship (i.e. step 16), the method further includes:
and step 17, setting a first weight for the resource data according to the matching degree.
By setting the first weight for the resource data, when the resource meeting the retrieval intention is returned to the user, the resource can be returned according to the first weight, the resource with higher matching degree with the retrieval intention is preferentially returned to the user, and the user experience is improved.
After step 12, if the same intention corresponds to a plurality of intention expressions, a third mapping relationship between the same intention and the corresponding intention expressions is further established.
The following describes in detail a process of establishing a link between the resource data and the second mapping relationship with reference to fig. 4. As shown in fig. 4, the establishing a link between the resource data and the second mapping relationship according to the content information and the identified intention specifically includes the following steps:
step 41, matching the content information of the resource data with the identified intent.
And step 42, judging whether the content information is matched with the identified intention, if so, executing step 16, otherwise, executing step 43.
Specifically, if the content information of the resource data matches the identified intent, establishing a link between the resource data and the second mapping relationship according to the content information and the identified intent; if the content information of the resource data does not match the identified intent, further determining other intent expressions corresponding to the identified intent, and matching the content information of the resource data with the other intent expressions of the intent (i.e., performing step 43).
And 43, determining a corresponding intention expression according to the third mapping relation, and matching the content information with the intention expression.
Step 44, judging whether the content information is matched with at least one intention expression, if so, executing step 16; otherwise, the flow ends.
Specifically, if the content information of the resource data matches with at least one of the intention expressions, which indicates that the resource data also satisfies the identified intention, a link between the resource data and the second mapping relationship is established according to the content information and the identified intention (i.e., step 16 is executed); if the content information of the resource data is not matched with each intention expression, the resource which does not meet the intention is shown, and then the flow is ended.
As can be seen from steps 41 to 44, by establishing a third mapping relationship between the same intention and each corresponding intention expression, and further matching the content information of the resource data with the intention expression when the content information of the resource data cannot satisfy the intention, a variety of different intention expressions can be utilized to expand the range of resources potentially satisfying the demand when searching for resources satisfying the intention.
For the sake of clarity, the above technical solution is described in detail by a specific example with reference to the structure of the intention knowledge graph of fig. 1. For example, if it is desired to obtain the second mapping relationship of "siamese cat" and "purchase" to find the corresponding resource, it is assumed that there is an article that describes the price of "siamese cat", but the article uses an expression manner of "how much money is one of" siamese cat "to describe the content, and because there is an intention expression manner of" how much money is one of "below the" purchase "intention in the intention knowledge map, it can be determined that the article satisfies the" purchase "intention, so that the resource can be linked with the second mapping relationship.
Further, in an intention knowledge graph generating method provided in another embodiment of the present disclosure, as shown in fig. 5, the method further includes the steps of:
step 51, if an entity corresponds to a plurality of search terms, determining the corresponding intention of each search term of the entity.
And step 52, determining the importance of each intention according to the user history retrieval information, and setting a second weight for the corresponding second mapping relation according to the importance of each intention.
It can be seen from steps 51-52 that in the present disclosure, by setting the second weight for the second mapping relationship, when the resource meeting the user's intention is obtained, the resource with the second weight is preferentially obtained and returned, so that under the condition that the number of the resources returned to the user is limited, the resource information with the high importance is preferentially returned, which better conforms to the search habits of most users, and improves the user experience.
Further, in an intention knowledge graph generating method provided by another embodiment of the present disclosure, as shown in fig. 6, after the intention knowledge graph is generated, the method further includes a step of expanding the intention knowledge graph, the step of expanding the intention knowledge graph including:
step 61, judging whether the search terms corresponding to different entities under the same concept at the bottom layer have non-shared intentions, if so, executing step 62; otherwise, the flow ends.
Specifically, if there is a non-common intention between the search terms corresponding to different entities under the same concept at the bottom layer, the intention is supplemented for the different entities in the intention map (i.e., step 62); if no non-common intention exists between the retrieval words corresponding to different entities under the same concept at the bottom layer, namely the intentions corresponding to different entities are the same, the intention knowledge graph does not need to be expanded, and the process is ended.
And step 62, respectively supplementing corresponding intentions of different entities belonging to the same concept at the bottom layer in the intention knowledge graph according to the non-shared intentions.
Specifically, the non-common intention is supplemented under the corresponding entities.
And step 63, establishing a second mapping relation between the search terms under the entity supplemented with the intention and the supplemented intention.
Specifically, the specific implementation manner of this step is the same as the specific implementation manner of step 13, and is not described herein again.
And step 64, acquiring corresponding resource data according to the established second mapping relation, and establishing a link between the acquired resource data and the established second mapping relation.
Specifically, the specific implementation manner of this step is the same as that of step 16, and is not described herein again.
As can be seen from steps 61-64, by judging whether different entities under the same concept have common intentions, supplementing non-common intentions under other entities, and acquiring corresponding resource data based on the common intentions and intentions corresponding to the search terms of different entities under the same concept, the intentions corresponding to the search terms of different entities under the same concept can be shared to a certain extent, so that the intention knowledge graph is expanded. Compared with the scheme of the existing intention dictionary, the intention expansion of the existing intention knowledge graph can be carried out by the method, so that the intention identification and satisfaction aiming at the open domain are wider. By expanding the entity intention in the intention knowledge graph, the range of hanging connection of resources and intention can be expanded, and the intention knowledge graph is further improved, so that the intention identification is more accurate when the user retrieval intention is identified based on the intention knowledge graph.
For the sake of clarity, the above technical solution is described in detail by a specific example with reference to the structure of the intention knowledge graph of fig. 1. For example, the entities "british short-haired cat" and "siamese cat" both belong to the concept of "pet cat", and the "british short-haired cat" entity corresponds to 3 search terms: the corresponding intentions of the search words of English short-haired cat, foreign short-haired cat, English short and English short-haired cat are 'purchase' and 'difference from blue cat', and the corresponding intentions of the search words of foreign short-haired cat and English short are 'reason of lacrimation'; the Siamese cat entity corresponds to 1 search term Siamese cat, and the Siamese cat entity is intended to be the ear mite disease. The non-shared intention exists between the search words corresponding to the entity of the English short-hair cat and the entity of the Siamese cat, and for the entity of the English short-hair cat, the intention of the otoacariasis needs to be supplemented, so that a second mapping relation between the intentions of the 3 search words of the entity of the English short-hair cat, the English short-hair and the otoacariasis can be respectively established, the resource data of the English short-hair cat otoacariasis, the foreign short-hair cat otoacariasis and the English short-ear acariasis can be respectively obtained, and the links of the corresponding resource data and the corresponding second mapping relation are respectively established, thereby realizing the extension of the intention knowledge map. For the Siamese cat entity, the intention of 'reason for tearing', 'purchasing' and 'distinction from blue cat' and the resource thereof can be expanded as well, and the detailed description is omitted.
An intention identifying method is also provided in the embodiments of the present disclosure, as shown in fig. 7, the intention identifying method includes the following steps:
step 71, obtaining the search term input by the user.
Step 72, judging whether the search word input by the user is an entity search word, if so, executing step 73; otherwise, the flow ends.
Specifically, the existing entity discrimination model can be used to determine whether the search term is an entity search term, and the specific implementation manner is not described herein again.
And 73, determining a corresponding entity according to the search term and the first mapping relation.
In this step, the determined entities are entities that conform to most user search habits, that is, entities pointed to by most of the search terms input by the user.
And step 74, determining the search term corresponding to the entity according to the first mapping relation, and determining the search term closest to the search term input by the user from the search terms.
Step 75, determining a second mapping relationship corresponding to the search term closest to the search term input by the user.
And step 76, acquiring and returning corresponding resource data according to the determined second mapping relation and the link between the resource data and the second mapping relation.
Preferably, the corresponding resource data can be returned according to the first weight in the intention knowledge graph, so that the resource with high matching degree with the intention of the user is preferentially returned.
And if the determined second mapping relations are multiple, acquiring a second weight corresponding to each second mapping relation, and acquiring and returning corresponding resource data according to the second weights corresponding to each second mapping relation, so that the resource with high importance can be preferentially returned.
It should be noted that the resource data may also be returned from high to low confidence levels according to the type of the resource.
After the related resources are obtained, all or the first n resources can be obtained according to the product requirements and placed on a product end (entity card) for use.
Due to the limitation of product space, the entity card product can only display limited contents, so that returned resources need to be filtered and screened, and finally displayed according to resource priority. Preferably, the priority of the returned resource may be determined according to the following formula:
Figure GDA0002151512660000191
wherein, IpriorityIndicating a priority of the resource; lambda [ alpha ]1And λ2Is a coefficient; n denotes the number of all resources under some intention.
C represents the complexity of the intention, can be set according to the quantity and the size (article type counting word number and video type counting video duration) of resources corresponding to the intention, and can be calculated and recorded in an intention knowledge graph.
miThe importance degree of the resource to the entity can be determined according to the type of the entity, such as a person entity, and the importance degree of the structured information is higher; for the pet and the domesticated plant entity, the importance degree of the articles and the video resources is higher.
siThe quality of the resource, for example, the quality of the article resource is calculated based on the number of articles published by the author, the length of the articles, the number of praise, and the like, and can be calculated and recorded in the intention knowledge graph.
It can be seen from steps 71-76 that, during the intent recognition, for the entity type search term input by the user, the corresponding entity is determined first, then all search terms under the entity are determined, the search term closest to the search term input by the user is selected from the determined entity type search terms, and the corresponding second mapping relationship is determined, so as to obtain the relevant resource data. The user intention is not directly analyzed from the search words or the intention is matched from the intention dictionary, but the intention of the user on the entity is obtained from the intention knowledge graph of the entity granularity through the search words of the entity class searched by the user. Compared with the existing intention identification method, the intention acquired by the method is not based on the category granularity, but can be refined to the entity granularity, so that the intention expression is more accurate. Compared with the existing clustering scheme and other schemes, the method and the device have the advantages that the intention knowledge graph is accurately linked to the entity, so that the intention set of the entity is obtained, meanwhile, the semantic relation between the intention and other intentions of the entity, the upper concept of the entity, the lower sub-intention of the entity and the like can be known, the obtained intention set is semantically interpretable, and the intention identification and satisfaction effects are more accurate.
For the sake of clarity, the above technical solution is described in detail by a specific example with reference to the structure of the intention knowledge graph of fig. 1. For example, a search term (for example, an english short search term) input by a user is obtained, whether the search term input by the user is an entity search term is determined according to an existing entity-search term discrimination model, and if it is determined that "english short search term" is an entity search term, an entity corresponding to the search term (here, an entity of "english shorthair cat" in the entity layer) is found from the intention knowledge graph through an entity-linking technology (for example, word-pattern association). Searching the searching words below the entity of the English shorthair cat according to the entity of the English shorthair cat, and preferentially selecting the searching words consistent with the inputted searching words of the English shorthair cat according to the similarity of the character strings. And determining that the second mapping relation corresponding to the 'English short' search term is edgeId:4, and acquiring the resource corresponding to the edgeID: 4. According to edgeID 4, the corresponding resource link is obtained from the resource layer, such as the link why the uk shorthair cat lacrimate 3. If there are multiple resource links, the multiple resources are sorted according to the first weight of the multiple resources. If there are multiple intents under the term, further attempts are made to sequentially acquire the corresponding resources according to the second weight of the second mapping relationship (i.e., edgeID).
In step 75, if the second mapping relationship corresponding to the search term closest to the search term input by the user cannot be determined, the intention identifying method may further include the steps of: and determining a second mapping relation corresponding to other search terms, wherein the other search terms refer to the search terms except the search term closest to the search term input by the user in the search terms corresponding to the determined entity. Through this step, sharing of the intent of the same entity can be achieved.
For the sake of clarity, the above technical solutions are described in detail by a specific example. For example, the entity, tangchaos dui, whose corresponding terms may be: plum-taibai and poem. Supposing that in an intention knowledge graph, a retrieval word is poetry and has no corresponding intention information; the search term "plum white" corresponds to the intention "why plum white is called plum white". When the retrieval word input by the user is poetry, the entity of 'Libai' can be corresponding to both poetry and Libai, so that the intention of 'why Libai is called Libai' is returned to the user, and the effect of mutual sharing of the intentions of the retrieval words under the granularity of the entity is achieved.
According to the method, the preset intention knowledge graph is introduced, the entity-level intention analysis is carried out on the search words input by the user, the user intention is met according to various resources which are pre-hooked to the intention knowledge graph, and finally the intention knowledge graph is displayed on a product.
The establishment of the intention knowledge graph is an entity mapping realized by using knowledge graph correlation technology, namely, the user intention is cleaned, disambiguated, preferred and the like according to the historical retrieval information of the user, fuzzy matching of the retrieval words, the front and back links of the retrieval words under the same conversation, a concept library and the like, so that a complete user intention knowledge graph is established, and meanwhile, the corresponding resources are hung on the intention pairs according to the intention pair granularity of the retrieval words and the intention.
The identification and satisfaction of the intention are realized by analyzing the entity corresponding to the user search word and the search intention of the user by using the intention knowledge graph, finding the extracted resource link from the intention knowledge graph and displaying the corresponding resource on products such as entity cards and the like.
Based on the same technical concept, the embodiment of the present disclosure also provides an intention knowledge graph generating apparatus, as shown in fig. 8, including: a first acquisition module 81, an identification module 82, an establishment module 83, an association module 84, a second acquisition module 85, and a linking module 86.
The first obtaining module 81 is configured to obtain user history retrieval information.
The identification module 82 is configured to identify an entity and an intention corresponding to the search term according to the user history search information.
The establishing module 83 is configured to establish a first mapping relationship between the search term and the entity, and a second mapping relationship between the search term and the intention.
The associating module 84 is configured to associate the entity with a concept at a bottom layer in a preset concept hierarchy, where the concept hierarchy includes at least two layers of structures.
The second obtaining module 84 is configured to obtain content information of the resource data.
The link module 85 is configured to establish a link between the resource data and the second mapping relationship according to the content information and the identified intention, so as to generate an intention knowledge graph.
Further, the establishing module 83 is further configured to, after the identifying module identifies the intention corresponding to the search term, establish a third mapping relationship between the same intention and each corresponding intention expression when the same intention corresponds to multiple intention expressions.
Preferably, the link module 85 is specifically configured to match the content information with the identified intention, and when the content information matches the identified intention, establish a link between the resource data and the second mapping relationship; and when the content information is not matched with the identified intention, determining corresponding intention expressions according to the third mapping relation, matching the content information with the intention expressions, and when the content information is matched with at least one intention expression, establishing a link between the resource data and the second mapping relation.
Further, as shown in fig. 9, the intention knowledge graph generating apparatus according to another embodiment of the present disclosure further includes a first setting module 86, where the first setting module 86 is configured to set a first weight for the resource data according to a matching degree when the content information matches the identified intention.
Preferably, in the first mapping relationship, one entity corresponds to at least one search term;
further, another embodiment of the present disclosure provides an intention knowledge graph generating apparatus, which further includes a second setting module, as shown in fig. 10, the second setting module includes a second determining unit 871, a processing unit 872 and a setting unit 873.
The second determining unit 871 is configured to determine, when an entity corresponds to a plurality of search terms, an intention corresponding to each search term of the entity, respectively.
The processing unit 872 is configured to determine the importance of the intent according to the user history retrieval information.
The setting unit 873 is configured to set a second weight for the corresponding second mapping relationship according to the importance of the intention.
Further, the intention knowledge graph generating apparatus provided by another embodiment of the present disclosure further includes an extension module, as shown in fig. 11, the extension module includes a second determination unit 881, an intention supplement unit 882, a second mapping relationship supplement unit 883, a resource obtaining unit 884, and a link supplement unit 885.
The second determining unit 881 is used to determine whether there is an intention that is not shared between the search terms corresponding to different entities belonging to the same concept at the bottom.
The intention supplementing unit 882 is configured to, when the second determining unit 881 determines that there is an intention that is not shared between the search terms corresponding to the different entities belonging to the same concept at the bottom level, respectively supplement the intentions corresponding to the different entities belonging to the same concept at the bottom level in the intention knowledge graph according to the intention that is not shared.
The second mapping relationship complementing unit 883 is configured to establish a second mapping relationship between the search term under the entity complemented with the intent and the complemented intent.
The resource obtaining unit 884 is configured to obtain corresponding resource data according to the established second mapping relationship.
The link supplementing unit 885 is configured to establish a link between the acquired resource data and the established second mapping relationship.
Based on the same technical concept, the embodiment of the present disclosure also provides an intention recognition apparatus applied to an intention knowledge graph, where the intention knowledge graph includes: the system comprises a first mapping relation between a search term and an entity, a second mapping relation between the search term and an intention, and a link of resource data and the second mapping relation.
As shown in fig. 12, the intention identifying means includes: a judging module 121, a determining module 122 and a resource obtaining module 123.
The judging module 121 is configured to judge whether the search term input by the user is an entity search term.
The determining module 122 is configured to, when the determining module 121 determines that the search term input by the user is an entity search term, determine a corresponding entity according to the search term and the first mapping relationship; determining a search term corresponding to the entity according to the first mapping relation, and determining a search term closest to the search term input by the user from the search terms; and determining a second mapping relation corresponding to the retrieval word closest to the retrieval word input by the user.
The resource obtaining module 123 is configured to obtain and return corresponding resource data according to the determined second mapping relationship and the link between the resource data and the second mapping relationship.
Preferably, the intention knowledge graph further comprises a first weight of the resource data;
the resource obtaining module 123 is specifically configured to return corresponding resource data according to the first weight in the intention knowledge graph.
Preferably, the second mapping relationship has a second weight;
the resource obtaining module 123 is specifically configured to, when the determined second mapping relationships are multiple, obtain a second weight corresponding to each second mapping relationship; and acquiring and returning corresponding resource data according to the second weight corresponding to each second mapping relation.
Further, the determining module 122 is further configured to, when a second mapping relationship corresponding to a search term closest to the search term input by the user cannot be determined, determine a second mapping relationship corresponding to another search term, where the another search term is a search term other than the search term closest to the search term input by the user in the search terms corresponding to the determined entity.
An embodiment of the present disclosure further provides a server, where the server includes: one or more processors and storage; the storage device stores one or more programs thereon, and when the one or more programs are executed by the one or more processors, the one or more processors implement the intention knowledge graph generating method or the intention identifying method provided in the foregoing embodiments.
The disclosed embodiments also provide a computer readable medium on which a computer program is stored, wherein the computer program, when executed, implements the intention knowledge graph generating method or the intention identifying method provided by the foregoing embodiments.
It will be understood by those of ordinary skill in the art that all or some of the steps of the methods disclosed above, functional modules/units in the apparatus, may be implemented as software, firmware, hardware, and suitable combinations thereof. In a hardware implementation, the division between functional modules/units mentioned in the above description does not necessarily correspond to the division of physical components; for example, one physical component may have multiple functions, or one function or step may be performed by several physical components in cooperation. Some or all of the physical components may be implemented as software executed by a processor, such as a central processing unit, digital signal processor, or microprocessor, or as hardware, or as an integrated circuit, such as an application specific integrated circuit. Such software may be distributed on computer readable media, which may include computer storage media (or non-transitory media) and communication media (or transitory media). The term computer storage media includes volatile and nonvolatile, removable and non-removable media implemented in any method or technology for storage of information such as computer readable instructions, data structures, program modules or other data, as is well known to those of ordinary skill in the art. Computer storage media includes, but is not limited to, RAM, ROM, EEPROM, flash memory or other memory technology, CD-ROM, Digital Versatile Disks (DVD) or other optical disk storage, magnetic cassettes, magnetic tape, magnetic disk storage or other magnetic storage devices, or any other medium which can be used to store the desired information and which can accessed by a computer. In addition, communication media typically embodies computer readable instructions, data structures, program modules or other data in a modulated data signal such as a carrier wave or other transport mechanism and includes any information delivery media as known to those skilled in the art.
Example embodiments have been disclosed herein, and although specific terms are employed, they are used and should be interpreted in a generic and descriptive sense only and not for purposes of limitation. In some instances, features, characteristics and/or elements described in connection with a particular embodiment may be used alone or in combination with features, characteristics and/or elements described in connection with other embodiments, unless expressly stated otherwise, as would be apparent to one skilled in the art. It will, therefore, be understood by those skilled in the art that various changes in form and details may be made therein without departing from the scope of the invention encompassed by the appended claims.

Claims (20)

1. An intention knowledge graph generation method, wherein,
acquiring historical retrieval information of a user, and identifying an entity and an intention corresponding to a retrieval word according to the historical retrieval information of the user;
establishing a first mapping relation between a search term and an entity and a second mapping relation between the search term and an intention;
associating the entity with the concept at the bottom layer in a preset concept system, wherein the concept system at least comprises two layers of structures;
acquiring content information of resource data, and establishing a link between the resource data and the second mapping relation according to the content information and the identified intention so as to generate an intention knowledge graph;
after generating the intent knowledge graph, the method further comprises:
judging whether non-common intentions exist among retrieval words corresponding to different entities under the same concept at the bottom layer;
if so, extending the intent knowledge graph.
2. The method of claim 1, wherein, after the identifying the intent corresponding to the term, the method further comprises:
and if the same intention corresponds to multiple intention expressions, establishing a third mapping relation between the same intention and the corresponding intention expressions.
3. The method of claim 2, wherein the establishing a link between the resource data and the second mapping relationship based on the content information and the identified intent comprises:
matching the content information with the identified intention, and if the content information is matched with the identified intention, establishing a link between the resource data and the second mapping relation; if not, determining a corresponding intention expression according to the third mapping relation, matching the content information with the intention expression, and if the content information is matched with at least one intention expression, establishing a link between the resource data and the second mapping relation;
after establishing the link of the resource data with the second mapping relationship, the method further comprises: and setting a first weight for the resource data according to the matching degree.
4. The method of claim 1, wherein in the first mapping relationship, one entity corresponds to at least one term; the method further comprises the following steps:
if one entity corresponds to a plurality of search terms, respectively determining the corresponding intention of each search term of the entity;
and determining the importance of the intention according to the historical retrieval information of the user, and setting a second weight for a corresponding second mapping relation according to the importance of the intention.
5. The method of claim 1, wherein the step of expanding the intent knowledge graph comprises:
according to the non-common intents, respectively supplementing intents corresponding to different entities belonging to the same concept at the bottom layer in the intention knowledge graph;
establishing a second mapping relation between the search terms under the entity supplemented with the intention and the supplemented intention;
and acquiring corresponding resource data according to the established second mapping relation, and establishing a link between the acquired resource data and the established second mapping relation.
6. An intent recognition method, wherein the method is applied to an intent knowledge graph comprising: a first mapping relationship between a term and an entity, a second mapping relationship between a term and an intent, and a link of resource data to the second mapping relationship, the method comprising:
judging whether the search word input by the user is an entity search word, if so, determining a corresponding entity according to the search word and the first mapping relation;
determining a search term corresponding to the entity according to the first mapping relation, and determining a search term closest to the search term input by the user from the search terms;
determining a second mapping relation corresponding to a retrieval word closest to the retrieval word input by the user, wherein intentions in the second mapping relation comprise intentions of retrieval words corresponding to other entities of the entity under the same lowest-level concept, and the intentions are non-shared intentions among the retrieval words corresponding to the entities;
and acquiring and returning corresponding resource data according to the determined second mapping relation and the link between the resource data and the second mapping relation.
7. The method of claim 6, wherein a first weight of resource data is further included in the intent knowledge-graph;
the returning of the corresponding resource data specifically includes: and returning corresponding resource data according to the first weight in the intention knowledge graph.
8. The method of claim 6, wherein the second mapping relationship has a second weight;
the acquiring and returning corresponding resource data according to the determined second mapping relationship and the link between the resource data and the second mapping relationship specifically includes:
if a plurality of second mapping relations are determined, acquiring a second weight corresponding to each second mapping relation;
and acquiring and returning corresponding resource data according to the second weight corresponding to each second mapping relation.
9. The method of any one of claims 6-8, wherein the method further comprises:
and if the second mapping relation corresponding to the retrieval word closest to the retrieval word input by the user cannot be determined, determining second mapping relations corresponding to other retrieval words, wherein the other retrieval words are retrieval words except the retrieval word closest to the retrieval word input by the user in the determined retrieval words corresponding to the entity.
10. An intent-knowledge-graph generating apparatus, comprising: the system comprises a first acquisition module, an identification module, an establishment module, a correlation module, a second acquisition module, a link module and an expansion module;
the first acquisition module is used for acquiring historical retrieval information of a user;
the identification module is used for identifying an entity and an intention corresponding to a search word according to the historical search information of the user;
the establishing module is used for establishing a first mapping relation between a search term and an entity and a second mapping relation between the search term and an intention;
the association module is used for associating the entity with the concept at the bottom layer in a preset concept system, wherein the concept system at least comprises two layers of structures;
the second acquisition module is used for acquiring the content information of the resource data;
the link module is used for establishing a link between the resource data and the second mapping relation according to the content information and the identified intention so as to generate an intention knowledge graph;
the expansion module is used for judging whether non-shared intentions exist among the retrieval words corresponding to different entities under the same concept at the bottom layer, and if yes, expanding the intention knowledge graph.
11. The intention knowledge graph generating apparatus of claim 10, wherein the establishing module is further configured to establish a third mapping relationship between the same intention and each corresponding intention expression when the same intention corresponds to a plurality of intention expressions after the identifying module identifies the intention corresponding to the search word.
12. The intent knowledge graph generating apparatus of claim 11, wherein the linking module is specifically configured to match the content information with the identified intent, establish a link of the resource data with the second mapping relationship when the content information matches the identified intent; when the content information is not matched with the identified intention, determining corresponding intention expressions according to the third mapping relation, matching the content information with the intention expressions, and when the content information is matched with at least one intention expression, establishing a link between the resource data and the second mapping relation;
the intention knowledge graph generating device further comprises a first setting module, wherein the first setting module is used for setting a first weight for the resource data according to the matching degree when the content information is matched with the identified intention.
13. The intent knowledge graph generating apparatus of claim 10 wherein, in the first mapping, one entity corresponds to at least one term; the intention knowledge graph generating device further comprises a second setting module, wherein the second setting module comprises a second determining unit, a processing unit and a setting unit;
the second determining unit is used for respectively determining the corresponding intention of each search term of an entity when the entity corresponds to a plurality of search terms;
the processing unit is used for determining the importance of the intention according to the user history retrieval information;
the setting unit is used for setting a second weight for the corresponding second mapping relation according to the importance of the intention.
14. The intention knowledge graph generating apparatus of claim 10, wherein the extension module includes a second judging unit, an intention supplementing unit, a second mapping relationship supplementing unit, a resource acquiring unit, and a link supplementing unit;
the second judging unit is used for judging whether the search terms corresponding to different entities under the same concept at the bottom layer have non-common intentions;
the intention supplementing unit is used for respectively supplementing the intentions corresponding to the different entities belonging to the same concept at the bottommost layer in the intention knowledge graph according to the non-shared intention when the second judging unit judges that the non-shared intention exists among the retrieval words corresponding to the different entities belonging to the same concept at the bottommost layer;
the second mapping relation supplementing unit is used for establishing a second mapping relation between the search terms under the entity supplemented with the intention and the supplemented intention;
the resource obtaining unit is used for obtaining corresponding resource data according to the established second mapping relation;
and the link supplementing unit is used for establishing a link between the acquired resource data and the established second mapping relation.
15. An intent recognition apparatus, wherein applied to an intent knowledge graph, the intent knowledge graph comprises: a first mapping relation between the search term and the entity, a second mapping relation between the search term and the intention, and a link between the resource data and the second mapping relation; the intention recognition means includes: the device comprises a judging module, a determining module and a resource obtaining module;
the judging module is used for judging whether the search word input by the user is an entity search word;
the determining module is used for determining a corresponding entity according to the search term and the first mapping relation when the judging module judges that the search term input by the user is the entity search term; determining a search term corresponding to the entity according to the first mapping relation, and determining a search term closest to the search term input by the user from the search terms; determining a second mapping relation corresponding to a retrieval word closest to the retrieval word input by the user, wherein intentions in the second mapping relation comprise intentions of retrieval words corresponding to other entities of the entity under the same lowest-level concept, and the intentions are non-shared intentions among the retrieval words corresponding to the entities;
and the resource acquisition module is used for acquiring and returning corresponding resource data according to the determined second mapping relation and the link between the resource data and the second mapping relation.
16. The intent recognition apparatus of claim 15, wherein a first weight of resource data is further included in the intent knowledge-graph;
the resource obtaining module is specifically configured to return corresponding resource data according to the first weight in the intention knowledge graph.
17. The intent recognition apparatus of claim 15, wherein the second mapping relationship has a second weight;
the resource obtaining module is specifically configured to, when the determined second mapping relationships are multiple, obtain second weights corresponding to the second mapping relationships; and acquiring and returning corresponding resource data according to the second weight corresponding to each second mapping relation.
18. The intention recognition apparatus according to any one of claims 15-17, wherein the determining module is further configured to, when the second mapping relationship corresponding to the search word closest to the search word input by the user cannot be determined, determine second mapping relationships corresponding to other search words, which are the search words other than the search word closest to the search word input by the user, in the determined search words corresponding to the entity.
19. A server, comprising:
one or more processors;
a storage device having one or more programs stored thereon;
the one or more programs, when executed by the one or more processors, cause the one or more processors to implement the intent knowledge graph generation method of any of claims 1-5 or the intent recognition method of any of claims 6-9.
20. A computer-readable medium on which a computer program is stored, wherein the program when executed implements the intent knowledge graph generation method of any one of claims 1-5 or the intent recognition method of any one of claims 6-9.
CN201910511702.9A 2019-06-13 2019-06-13 Intention knowledge graph generation method, intention identification method and device Active CN110263180B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910511702.9A CN110263180B (en) 2019-06-13 2019-06-13 Intention knowledge graph generation method, intention identification method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910511702.9A CN110263180B (en) 2019-06-13 2019-06-13 Intention knowledge graph generation method, intention identification method and device

Publications (2)

Publication Number Publication Date
CN110263180A CN110263180A (en) 2019-09-20
CN110263180B true CN110263180B (en) 2021-06-04

Family

ID=67918157

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910511702.9A Active CN110263180B (en) 2019-06-13 2019-06-13 Intention knowledge graph generation method, intention identification method and device

Country Status (1)

Country Link
CN (1) CN110263180B (en)

Families Citing this family (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111104520B (en) * 2019-11-21 2023-06-30 新华智云科技有限公司 Personage entity linking method based on personage identity
CN111091006B (en) * 2019-12-20 2023-08-29 北京百度网讯科技有限公司 Method, device, equipment and medium for establishing entity intention system
CN110990710B (en) * 2019-12-24 2023-07-04 北京百度网讯科技有限公司 Resource recommendation method and device
CN113360751A (en) * 2020-03-06 2021-09-07 百度在线网络技术(北京)有限公司 Intention recognition method, apparatus, device and medium
CN111597433B (en) * 2020-04-10 2023-08-01 北京百度网讯科技有限公司 Resource searching method and device and electronic equipment
CN111639234B (en) * 2020-05-29 2023-06-27 北京百度网讯科技有限公司 Method and device for mining core entity attention points
CN111737494B (en) * 2020-06-28 2021-03-12 上海松鼠课堂人工智能科技有限公司 Knowledge graph generation method of intelligent learning system
CN111967263A (en) * 2020-07-30 2020-11-20 北京明略软件系统有限公司 Domain named entity denoising method and system based on entity topic relevance
CN112860813B (en) * 2021-02-10 2023-09-22 北京百度网讯科技有限公司 Method and device for retrieving information
CN113609827B (en) * 2021-08-09 2023-05-26 海南大学 Content processing method and system based on intent-driven DIKW
CN113609281A (en) * 2021-08-09 2021-11-05 海南大学 Intention identification method and device based on DIKW map
CN113722505B (en) * 2021-08-30 2023-04-18 海南大学 DIKW resource-oriented emotion expression mapping, measuring and optimizing transmission system

Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2015079575A1 (en) * 2013-11-29 2015-06-04 株式会社 東芝 Interactive support system, method, and program
CN104866593A (en) * 2015-05-29 2015-08-26 中国电子科技集团公司第二十八研究所 Database searching method based on knowledge graph
CN107679186A (en) * 2017-09-30 2018-02-09 北京奇虎科技有限公司 The method and device of entity search is carried out based on entity storehouse
CN107688614A (en) * 2017-08-04 2018-02-13 平安科技(深圳)有限公司 It is intended to acquisition methods, electronic installation and computer-readable recording medium
CN107807957A (en) * 2017-09-30 2018-03-16 北京奇虎科技有限公司 entity library generating method and device
CN108153901A (en) * 2018-01-16 2018-06-12 北京百度网讯科技有限公司 The information-pushing method and device of knowledge based collection of illustrative plates
CN109145153A (en) * 2018-07-02 2019-01-04 北京奇艺世纪科技有限公司 It is intended to recognition methods and the device of classification
CN109739964A (en) * 2018-12-27 2019-05-10 北京拓尔思信息技术股份有限公司 Knowledge data providing method, device, electronic equipment and storage medium
CN109871450A (en) * 2019-01-11 2019-06-11 北京光年无限科技有限公司 Based on the multi-modal exchange method and system for drawing this reading
CN109871543A (en) * 2019-03-12 2019-06-11 广东小天才科技有限公司 A kind of intention acquisition methods and system

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106649878A (en) * 2017-01-07 2017-05-10 陈翔宇 Artificial intelligence-based internet-of-things entity search method and system
CN109145200A (en) * 2018-07-13 2019-01-04 百度在线网络技术(北京)有限公司 Promote method, apparatus, equipment and the computer storage medium showed
CN109829039B (en) * 2018-12-13 2023-06-09 平安科技(深圳)有限公司 Intelligent chat method, intelligent chat device, computer equipment and storage medium

Patent Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2015079575A1 (en) * 2013-11-29 2015-06-04 株式会社 東芝 Interactive support system, method, and program
CN104866593A (en) * 2015-05-29 2015-08-26 中国电子科技集团公司第二十八研究所 Database searching method based on knowledge graph
CN107688614A (en) * 2017-08-04 2018-02-13 平安科技(深圳)有限公司 It is intended to acquisition methods, electronic installation and computer-readable recording medium
CN107679186A (en) * 2017-09-30 2018-02-09 北京奇虎科技有限公司 The method and device of entity search is carried out based on entity storehouse
CN107807957A (en) * 2017-09-30 2018-03-16 北京奇虎科技有限公司 entity library generating method and device
CN108153901A (en) * 2018-01-16 2018-06-12 北京百度网讯科技有限公司 The information-pushing method and device of knowledge based collection of illustrative plates
CN109145153A (en) * 2018-07-02 2019-01-04 北京奇艺世纪科技有限公司 It is intended to recognition methods and the device of classification
CN109739964A (en) * 2018-12-27 2019-05-10 北京拓尔思信息技术股份有限公司 Knowledge data providing method, device, electronic equipment and storage medium
CN109871450A (en) * 2019-01-11 2019-06-11 北京光年无限科技有限公司 Based on the multi-modal exchange method and system for drawing this reading
CN109871543A (en) * 2019-03-12 2019-06-11 广东小天才科技有限公司 A kind of intention acquisition methods and system

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
"基于用户意图理解的社交网络跨媒体搜索与挖掘";崔婉秋 等;《智能系统学报》;20171109;第12卷(第6期);第761-769页 *
"一种基于知识图谱的用户搜索意图挖掘方法的研究";石刚;《中国优秀硕士学位论文全文数据库信息科技辑》;20170215(第02期);第I138-4519页 *

Also Published As

Publication number Publication date
CN110263180A (en) 2019-09-20

Similar Documents

Publication Publication Date Title
CN110263180B (en) Intention knowledge graph generation method, intention identification method and device
CN107748754B (en) Knowledge graph perfecting method and device
CN110609902B (en) Text processing method and device based on fusion knowledge graph
WO2022116537A1 (en) News recommendation method and apparatus, and electronic device and storage medium
CN103823824B (en) A kind of method and system that text classification corpus is built automatically by the Internet
WO2019200752A1 (en) Semantic understanding-based point of interest query method, device and computing apparatus
US8082248B2 (en) Method and system for document classification based on document structure and written style
CN110502621A (en) Answering method, question and answer system, computer equipment and storage medium
US8095539B2 (en) Taxonomy-based object classification
CN101957816B (en) Webpage metadata automatic extraction method and system based on multi-page comparison
CN102890713B (en) A kind of music recommend method based on user's current geographic position and physical environment
US8577938B2 (en) Data mapping acceleration
US10311374B2 (en) Categorization of forms to aid in form search
CN103955529A (en) Internet information searching and aggregating presentation method
CN112632397A (en) Personalized recommendation method based on multi-type academic achievement portrait and mixed recommendation strategy
CN108021715B (en) Heterogeneous label fusion system based on semantic structure feature analysis
CN105989056A (en) Chinese news recommending system
CN109002432A (en) Method for digging and device, computer-readable medium, the electronic equipment of synonym
CN103778206A (en) Method for providing network service resources
Tekli An overview of cluster-based image search result organization: background, techniques, and ongoing challenges
CN114997288A (en) Design resource association method
CN107908749B (en) Character retrieval system and method based on search engine
Patwardhan et al. ViTag: Automatic video tagging using segmentation and conceptual inference
Maynard et al. Change management for metadata evolution
Kang et al. Recognising informative Web page blocks using visual segmentation for efficient information extraction.

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant