CN116010611A

CN116010611A - Knowledge graph construction and information recommendation method and device and computer equipment

Info

Publication number: CN116010611A
Application number: CN202111225908.9A
Authority: CN
Inventors: 徐朕燃; 单子非; 王成浩; 户保田
Original assignee: Tencent Technology Shenzhen Co Ltd
Current assignee: Tencent Technology Shenzhen Co Ltd
Priority date: 2021-10-21
Filing date: 2021-10-21
Publication date: 2023-04-25

Abstract

The application relates to a knowledge graph construction method, a knowledge graph construction device, a computer device, a storage medium and a computer program product. The method comprises the following steps: acquiring a reading text of a target user identifier in a preset time period, and acquiring an instance entity set based on the reading text; expanding based on the instance entity set to obtain an expanded instance entity set and an expanded concept entity set; obtaining instance entity weights, calculating instance similarity between the extended instance entities and the instance entities, and calculating the extended instance entity weights by using the instance entity weights and the instance similarity; calculating the concept similarity degree of the expanded concept entity and the target instance entity, and calculating the expanded concept entity weight by using the instance entity weight, the expanded instance entity weight and the concept similarity degree; and establishing an interest knowledge graph corresponding to the target user identifier based on the target instance entity set, the expanded concept entity set, the instance entity weight, the expanded instance entity weight and the expanded concept entity weight. By adopting the method, the accuracy of interest characterization is improved.

Description

Knowledge graph construction and information recommendation method and device and computer equipment

Technical Field

The present disclosure relates to the field of internet technologies, and in particular, to a knowledge graph construction method, an information recommendation device, a computer device, a storage medium, and a computer program product.

Background

With the development of artificial intelligence technology, a knowledge graph technology appears, and the knowledge graph is a modern theory that combines the theory and method of subjects such as application mathematics, graphics, information visualization technology, information science and the like with the methods of introduction analysis of metering, co-occurrence analysis and the like, and utilizes the visualized graph to vividly display the core structure, development history, leading edge field and overall knowledge architecture of subjects to achieve the aim of multi-subject fusion. Currently, when characterizing a user's interests, the user's interests are typically characterized by user images. However, the user interest feature is characterized by the user image, so that the situation of missing the user interest may exist, and the problem of low accuracy of the user interest characterization exists.

Disclosure of Invention

In view of the foregoing, it is desirable to provide a knowledge graph construction, information recommendation method, apparatus, computer device, storage medium, and computer program product that can improve the accuracy of user interest characterization.

A knowledge graph construction method, the method comprising:

acquiring a reading text of a target user identifier in a preset time period, and carrying out entity linking based on the reading text to obtain an instance entity set corresponding to the reading text;

performing instance expansion based on each instance entity in the instance entity set to obtain an expanded instance entity set, obtaining a target instance entity set based on the expanded instance entity set and the instance entity set, and performing concept expansion based on each target instance entity in the target instance entity set to obtain an expanded concept entity set;

acquiring the weight of each instance entity, wherein the weight of each instance entity refers to the occurrence number of each instance entity in a reading text and a target reading text in a target time period before a preset time period;

calculating the instance similarity between each extended instance entity and the associated instance entity in each instance entity, and calculating the weight of the extended instance entity by using the weight of each instance entity and the instance similarity to obtain the weight of each extended instance entity;

calculating the concept similarity between each extended concept entity in the extended concept entity set and the associated target instance entity in each target instance entity, and calculating the extended concept entity weight by using the weight of each instance entity, the weight of each extended instance entity and the concept similarity to obtain the weight of each extended concept entity;

And establishing an interest knowledge graph corresponding to the target user identifier based on the target instance entity set, the expanded concept entity set, the weights of all instance entities, the weights of all expanded instance entities and the weights of all expanded concept entities.

A knowledge graph construction apparatus, the apparatus comprising:

the example obtaining module is used for obtaining a reading text of the target user identifier in a preset time period, and carrying out entity linking based on the reading text to obtain an example entity set corresponding to the reading text;

the expansion module is used for carrying out example expansion based on each example entity in the example entity set to obtain an expanded example entity set, obtaining a target example entity set based on the expanded example entity set and the example entity set, and carrying out concept expansion based on each target example entity in the target example entity set to obtain an expanded concept entity set;

the weight acquisition module is used for acquiring the weight of each instance entity, wherein the weight of each instance entity refers to the occurrence number of each instance entity in the reading text and the target reading text in the target time period before the preset time period;

the instance weight calculation module is used for calculating instance similarity between each extended instance entity and the instance entity associated with each instance entity, and calculating the extended instance entity weight by using the instance entity weight and the instance similarity to obtain the extended instance entity weight;

The concept weight calculation module is used for calculating the concept similarity between each extended concept entity in the extended concept entity set and the associated target instance entity in each target instance entity, and calculating the extended concept entity weight by using the weight of each instance entity, the weight of each extended instance entity and the concept similarity to obtain the weight of each extended concept entity;

the map building module is used for building an interest knowledge map corresponding to the target user identifier based on the target instance entity set, the expanded concept entity set, the instance entity weights, the expanded instance entity weights and the expanded concept entity weights.

A computer device comprising a memory storing a computer program and a processor which when executing the computer program performs the steps of:

A computer readable storage medium having stored thereon a computer program which when executed by a processor performs the steps of:

A computer program product. The computer program product comprises a computer program which, when executed by a processor, implements the steps of:

According to the knowledge graph construction method, the knowledge graph construction device, the computer equipment, the storage medium and the computer program product, the reading text of the target user identifier in the preset time period is obtained, and entity linking is carried out based on the reading text, so that an instance entity set corresponding to the reading text is obtained; performing instance expansion based on each instance entity in the instance entity set to obtain an expanded instance entity set, obtaining a target instance entity set based on the expanded instance entity set and the instance entity set, and performing concept expansion based on each target instance entity in the target instance entity set to obtain an expanded concept entity set. And then obtaining the entity weights of all the examples, calculating the entity weights of all the extended examples by using the entity weights of all the examples and the similarity of the examples, and calculating the entity weights of all the extended concepts by using the entity weights of all the extended examples, the entity weights of all the examples and the similarity of the concepts. And finally, establishing an interest knowledge graph corresponding to the target user identifier by using the target instance entity set, the expanded concept entity set, the weights of all instance entities, the weights of all expanded instance entities and the weights of all expanded concept entities, so that the established interest knowledge graph can improve the accuracy of interest characterization of the user.

An information recommendation method, the method comprising:

receiving an information recommendation instruction, wherein the information recommendation instruction carries a user identifier and an inquiry statement;

extracting query keywords from the query sentences to obtain target keywords;

acquiring an interest knowledge graph corresponding to a user identifier, wherein the interest knowledge graph is established by acquiring a reading text of a target user identifier in a preset time period, expanding examples and concepts based on each example entity in the reading text to obtain an expanded entity set, acquiring each example entity weight in the reading text, calculating each expanded entity weight in the expanded entity set based on each example entity weight in the reading text, and using the example entity, the expanded entity set, each example entity weight and each expanded entity weight in the reading text;

and determining a target interest entity from the interest knowledge graph, acquiring recommendation information based on the target keyword and the target interest entity, and returning the recommendation information to the terminal corresponding to the user identifier.

An information recommendation apparatus, the apparatus comprising:

the instruction receiving module is used for receiving an information recommendation instruction, wherein the information recommendation instruction carries a user identifier and an inquiry statement;

The extraction module is used for extracting the query keywords from the query sentences to obtain target keywords;

the map acquisition module is used for acquiring an interest knowledge map corresponding to the user identifier, wherein the interest knowledge map is established by acquiring a reading text of the target user identifier in a preset time period, expanding examples and concepts based on each example entity in the reading text to obtain an expanded entity set, acquiring each example entity weight in the reading text, calculating each expanded entity weight in the expanded entity set based on each example entity weight in the reading text, and using the example entity, the expanded entity set, each example entity weight and each expanded entity weight in the reading text;

and the recommendation module is used for determining a target interest entity from the interest knowledge graph, acquiring recommendation information based on the target keyword and the target interest entity, and returning the recommendation information to the terminal corresponding to the user identifier.

Extracting query keywords from the query sentences to obtain target keywords;

extracting query keywords from the query sentences to obtain target keywords;

The information recommending method, the information recommending device, the computer equipment, the storage medium and the computer program product are characterized in that by receiving an information recommending instruction, the information recommending instruction carries a user identifier and an inquiry statement; extracting query keywords from query sentences to obtain target keywords, determining corresponding target interest entities from interest knowledge maps corresponding to user identifications, and further obtaining recommendation information by using the target keywords and the target interest entities.

Drawings

FIG. 1 is an application environment diagram of a knowledge graph construction method in one embodiment;

FIG. 2 is a flow chart of a knowledge graph construction method in one embodiment;

FIG. 3 is a flow diagram of an example entity set in one embodiment;

FIG. 4 is a flow diagram of entity linking in one embodiment;

FIG. 5 is a flow diagram of entity disambiguation in one embodiment;

FIG. 6 is a schematic diagram of a framework for entity disambiguation in one embodiment;

FIG. 7 is a flow diagram of an extended set of instance entities in one embodiment;

FIG. 8 is a flow diagram of an expanded concept entity set in one embodiment;

FIG. 9 is a flow diagram of obtaining a first set of expanded concept entities and a second set of expanded concept entities in one embodiment;

FIG. 10 is a flow diagram of obtaining expanded instance entity weights in one embodiment;

FIG. 11 is a flow diagram of obtaining expanded concept entity weights in one embodiment;

FIG. 12 is a flow diagram of obtaining a first expanded concept entity weight and a second expanded concept entity weight in one embodiment;

FIG. 13 is a flow diagram of obtaining example weight decay factors in one embodiment;

FIG. 14 is a graph showing time intervals versus example weight decay factors in one embodiment;

FIG. 15 is a schematic diagram of an interesting knowledge graph in one embodiment;

FIG. 16 is a conceptual entity weight diagram of the interest knowledge graph in the embodiment of FIG. 15;

FIG. 17 is a schematic diagram of another knowledge-graph of interest in the embodiment of FIG. 15;

FIG. 18 is a schematic diagram of conceptual entity weights in another knowledge-graph of interest in the embodiment of FIG. 15;

FIG. 19 is a flow chart of an information recommendation method according to an embodiment;

FIG. 20 is a flowchart of a knowledge graph construction method in an embodiment;

FIG. 21 is a flow chart of constructing a knowledge-graph of interest in one embodiment;

FIG. 22 is a schematic diagram of a knowledge graph of personal interests in an embodiment;

FIG. 23 is a block diagram showing a knowledge graph construction apparatus in one embodiment;

FIG. 24 is a block diagram showing an information recommending apparatus according to an embodiment;

FIG. 25 is an internal block diagram of a computer device in one embodiment;

fig. 26 is an internal structural view of a computer device in another embodiment.

Detailed Description

In order to make the objects, technical solutions and advantages of the present application more apparent, the present application will be further described in detail with reference to the accompanying drawings and examples. It should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the present application.

Natural language processing (Nature Language processing, NLP) is an important direction in the fields of computer science and artificial intelligence. It is studying various theories and methods that enable effective communication between a person and a computer in natural language. Natural language processing is a science that integrates linguistics, computer science, and mathematics. Thus, the research in this field will involve natural language, i.e. language that people use daily, so it has a close relationship with the research in linguistics. Natural language processing techniques typically include text processing, semantic understanding, machine translation, robotic questions and answers, knowledge graph techniques, and the like.

The scheme provided by the embodiment of the application relates to the technology of knowledge graph and the like of artificial intelligence, and is specifically described through the following embodiments:

the knowledge graph construction method provided by the embodiment of the application can be applied to an application environment shown in fig. 1. Wherein the terminal 102 communicates with the server 104 via a network. The data storage system may store data that the server 104 needs to process. The data storage system may be integrated on the server 104 or may be located on a cloud or other network server. The terminal 102 may send a knowledge graph construction instruction to the server, and the server 104 may obtain, from the data storage system according to the knowledge graph construction instruction, a reading text of the target user identifier in a preset period of time, and perform entity linking based on the reading text, so as to obtain an instance entity set corresponding to the reading text. The server 104 expands the instance based on each instance entity in the instance entity set to obtain an expanded instance entity set, obtains a target instance entity set based on the expanded instance entity set and the instance entity set, and expands the concept based on each target instance entity in the target instance entity set to obtain an expanded concept entity set. The server 104 obtains weights of respective instance entities, where the weights of respective instance entities refer to the number of occurrences of the respective instance entities in the reading text and the target reading text for a target period of time before a preset period of time. The server 104 calculates the instance similarity between each extended instance entity and the associated instance entity in each instance entity, and uses the weight of each instance entity and the instance similarity to calculate the weight of the extended instance entity, so as to obtain the weight of each extended instance entity. The server 104 calculates the concept similarity between each extended concept entity in the extended concept entity set and the associated target instance entity in each target instance entity, and uses the weight of each instance entity, the weight of each extended instance entity and the concept similarity to calculate the weight of each extended concept entity, so as to obtain the weight of each extended concept entity. The server 104 establishes an interest knowledge graph corresponding to the target user identifier based on the target instance entity set, the extended concept entity set, the weights of the instance entities, the weights of the extended instance entities and the weights of the extended concept entities. The terminal 102 may be, but not limited to, various personal computers, notebook computers, smart phones, tablet computers, internet of things devices, and portable wearable devices, where the internet of things devices may be smart speakers, smart televisions, smart air conditioners, smart vehicle devices, and the like. The portable wearable device may be a smart watch, smart bracelet, headset, or the like. The server 104 may be implemented as a stand-alone server or as a server cluster of multiple servers.

In one embodiment, as shown in fig. 2, a knowledge graph construction method is provided, and the method is applied to the server in fig. 1 for illustration, it is to be understood that the method can also be applied to a terminal, and can also be applied to a system including the terminal and the server, and implemented through interaction between the terminal and the server. In this embodiment, the method includes the steps of:

step 202, obtaining a reading text of a target user identifier in a preset time period, and carrying out entity linking based on the reading text to obtain an instance entity set corresponding to the reading text.

The target user identification is used for uniquely identifying a user needing to establish the knowledge graph. The preset time period refers to a preset time period, and may be, for example, one day, one week, one month, or the like. Reading text refers to text that has been browsed by a user, and may be various types of text, such as news text, advertisement text, chat text, and the like. Entity links are used to link entity words found in text to tagged entities in the knowledge base. An instance entity set refers to a set of instance entities in the reading text. Instance entities refer to entities of an instance type. A knowledge graph is a huge knowledge network, wherein nodes in the network represent entities, and edges between the nodes represent the entities and relationships between the entities. The entities include both concepts and instances.

Specifically, the server may obtain the text browsed by the target user identifier in the preset time period from the database to obtain the reading text, and may also collect the text browsed by the target user identifier in the preset time period from the internet to obtain the reading text. The server can acquire all texts browsed by the target user identifier in a preset time period from the service server side, and takes all texts as reading texts. And then the server carries out entity linking on the reading text to obtain an instance entity set corresponding to the reading text. In one embodiment, the server may also perform named entity recognition on the reading text to obtain an instance entity set corresponding to the reading text.

Step 204, performing instance expansion based on each instance entity in the instance entity set to obtain an expanded instance entity set, obtaining a target instance entity set based on the expanded instance entity set and the instance entity set, and performing concept expansion based on each target instance entity in the target instance entity set to obtain an expanded concept entity set.

The extended instance entity set refers to a set of each extended instance entity, and the extended instance entity refers to an instance entity obtained by extension according to the association relationship between the instance entity and a preset instance entity. The target instance entity set refers to a set of target instance entities, and each target instance entity comprises an extended instance entity in the extended instance entity set and an instance entity in the instance entity set. The extended concept entity set refers to a set of each extended concept entity, and the extended concept entity refers to a concept entity obtained by extension according to a target instance entity and a preset association relationship between the concept entity and the instance entity. Concept entities refer to entities of a concept type.

Specifically, the server uses each instance entity in the instance entity set to expand the instance entity according to the association relation between the preset instance entities to obtain the instance entity associated with each instance entity in the instance entity set, and the expanded instance entity set is obtained. And obtaining a target instance entity set based on the extended instance entity set and the instance entity set. And expanding concept entities by using each target instance entity in the target instance entity set according to the association relation between the preset instance entity and the concept entity to obtain concept entities associated with each target instance entity in the target instance entity set, and thus obtaining an expanded concept entity set.

In step 206, the weights of the instance entities are obtained, where the weights of the instance entities refer to the number of occurrences of the instance entities in the reading text and the target reading text in the target time period before the preset time period.

The target time period before the preset time period refers to a preset time period before the preset time period, and the target reading text refers to all texts browsed by a user in the target time period. For example, the preset time period is 5 months 12 days, and the target time period may refer to a time period of 10 days before 5 months 12 days, that is, a time period of 5 months 2 days to 5 months 11 days. And acquiring all the texts browsed by the user within the time period of 5 months 2 days to 5 months 11 days to obtain target reading texts. The instance entity weight refers to the number of occurrences of the instance entity in the reading text and in the target reading text. The entity weights are used to characterize the user's interest level in the entity.

Specifically, the server acquires the occurrence number corresponding to each instance entity, and takes the occurrence number corresponding to each instance entity as the weight of each instance entity. The server may count the number of occurrences of the instance entity in the target reading text in the target time period in advance, then count the number of occurrences of the instance entity in the reading text, and finally calculate the sum of the number of occurrences, and take the sum of the final number of occurrences as the instance entity weight.

And step 208, calculating the instance similarity between each extended instance entity and the associated instance entity in each instance entity, and calculating the extended instance entity weight by using the weight of each instance entity and the instance similarity to obtain the weight of each extended instance entity.

Wherein, the instance similarity is used for representing the similarity between the extended instance entity and the associated instance entity. The extended instance entity weight refers to the entity weight of the extended instance type.

Specifically, the server may perform vectorization representation on each extended instance entity and each instance entity, to obtain each extended instance entity vector and each instance entity vector. And the server calculates the similarity between each extended instance entity vector and the associated instance entity vector by using a similarity algorithm to obtain the instance similarity, wherein the similarity algorithm can be a cosine similarity algorithm, a distance similarity algorithm and the like. And the server performs weighted calculation on the similarity of the corresponding examples by using the weight of each example entity to obtain the weight of the corresponding expansion example entity of each expansion example entity.

In one embodiment, after carrying out example expansion based on each example entity in the example entity set to obtain an expanded example entity set, the server directly calculates the example similarity between each expanded example entity and the associated example entity in each example entity, and uses the weight of each example entity and the example similarity to calculate the weight of each expanded example entity to obtain the weight of each expanded example entity. And after the weight of each extended instance entity is obtained, obtaining a target instance entity set based on the extended instance entity set and the instance entity set, and carrying out concept extension based on each target instance entity in the target instance entity set to obtain an extended concept entity set.

Step 210, calculating the concept similarity degree between each extended concept entity in the extended concept entity set and the associated target instance entity in each target instance entity, and calculating the extended concept entity weight by using the weight of each instance entity, the weight of each extended instance entity and the concept similarity degree to obtain the weight of each extended concept entity.

Wherein the concept similarity is used to characterize the degree of similarity between the concept instance entity and the associated target instance entity. Expanding concept entity weights refers to entity weights of the developed concept types.

Specifically, the server may perform vectorization representation on each target instance entity and each extended concept entity, to obtain each extended concept entity vector and each target instance entity vector. The server calculates the similarity between each extended concept entity vector and the associated target instance entity vector by using a similarity algorithm to obtain the concept similarity degree, wherein the similarity algorithm can be a cosine similarity algorithm, a distance similarity algorithm and the like. And the server performs weighted calculation by using the entity weights of all the examples, the entity weights of all the expanded examples and the similarity degree of the concepts to obtain the entity weights of all the expanded concepts.

Step 212, establishing an interest knowledge graph corresponding to the target user identifier based on the target instance entity set, the extended concept entity set, the weights of the instance entities, the weights of the extended instance entities and the weights of the extended concept entities.

The interest knowledge graph is a knowledge graph obtained by reading texts of target user identifications in a preset time period, and the interest knowledge graph comprises each target instance entity, each extended concept entity, each instance entity weight, each extended concept entity weight and an association relation among the entities.

Specifically, the server establishes an initial knowledge graph according to the association relationship among the target instance entity set, the expanded concept entity set and each entity, and then performs weight setting on the entities in the initial knowledge graph according to the weights of each instance entity, each expanded instance entity and each expanded concept entity to obtain an interest knowledge graph corresponding to the target user identifier.

In the knowledge graph construction method, reading texts of target user identifiers in a preset time period are obtained, and entity links are carried out based on the reading texts, so that an instance entity set corresponding to the reading texts is obtained; performing instance expansion based on each instance entity in the instance entity set to obtain an expanded instance entity set, obtaining a target instance entity set based on the expanded instance entity set and the instance entity set, and performing concept expansion based on each target instance entity in the target instance entity set to obtain an expanded concept entity set. And then obtaining the entity weights of all the examples, calculating the entity weights of all the extended examples by using the entity weights of all the examples and the similarity of the examples, and calculating the entity weights of all the extended concepts by using the entity weights of all the extended examples, the entity weights of all the examples and the similarity of the concepts. And finally, establishing an interest knowledge graph corresponding to the target user identifier by using the target instance entity set, the expanded concept entity set, the weights of all instance entities, the weights of all expanded instance entities and the weights of all expanded concept entities, so that the established interest knowledge graph can improve the accuracy of interest characterization of the user.

In one embodiment, as shown in fig. 3, step 202, performing entity linking based on the reading text to obtain an instance entity set corresponding to the reading text includes:

step 302, performing entity word recognition based on the reading text to obtain each entity word.

Wherein, the entity word refers to the word obtained by recognition in the reading text, and the entity word is not linked to the node in the knowledge base.

Specifically, the server may perform a string search on the read text by using all names in a preset alias table, and quickly identify each entity word boundary by using a string matching algorithm, where the string matching algorithm may use a multi-pattern matching algorithm, such as an Aho-core (by preprocessing the pattern string into a definite finite state automaton, scanning the text), and the alias table stores words with names that are not identical but have consistent semantics, such as russian and Zhou Shu. In a particular embodiment, the alias table may be derived through hyperlinks in a Chinese wiki page.

The server may also use a sequence tagging algorithm to identify individual entity words in the read text. The server may also use a named entity recognition model to recognize individual entity words in the resulting reading text.

And step 304, carrying out entity recall from a preset knowledge base based on each entity word to obtain candidate entity sets corresponding to each entity word.

The preset knowledge base refers to a preset database containing entities. The candidate entity set refers to all relevant entity sets of entity words in a preset knowledge base. The candidate entity set includes individual candidate entities.

Specifically, the server recalls each entity word from a preset knowledge base to obtain all relevant entity sets, and a candidate entity set corresponding to each entity word is obtained. The server can also use the alias table to carry out entity recall to obtain candidate entity sets corresponding to each entity word.

And 306, performing entity disambiguation based on the candidate entity sets respectively corresponding to the entity words to obtain the entities respectively corresponding to the entity words, and obtaining the instance entity set corresponding to the reading text based on the entities respectively corresponding to the entity words.

The entity disambiguation is used for sorting the candidate entities in the candidate entity set, and selecting the most similar candidate entities as the example link results.

Specifically, the server may perform entity disambiguation on the candidate entity set corresponding to each entity word, that is, select the most similar entity from the candidate entity set as the entity corresponding to the entity word. And traversing the candidate entity sets corresponding to all the entity words to obtain the entity corresponding to each entity word, and forming an instance entity set corresponding to the reading text. In a specific embodiment, as shown in fig. 4, a flow chart of entity linking is shown, where entity word recognition is performed on a reading text to obtain entity word labeling text, then each candidate entity corresponding to an entity word is recalled from a knowledge base, a score of each candidate entity is obtained by calculating a similarity score of the entity word and each candidate entity, and then a candidate entity with the largest score is selected to obtain a target entity corresponding to the entity word.

In one embodiment, as shown in fig. 5, step 306, performing entity disambiguation based on the candidate entity sets corresponding to each entity word respectively, to obtain the entities corresponding to each entity word respectively, includes:

step 502, determining a current entity word from all the entity words, and obtaining an entity text corresponding to the current entity word;

step 504, inputting the entity text and the corresponding candidate entity set into an entity disambiguation model, wherein the entity disambiguation model maps the entity text and the corresponding candidate entity set into a vector space respectively to obtain an entity word vector corresponding to the current entity word and a candidate entity vector set corresponding to the candidate entity set, calculates the similarity degree of the entity word vector and the candidate entity vector in the candidate entity vector set respectively, and determines the current entity corresponding to the current entity word from the candidate entity set based on the similarity degree.

Wherein, the current entity word refers to the entity word of the corresponding entity to be determined. The entity text refers to the context text corresponding to the current entity word. The entity disambiguation model is used for scoring the similarity of the input entity words and the candidate entities. The entity disambiguation model is trained in advance using a neural network algorithm, and may be a dual encoder model. The current entity refers to the entity corresponding to the current entity word.

Specifically, the server sequentially takes each entity word as the current entity word, and obtains the entity text corresponding to the current entity word from the reading text. And then inputting the entity text and the corresponding candidate entity set into an entity disambiguation model, wherein the entity disambiguation model respectively performs mapping on the entity text and the corresponding candidate entity set into a vector space to obtain an entity word vector corresponding to the current entity word and a candidate entity vector set corresponding to the candidate entity set, calculating the similarity degree of the entity word vector and each candidate entity vector in the candidate entity vector set respectively, and then selecting a candidate entity with the final similarity degree as the current entity corresponding to the current entity word. In a specific embodiment, as shown in fig. 6, a frame diagram for entity disambiguation is summarized, where an entity word context and a candidate entity context are respectively segmented to obtain a word segmentation result, then the word segmentation result is encoded by using an encoder and is input into a feedforward neural network to perform vectorization to obtain an entity word vector and a candidate entity vector, then the similarity between the entity word vector and the candidate entity vector is calculated, finally the similarity between the entity word vector and each candidate entity vector is obtained, and then the candidate entity with the maximum similarity is selected as the entity corresponding to the entity word.

In the embodiment, the instance entity set is obtained through entity link, so that the accuracy of the obtained instance entity set is improved.

In one embodiment, as shown in fig. 7, step 204, that is, performing an instance expansion based on each instance entity in the instance entity set, obtains an expanded instance entity set, includes:

step 702, determining a current instance entity from the instance entity set, and searching associated candidate instance entities in a preset knowledge base according to a preset association relationship by using the current instance entity.

The current instance entity refers to an instance entity needing to be expanded currently. The preset association relationship refers to an association relationship between preset instance pre-instances, for example, an instance (instance) developed by a related to relationship. Candidate instance entities refer to instance entities that require further confirmation.

Specifically, the server can expand through the upper level, lower level, related attributes and the like in the preset knowledge base. The server selects an instance entity from the instance entity set as a current instance entity, and then searches each candidate instance entity associated with the current instance entity in a preset knowledge base according to a preset association relation.

Step 704, calculating the instance similarity between the current instance entity and each candidate instance entity, and selecting the expansion instance entity associated with the current instance entity from each candidate instance entity based on the instance similarity.

The instance similarity is used for representing the similarity degree of the current instance entity and the candidate instance entity, and the higher the instance similarity degree is, the more similar the current instance entity and the candidate instance entity are.

Specifically, the server may obtain a current instance entity vector corresponding to the current instance entity and candidate instance entity vectors corresponding to the candidate instance entities. And then calculating the instance similarity degree between the entity vector of the previous instance and each candidate instance entity vector respectively by using a cosine similarity algorithm. And comparing the instance similarity with a preset expansion stop threshold, and selecting a candidate instance entity corresponding to the instance similarity as an expansion instance entity when the instance similarity is not lower than the expansion stop threshold. And stopping expanding when the instance similarity is lower than the expansion stopping threshold, namely, selecting the candidate instance entity corresponding to the instance similarity as an expansion instance entity.

Step 706, traversing each instance entity in the instance entity set to obtain an extended instance entity set.

Specifically, the server returns to determine the current instance entity from the instance entity set, and uses the current instance entity to search for each associated candidate instance entity in a preset knowledge base according to a preset association relationship, until all the instance entities in the instance entity set are traversed, and an extended instance entity set is obtained based on all the extended instance entities obtained by extension.

In the embodiment, the accuracy of the obtained extended instance entity set is improved by calculating the instance similarity between the current instance entity and each candidate instance entity and selecting the extended instance entity associated with the current instance entity from each candidate instance entity based on the instance similarity.

In one embodiment, as shown in fig. 8, step 204, that is, performing concept expansion based on each target instance entity in the target instance entity set, to obtain an expanded concept entity set, includes:

step 802, obtaining an instance relation and a sub-class relation, and performing concept expansion based on each target instance entity in a preset knowledge base according to the instance relation to obtain a first expansion concept entity set.

The instance relationship refers to the relationship between the instance and the concept in the knowledge graph, namely an instanceOf relationship. The sub-class relationship refers to a relationship between a parent concept and a child concept in the knowledge graph, namely a subclauseof relationship, and the parent concept and the child concept can have multiple levels. The first extended concept entity set refers to a set of first extended concept entities obtained by extending through an instance relation.

Specifically, the server acquires an instance relation and a sub-class relation from a preset knowledge base, and searches each concept entity associated with each target instance entity in the preset knowledge base according to the instance relation to obtain a first expanded concept entity set.

Step 804, performing concept expansion based on each first expansion concept entity in the first expansion concept entity set in a preset knowledge base according to the subclass relation to obtain a second expansion concept entity set.

Wherein the second extended concept entity set refers to a set of second extended concept entities obtained by extending through a subclass relationship

Specifically, the server searches each concept entity associated with each first expansion concept entity in the first expansion concept entity set according to the subclass relation in the preset knowledge base to obtain a second expansion concept entity set.

Step 806, obtaining an extended concept entity set based on the first extended concept entity set and the second extended concept entity set.

Specifically, the server obtains an extended concept entity set according to all the first extended concept entities and all the second extended concept entities.

In one embodiment, as shown in fig. 9, step 802, that is, performing concept expansion based on each target instance entity in a preset knowledge base according to an instance relationship, to obtain a first expanded concept entity set, includes:

step 902, determining a current target instance entity from all target instance entities, and searching associated first candidate expansion concept entities in a preset knowledge base according to instance relations by using the current target instance entity.

The current target instance entity refers to a target instance entity needing to expand concept entities currently. The first candidate expanded concept entity refers to a candidate first expanded concept entity.

Specifically, the server searches all associated concept entities in a preset knowledge base according to the instance relation by using the current target instance entity, and takes each concept entity as a first candidate expansion concept entity.

Step 904, calculating first concept similarity degrees of the current target instance entity and each first candidate expanded concept entity, and selecting a first expanded concept entity corresponding to the current target instance entity from each first candidate expanded concept entity based on the first concept similarity degrees.

The first concept similarity is used for representing the similarity between the current target instance entity and the first candidate expansion concept entity, and the higher the similarity is, the more similar the current target instance entity and the first candidate expansion concept entity are.

Specifically, the server may obtain a current target instance entity vector corresponding to the current target instance entity and first candidate expansion concept entity vectors corresponding to the first candidate expansion concept entities from a preset knowledge base, then calculate cosine similarity between the current target instance entity vector and the first candidate expansion concept entity vectors by using a cosine similarity algorithm, obtain each first concept similarity degree, compare each first concept similarity degree with a preset expansion stop threshold, and select the first candidate expansion concept entity corresponding to the first concept similarity degree as the first expansion concept entity when the first concept similarity degree is not lower than the expansion stop threshold. And stopping expanding when the first concept similarity degree is lower than an expansion stopping threshold value, namely, the first candidate expansion concept entity corresponding to the first concept similarity degree is not selected as the first expansion concept entity.

Step 906, traversing each target instance entity in the target instance entity set to obtain a first extended concept entity set.

Specifically, the server returns to determine a current target instance entity from all target instance entities, and searches for each associated first candidate expansion concept entity in a preset knowledge base according to an instance relation by using the current target instance entity until each target instance entity in the target instance entity set is traversed, and a first expansion concept entity set is obtained based on each selected first expansion concept entity.

In one embodiment, as shown in fig. 9, step 804, that is, performing concept expansion based on each first expansion concept entity in the first expansion concept entity set in the preset knowledge base according to the subclass relationship, to obtain a second expansion concept entity set, includes:

step 908, determining a current first extended concept entity from the first extended concept entity set, and searching each associated second candidate extended concept entity in a preset knowledge base according to a subclass relation by using the current first extended concept entity.

The current first expansion concept entity refers to a first expansion concept entity needing to be further expanded. The second candidate expanded concept entity refers to a candidate second expanded concept entity.

Specifically, the server selects a first expansion concept entity from the first expansion concept entity set to obtain a current first expansion concept entity, searches all concept entities associated with the current first expansion concept entity in a preset knowledge base according to a subclass relation, and takes each associated concept entity as each second candidate expansion concept entity.

Step 910, calculating the second concept similarity degree of the current first expanded concept entity and each second candidate expanded concept entity, and selecting the second expanded concept entity corresponding to the current first expanded concept entity from each second candidate expanded concept entity based on the second concept similarity degree.

The second concept similarity is used for representing the similarity between the current first expanded concept entity and the second candidate expanded concept entity. The higher the similarity, the more similar the current first expanded concept entity and the second candidate expanded concept entity are

Specifically, the server may obtain, from a preset knowledge base, a current first expansion concept entity vector corresponding to the current first expansion concept entity and a second candidate expansion concept entity vector corresponding to each second candidate expansion concept entity, and calculate, using a cosine similarity algorithm, similarity between the current first expansion concept entity vector and each second candidate expansion concept entity vector, so as to obtain similarity degrees of the respective second concepts. And comparing the similarity degree of each second concept with a preset expansion stop threshold, and selecting a second candidate expansion concept entity corresponding to the similarity degree of the second concept as a second expansion concept entity when the similarity degree of the second concept is not lower than the expansion stop threshold. When the similarity degree of the second concept is lower than the expansion stop threshold, stopping expansion, namely, the second candidate expansion concept entity corresponding to the similarity degree of the second concept is not selected as the second expansion concept entity

Step 912, traversing each first extended concept entity in the first extended concept entity set to obtain a second extended concept entity set.

Specifically, the server returns to determine the current first expanded concept entity from the first expanded concept entity set, searches the step straight line of each associated second candidate expanded concept entity in the preset knowledge base by using the current first expanded concept entity according to the subclass relation until all the first expanded concept entities in the first expanded concept entity set are traversed, and obtains a second expanded concept entity set based on the selected second expanded concept entity.

In the above embodiment, the concept expansion is performed based on each target instance entity in the preset knowledge base according to the instance relation to obtain the first expanded concept entity set, and the concept expansion is performed based on each first expanded concept entity in the first expanded concept entity set in the preset knowledge base according to the sub-category relation to obtain the second expanded concept entity set, so that the expanded concept entity set is obtained, and the accuracy of the obtained expanded concept entity set is improved.

In one embodiment, as shown in fig. 10, step 208, calculating the instance similarity between each extended instance entity and the associated instance entity in each instance entity, and calculating the extended instance entity weight by using the weight of each instance entity and the instance similarity, to obtain the weight of each extended instance entity, including:

Step 1002, determining a current extended instance entity from the extended instance entities, and determining an instance entity associated with the current extended instance entity from the instance entities.

Step 1004, obtaining the associated instance entity weight corresponding to the associated instance entity from the instance entity weights, and determining the associated instance similarity between the current extended instance entity and the associated instance entity from the instance similarity.

The current expansion instance entity refers to an expansion instance entity needing to calculate weight currently.

Specifically, when the server needs to calculate the weight of the extended instance entity, the server selects the current extended instance entity from each extended instance entity, then obtains instance entities associated with the pre-current extended instance entity from each instance entity, and in one embodiment, there may be a plurality of instance entities associated with the current extended instance entity, for example, at least two instance entities associated with the current extended instance entity are obtained, and meanwhile, the weight of each instance entity associated with the current extended instance entity is obtained. At this time, the similarity of the associated instance between the current extended instance entity and each associated instance entity is obtained from the instance similarity, so as to obtain the similarity of each associated instance, where the instance similarity refers to the similarity of the instance entity and the current extended instance entity. The current extended instance entity may be extended by a different instance entity.

Step 1006, calculating the product of the entity weight of the associated instance and the similarity degree of the associated instance to obtain the corresponding weight of the expanded instance of the current expanded instance entity.

Specifically, when the current expansion instance entity has only one associated instance entity, the server calculates the product of the weight of the associated instance entity and the similarity degree of the corresponding associated instance, and takes the product as the corresponding expansion instance weight of the previous expansion instance entity. When the current expansion instance entity has a plurality of corresponding instance entities, the server calculates products of the association instance entity weights and the corresponding association instance similarity degree, and then calculates the sum of all the products to obtain the expansion instance weights corresponding to the current expansion instance entity. In a specific embodiment, the expanded instance weights may be calculated using equation (1) as shown below.

Wherein I is _related Representing an extended instance entity, W _I (I _related ) Representing the weight of the extended instance. v denotes the associated instance entity, N _i Representation and I _related A set of instance entities in the associated reading text. W (W) _I (v) Representing instance entity weights. W (W) _E (v,I _related ) Representing extended instance entity I _related Associated instance entity and extended instance entity I _related Example degree of similarity between. In one particular embodiment, the example similarity may be calculated using equation (2) as shown below.

Wherein W is _E (v _i ,v _j ) Representing instance similarity, can also be understood as edge weights of the knowledge graph. v _i An instance entity is represented as such,

word vector representing instance entity, v _j Representing associated extended instance entities, y _vj Word vectors representing extended instance entities.

And step 1008, traversing each extended instance entity to obtain the weight of each extended instance entity.

Specifically, the server returns to determine the current extended instance entity from the extended instance entities, and determines the step execution of the instance entity associated with the current extended instance entity from the extended instance entities until all the extended instance entities are traversed, so as to obtain the weight of the extended instance entity corresponding to each extended instance entity.

In one embodiment, as shown in fig. 11, step 210, namely calculating a concept similarity degree between each extended concept entity in the extended concept entity set and a target instance entity associated with each target instance entity, and calculating the extended concept entity weight by using each instance entity weight, each extended instance entity weight and the concept similarity degree, to obtain each extended concept entity weight, including:

step 1102, determining a first extended concept entity set and a second extended concept entity set from the extended concept entities, wherein the first extended concept entity set is obtained by a preset instance relation based on the target instance entity set, and the second extended concept entity set is obtained by a preset sub-category relation based on the first extended concept entity set.

Specifically, the server divides each extended concept entity to obtain a first extended concept entity set and a second extended concept entity set, wherein the first extended concept entity set is obtained by a preset instance relation based on a target instance entity in a target instance entity set, and the second extended concept entity in the second extended concept entity set is obtained by a preset subclass relation based on the first extended concept entity in the first extended concept entity set.

Step 1104, calculating a first concept similarity degree between each first expanded concept entity in the first expanded concept entity set and an associated target instance entity in each target instance entity, determining a target instance entity weight associated with each first expanded concept entity from each instance entity weight and each expanded instance entity weight, and calculating a first expanded concept entity weight based on the first concept similarity degree and the associated target instance entity weight to obtain each first expanded concept entity weight.

Wherein the first concept similarity is used to characterize a degree of similarity between the first expanded concept entity and the associated target instance entity. The target instance entity weight refers to the weight of the instance entity associated with the first extended concept entity. The first extended concept entity weight refers to the weight of the first extended concept entity.

Specifically, the server may calculate, using a cosine similarity algorithm, a similarity between each first extended concept entity in the first extended concept entity set and an associated target instance entity in each target instance entity, so as to obtain each first concept similarity degree. And then determining the target instance entity weight corresponding to the target instance entity associated with each first expansion concept entity from the instance entity weights and the expansion instance entity weights, and calculating the first expansion concept entity weight by using the similarity of each first concept and the associated target instance entity weight to obtain each first expansion concept entity weight.

Step 1106, calculating a second concept similarity degree between each second expanded concept entity in the second expanded concept entity set and each first expanded concept entity associated with each first expanded concept entity, determining a first expanded concept entity weight associated with each second expanded concept entity from each first expanded concept entity weight, and calculating a second expanded concept entity weight based on the second concept similarity degree and the associated first expanded concept entity weight to obtain each second expanded concept entity weight.

Wherein the second concept similarity is used to characterize a degree of similarity between the second expanded concept entity and the associated first expanded concept entity. The second extended concept entity weight refers to the weight of the second extended concept entity.

Specifically, the server may calculate, using a cosine similarity algorithm, a similarity between each second extended concept entity in the second extended concept entity set and the associated first extended concept entity, so as to obtain a similarity degree of each second concept. And then determining the first expanded concept entity weight corresponding to the first expanded concept entity associated with each second expanded concept entity from the first expanded concept entity weights, and calculating the second expanded concept entity weight by using the similarity of each second concept and the associated first expanded concept entity weight to obtain each second expanded concept entity weight.

Step 1108, obtaining each extended concept entity weight based on each first extended concept entity weight and each second extended concept entity weight.

Specifically, after obtaining the weight of each first extended concept entity and the weight of each second extended concept entity, the server obtains the weight of the extended concept entity corresponding to each extended concept entity.

In one embodiment, as shown in fig. 12, step 1104, namely calculating a first concept similarity degree between each first extended concept entity in the first extended concept entity set and an associated target instance entity in each target instance entity, determining a target instance entity weight associated with each first extended concept entity from each instance entity weight and each extended instance entity weight, and performing first extended concept entity weight calculation based on the first concept similarity degree and the associated target instance entity weight to obtain each first extended concept entity weight, including:

step 1202, determining a current first extended concept entity from the first extended concept entities, and determining a current target instance entity associated with the current first extended concept entity from the target instance entities.

The current first expansion concept entity refers to a first expansion concept entity needing weight calculation currently. The current target instance entity refers to a target instance entity associated with the current first extended concept entity, and there may be a plurality of current target instance entities, i.e. at least two current target instance entities associated with the current first extended concept entity are determined.

Specifically, the server may select the first extended concept entity from the first extended concept entities in turn, to obtain the current first extended concept entity. And then determining a target instance entity associated with the current first expanded concept entity from the target instance entities to obtain the current target instance entity. In one embodiment, a plurality of target instance entities associated with the current first expanded concept entity are determined, and at least two current target instance entities are obtained.

Step 1204, calculating the current first concept similarity between the current first expanded concept entity and the current target instance entity, and determining the current target instance entity weight corresponding to the current target instance entity from the instance entity weights and the expanded instance entity weights.

Wherein the current first concept similarity is used to characterize a similarity between the current first expanded concept entity and the current target instance entity.

Specifically, the server may perform word vectorization processing on the current first extended concept entity and the current target instance entity to obtain a current first extended concept entity vector and a current target instance entity vector, or may directly obtain the current first extended concept entity vector and the current target instance entity vector from the database, and then calculate cosine similarity between the current first extended concept entity vector and the current target instance entity vector by using a cosine similarity algorithm to obtain a current first concept similarity degree. When the number of the current target instance entities is multiple, the server calculates cosine similarity between the current first expansion concept entity vector and each current target instance entity vector respectively, and multiple current first concept similarity degrees are obtained. And then, the server determines the current target instance entity weight corresponding to the current target instance entity from the instance entity weights and the extended instance entity weights.

In step 1206, a product of the similarity of the current first concept and the entity weight of the current target instance is calculated, so as to obtain a first expanded concept entity weight corresponding to the current first expanded concept entity.

Specifically, the server multiplies the similarity of the current first concept by the entity weight of the current target instance, and takes the multiplied result as the entity weight of the first expanded concept corresponding to the current first expanded concept entity. In a specific embodiment, the first expanded concept entity weight corresponding to the current first expanded concept entity may be calculated using the following formula (3).

Wherein C is _instance Representing a first extended concept entity, W _c (C _instance ) Representing the weight of a first extended concept entity, N _i Representation and C _instance An associated set of target instance entities, which may be extended instance entitiesOr may be an example entity in the reading text. V represents the associated target instance entity, W _I (V) represents target instance entity weights. W (W) _E (V,C _instance ) Representing a first degree of conceptual similarity.

Step 1208, traversing each first extended concept entity to obtain each first extended concept entity weight.

Specifically, the server returns to determine the current first expansion concept entity from the first expansion concept entities, and determines the step execution of the current target instance entity associated with the current first expansion concept entity from the target instance entities until traversing each first expansion concept entity is completed, and the first expansion instance entity weight corresponding to each first expansion concept entity is obtained.

In one embodiment, as shown in fig. 12, step 1106, namely calculating a second concept similarity degree between each second expanded concept entity in the second expanded concept entity set and each first expanded concept entity associated with each first expanded concept entity, determining a first expanded concept entity weight associated with each second expanded concept entity from each first expanded concept entity weight, and performing second expanded concept entity weight calculation based on the second concept similarity degree and the associated first expanded concept entity weight to obtain each second expanded concept entity weight, including:

step 1210, determining a current second extended concept entity from the second extended concept entities, and determining a current first extended concept entity associated with the current second extended concept entity from the first extended concept entities.

The current second expansion concept entity refers to a second expansion concept entity needing weight calculation at present. The current first extended concept entity refers to a first extended concept entity associated with the current second extended concept entity, and the current first extended concept entity can be multiple, namely at least two current first extended concept entities associated with the current second extended concept entity are determined.

Specifically, the server may sequentially select the second extended concept entity from the second extended concept entities, to obtain the current second extended concept entity. And then determining the first expansion concept entity associated with the current second expansion concept entity from the first expansion concept entities to obtain the current first expansion concept entity. In one embodiment, it is determined that there are a plurality of first extended concept entities associated with the current second extended concept entity, that is, at least two current first extended concept entities are obtained.

Step 1212, calculating the similarity of the current second concept between the current second expanded concept entity and the current first expanded concept entity, and determining the current first expanded concept entity weight corresponding to the current first expanded concept entity from the weights of the first expanded concept entities.

The similarity degree of the current second concept is used for representing the similarity degree between the current second expanded concept entity and the current first expanded concept entity.

Specifically, the server may perform word vectorization processing on the current second extended concept entity and the current first extended concept entity to obtain a current second extended concept entity vector and a current first extended concept entity vector, or may directly obtain the current second extended concept entity vector and the current first extended concept entity vector from the database, and then calculate cosine similarity between the current second extended concept entity vector and the current first extended concept entity vector by using a cosine similarity algorithm to obtain a current second concept similarity degree. When a plurality of current first expansion concept entity are provided, the server calculates cosine similarity between the current first expansion concept entity vector and each current first expansion concept entity vector respectively, and a plurality of current second concept similarity degrees are obtained. And then, the server determines the current first expanded concept entity weight corresponding to the current first expanded concept entity from the first expanded concept entity weights.

Step 1214, calculating the product of the similarity of the current second concept and the weight of the current first expanded concept entity to obtain the weight of the second expanded concept entity corresponding to the current second expanded concept entity.

Specifically, the server multiplies the similarity of the current second concept with the current first expanded concept entity weight, and takes the multiplied result as the second expanded concept entity weight corresponding to the current second expanded concept entity. In a specific embodiment, the second expanded concept entity weight corresponding to the current second expanded concept entity may be calculated using the following formula (4).

Wherein C is _subclass Representing a second extended concept entity, W _c (C _cubclass ) Representing the weight of the entity of the second expanded concept, N _i Representation and C _subclass A set of associated first expanded concept entities. V represents the associated first extended concept entity, W _C (V) represents a first expanded concept entity weight. W (W) _E (V,C _subclass ) Representing a second degree of conceptual similarity.

Step 1216, traversing each second extended concept entity to obtain each second extended concept entity weight.

Specifically, the server returns to determine the current second expanded concept entity from the second expanded concept entities, and determines the step of executing the current first expanded concept entity associated with the current second expanded concept entity from the first expanded concept entities until each second expanded concept entity is traversed, so as to obtain the second expanded instance entity weight corresponding to each second expanded concept entity.

In the embodiment, the expanded concept entity weight is obtained by respectively calculating the first expanded concept entity weight and the second expanded concept entity weight, so that the accuracy of the obtained expanded concept entity weight is improved.

In one embodiment, after step 208, that is, after calculating the degree of similarity between each extended instance entity and the instance entity associated with each instance entity, the method further includes the steps of:

acquiring an instance weight attenuation factor, and carrying out weight attenuation on the instance entity weight and the expansion instance entity weight based on the instance weight attenuation factor to obtain the instance entity attenuation weight and the expansion instance entity attenuation weight;

the instance weight attenuation factors are used for attenuating the instance entity weights and the developed instance entity weights, and represent the change of the entity weights along with the change of time. The instance entity decay weight refers to the decayed instance entity weight. The attenuation weight of the extended instance entity refers to the attenuated extended instance entity weight.

Specifically, the server may directly obtain the instance weight attenuation factor from the database, or may obtain the current time point, and calculate the instance weight attenuation factor based on the current time point. And then carrying out weight attenuation on the instance entity weights and the expansion instance entity weights by using instance weight attenuation factors, namely calculating the product of the instance weight attenuation factors and each instance entity weight to obtain the instance entity attenuation weights, and calculating the product of the instance weight attenuation factors and each expansion instance entity weight to obtain the expansion instance entity attenuation weights.

Step 210, namely, performing extended concept entity weight calculation by using the entity weights of each instance, the entity weights of each extended instance and the similarity degree of concepts to obtain the entity weights of each extended concept, including the steps of:

and carrying out expanded concept entity weight calculation based on the attenuation weights of the instance entities, the attenuation weights of the expanded instance entities and the similarity of concepts to obtain the attenuation weights of the expanded concept entities.

The attenuation weight of the expanded concept entity refers to the attenuated expanded concept entity weight.

Specifically, the extended concept entity weight is calculated by using the instance entity weight and the extended instance entity weight, and at the moment, when the instance entity weight and the extended instance entity weight are attenuated, the obtained extended concept entity weight is correspondingly attenuated, and then the attenuation weight of the extended concept entity is obtained.

Step 212, namely, establishing an interest knowledge graph corresponding to the target user identifier based on the target instance entity set, the extended concept entity set, the weights of each instance entity, the weights of each extended instance entity and the weights of each extended concept entity, comprising the steps of:

and establishing a target interest knowledge graph corresponding to the target user identifier based on the target instance entity set, the expanded concept entity set, the attenuation weights of all instance entities, the attenuation weights of all expanded instance entities and the attenuation weights of all expanded concept entities.

Specifically, the server uses the attenuated weights to establish a knowledge graph, namely, establishes a target interest knowledge graph corresponding to the target user identifier by using the target instance entity set, the expanded concept entity set, the attenuation weights of all instance entities, the attenuation weights of all expanded instance entities and the attenuation weights of all expanded concept entities.

In the above embodiment, the entity weight is attenuated by using the example weight attenuation factor to obtain the attenuated entity weight, and then the target interest knowledge graph is established by using the attenuated entity weight, so that the obtained target interest knowledge graph can reflect the interest characteristics of the user more accurately.

In one embodiment, as shown in FIG. 13, obtaining the instance weight decay factor includes:

step 1302, obtaining a first target time point corresponding to each instance entity and a second target time point corresponding to each extended instance entity, where the first target time point is a historical time point when the current instance entity was used as the extended instance entity, and the second target time point is a historical time point when the extended instance entity was used as the extended instance entity.

Specifically, the previous use of the current instance entity as the extended instance entity means that the current instance entity appears in the obtained extended instance entity when the current instance entity is extended through the corresponding instance entity in the reading text in the last generation of the interest knowledge graph. The previous expansion instance entity is used as an expansion instance entity, namely the expansion is carried out through the instance entity in the corresponding reading text when the interest knowledge graph is generated last time, and the expansion instance entity appears in the expanded instance entity. The first target time point and the second target time point are time points when corresponding entities are expanded from corresponding reading texts when the interest knowledge graph is generated last time.

In step 1304, a current time point is obtained, a first time interval is determined based on the first target time point and the current time point, and a second time interval is determined based on the second target time point and the current time point.

The current time point refers to a time point when the interest knowledge graph is currently generated, and may be a specific time or a date.

Specifically, the server may acquire the system time to obtain a current time point, and then perform time interval calculation on the first target time point and the current time point corresponding to each instance entity, to obtain a time interval corresponding to each instance entity, that is, a first time interval. And simultaneously, calculating the time interval between a second target time point corresponding to each expansion instance entity and the current time point to obtain a time interval corresponding to each expansion instance entity, namely a second time interval.

Step 1306, calculating a first initial weight attenuation factor based on the first time interval and the preset weight attenuation speed parameter when the first time interval is within the preset weight attenuation time range, and normalizing the first initial weight attenuation factor to obtain an instance weight attenuation factor corresponding to each instance entity.

The preset weight attenuation time range refers to a time range in which preset entity weights are attenuated, when the time interval is smaller than the preset weight attenuation time range, the entity weights do not attenuate the weights, and when the time interval is larger than the preset weight attenuation time range, the entity weights tend to zero. The preset weight attenuation speed parameter is a preset parameter for controlling the weight attenuation speed.

Specifically, the server judges that when the first time interval is within a preset weight attenuation time range, a first initial weight attenuation factor is calculated based on the first time interval and a preset weight attenuation speed parameter, and the first initial weight attenuation factor is normalized to obtain an instance weight attenuation factor corresponding to each instance entity.

Step 1308, when the second time interval is within the preset weight attenuation time range, calculating a second initial weight attenuation factor based on the second time interval and the preset weight attenuation speed parameter, and normalizing the second initial weight attenuation factor to obtain an example weight attenuation factor corresponding to each extended example entity.

Specifically, when the server judges that the second time interval is within the preset weight attenuation time range, calculating a second initial weight attenuation factor based on the second time interval and the preset weight attenuation speed parameter, and normalizing the second initial weight attenuation factor to obtain an example weight attenuation factor respectively corresponding to each extended example entity.

In one embodiment, as shown in FIG. 14, a time interval is plotted against an example weight decay factor. Wherein the instance entity weights do not decay, i.e. remain unchanged, when the time interval precedes t 1. Between t1 and t2, the instance entity weights are decayed, and after t2, the instance entity weights approach 0. The example weight decay factor may then be calculated using equation (5) as shown below.

Where x refers to a time interval, or may be in days. t1 and t2 are empirically set, e.g., t1 may be 7 days and t2 may be 25 days. scale for measuring the size of a sample _x The speed for controlling the weight decay is empirically set and may be 0.25.scale for measuring the size of a sample _y For controlling the entity weight to range from 0 to 1, scale is determined because the instance entity weight does not decay at 0-t1 _y =1/norm (0,7,10). Wherein norm (x, μ, σ) is defined as shown in the following formula (6):

in the above embodiment, the entity weight is attenuated by using the example weight attenuation factor, so that the obtained attenuated entity weight is more accurate.

In one embodiment, after step 212, after establishing the interest knowledge graph corresponding to the target user in the current time period based on each current instance entity, each extended concept entity, the current instance weight, each extended instance weight, and the extended concept weight, the method further includes the steps of:

and obtaining the interest knowledge maps corresponding to the target user identifications at each preset time point, and carrying out dynamic visual display on the interest knowledge maps corresponding to the target user identifications at each preset time point.

The preset time point refers to a time point of the established interest knowledge graph, and may be preset, for example, the interest knowledge graph of the user is established at intervals of one month, or the interest knowledge graph of the user is established every day.

Specifically, the server acquires the interest knowledge patterns corresponding to the target user at each preset time point, wherein the interest knowledge patterns corresponding to each preset time point are the interest knowledge patterns established by the method in any embodiment of the knowledge pattern establishing method. And then dynamically and visually displaying the interest knowledge graph corresponding to the target user identifier at each preset time point, wherein dynamic visualization refers to dynamic display, such as display in an animation mode or display in a video playing mode. In a specific embodiment, as shown in fig. 15, for the display of the interest knowledge graph of the user a on the 5 month 8 day, the size of the circle in the graph represents the size of the corresponding node weight, the blank circle node is the example entity node obtained from reading the text, the black circle node is the example entity node obtained by expansion, and the gray circle node is the concept entity node obtained by expansion. Fig. 16 is a schematic diagram showing weights of conceptual entity nodes in the interest knowledge graph of the user a on 5 months and 8 days. When the user clicks the play button, the interest knowledge maps corresponding to the user A every day in the period from 5 months, 4 days to 5 months, 10 days are sequentially played. As shown in fig. 17, the interest knowledge graph of the user a on the day 5 and 21 is shown, and in combination with the schematic diagram of the node weights of the concept entities in the interest knowledge graph of the user a on the day 5 and 21 shown in fig. 18, it can be obviously seen that the interests of the user are obviously transferred from the concept entity a to the concept entity B and the concept entity C.

In one embodiment, as shown in fig. 19, an information recommendation method is provided, and is described by taking an example that the method is applied to the server in fig. 1 as an application, it is understood that the method can also be applied to a terminal, and can also be applied to a system including the terminal and the server, and implemented through interaction between the terminal and the server. In this embodiment, the method includes the steps of:

in step 1902, an information recommendation instruction is received, where the information recommendation instruction carries a user identifier and an inquiry sentence.

Wherein the user identification is used to uniquely identify the user. The query sentence refers to a sentence for querying recommendation information.

Specifically, the server receives an information recommendation instruction sent by the user terminal, wherein the information recommendation instruction carries a user identifier and an inquiry statement.

Step 1904, extracting query keywords from the query sentence to obtain target keywords.

Specifically, the server may use a keyword extraction algorithm to extract query keywords from the query sentence, and obtain the target keywords. The target keywords are used to characterize the type of information to be recommended to the user, such as news, advertisements, merchandise, video, and the like.

Step 1906, obtaining an interest knowledge graph corresponding to the user identifier, wherein the interest knowledge graph is established by obtaining a reading text of the target user identifier in a preset time period, expanding examples and concepts based on each example entity in the reading text to obtain an expanded entity set, obtaining each example entity weight in the reading text, calculating each expanded entity weight in the expanded entity set based on each example entity weight in the reading text, and using the example entity, the expanded entity set, each example entity weight and each expanded entity weight in the reading text.

Specifically, the server acquires an interest knowledge graph corresponding to the user identifier, where the interest knowledge graph is a knowledge graph established according to the reading text in the latest time period, for example, the current time point when information recommendation is required is 5 months and 12 days, and at this time, the interest knowledge graph of the user established according to the reading text in 5 months and 11 days can be acquired. The interest knowledge graph may be obtained by using any embodiment of the above-mentioned interest knowledge graph construction method. For example, by acquiring a reading text of a target user identifier in a preset time period, carrying out example and concept expansion based on each example entity in the reading text to obtain an expansion entity set, acquiring each example entity weight in the reading text, calculating each expansion entity weight in the expansion entity set based on each example entity weight in the reading text, and establishing the interest knowledge graph by using the example entity, the expansion entity set, each example entity weight and each expansion entity weight in the reading text.

And 1908, determining a target interest entity from the interest knowledge graph, acquiring recommendation information based on the target keyword and the target interest entity, and returning the recommendation information to the terminal corresponding to the user identifier.

Specifically, the server determines a target interest entity from the interest knowledge graph, where the target interest entity may be an entity with the highest entity weight in the interest knowledge graph. And acquiring recommendation information according to the target keywords and the target interest entities, and returning the recommendation information to the terminal corresponding to the user identifier.

According to the information recommending method, the information recommending instruction is received, and the information recommending instruction carries the user identification and the query statement; extracting query keywords from query sentences to obtain target keywords, determining corresponding target interest entities from interest knowledge maps corresponding to user identifications, and further obtaining recommendation information by using the target keywords and the target interest entities.

In a specific embodiment, as shown in fig. 20, a knowledge graph construction method is provided, which specifically includes the following steps:

step 2002, obtaining a reading text of the target user identifier in a preset time period, and carrying out entity linking based on the reading text to obtain an instance entity set corresponding to the reading text.

In step 2004, determining a current instance entity from the instance entity set, searching each associated candidate instance entity in a preset knowledge base according to a preset association relationship by using the current instance entity, calculating the instance similarity between the current instance entity and each candidate instance entity, selecting an extended instance entity associated with the current instance entity from each candidate instance entity based on the instance similarity, traversing each instance entity in the instance entity set to obtain an extended instance entity set, and obtaining a target instance entity set based on the extended instance entity set and the instance entity set.

Step 2006, determining a current extended instance entity from the extended instance entities, and determining an instance entity associated with the current extended instance entity from the instance entities; obtaining the associated instance entity weight corresponding to the associated instance entity from the instance entity weights, and determining the associated instance similarity between the current extended instance entity and the associated instance entity from the instance similarity; calculating the product of the entity weight of the associated instance and the similarity degree of the associated instance to obtain the corresponding expanding instance weight of the current expanding instance entity; traversing each extended instance entity to obtain the weight of each extended instance entity.

Step 2008, obtaining an instance weight attenuation factor, and carrying out weight attenuation on the instance entity weight and the expansion instance entity weight based on the instance weight attenuation factor to obtain the instance entity attenuation weight and the expansion instance entity attenuation weight.

Step 2010, obtaining an instance relation and a sub-class relation, and performing concept expansion based on each target instance entity in a preset knowledge base according to the instance relation to obtain a first expansion concept entity set; and carrying out concept expansion on each first expansion concept entity in the first expansion concept entity set in a preset knowledge base according to the subclass relation to obtain a second expansion concept entity set.

Step 2012, calculating a first concept similarity degree between each first expanded concept entity in the first expanded concept entity set and an associated target instance entity in each target instance entity, determining a target instance entity attenuation weight associated with each first expanded concept entity from each instance entity attenuation weight and each expanded instance entity attenuation weight, and calculating a first expanded concept entity attenuation weight based on the first concept similarity degree and the associated target instance entity attenuation weight to obtain each first expanded concept entity attenuation weight;

Step 2014, calculating second concept similarity between each second expanded concept entity in the second expanded concept entity set and each first expanded concept entity associated with each first expanded concept entity, determining first expanded concept entity attenuation weights associated with each second expanded concept entity from each first expanded concept entity attenuation weight, and calculating second expanded concept entity attenuation weights based on the second concept similarity and the associated first expanded concept entity attenuation weights to obtain each second expanded concept entity attenuation weight;

in step 2016, a target interest knowledge graph corresponding to the target user identifier is built based on the target instance entity set, the extended concept entity set, the attenuation weights of the instance entities, the attenuation weights of the extended instance entities, the attenuation weights of the first extended concept entities, and the attenuation weights of the second extended concept entities.

The application also provides an application scene, and the application scene applies the knowledge graph construction method. Specifically, in an information service platform of instant messaging application, an interest knowledge graph of each user may be established, specifically: as shown in fig. 21, a flow chart for building a knowledge graph is shown, in which an instant messaging application server obtains an information service article browsed by a user in an information service platform within a preset period of time, then links the information service article to obtain each instance entity, then develops an instance (instance) through a related to relationship based on the instance entity, and then develops a node of a concept (concept) through an instance of (instance relationship) and a subs of (sub-relationship) relationship by using all the instances, wherein if the word vector similarity of the developed entity and the original entity is lower than a threshold value, the expansion is stopped, and at this time, an initial knowledge graph is obtained according to all the instances and all the concepts. Setting weights of all nodes in the initial knowledge graph, wherein the weights change along with time, namely obtaining instance entity weights by obtaining the occurrence times corresponding to each instance entity, calculating the expanded instance entity weights according to the similarity degree and the instance entity weights by obtaining the similarity degree of the instance entity and the expanded instance entity, calculating the concept entity weights by expanding the instance entity weights, so as to obtain the weights of the nodes in each knowledge graph, finally obtaining the personal interest knowledge graph of the user in the preset time period, and dynamically displaying the personal interest knowledge graph of the user in the preset time period, wherein the personal interest knowledge graph is a partial schematic diagram of the personal interest knowledge graph after being landed, as shown in fig. 22. And recommending information service articles related to the interest entities to the user according to the interest entities with highest weights in the personal interest knowledge graph. Advertisements, goods, videos, live broadcasts and the like related to the interest entities can be recommended to the user according to the interest entities with the highest weights in the personal interest knowledge graph. In a specific embodiment, the information recommendation method can be applied to cold start recommendation of new services, specifically: the interest knowledge graph established according to the reading text of the user of the old service can be obtained, and then when the user uses the new service when the new service is online, the information of interest of the user in the new service is recommended to the user according to the interest knowledge graph of the old service. For example, a user interest knowledge graph established according to an information service article of an information service platform of an instant messaging application can be obtained, then the user interest knowledge graph is applied to information recommendation of a new service, for example, the user interest knowledge graph is applied to live broadcast information recommendation of a new live broadcast service, when a user uses a live broadcast platform which is just online, the user interest knowledge graph established according to the information service article of the information service platform can be obtained, then an interest entity with the highest weight in the user interest knowledge graph is obtained, and finally live broadcast information related to the interest entity with the highest weight is recommended to the user terminal, so that the user can conveniently use the live broadcast platform which is just online, and user experience is improved.

It should be understood that, although the steps in the flowcharts in fig. 2-20 are shown in order as indicated by the arrows, these steps are not necessarily performed in order as indicated by the arrows. The steps are not strictly limited to the order of execution unless explicitly recited herein, and the steps may be executed in other orders. Moreover, at least a portion of the steps in the flowcharts of FIGS. 2-20 may include steps or stages that are not necessarily performed at the same time, but may be performed at different times, nor does the order in which the steps or stages are performed necessarily performed in sequence, but may be performed alternately or alternately with other steps or at least a portion of the steps or stages in other steps.

In one embodiment, as shown in fig. 23, a knowledge graph construction apparatus 2300 is provided, which may be a software module or a hardware module, or a combination of both, and specifically includes: an instance derivation module 2302, an extension module 2304, a weight acquisition module 2306, an instance weight calculation module 2308, a concept weight calculation module 2310, and a graph creation module 2310, wherein:

An instance obtaining module 2302, configured to obtain a reading text of the target user identifier in a preset time period, and perform entity linking based on the reading text, so as to obtain an instance entity set corresponding to the reading text;

the expansion module 2304 is configured to perform instance expansion based on each instance entity in the instance entity set to obtain an expanded instance entity set, obtain a target instance entity set based on the expanded instance entity set and the instance entity set, and perform concept expansion based on each target instance entity in the target instance entity set to obtain an expanded concept entity set;

a weight obtaining module 2306, configured to obtain weights of instance entities, where the weights of instance entities refer to the number of occurrences of each instance entity in the reading text and the target reading text in a target time period before a preset time period;

an instance weight calculation module 2308, configured to calculate an instance similarity between each extended instance entity and an instance entity associated with each instance entity, and calculate an extended instance entity weight by using the instance entity weight and the instance similarity, so as to obtain an extended instance entity weight;

the concept weight calculation module 2310 is configured to calculate a concept similarity between each extended concept entity in the extended concept entity set and a target instance entity associated with each target instance entity, and calculate an extended concept entity weight by using each instance entity weight, each extended instance entity weight, and the concept similarity, so as to obtain each extended concept entity weight;

The map building module 2312 is configured to build an interest knowledge map corresponding to the target user identifier based on the target instance entity set, the extended concept entity set, the instance entity weights, the extended instance entity weights, and the extended concept entity weights.

In one embodiment, the instance get module 2302 includes:

the recognition unit is used for recognizing the entity words based on the reading text to obtain each entity word;

the recall unit is used for carrying out entity recall from a preset knowledge base based on each entity word to obtain candidate entity sets corresponding to each entity word respectively;

and the disambiguation unit is used for performing entity disambiguation based on the candidate entity sets respectively corresponding to the entity words to obtain the entities respectively corresponding to the entity words, and obtaining the instance entity set corresponding to the reading text based on the entities respectively corresponding to the entity words.

In one embodiment, the disambiguation unit is further configured to determine a current entity word from the entity words, and obtain an entity text corresponding to the current entity word; and inputting the entity text and the corresponding candidate entity set into an entity disambiguation model, wherein the entity disambiguation model maps the entity text and the corresponding candidate entity set into a vector space respectively to obtain an entity word vector corresponding to the current entity word and a candidate entity vector set corresponding to the candidate entity set, calculates the similarity degree of the entity word vector and the candidate entity vector in the candidate entity vector set respectively, and determines the current entity corresponding to the current entity word from the candidate entity set based on the similarity degree.

In one embodiment, the expansion module 2304 is further configured to determine a current instance entity from the instance entity set, and search each associated candidate instance entity in a preset knowledge base according to a preset association relationship using the current instance entity; calculating the instance similarity between the current instance entity and each candidate instance entity, and selecting the expansion instance entity associated with the current instance entity from each candidate instance entity based on the instance similarity; and traversing each instance entity in the instance entity set to obtain an extended instance entity set.

In one embodiment, the expansion module 2304 includes:

the first concept expansion unit is used for acquiring an instance relation and a sub-category relation, and carrying out concept expansion on the basis of each target instance entity in a preset knowledge base according to the instance relation to obtain a first expansion concept entity set;

the second concept expansion unit is used for carrying out concept expansion based on each first expansion concept entity in the first expansion concept entity set in a preset knowledge base according to the subclass relation to obtain a second expansion concept entity set;

the expanded concept obtaining unit is used for obtaining an expanded concept entity set based on the first expanded concept entity set and the second expanded concept entity set.

In one embodiment, the first concept expansion unit is further configured to determine a current target instance entity from the target instance entities, and search each associated first candidate expansion concept entity in a preset knowledge base according to an instance relationship by using the current target instance entity; calculating first concept similarity degrees of the current target instance entity and each first candidate expanded concept entity respectively, and selecting a first expanded concept entity corresponding to the current target instance entity from each first candidate expanded concept entity based on the first concept similarity degrees; and traversing each target instance entity in the target instance entity set to obtain a first expansion concept entity set.

In one embodiment, the second concept extension unit is further configured to: determining a current first expansion concept entity from a first expansion concept entity set, and searching each associated second candidate expansion concept entity in a preset knowledge base according to a subclass relation by using the current first expansion concept entity; calculating second concept similarity degrees of the current first expanded concept entity and each second candidate expanded concept entity, and selecting a second expanded concept entity corresponding to the current first expanded concept entity from each second candidate expanded concept entity based on the second concept similarity degrees; and traversing each first expanded concept entity in the first expanded concept entity set to obtain a second expanded concept entity set.

In one embodiment, the instance weight calculation module 2308 is further configured to determine a current extended instance entity from among the extended instance entities and determine an instance entity associated with the current extended instance entity from among the instance entities; obtaining the associated instance entity weight corresponding to the associated instance entity from the instance entity weights, and determining the associated instance similarity between the current extended instance entity and the associated instance entity from the instance similarity; calculating the product of the entity weight of the associated instance and the similarity degree of the associated instance to obtain the corresponding expanding instance weight of the current expanding instance entity; traversing each extended instance entity to obtain the weight of each extended instance entity.

In one embodiment, concept weight calculation module 2310 includes:

the determining unit is used for determining a first extended concept entity set and a second extended concept entity set from various extended concept entities, wherein the first extended concept entity set is obtained through a preset instance relation based on a target instance entity set, and the second extended concept entity set is obtained through a preset subclass relation based on the first extended concept entity set;

the first computing unit is used for computing first concept similarity between each first expanded concept entity in the first expanded concept entity set and the associated target instance entity in each target instance entity, determining target instance entity weights associated with each first expanded concept entity from each instance entity weight and each expanded instance entity weight, and computing first expanded concept entity weights based on the first concept similarity and the associated target instance entity weights to obtain each first expanded concept entity weight;

The second computing unit is used for computing second concept similarity between each second expanded concept entity in the second expanded concept entity set and each first expanded concept entity associated with each first expanded concept entity, determining first expanded concept entity weights associated with each second expanded concept entity from each first expanded concept entity weight, and computing second expanded concept entity weights based on the second concept similarity and the associated first expanded concept entity weights to obtain each second expanded concept entity weight;

the weight obtaining unit is used for obtaining the weight of each extended concept entity based on the weight of each first extended concept entity and the weight of each second extended concept entity.

In one embodiment, the first computing unit is further configured to determine a current first extended concept entity from among the first extended concept entities, and determine a current target instance entity associated with the current first extended concept entity from among the target instance entities; calculating the current first concept similarity between the current first expanded concept entity and the current target instance entity, and determining the current target instance entity weight corresponding to the current target instance entity from the instance entity weights and the expanded instance entity weights; calculating the product of the similarity of the current first concept and the entity weight of the current target instance to obtain a first expanded concept entity weight corresponding to the current first expanded concept entity; and traversing each first expansion concept entity to obtain each first expansion concept entity weight.

In one embodiment, the second computing unit is further configured to determine a current second extended concept entity from the second extended concept entities, and determine a current first extended concept entity associated with the current second extended concept entity from the first extended concept entities; calculating the similarity degree of the current second concept between the current second expanded concept entity and the current first expanded concept entity, and determining the current first expanded concept entity weight corresponding to the current first expanded concept entity from the weights of the first expanded concept entities; calculating the product of the similarity of the current second concept and the current first expanded concept entity weight to obtain a second expanded concept entity weight corresponding to the current second expanded concept entity; and traversing each second expansion concept entity to obtain each second expansion concept entity weight.

In one embodiment, the knowledge graph construction apparatus 2300 further includes:

the weight attenuation module is used for acquiring an instance weight attenuation factor, and carrying out weight attenuation on the instance entity weight and the expansion instance entity weight based on the instance weight attenuation factor to obtain the instance entity attenuation weight and the expansion instance entity attenuation weight;

The concept weight calculation module 2310 is further configured to perform extended concept entity weight calculation based on the attenuation weights of the instance entities, the attenuation weights of the extended instance entities, and the similarity of the concepts, so as to obtain attenuation weights of the extended concept entities;

the map creation module 2312 is further configured to create a target interest knowledge map corresponding to the target user identifier based on the target instance entity set, the extended concept entity set, the attenuation weights of each instance entity, the attenuation weights of each extended instance entity, and the attenuation weights of each extended concept entity.

In one embodiment, the weight attenuation module is further configured to obtain a first target time point corresponding to each instance entity and a second target time point corresponding to each extended instance entity, where the first target time point is a historical time point when the current instance entity was previously used as the extended instance entity, and the second target time point is a historical time point when the extended instance entity was previously used as the extended instance entity; acquiring a current time point, determining a first time interval based on a first target time point and the current time point, and determining a second time interval based on a second target time point and the current time point; when the first time interval is within a preset weight attenuation time range, calculating a first initial weight attenuation factor based on the first time interval and a preset weight attenuation speed parameter, and normalizing the first initial weight attenuation factor to obtain example weight attenuation factors respectively corresponding to each example entity; when the second time interval is within the preset weight attenuation time range, calculating a second initial weight attenuation factor based on the second time interval and the preset weight attenuation speed parameter, and normalizing the second initial weight attenuation factor to obtain an example weight attenuation factor corresponding to each expansion example entity.

the display module is used for acquiring the interest knowledge maps corresponding to the target user identifications at each preset time point and dynamically and visually displaying the interest knowledge maps corresponding to the target user identifications at each preset time point.

In one embodiment, as shown in fig. 24, there is provided an information recommendation apparatus 2400, which may employ software modules or hardware modules, or a combination of both, as part of a computer device, the apparatus specifically comprising: an instruction receiving module 2402, an extracting module 2404, a map obtaining module 2406, and a recommending module 2408, wherein:

the instruction receiving module 2402 is configured to receive an information recommendation instruction, where the information recommendation instruction carries a user identifier and an inquiry statement;

the extracting module 2404 is configured to extract an inquiry keyword from the inquiry sentence, so as to obtain a target keyword;

the map acquisition module 2406 is configured to acquire an interest knowledge map corresponding to the user identifier, where the interest knowledge map is created by acquiring a reading text of the target user identifier in a preset time period, performing instance and concept expansion based on each instance entity in the reading text to obtain an expanded entity set, acquiring each instance entity weight in the reading text, calculating each expanded entity weight in the expanded entity set based on each instance entity weight in the reading text, and using the instance entity, the expanded entity set, each instance entity weight and each expanded entity weight in the reading text;

The recommendation module 2408 is configured to determine a target interest entity from the interest knowledge graph, obtain recommendation information based on the target keyword and the target interest entity, and return the recommendation information to the terminal corresponding to the user identifier.

For specific limitations of the knowledge graph construction apparatus and the information recommendation apparatus, reference may be made to the above limitations of the knowledge graph construction method and the information recommendation method, and no further description is given here. The above-described knowledge graph construction apparatus and information recommendation apparatus may be implemented in whole or in part by software, hardware, and combinations thereof. The above modules may be embedded in hardware or may be independent of a processor in the computer device, or may be stored in software in a memory in the computer device, so that the processor may call and execute operations corresponding to the above modules.

In one embodiment, a computer device is provided, which may be a server, and the internal structure of which may be as shown in fig. 25. The computer device includes a processor, a memory, and a network interface connected by a system bus. Wherein the processor of the computer device is configured to provide computing and control capabilities. The memory of the computer device includes a non-volatile storage medium and an internal memory. The non-volatile storage medium stores an operating system, computer programs, and a database. The internal memory provides an environment for the operation of the operating system and computer programs in the non-volatile storage media. The database of the computer device is used to store entity data, knowledge-graph data, and the like. The network interface of the computer device is used for communicating with an external terminal through a network connection. The computer program, when executed by a processor, implements a knowledge graph construction method or an information recommendation method.

In one embodiment, a computer device is provided, which may be a terminal, and the internal structure thereof may be as shown in fig. 26. The computer device includes a processor, a memory, a communication interface, a display screen, and an input device connected by a system bus. Wherein the processor of the computer device is configured to provide computing and control capabilities. The memory of the computer device includes a non-volatile storage medium and an internal memory. The non-volatile storage medium stores an operating system and a computer program. The internal memory provides an environment for the operation of the operating system and computer programs in the non-volatile storage media. The communication interface of the computer device is used for carrying out wired or wireless communication with an external terminal, and the wireless mode can be realized through WIFI, an operator network, NFC (near field communication) or other technologies. The computer program, when executed by a processor, implements a knowledge graph construction method or an information recommendation method. The display screen of the computer equipment can be a liquid crystal display screen or an electronic ink display screen, and the input device of the computer equipment can be a touch layer covered on the display screen, can also be keys, a track ball or a touch pad arranged on the shell of the computer equipment, and can also be an external keyboard, a touch pad or a mouse and the like.

It will be appreciated by those skilled in the art that the structures shown in fig. 25 and 26 are merely block diagrams of portions of structures related to the present application and do not constitute a limitation of the computer device to which the present application may be applied, and that a particular computer device may include more or fewer components than shown, or may combine certain components, or have a different arrangement of components.

In an embodiment, there is also provided a computer device comprising a memory and a processor, the memory having stored therein a computer program, the processor implementing the steps of the method embodiments described above when the computer program is executed.

In one embodiment, a computer-readable storage medium is provided, storing a computer program which, when executed by a processor, implements the steps of the method embodiments described above.

In one embodiment, a computer program product or computer program is provided that includes computer instructions stored in a computer readable storage medium. The processor of the computer device reads the computer instructions from the computer-readable storage medium, and the processor executes the computer instructions, so that the computer device performs the steps in the above-described method embodiments.

It should be noted that, user information (including but not limited to user equipment information, user personal information, etc.) and data (including but not limited to data for analysis, stored data, presented data, etc.) referred to in the present application are information and data authorized by the user or sufficiently authorized by each party. The user may also reject or may conveniently reject the recommended information.

Those skilled in the art will appreciate that implementing all or part of the above described methods may be accomplished by way of a computer program stored on a non-transitory computer readable storage medium, which when executed, may comprise the steps of the embodiments of the methods described above. Any reference to memory, storage, database, or other medium used in embodiments provided herein may include at least one of non-volatile and volatile memory. The nonvolatile Memory may include Read-Only Memory (ROM), magnetic tape, floppy disk, flash Memory, optical Memory, or the like. Volatile memory can include random access memory (Random Access Memory, RAM) or external cache memory. By way of illustration, and not limitation, RAM can be in the form of a variety of forms, such as static random access memory (Static Random Access Memory, SRAM) or dynamic random access memory (Dynamic Random Access Memory, DRAM), and the like.

The technical features of the above embodiments may be arbitrarily combined, and all possible combinations of the technical features in the above embodiments are not described for brevity of description, however, as long as there is no contradiction between the combinations of the technical features, they should be considered as the scope of the description.

The above examples merely represent a few embodiments of the present application, which are described in more detail and are not to be construed as limiting the scope of the invention. It should be noted that it would be apparent to those skilled in the art that various modifications and improvements could be made without departing from the spirit of the present application, which would be within the scope of the present application. Accordingly, the scope of protection of the present application is to be determined by the claims appended hereto.

Claims

1. The knowledge graph construction method is characterized by comprising the following steps of:

Acquiring the weight of each instance entity, wherein the weight of each instance entity refers to the occurrence times of each instance entity in the reading text and the target reading text in a target time period before the preset time period;

calculating the instance similarity between each extended instance entity and the associated instance entity in each instance entity, and calculating the extended instance entity weight by using the instance entity weights and the instance similarity to obtain each extended instance entity weight;

and establishing an interest knowledge graph corresponding to the target user identifier based on the target instance entity set, the expanded concept entity set, the instance entity weights, the expanded instance entity weights and the expanded concept entity weights.

2. The method of claim 1, wherein the entity linking based on the reading text to obtain the instance entity set corresponding to the reading text comprises:

performing entity word recognition based on the reading text to obtain each entity word;

performing entity recall from a preset knowledge base based on the entity words to obtain candidate entity sets corresponding to the entity words respectively;

and performing entity disambiguation based on the candidate entity sets respectively corresponding to the entity words to obtain the entities respectively corresponding to the entity words, and obtaining the instance entity set corresponding to the reading text based on the entities respectively corresponding to the entity words.

3. The method according to claim 2, wherein the performing entity disambiguation based on the candidate entity sets respectively corresponding to the entity words to obtain the entities respectively corresponding to the entity words includes:

determining a current entity word from the entity words, and acquiring an entity text corresponding to the current entity word;

and inputting the entity text and the corresponding candidate entity set into an entity disambiguation model, wherein the entity disambiguation model maps the entity text and the corresponding candidate entity set into a vector space respectively to obtain an entity word vector corresponding to the current entity word and a candidate entity vector set corresponding to the candidate entity set, calculates the similarity degree of the entity word vector and the candidate entity vector in the candidate entity vector set respectively, and determines the current entity corresponding to the current entity word from the candidate entity set based on the similarity degree.

4. The method of claim 1, wherein the expanding the instance based on each instance entity in the instance entity set to obtain an expanded instance entity set comprises:

determining a current instance entity from the instance entity set, and searching associated candidate instance entities in a preset knowledge base according to a preset association relation by using the current instance entity;

calculating the instance similarity between the current instance entity and each candidate instance entity, and selecting an expansion instance entity associated with the current instance entity from the candidate instance entities based on the instance similarity;

and traversing each instance entity in the instance entity set to obtain the extended instance entity set.

5. The method of claim 1, wherein the performing concept expansion based on each target instance entity in the target instance entity set to obtain an expanded concept entity set includes:

acquiring an instance relation and a sub-class relation, and performing concept expansion based on each target instance entity in a preset knowledge base according to the instance relation to obtain a first expansion concept entity set;

performing concept expansion based on each first expansion concept entity in the first expansion concept entity set in the preset knowledge base according to the sub-class relation to obtain the second expansion concept entity set;

And obtaining the extended concept entity set based on the first extended concept entity set and the second extended concept entity set.

6. The method of claim 5, wherein the performing concept expansion in a preset knowledge base according to the instance relation based on the target instance entities to obtain a first expanded concept entity set includes:

determining a current target instance entity from the target instance entities, and searching associated first candidate expansion concept entities in a preset knowledge base according to the instance relation by using the current target instance entity;

calculating first concept similarity degrees of the current target instance entity and each first candidate expanded concept entity respectively, and selecting a first expanded concept entity corresponding to the current target instance entity from each first candidate expanded concept entity based on the first concept similarity degrees;

and traversing each target instance entity in the target instance entity set to obtain a first extended concept entity set.

7. The method of claim 5, wherein the performing concept expansion in the preset knowledge base based on each first expansion concept entity in the first expansion concept entity set according to the sub-class relationship to obtain the second expansion concept entity set includes:

Determining a current first expansion concept entity from the first expansion concept entity set, and searching each associated second candidate expansion concept entity in the preset knowledge base by using the current first expansion concept entity according to the subclass relation;

calculating second concept similarity degrees of the current first expanded concept entity and each second candidate expanded concept entity, and selecting a second expanded concept entity corresponding to the current first expanded concept entity from each second candidate expanded concept entity based on the second concept similarity degrees;

and traversing each first expanded concept entity in the first expanded concept entity set to obtain a second expanded concept entity set.

8. The method of claim 1, wherein the calculating the instance similarity between each extended instance entity and the associated instance entity in each instance entity, and performing the calculating of the extended instance entity weight by using the respective instance entity weight and the instance similarity, to obtain each extended instance entity weight includes:

determining a current extended instance entity from the extended instance entities, and determining an instance entity associated with the current extended instance entity from the instance entities;

Obtaining the associated instance entity weight corresponding to the associated instance entity from the instance entity weights, and determining the associated instance similarity between the current extended instance entity and the associated instance entity from the instance similarity;

calculating the product of the entity weight of the associated instance and the similarity degree of the associated instance to obtain the corresponding expanding instance weight of the current expanding instance entity;

and traversing the expansion instance entities to obtain the weight of the expansion instance entities.

9. The method of claim 1, wherein the calculating the concept similarity between each extended concept entity in the extended concept entity set and the associated target instance entity in each target instance entity, and performing extended concept entity weight calculation using the each instance entity weight, each extended instance entity weight, and the concept similarity, to obtain each extended concept entity weight includes:

determining a first extended concept entity set and a second extended concept entity set from the extended concept entities, wherein the first extended concept entity set is obtained through a preset instance relation based on the target instance entity set, and the second extended concept entity set is obtained through a preset subclass relation based on the first extended concept entity set;

Calculating a first concept similarity degree between each first expanded concept entity in the first expanded concept entity set and an associated target instance entity in each target instance entity respectively, determining a target instance entity weight associated with each first expanded concept entity from the instance entity weights and the expanded instance entity weights, and calculating a first expanded concept entity weight based on the first concept similarity degree and the associated target instance entity weight to obtain each first expanded concept entity weight;

calculating second concept similarity between each second expanded concept entity in the second expanded concept entity set and each first expanded concept entity associated with each first expanded concept entity, determining first expanded concept entity weights associated with each second expanded concept entity from each first expanded concept entity weight, and calculating second expanded concept entity weights based on the second concept similarity and the associated first expanded concept entity weights to obtain each second expanded concept entity weight;

and obtaining the expanded concept entity weights based on the first expanded concept entity weights and the second expanded concept entity weights.

10. The method of claim 9, wherein the calculating a first concept similarity between each first extended concept entity in the first extended concept entity set and an associated target instance entity in each target instance entity, determining a target instance entity weight associated with each first extended concept entity from the respective instance entity weight and the respective extended instance entity weight, and performing a first extended concept entity weight calculation based on the first concept similarity and the associated target instance entity weight to obtain each first extended concept entity weight, includes:

determining a current first expanded concept entity from the first expanded concept entities, and determining a current target instance entity associated with the current first expanded concept entity from the target instance entities;

calculating the current first concept similarity degree between the current first expanded concept entity and the current target instance entity, and determining the current target instance entity weight corresponding to the current target instance entity from the instance entity weights and the expanded instance entity weights;

Calculating the product of the similarity of the current first concept and the entity weight of the current target instance to obtain a first expanded concept entity weight corresponding to the current first expanded concept entity;

and traversing each first expansion concept entity to obtain the weight of each first expansion concept entity.

11. The method of claim 9, wherein the calculating a second concept similarity between each second expanded concept entity in the second expanded concept entity set and each first expanded concept entity associated with each first expanded concept entity respectively, determining a first expanded concept entity weight associated with each second expanded concept entity from each first expanded concept entity weight, and performing a second expanded concept entity weight calculation based on the second concept similarity and the associated first expanded concept entity weight to obtain each second expanded concept entity weight, includes:

determining a current second expanded concept entity from the second expanded concept entities, and determining a current first expanded concept entity associated with the current second expanded concept entity from the first expanded concept entities;

Calculating the similarity degree of the current second concept between the current second expanded concept entity and the current first expanded concept entity, and determining the current first expanded concept entity weight corresponding to the current first expanded concept entity from the first expanded concept entity weights;

calculating the product of the similarity of the current second concept and the weight of the current first expanded concept entity to obtain the weight of the second expanded concept entity corresponding to the current second expanded concept entity;

and traversing each second expansion concept entity to obtain the weight of each second expansion concept entity.

12. The method of claim 1, wherein after calculating the instance similarity between each extended instance entity and the instance entity associated with each instance entity, performing extended instance entity weight calculation using the instance entity weights and the instance similarity, obtaining each extended instance entity weight, further comprising:

acquiring an instance weight attenuation factor, and carrying out weight attenuation on the instance entity weights and the expansion instance entity weights based on the instance weight attenuation factor to obtain instance entity attenuation weights and expansion instance entity attenuation weights;

The calculating of the expanded concept entity weight by using the entity weight of each instance, the entity weight of each expanded instance and the similarity of the concepts to obtain the entity weight of each expanded concept comprises the following steps:

performing expanded concept entity weight calculation based on the attenuation weights of the instance entities, the attenuation weights of the expanded instance entities and the concept similarity degree to obtain the attenuation weights of the expanded concept entities;

the establishing an interest knowledge graph corresponding to the target user identifier based on the target instance entity set, the extended concept entity set, the instance entity weights, the extended instance entity weights and the extended concept entity weights includes:

13. The method of claim 12, wherein the obtaining an instance weight decay factor comprises:

acquiring a first target time point corresponding to each instance entity and a second target time point corresponding to each expansion instance entity, wherein the first target time point is a historical time point when the current instance entity is used as the expansion instance entity in the last time, and the second target time point is a historical time point when the expansion instance entity is used as the expansion instance entity in the last time;

Acquiring a current time point, determining a first time interval based on the first target time point and the current time point, and determining a second time interval based on the second target time point and the current time point;

when the first time interval is within a preset weight attenuation time range, calculating a first initial weight attenuation factor based on the first time interval and a preset weight attenuation speed parameter, and normalizing the first initial weight attenuation factor to obtain an instance weight attenuation factor respectively corresponding to each instance entity;

and when the second time interval is within the preset weight attenuation time range, calculating a second initial weight attenuation factor based on the second time interval and the preset weight attenuation speed parameter, and normalizing the second initial weight attenuation factor to obtain the example weight attenuation factors respectively corresponding to the expansion example entities.

14. The method of claim 1, after establishing the interest knowledge graph corresponding to the target user in the current time period based on the respective current instance entity, the respective extended concept entity, the current instance weight, the respective extended instance weight, and the extended concept weight, further comprising:

15. An information recommendation method, the method comprising:

extracting query keywords from the query sentences to obtain target keywords;

acquiring an interest knowledge graph corresponding to the user identifier, wherein the interest knowledge graph is established by acquiring a reading text of a target user identifier in a preset time period, expanding an example and a concept based on each example entity in the reading text to obtain an expanded entity set, acquiring each example entity weight in the reading text, calculating each expanded entity weight in the expanded entity set based on each example entity weight in the reading text, and using the example entity, the expanded entity set, each example entity weight and each expanded entity weight in the reading text;

16. A knowledge graph construction apparatus, characterized in that the apparatus comprises:

the example obtaining module is used for obtaining a reading text of a target user identifier in a preset time period, and carrying out entity linking based on the reading text to obtain an example entity set corresponding to the reading text;

the instance weight calculation module is used for calculating instance similarity between each extended instance entity and the associated instance entity in each instance entity, and calculating the extended instance entity weight by using the instance entity weights and the instance similarity to obtain the extended instance entity weight;

17. An information recommendation device, characterized in that the device comprises:

the map acquisition module is used for acquiring an interest knowledge map corresponding to the user identifier, wherein the interest knowledge map is established by acquiring a reading text of a target user identifier in a preset time period, expanding an example and a concept based on each example entity in the reading text to obtain an expanded entity set, acquiring each example entity weight in the reading text, calculating each expanded entity weight in the expanded entity set based on each example entity weight in the reading text, and using the example entity, the expanded entity set, each example entity weight and each expanded entity weight in the reading text;

And the recommending module is used for determining a target interest entity from the interest knowledge graph, acquiring recommending information based on the target keyword and the target interest entity, and returning the recommending information to the terminal corresponding to the user identifier.

18. A computer device comprising a memory and a processor, the memory storing a computer program, characterized in that the processor implements the steps of the method of any one of claims 1 to 15 when the computer program is executed.

19. A computer readable storage medium storing a computer program, characterized in that the computer program when executed by a processor implements the steps of the method of any one of claims 1 to 15.

20. A computer program product comprising a computer program, characterized in that the computer program, when executed by a processor, implements the steps of the method of any of claims 1 to 15.