CN106844603A - The computational methods and device, application process and device of entity hot topic degree - Google Patents

The computational methods and device, application process and device of entity hot topic degree Download PDF

Info

Publication number
CN106844603A
CN106844603A CN201710029383.9A CN201710029383A CN106844603A CN 106844603 A CN106844603 A CN 106844603A CN 201710029383 A CN201710029383 A CN 201710029383A CN 106844603 A CN106844603 A CN 106844603A
Authority
CN
China
Prior art keywords
popularity
entity
knowledge
basic attribute
answer
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201710029383.9A
Other languages
Chinese (zh)
Other versions
CN106844603B (en
Inventor
简仁贤
陈思聪
产文
贾陆华
叶俊杰
董彦均
袁皓
曹军
乔巍
靳颖超
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhuzhi Technology Beijing Co ltd
Original Assignee
Intelligent Technology (shanghai) Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Intelligent Technology (shanghai) Co Ltd filed Critical Intelligent Technology (shanghai) Co Ltd
Priority to CN201710029383.9A priority Critical patent/CN106844603B/en
Publication of CN106844603A publication Critical patent/CN106844603A/en
Application granted granted Critical
Publication of CN106844603B publication Critical patent/CN106844603B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/332Query formulation
    • G06F16/3329Natural language query formulation or dialogue systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/951Indexing; Web crawling techniques

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Databases & Information Systems (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Mathematical Physics (AREA)
  • General Engineering & Computer Science (AREA)
  • Computational Linguistics (AREA)
  • Human Computer Interaction (AREA)
  • Artificial Intelligence (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Machine Translation (AREA)

Abstract

The invention provides application process and device of the entity hot topic degree in human-computer dialogue in the computational methods and device, knowledge mapping of entity hot topic degree in knowledge mapping, by the calculating to entity hot topic degree in knowledge mapping, applied during human-computer dialogue, make the question and answer of knowledge class to point effectively obtaining quantification.The present invention realizes the self-confident fraction setting of knowledge class answer, reduces works and expressions for everyday use and races to be the first to answer a question the answer for chatting class;The topic realized in people and the dialogue of emotion chat robots extends, such as chatted to a certain topic in talking with, and robot can actively ask a question the application of related hot topic entry;The treatment for entity polysemant during knowledge class is answered is realized, the answer of default entity entry is exported when other clues does not occur in the context of dialogue.

Description

Entity popularity calculation method and device, and application method and device
Technical Field
The invention relates to an artificial intelligence dialog system, in particular to a method and a device for calculating entity popularity in a knowledge graph and a method and a device for applying the entity popularity in the knowledge graph in man-machine dialog.
Background
Compared with the traditional corpus retrieval dialogue system, the artificial intelligence dialogue system containing the knowledge map has the advantages that the artificial intelligence dialogue system has the answer capability in knowledge and common knowledge, and people can feel that robots and people can remember knowledge, understand knowledge and chat knowledge when chatting with the artificial intelligence dialogue system. The structural flow of the artificial intelligent dialogue system with knowledge graph is that users input sentences, chatty answers and knowledge class answers based on knowledge graph are processed in parallel (candidate answers are given and a confidence score is given respectively, the result is hopefully given the higher the score is), and finally a final sequencer selects the most appropriate answer from all candidate answers and sends the most appropriate answer back to the users.
When the number of entities (terms) of a knowledge graph reaches the order of millions or even billions, the entities (terms) are heavily related to common words, such as: who i is (movie name), how good you are (song name), etc. Therefore, knowledge-graph-based knowledge-class answers need to do: judging whether the intention of the user to input the sentence is to ask knowledge or not; whether the term in question belongs to the common term; triggering an answer module for whether the knowledge answer can be answered quickly or not; how to set questions such as answering confidence scores. Failure to solve such problems can cause knowledge-based answers to question the chatter that should be triggered originally; in addition, the priority problem triggered by the same-name entity also needs to be solved.
Disclosure of Invention
The invention aims to provide a method and a device for calculating entity popularity in a knowledge graph and a method and a device for applying the entity popularity in the knowledge graph in man-machine conversation, and aims to solve the problems that when an existing artificial intelligent conversation system encounters a same-name entity in the man-machine conversation process, whether a knowledge type answer or a chatting type answer needs to be triggered cannot be determined according to the intention of a user input sentence, and the triggering priority of the same-name entity cannot be determined.
The technical scheme adopted by the invention for solving the technical problems is as follows:
a method for calculating entity popularity in a knowledge graph comprises the following steps:
capturing an encyclopedia page of an entity in a knowledge graph, and counting basic attributes of the encyclopedia page of the entity to obtain a statistical result of the basic attributes; the basic attributes comprise one or more of attribute quantity, link quantity, page space, production date/showing time, encyclopedia page browsing frequency statistics, encyclopedia page latest updating statistics and entity appearance frequency of daily expressions;
setting the initial popularity of each basic attribute according to the statistical result of the basic attributes;
normalizing the initial heat degree of each basic attribute to obtain the normalized heat degree of each basic attribute;
acquiring a weighting coefficient of each basic attribute;
and according to the weighting coefficient of each basic attribute, carrying out weighted summation on the normalized popularity of each basic attribute to obtain the entity popularity.
On the basis of the above embodiment, further, the method further includes:
the entity popularity is updated periodically.
On the basis of the foregoing embodiment, further, the step of periodically updating the entity popularity includes:
updating the initial popularity of each basic attribute;
updating the normalized popularity of each basic attribute according to the updated initial popularity of each basic attribute;
updating the entity popularity according to the updated normalized popularity of each basic attribute; or,
acquiring hot searching data according to a hot searching list, a ranking and ranking change of a searching website;
counting short comments and long comments of the community website according to a time sequence to obtain community data;
counting entities in the man-machine conversation record according to a time sequence to obtain conversation data;
taking the hot search data, the community data and the dialogue data as a calibration data set, and updating the weighting coefficient of each basic attribute according to the calibration data set;
and updating the entity popularity according to the updated weighting coefficient of each basic attribute.
On the basis of any of the above embodiments, further comprising:
and correcting the entity popularity of the adjacent entities in the knowledge graph.
A method for applying entity popularity in a knowledge graph in man-machine conversation comprises the following steps:
acquiring knowledge answers and chatting answers according to information input by a user; the knowledge answer comprises an entity;
a method for calculating entity popularity in a knowledge graph in any one of the above embodiments;
acquiring a knowledge answer score according to the entity popularity;
acquiring a chatting answer score;
sorting the knowledge answers and the chatting answers according to the knowledge answer scores and the chatting answer scores to obtain a sorting result;
and responding to the user according to the sorting result.
A computing device for entity popularity in a knowledge graph, comprising:
the system comprises a statistical module, a processing module and a processing module, wherein the statistical module is used for capturing encyclopedia pages of entities in a knowledge graph, and performing statistics on basic attributes of the encyclopedia pages of the entities to obtain statistical results of the basic attributes; the basic attributes comprise one or more of attribute quantity, link quantity, page space, production date/showing time, encyclopedia page browsing frequency statistics, encyclopedia page latest updating statistics and entity appearance frequency of daily expressions;
the setting module is used for setting the initial popularity of each basic attribute according to the statistical result of the basic attributes;
the normalization module is used for normalizing the initial heat degree of each basic attribute to obtain the normalized heat degree of each basic attribute;
the coefficient acquisition module is used for acquiring the weighting coefficient of each basic attribute;
and the calculation module is used for carrying out weighted summation on the normalized popularity of each basic attribute according to the weighting coefficient of each basic attribute to obtain the entity popularity.
On the basis of the above embodiment, further, the method further includes:
and the updating module is used for regularly updating the entity popularity.
On the basis of the foregoing embodiment, further, the update module is configured to:
updating the initial popularity of each basic attribute;
updating the normalized popularity of each basic attribute according to the updated initial popularity of each basic attribute;
updating the entity popularity according to the updated normalized popularity of each basic attribute; or,
acquiring hot searching data according to a hot searching list, a ranking and ranking change of a searching website;
counting short comments and long comments of the community website according to a time sequence to obtain community data;
counting entities in the man-machine conversation record according to a time sequence to obtain conversation data;
taking the hot search data, the community data and the dialogue data as a calibration data set, and updating the weighting coefficient of each basic attribute according to the calibration data set;
and updating the entity popularity according to the updated weighting coefficient of each basic attribute.
On the basis of any of the above embodiments, further comprising:
and the correction module is used for correcting the entity popularity of the adjacent entities in the knowledge graph.
An apparatus for applying entity popularity in a knowledge graph in man-machine conversation, comprising:
the answer obtaining module is used for obtaining knowledge answers and chatting answers according to information input by a user; the knowledge answer comprises an entity;
means for calculating the popularity of entities in the knowledge-graph in any of the above embodiments;
the first score module is used for acquiring knowledge answer scores according to the entity popularity;
the second score module is used for acquiring chatting answer scores;
the sorting module is used for sorting the knowledge answers and the chatting answers according to the knowledge answer scores and the chatting answer scores to obtain a sorting result;
and the response module is used for responding to the user according to the sorting result.
The invention has the beneficial effects that:
the invention provides a method and a device for calculating entity popularity in a knowledge graph and a method and a device for applying the entity popularity in the knowledge graph in a man-machine conversation. The method and the device realize the self-confidence score setting of knowledge answers and reduce the answer of daily wording to answer the chat questions; the topic extension in the conversation of the human and emotional chat robot is realized, for example, when a certain topic is chatted in the conversation, the robot can actively ask the application of the related hot entry; the processing of the entity ambiguous words in the knowledge-based answers is realized, and the answers of default (most popular) entity entries are output when no other clues appear in the conversation context.
Drawings
The invention is further illustrated with reference to the following figures and examples.
FIG. 1 is a flow chart illustrating a method for calculating entity popularity in a knowledge graph provided by an embodiment of the present invention;
FIG. 2 is a schematic diagram of a computing device for entity popularity in a knowledge graph according to an embodiment of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention is described in further detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the invention and do not limit the invention.
Detailed description of the preferred embodiment
As shown in FIG. 1, the embodiment of the invention provides a method for calculating the popularity of entities in a knowledge graph, which comprises the following steps.
S101, capturing an encyclopedia page of an entity in a knowledge graph, counting basic attributes of the encyclopedia page of the entity, and acquiring a counting result of the basic attributes; the basic attributes are not limited in the embodiments of the present invention, and the basic attributes may include one or more of attribute number, link number, page spread, production date/showing time, encyclopedia page browsing frequency statistics, encyclopedia page latest update statistics, and entity occurrence frequency of daily expressions.
And S102, setting the initial popularity of each basic attribute according to the statistical result of the basic attributes.
Step S103, normalization processing is carried out on the initial popularity of each basic attribute, and the normalized popularity of each basic attribute is obtained.
Step S104, acquiring the weighting coefficient of each basic attribute.
And step S105, carrying out weighted summation on the normalized popularity of each basic attribute according to the weighting coefficient of each basic attribute to obtain the entity popularity.
The embodiment of the present invention does not limit the manner of obtaining the weighting coefficients of each basic attribute in step S104, and preferably, a plurality of entities may be extracted as samples, the samples are manually labeled as hot samples or cold samples, and then the weighting coefficients of each basic attribute are trained by using a logistic regression algorithm in machine learning for the labeled hot samples and cold samples.
The embodiment of the invention calculates the entity popularity degree in the knowledge graph and applies the entity popularity degree to the man-machine conversation process, so that the assignment of questions and answers of knowledge can be effectively quantified.
In the embodiment of the present invention, the number of attributes refers to the number of basic attributes, and a general encyclopedia page and a community-class entry page all have some basic attributes of the entry, for example, if the entry is a movie, the attributes may include: chinese name, English name, release time, director, actors, score. The number of attributes is positively correlated with the hot degree of the entity entry.
In the embodiment of the present invention, the number of links refers to statistics of the number of links to other entity entry pages, for example, when introductory content in the entity entry pages includes other entity entries, pages linked to other entity entries may exist, and the number of links is statistics of the number of links. The number of links is positively correlated with the degree of popularity of the entity entry.
In the embodiment of the invention, the page space refers to the number of words in the entity entry page, and the word statistics include introduction and specific category introduction, such as: the movie entries have scenario outlines, film comments and character introductions; the character entry will have a growth experience, the first barrel of money; the tool type entry has application scope and principle. The length of the page space and the popularity of the entity entries are found to be positively correlated.
In the embodiment of the invention, statistics of production date/showing time are mostly aimed at film and television works, books and magazines. The closer the hot is from the current time, the higher the other basic information statistics are the same.
In the embodiment of the invention, the encyclopedic page browsing frequency statistics refers to the statistics of the real page access frequency. The page browsing times and the popularity of the entity entries are positively correlated by finding.
In the embodiment of the invention, the encyclopedic page latest update statistics refer to the latest update time of the entity entry page. When other basic information statistics are the same, the more recently updated are more likely to be topical terms, i.e., the more topical entities are hot.
In the embodiment of the present invention, the occurrence frequency of an entity in a daily expression refers to the occurrence frequency of the entity in the daily expression. One type of direct use is to give a thermal cutoff if the frequency is high; another use is in point adjustment of the points of the robot answers in conjunction with popularity when applied in a human-machine conversation. Assume that there are two entries of the same popularity, such as: the world black asks for eyes to be closed (a class of social games) and you ' good (namely, the daily expressions, the singing songs of the joy group, the Li nationality singing songs, the Aimengmeng singing songs and the general art program names), and obviously, the word ' you ' is more frequent in the daily expressions and is more taken as the daily expressions by people.
For example, the entity term "yaoming" exists in various ambiguous semantic terms with the name "yaoming" in a certain encyclopedia page: yaoming (chief manager of joint boards of middle-aged and colleagues), in the initial popularity calculation: the attribute number is 29; the number of links is 50; page space 5533; the encyclopedia editing times is 984; the page browsing times are 1 hundred million and 6 million, and the like; under a periodic updating mechanism, the yaoming vocabulary entry is in a character Fengyun chart of a hot search chart, and the like; in the relationship in the knowledge map, wife 'Yeli', teammates 'easy connection' and the like are also highly popular entities. (II) Yaoming (China first-grade composer), in the initial heat degree calculation: the attribute number is 11; the number of links is 53; page space 999; 35 percent of encyclopedia editing times; the page browsing times is 6 hundred and more ten thousand; under a periodic updating mechanism, the yaoming vocabulary entry is not in any hot search list; the entities that are related in the knowledge-graph are not highly popular entities.
The resulting yaoming (chief deputy of the joint presidents and general manager) was highly popular and was 0.98 on the assumption that the rating was 0 to 1 point; the second grade of the heat rating of Yaoming (the first grade of the composer in China) is 0.45 point.
Preferably, the embodiment of the present invention may further include: and step S106, updating the entity popularity regularly.
The embodiment of the present invention does not limit the updating manner of the entity popularity, and preferably, the step of periodically updating the entity popularity may specifically be: updating the initial popularity of each basic attribute; updating the normalized popularity of each basic attribute according to the updated initial popularity of each basic attribute; updating the entity popularity according to the updated normalized popularity of each basic attribute; or acquiring hot searching data according to the hot searching list, the ranking and the ranking change of the searching website; counting short comments and long comments of the community website according to a time sequence to obtain community data; counting entities in the man-machine conversation record according to a time sequence to obtain conversation data; taking the hot search data, the community data and the dialogue data as a calibration data set, and updating the weighting coefficient of each basic attribute according to the calibration data set; and updating the entity popularity according to the updated weighting coefficient of each basic attribute. The embodiment of the present invention does not limit the update algorithm of the weighting coefficients, and preferably, the update algorithm may be a reordering algorithm based on machine learning.
The method for utilizing the ranking change in the hot search data is not limited, and preferably, the initial hot degree can be subjected to score adding or score subtracting according to the hot search data, for example, the ranking in the hot search data is increased to score adding; decreasing to a point of decreasing; and dynamically adjusting the size according to the change degree.
In the embodiment of the invention, the community data mainly aims at film and television works and books, such comments can be found in community websites, the comments are counted according to time summation, the length and the quality of the comments are distinguished, the time of the comments is taken as a reference of a weighted summation coefficient, and specifically, the closer the coefficient to the current is, the larger the coefficient is. For example, 10 comments 1 year ago may be distinguished from 10 comments yesterday night; and 10 short scores at yesterday night are also different from 10 long scores at yesterday night; the 10 3-star short scores last night are also distinguished from the 10 5-star long scores last day night. The usage of the counting result may be: adding points directly; and secondly, making reference to the calibration data set and introducing machine learning reordering.
In the embodiment of the invention, the conversation data is acquired similar to community data, and the data source is only required to be replaced, so that the counting which is common to all users can be made; it can also be made a customized count for each user based on preference habits. The trending calculation may be a set of system scores common to all users; or a system score customized to each user.
Preferably, the calculation method according to the embodiment of the present invention may further include: step S107, the entity popularity of the adjacent entities in the knowledge graph is corrected. In the knowledge graph, one node is a term entity, and all attributes of the entity are stored. The relationship of two nodes stores the relationship of two entities represented by the two nodes and all attributes of the relationship. For example, entity A "Yaoming" and entity B "Yeli" may be represented by two nodes in the graph, with the attributes stored in each node (e.g., height, profile, primary honor). Their relationship (directional) is a points to B with relationship R1 "wife"; b points to A with the relationship R2 "husband". Popular language description A-R1- > B is "Yaoming's wife (R1) is Yeli (B)"; a < -R2-B is "Yeli (B)" and the husband (R2) is "Yaoming (A)". Of course, the relationship is not necessarily limited to human and human, and may be various, such as "there is no alternate (B) in the representative work (R) of liu de hua (a)", "there is liu de hua (a) in the main actor (R) of no alternate (B)", and may be: "white (A) belongs to the (R) color (B)". The goal of the hot-degree correction of the related adjacent entities is to obtain the hot-degree of each entity, for example, the hot-degree of the entity "yaoming" is high, and the entity "ye li" of the related "wife" is connected; the enthusiasm of the "Yaoqin" entity related to "daughter" is also high. This type of entity popularity ranking problem is similar to the PageRank's web page ranking problem: the popularity of an entity is equivalent to the ranking of a web page; the relationships between entities are equivalent to link jumps between web pages (i.e., the relationship from entity a to entity B is equivalent to a jump from page a to page B) so that the problem can be translated into another numerical correction and ranking of the popularity of all entities in the knowledge graph using a PageRank-like derivation algorithm. Experiments show that the percentage of the heat transfer threshold can be adjusted to achieve a good convergence effect.
In the first embodiment, a method for calculating entity popularity in a knowledge graph is provided, and correspondingly, a device for calculating entity popularity in a knowledge graph is also provided. Since the apparatus embodiments are substantially similar to the method embodiments, they are described in a relatively simple manner, and reference may be made to some of the descriptions of the method embodiments for relevant points. The device embodiments described below are merely illustrative.
Detailed description of the invention
The embodiment of the invention provides an application method of entity popularity in a knowledge graph in man-machine conversation, which comprises the following steps: acquiring knowledge answers and chatting answers according to information input by a user; the knowledge answer comprises an entity; the method for calculating the entity popularity in the knowledge graph in any one of the specific embodiments is used for calculating to obtain the entity popularity; acquiring a knowledge answer score according to the entity popularity; acquiring a chatting answer score; sorting the knowledge answers and the chatting answers according to the knowledge answer scores and the chatting answer scores to obtain a sorting result; and responding to the user according to the sorting result.
The embodiment of the invention realizes the self-confidence score setting of knowledge answers and reduces the answer of daily wording to answer the chat questions; the topic extension in the conversation of the human and emotional chat robot is realized, for example, when a certain topic is chatted in the conversation, the robot can actively ask the application of the related hot entry; the processing of the entity ambiguous word in the knowledge-based answer is realized, and the answer of the default entity entry is output when no other clues appear in the conversation context. The default entity entry may be the entity entry with the highest entity popularity.
During the process of chatting between the user and the robot, answers of the knowledge class are given scores according to the entity popularity of the first part, and the final sequencer can make a selection according to the answers and scores given by all modules (including the answers of the knowledge class and the answers of the chatty class) and finally really recover the user. Therefore, knowledge-based answers of the entries with higher popularity and higher scores are positively correlated.
The following categories are made from the popularity of the entity term (another dimension is included: the term frequency of the entity term in everyday usage) to the customized extension of the answers to the knowledge class:
(i) a sentence of the user is an entity entry or a synonym of the entity entry. Such as: the user asks: "Zhoujilun" or "periof". This type of decision is made on context:
(i.a) if the historical man-machine conversation of the previous round was recorded as, the robot asked a question, this round the user was answering, such as robot: "who are your favorite singers", user: "Zhou Ji Lun"; in this case, the score is lowered on the basis of the popularity, so that the problem that the answer of the knowledge class is not appropriate is prevented.
(i.b) if the historical man-machine conversation record of the previous round judges that the user initiates a topic at the moment, the topic is equivalent to the introduction that the user wants the robot to answer the entity word 'Zhou Ji Lun'. In this case, the score is increased on the basis of the popularity, and the introduction answer of the knowledge class or the answer based on the knowledge reasoning needs to be changed into a high-score result.
(i.c) if the historical man-machine conversation record of the previous round has no enough confidence to judge, giving a score according to the popularity of the entity entry, and because the popularity of the cold entry is low, the score of the knowledge answer is also low at the moment, and the improper answer of the cold entry (or the entry with high word frequency in the daily term) is also prevented to a certain extent.
(ii) The user asks a sentence with intention of introducing the knowledge of the entity entry or asks the attribute of the entity entry or asks the relationship of the entity entry, such as "do you know who you are in Zhou Jie Lun" or "do you know the representative works of Zhou Jie Lun" or "who is in Zhou Jie Lun wife", which is scored according to the confidence value of the intention classifier asking for knowledge and the combination of the heat degree of the entry.
(iii) When a user asks multiple entity entries, such as "what relationship is Zhou Jieren and Kun Ling", etc. The answer score is then scored based on a combination of the confidence value of the intent classifier asking knowledge and the popularity of the terms in the sentence.
In the first embodiment, the example of the entity "yaoming" is embodied in the man-machine interaction roughly as follows:
(1) setting confidence scores of knowledge answers, and scoring the answers of the 2 yaoming questions and answers according to popularity to obtain the knowledge answer scores.
(2) In the extension of topics in a human-computer conversation, for example, when a person chats about a certain topic in the conversation, the robot can actively ask about related hot entries and other applications. For example, the user asks "yaoming" and the robot may answer additional responses based on its associated neighboring hit entities, for example, say "he has XX nearest news" and then attaches a "couple" and his friends are likely to associate with the nearest lake to play the ball. "
(3) Processing of entity ambiguous words in knowledge-based answers, outputting answers to default (most highly popular) entity terms when no other clues are present in the dialog context, such as user questions: "you know Yaoming", the way that Yaoming (the chief manager of the joint board of middle-profession) is presented or the related knowledge reasoning answer is given.
After the invention is applied, the scores of the questions and answers of the knowledge class can be effectively quantified in the man-machine conversation. The following problems can be solved:
(1) the confidence score of the knowledge answer is set, and the answer of daily expressions to the chatting class is reduced. For example, for the cold entry movie "I am who", the user asks: (ii) who I am, the knowledge class is scored to be low according to popularity of the entry and the rule of the (i), so that the chatting class answers can give a result; the user asks: "do you know who i is this movie", the knowledge-based answers are scored high according to the popularity of the terms and according to the rules of (ii) above, chatty-based answers are not answered, and knowledge-based answers are answered.
(2) When a person chats to a certain topic in a conversation with the emotional chatting robot, the robot can actively ask applications such as related hot entries. For example, the user asks: "today NBA (American basket) has lake team match", and the word "easy to establish connection" has gone to "lake team" and has played the ball recently the enthusiasm is higher, therefore the robot can answer "the lake team does not have the match today, will play XX team tomorrow at XX moment according to" easy to establish connection "and" lake team "the triplets (entity A, relation R, entity B) that the knowledge map exists (easy to establish connection, now effective in, lake team). For that, the lake people who are easy to establish the connection play the ball with your knowledge ".
(3) Processing of entity ambiguous words in knowledge-based answers, outputting answers to default (most highly popular) entity terms when no other clues are present in the dialog context, such as user questions: "do you know yaoming", the entry with the highest hot top returned is the knowledge answer of the previous basketball player yaoming. (of course, when there is a context clue, the entity vocabulary is answered according to the clue, for example, "do you know Maojing Quyao", the answer is the knowledge answer of Maojing Quyao of first-level Quyao in China).
In the second embodiment, a method for applying entity popularity in a knowledge graph in a human-computer conversation is provided, and correspondingly, an application apparatus for applying entity popularity in a knowledge graph in a human-computer conversation is also provided. Since the apparatus embodiments are substantially similar to the method embodiments, they are described in a relatively simple manner, and reference may be made to some of the descriptions of the method embodiments for relevant points. The device embodiments described below are merely illustrative.
Detailed description of the preferred embodiment
As shown in FIG. 2, the embodiment of the invention provides a computing device for entity popularity in a knowledge graph, which comprises the following modules.
The statistical module 201 is configured to capture an encyclopedia page of an entity in a knowledge graph, perform statistics on basic attributes of the encyclopedia page of the entity, and obtain a statistical result of the basic attributes; the basic attributes comprise one or more of attribute quantity, link quantity, page space, production date/showing time, encyclopedia page browsing frequency statistics, encyclopedia page latest updating statistics and entity appearance frequency of daily expressions.
And the setting module 202 is configured to set an initial popularity of each basic attribute according to the statistical result of the basic attributes.
And the normalization module 203 is configured to perform normalization processing on the initial popularity of each basic attribute to obtain the normalized popularity of each basic attribute.
A coefficient obtaining module 204, configured to obtain a weighting coefficient of each basic attribute.
The calculating module 205 is configured to perform weighted summation on the normalized popularity of each basic attribute according to the weighting coefficient of each basic attribute, so as to obtain the entity popularity.
The method for acquiring the weighting coefficients of the basic attributes by the coefficient acquisition module 204 is not limited, and preferably, the coefficient acquisition module 204 may be configured to extract a plurality of entities as samples, manually mark the samples as hot samples or cold samples, and train the weighting coefficients of the basic attributes by using a logistic regression algorithm in machine learning for the marked hot samples and cold samples.
The embodiment of the invention calculates the entity popularity degree in the knowledge graph and applies the entity popularity degree to the man-machine conversation process, so that the assignment of questions and answers of knowledge can be effectively quantified.
Preferably, the embodiment of the present invention may further include: an update module 206 for periodically updating the entity popularity.
The embodiment of the present invention does not limit the update module, and preferably, the update module may be configured to: updating the initial popularity of each basic attribute; updating the normalized popularity of each basic attribute according to the updated initial popularity of each basic attribute; updating the entity popularity according to the updated normalized popularity of each basic attribute; or acquiring hot searching data according to the hot searching list, the ranking and the ranking change of the searching website; counting short comments and long comments of the community website according to a time sequence to obtain community data; counting entities in the man-machine conversation record according to a time sequence to obtain conversation data; taking the hot search data, the community data and the dialogue data as a calibration data set, and updating the weighting coefficient of each basic attribute according to the calibration data set; and updating the entity popularity according to the updated weighting coefficient of each basic attribute.
Preferably, the embodiment of the present invention may further include a modification module 207, configured to modify entity popularity of adjacent entities in the knowledge graph.
Detailed description of the invention
The embodiment of the invention provides an application device of entity popularity in a knowledge graph in man-machine conversation, which comprises the following steps: the answer obtaining module is used for obtaining knowledge answers and chatting answers according to information input by a user; the knowledge answer comprises an entity; means for calculating the popularity of entities in the knowledge-graph in any of the above embodiments; the first score module is used for acquiring knowledge answer scores according to the entity popularity; the second score module is used for acquiring chatting answer scores; the sorting module is used for sorting the knowledge answers and the chatting answers according to the knowledge answer scores and the chatting answer scores to obtain a sorting result; and the response module is used for responding to the user according to the sorting result.
The embodiment of the invention realizes the self-confidence score setting of knowledge answers and reduces the answer of daily wording to answer the chat questions; the topic extension in the conversation of the human and emotional chat robot is realized, for example, when a certain topic is chatted in the conversation, the robot can actively ask the application of the related hot entry; the processing of the entity ambiguous word in the knowledge-based answer is realized, and the answer of the default entity entry is output when no other clues appear in the conversation context. The default entity entry may be the entity entry with the highest entity popularity.
It should be noted that the embodiments and features of the embodiments may be combined with each other without conflict. Although the present invention has been described to a certain extent, it is apparent that appropriate changes in the respective conditions may be made without departing from the spirit and scope of the present invention. It is to be understood that the invention is not limited to the described embodiments, but is to be accorded the scope consistent with the claims, including equivalents of each element described.

Claims (10)

1. A method for calculating entity popularity in a knowledge graph is characterized by comprising the following steps:
capturing an encyclopedia page of an entity in a knowledge graph, and counting basic attributes of the encyclopedia page of the entity to obtain a statistical result of the basic attributes; the basic attributes comprise one or more of attribute quantity, link quantity, page space, production date/showing time, encyclopedia page browsing frequency statistics, encyclopedia page latest updating statistics and entity appearance frequency of daily expressions;
setting the initial popularity of each basic attribute according to the statistical result of the basic attributes;
normalizing the initial heat degree of each basic attribute to obtain the normalized heat degree of each basic attribute;
acquiring a weighting coefficient of each basic attribute;
and according to the weighting coefficient of each basic attribute, carrying out weighted summation on the normalized popularity of each basic attribute to obtain the entity popularity.
2. The method of calculating the popularity of entities in a knowledge-graph of claim 1, further comprising:
the entity popularity is updated periodically.
3. The method for calculating entity popularity in a knowledge graph according to claim 2, wherein the step of periodically updating the entity popularity is specifically:
updating the initial popularity of each basic attribute;
updating the normalized popularity of each basic attribute according to the updated initial popularity of each basic attribute;
updating the entity popularity according to the updated normalized popularity of each basic attribute; or,
acquiring hot searching data according to a hot searching list, a ranking and ranking change of a searching website;
counting short comments and long comments of the community website according to a time sequence to obtain community data;
counting entities in the man-machine conversation record according to a time sequence to obtain conversation data;
taking the hot search data, the community data and the dialogue data as a calibration data set, and updating the weighting coefficient of each basic attribute according to the calibration data set;
and updating the entity popularity according to the updated weighting coefficient of each basic attribute.
4. The method for calculating the popularity of entities in a knowledge-graph according to claim 1 or 2, further comprising:
and correcting the entity popularity of the adjacent entities in the knowledge graph.
5. A method for applying entity popularity in a knowledge graph in man-machine conversation is characterized by comprising the following steps:
acquiring knowledge answers and chatting answers according to information input by a user; the knowledge answer comprises an entity;
a method of calculating the popularity of entities in a knowledge-graph as claimed in any one of claims 1 to 4;
acquiring a knowledge answer score according to the entity popularity;
acquiring a chatting answer score;
sorting the knowledge answers and the chatting answers according to the knowledge answer scores and the chatting answer scores to obtain a sorting result;
and responding to the user according to the sorting result.
6. A computing device for entity popularity in a knowledge graph, comprising:
the system comprises a statistical module, a processing module and a processing module, wherein the statistical module is used for capturing encyclopedia pages of entities in a knowledge graph, and performing statistics on basic attributes of the encyclopedia pages of the entities to obtain statistical results of the basic attributes; the basic attributes comprise one or more of attribute quantity, link quantity, page space, production date/showing time, encyclopedia page browsing frequency statistics, encyclopedia page latest updating statistics and entity appearance frequency of daily expressions;
the setting module is used for setting the initial popularity of each basic attribute according to the statistical result of the basic attributes;
the normalization module is used for normalizing the initial heat degree of each basic attribute to obtain the normalized heat degree of each basic attribute;
the coefficient acquisition module is used for acquiring the weighting coefficient of each basic attribute;
and the calculation module is used for carrying out weighted summation on the normalized popularity of each basic attribute according to the weighting coefficient of each basic attribute to obtain the entity popularity.
7. The apparatus for calculating popularity of entities in a knowledge-graph according to claim 6, further comprising:
and the updating module is used for regularly updating the entity popularity.
8. The apparatus for calculating popularity of entities in a knowledge-graph according to claim 7, wherein the update module is configured to:
updating the initial popularity of each basic attribute;
updating the normalized popularity of each basic attribute according to the updated initial popularity of each basic attribute;
updating the entity popularity according to the updated normalized popularity of each basic attribute; or,
acquiring hot searching data according to a hot searching list, a ranking and ranking change of a searching website;
counting short comments and long comments of the community website according to a time sequence to obtain community data;
counting entities in the man-machine conversation record according to a time sequence to obtain conversation data;
taking the hot search data, the community data and the dialogue data as a calibration data set, and updating the weighting coefficient of each basic attribute according to the calibration data set;
and updating the entity popularity according to the updated weighting coefficient of each basic attribute.
9. The apparatus for calculating entity popularity in a knowledge-graph according to claim 6 or 7, further comprising:
and the correction module is used for correcting the entity popularity of the adjacent entities in the knowledge graph.
10. An apparatus for applying entity popularity in a knowledge graph to human-computer interaction, comprising:
the answer obtaining module is used for obtaining knowledge answers and chatting answers according to information input by a user; the knowledge answer comprises an entity;
the means for calculating the popularity of entities in a knowledge-graph of any one of claims 6-9;
the first score module is used for acquiring knowledge answer scores according to the entity popularity;
the second score module is used for acquiring chatting answer scores;
the sorting module is used for sorting the knowledge answers and the chatting answers according to the knowledge answer scores and the chatting answer scores to obtain a sorting result;
and the response module is used for responding to the user according to the sorting result.
CN201710029383.9A 2017-01-16 2017-01-16 Entity popularity calculation method and device, and application method and device Active CN106844603B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201710029383.9A CN106844603B (en) 2017-01-16 2017-01-16 Entity popularity calculation method and device, and application method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201710029383.9A CN106844603B (en) 2017-01-16 2017-01-16 Entity popularity calculation method and device, and application method and device

Publications (2)

Publication Number Publication Date
CN106844603A true CN106844603A (en) 2017-06-13
CN106844603B CN106844603B (en) 2021-05-11

Family

ID=59124809

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201710029383.9A Active CN106844603B (en) 2017-01-16 2017-01-16 Entity popularity calculation method and device, and application method and device

Country Status (1)

Country Link
CN (1) CN106844603B (en)

Cited By (19)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107329986A (en) * 2017-06-01 2017-11-07 竹间智能科技(上海)有限公司 The interactive method and device recognized based on language performance
CN107679103A (en) * 2017-09-08 2018-02-09 口碑(上海)信息技术有限公司 For entity attributes analysis method and system
CN107943774A (en) * 2017-11-20 2018-04-20 北京百度网讯科技有限公司 article generation method and device
CN108154427A (en) * 2017-12-01 2018-06-12 上海富利通信息系统有限公司 A kind of data detection method, device and electronic equipment
CN109255037A (en) * 2018-08-31 2019-01-22 北京字节跳动网络技术有限公司 Method and apparatus for output information
CN109726846A (en) * 2017-10-30 2019-05-07 北京国双科技有限公司 Movable box office prediction technique and device
CN109753557A (en) * 2018-12-26 2019-05-14 出门问问信息科技有限公司 Answer output method, device, equipment and the storage medium of question answering system
CN109800288A (en) * 2019-01-22 2019-05-24 杭州师范大学 A kind of the scientific research analysis of central issue and prediction technique of knowledge based map
CN109815320A (en) * 2018-12-26 2019-05-28 出门问问信息科技有限公司 Answer generation method, device, equipment and the storage medium of question answering system
CN109902161A (en) * 2019-02-01 2019-06-18 出门问问信息科技有限公司 Answer processing method, device, equipment and the storage medium of question answering system
CN110019840A (en) * 2018-07-20 2019-07-16 腾讯科技(深圳)有限公司 The method, apparatus and server that entity updates in a kind of knowledge mapping
CN110110051A (en) * 2018-01-31 2019-08-09 阿里巴巴集团控股有限公司 A kind of dialogue configuration method and server
CN110222156A (en) * 2019-06-14 2019-09-10 北京百度网讯科技有限公司 It was found that the method and apparatus of entity, electronic equipment, computer-readable medium
CN110309189A (en) * 2018-03-13 2019-10-08 深圳市腾讯计算机系统有限公司 The temperature acquisition methods and device of entity word
CN110674313A (en) * 2019-09-20 2020-01-10 四川长虹电器股份有限公司 Method for dynamically updating knowledge graph based on user log
CN111309888A (en) * 2020-02-25 2020-06-19 百度在线网络技术(北京)有限公司 Man-machine conversation method, device, electronic equipment and storage medium
CN112650856A (en) * 2020-12-28 2021-04-13 上海卓繁信息技术股份有限公司 Consultation method and device for providing study direction in academic field and electronic equipment
WO2021164618A1 (en) * 2020-02-17 2021-08-26 京东方科技集团股份有限公司 Knowledge graph-based question answering method and apparatus, computer device, and medium
WO2022204845A1 (en) * 2021-03-29 2022-10-06 深圳市欢太科技有限公司 Method and apparatus for generating entity popularity, and storage medium and electronic device

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP1995669A1 (en) * 2007-05-24 2008-11-26 Deutsche Telekom AG Ontology-content-based filtering method for personalized newspapers
CN104504124A (en) * 2014-12-31 2015-04-08 合一网络技术(北京)有限公司 Method for presenting entity popularity through video searching and playing behaviors
CN105068661A (en) * 2015-09-07 2015-11-18 百度在线网络技术(北京)有限公司 Man-machine interaction method and system based on artificial intelligence
CN105989143A (en) * 2015-02-28 2016-10-05 科大讯飞股份有限公司 Network entity popular degree analysis method and system

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP1995669A1 (en) * 2007-05-24 2008-11-26 Deutsche Telekom AG Ontology-content-based filtering method for personalized newspapers
CN104504124A (en) * 2014-12-31 2015-04-08 合一网络技术(北京)有限公司 Method for presenting entity popularity through video searching and playing behaviors
CN105989143A (en) * 2015-02-28 2016-10-05 科大讯飞股份有限公司 Network entity popular degree analysis method and system
CN105068661A (en) * 2015-09-07 2015-11-18 百度在线网络技术(北京)有限公司 Man-machine interaction method and system based on artificial intelligence

Cited By (24)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107329986A (en) * 2017-06-01 2017-11-07 竹间智能科技(上海)有限公司 The interactive method and device recognized based on language performance
CN107679103A (en) * 2017-09-08 2018-02-09 口碑(上海)信息技术有限公司 For entity attributes analysis method and system
CN109726846A (en) * 2017-10-30 2019-05-07 北京国双科技有限公司 Movable box office prediction technique and device
CN107943774A (en) * 2017-11-20 2018-04-20 北京百度网讯科技有限公司 article generation method and device
CN108154427A (en) * 2017-12-01 2018-06-12 上海富利通信息系统有限公司 A kind of data detection method, device and electronic equipment
CN108154427B (en) * 2017-12-01 2022-01-28 上海子午线新荣科技有限公司 Data detection method and device and electronic equipment
CN110110051A (en) * 2018-01-31 2019-08-09 阿里巴巴集团控股有限公司 A kind of dialogue configuration method and server
CN110309189B (en) * 2018-03-13 2023-04-18 深圳市腾讯计算机系统有限公司 Method and device for acquiring heat of entity words
CN110309189A (en) * 2018-03-13 2019-10-08 深圳市腾讯计算机系统有限公司 The temperature acquisition methods and device of entity word
CN110019840B (en) * 2018-07-20 2021-06-15 腾讯科技(深圳)有限公司 Method, device and server for updating entities in knowledge graph
CN110019840A (en) * 2018-07-20 2019-07-16 腾讯科技(深圳)有限公司 The method, apparatus and server that entity updates in a kind of knowledge mapping
CN109255037A (en) * 2018-08-31 2019-01-22 北京字节跳动网络技术有限公司 Method and apparatus for output information
CN109815320A (en) * 2018-12-26 2019-05-28 出门问问信息科技有限公司 Answer generation method, device, equipment and the storage medium of question answering system
CN109753557A (en) * 2018-12-26 2019-05-14 出门问问信息科技有限公司 Answer output method, device, equipment and the storage medium of question answering system
CN109800288A (en) * 2019-01-22 2019-05-24 杭州师范大学 A kind of the scientific research analysis of central issue and prediction technique of knowledge based map
CN109902161A (en) * 2019-02-01 2019-06-18 出门问问信息科技有限公司 Answer processing method, device, equipment and the storage medium of question answering system
CN109902161B (en) * 2019-02-01 2023-10-20 出门问问创新科技有限公司 Answer processing method, device, equipment and storage medium of question-answering system
CN110222156A (en) * 2019-06-14 2019-09-10 北京百度网讯科技有限公司 It was found that the method and apparatus of entity, electronic equipment, computer-readable medium
CN110674313A (en) * 2019-09-20 2020-01-10 四川长虹电器股份有限公司 Method for dynamically updating knowledge graph based on user log
WO2021164618A1 (en) * 2020-02-17 2021-08-26 京东方科技集团股份有限公司 Knowledge graph-based question answering method and apparatus, computer device, and medium
CN111309888A (en) * 2020-02-25 2020-06-19 百度在线网络技术(北京)有限公司 Man-machine conversation method, device, electronic equipment and storage medium
CN111309888B (en) * 2020-02-25 2023-10-24 百度在线网络技术(北京)有限公司 Man-machine conversation method and device, electronic equipment and storage medium
CN112650856A (en) * 2020-12-28 2021-04-13 上海卓繁信息技术股份有限公司 Consultation method and device for providing study direction in academic field and electronic equipment
WO2022204845A1 (en) * 2021-03-29 2022-10-06 深圳市欢太科技有限公司 Method and apparatus for generating entity popularity, and storage medium and electronic device

Also Published As

Publication number Publication date
CN106844603B (en) 2021-05-11

Similar Documents

Publication Publication Date Title
CN106844603B (en) Entity popularity calculation method and device, and application method and device
CN109844743B (en) Generating responses in automated chat
Kiesling et al. Interactional stancetaking in online forums
CN109314660B (en) Method and device for providing news recommendation in automatic chat
Ging et al. Neologising misogyny: Urban Dictionary’s folksonomies of sexual abuse
CN109783657A (en) Multistep based on limited text space is from attention cross-media retrieval method and system
Krause et al. Edina: Building an open domain socialbot with self-dialogues
Burroughs et al. Religious memetics: Institutional authority in digital/lived religion
JP2009099088A (en) Sns user profile extraction device, extraction method and extraction program, and device using user profile
WO2019218527A1 (en) Multi-system combined natural language processing method and apparatus
US20220210098A1 (en) Providing responses in an event-related session
CN110941712B (en) User-level personalized text abstract generation method and system
Tiwari et al. Ensemble approach for twitter sentiment analysis
CN112133406B (en) Multi-mode emotion guidance method and system based on emotion maps and storage medium
WO2023124837A1 (en) Inquiry processing method and apparatus, device, and storage medium
Galitsky et al. Parse thicket representation for multi-sentence search
CN115470344A (en) Video barrage and comment theme fusion method based on text clustering
KR101326313B1 (en) Method of classifying emotion from multi sentence using context information
Yuan Ideological struggle and cultural intervention in online discourse: an empirical study of resistance through translation in China
CN109948139A (en) A kind of semantic tendency analysis method and system
CN112115707A (en) Emotion dictionary construction method for bullet screen emotion analysis and based on expressions and tone
US20160267139A1 (en) Knowledge based service system, server for providing knowledge based service, method for knowledge based service, and non-transitory computer readable recording medium
Mody Contemporary Intimacies
JP6858721B2 (en) Dialogue controls, programs and methods capable of conducting content dialogue
Guruge et al. Analyze hate contents on sinhala tweets using an ensemble method

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
TR01 Transfer of patent right
TR01 Transfer of patent right

Effective date of registration: 20240527

Address after: Room 122, First Floor, No. 2429 Xingang East Road, Haizhu District, Guangzhou City, Guangdong Province, 510000 (for office only)

Patentee after: Zhujian Intelligent Technology (Guangzhou) Co.,Ltd.

Country or region after: China

Address before: 200233 room 2075, 2 / F, building 1, 146 Fute East 1st Road, Pudong New Area Free Trade Zone, Shanghai

Patentee before: ZHUJIAN INTELLIGENT TECHNOLOGY (SHANGHAI) Co.,Ltd.

Country or region before: China

TR01 Transfer of patent right

Effective date of registration: 20240820

Address after: Room A228, 1st Floor, Building 3, No. 18 Keyuan Road, Economic Development Zone, Daxing District, Beijing 102600

Patentee after: Zhuzhi Technology (Beijing) Co.,Ltd.

Country or region after: China

Address before: Room 122, First Floor, No. 2429 Xingang East Road, Haizhu District, Guangzhou City, Guangdong Province, 510000 (for office only)

Patentee before: Zhujian Intelligent Technology (Guangzhou) Co.,Ltd.

Country or region before: China