CN116186197A - Topic recommendation method, device, electronic equipment and storage medium - Google Patents

Topic recommendation method, device, electronic equipment and storage medium Download PDF

Info

Publication number
CN116186197A
CN116186197A CN202111435283.9A CN202111435283A CN116186197A CN 116186197 A CN116186197 A CN 116186197A CN 202111435283 A CN202111435283 A CN 202111435283A CN 116186197 A CN116186197 A CN 116186197A
Authority
CN
China
Prior art keywords
comment
topic
recall
posted
candidate
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202111435283.9A
Other languages
Chinese (zh)
Inventor
陈小帅
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tencent Technology Shenzhen Co Ltd
Original Assignee
Tencent Technology Shenzhen Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tencent Technology Shenzhen Co Ltd filed Critical Tencent Technology Shenzhen Co Ltd
Priority to CN202111435283.9A priority Critical patent/CN116186197A/en
Publication of CN116186197A publication Critical patent/CN116186197A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/3331Query processing
    • G06F16/334Query execution
    • G06F16/3344Query execution using natural language analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/335Filtering based on additional data, e.g. user or group profiles
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/35Clustering; Classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/10Text processing
    • G06F40/166Editing, e.g. inserting or deleting
    • G06F40/169Annotation, e.g. comment data or footnotes
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/10Text processing
    • G06F40/194Calculation of difference between files
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/30Semantic analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Artificial Intelligence (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • General Health & Medical Sciences (AREA)
  • Software Systems (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Computation (AREA)
  • Medical Informatics (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The application provides a topic recommendation method, a topic recommendation device, electronic equipment and a storage medium, relates to the technical field of computers, and can be applied to various scenes such as cloud technology, artificial intelligence and the like. The method comprises the following steps: obtaining comments to be distributed; screening at least one first candidate topic which accords with the relevance screening condition from all recall topics based on the target relevance between the comment to be issued and all recall topics; screening at least one candidate comment meeting the similarity screening condition in each recall comment based on target similarity between the comment to be issued and each recall comment, and taking topics associated with the at least one candidate comment as corresponding second candidate topics respectively; and determining a recommended topic corresponding to the evaluation to be issued based on the obtained at least one first candidate topic and at least one second candidate topic. And the degree of association between the comment to be issued and the recall topic and the degree of similarity between the comment to be issued and the recall topic are adopted to determine the recommended topic, so that the accuracy can be improved.

Description

Topic recommendation method, device, electronic equipment and storage medium
Technical Field
The present disclosure relates to the field of computer technologies, and in particular, to a topic recommendation method, a device, an electronic device, and a storage medium.
Background
With the rapid development of internet information technology, more and more target objects are enthusiastically browsing various media information by means of various multimedia application platforms and posting comments for the various media information.
At present, when a target object issues comments on a multimedia application platform, topics related to the comments to be issued are often selected from recommended topics, and the selected topics are added to the comments to be issued and issued together with the comments of the comments to be issued. The topic can promote the exposure of comments and promote the interaction efficiency of discussing the comment content based on the same topic among target objects, and the importance of the topic on the comments to be sent can be seen.
However, when topic recommendation is performed, recommendation is performed based on the topics which are currently hot, and all topics in the multimedia application platform are not covered; at this time, there will be a case where the association between the recommended topics and the comments to be issued is weak, and only one topic can be selected from the recommended topics to be issued in association with the comments to be issued, or the topics associated with the comments to be issued can be searched for again.
Therefore, the accuracy of topic recommendation based on heat is low, so that the relevance of topics and comments to be issued is poor, and the accuracy of subsequent big data summarization analysis is affected.
Disclosure of Invention
The application provides a topic recommendation method, a topic recommendation device, electronic equipment and a storage medium, which are used for improving the accuracy of topic recommendation.
In one aspect, an embodiment of the present application provides a topic recommendation method, including:
obtaining comments to be distributed;
screening at least one first candidate topic which accords with the relevance screening condition from all recall topics based on the target relevance between the comment to be issued and all recall topics;
screening at least one candidate comment meeting the similarity screening condition in each recall comment based on target similarity between the comment to be issued and each recall comment, and taking topics associated with the at least one candidate comment as corresponding second candidate topics respectively;
and determining a recommended topic corresponding to the evaluation to be issued based on the obtained at least one first candidate topic and at least one second candidate topic.
In one aspect, an embodiment of the present application provides a topic recommendation device, including:
The acquisition unit is used for acquiring comments to be distributed;
the first screening unit is used for screening at least one first candidate topic which accords with the relevance screening condition from all recall topics based on the target relevance between the comments to be distributed and all recall topics;
the second screening unit is used for screening out at least one candidate comment meeting the similarity screening condition in each recall comment based on the target similarity between the comment to be issued and each recall comment, and taking topics associated with the at least one candidate comment as corresponding second candidate topics respectively;
the determining unit is used for determining recommended topics corresponding to the evaluation to be issued based on the obtained at least one first candidate topic and the obtained at least one second candidate topic.
In a possible embodiment, the first filtering unit is further configured to perform at least one of the following steps before filtering out at least one first candidate topic that meets the relevance filtering condition in each recall topic based on the target relevance between the comment to be issued and each recall topic:
recall each recall topic based on the comment entity tag for which a comment is to be posted;
Based on the first comment semantic information of the comment to be sent, each recall topic is recalled.
In a possible embodiment, the first screening unit is specifically configured to:
determining each comment entity label corresponding to the comment to be issued based on the entity identification model; the entity identification model is obtained by training comments marked with comment entity labels;
determining at least one topic entity tag based on each comment entity tag corresponding to the evaluation theory to be issued, and determining at least one known topic corresponding to each topic entity tag based on the mapping relation between the topic entity tag and the topic;
based on at least one known topic, each matched topic entity tag recalls each recalled topic from the at least one known topic.
In a possible embodiment, the first screening unit is specifically configured to:
for at least one known topic, the following operations are performed: determining at least one target topic entity tag that matches a known topic, wherein the target topic entity tag is: topic entity tags consistent with the comment entity tag content in the comment to be issued;
based on at least one known topic, at least one target topic entity tag matched with each other recalls the known topic conforming to the first recall condition from the at least one known topic, and takes the recalled known topic as a recall topic.
In a possible embodiment, the first screening unit is specifically configured to:
determining a first number of matched target topic entity tags, recalling known topics for which the first number reaches a first threshold; or (b)
Determining at least one target topic entity label, the corresponding topic label confidence degree, determining a topic screening value based on the topic label confidence degrees, and recalling the known topics of which the topic screening value reaches the topic screening threshold value.
In one possible embodiment, the topic tag confidence is determined based on a target topic entity tag, a first number of occurrences on a corresponding one of the known topics, associated all posted comments, and a second number of occurrences on a corresponding one of the known topics, associated all posted comments, corresponding all comment entity tags.
In a possible embodiment, the first screening unit is specifically configured to:
based on first comment semantic information of comments to be issued, inquiring an approximate nearest neighbor search index to obtain a corresponding known topic set;
in the known topic set, each recalled topic is recalled based on cosine similarity between the comment to be issued and each known topic in the known topic set.
In a possible embodiment, the first screening unit is specifically configured to:
for each recall topic, the following operations are performed:
determining the semantic association degree between the comment to be posted and a recall topic based on the first comment semantic information of the comment to be posted and the topic semantic information of the recall topic;
based on the semantic association degree and the recall association degree corresponding to one recall topic, determining the target association degree between the comment to be posted and the one recall topic;
and screening at least one first candidate topic which meets the relevance screening condition from the recall topics based on the target relevance corresponding to each recall topic.
In a possible embodiment, the second screening unit is further configured to perform at least one of the following steps before screening out at least one candidate comment meeting the similarity screening condition in each recall comment based on the target similarity between the comment to be issued and each recall comment:
recall each recall comment based on a comment entity tag for which a comment is to be posted;
based on the first comment semantic information of the comment to be sent, each recall comment is recalled.
In a possible embodiment, the second screening unit is specifically configured to:
Determining each comment entity label corresponding to the comment to be issued based on the entity identification model; the entity identification model is obtained by training comments marked with comment entity labels;
determining at least one posted comment corresponding to each comment entity tag based on the mapping relation between the comment entity tag and the posted comments;
based on the at least one posted comment, each matching comment entity tag recalls each recall comment from the at least one posted comment.
In a possible embodiment, the second screening unit is specifically configured to:
for at least one posted comment, the following operations are performed: determining at least one target comment entity tag matched with one posted comment, wherein the target comment entity tag is as follows: comment entity tags matched with the comment to be posted and the posted comments;
based on the at least one posted comment, at least one target comment entity tag matched with each other, recalling the posted comment conforming to the second recall condition from the at least one posted comment, and taking the recalled posted comment as a recall comment.
In a possible embodiment, the second screening unit is specifically configured to:
Determining a second number of matched target comment entity tags, recalling published comments of which the second number reaches a second threshold; or (b)
And determining comment screening values based on the respective comment label confidence degrees, and recalling posted comments of which the comment screening values reach a comment screening threshold value, wherein the comment label confidence degrees are determined based on an entity recognition model.
In a possible embodiment, the second screening unit is specifically configured to:
inquiring a depth similarity retrieval index based on first comment semantic information of comments to be posted to obtain a corresponding posted comment set;
in the posted comment set, each recall comment is recalled based on cosine similarity between the comment to be posted and each posted comment in the posted comment set.
In a possible embodiment, the second screening unit is specifically configured to:
for each recall comment, the following operations are performed:
determining semantic similarity between the comment to be posted and a recall comment based on first comment semantic information of the comment to be posted and second comment semantic information of the recall comment;
Determining target similarity between the comment to be posted and a recall comment based on the semantic similarity and the recall similarity corresponding to the recall comment;
and screening at least one candidate comment meeting the similarity screening condition in each recall comment based on the target similarity corresponding to each recall comment.
In a possible embodiment, the determining unit is specifically configured to:
determining at least one co-occurrence candidate topic based on the at least one first candidate topic and the at least one second candidate topic, wherein the co-occurrence candidate topic is: candidate topics present in both the at least one first candidate topic and the at least one second candidate topic;
for at least one co-occurrence candidate topic, the following operations are respectively performed: determining a target screening value based on the target association degree, the target similarity and the heat degree corresponding to the co-occurrence candidate topics;
based on each target screening value, screening at least one target topic from at least one co-occurrence candidate topic, and taking the screened at least one target topic as a recommended topic corresponding to the evaluation to be issued.
In one possible embodiment, the popularity is determined by the number of associations corresponding to one co-occurrence candidate topic and the number of exposures of posted comments associated with one co-occurrence candidate topic.
In one aspect, an embodiment of the present application provides an electronic device, including: a memory and a processor, wherein the memory is used for storing a computer program; and the processor is used for executing the computer program to realize the steps of the topic recommendation method provided by the embodiment of the application.
In one aspect, embodiments of the present application provide a computer readable storage medium storing a computer program, which when executed by a processor, implements the steps of the topic recommendation method provided by the embodiments of the present application.
In one aspect, embodiments of the present application provide a computer program product comprising a computer program stored in a computer readable storage medium; when the processor of the electronic device reads the computer program from the computer-readable storage medium, the processor executes the computer program, so that the electronic device executes the steps of the topic recommendation method provided by the embodiment of the application.
Due to the adoption of the technical scheme, the embodiment of the application has at least the following technical effects:
in the scheme of the embodiment of the application, when a comment to be distributed is required to recommend a topic, firstly, based on the comment to be distributed, a part of recalled topics and a part of recalled comments are recalled roughly; then, based on target association degrees between comments to be distributed and each recall topic, screening at least one first candidate topic with strong association degrees with the comments to be distributed from the recall topics; and screening at least one candidate comment with strong similarity to the comment to be sent out in the recall comment based on the target similarity between the comment to be sent out and the recall comment, and respectively taking topics respectively associated with the at least one candidate comment as second candidate topics to accurately find the topics with strong relevance to the comment to be sent out; and then, fusing the first candidate topics and the second candidate topics, and determining recommended topics corresponding to the evaluation theory to be issued.
In the scheme, the degree of association between the comment to be issued and the recall topic is adopted to directly realize topic recommendation, and the degree of similarity between the comment to be issued and the recall comment is adopted to indirectly realize topic recommendation, so that the degree of association between the recommended topic and the comment to be issued is improved, the condition that a target object searches again is avoided, the difficulty of selecting the topic to which the comment to be issued belongs by the target object is reduced, and the topic selection efficiency during comment issuing is improved; and when the recommended topics are determined, all topics are covered, so that the topic recommendation accuracy is improved, and the accuracy of subsequent big data summarization analysis is further improved.
Additional feature vectors and advantages of the application will be set forth in the description which follows, and in part will be obvious from the description, or may be learned by practice of the application. The objectives and other advantages of the application will be realized and attained by the structure particularly pointed out in the written description and claims thereof as well as the appended drawings.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings that are needed in the description of the embodiments will be briefly described below, it being obvious that the drawings in the following description are only some embodiments of the present application, and that other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
Fig. 1 is a schematic view of an application scenario of topic recommendation provided in an embodiment of the present application;
FIG. 2 is a flowchart of a method for topic recommendation provided in an embodiment of the present application;
fig. 3 is a schematic diagram of a comment editing interface to be published according to an embodiment of the present application;
fig. 4 is a schematic diagram showing recommended topics according to an embodiment of the present application;
FIG. 5 is a schematic diagram of another recommended topic display provided in an embodiment of the present application;
FIG. 6 is a flowchart of a method for recalling various recall topics provided by an embodiment of the present application;
fig. 7 is a schematic diagram of obtaining comment entity labels based on an entity identification model according to an embodiment of the present application;
FIG. 8 is a flowchart of another method for recalling individual recall topics provided by an embodiment of the present application;
FIG. 9 is a schematic diagram of obtaining first comment semantic information according to an embodiment of the present application;
FIG. 10 is a schematic diagram of determining a semantic association between a comment to be posted and a recall topic provided by an embodiment of the present application;
FIG. 11 is a flowchart of a method for recalling each recall comment provided in an embodiment of the present application;
FIG. 12 is a schematic diagram of determining semantic similarity between a comment to be posted and a recall comment provided by an embodiment of the present application;
FIG. 13 is a flowchart of another method for recalling individual recall comments provided by an embodiment of the present application;
FIG. 14 is a flowchart of a specific implementation method of topic recommendation provided in an embodiment of the present application;
fig. 15 is a block diagram of a topic recommendation device according to an embodiment of the present application;
fig. 16 is a block diagram of an electronic device according to an embodiment of the present application;
fig. 17 is a block diagram of another electronic device according to an embodiment of the present application.
Detailed Description
In order to make the objects, technical solutions and advantageous effects of the present application more clear, the technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are only some embodiments, but not all embodiments of the present application. All other embodiments, which can be made by one of ordinary skill in the art without undue burden from the present disclosure, are within the scope of the present disclosure.
In order to facilitate a better understanding of the technical solutions of the present application, the following description will describe some of the concepts related to the present application.
Comment: subjective or objective impression elucidation is performed on things to express own ideas, ideas and feelings faster. The method is used in multimedia software; for example, the method is applied to a video player and commentary is carried out on a certain part of play; for another example, the method is applied to a reading player and commentary is carried out on a certain article; for example, the method is applied to a music player and commentary is performed on a song. Even, the method can be applied to various shopping software, and comments are made on purchased articles. The method can also be applied to any platform capable of posting comments, namely, a platform which does not contain information such as video, reading, comments and the like.
In the embodiment of the application, the comment can be a post published by a certain play in a discussion community of the video player, and other users can continuously publish own views aiming at the post so as to improve interaction efficiency.
Topic: refer to the topic of a conversation, or the subject matter of a discussion. The topic is the center of the conversation, but is not limited to the center of the conversation, and various opinions are one of the full topics, namely a summary of various events focused in daily life.
Entity tag: the tags are used to mark product targets and classifications or content and are tools that facilitate you're and his lookup and location. In the embodiment of the application, the entity label refers to a keyword with strong correlation, and mainly comprises a comment entity label and a topic entity label; that is, the comment entity tag is a keyword in the comment, the topic entity tag is a keyword in the topic, for example, in a discussion community of the video player, the comment includes a play name, a person name, and the like, and the play name and the person name are regarded as comment entity tags, and the topic comment entity tag is the same.
Semantic information: semantic information is one of the expression forms of information, and refers to information with a certain meaning capable of eliminating uncertainty of things. In the embodiment of the application, the semantic information refers to semantic feature vectors obtained through a pre-training language model, and the semantic feature vectors are also called depth representation and coding representation; the method mainly comprises comment semantic information and topic semantic information, for example, the comment is input into a pre-training language model to obtain the comment semantic information, and the topic semantic information is the same.
Pre-training language model: the computer has the function of making reasoning and decision by using knowledge and common sense. The self-supervision learning is used for obtaining a pre-training language model irrelevant to specific tasks from large-scale training data, learning the context-related representation of each word of the input text, namely embodying the semantic representation of a word in a specific context, and implicitly learning general grammar semantic knowledge. The training data may be text. And the pre-trained language model extends from single language, to multi-language, multi-modal tasks. Wherein the pre-trained language model includes, but is not limited to: the BERT model, the ALBERT model, and the XLNet model.
Entity recognition model: and training based on a large number of training data marked with entity labels to obtain the deep learning network model. The training data may be one or a combination of comments tagged with an entity tag, topics tagged with an entity tag.
Approximate nearest neighbor search (Approximate Nearest Neighbors, ANN): the method mainly utilizes the characteristic that clustered aggregation distribution is formed among data after the data volume is increased, classifies or codes the data in a database by a data analysis clustering method, predicts the data category of the target data according to the data characteristics of the target data, and returns part or all of the data category as a retrieval result. The method is mainly applied to a document retrieval system, and approximate nearest neighbor retrieval is used as a method for searching similar document information.
Depth similarity retrieval: the data most similar to the target data is searched from the database according to the similarity of the data.
Inverted index: the record is looked up according to the value of the attribute. Each entry in such an index table includes an attribute value and record data having the attribute value. Since the attribute value is not determined by the record data but by the attribute value, it is called inverted index (inverted index). For example, in the embodiment of the present application, searching for a corresponding known topic based on a topic entity tag, searching for a corresponding posted comment based on a comment entity tag, and the like all belong to the retrieval based on the inverted index.
The word "exemplary" is used hereinafter to mean "serving as an example, embodiment, or illustration. Any embodiment described as "exemplary" is not necessarily to be construed as preferred or advantageous over other embodiments.
The terms "first," "second," and the like herein are used for descriptive purposes only and are not to be construed as either explicit or implicit relative importance or to indicate the number of technical features indicated. Thus, a feature defining "a first" or "a second" may explicitly or implicitly include one or more such feature, and in the description of embodiments of the present application, unless otherwise indicated, the meaning of "a plurality" is two or more.
The following briefly describes the design concept of the embodiment of the present application:
the embodiment of the application relates to a scene of posting comments on a multimedia application platform, and because a target object adds topics in the comments in the process of posting the comments, the discussion of the comments, the exposure rate and the like are increased; therefore, the embodiment of the application specifically relates to a specific scene of how to recommend topics in the comment posting process.
In the related art, when comments are posted, topics related to the comments to be posted are often selected from recommended topics, and the selected topics are added to the comments to be posted and posted together with the comments of the comments to be posted.
The recommended topics are recommended through heat; that is, topics that are currently hot are recommended to the target object. At this time, all topics in the multimedia application platform are not covered, and long-tail topics and new hot topics are not optimized in a directional manner; the situation that the correlation between the recommended topics and the comments to be distributed is weak exists, and only one topic can be selected from the recommended topics to be distributed in a correlated manner with the comments to be distributed, or the target object actively searches for the topics related to the comments to be distributed.
Therefore, the accuracy of topic recommendation based on heat is low, so that the relevance of topics and comments to be issued is poor, the comment issuing efficiency is affected, and the accuracy of subsequent big data summarization analysis is further affected.
In view of this, the embodiments of the present application provide a topic recommendation method, apparatus, electronic device, and storage medium, so as to improve accuracy of topic recommendation, and improve association between recommended topics and comments to be issued, further improve comment issuing efficiency, and ensure accuracy of candidate big data summary analysis.
In the embodiment of the application, firstly, based on the obtained comment to be issued, roughly recalling associated recall topics in a topic library and roughly recalling similar recall comments in a comment library; then, screening is carried out again in the recalled topics based on the target association degree between the comment to be issued and the recalled topics, at least one first candidate topic which accords with the association degree screening condition is screened out, meanwhile, screening is carried out again in the recalled comments based on the target similarity between the comment to be issued and the recalled comments, at least one candidate comment which accords with the similarity degree screening condition is screened out, topics which are respectively associated with the at least one candidate comment are used as at least one second candidate topic, and topics which have strong association degree with the comment to be issued are accurately searched; and finally, fusing the obtained at least one first candidate topic and at least one second candidate topic, and determining the recommended topic corresponding to the evaluation to be issued.
In the embodiment of the application, topic recommendation is performed not based on heat, but based on the association degree between the comment to be issued and the topics and the similarity determination between the comment to be issued, so that the association degree between the comment to be issued and the recommended topics is improved, the condition that a target object searches again is avoided, the difficulty of selecting the topic to which the comment to be issued belongs by the target object is reduced, and the topic selection efficiency during comment issuing, namely the interaction efficiency of comment issuing is improved; in the process of determining the recommended topics, all topics are covered, the long-tail topics and the new hot topics are recommended more accurately, the accuracy of topic recommendation is improved, and the accuracy of subsequent big data summarization analysis is further improved.
In the embodiment of the application, the processes of determining the relevance of the target, the similarity of the target and the like relate to artificial intelligence (Artificial Intelligence, AI) and Machine Learning technology, and are designed based on voice technology, natural language processing technology and Machine Learning (ML) in the artificial intelligence.
Artificial intelligence is the theory, method, technique and application system that uses a digital computer or a digital computer-controlled machine to simulate, extend and expand human intelligence, sense the environment, acquire knowledge and use the knowledge to obtain optimal results. In other words, artificial intelligence is an integrated technology of computer science that attempts to understand the essence of intelligence and to produce a new intelligent machine that can react in a similar way to human intelligence.
Artificial intelligence, i.e. research on design principles and implementation methods of various intelligent machines, enables the machines to have functions of sensing, reasoning and decision. Artificial intelligence techniques mainly include computer vision techniques, natural language processing techniques, machine learning/deep learning, and other major directions. With research and progress of artificial intelligence technology, artificial intelligence is developed in various fields such as common smart home, intelligent customer service, virtual assistant, smart speaker, smart marketing, unmanned, automatic driving, robot, smart medical, etc., and it is believed that with the development of technology, artificial intelligence will be applied in more fields and with increasingly important value.
Machine learning is a multi-domain interdisciplinary, involving multiple disciplines such as probability theory, statistics, approximation theory, convex analysis, algorithm complexity theory, and the like. It is specially studied how a computer simulates or implements learning behavior of a human to acquire new knowledge or skills, and reorganizes existing knowledge structures to continuously improve own performance. Compared with the data mining, which finds the mutual characteristics among big data, the machine learning is more focused on the design of an algorithm, so that a computer can automatically learn the rules from the data and predict unknown data by utilizing the rules.
Machine learning is the core of artificial intelligence, a fundamental approach to letting computers have intelligence, which is applied throughout various areas of artificial intelligence. Machine learning and deep learning typically include techniques such as artificial neural networks, belief networks, reinforcement learning, transfer learning, induction learning, and the like. Reinforcement learning (Reinforcement Learning, RL), also known as re-excitation learning, evaluation learning, or reinforcement learning, is one of the paradigm and methodology of machine learning to describe and solve the problem of agents (agents) through learning strategies to maximize returns or achieve specific goals during interactions with an environment.
After the design concept of the embodiment of the present application is introduced, some simple descriptions are made below for application scenarios applicable to the technical solution of the embodiment of the present application, and it should be noted that the application scenarios described below are only used to illustrate the embodiment of the present application and are not limiting. In the specific implementation process, the technical scheme provided by the embodiment of the application can be flexibly applied according to actual needs.
Referring to fig. 1, fig. 1 is a schematic view of an application scenario in an embodiment of the present application. The application scenario includes a plurality of terminal devices 110 and a server 120, where the terminal devices 110 and the server 120 may communicate through a communication network.
In an alternative embodiment, the communication network may be a wired network or a wireless network. Accordingly, the terminal device 110 and the server 120 may be directly or indirectly connected through wired or wireless communication. For example, the terminal device 110 may be indirectly connected to the server 120 through a wireless access point, or the terminal device 110 may be directly connected to the server 120 through the internet, which is not limited herein.
In the embodiment of the present application, the terminal device 110 includes, but is not limited to, a mobile phone, a tablet computer, a notebook computer, a desktop computer, an electronic book reader, an intelligent voice interaction device, an intelligent home appliance, a vehicle-mounted terminal, and the like; the terminal device can be provided with a client related to topic recommendation, wherein the client can be software (such as a browser, video software and the like) or a webpage, an applet and the like;
the server 120 is a background server corresponding to software, web pages, applets, etc., or a server dedicated to topic recommendation, which is not particularly limited in this application. The server 120 may be an independent physical server, a server cluster or a distributed system formed by a plurality of physical servers, or may be a cloud server providing cloud services, cloud databases, cloud computing, cloud functions, cloud storage, network services, cloud communication, middleware services, domain name services, security services, a content delivery network (Content Delivery Network, CDN), basic cloud computing services such as big data and an artificial intelligence platform.
It should be noted that, the topic recommendation method in the embodiment of the present application may be performed by an electronic device, which may be the server 120 or the terminal device 110, that is, the method may be performed by the server 120 or the terminal device 110 separately, or may be performed by both the server 120 and the terminal device 110 together.
When the terminal device 110 performs alone, for example, the terminal device 110 may acquire the comment to be posted input by the user, and then, based on the acquired comment to be posted, recall a part of the recalled topics and a part of the recalled comments, and perform subsequent processing based on the target association degree between the comment to be posted and the recalled topics and the target similarity between the comment to be posted and the recalled comments.
When the server 120 performs alone, for example, the terminal device 110 may obtain the comment to be issued input by the user, and then send the comment to be issued to the server 120, where the server 120 performs subsequent processing based on the obtained comment to be issued, a part of the recalled topics and a part of the recalled comments, and based on the target association degree between the comment to be issued and the recalled topics, and the target similarity degree between the comment to be issued and the recalled comments.
When the server 120 and the terminal 110 perform together, for example, the terminal 110 may acquire the comment to be issued input by the user, and then send the comment to be issued to the server 120, the server 120 recalls a part of the recalled topics and a part of the recalled comments based on the acquired comment to be issued, and returns the recalled topics and the recalled comments to the terminal 110, and the terminal 110 performs subsequent processing based on the target association degree between the comment to be issued and the recalled topics and the target similarity between the comment to be issued and the recalled comments.
In the following, the server alone is mainly used as an example, and the present invention is not limited thereto.
In a specific implementation, a user may input a comment image to be issued in the terminal device 110, the terminal device 110 sends the comment to be issued to the server 120, and the server 120 may determine a recommended topic associated with the comment to be issued by adopting the topic recommendation method in the embodiment of the present application.
It should be noted that, the number of the terminal devices 110 and the servers 120 is not limited in practice, and is not specifically limited in the embodiment of the present application, as shown in fig. 1 for illustration only.
In the embodiment, when the number of the servers 120 is plural, the plural servers 120 may be configured as a blockchain, and the servers 120 are nodes on the blockchain; the topic recommendation method disclosed by the embodiment of the application, wherein the related topic data can be stored on a blockchain, for example, the topic data comprises each topic in a topic library, topic entity labels of each topic, topic semantic information and the like.
The topic recommendation method provided by the exemplary embodiments of the present application will be described below with reference to the accompanying drawings in conjunction with the application scenario described above, and it should be noted that the application scenario described above is only shown for the convenience of understanding the spirit and principles of the present application, and the embodiments of the present application are not limited in any way in this respect. Moreover, the embodiment of the application can be applied to various scenes, including not only topic recommendation scenes, but also various scenes such as cloud technology, artificial intelligence, intelligent traffic, auxiliary driving and the like.
Referring to fig. 2, fig. 2 is a flowchart of a method for topic recommendation in an embodiment of the present application, where a server is taken as an execution body to describe the method, and a specific implementation flow of the method is as follows:
Step S200, obtaining comments to be distributed.
The server receives comments to be distributed, which are sent by the terminal equipment.
In a possible embodiment, the terminal device receives the comment to be issued input by the user in the client, and sends the comment to be issued input by the user to the server after responding to the topic adding instruction triggered by the user.
For example, when a user needs to issue comments to be issued in a discussion community of a video player of a terminal device, a comment issuing instruction is triggered in the discussion community, and the user jumps to a comment editing interface to be issued, and referring to fig. 3, fig. 3 exemplarily provides a schematic diagram of the comment editing interface to be issued in an embodiment of the present application.
As shown in fig. 3, the comment editing interface to be published includes, but is not limited to: comment writing area, picture adding instruction and topic adding instruction.
At this time, the user can input a comment to be issued in the comment writing area; when determining that the pictures need to be added, triggering a picture adding instruction, and enabling the terminal equipment to jump to a picture library in response to the picture adding instruction triggered by the user so as to enable the user to select the pictures needing to be added in the picture library; when determining that topics need to be added, triggering a topic adding instruction, and sending a to-be-issued comment input by a user in a comment writing area to a server by the terminal equipment in response to the topic adding instruction triggered by the user, so that the server determines recommended topics associated with the to-be-issued comments based on the to-be-issued comment, and returns the determined recommended topics to the terminal equipment, so that the terminal equipment displays the recommended topics, the user can conveniently select the recommended topics, the difficulty and cost of selecting the topics to which the to-be-issued comments belong by the user are reduced, and the comment issuing efficiency is improved.
Referring to fig. 4, fig. 4 exemplarily provides a schematic diagram for displaying recommended topics in an embodiment of the present application; or, referring to fig. 5, fig. 5 exemplarily provides another schematic diagram for displaying recommended topics in the embodiment of the present application.
Step S201, screening at least one first candidate topic which meets the relevance screening condition from all recall topics based on target relevance between comments to be distributed and all recall topics.
In order to reduce workload, in the embodiment of the application, the relevance between topics and comments to be distributed is determined one by one without aiming at each topic in a dialogue question library; instead, the recall topic associated with the comment to be posted is first recalled coarsely in the topic library. And then, finely arranging the recall topics, and screening at least one first candidate topic which meets the relevancy screening condition from the recall topics.
Specifically, each recall topic is recalled in the topic library mainly based on at least one of comment entity tags of comments to be distributed and first comment semantic information of the comments to be distributed.
The recall topics based on the comment entity tags and the recall topics based on the first comment semantic information are described in detail.
Mode one: based on the comment entity tags of the comments to be sent, each recall topic is recalled.
Referring to fig. 6, fig. 6 is a flowchart for exemplarily providing a method for recalling various recalled topics in an embodiment of the present application, including the following steps:
step S600, obtaining each comment entity tag corresponding to the comment to be issued.
In one possible embodiment, the comments to be issued are input into a pre-trained entity recognition model, and each comment entity label corresponding to the comments to be issued is obtained.
Referring to fig. 7, fig. 7 is an exemplary schematic diagram for obtaining comment entity tags based on an entity identification model in an embodiment of the present application. Inputting a starting mark (CLS) and comments to be issued into the entity identification model to obtain each comment entity label corresponding to the comments to be issued.
Firstly, dividing comments to be posted to obtain elements, wherein each element comprises at least one word in the comments to be posted; then, inputting the initial mark and each element into a pre-training language model, such as a BERT model, to obtain context information (context), average pooling information (Avg Pool) and initial mark information of each element in comments to be issued; finally, determining each comment entity tag corresponding to the comment to be issued and comment tag confidence corresponding to each comment entity tag based on the context information, the average pooling information, the start flag information and the Width information (Width) information; wherein, width coding can make the objects corresponding to the vectors with similar distances have similar meanings.
The entity recognition model is obtained by training data marked with entity labels, and the training data can be at least one of comments and topics.
Step S601, determining at least one topic entity tag based on each comment entity tag corresponding to the comment to be issued.
In one possible embodiment, the comment entity tag and the topic entity tag adopt the same tag system, namely the comment entity tag is consistent with the content of the topic entity tag; therefore, the obtained comment entity tag is directly used as a topic entity tag.
In another possible embodiment, after inputting the comment to be issued into the entity identification model, the confidence level of the comment label corresponding to each comment entity label, that is, the existence probability of the comment entity label, is obtained in addition to each comment entity label; then, screening comment entity tags based on the comment tag confidence, and taking the screened comment entity tags as topic entity tags;
generally, comment entity tags with confidence degrees of comment tags being up to a corresponding confidence degree threshold are screened out; or sorting the comment entity tags from front to back according to the confidence level of the comment tag, and screening out at least one comment entity tag with the front sorting.
Step S602, determining at least one known topic corresponding to each of the at least one topic entity tag based on the mapping relationship between the topic entity tag and the topic.
Because, based on the mapping relation between topic entity labels and topics, determining the known topics corresponding to the topic entity labels, namely, taking the topic entity labels as indexes, and searching topics matched with the topic entity labels; therefore, it is necessary to construct in advance a mapping relationship between topic entity tags and topics.
Specifically, each known topic is respectively input into the pre-trained entity recognition model to obtain at least one corresponding topic entity tag; then, classifying the known topics based on topic entity labels corresponding to the known topics, namely dividing the known topics corresponding to the same topic entity label into the same group to form a mapping from topic entity labels to topics.
In the following, a table format is used to exemplarily describe that a known topic corresponds to at least one topic entity tag, and that a topic entity tag corresponds to at least one known topic.
As shown in table 1, examples of at least one topic entity tag corresponding to a known topic.
TABLE 1
Known topics Topic entity tag
#A some B some B some en ai #, a game of chance "A-certain", "B-certain"
One person gives the best A some # # "A-somewhere"
# A some B some two-way love # "A-certain", "B-certain"
# TV drama H-bead lady #) "H bead"
# update a personalized signature #, for H-bead character "H bead"
As shown in table 2, there are examples of topic entity tags corresponding to at least one known topic.
TABLE 2
Figure BDA0003381566310000181
Therefore, based on the determined mapping relation between the topic entity labels and the topics, each topic corresponding to each topic entity label can be accurately obtained.
Step S603 recalls each recalled topic from at least one known topic based on the at least one known topic, the topic entity tag that each matches.
Since there are a plurality of topic entity tags, and each topic entity tag corresponds to at least one known topic, a large number of known topics will be obtained from the topic library, and in order to ensure the accuracy of each recalled topic, a large number of known topics screened from the topic library will be screened again to obtain the recalled topic.
Moreover, as each known topic is respectively corresponding to at least one topic entity tag; however, in the corresponding at least one topic entity tag, there may be one topic entity tag consistent with the comment entity tag content in the comment to be posted, or there may be a plurality of topic entity tags consistent with the comment entity tag content in the comment to be posted; at this time, the more topic entity tags corresponding to the known topics exist, the more topic entity tags consistent with the comment entity tag content in the comment to be issued, the stronger the association degree between the known topics and the comment to be issued is possible;
Therefore, in the embodiment of the present application, based on at least one known topic, at least one target topic entity tag that is matched with each other, recalls a known topic that meets the first recall condition from the at least one known topic, and takes the recalled known topic as a recall topic, where the target topic entity tag is: topic entity tags consistent with the comment entity tag content in the comment to be issued.
In one possible implementation, in recalling each recall topic, a first number of matched target topic entity tags is determined for at least one known topic, and the first number of known topics reaching a first threshold are recalled, and the recalled known topics are taken as recall topics.
In another possible implementation, during recall of each recall topic, for at least one known topic: firstly, respectively determining at least one matched target topic entity label and topic label confidence degrees corresponding to the target topic entity labels; then, determining topic screening values based on the confidence of each topic label; finally, the recalled topic screening value reaches the known topic of the topic screening threshold value, and the recalled known topic is used as the recalled topic;
The topic label confidence is determined based on a target topic entity label, the first occurrence times of all related posted comments on a corresponding known topic, and the second occurrence times of all related posted comments and corresponding comment entity labels of the known topic.
Next, a known topic, which is consistent with the first recall condition among at least one known topic, will be described by taking the known topic Ti as an example.
First, a description is given of known topics recalling that the first number reaches a first threshold:
assume that: the input comment entity label corresponding to the comment to be issued comprises the following components: "A-certain", "B-certain", "H-bead";
for example, the topic Ti is known as: a certain B certain xiu en ai#, the topic entity labels corresponding to the known topic are "a certain", "B certain", namely, the first quantity is 2, if the first threshold is set to be 2, the first quantity is determined to reach the first threshold, and the known topic is recalled;
for another example, the topic Ti is known as: one person gives a sentence of the best A some#, the topic entity label corresponding to the known topic is "A some", namely, the first quantity is 1, if the first threshold value is set to be 2, the first quantity does not reach the first threshold value, and the known topic is discarded, namely, the known topic is not recalled.
Next, description is given of known topics for which the recall topic screening value reaches the topic screening threshold:
assume that: the input comment entity label corresponding to the comment to be issued comprises the following components: "A-certain", "B-certain", "H-bead";
for example, the topic Ti is known as: the topic entity labels corresponding to the known topics are 'A-certain' and 'B-certain', and 'A-certain' and 'B-certain' are topic entity labels consistent with the comment entity label content in the comment to be issued;
if the number of comments of the posted comments associated with the known topic is 100, each posted comment includes "A some" and "B some"; at this time, the first occurrence times of "a-certain" and "B-certain" on the posted comments associated with the known topic are all 100, if 50 posted comments in the 100 posted comments include other comment entity labels "H-beads" besides "a-certain" and "B-certain", that is, the occurrence times of "H-beads" on the posted comments associated with the known topic are 50, at this time, it is determined that all posted comments associated with the known topic have second occurrence times of all corresponding comment entity labels are 250;
at this time, the topic label confidence of "a" is 40%, and the topic label confidence of "B" is 40%;
Then, adding and summing the topic label confidence coefficient of 'A-certain' and the topic label confidence coefficient of 'B-certain', determining that the topic screening value is 80%, and recalling the known topic if the topic screening threshold value is 75%;
or, the topic label confidence of 'A-certain' and the topic label confidence of 'B-certain' are subjected to weighted average processing, a topic screening value of 40% is determined, and if the topic screening threshold is 30%, the known topic is recalled.
In the embodiment of the present application, in the process of recalling each recall topic, it is also possible to: screening based on the first number of matched target entity labels; and then determining the topic label confidence corresponding to each target topic entity label, determining a topic screening value based on each topic label confidence, and recalling based on a topic screening threshold value.
Mode two: based on the first comment semantic information of the comment to be sent, each recall topic is recalled.
Referring to fig. 8, fig. 8 is a flowchart illustrating another method for recalling individual recall topics in an embodiment of the present application, including the steps of:
step S800, obtaining first comment semantic information of a comment to be issued.
In one possible embodiment, inputting comments to be issued into a pre-training language model to obtain first comment semantic information corresponding to the comments to be issued; for example, inputting the comment to be posted into the BERT model to obtain the first comment semantic information of the comment to be posted, referring to fig. 9, fig. 9 exemplarily provides a schematic diagram for obtaining the first comment semantic information in the embodiment of the present application.
Step S801, based on first comment semantic information of comments to be issued, inquiring an approximate nearest neighbor search index to obtain a corresponding known topic set.
The approximate nearest neighbor search index corresponds to a known topic set, and the approximate nearest neighbor search index can be central topic semantic information corresponding to the known topic set.
In one possible embodiment, topics in a topic library are clustered, a plurality of similar known topics are divided into the same known topic set, and central topic semantic information of each known topic set is determined; then, aiming at the central topic semantic information of each known topic set, respectively determining the association degree between the central topic semantic information and the first comment semantic information, and determining an approximate nearest neighbor search index based on the association degree; and finally, acquiring a known topic set corresponding to the queried approximate nearest neighbor retrieval index.
In another possible embodiment, the query can also be performed in a tree form, namely, topics in the topic library are divided into a plurality of known topic sets, and at least one known topic subset is included in the plurality of known topic sets, and the topic sets are progressive layer by layer;
therefore, when the index is searched in the approximate nearest neighbor, first determining a first association degree between first comment semantic information and central topic semantic information of a known topic set, screening central topic semantic information based on the first association degree, determining a known topic set corresponding to the screened central topic semantic information in a plurality of known topic sets, and determining each known topic subset in the screened known topic sets; then, determining sub-center topic semantic information of each known topic subset, screening the sub-center topic semantic information based on a second association degree between the sub-center topic semantic information and the first comment semantic information, and determining a sub-known topic set corresponding to the screened sub-center topic semantic information in a plurality of sub-known topic sets; and (5) progressive layer by layer until finishing.
It should be noted that, the manner of querying the approximate nearest neighbor index based on the clustering and tree form is merely illustrative, and in the embodiment of the present application, the query may be determined based on a hash manner or the like.
Step S802 recalls, in the known topic set, each recalled topic based on cosine similarity between the comment to be posted and each known topic in the known topic set.
Because similar known topics contained in the known topic sets are determined based on first comment semantic information of comments to be distributed and central topic semantic information of the known topic sets when the known topic sets are determined, the degree of association between each known topic in the known topic sets and the comments to be distributed cannot be guaranteed to meet the condition;
therefore, in order to ensure accuracy and calculation speed, in the embodiment of the present application, first, for each known topic in the known topic set, a cosine similarity between the known topic and the comment to be issued is determined; then, the known topics recalled and cosine similarity reaches a set threshold value in the known topic set, or a plurality of known topics with the cosine similarity ranked at the front are recalled, and the recalled known topics are taken as recalled topics.
It should be noted that, the filtering based on cosine similarity is only one implementation manner, and in the embodiment of the present application, each recall topic may also be recalled in the known topic set by using a hamming distance between the known topic and the comment to be issued, which is not limited specifically herein.
In this embodiment of the present application, when each recall topic is recalled based on the degree of association between the comment to be issued and the topic, the above-described first mode may be adopted alone, the above-described second mode may be adopted alone, or the above-described first mode and the above-described second mode may be adopted together.
Because the recall topics are recalled roughly, the relevance between the comments to be distributed and the recall topics is relatively weak, in the embodiment of the application, the recall topics are arranged finely, and at least one first candidate topic meeting the relevance screening condition is screened out.
In one possible embodiment, after recalling each recall topic, at least one first candidate topic meeting the relevancy screening condition is screened out of each recall topic based on the target relevancy between the comment to be issued and each recall topic.
Because the target association degree between each recall topic and the comment to be sent needs to be determined, taking one recall topic as an example, the determination of the target association degree between the comment to be sent and the recall topic is described as follows:
firstly, inputting comments to be posted and a recall topic into a relevancy prediction model to obtain semantic relevancy of the comments to be posted and the recall topic;
In the relevancy prediction model, comments to be distributed and a recall topic are respectively obtained through a corresponding pre-training language model, such as a BERT model, so as to obtain corresponding first comment semantic information and topic semantic information; determining the semantic association degree between the comment to be posted and a recall topic based on the first comment semantic information of the comment to be posted and topic semantic information of the recall topic; referring to fig. 10, fig. 10 is an exemplary schematic diagram for determining a semantic association between a comment to be posted and a recall topic in an embodiment of the present application.
Secondly, determining target association degree between comments to be posted and a recall topic based on the semantic association degree and the recall association degree corresponding to the recall topic;
the recall association degree of the recall topics is determined based on the recall mode of the recall topics. If the recall is based on the comment entity label of the comment to be sent, the recall association is determined based on the topic label confidence, and the topic label confidence can also be used as the recall association; if recall is performed based on the first comment semantic information of the comment to be issued, the recall association is determined based on cosine similarity between the comment to be issued and the recall topic, and the cosine similarity can also be directly used as the recall association.
In one possible embodiment, the semantic association degree and the recall association degree are weighted to obtain a target association degree; that is, the target association=α1×semantic association+α2×recall association, where α1 and α2 are weights of the semantic association and recall association, respectively, and α1+α2=1.
By adopting the mode, after the target association degree between each recall topic and the comment to be issued is determined, at least one first candidate topic which accords with the association degree screening condition is screened out of each recall topic based on the target association degree;
generally, screening out target association and at least one recall topic reaching a corresponding target association degree threshold; or sorting the recall topics from front to back according to the size of the comment target association, and screening out at least one recall topic with the front sorting; and finally, taking the screened at least one recall topic as at least one first candidate topic.
Based on the semantic association and the recall association, screening first candidate topics from all recall topics of recall, so that the association between the screened first candidate topics and comments to be distributed is stronger, and the accuracy of topic recommendation is further ensured.
Step S202, screening out at least one candidate comment meeting the similarity screening condition in each recall comment based on the target similarity between the comment to be issued and each recall comment, and taking topics associated with the at least one candidate comment as corresponding second candidate topics respectively.
In order to ensure the accuracy of recommended topics, in the embodiment of the application, topics related to the comments to be distributed are directly determined based on the target association degree between the comments to be distributed and the topics, recall comments are roughly recalled based on the similarity between the comments to be distributed and the posted comments, and then the topics related to the comments to be distributed are indirectly determined based on the association relation between the comments and the topics.
Therefore, recall comments similar to the comment to be posted are also roughly recalled in the comment library. And then, finely arranging the recall comments, and screening at least one candidate comment meeting the similarity screening condition from the recall comments. Since the candidate comments are associated with topics, topics respectively associated with at least one candidate comment are respectively used as corresponding second candidate topics.
Specifically, each recall comment is recalled in the comment library mainly based on at least one of comment entity tags of comments to be sent and first comment semantic information of the comments to be sent.
And recall each recall comment based on the comment entity tag and recall each recall comment based on the first comment semantic information to be described in detail.
Mode one: based on the comment entity tags of the comments to be sent, each recall comment is recalled.
Referring to fig. 11, fig. 11 is a flowchart for exemplarily providing a method for recalling each recall comment in an embodiment of the present application, including the following steps:
step S1100, obtaining each comment entity tag corresponding to the comment to be issued.
The specific implementation of this step may refer to step S600, which is not described herein.
Step S1101, determining at least one posted comment corresponding to each comment entity tag based on the mapping relationship between the comment entity tag and the posted comments.
Because the posted comments corresponding to the comment entity tags are determined based on the mapping relation between the comment entity tags and the posted comments, namely, the comment entity tags are used as indexes, and the posted comments matched with the comment entity tags are searched; therefore, it is necessary to construct in advance a mapping relationship between comment entity tags and posted comments.
Specifically, each posted comment is respectively input into the pre-trained entity recognition model to obtain at least one corresponding comment entity label; and classifying the posted comments based on the comment entity tags corresponding to the posted comments, namely dividing the posted comments corresponding to the same comment entity tag into the same group to form a mapping from the comment entity tag to the posted comments.
In the following, a form of a table is adopted to exemplarily illustrate that the posted comment corresponds to at least one comment entity tag and that the comment entity tag corresponds to at least one posted comment.
As shown in table 3, an example of at least one comment entity tag corresponds to a posted comment.
TABLE 3 Table 3
Figure BDA0003381566310000251
As shown in table 4, an example of at least one posted comment is corresponding to a comment entity tag.
TABLE 4 Table 4
Figure BDA0003381566310000252
Therefore, based on the determined mapping relation between the comment entity tags and the posted comments, each posted comment corresponding to each comment entity tag can be accurately obtained.
Step S1102, recall each recall comment from the at least one posted comment based on the at least one posted comment, the comment entity tags that are respectively matched.
Because a plurality of comment entity tags exist, and each comment entity tag corresponds to at least one posted comment, a large number of posted comments are obtained, and in order to ensure the accuracy of each recalled comment, a large number of posted comments are screened out from a comment library, and screening is performed again to obtain the recalled comments.
In addition, as each posted comment is respectively corresponding to at least one comment entity tag; however, in the corresponding at least one comment entity tag, there may be one comment entity tag consistent with the comment entity tag content in the comment to be issued, or there may be a plurality of comment entity tags consistent with the comment entity tag content in the comment to be issued; at this time, the more the number of comment entity tags corresponding to the posted comment are, which are consistent with the content of the comment entity tag in the comment to be posted, the more similar the posted comment is to the comment to be posted;
Therefore, in the embodiment of the application, based on at least one posted comment, at least one target comment entity tag matched with each other, recalling a posted comment conforming to the second recall condition from the at least one posted comment, and taking the recalled posted comment as a recall comment, wherein the target comment entity tag is: and a comment entity label matched with the comment to be posted.
In one possible implementation manner, in the process of recalling each recalled comment, a second number of matched target comment entity tags is respectively determined for at least one posted comment, the posted comments with the second number reaching a second threshold value are recalled, and the recalled posted comments are taken as the recalled comments.
In another possible implementation, during recall of each recall comment, a comment is posted for at least one of the posted comments: firstly, respectively determining at least one matched target comment entity label and the comment label confidence corresponding to each target comment entity label; then, determining comment screening values based on the confidence level of each comment label; finally, the review comment screening value reaches the comment screening threshold value, and the review comment is taken as the review comment; the comment tag confidence is determined based on the entity recognition model.
Note that, the implementation of the recall comment is similar to the implementation of the recall topic, and will not be illustrated here.
In the embodiment of the present application, in the process of recalling each recall comment, it may also be: screening based on the second number of the matched target comment entity tags; and then, determining the confidence coefficient of the comment label corresponding to each target comment entity label, determining a comment screening value based on the confidence coefficient of each comment label, and recalling based on a comment screening threshold value.
Mode two: based on the first comment semantic information of the comment to be sent, each recall comment is recalled.
Referring to fig. 13, fig. 13 is a flowchart illustrating another method for recalling individual recall comments in an embodiment of the present application, including the steps of:
step S1300, obtaining first semantic information of comments to be issued.
The specific implementation of this step can be referred to as step S800, and will not be described herein.
Step S1301, based on the first comment semantic information of the comment to be posted, queries the depth similarity search index to obtain a corresponding posted comment set.
One depth similar search index corresponds to one posted comment set, and the depth similar search index can be central comment semantic information corresponding to the posted comment set.
In one possible embodiment, the comments in the comment library are clustered, a plurality of similar posted comment sets are divided into the same posted comment set, and central comment semantic information of each posted comment set is determined; then, aiming at the central comment semantic information of each posted comment set, respectively determining the similarity between the central comment semantic information and the first comment semantic information, and determining a depth similarity retrieval index based on the similarity; and finally, acquiring the posted comment set corresponding to the queried depth similar retrieval index.
It should be noted that, the manner of searching for the depth similarity index based on the cluster is merely illustrative, and in the embodiment of the present application, the search index may be determined based on a tree form, a hash, or the like.
In step S1302, in the posted comment set, each recall comment is recalled based on cosine similarity between the comment to be posted and each posted comment in the posted comment set.
Because similar posted comments contained in the posted comment set are determined based on the first comment semantic information of the comment to be posted and the central comment semantic information of the comment set when the comment set to be posted is determined, the similarity between each posted comment in the comment set to be posted and the comment to be posted cannot be guaranteed to meet the condition;
Therefore, in order to ensure accuracy and calculation speed, in the embodiment of the present application, first, for each posted comment in a posted comment set, a cosine similarity between the posted comment set and a comment to be posted is determined; and then, recalling a plurality of posted comments with the cosine similarity reaching a set threshold value or sorting the cosine similarity to the front in the posted comment set, and taking the recalled posted comments as recalled comments.
It should be noted that, the filtering based on cosine similarity is only one implementation manner, and in the embodiment of the present application, each recall comment may be recalled in the posted comment set through a hamming distance between the posted comment and the comment to be posted.
In this embodiment of the present application, when recall of each recall comment is performed based on the similarity between the comment to be posted and the comment that has been posted, the foregoing manner one may be adopted only, the foregoing manner two may be adopted only, or both the foregoing manner one and the foregoing manner two may be adopted together.
Because the recall comments are recalled roughly, the similarity between the comments to be sent and the recall comments is relatively weak, in the embodiment of the application, the recall comments are arranged finely, and at least one candidate comment meeting the similarity screening condition is screened out.
In one possible embodiment, after recalling each recall comment, at least one candidate comment meeting the similarity screening condition is screened out from each recall comment based on the target similarity between the comment to be posted and each recall comment.
Because the target similarity between each recall comment and the comment to be sent needs to be determined, taking one recall comment as an example, the determination of the target similarity between the comment to be sent and the recall comment is described as follows:
firstly, inputting comments to be posted and a recall comment into a similarity prediction model to obtain semantic similarity of the comments to be posted and the recall comment;
in the similarity prediction model, a comment to be issued and a recall comment are respectively obtained through a corresponding pre-training language model, such as a BERT model, so as to obtain corresponding first comment semantic information and comment semantic information; determining the semantic similarity between the comment to be posted and a recall comment based on the first comment semantic information of the comment to be posted and the comment semantic information of the recall comment; referring to FIG. 12, FIG. 12 is an exemplary diagram providing a method for determining semantic similarity between a comment to be posted and a recall comment in an embodiment of the present application.
Secondly, determining target similarity between the comment to be posted and a recall comment based on the semantic similarity and the recall similarity corresponding to the recall comment;
the recall similarity of the recall comment is determined based on a recall mode of the recall comment. If the recall is based on the comment entity tag of the comment to be sent, the recall similarity is determined based on the comment tag confidence, and the comment tag confidence can also be used as the recall similarity; if the recall is based on the first comment semantic information of the comment to be issued, the recall similarity is determined based on the cosine similarity between the comment to be issued and the recall comment, and the cosine similarity can also be directly used as the recall similarity.
In one possible embodiment, the semantic similarity and the recall similarity are weighted to obtain the target similarity; that is, the target similarity=β1×semantic similarity+β2×recall similarity, where β1 and β2 are weights of the semantic similarity and the recall similarity, respectively, and β1+β2=1.
After the target similarity between each recalled comment and the comment to be issued is determined, at least one candidate comment meeting the similarity screening condition is screened out from each recalled comment based on the target similarity;
Generally, screening out target associations, and at least one recall comment reaching a corresponding target similarity threshold; or sorting the recall comments from front to back according to the size of the comment target association, and screening out at least one recall comment with the front sorting; and finally, taking the screened at least one recall comment as at least one candidate comment.
Based on the semantic similarity and the recall similarity, candidate comments are screened in all recalled comments, so that the similarity between the screened candidate comments and the comments to be distributed is stronger, the association degree between topics corresponding to the candidate comments and the comments to be distributed is ensured, and the accuracy of topic recommendation is further ensured.
In one possible implementation manner, after determining the candidate comment, the topic corresponding to the candidate comment is used as a second candidate topic.
Step S203, determining a recommended topic corresponding to the comment to be issued based on the obtained at least one first candidate topic and at least one second candidate topic.
In one possible embodiment, at first, at least one first candidate topic and at least one second candidate topic are fused, and at least one co-occurrence candidate topic is reserved, wherein the co-occurrence candidate topic is: candidate topics present in both the at least one first candidate topic and the at least one second candidate topic; and then, taking the reserved at least one co-occurrence candidate topic as a recommended topic corresponding to the evaluation to be issued.
When the number of the determined co-occurrence candidate topics is relatively large based on at least one first candidate topic and at least one second candidate topic, if all the co-occurrence candidate topics are used as recommended topics, and users can increase difficulty in selecting topics to which comments to be distributed belong when selecting topics for users, and therefore in the embodiment of the application, another embodiment of determining recommended topics corresponding to the comments to be distributed based on the at least one first candidate topic and the at least one second candidate topic is further provided.
In another possible embodiment, first, at least one first candidate topic and at least one second candidate topic are fused, and at least one co-occurrence candidate topic is reserved, where the co-occurrence candidate topic is: candidate topics present in both the at least one first candidate topic and the at least one second candidate topic;
then, for at least one co-occurrence candidate topic, respectively performing: determining a target screening value, namely a target screening value=w1 corresponding to a co-occurrence candidate topic T ', a target relevance+w2 of a co-occurrence candidate topic T', a target similarity of a co-occurrence candidate topic T ', and a heat of a co-occurrence candidate topic T', based on the target relevance, the target similarity and the heat of a co-occurrence candidate topic corresponding to the co-occurrence candidate topic;
The heat is determined by the corresponding association times of the co-occurrence candidate topics T 'and the exposure times of the posted comments associated with the co-occurrence candidate topics T'; specifically, the popularity of the co-occurrence candidate topic T ' (the number of associations the co-occurrence candidate topic T ' is associated with by comments/the sum of the number of associations the all topics are associated with by comments) × (the number of exposures of posted comments/the number of exposures of all posted comments associated with the co-occurrence candidate topic T ').
Finally, screening at least one target topic from at least one co-occurrence candidate topic based on each target screening value, and taking the screened at least one target topic as a recommended topic corresponding to the evaluation theory to be issued;
generally, screening out a target screening value to reach at least one target topic of the corresponding target screening value; or sorting the co-occurrence candidate topics from front to back according to the size of the target screening value, and screening at least one target topic with the front sorting; and finally, taking the screened at least one target topic as a recommended topic corresponding to the evaluation to be issued.
The topic recommendation method according to the embodiment of the present application will be described below with reference to fig. 14.
Referring to fig. 14, fig. 14 is a flowchart for illustrating a specific implementation method of topic recommendation in an embodiment of the present application, including the following steps:
Step S140, the server acquires comments to be posted.
Step S141, the server recommends topics based on comments to be issued;
in step S141, mainly the following steps are performed:
step S1410, the server screens out at least one first candidate topic from each recall topic based on the target association degree between the comment to be posted and each recall topic;
step 1411, the server screens out at least one candidate comment in each recall comment based on the target similarity between the comment to be posted and each recall comment, and takes topics associated with each at least one candidate comment as corresponding second candidate topics respectively;
in step S1412, the server fuses at least one first candidate topic and at least one second candidate topic, and determines a recommended topic corresponding to the comment to be issued.
Step S142, the recommended topics corresponding to the evaluation theory to be issued are returned to the terminal equipment, so that the determined recommended topics are recommended to the user through the terminal equipment and are selected by the user for use.
In the scheme of the embodiment of the application, when a comment recommendation topic is required to be issued, firstly, based on the obtained comment to be issued, roughly recalling the associated recall topic in a topic library and roughly recalling similar recall comments in a comment library; then, screening is carried out again in the recalled topics based on the target association degree between the comments to be distributed and the recalled topics, at least one first candidate topic which accords with the association degree screening condition is screened out, meanwhile, screening is carried out again in the recalled comments based on the target similarity between the comments to be distributed and the recalled comments, at least one candidate comment which accords with the similarity degree screening condition is screened out, topics which are respectively associated with the at least one candidate comment are used as at least one second candidate topic, and topics with strong association degree with the comments to be distributed are accurately searched in the process; and finally, fusing the obtained at least one first candidate topic and at least one second candidate topic, and determining the recommended topic corresponding to the evaluation to be issued.
The method has the advantages that the degree of association between the comment to be issued and the recall topic is adopted, topic recommendation is directly achieved, the degree of association between the recommended topic and the comment to be issued is indirectly achieved, the condition that a target object searches again is avoided, the difficulty of selecting the topic to which the comment to be issued belongs by the target object is reduced, and the topic selection efficiency in issuing the comment is improved; and when the recommended topics are determined, all topics are covered, so that the topic recommendation accuracy is improved, and the accuracy of subsequent big data summarization analysis is further improved.
The principle of solving the problem of the topic recommendation device is similar to that of the method of the embodiment, so that the implementation of the device can be referred to the implementation of the method, and the repetition is omitted.
Referring to fig. 15, fig. 15 exemplarily provides a topic recommendation device in an embodiment of the present application, where the topic recommendation device 1500 includes an obtaining unit 1501, a first filtering unit 1502, a second filtering unit 1503, and a determining unit 1504.
An obtaining unit 1501, configured to obtain comments to be distributed;
A first screening unit 1502, configured to screen, based on a target relevance between a comment to be issued and each recall topic, at least one first candidate topic that meets a relevance screening condition in each recall topic;
a second screening unit 1503, configured to screen, based on the target similarity between the comment to be issued and each recall comment, at least one candidate comment meeting the similarity screening condition in each recall comment, and use topics associated with each at least one candidate comment as corresponding second candidate topics respectively;
a determining unit 1504 is configured to determine a recommended topic corresponding to an evaluation to be issued based on the obtained at least one first candidate topic and at least one second candidate topic.
In a possible embodiment, the first filtering unit 1502 is further configured to perform at least one of the following steps before filtering out at least one first candidate topic that meets the relevance filtering condition in each recall topic based on the target relevance between the comment to be issued and each recall topic:
recall each recall topic based on the comment entity tag for which a comment is to be posted;
based on the first comment semantic information of the comment to be sent, each recall topic is recalled.
In a possible embodiment, the first screening unit 1502 is specifically configured to:
determining each comment entity label corresponding to the comment to be issued based on the entity identification model; the entity identification model is obtained by training comments marked with comment entity labels;
determining at least one topic entity tag based on each comment entity tag corresponding to the evaluation theory to be issued, and determining at least one known topic corresponding to each topic entity tag based on the mapping relation between the topic entity tag and the topic;
based on at least one known topic, each matched topic entity tag recalls each recalled topic from the at least one known topic.
In a possible embodiment, the first screening unit 1502 is specifically configured to:
for at least one known topic, the following operations are performed: determining at least one target topic entity tag that matches a known topic, wherein the target topic entity tag is: topic entity tags consistent with the comment entity tag content in the comment to be issued;
based on at least one known topic, at least one target topic entity tag matched with each other recalls the known topic conforming to the first recall condition from the at least one known topic, and takes the recalled known topic as a recall topic.
In a possible embodiment, the first screening unit 1502 is specifically configured to:
determining a first number of matched target topic entity tags, recalling known topics for which the first number reaches a first threshold; or (b)
Determining at least one target topic entity label, the corresponding topic label confidence degree, determining a topic screening value based on the topic label confidence degrees, and recalling the known topics of which the topic screening value reaches the topic screening threshold value.
In one possible embodiment, the topic tag confidence is determined based on a target topic entity tag, a first number of occurrences on a corresponding one of the known topics, associated all posted comments, and a second number of occurrences on a corresponding one of the known topics, associated all posted comments, corresponding all comment entity tags.
In a possible embodiment, the first screening unit 1502 is specifically configured to:
based on first comment semantic information of comments to be issued, inquiring an approximate nearest neighbor search index to obtain a corresponding known topic set;
in the known topic set, each recalled topic is recalled based on cosine similarity between the comment to be issued and each known topic in the known topic set.
In a possible embodiment, the first screening unit 1502 is specifically configured to:
for each recall topic, the following operations are performed:
determining the semantic association degree between the comment to be posted and a recall topic based on the first comment semantic information of the comment to be posted and the topic semantic information of the recall topic;
based on the semantic association degree and the recall association degree corresponding to one recall topic, determining the target association degree between the comment to be posted and the one recall topic;
and screening at least one first candidate topic which meets the relevance screening condition from the recall topics based on the target relevance corresponding to each recall topic.
In a possible embodiment, the second filtering unit 1503 is further configured to perform at least one of the following steps before filtering out at least one candidate comment meeting the similarity filtering condition in each recall comment based on the target similarity between the comment to be issued and each recall comment:
recall each recall comment based on a comment entity tag for which a comment is to be posted;
based on the first comment semantic information of the comment to be sent, each recall comment is recalled.
In a possible embodiment, the second screening unit 1503 is specifically configured to:
Determining each comment entity label corresponding to the comment to be issued based on the entity identification model; the entity identification model is obtained by training comments marked with comment entity labels;
determining at least one posted comment corresponding to each comment entity tag based on the mapping relation between the comment entity tag and the posted comments;
based on the at least one posted comment, each matching comment entity tag recalls each recall comment from the at least one posted comment.
In a possible embodiment, the second screening unit 1503 is specifically configured to:
for at least one posted comment, the following operations are performed: determining at least one target comment entity tag matched with one posted comment, wherein the target comment entity tag is as follows: comment entity tags matched with the comment to be posted and the posted comments;
based on the at least one posted comment, at least one target comment entity tag matched with each other, recalling the posted comment conforming to the second recall condition from the at least one posted comment, and taking the recalled posted comment as a recall comment.
In a possible embodiment, the second screening unit 1503 is specifically configured to:
Determining a second number of matched target comment entity tags, recalling published comments of which the second number reaches a second threshold; or (b)
And determining comment screening values based on the respective comment label confidence degrees, and recalling posted comments of which the comment screening values reach a comment screening threshold value, wherein the comment label confidence degrees are determined based on an entity recognition model.
In a possible embodiment, the second screening unit 1503 is specifically configured to:
inquiring a depth similarity retrieval index based on first comment semantic information of comments to be posted to obtain a corresponding posted comment set;
in the posted comment set, each recall comment is recalled based on cosine similarity between the comment to be posted and each posted comment in the posted comment set.
In a possible embodiment, the second screening unit 1503 is specifically configured to:
for each recall comment, the following operations are performed:
determining semantic similarity between the comment to be posted and a recall comment based on first comment semantic information of the comment to be posted and second comment semantic information of the recall comment;
Determining target similarity between the comment to be posted and a recall comment based on the semantic similarity and the recall similarity corresponding to the recall comment;
and screening at least one candidate comment meeting the similarity screening condition in each recall comment based on the target similarity corresponding to each recall comment.
In a possible embodiment, the determining unit 1504 is specifically configured to:
determining at least one co-occurrence candidate topic based on the at least one first candidate topic and the at least one second candidate topic, wherein the co-occurrence candidate topic is: candidate topics present in both the at least one first candidate topic and the at least one second candidate topic;
for at least one co-occurrence candidate topic, the following operations are respectively performed: determining a target screening value based on the target association degree, the target similarity and the heat degree corresponding to the co-occurrence candidate topics;
based on each target screening value, screening at least one target topic from at least one co-occurrence candidate topic, and taking the screened at least one target topic as a recommended topic corresponding to the evaluation to be issued.
In one possible embodiment, the popularity is determined by the number of associations corresponding to one co-occurrence candidate topic and the number of exposures of posted comments associated with one co-occurrence candidate topic.
In the scheme of the embodiment of the application, when a topic is recommended for a comment to be issued, a part of the recalled topic and a part of the recalled comment are recalled based on the comment to be issued; then, based on target association degrees between comments to be distributed and each recall topic, screening at least one first candidate topic with strong association degrees with the comments to be distributed from each recall topic; and screening at least one candidate comment with strong similarity with the comment to be issued in the recall comment based on the target similarity between the comment to be issued and the recall comment, and taking topics respectively associated with the at least one candidate comment as second candidate topics; and then, screening recommended topics corresponding to the evaluation theory to be issued from the first candidate topics and the second candidate topics.
In the embodiment of the application, the association degree between the comment to be issued and the recommended topic is improved based on the association degree between the comment to be issued and the topic and the similarity determination between the comment to be issued and the topic, so that the condition that the target object searches again is avoided, the difficulty of selecting the topic to which the comment to be issued belongs by the target object is reduced, and the topic selection efficiency during comment issuing, namely the interaction efficiency of comment issuing is improved; in the process of determining the recommended topics, all topics are covered, the long-tail topics and the new hot topics are recommended more accurately, the accuracy of topic recommendation is improved, and the accuracy of subsequent big data summarization analysis is further improved.
For convenience of description, the above parts are described as being functionally divided into modules (or units) respectively. Of course, the functions of each module (or unit) may be implemented in the same piece or pieces of software or hardware when implementing the present application.
Having described the topic recommendation method and apparatus of an exemplary embodiment of the present application, next, an apparatus for topic recommendation according to another exemplary embodiment of the present application is described.
Those skilled in the art will appreciate that the various aspects of the present application may be implemented as a system, method, or program product. Accordingly, aspects of the present application may be embodied in the following forms, namely: an entirely hardware embodiment, an entirely software embodiment (including firmware, micro-code, etc.) or an embodiment combining hardware and software aspects may be referred to herein as a "circuit," module "or" system.
Based on the same inventive concept as the method embodiments described above, an electronic device is also provided in the embodiments of the present application, and in one embodiment, the electronic device may be a server, such as the server 120 shown in fig. 1. In this embodiment, the electronic device may be configured as shown in fig. 16, including a memory 1601, a communication module 1603, and one or more processors 1602.
A memory 1601 for storing a computer program executed by the processor 1602. The memory 1601 may mainly include a storage program area and a storage data area, wherein the storage program area may store an operating system, a program required for running an instant messaging function, and the like; the storage data area can store various instant messaging information, operation instruction sets and the like.
The memory 1601 may be a volatile memory (RAM) such as a random-access memory (RAM); the memory 1601 may also be a nonvolatile memory (non-volatile memory), such as a read-only memory, a flash memory (flash memory), a hard disk (HDD) or a Solid State Drive (SSD); or memory 1601, is any other medium that can be used to carry or store a desired computer program in the form of instructions or data structures and that can be accessed by a computer, but is not limited thereto. The memory 1601 may be a combination of the above memories.
The processor 1602 may include one or more central processing units (central processing unit, CPU) or digital processing units, or the like. A processor 1602 for implementing the topic recommendation method described above when invoking a computer program stored in the memory 1601.
The communication module 1603 is used for communicating with terminal devices and other servers.
The specific connection medium between the memory 1601, the communication module 1603, and the processor 1602 is not limited in the embodiments of the present application. The embodiment of the present application is illustrated in fig. 16, where the memory 1601 and the processor 1602 are connected by a bus 1604, and the bus 1604 is illustrated in fig. 16 by a bold line, and the connection between other components is merely illustrative, and not limiting. The bus 1604 may be divided into an address bus, a data bus, a control bus, and the like. For ease of description, only one thick line is depicted in fig. 16, but only one bus or one type of bus is not depicted.
The memory 1601 has stored therein a computer storage medium having stored therein computer executable instructions for implementing the topic recommendation method of an embodiment of the present application. The processor 1602 is configured to perform the topic recommendation method described above, as shown in FIG. 2.
In another embodiment, the electronic device may also be other electronic devices, such as terminal device 110 shown in fig. 1. In this embodiment, the structure of the electronic device may include, as shown in fig. 17: communication component 1710, memory 1720, display unit 1730, camera 1740, sensor 1750, audio circuit 1760, bluetooth module 1770, processor 1780, and the like.
The communication component 1710 is for communicating with a server. In some embodiments, a circuit wireless fidelity (Wireless Fidelity, wiFi) module may be included, where the WiFi module belongs to a short-range wireless transmission technology, and the electronic device may help the user to send and receive information through the WiFi module.
Memory 1720 may be used to store software programs and data. The processor 1780 performs various functions and data processing of the terminal device 110 by executing software programs or data stored in the memory 1720. Memory 1720 may include high-speed random access memory and may also include non-volatile memory, such as at least one magnetic disk storage device, flash memory device, or other volatile solid-state storage device. Memory 1720 stores an operating system that enables terminal device 110 to operate. The memory 1720 may store an operating system and various application programs, and may also store code for executing the topic recommendation method according to the embodiment of the present application.
The display unit 1730 may also be used to display information input by a user or information provided to the user and a graphical user interface (graphical user interface, GUI) of various menus of the terminal device 110. In particular, the display unit 1730 may include a display screen 1732 provided at the front of the terminal device 110. The display 1732 may be configured in the form of a liquid crystal display, light emitting diodes, or the like. The display unit 1730 may be used to display a comment editing interface to be distributed, or the like in the embodiment of the present application.
The display unit 1730 may also be used to receive input digital or character information, generate signal inputs related to user settings and function controls of the terminal device 110, and in particular, the display unit 1730 may include a touch screen 1731 provided at the front of the terminal device 110, and may collect touch operations on or near the user, such as clicking buttons, dragging scroll boxes, and the like.
The touch screen 1731 may be covered on the display screen 1732, or the touch screen 1731 and the display screen 1732 may be integrated to implement input and output functions of the terminal device 110, and the integrated touch screen may be simply referred to as a touch screen. The display unit 1730 in the present application may display an application program and corresponding operation steps.
Camera 1740 may be used to capture still images and the user may comment on the image captured by camera 1740 through the application. The camera 1740 may be one or more. The object generates an optical image through the lens and projects the optical image onto the photosensitive element. The photosensitive element may be a charge coupled device (charge coupled device, CCD) or a Complementary Metal Oxide Semiconductor (CMOS) phototransistor. The photosensitive elements convert the optical signals to electrical signals, which are then transferred to a processor 1780 for conversion to digital image signals.
The terminal device may also comprise at least one sensor 1750, such as an acceleration sensor 1751, a distance sensor 1752, a fingerprint sensor 1753, a temperature sensor 1754. The terminal device may also be configured with other sensors such as gyroscopes, barometers, hygrometers, thermometers, infrared sensors, light sensors, motion sensors, and the like.
Audio circuitry 1760, speaker 1761, microphone 1762 may provide an audio interface between the user and terminal device 110. The audio circuit 1760 may transmit the received electrical signal converted from audio data to the speaker 1761, where the electrical signal is converted to a sound signal by the speaker 1761. The terminal device 110 may also be configured with a volume button for adjusting the volume of the sound signal. On the other hand, microphone 1762 converts the collected sound signals into electrical signals, which are received by audio circuitry 1760 and converted into audio data, which are output to communication component 1710 for transmission to, for example, another terminal device 110, or to memory 1720 for further processing.
The bluetooth module 1770 is configured to interact with other bluetooth devices having bluetooth modules via a bluetooth protocol. For example, the terminal device may establish a bluetooth connection with a wearable electronic device (e.g., a smart watch) that also has a bluetooth module through the bluetooth module 1770, so as to perform data interaction.
The processor 1780 is a control center of the terminal device, connects various parts of the entire terminal using various interfaces and lines, and performs various functions of the terminal device and processes data by running or executing software programs stored in the memory 1720, and calling data stored in the memory 1720. In some embodiments, the processor 1780 may include one or more processing units; the processor 1780 may also integrate an application processor that primarily handles operating systems, user interfaces, applications, etc., and a baseband processor that primarily handles wireless communications. It will be appreciated that the baseband processor described above may not be integrated into the processor 1780. The processor 1780 may run an operating system, an application, a user interface display, and a touch response, as well as the topic recommendation method of the embodiments of the present application. In addition, a processor 1780 is coupled with the display unit 1730.
In some possible embodiments, aspects of the topic recommendation method provided herein may also be implemented in the form of a program product comprising a computer program for causing an electronic device to perform the steps in the topic recommendation method described herein above according to various exemplary embodiments of the present application when the program product is run on the electronic device, e.g. the electronic device may perform the steps as shown in fig. 2.
The program product may employ any combination of one or more readable media. The readable medium may be a readable signal medium or a readable storage medium. The readable storage medium can be, for example, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or a combination of any of the foregoing. More specific examples (a non-exhaustive list) of the readable storage medium would include the following: an electrical connection having one or more wires, a portable disk, a hard disk, random Access Memory (RAM), read-only memory (ROM), erasable programmable read-only memory (EPROM or flash memory), optical fiber, portable compact disk read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.
The program product of embodiments of the present application may employ a portable compact disc read only memory (CD-ROM) and comprise a computer program and may run on a computing device. However, the program product of the present application is not limited thereto, and in this document, a readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with a command execution system, apparatus, or device.
The readable signal medium may comprise a data signal propagated in baseband or as part of a carrier wave in which a readable computer program is embodied. Such a propagated data signal may take any of a variety of forms, including, but not limited to, electro-magnetic, optical, or any suitable combination of the foregoing. A readable signal medium may also be any readable medium that is not a readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with a command execution system, apparatus, or device.
A computer program embodied on a readable medium may be transmitted using any appropriate medium, including but not limited to wireless, wireline, optical fiber cable, RF, etc., or any suitable combination of the foregoing.
It should be noted that although several units or sub-units of the apparatus are mentioned in the above detailed description, such a division is merely exemplary and not mandatory. Indeed, the features and functions of two or more of the elements described above may be embodied in one element in accordance with embodiments of the present application. Conversely, the features and functions of one unit described above may be further divided into a plurality of units to be embodied.
Furthermore, although the operations of the methods of the present application are depicted in the drawings in a particular order, this is not required to or suggested that these operations must be performed in this particular order or that all of the illustrated operations must be performed in order to achieve desirable results. Additionally or alternatively, certain steps may be omitted, multiple steps combined into one step to perform, and/or one step decomposed into multiple steps to perform.
It will be appreciated by those skilled in the art that embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment, or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having a computer-usable computer program embodied therein.
While preferred embodiments of the present application have been described, additional variations and modifications in those embodiments may occur to those skilled in the art once they learn of the basic inventive concepts. It is therefore intended that the following claims be interpreted as including the preferred embodiments and all such alterations and modifications as fall within the scope of the application.
It will be apparent to those skilled in the art that various modifications and variations can be made in the present application without departing from the spirit or scope of the application. Thus, if such modifications and variations of the present application fall within the scope of the claims and the equivalents thereof, the present application is intended to cover such modifications and variations.

Claims (20)

1. A topic recommendation method, the method comprising:
obtaining comments to be distributed;
screening at least one first candidate topic which accords with a relevancy screening condition from all recall topics based on target relevancy between the comments to be distributed and all recall topics;
screening at least one candidate comment meeting a similarity screening condition in each recall comment based on the target similarity between the comment to be issued and each recall comment, and taking topics associated with the at least one candidate comment as corresponding second candidate topics respectively;
and determining the recommended topics corresponding to the evaluation to be issued based on the obtained at least one first candidate topic and at least one second candidate topic.
2. The method of claim 1, wherein the selecting at least one first candidate topic that meets a relevance selection condition in each recall topic based on a target relevance between the comment to be posted and each recall topic further comprises at least one of:
Recall each recall topic based on the comment entity tag of the comment to be posted;
and recalling each recall topic based on the first comment semantic information of the comment to be sent.
3. The method of claim 2, wherein the recalling the respective recall topics based on the comment entity tags of the comments to be posted comprises:
determining each comment entity label corresponding to the comment to be issued based on an entity identification model; the entity identification model is obtained through training of comments marked with comment entity labels;
determining at least one topic entity tag based on each comment entity tag corresponding to the comment to be issued, and determining at least one known topic corresponding to each topic entity tag based on a mapping relation between topic entity tags and topics;
based on the at least one known topic, each matched topic entity tag recalls the respective recalled topic from the at least one known topic.
4. The method of claim 3, wherein the recalling the respective recall topic from the at least one known topic based on the at least one known topic, each matching topic entity tag, comprises:
For the at least one known topic, the following operations are performed: determining at least one target topic entity tag that matches a known topic, wherein the target topic entity tag is: topic entity tags consistent with the comment entity tag content in the comment to be issued;
based on the at least one known topic, at least one target topic entity tag matched with each other recalls the known topic conforming to the first recall condition from the at least one known topic, and takes the recalled known topic as the recall topic.
5. The method of claim 4, wherein the recalling the known topic that meets the first recall condition from the at least one known topic based on the at least one known topic, each matching at least one target topic entity tag, comprises:
determining a first number of matched target topic entity tags, recalling known topics for which the first number reaches a first threshold; or (b)
Determining the at least one target topic entity label, the corresponding topic label confidence degree, determining a topic screening value based on the topic label confidence degrees, and recalling the known topics of which the topic screening value reaches a topic screening threshold value.
6. The method of claim 5, wherein the topic label confidence is determined based on a target topic entity label, a first number of occurrences on all posted comments associated with the corresponding one of the known topics, and a second number of occurrences on all comment entity labels corresponding to all posted comments associated with the one of the known topics.
7. The method of claim 2, wherein the recalling the respective recall topics based on the first comment semantic information of the comment to be posted comprises:
based on the first comment semantic information of the comment to be issued, inquiring an approximate nearest neighbor search index to obtain a corresponding known topic set;
and recalling each recalled topic in the known topic set based on cosine similarity between the comment to be distributed and each known topic in the known topic set.
8. The method as claimed in any one of claims 1 to 7, wherein screening at least one first candidate topic meeting a relevance screening condition in each recall topic based on a target relevance between the comment to be distributed and each recall topic comprises:
For each recall topic, the following operations are respectively executed:
determining the semantic association degree between the comment to be sent and the recall topic based on the first comment semantic information of the comment to be sent and topic semantic information of the recall topic;
determining a target association degree between the comment to be issued and the one recall topic based on the semantic association degree and the recall association degree corresponding to the one recall topic;
and screening at least one first candidate topic which meets the relevance screening condition from the recall topics based on the target relevance corresponding to each recall topic.
9. The method of claim 1, wherein the selecting at least one candidate comment meeting a similarity selection condition in each recall comment based on a target similarity between the comment to be posted and each recall comment further comprises at least one of:
recall each recall comment based on the comment entity tag of the comment to be posted;
and recalling each recalled comment based on the first comment semantic information of the comment to be sent.
10. The method of claim 9, wherein the recalling the respective recalled comments based on the comment entity tag of the comment to be posted comprises:
determining each comment entity label corresponding to the comment to be issued based on an entity identification model; the entity identification model is obtained through training of comments marked with comment entity labels;
determining at least one posted comment corresponding to each comment entity tag based on a mapping relation between the comment entity tag and the posted comments;
based on the at least one posted comment, each matching comment entity tag recalls the respective recall comment from the at least one posted comment.
11. The method of claim 10, wherein the recalling the respective recalled comments from the at least one posted comment based on the at least one posted comment, the respective matching comment entity tag comprises:
for the at least one posted comment, performing the following operations respectively: determining at least one target comment entity tag matched with one posted comment, wherein the target comment entity tag is: the comment entity tag is matched with the comment to be posted and the comment to be posted;
Based on the at least one posted comment, at least one target comment entity tag matched with each other, recalling a posted comment conforming to a second recall condition from the at least one posted comment, and taking the recalled posted comment as the recall comment.
12. The method of claim 11, wherein the recalling the posted comments conforming to the second recall condition from the at least one posted comment based on the at least one posted comment, the at least one target comment entity tag that each matches, comprises:
determining a second number of matched target comment entity tags, recalling published comments of which the second number reaches a second threshold; or (b)
And determining the confidence level of each corresponding comment label, determining a comment screening value based on the confidence level of each comment label, and recalling the posted comments of which the comment screening value reaches a comment screening threshold value, wherein the comment label confidence level is determined based on the entity identification model.
13. The method of claim 9, wherein the recalling the respective recalled comments based on the first comment semantic information of the comment to be posted comprises:
Inquiring a depth similarity retrieval index based on the first comment semantic information of the comment to be posted to obtain a corresponding posted comment set;
and recalling each recall comment in the posted comment set based on cosine similarity between the comment to be posted and each posted comment in the posted comment set.
14. The method of any one of claims 9 to 13, wherein screening at least one candidate comment meeting a similarity screening condition in each recall comment based on a target similarity between the comment to be issued and each recall comment comprises:
for each recall comment, the following operations are respectively executed:
determining the semantic similarity between the comment to be sent and the recall comment based on the first comment semantic information of the comment to be sent and the second comment semantic information of the recall comment;
determining target similarity between the comment to be issued and the one recall comment based on the semantic similarity and the recall similarity corresponding to the one recall comment;
and screening at least one candidate comment meeting the similarity screening condition from the recall comments based on the target similarity corresponding to each recall comment.
15. The method of claim 1, wherein the determining the recommended topic corresponding to the comment to be issued based on the obtained at least one first candidate topic and at least one second candidate topic comprises:
determining at least one co-occurrence candidate topic based on the at least one first candidate topic and the at least one second candidate topic, wherein the co-occurrence candidate topic is: candidate topics present in both the at least one first candidate topic and the at least one second candidate topic;
for the at least one co-occurrence candidate topic, performing the following operations respectively: determining a target screening value based on the target association degree, the target similarity and the heat degree corresponding to the co-occurrence candidate topics;
and screening at least one target topic from the at least one co-occurrence candidate topic based on each target screening value, and taking the screened at least one target topic as the recommended topic corresponding to the evaluation to be issued.
16. The method of claim 15, wherein the popularity is determined by a number of associations corresponding to the one co-occurrence candidate topic and a number of exposures of posted comments associated with the one co-occurrence candidate topic.
17. A topic recommendation device, the device comprising:
the acquisition unit is used for acquiring comments to be distributed;
the first screening unit is used for screening at least one first candidate topic which accords with the relevance screening condition from all the recall topics based on the target relevance between the comments to be distributed and all the recall topics;
the second screening unit is used for screening at least one candidate comment meeting the similarity screening condition in each recall comment based on the target similarity between the comment to be issued and each recall comment, and taking topics respectively associated with the at least one candidate comment as corresponding second candidate topics;
the determining unit is used for determining the recommended topics corresponding to the evaluation to be issued based on the obtained at least one first candidate topic and the obtained at least one second candidate topic.
18. An electronic device comprising a processor and a memory, wherein the memory stores a computer program which, when executed by the processor, causes the processor to perform the steps of the method of any of claims 1 to 16.
19. A computer readable storage medium, characterized in that it comprises a computer program for causing an electronic device to perform the steps of the method of any one of claims 1-16 when said computer program is run on the electronic device.
20. A computer program product, characterized in that it comprises a computer program, said computer program being stored in a computer readable storage medium; when the computer program is read from the computer readable storage medium by a processor of an electronic device, the processor executes the computer program, causing the electronic device to perform the steps of any one of the methods of claims 1-16.
CN202111435283.9A 2021-11-29 2021-11-29 Topic recommendation method, device, electronic equipment and storage medium Pending CN116186197A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111435283.9A CN116186197A (en) 2021-11-29 2021-11-29 Topic recommendation method, device, electronic equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111435283.9A CN116186197A (en) 2021-11-29 2021-11-29 Topic recommendation method, device, electronic equipment and storage medium

Publications (1)

Publication Number Publication Date
CN116186197A true CN116186197A (en) 2023-05-30

Family

ID=86449477

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111435283.9A Pending CN116186197A (en) 2021-11-29 2021-11-29 Topic recommendation method, device, electronic equipment and storage medium

Country Status (1)

Country Link
CN (1) CN116186197A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116663505A (en) * 2023-07-31 2023-08-29 厦门起量科技有限公司 Comment area management method and system based on Internet
CN117312500A (en) * 2023-11-30 2023-12-29 山东齐鲁壹点传媒有限公司 Semantic retrieval model building method based on ANN and BERT

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116663505A (en) * 2023-07-31 2023-08-29 厦门起量科技有限公司 Comment area management method and system based on Internet
CN116663505B (en) * 2023-07-31 2023-10-13 厦门起量科技有限公司 Comment area management method and system based on Internet
CN117312500A (en) * 2023-11-30 2023-12-29 山东齐鲁壹点传媒有限公司 Semantic retrieval model building method based on ANN and BERT
CN117312500B (en) * 2023-11-30 2024-02-27 山东齐鲁壹点传媒有限公司 Semantic retrieval model building method based on ANN and BERT

Similar Documents

Publication Publication Date Title
CN111444428B (en) Information recommendation method and device based on artificial intelligence, electronic equipment and storage medium
Gabriel De Souza et al. Contextual hybrid session-based news recommendation with recurrent neural networks
US8886589B2 (en) Providing knowledge content to users
WO2023065211A1 (en) Information acquisition method and apparatus
CN102197394B (en) Digital image retrieval by aggregating search results based on visual annotations
US20150332672A1 (en) Knowledge Source Personalization To Improve Language Models
CN113254711B (en) Interactive image display method and device, computer equipment and storage medium
US11126682B1 (en) Hyperlink based multimedia processing
CN112163428A (en) Semantic tag acquisition method and device, node equipment and storage medium
CN116186197A (en) Topic recommendation method, device, electronic equipment and storage medium
CN115114395A (en) Content retrieval and model training method and device, electronic equipment and storage medium
CA2932865A1 (en) Pipeline computing architecture and methods for improving data relevance
Rawat et al. A comprehensive study on recommendation systems their issues and future research direction in e-learning domain
Meng et al. A survey of personalized news recommendation
CN114741587A (en) Article recommendation method, device, medium and equipment
CN116578729B (en) Content search method, apparatus, electronic device, storage medium, and program product
US20230214676A1 (en) Prediction model training method, information prediction method and corresponding device
CN115129885A (en) Entity chain pointing method, device, equipment and storage medium
CN116484085A (en) Information delivery method, device, equipment, storage medium and program product
Shah et al. Multimodal semantics and affective computing from multimedia content
Sathish et al. Evolving the User Graph: From unsupervised topic models to knowledge assisted networks
Yu et al. News recommendation model based on encoder graph neural network and bat optimization in online social multimedia art education
Liao et al. Crowd knowledge enhanced multimodal conversational assistant in travel domain
CN118035945B (en) Label recognition model processing method and related device
Chen et al. Expert2Vec: distributed expert representation learning in question answering community

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination