CN116932703B - User controllable content generation method, device, equipment and medium - Google Patents

User controllable content generation method, device, equipment and medium Download PDF

Info

Publication number
CN116932703B
CN116932703B CN202311207343.0A CN202311207343A CN116932703B CN 116932703 B CN116932703 B CN 116932703B CN 202311207343 A CN202311207343 A CN 202311207343A CN 116932703 B CN116932703 B CN 116932703B
Authority
CN
China
Prior art keywords
entity
information base
sampling
static data
session request
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202311207343.0A
Other languages
Chinese (zh)
Other versions
CN116932703A (en
Inventor
李峰
张潇澜
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Suzhou Metabrain Intelligent Technology Co Ltd
Original Assignee
Suzhou Metabrain Intelligent Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Suzhou Metabrain Intelligent Technology Co Ltd filed Critical Suzhou Metabrain Intelligent Technology Co Ltd
Priority to CN202311207343.0A priority Critical patent/CN116932703B/en
Publication of CN116932703A publication Critical patent/CN116932703A/en
Application granted granted Critical
Publication of CN116932703B publication Critical patent/CN116932703B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/3331Query processing
    • G06F16/334Query execution
    • G06F16/3344Query execution using natural language analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/31Indexing; Data structures therefor; Storage structures
    • G06F16/316Indexing structures
    • G06F16/322Trees
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/332Query formulation
    • G06F16/3329Natural language query formulation or dialogue systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/3331Query processing
    • G06F16/3332Query translation
    • G06F16/3334Selection or weighting of terms from queries, including natural language queries
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/36Creation of semantic tools, e.g. ontology or thesauri
    • G06F16/367Ontology
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • G06N3/0455Auto-encoder networks; Encoder-decoder networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/047Probabilistic or stochastic networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/048Activation functions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/088Non-supervised learning, e.g. competitive learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/092Reinforcement learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/096Transfer learning
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Abstract

The application discloses a user controllable content generation method, device, equipment and medium, and relates to the technical field of artificial intelligence. The method comprises the following steps: acquiring a session request of a current user, and generating a corresponding session identification number according to the session request; acquiring a corresponding information base according to the session identification number, wherein the information base comprises a static data information base; completing the sampling of static data and forming an input prefix according to the session request and the information base through an information sampling network; and splicing the input prefix and the session request, and inputting the spliced input prefix and the session request into a pre-training language model to obtain a corresponding output result. The method and the device can receive the controllable appointed attribute based on the pre-training language model, and realize controllable text generation of specific content and specific style.

Description

User controllable content generation method, device, equipment and medium
Technical Field
The present disclosure relates to the field of artificial intelligence technologies, and in particular, to a method, an apparatus, a device, and a medium for generating user-controllable content.
Background
In recent years, with the increasing size of models, large-scale natural language models (large language model, LLM) have brought revolutionary technological changes in the field of natural language processing. LLM is known to have excellent performance and strong adaptability in various NLP (Natural Language Processing ) downstream tasks. Therefore, LLM is being used as a basic model of natural language processing, and is increasingly widely applied to various tasks of language processing and content generation.
However, because the LLM model is large in scale and difficult to update, the content output by the model is greatly affected by the training corpus, and correct response and knowledge update cannot be performed on the content other than the corpus. In the multi-round dialogue process, the result of model reply is lack of consistency with the front-round dialogue due to the limitation of input length and the uncertainty of LLM on the context internal learning. And because the LLM can not update parameters in real time, even if the user corrects the content output by the model in the dialogue process, the model still outputs wrong answers when encountering the same problems. In addition, in the process of generating content, the text meeting the core content limitation or specific style requirement cannot be generated directly often under the influence of the inherent corpus training. Therefore, the technology of exploring controllable text generation based on a large-scale language model becomes a key problem in the field of natural language processing.
Disclosure of Invention
In order to solve at least one of the problems mentioned in the background art, the application provides a method, a device, equipment and a medium for generating user-controllable content, which can receive controllable specified attributes based on a pre-training language model and realize controllable text generation of specific content and specific styles.
The specific technical scheme provided by the embodiment of the application is as follows:
in a first aspect, a method for generating user-controllable content is provided, including:
acquiring a session request of a current user, and generating a corresponding session identification number according to the session request;
acquiring a corresponding information base according to the session identification number, wherein the information base comprises a static data information base;
completing the sampling of static data and forming an input prefix according to the session request and the information base through an information sampling network;
and splicing the input prefix and the session request, and inputting the spliced input prefix and the session request into a pre-training language model to obtain a corresponding output result.
Further, the information base also comprises a dynamic data information base.
Further, the method for completing the sampling of the static data and forming the input prefix according to the session request and the information base through the information sampling network further includes:
completing the sampling of static data and forming a first input prefix according to the session request and the static data information base through an information sampling network;
and completing the sampling of the dynamic data and forming a second input prefix according to the session request and the dynamic data information base through an information sampling network.
Further, the static data includes object entities and relationships between the object entities, and the static data information base is constructed by:
sampling the static data in the form of a binary tree structure, wherein leaf nodes in the binary tree structure correspond to the object entities;
and storing leaf nodes in the binary tree structure and relations among the leaf nodes through triplets so as to store static data related to the current session identification number.
Further, the triplet includes a first entity, a second entity, and a relationship between the first entity and the second entity, wherein the first entity and the second entity correspond to different leaf nodes in the binary tree structure, respectively, and the relationship between the first entity and the second entity corresponds to the relationship between the different leaf nodes.
Further, the step of completing the sampling of the static data and forming an input prefix according to the session request and the information base through the information sampling network includes:
and sampling from the root node of the binary tree structure according to the session request and constraint data in the static data information base by an information sampling network according to the triples so as to form an input prefix of the pre-training language model.
Further, the method further comprises:
and if the current node is a non-leaf node, continuing to sample all relations and all sub-entities contained in the current node through the information sampling network until the current node is sampled to the leaf node.
Further, the leaf nodes comprise any one of common nodes and sampling end mark nodes.
Further, the method further comprises:
responsive to detecting that a leaf node sampled by the information sampling network is the sampling end marker node, terminating sampling;
and responding to the detection that the leaf node sampled by the information sampling network is the common node, and if the common node has an un-sampled brother node, continuing to sample the brother node until the leaf node of the brother node is sampled.
Further, the step of completing the sampling of the dynamic data and forming a second input prefix according to the session request and the dynamic data information base through the information sampling network includes:
sampling dynamic data according to the session request through an information sampling network to update the dynamic data information base in real time;
and the dynamic data information base stores dialogue data of the current session of the current user for the latest first preset round.
Further, the method for completing the sampling of the dynamic data and forming the second input prefix according to the session request and the dynamic data information base through the information sampling network further includes:
outputting the importance of the dialogue data of the latest first preset round of the current session through the information sampling network, sequencing from high to low according to the importance, and intercepting dialogue data of the number of the second preset round before as a second input prefix for reverse sequence splicing and inputting the dialogue data into the pre-training language model;
wherein the second preset round is not greater than the first preset round.
Further, the information sampling network comprises a word segmentation device layer, an embedding layer, a coding layer, a mask layer and a normalized index function layer.
Further, the word segmentation device layer is used for converting an input text into computable words, the embedding layer is used for converting corresponding words into word vector matrixes, the coding layer is used for carrying out nonlinear transformation and matrix calculation on all the word vector matrixes and outputting vectors consistent with the number of the input words, the mask layer is used for filtering the output coverage irrelevant to a sampled object, and the normalization index function layer is used for obtaining probability distribution of the sampled object through normalization index function by outputting the mask layer.
Further, the method further comprises:
and forming dialogue data by the output result and the session request, and checking and updating according to the static data information base.
Further, the forming the output result and the session request into dialogue data and performing verification and update according to the static data information base includes:
and in response to detecting that the relation between any triplet in the dialogue data and a first entity in the corresponding triplet of the static data information base is consistent with the relation between the first entity and a second entity but the relation between the first entity and the second entity is inconsistent with the relation between the first entity and the second entity, updating the second entity in the corresponding triplet of the static data information base.
Further, the forming the output result and the session request into dialogue data and performing verification and update according to the static data information base includes:
in response to detecting that a relationship between a first entity and a second entity of any triplet of the conversation data does not exist in the static data information base, a relationship between the first entity and a second entity corresponding to the first entity and the corresponding second entity are newly added in the static data information base.
Further, the forming the output result and the session request into dialogue data and performing verification and update according to the static data information base includes:
and in response to detecting that the first entity of any triplet in the dialogue data does not exist in the static data information base, newly adding the triplet corresponding to the first entity in the static data information base.
In a second aspect, there is provided a user-controllable content generation apparatus, the apparatus comprising:
the communication module is used for acquiring a session request of a current user and generating a corresponding session identification number according to the session request;
the reading module is used for acquiring a corresponding information base according to the session identification number, wherein the information base comprises a static data information base;
the sampling module is used for completing the sampling of static data and forming an input prefix according to the session request and the information base through an information sampling network;
and the control module is used for splicing the input prefix and the session request and inputting the input prefix and the session request into a pre-training language model to obtain a corresponding output result.
In a third aspect, an electronic device is provided comprising a memory, a processor and a computer program stored on the memory and executable on the processor, the processor implementing the user controllable content generation method when executing the computer program.
In a fourth aspect, a computer-readable storage medium is provided, storing computer-executable instructions for performing the user-controllable content generation method.
The embodiment of the application has the following beneficial effects:
1. the method, the device, the equipment and the medium for generating the user controllable content can sample static data, provide a static data information base construction and retrieval technology for the static data, receive controllable appointed attributes based on a pre-training language model and realize controllable text generation for specific content and specific style of a user;
2. the method can also be used for sampling dynamic data, and splicing the dynamic data as an input prefix to be connected into a pre-training model, so that the model output is more accurate, the multi-round dialogue capability is improved, meanwhile, information can be simplified and screened, excessive information is not added into the input prefix, the calculation cost of a language model is increased, the characteristics of the input model are reduced, and the calculation load of the model is reduced;
3. the static data can be sampled through a binary tree structure and a triplet storage mode, so that the follow-up static data consistency check update can be facilitated, the model output is more accurate, and the memory consistency is ensured.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings that are needed in the description of the embodiments will be briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present application, and that other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
FIG. 1 shows a general flow chart of a user-controllable content generation method provided by an embodiment of the present application;
FIG. 2 illustrates an example of a static binary tree structure of user-controllable content generation method according to one embodiment of the present application;
FIG. 3 illustrates a schematic diagram of a sampling process based on the information sampling network of FIG. 2, according to one embodiment of the present application;
FIG. 4 illustrates a schematic diagram of an information sampling network architecture according to one embodiment of the present application;
fig. 5 shows a schematic structural diagram of a user-controllable content generating device provided in an embodiment of the present application;
FIG. 6 illustrates an exemplary system that may be used to implement various embodiments described herein.
Detailed Description
For the purposes of making the objects, technical solutions and advantages of the present application more apparent, the technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present application, and it is apparent that the described embodiments are only some embodiments of the present application, but not all embodiments. All other embodiments, which can be made by one of ordinary skill in the art without undue burden from the present disclosure, are within the scope of the present disclosure.
It should be understood that throughout the description of this application, unless the context clearly requires otherwise, the words "comprise," "comprising," and the like in the description and the claims are to be construed in an inclusive sense rather than an exclusive or exhaustive sense; that is, it is the meaning of "including but not limited to".
It should also be appreciated that the terms "first," "second," and the like are used for descriptive purposes only and are not to be construed as indicating or implying relative importance. Furthermore, in the description of the present application, unless otherwise indicated, the meaning of "a plurality" is two or more.
Example 1
The application provides a user-controllable content generation method, referring to fig. 1, the method includes:
s1, acquiring a session request of a current user, and generating a corresponding session identification number according to the session request;
s2, acquiring a corresponding information base according to the session identification number, wherein the information base comprises a static data information base;
s3, completing the sampling of static data and forming an input prefix according to the session request and the information base through the information sampling network;
s4, splicing the input prefix with the session request and inputting the spliced input prefix with the session request into a pre-training language model to obtain a corresponding output result.
In some embodiments, the method further comprises:
S5, forming dialogue data by the output result and the dialogue request, checking and updating according to the static data, and updating the information sampling network according to the dialogue data.
Specifically, the application is based on a set of natural language text man-machine interaction system, and can comprise a user interaction page, a front-end server, a back-end server and a general storage device. For example, the user interaction page may include a user input box, a user input information presentation and a dialog presentation box, and for each dialog, there is a praise and click on a button, so that the user can evaluate the satisfaction of the proposed question dialog; the front-end server is used for storing codes and interaction logic required by the development of the user interaction page; the back-end server is mainly a high-performance server comprising an acceleration chip and is used for storing and starting a system background service and business logic and carrying out updating operation of a related model; the general storage device is used for storing all codes and information base data, and can also adopt a cloud server storage form. Firstly, a user logs in a system interaction page through a client interaction page, and the system uniquely identifies the user according to a user ID. A long connection initiated by a user with a system is called a session, in which the user can interact with the system for multiple rounds, and for convenience of description, each round of interaction is referred to as a session in the scheme. In one session, the content sent by the user to the system is called a session request (query), and the content returned by the system is called a response (response). For example, when the user logs in for the first time, the system will create a new session (session) according to the user ID, generate a corresponding session identification number according to the session request, and save the session identification number (session ID); the session identification number (session id) will be used to load the information base associated with the session, providing the required information for the subsequent session. When the user logs in for the first time, the user inquires a corresponding session identification number (sessionID) according to the user ID, and wakes up the previously saved session. For each session, when the session is created or reloaded, consecutive turns of dialog (Ntc) in the current session (session) are initialized to 0. If there is no interaction beyond a certain time, the session (session) is automatically disconnected and when the connection session (session) is replayed, the session (session) is considered to be reloaded. In particular, for non-first-sign-on users, it may still be selected whether to re-create a session, creating a new session will initialize the corresponding blank information library simultaneously, in which user interaction with the system will not retrieve information stored in the session that the user has previously created. The system obtains a corresponding information base according to the session identification number (sessionID), and the information base can comprise a static data information base. The static information database is used for storing static information, which is described below, and the static information is used for storing object entities and relationships between the object entities, such as chatbots (chat robots), names, small source classmates, and the like. By sampling the static data and providing a static data information base construction and retrieval technology for the static data, controllable specified attributes of a user can be received based on a pre-training language model, so that controllable text generation for specific content and specific style of the user is realized.
In some embodiments, the information store further comprises a dynamic data information store.
In some embodiments, S3 further comprises:
s31, completing sampling of static data and forming a first input prefix according to a session request and a static data information base through an information sampling network;
s32, completing the sampling of the dynamic data and forming a second input prefix according to the session request and the dynamic data information base through the information sampling network.
Specifically, the dynamic information database is used for storing dynamic information, where the dynamic information is used for storing the dialogue content of the current user in the latest first preset round, for example, the user dialogue "give you a name bar, call a little-source college bar" and the returned dialogue "good, i call little-source college", etc. In particular, the sampled static information is stored in a memory of the computer device; the sampled dynamic information is stored in the cache, and as the current user performs multiple rounds of conversations, conversation contents of a certain preset round of conversations are continuously stored, taking 20 rounds as an example, if the current user performs 21 rounds of conversations, the latest 20 rounds of conversation contents are reserved, namely conversation contents of the first round are not reserved, so that timeliness of the dynamic information is ensured, and excessive sampled information does not cause complex calculation. It should be noted that, there is no sequence requirement for forming the first input prefix and forming the second input prefix, the first prefix may be formed first, the second prefix may be formed first, and the first prefix and the second prefix may be formed simultaneously. In particular, when stitching input prefixes, in general, the first input prefix is located at a stitching position before the second input prefix.
In particular, the system may also obtain corresponding information base constraint data, which may also include header data. The information retrieval part mainly comprises an information base and an information sampling network. The information sampling network can complete the sampling of static data and dynamic data according to the session request, and then generates an input prefix so as to be conveniently input into the pre-training language model to obtain an output result. And after the input prefix generated by sampling is spliced with a user request (query), inputting a pre-trained language model together, and obtaining model output. The pre-trained language model herein, i.e., the large language model (English: large Language Model, abbreviated LLM), also known as the large language model, is a model based on machine learning and natural language processing techniques that learns the ability to serve human language understanding and generation by training on a large amount of text data. The core idea of LLM is to learn patterns and language structures of natural language through extensive unsupervised training, which can simulate the human language cognition and generation process to some extent. Compared with the traditional NLP model, the LLM can better understand and generate natural texts, and can also show a certain logic thinking and reasoning capability. Pretraining belongs to the category of transfer learning. The pre-training model parameters are not initialized randomly any more, but are pre-trained through some tasks to obtain a set of model parameters, and then the model is initialized by the set of parameters and then trained. The pre-trained language model referred to herein may be a large model of a transducer-based decoder structure, such as the GPT family, source 1.0, etc.; the model using the encoder stack, such as Bert, or the model using the encoder+decoder structure, such as T5, is not specifically defined herein. In particular, a transducer model is a neural network that learns context and thus meaning by tracking relationships in sequence data (e.g., words in this sentence). The transducer model employs a set of evolving mathematical techniques, known as attention or self-attention, to detect even the subtle ways in which remote data elements in a series interact and interdepend. The transducer model uses a Self-Attention (Self-Attention) mechanism to compute the relationships between all elements in the sequence in parallel. This gives the transducer model significant advantages in terms of computational efficiency and long-range dependent capture capability. In addition, the Multi-Head self-Attention (Multi-Head Attention) structure of the transducer model can capture a variety of different dependencies. Multi-Head self-Attention (Multi-Head Attention) is a core component of the transducer model. It includes multiple parallel self-attention heads, each of which can learn different dependencies. The multi-headed self-attention output is a concatenation of the outputs of the individual heads, which are then linearly transformed to the final result. The results generated by the language model are returned in sequence according to the request ID and finally output to the reply display area of the user interaction page. Finally, combining the model generation result with the user session request to form a piece of dialogue data < Q, R >, checking the static data according to the dialogue data, checking the consistency of the data, and if the consistency is not the same, updating the static data; and updating the information sampling network according to the continuous dialogue turns and the single satisfaction degree of the user. Different from a prefix generation model of traditional offline data training, the method and the device not only sample static data but also sample dynamic data, and provide an information base construction, retrieval and updating technology for the static data, so that the generation of text content in a specified style can be realized, and meanwhile, the consistency of long and short time aging memory of multiple rounds of conversations can be ensured; whether the generated content or dialogue reply meets the user expectations or not can be evaluated and updated in real time, so that the interactive wish of the user and the user evaluation satisfaction degree are improved, the system has extremely strong migration and adaptability, model output is more accurate, multi-round dialogue capability is improved, and adaptability to scenes is higher.
In some embodiments, the static data includes object entities and relationships between the object entities, and the static data information base is constructed by:
sampling static data in the form of a binary tree structure, wherein leaf nodes in the binary tree structure correspond to object entities;
the leaf nodes in the binary tree structure and the relationships between the leaf nodes are stored by triplets to store static data related to the current session identification number.
Specifically, the information sampling includes sampling dynamic data and static data, wherein for the static data, a triplet of a binary tree structure is used in an information base to store a knowledge graph related to the current session identification number (session id). Specifically, conversational robot information (chatbot profile) and user information (user profile) may be included. Wherein fig. 2 is an example of sampling of information base static data with respect to a conversational robot (chatbot). By adopting the technical means, the static data can be sampled and updated in a binary tree structure and triplet storage mode, the information can be simplified, the model can be easily identified in a simplified and convenient mode by adopting the triplet mode, the subsequent updating is convenient, the static data consistency can be ensured, the data added with the input prefix is simplified, and the calculation cost of the language model is reduced.
In some embodiments, based thereon, S3 comprises:
s301, sampling from a binary tree structure root node according to a session request and constraint data in a static data information base through an information sampling network to form an input prefix of a pre-training language model;
and S302, if the current node is a non-leaf node, continuing to sample all relations and all sub-entities contained in the current node through the information sampling network until the current node is sampled to the leaf node.
In some embodiments, the leaf nodes include any one of a normal node and an end-of-sample marker node, based on which S3 further includes:
s303, responding to the detection that the leaf node sampled by the information sampling network is a sampling end marking node, and terminating the sampling;
and S304, responding to the detection that the leaf node sampled by the information sampling network is a common node, and if the common node has an un-sampled brother node, continuing to sample the brother node until the leaf node of the brother node is sampled.
Illustratively, a set of < entity a, relationship i, entity b > triples is represented by node a, left link i, left child node b, and node a, right link j, right child node c are described as sibling nodes (relationship j) with right child node c; the right child node c and its parent node d form a set of triples < entity d, relationship j, entity c >. That is, the relation between each entity of the static data sample forms the prefix of the language model to be input according to the triplet structure that the relation between the entity a and the entity b is the relation i. By adopting the technical means, the relationship between leaf nodes in the binary tree structure and nodes can be simplified and stored in a triplet mode by adopting the binary tree structure and the triplet storage mode, so that the static data can be sampled in a simple mode, the information can be simplified, a model can be easily identified in a simplified and convenient mode, the follow-up updating is convenient, the data added with the input prefix can be simplified while the consistency of the static data can be ensured, and the calculation cost of a language model is reduced. In particular, since the leaf node may comprise an end-of-sampling marker node and the sampling is ended when the output of the information sampling network is detected to fall within the interval corresponding to the end-of-sampling marker node, i.e. the input length of the static information part input prefix is determined based on the current user's session situation and is determined based on the user's session intention, the input length of this part is also controllable. Meanwhile, in order to avoid the fact that the sampling process cannot be terminated due to the annular structure in the knowledge graph, it is specified that the sampling process is also terminated when the newly added entity is the sampled entity.
In some embodiments, the triplets include a first entity, a second entity, and a relationship between the first entity and the second entity, wherein the first entity and the second entity correspond to different leaf nodes in a binary tree structure, respectively, and the relationship between the first entity and the second entity corresponds to a relationship between different leaf nodes. Referring to fig. 2 and table 1, session robot information is sampled from a root node through an information sampling network according to the session request and constraint data in the information base according to a triplet structure (a first entity, a second entity, and a structure of a relationship between the first entity and the second entity). If the current node is a non-leaf node, the sampling network will continue to sample the relationship and sub-entities contained therein until the sampling entity is a leaf node, or the output of the sampling network falls in the interval corresponding to the end mark, at which time the sampling of the conversation robot information is terminated. It should be noted that, in order to avoid that the cyclic structure in the knowledge graph leads to the sampling process not ending, it is provided that the sampling process ends when the newly added sampled entity is the sampled entity. And then the information sampling network continues to sample the user information according to the sequence of the first entity, the relation between the first entity and the second entity. By adopting the technical scheme, the static data can be sampled by traversing the leaf nodes and the non-leaf nodes through the binary tree structure and the triple storage mode, on one hand, the information can be simplified and screened, so that excessive information is not added into the input prefix, and the calculation cost of the language model is increased; on the other hand, the method can facilitate the subsequent consistency check and update of the static data, so that the output of the model is more accurate, and the memory consistency is ensured.
Table 1 static data triplet representations examples
Sequence number Entity Relationship of Entity Whether or not the root node
0 Chatbot Name of the name Source(s) Is that
1 Chatbot Is that Dialog AI Whether or not
2 Source(s) Is that Language big model Whether or not
3 Source(s) Is issued to 2021, 9 Whether or not
4 Dialog AI Can be performed Chat conversations Whether or not
5 Dialog AI Can be performed Writing Whether or not
6 Dialog AI Can be performed Recommendation suggestion Whether or not
7 User' s Null Null Is that
In some embodiments, S5 comprises:
and S501, in response to detection that the relation between any triplet in the dialogue data and the first entity in the triplet corresponding to the static data information base is consistent with the relation between the first entity and the second entity but the relation between the first entity and the second entity is inconsistent with the relation between the first entity and the second entity in the triplet corresponding to the static data information base, updating the second entity in the triplet corresponding to the static data information base.
In some embodiments, S5 further comprises:
s502, in response to detecting that the relation between the first entity and the second entity of any triplet in the dialogue data does not exist in the static data information base, the relation between the first entity and the second entity corresponding to the first entity and the corresponding second entity are newly added in the static data information base.
In some embodiments, S5 further comprises:
s503, responding to the fact that the first entity of any triplet in the dialogue data does not exist in the static data information base, and newly adding the triplet corresponding to the first entity in the static data information base.
Specifically, after session data < Q, R > is formed, information such as entity, relationship, event and time in the session data is extracted by using an information extraction algorithm, so as to form triples < entity a, relationship i, entity b > and some related information. Comparing the triples obtained by adopting the information extraction algorithm with a static data list obtained by sampling aiming at a user request according to the entity, judging whether the information of the triples is consistent, and if so, checking to pass; if the static data is inconsistent or the same entity is not found, the static data updating module can be entered to update the static data. The information list is retrieved according to the < entity a > generated by the < Q, R > pair. For example, for the first entity a, if there is a relation i, but the entity b is inconsistent with the stored information, updating the information list according to the relation i > of the entity a, and replacing the original sub-entity with the entity b; if the relation i does not exist, adding the relation i and the corresponding sub-entity b to the first entity a; if the first entity a does not exist in the information base sampling static data, the triplet < entity a, relation i and entity b > is directly added in the static data base. By adopting the technical means, as the structure of the triplet comprises the first entity, the relation and the second entity, the relation between all the first entities and the second entity and the relation between the corresponding first entities and the second entity are stored, the static data can be updated from different dimensions according to the inconsistency of the second entity or the inconsistency of the relation and the inconsistency of the first entity, and the static database can be simply, quickly and comprehensively updated in a mode of being convenient for identification by a model due to the characteristics of the triplet, so that the data consistency of the static data can be ensured.
In some embodiments, S32 further comprises:
s321, sampling dynamic data according to a session request through an information sampling network so as to update a dynamic data information base in real time; the dynamic data information base stores dialogue data of the current session of the current user in the latest first preset round.
In some embodiments, S32 further comprises:
s322, outputting importance of dialogue data of a first preset round of a current session through an information sampling network, sequencing the dialogue data according to the importance from high to low, and intercepting dialogue data of a second preset round number before as a second input prefix for reverse sequence splicing and inputting a pre-training language model;
wherein the second preset round is not greater than the first preset round.
In some embodiments, the information sampling network includes a word segmentation layer, an embedding layer, a coding layer, a masking layer, and a normalized exponential function layer; the word segmentation device layer is used for converting an input text into computable words, the embedding layer is used for converting corresponding words into word vector matrixes, the coding layer is used for carrying out nonlinear transformation and matrix calculation on all word vector matrixes, outputting vectors consistent with the number of the input words, the mask layer is used for covering and filtering the output irrelevant to a sampled object, and the normalization index function layer is used for outputting the mask layer to obtain probability distribution of the sampled object through a normalization index function.
Specifically, the information sampling network may also sample dynamic data according to a user's session request. Unlike static data sampling, dynamic data in the dynamic information base is dialogue data of a first preset round (such as the latest K rounds) of a current session of a user, and the information sampling network outputs importance to the K rounds of dialogue. Then the program sorts the K rounds of dialogs according to the importance from high to low, intercepts dialogs of the second preset number of rounds before (for example, the front M rounds, wherein M is less than or equal to K), and splices the dialogs in reverse order, and inputs the dialogs into the pre-training language model, namely, the dialogs with the highest importance are spliced finally, and referring to FIG. 3. Referring to fig. 4, the information sampling network includes a Tokenizer layer, an Embedding layer, an encoding layer, a masking layer, and a normalized exponential function layer softmax layer. Wherein the segmenter layer (token layer) is used for converting the input text into a computable numerical value (namely token); an Embedding layer (Embedding layer) converts a token corresponding to input into a word vector matrix; the coding layer is used for carrying out nonlinear transformation and matrix calculation on all word vectors and outputting N multiplied by 1-dimensional vectors which are consistent with the number N of input word elements (token); wherein, the token refers to a minimum unit with independent semantics, each token represents an independent unit (word element), has a certain semantic meaning and can be processed by a model; a masking layer for masking out output independent of the sampled object such that the sampling probability distribution is concentrated in the sampled object set; the softmax layer passes the output of the mask layer through softmax to obtain a probability distribution of the sampled object. The dimension of the output required by the coding layer can be changed along with the dimension of the input, and an RNN (Recurrent Neural Network, cyclic neural network) type neural network can be used, or a network based on a transform structure, such as an encoder part of a Bert model, can be used. Since the structure of the coding layer is a technical solution known in the art, it will not be described here too much. Preferably, the proposal adopts the Bert network as an embedded layer and a coding layer. In addition, the information sampling network structure can be adopted, model initialization is carried out through pretrained Bert network weight, and the information base and the sampler are updated and learned on line by a subsequent information base updating module. By adopting the technical means, the dynamic data can be sampled, the characteristic dimension of the input prefix is enriched, the accuracy of multi-round session output is further improved, and the multi-round session capability is improved.
In some embodiments, the session data further includes at least one of a successive session round and a single session satisfaction of the current session of the current user, based on which S5 further includes:
s511, using the generation of one-time input prefix as a strategy, and defining a state cost function to describe a value expected obtained by the sampling strategy;
and S512, optimizing and updating the information sampling network based on the gradient ascending reinforcement learning algorithm according to the continuous dialogue round and the single dialogue satisfaction degree so as to maximize the state cost function result.
Specifically, in addition to updating the information repository, the information sampling network may also be updated. The update to the information sampling network may be performed by successive dialog turns of the current session of the current user, with single dialog satisfaction being an indicator. Wherein, in the current session, a continuous dialogue round (turn_count) is usedN tc To represent. After a given reply, the user's evaluation of the reply, usingV el To represent.N tc It is characterized that, in one session, the user is willing to interact with the model for a turn,N tc the larger the recognition of the system return information interaction, the more willing the user to interact with the system. Conversely, when the output is poor, the user is generally not interested in continuing to use the device, at which time N tc The value of (2) will be lower.V el The user's acceptance of replies returned to the system during a single session is characterized. By way of example only, and not by way of limitation,V el the value of (1, 0, -1) may be. Wherein 1 indicates that the user praise the current reply, 0 indicates that no evaluation is made on the reply, and-1 indicates that the current reply is stepped on. In particular, the method comprises the steps of,V el other satisfaction score values are possible to represent the current user's satisfaction with a single session. Although it isV el The result reflects the user's evaluation of the current request reply, but in practical applications, the user often evaluates the last reply based on the results of multiple questions and answers, thusV el The scoring of the information sampled by the sampler can also be reflected.
Specifically, for an information sampling network, each request of a user can be regarded as an initial stateThe first sampling of information in the information base by the sampling network can be regarded as a single action behaviorEntity/relationship (for static data)/dialogue (for dynamic data) of user session request and sample are spliced at this time to form new state. After all sampling processes are completed in turn, all states are noted as s= {,…The sampling behavior of the network output in each state is denoted as a= { ,…,The generation of one-time input prefixes can be regarded as a set of policiesWhereinFor sampling network parameters. Due to the fact that forN tc AndV el all that characterizes is that a group of strategies finally score the user experience, so the scheme adopts a reinforcement learning algorithm based on strategy gradient to update the information sampling network on line.
Specifically, for the information sampling network, the goal is to make the number of dialogue retention rounds and user evaluation as good as possible, i.e. the final, after a series of sampling actions based on the state of any user inputN tc AndV el the index of (2) is as good as possible. Defining a state machine value function for this purpose(1) Wherein the action cost functionRepresenting information sampling network usage policiesThe return obtained after a series of actions a is made in the state set S. State cost functionRepresenting the sampling network for the state set S according to the sampling strategyThe value that can be obtained is expected.
The goal is to continuously optimize parameters of the sampling network based on successive dialog turns and single dialog satisfactionOptimizing and updating information sampling network based on gradient ascending reinforcement learning algorithm and maximizing state cost functionThe objective function is:(2) I.e. let the information sampling network learn a set of parameters Such that for any request (query) and entity/relationship/session splice-in stateA sampling result can be output so that the value of the state cost function is maximized. Whereby parameters of the sampling network are continuously optimized according to successive dialog turns and single dialog satisfactionMaximizing the value of the state cost function, namely the obtained cost periodThe maximum observation is achieved.
Optimizing the formula (2) by adopting a gradient ascent method, wherein the gradient isThe formula of (a) is as follows:(3) WhereinRepresenting the discount rate of the return, n representing the number of samples. Unlike conventional reinforcement learning, n is uncertain during the sampling of each dialogue, and thus a constant cannot be used instead of a payback fit. In order to calculate the formula (3)The following loss function is defined:(4) Wherein y represents ideal sampling probability distribution of information in the information base, and the value is obtained byN tc AndV el and (5) determining.Representing the sampling probability distribution of the sampling network output. Equation (4) represents the difference between the probability distribution of the output and the desired optimal probability distribution calculated using the cross entropy loss function when the information sampling network output is sampling the probability distribution of a certain entity/relationship. Calculating gradient by adopting a back propagation algorithm to the formula (4) to obtain the formula (3) Is a numerical value of (2).
In some embodiments, S512 comprises:
the rewards obtained after the corresponding actions are executed in any state by defining a rewards function description information sampling network, wherein the rewards function takes successive dialogue rounds and single dialogue satisfaction as parameters;
and optimizing and updating the information sampling network based on the gradient ascending reinforcement learning algorithm according to the output rewards.
In particular, due toRepresenting the sampling network in stateWhen executing an actionThe return obtained later can be obtained after completing a round of dialogue as follows:(5) WhereinRepresenting the sampling network in stateNext, an action is performedAnd awards obtained later. For simplicity, it may be considered that in a round of dialogue,the value at each sampling is equal, and the reward function is defined as follows:(6) Wherein the regulatory factorTo take the value of [0,1 ]]Constant of the same.The value of (c) increases with the number of rounds. Illustratively, the user's evaluation of the reply is praise,taking 1, if no evaluation is carried out,taking 0 at this timeThe value range is [0,1 ]]Mainly determined by the dialog turns. If the user's assessment of the reply is a click,taking-1 at this timeThe value range is [ -1,0]. For the t-th action, all rewards from t to n are accumulated Namely, an action cost function of action t:
(7)
the strategy gradient is thus obtained as(8)
Updating parameters of an information sampling network using a random gradient ascent method according to equation (8)(9);
Wherein the method comprises the steps ofFor the current sampling of the network parameters,representing the updated sampled network parameters. Constant (constant)Is the discount rate.
By adopting the technical means, according to the formulas (1) to (9), the online network update based on the strategy gradient algorithm can be realized for the information sampling network, so that the information sampling network approximates to the optimal sampling result for the information sampling in the information base, the accuracy of the information sampling is improved, and the language model output is more accurate.
For example, a session request of a current user is first received, for example, when the user inputs a request query= "change name bar for you, call a small source college bar", the information sampling network firstly samples static data according to the request, and obtains "chatbot name is source", and no sampling information is returned because there is no history dialogue in dynamic data. After the sampling information is spliced with the original request query, the original request query is changed into query= "the chatbot name is source < n > to change the name bar into a name bar, called a small source classmate bar", and then the query is transmitted into a pre-training language model, and the language model generates a reply res= "good, i called the small source classmate. ". After user input and language model reply are spliced, single-round dialogue content 'for you to change name bar, called a small source college bar' is obtained. < n > good, i call the little classmate. The information is sent to a verification module, an information extraction algorithm extracts an entity, and the entity is equivalently replaced by a chatbot by referring to disambiguation, so that an information triplet < chatbot, name and little source classmate > is obtained. Comparing < chatbot, name, source > with triples sampled by the information sampling network. The first entity is matched with the relation, and the second entity is different, so that the second entity, namely the fruiting body, needs to be updated as 'little source classmates', and the 0 th data in the static data table is updated. This round of dialogue is stored in a dynamic database. Whether the user evaluates the reply or not, the sampling network updates the parameters according to formulas (1) - (9).
Illustratively, when the user inputs the query again= "what name you call? The sampling network samples static data according to the query, the obtained "chatbot name is a small source classmate", the sampling probability is output to dynamic data, the dynamic data are arranged in a reverse probability sequence, and the dynamic data are spliced with the original request query to be changed into query= "the chatbot name is the small source classmate < n > to change the name bar to the small source classmate bar. < n > good, i call the little classmate. < n > what name you call? And finally, the query is transmitted into a language model, and the language model generates a reply res= "hello, i call the little source classmates. "what name you call" is the single-turn dialog content obtained after the user input and language model reply are spliced? < n > hello, i call the little source classmate. "the triplet extracted by the verification module is still < chatbot, name, little source classmate >, and the static library is not updated. The session of this round is stored in a dynamic database. Whether the user evaluates the reply or not, the sampling network updates the parameters according to formulas (1) - (9). Through the above examples, it is easy to find that, by adopting the technical scheme of the application, the multi-round dialogue capability based on the pre-training language model can be remarkably improved, and the system updates the static data and the information sampling network in the information base in a reinforcement learning mode, so that the system has extremely strong migration and adaptability and high scene adaptation capability; meanwhile, static data can be sampled through the information sampling network to simplify and screen information, and the calculation cost of a language model is not increased due to prefix addition of excessive information. The information sampling network can avoid confusion of multiple rounds of dialogue intentions caused by excessive information through reverse sequence splicing, so that the output of the language model is more accurate. For the above process, the user can also add static data in the information base by inputting a query specifying the model session style, so as to realize the session customization of the specific style.
In the embodiment, static data and dynamic data can be sampled, an information base construction, retrieval and updating technology is provided for the static data, the generation of text contents in a specified style can be realized, and meanwhile, the consistency of long-short-term memory of multiple conversations is ensured; whether the generated content or dialogue reply meets the user expectations or not can be evaluated and updated in real time, so that the interactive wish of the user and the user evaluation satisfaction degree are improved, the system has extremely strong migration and adaptability, model output is more accurate, multi-round dialogue capability is improved, and adaptability to scenes is higher.
It should be noted that the terms "S1", "S2", and the like are used for the purpose of describing steps only, and are not intended to be limited to the order or sequence of steps or to limit the present application, but are merely used for convenience in describing the method of the present application and are not to be construed as indicating the sequence of steps. In addition, the technical solutions of the embodiments may be combined with each other, but it is necessary to base that the technical solutions can be realized by those skilled in the art, and when the technical solutions are contradictory or cannot be realized, the combination of the technical solutions should be regarded as not exist and not within the protection scope of the present application.
Example two
Corresponding to the above embodiment, the present application further provides a user controllable content generating device, and referring to fig. 5, the device includes a communication module, a reading module, a sampling module, and a control module.
The communication module is used for acquiring a session request of a current user and generating a corresponding session identification number according to the session request; the reading module is used for acquiring a corresponding information base according to the session identification number, wherein the information base comprises a static data information base; the sampling module is used for completing the sampling of static data and forming an input prefix according to the session request and the information base through an information sampling network; and the control module is used for splicing the input prefix and the session request and inputting the input prefix and the session request into a pre-training language model to obtain a corresponding output result.
Further, the information base also comprises a dynamic data information base.
Furthermore, the sampling module is further used for completing the sampling of the static data and forming a first input prefix according to the session request and the static data information base through an information sampling network; and the second input prefix is used for completing the sampling of the dynamic data according to the session request and the dynamic data information base through an information sampling network.
Further, the static data includes object entities and relationships between the object entities, and the static data information base is constructed by:
sampling the static data in the form of a binary tree structure, wherein leaf nodes in the binary tree structure correspond to the object entities;
and storing leaf nodes in the binary tree structure and relations among the leaf nodes through triplets so as to store static data related to the current session identification number.
Further, the sampling module is further configured to sample, through an information sampling network, from a root node of the binary tree structure according to the session request and the triplet according to constraint data in the static data information base, so as to form an input prefix of the pre-training language model.
Further, the sampling module is further configured to, if the current node is a non-leaf node, continue to sample all relationships and all sub-entities contained in the current node through the information sampling network until the current node is sampled to the leaf node.
Further, the leaf nodes comprise any one of common nodes and sampling end mark nodes.
Further, the sampling module is further configured to terminate sampling in response to detecting that a leaf node sampled by the information sampling network is the sampling end marker node; and the leaf node used for responding to the detection of the information sampling network sampling is the common node, and if the common node has an un-sampled brother node, the sampling of the brother node is continued until the sampling is carried out on the leaf node of the brother node.
Further, the triplet includes a first entity, a second entity, and a relationship between the first entity and the second entity, wherein the first entity and the second entity correspond to different leaf nodes in the binary tree structure, respectively, and the relationship between the first entity and the second entity corresponds to the relationship between the different leaf nodes.
Further, the sampling module is further used for sampling dynamic data according to the session request through an information sampling network so as to update the dynamic data information base in real time; the dynamic data information base stores dialogue data of the current session of the current user in a first preset round recently.
Further, the sampling module is further configured to output, through the information sampling network, importance of dialogue data of a first preset round of a current session, sort the dialogue data according to importance from high to low, and intercept dialogue data of a second preset round number before as a second input prefix for reverse-order splicing and inputting the dialogue data into the pre-training language model; wherein the second preset round is not greater than the first preset round.
Further, the information sampling network comprises a word segmentation device layer, an embedding layer, a coding layer, a mask layer and a normalized index function layer.
Further, the word segmentation device layer is used for converting an input text into computable words, the embedding layer is used for converting corresponding words into word vector matrixes, the coding layer is used for carrying out nonlinear transformation and matrix calculation on all the word vector matrixes and outputting vectors consistent with the number of the input words, the mask layer is used for filtering the output coverage irrelevant to a sampled object, and the normalization index function layer is used for obtaining probability distribution of the sampled object through normalization index function by outputting the mask layer.
Further, the device also comprises a verification updating module which is used for forming dialogue data between the output result and the session request and performing verification updating according to the static data information base.
Further, the verification updating module is further configured to update the second entity in the corresponding triplet of the static data information base in response to detecting that any triplet in the dialogue data is consistent with the first entity in the corresponding triplet of the static data information base and the relationship between the first entity and the second entity is inconsistent with the second entity.
Further, the verification updating module is further configured to, in response to detecting that a relationship between a first entity and a second entity of any triplet in the dialogue data does not exist in the static data information base, newly add, in the static data information base, a relationship between the first entity and the second entity corresponding to the first entity and the corresponding second entity.
Further, the verification updating module is further configured to, in response to detecting that the first entity of any triplet in the dialogue data does not exist in the static data information base, newly add a triplet corresponding to the first entity in the static data information base.
Further, the dialogue data further comprises at least one of a continuous dialogue round and a single dialogue satisfaction of a current session of the current user, and the verification updating module is further used for defining a state value function description and sampling value expectations obtainable by the strategy by taking one generation of the input prefix as the strategy; and optimizing and updating the information sampling network based on a gradient-increasing reinforcement learning algorithm according to the continuous dialogue turns and the single dialogue satisfaction so as to maximize the state value function result.
Further, the verification updating module is further configured to describe a return obtained after the information sampling network performs a corresponding action in any state by defining a reward function, where the reward function uses the continuous dialogue round and the single dialogue satisfaction as parameters; and optimizing and updating the information sampling network based on the gradient ascending reinforcement learning algorithm according to the output rewards.
Further, the state cost function is calculated using the following formula:wherein the action cost functionRepresenting information sampling network usage policiesThe return obtained after a series of actions A are made under the state set S; state cost functionRepresenting the information sampling network for the state set S according to the sampling strategyThe value that can be obtained is expected.
Further, the reward function is performed by adopting the following formulaAnd (3) calculating:whereinRepresenting the status of an information sampling networkNext, an action is performedA reward obtained later;in order to adjust the factor(s),for successive dialog turns,is single session satisfaction.
The specific limitation regarding the user-controllable content generating device may be referred to above as the relevant limitation regarding the user-controllable content generating method embodiment, and thus will not be described herein. The respective modules in the above-described user-controllable content generating apparatus may be implemented in whole or in part by software, hardware, and a combination thereof. The above modules may be embedded in hardware or may be independent of a processor in the computer device, or may be stored in software in a memory in the computer device, so that the processor may call and execute operations corresponding to the above modules.
Example III
Corresponding to the above embodiment, the present application further provides an electronic device, including a memory, a processor, and a computer program stored in the memory and capable of running on the processor, where the processor may implement the above-mentioned user-controllable content generating method when executing the program.
As shown in fig. 6, in some embodiments, the system can be the above-described electronic device for the user-controllable content generation method of any of the described embodiments. In some embodiments, a system may include one or more computer-readable media (e.g., system memory or NVM/storage) having instructions and one or more processors (e.g., processor (s)) coupled with the one or more computer-readable media and configured to execute the instructions to implement the modules to perform the actions described herein.
For one embodiment, the system control module may include any suitable interface controller to provide any suitable interface to at least one of the processor(s) and/or any suitable device or component in communication with the system control module.
The system control module may include a memory controller module to provide an interface to the system memory. The memory controller modules may be hardware modules, software modules, and/or firmware modules.
The system memory may be used, for example, to load and store data and/or instructions for the system. For one embodiment, the system memory may include any suitable volatile memory, such as, for example, a suitable DRAM. In some embodiments, the system memory may comprise double data rate type four synchronous dynamic random access memory (DDR 4 SDRAM).
For one embodiment, the system control module may include one or more input/output (I/O) controllers to provide an interface to the NVM/storage device and the communication interface(s).
For example, NVM/storage may be used to store data and/or instructions. The NVM/storage may include any suitable nonvolatile memory (e.g., flash memory) and/or may include any suitable nonvolatile storage device(s) (e.g., one or more Hard Disk Drives (HDDs), one or more Compact Disc (CD) drives, and/or one or more Digital Versatile Disc (DVD) drives).
The NVM/storage may include a storage resource that is physically part of the device on which the system is installed or it may be accessed by the device without being part of the device. For example, the NVM/storage may be accessed over a network via the communication interface(s).
The communication interface(s) may provide an interface for the system to communicate over one or more networks and/or with any other suitable device. The system may wirelessly communicate with one or more components of a wireless network in accordance with any of one or more wireless network standards and/or protocols.
For one embodiment, at least one of the processor(s) may be packaged together with logic of one or more controllers (e.g., memory controller modules) of the system control module. For one embodiment, at least one of the processor(s) may be packaged together with logic of one or more controllers of the system control module to form a System In Package (SiP). For one embodiment, at least one of the processor(s) may be integrated on the same die as logic of one or more controllers of the system control module. For one embodiment, at least one of the processor(s) may be integrated on the same die with logic of one or more controllers of the system control module to form a system on chip (SoC).
In various embodiments, the system may be, but is not limited to being: a server, workstation, desktop computing device, or mobile computing device (e.g., laptop computing device, handheld computing device, tablet, netbook, etc.). In various embodiments, the system may have more or fewer components and/or different architectures. For example, in some embodiments, a system includes one or more cameras, a keyboard, a Liquid Crystal Display (LCD) screen (including a touch screen display), a non-volatile memory port, multiple antennas, a graphics chip, an Application Specific Integrated Circuit (ASIC), and a speaker.
It should be noted that the present application may be implemented in software and/or a combination of software and hardware, for example, using Application Specific Integrated Circuits (ASIC), a general purpose computer or any other similar hardware device. In one embodiment, the software programs of the present application may be executed by a processor to implement the steps or functions as described above. Likewise, the software programs of the present application (including associated data structures) may be stored on a computer readable recording medium, such as RAM memory, magnetic or optical drive or diskette and the like. In addition, some steps or functions of the present application may be implemented in hardware, for example, as circuitry that cooperates with the processor to perform various steps or functions.
Furthermore, portions of the present application may be implemented as a computer program product, such as computer program instructions, which when executed by a computer, may invoke or provide methods and/or techniques in accordance with the present application by way of operation of the computer. Those skilled in the art will appreciate that the form of computer program instructions present in a computer readable medium includes, but is not limited to, source files, executable files, installation package files, etc., and accordingly, the manner in which the computer program instructions are executed by a computer includes, but is not limited to: the computer directly executes the instruction, or the computer compiles the instruction and then executes the corresponding compiled program, or the computer reads and executes the instruction, or the computer reads and installs the instruction and then executes the corresponding installed program. Herein, a computer-readable medium may be any available computer-readable storage medium or communication medium that can be accessed by a computer.
Communication media includes media whereby a communication signal containing, for example, computer readable instructions, data structures, program modules, or other data, is transferred from one system to another. Communication media may include conductive transmission media such as electrical cables and wires (e.g., optical fibers, coaxial, etc.) and wireless (non-conductive transmission) media capable of transmitting energy waves, such as acoustic, electromagnetic, RF, microwave, and infrared. Computer readable instructions, data structures, program modules, or other data may be embodied as a modulated data signal, for example, in a wireless medium, such as a carrier wave or similar mechanism, such as that embodied as part of spread spectrum technology. The term "modulated data signal" means a signal that has one or more of its characteristics set or changed in such a manner as to encode information in the signal. The modulation may be analog, digital or hybrid modulation techniques.
An embodiment according to the present application comprises an apparatus comprising a memory for storing computer program instructions and a processor for executing the program instructions, wherein the computer program instructions, when executed by the processor, trigger the apparatus to operate a method and/or a solution according to the embodiments of the present application as described above.
Example IV
Corresponding to the above embodiments, the present application further provides a computer-readable storage medium storing computer-executable instructions for performing a user-controllable content generating method.
In this embodiment, computer-readable storage media may include volatile and nonvolatile, removable and non-removable media implemented in any method or technology for storage of information such as computer-readable instructions, data structures, program modules or other data. For example, computer-readable storage media include, but are not limited to, volatile memory, such as random access memory (RAM, DRAM, SRAM); and nonvolatile memory such as flash memory, various read only memory (ROM, PROM, EPROM, EEPROM), magnetic and ferromagnetic/ferroelectric memory (MRAM, feRAM); and magnetic and optical storage devices (hard disk, tape, CD, DVD); or other now known media or later developed computer-readable information/data that can be stored for use by a computer system.
While preferred embodiments of the present application have been described, additional variations and modifications in those embodiments may occur to those skilled in the art once they learn of the basic inventive concepts. It is therefore intended that the following claims be interpreted to embrace the preferred embodiments and all such variations and modifications as fall within the scope of the embodiments herein.
It will be apparent to those skilled in the art that various modifications and variations can be made in the present application without departing from the spirit or scope of the application. Thus, if such modifications and variations of the present application fall within the scope of the claims and the equivalents thereof, the present application is intended to cover such modifications and variations.

Claims (14)

1. A method of generating user-controllable content, comprising:
acquiring a session request of a current user, and generating a corresponding session identification number according to the session request;
acquiring a corresponding information base according to the session identification number, wherein the information base comprises a static data information base;
the static data is sampled and input prefixes are formed through an information sampling network according to the session request and the information base, the static data comprises object entities and relations among the object entities, and the static data information base is constructed in the following mode: sampling the static data in the form of a binary tree structure, wherein leaf nodes in the binary tree structure correspond to the object entities; storing leaf nodes in the binary tree structure and relations among the leaf nodes through triplets so as to store static data related to the current session identification number; the triplet comprises a first entity, a second entity and a relation between the first entity and the second entity, wherein the first entity and the second entity respectively correspond to different leaf nodes in the binary tree structure, and the relation between the first entity and the second entity corresponds to the relation between the different leaf nodes; the method further comprises the steps of:
Splicing the input prefix and the session request and inputting the spliced input prefix and the session request into a pre-training language model to obtain a corresponding output result;
forming dialogue data by the output result and the session request, and checking and updating according to the static data information base, wherein the method comprises the following steps:
in response to detecting that a relationship between any triplet in the dialogue data and a first entity in a triplet corresponding to the static data information base is consistent with the relationship between the first entity and a second entity but the second entity is inconsistent, updating the second entity in the triplet corresponding to the static data information base;
in response to detecting that a relationship between a first entity and a second entity of any triplet of the conversation data does not exist in the static data information base, newly adding a relationship between the first entity and a second entity corresponding to the first entity and the corresponding second entity in the static data information base;
and in response to detecting that the first entity of any triplet in the dialogue data does not exist in the static data information base, newly adding the triplet corresponding to the first entity in the static data information base.
2. The method for generating user-controllable content according to claim 1, wherein said completing the sampling of static data and forming an input prefix according to the session request and the information base through the information sampling network comprises:
And sampling from the root node of the binary tree structure according to the session request and constraint data in the static data information base by an information sampling network according to the triples so as to form an input prefix of the pre-training language model.
3. The user-controllable content generation method of claim 2, wherein the method further comprises:
and if the current node is a non-leaf node, continuing to sample all relations and all sub-entities contained in the current node through the information sampling network until the current node is sampled to the leaf node.
4. A user-controllable content generation method as claimed in claim 3, wherein the leaf nodes comprise any one of a normal node and an end-of-sample marker node.
5. The user-controllable content generation method of claim 4, wherein the method further comprises:
responsive to detecting that a leaf node sampled by the information sampling network is the sampling end marker node, terminating sampling;
and responding to the detection that the leaf node sampled by the information sampling network is the common node, and if the common node has an un-sampled brother node, continuing to sample the brother node until the leaf node of the brother node is sampled.
6. The user-controllable content generation method of claim 1, wherein the information repository further comprises a dynamic data information repository.
7. The method for generating user-controllable content according to claim 6, wherein said completing the sampling of static data and forming an input prefix according to the session request and the information base through the information sampling network further comprises:
completing the sampling of static data and forming a first input prefix according to the session request and the static data information base through an information sampling network;
and completing the sampling of the dynamic data and forming a second input prefix according to the session request and the dynamic data information base through an information sampling network.
8. The method for generating user-controllable content according to claim 7, wherein said completing the sampling of dynamic data and forming a second input prefix according to the session request and the dynamic data information base through the information sampling network comprises:
sampling dynamic data according to the session request through an information sampling network to update the dynamic data information base in real time;
and the dynamic data information base stores dialogue data of the current session of the current user for the latest first preset round.
9. The method for generating user-controllable content according to claim 8, wherein said sampling of dynamic data and forming a second input prefix according to said session request and said dynamic data information base via an information sampling network, further comprises:
outputting the importance of the dialogue data of the latest first preset round of the current session through the information sampling network, sequencing from high to low according to the importance, and intercepting dialogue data of the number of the second preset round before as a second input prefix for reverse sequence splicing and inputting the dialogue data into the pre-training language model;
wherein the second preset round is not greater than the first preset round.
10. The user-controllable content generation method of claim 1, wherein the information sampling network comprises a word segmentation layer, an embedding layer, a coding layer, a masking layer, and a normalized exponential function layer.
11. The method of claim 10, wherein the word segmentation layer is configured to convert an input text into a computable word element, the embedding layer is configured to convert the corresponding word element into a word vector matrix, the encoding layer is configured to perform nonlinear transformation and matrix calculation on all the word vector matrices, output a vector consistent with the number of the input word elements, the mask layer is configured to mask and filter an output independent of a sampled object, and the normalized exponential function layer is configured to subject the mask layer output to a normalized exponential function to obtain a probability distribution of the sampled object.
12. A user-controllable content generation apparatus, the apparatus comprising:
the communication module is used for acquiring a session request of a current user and generating a corresponding session identification number according to the session request;
the reading module is used for acquiring a corresponding information base according to the session identification number, wherein the information base comprises a static data information base;
the sampling module is used for completing the sampling of static data and forming an input prefix according to the session request and the information base through an information sampling network, wherein the static data comprises object entities and relations among the object entities, and the sampling module is also used for constructing a static data information base through the following modes: sampling the static data in the form of a binary tree structure, wherein leaf nodes in the binary tree structure correspond to the object entities; storing leaf nodes in the binary tree structure and relations among the leaf nodes through triplets so as to store static data related to the current session identification number; the triplet comprises a first entity, a second entity and a relation between the first entity and the second entity, wherein the first entity and the second entity respectively correspond to different leaf nodes in the binary tree structure, and the relation between the first entity and the second entity corresponds to the relation between the different leaf nodes;
The control module is used for splicing the input prefix and the session request and inputting the input prefix and the session request into a pre-training language model to obtain a corresponding output result;
the verification updating module is used for forming dialogue data by the output result and the session request and carrying out verification updating according to the static data information base; the verification updating module is further used for updating the second entity in the corresponding triplet of the static data information base in response to detecting that the relation between any triplet in the dialogue data and the first entity in the corresponding triplet of the static data information base is consistent but the relation between the first entity and the second entity is inconsistent; the verification updating module is further used for responding to the fact that the relation between the first entity and the second entity of any triplet in the dialogue data is detected to be not existing in the static data information base, and the relation between the first entity and the second entity corresponding to the first entity and the corresponding second entity are newly added in the static data information base; the verification updating module is further used for responding to the fact that a first entity of any triplet in the dialogue data does not exist in the static data information base, and the triplet corresponding to the first entity is newly added in the static data information base.
13. An electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, wherein the processor implements the user controllable content generation method of any one of claims 1 to 11 when the computer program is executed by the processor.
14. A computer-readable storage medium storing computer-executable instructions for performing the user-controllable content generation method of any one of claims 1 to 11.
CN202311207343.0A 2023-09-19 2023-09-19 User controllable content generation method, device, equipment and medium Active CN116932703B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202311207343.0A CN116932703B (en) 2023-09-19 2023-09-19 User controllable content generation method, device, equipment and medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202311207343.0A CN116932703B (en) 2023-09-19 2023-09-19 User controllable content generation method, device, equipment and medium

Publications (2)

Publication Number Publication Date
CN116932703A CN116932703A (en) 2023-10-24
CN116932703B true CN116932703B (en) 2024-01-23

Family

ID=88386548

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202311207343.0A Active CN116932703B (en) 2023-09-19 2023-09-19 User controllable content generation method, device, equipment and medium

Country Status (1)

Country Link
CN (1) CN116932703B (en)

Citations (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20190155905A1 (en) * 2017-11-17 2019-05-23 Digital Genius Limited Template generation for a conversational agent
CN112084314A (en) * 2020-08-20 2020-12-15 电子科技大学 Knowledge-introducing generating type session system
CN112818105A (en) * 2021-02-05 2021-05-18 江苏实达迪美数据处理有限公司 Multi-turn dialogue method and system fusing context information
CN114443827A (en) * 2022-01-28 2022-05-06 福州大学 Local information perception dialogue method and system based on pre-training language model
CN114547329A (en) * 2022-01-25 2022-05-27 阿里巴巴(中国)有限公司 Method for establishing pre-training language model, semantic analysis method and device
CN114780694A (en) * 2022-03-28 2022-07-22 北京智谱华章科技有限公司 Zero-fine-tuning anthropomorphic session generation method and equipment based on pre-training language model
CN114970560A (en) * 2022-05-19 2022-08-30 深圳市优必选科技股份有限公司 Dialog intention recognition method and device, storage medium and intelligent device
US20230018489A1 (en) * 2021-07-19 2023-01-19 Beijing Baidu Netcom Science Technology Co., Ltd. Method for acquiring structured question-answering model, question-answering method and corresponding apparatus
CN116384412A (en) * 2023-02-24 2023-07-04 华院计算技术(上海)股份有限公司 Dialogue content generation method and device, computer readable storage medium and terminal
CN116521893A (en) * 2023-04-28 2023-08-01 苏州浪潮智能科技有限公司 Control method and control device of intelligent dialogue system and electronic equipment
CN116644170A (en) * 2023-06-28 2023-08-25 南京领行科技股份有限公司 Reply text generation method, device, communication equipment and storage medium

Patent Citations (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20190155905A1 (en) * 2017-11-17 2019-05-23 Digital Genius Limited Template generation for a conversational agent
CN112084314A (en) * 2020-08-20 2020-12-15 电子科技大学 Knowledge-introducing generating type session system
CN112818105A (en) * 2021-02-05 2021-05-18 江苏实达迪美数据处理有限公司 Multi-turn dialogue method and system fusing context information
US20230018489A1 (en) * 2021-07-19 2023-01-19 Beijing Baidu Netcom Science Technology Co., Ltd. Method for acquiring structured question-answering model, question-answering method and corresponding apparatus
CN114547329A (en) * 2022-01-25 2022-05-27 阿里巴巴(中国)有限公司 Method for establishing pre-training language model, semantic analysis method and device
CN114443827A (en) * 2022-01-28 2022-05-06 福州大学 Local information perception dialogue method and system based on pre-training language model
CN114780694A (en) * 2022-03-28 2022-07-22 北京智谱华章科技有限公司 Zero-fine-tuning anthropomorphic session generation method and equipment based on pre-training language model
CN114970560A (en) * 2022-05-19 2022-08-30 深圳市优必选科技股份有限公司 Dialog intention recognition method and device, storage medium and intelligent device
CN116384412A (en) * 2023-02-24 2023-07-04 华院计算技术(上海)股份有限公司 Dialogue content generation method and device, computer readable storage medium and terminal
CN116521893A (en) * 2023-04-28 2023-08-01 苏州浪潮智能科技有限公司 Control method and control device of intelligent dialogue system and electronic equipment
CN116644170A (en) * 2023-06-28 2023-08-25 南京领行科技股份有限公司 Reply text generation method, device, communication equipment and storage medium

Also Published As

Publication number Publication date
CN116932703A (en) 2023-10-24

Similar Documents

Publication Publication Date Title
US20200301954A1 (en) Reply information obtaining method and apparatus
CN110366734B (en) Optimizing neural network architecture
CN108509463B (en) Question response method and device
CN109992773B (en) Word vector training method, system, device and medium based on multi-task learning
CN108921657B (en) Knowledge-enhanced memory network-based sequence recommendation method
CN110929515A (en) Reading understanding method and system based on cooperative attention and adaptive adjustment
WO2024011814A1 (en) Image-text mutual retrieval method, system and device, and nonvolatile readable storage medium
CN111382573A (en) Method, apparatus, device and storage medium for answer quality assessment
US20190130251A1 (en) Neural question answering system
CN110990555B (en) End-to-end retrieval type dialogue method and system and computer equipment
CN112699215B (en) Grading prediction method and system based on capsule network and interactive attention mechanism
CN111753076A (en) Dialogue method, dialogue device, electronic equipment and readable storage medium
CN109522561B (en) Question and sentence repeated recognition method, device and equipment and readable storage medium
CN116664719B (en) Image redrawing model training method, image redrawing method and device
US20230094730A1 (en) Model training method and method for human-machine interaction
CN116775807A (en) Natural language processing and model training method, equipment and storage medium
CN116757224A (en) Intent understanding method, apparatus, device, and medium
CN115186147A (en) Method and device for generating conversation content, storage medium and terminal
Tao et al. Multi‐head attention graph convolutional network model: End‐to‐end entity and relation joint extraction based on multi‐head attention graph convolutional network
CN112632267B (en) Global interaction and greedy selection combined search result diversification system
CN113420136A (en) Dialogue method, system, electronic equipment, storage medium and program product
CN117575008A (en) Training sample generation method, model training method, knowledge question-answering method and knowledge question-answering device
CN116932703B (en) User controllable content generation method, device, equipment and medium
CN116957047B (en) Sampling network updating method, device, equipment and medium
CN116910190A (en) Method, device and equipment for acquiring multi-task perception model and readable storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant