WO2023061443A1 - 一种确定回复语句的方法及装置 - Google Patents

一种确定回复语句的方法及装置 Download PDF

Info

Publication number
WO2023061443A1
WO2023061443A1 PCT/CN2022/125088 CN2022125088W WO2023061443A1 WO 2023061443 A1 WO2023061443 A1 WO 2023061443A1 CN 2022125088 W CN2022125088 W CN 2022125088W WO 2023061443 A1 WO2023061443 A1 WO 2023061443A1
Authority
WO
WIPO (PCT)
Prior art keywords
statement
user
dialog
dialogue
category
Prior art date
Application number
PCT/CN2022/125088
Other languages
English (en)
French (fr)
Inventor
何彬
王雅圣
李一同
糜飞
Original Assignee
华为技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华为技术有限公司 filed Critical 华为技术有限公司
Publication of WO2023061443A1 publication Critical patent/WO2023061443A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/332Query formulation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks

Definitions

  • the present application relates to the field of artificial intelligence, in particular to a method and device for determining reply sentences.
  • Artificial intelligence is a theory, method, technology and application system that uses digital computers or machines controlled by digital computers to simulate, extend and expand human intelligence, perceive the environment, acquire knowledge and use knowledge to obtain the best results.
  • artificial intelligence is the branch of computer science that attempts to understand the nature of intelligence and produce a new class of intelligent machines that respond in ways similar to human intelligence.
  • Artificial intelligence is to study the design principles and implementation methods of various intelligent machines, so that the machines have the functions of perception, reasoning and decision-making.
  • the dialogue system has various dialogue types, such as chatting type (mainly for entertainment, escort, etc.), task type (used to complete user-specific needs, such as booking tickets, hotel reservations, etc.), question-and-answer type (providing users with knowledge-related services, answer user questions), etc.
  • chatting type mainly for entertainment, escort, etc.
  • task type used to complete user-specific needs, such as booking tickets, hotel reservations, etc.
  • question-and-answer type providing users with knowledge-related services, answer user questions
  • the corresponding dialogue model is trained separately, and different dialogue models are organized together in an integrated manner. Construct a multifunctional dialogue system.
  • the above dialogue system has the problems of complex system structure and large storage space.
  • the present application provides a method for determining reply sentences, which recognizes the dialog category of the user dialog through the state determination network, and for different dialog types, multiplexes the dialog generation network to generate corresponding reply sentences, which is equivalent to using the same Models are used to process user statements of different dialogue types, which reduces the model complexity and model size of the dialogue system.
  • the first user sentence may be a text such as a question or a request input by the user to the question answering device.
  • the user may input a target question into the question answering device in text form, and in this case, the question answering device may directly obtain the first user statement in text form.
  • the user can also input a target question into the question answering device in voice form, in this case, the question answering device can convert the received voice information into text information, so as to obtain the first user statement in text form.
  • the user can also use body language to input a target question to the question answering device. In this case, the question answering device collects and analyzes the user's body movements to identify the first user statement in text form.
  • the first state information of the first user statement is determined through a state determination network, the first state information includes a first dialogue category of the first user statement, and the first dialogue category be chat-type dialogue, task-type dialogue, question-and-answer dialogue or search-type dialogue;
  • the state determination network may be trained and has the ability to determine the corresponding dialogue type based on user sentences.
  • the present application does not limit that the state determination network needs to have the ability to identify four types of dialogs (chat type dialog, task type dialog, question-and-answer type dialog or retrieval type dialog), and the state determination network may be able to identify four types of dialog types at least two of the abilities.
  • the input of the state determination network may be the first user statement (optionally, may also include other historical statements of the user), which is not limited here.
  • dialogue type may also be called a dialogue belief state (belief state).
  • chat-type dialogue may also be referred to as a chat-type dialogue.
  • the state determination network can be a part of the GPT model or the complete GPT model
  • the sentence generation network can be a part of the DialoGPT model or the complete DialoGPT model
  • the sentence generation network can be a part of the BART model or the complete BART model
  • the sentence generation network Can be part of T5 model or complete T5 model.
  • the sentence generation network may be a GPT model, a DialoGPT model, a BART model or a T5 model.
  • the sentence generation network can be a part of the GPT model or the complete GPT model
  • the sentence generation network can be a part of the DialoGPT model or the complete DialoGPT model
  • the sentence generation network can be a part of the BART model or the complete BART model
  • the sentence generation network Can be part of T5 model or complete T5 model.
  • the state determination network and the sentence generation network in this embodiment of the present application may be two parts of the same network, or may be different networks.
  • statement generation network may also generate a reply statement to the first user statement based on other user history statements except the first user statement, which is not limited here.
  • the dialogue category of the user dialogue is identified through the state determination network, and the dialogue generation network is reused for different dialogue types to generate corresponding reply sentences, which is equivalent to using the same model to process different dialogue types
  • the mode of multiple dialogue types can be unified, so that multiple dialogue types can be trained at the same time, and the trained dialogue system has the ability of multiple dialogue types at the same time, which reduces the model of the dialogue system. complexity and model size.
  • the first status information may further include slot information, where the slot information may be a keyword in the first user sentence.
  • the determining the first state information of the first user sentence through a state determination network according to the first user sentence includes: determining the first state information of the first user sentence from multiple dialogue types through a state determination network
  • the first dialog category of the first user statement, the plurality of dialog types include at least two of the chat dialog, task dialog, question-answer dialog, and retrieval dialog.
  • the first dialog category of the first user statement may be determined from multiple dialog types through a state determination network, where the multiple dialog types include the chat-type dialog, task-type dialog, At least two of a question-and-answer dialogue and a retrieval dialogue.
  • the plurality of dialog types include chat-type dialogs and task-type dialogs.
  • the plurality of dialog types include chat-type dialogs and question-and-answer type dialogs.
  • the plurality of dialog types include chat-type dialogs and retrieval-type dialogs.
  • the plurality of dialog types include task-based dialogs and question-and-answer dialogs.
  • the plurality of dialog types include task dialogs and retrieval dialogs.
  • dialog types include question-and-answer dialogs and retrieval dialogs.
  • the plurality of dialog types include chat-type dialogs, task-type dialogs, and question-and-answer dialogs.
  • the plurality of dialog types include chat-type dialogs, task-type dialogs, and retrieval-type dialogs.
  • the plurality of dialog types include task dialogs, question and answer dialogs, and retrieval dialogs.
  • the plurality of dialog types include chat-type dialogs, task-type dialogs, question-and-answer dialogs, and retrieval-type dialogs.
  • the dialogue generation network in the embodiment of the application can be reused to generate corresponding reply sentences.
  • the second user to be replied can also be obtained statement, according to the second user statement, determine the second state information of the second user statement through the state determination network, the second state information includes the second dialog category of the second user statement, the The second dialog type is a chat type dialog, a task type dialog, a question-and-answer type dialog or a retrieval type dialog, and the second type of dialog is different from the first type of dialog, and then the second user statement and the second The dialog category is input to the sentence generation network to obtain a reply sentence corresponding to the second user sentence.
  • the state determination network and the sentence generation network are GPT models, DialoGPT models, BART models or T5 models.
  • the present application does not limit the state determination network and the statement generation network to be a complete GPT model, DialoGPT model, BART model or T5 model respectively, and the state determination network and the statement generation network may be equipped with GPT model, A model of similar network structure or network performance of the DialoGPT model, BART model or T5 model.
  • the state determination network and the statement generation network may be part of the GPT model, the DialoGPT model, the BART model or the T5 model respectively.
  • the dialogue system may obtain the keywords or key sentences required for constructing the reply sentence from the first user sentence or the database according to the first user sentence, and convert the second A user sentence, the first dialogue category, the keyword or key sentence are input to the sentence generation network to obtain a reply sentence corresponding to the first user sentence.
  • dialogue-related data or text content can be obtained from external resources such as an external database/knowledge base/corpus as dialogue information (that is, the above-mentioned keywords or key sentences ) to join the dialogue process.
  • external resources such as an external database/knowledge base/corpus as dialogue information (that is, the above-mentioned keywords or key sentences ) to join the dialogue process.
  • the first user statement and the first dialogue category can be input into the statement generation network to obtain the reply statement corresponding to the first user statement, or the first user statement can be The sentence, the first dialogue category, the keyword or key sentence are input to the sentence generation network to obtain a reply sentence corresponding to the first user sentence.
  • the present application provides a method for determining a reply sentence, the method comprising:
  • the first dialog category is a chat dialog, a task dialog, a question-and-answer dialog or a retrieval dialog;
  • the sentence generation network is updated according to the difference between the first reply sentence and the second reply sentence.
  • the determining the first state information of the first user statement through a state determination network according to the first user statement includes:
  • the multiple dialog types include chat-type dialog, task-type dialog, question-and-answer dialog, and retrieval-type dialog at least two.
  • the method also includes:
  • the second state information includes the fourth dialog category of the second user statement, the fourth the dialog category is different from said third dialog category;
  • the sentence generation network is updated according to the difference between the fourth reply sentence and the third reply sentence.
  • the inputting the first user statement and the first dialog category into the statement generation network to obtain the second reply statement corresponding to the first user statement includes:
  • the first user statement obtain keywords or key sentences required for constructing the reply statement from the first user statement or the database;
  • the present application provides a device for determining a reply sentence, the device comprising:
  • a state generating module configured to determine first state information of the first user sentence through a state determination network according to the first user sentence, the first state information including a first dialog category of the first user sentence,
  • the first dialog category is a chat dialog, a task dialog, a question-and-answer dialog or a retrieval dialog;
  • the reply sentence generation module is used to input the first user sentence and the first dialogue category into the sentence generation network to obtain the reply sentence corresponding to the first user sentence.
  • the present application provides a device for determining a reply sentence
  • the device includes: an acquisition module, used to obtain the first user sentence to be replied; a state generation module, used to determine the response through the state determination network according to the first user sentence
  • the first state information of the first user statement, the first state information includes the first dialogue category of the first user statement, and the first dialogue category is a chat dialogue, a task dialogue, a question-and-answer dialogue, or Retrieval dialogue;
  • a reply statement generating module configured to input the first user statement and the first dialogue category into the statement generation network to obtain a reply statement corresponding to the first user statement.
  • the dialogue category of the user dialogue is identified through the state determination network, and the dialogue generation network is reused to generate corresponding reply sentences for different dialogue types, which is equivalent to using the same model to process user statements of different dialogue types.
  • the model During training, by unifying the modes of multiple dialogue types, multiple dialogue types can be trained at the same time, and the trained dialogue system has the ability of multiple dialogue types at the same time, reducing the model complexity and model size of the dialogue system.
  • the acquisition module is also used to:
  • the state generation module is further configured to determine the second state information of the second user statement through the state determination network according to the second user statement, and the second state information includes the second user statement A second dialogue category, where the second dialogue category is a chat-type dialogue, a task-type dialogue, a question-and-answer dialogue or a search-type dialogue, and the second dialogue category is different from the first dialogue category;
  • the reply sentence generation module is further configured to input the second user sentence and the second dialog category into the sentence generation network to obtain a reply sentence corresponding to the second user sentence.
  • the state determination network and the sentence generation network are GPT models, DialoGPT models, BART models or T5 models.
  • the reply statement generating module is specifically used for:
  • the first user statement obtain keywords or key sentences required for constructing the reply statement from the first user statement or the database;
  • the present application provides a device for determining a reply sentence, the device comprising:
  • An acquisition module configured to acquire a first user statement, a first dialogue category of the first user statement, and a first reply statement corresponding to the first user statement, the first dialogue category being the first dialogue category of the first user statement
  • the real category, the first dialogue category is a chat-type dialogue, a task-type dialogue, a question-and-answer dialogue or a search-type dialogue;
  • a state generating module configured to determine the first state information of the first user statement through a state determination network according to the first user statement, the first state information including the second dialog category of the first user statement;
  • a reply statement generating module configured to input the first user statement and the first dialog category to the statement generation network, to obtain a second reply statement corresponding to the first user statement;
  • the sentence generation network is updated according to the difference between the first reply sentence and the second reply sentence.
  • This application uses the state determination network to identify the dialogue category of the user dialogue, and for different dialogue types, reuses the dialogue generation network to generate corresponding reply sentences, which is equivalent to using the same model to process user sentences of different dialogue types.
  • model training by unifying the modes of multiple dialogue types, multiple dialogue types can be trained at the same time, and the trained dialogue system has the ability of multiple dialogue types at the same time, which reduces the model complexity of the dialogue system and the model size.
  • the state generating module is specifically used for:
  • the multiple dialog types include chat-type dialog, task-type dialog, question-and-answer dialog, and retrieval-type dialog at least two.
  • the model update module is further configured to update the state determination network according to the difference between the fourth dialog category and the third dialog category;
  • the state determination network and the sentence generation network are GPT models, DialoGPT models, BART models or T5 models.
  • the embodiment of the present application provides a device for determining a reply sentence, which may include a memory, a processor, and a bus system, wherein the memory is used to store programs, and the processor is used to execute the programs in the memory to perform the above-mentioned first In one aspect any optional method.
  • the embodiment of the present application provides a device for determining a reply sentence, which may include a memory, a processor, and a bus system, wherein the memory is used to store programs, and the processor is used to execute the programs in the memory to perform the above-mentioned Either method is optional.
  • the embodiment of the present application provides a computer-readable storage medium, where a computer program is stored in the computer-readable storage medium, and when it is run on a computer, it enables the computer to execute any optional program in the above-mentioned first aspect. method, and any optional method of the second aspect.
  • the embodiment of the present application provides a computer program product, including codes, used to implement any optional method in the first aspect and any optional method in the second aspect when the code is executed.
  • the present application provides a chip system, which includes a processor, configured to support an execution device or a training device to implement the functions involved in the above aspect, for example, send or process the data involved in the above method; or, information.
  • the chip system further includes a memory, and the memory is used for storing necessary program instructions and data of the execution device or the training device.
  • the system-on-a-chip may consist of chips, or may include chips and other discrete devices.
  • An embodiment of the present application provides a method for determining a reply statement, the method comprising: acquiring a first user statement to be replied; determining the first statement of the first user statement through a state determination network according to the first user statement; state information, the first state information includes a first dialog category of the first user statement, and the first dialog category is a chat-type dialog, a task-type dialog, a question-and-answer dialog or a retrieval-type dialog; the first The user sentence and the first dialogue category are input to the sentence generation network to obtain a reply sentence corresponding to the first user sentence.
  • the dialogue category of the user dialogue is identified through the state determination network, and the dialogue generation network is reused to generate corresponding reply sentences for different dialogue types, which is equivalent to using the same model to process user statements of different dialogue types.
  • the model During training, by unifying the modes of multiple dialogue types, multiple dialogue types can be trained at the same time, and the trained dialogue system has the ability of multiple dialogue types at the same time, reducing the model complexity and model size of the dialogue system.
  • FIG. 2 is a schematic diagram of a system architecture provided by an embodiment of the present application.
  • Fig. 4 is a schematic diagram of an interface of a task-based dialogue
  • Fig. 5 is the schematic diagram of the model that the embodiment of the present application provides
  • FIG. 7 is a schematic diagram of a method for determining a reply sentence provided by an embodiment of the present application.
  • FIG. 8 is a schematic diagram of a device for determining a reply sentence provided by an embodiment of the present application.
  • FIG. 9 is a schematic diagram of a device for determining a reply sentence provided by an embodiment of the present application.
  • FIG. 10 is a schematic structural diagram of an execution device provided by an embodiment of the present application.
  • Fig. 11 is a schematic structural diagram of a training device provided by an embodiment of the present application.
  • Figure 1 shows a schematic structural diagram of the main framework of artificial intelligence.
  • the following is from the “intelligent information chain” (horizontal axis) and “IT value chain” ( Vertical axis) to illustrate the above artificial intelligence theme framework in two dimensions.
  • the "intelligent information chain” reflects a series of processes from data acquisition to processing. For example, it can be the general process of intelligent information perception, intelligent information representation and formation, intelligent reasoning, intelligent decision-making, intelligent execution and output. In this process, the data has undergone a condensed process of "data-information-knowledge-wisdom".
  • IT value chain reflects the value brought by artificial intelligence to the information technology industry from the underlying infrastructure of artificial intelligence, information (provided and processed by technology) to the systematic industrial ecological process.
  • the infrastructure provides computing power support for the artificial intelligence system, realizes communication with the outside world, and realizes support through the basic platform.
  • the basic platform includes distributed computing framework and network and other related platform guarantees and supports, which can include cloud storage and Computing, interconnection network, etc.
  • sensors communicate with the outside to obtain data, and these data are provided to the smart chips in the distributed computing system provided by the basic platform for calculation.
  • Data from the upper layer of the infrastructure is used to represent data sources in the field of artificial intelligence.
  • the data involves graphics, images, voice, text, and IoT data of traditional equipment, including business data of existing systems and sensory data such as force, displacement, liquid level, temperature, and humidity.
  • Data processing usually includes data training, machine learning, deep learning, search, reasoning, decision-making, etc.
  • machine learning and deep learning can symbolize and formalize intelligent information modeling, extraction, preprocessing, training, etc. of data.
  • Reasoning refers to the process of simulating human intelligent reasoning in a computer or intelligent system, and using formalized information to carry out machine thinking and solve problems according to reasoning control strategies.
  • the typical functions are search and matching.
  • Decision-making refers to the process of decision-making after intelligent information is reasoned, and usually provides functions such as classification, sorting, and prediction.
  • Intelligent products and industry applications refer to the products and applications of artificial intelligence systems in various fields. It is the packaging of the overall solution of artificial intelligence, which commercializes intelligent information decision-making and realizes landing applications. Its application fields mainly include: intelligent terminals, intelligent transportation, Smart healthcare, autonomous driving, smart cities, etc.
  • the method and device provided in the embodiments of the present application can be applied in the scene of human-computer dialogue in natural language processing (natural language processing, NLP) technology.
  • NLP natural language processing
  • the embodiments of the present application can be applied in the scenario of building a dialogue robot and providing semantic understanding and dialogue services to end users.
  • the dialogue robot is, for example, a child accompanying education robot, an after-sales automatic answering application, a pre-sales consulting robot, an intelligent voice assistant on a terminal, and the like.
  • FIG. 2 is a schematic diagram of a system architecture provided by an embodiment of the present application.
  • the system architecture 500 includes an execution device 510 , a training device 520 , a database 530 , a client device 540 , a data storage system 550 and a data collection system 560 .
  • the execution device 510 includes a calculation module 511 , an I/O interface 512 , a preprocessing module 513 and a preprocessing module 514 .
  • the calculation module 511 may include the state determination network/rule 501, and the preprocessing module 513 and the preprocessing module 514 are optional.
  • the data collection device 560 is used to collect training samples.
  • the training samples may be text data, etc.
  • the training samples are the data used for training the state determination network and the sentence generation network. After collecting the training samples, the data collection device 560 stores these training samples in the database 530 .
  • the database 530 may also maintain pre-trained models such as a stateful determination network and a sentence generation network, or a model obtained by fine-tuning a pre-trained model at least once.
  • the training device 520 can train the state determination network and the sentence generation network based on the training samples maintained in the database 530 to obtain the state determination network/rule 501 .
  • the state determination network/rule 501 may be a trained state determination network and a sentence generation network.
  • the training samples maintained in the database 530 are not necessarily collected by the data collection device 560, and may also be received from other devices.
  • the training device 520 does not necessarily perform the training of the state determination network/rules 501 based entirely on the training samples maintained by the database 530. It is also possible to obtain training samples from the cloud or other places for model training. The limitations of the application examples.
  • the training samples may be private data from the client device 540, and then the training device 520 may use the private data from the client device 540 as training samples to perform model fine-tuning on the state determination network and the sentence generation network.
  • the training device 520 can train the state determination network and the sentence generation network through the model training method in the embodiment of the present application, so as to obtain the trained state determination network and sentence generation network.
  • the state determination network/rules 501 obtained from the training of the training device 520 can be applied to different systems or devices, such as the execution device 510 shown in FIG. , notebook computer, augmented reality (augmented reality, AR)/virtual reality (virtual reality, VR) equipment, vehicle-mounted terminal, etc., can also be a server or a cloud, etc.
  • the execution device 510 is configured with an input/output (input/output, I/O) interface 512 for data interaction with external devices, and the user can input data to the I/O interface 512 through the client device 540 (such as this The first user statement and the second user statement in the application embodiment).
  • I/O input/output
  • the preprocessing module 513 and the preprocessing module 514 are used to perform preprocessing according to the input data received by the I/O interface 512. It should be understood that there may be no preprocessing module 513 and preprocessing module 514 or only one preprocessing module. When the preprocessing module 513 and the preprocessing module 514 do not exist, the calculation module 511 may be used directly to process the input data.
  • the execution device 510 When the execution device 510 preprocesses the input data, or in the calculation module 511 of the execution device 510 performs calculation and other related processing, the execution device 510 can call the data, codes, etc. in the data storage system 550 for corresponding processing , the correspondingly processed data and instructions may also be stored in the data storage system 550 .
  • the I/O interface 512 presents the processing result (such as a reply sentence) to the client device 540, thereby providing it to the user.
  • the client device 540 can also be used as a data collection terminal, collecting input data from the input I/O interface 512 and output results from the output I/O interface 512 as new sample data, and storing them in the database 530 .
  • the data is stored in database 530 .
  • FIG. 2 is only a schematic diagram of a system architecture provided by the embodiment of the present application, and the positional relationship between devices, devices, modules, etc. shown in the figure does not constitute any limitation.
  • the data The storage system 550 is an external memory relative to the execution device 510 , and in other cases, the data storage system 550 may also be placed in the execution device 510 . It should be understood that the above execution device 510 may be deployed in the client device 540 .
  • the above-mentioned training device 520 can obtain the code stored in the memory (not shown in FIG. 2, which can be integrated into the training device 520 or deployed separately from the training device 520) to realize the determination reply in the embodiment of the present application method of the statement.
  • the training device 520 may include a hardware circuit (such as an application specific integrated circuit (ASIC), a field-programmable gate array (FPGA), a general-purpose processor, a digital signal processor (digital signal processing, DSP), microprocessor or microcontroller, etc.), or a combination of these hardware circuits, for example, the training device 520 can be a hardware system with the function of executing instructions, such as CPU, DSP, etc., or for not A hardware system with the function of executing instructions, such as ASIC, FPGA, etc., or a combination of the above-mentioned hardware system without the function of executing instructions and a hardware system with the function of executing instructions.
  • ASIC application specific integrated circuit
  • FPGA field-programmable gate array
  • DSP digital signal processor
  • microprocessor or microcontroller etc.
  • the training device 520 can be a hardware system with the function of executing instructions, such as CPU, DSP, etc., or for not A hardware system with the function of executing instructions, such as ASIC,
  • the training device 520 may be a hardware system capable of executing instructions, and the method for determining the reply sentence provided in the embodiment of the present application may be a software code stored in a memory, and the training device 520 may obtain the software code from the memory, and Execute the acquired software code to implement the method for determining the reply sentence provided by the embodiment of the present application.
  • execution device may be a server on the cloud side or an electronic device on the device side.
  • the neural network can be composed of neural units, and the neural unit can refer to an operation unit that takes xs (ie input data) and intercept 1 as input, and the output of the operation unit can be:
  • Ws is the weight of xs
  • b is the bias of the neural unit.
  • f is the activation function of the neural unit, which is used to introduce nonlinear characteristics into the neural network to convert the input signal in the neural unit into an output signal.
  • the output signal of the activation function can be used as the input of the next convolutional layer, and the activation function can be a sigmoid function.
  • a neural network is a network formed by connecting multiple above-mentioned single neural units, that is, the output of one neural unit can be the input of another neural unit.
  • the input of each neural unit can be connected with the local receptive field of the previous layer to extract the features of the local receptive field.
  • the local receptive field can be an area composed of several neural units.
  • Deep Neural Network also known as multi-layer neural network
  • DNN Deep Neural Network
  • the neural network inside DNN can be divided into three categories: input layer, hidden layer, and output layer.
  • the first layer is the input layer
  • the last layer is the output layer
  • the layers in the middle are all hidden layers.
  • the layers are fully connected, that is, any neuron in the i-th layer must be connected to any neuron in the i+1-th layer.
  • the coefficient of the kth neuron of the L-1 layer to the jth neuron of the L layer is defined as It should be noted that the input layer has no W parameter.
  • more hidden layers make the network more capable of describing complex situations in the real world. Theoretically speaking, a model with more parameters has a higher complexity and a greater "capacity", which means that it can complete more complex learning tasks.
  • Training the deep neural network is the process of learning the weight matrix, and its ultimate goal is to obtain the weight matrix of all layers of the trained deep neural network (the weight matrix formed by the vector W of many layers).
  • the error back propagation (BP) algorithm can be used to correct the size of the parameters in the initial model during the training process, so that the error loss of the model becomes smaller and smaller. Specifically, passing the input signal forward until the output produces an error loss, and updating the parameters in the initial model by backpropagating the error loss information, so that the error loss converges.
  • the backpropagation algorithm is a backpropagation movement dominated by error loss, aiming to obtain the optimal model parameters, such as the weight matrix.
  • the Transformer model is a commonly used model architecture for modeling dialogue.
  • the model consists of a transformer encoder and a decoder.
  • the encoder module is used to encode dialogue context information
  • the decoder module is used to generate dialogue context information.
  • One of the existing technologies proposes that multiple decoder modules can be used to model a dialog domain, and each decoder module corresponds to a dialog domain.
  • the encoder modules of each dialogue domain are learned through parameter sharing, and the data set of each dialogue domain is used to learn the corresponding decoder module of this domain.
  • the system learns a recurrent neural network-based module during the training process to discriminate the domain of the current dialogue context, and then uses the discriminated probability distribution to perform weighted integration of multiple decoder parameters to obtain a multi-domain dialogue system.
  • each domain corresponds to a system sub-module, resulting in a significant increase in model complexity and training overhead.
  • larger sub-modules are required to carry out functions, and the real unification of multiple domains has not been achieved.
  • the solution of the present invention hopes to propose a unified end-to-end dialogue system framework, unify different types of dialogue systems into the same dialogue mode, and realize unified training of different dialogue types, so that the model can simultaneously complete different types of dialogues. ability.
  • the first user sentence may be a text such as a question or a request input by the user to the question answering device.
  • the user may input a target question into the question answering device in text form, and in this case, the question answering device may directly obtain the first user statement in text form.
  • the user can also input a target question to the question answering device in voice form.
  • the question answering device can convert the received voice information into text information, thereby obtaining the first user statement in text form.
  • the user can also use body language to input a target question to the question answering device.
  • the question answering device collects and analyzes the user's body movements to identify the first user statement in text form.
  • the first user statement determine first state information of the first user statement through a state determination network, where the first state information includes a first dialog category of the first user statement, and the first The dialogue category is a chat dialogue, a task dialogue, a question-and-answer dialogue or a retrieval dialogue.
  • first state information of the first user sentence needs to be determined, where the first state information may include the first dialog category.
  • the first state information of the first user sentence may be determined through a state determination network.
  • the state determination network can be a generative pre-trained transformer (GPT) model, a dialogue generative pre-trained transformer (DialogueGPT) model, a BART model (bidirectional and auto-regressive transformers) or T5 model (transfer text-to-text transformer).
  • GPT generative pre-trained transformer
  • DialogGPT dialogue generative pre-trained transformer
  • BART bidirectional and auto-regressive transformers
  • T5 model transfer text-to-text transformer
  • the first dialog category of the first user statement may be determined from multiple dialog types through a state determination network, where the multiple dialog types include the chat-type dialog, task-type dialog, At least two of a question-and-answer dialogue and a retrieval dialogue.
  • the plurality of dialog types include chat-type dialogs and task-type dialogs.
  • the plurality of dialog types include chat-type dialogs and question-and-answer type dialogs.
  • the plurality of dialog types include chat-type dialogs and retrieval-type dialogs.
  • the plurality of dialog types include task-based dialogs and question-and-answer dialogs.
  • the plurality of dialog types include task dialogs and retrieval dialogs.
  • the plurality of dialog types include chat-type dialogs, task-type dialogs, and question-and-answer dialogs.
  • the plurality of dialog types include chat-type dialogs, task-type dialogs, and retrieval-type dialogs.
  • the plurality of dialog types include chat-type dialogs, task-type dialogs, question-and-answer dialogs, and retrieval-type dialogs.
  • the state determination network may be trained and has the ability to determine the corresponding dialogue type based on user sentences.
  • the input of the state determination network may be the first user statement (optionally, may also include other historical statements of the user), which is not limited here.
  • dialogue type may also be called a dialogue belief state (belief state).
  • FIG. 4 is a schematic diagram of an application scenario of a task-based dialogue involved in the present application.
  • the user conducts a task-based dialogue with the dialogue system.
  • the right side of Figure 4 is the user sentence input by the user
  • the left side of Figure 4 is the reply sentence output by the dialogue system according to the user sentence.
  • the user enters the user sentence "Book an air ticket to Beijing.”
  • the dialog system outputs a reply sentence "Okay, I found a flight ticket from Shenzhen to Beijing at 8 o'clock tomorrow for you. Do you need to book it?".
  • the user enters the user sentence "Book it for me.”
  • the dialogue system outputs a reply sentence "Okay, I have booked a flight ticket for you. Is there anything else I can help you with?".
  • the user then inputs the user sentence "No, thank you”.
  • the application scenario shown in FIG. 4 is only an example of one of the task-type dialogues.
  • the task-type dialogue can also be a dialogue about making a phone call, a dialogue about querying a geographic location, or a dialogue about ordering takeout. , conversations about asking about the weather, conversations about booking a hotel, etc., are not specifically limited here.
  • the task-type dialog may be represented by the user's intention behavior (or simply referred to as user behavior).
  • the user sentence input by the user usually includes user behavior.
  • the user behavior is a behavior in which the user makes a request to the dialogue system. Take the user sentence "book an air ticket to Beijing" as an example. This user sentence puts forward a request to the dialogue system to book an air ticket. Therefore, this user sentence includes the user behavior "book a flight ticket”.
  • user behavior can also be "calling", “querying location”, “ordering food”, “asking about weather” and “booking a hotel”. "Etc., no specific limitation is made here.
  • the first dialogue type may be hotel.
  • FIG. 1 is a schematic diagram of a possible application scenario of an embodiment of the present application.
  • the application scenario includes: a question answering device and users.
  • a user may ask a question to the question answering device, and the question answering device returns an appropriate answer to the user according to the user's question. For example: the user asks the question answering device "Where is the capital of China?", and the question answering device returns the answer "Beijing" to the user.
  • the question and answer here refers to one question and one answer, that is, to give accurate answers directly based on user questions, such as "how many degrees in Beijing today”. Question answering is more similar to information retrieval, although it may also involve context processing, such as "so how much is tomorrow".
  • the first user sentence may be a question input by the user, and the dialog system needs to determine the answer corresponding to the first user sentence from the knowledge base (or database).
  • the knowledge base is used to provide the knowledge needed to answer user questions.
  • the processing unit may be provided with a semantic matching model for retrieving the most appropriate answer from the knowledge base according to the user's question. It can be understood that the richer the knowledge in the knowledge base, the more questions the question answering device can answer.
  • the knowledge in the knowledge base is stored in the form of "question-answer pairs". "Question-answer pair" can also be referred to simply as "question and answering (QA) pair”.
  • Q represents a known question (or called a standard question)
  • A represents an answer corresponding to Q.
  • chat-type conversations may include greetings and pleasantries, are characterized by no clear purpose, and do not necessarily answer the user's questions. Chat mainly plays the role of emotional companionship in the human-computer dialogue system.
  • the first user statement is "does money buy happiness?"
  • the state determination network can recognize that the corresponding first dialogue category is "chit", and the "chit" can indicate that the first user statement is a chat-type dialogue.
  • the status information may also include slot information "money happiness”.
  • the first user statement is "i am looking for a cheap hotel", and the state determination network can recognize that the corresponding first dialogue category is "hotel", and the "hotel” can indicate that the first user statement is a task-based dialogue , the first state information may also include slot information "price cheap”.
  • the first user statement is "how high is Mt.Everest?"
  • the first user statement is a chat-type dialogue, and the state determination network can recognize that the corresponding first dialogue category is "qa", and the "qa" can indicate
  • the first user statement is a question-and-answer dialogue, and the first status information may also include slot information "Mt.Everest high”.
  • the first user statement is "which is the best brand for basketball?"
  • the state determination network can recognize that the corresponding first dialogue category is "faq”, and the "faq" can indicate that the first user statement is a retrieval type
  • the first status information may also include slot information "brand basketball”.
  • the sentence generation network may be a GPT model, a DialoGPT model, a BART model or a T5 model.
  • the state determination network and the sentence generation network in this embodiment of the present application may be two parts of the same network, or may be different networks.
  • FIG. 6 is a schematic diagram when the state determination network and the sentence generation network can be two parts of the same network.
  • dialogue-related data or text content can be obtained from external resources such as an external database/knowledge base/corpus as dialogue information (that is, the above-mentioned keywords or key sentences ) to join the dialogue process.
  • external resources such as an external database/knowledge base/corpus as dialogue information (that is, the above-mentioned keywords or key sentences ) to join the dialogue process.
  • the first user statement and the first dialogue category can be input into the statement generation network to obtain the reply statement corresponding to the first user statement, or the first user statement can be The sentence, the first dialogue category, the keyword or key sentence are input to the sentence generation network to obtain a reply sentence corresponding to the first user sentence.
  • statement generation network may also generate a reply statement to the first user statement based on other user history statements except the first user statement, which is not limited here.
  • the dialogue generation network in the embodiment of the application can be reused to generate corresponding reply sentences.
  • the second user to be replied can also be obtained statement, according to the second user statement, determine the second state information of the second user statement through the state determination network, the second state information includes the second dialog category of the second user statement, the The second dialog type is a chat type dialog, a task type dialog, a question-and-answer type dialog or a retrieval type dialog, and the second type of dialog is different from the first type of dialog, and then the second user statement and the second The dialog category is input to the sentence generation network to obtain a reply sentence corresponding to the second user sentence.
  • the dialogue category of the user dialogue is identified through the state determination network, and the dialogue generation network is reused for different dialogue types to generate corresponding reply sentences, which is equivalent to using the same model to process different dialogue types
  • the mode of multiple dialogue types can be unified, so that multiple dialogue types can be trained at the same time, and the trained dialogue system has the ability of multiple dialogue types at the same time, which reduces the model of the dialogue system. complexity and model size.
  • Table 1 is a comparison of dialogues of different dialogue types in the dialogue system provided by this application:
  • Table 1 is an illustration of how the dialogue system handles dialogues of different dialogue types in a practical application:
  • Unified Dialogue System significantly outperforms the baseline on task-type dialogues and performs similarly to the baseline model on small-chat dialogues, using the same parameter scale. This shows that the integrated dialogue system has both task-based and chat-type dialogue capabilities.
  • the task type switching test is carried out, and two data types are designed for testing:
  • Switch-1 The ratio of the model to complete the switch in the first round after the dialogue type is switched
  • Switch-2 The proportion of the model that completes the switch in the second round after the dialogue type is switched
  • the integrated dialogue system can basically complete the dialogue type switching in the first two rounds after the data type switching, which shows that the integrated dialogue system has the ability to switch between task-type and chat-type dialogue.
  • This embodiment has carried out task-type dialogue robustness test, simulated the noise environment of real dialogue scene (such as talking with mobile phone assistant when watching TV, passenger chatting when driver carries out voice interaction), that is, in task-type multi-round dialogue, randomly Insert a round or two of small talk dialogue.
  • the experimental results in Table 6 show that the robustness of the all-in-one dialogue system is significantly better than that of task-specific dialogue models trained separately.
  • the integrated dialogue system provided by the embodiment of the present application can significantly reduce the overall parameter quantity on the premise that the performance of multiple single-type systems can be maintained or improved, and it has the ability to switch between different dialogue types.
  • the robustness of task-based dialogue has been greatly improved.
  • An embodiment of the present application provides a method for determining a reply statement, the method comprising: acquiring a first user statement to be replied; determining the first statement of the first user statement through a state determination network according to the first user statement; state information, the first state information includes a first dialog category of the first user statement, and the first dialog category is a chat-type dialog, a task-type dialog, a question-and-answer dialog or a retrieval-type dialog; the first The user sentence and the first dialogue category are input to the sentence generation network to obtain a reply sentence corresponding to the first user sentence.
  • the dialogue category of the user dialogue is identified through the state determination network, and the dialogue generation network is reused to generate corresponding reply sentences for different dialogue types, which is equivalent to using the same model to process user statements of different dialogue types.
  • the model During training, by unifying the modes of multiple dialogue types, multiple dialogue types can be trained at the same time, and the trained dialogue system has the ability of multiple dialogue types at the same time, reducing the model complexity and model size of the dialogue system.
  • FIG. 7 is a schematic diagram of an embodiment of a method for determining a reply sentence provided by an embodiment of the present application. As shown in FIG. 7, a method for determining a reply sentence provided by an embodiment of the present application includes:
  • the first dialog type is a chat dialog, a task dialog, a question-and-answer dialog or a retrieval dialog.
  • the training device when it trains the state determination network and the statement generation network, it can obtain training samples.
  • the training samples can include the first user statement, the first dialogue category of the first user statement and the first reply statement corresponding to the first user statement, the first dialogue category is the true category of the first user statement, and the first dialogue category is a chat-type dialogue, a task-type dialogue, a question-and-answer dialogue or Retrieval dialogue.
  • the state determination network and the sentence generation network are the models to be updated.
  • the state determination network and the sentence generation network can be the initialization model at the beginning of the model training, or the pre-training model.
  • the model has some basic functions in the field. Or it is a model obtained by fine-tuning the pre-trained model with functions other than the basic functions mentioned above.
  • the state determination network and the sentence generation network are GPT models, DialoGPT models, BART models or T5 models.
  • the second dialog category may be a result obtained by the state determination network during one feedforward.
  • the second dialog category of the first user statement may be determined from a plurality of dialog types through a state determination network, and the multiple dialog types include the chat-type dialog, task-type dialog, At least two of a question-and-answer dialogue and a retrieval dialogue.
  • the second reply sentence may be a result obtained by the sentence generation network during one feedforward.
  • the second user statement, the third dialogue category of the second user statement, and the third reply statement corresponding to the second user statement may be acquired, the third dialogue category being the first Two true categories of the user statement; according to the second user statement, determine the second state information of the second user statement through the state determination network, the second state information includes the fourth user statement of the second user statement
  • the dialogue category, the fourth dialogue category is different from the third dialogue category; the second user statement and the third dialogue category are input to the statement generation network to obtain the second user statement corresponding to Four reply sentences; according to the difference between the fourth dialogue category and the third dialogue category, update the state determination network; according to the difference between the fourth reply statement and the third reply statement, update The statement generates a network.
  • keywords or key sentences required for constructing the reply statement can be obtained from the first user statement or a database according to the first user statement;
  • the sentence, the first dialogue category, the keyword or key sentence are input to the sentence generation network to obtain a second reply sentence corresponding to the first user sentence.
  • An embodiment of the present application provides a method for determining a reply sentence, the method comprising: acquiring a first user sentence, a first dialog category of the first user sentence, and a first reply sentence corresponding to the first user sentence,
  • the first dialogue category is the true category of the first user statement, and the first dialogue category is a chat-type dialogue, a task-type dialogue, a question-and-answer dialogue or a search-type dialogue; according to the first user statement, the passing state Determining that the network determines the first state information of the first user statement, the first state information including the second dialogue category of the first user statement; inputting the first user statement and the first dialogue category into
  • the sentence generation network obtains the second reply sentence corresponding to the first user sentence; according to the difference between the first dialogue category and the second dialogue category, the state determination network is updated; according to the first reply The difference between the sentence and the second reply sentence is updated to the sentence generation network.
  • An acquisition module 801, configured to acquire the first user statement to be replied
  • a state generating module 802 configured to determine first state information of the first user sentence through a state determination network according to the first user sentence, where the first state information includes a first dialog category of the first user sentence , the first dialog category is a chat dialog, a task dialog, a question-and-answer dialog or a retrieval dialog;
  • the state generating module is specifically used for:
  • the multiple dialog types including chat-type dialog, task-type dialog, question-and-answer type dialog, and retrieval-type dialog at least two.
  • the acquisition module is also used to:
  • the state generation module is further configured to determine the second state information of the second user statement through the state determination network according to the second user statement, and the second state information includes the second user statement A second dialogue category, where the second dialogue category is a chat-type dialogue, a task-type dialogue, a question-and-answer dialogue or a search-type dialogue, and the second dialogue category is different from the first dialogue category;
  • the state determination network and the sentence generation network are GPT models, DialoGPT models, BART models or T5 models.
  • the reply statement generating module is specifically used for:
  • the first user statement obtain keywords or key sentences required for constructing the reply statement from the first user statement or the database;
  • the present application provides a device for determining a reply sentence
  • the device includes: an acquisition module, used to obtain the first user sentence to be replied; a state generation module, used to determine the response through the state determination network according to the first user sentence
  • the first state information of the first user statement, the first state information includes the first dialogue category of the first user statement, and the first dialogue category is a chat dialogue, a task dialogue, a question-and-answer dialogue, or Retrieval dialogue;
  • a reply statement generating module configured to input the first user statement and the first dialogue category into the statement generation network to obtain a reply statement corresponding to the first user statement.
  • the dialogue category of the user dialogue is identified through the state determination network, and the dialogue generation network is reused to generate corresponding reply sentences for different dialogue types, which is equivalent to using the same model to process user statements of different dialogue types.
  • the model During training, by unifying the modes of multiple dialogue types, multiple dialogue types can be trained at the same time, and the trained dialogue system has the ability of multiple dialogue types at the same time, reducing the model complexity and model size of the dialogue system.
  • FIG. 9 is a schematic structural diagram of a device for determining a reply sentence provided by an embodiment of the present application.
  • the device 900 includes:
  • a state generation module 904 configured to determine first state information of the first user sentence through a state determination network according to the first user sentence, the first state information including the second dialog category of the first user sentence ;
  • Reply statement generating module 901 configured to input the first user statement and the first dialog category into the statement generation network, and obtain a second reply statement corresponding to the first user statement;
  • a model updating module 903, configured to update the state determination network according to the difference between the first dialogue category and the second dialogue category;
  • the sentence generation network is updated according to the difference between the first reply sentence and the second reply sentence.
  • model updating module 903 for the specific description of the model updating module 903, reference may be made to the descriptions of steps 704 and 705 in the above-mentioned embodiment, and details are not repeated here.
  • the state generating module is specifically used for:
  • the multiple dialog types include chat-type dialog, task-type dialog, question-and-answer dialog, and retrieval-type dialog at least two.
  • the acquisition module is also used to:
  • the state generation module is further configured to determine the second state information of the second user statement through the state determination network according to the second user statement, and the second state information includes the second user statement a fourth dialogue category, where the fourth dialogue category is different from the third dialogue category;
  • the reply sentence generation module is also used to input the second user sentence and the third dialogue category into the sentence generation network to obtain a fourth reply sentence corresponding to the second user sentence;
  • the model update module is further configured to update the state determination network according to the difference between the fourth dialog category and the third dialog category;
  • the sentence generation network is updated according to the difference between the fourth reply sentence and the third reply sentence.
  • the state determination network and the sentence generation network are GPT models, DialoGPT models, BART models or T5 models.
  • the reply statement generating module is specifically used for:
  • the present application provides a device for determining a reply statement, the device comprising: an acquisition module, configured to acquire a first user statement, a first dialog category of the first user statement, and a first user statement corresponding to the first user statement.
  • Reply sentence the first dialog category is the true category of the first user statement, and the first dialog category is a chat-type dialog, a task-type dialog, a question-and-answer dialog or a retrieval-type dialog
  • the state generation module is used to The first user statement determines the first state information of the first user statement through the state determination network, and the first state information includes the second dialog category of the first user statement
  • the reply statement generation module is used for Inputting the first user statement and the first dialogue category into the statement generation network to obtain a second reply statement corresponding to the first user statement;
  • a model update module configured to use the first dialogue category and the The difference between the second dialogue categories is used to update the state determination network;
  • the statement generation network is updated according to the difference between the first reply sentence and the second reply sentence
  • This application uses the state determination network to identify the dialogue category of the user dialogue, and for different dialogue types, reuses the dialogue generation network to generate corresponding reply sentences, which is equivalent to using the same model to process user sentences of different dialogue types.
  • model training by unifying the modes of multiple dialogue types, multiple dialogue types can be trained at the same time, and the trained dialogue system has the ability of multiple dialogue types at the same time, which reduces the model complexity of the dialogue system and the model size.
  • FIG. 10 is a schematic structural diagram of the execution device provided by the embodiment of the application. Smart wearable devices, servers, etc. are not limited here.
  • the apparatus for determining the reply statement described in the embodiment corresponding to FIG. 8 may be deployed on the execution device 1000, so as to realize the function of determining the reply sentence in the embodiment corresponding to FIG. 8 .
  • the execution device 1000 includes: a receiver 1001, a transmitter 1002, a processor 1003, and a memory 1004 (the number of processors 1003 in the execution device 1000 may be one or more), where the processor 1003 may include an application processing device 10031 and communication processor 10032.
  • the receiver 1001 , the transmitter 1002 , the processor 1003 and the memory 1004 may be connected through a bus or in other ways.
  • the memory 1004 may include read-only memory and random-access memory, and provides instructions and data to the processor 1003 .
  • a part of the memory 1004 may also include a non-volatile random access memory (non-volatile random access memory, NVRAM).
  • NVRAM non-volatile random access memory
  • the memory 1004 stores processors and operating instructions, executable modules or data structures, or their subsets, or their extended sets, wherein the operating instructions may include various operating instructions for implementing various operations.
  • the processor 1003 controls the operations of the execution device.
  • various components of the execution device are coupled together through a bus system, where the bus system may include not only a data bus, but also a power bus, a control bus, and a status signal bus.
  • the various buses are referred to as bus systems in the figures.
  • the methods disclosed in the foregoing embodiments of the present application may be applied to the processor 1003 or implemented by the processor 1003 .
  • the processor 1003 may be an integrated circuit chip, which has a signal processing capability.
  • each step of the above method may be completed by an integrated logic circuit of hardware in the processor 1003 or instructions in the form of software.
  • the above-mentioned processor 1003 may be a general-purpose processor, a digital signal processor (digital signal processing, DSP), a microprocessor or a microcontroller, a vision processing unit (vision processing unit, VPU), a tensor processing unit (tensor processing unit, TPU) and other processors suitable for AI computing, and can further include application specific integrated circuit (ASIC), field-programmable gate array (field-programmable gate array, FPGA) or other programmable logic devices, Discrete gate or transistor logic devices, discrete hardware components.
  • the processor 1003 may implement or execute various methods, steps, and logic block diagrams disclosed in the embodiments of the present application.
  • a general-purpose processor may be a microprocessor, or the processor may be any conventional processor, or the like.
  • the receiver 1001 can be used to receive input digital or character information, and generate signal input related to performing device related settings and function control.
  • the transmitter 1002 can be used to output digital or character information through the first interface; the transmitter 1002 can also be used to send instructions to the disk group through the first interface to modify the data in the disk group; the transmitter 1002 can also include a display device such as a display screen .
  • FIG. 11 There may be relatively large differences due to different configurations or performances, and may include one or more central processing units (central processing units, CPU) 1111 (for example, one or more processors) and memory 1132, and one or more storage applications
  • a storage medium 1130 (such as one or more mass storage devices) for program 1142 or data 1144 .
  • the memory 1132 and the storage medium 1130 may be temporary storage or persistent storage.
  • the program stored in the storage medium 1130 may include one or more modules (not shown in the figure), and each module may include a series of instruction operations on the training device.
  • the central processing unit 1111 may be configured to communicate with the storage medium 1130 , and execute a series of instruction operations in the storage medium 1130 on the training device 1100 .
  • the training device 1100 can also include one or more power supplies 1126, one or more wired or wireless network interfaces 1150, one or more input and output interfaces 1158; or, one or more operating systems 1141, such as Windows ServerTM, Mac OS XTM , UnixTM, LinuxTM, FreeBSDTM and so on.
  • operating systems 1141 such as Windows ServerTM, Mac OS XTM , UnixTM, LinuxTM, FreeBSDTM and so on.
  • the training device may execute the method for determining a reply sentence in the embodiment corresponding to FIG. 7 .
  • An embodiment of the present application also provides a computer-readable storage medium, the computer-readable storage medium stores a program for signal processing, and when it is run on a computer, the computer executes the steps performed by the aforementioned executing device , or, causing the computer to perform the steps performed by the aforementioned training device.
  • the storage unit is a storage unit in the chip, such as a register, a cache, etc.
  • the storage unit may also be a storage unit located outside the chip in the wireless access device, such as only Read-only memory (ROM) or other types of static storage devices that can store static information and instructions, random access memory (random access memory, RAM), etc.
  • ROM Read-only memory
  • RAM random access memory
  • FIG. 12 is a schematic structural diagram of a chip provided by the embodiment of the present application.
  • the chip can be represented as a neural network processor NPU1200, and the NPU 1200 is mounted to the main CPU (Host CPU) as a coprocessor ), the tasks are assigned by the Host CPU.
  • the core part of the NPU is the operation circuit 1203, and the operation circuit 1203 is controlled by the controller 1204 to extract matrix data in the memory and perform multiplication operations.
  • the NPU 1200 can implement the method for determining the reply sentence provided in the embodiment described in FIG. 3 through the cooperation between various internal devices, or perform inference on the model obtained from training.
  • the computing circuit 1203 in the NPU 1200 can perform the steps of acquiring the model and performing model training on the model.
  • the computing circuit 1203 in the NPU 1200 includes multiple processing units (Process Engine, PE).
  • arithmetic circuit 1203 is a two-dimensional systolic array.
  • the arithmetic circuit 1203 may also be a one-dimensional systolic array or other electronic circuits capable of performing mathematical operations such as multiplication and addition.
  • arithmetic circuit 1203 is a general-purpose matrix processor.
  • the operation circuit fetches the data corresponding to the matrix B from the weight memory 1202, and caches it in each PE in the operation circuit.
  • the operation circuit fetches the data of matrix A from the input memory 1201 and performs matrix operation with matrix B, and the obtained partial or final results of the matrix are stored in an accumulator (accumulator) 1208 .
  • the unified memory 1206 is used to store input data and output data.
  • the weight data directly accesses the controller (Direct Memory Access Controller, DMAC) 1205 through the storage unit, and the DMAC is transferred to the weight storage 1202.
  • the input data is also transferred to the unified memory 1206 through the DMAC.
  • DMAC Direct Memory Access Controller
  • the BIU is the Bus Interface Unit, that is, the bus interface unit 1210, which is used for the interaction between the AXI bus and the DMAC and the instruction fetch buffer (Instruction Fetch Buffer, IFB) 1209.
  • IFB Instruction Fetch Buffer
  • the bus interface unit 1210 (Bus Interface Unit, referred to as BIU), is used for fetching the memory 1209 to obtain instructions from the external memory, and is also used for the storage unit access controller 1205 to obtain the original data of the input matrix A or the weight matrix B from the external memory.
  • BIU Bus Interface Unit
  • the DMAC is mainly used to move the input data in the external memory DDR to the unified memory 1206 , move the weight data to the weight memory 1202 , or move the input data to the input memory 1201 .
  • the vector computing unit 1207 includes a plurality of computing processing units, and if necessary, further processes the output of the computing circuit 1203, such as vector multiplication, vector addition, exponent operation, logarithmic operation, size comparison and so on. It is mainly used for non-convolutional/fully connected layer network calculations in neural networks, such as Batch Normalization (batch normalization), pixel-level summation, and upsampling of feature planes.
  • the vector computation unit 1207 can store the vector of the processed output to unified memory 1206 .
  • the vector calculation unit 1207 can apply a linear function; or, a nonlinear function to the output of the operation circuit 1203, such as performing linear interpolation on the feature plane extracted by the convolution layer, and then such as a vector of accumulated values to generate an activation value.
  • the vector computation unit 1207 generates normalized values, pixel-level summed values, or both.
  • the vector of processed outputs can be used as an activation input to operational circuitry 1203, eg, for use in subsequent layers in a neural network.
  • An instruction fetch buffer (instruction fetch buffer) 1209 connected to the controller 1204 is used to store instructions used by the controller 1204;
  • the unified memory 1206, the input memory 1201, the weight memory 1202 and the fetch memory 1209 are all On-Chip memories. External memory is private to the NPU hardware architecture.
  • the processor mentioned above can be a general-purpose central processing unit, microprocessor, ASIC, or one or more integrated circuits for controlling the execution of the above-mentioned programs.
  • the essence of the technical solution of this application or the part that contributes to the prior art can be embodied in the form of a software product, and the computer software product is stored in a readable storage medium, such as a floppy disk of a computer , U disk, mobile hard disk, ROM, RAM, magnetic disk or optical disk, etc., including several instructions to make a computer device (which can be a personal computer, training device, or network device, etc.) execute the instructions described in various embodiments of the present application method.
  • a computer device which can be a personal computer, training device, or network device, etc.
  • all or part of them may be implemented by software, hardware, firmware or any combination thereof.
  • software When implemented using software, it may be implemented in whole or in part in the form of a computer program product.
  • the computer-readable storage medium may be any available medium that can be stored by a computer, or a data storage device such as a training device or a data center integrated with one or more available media.
  • the available medium may be a magnetic medium (such as a floppy disk, a hard disk, or a magnetic tape), an optical medium (such as a DVD), or a semiconductor medium (such as a solid state disk (Solid State Disk, SSD)), etc.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • Databases & Information Systems (AREA)
  • Mathematical Physics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Computation (AREA)
  • Biophysics (AREA)
  • Biomedical Technology (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Software Systems (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

一种确定回复语句的方法,所述方法包括:获取待回复的第一用户语句(301);根据所述第一用户语句,通过状态确定网络确定所述第一用户语句的第一状态信息,所述第一状态信息包括所述第一用户语句的第一对话类别,并将所述第一用户语句以及所述第一对话类别,输入至语句生成网络,得到所述第一用户语句对应的回复语句(303)。该方法通过状态确定网络识别出用户对话的对话类别,并针对于不同的对话类型,复用对话生成网络来生成对应的回复语句,相当于可以采用同一个模型来处理不同对话类型的用户语句,降低了对话系统的模型复杂度以及模型大小。

Description

一种确定回复语句的方法及装置
本申请要求于2021年10月15日提交中国专利局、申请号为202111205658.2、发明名称为“一种确定回复语句的方法及装置”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本申请涉及人工智能领域,尤其涉及一种确定回复语句的方法及装置。
背景技术
人工智能(artificial intelligence,AI)是利用数字计算机或者数字计算机控制的机器模拟、延伸和扩展人的智能,感知环境、获取知识并使用知识获得最佳结果的理论、方法、技术及应用系统。换句话说,人工智能是计算机科学的一个分支,它企图了解智能的实质,并生产出一种新的能以人类智能相似的方式作出反应的智能机器。人工智能也就是研究各种智能机器的设计原理与实现方法,使机器具有感知、推理与决策的功能。
对话系统有多种对话类型,例如闲聊型(主要面向娱乐、陪护等)、任务型(用于完成用户特定需求,例如订票、订酒店等)、问答型(给用户提供知识相关的服务,回答用户问题)等。随着深度学习的进步,对话系统取得了巨大的进步。
在现有的实现中,为了能够使得对话系统同时具备应对上述多种对话类型的用户对话,针对于每种对话类型,单独训练对应的对话模型,不同的对话模型通过集成的方式组织在一起来构造一个多功能的对话系统。然而,上述对话系统存在系统结构复杂、且占据存储空间较大的问题。
发明内容
本申请提供了一种确定回复语句的方法,通过状态确定网络识别出用户对话的对话类别,并针对于不同的对话类型,复用对话生成网络来生成对应的回复语句,相当于可以采用同一个模型来处理不同对话类型的用户语句,降低了对话系统的模型复杂度以及模型大小。
第一方面,本申请提供了一种确定回复语句的方法,所述方法包括:
获取待回复的第一用户语句;
在一种可能的实现中,该第一用户语句可以是用户向问答设备输入的问题、请求等文本。示例性的,用户可以采用文本形式向问答设备输入目标问题,该情况下,问答设备可以直接获取到文本形式的第一用户语句。用户还可以采用语音形式向问答设备输入目标问题,该情况下,问答设备可以将接收到的语音信息转换为文本信息,从而得到文本形式的第一用户语句。用户还可以采用肢体语言向问答设备输入目标问题,该情况下,问答设备通过对用户的肢体动作进行采集和分析,识别得到文本形式的第一用户语句。
根据所述第一用户语句,通过状态确定网络确定所述第一用户语句的第一状态信息,所述第一状态信息包括所述第一用户语句的第一对话类别,所述第一对话类别为聊天型对 话、任务型对话、问答型对话或检索型对话;
其中,状态确定网络可以为训练好的,具备基于用户语句确定对应的对话类型的能力。
应理解,本申请并不限定状态确定网络需要具备识别出四种对话类别(聊天型对话、任务型对话、问答型对话或检索型对话)的能力,状态确定网络可以具备识别出四种对话类别中至少两种的能力。
应理解,在确定对话类型时,状态确定网络的输入可以为第一用户语句(可选的,还可以包括用户的其他历史语句),这里并不限定。
应理解,上述对话类型也可以称之为对话置信状态(belief state)。
其中,聊天型对话也可以称之为闲聊型对话。
其中,状态确定网络可以为GPT模型的一部分或者完整的GPT模型,语句生成网络可以为DialoGPT模型的一部分或者完整的DialoGPT模型,语句生成网络可以为BART模型的一部分或者完整的BART模型,语句生成网络可以为T5模型的一部分或者完整的T5模型。
将所述第一用户语句以及所述第一对话类别,输入至语句生成网络,得到所述第一用户语句对应的回复语句。
在一种可能的实现中,所述语句生成网络可以为GPT模型、DialoGPT模型、BART模型或T5模型。其中,语句生成网络可以为GPT模型的一部分或者完整的GPT模型,语句生成网络可以为DialoGPT模型的一部分或者完整的DialoGPT模型,语句生成网络可以为BART模型的一部分或者完整的BART模型,语句生成网络可以为T5模型的一部分或者完整的T5模型。
可选的,本申请实施例中的状态确定网络和语句生成网络可以为同一个网络的两部分,也可以为不同的网络。
应理解,语句生成网络还可以基于除了第一用户语句之外的其他用户历史语句来生成第一用户语句的回复语句,这里并不限定。
应理解,针对于不同对话类别的用户语句,都可以作为同一个语句生成网络的输入,来得到回复语句。
本申请实施例中,通过状态确定网络识别出用户对话的对话类别,并针对于不同的对话类型,复用对话生成网络来生成对应的回复语句,相当于可以采用同一个模型来处理不同对话类型的用户语句,在模型训练时,可以通过统一多种对话类型的模式,使得多种对话类型可以同时进行训练,训练出的对话系统同时具备多种对话类型的能力,降低了对话系统的模型复杂度以及模型大小。
在一种可能的实现中,第一状态信息还可以包括槽位信息,其中,槽位信息可以为第一用户语句中的关键词。
在一种可能的实现中,所述根据所述第一用户语句,通过状态确定网络确定所述第一用户语句的第一状态信息,包括:通过状态确定网络,从多个对话类型中确定所述第一用户语句的第一对话类别,所述多个对话类型包括所述聊天型对话、任务型对话、问答型对话以及检索型对话中的至少两个。
在一种可能的实现中,可以通过状态确定网络,从多个对话类型中确定所述第一用户 语句的第一对话类别,所述多个对话类型包括所述聊天型对话、任务型对话、问答型对话以及检索型对话中的至少两个。
例如,多个对话类型包括聊天型对话以及任务型对话。
例如,多个对话类型包括聊天型对话以及问答型对话。
例如,多个对话类型包括聊天型对话以及检索型对话。
例如,多个对话类型包括任务型对话以及问答型对话。
例如,多个对话类型包括任务型对话以及检索型对话。
例如,多个对话类型包括问答型对话以及检索型对话。
例如,多个对话类型包括聊天型对话、任务型对话以及问答型对话。
例如,多个对话类型包括聊天型对话、任务型对话以及检索型对话。
例如,多个对话类型包括任务型对话、问答型对话以及检索型对话。
例如,多个对话类型包括聊天型对话、任务型对话、问答型对话以及检索型对话。
本申请实施例中,针对于不同的对话类型,可以复用本申请实施例中的对话生成网络来生成对应的回复语句,在一种可能的实现中,还可以获取到待回复的第二用户语句,根据所述第二用户语句,通过所述状态确定网络确定所述第二用户语句的第二状态信息,所述第二状态信息包括所述第二用户语句的第二对话类别,所述第二对话类别为聊天型对话、任务型对话、问答型对话或检索型对话,且所述第二对话类别和所述第一对话类别不同,进而将所述第二用户语句以及所述第二对话类别输入至所述语句生成网络,得到所述第二用户语句对应的回复语句。
在一种可能的实现中,所述状态确定网络和所述语句生成网络为GPT模型、DialoGPT模型、BART模型或T5模型。其中,本申请并不限定所述状态确定网络和所述语句生成网络分别为完整的GPT模型、DialoGPT模型、BART模型或T5模型,状态确定网络和所述语句生成网络可以分别为具备GPT模型、DialoGPT模型、BART模型或T5模型的相似网络结构或网络性能的模型。例如,状态确定网络和所述语句生成网络可以分别为GPT模型的部分、DialoGPT模型的部分、BART模型的部分或T5模型的部分。
在一种可能的实现中,对话系统可以根据所述第一用户语句,从所述第一用户语句或者数据库中得到用于构建所述回复语句所需的关键词或关键句,将所述第一用户语句、所述第一对话类别、所述关键词或关键句输入至所述语句生成网络,得到所述第一用户语句对应的回复语句。
本申请实施例中,可以根据第一用户语句,以及第一对话类别,在外部数据库/知识库/语料库等外部资源中获取对话相关的数据或文本内容作为对话信息(即上述关键词或关键句)加入对话过程。
在一种可能的实现中,可以将所述第一用户语句以及所述第一对话类别,输入至语句生成网络,得到所述第一用户语句对应的回复语句,或者,将所述第一用户语句、所述第 一对话类别、所述关键词或关键句输入至所述语句生成网络,得到所述第一用户语句对应的回复语句。
第二方面,本申请提供了一种确定回复语句的方法,所述方法包括:
获取第一用户语句、所述第一用户语句的第一对话类别以及所述第一用户语句对应的第一回复语句,所述第一对话类别为所述第一用户语句的真实类别,所述第一对话类别为聊天型对话、任务型对话、问答型对话或检索型对话;
根据所述第一用户语句,通过状态确定网络确定所述第一用户语句的第一状态信息,所述第一状态信息包括所述第一用户语句的第二对话类别;
将所述第一用户语句以及所述第一对话类别输入至语句生成网络,得到所述第一用户语句对应的第二回复语句;
根据所述第一对话类别和所述第二对话类别之间的差异,更新所述状态确定网络;
根据所述第一回复语句和所述第二回复语句之间的差异,更新所述语句生成网络。
本申请通过状态确定网络识别出用户对话的对话类别,并针对于不同的对话类型,复用对话生成网络来生成对应的回复语句,相当于可以采用同一个模型来处理不同对话类型的用户语句,在模型训练时,可以通过统一多种对话类型的模式,使得多种对话类型可以同时进行训练,训练出的对话系统同时具备多种对话类型的能力,降低了对话系统的模型复杂度以及模型大小。
在一种可能的实现中,所述根据所述第一用户语句,通过状态确定网络确定所述第一用户语句的第一状态信息,包括:
通过状态确定网络,从多个对话类型中确定所述第一用户语句的第二对话类别,所述多个对话类型包括所述聊天型对话、任务型对话、问答型对话以及检索型对话中的至少两个。
在一种可能的实现中,所述方法还包括:
获取第二用户语句、所述第二用户语句的第三对话类别以及所述第二用户语句对应的第三回复语句,所述第三对话类别为所述第二用户语句的真实类别;
根据所述第二用户语句,通过所述状态确定网络确定所述第二用户语句的第二状态信息,所述第二状态信息包括所述第二用户语句的第四对话类别,所述第四对话类别和所述第三对话类别不同;
将所述第二用户语句以及所述第三对话类别输入至所述语句生成网络,得到所述第二用户语句对应的第四回复语句;
根据所述第四对话类别和所述第三对话类别之间的差异,更新所述状态确定网络;
根据所述第四回复语句和所述第三回复语句之间的差异,更新所述语句生成网络。
在一种可能的实现中,所述状态确定网络和所述语句生成网络为GPT模型、DialoGPT 模型、BART模型或T5模型。
在一种可能的实现中,所述将所述第一用户语句以及所述第一对话类别输入至语句生成网络,得到所述第一用户语句对应的第二回复语句,包括:
根据所述第一用户语句,从所述第一用户语句或者数据库中得到用于构建所述回复语句所需的关键词或关键句;
将所述第一用户语句、所述第一对话类别、所述关键词或关键句输入至所述语句生成网络,得到所述第一用户语句对应的第二回复语句。
第三方面,本申请提供了一种确定回复语句的装置,所述装置包括:
获取模块,用于获取待回复的第一用户语句;
状态生成模块,用于根据所述第一用户语句,通过状态确定网络确定所述第一用户语句的第一状态信息,所述第一状态信息包括所述第一用户语句的第一对话类别,所述第一对话类别为聊天型对话、任务型对话、问答型对话或检索型对话;
回复语句生成模块,用于将所述第一用户语句以及所述第一对话类别,输入至语句生成网络,得到所述第一用户语句对应的回复语句。
本申请提供了一种确定回复语句的装置,所述装置包括:获取模块,用于获取待回复的第一用户语句;状态生成模块,用于根据所述第一用户语句,通过状态确定网络确定所述第一用户语句的第一状态信息,所述第一状态信息包括所述第一用户语句的第一对话类别,所述第一对话类别为聊天型对话、任务型对话、问答型对话或检索型对话;回复语句生成模块,用于将所述第一用户语句以及所述第一对话类别,输入至语句生成网络,得到所述第一用户语句对应的回复语句。通过状态确定网络识别出用户对话的对话类别,并针对于不同的对话类型,复用对话生成网络来生成对应的回复语句,相当于可以采用同一个模型来处理不同对话类型的用户语句,在模型训练时,可以通过统一多种对话类型的模式,使得多种对话类型可以同时进行训练,训练出的对话系统同时具备多种对话类型的能力,降低了对话系统的模型复杂度以及模型大小。
在一种可能的实现中,所述状态生成模块,具体用于:
通过状态确定网络,从多个对话类型中确定所述第一用户语句的第一对话类别,所述多个对话类型包括所述聊天型对话、任务型对话、问答型对话以及检索型对话中的至少两个。
在一种可能的实现中,所述获取模块,还用于:
获取待回复的第二用户语句;
所述状态生成模块,还用于根据所述第二用户语句,通过所述状态确定网络确定所述第二用户语句的第二状态信息,所述第二状态信息包括所述第二用户语句的第二对话类别,所述第二对话类别为聊天型对话、任务型对话、问答型对话或检索型对话,且所述第二对 话类别和所述第一对话类别不同;
所述回复语句生成模块,还用于将所述第二用户语句以及所述第二对话类别输入至所述语句生成网络,得到所述第二用户语句对应的回复语句。
在一种可能的实现中,所述状态确定网络和所述语句生成网络为GPT模型、DialoGPT模型、BART模型或T5模型。
在一种可能的实现中,所述回复语句生成模块,具体用于:
根据所述第一用户语句,从所述第一用户语句或者数据库中得到用于构建所述回复语句所需的关键词或关键句;
将所述第一用户语句、所述第一对话类别、所述关键词或关键句输入至所述语句生成网络,得到所述第一用户语句对应的回复语句。
第四方面,本申请提供了一种确定回复语句的装置,所述装置包括:
获取模块,用于获取第一用户语句、所述第一用户语句的第一对话类别以及所述第一用户语句对应的第一回复语句,所述第一对话类别为所述第一用户语句的真实类别,所述第一对话类别为聊天型对话、任务型对话、问答型对话或检索型对话;
状态生成模块,用于根据所述第一用户语句,通过状态确定网络确定所述第一用户语句的第一状态信息,所述第一状态信息包括所述第一用户语句的第二对话类别;
回复语句生成模块,用于将所述第一用户语句以及所述第一对话类别输入至语句生成网络,得到所述第一用户语句对应的第二回复语句;
模型更新模块,用于根据所述第一对话类别和所述第二对话类别之间的差异,更新所述状态确定网络;
根据所述第一回复语句和所述第二回复语句之间的差异,更新所述语句生成网络。
本申请提供了一种确定回复语句的装置,所述装置包括:获取模块,用于获取第一用户语句、所述第一用户语句的第一对话类别以及所述第一用户语句对应的第一回复语句,所述第一对话类别为所述第一用户语句的真实类别,所述第一对话类别为聊天型对话、任务型对话、问答型对话或检索型对话;状态生成模块,用于根据所述第一用户语句,通过状态确定网络确定所述第一用户语句的第一状态信息,所述第一状态信息包括所述第一用户语句的第二对话类别;回复语句生成模块,用于将所述第一用户语句以及所述第一对话类别输入至语句生成网络,得到所述第一用户语句对应的第二回复语句;模型更新模块,用于根据所述第一对话类别和所述第二对话类别之间的差异,更新所述状态确定网络;根据所述第一回复语句和所述第二回复语句之间的差异,更新所述语句生成网络。本申请通过状态确定网络识别出用户对话的对话类别,并针对于不同的对话类型,复用对话生成网络来生成对应的回复语句,相当于可以采用同一个模型来处理不同对话类型的用户语句,在模型训练时,可以通过统一多种对话类型的模式,使得多种对话类型可以同时进行训练,训练出的对话系统同时具备多种对话类型的能力,降低了对话系统的模型复杂度以及模型 大小。
在一种可能的实现中,所述状态生成模块,具体用于:
通过状态确定网络,从多个对话类型中确定所述第一用户语句的第二对话类别,所述多个对话类型包括所述聊天型对话、任务型对话、问答型对话以及检索型对话中的至少两个。
在一种可能的实现中,所述获取模块,还用于:
获取第二用户语句、所述第二用户语句的第三对话类别以及所述第二用户语句对应的第三回复语句,所述第三对话类别为所述第二用户语句的真实类别;
所述状态生成模块,还用于根据所述第二用户语句,通过所述状态确定网络确定所述第二用户语句的第二状态信息,所述第二状态信息包括所述第二用户语句的第四对话类别,所述第四对话类别和所述第三对话类别不同;
所述回复语句生成模块,还用于将所述第二用户语句以及所述第三对话类别输入至所述语句生成网络,得到所述第二用户语句对应的第四回复语句;
所述模型更新模块,还用于根据所述第四对话类别和所述第三对话类别之间的差异,更新所述状态确定网络;
根据所述第四回复语句和所述第三回复语句之间的差异,更新所述语句生成网络。
在一种可能的实现中,所述状态确定网络和所述语句生成网络为GPT模型、DialoGPT模型、BART模型或T5模型。
在一种可能的实现中,所述回复语句生成模块,具体用于:
根据所述第一用户语句,从所述第一用户语句或者数据库中得到用于构建所述回复语句所需的关键词或关键句;
将所述第一用户语句、所述第一对话类别、所述关键词或关键句输入至所述语句生成网络,得到所述第一用户语句对应的第二回复语句。
第五方面,本申请实施例提供了一种确定回复语句的装置,可以包括存储器、处理器以及总线系统,其中,存储器用于存储程序,处理器用于执行存储器中的程序,以执行如上述第一方面任一可选的方法。
第六方面,本申请实施例提供了一种确定回复语句的装置,可以包括存储器、处理器以及总线系统,其中,存储器用于存储程序,处理器用于执行存储器中的程序,以执行如上述第二方面任一可选的方法。
第七方面,本申请实施例提供了一种计算机可读存储介质,所述计算机可读存储介质中存储有计算机程序,当其在计算机上运行时,使得计算机执行上述第一方面任一可选的方法、以及第二方面任一可选的方法。
第八方面,本申请实施例提供了一种计算机程序产品,包括代码,当代码被执行时,用于实现上述第一方面任一可选的方法、以及第二方面任一可选的方法。
第九方面,本申请提供了一种芯片系统,该芯片系统包括处理器,用于支持执行设备或训练设备实现上述方面中所涉及的功能,例如,发送或处理上述方法中所涉及的数据;或,信息。在一种可能的设计中,所述芯片系统还包括存储器,所述存储器,用于保存执行设备或训练设备必要的程序指令和数据。该芯片系统,可以由芯片构成,也可以包括芯片和其他分立器件。
本申请实施例提供了一种确定回复语句的方法,所述方法包括:获取待回复的第一用户语句;根据所述第一用户语句,通过状态确定网络确定所述第一用户语句的第一状态信息,所述第一状态信息包括所述第一用户语句的第一对话类别,所述第一对话类别为聊天型对话、任务型对话、问答型对话或检索型对话;将所述第一用户语句以及所述第一对话类别,输入至语句生成网络,得到所述第一用户语句对应的回复语句。通过状态确定网络识别出用户对话的对话类别,并针对于不同的对话类型,复用对话生成网络来生成对应的回复语句,相当于可以采用同一个模型来处理不同对话类型的用户语句,在模型训练时,可以通过统一多种对话类型的模式,使得多种对话类型可以同时进行训练,训练出的对话系统同时具备多种对话类型的能力,降低了对话系统的模型复杂度以及模型大小。
附图说明
图1为人工智能主体框架的一种结构示意图;
图2为本申请实施例提供的一种系统架构的示意图;
图3为本申请实施例提供的一种确定回复语句的方法的实施例示意;
图4为一种任务型对话的界面示意;
图5为本申请实施例提供的模型的示意;
图6为本申请实施例提供的模型的示意;
图7为本申请实施例提供的一种确定回复语句的方法的示意;
图8为本申请实施例提供的一种确定回复语句的装置的示意;
图9为本申请实施例提供的一种确定回复语句的装置的示意;
图10为本申请实施例提供的执行设备的一种结构示意图;
图11是本申请实施例提供的训练设备一种结构示意图;
图12为本申请实施例提供的芯片的一种结构示意图。
具体实施方式
下面结合本发明实施例中的附图对本发明实施例进行描述。本发明的实施方式部分使用的术语仅用于对本发明的具体实施例进行解释,而非旨在限定本发明。
下面结合附图,对本申请的实施例进行描述。本领域普通技术人员可知,随着技术的发展和新场景的出现,本申请实施例提供的技术方案对于类似的技术问题,同样适用。
本申请的说明书和权利要求书及上述附图中的术语“第一”、“第二”等是用于区别类 似的对象,而不必用于描述特定的顺序或先后次序。应该理解这样使用的术语在适当情况下可以互换,这仅仅是描述本申请的实施例中对相同属性的对象在描述时所采用的区分方式。此外,术语“包括”和“具有”以及他们的任何变形,意图在于覆盖不排他的包含,以便包含一系列单元的过程、方法、系统、产品或设备不必限于那些单元,而是可包括没有清楚地列出的或对于这些过程、方法、产品或设备固有的其它单元。
首先对人工智能系统总体工作流程进行描述,请参见图1,图1示出的为人工智能主体框架的一种结构示意图,下面从“智能信息链”(水平轴)和“IT价值链”(垂直轴)两个维度对上述人工智能主题框架进行阐述。其中,“智能信息链”反映从数据的获取到处理的一列过程。举例来说,可以是智能信息感知、智能信息表示与形成、智能推理、智能决策、智能执行与输出的一般过程。在这个过程中,数据经历了“数据—信息—知识—智慧”的凝练过程。“IT价值链”从人智能的底层基础设施、信息(提供和处理技术实现)到系统的产业生态过程,反映人工智能为信息技术产业带来的价值。
(1)基础设施
基础设施为人工智能系统提供计算能力支持,实现与外部世界的沟通,并通过基础平台实现支撑。通过传感器与外部沟通;计算能力由智能芯片(CPU、NPU、GPU、ASIC、FPGA等硬件加速芯片)提供;基础平台包括分布式计算框架及网络等相关的平台保障和支持,可以包括云存储和计算、互联互通网络等。举例来说,传感器和外部沟通获取数据,这些数据提供给基础平台提供的分布式计算系统中的智能芯片进行计算。
(2)数据
基础设施的上一层的数据用于表示人工智能领域的数据来源。数据涉及到图形、图像、语音、文本,还涉及到传统设备的物联网数据,包括已有系统的业务数据以及力、位移、液位、温度、湿度等感知数据。
(3)数据处理
数据处理通常包括数据训练,机器学习,深度学习,搜索,推理,决策等方式。
其中,机器学习和深度学习可以对数据进行符号化和形式化的智能信息建模、抽取、预处理、训练等。
推理是指在计算机或智能系统中,模拟人类的智能推理方式,依据推理控制策略,利用形式化的信息进行机器思维和求解问题的过程,典型的功能是搜索与匹配。
决策是指智能信息经过推理后进行决策的过程,通常提供分类、排序、预测等功能。
(4)通用能力
对数据经过上面提到的数据处理后,进一步基于数据处理的结果可以形成一些通用的能力,比如可以是算法或者一个通用系统,例如,翻译,文本的分析,计算机视觉的处理,语音识别,图像的识别等等。
(5)智能产品及行业应用
智能产品及行业应用指人工智能系统在各领域的产品和应用,是对人工智能整体解决方案的封装,将智能信息决策产品化、实现落地应用,其应用领域主要包括:智能终端、智能交通、智能医疗、自动驾驶、智慧城市等。
以下示例性介绍本申请的应用场景。
本申请实施例提供的方法及装置能够应用在自然语言处理(natural language processing,NLP)技术中人机对话的场景。具体而言,本申请实施例能够应用在构建对话机器人并向最终用户提供语义理解和对话服务的场景中。其中,对话机器人例如是儿童陪伴教育机器人、售后自动回答应用、售前咨询机器人、终端上的智能语音助手等。
接下来介绍本申请实施例的应用架构。
下面结合图2对本申请实施例提供的系统架构进行详细的介绍。图2为本申请一实施例提供的系统架构示意图。如图2所示,系统架构500包括执行设备510、训练设备520、数据库530、客户设备540、数据存储系统550以及数据采集系统560。
执行设备510包括计算模块511、I/O接口512、预处理模块513和预处理模块514。计算模块511中可以包括状态确定网络/规则501,预处理模块513和预处理模块514是可选的。
数据采集设备560用于采集训练样本。训练样本可以为文本数据等等,在本申请实施例中,训练样本为对状态确定网络以及语句生成网络进行训练时所采用的数据。在采集到训练样本之后,数据采集设备560将这些训练样本存入数据库530。
应理解,数据库530中还可以维护有状态确定网络以及语句生成网络等预训练模型或者对预训练模型进行至少一次微调(fine-tune)后得到的模型。
训练设备520可以基于数据库530中维护的训练样本对状态确定网络以及语句生成网络进行训练,以得到状态确定网络/规则501。本申请实施例中,状态确定网络/规则501可以为训练后的状态确定网络以及语句生成网络。
需要说明的是,在实际应用中,数据库530中维护的训练样本不一定都来自于数据采集设备560的采集,也有可能是从其他设备接收得到的。另外需要说明的是,训练设备520也不一定完全基于数据库530维护的训练样本进行状态确定网络/规则501的训练,也有可能从云端或其他地方获取训练样本进行模型训练,上述描述不应该作为对本申请实施例的限定。
具体的,训练样本可以为来自客户设备540的私有数据,进而训练设备520可以将来自客户设备540的私有数据作为训练样本对状态确定网络以及语句生成网络进行模型微调。
本申请实施例中,训练设备520可以通过本申请实施例中的模型训练方法对状态确定网络以及语句生成网络进行训练,以得到训练后的状态确定网络以及语句生成网络。
根据训练设备520训练得到的状态确定网络/规则501可以应用于不同的系统或设备中,如应用于图2所示的执行设备510,所述执行设备510可以是终端,如手机终端,平板电脑,笔记本电脑,增强现实(augmented reality,AR)/虚拟现实(virtual reality,VR)设备,车载终端等,还可以是服务器或者云端等。
在图2中,执行设备510配置输入/输出(input/output,I/O)接口512,用于与外部设备进行数据交互,用户可以通过客户设备540向I/O接口512输入数据(例如本申请实施例中的第一用户语句、第二用户语句)。
预处理模块513和预处理模块514用于根据I/O接口512接收到的输入数据进行预处 理。应理解,可以没有预处理模块513和预处理模块514或者只有的一个预处理模块。当不存在预处理模块513和预处理模块514时,可以直接采用计算模块511对输入数据进行处理。
在执行设备510对输入数据进行预处理,或者在执行设备510的计算模块511执行计算等相关的处理过程中,执行设备510可以调用数据存储系统550中的数据、代码等以用于相应的处理,也可以将相应处理得到的数据、指令等存入数据存储系统550中。
最后,I/O接口512将处理结果(例如回复语句)呈现给客户设备540,从而提供给用户。
在图2所示情况下,用户可以手动给定输入数据,该“手动给定输入数据”可以通过I/O接口512提供的界面进行操作。另一种情况下,客户设备540可以自动地向I/O接口512发送输入数据,如果要求客户设备540自动发送输入数据需要获得用户的授权,则用户可以在客户设备540中设置相应权限。用户可以在客户设备540查看执行设备510输出的结果,具体的呈现形式可以是显示、声音、动作等具体方式。客户设备540也可以作为数据采集端,采集如图所示输入I/O接口512的输入数据及输出I/O接口512的输出结果作为新的样本数据,并存入数据库530。当然,也可以不经过客户设备540进行采集,而是由I/O接口512直接将如图所示输入I/O接口512的输入数据及输出I/O接口512的输出结果,作为新的样本数据存入数据库530。
值得注意的是,图2仅是本申请实施例提供的一种系统架构的示意图,图中所示设备、器件、模块等之间的位置关系不构成任何限制,例如,在图2中,数据存储系统550相对执行设备510是外部存储器,在其它情况下,也可以将数据存储系统550置于执行设备510中。应理解,上述执行设备510可以部署于客户设备540中。
本申请实施例中,上述训练设备520可以获取到存储器(图2中未示出,可以集成于训练设备520或者与训练设备520分离部署)中存储的代码来实现本申请实施例中的确定回复语句的方法。
本申请实施例中,训练设备520可以包括硬件电路(如专用集成电路(application specific integrated circuit,ASIC)、现场可编程门阵列(field-programmable gate array,FPGA)、通用处理器、数字信号处理器(digital signal processing,DSP)、微处理器或微控制器等等)、或这些硬件电路的组合,例如,训练设备520可以为具有执行指令功能的硬件系统,如CPU、DSP等,或者为不具有执行指令功能的硬件系统,如ASIC、FPGA等,或者为上述不具有执行指令功能的硬件系统以及具有执行指令功能的硬件系统的组合。
具体的,训练设备520可以为具有执行指令功能的硬件系统,本申请实施例提供的确定回复语句的方法可以为存储在存储器中的软件代码,训练设备520可以从存储器中获取到软件代码,并执行获取到的软件代码来实现本申请实施例提供的确定回复语句的方法。
应理解,训练设备520可以为不具有执行指令功能的硬件系统以及具有执行指令功能的硬件系统的组合,本申请实施例提供的模型训练方法的部分步骤还可以通过训练设备520中不具有执行指令功能的硬件系统来实现,这里并不限定。
应理解,上述执行设备可以为云侧的服务器或者端侧的电子设备。
由于本申请实施例涉及大量神经网络的应用,为了便于理解,下面先对本申请实施例涉及的相关术语及神经网络等相关概念进行介绍。
(1)神经网络
神经网络可以是由神经单元组成的,神经单元可以是指以xs(即输入数据)和截距1为输入的运算单元,该运算单元的输出可以为:
Figure PCTCN2022125088-appb-000001
其中,s=1、2、……n,n为大于1的自然数,Ws为xs的权重,b为神经单元的偏置。f为神经单元的激活函数(activation functions),用于将非线性特性引入神经网络中,来将神经单元中的输入信号转换为输出信号。该激活函数的输出信号可以作为下一层卷积层的输入,激活函数可以是sigmoid函数。神经网络是将多个上述单一的神经单元联结在一起形成的网络,即一个神经单元的输出可以是另一个神经单元的输入。每个神经单元的输入可以与前一层的局部接受域相连,来提取局部接受域的特征,局部接受域可以是由若干个神经单元组成的区域。
(2)深度神经网络
深度神经网络(Deep Neural Network,DNN),也称多层神经网络,可以理解为具有很多层隐含层的神经网络,这里的“很多”并没有特别的度量标准。从DNN按不同层的位置划分,DNN内部的神经网络可以分为三类:输入层,隐含层,输出层。一般来说第一层是输入层,最后一层是输出层,中间的层数都是隐含层。层与层之间是全连接的,也就是说,第i层的任意一个神经元一定与第i+1层的任意一个神经元相连。虽然DNN看起来很复杂,但是就每一层的工作来说,其实并不复杂,简单来说就是如下线性关系表达式:
Figure PCTCN2022125088-appb-000002
其中,
Figure PCTCN2022125088-appb-000003
是输入向量,
Figure PCTCN2022125088-appb-000004
是输出向量,
Figure PCTCN2022125088-appb-000005
是偏移向量,W是权重矩阵(也称系数),α()是激活函数。每一层仅仅是对输入向量
Figure PCTCN2022125088-appb-000006
经过如此简单的操作得到输出向量
Figure PCTCN2022125088-appb-000007
由于DNN层数多,则系数W和偏移向量
Figure PCTCN2022125088-appb-000008
的数量也就很多了。这些参数在DNN中的定义如下所述:以系数W为例:假设在一个三层的DNN中,第二层的第4个神经元到第三层的第2个神经元的线性系数定义为
Figure PCTCN2022125088-appb-000009
上标3代表系数W所在的层数,而下标对应的是输出的第三层索引2和输入的第二层索引4。总结就是:第L-1层的第k个神经元到第L层的第j个神经元的系数定义为
Figure PCTCN2022125088-appb-000010
需要注意的是,输入层是没有W参数的。在深度神经网络中,更多的隐含层让网络更能够刻画现实世界中的复杂情形。理论上而言,参数越多的模型复杂度越高,“容量”也就越大,也就意味着它能完成更复杂的学习任务。训练深度神经网络的也就是学习权重矩阵的过程,其最终目的是得到训练好的深度神经网络的所有层的权重矩阵(由很多层的向量W形成的权重矩阵)。
(3)损失函数
在训练深度神经网络的过程中,因为希望深度神经网络的输出尽可能的接近真正想要预测的值,所以可以通过比较当前网络的预测值和真正想要的目标值,再根据两者之间的 差异情况来更新每一层神经网络的权重向量(当然,在第一次更新之前通常会有初始化的过程,即为深度神经网络中的各层预先配置参数),比如,如果网络的预测值高了,就调整权重向量让它预测低一些,不断的调整,直到深度神经网络能够预测出真正想要的目标值或与真正想要的目标值非常接近的值。因此,就需要预先定义“如何比较预测值和目标值之间的差异”,这便是损失函数(loss function)或目标函数(objective function),它们是用于衡量预测值和目标值的差异的重要方程。其中,以损失函数举例,损失函数的输出值(loss)越高表示差异越大,那么深度神经网络的训练就变成了尽可能缩小这个loss的过程。
(4)反向传播算法
可以采用误差反向传播(back propagation,BP)算法在训练过程中修正初始模型中参数的大小,使得模型的误差损失越来越小。具体地,前向传递输入信号直至输出会产生误差损失,通过反向传播误差损失信息来更新初始模型中的参数,从而使误差损失收敛。反向传播算法是以误差损失为主导的反向传播运动,旨在得到最优的模型参数,例如权重矩阵。
在开放领域,集成学习通常是一种集成多种功能模块或任务类型的学习方式。在对话系统的研究中,不同的对话类型、不同的对话领域也可以通过集成学习的方式来进行整合。Transformer模型是一种常用的建模对话的模型架构,该模型由transformer编码器和解码器组成,编码器这一模块用于对话上下文信息的编码,解码器这一模块基于对话上下文来进行生成。现有技术一提出可以利用多个解码器模块来建模对个对话领域,每个解码器模块对应一种对话领域。在模型训练过程中,各个对话领域的编码器模块通过参数共享的方式进行学习,且每个对话领域的数据集会用于学习该领域对应的解码器模块。此外,该系统在训练过程中学习一个基于循环神经网络的模块来对当前对话上下文的领域所属进行判别,进而使用判别的概率分布来进行多个解码器参数的加权整合,得到多领域对话系统。
集成学习的方法普遍存在模型相对复杂、部署开销大、更新代价大等不足之处。现有技术一每个领域对应一个系统子模块,导致了模型复杂度及训练开销大幅增加。在对话领域数量增加时需要更大的子模块来进行功能承载,多个领域间并没有达到真正的统一。
随着技术的演进,用户的需求总是会朝着一个系统解决所有问题发展。利用对话预训练技术使单模型能够支持各种对话类型、同时可以实现任务间切换是一种发展趋势。因此,本发明方案希望提出一种统一的端到端对话系统框架,将不同类型的对话系统统一到相同的对话模式,实现不同对话类型的统一训练,从而使模型同时具备完成不同类型的对话的能力。
首先以模型推理阶段为例对本申请实施例提供的一种确定回复语句的方法进行说明。
参照图3,图3为本申请实施例提供的一种确定回复语句的方法的实施例示意,如图3示出的那样,本申请实施例提供的一种确定回复语句的方法包括:
301、获取待回复的第一用户语句。
在一种可能的实现中,该第一用户语句可以是用户向问答设备输入的问题、请求等文本。示例性的,用户可以采用文本形式向问答设备输入目标问题,该情况下,问答设备可以直接获取到文本形式的第一用户语句。用户还可以采用语音形式向问答设备输入目标问 题,该情况下,问答设备可以将接收到的语音信息转换为文本信息,从而得到文本形式的第一用户语句。用户还可以采用肢体语言向问答设备输入目标问题,该情况下,问答设备通过对用户的肢体动作进行采集和分析,识别得到文本形式的第一用户语句。
302、根据所述第一用户语句,通过状态确定网络确定所述第一用户语句的第一状态信息,所述第一状态信息包括所述第一用户语句的第一对话类别,所述第一对话类别为聊天型对话、任务型对话、问答型对话或检索型对话。
在一种可能的实现中,在获取到第一用户语句之后,需要确定出第一用户语句的第一状态信息,其中,第一状态信息可以包括第一对话类别。
在一种可能的实现中,可以通过状态确定网络确定所述第一用户语句的第一状态信息。
在一种可能的实现中,所述状态确定网络可以为生成式预训练变换(generative pre-trained transformer,GPT)模型、对话生成式预训练变换(dialogue generative pre-trained transformer,DialoGPT)模型、BART模型(bidirectional and auto-regressive transformers)或T5模型(transfer text-to-text transformer)。
在一种可能的实现中,可以通过状态确定网络,从多个对话类型中确定所述第一用户语句的第一对话类别,所述多个对话类型包括所述聊天型对话、任务型对话、问答型对话以及检索型对话中的至少两个。
例如,多个对话类型包括聊天型对话以及任务型对话。
例如,多个对话类型包括聊天型对话以及问答型对话。
例如,多个对话类型包括聊天型对话以及检索型对话。
例如,多个对话类型包括任务型对话以及问答型对话。
例如,多个对话类型包括任务型对话以及检索型对话。
例如,多个对话类型包括问答型对话以及检索型对话。
例如,多个对话类型包括聊天型对话、任务型对话以及问答型对话。
例如,多个对话类型包括聊天型对话、任务型对话以及检索型对话。
例如,多个对话类型包括任务型对话、问答型对话以及检索型对话。
例如,多个对话类型包括聊天型对话、任务型对话、问答型对话以及检索型对话。
其中,状态确定网络可以为训练好的,具备基于用户语句确定对应的对话类型的能力。
应理解,在确定对话类型时,状态确定网络的输入可以为第一用户语句(可选的,还可以包括用户的其他历史语句),这里并不限定。
应理解,上述对话类型也可以称之为对话置信状态(belief state)。
接下里介绍任务型对话:
参阅图4,图4是本申请涉及的一种任务型对话的应用场景的示意图。在该应用场景下,用户与对话系统进行任务型对话。如图4所示,图4的右边是用户输入的用户语句,图4的左边是对话系统根据用户语句输出的回复语句。首先,用户输入用户语句“订去北京的飞机票。”,然后,对话系统输出回复语句“好的,为您找到明天八点从深圳到北京的机票,您需要预订吗?”。然后,用户再输入用户语句“帮我预订吧。”,对话系统输出回复语句“好的,已为您预订机票。请问还有什么需要帮忙吗?”。用户再输入用户语句“没有了,谢谢”。 应理解,图4所示的应用场景仅仅是其中一种任务型对话的示例,在实际应用中,任务型对话还可以是关于打电话的对话、关于查询地理位置的对话、关于订外卖的对话、关于询问天气的对话以及关于订酒店的对话等等,此处不作具体限定。
由于任务型对话的复杂性,用户需要将需求分多轮进行描述。对话系统则需要给出每一轮的限制条件下的最佳决策,并且对当前状态(context)进行记录。
在状态确定网络识别出第一用户语句的对话类型为任务型对话时,可以通过用户的意图行为(或者简称为用户行为)来表示任务型对话。具体的,在任务型对话中,用户输入的用户语句中通常包含用户行为。其中,所述用户行为是用户向对话系统提出要求的行为。以用户语句“订去北京的机票”为例,该用户语句向对话系统提出了订机票的要求,因此,该用户语句中包含了用户行为“订机票”。应理解,上述用户行为的举例仅仅是作为一种示例,其他的实施例中,用户行为还可以是“打电话”、“查询地理位置”、“订外卖”、“询问天气”以及“订酒店”等等,此处不作具体限定。
例如,第一用户语句为“i am looking for a cheap hotel”,则第一对话类型可以为hotel。
用户行为可以是对话系统将用户语句输入状态确定网络识别得到的。在一具体的实施例中,状态确定网络可以根据对话系统支持的用户行为对分类进行划分。例如,对话系统支持的用户行为包括“订机票”、“打电话”、“订外卖”、“询问天气”以及“订酒店”,那么,该状态确定网络的分类就包括“订机票”、“打电话”、“订外卖”、“询问天气”以及“订酒店”。状态确定网络根据用户输入的用户语句“订去北京的机票”,可以确定应该归入“订机票”分类,从而识别该用户语句包含的用户行为是“订机票”。
问答系统(question answering system,QA)已经得到的广泛的应用。问答系统是信息检索系统的一种高级形式,它能用准确、简洁的自然语言回答用户用自然语言提出的问题。问答系统也可以称为人机对话系统等。目前,很多领域的智能客服系统都采用了问答系统。图1为本申请实施例的一种可能的应用场景示意图。如图1所示,该应用场景包括:问答设备和用户。示例性的,用户可以向问答设备提出问题,问答设备根据用户的问题,向用户返回合适的答案。例如:用户向问答设备提出问题“中国的首都是哪里?”,问答设备向用户返回答案“北京”。
接下里介绍问答型对话:
此处的问答,指的是一问一答,即直接根据用户问题给出精准答案,如”北京今天多少度“。问答更类似信息检索,虽然可能也涉及上下文处理,如”那么明天多少度“。
第一用户语句可以为用户输入的问题,对话系统需要从知识库(或者称之为数据库)中确定该第一用户语句对应的回答。其中,知识库用于提供回答用户问题所需的知识。处理单元中可以设置有语义匹配模型,用于根据用户的问题去知识库中检索最合适的答案。能够理解,知识库中的知识越丰富,问答设备能够回答的问题越多。一种可能的实施方式中,知识库中的知识以“问题-答案对”的形式存储。“问题-答案对”也可以简称为“问答(question and answering、QA)对”。其中,Q表示已知问题(或者称为标准问题),A表示Q对应的答案。问答设备接收到用户问题后,去知识库中寻找答案,本质上就是将用户问题与知识库中的已知问题进行匹配,返回最匹配的已知问题对应的答案。
接下里介绍聊天型对话:
在一种可能的实现中,聊天型对话可以包括问候和寒暄,其特点是没有明确目的,而且不一定回答用户的问题。聊天在人机对话系统中主要是起到情感陪伴的作用。
在一种可能的实现中,第一状态信息还可以包括槽位信息,其中,槽位信息可以为第一用户语句中的关键词。以任务型对话为例,对话系统可以将用户语句输入状态确定网络从而识别槽位信息。状态确定网络可以提取用户对话中提供的关键信息。例如,订机票槽位类型包括“出发地”和“目的地”,槽位识别模型需要提取“出发地”和“目的地”的信息。状态确定网络根据用户输入的用户语句“我要订北京到上海的机票”,识别得到结果是“出发地:北京”和“目的地:上海”,从而给对话系统提供槽位信息。
示例性的,第一用户语句为“does money buy happiness?”,状态确定网络可以识别出对应的第一对话类别为“chit”,该“chit”可以指示第一用户语句为聊天型对话,第一状态信息还可以包括槽位信息“money happiness”。
示例性的,第一用户语句为“i am looking for a cheap hotel”,状态确定网络可以识别出对应的第一对话类别为“hotel”,该“hotel”可以指示第一用户语句为任务型对话,第一状态信息还可以包括槽位信息“price cheap”。
示例性的,第一用户语句为“how high is Mt.Everest?”第一用户语句为聊天型对话,状态确定网络可以识别出对应的第一对话类别为“qa”,该“qa”可以指示第一用户语句为问答型对话,第一状态信息还可以包括槽位信息“Mt.Everest high”。
示例性的,第一用户语句为“which is the best brand for basketball?”,状态确定网络可以识别出对应的第一对话类别为“faq”,该“faq”可以指示第一用户语句为检索型对话,第一状态信息还可以包括槽位信息“brand basketball”。
303、将所述第一用户语句以及所述第一对话类别,输入至语句生成网络,得到所述第一用户语句对应的回复语句。
在一种可能的实现中,所述语句生成网络可以为GPT模型、DialoGPT模型、BART模型或T5模型。
可选的,本申请实施例中的状态确定网络和语句生成网络可以为同一个网络的两部分,也可以为不同的网络。参照图6,图6为状态确定网络和语句生成网络可以为同一个网络的两部分时的示意。
在一种可能的实现中,对话系统可以根据所述第一用户语句,从所述第一用户语句或者数据库中得到用于构建所述回复语句所需的关键词或关键句,将所述第一用户语句、所述第一对话类别、所述关键词或关键句输入至所述语句生成网络,得到所述第一用户语句对应的回复语句。
本申请实施例中,可以根据第一用户语句,以及第一对话类别,在外部数据库/知识库/语料库等外部资源中获取对话相关的数据或文本内容作为对话信息(即上述关键词或关键句)加入对话过程。
在一种可能的实现中,可以将所述第一用户语句以及所述第一对话类别,输入至语句生成网络,得到所述第一用户语句对应的回复语句,或者,将所述第一用户语句、所述第 一对话类别、所述关键词或关键句输入至所述语句生成网络,得到所述第一用户语句对应的回复语句。
应理解,语句生成网络还可以基于除了第一用户语句之外的其他用户历史语句来生成第一用户语句的回复语句,这里并不限定。
本申请实施例中,针对于不同的对话类型,可以复用本申请实施例中的对话生成网络来生成对应的回复语句,在一种可能的实现中,还可以获取到待回复的第二用户语句,根据所述第二用户语句,通过所述状态确定网络确定所述第二用户语句的第二状态信息,所述第二状态信息包括所述第二用户语句的第二对话类别,所述第二对话类别为聊天型对话、任务型对话、问答型对话或检索型对话,且所述第二对话类别和所述第一对话类别不同,进而将所述第二用户语句以及所述第二对话类别输入至所述语句生成网络,得到所述第二用户语句对应的回复语句。
本申请实施例中,通过状态确定网络识别出用户对话的对话类别,并针对于不同的对话类型,复用对话生成网络来生成对应的回复语句,相当于可以采用同一个模型来处理不同对话类型的用户语句,在模型训练时,可以通过统一多种对话类型的模式,使得多种对话类型可以同时进行训练,训练出的对话系统同时具备多种对话类型的能力,降低了对话系统的模型复杂度以及模型大小。
参照图5,在现有的实现中,针对于闲聊型对话的模型,仅包括上述对话生成网络,本申请提供的网络相当于在闲聊型对话的模型中增加“状态确定网络”和“状态检索模块”,“对话状态生成”可以捕捉当前轮的闲聊主题或关键内容,“状态检索”可作为相关话题或对话信息的输入接口,针对于问答型对话的模型,增加了“状态确定网络”,可以引导模型的回复内容,增加检索结果与回复内容的相关度,针对于检索型对话的模型,增加了“状态确定网络”,可以增强当前轮的对话上下文信息。
参照表1,表1为本申请提供的对话系统针对于不同对话类型的对话的对比示意:
表1
Figure PCTCN2022125088-appb-000011
Figure PCTCN2022125088-appb-000012
参照表2,表1为一种实际应用时,对话系统针对于不同对话类型的对话的处理示意:
表2
Figure PCTCN2022125088-appb-000013
Figure PCTCN2022125088-appb-000014
接下里结合具体的数据集对本申请进行有益效果验证,利用MultiWOZ 2.0作为任务型对话数据集,Reddit作为闲聊型对话数据集,并在两类对话数据上进行了一体化对话模型与已有模型的对话性能进行了对比,如表3所示。
表3
Figure PCTCN2022125088-appb-000015
实验结果表明,在使用同等参数规模的前提下,一体化对话系统(UniDS)在任务型对话上的表现显著超过基线,在闲聊型对话上的表现与基线模型近似。这表明了一体化对话系统同时具备任务型和闲聊型的对话能力。
本实施例进行了任务类型切换测试,设计了两种数据类型来进行测试:
(a)任务型数据前随机插入两轮闲聊型数据:
Turn1-闲聊Turn2-闲聊Turn3-任务Turn4-任务…TurnN-任务;
(b):闲聊型数据前随机插入两轮任务型数据:
Turn1-任务Turn2-任务Turn3-闲聊Turn4-闲聊…TurnN-闲聊;
此外,设计了两个模型切换能力评价指标:
Switch-1:模型在对话类型切换后的第一轮完成切换的比例;
Switch-2:模型在对话类型切换后的第二轮就完成切换的比例;
测试结果如表4和表5所示。两种设置下一体化对话系统都可以在数据类型切换后的前两轮基本完成对话类型的切换,这表明了一体化对话系统具备任务型和闲聊型对话之间的切换能力。
表4一体化对话系统的任务类型(闲聊到任务)切换测试结果
Figure PCTCN2022125088-appb-000016
表5一体化对话系统的任务类型(任务到闲聊)切换测试结果
Figure PCTCN2022125088-appb-000017
本实施例进行了任务型对话鲁棒性测试,模拟了真实对话场景(例如看电视时与手机 助手对话、驾驶员进行语音交互时乘客聊天)的噪声环境,即在任务型多轮对话中随机插入一轮或两轮闲聊对话。表6中的实验结果表明,一体化对话系统的鲁棒性明显优于任务型单独训练的对话模型。
表6噪声环境下任务型对话鲁棒性测试
Figure PCTCN2022125088-appb-000018
由上可知,本申请实施例提供的一体化对话系统较多个单类型系统可以保持性能不降或有所提升的前提下,明显减少整体参数量,且具备不同对话类型间的切换能力,对任务型对话的鲁棒性有较大的提升。
本申请实施例提供了一种确定回复语句的方法,所述方法包括:获取待回复的第一用户语句;根据所述第一用户语句,通过状态确定网络确定所述第一用户语句的第一状态信息,所述第一状态信息包括所述第一用户语句的第一对话类别,所述第一对话类别为聊天型对话、任务型对话、问答型对话或检索型对话;将所述第一用户语句以及所述第一对话类别,输入至语句生成网络,得到所述第一用户语句对应的回复语句。通过状态确定网络识别出用户对话的对话类别,并针对于不同的对话类型,复用对话生成网络来生成对应的回复语句,相当于可以采用同一个模型来处理不同对话类型的用户语句,在模型训练时,可以通过统一多种对话类型的模式,使得多种对话类型可以同时进行训练,训练出的对话系统同时具备多种对话类型的能力,降低了对话系统的模型复杂度以及模型大小。
首先以模型训练阶段为例对本申请实施例提供的一种确定回复语句的方法进行说明。
参照图7,图7为本申请实施例提供的一种确定回复语句的方法的实施例示意,如图7示出的那样,本申请实施例提供的一种确定回复语句的方法包括:
701、获取第一用户语句、所述第一用户语句的第一对话类别以及所述第一用户语句对应的第一回复语句,所述第一对话类别为所述第一用户语句的真实类别,所述第一对话类别为聊天型对话、任务型对话、问答型对话或检索型对话。
其中,训练设备在对状态确定网络以及语句生成网络进行训练时,可以获取到训练样本,以一次迭代过程为例,训练样本可以包括第一用户语句、所述第一用户语句的第一对话类别以及所述第一用户语句对应的第一回复语句,所述第一对话类别为所述第一用户语句的真实类别,所述第一对话类别为聊天型对话、任务型对话、问答型对话或检索型对话。
其中,状态确定网络以及语句生成网络为待更新的模型,其中,状态确定网络以及语句生成网络可以是模型训练开始阶段的初始化模型,或者是预训练模型,该模型具备所属领域的一些基础功能,或者是对预训练模型进行微调后得到的具有除上述基础功能之外的其他功能的模型。
在一种可能的实现中,所述状态确定网络和所述语句生成网络为GPT模型、DialoGPT模型、BART模型或T5模型。
702、根据所述第一用户语句,通过状态确定网络确定所述第一用户语句的第一状态信息,所述第一状态信息包括所述第一用户语句的第二对话类别;
其中,第二对话类别可以为状态确定网络在一次前馈时得到的结果。
在一种可能的实现中,可以通过状态确定网络,从多个对话类型中确定所述第一用户语句的第二对话类别,所述多个对话类型包括所述聊天型对话、任务型对话、问答型对话以及检索型对话中的至少两个。
703、将所述第一用户语句以及所述第一对话类别输入至语句生成网络,得到所述第一用户语句对应的第二回复语句;
其中,第二回复语句可以为语句生成网络在一次前馈时得到的结果。
704、根据所述第一对话类别和所述第二对话类别之间的差异,更新所述状态确定网络;
705、根据所述第一回复语句和所述第二回复语句之间的差异,更新所述语句生成网络。
在一种可能的实现中,可以获取第二用户语句、所述第二用户语句的第三对话类别以及所述第二用户语句对应的第三回复语句,所述第三对话类别为所述第二用户语句的真实类别;根据所述第二用户语句,通过所述状态确定网络确定所述第二用户语句的第二状态信息,所述第二状态信息包括所述第二用户语句的第四对话类别,所述第四对话类别和所述第三对话类别不同;将所述第二用户语句以及所述第三对话类别输入至所述语句生成网络,得到所述第二用户语句对应的第四回复语句;根据所述第四对话类别和所述第三对话类别之间的差异,更新所述状态确定网络;根据所述第四回复语句和所述第三回复语句之间的差异,更新所述语句生成网络。
在一种可能的实现中,可以根据所述第一用户语句,从所述第一用户语句或者数据库中得到用于构建所述回复语句所需的关键词或关键句;将所述第一用户语句、所述第一对话类别、所述关键词或关键句输入至所述语句生成网络,得到所述第一用户语句对应的第二回复语句。
本申请实施例提供了一种确定回复语句的方法,所述方法包括:获取第一用户语句、所述第一用户语句的第一对话类别以及所述第一用户语句对应的第一回复语句,所述第一对话类别为所述第一用户语句的真实类别,所述第一对话类别为聊天型对话、任务型对话、问答型对话或检索型对话;根据所述第一用户语句,通过状态确定网络确定所述第一用户语句的第一状态信息,所述第一状态信息包括所述第一用户语句的第二对话类别;将所述第一用户语句以及所述第一对话类别输入至语句生成网络,得到所述第一用户语句对应的第二回复语句;根据所述第一对话类别和所述第二对话类别之间的差异,更新所述状态确定网络;根据所述第一回复语句和所述第二回复语句之间的差异,更新所述语句生成网络。通过状态确定网络识别出用户对话的对话类别,并针对于不同的对话类型,复用对话生成网络来生成对应的回复语句,相当于可以采用同一个模型来处理不同对话类型的用户语句,在模型训练时,可以通过统一多种对话类型的模式,使得多种对话类型可以同时进行训练,训练出的对话系统同时具备多种对话类型的能力,降低了对话系统的模型复杂度以及模型大小。
参照图8,图8为本申请实施例提供的一种确定回复语句的装置的结构示意,如图8所示,所述装置800包括:
获取模块801,用于获取待回复的第一用户语句;
其中,关于获取模块801的具体描述可以参照上述实施例中步骤301的描述,这里不再赘述。
状态生成模块802,用于根据所述第一用户语句,通过状态确定网络确定所述第一用户语句的第一状态信息,所述第一状态信息包括所述第一用户语句的第一对话类别,所述第一对话类别为聊天型对话、任务型对话、问答型对话或检索型对话;
其中,关于状态生成模块802的具体描述可以参照上述实施例中步骤302的描述,这里不再赘述。
回复语句生成模块803,用于将所述第一用户语句以及所述第一对话类别,输入至语句生成网络,得到所述第一用户语句对应的回复语句。
其中,关于回复语句生成模块803的具体描述可以参照上述实施例中步骤303的描述,这里不再赘述。
在一种可能的实现中,所述状态生成模块,具体用于:
通过状态确定网络,从多个对话类型中确定所述第一用户语句的第一对话类别,所述多个对话类型包括所述聊天型对话、任务型对话、问答型对话以及检索型对话中的至少两个。
在一种可能的实现中,所述获取模块,还用于:
获取待回复的第二用户语句;
所述状态生成模块,还用于根据所述第二用户语句,通过所述状态确定网络确定所述第二用户语句的第二状态信息,所述第二状态信息包括所述第二用户语句的第二对话类别,所述第二对话类别为聊天型对话、任务型对话、问答型对话或检索型对话,且所述第二对话类别和所述第一对话类别不同;
所述回复语句生成模块,还用于将所述第二用户语句以及所述第二对话类别输入至所述语句生成网络,得到所述第二用户语句对应的回复语句。
在一种可能的实现中,所述状态确定网络和所述语句生成网络为GPT模型、DialoGPT模型、BART模型或T5模型。
在一种可能的实现中,所述回复语句生成模块,具体用于:
根据所述第一用户语句,从所述第一用户语句或者数据库中得到用于构建所述回复语句所需的关键词或关键句;
将所述第一用户语句、所述第一对话类别、所述关键词或关键句输入至所述语句生成网络,得到所述第一用户语句对应的回复语句。
本申请提供了一种确定回复语句的装置,所述装置包括:获取模块,用于获取待回复的第一用户语句;状态生成模块,用于根据所述第一用户语句,通过状态确定网络确定所述第一用户语句的第一状态信息,所述第一状态信息包括所述第一用户语句的第一对话类别,所述第一对话类别为聊天型对话、任务型对话、问答型对话或检索型对话;回复语句 生成模块,用于将所述第一用户语句以及所述第一对话类别,输入至语句生成网络,得到所述第一用户语句对应的回复语句。通过状态确定网络识别出用户对话的对话类别,并针对于不同的对话类型,复用对话生成网络来生成对应的回复语句,相当于可以采用同一个模型来处理不同对话类型的用户语句,在模型训练时,可以通过统一多种对话类型的模式,使得多种对话类型可以同时进行训练,训练出的对话系统同时具备多种对话类型的能力,降低了对话系统的模型复杂度以及模型大小。
参照图9,图9为本申请实施例提供的一种确定回复语句的装置的结构示意,如图9所示,所述装置900包括:
获取模块902,用于获取第一用户语句、所述第一用户语句的第一对话类别以及所述第一用户语句对应的第一回复语句,所述第一对话类别为所述第一用户语句的真实类别,所述第一对话类别为聊天型对话、任务型对话、问答型对话或检索型对话;
其中,关于获取模块902的具体描述可以参照上述实施例中步骤701的描述,这里不再赘述。
状态生成模块904,用于根据所述第一用户语句,通过状态确定网络确定所述第一用户语句的第一状态信息,所述第一状态信息包括所述第一用户语句的第二对话类别;
其中,关于状态生成模块904的具体描述可以参照上述实施例中步骤702的描述,这里不再赘述。
回复语句生成模块901,用于将所述第一用户语句以及所述第一对话类别输入至语句生成网络,得到所述第一用户语句对应的第二回复语句;
其中,关于回复语句生成模块901的具体描述可以参照上述实施例中步骤703的描述,这里不再赘述。
模型更新模块903,用于根据所述第一对话类别和所述第二对话类别之间的差异,更新所述状态确定网络;
根据所述第一回复语句和所述第二回复语句之间的差异,更新所述语句生成网络。
其中,关于模型更新模块903的具体描述可以参照上述实施例中步骤704和705的描述,这里不再赘述。
在一种可能的实现中,所述状态生成模块,具体用于:
通过状态确定网络,从多个对话类型中确定所述第一用户语句的第二对话类别,所述多个对话类型包括所述聊天型对话、任务型对话、问答型对话以及检索型对话中的至少两个。
在一种可能的实现中,所述获取模块,还用于:
获取第二用户语句、所述第二用户语句的第三对话类别以及所述第二用户语句对应的第三回复语句,所述第三对话类别为所述第二用户语句的真实类别;
所述状态生成模块,还用于根据所述第二用户语句,通过所述状态确定网络确定所述第二用户语句的第二状态信息,所述第二状态信息包括所述第二用户语句的第四对话类别,所述第四对话类别和所述第三对话类别不同;
所述回复语句生成模块,还用于将所述第二用户语句以及所述第三对话类别输入至所述语句生成网络,得到所述第二用户语句对应的第四回复语句;
所述模型更新模块,还用于根据所述第四对话类别和所述第三对话类别之间的差异,更新所述状态确定网络;
根据所述第四回复语句和所述第三回复语句之间的差异,更新所述语句生成网络。
在一种可能的实现中,所述状态确定网络和所述语句生成网络为GPT模型、DialoGPT模型、BART模型或T5模型。
在一种可能的实现中,所述回复语句生成模块,具体用于:
根据所述第一用户语句,从所述第一用户语句或者数据库中得到用于构建所述回复语句所需的关键词或关键句;
将所述第一用户语句、所述第一对话类别、所述关键词或关键句输入至所述语句生成网络,得到所述第一用户语句对应的第二回复语句。
本申请提供了一种确定回复语句的装置,所述装置包括:获取模块,用于获取第一用户语句、所述第一用户语句的第一对话类别以及所述第一用户语句对应的第一回复语句,所述第一对话类别为所述第一用户语句的真实类别,所述第一对话类别为聊天型对话、任务型对话、问答型对话或检索型对话;状态生成模块,用于根据所述第一用户语句,通过状态确定网络确定所述第一用户语句的第一状态信息,所述第一状态信息包括所述第一用户语句的第二对话类别;回复语句生成模块,用于将所述第一用户语句以及所述第一对话类别输入至语句生成网络,得到所述第一用户语句对应的第二回复语句;模型更新模块,用于根据所述第一对话类别和所述第二对话类别之间的差异,更新所述状态确定网络;根据所述第一回复语句和所述第二回复语句之间的差异,更新所述语句生成网络。本申请通过状态确定网络识别出用户对话的对话类别,并针对于不同的对话类型,复用对话生成网络来生成对应的回复语句,相当于可以采用同一个模型来处理不同对话类型的用户语句,在模型训练时,可以通过统一多种对话类型的模式,使得多种对话类型可以同时进行训练,训练出的对话系统同时具备多种对话类型的能力,降低了对话系统的模型复杂度以及模型大小。
接下来介绍本申请实施例提供的一种执行设备,请参阅图10,图10为本申请实施例提供的执行设备的一种结构示意图,执行设备1000具体可以表现为手机、平板、笔记本电脑、智能穿戴设备、服务器等,此处不做限定。其中,执行设备1000上可以部署有图8对应实施例中所描述的确定回复语句的装置,用于实现图8对应实施例中确定回复语句的功能。具体的,执行设备1000包括:接收器1001、发射器1002、处理器1003和存储器1004(其中执行设备1000中的处理器1003的数量可以一个或多个),其中,处理器1003可以包括应用处理器10031和通信处理器10032。在本申请的一些实施例中,接收器1001、发射器1002、处理器1003和存储器1004可通过总线或其它方式连接。
存储器1004可以包括只读存储器和随机存取存储器,并向处理器1003提供指令和数据。存储器1004的一部分还可以包括非易失性随机存取存储器(non-volatile random access  memory,NVRAM)。存储器1004存储有处理器和操作指令、可执行模块或者数据结构,或者它们的子集,或者它们的扩展集,其中,操作指令可包括各种操作指令,用于实现各种操作。
处理器1003控制执行设备的操作。具体的应用中,执行设备的各个组件通过总线系统耦合在一起,其中总线系统除包括数据总线之外,还可以包括电源总线、控制总线和状态信号总线等。但是为了清楚说明起见,在图中将各种总线都称为总线系统。
上述本申请实施例揭示的方法可以应用于处理器1003中,或者由处理器1003实现。处理器1003可以是一种集成电路芯片,具有信号的处理能力。在实现过程中,上述方法的各步骤可以通过处理器1003中的硬件的集成逻辑电路或者软件形式的指令完成。上述的处理器1003可以是通用处理器、数字信号处理器(digital signal processing,DSP)、微处理器或微控制器、以及视觉处理器(vision processing unit,VPU)、张量处理器(tensor processing unit,TPU)等适用于AI运算的处理器,还可进一步包括专用集成电路(application specific integrated circuit,ASIC)、现场可编程门阵列(field-programmable gate array,FPGA)或者其他可编程逻辑器件、分立门或者晶体管逻辑器件、分立硬件组件。该处理器1003可以实现或者执行本申请实施例中的公开的各方法、步骤及逻辑框图。通用处理器可以是微处理器或者该处理器也可以是任何常规的处理器等。结合本申请实施例所公开的方法的步骤可以直接体现为硬件译码处理器执行完成,或者用译码处理器中的硬件及软件模块组合执行完成。软件模块可以位于随机存储器,闪存、只读存储器,可编程只读存储器或者电可擦写可编程存储器、寄存器等本领域成熟的存储介质中。该存储介质位于存储器1004,处理器1003读取存储器1004中的信息,结合其硬件完成上述方法的步骤。
接收器1001可用于接收输入的数字或字符信息,以及产生与执行设备的相关设置以及功能控制有关的信号输入。发射器1002可用于通过第一接口输出数字或字符信息;发射器1002还可用于通过第一接口向磁盘组发送指令,以修改磁盘组中的数据;发射器1002还可以包括显示屏等显示设备。
本申请实施例还提供了一种训练设备,请参阅图11,图11是本申请实施例提供的训练设备一种结构示意图,具体的,训练设备1100由一个或多个服务器实现,训练设备1100可因配置或性能不同而产生比较大的差异,可以包括一个或一个以上中央处理器(central processing units,CPU)1111(例如,一个或一个以上处理器)和存储器1132,一个或一个以上存储应用程序1142或数据1144的存储介质1130(例如一个或一个以上海量存储设备)。其中,存储器1132和存储介质1130可以是短暂存储或持久存储。存储在存储介质1130的程序可以包括一个或一个以上模块(图示没标出),每个模块可以包括对训练设备中的一系列指令操作。更进一步地,中央处理器1111可以设置为与存储介质1130通信,在训练设备1100上执行存储介质1130中的一系列指令操作。
训练设备1100还可以包括一个或一个以上电源1126,一个或一个以上有线或无线网络接口1150,一个或一个以上输入输出接口1158;或,一个或一个以上操作系统1141,例如Windows ServerTM,Mac OS XTM,UnixTM,LinuxTM,FreeBSDTM等等。
具体的,训练设备可以执行图7对应实施例中的确定回复语句的方法。
本申请实施例中还提供一种包括计算机程序产品,当其在计算机上运行时,使得计算机执行如前述执行设备所执行的步骤,或者,使得计算机执行如前述训练设备所执行的步骤。
本申请实施例中还提供一种计算机可读存储介质,该计算机可读存储介质中存储有用于进行信号处理的程序,当其在计算机上运行时,使得计算机执行如前述执行设备所执行的步骤,或者,使得计算机执行如前述训练设备所执行的步骤。
本申请实施例提供的执行设备、训练设备或终端设备具体可以为芯片,芯片包括:处理单元和通信单元,所述处理单元例如可以是处理器,所述通信单元例如可以是输入/输出接口、管脚或电路等。该处理单元可执行存储单元存储的计算机执行指令,以使执行设备内的芯片执行上述实施例描述的数据处理方法,或者,以使训练设备内的芯片执行上述实施例描述的数据处理方法。可选地,所述存储单元为所述芯片内的存储单元,如寄存器、缓存等,所述存储单元还可以是所述无线接入设备端内的位于所述芯片外部的存储单元,如只读存储器(read-only memory,ROM)或可存储静态信息和指令的其他类型的静态存储设备,随机存取存储器(random access memory,RAM)等。
具体的,请参阅图12,图12为本申请实施例提供的芯片的一种结构示意图,所述芯片可以表现为神经网络处理器NPU1200,NPU 1200作为协处理器挂载到主CPU(Host CPU)上,由Host CPU分配任务。NPU的核心部分为运算电路1203,通过控制器1204控制运算电路1203提取存储器中的矩阵数据并进行乘法运算。
NPU 1200可以通过内部的各个器件之间的相互配合,来实现图3所描述的实施例中提供的确定回复语句的方法,或者对训练得到的模型进行推理。
其中,NPU 1200中的运算电路1203可以执行获取模型以及对模型进行模型训练的步骤。
更具体的,在一些实现中,NPU 1200中的运算电路1203内部包括多个处理单元(Process Engine,PE)。在一些实现中,运算电路1203是二维脉动阵列。运算电路1203还可以是一维脉动阵列或者能够执行例如乘法和加法这样的数学运算的其它电子线路。在一些实现中,运算电路1203是通用的矩阵处理器。
举例来说,假设有输入矩阵A,权重矩阵B,输出矩阵C。运算电路从权重存储器1202中取矩阵B相应的数据,并缓存在运算电路中每一个PE上。运算电路从输入存储器1201中取矩阵A数据与矩阵B进行矩阵运算,得到的矩阵的部分结果或最终结果,保存在累加器(accumulator)1208中。
统一存储器1206用于存放输入数据以及输出数据。权重数据直接通过存储单元访问控制器(Direct Memory Access Controller,DMAC)1205,DMAC被搬运到权重存储器1202中。输入数据也通过DMAC被搬运到统一存储器1206中。
BIU为Bus Interface Unit即,总线接口单元1210,用于AXI总线与DMAC和取指存储器(Instruction Fetch Buffer,IFB)1209的交互。
总线接口单元1210(Bus Interface Unit,简称BIU),用于取指存储器1209从外部存储 器获取指令,还用于存储单元访问控制器1205从外部存储器获取输入矩阵A或者权重矩阵B的原数据。
DMAC主要用于将外部存储器DDR中的输入数据搬运到统一存储器1206或将权重数据搬运到权重存储器1202中或将输入数据数据搬运到输入存储器1201中。
向量计算单元1207包括多个运算处理单元,在需要的情况下,对运算电路1203的输出做进一步处理,如向量乘,向量加,指数运算,对数运算,大小比较等等。主要用于神经网络中非卷积/全连接层网络计算,如Batch Normalization(批归一化),像素级求和,对特征平面进行上采样等。
在一些实现中,向量计算单元1207能将经处理的输出的向量存储到统一存储器1206。例如,向量计算单元1207可以将线性函数;或,非线性函数应用到运算电路1203的输出,例如对卷积层提取的特征平面进行线性插值,再例如累加值的向量,用以生成激活值。在一些实现中,向量计算单元1207生成归一化的值、像素级求和的值,或二者均有。在一些实现中,处理过的输出的向量能够用作到运算电路1203的激活输入,例如用于在神经网络中的后续层中的使用。
控制器1204连接的取指存储器(instruction fetch buffer)1209,用于存储控制器1204使用的指令;
统一存储器1206,输入存储器1201,权重存储器1202以及取指存储器1209均为On-Chip存储器。外部存储器私有于该NPU硬件架构。
其中,上述任一处提到的处理器,可以是一个通用中央处理器,微处理器,ASIC,或一个或多个用于控制上述程序执行的集成电路。
另外需说明的是,以上所描述的装置实施例仅仅是示意性的,其中所述作为分离部件说明的单元可以是或者也可以不是物理上分开的,作为单元显示的部件可以是或者也可以不是物理单元,即可以位于一个地方,或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部模块来实现本实施例方案的目的。另外,本申请提供的装置实施例附图中,模块之间的连接关系表示它们之间具有通信连接,具体可以实现为一条或多条通信总线或信号线。
通过以上的实施方式的描述,所属领域的技术人员可以清楚地了解到本申请可借助软件加必需的通用硬件的方式来实现,当然也可以通过专用硬件包括专用集成电路、专用CPU、专用存储器、专用元器件等来实现。一般情况下,凡由计算机程序完成的功能都可以很容易地用相应的硬件来实现,而且,用来实现同一功能的具体硬件结构也可以是多种多样的,例如模拟电路、数字电路或专用电路等。但是,对本申请而言更多情况下软件程序实现是更佳的实施方式。基于这样的理解,本申请的技术方案本质上或者说对现有技术做出贡献的部分可以以软件产品的形式体现出来,该计算机软件产品存储在可读取的存储介质中,如计算机的软盘、U盘、移动硬盘、ROM、RAM、磁碟或者光盘等,包括若干指令用以使得一台计算机设备(可以是个人计算机,训练设备,或者网络设备等)执行本申请各个实施例所述的方法。
在上述实施例中,可以全部或部分地通过软件、硬件、固件或者其任意组合来实现。 当使用软件实现时,可以全部或部分地以计算机程序产品的形式实现。
所述计算机程序产品包括一个或多个计算机指令。在计算机上加载和执行所述计算机程序指令时,全部或部分地产生按照本申请实施例所述的流程或功能。所述计算机可以是通用计算机、专用计算机、计算机网络、或者其他可编程装置。所述计算机指令可以存储在计算机可读存储介质中,或者从一个计算机可读存储介质向另一计算机可读存储介质传输,例如,所述计算机指令可以从一个网站站点、计算机、训练设备或数据中心通过有线(例如同轴电缆、光纤、数字用户线(DSL))或无线(例如红外、无线、微波等)方式向另一个网站站点、计算机、训练设备或数据中心进行传输。所述计算机可读存储介质可以是计算机能够存储的任何可用介质或者是包含一个或多个可用介质集成的训练设备、数据中心等数据存储设备。所述可用介质可以是磁性介质,(例如,软盘、硬盘、磁带)、光介质(例如,DVD)、或者半导体介质(例如固态硬盘(Solid State Disk,SSD))等。

Claims (23)

  1. 一种确定回复语句的方法,其特征在于,所述方法包括:
    获取待回复的第一用户语句;
    根据所述第一用户语句,通过状态确定网络确定所述第一用户语句的第一状态信息,所述第一状态信息包括所述第一用户语句的第一对话类别,所述第一对话类别为聊天型对话、任务型对话、问答型对话或检索型对话;
    将所述第一用户语句以及所述第一对话类别,输入至语句生成网络,得到所述第一用户语句对应的回复语句。
  2. 根据权利要求1所述的方法,其特征在于,所述根据所述第一用户语句,通过状态确定网络确定所述第一用户语句的第一状态信息,包括:
    通过状态确定网络,从多个对话类型中确定所述第一用户语句的第一对话类别,所述多个对话类型包括所述聊天型对话、任务型对话、问答型对话以及检索型对话中的至少两个。
  3. 根据权利要求1或2所述的方法,其特征在于,所述方法还包括:
    获取待回复的第二用户语句;
    根据所述第二用户语句,通过所述状态确定网络确定所述第二用户语句的第二状态信息,所述第二状态信息包括所述第二用户语句的第二对话类别,所述第二对话类别为聊天型对话、任务型对话、问答型对话或检索型对话,且所述第二对话类别和所述第一对话类别不同;
    将所述第二用户语句以及所述第二对话类别输入至所述语句生成网络,得到所述第二用户语句对应的回复语句。
  4. 根据权利要求1至3任一所述的方法,其特征在于,所述状态确定网络和所述语句生成网络为GPT模型、DialoGPT模型、BART模型或T5模型。
  5. 根据权利要求1至4任一所述的方法,其特征在于,所述将所述第一用户语句以及所述第一对话类别输入至语句生成网络,得到所述第一用户语句对应的回复语句,包括:
    根据所述第一用户语句,从所述第一用户语句和/或数据库中得到用于构建所述回复语句所需的关键词或关键句;
    将所述第一用户语句、所述第一对话类别、所述关键词或关键句输入至所述语句生成网络,得到所述第一用户语句对应的回复语句。
  6. 一种确定回复语句的方法,其特征在于,所述方法包括:
    获取第一用户语句、所述第一用户语句的第一对话类别以及所述第一用户语句对应的第一回复语句,所述第一对话类别为所述第一用户语句的真实类别,所述第一对话类别为 聊天型对话、任务型对话、问答型对话或检索型对话;
    根据所述第一用户语句,通过状态确定网络确定所述第一用户语句的第一状态信息,所述第一状态信息包括所述第一用户语句的第二对话类别;
    将所述第一用户语句以及所述第一对话类别输入至语句生成网络,得到所述第一用户语句对应的第二回复语句;
    根据所述第一对话类别和所述第二对话类别之间的差异,更新所述状态确定网络;
    根据所述第一回复语句和所述第二回复语句之间的差异,更新所述语句生成网络。
  7. 根据权利要求6所述的方法,其特征在于,所述根据所述第一用户语句,通过状态确定网络确定所述第一用户语句的第一状态信息,包括:
    通过状态确定网络,从多个对话类型中确定所述第一用户语句的第二对话类别,所述多个对话类型包括所述聊天型对话、任务型对话、问答型对话以及检索型对话中的至少两个。
  8. 根据权利要求6或7所述的方法,其特征在于,所述方法还包括:
    获取第二用户语句、所述第二用户语句的第三对话类别以及所述第二用户语句对应的第三回复语句,所述第三对话类别为所述第二用户语句的真实类别;
    根据所述第二用户语句,通过所述状态确定网络确定所述第二用户语句的第二状态信息,所述第二状态信息包括所述第二用户语句的第四对话类别,所述第四对话类别和所述第三对话类别不同;
    将所述第二用户语句以及所述第三对话类别输入至所述语句生成网络,得到所述第二用户语句对应的第四回复语句;
    根据所述第四对话类别和所述第三对话类别之间的差异,更新所述状态确定网络;
    根据所述第四回复语句和所述第三回复语句之间的差异,更新所述语句生成网络。
  9. 根据权利要求6至8任一所述的方法,其特征在于,所述状态确定网络和所述语句生成网络为GPT模型、DialoGPT模型、BART模型或T5模型。
  10. 根据权利要求6至9任一所述的方法,其特征在于,所述将所述第一用户语句以及所述第一对话类别输入至语句生成网络,得到所述第一用户语句对应的第二回复语句,包括:
    根据所述第一用户语句,从所述第一用户语句或者数据库中得到用于构建所述回复语句所需的关键词或关键句;
    将所述第一用户语句、所述第一对话类别、所述关键词或关键句输入至所述语句生成网络,得到所述第一用户语句对应的第二回复语句。
  11. 一种确定回复语句的装置,其特征在于,所述装置包括:
    获取模块,用于获取待回复的第一用户语句;
    状态生成模块,用于根据所述第一用户语句,通过状态确定网络确定所述第一用户语句的第一状态信息,所述第一状态信息包括所述第一用户语句的第一对话类别,所述第一对话类别为聊天型对话、任务型对话、问答型对话或检索型对话;
    回复语句生成模块,用于将所述第一用户语句以及所述第一对话类别,输入至语句生成网络,得到所述第一用户语句对应的回复语句。
  12. 根据权利要求11所述的装置,其特征在于,所述状态生成模块,具体用于:
    通过状态确定网络,从多个对话类型中确定所述第一用户语句的第一对话类别,所述多个对话类型包括所述聊天型对话、任务型对话、问答型对话以及检索型对话中的至少两个。
  13. 根据权利要求11或12所述的装置,其特征在于,所述获取模块,还用于:
    获取待回复的第二用户语句;
    所述状态生成模块,还用于根据所述第二用户语句,通过所述状态确定网络确定所述第二用户语句的第二状态信息,所述第二状态信息包括所述第二用户语句的第二对话类别,所述第二对话类别为聊天型对话、任务型对话、问答型对话或检索型对话,且所述第二对话类别和所述第一对话类别不同;
    所述回复语句生成模块,还用于将所述第二用户语句以及所述第二对话类别输入至所述语句生成网络,得到所述第二用户语句对应的回复语句。
  14. 根据权利要求11至13任一所述的装置,其特征在于,所述状态确定网络和所述语句生成网络为GPT模型、DialoGPT模型、BART模型或T5模型。
  15. 根据权利要求11至14任一所述的装置,其特征在于,所述回复语句生成模块,具体用于:
    根据所述第一用户语句,从所述第一用户语句或者数据库中得到用于构建所述回复语句所需的关键词或关键句;
    将所述第一用户语句、所述第一对话类别、所述关键词或关键句输入至所述语句生成网络,得到所述第一用户语句对应的回复语句。
  16. 一种确定回复语句的装置,其特征在于,所述装置包括:
    获取模块,用于获取第一用户语句、所述第一用户语句的第一对话类别以及所述第一用户语句对应的第一回复语句,所述第一对话类别为所述第一用户语句的真实类别,所述第一对话类别为聊天型对话、任务型对话、问答型对话或检索型对话;
    状态生成模块,用于根据所述第一用户语句,通过状态确定网络确定所述第一用户语句的第一状态信息,所述第一状态信息包括所述第一用户语句的第二对话类别;
    回复语句生成模块,用于将所述第一用户语句以及所述第一对话类别输入至语句生成 网络,得到所述第一用户语句对应的第二回复语句;
    模型更新模块,用于根据所述第一对话类别和所述第二对话类别之间的差异,更新所述状态确定网络;
    根据所述第一回复语句和所述第二回复语句之间的差异,更新所述语句生成网络。
  17. 根据权利要求16所述的装置,其特征在于,所述状态生成模块,具体用于:
    通过状态确定网络,从多个对话类型中确定所述第一用户语句的第二对话类别,所述多个对话类型包括所述聊天型对话、任务型对话、问答型对话以及检索型对话中的至少两个。
  18. 根据权利要求16或17所述的装置,其特征在于,所述获取模块,还用于:
    获取第二用户语句、所述第二用户语句的第三对话类别以及所述第二用户语句对应的第三回复语句,所述第三对话类别为所述第二用户语句的真实类别;
    所述状态生成模块,还用于根据所述第二用户语句,通过所述状态确定网络确定所述第二用户语句的第二状态信息,所述第二状态信息包括所述第二用户语句的第四对话类别,所述第四对话类别和所述第三对话类别不同;
    所述回复语句生成模块,还用于将所述第二用户语句以及所述第三对话类别输入至所述语句生成网络,得到所述第二用户语句对应的第四回复语句;
    所述模型更新模块,还用于根据所述第四对话类别和所述第三对话类别之间的差异,更新所述状态确定网络;
    根据所述第四回复语句和所述第三回复语句之间的差异,更新所述语句生成网络。
  19. 根据权利要求16至18任一所述的装置,其特征在于,所述状态确定网络和所述语句生成网络为GPT模型、DialoGPT模型、BART模型或T5模型。
  20. 根据权利要求16至19任一所述的装置,其特征在于,所述回复语句生成模块,具体用于:
    根据所述第一用户语句,从所述第一用户语句或者数据库中得到用于构建所述回复语句所需的关键词或关键句;
    将所述第一用户语句、所述第一对话类别、所述关键词或关键句输入至所述语句生成网络,得到所述第一用户语句对应的第二回复语句。
  21. 一种确定回复语句的装置,其特征在于,所述装置包括存储器和处理器;所述存储器存储有代码,所述处理器被配置为获取所述代码,并执行如权利要求1至10任一所述的方法。
  22. 一种计算机存储介质,其特征在于,所述计算机存储介质存储有一个或多个指令, 所述指令在由一个或多个计算机执行时使得所述一个或多个计算机实施权利要求1至10任一所述的方法。
  23. 一种计算机程序产品,包括代码,其特征在于,在所述代码被执行时用于实现如权利要求1至10任一所述的方法。
PCT/CN2022/125088 2021-10-15 2022-10-13 一种确定回复语句的方法及装置 WO2023061443A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202111205658.2 2021-10-15
CN202111205658.2A CN115994201A (zh) 2021-10-15 2021-10-15 一种确定回复语句的方法及装置

Publications (1)

Publication Number Publication Date
WO2023061443A1 true WO2023061443A1 (zh) 2023-04-20

Family

ID=85987276

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2022/125088 WO2023061443A1 (zh) 2021-10-15 2022-10-13 一种确定回复语句的方法及装置

Country Status (2)

Country Link
CN (1) CN115994201A (zh)
WO (1) WO2023061443A1 (zh)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110209784A (zh) * 2019-04-26 2019-09-06 腾讯科技(深圳)有限公司 消息交互方法、计算机设备及存储介质
CN110347792A (zh) * 2019-06-25 2019-10-18 腾讯科技(深圳)有限公司 对话生成方法及装置、存储介质、电子设备
CN111611365A (zh) * 2020-05-19 2020-09-01 上海鸿翼软件技术股份有限公司 一种对话系统的流程控制方法、装置、设备及存储介质
CN112100353A (zh) * 2020-09-15 2020-12-18 京东方科技集团股份有限公司 人机对话方法及系统、计算机设备及介质
US20210043207A1 (en) * 2018-04-24 2021-02-11 Microsoft Technology Licensing, Llc Session message processing

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20210043207A1 (en) * 2018-04-24 2021-02-11 Microsoft Technology Licensing, Llc Session message processing
CN110209784A (zh) * 2019-04-26 2019-09-06 腾讯科技(深圳)有限公司 消息交互方法、计算机设备及存储介质
CN110347792A (zh) * 2019-06-25 2019-10-18 腾讯科技(深圳)有限公司 对话生成方法及装置、存储介质、电子设备
CN111611365A (zh) * 2020-05-19 2020-09-01 上海鸿翼软件技术股份有限公司 一种对话系统的流程控制方法、装置、设备及存储介质
CN112100353A (zh) * 2020-09-15 2020-12-18 京东方科技集团股份有限公司 人机对话方法及系统、计算机设备及介质

Also Published As

Publication number Publication date
CN115994201A (zh) 2023-04-21

Similar Documents

Publication Publication Date Title
CN111897941B (zh) 对话生成方法、网络训练方法、装置、存储介质及设备
WO2021047286A1 (zh) 文本处理模型的训练方法、文本处理方法及装置
WO2022007823A1 (zh) 一种文本数据处理方法及装置
WO2022116933A1 (zh) 一种训练模型的方法、数据处理的方法以及装置
WO2021233199A1 (zh) 搜索推荐模型的训练方法、搜索结果排序的方法及装置
WO2022068627A1 (zh) 一种数据处理方法及相关设备
JP2021523464A (ja) 収束質問に対する回答を改善するための仮想談話ツリーの構築
CN111898636B (zh) 一种数据处理方法及装置
WO2021057884A1 (zh) 语句复述方法、训练语句复述模型的方法及其装置
WO2020073533A1 (zh) 自动问答方法及装置
WO2024083121A1 (zh) 一种数据处理方法及其装置
WO2024002167A1 (zh) 一种操作预测方法及相关装置
WO2020192523A1 (zh) 译文质量检测方法、装置、机器翻译系统和存储介质
CN115293227A (zh) 一种模型训练方法及相关设备
WO2024120504A1 (zh) 一种数据处理方法及相关设备
WO2024114659A1 (zh) 一种摘要生成方法及其相关设备
WO2024046473A1 (zh) 一种数据处理方法及其装置
WO2024067779A1 (zh) 一种数据处理方法及相关装置
WO2024012360A1 (zh) 一种数据处理方法及相关装置
CN113420136A (zh) 一种对话方法、系统、电子设备、存储介质和程序产品
WO2023279921A1 (zh) 神经网络模型的训练方法、数据处理的方法及装置
WO2023284716A1 (zh) 一种神经网络搜索方法及相关设备
CN116910201A (zh) 一种对话数据生成方法及其相关设备
WO2023061443A1 (zh) 一种确定回复语句的方法及装置
CN115129863A (zh) 意图识别方法、装置、设备、存储介质和计算机程序产品

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22880379

Country of ref document: EP

Kind code of ref document: A1

WWE Wipo information: entry into national phase

Ref document number: 2022880379

Country of ref document: EP

ENP Entry into the national phase

Ref document number: 2022880379

Country of ref document: EP

Effective date: 20240419

NENP Non-entry into the national phase

Ref country code: DE