CN117453899A - Intelligent dialogue system and method based on large model and electronic equipment - Google Patents

Intelligent dialogue system and method based on large model and electronic equipment Download PDF

Info

Publication number
CN117453899A
CN117453899A CN202311797111.5A CN202311797111A CN117453899A CN 117453899 A CN117453899 A CN 117453899A CN 202311797111 A CN202311797111 A CN 202311797111A CN 117453899 A CN117453899 A CN 117453899A
Authority
CN
China
Prior art keywords
intention
information
current
decision
large model
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202311797111.5A
Other languages
Chinese (zh)
Other versions
CN117453899B (en
Inventor
黄深广
朱甬翔
段嘉铭
吴国涛
朱泉
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhejiang Zhigangtong Technology Co ltd
Original Assignee
Zhejiang Zhigangtong Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhejiang Zhigangtong Technology Co ltd filed Critical Zhejiang Zhigangtong Technology Co ltd
Priority to CN202311797111.5A priority Critical patent/CN117453899B/en
Publication of CN117453899A publication Critical patent/CN117453899A/en
Application granted granted Critical
Publication of CN117453899B publication Critical patent/CN117453899B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Abstract

The invention relates to an intelligent dialogue system, a method and electronic equipment based on a large model; in the invention, the following components are added: the matching type intention recognition module is used for acquiring corresponding reference intention and reference slot position information according to a user request and history information in a current session; the large model decision module is used for generating operation parameters according to the user request, the corresponding reference intention and reference slot position information and the history information in the current session, and executing a corresponding calling program in a decision space based on the operation parameters to obtain current decision information; the large model text generation module is used for generating reply content according to the historical decision information, the current decision information and the corresponding fixed reference information thereof, and returning the reply content to the user; the intelligent dialogue system acquires the reference intention and the reference slot position information corresponding to the user request through the matching type intention recognition module, and executes the corresponding calling program based on the reference intention and the reference slot position information, so that the intelligent dialogue system can call an accurate external API.

Description

Intelligent dialogue system and method based on large model and electronic equipment
Technical Field
The present invention relates to the field of intelligent conversations, and in particular, to an intelligent conversations system, method and electronic device based on a large model.
Background
Intelligent dialog systems have long been developed that go through conventional dialog systems (a specific dialog process is shown in fig. 5, which includes an intent recognition module, a dialog management module, and a text generation module), rule-based automated dialog systems, machine-learning-based intelligent dialog systems, deep-learning-based intelligent dialog systems, and today's forefront large model-based dialog systems. Deep learning based intelligent dialog systems benefit from the recent advent of deep learning techniques, such as systems based on Recurrent Neural Networks (RNNs) and transducer models (transducers), which can better understand user intent and generate more natural and fluent responses. Large model-based dialog systems attempt to simplify the dialog system by virtue of the excellent language understanding and reasoning capabilities of large models, and the reconstruction of past intent recognition modules, dialog management modules, and text generation modules only retains the two parts of the large model module and the filter module.
With the rapid development of deep learning, openAI introduced the Large Language Models concept in 2022, where the dialog system industry was rapidly subverted by the GPT3.5 algorithm model. OpenAI greatly simplifies the legacy dialog system architecture by using the GPT3.5 model. The simplified dialog system architecture consists of a large model (such as GPT 3.5) and a filter, and the specific dialog process is shown in fig. 6, which replaces the past intention recognition module, dialog management module and text generation module. Where the filter is the part of the pointer that screens and filters the output of the large model for controlling the security and compliance of the dialog. It may filter the answers generated by the large model according to specific requirements and rules to ensure that the output content meets expectations and relevant criteria. The simplified dialogue system architecture provides more efficient and flexible dialogue generation capability, so that a large model can directly bear more dialogue tasks, multi-module components such as intention recognition, dialogue management and the like in the original dialogue system are fused, and the complexity of the system caused by linkage among the multiple modules is reduced. Meanwhile, the filter can help control and manage the output of the dialogue system, and the usability and safety of the dialogue system are improved.
However, due to the self-characteristics of the GPT (Generative Pretrained Transformer) algorithm architecture, the input of the model is directly the user's speaking and is directly output based on the user's speaking, so that the response result of the dialogue system has a illusion problem, and the user experience of the system is greatly reduced. In the existing solution, introducing the above guidance information (providing the GPT with the above reference information by using the search knowledge base in the large model generation module, in order to expect that it can conform to the fact of the guidance output) through the hint word engineering can alleviate the illusion problem to some extent, but since its design does not change the underlying architecture of the algorithm, i.e. its core function in the dialog system, it cannot completely ensure that the generated content is not illusive and that the generated content is correct. In addition, the prompt word engineering is often one of the main methods for a malicious user to attack a large model, and because the result generated by the system is strongly dependent on the user request, the possibility of being attacked maliciously exists.
In addition, in the conventional dialogue system, classification is performed based on a user request, that is, what is the target of the classification of the algorithm needs to be told first, and then data is collected to train and optimize. The algorithm requires additional training to re-collect data each time the target is changed, i.e., targeted tuning, resulting in higher maintenance costs.
Disclosure of Invention
In order to solve the problem that the reply content output by the dialogue system based on the large model has illusion, and simultaneously solve the problem that additional training is needed when the target changes in the traditional dialogue system, one aspect of the embodiment of the invention provides an intelligent dialogue system based on the large model, which comprises: a matching type intention recognition module and a large model;
the matching type intention recognition module is used for acquiring corresponding reference intention and reference slot position information according to a user request and history information in a current session;
the large model includes: the large model text generation module and the large model decision module comprises a decision space; the decision space comprises a plurality of calling programs corresponding to external APIs; any calling program is defined with corresponding operation parameters and fixed reference information;
the large model decision module is used for generating operation parameters according to the user request, the corresponding reference intention and reference slot position information and the history information in the current session, and executing a corresponding calling program in a decision space based on the operation parameters to obtain current decision information;
the large model text generation module is used for generating reply content according to the historical decision information, the current decision information and the corresponding fixed reference information, and returning the reply content to the user.
In some embodiments thereof, the matching type intention recognition module specifically includes:
a set setting unit for defining intention information; the intention information comprises an intention and a corresponding intention candidate set; the intents are in one-to-one correspondence with the candidate sets of intents;
the word segmentation device is used for segmenting the user request by using a word segmentation algorithm to obtain a word segmentation result; the word segmentation result comprises a phrase and a sentence;
the vectorization unit is used for vectorizing the word segmentation result to obtain corresponding phrase vectors and sentence vectors;
the analysis unit is used for acquiring an intention candidate set corresponding to each intention based on the intention of each historical dialog in the current conversation, and searching in each acquired intention candidate set based on the phrase vector corresponding to the current user request to obtain a plurality of preselected intentions corresponding to the current request; searching in a slot position vector knowledge base by utilizing sentence vectors, and matching to obtain reference slot position information corresponding to the current request; respectively inputting each dialogue information set into a Cross encoding algorithm to obtain scores corresponding to each intention candidate set; the intention candidate sets are in one-to-one correspondence with the dialogue information sets; the dialogue information sets each comprise: preselection intention and reference slot information corresponding to the current request, an intention candidate set, historical intention and historical slot information corresponding to the intention candidate set, and the current user request; the intent candidate set with the highest score is obtained and the corresponding pre-selected intent is set as the reference intent corresponding to the current request.
In some embodiments, the historical decision information is decision information corresponding to a last session in the current session.
In order to solve the above-mentioned problems, according to still another aspect of the embodiments of the present invention, there is provided a large model-based intelligent dialogue method applied to the large model-based intelligent dialogue system as described above; the intelligent dialogue method comprises the following steps:
s1: acquiring historical information in a user request and a current session;
s2: acquiring corresponding reference intention and reference slot position information by using a user request and historical information in the current session through a matching type intention recognition module;
s3: generating operation parameters by using the user request, the corresponding reference intention and reference slot position information and the history information in the current session through a large model decision module, and executing a corresponding calling program in a decision space based on the operation parameters to obtain current decision information;
s4: and generating reply content by using the historical decision information, the current decision information and the corresponding fixed reference information through the large model text generation module, and returning the reply content to the user.
In some embodiments, the step S2 specifically includes:
defining intention information; the intention information comprises an intention and a corresponding intention candidate set; the intents are in one-to-one correspondence with the candidate sets of intents;
performing word segmentation on the user request by using a word segmentation algorithm to obtain a word segmentation result; the word segmentation result comprises a phrase and a sentence;
the vectorization word segmentation result is used for obtaining corresponding phrase vectors and sentence vectors;
acquiring an intention candidate set corresponding to each intention based on the intention of each historical dialog in the current conversation, and respectively searching in each acquired intention candidate set based on the phrase vector corresponding to the current user request to obtain a plurality of preselected intentions corresponding to the current request; searching in a slot position vector knowledge base by utilizing sentence vectors, and matching to obtain reference slot position information corresponding to the current request; respectively inputting each dialogue information set into a Cross encoding algorithm to obtain scores corresponding to each intention candidate set; the intention candidate sets are in one-to-one correspondence with the dialogue information sets; the dialogue information sets each comprise: preselection intention and reference slot information corresponding to the current request, an intention candidate set, historical intention and historical slot information corresponding to the intention candidate set, and the current user request; the intent candidate set with the highest score is obtained and the corresponding pre-selected intent is set as the reference intent corresponding to the current request.
In some embodiments, the historical decision information is decision information corresponding to a last session in the current session.
In order to solve the above problem, another aspect of an embodiment of the present invention provides an electronic device, including: a processor, and a memory storing a program comprising instructions that, when executed by the processor, cause the processor to perform the method described above.
To solve the above-described problems, in a further aspect of embodiments of the present invention, there is provided a non-transitory machine-readable medium storing computer instructions for causing the computer to perform the method described above.
The beneficial effects of the embodiment of the invention include:
(1) The intelligent dialogue system provided by the invention comprises the following steps: the matching type intention recognition module is used for acquiring corresponding reference intention and reference slot position information according to a user request and history information in a current session; the large model decision module is used for generating operation parameters according to the user request, the corresponding reference intention and reference slot position information and the history information in the current session, and executing a corresponding calling program in a decision space based on the operation parameters to obtain current decision information; the large model text generation module is used for generating reply content according to the historical decision information, the current decision information and the corresponding fixed reference information (auxiliary information) thereof, and returning the reply content to the user; the method comprises the steps that reference intention and reference slot position information corresponding to a user request are obtained through a matching type intention recognition module, and a corresponding calling program is executed based on the reference intention and the reference slot position information, so that an intelligent dialogue system can call an accurate external API (application program interface), and the problem that in the existing large-model dialogue system, because the input of a model is directly the speaking of the user and no auxiliary information exists, the result is directly output based on the speaking of the user, and the response result of the dialogue system has illusion is avoided;
(2) The invention executes the corresponding calling program based on the reference intention and the reference slot position information, ensures that the intelligent dialogue system can call an accurate external API, and avoids the problem that the large model is unavailable (the API cannot be accurately called) because the number of APIs is more and the required text amount is large and the context is too long when the APIs are called through the utterance of the user and the definition of the additional large-section description APIs in the dialogue system of the existing large model;
(3) In the invention, as the large model text generation module does not directly interact with the content requested by the user, the attack of a malicious prompter is avoided;
(4) According to the matching type intention recognition module, the preselected intention is obtained, the reference intention corresponding to the current request is obtained by combining the reference slot position information with the preselected intention and combining the historical information in the current session, and the problem that the maintenance cost is too high because training and tuning are needed in a classification mode in a traditional dialogue system is avoided in a retrieval mode.
The details of one or more embodiments of the invention are set forth in the accompanying drawings and the description below to provide a more thorough understanding of the other features, objects, and advantages of the invention.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below. It is evident that the drawings in the following description are only some embodiments of the invention, from which other embodiments can be obtained for a person skilled in the art without inventive effort.
FIG. 1 is a block diagram of a large model-based intelligent dialog system in accordance with one embodiment of the present invention;
FIG. 2 is a schematic diagram of a matched intent recognition module in the intelligent dialog system of FIG. 1;
FIG. 3 is a flow chart of a large model based intelligent dialog method in accordance with another embodiment of the present invention;
fig. 4 is a schematic structural view of the electronic device of the present embodiment;
FIG. 5 is a schematic diagram of a conventional dialog system in the prior art;
fig. 6 is a schematic diagram of a prior art large model based dialog system.
Detailed Description
Embodiments of the present embodiment will be described in more detail below with reference to the accompanying drawings. While certain embodiments of the present embodiments are illustrated in the accompanying drawings, it is to be understood that the present embodiments may be embodied in various forms and should not be construed as limited to the embodiments set forth herein, but rather are provided to provide a more thorough and complete understanding of the present embodiments. It should be understood that the drawings and the embodiments of the present embodiments are presented for purposes of illustration only and are not intended to limit the scope of the embodiments.
For better understanding of the embodiments of the present application, technical terms related in the embodiments of the present application are explained below:
the large model refers to a large-scale pre-trained language model, such as GPT3.5, GPT4, and universal meaning thousand questions of Ababa of OpenAI, and the large model can have powerful semantic understanding and generating capability by pre-training on a large number of corpora, can automatically extract context information, understand user intention and generate accurate answers conforming to the context.
Byte Pair Encoding (BPE) is a subword segmentation algorithm for natural language processing that aims to find an optimal way of character combining such that the character combinations of the different words in the whole dataset are as few as possible.
ULM word segmentation algorithm: the core idea of ULM is to learn the features of a language using neural network models, so that the input text can be accurately segmented into words or phrases.
The Sentence-BERT algorithm: the Sentence-BERT is a twin network based on pre-trained BERTs, capable of obtaining semantically significant chapter vectors.
In the prior art, due to the self-characteristics of the GPT (Generative Pretrained Transformer) algorithm architecture, the input of the model is directly the user's speaking, and the model is directly output based on the user's speaking, so that the result of the response of the dialogue system has a phantom problem.
In order to solve the above technical problems, as shown in fig. 1, an embodiment of the present invention provides an intelligent dialogue system based on a large model, where the intelligent dialogue system includes: a matching type intention recognition module and a large model;
the matching type intention recognition module is used for acquiring corresponding reference intention and reference slot position information according to a user request and history information in a current session;
as shown in fig. 2, the matching type intention recognition module specifically includes:
a set setting unit for defining intention information; the intention information comprises an intention and a corresponding intention candidate set; the intents are in one-to-one correspondence with the candidate sets of intents;
the word segmentation device is used for segmenting the user request by using a word segmentation algorithm to obtain a word segmentation result; the word segmentation result comprises a phrase and a sentence;
the word segmentation algorithm that can be used in this embodiment includes: byte Pair Encoding (BPE), ULM Word segmentation algorithm, word Piece segmentation algorithm, etc.
The vectorization unit is used for vectorizing the word segmentation result to obtain corresponding phrase vectors and sentence vectors;
specifically, the algorithm adopted by the vectorization unit is a Sentence-BERT algorithm.
The analysis unit is used for acquiring an intention candidate set corresponding to each intention based on the intention of each historical dialog in the current conversation, and searching in each acquired intention candidate set based on the phrase vector corresponding to the current user request to obtain a plurality of preselected intentions corresponding to the current request; searching in a slot position vector knowledge base by utilizing sentence vectors, and matching to obtain reference slot position information corresponding to the current request; respectively inputting each dialogue information set into a Cross encoding algorithm to obtain scores corresponding to each intention candidate set; the intention candidate sets are in one-to-one correspondence with the dialogue information sets; the dialogue information sets each comprise: preselection intention and reference slot information corresponding to the current request, an intention candidate set, historical intention and historical slot information corresponding to the intention candidate set, and the current user request; the intent candidate set with the highest score is obtained and the corresponding pre-selected intent is set as the reference intent corresponding to the current request.
For example, the user's request is "check weather in Hangzhou", then:
the reference slot information is: hangzhou;
the reference intent is: and (5) checking weather.
In addition, it should be explained that a session includes one or more rounds of conversations; when there is only one round of dialogue, no pre-selected intention is available for the request of the round of dialogue, in which case the corresponding reference intention is obtained by directly searching the phrase vector in the intention vector knowledge base and matching. Meanwhile, when only one session is performed, the historical decision information is empty, and when only two sessions are performed, the historical decision information corresponding to the second session is the decision information corresponding to the previous session in the current session.
An illustration will now be made in which a session comprises a plurality of rounds of dialogue:
in the first dialog, the user makes a request: helping me to check the box turnover rate of the next wharf; at this time, the reference intentions obtained are: the box turnover rate of the wharf;
in the second dialog, the user makes a request: the unmanned vehicle can be used at present; at this time, since the reference intention of the first dialog is the turn-over rate of the dock, the question for the unmanned vehicle is given a high probability that the unmanned vehicle in the dock is available, and thus the analysis unit will define the intention candidate set (i.e., the dock-related intention candidate set) corresponding to the reference intention in the second dialog based on the intention of the first dialog.
It should be noted that the matching type intention recognition module is an important component in the intelligent dialogue system of the present invention, and is used for understanding the intention of the user according to the request (question) of the user and extracting the key information therein to be provided to the large model in the form of intention and slot.
According to the matching type intention recognition module, the preselected intention is obtained, the reference intention corresponding to the current request is obtained by combining the reference slot position information with the preselected intention and combining the historical information in the current session, and the problem that the maintenance cost is too high because training and tuning are needed in a classification mode in a traditional dialogue system is avoided in a retrieval mode.
The large model includes: the large model text generation module and the large model decision module comprises a decision space; the decision space comprises a plurality of calling programs corresponding to external APIs; any calling program is defined with corresponding operation parameters and fixed reference information;
the external API may be any application that a business party wants to access, such as: weather inquiry systems, logistics inquiry systems, route inquiry systems, and the like.
The large model decision module is used for carrying out language understanding and reasoning according to the user request, the corresponding reference intention and reference slot position information and the historical information in the current session so as to select a corresponding external API in a decision space, generate an operation parameter corresponding to the external API, and execute a corresponding calling program in the decision space based on the operation parameter to obtain current decision information (namely, the decision information of the current round);
that is, in the related art dialogue system based on the large model, the user's utterance is directly input to the large model, and then the large model searches the external API to obtain the call result. The method and the device firstly search the intention and the slot position and then search the external API so as to improve the calling accuracy.
The large model text generation module is used for generating reply content according to the historical decision information, the current decision information and the corresponding fixed reference information, and returning the reply content to the user.
The historical decision information is decision information corresponding to the last-round session in the current session.
In the invention, the large model text generation module does not directly interact with the content requested by the user, so that the attack of a malicious prompter is avoided (the reply text is not generated by malicious guidance because of no direct interaction).
The present embodiment describes the following for fixed reference information:
the user request is: inquiring the box turnover rate of the boat.
The reference intent corresponding to the request is: inquiring the box turnover rate index, wherein the slot positions are as follows: wharf boat. When the big model decision module calls the external API, the big model also generates a thinking path, such as: the content of the user consultation is intended to be the turn-over rate of the boat, and no time is provided, so that the default 3 month period is adopted for calculation. The default query time period (3 months) is fixed reference information set in advance.
The intelligent dialogue system provided by the invention comprises the following steps: the matching type intention recognition module is used for acquiring corresponding reference intention and reference slot position information according to a user request and history information in a current session; the large model decision module is used for generating operation parameters according to the user request, the corresponding reference intention and reference slot position information and the history information in the current session, and executing a corresponding calling program in a decision space based on the operation parameters to obtain current decision information; the large model text generation module is used for generating reply content according to the historical decision information, the current decision information and the corresponding fixed reference information (auxiliary information) thereof, and returning the reply content to the user; the intelligent dialogue system can call an accurate external API, so that the problem that in the existing large-model dialogue system, because the input of the model is directly the utterance of the user and no auxiliary information exists, the result is directly output based on the utterance of the user, and the response result of the dialogue system has illusion is solved.
The embodiment of the invention also provides an intelligent dialogue method based on the large model, which is applied to the intelligent dialogue system based on the large model; as shown in fig. 3, the intelligent dialogue method includes the steps of:
s1: acquiring historical information in a user request and a current session;
s2: acquiring corresponding reference intention and reference slot position information by using a user request and historical information in the current session through a matching type intention recognition module;
the step S2 specifically comprises the following steps:
defining intention information; the intention information comprises an intention and a corresponding intention candidate set; the intents are in one-to-one correspondence with the candidate sets of intents;
performing word segmentation on the user request by using a word segmentation algorithm to obtain a word segmentation result; the word segmentation result comprises a phrase and a sentence;
the vectorization word segmentation result is used for obtaining corresponding phrase vectors and sentence vectors;
acquiring an intention candidate set corresponding to each intention based on the intention of each historical dialog in the current conversation, and respectively searching in each acquired intention candidate set based on the phrase vector corresponding to the current user request to obtain a plurality of preselected intentions corresponding to the current request; searching in a slot position vector knowledge base by utilizing sentence vectors, and matching to obtain reference slot position information corresponding to the current request; respectively inputting each dialogue information set into a Cross encoding algorithm to obtain scores corresponding to each intention candidate set; the intention candidate sets are in one-to-one correspondence with the dialogue information sets; the dialogue information sets each comprise: preselection intention and reference slot information corresponding to the current request, an intention candidate set, historical intention and historical slot information corresponding to the intention candidate set, and the current user request; the intent candidate set with the highest score is obtained and the corresponding pre-selected intent is set as the reference intent corresponding to the current request.
S3: generating operation parameters by using the user request, the corresponding reference intention and reference slot position information and the history information in the current session through a large model decision module, and executing a corresponding calling program in a decision space based on the operation parameters to obtain current decision information;
s4: and generating reply content by using the historical decision information, the current decision information and the corresponding fixed reference information through the large model text generation module, and returning the reply content to the user.
The historical decision information is decision information corresponding to the last-round session in the current session.
The invention executes the corresponding calling program based on the reference intention and the reference slot position information, ensures that the intelligent dialogue system can call an accurate external API, and solves the problem that the large model is unavailable (can not accurately call the API) because the number of APIs is large and the text amount required by definition is large and the context is too long when the APIs are called through the words of the user and the definition of the additional large-section description APIs in the dialogue system of the existing large model.
The embodiment of the invention also provides electronic equipment, which comprises: at least one processor; and a memory communicatively coupled to the at least one processor. The memory stores a computer program executable by the at least one processor, which when executed by the at least one processor is adapted to cause an electronic device to perform a method of an embodiment of the invention.
The embodiments of the present invention also provide a non-transitory machine-readable medium storing a computer program, wherein the computer program is configured to cause a computer to perform the method of the embodiments of the present invention when executed by a processor of the computer.
The embodiments of the present invention also provide a computer program product comprising a computer program, wherein the computer program, when being executed by a processor of a computer, is for causing the computer to perform the method of the embodiments of the present invention.
With reference to fig. 4, a block diagram of an electronic device that may be a server or a client of an embodiment of the present invention will now be described, which is an example of a hardware device that may be applied to aspects of the present invention. Electronic devices are intended to represent various forms of digital electronic computer devices, such as laptops, desktops, workstations, personal digital assistants, servers, blade servers, mainframes, and other suitable computers. The electronic device may also represent various forms of mobile devices, such as personal digital processing, cellular telephones, smartphones, wearable devices, and other similar computing devices. The components shown herein, their connections and relationships, and their functions, are meant to be exemplary only, and are not meant to limit implementations of the inventions described and/or claimed herein.
As shown in fig. 4, the electronic device includes a computing unit 401 that can perform various suitable actions and processes according to a computer program stored in a Read Only Memory (ROM) 402 or a computer program loaded from a storage unit 408 into a Random Access Memory (RAM) 403. In the RAM 403, various programs and data required for the operation of the electronic device can also be stored. The computing unit 401, ROM 402, and RAM 403 are connected to each other by a bus 404. An input/output (I/O) interface 405 is also connected to bus 404.
A number of components in the electronic device are connected to the I/O interface 405, including: an input unit 406, an output unit 407, a storage unit 408, and a communication unit 409. The input unit 406 may be any type of device capable of inputting information to an electronic device, and the input unit 406 may receive input numeric or character information and generate key signal inputs related to user settings and/or function controls of the electronic device. The output unit 407 may be any type of device capable of presenting information and may include, but is not limited to, a display, speakers, video/audio output terminals, vibrators, and/or printers. Storage unit 408 may include, but is not limited to, magnetic disks, optical disks. The communication unit 409 allows the electronic device to exchange information/data with other devices via a computer network, such as the internet, and/or various telecommunications networks, and may include, but is not limited to, modems, network cards, infrared communication devices, wireless communication transceivers and/or chipsets, such as bluetooth devices, wiFi devices, wiMax devices, cellular communication devices, and/or the like.
The computing unit 401 may be a variety of general purpose and/or special purpose processing components having processing and computing capabilities. Some examples of computing unit 401 include, but are not limited to, a CPU, a Graphics Processing Unit (GPU), various specialized Artificial Intelligence (AI) computing chips, various computing units running machine learning model algorithms, a Digital Signal Processor (DSP), and any suitable processor, controller, microcontroller, etc. The computing unit 401 performs the methods and processes described above. For example, in some embodiments, method embodiments of the present invention may be implemented as a computer program tangibly embodied on a machine-readable medium, such as storage unit 408. In some embodiments, part or all of the computer program may be loaded and/or installed onto the electronic device via the ROM 402 and/or the communication unit 409. In some embodiments, the computing unit 401 may be configured to perform the above-described methods by any other suitable means (e.g., by means of firmware).
A computer program for implementing the methods of embodiments of the present invention may be written in any combination of one or more programming languages. These computer programs may be provided to a processor or controller of a general purpose computer, special purpose computer, or other programmable data processing apparatus, such that the computer programs, when executed by the processor or controller, cause the functions/operations specified in the flowchart and/or block diagram to be implemented. The computer program may execute entirely on the machine, partly on the machine, as a stand-alone software package, partly on the machine and partly on a remote machine or entirely on the remote machine or server.
In the context of embodiments of the present invention, a machine-readable medium may be a tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. The machine-readable medium may be a machine-readable signal medium or a machine-readable storage medium. The machine-readable signal medium may include, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any suitable combination of the foregoing. More specific examples of a machine-readable storage medium would include an electrical connection based on one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.
It should be noted that the term "comprising" and its variants as used in the embodiments of the present invention are open-ended, i.e. "including but not limited to". The term "based on" is based at least in part on. The term "one embodiment" means "at least one embodiment"; the term "another embodiment" means "at least one additional embodiment"; the term "some embodiments" means "at least some embodiments. References to "one or more" modifications in the examples of the invention are intended to be illustrative rather than limiting, and it will be understood by those skilled in the art that "one or more" is intended to be interpreted as "one or more" unless the context clearly indicates otherwise.
The term "embodiment" in this specification means that a particular feature, structure, or characteristic described in connection with the embodiment may be included in at least one embodiment of the invention. The appearances of such phrases in various places in the specification are not necessarily all referring to the same embodiment, nor are separate or alternative embodiments mutually exclusive. The various embodiments in this specification are described in a related manner, with identical and similar parts being referred to each other. In particular, for apparatus, devices, system embodiments, the description is relatively simple as it is substantially similar to method embodiments, see for relevant part of the description of method embodiments.
The above examples merely represent a few embodiments of the present invention, which are described in more detail and are not to be construed as limiting the scope of the patent claims. It should be noted that it will be apparent to those skilled in the art that several variations and modifications can be made without departing from the spirit of the invention, which are all within the scope of the invention. Accordingly, the scope of the invention should be assessed as that of the appended claims.

Claims (8)

1. A large model-based intelligent dialog system, the intelligent dialog system comprising: a matching type intention recognition module and a large model;
the matching type intention recognition module is used for acquiring corresponding reference intention and reference slot position information according to a user request and history information in a current session;
the large model includes: the large model text generation module and the large model decision module comprises a decision space; the decision space comprises a plurality of calling programs corresponding to external APIs; any calling program is defined with corresponding operation parameters and fixed reference information;
the large model decision module is used for generating operation parameters according to the user request, the corresponding reference intention and reference slot position information and the history information in the current session, and executing a corresponding calling program in a decision space based on the operation parameters to obtain current decision information;
the large model text generation module is used for generating reply content according to the historical decision information, the current decision information and the corresponding fixed reference information, and returning the reply content to the user.
2. The intelligent dialogue system based on large models as claimed in claim 1, wherein the matching type intention recognition module specifically comprises:
a set setting unit for defining intention information; the intention information comprises an intention and a corresponding intention candidate set; the intents are in one-to-one correspondence with the candidate sets of intents;
the word segmentation device is used for segmenting the user request by using a word segmentation algorithm to obtain a word segmentation result; the word segmentation result comprises a phrase and a sentence;
the vectorization unit is used for vectorizing the word segmentation result to obtain corresponding phrase vectors and sentence vectors;
the analysis unit is used for acquiring an intention candidate set corresponding to each intention based on the intention of each historical dialog in the current conversation, and searching in each acquired intention candidate set based on the phrase vector corresponding to the current user request to obtain a plurality of preselected intentions corresponding to the current request; searching in a slot position vector knowledge base by utilizing sentence vectors, and matching to obtain reference slot position information corresponding to the current request; respectively inputting each dialogue information set into a Cross encoding algorithm to obtain scores corresponding to each intention candidate set; the intention candidate sets are in one-to-one correspondence with the dialogue information sets; the dialogue information sets each comprise: preselection intention and reference slot information corresponding to the current request, an intention candidate set, historical intention and historical slot information corresponding to the intention candidate set, and the current user request; the intent candidate set with the highest score is obtained and the corresponding pre-selected intent is set as the reference intent corresponding to the current request.
3. The intelligent dialogue system based on large model as claimed in claim 2, wherein the historical decision information is decision information corresponding to the last session in the current session.
4. A large model-based intelligent dialogue method applied to the large model-based intelligent dialogue system of any one of claims 1 to 3; the intelligent dialogue method is characterized by comprising the following steps:
s1: acquiring historical information in a user request and a current session;
s2: acquiring corresponding reference intention and reference slot position information by using a user request and historical information in the current session through a matching type intention recognition module;
s3: generating operation parameters by using the user request, the corresponding reference intention and reference slot position information and the history information in the current session through a large model decision module, and executing a corresponding calling program in a decision space based on the operation parameters to obtain current decision information;
s4: and generating reply content by using the historical decision information, the current decision information and the corresponding fixed reference information through the large model text generation module, and returning the reply content to the user.
5. The intelligent dialogue method based on the large model as claimed in claim 4, wherein the step S2 specifically includes:
defining intention information; the intention information comprises an intention and a corresponding intention candidate set; the intents are in one-to-one correspondence with the candidate sets of intents;
performing word segmentation on the user request by using a word segmentation algorithm to obtain a word segmentation result; the word segmentation result comprises a phrase and a sentence;
the vectorization word segmentation result is used for obtaining corresponding phrase vectors and sentence vectors;
acquiring an intention candidate set corresponding to each intention based on the intention of each historical dialog in the current conversation, and respectively searching in each acquired intention candidate set based on the phrase vector corresponding to the current user request to obtain a plurality of preselected intentions corresponding to the current request; searching in a slot position vector knowledge base by utilizing sentence vectors, and matching to obtain reference slot position information corresponding to the current request; respectively inputting each dialogue information set into a Cross encoding algorithm to obtain scores corresponding to each intention candidate set; the intention candidate sets are in one-to-one correspondence with the dialogue information sets; the dialogue information sets each comprise: preselection intention and reference slot information corresponding to the current request, an intention candidate set, historical intention and historical slot information corresponding to the intention candidate set, and the current user request; the intent candidate set with the highest score is obtained and the corresponding pre-selected intent is set as the reference intent corresponding to the current request.
6. The intelligent dialogue method based on the large model according to claim 5, wherein the historical decision information is decision information corresponding to a last-round session in the current session.
7. An electronic device, comprising: a processor, and a memory storing a program, characterized in that the program comprises instructions which, when executed by the processor, cause the processor to perform the method according to any one of claims 4 or 6.
8. A non-transitory machine readable medium having stored thereon computer instructions for causing the computer to perform the method according to any one of claims 4 or 6.
CN202311797111.5A 2023-12-26 2023-12-26 Intelligent dialogue system and method based on large model and electronic equipment Active CN117453899B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202311797111.5A CN117453899B (en) 2023-12-26 2023-12-26 Intelligent dialogue system and method based on large model and electronic equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202311797111.5A CN117453899B (en) 2023-12-26 2023-12-26 Intelligent dialogue system and method based on large model and electronic equipment

Publications (2)

Publication Number Publication Date
CN117453899A true CN117453899A (en) 2024-01-26
CN117453899B CN117453899B (en) 2024-03-29

Family

ID=89585971

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202311797111.5A Active CN117453899B (en) 2023-12-26 2023-12-26 Intelligent dialogue system and method based on large model and electronic equipment

Country Status (1)

Country Link
CN (1) CN117453899B (en)

Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111026842A (en) * 2019-11-29 2020-04-17 微民保险代理有限公司 Natural language processing method, natural language processing device and intelligent question-answering system
CN111104504A (en) * 2019-12-25 2020-05-05 天津中科智能识别产业技术研究院有限公司 Natural language processing and knowledge graph based dialogue method
CN112632961A (en) * 2021-03-04 2021-04-09 支付宝(杭州)信息技术有限公司 Natural language understanding processing method, device and equipment based on context reasoning
JP2021174511A (en) * 2020-04-29 2021-11-01 バイドゥ オンライン ネットワーク テクノロジー (ベイジン) カンパニー リミテッド Query analyzing method, device, electronic equipment, program, and readable storage medium
CN116127011A (en) * 2022-11-22 2023-05-16 马上消费金融股份有限公司 Intention recognition method, device, electronic equipment and storage medium
CN116244418A (en) * 2023-05-11 2023-06-09 腾讯科技(深圳)有限公司 Question answering method, device, electronic equipment and computer readable storage medium
CN116521893A (en) * 2023-04-28 2023-08-01 苏州浪潮智能科技有限公司 Control method and control device of intelligent dialogue system and electronic equipment
CN116976502A (en) * 2023-07-10 2023-10-31 浙江智港通科技有限公司 Structured ship configuration method, system and medium for container ship
CN117056494A (en) * 2023-09-28 2023-11-14 腾讯科技(深圳)有限公司 Open domain question and answer method, device, electronic equipment and computer storage medium

Patent Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111026842A (en) * 2019-11-29 2020-04-17 微民保险代理有限公司 Natural language processing method, natural language processing device and intelligent question-answering system
CN111104504A (en) * 2019-12-25 2020-05-05 天津中科智能识别产业技术研究院有限公司 Natural language processing and knowledge graph based dialogue method
JP2021174511A (en) * 2020-04-29 2021-11-01 バイドゥ オンライン ネットワーク テクノロジー (ベイジン) カンパニー リミテッド Query analyzing method, device, electronic equipment, program, and readable storage medium
CN112632961A (en) * 2021-03-04 2021-04-09 支付宝(杭州)信息技术有限公司 Natural language understanding processing method, device and equipment based on context reasoning
CN116127011A (en) * 2022-11-22 2023-05-16 马上消费金融股份有限公司 Intention recognition method, device, electronic equipment and storage medium
CN116521893A (en) * 2023-04-28 2023-08-01 苏州浪潮智能科技有限公司 Control method and control device of intelligent dialogue system and electronic equipment
CN116244418A (en) * 2023-05-11 2023-06-09 腾讯科技(深圳)有限公司 Question answering method, device, electronic equipment and computer readable storage medium
CN116976502A (en) * 2023-07-10 2023-10-31 浙江智港通科技有限公司 Structured ship configuration method, system and medium for container ship
CN117056494A (en) * 2023-09-28 2023-11-14 腾讯科技(深圳)有限公司 Open domain question and answer method, device, electronic equipment and computer storage medium

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
杨晔;: "基于深度学习的聊天机器人的研究", 信息技术与信息化, no. 03, 28 March 2020 (2020-03-28), pages 164 - 165 *
舒文韬,等: "大型语言模型:原理、实现与发展", 《计算机研究与发展》, 9 October 2023 (2023-10-09), pages 1 - 10 *

Also Published As

Publication number Publication date
CN117453899B (en) 2024-03-29

Similar Documents

Publication Publication Date Title
CN108847241B (en) Method for recognizing conference voice as text, electronic device and storage medium
US20230080671A1 (en) User intention recognition method and apparatus based on statement context relationship prediction
WO2022121251A1 (en) Method and apparatus for training text processing model, computer device and storage medium
CN114970522B (en) Pre-training method, device, equipment and storage medium of language model
CN115309877B (en) Dialogue generation method, dialogue model training method and device
CN111027292B (en) Method and system for generating limited sampling text sequence
CN111339309B (en) Corpus expansion method and system for user intention
CN116737908A (en) Knowledge question-answering method, device, equipment and storage medium
CN113010653B (en) Method and system for training and conversing conversation strategy model
EP4057283A2 (en) Method for detecting voice, method for training, apparatuses and smart speaker
CN112632248A (en) Question answering method, device, computer equipment and storage medium
JP2021081713A (en) Method, device, apparatus, and media for processing voice signal
JP2023002690A (en) Semantics recognition method, apparatus, electronic device, and storage medium
CN116187320A (en) Training method and related device for intention recognition model
CN115312034A (en) Method, device and equipment for processing voice signal based on automaton and dictionary tree
CN117453899B (en) Intelligent dialogue system and method based on large model and electronic equipment
CN112100339A (en) User intention recognition method and device for intelligent voice robot and electronic equipment
CN113792133B (en) Question judging method and device, electronic equipment and medium
CN114490969B (en) Question and answer method and device based on table and electronic equipment
CN113535930B (en) Model training method, device and storage medium
CN111091011B (en) Domain prediction method, domain prediction device and electronic equipment
CN111159339A (en) Text matching processing method and device
CN117333889A (en) Training method and device for document detection model and electronic equipment
CN117539984A (en) Method, device and equipment for generating reply text
CN115906797A (en) Text entity alignment method, device, equipment and medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant