CN114255750B - Data set construction and task-based dialogue method, electronic device and storage medium - Google Patents

Data set construction and task-based dialogue method, electronic device and storage medium Download PDF

Info

Publication number
CN114255750B
CN114255750B CN202111421284.8A CN202111421284A CN114255750B CN 114255750 B CN114255750 B CN 114255750B CN 202111421284 A CN202111421284 A CN 202111421284A CN 114255750 B CN114255750 B CN 114255750B
Authority
CN
China
Prior art keywords
task
data set
dialog
intention
user
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202111421284.8A
Other languages
Chinese (zh)
Other versions
CN114255750A (en
Inventor
姜飞俊
胡于响
施晨
林兆江
徐鹏
冯雁
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Alibaba China Co Ltd
Original Assignee
Alibaba China Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Alibaba China Co Ltd filed Critical Alibaba China Co Ltd
Priority to CN202111421284.8A priority Critical patent/CN114255750B/en
Publication of CN114255750A publication Critical patent/CN114255750A/en
Application granted granted Critical
Publication of CN114255750B publication Critical patent/CN114255750B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/951Indexing; Web crawling techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/30Semantic analysis
    • G06F40/35Discourse or dialogue representation
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L13/00Speech synthesis; Text to speech systems
    • G10L13/02Methods for producing synthetic speech; Speech synthesisers
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/08Speech classification or search
    • G10L15/18Speech classification or search using natural language modelling
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue
    • G10L2015/225Feedback of the input speech

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Computational Linguistics (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Multimedia (AREA)
  • Acoustics & Sound (AREA)
  • Databases & Information Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Artificial Intelligence (AREA)
  • Data Mining & Analysis (AREA)
  • General Health & Medical Sciences (AREA)
  • User Interface Of Digital Computer (AREA)

Abstract

The embodiment of the invention provides a data set construction and task-based dialogue method, electronic equipment and a storage medium. The data set construction method comprises the following steps: constructing a user intention set; determining key data corresponding to each user intention in the user intention set from a characteristic data set; generating a task-based dialog outline of the target user intention according to key data corresponding to the target user intention in the user intentions; and constructing a task-based dialog data set according to the task-based dialog outline intended by the target user. The scheme of the embodiment of the invention reduces the labor cost and time cost during the construction of the dialogue data set.

Description

Data set construction and task-based dialogue method, electronic device and storage medium
Technical Field
The embodiment of the invention relates to the technical field of computers, in particular to a data set construction and task-based dialogue method, electronic equipment and a storage medium.
Background
Task-oriented dialog (ToD), also known as Task-oriented dialog, systems based on Task-oriented dialog are used to intelligently execute instruction tasks according to user input instructions, returning information desired by the user.
To realize intelligent analysis and feedback, a reliable dialogue data set is needed to construct a task-based dialogue system. In the existing data acquisition process, a conversation process needs to be manually marked, and information such as conversation states and the like needs to be further marked after the conversation is generated, so that the time cost and the labor cost are high.
Disclosure of Invention
In view of the above, embodiments of the present invention provide a data set construction and task-based dialog method, an electronic device, and a storage medium to at least partially solve the above problems.
According to a first aspect of the embodiments of the present invention, there is provided a data set constructing method, including: constructing a user intention set; determining key data corresponding to each user intention in the user intention set from a characteristic data set; generating a task-based dialog outline of the target user intention according to key data corresponding to the target user intention in the user intentions; and constructing a task-based dialog data set according to the task-based dialog outline intended by the target user.
According to a second aspect of the embodiments of the present invention, there is provided a data set constructing apparatus including: the first building module is used for building a user intention set; the determining module is used for determining key data corresponding to each user intention in the user intention set from the characteristic data set; the generating module is used for generating a task type dialog outline of the target user intention according to key data corresponding to the target user intention in all the user intentions; and the second construction module is used for constructing a task type conversation data set according to the task type conversation outline intended by the target user.
According to a second aspect of the embodiments of the present invention, there is provided a task-based dialog method, including: acquiring a task-based dialogue request of a user through a human-computer interaction interface, wherein the task-based dialogue request indicates the intention of a target user; sending the task-based dialogue request to a task-based dialogue system, wherein the task-based dialogue system processes the intention of the target user by using a machine learning model to obtain a task-based dialogue response, and the machine learning model is obtained by training a task-based dialogue data set constructed according to the method of the first aspect; receiving a task-based dialog response sent by the task-based dialog system; and displaying the task type dialogue response to a user through the man-machine interaction interface.
According to a fourth aspect of embodiments of the present invention, there is provided an electronic apparatus, including: the system comprises a processor, a memory, a communication interface and a communication bus, wherein the processor, the memory and the communication interface complete mutual communication through the communication bus; the memory is used for storing at least one executable instruction, and the executable instruction causes the processor to execute the corresponding operation of the method according to the first aspect.
According to a fifth aspect of embodiments of the present invention, there is provided a computer storage medium having stored thereon a computer program which, when executed by a processor, implements the method as described in the first aspect.
In the scheme of the embodiment of the invention, the key data corresponding to each user intention in the user intention set is determined from the characteristic data set, the accurate corresponding relation between the key data and the user intention is established, the accurate task type conversation outline is efficiently generated according to the key data corresponding to the target user intention in each user intention, the task type conversation data set is constructed according to the task type conversation outline of the target user intention, the construction efficiency of the data set is improved, and the labor cost and the time cost in constructing the conversation data set are reduced.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the embodiments or the prior art descriptions will be briefly described below, it is obvious that the drawings in the following descriptions are only some embodiments described in the embodiments of the present invention, and other drawings can be obtained by those skilled in the art according to these drawings.
FIG. 1 is a schematic block diagram of a task-based dialog system according to one example;
FIG. 2 is a flow diagram of the steps of a data set construction method according to one embodiment of the invention;
FIG. 3A is a schematic diagram of a method of constructing a feature data set according to another embodiment of the invention;
FIG. 3B is a diagram illustrating a method for constructing a dialog data set according to another embodiment of the present invention;
FIG. 3C is a diagram illustrating a task-based dialog method according to another embodiment of the present invention;
FIG. 4 is a block diagram of a data set constructing apparatus according to another embodiment of the present invention;
fig. 5 is a schematic structural diagram of an electronic device according to another embodiment of the invention.
Detailed Description
In order to make those skilled in the art better understand the technical solutions in the embodiments of the present invention, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all the embodiments. All other embodiments obtained by a person skilled in the art based on the embodiments of the present invention shall fall within the scope of the protection of the embodiments of the present invention.
The following further describes specific implementation of the embodiments of the present invention with reference to the drawings.
FIG. 1 is a schematic block diagram of a task-based dialog system in accordance with one embodiment of the present invention. The task-based dialog system of FIG. 1 includes a speech recognition module 110, a natural speech understanding module 120, a dialog manager 130, a knowledge base module 140, a natural language generation module 150, and a speech synthesis module 160.
For Speech data from a user request, Speech Recognition module 110 can configure the Speech data to perform Automatic Speech Recognition (ASR) on the Speech data, resulting in Speech text input to natural Speech understanding module 120. The Natural speech Understanding module 120 can perform semantic processing (NLU) on the speech text, and is mainly used for processing a sentence input by a user or a speech recognition result, extracting a dialog intention of the user and information transmitted by the user, and generating a semantic text with key semantic information. Dialog manager 130 may include Dialog State Tracking (DST) and Dialog Policy Learning (DPL), which primarily functions to update the state of the system based on the results of the NLU and generate corresponding system actions, e.g., by invoking data in knowledge base 140, resulting in dialog semantic text that is key semantic text, where the dialog semantic text conforms to the intent of the user request. The dialog semantic Text is converted into a Text by a Natural Language Generation (NLG) of the Natural Language Generation module 150, the system operation output by the dialog management is expressed in a Text form To obtain a complete semantic Text, and a Text-To-Speech (TTS) of the Speech synthesis 160 is used To generate Speech data that can be fed back To the user, thereby completing a typical task-based dialog flow.
In other examples, the speech recognition module 110 and the speech synthesis module 160 may also be configured as one module for converting speech data to speech. In addition, the natural speech understanding module 120 and the natural language generating module 150 may be configured as one module for semantic processing of a speech text.
FIG. 2 is a flow chart of steps of a data set construction method according to another embodiment of the present invention. The solution of the present embodiment may be applied to any suitable electronic device with data processing capability, including but not limited to: server, mobile terminal (such as mobile phone, PAD, etc.), PC, etc. The data set construction method of fig. 2 includes:
s210: a set of user intentions is constructed.
It should be understood that each user intent in the set of user intents indicates information such as the type of service, content of the service, etc. that the user currently needs. The service types include, but are not limited to, information services related to the user's clothing and housing activities, such as weather services, restaurant reservation services, ticket booking services such as ticket reservation services or air ticket reservation services, attraction query services, and the like.
S220: and determining key data corresponding to each user intention in the user intention set from the characteristic data set.
It should be appreciated that, in one example, the key data may be user targets that include respective user intents, i.e., operational targets that indicate user intent. The feature data set may be generated based on a data set, such as a bilingual data set. For example, bilingual internet data may be crawled or internet published bilingual data may be collected, resulting in a bilingual data set. The bilingual data set may be text data of chapters, paragraphs, sentences, and the like. The data in the feature data set has a feature representing a user's intent.
S230: and generating a task type dialog outline of the target user intention according to the key data corresponding to the target user intention in the user intentions.
It should be understood that the task-based dialog outline indicates the dialog interaction characteristics, dialog key contents, etc. between the system side and the user side in the task-based dialog system. The interactive dialog may be generated based on a mapping of predetermined patterns. The dialog key content may include or correspond to the key data described above, and the mapping may indicate a correspondence of the key data in the dialog content when generating the interactive dialog based on the mapping.
S240: and constructing a task-based dialog data set according to the task-based dialog outline intended by the target user.
It should be understood that the above-mentioned dialog key content can be used for sentence improvement or rewriting, so as to obtain the actual dialog data more conforming to the actual expression mode as the task-based dialog data set.
It should also be understood that the application scenario of the task-based dialog system constructed based on the current task-based dialog dataset can be determined, and the dialog dataset of the corresponding language style or language mode of the application scenario can be constructed based on different application scenarios by using the same task-based dialog schema.
In the scheme of the embodiment of the invention, the key data corresponding to each user intention in the user intention set is determined from the characteristic data set, the accurate corresponding relation between the key data and the user intention is established, the accurate task type conversation outline is efficiently generated according to the key data corresponding to the target user intention in each user intention, the task type conversation data set is established according to the task type conversation outline of the target user intention, the establishing efficiency of the data set is improved, and the labor cost and the time cost in establishing the conversation data set are reduced.
Other exemplary schemes that are possible on the basis of the above-described embodiment schemes will be described below.
In other examples, the data set construction method further comprises: collecting a bilingual data set; and processing the bilingual data set based on the knowledge base rule to obtain a characteristic data set. The knowledge base rules can reliably construct accurate association relations and implicit association relations among various data, so that the obtained feature data set can more reliably reflect intention information required in the task-based dialog.
In other examples, determining, from the feature data set, key data corresponding to each user intention in the user intention set includes: determining intention constraint conditions corresponding to all user intentions in the user intention set; from the feature data set, key data that meets the intention constraints of the respective user's intentions is determined. The intention constraint condition reflects the completeness of the intention, and the key data meeting the constraint condition can reflect the specific intention more accurately and reliably.
In other examples, the intent constraints include information type and information type attributes, and determining key data from the feature data set that meets the intent constraints for each user intent includes: and determining key data corresponding to the information type and the information type attribute from the feature data set, wherein the key data are respectively used as an information slot value of the information type and an attribute value of the information type attribute. The use of information types and their slot values enables a better quantification of the intent constraints, reflecting data processing efficiency. Furthermore, the use of information type attributes can provide deep features and dependencies between information types and their information slot values, thus improving the accuracy of the intent constraints while not substantially reducing data processing efficiency.
In other examples, the attribute value of the information type attribute indicates a value range of the information slot value of the information type, thereby further improving the accuracy of the intent constraint number quantization.
In other examples, generating a mission-based dialog schema of a target user intent according to key data corresponding to the target user intent in the user intentions includes: acquiring at least one first feature sentence indicating the intention of a target user; and calling the service interface according to the key data of the intention constraint condition of the intention of the target user to obtain at least one second characteristic statement indicating a return result of the service interface, wherein the service interface is constructed based on the intention constraint condition, and the at least one first characteristic statement and the at least one second characteristic statement form a dialog schema. The service interface is constructed based on the intention constraint conditions, and the generation efficiency of the dialog outline is greatly improved. It should be understood that an interface set comprising a series of service interfaces may be constructed, a target service interface may be selected from the interface set, a corresponding dialog schema may be generated, when a first feature sentence does not satisfy an intention constraint required by the target service interface, a second feature sentence for guiding further first feature sentences may be constructed until the intention constraint of the target service interface is satisfied, at which point the resulting dialog schema conforms to a desired accuracy of the target service interface, and the second feature sentence as feedback of the target user intention is output from the target service interface.
In other examples, constructing a tasking dialog dataset from a tasking dialog schema of a target user intent includes: constructing alternative actual dialogue data according to the task-type dialogue outline intended by the target user; and determining a task-based dialogue data set according to dialogue selection frequency in the alternative actual dialogue data. Compared with the conversation outline, the alternative actual conversation data reflects an expression mode which is easy to accept by a user, so that the feedback data is more intelligent. The dialog selection frequency indicates a degree of match between the dialog outline and the actual dialog data, so that the determined task-oriented dialog data set is more acceptable to the user and more accurate.
A construction method of the dialogue data set of another embodiment will be described and explained in detail below with reference to fig. 3A and 3B. The embodiment of fig. 3A illustrates an example of a constructed bilingual feature data set. The Bilingual feature data set is constructed according to the rule of the Knowledge Base and serves as a Bilingual Knowledge Base Bilingknowledge Base. For example, bilingual datasets may be collected, for example, crawling bilingual internet data or collecting internet published bilingual data, resulting in a bilingual dataset. The bilingual data set may be text data of chapters, paragraphs, sentences, and the like.
Bilingual travel information published by the internet may be collected as a bilingual dataset, for example, the travel information may include data for subway stations, attractions, hotels, and restaurants. The travel information is processed to obtain the English feature data shown on the left side of FIG. 3A and the Chinese feature data shown on the right side.
The bilingual feature data described above can then be used to construct a dialog data set based on the construction of the dialog data set shown in fig. 3B.
Specifically, a dialog schema may be generated based on the built series of service interfaces, and then a dialog data set indicating an actual dialog may be generated using the dialog schema. For example, a target service interface may be selected from the interface set, a corresponding dialog schema may be generated, when the first feature sentence does not satisfy the intention constraint condition required by the target service interface, a second feature sentence for guiding further first feature sentences may be constructed until the intention constraint condition of the target service interface is satisfied, at which point the resulting dialog schema conforms to the desired accuracy of the target service interface, and the second feature sentence as feedback of the intention of the target user is output from the target service interface. For example, the service interface includes, but is not limited to, returning a user service type, such as a query service, a subscription service, a weather service, an information service, and the like. For a travel scenario, the service interfaces include, but are not limited to, searching for restaurants, subscribing to hotels, searching for attractions, searching for modes of transportation, searching for weather, and the like.
In one example, a target user intends to book a service for a restaurant, and a target service interface returns the restaurant to be booked, where the target service interface corresponds to the intention constraints of: including at least restaurant type, rating, and price. The information returned by the target service interface is the restaurant location. Thus, the target service interface obtains a first characteristic statement, e.g., "i want to find a restaurant. I do not want to eat italian meals "the first characteristic sentence indicates the location of the restaurant, but collects the score and price information for the restaurant, so the second characteristic sentence" is there a requirement for the score of the restaurant? "the feedback of the user is" no scoring requirement, restaurant price is medium ", the intention constraint condition corresponding to the target service interface is satisfied, and the place X street (X street) of the restaurant can be returned through the target service interface, so that the conversation outline is generated. In addition, a second characteristic sentence inquiry for whether reservation is required or not may be generated, and after the intention of the user to reserve a restaurant is obtained, inquiry scheduled time information may be further generated, or a restaurant location capable of receiving reservation or a restaurant location meeting the conditions and corresponding scheduled time, reservation vacancy information, and the like may be returned. The information in the intention constraint condition, the first characteristic sentence and the second characteristic sentence can be bilingual texts.
The logic followed to generate the dialog outline will be described in detail below.
A dialog schema may be generated by the dialog simulator based on a predetermined schema. The dialog simulator may return second key data based on the first key data from the bilingual feature data set. For example, "restaurant," "non-italian meal" are examples of first critical data, and "X Street" is an example of second critical data.
In the flow of steps shown in FIG. 3B, two dialog simulators may be configured to generate a dialog outline, one for simulating the user side in a task-based dialog and the other for simulating the system side in a task-based dialog.
In step S310, first key data from the bilingual feature data set may be acquired as a search option.
In step S320, an intention constraint condition may be determined according to the first key data, and a service interface corresponding to the intention constraint condition may be called to perform a search process based on a search option.
In step S330, the service interface returns a search result matching the search option.
In step S340, determining whether the search result sufficiently meets the intention constraint condition, if yes, directly proceeding to step S350, returning the second key data, and ending the process; if not, the process returns to step S310 to re-execute, and obtain additional search results until all search results are enough to meet the intention constraint condition.
In other words, since the data in S310 and S350 actually represent important information in a dialog, the two dialog simulators can generate the subsequent dialog outline 3200 based on the body corresponding to the data in section 3100.
It should be understood that the above-mentioned key data may be implemented by a triplet, i.e., the key data may be abstracted into an information slot indicating the type of information, a slot value of the information slot, and an information attribute. The slot value of the information slot indicates the information content, and the information type attribute may indicate the value range of the slot value. In one example, if the key data as the search option is a restaurant score to be reserved, the information type is a reserved restaurant score, the slot value is 4, which indicates that the restaurant score is related to 4, and the information type attribute is at least. That is, the triple of key data indicates that the restaurant to be booked is rated at least 4 points. For another example, in order to simulate more various user targets, as an example of a triple, the correspondence between information slots, relationships, and values better distinguishes key data. For example, in a restaurant searching scenario, a user "wants to eat" a Chinese meal (cuisine, equal _ to, Chinese) and a user "does not want to eat" a Chinese meal (cuisine, not, Chinese) can be well distinguished and represented by different "relationships".
Thus, the use of information types and their slot values enables a better quantitative quantification of the intent constraints, reflecting data processing efficiency. Furthermore, the use of information type attributes can provide deep features and dependencies between information types and their information slot values, thus improving the accuracy of the intent constraints while not substantially reducing data processing efficiency. The attribute value of the information type attribute indicates the value range of the information slot value of the information type, so that the quantization precision of the intention constraint condition number is further improved.
It should also be appreciated that for a recommended service, given a service interface, multiple result items may be queried that satisfy the condition. Recommendations may be made based on information such as user scores, and ranking the plurality of result items based on user scores may reduce the difficulty of recommending returned results for the service interface.
It is also understood that some of the result items in the knowledge base contain mixed language information, e.g., some restaurants have only english names. The data set of the embodiment of the invention is provided with a special conversion statement or the capability for realizing the conversion statement, so that the cross-language entities can be better processed when the service interface calls the query. In the case of a restaurant with only english name, it is also possible to have a flexible mixed language show of such restaurant/hotel/attraction name entities in chinese dialogue interaction.
It should also be appreciated that in the event that the values of some slots may not be fully specified, the present scheme may inherit the portion of the slot values from the previously invoked service interface results. In the case where the user has previously reserved a restaurant and then wants to recommend an attraction next to the restaurant, the location of the attraction is not specified, but can be inherited through the location of the restaurant just reserved.
Then, an actual dialog data set 3300 may be generated from the dialog schema 3200.
The construction from the dialog outline 3200 to the final dialog data set 3300 will be described in detail below.
Alternative actual dialogue data can be constructed according to the task-based dialogue outline intended by the target user, and then a task-based dialogue data set is determined according to dialogue selection frequency in the alternative actual dialogue data. Specifically, the conversation schema can be converted into natural language conversations which are more easily accepted by users in a mode of crowdsourcing artificial labels. The rewriting can be performed sentence by sentence according to the contents of the dialog outline. After the conversation rewriting process is finished, the annotator can read through the rewritten conversation and answer some questions to obtain conversation selection frequency, the conversation selection frequency indicates authenticity and validity, and if the frequency of selecting the alternative actual conversation data is higher, the authenticity and/or the validity of the alternative actual conversation data are/is higher. For example, examples of the above-mentioned problems may be: "does this dialog look like a dialog between a user and a professional human assistant? "whether this session has the same meaning as the original session, and the session itself is reasonably smooth" the first question is to measure the authenticity of the rewritten session, and the second question is to measure the validity thereof.
Fig. 3C is a diagram illustrating a task-based dialog method according to another embodiment of the present invention. The tasking dialog method of fig. 3C includes:
s1100: and acquiring a task-based conversation request of the user through the human-computer interaction interface, wherein the task-based conversation request indicates the intention of the target user.
S1200: and sending the task type dialogue request to a task type dialogue system, processing the intention of the target user by using a machine learning model by using the task type dialogue system to obtain a task type dialogue response, and training the machine learning model by using a task type dialogue data set constructed by a data set construction method.
S1300: and receiving a task-based dialog response sent by the task-based dialog system.
S1400: and displaying the task type dialogue response to the user through the man-machine interaction interface.
In the scheme of the embodiment, the machine learning model is obtained by training the task-based dialogue data set constructed by the data set construction method, so that when the machine learning model processes the intention of the target user, accurate task-based dialogue response can be obtained.
In particular, the tasking dialog method of fig. 3C will be described in conjunction with the tasking dialog system of fig. 1.
It should be understood that the task-based dialog method of fig. 3C may be applied to electronic devices such as smart devices, and smart devices include smart devices such as embedded devices, internet of things devices, and the like, for example, smart home devices such as smart doorbells, smart stereos, or smart office devices, wearable devices such as smart watches, smart glasses, smart bracelets, smart terminals such as smartphones, tablet computers, and the like. It should also be understood that the constructed data set can be used for training a machine learning model, the machine learning model can be an end-to-end model, and the input of the machine learning model can be dialog text in the task-based dialog data set, and can also be other data based on the dialog text, for example, speech data corresponding to the dialog text. The output of the machine learning model may be the dialog text in the task-oriented dialog data set, or may be other data based on the dialog text, such as speech data corresponding to the dialog text. In addition, as an example, the input and output of the machine learning model may be both dialog text and speech data, and as another example, one of the input and output of the machine learning model may be dialog text and the other may be speech data. It should also be understood that the machine learning model may be trained or inferred by a machine learning software framework such as TensorFlow in coordination with hardware algorithms such as a GPU, for example, using a tasked dialog dataset, or such as performing a tasked dialog.
In one example, the machine learning model may be deployed as a dialog manager 130. At this time, the input and output of the pre-trained machine learning model are dialog text, and the natural speech understanding module 120 and the natural language generation module 150 may be configured processing logic or other machine learning models.
In another example, the machine learning model may function as the speech recognition module 110, the natural speech understanding module 120, the dialog manager 130, the natural language generation module 150, and the speech synthesis module 160. In this case, the input and output of the pre-trained machine learning model are both speech data.
Specifically, the human-computer interaction interface of the intelligent device can acquire a task-based dialogue request of the user, the task-based dialogue request indicates the intention of the target user, and the task-based dialogue request can be a voice instruction, a text instruction, a touch instruction corresponding to a text instruction, or the like. It should be understood that the voice instructions, text instructions, or touch instructions all correspond to dialog text, in other words, dialog text indicates a target user intent. In addition, the intelligent device can receive task-based dialogue response from the task-based dialogue system, and the human-computer interaction interface can also show the task-based dialogue response to the user in multimedia modes such as text display, voice response and image response.
In one example, the human-computer interaction interface acquires a voice instruction comprising voice data, and sends the voice data to a task-based dialog system arranged at a back end, the task-based dialog system processes the semantic data through an end-to-end machine model to obtain a dialog response comprising the voice data, and returns the dialog response to the intelligent device at the front end, and then the intelligent device presents the dialog response to a user through the human-computer interaction interface. It should be appreciated that although the dialog responses are voice data, the human-computer interaction interface may still be presented in a multimedia manner other than a voice manner. For example, when acquiring text information or image information matched with voice data, the intelligent device may perform local data processing to obtain image information or text information matched with a voice response, and the intelligent device may further send the voice data to a back-end server to obtain returned image information or text information. The backend server may be the same as or different from the server used in the task dialog system.
In another example, the speech recognition module 110 and the speech synthesis module 160 may be modules in a smart device. In other words, the smart device acquires voice data through the human-machine interaction interface and converts it into dialog text, and then transmits the dialog text to the dialog manager 130 deployed at the backend, and the machine learning model in the dialog manager 130 processes the dialog text and returns a response text to the smart device. At this time, the intelligent device may process the response text to obtain corresponding image information or voice data, so as to perform multimedia presentation. Alternatively, the intelligent device may send the dialog text or the response text to the back-end server to obtain the returned voice data or image information, and after the human-computer interaction interface obtains the voice data, the intelligent device may also send the voice data to the back-end server for processing to obtain the returned dialog text. Alternatively, the dialog manager 130 (deployed in the first server) may send the processed response text directly to a second server for multimedia presentation processing, which returns the multimedia presentation data corresponding to the response text to the smart device. In addition, the second server can be connected with other databases, and the response text is fused with relevant information collected from other databases to obtain multimedia display data and return the multimedia display data to the intelligent equipment. Rendering the multimedia display data by a human-computer interaction interface of the intelligent equipment, and performing multimedia display.
Fig. 4 is a block diagram of a data set constructing apparatus according to another embodiment of the present invention. The solution of the present embodiment may be applied to any suitable electronic device with data processing capability, including but not limited to: server, mobile terminal (such as mobile phone, PAD, etc.), PC, etc. The data set constructing apparatus of fig. 4 includes:
a first construction module 410 constructs a set of user intentions.
The determining module 420 determines, from the feature data set, key data corresponding to each user intention in the user intention set.
The generating module 430 generates a task-based dialog outline of the target user intention according to the key data corresponding to the target user intention in the user intentions.
And the second construction module 440 is used for constructing a task-based dialog data set according to the task-based dialog outline intended by the target user.
In the scheme of the embodiment of the invention, the key data corresponding to each user intention in the user intention set is determined from the characteristic data set, the accurate corresponding relation between the key data and the user intention is established, the accurate task type conversation outline is efficiently generated according to the key data corresponding to the target user intention in each user intention, the task type conversation data set is constructed according to the task type conversation outline of the target user intention, the construction efficiency of the data set is improved, and the labor cost and the time cost in constructing the conversation data set are reduced.
In other examples, the apparatus further comprises: the acquisition module is used for acquiring a bilingual data set; and processing the bilingual data set based on a knowledge base rule to obtain the characteristic data set.
In other examples, the determining module is specifically configured to: determining intention constraint conditions corresponding to all user intentions in the user intention set; and determining key data meeting the intention constraint condition of each user intention from the characteristic data set.
In other examples, the intent constraint includes an information type and an information type attribute, and the determination module is specifically configured to: and determining key data corresponding to the information type and the information type attribute from a feature data set, wherein the key data are respectively used as an information slot value of the information type and an attribute value of the information type attribute.
In other examples, the attribute value of the information type attribute indicates a value range of an information slot value of the information type.
In other examples, the generation module is specifically configured to: obtaining at least one first feature statement indicating the target user intent; calling the service interface according to key data of an intention constraint condition of the target user intention to obtain at least one second characteristic statement indicating a return result of the service interface, wherein the service interface is constructed based on the intention constraint condition, and the at least one first characteristic statement and the at least one second characteristic statement form the dialog outline.
In other examples, the second building block is specifically configured to: constructing alternative actual dialogue data according to the task-type dialogue outline intended by the target user; and determining the task-type dialogue data set according to dialogue selection frequency in the alternative actual dialogue data.
The apparatus of this embodiment is used to implement the corresponding method in the foregoing method embodiments, and has the beneficial effects of the corresponding method embodiments, which are not described herein again. In addition, the functional implementation of each module in the apparatus of this embodiment can refer to the description of the corresponding part in the foregoing method embodiments, and is not repeated herein.
Referring to fig. 5, a schematic structural diagram of an electronic device according to another embodiment of the present invention is shown, and the specific embodiment of the present invention does not limit the specific implementation of the electronic device.
As shown in fig. 5, the electronic device may include: a processor (processor)502, a Communications Interface 504, a memory 506, and a communication bus 508.
Wherein:
the processor 502, communication interface 504, and memory 506 communicate with one another via a communication bus 508.
A communication interface 504 for communicating with other electronic devices or servers.
The processor 502 is configured to execute the program 510, and may specifically perform the relevant steps in the above method embodiments.
In particular, program 510 may include program code that includes computer operating instructions.
The processor 502 may be a processor CPU, or an application Specific Integrated circuit (asic), or one or more Integrated circuits configured to implement an embodiment of the present invention. The intelligent device comprises one or more processors which can be the same type of processor, such as one or more CPUs; or may be different types of processors such as one or more CPUs and one or more ASICs.
And a memory 506 for storing a program 510. The memory 506 may comprise high-speed RAM memory, and may also include non-volatile memory (non-volatile memory), such as at least one disk memory.
The program 510 may be specifically configured to cause the processor 502 to perform the following operations: constructing a user intention set; determining key data corresponding to each user intention in the user intention set from a characteristic data set; generating a task type dialog outline of the target user intention according to key data corresponding to the target user intention in the user intentions; and constructing a task-based dialog data set according to the task-based dialog outline intended by the target user.
Alternatively, the program 510 may be specifically configured to cause the processor 502 to perform the following operations: acquiring a task-based dialogue request of a user through a human-computer interaction interface, wherein the task-based dialogue request indicates the intention of a target user; sending the task type dialogue request to a task type dialogue system, wherein the task type dialogue system processes the intention of the target user by using a machine learning model to obtain a task type dialogue response, and the machine learning model is obtained by training a task type dialogue data set constructed according to a data set construction method; receiving a task-based dialog response sent by the task-based dialog system; and displaying the task type dialogue response to a user through the man-machine interaction interface.
In addition, for specific implementation of each step in the program 510, reference may be made to corresponding steps and corresponding descriptions in units in the foregoing method embodiments, which are not described herein again. It can be clearly understood by those skilled in the art that, for convenience and brevity of description, the specific working processes of the above-described devices and modules may refer to the corresponding process descriptions in the foregoing method embodiments, and are not described herein again.
It should be noted that, according to the implementation requirement, each component/step described in the embodiment of the present invention may be divided into more components/steps, and two or more components/steps or partial operations of the components/steps may also be combined into a new component/step to achieve the purpose of the embodiment of the present invention.
The above-described method according to an embodiment of the present invention may be implemented in hardware, firmware, or as software or computer code storable in a recording medium such as a CD ROM, a RAM, a floppy disk, a hard disk, or a magneto-optical disk, or as computer code originally stored in a remote recording medium or a non-transitory machine-readable medium downloaded through a network and to be stored in a local recording medium, so that the method described herein may be stored in such software processing on a recording medium using a general-purpose computer, a dedicated processor, or programmable or dedicated hardware such as an ASIC or FPGA. It will be appreciated that a computer, processor, microprocessor controller, or programmable hardware includes memory components (e.g., RAM, ROM, flash memory, etc.) that can store or receive software or computer code that, when accessed and executed by a computer, processor, or hardware, implements the methods described herein. Further, when a general-purpose computer accesses code for implementing the methods illustrated herein, execution of the code transforms the general-purpose computer into a special-purpose computer for performing the methods illustrated herein.
Those of ordinary skill in the art will appreciate that the various illustrative elements and method steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware, or combinations of computer software and electronic hardware. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the technical solution. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present embodiments.
The above embodiments are only for illustrating the embodiments of the present invention and not for limiting the embodiments of the present invention, and those skilled in the art can make various changes and modifications without departing from the spirit and scope of the embodiments of the present invention, so that all equivalent technical solutions also belong to the scope of the embodiments of the present invention, and the scope of patent protection of the embodiments of the present invention should be defined by the claims.

Claims (8)

1. A data set construction method, comprising:
constructing a user intention set;
determining intention constraint conditions corresponding to all user intentions in the user intention set;
determining key data meeting intention constraint conditions of the user intentions from the feature data set;
acquiring at least one first feature sentence indicating the intention of a target user;
calling a service interface according to key data of an intention constraint condition of the intention of the target user to obtain at least one second characteristic statement indicating a return result of the service interface, wherein the service interface is constructed based on the intention constraint condition, and the at least one first characteristic statement and the at least one second characteristic statement form a dialog schema;
and constructing a task-based dialog data set according to the task-based dialog outline intended by the target user.
2. The method of claim 1, wherein the method further comprises:
collecting a bilingual data set;
and processing the bilingual data set based on a knowledge base rule to obtain the characteristic data set.
3. The method of claim 1, wherein the intent constraint comprises an information type and an information type attribute,
determining key data meeting the intention constraint condition of each user intention from the characteristic data set, wherein the key data comprises the following steps:
and determining key data corresponding to the information type and the information type attribute from a feature data set, wherein the key data are respectively used as an information slot value of the information type and an attribute value of the information type attribute.
4. The method of claim 3, wherein the attribute value of the information type attribute indicates a range of values of an information slot value of the information type.
5. The method of claim 1, wherein constructing a task-based dialog data set from the task-based dialog schema of the target user intent comprises:
constructing alternative actual dialogue data according to the task-type dialogue outline intended by the target user;
and determining the task-based dialogue data set according to dialogue selection frequency in the alternative actual dialogue data.
6. A method of tasking a dialog, comprising:
acquiring a task-based dialogue request of a user through a human-computer interaction interface, wherein the task-based dialogue request indicates the intention of a target user;
sending the task type dialogue request to a task type dialogue system, wherein the task type dialogue system processes the intention of the target user by using a machine learning model to obtain a task type dialogue response, and the machine learning model is obtained by training a task type dialogue data set constructed according to the method of any one of claims 1-5;
receiving a task-based dialog response sent by the task-based dialog system;
and displaying the task type dialogue response to a user through the man-machine interaction interface.
7. An electronic device, comprising: the system comprises a processor, a memory, a communication interface and a communication bus, wherein the processor, the memory and the communication interface complete mutual communication through the communication bus; the memory is used for storing at least one executable instruction which causes the processor to execute the corresponding operation of the method according to any one of claims 1-6.
8. A computer storage medium having stored thereon a computer program which, when executed by a processor, carries out the method of any one of claims 1-6.
CN202111421284.8A 2021-11-26 2021-11-26 Data set construction and task-based dialogue method, electronic device and storage medium Active CN114255750B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111421284.8A CN114255750B (en) 2021-11-26 2021-11-26 Data set construction and task-based dialogue method, electronic device and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111421284.8A CN114255750B (en) 2021-11-26 2021-11-26 Data set construction and task-based dialogue method, electronic device and storage medium

Publications (2)

Publication Number Publication Date
CN114255750A CN114255750A (en) 2022-03-29
CN114255750B true CN114255750B (en) 2022-09-27

Family

ID=80793383

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111421284.8A Active CN114255750B (en) 2021-11-26 2021-11-26 Data set construction and task-based dialogue method, electronic device and storage medium

Country Status (1)

Country Link
CN (1) CN114255750B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115064170B (en) * 2022-08-17 2022-12-13 广州小鹏汽车科技有限公司 Voice interaction method, server and storage medium

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10904169B2 (en) * 2017-08-08 2021-01-26 International Business Machines Corporation Passing chatbot sessions to the best suited agent
CN111966803B (en) * 2020-08-03 2024-04-12 深圳市欢太科技有限公司 Dialogue simulation method and device, storage medium and electronic equipment
CN112527969B (en) * 2020-12-22 2022-11-15 上海浦东发展银行股份有限公司 Incremental intention clustering method, device, equipment and storage medium

Also Published As

Publication number Publication date
CN114255750A (en) 2022-03-29

Similar Documents

Publication Publication Date Title
CN109791550B (en) Generating contextual search suggestions
CN107818781B (en) Intelligent interaction method, equipment and storage medium
CN110111780B (en) Data processing method and server
CN110223695B (en) Task creation method and mobile terminal
US20230024457A1 (en) Data Query Method Supporting Natural Language, Open Platform, and User Terminal
CN109710935B (en) Museum navigation and knowledge recommendation method based on cultural relic knowledge graph
US10713288B2 (en) Natural language content generator
US11410643B2 (en) Response generation for conversational computing interface
CN116127020A (en) Method for training generated large language model and searching method based on model
CN114595686B (en) Knowledge extraction method, and training method and device of knowledge extraction model
CN112084315A (en) Question-answer interaction method, device, storage medium and equipment
CN116501960B (en) Content retrieval method, device, equipment and medium
CN114255750B (en) Data set construction and task-based dialogue method, electronic device and storage medium
CN117932022A (en) Intelligent question-answering method and device, electronic equipment and storage medium
CN112784024B (en) Man-machine conversation method, device, equipment and storage medium
CN110249326B (en) Natural language content generator
CN116662495A (en) Question-answering processing method, and method and device for training question-answering processing model
CN110705308A (en) Method and device for recognizing field of voice information, storage medium and electronic equipment
CN112069267A (en) Data processing method and device
CN115440223A (en) Intelligent interaction method and device, robot and computer readable storage medium
JP4795452B2 (en) Search system and search program
CN114020245A (en) Page construction method and device, equipment and medium
CN113505293A (en) Information pushing method and device, electronic equipment and storage medium
US10649739B2 (en) Facilitating application development
CN111046161A (en) Intelligent dialogue method and device for commodity marketing scene

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
CB02 Change of applicant information
CB02 Change of applicant information

Address after: Room 554, 5 / F, building 3, 969 Wenyi West Road, Wuchang Street, Yuhang District, Hangzhou City, Zhejiang Province

Applicant after: Alibaba (China) Co.,Ltd.

Address before: Room 508, 5 / F, building 4, No. 699, Wangshang Road, Changhe street, Binjiang District, Hangzhou City, Zhejiang Province

Applicant before: Alibaba (China) Co.,Ltd.

GR01 Patent grant
GR01 Patent grant