CN110209446B - Method and device for configuring combined slot in man-machine conversation system - Google Patents

Method and device for configuring combined slot in man-machine conversation system Download PDF

Info

Publication number
CN110209446B
CN110209446B CN201910330314.0A CN201910330314A CN110209446B CN 110209446 B CN110209446 B CN 110209446B CN 201910330314 A CN201910330314 A CN 201910330314A CN 110209446 B CN110209446 B CN 110209446B
Authority
CN
China
Prior art keywords
slot
bot platform
user
training
entity
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201910330314.0A
Other languages
Chinese (zh)
Other versions
CN110209446A (en
Inventor
张晴
胡仁林
刘畅
张轶博
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Huawei Technologies Co Ltd
Original Assignee
Huawei Technologies Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Huawei Technologies Co Ltd filed Critical Huawei Technologies Co Ltd
Priority to CN201910330314.0A priority Critical patent/CN110209446B/en
Publication of CN110209446A publication Critical patent/CN110209446A/en
Priority to PCT/CN2020/085234 priority patent/WO2020216134A1/en
Application granted granted Critical
Publication of CN110209446B publication Critical patent/CN110209446B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/28Databases characterised by their database models, e.g. relational or object models
    • G06F16/284Relational databases
    • G06F16/288Entity relationship models
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2411Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on the proximity to a decision surface, e.g. support vector machines
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/44Arrangements for executing specific programs
    • G06F9/451Execution arrangements for user interfaces
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Databases & Information Systems (AREA)
  • Evolutionary Computation (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Biomedical Technology (AREA)
  • Human Computer Interaction (AREA)
  • Health & Medical Sciences (AREA)
  • Machine Translation (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)
  • User Interface Of Digital Computer (AREA)

Abstract

The application provides a configuration method and a configuration device of a combined slot position in a man-machine conversation system, which relate to the technical field of AI (artificial intelligence) and can still realize the extraction of the slot position even if a user says that the expression of an entity type exchange sequence in the combined slot position set by the user is included. The specific scheme comprises the following steps: the method comprises the steps that the Bot platform receives a first slot configured by a user on a first interface (namely, an interface for setting the slot for a first intention in a first skill of the Bot platform); the first slot position is a combined slot position comprising N entity types which are arranged according to a sequence set by a user, wherein N is more than or equal to 2 and is a positive integer; recombining the N entity types to obtain M second slot positions; the M second slot positions comprise slot positions obtained by arranging k entity types in the N entity types according to any sequence, wherein k belongs to {1, 2, … …, N }; and training the one or more training corpora according to the one or more training corpora and the M second slot positions, so that the Bot platform has the capability of extracting the M second slot positions in the user's speech.

Description

Method and device for configuring combined slot in man-machine conversation system
Technical Field
The embodiment of the application relates to the technical field of Artificial Intelligence (AI), in particular to a configuration method and a configuration device of a combined slot position in a man-machine conversation system.
Background
AI is an indispensable technology for implementing artificial intelligence. Human-computer interaction (e.g., human-computer voice interaction) is a common commercialized AI capability, and may be referred to as a chat robot (ChatBot, simply referred to as Bot). ChatBot can be divided into two categories: open Domain (Open Domain) chat products and Task Oriented (Task Oriented) chat products. The task-oriented chat product is a man-machine interaction product which is oriented to a single task like 'booking an air ticket', 'ordering a meal', 'inquiring weather' and the like.
The user can configure one or more skills (kill), intentions (Intent) and slots (Slot) for the Bot platform in the Bot platform of the task-oriented chat product, so as to realize human-computer interaction. For example, the above-described skill may be "order", "buy ticket", or the like. Each skill may include one or more intentions, e.g., the skill "order" may include multiple intentions such as "buy hamburgers" and "buy drinks". Each intent may include one or more slots. For example, the intent "buy hamburger" may include slot "@ quantity", and slot "@ hamburger type", etc.
Referring to fig. 1, for example of human-machine voice interaction, a task-oriented voice interaction process may include: (1) the microphone of Bot collects voice data 101. (2) The speech recognition module (ASR) performs speech recognition on the speech data 101, and recognizes the corresponding text information 1. (3) A Natural Language Understanding (NLU) module understands an intention to be expressed by the text information 1. Wherein the intent may determine that textual information 1 indicates the type of event performed by the Bot platform. For example, assuming that the file information 1 is "i want to have a spicy chicken leg castle", the NLU module can derive from the text information 1: the intention is "buy hamburger", i.e. the text information 1 indicates that Bot bought hamburger for the user. (4) After the intent is determined, the NLU module parses the text information 1 to extract the core content, i.e., the slot. In conjunction with the above example, the NLU module may extract: the trench "@ quantity" ═ one, and the trench "@ hamburg type" ═ hot chicken leg fort in wheat ". (5) And the Bot platform responds to the user according to the extracted slot position. Where @ is used to identify an entity type, with no practical significance.
Currently, the slot that includes only one entity type may be defined in the Bot platform, such as a slot "@ number" and a slot "@ hamburger type"; design of the combined slot position can also be supported. A composite slot is a slot that includes at least two entity types. For example, the user may combine the two slots into one combined slot "@ number @ hamburger type" that includes two entity types of ' @ number ' and ' @ hamburger type ". Thus, the NLU module recognizes that the text message 1 can extract: the slot position '@ quantity @ hamburger type' ═ one part of spicy chicken leg hamburger.
However, the related art only supports extraction of a combination slot in a text information specifying a combination order. If the text information includes the expression of the entity type exchange sequence in the combined slot configured by the user, the Bot platform cannot recognize the sequence. For example. Assume that a combined slot "@ quantity @ hamburger type" is defined in the Bot platform for the intent "buy hamburger". The sequence of the entity types in the combined slot position is as follows: '@ quantity' before and'@ hamburger type' after. Then, the Bot platform can only recognize and respond to the text message "i want one spicy chicken leg burger", "give me two crispy cod burgers", etc. "@ number 'before," @ hamburger type' after expression; and when the file information is 'I want two parts of crispy cod hamburger', 'Shanghai chicken leg hamburger' and the like, 'the number of' @ is 'behind,' the type of '@ hamburger' is in front of the file, Bot cannot extract the corresponding slot position.
Disclosure of Invention
The application provides a configuration method and a configuration device of a combined slot position in a man-machine conversation system, which can still realize the extraction of the slot position even if a user statement comprises an expression of an entity type exchange sequence in the combined slot position set by a user.
The technical scheme is as follows:
in a first aspect, the present application provides a method for configuring a combination slot in a human-computer interaction system, where the method is applied to a Bot platform. The method can comprise the following steps: the Bot platform may receive a first slot of a first interface configuration of a user for setting a slot for a first intent in a first skill of the Bot platform; the first slot position is a combined slot position comprising N entity types, N is more than or equal to 2, and N is a positive integer; the N entity types are arranged in the first slot position according to the sequence set by the user; then, the Bot platform recombines the N entity types to obtain M second slot positions; the M second slot positions comprise slot positions obtained by arranging k entity types in the N entity types according to any sequence, wherein k belongs to {1, 2, … …, N }; and finally, the Bot platform trains the one or more training corpora according to the one or more training corpora and the M second slot positions, so that the Bot platform has the capability of extracting the M second slot positions in the user's speech.
In the application, the plurality of second slot positions obtained by recombination comprise slot positions obtained by arranging one or more entity types in a plurality of entity types according to any sequence, namely, the Bot platform can recombine to obtain slot positions obtained by arranging the entity types according to any sequence; therefore, even if the sequence of the entity types is changed in the user utterance or the user utterance only includes one entity type, the Bot platform can extract the corresponding slot position and reply to the user utterance.
With reference to the first aspect, in a possible design manner, the M second slots include slots in which k entity types of the N entity types are arranged in an arbitrary order, and k ∈ {1, 2, … …, N }. That is, k may be any positive integer of [1, N ].
For example, when k is 1, the Bot platform may select 1 (i.e., k) entity types from the N entity types, and 1 entity type selected by the Bot platform serves as one second slot. Wherein, the Bot platform can have N choices, namely can obtain
Figure BDA0002037485820000021
Figure BDA0002037485820000022
A second slot position.
When k is 2, the Bot platform may select 2 (i.e., k) entity types from the N entity types, and combine the 2 entity types selected by the Bot platform into one second slot (i.e., a combined slot). Wherein, the Bot platform can have N (N-1) choices, namely, the Bot platform can be obtained
Figure BDA0002037485820000023
A second slot position.
When k is 3, the Bot platform may select 3 (i.e., k) entity types from the N entity types, and combine the 3 entity types selected by the Bot platform into one second slot (i.e., a combined slot). Wherein, the Bot platform can be selected from N (N-1) x (N-2), namely, the Bot platform can be obtained
Figure BDA0002037485820000024
A second slot position.
When k is N-1, the Bot platform may select N-1 (i.e., k) entity types from the N entity types, and the N-1 entity types selected by the Bot platform are combined into one second slot (i.e., a combined slot). Among them, the Bot platform may have N × (N-1) × (N-2)) X … … X2, that is to say that
Figure BDA0002037485820000025
Figure BDA0002037485820000026
A second slot position.
When k is equal to N, the Bot platform may select N (i.e., k) entity types from the N entity types, and combine the N entity types selected by the Bot platform into one second slot (i.e., a combined slot). Wherein the Bot platform can be selected from N × (N-1) × (N-2) × … … × 2 × 1
Figure BDA0002037485820000027
Figure BDA0002037485820000028
A second slot position.
From the above description it follows that:
Figure BDA0002037485820000031
Figure BDA0002037485820000032
with reference to the first aspect, in another possible design manner, the method may further include: the Bot platform receives a slot position name configured by a user on a first interface, wherein the slot position name is used for identifying a first slot position. For example, the first interface may include a "slot name" setting box. The Bot platform may receive "CompositeNE" entered by the user in the "slot name" setup box in the first interface. That is, the above slot names may include "CompositeNE".
With reference to the first aspect, in another possible design manner, after the Bot platform recombines the N entity types to obtain M second slots, before the Bot platform trains one or more training corpuses according to one or more training corpuses and the M second slots, the method of the present application may further include: the Bot platform stores the slot position name and the M second slot positions in an associated manner, marks the characteristics of the combined Entity for the M second slot positions corresponding to the slot position name, and the characteristics of the combined Entity are used for indicating that the M second slot positions are the recombined combined Entity type, namely the recombined normalized combined Entity (Composite Entity) type. For example, assume the first slot described above is of the "@ number @ hamburger type". The Bot platform reorganizes compositieNE "@ number @ hamburger type", and four second slot positions can be obtained: "@ quantity", "@ hamburger type", "@ quantity @ hamburger type", and "@ hamburger type @ quantity". The Bot platform may store a slot name (e.g., CompositeNE) in association with the four second slots and then mark the feature of Composite Entity for the 4 second slots corresponding to the CompositeNE e (i.e., the slot name).
With reference to the first aspect, in another possible design manner, the training of the one or more corpus by the Bot platform according to the one or more corpus and the M second slots includes: the Bot platform receives a real label added by a user for each word in one or more training corpuses; adding features for each word in one or more training corpora by the Bot platform according to the M second slot positions; the Bot platform trains one or more training corpora by adopting a deep learning algorithm according to the real labels and the characteristics; or, the Bot platform trains one or more corpus according to real labels and features by adopting a deep learning algorithm in combination with a conditional random field algorithm (CRF) algorithm.
With reference to the first aspect, in another possible design manner, the deep learning algorithm includes a long short-term memory (LSTM) algorithm.
With reference to the first aspect, in another possible design manner, the Bot platform may receive one or more corpus related to the M second slots, which are input by a user. The one or more corpora are possible user utterances by the user for the first intent (e.g., buy hamburgers). For example, for a first intent "buy hamburgers," possible user utterances may include: "I want to buy a spicy drumstick fort", "give I three cod forts", "I want four cod forts", or "spicy drumstick fort three", etc. Then, the Bot platform may receive the real label that the user adds for each word in the corpus and add features for each word. A word of real tag is used to indicate the slot name of the slot to which the word corresponds and the position of the word in the corresponding slot. A word feature indicates a combined entity feature of the slot to which the word corresponds and the position of the word in the corresponding slot. And finally, the Bot platform can adopt a deep learning algorithm to train one or more training corpuses according to the real labels added to the training corpuses by the user, the characteristics added to the training corpuses by the Bot platform and the M slot positions configured in the Bot platform. Or, the Bot platform may adopt a deep learning algorithm in combination with the CRF algorithm to train one or more corpus according to the real label added to the corpus by the user, the characteristics added to the corpus by the Bot platform, and the M slots configured in the Bot platform.
Taking the example that the corpus is ' i want two apples, i ' on one apple ' as an example, the user can add real labels for each character in the corpus in different combination modes, and the Bot platform can add features for each character in the corpus. In implementation (1), the Bot platform may add features to each word in the corpus in a "correct combination" manner. In implementation (2), the Bot platform may add features to each word in the corpus in a "correct combination" + "fine-grained combination" manner. In implementation (3), the Bot platform may add features to each word in the corpus in an "exhaustive combination" + "fine-grained combination" manner. In implementation (4), the Bot platform may add features to each word in the corpus in an "exhaustive combination" manner.
In each implementation manner, the detailed description in the embodiment of the present application may be referred to in the specific method for the user to add a real tag to each word in the corpus and the specific method for the Bot platform to add a feature to each word in the corpus, which is not described herein again.
With reference to the first aspect, in another possible design manner, the recombining, by the Bot platform, N entity types to obtain M second slots includes: and recombining the N entity types by the Bot platform by adopting a dynamic programming algorithm to obtain M second slot positions.
With reference to the first aspect, in another possible design manner, the training of one or more corpus by the Bot platform according to the one or more corpus and the M second slots includes: and the Bot platform trains one or more training corpora according to one or more training corpora and the M second slot positions by adopting a single-point classification algorithm and a probability-based dynamic programming algorithm.
With reference to the first aspect, in another possible design manner, the single-point classification algorithm at least includes any one of a Support Vector Machine (SVM) model, a maximum entropy model, a fast text classification algorithm (fasttext) model, a Convolutional Neural Network (CNN) model, an n-gram model, or a Recurrent Neural Network (RNN) model.
In another possible design manner, in combination with the first aspect, the single-point classification algorithm is a (bidirectional encoder representation from transforms, BERT) model, and the BERT model is a bidirectional language model.
With reference to the first aspect, in another possible design manner, the Bot platform trains one or more corpus according to the one or more corpus and the M second slots by using a single-point classification algorithm and a probability-based dynamic programming algorithm, including: aiming at each training corpus in one or more training corpuses, a Bot platform cuts one training corpus into one or more candidate entities at different positions by scanning from right to left by adopting a probability-based dynamic programming algorithm to obtain a cut normalized combination entity and the number of normalized combination entities; the Bot platform acquires the confidence of the candidate entity corresponding to each position according to the M second slot positions by adopting a single-point classification algorithm; and determining a cutting mode of a training corpus by the Bot platform according to the normalized combined entity number corresponding to each position and the confidence coefficient of the candidate entity.
With reference to the first aspect, in another possible design manner, after the Bot platform recombines N entity types to obtain M second slots, the method of the present application may further include: the Bot platform displays a second interface, the second interface comprises a training start button, and the training start button is used for triggering the Bot platform to train the one or more training corpuses; and responding to the clicking operation of the user on the start training button, and training one or more training corpora by the Bot platform according to one or more training corpora and the M second slots, so that the Bot platform has the capability of extracting the M second slots in the user's speech. That is, in the present application, the Bot platform may trigger the Bot platform to perform training in response to a click operation of a start training button by a user.
In a second aspect, the present application provides a Bot platform, which may include: a processor, a memory, and a display. The memory, the display and the processor are coupled; the memory is for storing computer program code comprising computer instructions which, when executed by the processor, cause the Bot platform to perform: the display is used for displaying a first interface, and the first interface is used for setting a slot position for a first intention in a first skill of the Bot platform; the processor is used for receiving a first slot position of a first interface configuration displayed by a user on the display; the first slot position is a combined slot position, the first slot position comprises N entity types, N is more than or equal to 2, and N is a positive integer; the N entity types are arranged in the first slot according to the sequence set by the user; the processor is further used for recombining the N entity types to obtain M second slot positions; the M second slot positions comprise slot positions obtained by arranging k entity types in the N entity types according to any sequence, wherein k belongs to {1, 2, … …, N }; the processor is further configured to train the one or more corpus according to the one or more corpus and the M second slots, so that the Bot platform has a capability of extracting the M second slots in the user utterance.
In combination with the second aspect, in one possible design,
Figure BDA0002037485820000051
with reference to the second aspect, in another possible design manner, the processor is further configured to receive a slot name configured by the first interface displayed on the display by the user, where the slot name is used to identify the first slot.
With reference to the second aspect, in another possible design manner, the processor is further configured to, after recombining the N entity types to obtain M second slot locations, store slot names and the M second slot locations in a memory in an associated manner according to one or more corpus and the M second slot locations before training the one or more corpus, and mark, for the M second slot locations corresponding to the slot names, a feature of a combined entity, where the feature of the combined entity is used to indicate that the M second slot locations are the recombined combined entity type.
With reference to the second aspect, in another possible design manner, the processor is configured to train one or more corpus according to the one or more corpus and the M second slots, and includes: the processor is specifically used for receiving a real label added by a user for each word in one or more training corpora; adding features for each word in one or more corpus according to the M second slot positions; the processor is also used for training one or more training corpora according to the real labels and the characteristics by adopting a deep learning algorithm; or training one or more training corpora according to the real labels and the characteristics by adopting a deep learning algorithm and a conditional random field CRF algorithm.
In another possible design form, in combination with the second aspect, the deep learning algorithm includes an LSTM algorithm.
With reference to the second aspect, in another possible design manner, the processor is configured to recombine the N entity types to obtain M second slots, and includes: and the processor is specifically used for recombining the N entity types by adopting a dynamic programming algorithm to obtain M second slot positions.
With reference to the second aspect, in another possible design manner, the processor is configured to train one or more corpus according to the one or more corpus and the M second slots, and includes: and the processor is specifically used for training the one or more training corpora according to the one or more training corpora and the M second slot positions by adopting a single-point classification algorithm and a probability-based dynamic programming algorithm.
With reference to the second aspect, in another possible design manner, the single-point classification algorithm at least includes any one of an SVM model, a maximum entropy model, a fasttext model, a CNN model, an n-gram model, or an RNN model.
In combination with the second aspect, in another possible design manner, the single-point classification algorithm is a BERT model, and the BERT model is a bidirectional language model.
With reference to the second aspect, in another possible design manner, the processor is configured to train one or more corpus according to one or more corpus and M second slots by using an SVM algorithm and a probability-based dynamic programming algorithm, and includes: the processor is specifically used for cutting one training corpus into one or more candidate entities at different positions by scanning from right to left by adopting a probability-based dynamic programming algorithm aiming at each training corpus in one or more training corpora to obtain the number of the cut normalized combination entities and the normalized combination entities; obtaining the confidence of the candidate entity corresponding to each position according to the M second slot positions by adopting a single-point classification algorithm; and determining a cutting mode of a training corpus according to the number of the normalized combined entities corresponding to each position and the confidence coefficient of the candidate entities.
With reference to the second aspect, in another possible design manner, the display is further configured to display a second interface after the processor recombines the N entity types to obtain M second slots, where the second interface includes a training start button, and the training start button is used to trigger the Bot platform to train the one or more training corpora. And the processor is also used for responding to the clicking operation of the user on the training starting button displayed by the display, training one or more training corpora according to one or more training corpora and the M second slots, and enabling the Bot platform to have the capability of extracting the M second slots in the user's speech.
In a third aspect, the present application provides a computer storage medium, where the computer storage medium includes computer instructions, and when the computer instructions are executed on a Bot platform, the Bot platform is enabled to execute the configuration method for a combined slot in a human-computer dialog system according to the first aspect and any possible design manner thereof.
In a fourth aspect, the present application provides a computer program product, which when run on a computer, causes the computer to execute the method for configuring a combination slot in a human-machine dialog system according to the first aspect and any one of the possible designs thereof.
It should be understood that, for the beneficial effects that the Bot platform according to the second aspect, the computer storage medium according to the third aspect, and the computer program product according to the fourth aspect can achieve, reference may be made to the beneficial effects in the first aspect and any possible design manner thereof, and details are not described here again.
Drawings
FIG. 1 is an example of a task-oriented voice interaction flow;
fig. 2 is a schematic diagram of a hardware structure of an electronic device according to an embodiment of the present disclosure;
fig. 3 is a schematic diagram of a skill configuration interface of a Bot platform according to an embodiment of the present application;
fig. 4A is a schematic diagram of a skill configuration interface of another Bot platform provided in an embodiment of the present application;
fig. 4B is a schematic diagram of a skill configuration interface of another Bot platform provided in an embodiment of the present application;
FIG. 5 is a schematic diagram of an intended configuration interface of a Bot platform according to an embodiment of the present disclosure;
FIG. 6 is a schematic diagram of an intent configuration interface of another Bot platform provided in an embodiment of the present application;
fig. 7 is a flowchart of a configuration method for a combination slot in a human-computer conversation system according to an embodiment of the present application;
fig. 8A is a system block diagram illustrating a configuration of a combination slot in a human-computer interaction system according to an embodiment of the present disclosure;
fig. 8B is a block diagram of another system for implementing a configuration of a combination slot in a human-computer conversation system according to an embodiment of the present disclosure;
FIG. 9 is a schematic diagram of a model training interface of a Bot platform according to an embodiment of the present disclosure;
FIG. 10 is a schematic diagram of a model training interface of another Bot platform according to an embodiment of the present disclosure;
FIG. 11 is a schematic diagram of a model training interface of another Bot platform according to an embodiment of the present disclosure;
FIG. 12 is a schematic diagram of a model training interface of another Bot platform according to an embodiment of the present disclosure;
FIG. 13 is a schematic diagram of a model training interface of another Bot platform according to an embodiment of the present disclosure;
FIG. 14 is a schematic diagram of a model training interface of another Bot platform according to an embodiment of the present disclosure;
fig. 15 is a schematic structural component diagram of a Bot platform provided in an embodiment of the present application.
Detailed Description
The embodiment of the application provides a configuration method of a combined slot position in a man-machine conversation system, and the method can be applied to a Bot platform. The method and the device can be particularly applied to the slot configuration process of the Bot platform.
The "combined slot" in the embodiments of the present application refers to a slot including at least two entity types. For example, the extracted slot in the corpus "i want two pears for one apple" includes "@ one @ apple" and "@ two @ pears". The "@ one @ apple" and "@ two @ pears" each include the entity type '@ quantity' and '@ hamburger type'. Therefore, "@ one @ apple" and "@ two @ pears" are combined slot positions.
Illustratively, the Bot platform in the embodiment of the present application may be an electronic device integrated with a Chat Bot (Chat Bot) conversational application. Alternatively, the Bot platform may be an electronic device capable of logging into a web page of a Chat Bot conversational application service provided by a server. For example, the Bot platform may be an electronic device such as a Personal Computer (PC), a notebook computer, a portable computer (e.g., a mobile phone), a wearable electronic device (e.g., a smart watch), a tablet computer, an Augmented Reality (AR) \ Virtual Reality (VR) device, and an in-vehicle computer, which have the above functions, and the following embodiments do not particularly limit the specific form of the Bot platform.
Exemplarily, in the embodiment of the present application, a Bot platform is taken as an example, and a schematic structural diagram of the Bot platform is shown. As shown in fig. 2, the Bot platform 200 may include a processor 210, an external memory interface 220, an internal memory 221, a Universal Serial Bus (USB) interface 230, a charging management module 240, a power management module 241, a battery 242, an antenna 1, an antenna 2, a mobile communication module 250, a wireless communication module 260, an audio module 270, a speaker 270A, a receiver 270B, a microphone 270C, an earphone interface 270D, a sensor module 280, keys 290, a motor 291, an indicator 292, a camera 293, a display 294, and a Subscriber Identification Module (SIM) card interface 295, and the like. Among them, the sensor module 280 may include a pressure sensor, a gyroscope sensor, an air pressure sensor, a magnetic sensor, an acceleration sensor, a distance sensor, a proximity light sensor, a fingerprint sensor, a temperature sensor, a touch sensor, an ambient light sensor, a bone conduction sensor, etc.
It is to be understood that the illustrated structure of the embodiment of the present application does not specifically limit the Bot platform 200. In other embodiments, Bot platform 200 may include more or fewer components than shown, or combine certain components, or split certain components, or a different arrangement of components. The illustrated components may be implemented in hardware, software, or a combination of software and hardware.
Processor 210 may include one or more processing units, such as: the processor 210 may include an Application Processor (AP), a modem processor, a Graphics Processing Unit (GPU), an Image Signal Processor (ISP), a controller, a memory, a video codec, a Digital Signal Processor (DSP), a baseband processor, and/or a neural-Network Processing Unit (NPU), among others. The different processing units may be separate devices or may be integrated into one or more processors.
Among the controllers may be the neural center and command center of Bot platform 200. The controller can generate an operation control signal according to the instruction operation code and the timing signal to complete the control of instruction fetching and instruction execution.
A memory may also be provided in processor 210 for storing instructions and data. In some embodiments, the memory in the processor 210 is a cache memory. The memory may hold instructions or data that have just been used or recycled by processor 210. If the processor 210 needs to use the instruction or data again, it can be called directly from the memory. Avoiding repeated accesses reduces the latency of the processor 210, thereby increasing the efficiency of the system.
In some embodiments, processor 210 may include one or more interfaces. The interface may include an integrated circuit (I2C) interface, an integrated circuit built-in audio (I2S) interface, a Pulse Code Modulation (PCM) interface, a universal asynchronous receiver/transmitter (UART) interface, a Mobile Industry Processor Interface (MIPI), a general-purpose input/output (GPIO) interface, a Subscriber Identity Module (SIM) interface, and/or a Universal Serial Bus (USB) interface, etc.
It should be understood that the interface connection relationship between the modules according to the embodiment of the present invention is only an exemplary illustration, and does not constitute a structural limitation on the Bot platform 200. In other embodiments of the present application, the Bot platform 200 may also adopt different interface connection manners or a combination of a plurality of interface connection manners in the above embodiments.
The charge management module 240 is configured to receive a charging input from a charger. The charger may be a wireless charger or a wired charger. In some wired charging embodiments, the charging management module 240 may receive charging input from a wired charger via the USB interface 230. In some wireless charging embodiments, the charging management module 240 may receive wireless charging input through the wireless charging coil of the Bot platform 200. The charging management module 240 may also supply power to the electronic device through the power management module 241 while charging the battery 242.
The power management module 241 is used to connect the battery 242, the charging management module 240 and the processor 210. The power management module 241 receives input from the battery 242 and/or the charging management module 240, and provides power to the processor 210, the internal memory 221, the external memory, the display 294, the camera 293, and the wireless communication module 260.
The wireless communication function of the Bot platform 200 may be implemented by the antenna 1, the antenna 2, the mobile communication module 250, the wireless communication module 260, a modem processor, a baseband processor, and the like. The antennas 1 and 2 are used for transmitting and receiving electromagnetic wave signals. Each antenna in Bot platform 200 may be used to cover a single or multiple communication bands. Different antennas can also be multiplexed to improve the utilization of the antennas. For example: the antenna 1 may be multiplexed as a diversity antenna of a wireless local area network. In other embodiments, the antenna may be used in conjunction with a tuning switch.
The mobile communication module 250 may provide a solution including 2G/3G/4G/5G wireless communication applied on the Bot platform 200. The mobile communication module 250 may include at least one filter, a switch, a power amplifier, a Low Noise Amplifier (LNA), and the like. The mobile communication module 250 may receive the electromagnetic wave from the antenna 1, filter, amplify, etc. the received electromagnetic wave, and transmit the electromagnetic wave to the modem processor for demodulation. The mobile communication module 250 may also amplify the signal modulated by the modem processor, and convert the signal into electromagnetic wave through the antenna 1 to radiate the electromagnetic wave. In some embodiments, at least some of the functional modules of the mobile communication module 250 may be disposed in the processor 210. In some embodiments, at least some of the functional modules of the mobile communication module 250 may be disposed in the same device as at least some of the modules of the processor 210.
The modem processor may include a modulator and a demodulator. In some embodiments, the modem processor may be a stand-alone device. In other embodiments, the modem processor may be separate from the processor 210, and may be disposed in the same device as the mobile communication module 250 or other functional modules.
The wireless communication module 260 may provide a solution for wireless communication applied to the Bot platform 200, including Wireless Local Area Networks (WLANs) (e.g., wireless fidelity (Wi-Fi) networks), bluetooth (bluetooth, BT), Global Navigation Satellite System (GNSS), Frequency Modulation (FM), Near Field Communication (NFC), Infrared (IR), and the like. The wireless communication module 260 may be one or more devices integrating at least one communication processing module. The wireless communication module 260 receives electromagnetic waves via the antenna 2, performs frequency modulation and filtering processing on electromagnetic wave signals, and transmits the processed signals to the processor 210. The wireless communication module 260 may also receive a signal to be transmitted from the processor 210, frequency-modulate and amplify the signal, and convert the signal into electromagnetic waves via the antenna 2 to radiate the electromagnetic waves.
In some embodiments, antenna 1 of Bot platform 200 is coupled to mobile communication module 250 and antenna 2 is coupled to wireless communication module 260, such that Bot platform 200 may communicate with networks and other devices via wireless communication techniques. The wireless communication technology may include global system for mobile communications (GSM), General Packet Radio Service (GPRS), code division multiple access (code division multiple access, CDMA), Wideband Code Division Multiple Access (WCDMA), time-division code division multiple access (time-division code division multiple access, TD-SCDMA), Long Term Evolution (LTE), LTE, BT, GNSS, WLAN, NFC, FM, and/or IR technologies, etc. The GNSS may include a Global Positioning System (GPS), a global navigation satellite system (GLONASS), a beidou navigation satellite system (BDS), a quasi-zenith satellite system (QZSS), and/or a Satellite Based Augmentation System (SBAS).
The Bot platform 200 implements display functions via the GPU, display screen 294, and application processor, among other things. The GPU is a microprocessor for image processing, and is connected to the display screen 294 and an application processor. The GPU is used to perform mathematical and geometric calculations for graphics rendering. Processor 210 may include one or more GPUs that execute program instructions to generate or alter display information.
The display screen 294 is used to display images, video, and the like. The display screen 294 includes a display panel. The display panel may adopt a Liquid Crystal Display (LCD), an organic light-emitting diode (OLED), an active-matrix organic light-emitting diode (active-matrix organic light-emitting diode, AMOLED), a flexible light-emitting diode (FLED), a miniature, a Micro-oeld, a quantum dot light-emitting diode (QLED), and the like. In some embodiments, Bot platform 200 may include 1 or N display screens 294, N being a positive integer greater than 1.
The Bot platform 200 may implement the shooting function through the ISP, the camera 293, the video codec, the GPU, the display screen 294, the application processor, and the like.
The ISP is used to process the data fed back by the camera 293. For example, when a photo is taken, the shutter is opened, light is transmitted to the camera photosensitive element through the lens, the optical signal is converted into an electrical signal, and the camera photosensitive element transmits the electrical signal to the ISP for processing and converting into an image visible to naked eyes. The ISP can also carry out algorithm optimization on the noise, brightness and skin color of the image. The ISP can also optimize parameters such as exposure, color temperature and the like of a shooting scene. In some embodiments, the ISP may be provided in camera 293.
The camera 293 is used to capture still images or video. The object generates an optical image through the lens and projects the optical image to the photosensitive element. The photosensitive element may be a Charge Coupled Device (CCD) or a complementary metal-oxide-semiconductor (CMOS) phototransistor. The light sensing element converts the optical signal into an electrical signal, which is then passed to the ISP where it is converted into a digital image signal. And the ISP outputs the digital image signal to the DSP for processing. The DSP converts the digital image signal into image signal in standard RGB, YUV and other formats. In some embodiments, Bot platform 200 may include 1 or N cameras 293, N being a positive integer greater than 1.
The NPU is a Neural-Network (NN) computing processor, which processes input information quickly by using a biological Neural Network structure, for example, by using a transfer mode between neurons of a human brain, and can also learn by itself continuously. The NPU can realize applications such as intelligent cognition of the Bot platform 200, for example: image recognition, face recognition, speech recognition, text understanding, and the like. For example, the NPU may run the AI model in the embodiment of the present application, and perform the services such as image recognition, face recognition, speech recognition, text understanding, and the like.
The external memory interface 220 may be used to connect an external memory card, such as a MicroSD card, to extend the storage capability of the Bot platform 200. The external memory card communicates with the processor 210 through the external memory interface 220 to implement a data storage function. For example, files such as music, video, etc. are saved in an external memory card.
Internal memory 221 may be used to store computer-executable program code, including instructions. The processor 210 executes various functional applications and data processing of the Bot platform 200 by executing instructions stored in the internal memory 221. The internal memory 221 may include a program storage area and a data storage area. The storage program area may store an operating system, an application program (such as a sound playing function, an image playing function, etc.) required by at least one function, and the like. The storage data area may store data (such as audio data, a phonebook, etc.) created during use of the Bot platform 200, and the like. In addition, the internal memory 221 may include a high-speed random access memory, and may further include a nonvolatile memory, such as at least one magnetic disk storage device, a flash memory device, a universal flash memory (UFS), and the like. For example, a memory (e.g., internal memory 221) may be used to hold model code for the AI model.
Bot platform 200 may implement audio functionality via audio module 270, speaker 270A, headphones 270B, microphone 270C, headset interface 270D, and an application processor, among other things. Such as music playing, recording, etc. Audio module 270 is used to convert digital audio information into an analog audio signal output and also to convert an analog audio input into a digital audio signal. Audio module 270 may also be used to encode and decode audio signals. In some embodiments, the audio module 270 may be disposed in the processor 210, or some functional modules of the audio module 270 may be disposed in the processor 210.
The keys 290 include a power-on key, a volume key, etc. The keys 290 may be mechanical keys. Or may be touch keys. Bot platform 200 may receive key inputs, generating key signal inputs relating to user settings and function controls of Bot platform 200.
The motor 291 may generate a vibration cue. The motor 291 can be used for both incoming call vibration prompting and touch vibration feedback. Indicator 292 may be an indicator light that may be used to indicate a state of charge, a change in charge, or may be used to indicate a message, missed call, notification, etc.
The SIM card interface 295 is used to connect a SIM card. The SIM card can be attached to and detached from the Bot platform 200 by being inserted into the SIM card interface 295 or being pulled out from the SIM card interface 295. The Bot platform 200 may support 1 or N SIM card interfaces, N being a positive integer greater than 1. The SIM card interface 295 may support a Nano SIM card, a Micro SIM card, a SIM card, etc. Multiple cards can be inserted into the same SIM card interface 295 at the same time. The types of the plurality of cards may be the same or different. The SIM card interface 295 may also be compatible with different types of SIM cards. The SIM card interface 295 may also be compatible with external memory cards. The Bot platform 200 interacts with the network through the SIM card to implement functions such as call and data communication. In some embodiments, Bot platform 200 employs esims, namely: an embedded SIM card. The eSIM card can be embedded in the Bot platform 200 and cannot be separated from the Bot platform 200.
The Bot platform in the embodiment of the application is a Bot platform of a task-oriented chat product. A user can configure one or more skills (Skill), intentions (Intent) and slots (Slot) for the Bot platform in the Bot platform to realize human-computer interaction. The embodiment of the application describes a process for creating skills, intentions and slots in a Bot platform by a user by combining the drawings.
Referring to fig. 3, a skill configuration interface 301 of a Bot platform provided in an embodiment of the present application is shown. As shown in fig. 3, skill configuration interface 301 may include: a basic configuration item, a skill name setting box, a skill classification setting box, an icon setting item, a skill request word setting box, a skill welcome word setting box, a skill description setting box and the like.
Wherein the "skill name" setting box is used to input a skill name that the user wants to create. For example, the "skill name" setting box may input a skill name such as "order" or "ticket purchase". The "skill classification" setting box is used to set the type of skill corresponding to the skill name input in the "skill name" setting box. For example, the above-mentioned skill names "order" or "ticket buying" and the like are all related to the tool assistant shown in fig. 3. The "icon" setting item is used to set an icon for the skill name corresponding to the skill input in the "skill name" setting box. The "skill request word" setting box is used for inputting a request word for controlling the Bot platform to start the skill. For example, as shown in fig. 3, the "skill request word" setting box may input "hi minze". The above "order" skill may be initiated when the Bot platform recognizes the user's voice data or text data "hi minming". The ' skill welcome ' setting box is used for inputting the skill welcome which is used for automatically broadcasting the content, introducing energy conservation to the user and guiding the user how to express that the user can be identified when the system cannot identify the user's speech. The "skill description" setting box is used to input an introduction to the above-described skill, such as the problem that the above-described energy saving can solve, the service that can be provided to the user, and the like.
In response to the user clicking the "save" button in the "basic configuration" item shown in fig. 3, the Bot platform may display an intent configuration interface 401 shown in fig. 4A, which guides the user to configure one or more intentions for the skills (e.g., "order food") configured by the user in the skill configuration interface 301.
As shown in fig. 4A, the intent configuration interface 401 includes: an "intention name" setting box, an "intention Chinese name" setting box, a "multi-turn dialog" setting box, a "user utterance" setting box, and the like.
Wherein the "intention name" setting box is used to input a name of an intention (such as an english name, or a name including letters and/or data) that the user wants to create. For example, "buyHum" is input in the "intention name" setting box shown in fig. 4A. The "intention chinese name" setting box is used to input the chinese name of the intention that the user wants to create. For example, "buy hamburger" is entered in the "intention chinese name" setting box shown in fig. 4A. The "multiple rounds of dialog" setup box is used to set the manner of multiple rounds of dialog between the Bot platform and the user. The "user saying" setting box is used to input the possible saying of "buy hamburgers" by the user for the above-mentioned intention (i.e., the user saying). For example, the user may input a user's utterance such as "i want to buy one spicy drumstick", "give me three gad burgers" or "three spicy drumstick burgers" in the "user utterance" setting box. The user utterance input in the "user utterance" setting box may also be referred to as a corpus. The training corpus is used to train one or more slots (including combination slots) that the user sets for the intent "buy hamburgers", such as Artificial Intelligence (AI) model training.
As shown in fig. 4A, the intent configuration interface 401 may further include an "intent to save" item 403 and a "configuration selection bar" 402, among other things. In response to the user entering the intent name in the "intent name" setup box, the Bot platform may display the user-entered intent name in an "intent save" item 403. For example, in response to the user inputting "buyHum" in the "intention name" setting box, as shown in fig. 4A, the Bot platform displays "buyHum" in the "intention save" item 403. In response to a user clicking on the "save" button in the "intent save" item 403, the Bot platform may save the user's intent and related parameters set at the intent configuration interface 401.
The "configuration selection bar" 402 may include a plurality of selections, such as a "base configuration" selection, an "intent" selection, and a "slot type" selection, and a "training" selection. After the user creates an intent in the intent configuration interface 401 shown in FIG. 4A, the user may click on the "slot type" selection in the "configuration selection bar" 402. In response to a user clicking on the "slot type" selection in the "configuration selection bar" 402, the Bot platform may display a slot configuration interface 404 shown in fig. 4B. A "slot type" setting item 405 is included in the slot configuration interface 404. The "slot type" setting item 405 includes a "system slot type" option and a "new slot type" option. In response to a click operation of the user on the "system slot type" option, the Bot platform may determine that the slot type that the user wants to configure is the system slot type, that is, a slot type that is preconfigured in the Bot platform. In response to the click operation of the user on the option of the new slot type, the Bot platform may determine that the slot type that the user wants to configure is the new slot type, that is, the slot type defined by the user. As shown in fig. 4B, in the embodiment of the present application, for example, a user selects a "new slot type" option. In response to a user clicking on the "new slot type" option, the Bot platform may display a slot configuration interface 501 shown in fig. 5, which guides the user to configure one or more slots for the user's intent (e.g., "buy hamburger") configured in the intent configuration interface 401.
As shown in fig. 5, the slot configuration interface 501 may include: a "slot name" setting box 503, a "slot chinese name" setting box 504, and an "entity type" setting box 505, and the like.
Wherein the "slot name" setting box is used to input the name (such as english name, or name including letters and/or data) of the slot that the user wants to configure for the intention "buy hamburger" above. For example, as shown in fig. 5, "CompositeNE" is input in the "slot name" setting box. The "slot Chinese name" setting box is used for inputting the Chinese name of the slot which the user wants to configure. For example, as shown in fig. 5, a chinese name "combination slot 1" is input in the "slot chinese name" setting box.
It is understood that a slot (including a combined slot) may be made up of one or more entity types. The "entity type" setting box 505 is used to configure the entity type in the slot "combination slot 1". In this embodiment, the entity types that the user can configure in the "entity type" setting box 505 may include: a system entity type (namely an entity type pre-configured in a Bot platform) and a new entity type (namely a user-defined entity type).
As shown in fig. 5, the "entity type" setting box 505 includes a "system entity type" button 506, a "new entity type" button 507, an "instance type" input box 508, and an "new entity type" button 509. The "instance type" input box 508 is used to input one or more entity types that the user wants to configure for the slot "combination slot 1" described above. In response to a user clicking on the "system entity type" button 506, the Bot platform may determine that the entity type that the user wants to configure at the "instance type" input box 508 is a system entity type. In response to a user clicking on the "new entity type" button 507, the Bot platform may determine that the entity type that the user wants to configure at the "instance type" input box 508 is the new entity type. As shown in fig. 5, in the embodiment of the present application, a user selects a "new entity type" button 507 as an example. An "Add entity type" button 509 is used to trigger the Bot platform to display an "instance type" input box 508 for configuring the new entity type.
As shown in fig. 5, the slot configuration interface 501 may also include a "slot save" item 502. In response to the user entering a slot name in the "slot name" setup box, the Bot platform may display the slot name entered by the user in the "slot save" item 502. For example, in response to the user entering "CompositeNE" in the "slot name" setting box, the Bot platform displays "CompositeNE" in the "slot save" item 502 as shown in fig. 6.
As shown in fig. 6, the user inputs the slot chinese name "combined slot 1" in the "slot chinese name" setting box 504, and the user configures two instance types '@ number' and '@ hamburger type' in the "instance type" input box 508. In general, if a user clicks the "save" button in the "slot save" item 502 shown in FIG. 6, the Bot platform may generate a combined slot 1 "@ quantity @ hamburger type" in response to a user click operation on the "save" button. The sequence of the entity types in the combined slot position is as follows: '@ quantity' before and'@ hamburger type' after. Thus, the Bot platform can only recognize and respond to the text message "i want one spicy chicken leg burger", "give me two crispy cod burgers", etc. "@ number 'before," @ hamburger type' after expression. And when the user says that the user wants two parts of the crispy cod hamburger, three parts of the spicy chicken leg hamburger for me, and the like, the '@ number' is behind, and the '@ hamburger type' is in front, the Bot platform cannot extract the corresponding slot position.
If the situation that the situation is that the user wants to say that the user wants to have two parts of the crispy cod-fish castle, the situation that the user wants to say three parts of the spicy chicken leg castle, and the like, the '@ number' is behind, and the '@ hamburger type' is expressed in the front of the slot position for extraction is to be realized, the user needs to be configured with a combined slot position '@ hamburger type @ number'. That is, the user says that a plurality of combined slots in a specified sequence need to be configured for the user with the same meaning, and the user operation is cumbersome.
In order to solve the above problem, an embodiment of the present application provides a method for configuring a combination slot in a human-computer interaction system, where the method may be applied to a robot Bot platform. In the method, a Bot platform may receive a user-configured combined slot, which may include multiple entity types. Although the plurality of entity types are arranged in the combination slot according to the sequence set by the user; however, in this embodiment of the present application, the Bot platform may recombine the plurality of entity types to obtain a plurality of combined slots. The plurality of combined slots comprise slots obtained by arranging one or more entity types in the plurality of entity types according to any sequence. And finally, the Bot platform can perform model training according to one or more training corpora input by the user and a plurality of slots obtained by recombination.
Because the plurality of combined slot positions obtained by recombination comprise slot positions obtained by arranging one or more entity types in the plurality of entity types according to any sequence, namely the slot positions obtained by arranging the plurality of entity types according to any sequence can be obtained by recombining the Bot platform; therefore, even if the order of the plurality of entity types is changed in the user's utterance, the Bot platform can extract the corresponding slot.
In addition, in the embodiment of the application, the user only needs to configure which entity types are included in the combined slot, and the user does not need to specify the sequence of the entity types in the combined slot.
For convenience of understanding, the following describes in detail a configuration method of a combination slot in a human-machine conversation system according to an embodiment of the present application with reference to the drawings.
An embodiment of the present application provides a method for configuring a combination slot in a human-computer dialog system, where as shown in fig. 7, the method may include S701 to S703:
s701, receiving a first slot configured by a user on a first interface by a Bot platform. The first slot position is a combined slot position, the first slot position comprises N entity types, N is more than or equal to 2, and N is a positive integer.
The N entity types are arranged in the first slot according to the sequence set by the user. For example, the first slot may be the combined slot 1 "@ number @ hamburger type" described above. This first slot includes two instance types '@ quantity' and '@ hamburger type', i.e. N ═ 2. The order of the entity types in the first slot position is set by a user, and the specific order is as follows: '@ quantity' before and'@ hamburger type' after.
The first interface is used to set a slot for a first intent in a first skill of the Bot platform. For example, the first interface may be the slot configuration interface shown in fig. 5 or 6. Accordingly, the first skill may be the above skill "order a meal", and the first intent may be the above intent "buy hamburgers".
S702, recombining the N entity types by the Bot platform to obtain M second slot positions. The M second slots include slots in which k entity types of the N entity types are arranged in an arbitrary order, k being {1, 2, … …, N }.
For example, in connection with the above example, after the Bot platform receives the combined slot 1 "@ number @ hamburger type", the two instance types '@ number' and '@ hamburger type' may be recombined to get four slots: "@ quantity", "@ hamburger type", "@ quantity @ hamburger type", and "@ hamburger type @ quantity". Therefore, even if the user says that the user wants two parts of the crispy cod hamburger, three parts of the spicy chicken leg hamburger for me, and the like, the '@ number' is behind, and the '@ hamburger type' is in front, the Bot platform can also extract the corresponding slot position.
It should be noted that @ in the embodiments of the present application is used to identify an entity type, and has no actual meaning. In the embodiment of the application, entity types and slot positions are distinguished by 'and'. For example, '@ quantity' represents an entity type, while "@ quantity" represents a slot.
In this embodiment of the present application, the M second slots include slots in which k entity types of the N entity types are arranged in any order, and k belongs to {1, 2, … …, N }. That is, k may be any positive integer of [1, N ].
For example, when k is 1, the Bot platform may select 1 (i.e., k) entity types from the N entity types, and 1 entity type selected by the Bot platform serves as one second slot. Wherein, the Bot platform can have N choices, namely can obtain
Figure BDA0002037485820000131
Figure BDA0002037485820000132
A second slot position.
When k is 2, the Bot platform may select 2 (i.e., k) entity types from the N entity types, and combine the 2 entity types selected by the Bot platform into one second slot (i.e., a combined slot). Wherein, the Bot platform can have N (N-1) choices, namely, the Bot platform can be obtained
Figure BDA0002037485820000133
A second slot position.
When k is 3, the Bot platform may select 3 (i.e., k) entity types from the N entity types, and combine the 3 entity types selected by the Bot platform into one second slot (i.e., a combined slot). Wherein, the Bot platform can be selected from N (N-1) x (N-2), namely, the Bot platform can be obtained
Figure BDA0002037485820000134
A second slot position.
When k is N-1, the Bot platform may select N-1 (i.e., k) entity types from the N entity types, and the N-1 entity types selected by the Bot platform are combined into one second slot (i.e., a combined slot). Wherein, the Bot platform can have N (N-1) x (N-2) x … … x 2 choices, namely, can obtain
Figure BDA0002037485820000135
Figure BDA0002037485820000136
A second slot position.
When k is equal to N, the Bot platform may select N (i.e., k) entity types from the N entity types, and combine the N entity types selected by the Bot platform into one second slot (i.e., a combined slot). Wherein the Bot platform can be selected from N × (N-1) × (N-2) × … … × 2 × 1
Figure BDA0002037485820000141
Figure BDA0002037485820000142
A second slot position.
From the above description it follows that:
Figure BDA0002037485820000143
Figure BDA0002037485820000144
for example, take N ═ 3, that is, 3 entity types (such as the entity types '@ a', '@ B' and '@ C') are included in the first slot as an example. The Bot platform reassembles these 3 entity types, and can obtain the following 15 (i.e., M ═ 15) second slots. Specifically, when k is 1, the Bot platform may obtain 3 second slots: "@ A", "@ B", and "@ C". When k is 2, the Bot platform may get 6 second slots: "@ A @ B" and "@ A @ C"; "@ B @ A" and "@ B @ C"; and "@ C @ A" and "@ C @ B". When k is 3, the Bot platform may get 6 second slots: "@ A @ B @ C" and "@ A @ C @ B"; "@ B @ A @ C" and "@ B @ C @ A"; and "@ C @ A @ B" and "@ C @ B @ A". Wherein M is 3+6+6 15.
It should be noted that the user may not set the order of the entity types in the first slot. In this case, the user may set only which entity types are included in the first slot without setting the order of the entity types in the first slot. That is to say, in this embodiment of the application, regardless of whether the user sets the order of the entity types in the first slot, the Bot platform may recombine the N entity types to obtain the M second slots.
In some embodiments, the Bot platform may adopt a dynamic programming algorithm to recombine the N entity types to obtain the M second slots.
And S703, training the one or more training corpora by the Bot platform according to the one or more training corpora and the M second slot positions, so that the Bot platform has the capability of extracting the M second slot positions in the user utterance.
Wherein the Bot platform may display the second interface. The second interface includes a start training button for triggering the Bot platform to train. In response to the clicking operation of the user on the start training button, the Bot platform may train the one or more corpus according to the one or more corpus and the M second slots input by the user, so that the Bot platform has the capability of extracting the M second slots in the user's utterance.
For example, in response to the user clicking on the "train" selection item in the "configuration selection bar" 601 as shown in fig. 6, the Bot platform may display a second interface 901 as shown in fig. 9, guiding the user to trigger the Bot platform to train. The second interface 901 includes a start training button 902. In response to the user clicking the start training button 902, the Bot platform may perform S703 for training.
In some embodiments, the Bot platform may further receive a slot name configured by the user for the first slot in the first interface. The slot position name is used for identifying the first slot position. For example, the Bot platform may receive "CompositeNE" entered by the user in the "slot name" setting box in the first interface shown in fig. 6, and "combination slot 1" entered by the user in the "slot chinese name" setting box in the first interface shown in fig. 6. That is, the above slot names may include "CompositeNE" and "combination slot 1".
In some embodiments, the Bot platform may further store the slot names in association with M second slots, and mark a feature of a Composite Entity (Composite Entity) for the M second slots corresponding to the slot names. The Composite Entity (Composite Entity) feature is to indicate that the M second slots are of a normalized Composite Entity (Composite Entity) type.
Combining the above example, the Bot platform reorganizes CompositeNE e "@ number @ hamburger type", and can obtain four second slot positions: "@ quantity", "@ hamburger type", "@ quantity @ hamburger type", and "@ hamburger type @ quantity". The Bot platform may store the slot name (e.g., combined slot 1) in association with the four second slots and then mark the Composite Entity property for the 4 second slots corresponding to combined slot 1 (i.e., slot name).
In some embodiments, the Bot platform may receive one or more corpus associated with the M second slots as input by the user. For example, the Bot platform may receive one or more corpora from a user entering a "user says" settings box in the intent configuration interface 401 shown in fig. 4A. The one or more corpora are possible user utterances by the user for the first intent (e.g., buy hamburgers). For example, for a first intent "buy hamburgers," possible user utterances may include: "I want to buy a spicy drumstick fort", "give I three cod forts", "I want four cod forts", or "spicy drumstick fort three", etc. Then, the Bot platform may receive the real tags added by the user for each word in the corpus, and add features for each word in the corpus according to the M second slots. A word of real tag is used to indicate the slot name of the slot to which the word corresponds and the position of the word in the corresponding slot. A word feature indicates a combined entity feature of the slot to which the word corresponds and the position of the word in the corresponding slot. Wherein, the Bot platform can determine the characteristics matched with each word in the training corpus according to the M second slot positions, such as the ' @ quantity ', ' @ fruit type ', ' @ quantity @ fruit type ' and ' @ fruit type @ quantity, and then add the characteristics for each word. And finally, the Bot platform can train one or more training corpuses according to the real labels added to the training corpuses by the user, the characteristics added to the training corpuses by the Bot platform and the M slot positions configured in the Bot platform. In one implementation, the Bot platform may employ a deep learning algorithm to train one or more corpus according to the added real tags and the added features.
In another implementation, the Bot platform may use a deep learning algorithm in combination with the CRF algorithm to train one or more corpus according to the added true tags and the added features. Specifically, the Bot platform may learn the ability to add features corresponding to the real tags added by the user to the corpus by comparing the features added by the Bot platform for each word in the corpus (e.g., "i want two apples for one") with the real tags added by the user for each word. Thus, after the Bot platform receives the corresponding user utterance (for example, "i need two apples), the correct slot position can be extracted.
Optionally, the one or more corpora may be input by a user. Alternatively, the one or more corpus may include: the method comprises the following steps of obtaining a training corpus input by a user and one or more training corpuses obtained by generalizing the training corpus input by the user by a Bot platform. For example, assuming that the Bot platform receives a training corpus "i want to buy a spicy drumstick castle" input by the user, the Bot platform can generalize the training corpus to obtain one or more similar training corpuses. For example, the Bot platform may generalize the training corps "i want to buy one spicy drumstick" to get the training corps "i want to buy three crispy codfish burgers" and "i want to buy five spicy drumstick burgers" etc.
Illustratively, the first slot is "@ number @ fruit type" as an example. Then, the M second slots may include: "@ quantity", "@ fruit type", "@ quantity @ fruit type", and "@ fruit type @ quantity". "@ quantity", "@ fruit type", "@ quantity @ fruit type" and "@ fruit type @ quantity" may be normalized to a "Composite Entity" Entity type, i.e., the combined Entity characteristic of these four second slots is Composite Entity.
Taking the example that the corpus is 'i want two pears of one apple', the user can add a real label to each character in the corpus, and the Bot platform can add features to each character in the corpus. In this embodiment of the application, a user adds a real tag to each word in the corpus, and when the Bot platform adds features to each word in the corpus, "O" indicates that the corresponding word is not a word corresponding to the second slot, "B" indicates that the corresponding word is the first word of the second slot, and "I" indicates that the corresponding word is a word (such as the second word or the third word) other than the first word in the second slot.
Specifically, if the Bot platform extracts a correct slot position from 'i want two apples', the extracted slot positions are '@ one @ apple' and '@ two @ pears', namely '@ number @ fruit type'. Based on this, the user can add the real label shown in any one of the figures 10, 11, 12 or 13 according to the combined slot position "@ number @ fruit type" i want two apples ". For example, as shown in any of fig. 10, 11, 12, or 13, the user may add a real label "O" for the "me" word, the "to" word, and the "buy" word; adding a real label 'B-composite Slot' to the 'one' word; adding a real label 'I-composite Slot' to the 'word'; adding a real label 'I-composite Slot' to the 'apple' word; adding a real label 'I-composite Slot' to the 'fruit' word; adding a real label 'B-composition Slot' to the 'two' word; adding a real label 'I-composite Slot' to the 'word'; and adding a real label I-composite Slot to the Chinese character 'pear'.
Some of the corpus or user speech may include a plurality of slots (which may be combined slots). In this case, the corpus or user utterance includes a List slot. For example, the extracted slot in the corpus "i want two pears for one apple" includes "@ one @ apple" and "@ two @ pears". "@ one @ apple" and "@ two @ pears" all correspond combination trench "@ quantity @ fruit type". Namely, the corpus "I want two apples to be one pear" includes two groove positions "@ quantity @ fruit types", and then the corpus "I want two apples to be one pear" includes a List groove position.
Illustratively, the Bot platform may add features for each word in the corpus in different combinations.
In implementation (1), the Bot platform may add features to each word in the corpus in a "correct combination" manner.
Specifically, if the Bot platform extracts a correct slot position from 'i want two apples', the extracted slot positions are '@ one @ apple' and '@ two @ pears', namely '@ number @ fruit type'. Based on this, the Bot platform can add the feature a shown in fig. 10 according to the combined slot position "@ number @ fruit type" i want two pears in one apple ". For example, as shown in FIG. 10, the Bot platform may add the feature "B-composition Entity" for a "word; adding a feature I-composite Entity to the word; adding the feature I-composite Entity to the 'apple' word; adding a characteristic I-composite Entity to the 'fruit' word; adding a characteristic B-composition Entity to the 'two' word; adding a feature I-composite Entity to the word; the "pear" word is added with the characteristic "I-composite Entity".
In implementation (2), the Bot platform may add features to each word in the corpus in a "correct combination" + "fine-grained combination" manner.
If the Bot platform extracts a correct slot position from 'I want two pears for one apple', the extracted slot position is '@ one @ apple' and '@ two @ pears', namely '@ number @ fruit type'. Based on this, the Bot platform can add features according to each word in the combined slot position "@ number @ fruit type" i want two pears in one apple ". That is, the Bot platform can add features to each word in "i want one apple and two pears" in a "right-combine" manner. However, the entity types '@ one' and '@ apple' in "@ one @ apple" may also be extracted as a slot, such as the entity types '@ two' and '@ pear' in "@ one" and "@ apple", "@ two @ pears" may also be extracted as a slot, such as "@ two" and "@ pear". That is to say, the above-mentioned combination trench "@ one @ apple" and "@ two @ pears" can carry out the division of fine grit. For example, as shown in fig. 11, the Bot platform may add feature a and feature b for "i want two pears for one apple". In conjunction with FIG. 10, as shown in FIG. 11, the Bot platform may also add a feature for a word, # B-composition Entity; adding a feature to the word "# I-composite Entity"; adding a feature for an "apple" word, "# B-composite Entity"; adding a characteristic # I-composite Entity to the 'fruit' word; adding a feature to the 'two' word '# B-composition Entity'; adding a feature to the word "# I-composite Entity"; add a feature to the word "pear" # B-composite Entity ".
In implementation (3), the Bot platform may add features to each word in the corpus in an "exhaustive combination" + "fine-grained combination" manner.
Wherein "@ apple @ two" in the training corpus "i want two pears of an apple" is "@ fruit type @ quantity in the above-mentioned M second trench". Therefore, the Bot platform can combine the above-mentioned "correct combination" scheme, for "I need two pears in an apple" add combination trench "@ apple @ two" corresponding characteristics.
As shown in fig. 12, the Bot platform may add feature a, feature b, and feature c for "i want two pears for one apple". In conjunction with FIG. 11, as shown in FIG. 12, the Bot platform may also add a feature "# B-composite Entity" for the "apple" word; adding a characteristic # I-composite Entity to the 'fruit' word; adding a feature to the 'two' word '# B-composition Entity'; adding a feature to the word "# I-composite Entity"; add a feature to the word "pear" # B-composite Entity ".
In implementation (4), the Bot platform may add features to each word in the corpus in an "exhaustive combination" manner. For example, as shown in fig. 13, the Bot platform may add feature a and feature c for "i want two pears for one apple". The details of the feature a and the feature c may refer to the description in the foregoing embodiments, and are not repeated herein.
It should be noted that, because the Bot platform adds features to the corpus in an 'exhaustive combination' manner, the Bot platform can learn all the combinations of each word in the corpus; the "fine-grained grouping" in fig. 12 increases the uncertainty of unnecessary grouping of words, and affects the accuracy of the extraction result. And the combined features of ambiguity of the composition slot position "@ apple @ two" corresponding to the feature c shown in fig. 13 can enhance the learning effect of the Bot platform and improve the accuracy of the Bot platform extraction result. Therefore, the Bot platform adds features to each word in the training corpus by adopting the 'exhaustive combination' mode in the implementation mode (4), so that the ambiguous enumeration number after each word combination can be reduced, and the accuracy of slot extraction can be improved.
In some scenarios, some words in the user's speech may be ambiguous. In particular, some words in the user's utterance may belong to multiple entity types. For example, the "apple" in the user saying "i want to buy apple" may be the trade name, and may also be the fruit type. I.e. a situation where some entity in the user's opinion may itself be ambiguous. In order to improve the accuracy of slot extraction, during training, a user can also add a training corpus including ambiguous entities in the Bot platform, and add entity tags to the ambiguous entities in the training corpus, so that the Bot platform can learn the ability to identify the ambiguous entities.
For example, the corpus may be "i want to buy apple, i want to buy an apple". The Bot platform can add features and entity labels for each word in "i want to buy apple, i want to buy an apple". The entity tag is used for indicating that the corresponding word is a word in the commodity name. For example, as shown in FIG. 14, the user may add the entity tag "B-product Name" for the "apple" word and the entity tag "I-product Name" for the "fruit" word.
It can be understood that when model training is performed, learning of training corpora including entity ambiguity is added, and accuracy of slot extraction when a Bot platform extracts a slot in a user utterance including entity ambiguity is improved.
In another implementation, the Bot platform may use a single-point classification algorithm (i.e., a single-point classification model) and a probability-based dynamic programming algorithm to train one or more corpus according to one or more corpus and M second slots input by the user, so that the Bot platform has the capability of extracting M second slots in the user utterance.
Illustratively, the Bot platform may adopt a probability-based dynamic programming algorithm to scan from right to left and cut a corpus into one or more candidate entities at different positions, so as to obtain the number of normalized combined entities after cutting. And then, the Bot platform acquires the confidence of the candidate entity corresponding to each position according to the M second slot positions by adopting a single-point classification algorithm. And finally, determining the cutting mode of the training corpus by the Bot platform according to the normalized combined entity number corresponding to each position and the confidence coefficient of the candidate entity.
In some embodiments, the Bot platform may adopt a probabilistic dynamic programming algorithm to perform a minimum cut on the corpus to obtain a plurality of combined entities and the number of the combined entities. In the embodiment of the application, the goal of the Bot platform performing the minimum switching on the corpus is to perform the maximum combination on the entities in the corpus. For example, take "one apple and two pears" as an example. Four candidate entities are included in "one apple and two pears": one, (apple), (two) and (pear). The Bot platform can cut "one apple and two pears" resulting in a plurality of combined entities, such as "(one) (apple)" "(two) (pears)", or "(one)", "(apple)" and "(two) (pears)", etc. In the cutting mode corresponding to the cutting mode of the (one) (apple) ", (two) (pears)", two combined entities such as the (one) (apple) "and the (two) (pears)" are obtained by cutting the (one apple) "and the (two) (pears)", and the number of the normalized combined entities is 2. In the cutting mode corresponding to "(one)" "(apple)" "(two) (pears)", a combined entity, such as "(two) (pears)" is obtained by cutting "one apple and two pears", and the number of the normalized combined entities is 1. Wherein the number of normalized combined entities of "(one) (apple)" "(two) (pears)" is greater than the number of normalized combined entities of "(one)" "(apple)" "(two) (pears)"; thus, the cutting pattern corresponding to "(one) (apple)" "(two) (pears)" is superior to the cutting pattern corresponding to "(one)" "(apple)" "(two) (pears)".
The dynamic planning process in the embodiment of the application: and minimizing the number of combined entities obtained after cutting.
if (Max comparison stage):
F(i+1)'=Max{p(i-1)*F(i-1)+1,p(i-1)*F(i-1),
...
p(i-k)*F(i-k)+1,p(i-k)*F(i-k),
k=1,2,3,...Ki}
if (F (i +1) assignment stage):
f (i +1) ═ F (i +1)'/p minus probability value in Max comparison
Where Ki is the "maximum word length + 1" in the entity list extending to the left of the ith position. For example, for the "pear" word in "two pears for one apple", when the initial index of i is 0, the index corresponding to "pear" is "6". When the Bot platform calculates F (6), candidate entities are respectively searched from the position of i ═ 6 to the left, where the candidate entities are "pear" and "two pears". For both candidate entities, pear and two pears, it is also determined whether the candidate entity corresponds to the target slot, since one entity type may correspond to multiple slots. Location may correspond to both the origin and destination, for example. In general, this is also the case with "normalized" combined entity types, so the certainty of the candidate entities themselves needs to be considered in the dynamic planning process. In the embodiment of the application, a single-point classification algorithm can be adopted to calculate the confidence p (i-1) of the candidate entity, and the measurement object of i-1 is the candidate entity corresponding to F (i-1).
Where F (i) is the number of normalized combined entities formed by the "joining" of candidate entities after cleavage up to the ith position. Wherein the normalized combined entity formed by the combination comprises at least two candidate entities. For example, "one apple and two pears", according to the meaning of f (i) in the first Max, "(one)" "(apple) (two)" "(pears)" the normalized combined number of entities formed by "joining" of candidate entities is "1", and "the normalized combined entity formed by" joining "is" (apple) (two) "; the normalized number of combined entities formed by candidate entities "joining" of "(one) (apple)" "(two) (pear)" is "2", and the normalized combined entities are "(one) (apple)" and "(two) (pear)".
Wherein at least two candidate entities included in the normalized combined entity formed by the "combining" may correspond to different entity types. For example, take the case of "combined" to form normalized combined entities "(one) (apple)". The normalized combined entities "(one) (apple)" formed by "combining" include the candidate entity (one) and the candidate entity (apple). Wherein, the candidate entity (one) corresponds to the entity type '@ quantity', and the candidate entity (apple) corresponds to the entity type '@ fruit type'.
For the f (i) meaning in the first Max, the Bot platform may cut the corpus according to the corresponding cutting mode when the value of f (i) is the maximum. For example, "two apples" are cut "in accordance with" (one) (apple) "" (two) (pears) "to give the largest number of normalized combined entities (which is" 2 "), and thus, the Bot platform can cut" one apple and two pears "in accordance with the corresponding cut pattern of" (one) (apple) "" (two) (pears) "to give" combined "formed normalized combined entities" (one) (apple) "and" (two) (pears) ". It should be noted that the Max mode is taken as an example in the above formula.
If Max in the formula is changed to Min, f (i) can be defined as the number of all combined entities after cutting. Wherein the combined entity may include one or more candidate entities. For example, "one apple and two pears", according to the meaning of f (i) in the second Min, "(one)" "(apple) (two)" "(pears)" the number of combined entities formed by "joining" of candidate entities is "3", the combined entities are "(one)", "(apple) (two)" and "(pear)"; "(one) (apple)" "(two) (pears)" the number of combined entities formed by the candidate entities "joining" is "2", and the combined entities are "(one) (apple)" and "(two) (pears)".
For the meaning of f (i) in the second Min, the Bot platform may cut the corpus according to the corresponding cutting mode when the value of f (i) is the minimum. For example, the "two pears of an apple" cut according to "(one) (apple)" "(two) (pears)" results in the smallest number of normalized combined entities (which is "2"), and thus, the Bot platform can cut "two pears of an apple" in a manner corresponding to the "one (apple)" "(two) (pears)" resulting in the normalized combined entities "(one) (apple)" and "(two) (pears)".
In the dynamic planning process, for the ith position, there are two cases for each candidate entity to the left: case (1) cuts the entity, the formula is + 1; case (2) does not cut the entity, and the formula is expressed as + 0. In different cases, the selection of the next position in left recursion is different. For example, in case (1), the next position in left recursion is one position to the left of the starting subscript of the candidate entity. For example, in "two pears in one apple", the next position when the 6 th position (i.e., "pear") is left-handed is one position "one" to the left of the starting subscript of the candidate entity "pear", and the formula is expressed as + 1. In case (2), the next position in left recursion is one position to the left of the candidate entity's last subscript. For example, in "two pears in one apple", the next position when the 3 rd position (i.e., "fruit") is left-handed is one position "one" to the left of the starting subscript of the candidate entity "apple", and the formula is expressed as + 0.
The purpose of F (i +1) ═ F (i +1) '/p is to reduce the maximum value in the F's counting meaning F (i +1) ═ F (i +1) '/p when p in F (i +1)'/p is Max { } comparison, so that F (i +1) can retain the numerical meaning during the next iteration, with the effect of probability only acting upon Max { } comparison. Wherein F (0) ═ 0.
In the above probabilistic dynamic programming framework, the probability p (i) is introduced to measure the uncertainty of the candidate entity. For the ith position, the confidence of the candidate entity is predicted. The confidence degree of the entity is predicted by adopting an SVM model, and the confidence degree can be any model, such as a single-point classification model of maximum entropy, fasttext, CNN and the like, or a language model of n-gram, RNN and the like. The single-point classification model has the advantages that in actual engineering implementation, the prediction speed is high, and effective training can be completed only by a small number of samples.
However, the single point model has a disadvantage in that the inter-sequence relationship is not considered. In order to solve the problem, the method for dynamically planning based on the probability is provided by combining the advantages of high single-point model prediction speed and less training samples. By adopting a dynamic planning method based on probability, the prediction result of the single-point model can be combined with the limit in decoding (maximum combination), and the effect of the sequence labeling model is realized. In this framework, the single-point classification model can be extended to a model that fuses context features of sentence-level sequences, such as a BERT-based single-point classification model. Although a model (e.g., a BERT-based single-point classification model) that fuses context features of sentence-level sequences is also a single-point prediction for candidate entities, the model is a bidirectional language model that takes into account sentence-level context information while predicting confidence of candidate entities.
For example, in the embodiment of the present application, the calculation process is described by taking the calculation of F (6) in "one apple and two pears" as an example:
the first step is as follows: f (6) ═ Max { according to the left scan, generate candidate entity: "pear", "two pears" } - > determine k in F (i-k), F (6) ═ Max { F (pieces) +1, F (pieces) }, where F (pieces) is the case of no cut to the candidate entity. After "one" is determined, k is 1 since "one" includes one word "pear"; after the "fruit" is determined, k is 3 because the "fruit" includes three words "two pears" thereafter.
The second step is that: according to the determined k substitution, F (6) ═ Max { F (number) +1, F (fruit) +1, F (number) } - > F (6) ═ Max { F (5) +1, F (3) +1, F (5) } - > is calculated for the corresponding probability p ("pear"), p ("two pears") - > p (5), p (3), which is the probability that operating on the candidate entity results in the i-k position being recurred to the left, the corresponding candidate entity.
The third step: max is calculated by combining the calculated probability values, and F (6) ═ Max { F (5) +1, F (3) +1, F (5) }.
Thus, iterative computation can finally obtain F (6) from F (5), F (4), F (3), F (2) and F (1), and operations corresponding to Max operations are selected according to dynamic planning, so that whether cutting and cutting positions (namely cutting modes) can be reversely obtained.
As shown in fig. 8A, the method of the embodiment of the present application may include: a slot creation and configuration process and a model training process.
The "slot creation and configuration" flow shown in fig. 8A corresponds to S701 shown in fig. 7, and step 1 shown in fig. 8B: and (4) configuring a user interface.
As shown in fig. 8B, in step 1, the Bot platform may perform user interface configuration (i.e., execute S701). For example, the user may perform the user interface configuration at the slot configuration interface (i.e., the first interface) shown in fig. 5 or 6. Specifically, step 1 may include: and (3) selecting the entity type in the combined slot position and configuring the combined slot position by the user. For example, the user may configure a plurality of entity types in the "entity type" setting box 505 shown in fig. 5, indicating that the user selects "combination slot". The user may configure the entity type in the combination slot "combination slot 1" in the "entity type" setting box 505 shown in fig. 5, such as the '@ number' and '@ hamburger type' shown in fig. 6.
The "model training" flow shown in FIG. 8A corresponds to S702-S703 shown in FIG. 7, and step 2 shown in FIG. 8B: and (5) training a model.
In the "model training" process shown in fig. 8A or step 2 shown in fig. 8B, the Bot platform may recombine the N entity types to obtain the M second slots. Also, the Bot platform may train one or more corpora using a deep learning algorithm (e.g., the LSTM algorithm shown in fig. 8A or 8B) in conjunction with the CER algorithm, or using the SVM algorithm and the probability-based dynamic programming algorithm shown in fig. 8A or 8B.
As shown in fig. 8A, the method of the embodiment of the present application may further include: the "model prediction" process. The "model prediction" flow shown in FIG. 8A corresponds to step 3 shown in FIG. 8B: and (5) model prediction.
After the "model training" process, the Bot platform may receive a user utterance (e.g., voice information or text information input by the user) and understand an intention to be expressed by the user utterance; after the intent is determined, the Bot platform may speak to the user for slot extraction. If the user received by the Bot platform says that the speech information is the voice information, the voice information can be converted into the text information, and then intention understanding and slot position extraction are carried out on the text information.
Illustratively, after completing the AI model training, the Bot platform may receive a user utterance entered by the user in the "user utterance" setting box shown in fig. 4A. Then, the Bot platform can perform intent understanding and slot extraction on the user's utterance entered in the "user utterance" setting box. Finally, the Bot platform may receive a determination operation of the user on the extracted slot position, that is, the Bot platform may confirm the extracted slot position by interacting with the user. As shown in fig. 8B, step 3 may include "slot extraction for user utterance" and "user interaction confirmation".
Wherein, after the 'model prediction' process, the Bot platform can be formally used. The method formally used by the Bot platform is similar to the method for using the Bot platform in the "model prediction" process, and details are not repeated here in the embodiments of the present application. It should be noted that, the users in the above "slot creation and configuration" process, "model training" process, and "model prediction" process may be referred to as a first user; the user in the process of formal use of the Bot platform can be called a second user.
In some embodiments, the first user may be a developer or a tester of the Bot platform. Developers or testers can configure 'skill', 'intention' and 'slot position' for the Bot platform according to the requirements of the owner of the Bot platform, and model training is carried out. The owner of the Bot platform can be a restaurant, a convenience store or a staff in a market and the like.
In other embodiments, the first user may be an owner of the Bot platform (e.g., a restaurant, convenience store, or store employee, etc.). Staff in restaurants, convenience stores or shopping malls can configure 'skill', 'intention' and 'slot' for the Bot platform according to requirements, and perform model training.
In some embodiments, the second user may be an owner of the Bot platform, such as a staff member of a restaurant, convenience store, or store. In this embodiment, a user utterance (e.g., a voice message or a text message) is input to the Bot platform by the owner of the Bot platform (e.g., a restaurant, convenience store, or store clerk). The Bot platform can receive the user utterance and understand the intention to be expressed by the user utterance; after the intent is determined, the Bot platform may perform slot extraction on the user's opinion; and finally, the Bot platform can make feedback according to the extracted slot position.
In other embodiments, the second user may be a consumer of the restaurant, convenience store, or store. In this embodiment, a user utterance (e.g., a voice message or a text message) is input by the consumer to the Bot platform. The Bot platform can receive the user utterance and understand the intention to be expressed by the user utterance; after the intent is determined, the Bot platform may perform slot extraction on the user's opinion; and finally, the Bot platform can make feedback according to the extracted slot position, so that self-service consumption of the consumer is realized.
In the configuration method of the combined slot in the human-computer interaction system provided by the embodiment of the application, the multiple second slots obtained by recombination include slots obtained by arranging one or more entity types in the multiple entity types according to any order, that is, the Bot platform can recombine to obtain slots obtained by arranging the multiple entity types according to any order; therefore, even if the sequence of the entity types is changed in the user utterance or the user utterance only includes one entity type, the Bot platform can extract the corresponding slot position and reply to the user utterance.
Moreover, aiming at complex information in the user's expression, such as ' one apple and two pears ', the Bot platform can still correctly extract the combination slot ' one apple ' and ' two pears '. By the method, the comprehension capability of the Bot platform to the complex user information can be improved, and the use experience of the user using the Bot platform can be improved.
For example, in the embodiment of the present application, the above "skill" is used as shopping, "intention" is used as "buy fruit," the combined slot configured by the user on the Bot platform is "@ number @ fruit type," and the training corpus is "i want to buy two apples" as an example. The above "slot creation and configuration" flow, "model training" flow, and "model prediction" flow are exemplified.
Firstly, after receiving a combined slot position "@ quantity @ fruit type" configured by a first user, the Bot platform can recombine two entity types '@ quantity' and '@ fruit type' in the combined slot position "@ quantity @ fruit type" to obtain four slot positions: "@ quantity", "@ fruit type", "@ quantity @ fruit type", and "@ fruit type @ quantity".
Then, the Bot platform can train one or more training corpuses according to the training corpuses "I want to buy two pears in an apple", and the trench "@ quantity", "@ fruit type", "@ quantity @ fruit type" and "@ fruit type @ quantity", make the Bot platform possess the ability of drawing the trench in the user's saying "I want to buy two pears in an apple".
For example, the Bot platform trains one or more training corpora using the deep learning algorithm. For example, as shown in fig. 10, the Bot platform may receive the real label added by the user for each word in the corpus, and add the feature a to each word in the corpus in the manner of "correct combination" in the above implementation (1). Thus, the Bot platform can learn the ability to add features corresponding to the real tags added by the user for the training corpus "i want to buy one apple and two pears". For example, as shown in FIG. 10, the Bot platform may add the feature "B-composition Entity" for a "word; adding a feature I-composite Entity to the word; adding the feature I-composite Entity to the 'apple' word; adding a characteristic I-composite Entity to the 'fruit' word; adding a characteristic B-composition Entity to the 'two' word; adding a feature I-composite Entity to the word; the "pear" word is added with the characteristic "I-composite Entity". That is, the Bot platform can learn that the correct slot position in the training corpus "I want to buy two pears in one apple" is "@ one @ apple" and "@ two @ pears". Therefore, after the Bot platform receives the user saying that 'I want two apples', the slot positions '@ one @ apple' and '@ two @ pears' can be extracted. That is, the Bot platform can extract the correct slot position from the user saying "i want two pears for one apple".
It is understood that, in order to implement the above functions, the Bot platform includes a corresponding hardware structure and/or software module for executing each function. Those of skill in the art will readily appreciate that the various illustrative elements and algorithm steps described in connection with the embodiments disclosed herein may be implemented as hardware or combinations of hardware and computer software. Whether a function is performed as hardware or computer software drives hardware depends upon the particular application and design constraints imposed on the solution. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the embodiments of the present application.
In the embodiment of the present application, the Bot platform may be divided into functional modules according to the method example, for example, each functional module may be divided corresponding to each function, or two or more functions may be integrated into one processing module. The integrated module can be realized in a hardware mode, and can also be realized in a software functional module mode. It should be noted that, in the embodiment of the present application, the division of the module is schematic, and is only one logic function division, and there may be another division manner in actual implementation.
The embodiment of the application also provides a Bot platform. As shown in fig. 15, the Bot platform 1500 may include: slot configuration module 1501, model training module 1502, and model prediction module 1503. The slot configuration module 1501 is configured to execute the slot creation and configuration process, the model training module 1502 is configured to execute the model training process, and the model prediction module 1503 is configured to execute the model prediction process.
It is understood that part of the functions of the slot configuration module 1501, as well as the functions of the model training module 1502 and the model prediction module 1503 may be implemented in the processor 210 shown in fig. 2. The functions of the slot configuration module 1501 for displaying the first interface may be implemented in the display 294 shown in fig. 2.
An embodiment of the present application further provides a computer storage medium, where the computer storage medium includes computer instructions, and when the computer instructions are run on the Bot platform, the Bot platform is enabled to execute each function executed by the Bot platform in the description of the foregoing embodiment.
Embodiments of the present application further provide a computer program product, which, when running on a computer, causes the computer to execute the functions performed by the Bot platform in the description of the above embodiments.
Through the above description of the embodiments, it is clear to those skilled in the art that, for convenience and simplicity of description, the foregoing division of the functional modules is merely used as an example, and in practical applications, the above function distribution may be completed by different functional modules according to needs, that is, the internal structure of the device may be divided into different functional modules to complete all or part of the above described functions. For the specific working processes of the system, the apparatus and the unit described above, reference may be made to the corresponding processes in the foregoing method embodiments, and details are not described here again.
In the several embodiments provided in this embodiment, it should be understood that the disclosed system, apparatus and method may be implemented in other ways. For example, the above-described apparatus embodiments are merely illustrative, and for example, the division of the modules or units is only one logical division, and there may be other divisions when actually implemented, for example, a plurality of units or components may be combined or may be integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, devices or units, and may be in an electrical, mechanical or other form.
The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.
In addition, each functional unit in the embodiments of the present embodiment may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit can be realized in a form of hardware, and can also be realized in a form of a software functional unit.
The integrated unit, if implemented in the form of a software functional unit and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present embodiment essentially or partially contributes to the prior art, or all or part of the technical solution may be embodied in the form of a software product stored in a storage medium and including instructions for causing a computer device (which may be a personal computer, a server, or a network device) or a processor to execute all or part of the steps of the method described in the embodiments. And the aforementioned storage medium includes: flash memory, removable hard drive, read only memory, random access memory, magnetic or optical disk, and the like.
The above descriptions are only specific embodiments of the present embodiment, but the scope of the present embodiment is not limited thereto, and any changes or substitutions within the technical scope of the present embodiment should be covered by the scope of the present embodiment. Therefore, the protection scope of the present embodiment shall be subject to the protection scope of the claims.

Claims (22)

1. A configuration method of a combined slot in a man-machine conversation system is applied to a robot Bot platform, and comprises the following steps:
the Bot platform receives a first slot configured by a user on a first interface; the first slot position is a combined slot position, the first slot position comprises N entity types, N is more than or equal to 2, and N is a positive integer; the N entity types are arranged in the first slot position according to the sequence set by a user; the first interface is used for setting a slot position for a first intention in a first skill of the Bot platform;
the Bot platform recombines the N entity types to obtain M second slot positions; the M second slot positions are recombined normalized combined entity types, the M second slot positions comprise slot positions obtained by arranging k entity types in the N entity types according to any sequence, and k belongs to {1, 2, … …, N };
the Bot platform trains the one or more training corpuses according to the one or more training corpuses and the M second slots, so that the Bot platform has the capability of extracting the M second slots in the user utterance after understanding the intention to be expressed by the user utterance.
2. The method of claim 1,
Figure FDA0003150890880000011
3. the method according to claim 1 or 2, characterized in that the method further comprises:
and the Bot platform receives a slot position name configured by a user on the first interface, wherein the slot position name is used for identifying the first slot position.
4. The method according to claim 3, wherein after the Bot platform reassembles the N entity types to obtain M second slots, the Bot platform further comprises, before training the one or more corpora according to one or more corpora and the M second slots:
and the Bot platform stores the slot position names and the M second slot positions in an associated manner, marks the characteristics of the combined entities for the M second slot positions corresponding to the slot position names, and the characteristics of the combined entities are used for indicating that the M second slot positions are the types of the combined entities after recombination.
5. The method of claim 4, wherein the Bot platform trains the one or more corpora according to the one or more corpora and the M second slots, including:
the Bot platform receives a real label added by a user for each word in the one or more training corpuses;
the Bot platform adds features to each word in the one or more corpus according to the M second slots;
the Bot platform trains the one or more training corpora according to the real labels and the characteristics by adopting a deep learning algorithm; or the Bot platform trains the one or more training corpora according to the real labels and the features by adopting a deep learning algorithm in combination with a conditional random field CRF algorithm.
6. The method of claim 5, wherein the deep learning algorithm comprises a long short term memory network (LSTM) algorithm.
7. The method of claim 1 or 2, wherein the Bot platform trains the one or more corpora according to the one or more corpora and the M second slots, including:
and the Bot platform trains the one or more training corpuses according to the one or more training corpuses and the M second slot positions by adopting a single-point classification algorithm and a probability-based dynamic planning algorithm.
8. The method of claim 7, wherein the single point classification algorithm comprises at least any one of a Support Vector Machine (SVM) model, a maximum entropy model, a fast text classification algorithm model, a Convolutional Neural Network (CNN) model, an n-gram model, or a Recurrent Neural Network (RNN) model.
9. The method of claim 7, wherein the single point classification algorithm is a BERT model, and wherein the BERT model is a bi-directional language model.
10. The method of claim 7, wherein the Bot platform trains the one or more corpora according to the one or more corpora and the M second slots using a single point classification algorithm and a probability-based dynamic programming algorithm, including:
for each corpus of the one or more corpuses,
the Bot platform adopts the probability-based dynamic programming algorithm, scans from right to left and cuts a training corpus into one or more candidate entities at different positions to obtain the number of the cut normalized combined entities and the normalized combined entities;
the Bot platform acquires the confidence degree of the candidate entity corresponding to each position according to the M second slot positions by adopting the single-point classification algorithm;
and the Bot platform determines a cutting mode of the training corpus according to the number of the normalized combined entities corresponding to each position and the confidence coefficient of the candidate entities.
11. The method of claim 1 or 2, wherein after the Bot platform reassembles the N entity types, resulting in M second slots, the method further comprises:
the Bot platform displays a second interface, the second interface comprising a training start button, the training start button being used for triggering the Bot platform to train the one or more training corpora;
and responding to the clicking operation of the user on the training starting button, and training the one or more training corpuses by the Bot platform according to the one or more training corpuses and the M second slot positions, so that the Bot platform has the capability of extracting the M second slot positions in the user's speech.
12. A robotic Bot platform, the Bot platform comprising: a processor, a memory, and a display; the memory, the display, and the processor are coupled; the memory is configured to store computer program code comprising computer instructions that, when executed by the processor, the Bot platform performs:
the display is used for displaying a first interface, and the first interface is used for setting a slot position for a first intention in a first skill of the Bot platform;
the processor is used for receiving a first slot position of the first interface configuration displayed on the display by a user; the first slot position is a combined slot position, the first slot position comprises N entity types, N is more than or equal to 2, and N is a positive integer; the N entity types are arranged in the first slot position according to the sequence set by a user;
the processor is further configured to recombine the N entity types to obtain M second slot locations; the M second slot positions are recombined normalized combined entity types, the M second slot positions comprise slot positions obtained by arranging k entity types in the N entity types according to any sequence, and k belongs to {1, 2, … …, N };
the processor is further configured to train the one or more corpus according to the one or more corpus and the M second slots, so that the Bot platform has a capability of extracting the M second slots in the user utterance after understanding an intention to be expressed by the user utterance.
13. The Bot platform of claim 12,
Figure FDA0003150890880000021
14. the Bot platform of claim 12 or 13, wherein the processor is further configured to receive a slot name of the first interface configuration displayed by a user at the display, the slot name identifying the first slot.
15. The Bot platform according to claim 14, wherein the processor is further configured to, after recombining the N entity types to obtain the M second slot locations, store the slot names and the M second slot locations in the memory in association with each other according to the one or more corpus and the M second slot locations before training the one or more corpus, and mark a feature of a combined entity for the M second slot locations corresponding to the slot names, where the feature of the combined entity is used to indicate that the M second slot locations are a recombined combined entity type.
16. The Bot platform of claim 15, wherein the processor, configured to train one or more corpora according to the one or more corpora and the M second slots, includes:
the processor receives a real label added by a user for each word in the one or more training corpuses; adding features to each word in the one or more corpus according to the M second slots;
the processor is further configured to train the one or more training corpora according to the real labels and the features by using a deep learning algorithm; or training the one or more training corpora according to the real labels and the characteristics by adopting a deep learning algorithm and a conditional random field CRF algorithm.
17. The Bot platform of claim 16, in which the deep learning algorithm includes a long short term memory network (LSTM) algorithm.
18. The Bot platform according to claim 12 or 13, wherein the processor is configured to train the one or more corpora according to the one or more corpora and the M second slots, including:
the processor is specifically configured to train the one or more corpus according to the one or more corpus and the M second slots by using a single-point classification algorithm and a probability-based dynamic programming algorithm.
19. The Bot platform of claim 18, wherein the single point classification algorithm includes at least any one of a Support Vector Machine (SVM) model, a maximum entropy model, a fast text classification algorithm model, a Convolutional Neural Network (CNN) model, an n-gram model, or a Recurrent Neural Network (RNN) model.
20. The Bot platform of claim 18, in which the single point classification algorithm is a BERT model, the BERT model being a bi-directional language model.
21. The Bot platform of claim 18, wherein the processor, using the single point classification algorithm and the probability-based dynamic programming algorithm, is configured to train the one or more corpora according to the one or more corpora and the M second slots, including:
the processor is specifically configured to scan from right to left according to the probability-based dynamic programming algorithm for each corpus of the one or more corpuses to cut a corpus into one or more candidate entities at different positions, and obtain the number of cut normalized combined entities and the number of normalized combined entities; obtaining the confidence degree of the candidate entity corresponding to each position according to the M second slot positions by adopting the single-point classification algorithm; and determining a cutting mode of the training corpus according to the number of the normalized combined entities corresponding to each position and the confidence coefficient of the candidate entities.
22. The Bot platform according to claim 12 or 13, wherein the display is further configured to display a second interface after the processor has recombined the N entity types to obtain the M second slots, the second interface including a start training button, the start training button being configured to trigger the Bot platform to train the one or more training corpora;
the processor is further configured to respond to a click operation of a user on the training start button displayed by the display, train the one or more training corpora according to the one or more training corpora and the M second slots, and enable the Bot platform to have a capability of extracting the M second slots in the user's speech.
CN201910330314.0A 2019-04-23 2019-04-23 Method and device for configuring combined slot in man-machine conversation system Active CN110209446B (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN201910330314.0A CN110209446B (en) 2019-04-23 2019-04-23 Method and device for configuring combined slot in man-machine conversation system
PCT/CN2020/085234 WO2020216134A1 (en) 2019-04-23 2020-04-17 Configuration method and device for combination slots in human-machine dialogue system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910330314.0A CN110209446B (en) 2019-04-23 2019-04-23 Method and device for configuring combined slot in man-machine conversation system

Publications (2)

Publication Number Publication Date
CN110209446A CN110209446A (en) 2019-09-06
CN110209446B true CN110209446B (en) 2021-10-01

Family

ID=67786173

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910330314.0A Active CN110209446B (en) 2019-04-23 2019-04-23 Method and device for configuring combined slot in man-machine conversation system

Country Status (2)

Country Link
CN (1) CN110209446B (en)
WO (1) WO2020216134A1 (en)

Families Citing this family (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110209446B (en) * 2019-04-23 2021-10-01 华为技术有限公司 Method and device for configuring combined slot in man-machine conversation system
CN111143561B (en) * 2019-12-26 2023-04-07 北京百度网讯科技有限公司 Intention recognition model training method and device and electronic equipment
CN111274823B (en) * 2020-01-06 2021-08-27 科大讯飞(苏州)科技有限公司 Text semantic understanding method and related device
CN113806469B (en) * 2020-06-12 2024-06-11 华为技术有限公司 Statement intention recognition method and terminal equipment
CN112767942B (en) * 2020-12-31 2023-04-07 北京云迹科技股份有限公司 Speech recognition engine adaptation method and device, electronic equipment and storage medium
CN113326367B (en) * 2021-06-30 2023-06-16 四川启睿克科技有限公司 Task type dialogue method and system based on end-to-end text generation
CN114881046B (en) * 2022-05-23 2023-07-25 平安科技(深圳)有限公司 Training method and device for task session model, computer equipment and storage medium

Family Cites Families (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10191999B2 (en) * 2014-04-30 2019-01-29 Microsoft Technology Licensing, Llc Transferring information across language understanding model domains
CN107767861B (en) * 2016-08-22 2021-07-02 科大讯飞股份有限公司 Voice awakening method and system and intelligent terminal
CN107247706B (en) * 2017-06-16 2021-06-25 中国电子技术标准化研究院 Text sentence-breaking model establishing method, sentence-breaking method, device and computer equipment
CN108549656B (en) * 2018-03-09 2022-06-28 北京百度网讯科技有限公司 Statement analysis method and device, computer equipment and readable medium
CN108959257B (en) * 2018-06-29 2022-11-22 北京百度网讯科技有限公司 Natural language parsing method, device, server and storage medium
CN109325103B (en) * 2018-10-19 2020-12-04 北京大学 Dynamic identifier representation method, device and system for sequence learning
CN110209446B (en) * 2019-04-23 2021-10-01 华为技术有限公司 Method and device for configuring combined slot in man-machine conversation system

Also Published As

Publication number Publication date
WO2020216134A1 (en) 2020-10-29
CN110209446A (en) 2019-09-06

Similar Documents

Publication Publication Date Title
CN110209446B (en) Method and device for configuring combined slot in man-machine conversation system
CN108304846B (en) Image recognition method, device and storage medium
CN110288077B (en) Method and related device for synthesizing speaking expression based on artificial intelligence
WO2022052776A1 (en) Human-computer interaction method, and electronic device and system
US20190129688A1 (en) System and method for controlling colors of smart lights based on user intent using natural language processing
CN116797684B (en) Image generation method, device, electronic equipment and storage medium
WO2023125335A1 (en) Question and answer pair generation method and electronic device
CN110162770A (en) A kind of word extended method, device, equipment and medium
CN110209784B (en) Message interaction method, computer device and storage medium
CN110570840B (en) Intelligent device awakening method and device based on artificial intelligence
CN111724775A (en) Voice interaction method and electronic equipment
CN106406445B (en) Vision-impairment-assisted Chinese text reading system based on intelligent glasses
CN114756359A (en) Image processing method and electronic equipment
US12056192B2 (en) Word completion method and apparatus
CN113806473A (en) Intention recognition method and electronic equipment
CN109543014B (en) Man-machine conversation method, device, terminal and server
CN104978045A (en) Chinese character input method and device
CN113495984A (en) Statement retrieval method and related device
US20230081558A1 (en) Electronic device and operation method thereof
CN110109608A (en) Text display method, device, terminal and storage medium
CN114691839A (en) Intention slot position identification method
CN111639209A (en) Book content searching method, terminal device and storage medium
CN113742460A (en) Method and device for generating virtual role
US20230154463A1 (en) Method of reorganizing quick command based on utterance and electronic device therefor
US20220270604A1 (en) Electronic device and operation method thereof

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant