CN114020898B

CN114020898B - Man-machine automatic dialogue method, device, electronic equipment and storage medium

Info

Publication number: CN114020898B
Application number: CN202210014358.4A
Authority: CN
Inventors: 肖伟翼; 侯永华; 胡德意; 李幼萍; 朱富昆
Original assignee: Workway Shenzhen Information Technology Co ltd
Current assignee: Workway Shenzhen Information Technology Co ltd
Priority date: 2022-01-07
Filing date: 2022-01-07
Publication date: 2022-04-19
Anticipated expiration: 2042-01-07
Also published as: CN114020898A

Abstract

The application discloses a man-machine automatic dialogue method, a device, electronic equipment and a storage medium, wherein the method comprises the following steps: when an input text obtained based on user operation is obtained, determining a target skill triggered by the input text; acquiring a groove extracting mode string matched with the target skill to form a candidate mode set; determining a target slot position name corresponding to the entity based on a target slot position attribute of the entity in each candidate pattern string, obtaining a target character string conforming to the grammar of the regular expression based on the target slot position name and an entity word contained in the entity, and replacing the entity in the candidate pattern string with the target character string to obtain the regular expression corresponding to the candidate pattern string; matching the obtained regular expression with an input text, and extracting slot position key value pairs from the input text based on a matching result; based on the slot position key value pair extracted from the input text, slot filling processing is carried out on the slot position corresponding to the target skill; and acquiring reply content corresponding to the input text based on the slot filling result of the target skill.

Description

Man-machine automatic dialogue method, device, electronic equipment and storage medium

Technical Field

The present application relates to the field of artificial intelligence technologies, and in particular, to a method and an apparatus for human-machine automatic conversation, an electronic device, and a storage medium.

Background

An automatic human-machine dialog system is a system that understands the language and intent of a person and makes a reasonable answer. Before answering or responding to some intentions of a user in the man-machine automatic dialogue system, relevant information of the intentions needs to be clarified, for example, for the intentions of the user for inquiring weather conditions, the system needs to obtain two slot position information of time and place first so as to clearly present the weather conditions of the time and the place. The regular expressions are generally adopted to extract the slot position information of the intentions from the man-machine conversation content, but the regular expressions are relatively fixed, all conditions possibly related to each slot position need to be exhausted as far as possible to deal with the conversations under various scenes, and the regular expressions corresponding to each intention and the slot positions related to the regular expressions need to be maintained independently, so that the regular expressions are low in flexibility and high in maintenance cost, and are not beneficial to being used in a large-scale man-machine conversation system.

Disclosure of Invention

The embodiment of the application provides a man-machine automatic conversation method, a man-machine automatic conversation device, electronic equipment and a storage medium, so that the flexibility of a conversation system is improved, and the maintenance cost is reduced.

In one aspect, an embodiment of the present application provides a human-computer automatic conversation method, including:

when an input text obtained based on user operation is acquired, determining a target skill triggered by the input text;

acquiring a groove extracting mode string matched with the target skill to form a candidate mode set;

for each candidate pattern string in the candidate pattern set, determining a target slot position name corresponding to an entity based on a target slot position attribute of the entity in the candidate pattern string, obtaining a target character string conforming to the grammar of a regular expression based on the target slot position name and an entity word contained in the entity, and replacing the entity in the candidate pattern string with the target character string to obtain the regular expression corresponding to the candidate pattern string;

matching the obtained regular expression with the input text, and extracting slot position key value pairs from the input text based on a matching result, wherein each slot position key value pair comprises a slot position name and a slot position value;

based on the slot position key value pair extracted from the input text, slot filling processing is carried out on the slot position corresponding to the target skill;

and acquiring reply content corresponding to the input text based on the slot filling result of the target skill.

Optionally, the determining a target slot name corresponding to the entity based on the target slot attribute of the entity in the candidate pattern string includes:

and selecting a slot position name corresponding to the slot position configured for the target skill from the target slot position attributes of the entity as the target slot position name corresponding to the entity, wherein the target slot position attribute of each entity comprises at least one slot position name.

Optionally, the method further comprises:

acquiring a dictionary tree generated based on entity words contained in the entity related to the target skill;

searching entity words contained in the input text in the dictionary tree by using an AC (alternating current) automata algorithm, and adding the searched entity words as candidate words into a candidate word set;

the obtaining of the target character string conforming to the regular expression grammar based on the target slot names and the entity words contained in the entities includes:

filtering out entity words appearing in the candidate word set from entity words contained in the entity to serve as target entity words of the entity;

and converting the target slot position name and the target entity word into a target character string conforming to the regular expression grammar.

Optionally, before obtaining the reply content corresponding to the input text based on the slot filling result of the target skill, the method includes:

acquiring a slot position modification triggering condition corresponding to the target skill;

and judging whether the slot position modification triggering condition is met, if so, updating the slot position value of the slot position to be modified of the target skill according to a preset slot position assignment statement, wherein the slot position assignment statement comprises the slot position to be modified and a modification mode of the slot position value.

Optionally, the updating the slot value of the slot to be modified of the target skill according to the preset slot assignment statement includes:

acquiring a slot position to be modified and a corresponding slot position default value from a preset slot position assignment statement, and modifying the slot position value of the slot position to be modified of the target skill into the corresponding slot position default value; or

The method comprises the steps of obtaining a slot position to be modified from a preset slot position assignment statement, collecting relevant data based on the slot position to be modified, determining a slot position correction value of the slot position to be modified based on the collected relevant data, and modifying the slot position value of the slot position to be modified of the target skill into a corresponding slot position correction value.

Optionally, the determining the target skill triggered by the input text comprises:

determining target skills triggered by the input text in a mode of pattern matching, wherein the pattern matching refers to matching the input text with regular expressions corresponding to each skill;

if the target skill is not determined through the mode matching, determining the target skill triggered by the input text through the dialect matching, wherein the dialect matching process comprises the following steps: acquiring the feature vector of the input text and the feature vector of each preset dialect, respectively calculating the similarity between the feature vector of the input text and the feature vector of each preset dialect, and determining the skill of the preset dialect corresponding to the maximum similarity exceeding a preset threshold as the target skill triggered by the input text.

Optionally, the method further comprises:

when a to-be-processed event automatically triggered by a client is received, searching a target skill matched with the to-be-processed event in preconfigured skills based on the event name and the event parameter of the to-be-processed event, wherein the corresponding event name and the event parameter are configured for the skill in advance;

based on a target slot position configured for an event parameter in advance, filling a parameter value of the event parameter in the event to be processed into the slot position of the target skill;

and acquiring the reply content corresponding to the input text based on the slot value of the target skill.

In one aspect, an embodiment of the present application provides a human-machine automatic dialog apparatus, including:

the skill matching module is used for determining a target skill triggered by an input text when the input text obtained based on user operation is obtained;

the translation module is used for acquiring the groove extracting mode string matched with the target skill to form a candidate mode set; for each candidate pattern string in the candidate pattern set, determining a target slot position name corresponding to an entity based on a target slot position attribute of the entity in the candidate pattern string, obtaining a target character string conforming to the grammar of a regular expression based on the target slot position name and an entity word contained in the entity, and replacing the entity in the candidate pattern string with the target character string to obtain the regular expression corresponding to the candidate pattern string;

the slot extracting module is used for matching the obtained regular expression with the input text and extracting slot position key value pairs from the input text based on a matching result, wherein each slot position key value pair comprises a slot position name and a slot position value;

the slot filling module is used for performing slot filling processing on the slot corresponding to the target skill based on the slot key value pair extracted from the input text;

and the reply module is used for acquiring reply content corresponding to the input text based on the slot filling result of the target skill.

In one aspect, an embodiment of the present application provides an electronic device, which includes a memory, a processor, and a computer program stored in the memory and executable on the processor, wherein the processor implements the steps of any one of the methods when executing the computer program.

In one aspect, an embodiment of the present application provides a computer-readable storage medium having stored thereon computer program instructions, which, when executed by a processor, implement the steps of any of the above-described methods.

In one aspect, an embodiment of the present application provides a computer program product or a computer program comprising computer instructions stored in a computer readable storage medium. The processor of the computer device reads the computer instructions from the computer-readable storage medium, and the processor executes the computer instructions to cause the computer device to perform the method provided in any of the various alternative implementations of control of TCP transmission performance described above.

According to the man-machine automatic conversation method, the man-machine automatic conversation device, the electronic equipment and the storage medium, the mode string can be dynamically translated into the regular expression in the groove lifting process through the target groove position attribute self-defining function of the entity word, so that the mode string with simpler maintenance rule and more flexible expression is needed, the complex regular expression is not needed to be maintained, the user learning cost and the system maintenance cost are greatly reduced, and the usability and the matching efficiency of the system are improved. In addition, the method provides an entity definition function for the optional words in the pattern string, wherein one entity is a group of similar optional words, the optional words are translated into a part of the regular expression finally, and are endowed with the grouping names same as the entity names, so that the expansibility of the pattern string definition is improved, and the matching efficiency is optimized.

Drawings

In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings needed to be used in the embodiments will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present application, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without creative efforts.

Fig. 1 is a schematic view of an application scenario of a man-machine automatic conversation method according to an embodiment of the present application;

fig. 2 is a schematic flowchart of a man-machine automatic conversation method according to an embodiment of the present application;

fig. 3 is a schematic flowchart of a man-machine automatic conversation method according to an embodiment of the present application;

fig. 4 is a schematic structural diagram of a human-machine automatic conversation apparatus according to an embodiment of the present application;

fig. 5 is a schematic structural diagram of an electronic device according to an embodiment of the present application.

Detailed Description

The embodiments of the present application will be described in detail below with reference to the accompanying drawings.

It should be noted that, in the case of no conflict, the features in the following embodiments and examples may be combined with each other; moreover, based on the embodiments in the present application, all other embodiments obtained by a person of ordinary skill in the art without any creative effort belong to the protection scope of the present application.

It is noted that various aspects of the embodiments are described below within the scope of the appended claims. It should be apparent that the aspects described herein may be embodied in a wide variety of forms and that any specific structure and/or function described herein is merely illustrative. Based on the present application, one skilled in the art should appreciate that one aspect described herein may be implemented independently of any other aspects and that two or more of these aspects may be combined in various ways. For example, an apparatus may be implemented and/or a method practiced using any number of the aspects set forth herein. Additionally, such an apparatus may be implemented and/or such a method may be practiced using other structure and/or functionality in addition to one or more of the aspects set forth herein.

For convenience of understanding, terms referred to in the embodiments of the present application are explained below:

skill: the method is a technical means for presetting a certain intention, collecting necessary information for replying the intention in a guidance dialogue mode, a multi-turn dialogue mode, a multi-mode information collection mode and the like, and reasonably replying. In the application, skill attributes supporting customization are configured for skills, the skill attributes comprise skill names, reference names, skill descriptions and related slots, and a user can set the skill attributes according to requirements. Wherein, the skill name is used for showing, for example, the name of the weather skill can be "weather". The naming of the reference name follows the naming rule of the reference name, and has uniqueness, and the name cannot be duplicated with the reference names of other skills, for example, the reference name of the weather skill can be 'weather'. The user can describe the related content of the skill by editing the skill specification. The slot in the skill attribute refers to a slot required to complete a skill, such as a slot for a weather skill including: a time slot and a location slot.

And (4) slot position: the method refers to the definition information which must be acquired when reasonable reply is made to a certain skill, and the specific value of the definition information is called a slot bit value. For example, the necessary information required by the weather skill includes a place and a time, that is, the weather skill has time and two slots of the place, the slot value of the time may be yesterday, today, tomorrow or a specific time value, and the slot value of the place may be a specific place such as beijing, shenzhen, and the like. In the application, a user can define the attribute of the slot position by self, wherein the attribute comprises a slot position name, a slot position value, a slot position reference name and a slot position description, the slot position name is the name of the slot position, for example, the weather date slot position can be the date, the slot position reference name is used for referencing the slot position value, the system is used for presetting a variable' kill.

Entity: is a set of similar selectable words. The attributes of the entity include an entity name, an entity reference name, an entity type, a scope, a multiplex entity, an entry, a wildcard sentence, a target slot, and the like. Wherein the entity name is used for presentation. The entity reference name is used for reference in the pattern string, and the name of the entity reference name follows the reference name naming specification, for example, the 'place' in the pattern string 'place weather how' is the reference name of the weather entity.

The entity type includes an entry type and a wildcard type. The entity of the entry type consists of limited and listed entity words, all optional entity words need to be exhausted, for example, the entity words of the place entity can include Beijing, Shanghai, Guangzhou, Shenzhen and the like, and the entity of the entry type needs to define the entry attribute or multiplex the entity attribute at the same time. Entity words of wildcard type entities are represented by regular expressions, the regular expressions are matching patterns of selectable words, for example, entity words contained in digital entities are from 1 to infinity, and cannot be exhausted, and can only be represented by regular expressions "\ d +", and wildcard type entities need to define wildcard sentence attributes or multiplex entity attributes at the same time.

The attribute value of an entry is a list in which a group of similar entity words is recorded, and the attribute becomes effective only if the type of the entity is the entry type.

The scope types of entities include general and skill, general entities can be referenced in all skills, and skill entities are used only in specified skills.

The multiplexing entity attribute means that the entity can be derived from other entities to realize multiplexing of the entity, and the attribute value of the multiplexing entity attribute is a list of reference names of other entities. For an entity of an entry type, the type of the entity multiplexed by the entity can only be the entry type, when a certain entity specifies a multiplexed entity, the entity word of the entity is a union of all the multiplexed entities and entry attribute values of the entity, for example, assuming that there exists a Guangdong place entity, the entity reference name is guangdong, the entity word includes Guangzhou and Shenzhen, there exists a Fujian place entity, the entity reference name is fujian, the entity word includes Fuzhou and Quanzhou, if an entity of an entry type is defined, the reference name is place, the entry attribute value is "[" Beijing "," Shanghai "]", the multiplexed entity attribute is "[ guangdong and fujian ]", the entity word of the place includes Guangzhou, Shenzhou, Quanzhou, Beijing and Shanghai. For the entity of wildcard type, the multiplexing entity can only be the entity of wildcard type, if the entity does not define the attribute of the wildcard sentence, the wildcard sentence of the entity is the wildcard sentence of the first entity in the attribute of the multiplexing entity; if the entity defines a wildcard sentence, the multiplexing entity is not in effect.

The wildcard sentence attribute is used for defining a wildcard sentence, namely a regular expression, of the wildcard type entity. For example, the general sentence of the digital entity is '\ d +'. This attribute will only take effect if the entity is of wildcard type.

The target slot is one of the slot reference names of the defined skills, the grouping name when converting the entity in the pattern string into the regular expression grouping in translation is the value, and the default attribute value is the entity reference name.

Mode string: is a character string of a predefined regular expression similar to the entity, which is a simple writing method of the regular expression, such as "[ place ]]How much like the weather today ", wherein" [ place]"is a place entity. In this application, the pattern string is used only in "[ solution ]", and the like]Symbols such as "," () "," | ", have lower learning costs relative to regular expressions. The two symbols of "()", "|", which are consistent with the usage of the regular expression, respectively represent grouping and selection, for example, "(Beijing | Shanghai | Guangzhou | Shenzhen)" represents that the character to be matched contains one of the four words of Beijing, Shanghai, Guangzhou, and Shenzhen, and when the character string to be matched is "Shenzhen", the judgment mode is successfully matched with the character string. "[]"Only one entity reference name can be used in symbol. Using the term type entity as an example, "is used]The reference entity word represents any entity word matched with the specified entity, the matching result has a grouping name attribute, the matched character string can be extracted from the matching result through the grouping name, and the grouping name is the target slot position of the entity. For example, assuming that there is a place entity, the entity words include Beijing, Shanghai, Guangzhou, Shenzhen, the target slot property is city, the skill engine will turn "[ place ]]"translation to Standard regular expressions

And performing regular matching, judging that the matching between the pattern and the character string is successful when the character string to be matched is Beijing, and extracting the character string with the group name of city from the matching result by using a standard interface of a programming language.

Mode set: is a collection of similar schema strings with regular reference name attributes. Pattern sets are pattern string groupings, and pattern set names may be used instead of pattern strings specifically when defining conditions for a hit-and-fill rule. For example, assume that a slot filling condition "[ {" left ": word, rule", "operator": "=", "right": c _ place "," relationship ": and" } "for referring to a pattern set named c _ place represents that the pattern set named c _ place" hit under the condition. The pattern strings in the pattern set grouping are similar, for example, in weather skills, the pattern strings of the matching places have "[ place ] wool", "i think about weather of [ place ], and the like, and can be placed under the place pattern set c _ place; the pattern string matching time is "[ date ] wool", "i think about weather of [ date ], etc., and can be placed under the time pattern set c _ date.

Session state data: the data is used for temporarily recording the conversation state of the user in the real-time response stage. This portion of data is updated in real time according to the user's session status and certain data expiration clearing mechanisms are required to enable the engine to exit a certain skill state for a long period of time without interaction. Each round of corresponding session state data may include trigger skills, slot bit data, hit slot lifting rules, which are consistent in life cycle, created at the same time, and expired and died. Formally the session state data is stored in a cache (e.g., Redis) in the form of key-value pairs, and the key name is added with the user name identifier to isolate the session state data of different users. Wherein the key that triggers a skill is a kill and the value is the reference name of the most recently triggered skill. And the slot bit data stores the latest slot filling condition, the key is slot, and the value is the key value pair of which the slot name corresponds to the latest slot value. The groove lifting rule is a record of the hit groove lifting rule, the key is rule, and the value is the name of the groove lifting rule.

The man-machine conversation system also provides a skill state variable of a current round and a skill state variable of a previous round related to the conversation state data, wherein the skill state variable is obtained by comparing slot positions in the conversation with defined slot positions, and when the slot positions in the conversation are all filled, the value is True, which indicates that the slot positions are filled or the skill is completed; when a slot in a session is not full, the value is False, indicating that the slot is not full or the skill is not complete. The variable name of the skill state of the current round is stick.

Dictionary tree: also known as a word-lookup tree or Trie tree, is a tree-like structure that is a variant of a hash tree. Typical applications are for statistics, sorting and storing a large number of strings (but not limited to strings), and are therefore often used by search engine systems for text word frequency statistics. The advantages of the dictionary tree are: the public prefix of the character string is utilized to reduce the query time, so that unnecessary character string comparison is reduced to the maximum extent, and the query efficiency is higher than that of a Hash tree.

The AC automaton algorithm: is a string search algorithm invented by Alfred v.aho and Margaret j.coramick for matching substrings in a finite set of "dictionaries" in an input string of characters. The matching with the common character string is different from the matching with all dictionary strings, and the algorithm has approximately linear time complexity under the condition of shared algorithm, which is about the length of the character string plus the number of all matches.

A Client (Client), also called Client, refers to a program corresponding to a server and providing local services to clients. Except for some application programs which only run locally, the application programs are generally installed on common clients and need to be operated together with a server. After the internet has developed, the more common clients include web browsers used on the world wide web, email clients for receiving and sending emails, and client software for instant messaging. For this kind of application, a corresponding server and a corresponding service program are required in the network to provide corresponding services, such as database services, e-mail services, etc., so that a specific communication connection needs to be established between the client and the server to ensure the normal operation of the application program.

The method designs a set of JSON-based conditional grammar, and can translate conditional statements written according to a grammar into conditional statements of a program language for execution. E.g., the following conditional statement:

[ { "left": word.slot.date "," operator ":" = "," right ": today", "relation": and "}, {" left ": word.slot.city", "operator" = "," right ": Beijing", "relation": and "})

Indicating that the date slot bit value is equal to "today" and the city slot bit value is "Beijing", which results in a Boolean type. Specifically, the conditional syntax includes:

(1) a conditional statement is a JSON-formatted character string and is an ordered sequence consisting of a plurality of conditional clauses.

(2) The conditional clause is a JSON object including four key values of a left value (left), an operator (operator), a right value (right), and a connector (relation).

(3) The result of the execution of the conditional clause is a boolean value, resulting in True or False.

(4) Operators include equal to ("="), greater than (">"), greater than equal to ("> ="), less than ("<"), less than equal to ("< ="), inclusive ("in"), empty ("empty"). Operators contain single operand types and double operand types. The operator only operating on the left value is a single operand type, and if the empty operator only operates on the left value, the empty type is indicated. Operators where operations require the presence of both left and right values are referred to as dual operand types, such as equal to, greater than, less than, or containing operators.

(5) The equal operation is to compare whether the left value and the right value are equal in value when the equal operation is directed to integer or floating point type, and whether the characters and the lengths of the two character strings are completely the same when the equal operation is directed to character string type.

(6) And when the operators are directed at the character string, sequentially comparing the ASCII codes of the characters with the left values and the right values at the same index positions according to the sequence from left to right.

(7) The containing operation requires that the right value is a list type of data and determines whether the left value is in the list of right values, the comparison of the left value to each element in each list following a comparison equal to the operator.

(8) The left value can only be a variable built in the system, such as a variable related to the session state data and a variable related to the input data.

(9) The right value is string, integer, floating point, boolean, or list type data.

(10) The connector is the relation between the current clause and the previous clause in the conditional sentence sequence, and has two relations of 'and' or ', and the relation is represented by' and 'or' and the connector of the first clause must be and.

Assignment syntax: when the skill slot filling rule is defined, the slot position value needs to be modified by using assignment statements, and therefore a set of JSON-based assignment grammar is designed, and statements conforming to the assignment grammar are called the assignment statements. And in the real-time response stage, the skill engine can analyze the assignment statement defined by the assignment grammar and execute the assignment process. The rules for assigning syntax are specifically as follows:

(1) an assignment statement is an ordered sequence of multiple assignment clauses. The skill engine executes the clauses in the assignment statement one by one in sequence.

(2) The assignment clause comprises a left value, a right value type and a right value.

(3) The left value is a variable to be assigned, and the variable has two variables, namely a system predefined variable and a temporary variable. The predefined variables of the system comprise variables related to input data and variables related to session state data, such as place slot values in the place data expressed by the skip. The temporary variables do not need to be defined in advance, the assignment is automatically generated when the skill engine executes assignment, and the temporary variable naming follows a reference name naming rule.

(4) The right value type comprises a value type and a variable type, and when the right value type is the value type, the skill engine analyzes the right value into a specific numerical value; when a variable type, the skill engine replaces the right-valued variable with the variable value.

(5) The right value is a specific value or variable depending on the right value type. When the right value type is a value type, the value is any one of the types defined in the skills engine data type system. When the right value type is a variable type, the variables of the right value can only be variables that the skill engine can recognize, including system predefined variables and temporary variables.

(6) The temporary variables are only valid in the assignment statements, and the corresponding memory data is destroyed after the assignment statements are executed. For example, assuming that the right value type is represented by 1 when it is a value type and by 2 when it is a variable type, the assignment statement "[ {" left ": word. slot. place", "right _ type": 1 "," right ": Beijing" }, { "left": temp "," right _ type ": 2", "right": word. slot. place "}" indicates that the place slot bit value is set to "Beijing", then the temporary variable temp is defined and the value of temp is set to the place slot bit value "Beijing".

Any number of elements in the drawings are by way of example and not by way of limitation, and any nomenclature is used solely for differentiation and not by way of limitation.

The man-machine automatic process is as follows: the system determines the skills (namely intentions) related to the conversation based on the user input, and then continuously guides the user to input the slot position information required by the skills to the conversation system step by step so as to clarify the user intentions, make voice answers, display text and picture information or operate a robot client to make action responses. Therefore, the key to a dialog system is how to extract skills from user inputThe required slot position information. In the prior art, regular expressions are generally adopted to extract intended slot position information from man-machine conversation contents. Taking weather skill as an example, define a regular expression "

Today's weather is so, when the user input is ' Beijing weather is so ', the user input is judged to be matched with the defined regular expression, and a slot bit value of the slot ' place ' can be extracted to be ' Beijing ' through a standard interface of a programming language (such as a re standard library of a Python programming language). However, in practical application, the possible query places of the user are not only four places, namely beijing, shanghai, guangzhou and shenzhen, but may be all cities in the country and even all cities around the world, and in order to enable skills to cope with more scenes, the optional words in the regular expression need to be continuously expanded, but the standard regular expression has too many symbols and rules, so that the learning cost is high, and the regular expression can only be expanded in an enumeration manner, and when the system involves more skills and slots, the flexibility of a manner of separately maintaining the regular expression of each skill is low, and the maintenance cost is high.

To this end, the present application proposes a solution that: a large number of entities are predefined according to the scenes related to the system and used for storing all optional words possibly related to each entity, and a large number of pattern strings are predefined according to the input habits of users and used for extracting the slot position information. When the background server acquires an input text, determining a triggered target skill based on the content of the input text, acquiring a groove lifting mode string matched with the target skill, and forming a candidate mode set; then, translating the candidate pattern string into a regular expression based on a target slot position corresponding to the entity in the candidate pattern string and entity words contained in the entity; matching the translated regular expression with an input text, so as to extract slot position information required by the target skill; and finally, generating reply content corresponding to the input text based on the slot position information of the target skill. Through the target slot position attribute self-defining function of the entity, the pattern string can be dynamically translated into the regular expression in the slot lifting process, so that the pattern string with simpler maintenance rule and more flexible expression is only needed, the complex regular expression is not needed to be maintained, the system maintenance cost is reduced, and the matching efficiency is improved. In addition, through the self-defined setting of target slot position attribute for an entity can be shared to a plurality of skills, has improved the expansibility of entity word and mode string, has reduced the maintenance cost.

After introducing the design concept of the embodiment of the present application, some simple descriptions are provided below for application scenarios to which the technical solution of the embodiment of the present application can be applied, and it should be noted that the application scenarios described below are only used for describing the embodiment of the present application and are not limited. In specific implementation, the technical scheme provided by the embodiment of the application can be flexibly applied according to actual needs.

Fig. 1 is a schematic view of an application scenario of a man-machine automatic conversation method according to an embodiment of the present application. The physical architecture of the automatic human-machine dialog system generally comprises three parts: a user 101, a client 102 (such as a robot, a web page, a mobile app, etc.), and a background server 103, the data processing process is generally: the user 101 inputs contents such as voice or characters through the client 102, the client 102 transmits the input contents to the background server 103, the background server 103 analyzes the intention of the input contents by using the semantic recognition engine and makes a reply, the reply is finally returned to the client 102, and the client 102 displays the reply contents to the user 101 in a display mode including but not limited to characters, voice, animation, actions, expressions and the like. The background server 103 may be an independent physical server, a server cluster or a distributed system formed by a plurality of physical servers, or a cloud server providing basic cloud computing services such as a cloud service, a cloud database, cloud computing, a cloud function, cloud storage, a network service, cloud communication, a middleware service, a domain name service, a security service, a CDN, a big data and artificial intelligence platform, and the like.

Of course, the method provided in the embodiment of the present application is not limited to be used in the application scenario shown in fig. 1, and may also be used in other possible application scenarios, and the embodiment of the present application is not limited. The functions that can be implemented by each device in the application scenario shown in fig. 1 will be described in the following method embodiments, and will not be described in detail herein.

To further illustrate the technical solutions provided by the embodiments of the present application, the following detailed description is made with reference to the accompanying drawings and the detailed description. Although the embodiments of the present application provide the method operation steps as shown in the following embodiments or figures, more or less operation steps may be included in the method based on the conventional or non-inventive labor. In steps where no necessary causal relationship exists logically, the order of execution of the steps is not limited to that provided by the embodiments of the present application.

The following describes the technical solution provided in the embodiment of the present application with reference to the application scenario shown in fig. 1.

Referring to fig. 2, an embodiment of the present application provides a human-machine automatic conversation method, which can be applied to the background server shown in fig. 1, and specifically includes the following steps:

s201, when an input text obtained based on user operation is obtained, determining a target skill triggered by the input text.

In practical application, a user can input data to a client in modes of voice, characters, gestures and the like, and the client sends the input data to a background server. When the background server receives the character type information, the background server directly takes the character information as an input text. When input data of voice, gesture and the like are received, the background server converts the input data into text information, namely input text, through the technologies of voice recognition, motion recognition and the like.

In particular implementation, the target skill triggered by the input text can be determined by at least one of the following modes:

the first mode is as follows: and determining the target skill triggered by the input text by a mode matching mode.

The pattern matching refers to matching the input text with a regular expression corresponding to each skill. The process of pattern matching is essentially regular expression matching, and the input text is matched by using a pattern string defined by pattern grammar, so that whether the skill of the regular expression is triggered or not is judged. In practical application, a plurality of regular expressions can be configured for each skill in advance. And when matching, matching the input text with the regular expressions of all skills, and if the input text is successfully matched with a certain regular expression, determining the skill to which the regular expression belongs as the target skill triggered by the input text.

Further, in order to reduce the maintenance cost of the regular expression, a plurality of pattern strings may be configured for each skill in advance, and in order to distinguish from the pattern strings in the slot lifting process, the pattern strings used when the skills are matched are referred to as entry pattern strings. And during matching, respectively translating the entry pattern strings of all skills into regular expressions, performing regular matching on the input text and each regular expression, and if the input text is successfully matched with a certain regular expression, determining the skill to which the regular expression belongs as the target skill triggered by the input text. The entry pattern string translation process may refer to the specific implementation of step S203, and is not described in detail. The standard regular expression has higher learning cost, excessive symbols and rules, and the rule required by skill matching is only a small subset of the rules.

The second mode is as follows: and determining the target skill triggered by the input text by means of matching the words.

The language matching refers to matching the input text with preset languages of each skill, selecting the preset language corresponding to the highest matching degree from a plurality of preset languages of which the matching degree exceeds a matching degree threshold value, and taking the skill of the preset language corresponding to the highest matching degree as a target skill triggered by the input text. The preset dialogs are the text that the user triggers the possible input of each skill.

In specific implementation, the feature vector of the input text and the feature vector of each preset dialect are obtained, the similarity (such as cosine similarity) between the feature vector of the input text and the feature vector of each preset dialect is calculated respectively, the maximum similarity is selected from the similarities exceeding a preset threshold, and the skill of the preset dialect corresponding to the maximum similarity is determined as the target skill triggered by the input text. If a plurality of preset dialects with the maximum similarity exceeding a preset threshold exist, one preset dialect is randomly selected, and the skill of the preset dialect is determined as the target skill. Feature vectors for the input text and the default dialogs may be obtained using a natural language processing algorithm, such as the Bert algorithm introduced by google, usa.

In order to improve the response speed, a feature vector of each preset dialect (referred to as a candidate vector herein) may be generated in advance to form a candidate vector set, and the candidate vector set is stored in a database or a cache, so that the feature vector of the preset dialect is conveniently directly called during matching.

In one possible implementation mode, the target skill triggered by the input text can be determined in a mode of pattern matching; if the unique matching skill can be determined in a mode of pattern matching, determining the skill as a target skill; if a plurality of matched skills can be determined in a mode of pattern matching, one of the skills is randomly selected as a target skill; and if the target skill is not determined in the mode matching mode, determining the target skill triggered by the input text in the mode of matching the words. If the target skills cannot be determined through the speech matching mode, the background server can select one background speech from the background speech set and return the selected background speech to the client, for example, "ask what service you need", "do not hear clearly", ask you say again "and the like.

In a multi-turn conversation scene, the background server stores the target skill triggered by the input text into the conversation state data of the current turn of conversation to provide support for the next turn of conversation.

S202, obtaining a groove extracting mode string matched with the target skill to form a candidate mode set.

In specific implementation, according to the text grammar related to the input habits and skills of the user, a plurality of pattern strings for extracting slot position information are configured for each skill in advance, and the pattern strings are called slot lifting pattern strings.

S203, aiming at each candidate pattern string in the candidate pattern set, determining a target slot position name corresponding to the entity based on the target slot position attribute of the entity in the candidate pattern string, obtaining a target character string conforming to the grammar of the regular expression based on the target slot position name and the entity word contained in the entity, and replacing the entity in the candidate pattern string with the target character string to obtain the regular expression corresponding to the candidate pattern string.

Step S203 is a process of translating the pattern string into a regular expression. Since the pattern string, in addition to the entity reference portion, uses the notation and semantics of the regular expression, the translation process essentially replaces the entities in the pattern string with the regular expression. The entity type comprises a vocabulary entry type and a wildcard type, and the translation processes of the two types are different.

For an entity of the entry type, the entity words contained in the entry type are the union of the entity words in the entry attribute and the entity words of the entity in the multiplex entity attribute. The translation process of the entity of the entry type is as follows: and taking out all entity words of the entity, connecting all the entity words by using the 'I', forming a group, wherein the group name is the target slot position name in the target slot position attribute of the entity, and replacing the entity reference part character in the mode string by using the regular expression formed in the way. For example, the entity word of the place entity includes Beijing and Shanghai, and if the target slot of the place entity is city, the pattern string "[ place ] is]Weather how "can be translated into"

How the weather is.

For the entity of the wildcard type, grouping is firstly used for the wildcard sentence, the grouping name is the target slot position name in the target slot position attribute of the entity, and then the entity reference part in the pattern string is replaced by the grouped wildcard sentence. For example, if the number entity is a wildcard type entity, the wildcard sentence is "\ d +", the target slot is volume, and the pattern string "number is turned up [ -number]Volume up "can be translated into

". The present application supports defining wildcard typesThe entity of (2) specifies a wildcard sentence of the entity, and improves the representation capability of the entity.

And S204, matching the obtained regular expression with the input text, and extracting slot position key value pairs from the input text based on a matching result, wherein each slot position key value pair comprises a slot position name and a slot position value.

In specific implementation, a standard interface of a programming language (e.g., a re packet built in a Python programming language) may be used first, and the regular expression obtained in step S203 is used to match an input text to form a matching result set. If a plurality of results which are successfully matched exist, one of the results is randomly selected, a group name and a corresponding group character string are extracted from the selected matching result, a key value pair result set of the group character string corresponding to the group name is formed, and a rule reference name of a mode set to which the result belongs is stored in the conversation data of the current round, for example, the key value pair result set extracted by using Python language is data of a dictionary type, such as { "date": today "," place ": Beijing". And if no matching success result exists and the slot position is not filled, forming an empty key value pair result set, and storing an empty rule reference name in the session data of the current round.

Then, the key value pair result set is searched for keys with the same slot position name as the slot position value required by the target skill, the values corresponding to the keys are used as the slot position value of the slot position with the same name, and a slot lifting result with the key being the slot position name and the value being the slot position value, namely the slot position key value pair is formed. For example, the weather skill has two slots of date and place, the key value pair result set is "{" place ":" Beijing "}", and the key place with the same name as the slot exists in the result set, so the slot extraction result is "{" place ":" Beijing "}".

And S205, based on the slot position key value pair extracted from the input text, carrying out slot filling processing on the slot position corresponding to the target skill.

In specific implementation, according to the slot names of the extracted slot key value pairs, the slot positions with the same slot names in the target skill are found, and the slot values in the slot key value pairs are filled in the target skill under the slot positions.

In a multi-turn dialog scenario, the target skills of the previous turn of dialog may be acquired first. If the target skill of the current round of conversation is inconsistent with the target skill of the previous round of conversation, the current round of conversation is indicated to open the conversation about the new intention, and the slot bit data of the target skill of the current round of conversation can be filled based on the slot bit key value pair extracted from the input text. And if the target skill of the current round of conversation is consistent with the target skill of the previous round of conversation, indicating that the current round of conversation is the continuation of the previous round of conversation, acquiring the slot bit data of the target skill after the previous round of conversation, updating the slot bit data of the target skill based on the slot bit key value pair extracted from the input text, and taking the updated slot bit data as the slot filling result of the current round of conversation. The specific updating process is as follows: and merging the slot position key value pairs extracted in the current round into the slot position data in the previous round, and replacing the slot position value in the slot position data in the previous round with the slot position value in the slot position key value pair. For example, the slot data of the previous round is { "date": tomorrow "," place ": Guangzhou" } ", the slot lifting result of the current round is" { "place": Beijing "}", and the final slot data of the current round is merged and then is "{" date ": tomorrow", "place": Beijing "}".

After the slot filling processing, the slot bit data of the target skill can be stored in the session state data of the current round of conversation.

And S206, acquiring the reply content corresponding to the input text based on the slot filling result of the target skill.

Step S206 is substantially a process of selecting reply content according to the reply condition, where the reply content includes, but is not limited to, text, voice, instruction, and other data types. The specific implementation process is as follows: firstly, analyzing conditional statements one by one, and judging whether the execution result of the conditional statement is True or False; if the answer is True, acquiring the reply content under the condition; if the statement is False, analyzing and executing the next conditional statement; and if the next conditional statement does not exist, acquiring the default reply content of the system. If the reply content defines a text, analyzing and replacing a reference variable in the text as a variable value, and putting the replaced text into a text field of the response message; if the reply content defines voice, analyzing and replacing a reference variable in the voice character string, and placing the replaced voice character string in a voice field of the response message; if the reply content defines an instruction, the instruction name and the instruction parameter are obtained and put into the instruction field of the response message. And sending the response message to the client to complete the whole real-time response process. The method and the device support the self-defined reply rule to meet the complex and variable business requirements.

According to the man-machine automatic conversation method, the mode string can be dynamically translated into the regular expression in the groove lifting process through the target groove position attribute self-defining function of the entity word, so that the mode string with simpler maintenance rule and more flexible expression is needed, the complex regular expression is not needed to be maintained, the user learning cost and the system maintenance cost are greatly reduced, and the usability and the matching efficiency of the system are improved. In addition, the method provides an entity definition function for the optional words in the pattern string, wherein one entity is a group of similar optional words, the optional words are translated into a part of the regular expression finally, and are endowed with the grouping names same as the entity names, so that the expansibility of the pattern string definition is improved, and the matching efficiency is optimized.

In particular, an entity may correspond to slots under one or more skills to increase the multiplexing rate of the entity. Slots of one or more skills may be added in the target slot attributes of the same entity. For example, the target slot property of the entity place may include a slot city in a weather skill, and may also include a slot station in a ticket purchasing skill, so that both the weather skill and the ticket purchasing skill may use an entity word in the entity place. Through the self-defined setting of the target slot position attribute to the entity, a plurality of skills can share one entity, the expansibility of entity words and mode strings is improved, and the maintenance cost is reduced.

Therefore, in step S203, determining a target slot name corresponding to the entity based on the target slot attribute of the entity in the candidate pattern string specifically includes: and selecting a slot position name corresponding to a slot position configured for the target skill from the target slot position attributes of the entity as the target slot position name corresponding to the entity, wherein the target slot position attribute of each entity comprises at least one slot position name.

For example, the pattern string "[ place]The weather what' includes the entity place, and the obtained target slot position attribute of the entity place is as follows:and (3) city and station, wherein the target skill of the conversation in the current round is weather, the slot position of the weather skill comprises the city, the city is used as the target slot position name of the entity place, and the final mode string is translated into "

How the weather is.

For an entity of a term type, when there are many entity words of the entity contained in a pattern string, if place may contain all names of places nationwide, it is very long to exhaust the regular expression translated into all places, and the matching efficiency is very low due to too many optional words of the entity in the regular expression. Therefore, the method adopts a dynamic translation mode, only the entity words contained in the input text but not all entity words are incorporated into the translation process by searching the entity words contained in the input text, and the length of the regular expression can be greatly shortened through the translation optimization mode, so that the matching efficiency is improved.

In specific implementation, a dictionary tree generated based on entity words contained in an entity related to the target skill can be obtained; and then, searching the entity words contained in the input text in the dictionary tree by utilizing an AC (alternating current) automaton algorithm, and adding the searched entity words as candidate words into a candidate word set.

Correspondingly, in step S203, the step of obtaining the target character string conforming to the regular expression grammar based on the target slot name and the entity word included in the entity specifically includes: filtering out entity words appearing in the candidate word set from entity words contained in the entity to serve as target entity words of the entity; and converting the target slot position name and the target entity word into a target character string conforming to the regular expression grammar.

In order to improve the processing efficiency, all entity words of the entity in the target skill scope can be taken out in advance, a dictionary tree is constructed and stored in a cache, and the calling during translation is facilitated. When the pattern string is translated, all entity words contained in the input text are searched in the dictionary tree by utilizing an AC automatic machine algorithm. For example, for the schema string "[ city ]]Weather (tomorrow today) | "and input text" how the weather today is in Beijing |)"the entity word" Beijing "of city can be found from the input text by AC automata algorithm, because it is clear that the input text does not contain other entity words of city, that is, the matching pattern formed by other entity words can not be successfully matched with the input text, only the" Beijing "needs to be included during translation, and the final translation is a regular expression"

”。

In the process of determining the target skill triggered by the input text in a mode of pattern matching, the translation of the pattern string is also involved, the translation optimization mode can be also adopted in the translation process, the length of the regular expression is shortened, the matching efficiency is further improved, the specific process is similar, and repeated description is omitted.

On the basis of any of the above embodiments, before executing step S206, the method for human-machine automatic conversation according to the embodiment of the present application further includes the following steps: acquiring a slot position modification triggering condition corresponding to the target skill, and judging whether the slot position modification triggering condition is met; if so, updating the slot position value of the slot position to be modified of the target skill according to a preset slot position assignment statement; if not, the slot bit value is not modified.

The slot position modification triggering condition follows the condition syntax, but the left value can only be predefined variables of the system, including an input data variable and a session state data variable. For example, a pattern set c _ city is defined in the slot lifting stage, and if the slot value needs to be changed when the c _ city is matched in the custom slot filling stage, the slot modification triggering condition may be defined as follows:

[{“left”: “skill.rule”, “operator”: “=”, “right”: “c_city”, “relation”: “and”}] 。

the slot assignment statement comprises a modification mode of a slot to be modified and a slot bit value. The assignment statement follows an assignment syntax, such as "[ {" left ": word. slot. place", "right _ type": 1, "right": Beijing "}, {" left ": word. slot. date", "right": today ] ", the slot value of the place slot representing the target technology is set to" Beijing ", and the slot value of the date slot is set to" today ".

On the basis of automatic slot lifting, the self-defined slot position modifying mode is provided, the slot position value can be modified into required data according to the hit slot position modifying triggering condition and the slot position assignment statement, and the capability of the system for adapting to service requirement change is improved.

For example, when an external interface needs to be called to obtain relevant data, in order to ensure that the slot information of the target technology meets the requirement of the external interface on input data, the slot information of the target technology needs to be modified based on the rule requirement of the external interface. Therefore, the slot position to be modified and the corresponding slot position default value in the slot position assignment statement can be configured in advance according to the rule requirement of the external interface required to be called when the skill is executed. When the slot position value of the target skill is updated, the slot position to be modified and the corresponding slot position default value can be obtained from the preset slot position assignment statement, and the slot position value of the slot position to be modified of the target skill is modified into the corresponding slot position default value. For example, peripheral skill searching is triggered based on user input, the extracted slot position value of the search type slot is 'industrial and commercial bank', at this time, a search interface of external map software needs to be called to query relevant data, the search interface of the external map software only supports searching of a 'bank' type, the slot position value of the search type slot for the peripheral skill searching needs to be modified into a default value 'bank' of the slot position, and then the search interface can be smoothly called to query peripheral bank information and feed the information back to the client.

In some application scenarios, the system may obtain the slot position information required by the target skill through a network or an external device, and if the user is repeatedly asked, the communication efficiency may be reduced. For example, the user asks "how the weather is, which is generally the current weather condition of the location where the user is located. For the skills, default slot positions, namely slot positions to be modified, can be configured in the slot position assignment statement, and when the slot position values of the slot positions are not obtained, relevant data of the slot positions can be collected through a network, external equipment of a client, user information and other channels, so that the modification of the slot positions is completed. Specifically, a slot assignment statement of the target skill is obtained, a slot to be modified is obtained from a preset slot assignment statement, relevant data is collected based on the slot to be modified, a slot correction value of the slot to be modified is determined based on the collected relevant data, and the slot value of the slot to be modified of the target skill is modified into a corresponding slot correction value. Taking the input text as "how the weather is", the background server may obtain the current position of the user as the slot position value of the location slot position, and use the current time as the slot position value of the time slot position.

In the slot filling process, the self-defined correction process can be carried out on the slot position value of the target skill according to the slot position modification triggering condition and the slot position assignment statement so as to meet different business requirements.

In order to meet various types of service requirements, in the embodiment of the application, input data uploaded to a background server by a client is divided into a common type and an event type according to the source of the input data.

The general type of input data is input by the user actively performing related operations on the client, for example, data input by voice, text, gesture, etc., and the input text shown in fig. 2 is obtained based on the general type of input data.

The input data of the event type is data actively triggered and reported by the client, for example, the client detects that there are users around, and reports related data immediately. The client reports the data of the event type in the form of an event, and the input data of the event type is a predefined standard message and has an event name and event parameters. For example, a WELCOME event is predefined, the name of the event is WELCOME, the parameter of the event is provided with employee id, and when the client identifies that company personnel enter the company and identifies the specific employee, the WELCOME event and the parameter of the employee id are sent to the background server.

The background server will store the input data uploaded by the client in a cache (e.g., Redis). The background server provides a mode of referring to input data in a variable mode, and the variable name of the input data of the common type is input. In the input data of the event type, the variable name of the event name is input.

After receiving input data uploaded by the client, the background server firstly judges the data type of the input data, and executes the steps shown in fig. 2 when the input data is of a common type; when the input data is of the event type, the steps shown in fig. 3 are performed.

When a pending event (i.e. input data is an event type) automatically triggered by a client is received, the man-machine automatic dialogue method according to the embodiment of the application further includes the following steps:

s301, searching a target skill matched with the event to be processed in the preconfigured skills based on the event name and the event parameters of the event to be processed.

In specific implementation, a corresponding preset event is configured for each skill in advance, and the parameters of each preset event comprise an event name and event parameters. And judging that the skill of the preset event is triggered by the event to be processed only if the event to be processed and the preset event have the same event name and the same event parameter. And if the event to be processed is successfully matched with the plurality of preset events, one preset event is randomly selected, and the skill of the preset event is taken as the target skill. If no preset event matched with the event to be processed exists, it is determined that no skill is triggered, and at this time, the background server may select one linguistics from the linguistics collection to return to the client, for example, "ask what service you need", "do not hear clearly, ask you say again" and the like.

S302, based on the target slot position configured for the event parameter in advance, filling the parameter value of the event parameter in the event to be processed into the slot position of the target skill.

In specific implementation, the event parameters and parameter values thereof with the same parameter name as the hit rule in the input event parameters are selected to form an original key value pair. For example, the client inputs "{" type ": 2," input ":" WELCOME "," args "{" id ": 10000," name ": three open" } ", that is, the input event name is WELCOME, has two event parameters of id and name, the parameter name attribute of the hit event rule is only name, and the original key value pair is" { "name": three open "}". And replacing the key name of the obtained original key value pair with a target slot position attribute corresponding to the key name and the name parameter name in the hit rule to form a slot lifting result, wherein the original key value pair is { "name": three by three, for example, the target slot position attribute of the name event parameter in the hit rule is employee _ name, and the slot lifting result is { "employee _ name": three by three. According to the groove lifting result, the groove positions with the same groove position name in the target skill are found, and the groove position key value is used

In a multi-turn dialog scenario, the target skills of the previous turn of dialog may be acquired first. And if the target skill of the current round of conversation is not consistent with the target skill of the previous round of conversation, filling the slot bit data of the target skill of the current round of conversation based on the slot lifting result of the event to be processed. And if the target skill of the current round of conversation is consistent with the target skill of the previous round of conversation, acquiring the slot data of the target skill after the previous round of conversation, updating the slot data of the target skill based on the slot extracting result of the event to be processed, and taking the updated slot data as the slot filling result of the current round of conversation.

And S303, acquiring the reply content corresponding to the input text based on the slot position value of the target skill.

The specific implementation manner of step S303 is similar to that of step S302, and is not described again.

The man-machine automatic dialogue method can support different types of input data, adopts different analysis and processing modes aiming at the different types of input data, enables the processing process to be more flexible, and improves the data processing efficiency.

In an application scene of multi-round conversations, before skills are matched, skills triggered by the previous round of conversations are obtained from conversation state data, and whether the skills triggered by the previous round of conversations are finished or not is judged; if the skill triggered in the previous round is not finished, the skill matching is not selected, the skill triggered in the previous round is directly used as the target skill of the round, and then the process of groove lifting is directly skipped; and if the skill triggered in the previous round is finished, indicating that a new skill needs to be triggered, firstly performing skill matching, and after determining the target skill of the round, performing processes such as groove lifting, groove filling and the like. Generally, after the slots required by the target skill are all filled, the skill can be executed.

Conversational systems are typically multi-user oriented and therefore require a data isolation mechanism to ensure that there is no crosstalk in the face of multiple users. The data isolation aims at session state data, namely the session state data is required to set user identification, the session state data of each user is separately stored, and the system only acquires the session state data of a certain user when processing a session request of the user. For example, the skill slot data of the user with id 10000 is one of session state data, and is temporarily stored in the Redis in a form of key value pair, and the key name plus the user id can be "10000: slot" to realize data isolation, so that different users can independently use the system.

In order to enable a user to flexibly trigger and quit skills, an expiration mechanism of session state data is introduced into the system, and the session state data is expired and the skills are automatically quitted; the system can judge whether the skill is finished according to the slot position finishing condition of the trigger skill, and automatically reselects the skill entry when the skill is finished, so that the condition that the skill is always in a certain skill is avoided.

As shown in fig. 4, based on the same inventive concept as the above-mentioned man-machine automatic dialogue method, the embodiment of the present application further provides a man-machine automatic dialogue device 40, including:

the skill matching module 401 is configured to determine a target skill triggered by an input text when the input text obtained based on user operation is obtained;

a translation module 402, configured to obtain a groove extracting pattern string matched with the target skill, and form a candidate pattern set; for each candidate pattern string in the candidate pattern set, determining a target slot position name corresponding to an entity based on a target slot position attribute of the entity in the candidate pattern string, obtaining a target character string conforming to the grammar of a regular expression based on the target slot position name and an entity word contained in the entity, and replacing the entity in the candidate pattern string with the target character string to obtain the regular expression corresponding to the candidate pattern string;

a slot extracting module 403, configured to match the obtained regular expression with the input text, and extract slot position key value pairs from the input text based on a matching result, where each slot position key value pair includes a slot position name and a slot position value;

a slot filling module 404, configured to perform slot filling processing on a slot corresponding to the target skill based on the slot key value pair extracted from the input text;

and a reply module 405, configured to obtain reply content corresponding to the input text based on the slot filling result of the target skill.

Optionally, the translation module 402 is specifically configured to: and selecting a slot position name corresponding to the slot position configured for the target skill from the target slot position attributes of the entity as the target slot position name corresponding to the entity, wherein the target slot position attribute of each entity comprises at least one slot position name.

Optionally, the translation module 402 is further configured to: acquiring a dictionary tree generated based on entity words contained in the entity related to the target skill; searching entity words contained in the input text in the dictionary tree by using an AC (alternating current) automata algorithm, and adding the searched entity words as candidate words into a candidate word set;

accordingly, the translation module 402 is specifically configured to: filtering out entity words appearing in the candidate word set from entity words contained in the entity to serve as target entity words of the entity; and converting the target slot position name and the target entity word into a target character string conforming to the regular expression grammar.

Optionally, the caulking module 404 is further configured to: acquiring a slot position modification triggering condition corresponding to the target skill before acquiring the reply content corresponding to the input text based on the slot filling result of the target skill; and judging whether the slot position modification triggering condition is met, if so, updating the slot position value of the slot position to be modified of the target skill according to a preset slot position assignment statement, wherein the slot position assignment statement comprises the slot position to be modified and a modification mode of the slot position value.

Optionally, the slot filling module 404 is specifically configured to: acquiring a slot position to be modified and a corresponding slot position default value from a preset slot position assignment statement, and modifying the slot position value of the slot position to be modified of the target skill into the corresponding slot position default value; or acquiring the slot position to be modified from a preset slot position assignment statement, acquiring related data based on the slot position to be modified, determining a slot position correction value of the slot position to be modified based on the acquired related data, and modifying the slot position value of the slot position to be modified of the target skill into a corresponding slot position correction value.

Optionally, the skill matching module 401 is specifically configured to: determining target skills triggered by the input text in a mode of pattern matching, wherein the pattern matching refers to matching the input text with regular expressions corresponding to each skill; if the target skill is not determined through the mode matching, determining the target skill triggered by the input text through the dialect matching, wherein the dialect matching process comprises the following steps: acquiring the feature vector of the input text and the feature vector of each preset dialect, respectively calculating the similarity between the feature vector of the input text and the feature vector of each preset dialect, and determining the skill of the preset dialect corresponding to the maximum similarity exceeding a preset threshold as the target skill triggered by the input text.

Optionally, the man-machine automatic conversation apparatus 40 further includes: an event slot position module;

the skill matching module 401 is further configured to, when a to-be-processed event automatically triggered by a client is received, find a target skill matched with the to-be-processed event in preconfigured skills based on an event name and an event parameter of the to-be-processed event, where the corresponding event name and the event parameter are preconfigured for the skill;

the event slot position module is used for filling the parameter value of the event parameter in the event to be processed into the slot position of the target skill based on a target slot position configured for the event parameter in advance;

and the reply module 405 is further configured to obtain reply content corresponding to the input text based on the slot position value of the target skill.

The human-computer automatic conversation device provided by the embodiment of the application and the human-computer automatic conversation method adopt the same inventive concept, can obtain the same beneficial effects, and are not described again.

Based on the same inventive concept as the man-machine automatic conversation method, an embodiment of the present application further provides an electronic device, where the electronic device may be specifically (a control device or a control system inside an intelligent device, or an external device communicating with the intelligent device, for example) a desktop computer, a portable computer, a smart phone, a tablet computer, a Personal Digital Assistant (PDA), a server, and the like. As shown in fig. 5, the electronic device 50 may include a processor 501 and a memory 502.

The Processor 501 may be a general-purpose Processor, such as a Central Processing Unit (CPU), a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), a Field Programmable Gate Array (FPGA) or other Programmable logic device, a discrete Gate or transistor logic device, or a discrete hardware component, which may implement or execute the methods, steps, and logic blocks disclosed in the embodiments of the present Application. A general purpose processor may be a microprocessor or any conventional processor or the like. The steps of a method disclosed in connection with the embodiments of the present application may be directly implemented by a hardware processor, or may be implemented by a combination of hardware and software modules in a processor.

Memory 502, which is a non-volatile computer-readable storage medium, may be used to store non-volatile software programs, non-volatile computer-executable programs, and modules. The Memory may include at least one type of storage medium, and may include, for example, a flash Memory, a hard disk, a multimedia card, a card-type Memory, a Random Access Memory (RAM), a Static Random Access Memory (SRAM), a Programmable Read Only Memory (PROM), a Read Only Memory (ROM), a charged Erasable Programmable Read Only Memory (EEPROM), a magnetic Memory, a magnetic disk, an optical disk, and so on. The memory is any other medium that can be used to carry or store desired program code in the form of instructions or data structures and that can be accessed by a computer, but is not limited to such. The memory 502 in the embodiments of the present application may also be circuitry or any other device capable of performing a storage function for storing program instructions and/or data.

Those of ordinary skill in the art will understand that: all or part of the steps for implementing the method embodiments may be implemented by hardware related to program instructions, and the program may be stored in a computer readable storage medium, and when executed, the program performs the steps including the method embodiments; the computer storage media may be any available media or data storage device that can be accessed by a computer, including but not limited to: various media that can store program codes include a removable Memory device, a Random Access Memory (RAM), a magnetic Memory (e.g., a flexible disk, a hard disk, a magnetic tape, a magneto-optical disk (MO), etc.), an optical Memory (e.g., a CD, a DVD, a BD, an HVD, etc.), and a semiconductor Memory (e.g., a ROM, an EPROM, an EEPROM, a nonvolatile Memory (NAND FLASH), a Solid State Disk (SSD)).

Alternatively, the integrated units described above in the present application may be stored in a computer-readable storage medium if they are implemented in the form of software functional modules and sold or used as independent products. Based on such understanding, the technical solutions of the embodiments of the present application may be essentially implemented or portions thereof contributing to the prior art may be embodied in the form of a software product stored in a storage medium, and including several instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the methods described in the embodiments of the present application. And the aforementioned storage medium includes: various media that can store program codes include a removable Memory device, a Random Access Memory (RAM), a magnetic Memory (e.g., a flexible disk, a hard disk, a magnetic tape, a magneto-optical disk (MO), etc.), an optical Memory (e.g., a CD, a DVD, a BD, an HVD, etc.), and a semiconductor Memory (e.g., a ROM, an EPROM, an EEPROM, a nonvolatile Memory (NAND FLASH), a Solid State Disk (SSD)).

The above description is only for the specific embodiments of the present application, but the scope of the present application is not limited thereto, and any changes or substitutions that can be easily conceived by those skilled in the art within the technical scope of the present application should be covered within the scope of the present application. Therefore, the protection scope of the present application shall be subject to the protection scope of the claims.

Claims

1. A human-machine automatic conversation method, comprising:

acquiring reply content corresponding to the input text based on the slot filling result of the target skill;

the method further comprises the following steps: acquiring a dictionary tree generated based on entity words contained in the entity related to the target skill; searching entity words contained in the input text in the dictionary tree by using an AC (alternating current) automata algorithm, and adding the searched entity words as candidate words into a candidate word set;

the obtaining of the target character string conforming to the regular expression grammar based on the target slot names and the entity words contained in the entities includes: filtering out entity words appearing in the candidate word set from entity words contained in the entity to serve as target entity words of the entity; and converting the target slot position name and the target entity word into a target character string conforming to the regular expression grammar.

2. The method of claim 1, wherein determining a target slot name for an entity based on a target slot attribute for the entity in the candidate pattern string comprises:

3. The method according to claim 1, wherein before the slot filling result based on the target skill obtains the reply content corresponding to the input text, the method comprises:

4. The method of claim 3, wherein updating the slot value of the slot to be modified of the target skill according to a preset slot assignment statement comprises:

5. The method of any of claims 1 to 4, wherein determining the target skill triggered by the input text comprises:

determining target skills triggered by the input text in a mode of pattern matching;

6. The method according to any one of claims 1 to 4, further comprising:

7. An automatic human-computer dialog device, comprising:

the translation module is used for acquiring the groove extracting mode string matched with the target skill to form a candidate mode set; acquiring a dictionary tree generated based on entity words contained in the entity related to the target skill; searching entity words contained in the input text in the dictionary tree by using an AC (alternating current) automata algorithm, and adding the searched entity words as candidate words into a candidate word set; for each candidate pattern string in the candidate pattern set, determining a target slot position name corresponding to an entity based on a target slot position attribute of the entity in the candidate pattern string, filtering entity words appearing in the candidate word set from entity words contained in the entity to serve as target entity words of the entity, converting the target slot position name and the target entity words into a target character string conforming to the grammar of a regular expression, and replacing the entity in the candidate pattern string with the target character string to obtain the regular expression corresponding to the candidate pattern string;

8. An electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, characterized in that the steps of the method of any of claims 1 to 6 are implemented when the computer program is executed by the processor.

9. A computer-readable storage medium having computer program instructions stored thereon, which, when executed by a processor, implement the steps of the method of any one of claims 1 to 6.