CN110473530B

CN110473530B - Instruction classification method and device, electronic equipment and computer-readable storage medium

Info

Publication number: CN110473530B
Application number: CN201910773117.6A
Authority: CN
Inventors: 孙俊岭; 高兵兵
Original assignee: Beijing Baidu Netcom Science and Technology Co Ltd
Current assignee: Beijing Baidu Netcom Science and Technology Co Ltd
Priority date: 2019-08-21
Filing date: 2019-08-21
Publication date: 2021-12-07
Anticipated expiration: 2039-08-21
Also published as: CN110473530A

Abstract

The application discloses an instruction classification method, an instruction classification device, electronic equipment and a computer-readable storage medium, and relates to the technical field of computers. The specific implementation scheme is as follows: acquiring a voice instruction with failed analysis; re-analyzing the voice instruction for multiple times to obtain multiple analysis results; and classifying the analysis failure reasons of the voice command according to the multiple analysis results. The method and the device for determining the reason of the failure of the instruction analysis in an automatic mode save labor and time compared with manual failure determination.

Description

Instruction classification method and device, electronic equipment and computer-readable storage medium

Technical Field

The application relates to the technical field of computers, in particular to the technical field of voice.

Background

With the development of artificial intelligence technology, devices or applications controlled by voice are more and more. Metrics for such devices or applications include whether the instruction can be accurately identified and whether the instruction can be supported.

For instructions that fail to resolve, a manual investigation is required to locate the cause. When the number of instructions failed in analysis is large, the manual checking scheme consumes manpower and financial resources, and the quality cannot be guaranteed.

Disclosure of Invention

The embodiment of the invention provides an instruction classification method, an instruction classification device, electronic equipment and a computer-readable storage medium, and aims to solve one or more technical problems in the prior art.

According to a first aspect of an embodiment of the present invention, an embodiment of the present invention provides an instruction classification method, including:

acquiring a voice instruction failed to be analyzed through a natural language understanding model;

re-analyzing the voice instruction for multiple times by utilizing a natural language understanding model to obtain multiple analysis results;

and classifying the reasons of the failed voice instruction analysis according to the multiple analysis results.

Therefore, the failure reason of the instruction analysis is determined in an automatic mode, and compared with the manual failure reason classification, the manual failure reason classification method saves labor and time.

In one embodiment, classifying the analysis failure reason of the voice command according to the multiple analysis results comprises:

determining the analysis failure reason of the voice command as the first type fault when the multiple analysis result is probabilistic failure

Thus, in the case where multiple resolutions fail, the failure type may be determined as the first type failure. And realizing accurate positioning of the fault type as analytic instability.

In one embodiment, classifying the reason for the failed parsing of the voice command according to the multiple parsing results further includes:

acquiring synonyms of the voice instructions under the condition that the multiple analysis results are all failed;

analyzing the synonyms to obtain the analysis result of the synonyms;

and determining the reason of the failed analysis of the voice command as a second type of fault when the analysis result of the synonym is that the analysis is successful.

Thus, where the voice command synonym can be parsed, the fault type may be determined to be a second type of fault. And realizing accurate positioning of the fault type to be the analytic capability fault.

and determining the analysis failure reason of the voice command as a third type fault when the analysis result of the synonym is analysis failure.

Therefore, under the condition that the first and second faults are eliminated, the fault type can be accurately positioned to be the third fault, and the fault corresponds to the fault which does not support the instruction.

In one embodiment, obtaining synonyms of voice instructions comprises:

recognizing the voice command to obtain text information;

performing word segmentation processing on the text information to obtain a plurality of words;

respectively carrying out synonym conversion on the multiple participles to obtain synonyms corresponding to the multiple participles;

and combining the synonyms corresponding to the multiple participles according to the sequence of the multiple participles to obtain the synonym of the voice instruction.

Compared with the synonym processing of all voice instructions, the method has the advantages that the synonym processing is more accurate through the mode that the instructions are segmented, the synonym of each segmentation is respectively obtained, and then the synonyms are combined.

In one embodiment, the method further comprises:

inquiring field information adjacent to the voice instruction in the behavior log;

and inputting the field information into a scene classification model to obtain a scene corresponding to the voice command.

Therefore, after the scene corresponding to the voice command is obtained, the reason of the fault can be refined, and the problem can be positioned more quickly. For example, "i want to go to the airport from the train station" corresponds to the third type of failure in the navigation scenario, it may be quickly determined that the navigation scenario does not support the instruction to go to the airport from the train station. Thereby facilitating subsequent maintenance and upgrades.

One embodiment in the above application has the following advantages or benefits: the failure reason of the instruction analysis is determined in an automatic mode, and compared with the manual failure reason determination, the manual failure reason determination method saves labor and time. Because the technical means of carrying out repeated reanalysis on the voice command failed in the analysis is adopted, the estimation type classification is carried out according to the reanalysis result, the technical problem that manual judgment is needed is solved, and the technical effects of saving labor and time are achieved.

According to a second aspect of the embodiments of the present invention, an embodiment of the present invention provides an instruction sorting apparatus, including:

the analysis failure voice instruction acquisition module is used for acquiring a voice instruction which fails to be analyzed through the natural language understanding model;

the voice instruction analysis module is used for carrying out repeated analysis on the voice instruction by utilizing the natural language understanding model to obtain a repeated analysis result;

and the classification module is used for classifying the analysis failure reasons of the voice instruction according to the multiple analysis results.

In one embodiment, the classification module includes:

and the first classification submodule is used for determining the analysis failure reason of the voice instruction as a first type fault under the condition that the multiple analysis results are probabilistic failures.

In one embodiment, the classification module further comprises:

the synonym acquisition sub-module is used for acquiring synonyms of the voice instructions under the condition that the multiple analysis results are all failed;

the voice instruction analysis module is also used for analyzing the synonym to obtain an analysis result of the synonym;

and the second classification submodule is used for determining the reason of the analysis failure of the voice instruction as a second type of fault under the condition that the analysis result of the synonym is that the analysis is successful.

In one embodiment, the classification module further comprises:

and the third classification submodule is used for determining the analysis failure reason of the voice instruction as a third type fault under the condition that the analysis result of the synonym is analysis failure.

In one embodiment, the synonym acquisition sub-module includes:

the voice instruction identification unit is used for identifying a voice instruction to obtain text information;

the word segmentation unit is used for carrying out word segmentation processing on the text information to obtain a plurality of words;

the synonym acquisition execution unit is used for respectively carrying out synonym conversion on the multiple participles to obtain synonyms corresponding to the multiple participles;

and the synonym combination unit is used for combining the synonyms corresponding to the multiple participles according to the sequence of the multiple participles to obtain the synonym of the voice instruction.

In one embodiment, the apparatus further comprises:

the query module is used for querying field information adjacent to the voice instruction in the behavior log;

and the scene determining module is used for inputting the field information into the scene classification model to obtain a scene corresponding to the voice command.

According to a third aspect of the embodiments of the present invention, an embodiment of the present invention provides an electronic device, where functions of the electronic device may be implemented by hardware, or may be implemented by hardware executing corresponding software. The hardware or software includes one or more modules corresponding to the above-described functions.

In one possible design, the electronic device includes a processor and a memory, the memory is used for storing a program for supporting the apparatus to execute the instruction classification method, and the processor is configured to execute the program stored in the memory. The apparatus may also include a communication interface for communicating with other apparatuses or a communication network.

According to a fourth aspect of the embodiments of the present invention, there is provided a non-transitory computer-readable storage medium storing computer instructions for storing computer software instructions for an instruction sorting apparatus, the non-transitory computer-readable storage medium storing a program for executing the task scheduling method.

Other effects of the above-described alternative will be described below with reference to specific embodiments.

Drawings

The drawings are included to provide a better understanding of the present solution and are not intended to limit the present application. Wherein:

FIG. 1 is a schematic diagram according to a first embodiment of the present application;

FIG. 2 is a schematic diagram according to a first embodiment of the present application;

FIG. 3 is a schematic diagram according to a first embodiment of the present application;

FIG. 4 is a schematic illustration according to a first embodiment of the present application;

FIG. 5 is a schematic illustration according to a first embodiment of the present application;

FIG. 6 is a schematic diagram according to a second embodiment of the present application;

FIG. 7 is a schematic illustration according to a second embodiment of the present application;

FIG. 8 is a schematic diagram according to a second embodiment of the present application;

FIG. 9 is a schematic illustration according to a second embodiment of the present application;

FIG. 10 is a block diagram of an electronic device for implementing a method of instruction classification of an embodiment of the present application.

Detailed Description

The following description of the exemplary embodiments of the present application, taken in conjunction with the accompanying drawings, includes various details of the embodiments of the application for the understanding of the same, which are to be considered exemplary only. Accordingly, those of ordinary skill in the art will recognize that various changes and modifications of the embodiments described herein can be made without departing from the scope and spirit of the present application. Also, descriptions of well-known functions and constructions are omitted in the following description for clarity and conciseness.

FIG. 1 shows a flow diagram of an instruction classification method according to an embodiment of the invention. As shown in fig. 1, the method comprises the steps of:

s101: and acquiring the voice instruction failed to be analyzed through the natural language understanding model.

The voice command can be analyzed through a natural language understanding model, for example, a natural language understanding server or a natural language understanding application program can be included to analyze the voice command of the user. The voice instruction of the user is converted into text information, and the user intention, the word slot and the like are extracted from the text information. For example, in the application scenario of voice maps, the voice instruction of the user may include "i want to go to the airport from the train station". After the natural language understanding model converts the voice command into a text message, the user intention is analyzed to be navigation, and the word slot comprises a 'railway station' corresponding to a navigation starting point and an 'airport' corresponding to a navigation terminal point.

In the case where the natural language understanding model fails to parse the voice command, blank information may be output, or information indicating that parsing is impossible may be output. The information indicating that analysis is impossible may be "analysis impossible", "analysis failed", or "FAIL".

The set of analysis results of the natural language understanding model can be captured by using Big Data (Big Data) technology, so that a set of voice commands (FailedOrders) with analysis failure can be obtained. In addition, a big data processing framework (Spark) technology can be adopted to acquire a set of voice instructions with failed analysis.

Prioritization may be performed for voice instructions that fail natural language understanding model parsing. For example, the same voice commands are counted in quantity, and a certain quantity of voice commands ranked in the top are screened out. The same voice command can be the same word, for example, a similarity threshold is preset, words of the voice command are compared, and the voice command can be considered as the same voice command if the word similarity exceeds the similarity threshold.

S102: and re-analyzing the voice instruction for multiple times by utilizing the natural language understanding model to obtain multiple analysis results.

And inputting the voice instruction which fails to be analyzed into the natural language understanding model for multiple times to obtain repeated analysis results of the natural language understanding model for the same voice instruction for multiple times.

S103: and classifying the reasons of the failed voice instruction analysis according to the multiple analysis results.

The multiple analysis results may include a complete analysis success, a partial analysis success, or a complete analysis failure, etc. And classifying the analysis failure reasons of the voice command according to the multiple analysis results.

In one embodiment, step S103 includes:

and determining the analysis failure reason of the voice command as a first type fault when the multiple analysis results are probabilistic failures.

When the analysis result of the multiple re-analyses is a complete analysis success or a partial analysis success, it may be indicated that the case where the natural language understanding model acquired in step S101 fails to be analyzed is a probabilistic failure. A probabilistic failure corresponds to a first type of failure. The first type of failure may include a failure in which a natural language understanding model is unstable, such as jitter of a natural language understanding server (natural language understanding application), network delay, and the like.

As shown in fig. 2, in an embodiment, step S103 may further include:

s1031: and acquiring synonyms of the voice commands under the condition that all the analysis results fail for a plurality of times.

S1032: and analyzing the synonym to obtain an analysis result of the synonym.

S1033: and determining the reason of the failed analysis of the voice command as a second type of fault when the analysis result of the synonym is that the analysis is successful.

When the multiple analysis results are all failed, the natural language understanding model can not recognize the voice command, so that the first type of fault can be eliminated. In this case, synonyms of the voice instructions may be obtained. For example, an unrecognized voice command may include "i want to go to an airport from a train station". Synonyms for the voice command may include "i want to go from high-speed rail station to airport".

And inputting the synonym into the natural language understanding model to obtain an analysis result of the synonym. And judging whether the analysis result is successfully analyzed. When the analysis is successful, the intention of the synonym, the word slot, and the like are obtained. This means that the natural language understanding model can recognize the synonym but cannot recognize the voice command corresponding to the synonym. In this case, the cause of the analysis failure is classified into the second type of failure. The second type of failure may include a failure in the natural language understanding model resolving capability, such as poor generalization performance or low resolution accuracy.

In one embodiment, if the analysis result of the synonym obtained in step S1032 is an analysis failure, the analysis failure cause of the voice command is determined as a third type failure.

The third type of fault may include a fault that does not support instructions. For example, synonyms may include "i want to go from a high-speed rail station to an airport". And when the analysis result is analysis failure, the service for representing the voice map does not support the instruction. The voice map service not supporting the instruction may include: the navigation function of the voice map is not on-line, or the voice map cannot search high-speed rail stations, airports and the like.

As shown in fig. 3, in an embodiment, the obtaining of the synonym of the voice instruction in step S1031 may specifically include:

s10311: recognizing the voice command to obtain text information;

s10312: and performing word segmentation processing on the text information to obtain a plurality of words.

S10313: and respectively carrying out synonym conversion on the multiple participles to obtain synonyms corresponding to the multiple participles.

S10314: and combining the synonyms corresponding to the multiple participles according to the sequence of the multiple participles to obtain the synonym of the voice instruction.

For example, the acoustic model may be used to recognize voice commands, resulting in textual information. And then, the Chinese word segmentation tool is used for carrying out word segmentation processing on the text information. The Chinese segmentation tools may include Jiba Chinese segmentation (Jieba), Chinese vocabulary analysis (THULAC, THU Lexical Analyzer for Chinese), and the like. And after word segmentation is carried out on the text information, a plurality of word segments are obtained. Synonym conversion can be performed on a plurality of words by using a natural language processing model to obtain synonyms of each participle. Natural language processing models may include chinese near Word thesaurus (Synonyms), Word vector models (Word2vec), and the like. And combining the obtained corresponding synonyms of each participle according to the sequence of the participles to obtain the synonym of the voice instruction.

For example, the text message corresponding to the voice command may include "i want to go to an airport from a train station," and the word segments obtained by word segmentation are: "I want", "from", "train station", "go", "airport". Synonyms obtained using the natural language processing model may include: "i want" - "i want", "from" - "from", "train station" - "high-speed rail station", "go" - "to", "airport" - "airport". The synonyms of the words are combined according to the order of the participles, and the obtained synonym can comprise 'I want to go from a high-speed rail station to an airport'.

As shown in fig. 4, in one embodiment, the method further comprises:

s401: and inquiring field information adjacent to the voice instruction in the behavior log.

S402: and inputting the field information into a scene classification model to obtain a scene corresponding to the voice command.

The behavior log may include records of behaviors that the user generates each time the target application is used, and the behaviors may include accessing, browsing, searching, clicking, and the like. Whenever the user uses the target application, a behavior log associated with the target application is generated. And recording the user behavior in the behavior log in a field information mode. In this embodiment, the target application may be a voice map.

After receiving a voice instruction of a user, the target application program generates corresponding first field information in the behavior log. The first field information may include a result of parsing the voice command, identification information of the user, and the like. The identification information of the user may include the ID of the user, the version of the voice map accessed by the user, the location of the user, and the like.

And performing data processing on the behavior log, namely sequencing all field information in the behavior log of the target application program according to time, and screening all field information of the user according to the identification information of the user. And second field information adjacent to the first field information among all field information of the user. The adjacency may include that the second field information is before or after the first field information, and the time interval from the first field information is less than a preset time threshold.

Cleaning the second field, such as a missing field or a duplicate field, requires a cleaning process. And selecting the cleaned second field according to the identification information of the user. For example, selecting the second field corresponding to the user accessing the highest version of the voice map, or selecting the field corresponding to the user whose location is Beijing.

And inputting the cleaned and/or screened second field into the scene classification model, so as to obtain a scene corresponding to the voice command. The scene may include navigation, public transportation or retrieval, etc. The scene classification model may be trained in advance through fields in the behavior log.

After the scene corresponding to the voice command is obtained, the reason of the fault can be refined, and therefore the problem can be located more quickly. For example, "i want to go to the airport from the train station" corresponds to the third type of failure in the navigation scenario, it may be quickly determined that the navigation scenario does not support the instruction to go to the airport from the train station. Thereby facilitating subsequent maintenance and upgrades.

As shown in FIG. 5, in one embodiment, the instruction classification method includes the steps of:

by using big data or Spark technology, a failed speech instruction set (FailedOrders) on a Natural Language understanding model (NLP) is obtained, then the number of the speech instructions is counted, priority ranking is performed according to the number of the speech instructions, and a Top instruction (such as Top1000) is extracted.

And acquiring a behavior log corresponding to the voice instruction set failed in analysis on a server of the voice map by using big data or Spark technology. The acquisition process comprises the standardized processing of fields (data) in the behavior log, and the processed fields are transmitted to the scene classification model. The fields transmitted to the scene classification model also include storage, cleaning, and sample selection. And inputting the selected samples into a scene classification model, and carrying out scene division on the voice command failed in analysis. The scene may include: navigation, public transportation, retrieval, etc.

And under different scenes, carrying out secondary classification on the voice instructions failed in analysis according to the fault types. The secondary classification may include:

and inputting the voice command failed in the analysis into the natural language understanding model for multiple times, and judging whether the analysis is probabilistic analysis failure. In case of failure of the probabilistic resolution, the failure type is classified as a first type failure. The first type of failure includes a failure in which the natural language understanding model is unstable, such as jitter, network delay, and the like of a server or an application program in which the natural language understanding model is located.

And splitting the voice instruction which does not belong to the first type of fault and fails in analysis. For example, the voice command with failed parsing is "a to B". Split into "a", "to" and "B". Synonyms a- > a1, go- > go, B- > B1 were generated using a natural language processing model. Synonyms "a 1 to B1" are generated. The synonyms are sent to the natural language understanding model. In the case where the natural language understanding model can be parsed, the fault types are classified as a second type of fault. The second type of failure may include a failure in the natural language understanding model resolving capability, such as poor generalization performance or low resolution accuracy.

And if the voice command failed in analysis does not belong to the first type of fault or the second type of fault, dividing the voice command into a third type of fault. The third type of fault is a fault that does not support instructions.

And generating an analysis report according to the classified result. The analysis report may include the failed voice instruction and its corresponding scenario type, fault type, etc. The analysis report can be synchronized to a technical team for directional instruction optimization, so that the analysis success rate of the real voice user on line is improved. The flow of the instruction classification method may be periodically executed in a loop.

Fig. 6 is a block diagram showing a structure of an apparatus for instruction classification according to an embodiment of the present invention, the apparatus including:

the analysis failure voice instruction obtaining module 601 is configured to obtain a voice instruction that fails to be analyzed through the natural language understanding model.

The voice instruction parsing module 602 is configured to perform multiple re-parsing on the voice instruction by using the natural language understanding model, so as to obtain multiple parsing results.

The classifying module 603 is configured to classify the reason for the failed parsing of the voice command according to the multiple parsing results.

As shown in fig. 7, in one embodiment, the classification module 603 includes:

the first classification sub-module 6031 is configured to determine, when the multiple analysis result is a probabilistic failure, a cause of the analysis failure of the voice command as a first-class failure.

In one embodiment, the classification module 603 further comprises:

the synonym obtaining sub-module 6032 is configured to obtain a synonym of the voice instruction when all the parsing results fail.

The voice instruction parsing module 602 is further configured to parse the synonym to obtain a parsing result of the synonym.

The second classification sub-module 6033 is configured to determine, when the analysis result of the synonym is that the analysis is successful, a cause of the analysis failure of the voice command as a second type of fault.

In one embodiment, the classification module 603 further comprises:

the third classification sub-module 6034 is configured to determine, when the analysis result of the synonym is that the analysis fails, a cause of the analysis failure of the voice command as a third-class failure.

As shown in fig. 8, in one embodiment, synonym acquisition sub-module 6032 includes:

and a voice instruction recognition unit 60321 configured to recognize the voice instruction and obtain text information.

The word segmentation unit 60322 is configured to perform word segmentation processing on the text information to obtain a plurality of words.

The synonym obtaining execution unit 60323 is configured to perform synonym transformation on the multiple tokens respectively to obtain synonyms corresponding to the multiple tokens.

A synonym combining unit 60324, configured to combine synonyms corresponding to the multiple participles according to the ordering of the multiple participles, to obtain a synonym of the voice instruction.

As shown in fig. 9, in one embodiment, the apparatus further comprises:

and the query module 901 is used for querying the field information adjacent to the voice instruction in the behavior log.

And a scene determining module 902, configured to input the field information into the scene classification model, so as to obtain a scene corresponding to the voice instruction.

According to an embodiment of the present application, an electronic device and a readable storage medium are also provided.

As shown in fig. 10, is a block diagram of an electronic device according to an instruction classification method of an embodiment of the present application. Electronic devices are intended to represent various forms of digital computers, such as laptops, desktops, workstations, personal digital assistants, servers, blade servers, mainframes, and other appropriate computers. The electronic device may also represent various forms of mobile devices, such as personal digital processing, cellular phones, smart phones, wearable devices, and other similar computing devices. The components shown herein, their connections and relationships, and their functions, are meant to be examples only, and are not meant to limit implementations of the present application that are described and/or claimed herein.

As shown in fig. 10, the electronic apparatus includes: one or more processors 1001, memory 1002, and interfaces for connecting the various components, including high-speed interfaces and low-speed interfaces. The various components are interconnected using different buses and may be mounted on a common motherboard or in other manners as desired. The processor may process instructions for execution within the electronic device, including instructions stored in or on the memory to display Graphical information for a Graphical User Interface (GUI) on an external input/output device, such as a display device coupled to the Interface. In other embodiments, multiple processors and/or multiple buses may be used, along with multiple memories and multiple memories, as desired. Also, multiple electronic devices may be connected, with each device providing portions of the necessary operations (e.g., as a server array, a group of blade servers, or a multi-processor system). Fig. 10 illustrates an example of one processor 1001.

The memory 1002 is a non-transitory computer readable storage medium provided herein. The memory stores instructions executable by the at least one processor to cause the at least one processor to perform the method of instruction classification provided herein. The non-transitory computer readable storage medium of the present application stores computer instructions for causing a computer to perform the method of instruction classification provided herein.

The memory 1002, as a non-transitory computer readable storage medium, may be used to store non-transitory software programs, non-transitory computer executable programs, and modules, such as program instructions/modules corresponding to the method of instruction classification in the embodiments of the present application (e.g., the failed parsing voice instruction obtaining module 601, the voice instruction parsing module 602, and the classification module 603 shown in fig. 6). The processor 1001 executes various functional applications of the server and data processing, i.e., a method of implementing instruction classification in the above-described method embodiments, by executing non-transitory software programs, instructions, and modules stored in the memory 1002.

The memory 1002 may include a storage program area and a storage data area, wherein the storage program area may store an operating system, an application program required for at least one function; the storage data area may store data created by use of the electronic device according to the method of instruction classification, and the like. Further, the memory 1002 may include high-speed random access memory, and may also include non-transitory memory, such as at least one magnetic disk storage device, flash memory device, or other non-transitory solid state storage device. In some embodiments, the memory 1002 may optionally include memory located remotely from the processor 1001, which may be connected through a network to an electronic device that performs the method of instruction classification. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof.

The electronic device of the method of instruction classification may further include: an input device 1003 and an output device 1004. The processor 1001, the memory 1002, the input device 1003, and the output device 1004 may be connected by a bus or other means, and the bus connection is exemplified in fig. 10.

The input device 1003 may receive input numeric or character information and generate key signal inputs related to user settings and function control of the electronic apparatus instructing the method of classifying, such as a touch screen, a keypad, a mouse, a track pad, a touch pad, a pointing stick, one or more mouse buttons, a track ball, a joystick, and the like. The output devices 1004 may include a display device, auxiliary lighting devices (e.g., LEDs), and tactile feedback devices (e.g., vibrating motors), among others. The Display device may include, but is not limited to, a Liquid Crystal Display (LCD), a Light Emitting Diode (LED) Display, and a plasma Display. In some implementations, the display device can be a touch screen.

Various implementations of the systems and techniques described here can be realized in digital electronic circuitry, Integrated circuitry, Application Specific Integrated Circuits (ASICs), computer hardware, firmware, software, and/or combinations thereof. These various embodiments may include: implemented in one or more computer programs that are executable and/or interpretable on a programmable system including at least one programmable processor, which may be special or general purpose, receiving data and instructions from, and transmitting data and instructions to, a storage system, at least one input device, and at least one output device.

These computer programs (also known as programs, software applications, or code) include machine instructions for a programmable processor, and may be implemented using high-level procedural and/or object-oriented programming languages, and/or assembly/machine languages. As used herein, the terms "machine-readable medium" and "computer-readable medium" refer to any computer program product, apparatus, and/or device (e.g., magnetic discs, optical disks, memory, Programmable Logic Devices (PLDs)) used to provide machine instructions and/or data to a programmable processor, including a machine-readable medium that receives machine instructions as a machine-readable signal. The term "machine-readable signal" refers to any signal used to provide machine instructions and/or data to a programmable processor.

To provide for interaction with a user, the systems and techniques described here can be implemented on a computer having: a display device (e.g., a CRT (Cathode Ray Tube) or LCD (liquid crystal display) monitor) for displaying information to a user; and a keyboard and a pointing device (e.g., a mouse or a trackball) by which a user can provide input to the computer. Other kinds of devices may also be used to provide for interaction with a user; for example, feedback provided to the user can be any form of sensory feedback (e.g., visual feedback, auditory feedback, or tactile feedback); and input from the user may be received in any form, including acoustic, speech, or tactile input.

The systems and techniques described here can be implemented in a computing system that includes a back-end component (e.g., as a data server), or that includes a middleware component (e.g., an application server), or that includes a front-end component (e.g., a user computer having a graphical user interface or a web browser through which a user can interact with an implementation of the systems and techniques described here), or any combination of such back-end, middleware, or front-end components. The components of the system can be interconnected by any form or medium of digital data communication (e.g., a communication network). Examples of communication networks include: local Area Networks (LANs), Wide Area Networks (WANs), and the internet.

The computer system may include clients and servers. A client and server are generally remote from each other and typically interact through a communication network. The relationship of client and server arises by virtue of computer programs running on the respective computers and having a client-server relationship to each other.

According to the technical scheme of the embodiment of the application, the failure reason of the instruction analysis is determined in an automatic mode, and compared with the manual failure reason determination, the manual failure reason determination method saves labor and time.

It should be understood that various forms of the flows shown above may be used, with steps reordered, added, or deleted. For example, the steps described in the present application may be executed in parallel, sequentially, or in different orders, and the present invention is not limited thereto as long as the desired results of the technical solutions disclosed in the present application can be achieved.

The above-described embodiments should not be construed as limiting the scope of the present application. It should be understood by those skilled in the art that various modifications, combinations, sub-combinations and substitutions may be made in accordance with design requirements and other factors. Any modification, equivalent replacement, and improvement made within the spirit and principle of the present application shall be included in the protection scope of the present application.

Claims

1. An instruction classification method, comprising:

the natural language understanding model is used for repeatedly re-analyzing the voice instruction to obtain a repeated analysis result; the multiple analysis results comprise all successful analysis, partial successful analysis or all failed analysis;

2. The method according to claim 1, wherein the classifying the analysis failure cause of the voice command according to the multiple analysis results comprises:

and determining the analysis failure reason of the voice instruction as the unstable fault of the natural language understanding model under the condition that the multiple analysis results are all or part of analysis success.

3. The method according to claim 1 or 2, wherein the classifying the reason for the failed parsing of the voice command according to the multiple parsing results further comprises:

analyzing the synonym to obtain an analysis result of the synonym;

and under the condition that the analysis result of the synonym is successful, determining the reason of the analysis failure of the voice instruction as the analysis capability fault of the natural language understanding model.

4. The method according to claim 3, wherein the classifying the analysis failure cause of the voice command according to the multiple analysis results further comprises:

and under the condition that the analysis result of the synonym is analysis failure, determining the analysis failure reason of the voice instruction as a fault which does not support the instruction.

5. The method of claim 3, wherein obtaining the synonym for the voice instruction comprises:

recognizing the voice command to obtain text information;

6. The method of claim 1, further comprising:

querying second field information adjacent to the first field information corresponding to the voice instruction in an action log; the field information is used for recording the behavior of the user;

and inputting the second field information into a scene classification model to obtain a scene corresponding to the voice instruction.

7. An instruction sorting apparatus, comprising:

the voice instruction analysis module is used for repeatedly re-analyzing the voice instruction by utilizing the natural language understanding model to obtain a repeated analysis result; the multiple analysis results comprise all successful analysis, partial successful analysis or all failed analysis;

8. The apparatus of claim 7, wherein the classification module comprises:

and the first classification submodule is used for determining the analysis failure reason of the voice instruction as the unstable fault of the natural language understanding model under the condition that the multiple analysis results are all or part of analysis success.

9. The apparatus of claim 7 or 8, wherein the classification module further comprises:

a synonym obtaining sub-module, configured to obtain a synonym of the voice instruction when the multiple parsing results are all failed;

and the second classification submodule is used for determining the reason of the analysis failure of the voice instruction as the analysis capability fault of the natural language understanding model under the condition that the analysis result of the synonym is that the analysis is successful.

10. The apparatus of claim 9, wherein the classification module further comprises:

and the third classification submodule is used for determining the analysis failure reason of the voice instruction as the fault which does not support the instruction under the condition that the analysis result of the synonym is analysis failure.

11. The apparatus of claim 10, wherein the synonym acquisition sub-module comprises:

the voice instruction identification unit is used for identifying the voice instruction to obtain text information;

12. The apparatus of claim 7, further comprising:

the query module is used for querying second field information adjacent to the first field information corresponding to the voice instruction in a behavior log; the field information is used for recording the behavior of the user;

and the scene determining module is used for inputting the second field information into a scene classification model to obtain a scene corresponding to the voice instruction.

13. An electronic device, comprising:

at least one processor; and

a memory communicatively coupled to the at least one processor; wherein the content of the first and second substances,

the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the method of any one of claims 1-6.

14. A non-transitory computer readable storage medium having stored thereon computer instructions for causing the computer to perform the method of any one of claims 1-6.