CN111144128A

CN111144128A - Semantic parsing method and device

Info

Publication number: CN111144128A
Application number: CN201911361568.5A
Authority: CN
Inventors: 黄炼楷; 徐威; 林英展; 黄世维
Original assignee: Beijing Baidu Netcom Science and Technology Co Ltd
Current assignee: Beijing Baidu Netcom Science and Technology Co Ltd
Priority date: 2019-12-26
Filing date: 2019-12-26
Publication date: 2020-05-12
Anticipated expiration: 2039-12-26
Also published as: CN111144128B

Abstract

The embodiment of the disclosure discloses a semantic parsing method and a semantic parsing device. One embodiment of the method comprises: acquiring an offline vertical class set; detecting a current network state in response to receiving a sentence to be recognized; for each vertical class in the vertical class set, performing off-line semantic analysis in a heuristic derivation mode and off-line semantic analysis in a sample generalization model mode according to the vertical class to obtain a first analysis result and a second analysis result of the vertical class respectively, if the first analysis result meets a preset condition, taking the first analysis result as an off-line candidate analysis result of the vertical class, and otherwise, performing off-line fusion on the first analysis result and the second analysis result to generate an off-line candidate analysis result of the vertical class; and if the current network state is disconnected or the network is unstable, selecting a final analysis result from the offline candidate analysis results of different vertical classes according to the priority of the vertical class and/or the historical conversation information. The embodiment improves the accuracy and flexibility of semantic parsing.

Description

Semantic parsing method and device

Technical Field

The embodiment of the disclosure relates to the technical field of computers, in particular to a semantic parsing method and a semantic parsing device.

Background

With the rapid development and popularization of AIoT (AI + Internet of things), more and more intelligent devices have the voice interaction capability, such as intelligent vehicle-mounted devices, intelligent wearable devices, intelligent sound boxes and the like. The intelligent interaction capability of the intelligent device depends on the intelligent service of the cloud, and the intelligent interaction capability comprises voice awakening, voice recognition, semantic analysis, voice synthesis and the like. The voice awakening, voice recognition and synthesis have mature off-line technical schemes, and only a feasible method with immature semantic parsing capability is available, and only cloud service can be completely relied on. In an intelligent vehicle-mounted scene, an automobile may enter a tunnel or a remote area, on-line semantic analysis cannot normally serve under the condition that a network is unstable or disconnected, the voice interaction capability of intelligent equipment lacks an important link, the intelligent equipment cannot identify what a user says and cannot correctly reply the user, and the whole service fails.

In the prior art, only online semantic analysis service is used when a network is smooth, and a small part of key requirements are met by simple and inflexible methods such as regular expressions or text matching under the condition of unstable network or network disconnection. Therefore, has the following disadvantages:

1. inflexibility and poor generalization. A large number of rules and corpora need to be manually configured, and the diversity of conversations cannot be dealt with;

2. the maintainability is poor. When the rules are too many, the maintenance cost is higher and higher, and the rule base cannot be effectively managed;

3. the performance is poor. With the increase of the number, the efficiency of matching the regular expression and the text is obviously reduced, and performance pressure is brought to the intelligent equipment;

4. online resolution services cannot be multiplexed. The method for realizing the offline analysis capability is different from the method for realizing the online analysis service, so that the model and the data of the online analysis service cannot be reused by the offline analysis capability, the offline analysis capability needs to be re-developed, and the effect is greatly different from that of the online analysis service;

5. poor expandability and high migration cost. The method has no mature technical framework, is self-organized, cannot be rapidly expanded to a new field vertical type, and cannot be migrated to other intelligent platforms from the offline analysis capability.

Disclosure of Invention

The embodiment of the disclosure provides a semantic parsing method and a semantic parsing device.

In a first aspect, an embodiment of the present disclosure provides a semantic parsing method, including: acquiring an offline vertical class set; detecting a current network state in response to receiving a sentence to be recognized; for each vertical class in the vertical class set, performing off-line semantic analysis in a heuristic derivation mode and off-line semantic analysis in a sample generalization model mode according to the vertical class to obtain a first analysis result and a second analysis result of the vertical class respectively, if the first analysis result meets a preset condition, taking the first analysis result as an off-line candidate analysis result of the vertical class, and otherwise, performing off-line fusion on the first analysis result and the second analysis result to generate an off-line candidate analysis result of the vertical class; and if the current network state is disconnected or the network is unstable, selecting a final analysis result from the offline candidate analysis results of different vertical classes according to the priority of the vertical class and/or the historical conversation information.

In some embodiments, parsing the results includes: an intent and word slot set; and performing offline fusion on the first analysis result and the second analysis result to generate an offline candidate analysis result of the vertical class, wherein the offline candidate analysis result comprises: reversely acquiring a second word slot set needing to be identified according to the intention in a second analysis result obtained in a sample generalization model mode; and fusing the first word slot set and the second word slot set of the first analysis result obtained in the heuristic derivation mode, and finally deriving a complete intention and a word slot analysis result as the vertical off-line candidate analysis result.

In some embodiments, the method further comprises: if the network is detected to be normal, sending an online semantic analysis request including a statement to be identified to a semantic understanding server while performing offline semantic analysis; for each offline vertical class in the offline vertical class set, if the offline candidate analysis result of the vertical class is a first analysis result, directly adopting the offline candidate analysis result of the vertical class as a fusion result of the vertical class, otherwise, fusing the offline candidate analysis result of the vertical class and the received online candidate analysis result to obtain a fusion result of the vertical class; and selecting a final analysis result from the fusion results of different vertical classes according to the priority of the vertical classes and/or the historical conversation information.

In some embodiments, the method further comprises: and if the received online candidate analysis results which do not belong to the offline vertical classes are obtained, selecting the final analysis result from the fusion results and the online candidate analysis results of different vertical classes according to the priorities and/or the historical conversation information of the vertical classes.

In some embodiments, parsing the results includes: an intent and word slot set; and fusing the offline candidate analysis result of the vertical class with the received online candidate analysis result to obtain a fusion result of the vertical class, wherein the fusion result comprises the following steps: when the intention in the offline candidate analysis result of the vertical class is inconsistent with the intention in the online candidate analysis result, directly adopting the online candidate analysis result as the fusion result of the vertical class; otherwise, taking the word slot set in the offline candidate analysis result as a reference, checking and combining the word slot results in the online candidate analysis result, and taking the processed online candidate analysis result as the fusion result of the vertical class.

In some embodiments, performing offline semantic analysis in a heuristic derivation manner and offline semantic analysis in a sample generalization model manner on the sentence to be recognized according to the vertical category to obtain a first analysis result and a second analysis result of the vertical category, respectively, includes: performing word segmentation, part of speech tagging, named entity recognition and the like on a sentence to be recognized; performing syntactic structure analysis and disambiguation on the sentence to be recognized; carrying out named entity recognition of vertical specialization on the sentence to be recognized; according to word segmentation results, part-of-speech information and entity information, carrying out off-line semantic analysis in a heuristic derivation mode based on the constructed template, word slots and key word dictionary of the vertical class to obtain an intention and word slot set of the vertical class as a first analysis result of the vertical class; and according to word segmentation results, part-of-speech information and entity information, executing a sequence tagging task to identify a word slot set based on the trained sample generalization model of the vertical class, and executing a classification task to identify an intention as a second analysis result of the vertical class.

In some embodiments, the method further comprises: and directly loading the encrypted keyword dictionary into the memory after decrypting the encrypted keyword dictionary.

In some embodiments, sending an online semantic resolution request including a sentence to be recognized to a semantic understanding server includes: and sending an online semantic analysis request comprising an offline analysis capability identifier and a statement to be recognized to a semantic understanding server, so that the semantic understanding server discards the online semantic analysis request under the condition of high load pressure.

In a second aspect, an embodiment of the present disclosure provides a semantic parsing apparatus, including: an obtaining unit configured to obtain an offline vertical class set; a detection unit configured to detect a current network state in response to receiving a sentence to be recognized; the offline analysis unit is configured to perform offline semantic analysis in a heuristic derivation mode and offline semantic analysis in a sample generalization model mode on the statements to be recognized according to the vertical classes for each vertical class in the vertical class set to respectively obtain a first analysis result and a second analysis result of the vertical class, if the first analysis result meets a preset condition, the first analysis result is used as an offline candidate analysis result of the vertical class, and otherwise, the first analysis result and the second analysis result are subjected to offline fusion to generate the offline candidate analysis result of the vertical class; and the selecting unit is configured to select a final analysis result from the offline candidate analysis results of different vertical classes according to the priorities of the vertical classes and/or the historical conversation information if the current network state is disconnected or unstable.

In some embodiments, parsing the results includes: an intent and word slot set; and the offline parsing unit is further configured to: reversely acquiring a second word slot set needing to be identified according to the intention in a second analysis result obtained in a sample generalization model mode; and fusing the first word slot set and the second word slot set of the first analysis result obtained in the heuristic derivation mode, and finally deriving a complete intention and a word slot analysis result as the vertical off-line candidate analysis result.

In some embodiments, the apparatus further comprises an online parsing unit configured to: if the network is detected to be normal, sending an online semantic analysis request including a statement to be identified to a semantic understanding server while performing offline semantic analysis; for each offline vertical class in the offline vertical class set, if the offline candidate analysis result of the vertical class is a first analysis result, directly adopting the offline candidate analysis result of the vertical class as a fusion result of the vertical class, otherwise, fusing the offline candidate analysis result of the vertical class and the received online candidate analysis result to obtain a fusion result of the vertical class; and selecting a final analysis result from the fusion results of different vertical classes according to the priority of the vertical classes and/or the historical conversation information.

In some embodiments, the online parsing unit is further configured to: and if the received online candidate analysis results which do not belong to the offline vertical classes are obtained, selecting the final analysis result from the fusion results and the online candidate analysis results of different vertical classes according to the priorities and/or the historical conversation information of the vertical classes.

In some embodiments, parsing the results includes: an intent and word slot set; and the online parsing unit is further configured to: when the intention in the offline candidate analysis result of the vertical class is inconsistent with the intention in the online candidate analysis result, directly adopting the online candidate analysis result as the fusion result of the vertical class; otherwise, taking the word slot set in the offline candidate analysis result as a reference, checking and combining the word slot results in the online candidate analysis result, and taking the processed online candidate analysis result as the fusion result of the vertical class.

In some embodiments, the offline parsing unit is further configured to: performing word segmentation, part of speech tagging, named entity recognition and the like on a sentence to be recognized; performing syntactic structure analysis and disambiguation on the sentence to be recognized; carrying out named entity recognition of vertical specialization on the sentence to be recognized; according to word segmentation results, part-of-speech information and entity information, carrying out off-line semantic analysis in a heuristic derivation mode based on the constructed template, word slots and key word dictionary of the vertical class to obtain an intention and word slot set of the vertical class as a first analysis result of the vertical class; and according to word segmentation results, part-of-speech information and entity information, executing a sequence tagging task to identify a word slot set based on the trained sample generalization model of the vertical class, and executing a classification task to identify an intention as a second analysis result of the vertical class.

In some embodiments, the apparatus further comprises a decryption unit configured to: and directly loading the encrypted keyword dictionary into the memory after decrypting the encrypted keyword dictionary.

In some embodiments, the online parsing unit is further configured to: and sending an online semantic analysis request comprising an offline analysis capability identifier and a statement to be recognized to a semantic understanding server, so that the semantic understanding server discards the online semantic analysis request under the condition of high load pressure.

In a third aspect, an embodiment of the present disclosure provides an electronic device, including: one or more processors; a storage device having one or more programs stored thereon which, when executed by one or more processors, cause the one or more processors to implement a method as in any one of the first aspects.

In a fourth aspect, embodiments of the disclosure provide a computer readable medium having a computer program stored thereon, wherein the program when executed by a processor implements a method as in any one of the first aspect.

The semantic parsing method and the semantic parsing device provided by the embodiment of the disclosure have the following advantages:

1. the generalization performance is strong. Based on a mature and efficient unordered template engine as heuristic analysis capability and a semantic understanding model with strong sample generalization capability as high-order analysis capability, different dialogue expression modes can be well dealt with;

2. convenient maintenance and high development efficiency. The offline analysis capability directly multiplexes a model and a dictionary of the online analysis service, and the offline analysis capability is optimized while the online analysis service is developed and optimized without additional development work;

3. the frame expansibility is strong. The offline analysis framework supports a plurality of field vertical classes, can schedule and sort each vertical class, has low cost for adding/deleting one vertical class, and does not influence other vertical classes.

Drawings

Other features, objects and advantages of the disclosure will become more apparent upon reading of the following detailed description of non-limiting embodiments thereof, made with reference to the accompanying drawings in which:

FIG. 1 is an exemplary system architecture diagram in which one embodiment of the present disclosure may be applied;

FIG. 2 is a flow diagram for one embodiment of a semantic parsing method according to the present disclosure;

FIG. 3 is a flow diagram of yet another embodiment of a semantic parsing method according to the present disclosure;

FIGS. 4a, 4b are schematic diagrams of an application scenario of a semantic parsing method according to the present disclosure;

FIG. 5 is a schematic diagram of a semantic parsing apparatus according to one embodiment of the present disclosure;

FIG. 6 is a schematic block diagram of a computer system suitable for use with an electronic device implementing embodiments of the present disclosure.

Detailed Description

The present disclosure is described in further detail below with reference to the accompanying drawings and examples. It is to be understood that the specific embodiments described herein are merely illustrative of the relevant invention and not restrictive of the invention. It should be noted that, for convenience of description, only the portions related to the related invention are shown in the drawings.

It should be noted that, in the present disclosure, the embodiments and features of the embodiments may be combined with each other without conflict. The present disclosure will be described in detail below with reference to the accompanying drawings in conjunction with embodiments.

Fig. 1 illustrates an exemplary system architecture 100 to which embodiments of the semantic parsing method or semantic parsing apparatus of the present disclosure may be applied.

As shown in fig. 1, the system architecture 100 may include a microphone 101, a controller 102, a speech recognition server 103, and a semantic understanding server 104. The network serves as a medium for providing a communication link between the controller 102, the speech recognition server 103, and the semantic understanding server 104. The network may include various connection types, such as wired, wireless communication links, or fiber optic cables, to name a few.

A user can input voice to the controller 102 using the microphone 101. The controller 102 then interacts with the speech recognition server 103, the semantic understanding server 104 over a network to receive or send messages, etc. The microphone 101 may be a voice input device mounted on a mobile device such as an unmanned vehicle, and the microphone 101 may also be a built-in device such as a mobile phone or a computer. The controller may be a vehicle-mounted device or a built-in device such as a mobile phone or a computer. The controller 102 has a function of receiving and transmitting information.

And the voice recognition server 103 is used for receiving the voice sent by the controller 102 and converting the vocabulary contents in the voice into computer-readable input, such as keys, binary codes or character sequences. Unlike speaker recognition and speaker verification, the latter attempts to recognize or verify the speaker who uttered the speech rather than the vocabulary content contained therein. The voice recognition server 102 has a voice recognition system installed thereon. Speech recognition systems generally have two stages, training and decoding. Training, i.e., training the acoustic model through a large amount of labeled speech data. Decoding, namely recognizing the speech data outside the training set into characters through an acoustic model and a language model, wherein the recognition precision is directly influenced by the quality of the trained acoustic model.

And the semantic understanding server 103 is used for receiving the character result sent by the controller 102 and performing semantic analysis according to the character result. Semantic analysis refers to learning and understanding semantic contents represented by a text by using various methods, and any understanding of a language can be classified into the category of semantic analysis. A text segment is usually composed of words, sentences and paragraphs, and the semantic analysis can be further decomposed into vocabulary level semantic analysis, sentence level semantic analysis and chapter level semantic analysis according to different language units of the comprehension object. Generally speaking, vocabulary-level semantic analysis focuses on how to obtain or distinguish the semantics of words, sentence-level semantic analysis attempts to analyze the expressed semantics of an entire sentence, and chapter semantic analysis aims at studying the inherent structure of natural language text and understanding the semantic relationships between text elements (which may be sentence clauses or paragraphs). Briefly, the goal of semantic analysis is to realize automatic semantic analysis in each language unit (including vocabulary, sentences, chapters, etc.) by establishing an effective model and system, thereby realizing understanding of the true semantics of the whole text expression.

The controller 102 may also perform speech recognition locally offline. The sentence to be recognized identified by the offline speech recognition may be locally subjected to offline semantic parsing, or the sentence to be recognized may be sent to the semantic understanding server 103 for semantic parsing. And finally, fusing the offline analysis result and the online analysis result to obtain a final analysis result.

For developers, both the offline semantic analysis technology and the online semantic analysis technology are trained on the same dialogue template and sample, and the method is mainly divided into two aspects: a heuristic derivation technique based on an unordered template and a generalized model based on samples. The heuristic derivation technology based on the unordered template can generate a derivation strategy from the labeled template and the dictionary, and has low requirements on computing capacity and operating memory, so that the off-line analysis technology can achieve consistent effect on the aspect; the generalized model based on the sample is a depth model trained by the labeled sample, has higher requirements on computing capacity and operating memory, even needs an image processing unit (GPU), and can be complexly computed on a cloud server by an online analysis technology, but the offline analysis technology can be computed only on intelligent equipment (such as a mobile phone), and has very limited available computing capacity and operating memory, so that the depth model needs to be cut and compressed, and accordingly, the model effect is lost.

The speech recognition server 103 and the semantic understanding server 104 may be hardware or software. When the server is hardware, it may be implemented as a distributed server cluster formed by multiple servers, or may be implemented as a single server. When the server is software, it may be implemented as multiple pieces of software or software modules (e.g., multiple pieces of software or software modules used to provide distributed services), or as a single piece of software or software module. And is not particularly limited herein.

It should be noted that the semantic parsing method provided by the embodiment of the disclosure is generally executed by the controller 102, and accordingly, the semantic parsing apparatus is generally disposed in the controller 102.

It should be understood that the number of microphones, controllers, speech recognition servers, semantic understanding servers in fig. 1 are merely illustrative. There may be any number of microphones, controllers, speech recognition servers, semantic understanding servers, as desired for the implementation.

With continued reference to FIG. 2, a flow 200 of one embodiment of a semantic parsing method according to the present disclosure is shown. The semantic parsing method comprises the following steps:

step 201, obtaining an offline vertical collection.

In this embodiment, an execution subject (e.g., the controller shown in fig. 1) of the semantic parsing method may select a set of verticals to be called, such as a navigation vertical and a music vertical, according to the design and configuration of the overall requirement, where the navigation vertical has both an online parsing service and an offline parsing capability, and the music vertical has only the offline parsing capability.

In step 202, in response to receiving the sentence to be recognized, the current network state is detected.

In this embodiment, the controller may receive the speech converted into the sentence to be recognized, and may also receive the sentence to be recognized sent by the third-party server. And then detecting the current network state to see whether the current network state can be connected to a semantic understanding server at the cloud end. Only the offline semantic parsing capability is invoked in case of a network outage.

Step 203, for each vertical class in the vertical class set, performing offline semantic analysis in a heuristic derivation mode and offline semantic analysis in a sample generalization model mode according to the vertical class to obtain a first analysis result and a second analysis result of the vertical class, if the first analysis result meets a predetermined condition, taking the first analysis result as an offline candidate analysis result of the vertical class, otherwise, performing offline fusion on the first analysis result and the second analysis result to generate an offline candidate analysis result of the vertical class.

In this embodiment, because of a sudden network outage or unstable network, the online semantic analysis service cannot normally return a result, and the result of the offline semantic analysis capability is directly adopted. As shown in fig. 4b, for a specific vertical class, the basic service layer, the semantic understanding layer and the fusion guiding layer are respectively from bottom to top, and the specific process is as follows:

1. basic service layer: including but not limited to universal lexical analysis, universal compositional analysis, and vertical-type-specific named entity recognition. The general lexical analysis comprises the steps of performing word segmentation, part of speech tagging, named entity identification and the like on the dialog text; the general component analysis mainly comprises the steps of carrying out syntactic structure analysis and disambiguation on the dialog text; the named entity recognition of the vertical specialization is a specialized named entity recognition model constructed on the basis of a word slot and a dictionary constructed by a developer, and is used as a supplement of general lexical analysis, so that specific proper nouns of the vertical, such as 'band' word slot of a music vertical, can be recognized, specific words of 'panther band', 'Xinle band', 'May-day', 'compass' and the like can be recognized, and for the general lexical analysis, the 'compass' is only a 'tool' word slot.

2. Semantic understanding layer: according to the word segmentation result, the part of speech information, the entity information and the like provided by the basic service layer, the semantic understanding layer can deduce the result of the semantic result, namely the intention and the corresponding word slot set. The semantic understanding layer comprises three parts of a heuristic derivation technology based on a template, a semantic generalization model based on a sample and fusion derivation:

1) the heuristic derivation technology based on the template is to generate a series of derivation strategies by the template, the word slot and the keyword dictionary constructed by a developer, for example, the template 'song that i want to listen to [ singer (word slot)' can match with the dialogue requirements of 'song that i want to listen to zhou jilun', 'song that i want to listen to a country honor', and the intention of 'listening to song' and the word slot of the singer are obtained. The technology is fast in processing and controllable for developers, but has the defect of limited generalization capability, when a user turns to 'i want to listen to music of one week Jieren', the template cannot recall the same intention, and only the developer can build more abundant templates;

2) the semantic generalization model based on the samples aims to solve the generalization problem of the heuristic derivation technology, and a large number of labeled samples (such as the intention that 'i want to listen to a piece of music of zhou jilun' is labeled as 'listen to song', wherein 'zhou jilun' is labeled as 'singer' word groove) are used, so that the generalization model can learn the conversation paradigm of the requirements, for example, a new expression that 'i want to listen to a song of zhou jilun', can be recalled successfully. The generalized model based on the samples abstracts the word slot recognition into a sequence labeling task, abstracts the intention recognition into a classification task, and is mainly realized by deep learning (a cyclic neural network, a convolutional neural network), multi-task learning and the like.

3) Fusing a guiding layer: because the depth model requires a large number of labeled samples, the semantic generalization model may not be optimal under the conditions of high manual labeling cost and insufficient number of samples, and therefore a method combining a heuristic derivation technology is required to improve the overall effect. Generally, the classification task is more accurate than the sequence labeling task, so the main strategy of fusion derivation is to reversely acquire a word slot set to be identified according to the intention predicted by the generalization model, combine the word slot set matched by the heuristic derivation technology and the word slot set identified by the generalization model to perform fusion, and finally derive a complete intention and a word slot analysis result. And if the first analysis result meets the preset condition, the first analysis result is directly used without fusion. The predetermined condition may be that the semantic resolution score probability is higher than a predetermined value, i.e. the first resolution result may be satisfactory to the user.

Because the offline semantic analysis technology is operated on the intelligent device, the computing capability and the operating memory are relatively limited, model cutting and compression processing needs to be carried out on a basic service layer and a semantic generalization model based on a sample, and the effect is slightly lost when the model is compared with the online semantic analysis service. On the other hand, a dictionary is attached to the vertical type parsing capability constructed by the developer, such as a dictionary corresponding to a word slot of a 'band', the complete dictionary is very valuable resources and wealth for the developer, and if the complete dictionary is placed on an intelligent device (such as a mobile phone) of a user in a plaintext mode, the complete dictionary is likely to be stolen by a competitor, so that the dictionary of the type needs to be encrypted, and the semantic understanding layer can directly load the encrypted dictionary into a memory after decrypting the encrypted dictionary and cannot be obtained from the outside.

And 204, if the current network state is a disconnected network or the network is unstable, selecting a final analysis result from the offline candidate analysis results of different vertical classes according to the priorities of the vertical classes and/or the historical conversation information.

In this embodiment, the analysis results of different vertical categories (e.g., navigation vertical category and music vertical category) are obtained at this time, and the most suitable vertical category analysis result is selected according to the priority of the vertical category and/or the historical dialog information. For example, the user inputs "home-returning route", and the analysis result of the navigation tab "navigation home" and the analysis result of the music tab "song" home-returning route "are obtained. The final analysis result 'navigation home' can be selected according to the fact that the preset priority of the navigation vertical class is higher than the priority of the music vertical class. It is also possible to infer that the "home-returning route" is highly likely to belong to the navigation verticals, for example, from the historical dialogue information, for example, the last analysis result is "go unit". The offline candidate analysis results of different verticals can be recalled and sorted by combining the priority and the historical conversation information, and the offline candidate analysis result with the first ranking is selected as the final analysis result. And returning the final analysis result to the user in a voice or text form.

With further reference to FIG. 3, a flow 300 of yet another embodiment of a semantic parsing method is illustrated. The process 300 of the semantic parsing method includes the following steps:

step 301, obtaining an offline vertical collection.

Step 302, in response to receiving the sentence to be recognized, detecting the current network state.

And 303, for each vertical class in the vertical class set, performing offline semantic analysis in a heuristic derivation mode and offline semantic analysis in a sample generalization model mode according to the vertical class to obtain a first analysis result and a second analysis result of the vertical class respectively, if the first analysis result meets a predetermined condition, taking the first analysis result as an offline candidate analysis result of the vertical class, and otherwise, performing offline fusion on the first analysis result and the second analysis result to generate an offline candidate analysis result of the vertical class.

And step 304, if the current network state is disconnected or the network is unstable, selecting a final analysis result from the offline candidate analysis results of different vertical classes according to the priorities of the vertical classes and/or the historical conversation information.

The steps 301-304 are substantially the same as the steps 201-204, and therefore will not be described again.

Step 305, if the network is detected to be normal, sending an online semantic analysis request including the statement to be identified to the semantic understanding server while performing offline semantic analysis.

In this embodiment, in the case of networking, the controller may invoke a vertical online semantic parsing service, and may concurrently invoke a vertical offline semantic parsing capability, at which time off-line fusion may be performed according to an off-line result.

Optionally, when the controller calls the online semantic analysis service, a mark of "having offline analysis capability" may be attached according to an actual application scenario, and under the condition that the load pressure of the cloud server is high, the request is selectively discarded according to the mark, so that the pressure of the cloud server is reduced, and the quality of the overall online analysis request is improved. On the other hand, the conversation center control module can directly adopt the result of the offline analysis capability, so that the overtime problem of the online analysis service is avoided, and the local service performance is improved.

Step 306, for each offline vertical class in the offline vertical class set, if the offline candidate analysis result of the vertical class is the first analysis result, directly adopting the offline candidate analysis result of the vertical class as the fusion result of the vertical class, otherwise, fusing the offline candidate analysis result of the vertical class and the received online candidate analysis result to obtain the fusion result of the vertical class.

In the present embodiment, the following processing is performed according to the network status:

a. due to sudden network disconnection or unstable network, the online semantic analysis service cannot normally return results, and the results of offline semantic analysis capability are directly adopted;

b. since the offline semantic parsing capability is not time-consuming by the network, the result return time is faster. At the moment, if the offline analysis result comes from a heuristic derivation technology (the effect of the aspect is consistent with that of online analysis), the result of offline semantic analysis capability is directly adopted, and the result of online analysis service is not waited to be returned;

c. if the offline analysis result comes from the sample generalization model and the online analysis service also returns the analysis result, fusion is performed, the main fusion strategy is that the online analysis result is the main one and the offline analysis result is the auxiliary one: I) when the offline analysis intention is inconsistent with the online analysis intention, directly adopting an online analysis result; and II) if not, taking the word slot of the offline analysis result as a reference, checking and combining the word slot results of the online analysis result, and returning the processed online analysis result.

And 307, selecting a final analysis result from the fusion results of different vertical classes according to the priority of the vertical classes and/or the historical conversation information.

In this embodiment, similar to step 204, the final parsing result is selected from the offline and online merged results.

As shown in fig. 4a, the developer only needs to label the dialog template and the sample once, and different parsing capabilities can be obtained through different workflows. The online analysis technology can be used as an online analysis service to be called by a client side by online training and service deployment; the offline analysis technology needs to perform offline training first to obtain a corresponding offline SDK including a model and dictionary data, and then accesses the offline analysis capability into the intelligent device. The workflow can ensure the effect consistency of the off-line analysis capability to the maximum extent, the cost for developing the off-line analysis capability is greatly reduced, and for different intelligent platforms, models and dictionary data generated by off-line training are the same, but the access modes are different, so that the migration cost required by developers when migrating the intelligent service to other intelligent platforms is very low.

The bottom layer of the off-line semantic parsing technology can be written in efficient C + + language, and the same code can be transplanted to different platforms, for example, the code is deployed on a server as a pure C + + service application or is called by high-level language (Java, Python and the like) as a dynamic library. The online semantic analysis service is deployed in the cloud server and can be requested to be called by any networking client; the offline semantic parsing capability is accessed in the intelligent device, is related to the hardware architecture, the operating system and other operating environments, and specifically can look up the table one, including but not limited to Android devices, IOS devices and embedded devices. The following takes Android devices as examples, and details of the access method and the offline analysis technology are specifically described.

Table one: method for accessing offline semantic parsing capability of different intelligent platforms

Interpretation of terms:

the user: the use objects of the intelligent device are people who use mobile phone voice assistants.

The developer: personnel developing vertical resolution capabilities, personnel using and accessing off/on line resolution capabilities.

APP: the mobile phone software mainly refers to software installed on a smart phone, and overcomes the defects and individuation of an original system.

SDK: software Development Kit (Software Development Kit) is generally a collection of Development tools used by some Software engineers to build application Software for a particular Software package, Software framework, hardware platform, operating system, etc.

NDK: the android Native development Kit (Native development Kit) is a set of tools that allow you to implement parts of applications using Native code languages (e.g., C and C + +). This helps you reuse codebases written in these languages when developing certain types of applications.

JNI: the Java Native Interface (Java Native Interface) is a property of Java to invoke Native language. Java can interact with a C/C + + machine type through JNI, namely codes of C/C + + and other languages can be called in Java codes or Java codes can be called in the C/C + + codes.

SO File: the SO file is a shared library file under Linux, and the file format of the SO file is called ELF file format. Since the bottom layer of the Android operating system is based on a Linux system, the SO file can run on an Android platform.

The verticals: for certain specific fields or certain specific needs, all depth information and related services related to this field or need are provided. Such as navigation verticals, refer to analytic capabilities that provide map navigation related requirements for navigation, positioning, ranging, and the like.

Intention: the user converses the true desire to represent, such as "put your jeans' song to listen" the intent to represent "listening to a song".

Word slot: the intention corresponds to information or conditions, such as "listen to the song by putting the first Zhou Jilun's song" means "listen to the song" intention, "listen to the song" intention corresponds to the word slot of "singer", and the value corresponding to the word slot of "singer" is "Zhou Jilun" at this time.

Semantic parsing results: semantic parsing receives a dialog as input and the result of the parsing is an intent and corresponding word slot (value).

The Android device access method mainly includes compiling C + + codes into SO files through an NDK tool, and calling the C + + codes through JNI by Java codes in the SDK. The bottom layer analysis capability directly multiplexes C + + codes and can be transplanted on multiple platforms, so that the off-line analysis effect is ensured to the maximum extent; in addition, the logic of the analysis part is complex, a large amount of calculation is needed, and the performance of the service can be improved by using C + + code to execute; from the technical security aspect, Java codes are easily decompiled, SO that technical details are leaked, and the decompiling of SO files compiled by using C + + codes is very difficult.

With further reference to fig. 5, as an implementation of the methods shown in the above-mentioned figures, the present disclosure provides an embodiment of a semantic parsing apparatus, where the embodiment of the apparatus corresponds to the embodiment of the method shown in fig. 2, and the apparatus may be specifically applied to various electronic devices.

As shown in fig. 5, the semantic analysis device 500 of the present embodiment includes: an acquisition unit 501, a detection unit 502, an offline analysis unit 503, and a selection unit 505. The acquiring unit 501 is configured to acquire an offline vertical class set; a detecting unit 502 configured to detect a current network state in response to receiving a sentence to be recognized; the offline analysis unit 503 is configured to perform, for each vertical class in the vertical class set, according to the vertical class, offline semantic analysis in a heuristic derivation manner and offline semantic analysis in a sample generalization model manner on the statement to be recognized, to obtain a first analysis result and a second analysis result of the vertical class, respectively, if the first analysis result meets a predetermined condition, the first analysis result is used as an offline candidate analysis result of the vertical class, otherwise, the first analysis result and the second analysis result are subjected to offline fusion to generate an offline candidate analysis result of the vertical class; the selecting unit 505 is configured to select a final analysis result from the offline candidate analysis results of different vertical classes according to the priorities of the vertical classes and/or the historical dialogue information if the current network state is a network outage or network instability.

In this embodiment, the specific processing of the acquiring unit 501, the detecting unit 502, the offline analyzing unit 503 and the selecting unit 505 of the semantic analyzing apparatus 500 may refer to step 201, step 202, step 203 and step 204 in the corresponding embodiment of fig. 2.

In some optional implementations of this embodiment, the parsing result includes: an intent and word slot set; and the offline parsing unit 503 is further configured to: reversely acquiring a second word slot set needing to be identified according to the intention in a second analysis result obtained in a sample generalization model mode; and fusing the first word slot set and the second word slot set of the first analysis result obtained in the heuristic derivation mode, and finally deriving a complete intention and a word slot analysis result as the vertical off-line candidate analysis result.

In some optional implementations of this embodiment, the apparatus 500 further includes an online parsing unit 504 configured to: if the network is detected to be normal, sending an online semantic analysis request including a statement to be identified to a semantic understanding server while performing offline semantic analysis; for each offline vertical class in the offline vertical class set, if the offline candidate analysis result of the vertical class is a first analysis result, directly adopting the offline candidate analysis result of the vertical class as a fusion result of the vertical class, otherwise, fusing the offline candidate analysis result of the vertical class and the received online candidate analysis result to obtain a fusion result of the vertical class; and selecting a final analysis result from the fusion results of different vertical classes according to the priority of the vertical classes and/or the historical conversation information.

In some optional implementations of this embodiment, the online parsing unit 504 is further configured to: and if the received online candidate analysis results which do not belong to the offline vertical classes are obtained, selecting the final analysis result from the fusion results and the online candidate analysis results of different vertical classes according to the priorities and/or the historical conversation information of the vertical classes.

In some optional implementations of this embodiment, the parsing result includes: an intent and word slot set; and online parsing unit 504 is further configured to: when the intention in the offline candidate analysis result of the vertical class is inconsistent with the intention in the online candidate analysis result, directly adopting the online candidate analysis result as the fusion result of the vertical class; otherwise, taking the word slot set in the offline candidate analysis result as a reference, checking and combining the word slot results in the online candidate analysis result, and taking the processed online candidate analysis result as the fusion result of the vertical class.

In some optional implementations of this embodiment, the offline parsing unit 503 is further configured to: and performing word segmentation, part of speech tagging, named entity recognition and the like on the sentence to be recognized. And performing syntactic structure analysis and disambiguation on the sentence to be recognized. And carrying out vertical-type specialized named entity recognition on the sentence to be recognized. And according to the word segmentation result, the part of speech information and the entity information, carrying out off-line semantic analysis in a heuristic derivation mode based on the constructed template, word slot and key word dictionary of the vertical class to obtain the intention and word slot set of the vertical class as a first analysis result of the vertical class. And according to word segmentation results, part-of-speech information and entity information, executing a sequence tagging task to identify a word slot set based on the trained sample generalization model of the vertical class, and executing a classification task to identify an intention as a second analysis result of the vertical class.

In some optional implementations of this embodiment, the apparatus 500 further comprises a decryption unit (not shown in the drawings) configured to: and directly loading the encrypted keyword dictionary into the memory after decrypting the encrypted keyword dictionary.

In some optional implementations of this embodiment, the online parsing unit 504 is further configured to: and sending an online semantic analysis request comprising an offline analysis capability identifier and a statement to be recognized to a semantic understanding server, so that the semantic understanding server discards the online semantic analysis request under the condition of high load pressure.

Referring now to FIG. 6, a schematic diagram of an electronic device (e.g., the controller of FIG. 1) 600 suitable for use in implementing embodiments of the present disclosure is shown. The controller shown in fig. 6 is only an example, and should not bring any limitation to the functions and the scope of use of the embodiments of the present disclosure.

As shown in fig. 6, electronic device 600 may include a processing means (e.g., central processing unit, graphics processor, etc.) 601 that may perform various appropriate actions and processes in accordance with a program stored in a Read Only Memory (ROM)602 or a program loaded from a storage means 608 into a Random Access Memory (RAM) 603. In the RAM 603, various programs and data necessary for the operation of the electronic apparatus 600 are also stored. The processing device 601, the ROM 602, and the RAM 603 are connected to each other via a bus 604. An input/output (I/O) interface 605 is also connected to bus 604.

Generally, the following devices may be connected to the I/O interface 605: input devices 606 including, for example, a touch screen, touch pad, keyboard, mouse, camera, microphone, accelerometer, gyroscope, etc.; output devices 607 including, for example, a Liquid Crystal Display (LCD), a speaker, a vibrator, and the like; storage 608 including, for example, tape, hard disk, etc.; and a communication device 609. The communication means 609 may allow the electronic device 600 to communicate with other devices wirelessly or by wire to exchange data. While fig. 6 illustrates an electronic device 600 having various means, it is to be understood that not all illustrated means are required to be implemented or provided. More or fewer devices may alternatively be implemented or provided. Each block shown in fig. 6 may represent one device or may represent multiple devices as desired.

In particular, according to an embodiment of the present disclosure, the processes described above with reference to the flowcharts may be implemented as computer software programs. For example, embodiments of the present disclosure include a computer program product comprising a computer program embodied on a computer readable medium, the computer program comprising program code for performing the method illustrated in the flow chart. In such an embodiment, the computer program may be downloaded and installed from a network via the communication means 609, or may be installed from the storage means 608, or may be installed from the ROM 602. The computer program, when executed by the processing device 601, performs the above-described functions defined in the methods of embodiments of the present disclosure. It should be noted that the computer readable medium described in the embodiments of the present disclosure may be a computer readable signal medium or a computer readable storage medium or any combination of the two. A computer readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination of the foregoing. More specific examples of the computer readable storage medium may include, but are not limited to: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In embodiments of the disclosure, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. In embodiments of the present disclosure, however, a computer readable signal medium may comprise a propagated data signal with computer readable program code embodied therein, for example, in baseband or as part of a carrier wave. Such a propagated data signal may take many forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof. A computer readable signal medium may also be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device. Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to: electrical wires, optical cables, RF (radio frequency), etc., or any suitable combination of the foregoing.

The computer readable medium may be embodied in the electronic device; or may exist separately without being assembled into the electronic device. The computer readable medium carries one or more programs which, when executed by the electronic device, cause the electronic device to: acquiring an offline vertical class set; detecting a current network state in response to receiving a sentence to be recognized; for each vertical class in the vertical class set, performing off-line semantic analysis in a heuristic derivation mode and off-line semantic analysis in a sample generalization model mode according to the vertical class to obtain a first analysis result and a second analysis result of the vertical class respectively, if the first analysis result meets a preset condition, taking the first analysis result as an off-line candidate analysis result of the vertical class, and otherwise, performing off-line fusion on the first analysis result and the second analysis result to generate an off-line candidate analysis result of the vertical class; and if the current network state is disconnected or the network is unstable, selecting a final analysis result from the offline candidate analysis results of different vertical classes according to the priority of the vertical class and/or the historical conversation information.

Computer program code for carrying out operations for embodiments of the present disclosure may be written in any combination of one or more programming languages, including an object oriented programming language such as Java, Smalltalk, C + +, and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the case of a remote computer, the remote computer may be connected to the user's computer through any type of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or the connection may be made to an external computer (for example, through the Internet using an Internet service provider).

The flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present disclosure. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.

The units described in the embodiments of the present disclosure may be implemented by software or hardware. The described units may also be provided in a processor, and may be described as: a processor includes an acquisition unit, a detection unit, an offline parsing unit, and a selection unit. The names of these units do not form a limitation on the units themselves in some cases, and for example, the acquiring unit may also be described as a "unit acquiring a vertical collection of offline".

The foregoing description is only exemplary of the preferred embodiments of the disclosure and is illustrative of the principles of the technology employed. It will be appreciated by those skilled in the art that the scope of the invention in the present disclosure is not limited to the specific combination of the above-mentioned features, but also encompasses other embodiments in which any combination of the above-mentioned features or their equivalents is possible without departing from the inventive concept. For example, the above features and (but not limited to) the features disclosed in this disclosure having similar functions are replaced with each other to form the technical solution.

Claims

1. A semantic parsing method, comprising:

acquiring an offline vertical class set;

detecting a current network state in response to receiving a sentence to be recognized;

for each vertical class in the vertical class set, performing off-line semantic analysis in a heuristic derivation mode and off-line semantic analysis in a sample generalization model mode on the statement to be recognized according to the vertical class to respectively obtain a first analysis result and a second analysis result of the vertical class, if the first analysis result meets a preset condition, taking the first analysis result as an off-line candidate analysis result of the vertical class, and otherwise, performing off-line fusion on the first analysis result and the second analysis result to generate an off-line candidate analysis result of the vertical class;

and if the current network state is disconnected or the network is unstable, selecting a final analysis result from the offline candidate analysis results of different vertical classes according to the priority of the vertical class and/or the historical conversation information.

2. The method of claim 1, wherein parsing the results comprises: an intent and word slot set; and

the offline fusion of the first analysis result and the second analysis result to generate the vertical offline candidate analysis result includes:

reversely acquiring a second word slot set needing to be identified according to the intention in the second analysis result obtained in the sample generalization model mode;

and fusing the first word slot set and the second word slot set of the first analysis result obtained in a heuristic derivation mode, and finally deriving a complete intention and a word slot analysis result as the vertical off-line candidate analysis result.

3. The method of claim 1, wherein the method further comprises:

if the network is detected to be normal, sending an online semantic analysis request including the statement to be identified to a semantic understanding server while performing offline semantic analysis;

for each offline vertical class in the offline vertical class set, if the offline candidate analysis result of the vertical class is the first analysis result, directly adopting the offline candidate analysis result of the vertical class as the fusion result of the vertical class, otherwise, fusing the offline candidate analysis result of the vertical class and the received online candidate analysis result to obtain the fusion result of the vertical class;

and selecting a final analysis result from the fusion results of different vertical classes according to the priority of the vertical classes and/or the historical conversation information.

4. The method of claim 3, wherein the method further comprises:

and if the received online candidate analysis results which do not belong to the offline vertical classes are obtained, selecting the final analysis result from the fusion results and the online candidate analysis results of different vertical classes according to the priorities and/or the historical conversation information of the vertical classes.

5. The method of claim 3, wherein parsing the results comprises: an intent and word slot set; and

the fusing the offline candidate analysis result of the vertical class with the received online candidate analysis result to obtain a fused result of the vertical class includes:

when the intention in the offline candidate analysis result of the vertical class is inconsistent with the intention in the online candidate analysis result, directly adopting the online candidate analysis result as the fusion result of the vertical class;

otherwise, taking the word slot set in the offline candidate analysis result as a reference, checking and combining the word slot results in the online candidate analysis result, and taking the processed online candidate analysis result as the fusion result of the vertical class.

6. The method according to one of claims 1 to 5, wherein the performing offline semantic analysis in a heuristic derivation manner and offline semantic analysis in a sample generalization model manner on the sentence to be recognized according to the vertical class to obtain a first analysis result and a second analysis result of the vertical class respectively comprises:

performing word segmentation, part of speech tagging, named entity recognition and the like on the sentence to be recognized;

performing syntactic structure analysis and disambiguation on the sentence to be recognized;

carrying out vertical specialization named entity identification on the statement to be identified;

according to word segmentation results, part-of-speech information and entity information, carrying out off-line semantic analysis in a heuristic derivation mode based on the constructed template, word slots and key word dictionary of the vertical class to obtain an intention and word slot set of the vertical class as a first analysis result of the vertical class;

and according to word segmentation results, part-of-speech information and entity information, executing a sequence tagging task to identify a word slot set based on the trained sample generalization model of the vertical class, and executing a classification task to identify an intention as a second analysis result of the vertical class.

7. The method of claim 6, wherein the method further comprises:

and directly loading the encrypted keyword dictionary into a memory after decrypting the encrypted keyword dictionary.

8. The method according to one of claims 3-5, wherein the sending an online semantic resolution request including the sentence to be recognized to a semantic understanding server comprises:

and sending an online semantic analysis request comprising an offline analysis capability identifier and the statement to be recognized to a semantic understanding server, so that the semantic understanding server discards the online semantic analysis request under the condition of high load pressure.

9. A semantic parsing apparatus comprising:

an obtaining unit configured to obtain an offline vertical class set;

a detection unit configured to detect a current network state in response to receiving a sentence to be recognized;

the offline analysis unit is configured to perform offline semantic analysis in a heuristic derivation mode and offline semantic analysis in a sample generalization model mode on the statement to be identified according to each vertical class in the vertical class set to respectively obtain a first analysis result and a second analysis result of the vertical class, if the first analysis result meets a preset condition, the first analysis result is used as an offline candidate analysis result of the vertical class, and otherwise, the first analysis result and the second analysis result are subjected to offline fusion to generate the offline candidate analysis result of the vertical class;

and the selecting unit is configured to select a final analysis result from the offline candidate analysis results of different vertical classes according to the priorities of the vertical classes and/or the historical conversation information if the current network state is disconnected or unstable.

10. The apparatus of claim 9, wherein parsing the results comprises: an intent and word slot set; and

the offline parsing unit is further configured to:

11. The apparatus of claim 9, wherein the apparatus further comprises an online parsing unit configured to:

12. The apparatus of claim 11, wherein the online parsing unit is further configured to:

13. The apparatus of claim 11, wherein parsing the results comprises: an intent and word slot set; and

the online parsing unit is further configured to:

14. The apparatus according to one of claims 9-13, wherein the offline parsing unit is further configured to:

15. The apparatus of claim 14, wherein the apparatus further comprises a decryption unit configured to:

16. The apparatus according to one of claims 11-13, wherein the online parsing unit is further configured to:

17. An electronic device, comprising:

one or more processors;

a storage device having one or more programs stored thereon,

when executed by the one or more processors, cause the one or more processors to implement the method of any one of claims 1-8.

18. A computer-readable medium, on which a computer program is stored, wherein the program, when executed by a processor, implements the method of any one of claims 1-8.