WO2018153273A1 - Semantic parsing method and apparatus, and storage medium - Google Patents

Semantic parsing method and apparatus, and storage medium Download PDF

Info

Publication number
WO2018153273A1
WO2018153273A1 PCT/CN2018/075795 CN2018075795W WO2018153273A1 WO 2018153273 A1 WO2018153273 A1 WO 2018153273A1 CN 2018075795 W CN2018075795 W CN 2018075795W WO 2018153273 A1 WO2018153273 A1 WO 2018153273A1
Authority
WO
WIPO (PCT)
Prior art keywords
node
state machine
steps
voice
connection relationship
Prior art date
Application number
PCT/CN2018/075795
Other languages
French (fr)
Chinese (zh)
Inventor
冯晓冰
廖玲
王飞
徐浩
Original Assignee
腾讯科技(深圳)有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 腾讯科技(深圳)有限公司 filed Critical 腾讯科技(深圳)有限公司
Publication of WO2018153273A1 publication Critical patent/WO2018153273A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/237Lexical tools
    • G06F40/242Dictionaries
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/284Lexical analysis, e.g. tokenisation or collocates
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/30Semantic analysis
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/26Speech to text systems

Definitions

  • the present application relates to voice analysis technology, and in particular, to a state machine based semantic analysis method, apparatus, and storage medium.
  • Voice Assistant is an intelligent terminal application that helps users solve problems through intelligent interactions between intelligent conversations and instant questions and answers. It is mainly to help users solve life problems.
  • the voice assistant is a voice control application (App, Application; referred to as an application).
  • the voice generated by the user is collected by the sound collection hardware on the terminal, and then the voice is recognized by the voice recognition technology, and then the recognized voice is semantically determined. Then, respond quickly at the front desk; you can also make a voice chat with the user through the microphone, or help the user to manipulate the smart terminal through commands from the user.
  • the voice assistant is an application that can replace all or part of the user's query and operation on the terminal such as a mobile phone through voice interaction. Through such voice applications, users can greatly improve the convenience of operating mobile phones in different business scenarios.
  • the embodiment of the present application provides a semantic analysis method, device, and storage medium based on a state machine to solve at least one problem existing in the prior art, and can enhance the scalability of the voice platform.
  • An embodiment of the present application provides a state machine-based semantic parsing method, where the method is applied to a server, including:
  • a step set of the voice product in semantic parsing, where the step set includes at least two or more steps, and the two or more steps are used to complete at least the following operations: inputting to a user
  • the voice instruction performs preprocessing, parses the voice instruction, and invokes a corresponding function according to the parsed result;
  • Forming the node into a state machine of the voice product so that the server parses the voice command input by the user according to the state machine, and provides a function corresponding to the voice command to the user according to the analysis result.
  • An embodiment of the present application provides a state machine-based semantic parsing method, where the method is applied to a server, including:
  • each node of the state machine corresponds to a step in a set of semantic parsing steps; the step set is based on the voice
  • the functionality provided by the product is determined and includes at least two or more steps; the two or more steps are used to perform at least the following operations: pre-processing the voice command, parsing the voice command of the user, and Calling the corresponding function according to the result of the parsing;
  • the output result is output.
  • An embodiment of the present application provides a state machine based semantic parsing apparatus, the apparatus is applied to a server, the apparatus includes a processor and a memory connected to the processor; and the memory is stored by the processor Executing machine readable instruction unit; the machine readable instruction unit comprising: a first determining unit, a second determining unit, a third determining unit, a first forming unit, and a second forming unit, wherein:
  • the first determining unit is configured to determine a function of the voice product
  • the second determining unit is configured to determine, according to a function of the voice product, a step set of the voice product in semantic parsing, where the step set includes at least two or more steps, and the two or more steps are used by Performing at least the following operations: pre-processing a voice command input by the user, parsing the voice command, and invoking a corresponding function according to the result of the parsing;
  • the third determining unit is configured to determine a node of the corresponding state machine for each step in the step set,
  • the first forming unit is configured to form a node set according to the determined node
  • the second forming unit is configured to form the node to form a state machine of the voice product, so that the server parses a voice instruction input by the user according to the state machine, and provides the user with the result according to the analysis result.
  • a function corresponding to the voice command is configured to form the node to form a state machine of the voice product, so that the server parses a voice instruction input by the user according to the state machine, and provides the user with the result according to the analysis result.
  • An embodiment of the present application provides a state machine based semantic parsing apparatus, the apparatus is applied to a server, the apparatus includes a processor and a memory connected to the processor; and the memory is stored by the processor Executing a machine readable instruction unit; the machine readable instruction unit comprising: a third acquisition unit, an input unit, a fourth acquisition unit, and an output unit, wherein:
  • the third obtaining unit is configured to acquire a to-be-analyzed statement of the voice product
  • the input unit is configured to input the to-be-resolved statement into a first node of a preset state machine; wherein each node of the state machine corresponds to a step in a set of steps of semantic parsing;
  • the step set is determined according to the function provided by the voice product, and includes at least two or more steps; the two or more steps are used to complete at least the following operations: pre-processing the voice command, and the user
  • the voice instruction is parsed, and the corresponding function is called according to the result of the parsing;
  • the fourth obtaining unit is configured to obtain an output result from a last node of the state machine
  • the output unit is configured to output the output result.
  • a non-transitory computer readable storage medium storing one or more programs, the one or more programs comprising instructions that, when executed by a computing device, cause the computing device to perform the first aspect described above Or the method of the second aspect.
  • FIG. 1A is a schematic structural diagram of a system applicable to implementation of a semantic parsing method according to an embodiment of the present application
  • FIG. 1B is a schematic flowchart of a semantic analysis method based on a state machine according to an embodiment of the present application
  • FIG. 2 is a state diagram of a finite state machine of an elevator door in an embodiment of the present application
  • FIG. 3 is a state diagram of a state machine configuration in the embodiment
  • FIG. 5 is a schematic flowchart of semantic analysis of an embodiment of the present application.
  • FIG. 6 is a schematic flowchart of an implementation process of a semantic analysis method based on a state machine according to an embodiment of the present application
  • FIG. 7 is a schematic structural diagram of a semantic parsing apparatus based on a state machine according to an embodiment of the present application.
  • FIG. 8 is a schematic structural diagram of a semantic parsing apparatus based on a state machine according to an embodiment of the present application.
  • FIG. 9 is a schematic diagram of a network architecture according to an embodiment of the present application.
  • FIG. 10 is a schematic structural diagram of an electronic device according to an embodiment of the present application.
  • a semantic parser is built for each service in the voice platform.
  • Company A wants to launch a music business (such as QQ music)
  • the company also needs to build a semantic parser for the music business, so that users can search for their interest in instant messaging (QQ).
  • QQ instant messaging
  • the following embodiments of the present application propose a method for applying a finite state machine to a semantic parsing method, in which all possible steps in the semantic parsing process are abstracted into one node in the state machine. It is convenient for developers to add or delete a certain step, or to customize each step when each product is accessed, to generate a semantic analysis model suitable for the business; thus, the researcher can flexibly update the analytical method, voice The analysis process can be flexibly customized when the product is accessed. It can be seen from the above that the technical solution provided by the embodiment of the present application improves the voice platform, which not only enables the resource to be more rationally utilized, but also can construct a semantic for the new service when a new service is accessed. The parser is no longer tough.
  • Voice Assistant Software that provides users with corresponding services based on their voice input.
  • the voice platform is an improvement to the existing voice platform, and can provide semantic resolution services for multiple products.
  • Scene The scope of a sentence; for example, I want to listen to music, for music scenes; then like a joke, a joke scene.
  • Semantic parsing parsing a sentence into a scene, intent, and parameters that the computer can recognize. For example, I want to listen to the ice rain, the scene is a music scene, the intention is to listen, the parameter is ice rain.
  • Micro Desktop A desktop product from the Intelligent Platforms Division.
  • Finite state machine Finite-State Machine (FSM, referred to as state machine) is a mathematical model that represents a limited number of states and behaviors such as transitions and actions between these states.
  • NER Named entities
  • the FSM is composed of a limited state and a transition between each other, and can only be in one of a given number of states at any time.
  • the state machine produces an output that may be accompanied by a transition of the state.
  • the finite state machine includes the following components:
  • Transition The process by which an object moves from one state to another (eg, from a preprocessing state to a process in which a semantic algorithm resolves a state);
  • Transition condition The event and condition that causes the state of the object to be transformed (eg, the condition needs to be resolved);
  • Action The action taken by the object (eg, pre-processing action) before the state transition.
  • FIG. 1A is a schematic diagram showing the structure of a system to which the semantic analysis method of the example of the present application is applied.
  • the system includes at least a terminal device 101, a first server 102, a second server 103, and a network 104.
  • the terminal device 101 refers to a terminal device 101 having a data calculation processing function, including but not limited to a smart phone (a handheld computer, a tablet computer, a PC computer, etc.) (with a communication module installed).
  • Operating systems are installed on these terminal devices 101, including but not limited to: Android operating system, Symbian operating system, Windows mobile operating system, and Apple iPhone OS operating system.
  • the terminal device 101 is installed with an application client (for example, a voice assistant APP), and the application client is installed with the application server software corresponding to the application client through the network 104 and the first server 102 (for example, a voice assistant server) (for example, The voice assistant application server software performs information interaction to implement intelligent interaction between the application client's intelligent dialogue and instant question and answer.
  • an application client for example, a voice assistant APP
  • the application client is installed with the application server software corresponding to the application client through the network 104 and the first server 102 (for example, a voice assistant server)
  • the voice assistant application server software performs information interaction to implement intelligent interaction between the application client's intelligent dialogue and instant question and answer.
  • the second server 103 is installed with application server software (for example, a voice platform server) for constructing a state machine, and the built state machine is sent to the first server 102 through the network 104, so that the first server 102 can perform the above according to its state machine.
  • application server software for example, a voice platform server
  • the user voice sent by the client is used for parsing and replying.
  • the network 105 can be a wired network or a wireless network.
  • the voice platform server when a new voice service needs to be added, can customize different parsing steps according to requirements, so as to construct a new semantic parser for the service.
  • the voice platform server provided by the embodiment of the present application can rapidly expand the new service, and solves the problem that each voice service is created by each voice service due to the huge difference in the size and field of different service data involved in different service scenarios.
  • the problem of the server therefore, for the information service provider, the voice platform provided by the embodiment of the present application can implement the integration of the voice service of each service in the actual sense.
  • the flow of the state machine-based semantic parsing method provided by the embodiment of the present application is shown in FIG. 1B.
  • the user says “I want to listen to the ice rain” in the voice product (ie, the audio or music client), and the background (such as
  • the workflow of the server corresponding to the music client includes: detecting the phrase "I want to listen to the ice rain” spoken by the user, assigning the state machine according to the statement source, inputting the statement into the assigned state machine, and acquiring the state machine Output the result and return the output result to the user.
  • the background work is completely controlled by different state machines.
  • Figure 2 shows a state diagram of a finite state machine for an elevator door, as shown in Figure 2, which includes two states: state 1 is open and state 2 is closed. Among them, for state 1, the action of entering state 1 is to open the door, and for state 2, the action of entering state 2 is closing; the transition condition between state 1 and state 2 is opening or closing.
  • the following describes in detail how to apply the FSM model to the semantic parsing process.
  • the semantic parsing state machine implementation process of the embodiment of the present application is as follows:
  • a language that uses a uniform format and is identifiable from each other.
  • the speech resolution process includes the following steps:
  • Step S1 a preprocessing process
  • the voice input by the user and the voice server recognize the voice as the to-be-processed text after being voice-recognized (that is, the statement to be parsed); whether the statement to be parsed needs further analysis, and if the sentence to be parsed needs further analysis, then the step needs to be entered.
  • S2 that is, the parsing statement is parsed by the semantic parsing method; if the parsing statement does not need further parsing, then the process proceeds to step S3, and the vertical service is called.
  • the speech recognition technology converts the speech signal into a computer-readable text symbol, and solves the problem that the machine understands the problem of the person speaking.
  • a music client eg, a music app
  • the information is pre-processed, the voice information is identified, and the pending text of “I want to listen to the ice rain” is recognized.
  • the voice server needs to further semantically recognize the pending text of “I want to listen to the ice rain” to understand If the user's intention needs further analysis, the pre-processing state is transferred to the speech analysis algorithm state, that is, step S2, and the parsing statement is parsed by the semantic parsing method.
  • Step S2 parsing the parsing statement by a semantic parsing method
  • the semantic analysis method includes a deep learning method, a multi-scene analysis template, a NER+ vocabulary template, and a regular template.
  • step S3 that is, the vertical service is called; if the parsing is unsuccessful, the process proceeds to step S4, that is, the Frequently Asked Questions (FAQ) is called.
  • FAQ Frequently Asked Questions
  • the successful resolution means that the voice server can parse out the user's intention (for example, if the voice server parses out the pending text “I want to listen to the ice rain”, the user wants to listen to the song “ice rain”), then the process proceeds to step S3, that is, the call is made. Vertical service.
  • the voice server cannot parse the user's intention (for example, the voice server cannot parse out the pending text "I want to listen to the ice rain” is the user wants to listen to the song "ice rain”), then proceeds to step S4, Call Frequently Asked Questions (FAQ).
  • FAQ Call Frequently Asked Questions
  • Step S3 calling a vertical service
  • the vertical service may include: a music scene service, a map scene service, a video scene service, and an a la carte scene service.
  • step S2 if the calling vertical service is not correct, the process proceeds to step S2, and the processed text is re-analyzed; if the vertical service fails, the process proceeds to step S3 to re-invoke the vertical service. If the call is successful, the process ends (entering the end state).
  • Step S4 calling a general answer (FAQ);
  • the music server when it does not find a result when searching for a song, it returns a general answer to the music client, for example, causing the music client to make a voice "No song found.” If the statement to be parsed by the user is not recognized, then the general answer may also be returned to the user, for example, the voice is "unrecognizable”. After returning the general answer to the user, the process ends (ie, enters the end state).
  • Step S5 performing a local search on the parsing statement
  • step S5 the process proceeds from step S4 to step S5, and no condition is required; after the local search is performed, the process ends (entering the end state).
  • step S6 the process ends.
  • each of the above steps corresponds to one state in FIG. 3, for example, steps S1 to S6 correspond to state 31 to state 36, respectively, and the association between steps S1 to S6.
  • the relationship corresponds to the state transition condition between the state 31 and the state 36, for example, the relationship between the step S1 and the step S2, that is, the connection relationship is: whether the statement to be parsed needs further analysis, and if the statement to be parsed needs further analysis, then It is necessary to proceed to step S2; and the state transition condition between state 31 and state 32 is that an analysis condition is required.
  • step S1 The relationship between the step S1 and the step S3 is as follows: determining whether the statement to be parsed needs further parsing, if the parsing statement does not need further parsing, then it is required to proceed to step S3; and the state transition condition between state 31 and state 33 For: the analysis is successful.
  • FIG. 4 is a schematic flowchart of semantic analysis of an embodiment of the present application. As shown in FIG. 4, the semantic parsing process may further include the following steps:
  • Step S401 preprocessing
  • step S1 in the above embodiment see step S1 in the above embodiment.
  • Step S402 calling a semantic analysis method for semantic analysis
  • the semantic analysis method includes a deep learning method, a multi-scene analysis template, a NER+ vocabulary template, and a regular template.
  • Step S404 adapting logic
  • step S403 is required to perform semantic elimination. After the semantic elimination, the logic is performed in step S403 to determine the true intention of the user, thereby The selection of the scene in step S405 is performed.
  • Step S405 searching for a vertical scene
  • the vertical scene includes removing the phone scene, removing the SMS scene, the music scene, the joke scene, the eating scene, the à la carte scene, the purchase scene, the cooking scene, the cooking scene, and the like.
  • Step S406 the bottom operation, wherein the bottom operation generally includes a FAQ, an encyclopedia search, a jump search page, and an open domain search.
  • step S403 many words have a lot of meanings or semantics, and in a specific context, words have a certain meaning. Separate from the context to consider the meaning of words, semantics generally have semantic ambiguity. The task of disambiguating is to determine which semantics a polysemy uses in a particular context; the specific semantics can be determined by considering the context in which the vocabulary is used.
  • a relatively simple method is to give a definition of a vocabulary from a dictionary to determine the semantics of the vocabulary. But for most vocabulary, semantics and usage are not simply listed according to the definitions in the dictionary. Some of the semantics listed in the dictionary are clearly distinguishable, but most of the content is not. Determined and mixed together. What is even more difficult is that each vocabulary in the dictionary can only list a certain amount of semantics, and the semantics defined by the vocabulary in the actual context may not be found out from the semantics of the dictionary. Moreover, a word also has different part of speech. Determining the specific part of speech of a word belongs to the task of labeling. It is not involved here, but it needs to know that the determination of different parts of speech of the same word can effectively eliminate lexical ambiguity.
  • the browser voice assistant As an example, the flow step of the browser voice assistant is a subset of FIG. 4, as shown in FIG. 5, the browser voice parsing process includes:
  • Step S501 preprocessing
  • Step S502 calling a semantic analysis method for semantic analysis
  • the semantic analysis method includes a deep learning method, a multi-scene analysis template, and a NER+ vocabulary template.
  • Step S504 adapting logic
  • Step S505 searching for a vertical scene
  • the vertical scene includes removing the phone scene and removing the short message scene.
  • Step S506 the bottom operation, wherein the bottom operation generally includes an encyclopedia search and a jump search page.
  • an embodiment of the present application provides a state machine based semantic parsing method, which is applied to a first computing device, and the functions implemented by the method may be implemented by a processor calling program code in the first computing device.
  • the program code can be stored in a computer storage medium.
  • the first computing device includes at least a processor and a storage medium.
  • FIG. 6 is a schematic diagram of an implementation process of a semantic parsing method based on a state machine according to an embodiment of the present application. As shown in FIG. 6, the method may be applied to a voice server, where the method includes:
  • Step S601 determining a function of the voice product
  • the function of the voice product is to search for songs according to the user's voice command, and play the song;
  • the function of the voice product is to control the temperature, humidity, duration, etc. of the air conditioner according to the voice command of the user.
  • Working parameters, and working according to the determined working parameters; for the browser voice assistant, searching according to the user's voice command, and returning the result; for the voice chat assistant, the dialogue is performed according to the user's voice.
  • Step S602 determining, according to a function of the voice product, a step set of the voice product in semantic parsing, where the step set includes at least two or more steps;
  • Step S603 determining a node of the corresponding state machine for each step in the step set;
  • Step S604 forming a node set according to the determined node
  • Step S605 the node set is formed into a state machine of the voice product.
  • ⁇ machine> ⁇ /machine> is defined as a state machine
  • the content under ⁇ state> ⁇ /state> is The action corresponding to the state name and state, wherein the action is implemented by a class of the unified interface.
  • ⁇ transmition> ⁇ /transmition> is defined as a transfer.
  • the state machine may be run on the first computing device; or the state machine is output to the second computing device, and then the second computing device runs the state machine. Based on this, whether the first computing device or the second computing device runs the state machine, the method further includes:
  • Step S606 acquiring a to-be-analyzed statement of the voice product
  • Step S607 input the statement to be parsed into a first node of a preset state machine
  • Step S608 obtaining an output result from a last node of the state machine
  • Step S609 outputting the output result.
  • accessing any new voice product does not require re-encoding, and only needs to customize different parsing processes according to product requirements, which is simple, flexible, and efficient, and the user experience is good.
  • the user said in the browser voice assistant that the character A (for example, Li Xiaopeng) is the user, and the action that the user sees is to jump to the search page and use the browser to search for the keyword A.
  • the encyclopedic information of character A is directly spit out.
  • step S603 "determining a node of the corresponding state machine for each step in the set of steps” includes: in the step set, determining each connection relationship according to each step and other steps One step corresponds to a transition condition between the node and the other node corresponding to the other step; correspondingly, step S605 includes: forming the node into a state machine of the voice product according to the transition condition.
  • the second mode the forming the node to form a state machine of the voice product, comprising: determining, according to a connection relationship between each two steps in the step set, a connection relationship between nodes corresponding to each two steps Forming a state machine of the voice product according to a connection relationship between nodes in the set of nodes.
  • every two steps refers to all possible combinations of steps in the step set. Assuming that the step set includes steps a, b, c and d, then each two steps includes step a and step b, step a and step c, step a and Step d, step b and step c, step b and step d, step c and step d.
  • connection relationship between every two steps refers to step S1 and step S2 described above.
  • step S1 and step S2 is: determining whether the statement to be parsed needs further parsing, if the statement to be parsed If further analysis is needed, then step S2 needs to be entered; and the state transition condition between state 31 and state 32 is that the condition needs to be resolved.
  • the relationship between the step S1 and the step S3 is as follows: determining whether the statement to be parsed needs further parsing, if the parsing statement does not need further parsing, then it is required to proceed to step S3; and the state transition condition between the state 31 and the state 33 For: the analysis is successful.
  • the state machine for forming the node to form the voice product includes: acquiring an identifier of a node corresponding to each step; forming the voice according to a preset state map according to an identifier of a node corresponding to each step The state machine of the product.
  • the process of forming a preset state map includes:
  • Step SA1 determining a complete set of steps in semantic parsing, the complete set of steps comprising at least two or more steps, the step set being a subset of the complete set of steps;
  • step ensemble and step set may include the same number of steps, but the step ensemble may be more than the step set, wherein the subset represents the step corpus includes the same number of steps as the step set.
  • Step SA2 encapsulating each step of the step set into a node of the state machine
  • Step SA3 determining, according to the connection relationship between each two steps in the step set, the connection relationship (or transition condition) between the nodes corresponding to each two steps;
  • step SA4 a state diagram is formed based on the connection relationship (or transition condition) between the nodes.
  • step A2 the determining, for each step in the set of steps, the node of the corresponding state machine, comprising: acquiring the association information between the step and the node; and according to the association information, each of the step sets A step determines the node of the corresponding state machine.
  • the association information is used to represent the correspondence between the step and the node.
  • the correspondence relationship list may be used to implement the corresponding relationship list according to the identifier of the step, and the corresponding node is obtained.
  • the embodiment of the present application in order to ensure the correspondence between the step and the node (the state of the state machine), the embodiment of the present application further includes a matching correspondence between the determining step and the node, that is, the method in this embodiment Also includes:
  • Step SB1 Acquire a first connection relationship, where the first connection relationship is a connection relationship between a step in the step set and any other step in the step set;
  • Step SB2 obtaining a second connection relationship, where the second connection relationship is a connection relationship (or a transition condition) between a node corresponding to one step in the step set and a node corresponding to any other step in the state machine;
  • Step SB3 if the first connection relationship matches the second connection relationship, determine a node corresponding to the one step as one node in the node set;
  • step SB4 if the first connection relationship does not match the second connection relationship, the node is determined again for the one step.
  • An embodiment of the present application provides a state machine-based semantic parsing method and apparatus, wherein a function of a voice product is determined; and a step set of the voice product in semantic parsing is determined according to a function of the voice product, where the step set is Determining at least two or more steps; determining a node of the corresponding state machine for each step in the set of steps; forming a set of nodes according to the determined node; forming the set of nodes to form a state machine of the voice product; It enhances the scalability of the voice platform.
  • the embodiment of the present application provides a state machine-based semantic parsing apparatus, and each unit included in the apparatus, and each module included in each unit, can be implemented by a processor in the first computing device.
  • the functions implemented by the processor may of course be implemented by specific logic circuits; in the process of the specific embodiment, the processor may be a central processing unit (CPU), a microprocessor (MPU), and a digital Signal processor (DSP) or field programmable gate array (FPGA).
  • CPU central processing unit
  • MPU microprocessor
  • DSP digital Signal processor
  • FPGA field programmable gate array
  • the first computing device is implemented by using various electronic devices with information processing capabilities, for example, the electronic device can be implemented for a smart phone, a notebook computer, a desktop computer, a server cluster, or the like.
  • FIG. 7 is a schematic structural diagram of a semantic parsing apparatus based on a state machine according to an embodiment of the present application.
  • the apparatus 700 includes a processor and a memory connected to the processor; a machine readable instruction unit executed by the processor; the machine readable instruction unit comprising: a first determining unit 701, a second determining unit 702, a third determining unit 703, a first forming unit 704, and a second forming unit 705 ,among them:
  • the first determining unit 701 is configured to determine a function of the voice product
  • the second determining unit 702 is configured to determine, according to a function of the voice product, a step set of the voice product in semantic parsing, where the step set includes at least two or more steps;
  • the third determining unit 703 is configured to determine a node of the corresponding state machine for each step in the step set,
  • the first forming unit 704 is configured to form a node set according to the determined node
  • the second forming unit 705 is configured to form the node to form a state machine of the voice product.
  • the second forming unit 705 includes a first determining module 7051 and a first forming module 7052, wherein: the first determining module 7051 is configured to perform a connection relationship between each two steps in the step set. Determining a connection relationship (or transition condition) between nodes corresponding to each of the two steps; the first forming module 7052 is configured to form a connection relationship (or a transition condition) between nodes in the node set State machine for voice products.
  • the second forming unit 705 includes an obtaining module 7053 and a second forming module 7054, wherein: the obtaining module 7053 is configured to acquire an identifier of a node corresponding to each step; and the second forming module 7054 is configured to use
  • the state machine of the voice product is formed according to a preset state map according to the identifier of the node corresponding to each step.
  • the apparatus 700 further includes a third forming unit 706 for forming a preset state diagram, where the third forming unit 706 includes a second determining module 7061 and a packaging module. 7062.
  • the third determining module 7063 and the third forming module 7064 wherein:
  • the second determining module 7061 is configured to determine a complete set of steps in semantic parsing, where the complete set of steps includes at least two or more steps, and the step set is a subset of the complete set of steps;
  • the encapsulating module 7062 is configured to encapsulate each step of the step set into a node of a state machine
  • the second determining module 7063 is configured to determine, according to the connection relationship between each two steps in the step set, the connection relationship (or transition condition) between the nodes corresponding to each two steps;
  • the third forming module 7064 is configured to form a state diagram according to a connection relationship between the nodes.
  • the encapsulating module 7062 in the second method further includes an obtaining submodule and a determining submodule, where:
  • the obtaining submodule is configured to acquire association information between the step and the node;
  • the determining submodule is configured to determine, according to the association information, a node that determines a corresponding state machine for each step in the set of steps.
  • the apparatus further includes a first obtaining unit, a second acquiring unit, a matching unit, and a non-matching unit, where:
  • the first obtaining unit is configured to acquire a first connection relationship, where the first connection relationship is a connection relationship between a step in the step set and any other step in the step set;
  • the second obtaining unit is configured to acquire a second connection relationship, where the second connection relationship is a connection relationship between a node corresponding to one step in the step set and a node corresponding to any other step in the state machine (or transfer conditions);
  • the matching unit is configured to determine, as the first connection relationship and the second connection relationship, a node corresponding to the one step as one node in the node set;
  • the unmatching unit is configured to re-determine the node for the one step if the first connection relationship does not match the second connection relationship.
  • the device further includes a determining unit, configured to determine whether the first connection relationship and the second connection relationship match, and obtain a determination result; if the determination result indicates the first connection relationship and the second The connection relationship is matched, and the node is determined as one node in the node set; if the first connection relationship does not match the second connection relationship, determining, in the step, determining the node as the node A node in the collection.
  • a determining unit configured to determine whether the first connection relationship and the second connection relationship match, and obtain a determination result; if the determination result indicates the first connection relationship and the second The connection relationship is matched, and the node is determined as one node in the node set; if the first connection relationship does not match the second connection relationship, determining, in the step, determining the node as the node A node in the collection.
  • the apparatus may further include: a statement obtaining unit 707 that acquires a statement to be parsed of the voice product; a sentence input unit 708 that inputs the to-be-resolved sentence into the state machine a node; a result obtaining unit 709, which obtains an output result from a last node of the state machine; and a result output unit 710 that outputs the output result.
  • a statement obtaining unit 707 that acquires a statement to be parsed of the voice product
  • a sentence input unit 708 that inputs the to-be-resolved sentence into the state machine a node
  • a result obtaining unit 709 which obtains an output result from a last node of the state machine
  • a result output unit 710 that outputs the output result.
  • the embodiment of the present application provides a state machine-based semantic parsing apparatus, and each unit included in the apparatus, and each module included in each unit, can be implemented by a processor in the second computing device.
  • the functions implemented by the processor may of course be implemented by specific logic circuits; in the process of the specific embodiment, the processor may be a central processing unit (CPU), a microprocessor (MPU), and a digital Signal processor (DSP) or field programmable gate array (FPGA).
  • CPU central processing unit
  • MPU microprocessor
  • DSP digital Signal processor
  • FPGA field programmable gate array
  • the second computing device is implemented by using various electronic devices with information processing capabilities, for example, the electronic device can be implemented for a smart phone, a notebook computer, a desktop computer, a server cluster, or the like.
  • the apparatus 800 includes a processor and a memory connected to the processor;
  • the machine readable instruction unit comprises: a third acquisition unit 801, an input unit 802, a third acquisition unit 803, and an output unit 804, wherein:
  • the third obtaining unit 801 is configured to obtain a to-be-analyzed statement of the voice product.
  • the input unit 802 is configured to input the to-be-resolved statement into a first node of a preset state machine
  • the fourth obtaining unit 803 is configured to obtain an output result from a last node of the state machine
  • the output unit 804 is configured to output the output result.
  • the apparatus includes a first determining unit, a second determining unit, a third determining unit, a first forming unit, and a second forming unit, wherein:
  • the first determining unit is configured to determine a function of the voice product
  • the second determining unit is configured to determine, according to a function of the voice product, a step set of the voice product in semantic parsing, where the step set includes at least two or more steps;
  • the third determining unit is configured to determine a node of the corresponding state machine for each step in the step set,
  • the first forming unit is configured to form a node set according to the determined node
  • the second forming unit is configured to form the node to form a state machine of the voice product.
  • the first computing device formed by the first computing device may run on the first computing device or may operate as a functional module in the second
  • the second computing device may be a server of the voice product or a terminal of the voice product.
  • the state machine formed by the first computing device may be output to the server of the voice product or may be output to the terminal of the voice product.
  • the system 900 of the first mode includes a first computing device 901, a second computing device 902, and a terminal 903, where:
  • the first computing device 901 is configured to form a state machine (such as the foregoing method or the embodiment shown in Figure 8), and then output the formed state machine to the second computing device 902;
  • a state machine such as the foregoing method or the embodiment shown in Figure 8
  • the client 903 is installed with a voice product client (for example, a mobile phone voice assistant, a browser voice assistant), the user opens the client on the terminal, and then the user speaks a sentence, and the client sends the received voice to the second calculation.
  • a voice product client for example, a mobile phone voice assistant, a browser voice assistant
  • the second computing device 902 is a server of the terminal 903, the second computing device 902 is configured to run the state machine output by the first device 901, and the second computing device 902 is further configured to receive the voice output by the client of the terminal 903, and then the voice is Performing speech recognition preprocessing, obtaining a statement to be parsed, inputting the statement to be parsed into a state machine running on the second computing device 902, and then obtaining an output result outputted from the state machine, and returning the output result to the client of the terminal 903. Finally, the client of the terminal 903 outputs the output result to the user.
  • the second mode system 900 includes a first computing device 901 and a second computing device 902, where:
  • the first computing device 901 is configured to form a state machine (such as the foregoing method or the embodiment shown in Figure 8), and then output the formed state machine to the second computing device 902;
  • a state machine such as the foregoing method or the embodiment shown in Figure 8
  • the second computing device 902 serves as a terminal, and the second computing device 902 is installed with a client of the voice product (for example, a mobile phone voice assistant such as Apple's siri, browser voice assistant), the user opens the client on the terminal, and then the user speaks In a word, the client sends the received voice to the state machine running on the second computing device 902; after the state machine runs, the output is sent to the client, and then the client obtains the output output from the state machine, and finally The client outputs the output to the user.
  • the state machine can be independent of the client, wherein the client includes detection means for detecting what the user said, and then the detecting means sends the voice to the state machine running on the second computing device 902.
  • the foregoing state machine based semantic parsing method is implemented in the form of a software function module, and is sold or used as a standalone product, it may also be stored in a computer readable storage medium. in.
  • the technical solution of the embodiments of the present application may be embodied in the form of a software product in essence or in the form of a software product stored in a storage medium, including a plurality of instructions.
  • a computer device (which may be a personal computer, server, or network device, etc.) is caused to perform all or part of the methods described in various embodiments of the present application.
  • the foregoing storage medium includes various media that can store program codes, such as a USB flash drive, a mobile hard disk, a read only memory (ROM), a magnetic disk, or an optical disk.
  • program codes such as a USB flash drive, a mobile hard disk, a read only memory (ROM), a magnetic disk, or an optical disk.
  • the embodiment of the present application further provides a computer readable storage medium, where the computer readable storage medium stores computer executable instructions, and when the computer executable instructions are executed by the processor, are used to execute the embodiment of the present application.
  • a semantic parsing method based on state machine.
  • the embodiment of the present application further provides a computing device, including: a memory, a processor, and a computer program for being stored on the memory and operable on the processor, the processor executing the program It is used to implement the state machine based semantic parsing method in the embodiments of the present application.
  • FIG. 10 is a schematic structural diagram of an electronic device according to an embodiment of the present application.
  • the computing device 1000 may be configured.
  • the method includes: at least one processor 1001, at least one communication bus 1002, a user interface 1003, at least one external communication interface 1004, and at least one memory 1005.
  • the communication bus 1002 is used to implement connection communication between these components.
  • the user interface 1003 can include a display screen and a keyboard.
  • External communication interface 1004 can include standard wired and wireless interfaces.
  • the disclosed apparatus and method may be implemented in other manners.
  • the device embodiments described above are merely illustrative.
  • the division of the unit is only a logical function division.
  • there may be another division manner such as: multiple units or components may be combined, or Can be integrated into another system, or some features can be ignored or not executed.
  • the coupling, or direct coupling, or communication connection of the components shown or discussed may be indirect coupling or communication connection through some interfaces, devices or units, and may be electrical, mechanical or other forms. of.
  • the units described above as separate components may or may not be physically separated, and the components displayed as the unit may or may not be physical units; they may be located in one place or distributed on multiple network units; Some or all of the units may be selected according to actual needs to achieve the purpose of the solution of the embodiment.
  • each functional unit in each embodiment of the present application may be integrated into one processing unit, or each unit may be separately used as one unit, or two or more units may be integrated into one unit; the above integration
  • the unit can be implemented in the form of hardware or in the form of hardware plus software functional units.
  • the foregoing program may be stored in a computer readable storage medium, and when executed, the program includes The foregoing steps of the method embodiment; and the foregoing storage medium includes: a removable storage device, a read only memory (ROM), a magnetic disk, or an optical disk, and the like, which can store program codes.
  • ROM read only memory
  • the above-described integrated unit of the present application may be stored in a computer readable storage medium if it is implemented in the form of a software function module and sold or used as a stand-alone product.
  • the technical solution of the embodiments of the present application may be embodied in the form of a software product in essence or in the form of a software product stored in a storage medium, including a plurality of instructions.
  • a computer device (which may be a personal computer, server, or network device, etc.) is caused to perform all or part of the methods described in various embodiments of the present application.
  • the foregoing storage medium includes various media that can store program codes, such as a mobile storage device, a ROM, a magnetic disk, or an optical disk.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Health & Medical Sciences (AREA)
  • Computational Linguistics (AREA)
  • Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Artificial Intelligence (AREA)
  • Human Computer Interaction (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Machine Translation (AREA)

Abstract

A semantic parsing method based on a state machine. The method comprises: determining a function of a speech product (S601); determining a step set of the speech product in semantic parsing according to the function of the speech product, the step set comprising at least two steps (S602); determining a corresponding node of a state machine for each step in the step set (S603); forming a node set according to the determined nodes (S604); and forming the state machine of the speech product by using the node set (S605). Also disclosed are a semantic parsing apparatus based on a state machine, and a storage medium.

Description

语义解析方法、装置及存储介质Semantic analysis method, device and storage medium
本申请要求于2017年02月23日提交中国专利局、申请号为201710099405.9、发明名称为“一种基于状态机的语义解析方法及装置、设备”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。This application claims the priority of the Chinese patent application filed on February 23, 2017, the Chinese Patent Office, the application number is 201710099405.9, and the invention name is "a semantic analysis method and device based on state machine". The citations are incorporated herein by reference.
技术领域Technical field
本申请涉及语音解析技术,尤其涉及一种基于状态机的语义解析方法、装置及存储介质。The present application relates to voice analysis technology, and in particular, to a state machine based semantic analysis method, apparatus, and storage medium.
发明背景Background of the invention
语音助手是一款智能型的终端应用,通过智能对话与即时问答的智能交互,实现帮忙用户解决问题,其主要是帮忙用户解决生活类问题。语音助手是一种语音控制应用程序(App,Application;简称应用),通过终端上的声音采集硬件采集用户发出的语音,然后通过语音识别技术对语音进行识别,再对识别出的语音进行语义判断,然后在前台迅速做出回应;还可以通过麦克风与用户进行语音聊天,或者通从用户的命令,帮助用户操控智能终端。从以上可以看出,语音助手是一类可以通过语音交互来实现替代全部或部分,用户在终端如手机上的查询与操作的应用程序。用户通过此类语音应用,可以大大提高在不同业务场景下操作手机的便利性。Voice Assistant is an intelligent terminal application that helps users solve problems through intelligent interactions between intelligent conversations and instant questions and answers. It is mainly to help users solve life problems. The voice assistant is a voice control application (App, Application; referred to as an application). The voice generated by the user is collected by the sound collection hardware on the terminal, and then the voice is recognized by the voice recognition technology, and then the recognized voice is semantically determined. Then, respond quickly at the front desk; you can also make a voice chat with the user through the microphone, or help the user to manipulate the smart terminal through commands from the user. As can be seen from the above, the voice assistant is an application that can replace all or part of the user's query and operation on the terminal such as a mobile phone through voice interaction. Through such voice applications, users can greatly improve the convenience of operating mobile phones in different business scenarios.
发明内容Summary of the invention
本申请实施例为解决现有技术中存在的至少一个问题而提供一种基于状态机的语义解析方法、装置及存储介质,能够增强语音平台的可扩展性。The embodiment of the present application provides a semantic analysis method, device, and storage medium based on a state machine to solve at least one problem existing in the prior art, and can enhance the scalability of the voice platform.
本申请实施例的技术方案是这样实现的:The technical solution of the embodiment of the present application is implemented as follows:
本申请实施例提供一种基于状态机的语义解析方法,所述方法应用于服务器,包括:An embodiment of the present application provides a state machine-based semantic parsing method, where the method is applied to a server, including:
确定语音产品的功能;Determine the functionality of the voice product;
根据所述语音产品的功能确定所述语音产品在语义解析中的步骤集合,所述步骤集合中至少包括两个以上的步骤,所述两个以上的步骤用于完成至少以下操作:对用户输入的语音指令进行预处理、对所述语音指令进行解析、并根据解析的结果调用对应的功能;Determining, according to a function of the voice product, a step set of the voice product in semantic parsing, where the step set includes at least two or more steps, and the two or more steps are used to complete at least the following operations: inputting to a user The voice instruction performs preprocessing, parses the voice instruction, and invokes a corresponding function according to the parsed result;
为所述步骤集合中的每一步骤确定对应的状态机的节点;Determining a node of the corresponding state machine for each step in the set of steps;
根据确定的节点形成节点集合;Forming a set of nodes according to the determined nodes;
将所述节点集合形成所述语音产品的状态机,使得所述服务器根据所述状态机对用户输入的语音指令进行解析,并根据解析结果向所述用户提供与所述语音指令对应的功能。Forming the node into a state machine of the voice product, so that the server parses the voice command input by the user according to the state machine, and provides a function corresponding to the voice command to the user according to the analysis result.
本申请实施例提供一种基于状态机的语义解析方法,所述方法应用于服务器,包括:An embodiment of the present application provides a state machine-based semantic parsing method, where the method is applied to a server, including:
获取语音产品的待解析语句;Obtain a statement to be parsed of the voice product;
将所述待解析语句输入预设的状态机的第一个节点;其中,所述状态机的每一个节点与语义解析的步骤集合中的一个步骤相对应;所述步骤集合是根据所述语音产品提供的功能确定的,并至少包括两个以上的步骤;所述两个以上的步骤用于完成至少以下操作:对所述语音指令进行预处理、对所述用户的语音指令进行解析、并根据解析的结果调用对 应的功能;Inputting the to-be-resolved statement into a first node of a preset state machine; wherein each node of the state machine corresponds to a step in a set of semantic parsing steps; the step set is based on the voice The functionality provided by the product is determined and includes at least two or more steps; the two or more steps are used to perform at least the following operations: pre-processing the voice command, parsing the voice command of the user, and Calling the corresponding function according to the result of the parsing;
从所述状态机的最后一个节点获取输出结果;Obtaining an output result from a last node of the state machine;
将所述输出结果输出。The output result is output.
本申请实施例提供一种基于状态机的语义解析装置,所述装置应用于服务器,所述装置包括处理器以及与所述处理器相连接的存储器;所述存储器中存储有可由所述处理器执行的机器可读指令单元;所述机器可读指令单元包括:第一确定单元、第二确定单元、第三确定单元、第一形成单元和第二形成单元,其中:An embodiment of the present application provides a state machine based semantic parsing apparatus, the apparatus is applied to a server, the apparatus includes a processor and a memory connected to the processor; and the memory is stored by the processor Executing machine readable instruction unit; the machine readable instruction unit comprising: a first determining unit, a second determining unit, a third determining unit, a first forming unit, and a second forming unit, wherein:
所述第一确定单元,用于确定语音产品的功能;The first determining unit is configured to determine a function of the voice product;
所述第二确定单元,用于根据所述语音产品的功能确定所述语音产品在语义解析中的步骤集合,所述步骤集合中至少包括两个以上的步骤,所述两个以上的步骤用于完成至少以下操作:对用户输入的语音指令进行预处理、对所述语音指令进行解析、并根据解析的结果调用对应的功能;The second determining unit is configured to determine, according to a function of the voice product, a step set of the voice product in semantic parsing, where the step set includes at least two or more steps, and the two or more steps are used by Performing at least the following operations: pre-processing a voice command input by the user, parsing the voice command, and invoking a corresponding function according to the result of the parsing;
所述第三确定单元,用于为所述步骤集合中的每一步骤确定对应的状态机的节点,The third determining unit is configured to determine a node of the corresponding state machine for each step in the step set,
所述第一形成单元,用于根据确定的节点形成节点集合;The first forming unit is configured to form a node set according to the determined node;
所述第二形成单元,用于将所述节点集合形成所述语音产品的状态机,使得所述服务器根据所述状态机对用户输入的语音指令进行解析,并根据解析结果向所述用户提供与所述语音指令对应的功能。The second forming unit is configured to form the node to form a state machine of the voice product, so that the server parses a voice instruction input by the user according to the state machine, and provides the user with the result according to the analysis result. A function corresponding to the voice command.
本申请实施例提供一种基于状态机的语义解析装置,所述装置应用于服务器,所述装置包括处理器以及与所述处理器相连接的存储器;所述存储器中存储有可由所述处理器执行的机器可读指令单元;所述机器可读指令单元包括:第三获取单元、输入单元、第四获取单元和输出单元,其中:An embodiment of the present application provides a state machine based semantic parsing apparatus, the apparatus is applied to a server, the apparatus includes a processor and a memory connected to the processor; and the memory is stored by the processor Executing a machine readable instruction unit; the machine readable instruction unit comprising: a third acquisition unit, an input unit, a fourth acquisition unit, and an output unit, wherein:
所述第三获取单元,用于获取语音产品的待解析语句;The third obtaining unit is configured to acquire a to-be-analyzed statement of the voice product;
所述输入单元,用于将所述待解析语句输入预设的状态机的第一个节点;其中,所述状态机的每一个节点与语义解析的步骤集合中的一个步骤相对应;所述步骤集合是根据所述语音产品提供的功能确定的,并至少包括两个以上的步骤;所述两个以上的步骤用于完成至少以下操作:对所述语音指令进行预处理、对所述用户的语音指令进行解析、并根据解析的结果调用对应的功能;The input unit is configured to input the to-be-resolved statement into a first node of a preset state machine; wherein each node of the state machine corresponds to a step in a set of steps of semantic parsing; The step set is determined according to the function provided by the voice product, and includes at least two or more steps; the two or more steps are used to complete at least the following operations: pre-processing the voice command, and the user The voice instruction is parsed, and the corresponding function is called according to the result of the parsing;
所述第四获取单元,用于从所述状态机的最后一个节点获取输出结果;The fourth obtaining unit is configured to obtain an output result from a last node of the state machine;
所述输出单元,用于将所述输出结果输出。The output unit is configured to output the output result.
一种非易失性计算机可读存储介质,存储有一个或多个程序,所述一个或多个程序包括指令,所述指令当由计算设备执行时,使得所述计算设备执行上述第一方面或第二方面的方法。A non-transitory computer readable storage medium storing one or more programs, the one or more programs comprising instructions that, when executed by a computing device, cause the computing device to perform the first aspect described above Or the method of the second aspect.
附图简要说明BRIEF DESCRIPTION OF THE DRAWINGS
图1A为本申请实施例语义解析方法在实现时的适用的系统结构示意图;1A is a schematic structural diagram of a system applicable to implementation of a semantic parsing method according to an embodiment of the present application;
图1B为本申请实施例基于状态机的语义解析方法在实现时的流程示意图;FIG. 1B is a schematic flowchart of a semantic analysis method based on a state machine according to an embodiment of the present application;
图2为本申请实施例中电梯门的有限状态机的状态图;2 is a state diagram of a finite state machine of an elevator door in an embodiment of the present application;
图3为本实施例中状态机配置的状态图;3 is a state diagram of a state machine configuration in the embodiment;
图4为本申请实施例语义解析的流程示意图;4 is a schematic flowchart of semantic analysis of an embodiment of the present application;
图5为本申请实施例语义解析的流程示意图;FIG. 5 is a schematic flowchart of semantic analysis of an embodiment of the present application; FIG.
图6为本申请实施例基于状态机的语义解析方法的实现流程示意图;6 is a schematic flowchart of an implementation process of a semantic analysis method based on a state machine according to an embodiment of the present application;
图7为本申请实施例基于状态机的语义解析装置的组成结构示意图;FIG. 7 is a schematic structural diagram of a semantic parsing apparatus based on a state machine according to an embodiment of the present application;
图8为本申请实施例基于状态机的语义解析装置的组成结构示意图;FIG. 8 is a schematic structural diagram of a semantic parsing apparatus based on a state machine according to an embodiment of the present application;
图9为本申请实施例的网络架构示意图;FIG. 9 is a schematic diagram of a network architecture according to an embodiment of the present application;
图10为本申请实施例电子设备的组成结构示意图。FIG. 10 is a schematic structural diagram of an electronic device according to an embodiment of the present application.
实施方式Implementation
下面结合附图和具体实施例对本申请的技术方案进一步详细阐述。在本申请实施例的一种语义解析方案中,假设甲公司开设有浏览器业务和视频业务,其中这两项业务都需要进行语义解析,因为都嵌入有语音助手,以帮助那些不喜欢进行文字输入或者不具有写能力的用户。这样,用户可以在该甲公司视频业务的web页面上搜索自己感兴趣的电影,在浏览器业务的web页面上搜索自己感兴趣的关键词。由于开展视频业务和开展浏览器业务都需要用到语音解析器,因此,该甲公司将这两项业务整合在一个语音平台上;但是由于视频业务的业务数据规模、字段与浏览器业务的业务数据规模、字段都存在很大的差异,因此,在语音平台中分别为每一业务搭建一个语义解析器。当甲公司要开展一项音乐业务(如QQ音乐)时,该甲公司还需要为该音乐业务搭建一个适用于音乐业务的语义解析器,以便用户可以在即时通讯(QQ)上搜索自己感兴趣的音乐。由此可见,该语音平台虽然将各个业务放置在一起,但是并没有做到实际意义上的整合。The technical solutions of the present application are further elaborated below in conjunction with the accompanying drawings and specific embodiments. In a semantic analysis solution of the embodiment of the present application, it is assumed that Company A has a browser service and a video service, and both of these services require semantic analysis because voice assistants are embedded to help those who do not like to perform text. Enter or not have the ability to write. In this way, the user can search for the movie of interest on the web page of the company's video service, and search for the keyword of interest on the web page of the browser service. Since the voice parser is required for both the video service and the browser business, the company integrates the two services on one voice platform; however, due to the business data size, field and browser service of the video service There are big differences in data size and fields. Therefore, a semantic parser is built for each service in the voice platform. When Company A wants to launch a music business (such as QQ music), the company also needs to build a semantic parser for the music business, so that users can search for their interest in instant messaging (QQ). Music. It can be seen that although the voice platform puts various businesses together, it does not integrate in a practical sense.
此外,后台服务在进行语义解析的过程中,具体的解析方法有非常多,比如传统的正则模板、深度学习等。同时,进行产品化时,不同的产品会需要不同的场景及对应的服务。比如对于音箱,只需解析音乐、 天气、提醒等有限场景;而微桌面的语音助手,打电话、发短信则是必备场景。不同的语音产品的前置适配、后置兜底要求也不一样,比如浏览器语音助手,在不能提供解析语义时,跳转搜索是合理选择,而手表语音助手则不适合现这样的逻辑。面对解析过程中如此多的变数,如果将所有逻辑写在代码中,在新接入方法或者新接入产品时,将不得不进行重新编码,非常不灵活。In addition, in the process of semantic parsing in the background service, there are many specific parsing methods, such as traditional regular templates and deep learning. At the same time, different products will require different scenarios and corresponding services when productizing. For example, for speakers, only need to analyze limited scenes such as music, weather, and reminders; while the voice assistant of the micro-desktop, calling and texting are essential scenes. Different voice products have different requirements for pre-adaptation and rear-end pockets. For example, browser voice assistants, jump search is a reasonable choice when the parsing semantics cannot be provided, and watch voice assistant is not suitable for the current logic. In the face of so many variables in the parsing process, if all the logic is written in the code, it will have to be re-encoded when the new access method or new access product, which is very inflexible.
为了使得资源得到更加合理的利用,以下本申请实施例中提出一种将有限状态机应用于语义解析方法,其中,将语义解析流程中所有可能的步骤都抽象为状态机中的一个节点。即可方便开发人员添加、删除某一步骤,也可在每一种产品接入时,对各个步骤进行随意定制化,生成适应业务的语义解析模型;这样,研究人员可灵活更新解析方法,语音产品接入时可灵活定制解析流程。由以上可以看出,采用本申请实施例提供的技术方案,将对语音平台进行改进,不但使得资源能够得到更合理的利用,而且能够在有新业务接入时,为该新业务搭建一个语义解析器不再艰难。In order to make the resources more rationally utilized, the following embodiments of the present application propose a method for applying a finite state machine to a semantic parsing method, in which all possible steps in the semantic parsing process are abstracted into one node in the state machine. It is convenient for developers to add or delete a certain step, or to customize each step when each product is accessed, to generate a semantic analysis model suitable for the business; thus, the researcher can flexibly update the analytical method, voice The analysis process can be flexibly customized when the product is accessed. It can be seen from the above that the technical solution provided by the embodiment of the present application improves the voice platform, which not only enables the resource to be more rationally utilized, but also can construct a semantic for the new service when a new service is accessed. The parser is no longer tough.
为了更好地理解本申请实施例,本申请实施例提供以下名词的解释:For a better understanding of the embodiments of the present application, the embodiments of the present application provide an explanation of the following nouns:
语音助手:根据用户的语音输入,为用户提供相应服务的软件。Voice Assistant: Software that provides users with corresponding services based on their voice input.
语音平台,本实施例中的语音平台是对现有语音平台的改进,能够为多个产品提供语义解析服务。The voice platform, the voice platform in this embodiment is an improvement to the existing voice platform, and can provide semantic resolution services for multiple products.
场景:一句话所属的范围;比如我要听音乐,为音乐场景;再如来一个笑话,为笑话场景。Scene: The scope of a sentence; for example, I want to listen to music, for music scenes; then like a joke, a joke scene.
语义解析:将一句话解析为计算机可以识别的场景、意图和参数。例如我要听冰雨,场景为音乐场景,意图为听,参数为冰雨。Semantic parsing: parsing a sentence into a scene, intent, and parameters that the computer can recognize. For example, I want to listen to the ice rain, the scene is a music scene, the intention is to listen, the parameter is ice rain.
微桌面:智能平台部一款桌面产品。Micro Desktop: A desktop product from the Intelligent Platforms Division.
有限状态机:有限状态机(Finite-State Machine,FSM,简称状态机),是表示有限个状态以及在这些状态之间的转移和动作等行为的数学模型。Finite state machine: Finite-State Machine (FSM, referred to as state machine) is a mathematical model that represents a limited number of states and behaviors such as transitions and actions between these states.
命名实体(NER),如冰雨等。Named entities (NER), such as ice and rain.
下面结合附图和具体实施例对本申请的技术方案进一步详细阐述。The technical solutions of the present application are further elaborated below in conjunction with the accompanying drawings and specific embodiments.
在介绍本申请的各实施例之前,先介绍一下状态机的相关知识,FSM是由有限的状态和相互之间的转移构成的,在任何时候只能处于给定数目的状态中的一个。当接收到一个输入事件时,状态机产生一个输出,同时可能伴随状态的转移。有限状态机包括以下一些构成要素:Before introducing the various embodiments of the present application, the related knowledge of the state machine will be introduced. The FSM is composed of a limited state and a transition between each other, and can only be in one of a given number of states at any time. When an input event is received, the state machine produces an output that may be accompanied by a transition of the state. The finite state machine includes the following components:
状态(state):行为模型的基本组成部分,反映了系统中某个对象所处的阶段和活动情况(如,预处理状态);State: The basic component of a behavioral model that reflects the stage and activity of an object in the system (eg, preprocessing state);
转移(transition):对象从一个状态转移到另一个状态的过程(如,从预处理状态到语义算法解析状态的过程);Transition: The process by which an object moves from one state to another (eg, from a preprocessing state to a process in which a semantic algorithm resolves a state);
条件(transition condition):引起对象状态转化的事件及条件(如,需要解析条件);Transition condition: The event and condition that causes the state of the object to be transformed (eg, the condition needs to be resolved);
动作(action):在状态转移之前,对象所采取的行动(如,预处理动作)。Action: The action taken by the object (eg, pre-processing action) before the state transition.
图1A示出了本申请实例的语义解析方法适用的系统结构示意图。该系统至少包括终端设备101、第一服务器102、第二服务器103以及网络104。FIG. 1A is a schematic diagram showing the structure of a system to which the semantic analysis method of the example of the present application is applied. The system includes at least a terminal device 101, a first server 102, a second server 103, and a network 104.
其中,终端设备101是指具有数据计算处理功能的终端设备101,包括但不限于(安装有通信模块的)智能手机、掌上电脑、平板电脑、PC电脑等。这些终端设备101上都安装有操作系统,包括但不限于:Android操作系统、Symbian操作系统、Windows mobile操作系统、以及苹果iPhone OS操作系统等等。The terminal device 101 refers to a terminal device 101 having a data calculation processing function, including but not limited to a smart phone (a handheld computer, a tablet computer, a PC computer, etc.) (with a communication module installed). Operating systems are installed on these terminal devices 101, including but not limited to: Android operating system, Symbian operating system, Windows mobile operating system, and Apple iPhone OS operating system.
终端设备101安装有应用客户端(如,语音助手APP),该应用客户端通过网络104与第一服务器102(如,语音助手服务器)中安装有与应用客户端对应的应用服务器软件(如,语音助手应用服务器软件)进行信息交互,来实现应用客户端的智能对话与即时问答的智能交互功能。The terminal device 101 is installed with an application client (for example, a voice assistant APP), and the application client is installed with the application server software corresponding to the application client through the network 104 and the first server 102 (for example, a voice assistant server) (for example, The voice assistant application server software performs information interaction to implement intelligent interaction between the application client's intelligent dialogue and instant question and answer.
第二服务器103安装有用于构建状态机的应用服务器软件(如,语音平台服务器),将构建成的状态机通过网络104发送至第一服务器102,以使第一服务器102根据其状态机对上述应用客户端发送的用户语音进行解析以及答复。The second server 103 is installed with application server software (for example, a voice platform server) for constructing a state machine, and the built state machine is sent to the first server 102 through the network 104, so that the first server 102 can perform the above according to its state machine. The user voice sent by the client is used for parsing and replying.
网络105可以是有线网络,也可以是无线网络。The network 105 can be a wired network or a wireless network.
根据本申请实施例提供的方案,当需要增加一种新的语音业务时,语音平台服务器可以根据需要定制不同的解析步骤,从而为该业务搭建一个新的语义解析器可见。本申请实施例提供的语音平台服务器对于新的业务可以进行快速扩展,解决了由于不同业务场景所涉及的不同业务数据规模和字段存在巨大差异造成的每一个不同语音业务都创建一个对应的语音平台服务器的问题;因此,对于信息服务提供商而言,本申请实施例提供的语音平台可以实现将各个业务的语音业务实际意义上的整合。According to the solution provided by the embodiment of the present application, when a new voice service needs to be added, the voice platform server can customize different parsing steps according to requirements, so as to construct a new semantic parser for the service. The voice platform server provided by the embodiment of the present application can rapidly expand the new service, and solves the problem that each voice service is created by each voice service due to the huge difference in the size and field of different service data involved in different service scenarios. The problem of the server; therefore, for the information service provider, the voice platform provided by the embodiment of the present application can implement the integration of the voice service of each service in the actual sense.
本申请实施例提供的基于状态机的语义解析方法在实现时的流程参见图1B所示,例如用户对语音产品(即音响或音乐客户端)中说“我要听冰雨”,后台(如,与音乐客户端对应的服务器)的工作流程,包括:检测用户说出的“我要听冰雨”这句话,根据语句来源分配状态机,将语句输入分配的状态机,获取状态机的输出结果,向用户返回输出结果。从以上可以看出,后台工作完全由不同的状态机控制。The flow of the state machine-based semantic parsing method provided by the embodiment of the present application is shown in FIG. 1B. For example, the user says “I want to listen to the ice rain” in the voice product (ie, the audio or music client), and the background (such as The workflow of the server corresponding to the music client includes: detecting the phrase "I want to listen to the ice rain" spoken by the user, assigning the state machine according to the statement source, inputting the statement into the assigned state machine, and acquiring the state machine Output the result and return the output result to the user. As can be seen from the above, the background work is completely controlled by different state machines.
图2展示一个电梯门的有限状态机的状态图,如图2所示,该图中 包括两个状态:状态1为打开的,状态2为关闭的。其中,对于状态1来说,进入状态1的动作为开门,对于状态2来说,进入状态2的动作为关门;状态1与状态2之间的转移条件为打开或关闭。Figure 2 shows a state diagram of a finite state machine for an elevator door, as shown in Figure 2, which includes two states: state 1 is open and state 2 is closed. Among them, for state 1, the action of entering state 1 is to open the door, and for state 2, the action of entering state 2 is closing; the transition condition between state 1 and state 2 is opening or closing.
下面详细介绍如何将FSM模型应用于语义解析过程,本申请实施例的语义解析状态机实现流程如下:The following describes in detail how to apply the FSM model to the semantic parsing process. The semantic parsing state machine implementation process of the embodiment of the present application is as follows:
首先,为状态机的状态、转移、条件、动作设计统一接口。First, design a unified interface for the state, transition, condition, and action of the state machine.
例如,采用统一的格式并且相互之间可以识别的语言。For example, a language that uses a uniform format and is identifiable from each other.
其次,将语义解析中的所有的步骤,继承于统一接口,封装为状态机中的节点。Second, all the steps in semantic parsing are inherited from the unified interface and encapsulated as nodes in the state machine.
最后,将所有的节点连接起来成为状态图,最后包含所有语义解析步骤的状态机跑起来。Finally, all the nodes are connected to form a state diagram, and finally the state machine containing all the semantic resolution steps runs.
一般来说,语音解析过程包括以下步骤:In general, the speech resolution process includes the following steps:
步骤S1,预处理过程;Step S1, a preprocessing process;
一般来说,用户输入的语音,语音服务器经过语音识别会将语音识别为待处理文字(即待解析语句);判断待解析语句是否需要进一步解析,如果待解析语句需要进一步解析,那么需要进入步骤S2,即通过语义解析方法对待解析语句进行解析;如果待解析语句不需要进一步解析,那么进入步骤S3,调用垂直服务。Generally speaking, the voice input by the user and the voice server recognize the voice as the to-be-processed text after being voice-recognized (that is, the statement to be parsed); whether the statement to be parsed needs further analysis, and if the sentence to be parsed needs further analysis, then the step needs to be entered. S2, that is, the parsing statement is parsed by the semantic parsing method; if the parsing statement does not need further parsing, then the process proceeds to step S3, and the vertical service is called.
其中,语音识别技术是将语音信号转换为计算机可识别的文字符号,解决让机器听懂人说话问题的技术。Among them, the speech recognition technology converts the speech signal into a computer-readable text symbol, and solves the problem that the machine understands the problem of the person speaking.
例如,音乐客户端(如,音乐APP)通过移动终端设备的本地收音装置(如,内置话筒),获取到用户的语音信息,音乐客户端将该语音信息发送至语音服务器,语音服务器对该语音信息进行预处理,将该语音信息进行识别,并识别出“我要听冰雨”的待处理文字,此时语音服 务器需要进一步“我要听冰雨”的待处理文字进行语义识别,来理解用户的意图,需要进一步进行解析,则由预处理状态转移到语音解析算法状态,即步骤S2,通过语义解析方法对待解析语句进行解析。当语音服务器的预处理状态能够确定用户的意图时(即,理解用户的待处理文字的意图时),则语音服务器不需要进一步进行解析,直接进入步骤S3,调用垂直服务。步骤S2,通过语义解析方法对待解析语句进行解析;For example, a music client (eg, a music app) obtains voice information of the user through a local radio device (eg, a built-in microphone) of the mobile terminal device, and the music client sends the voice information to the voice server, and the voice server transmits the voice message to the voice server. The information is pre-processed, the voice information is identified, and the pending text of “I want to listen to the ice rain” is recognized. At this time, the voice server needs to further semantically recognize the pending text of “I want to listen to the ice rain” to understand If the user's intention needs further analysis, the pre-processing state is transferred to the speech analysis algorithm state, that is, step S2, and the parsing statement is parsed by the semantic parsing method. When the pre-processing state of the voice server can determine the user's intention (ie, when understanding the intention of the user's pending text), the voice server does not need to further parse, and directly proceeds to step S3 to invoke the vertical service. Step S2, parsing the parsing statement by a semantic parsing method;
其中,语义解析方法包括深度学习方法、多场景解析模板、NER+词汇模板、正则模板。Among them, the semantic analysis method includes a deep learning method, a multi-scene analysis template, a NER+ vocabulary template, and a regular template.
如果解析成功,进入步骤S3,即调用垂直服务;如果解析不成功,则进入步骤S4,即调用通用回答(Frequently Asked Questions,FAQ)。If the parsing is successful, the process proceeds to step S3, that is, the vertical service is called; if the parsing is unsuccessful, the process proceeds to step S4, that is, the Frequently Asked Questions (FAQ) is called.
其中,解析成功即为语音服务器能够解析出用户的意图(如,语音服务器解析出待处理文字“我要听冰雨”是用户想要听歌曲“冰雨”),则进入步骤S3,即调用垂直服务。The successful resolution means that the voice server can parse out the user's intention (for example, if the voice server parses out the pending text “I want to listen to the ice rain”, the user wants to listen to the song “ice rain”), then the process proceeds to step S3, that is, the call is made. Vertical service.
解析不成功即为语音服务器不能够解析出用户的意图(如,语音服务器不能解析出待处理文字“我要听冰雨”是用户想要听歌曲“冰雨”),则进入步骤S4,即调用通用回答(Frequently Asked Questions,FAQ)。If the resolution is unsuccessful, the voice server cannot parse the user's intention (for example, the voice server cannot parse out the pending text "I want to listen to the ice rain" is the user wants to listen to the song "ice rain"), then proceeds to step S4, Call Frequently Asked Questions (FAQ).
步骤S3,调用垂直服务;Step S3, calling a vertical service;
其中,垂直服务可以包括:音乐场景服务、地图场景服务、视频场景服务以及点菜场景服务等。The vertical service may include: a music scene service, a map scene service, a video scene service, and an a la carte scene service.
这里,如果调用垂直服务不正确,则重新进入步骤S2,重新对待处理文字进行语义解析;如果调用垂直服务失败,则进入步骤S3,重新调用垂直服务。如果调用成功,则流程结束(进入结束状态)。Here, if the calling vertical service is not correct, the process proceeds to step S2, and the processed text is re-analyzed; if the vertical service fails, the process proceeds to step S3 to re-invoke the vertical service. If the call is successful, the process ends (entering the end state).
步骤S4,调用通用回答(FAQ);Step S4, calling a general answer (FAQ);
这里,例如,音乐服务器在寻找一首歌曲的时候,没有找到结果,那么就会返回通用回答至音乐客户端,例如,使得音乐客户端发出语音 “没找到歌曲”。再如用户发出的待解析语句无法识别,那么可能也会向用户返回通用回答,例如发出语音“无法识别”。向用户返回通用回答之后,则流程结束(即进入结束状态)。Here, for example, when the music server does not find a result when searching for a song, it returns a general answer to the music client, for example, causing the music client to make a voice "No song found." If the statement to be parsed by the user is not recognized, then the general answer may also be returned to the user, for example, the voice is "unrecognizable". After returning the general answer to the user, the process ends (ie, enters the end state).
步骤S5,对待解析语句进行本地搜索;Step S5, performing a local search on the parsing statement;
这里,对于有些语音产品来说,还需要进行搜索服务,那么步骤S5,则从步骤S4进入步骤S5,并不需要任何条件;进行本地搜索之后,则流程结束(进入结束状态)。Here, for some voice products, a search service is also required. Then, in step S5, the process proceeds from step S4 to step S5, and no condition is required; after the local search is performed, the process ends (entering the end state).
步骤S6,流程结束。In step S6, the process ends.
以上述的6个步骤为例进行说明,以上的每一步骤都对应图3中的一个状态,例如步骤S1至步骤S6分别对应于状态31至状态36,其中步骤S1至步骤S6之间的关联关系分别对应于状态31至状态36之间的状态转移条件,例如步骤S1与步骤S2之间的关联关系即连接关系为:判断待解析语句是否需要进一步解析,如果待解析语句需要进一步解析,那么需要进入步骤S2;而状态31与状态32之间的状态转移条件为:需要解析条件。再如步骤S1与步骤S3之间的关联关系为:判断待解析语句是否需要进一步解析,如果待解析语句不需要进一步解析,那么需要进入步骤S3;而状态31与状态33之间的状态转移条件为:解析成功。Taking the above six steps as an example, each of the above steps corresponds to one state in FIG. 3, for example, steps S1 to S6 correspond to state 31 to state 36, respectively, and the association between steps S1 to S6. The relationship corresponds to the state transition condition between the state 31 and the state 36, for example, the relationship between the step S1 and the step S2, that is, the connection relationship is: whether the statement to be parsed needs further analysis, and if the statement to be parsed needs further analysis, then It is necessary to proceed to step S2; and the state transition condition between state 31 and state 32 is that an analysis condition is required. The relationship between the step S1 and the step S3 is as follows: determining whether the statement to be parsed needs further parsing, if the parsing statement does not need further parsing, then it is required to proceed to step S3; and the state transition condition between state 31 and state 33 For: the analysis is successful.
在本申请的其他实施例中,图4为本申请实施例语义解析的流程示意图,如图4所示,该语义解析流程还可以包括以下步骤:In other embodiments of the present application, FIG. 4 is a schematic flowchart of semantic analysis of an embodiment of the present application. As shown in FIG. 4, the semantic parsing process may further include the following steps:
步骤S401,预处理;Step S401, preprocessing;
这里,参见上述实施例中的步骤S1。Here, see step S1 in the above embodiment.
步骤S402,调用语义解析方法进行语义解析;Step S402, calling a semantic analysis method for semantic analysis;
这里,语义解析方法包括深度学习方法、多场景解析模板、NER+词汇模板、正则模板。Here, the semantic analysis method includes a deep learning method, a multi-scene analysis template, a NER+ vocabulary template, and a regular template.
步骤S403,语义消歧;Step S403, semantic disambiguation;
步骤S404,适配逻辑;Step S404, adapting logic;
其中,当在步骤S402中解析出的语义具有歧义或多个意义时,则需要进行步骤S403进行语义消岐,语义消岐后,在进行步骤S403适配逻辑,确定出用户的真正意图,从而进行步骤S405中的场景的选择。Wherein, when the semantics parsed in step S402 has ambiguity or multiple meanings, step S403 is required to perform semantic elimination. After the semantic elimination, the logic is performed in step S403 to determine the true intention of the user, thereby The selection of the scene in step S405 is performed.
步骤S405,搜索垂直场景;Step S405, searching for a vertical scene;
这里,垂直场景包括去除电话场景、去除短信场景、音乐场景、笑话场景、吃饭场景、点菜场景、购买场景、做饭场景、烹饪场景等。Here, the vertical scene includes removing the phone scene, removing the SMS scene, the music scene, the joke scene, the eating scene, the à la carte scene, the purchase scene, the cooking scene, the cooking scene, and the like.
步骤S406,兜底操作,其中,兜底操作一般包括FAQ、百科搜索、跳转搜索页面、开放域搜索。Step S406, the bottom operation, wherein the bottom operation generally includes a FAQ, an encyclopedia search, a jump search page, and an open domain search.
在步骤S403中,很多词语都有很多意思或语义,而在具体的语境中,词语有某种特定的意思。而独立于上下文来考虑词语意思,语义一般都会出现语义歧义。消除歧义的任务就是确定一个多义词在一个特定的语境中使用哪一种语义;通过考虑词汇使用的上下文完全可以确定其具体的语义。In step S403, many words have a lot of meanings or semantics, and in a specific context, words have a certain meaning. Separate from the context to consider the meaning of words, semantics generally have semantic ambiguity. The task of disambiguating is to determine which semantics a polysemy uses in a particular context; the specific semantics can be determined by considering the context in which the vocabulary is used.
比较简单的方法是从一部词典中给出某个词汇的定义确定该词汇具有的语义。但对于大部分词汇来说,语义和用法并不是简简单单能够根据词典中的定义来列出,词典中列出的语义之间有一些是可以清晰分辨的内容,但大部分内容都是不确定的,并且是混合在一起的。而更难的一点是,词典中每个词汇只能列出一定数量的语义,而该词汇在实际的语境中定义的语义不一定能够从词典中的语义中找出。而且一个词还具有不同的词性,确定一个词的具体词性属于标注的任务,这里暂不涉及,但需要知道同一个词的不同词性的确定能够有效的消除词汇歧义。下面来介绍从三种消歧方法。1、有监督消歧——基于标注训练集的消歧。2、基于词典的消歧——建立在词典资源上。3、无监督消歧——未 标注文本将应用到训练里。A relatively simple method is to give a definition of a vocabulary from a dictionary to determine the semantics of the vocabulary. But for most vocabulary, semantics and usage are not simply listed according to the definitions in the dictionary. Some of the semantics listed in the dictionary are clearly distinguishable, but most of the content is not. Determined and mixed together. What is even more difficult is that each vocabulary in the dictionary can only list a certain amount of semantics, and the semantics defined by the vocabulary in the actual context may not be found out from the semantics of the dictionary. Moreover, a word also has different part of speech. Determining the specific part of speech of a word belongs to the task of labeling. It is not involved here, but it needs to know that the determination of different parts of speech of the same word can effectively eliminate lexical ambiguity. Here are three methods of disambiguation. 1. Supervised Disambiguation - Disambiguation based on annotated training sets. 2. Dictionary-based disambiguation - built on dictionary resources. 3. Unsupervised disambiguation – unlabeled text will be applied to the training.
一个产品并不需要图4中的所有步骤,语义解析只用选合适的,兜底操作则1至2个即可。以浏览器语音助手为例,浏览器语音助手的流程步骤为图4的一个子集,见图5所示,该浏览器语音解析流程包括:A product does not need all the steps in Figure 4. Semantic analysis can only be done with one or two. Taking the browser voice assistant as an example, the flow step of the browser voice assistant is a subset of FIG. 4, as shown in FIG. 5, the browser voice parsing process includes:
步骤S501,预处理;Step S501, preprocessing;
步骤S502,调用语义解析方法进行语义解析;Step S502, calling a semantic analysis method for semantic analysis;
这里,语义解析方法包括深度学习方法、多场景解析模板、NER+词汇模板。Here, the semantic analysis method includes a deep learning method, a multi-scene analysis template, and a NER+ vocabulary template.
步骤S503,语义消歧义;Step S503, semantic disambiguation;
步骤S504,适配逻辑;Step S504, adapting logic;
步骤S505,搜索垂直场景;Step S505, searching for a vertical scene;
这里,垂直场景包括去除电话场景、去除短信场景。Here, the vertical scene includes removing the phone scene and removing the short message scene.
步骤S506,兜底操作,其中,兜底操作一般包括百科搜索、跳转搜索页面。Step S506, the bottom operation, wherein the bottom operation generally includes an encyclopedia search and a jump search page.
基于前述的实施例,本申请的实施例提供一种基于状态机的语义解析方法,应用于第一计算设备,该方法所实现的功能可以通过第一计算设备中的处理器调用程序代码来实现,当然程序代码可以保存在计算机存储介质中,可见,该第一计算设备至少包括处理器和存储介质。Based on the foregoing embodiments, an embodiment of the present application provides a state machine based semantic parsing method, which is applied to a first computing device, and the functions implemented by the method may be implemented by a processor calling program code in the first computing device. Of course, the program code can be stored in a computer storage medium. As can be seen, the first computing device includes at least a processor and a storage medium.
图6为本申请实施例基于状态机的语义解析方法的实现流程示意图,如图6所示,该方法可以应用于语音服务器,其中该方法包括:FIG. 6 is a schematic diagram of an implementation process of a semantic parsing method based on a state machine according to an embodiment of the present application. As shown in FIG. 6, the method may be applied to a voice server, where the method includes:
步骤S601,确定语音产品的功能;Step S601, determining a function of the voice product;
这里,对于音箱来说,语音产品的功能为根据用户的语音指令进行搜索歌曲,并播放歌曲;对于空调来说,语音产品的功能为根据用户的语音指令控制空调的温度、湿度、持续时间等工作参数,并按照确定的 工作参数进行工作;对于浏览器语音助手来说,根据用户的语音指令进行搜索,并返回结果;对于语音聊天助手来说,根据用户的语音进行对话。Here, for the speaker, the function of the voice product is to search for songs according to the user's voice command, and play the song; for the air conditioner, the function of the voice product is to control the temperature, humidity, duration, etc. of the air conditioner according to the voice command of the user. Working parameters, and working according to the determined working parameters; for the browser voice assistant, searching according to the user's voice command, and returning the result; for the voice chat assistant, the dialogue is performed according to the user's voice.
步骤S602,根据所述语音产品的功能确定所述语音产品在语义解析中的步骤集合,所述步骤集合中至少包括两个以上的步骤;Step S602, determining, according to a function of the voice product, a step set of the voice product in semantic parsing, where the step set includes at least two or more steps;
步骤S603,为所述步骤集合中的每一步骤确定对应的状态机的节点;Step S603, determining a node of the corresponding state machine for each step in the step set;
步骤S604,根据确定的节点形成节点集合;Step S604, forming a node set according to the determined node;
步骤S605,将所述节点集合形成所述语音产品的状态机。Step S605, the node set is formed into a state machine of the voice product.
在实施的过程中,本申请实施例中的功能或步骤可以采用配置文件来表示,例如:将<machine></machine>作为一个状态机的定义,<state></state>下的内容为状态名及状态对应的动作,其中,动作由统一接口的类实现。<transmition></transmition>下为转移定义。定义格式为转移=当前状态|条件|下一个状态。In the process of implementation, the functions or steps in the embodiment of the present application may be represented by a configuration file, for example, <machine></machine> is defined as a state machine, and the content under <state></state> is The action corresponding to the state name and state, wherein the action is implemented by a class of the unified interface. <transmition></transmition> is defined as a transfer. The definition format is Transfer = Current Status | Condition | Next Status.
需要说明的是,第一计算设备形成状态机后,可以将状态机运行在第一计算设备上;或者将状态机输出给第二计算设备,然后第二计算设备运行该状态机。基于此,无论是第一计算设备或第二计算设备运行状态机,该方法还包括:It should be noted that after the first computing device forms the state machine, the state machine may be run on the first computing device; or the state machine is output to the second computing device, and then the second computing device runs the state machine. Based on this, whether the first computing device or the second computing device runs the state machine, the method further includes:
步骤S606,获取语音产品的待解析语句;Step S606, acquiring a to-be-analyzed statement of the voice product;
步骤S607,将所述待解析语句输入预设的状态机的第一个节点;Step S607, input the statement to be parsed into a first node of a preset state machine;
步骤S608,从所述状态机的最后一个节点获取输出结果;Step S608, obtaining an output result from a last node of the state machine;
步骤S609,将所述输出结果输出。Step S609, outputting the output result.
应用本申请实施例提供的技术方案后,接入任何一款新语音产品,不需要进行重新编码,只需根据产品需求定制不同的解析流程即可,简单灵活高效,且用户的使用体验良好,举例来说,用户在浏览器语音助 手中说,人物A(例如李小鹏)是谁,用户看到的动作是,跳转搜索页面,利用浏览器搜索人物A这个关键词。而在微桌面中,则直接吐出人物A的百科信息。After applying the technical solution provided by the embodiment of the present application, accessing any new voice product does not require re-encoding, and only needs to customize different parsing processes according to product requirements, which is simple, flexible, and efficient, and the user experience is good. For example, the user said in the browser voice assistant that the character A (for example, Li Xiaopeng) is the user, and the action that the user sees is to jump to the search page and use the browser to search for the keyword A. In the micro-desktop, the encyclopedic information of character A is directly spit out.
下面提供几种实现步骤S605,“将所述节点集合形成所述语音产品的状态机”的方式:Several implementation steps S605 are provided below, "the manner in which the nodes are aggregated to form a state machine of the voice product":
方式一:首先,步骤S603,“为所述步骤集合中的每一步骤确定对应的状态机的节点”包括:在所述步骤集合中,根据每一步骤与其他步骤之间的连接关系确定每一步骤对应节点到其他步骤对应节点之间的转移条件;对应地,步骤S605包括:按照所述转移条件将所述节点集合形成所述语音产品的状态机。Manner 1: First, in step S603, "determining a node of the corresponding state machine for each step in the set of steps" includes: in the step set, determining each connection relationship according to each step and other steps One step corresponds to a transition condition between the node and the other node corresponding to the other step; correspondingly, step S605 includes: forming the node into a state machine of the voice product according to the transition condition.
方式二,所述将所述节点集合形成所述语音产品的状态机,包括:根据所述步骤集合中每两个步骤之间的连接关系确定各每两个步骤对应的节点之间的连接关系;根据所述节点集合中各节点之间的连接关系形成所述语音产品的状态机。The second mode, the forming the node to form a state machine of the voice product, comprising: determining, according to a connection relationship between each two steps in the step set, a connection relationship between nodes corresponding to each two steps Forming a state machine of the voice product according to a connection relationship between nodes in the set of nodes.
这里,每两个步骤是指步骤集合所有可能的步骤组合,假设步骤集合包括步骤a、b、c和d,那么每两个步骤包括步骤a与步骤b、步骤a与步骤c、步骤a与步骤d、步骤b与步骤c、步骤b与步骤d、步骤c与步骤d。Here, every two steps refers to all possible combinations of steps in the step set. Assuming that the step set includes steps a, b, c and d, then each two steps includes step a and step b, step a and step c, step a and Step d, step b and step c, step b and step d, step c and step d.
这里,每两个步骤之间的连接关系(关联关系)参见上述的步骤S1与步骤S2,例如步骤S1与步骤S2之间的关联关系为:判断待解析语句是否需要进一步解析,如果待解析语句需要进一步解析,那么需要进入步骤S2;而状态31与状态32之间的状态转移条件为:需要解析条件。再如步骤S1与步骤S3之间的关联关系为:判断待解析语句是否需要进一步解析,如果待解析语句不需要进一步解析,那么需要进入步骤S3; 而状态31与状态33之间的状态转移条件为:解析成功。Here, the connection relationship (association relationship) between every two steps refers to step S1 and step S2 described above. For example, the relationship between step S1 and step S2 is: determining whether the statement to be parsed needs further parsing, if the statement to be parsed If further analysis is needed, then step S2 needs to be entered; and the state transition condition between state 31 and state 32 is that the condition needs to be resolved. The relationship between the step S1 and the step S3 is as follows: determining whether the statement to be parsed needs further parsing, if the parsing statement does not need further parsing, then it is required to proceed to step S3; and the state transition condition between the state 31 and the state 33 For: the analysis is successful.
方式三:所述将所述节点集合形成所述语音产品的状态机,包括:获取每一步骤对应的节点的标识;根据每一步骤对应的节点的标识按照预设的状态图形成所述语音产品的状态机。Manner 3: The state machine for forming the node to form the voice product includes: acquiring an identifier of a node corresponding to each step; forming the voice according to a preset state map according to an identifier of a node corresponding to each step The state machine of the product.
在上述的方式三中,包括形成预设的状态图的过程,该形成预设的状态图包括:In the foregoing manner 3, the process of forming a preset state map includes:
步骤SA1,确定语义解析中的步骤全集,所述步骤全集至少包括两个以上的步骤,所述步骤集合为所述步骤全集的子集;Step SA1, determining a complete set of steps in semantic parsing, the complete set of steps comprising at least two or more steps, the step set being a subset of the complete set of steps;
这里,步骤全集和步骤集合可能包括相同数量的步骤,但是步骤全集可能比步骤集合的步骤多,其中子集表示步骤全集所包括的步骤与步骤集合所包括的步骤的数量相同。Here, the step ensemble and step set may include the same number of steps, but the step ensemble may be more than the step set, wherein the subset represents the step corpus includes the same number of steps as the step set.
步骤SA2,将所述步骤全集中的每一步骤封装为状态机的节点;Step SA2, encapsulating each step of the step set into a node of the state machine;
步骤SA3,根据所述步骤全集中每两个步骤之间的连接关系确定各每两个步骤对应的节点之间的连接关系(或转移条件);Step SA3, determining, according to the connection relationship between each two steps in the step set, the connection relationship (or transition condition) between the nodes corresponding to each two steps;
步骤SA4,根据各节点之间的连接关系(或转移条件),形成状态图。In step SA4, a state diagram is formed based on the connection relationship (or transition condition) between the nodes.
这里,步骤A2,所述为所述步骤集合中的每一步骤确定对应的状态机的节点,包括:获取步骤与节点之间的关联信息;根据所述关联信息为所述步骤集合中的每一步骤确定对应的状态机的节点。Here, in step A2, the determining, for each step in the set of steps, the node of the corresponding state machine, comprising: acquiring the association information between the step and the node; and according to the association information, each of the step sets A step determines the node of the corresponding state machine.
这里,关联信息用于表征步骤与节点之间的对应关系,在实施的过程中,可以采用对应关系列表来实现,根据步骤的标识查询对应关系列表,得到对应的节点。Here, the association information is used to represent the correspondence between the step and the node. In the process of implementation, the correspondence relationship list may be used to implement the corresponding relationship list according to the identifier of the step, and the corresponding node is obtained.
在本申请的其他实施例中,为了保证步骤与节点(状态机的状态)之间的对应关系,本申请实施例还包括判断步骤与节点之间的匹配对应关系,即本实施例中该方法还包括:In other embodiments of the present application, in order to ensure the correspondence between the step and the node (the state of the state machine), the embodiment of the present application further includes a matching correspondence between the determining step and the node, that is, the method in this embodiment Also includes:
步骤SB1,获取第一连接关系,所述第一连接关系为所述步骤集合中一个步骤与所述步骤集合中其他任一步骤之间的连接关系;Step SB1: Acquire a first connection relationship, where the first connection relationship is a connection relationship between a step in the step set and any other step in the step set;
步骤SB2,获取第二连接关系,所述第二连接关系为所述步骤集合中一个步骤对应的节点与所述状态机中其他任一步骤对应的节点之间的连接关系(或转移条件);Step SB2, obtaining a second connection relationship, where the second connection relationship is a connection relationship (or a transition condition) between a node corresponding to one step in the step set and a node corresponding to any other step in the state machine;
步骤SB3,如果所述第一连接关系与所述第二连接关系匹配,将所述一个步骤对应的节点确定为所述节点集合中一个节点;Step SB3, if the first connection relationship matches the second connection relationship, determine a node corresponding to the one step as one node in the node set;
这里,判断所述第一连接关系与所述第二连接关系是否匹配,得到判断结果;如果所述判断结果表明所述第一连接关系与所述第二连接关系匹配,将所述节点确定为所述节点集合中一个节点;Here, it is determined whether the first connection relationship and the second connection relationship match, and a determination result is obtained; if the determination result indicates that the first connection relationship matches the second connection relationship, determining the node as One node in the set of nodes;
步骤SB4,如果所述第一连接关系与所述第二连接关系不匹配,重新为所述一个步骤确定节点。In step SB4, if the first connection relationship does not match the second connection relationship, the node is determined again for the one step.
本申请实施例提供一种基于状态机的语义解析方法及装置,其中,确定语音产品的功能;根据所述语音产品的功能确定所述语音产品在语义解析中的步骤集合,所述步骤集合中至少包括两个以上的步骤;为所述步骤集合中的每一步骤确定对应的状态机的节点;根据确定的节点形成节点集合;将所述节点集合形成所述语音产品的状态机;如此,能够增强语音平台的可扩展性。An embodiment of the present application provides a state machine-based semantic parsing method and apparatus, wherein a function of a voice product is determined; and a step set of the voice product in semantic parsing is determined according to a function of the voice product, where the step set is Determining at least two or more steps; determining a node of the corresponding state machine for each step in the set of steps; forming a set of nodes according to the determined node; forming the set of nodes to form a state machine of the voice product; It enhances the scalability of the voice platform.
基于前述的实施例,本申请实施例提供一种基于状态机的语义解析装置,该装置所包括的各单元,以及各单元所包括各模块,都可以通过第一计算设备中的处理器来实现,在实现的过程中,处理器所实现的功能当然也可通过具体的逻辑电路实现;在具体实施例的过程中,处理器可以为中央处理器(CPU)、微处理器(MPU)、数字信号处理器(DSP)或现场可编程门阵列(FPGA)等。Based on the foregoing embodiments, the embodiment of the present application provides a state machine-based semantic parsing apparatus, and each unit included in the apparatus, and each module included in each unit, can be implemented by a processor in the first computing device. In the process of implementation, the functions implemented by the processor may of course be implemented by specific logic circuits; in the process of the specific embodiment, the processor may be a central processing unit (CPU), a microprocessor (MPU), and a digital Signal processor (DSP) or field programmable gate array (FPGA).
在实现的过程中,第一计算设备以采用各种具有信息处理能力的电子设备来实现,例如电子设备可以为智能手机、笔记本电脑、台式计算机、服务器集群等来实现。In the process of implementation, the first computing device is implemented by using various electronic devices with information processing capabilities, for example, the electronic device can be implemented for a smart phone, a notebook computer, a desktop computer, a server cluster, or the like.
图7为本申请实施例基于状态机的语义解析装置的组成结构示意图,如图7所示,所述装置700包括处理器以及与所述处理器相连接的存储器;所述存储器中存储有可由所述处理器执行的机器可读指令单元;所述机器可读指令单元包括:第一确定单元701、第二确定单元702、第三确定单元703、第一形成单元704和第二形成单元705,其中:FIG. 7 is a schematic structural diagram of a semantic parsing apparatus based on a state machine according to an embodiment of the present application. As shown in FIG. 7, the apparatus 700 includes a processor and a memory connected to the processor; a machine readable instruction unit executed by the processor; the machine readable instruction unit comprising: a first determining unit 701, a second determining unit 702, a third determining unit 703, a first forming unit 704, and a second forming unit 705 ,among them:
所述第一确定单元701,用于确定语音产品的功能;The first determining unit 701 is configured to determine a function of the voice product;
所述第二确定单元702,用于根据所述语音产品的功能确定所述语音产品在语义解析中的步骤集合,所述步骤集合中至少包括两个以上的步骤;The second determining unit 702 is configured to determine, according to a function of the voice product, a step set of the voice product in semantic parsing, where the step set includes at least two or more steps;
所述第三确定单元703,用于为所述步骤集合中的每一步骤确定对应的状态机的节点,The third determining unit 703 is configured to determine a node of the corresponding state machine for each step in the step set,
所述第一形成单元704,用于根据确定的节点形成节点集合;The first forming unit 704 is configured to form a node set according to the determined node;
所述第二形成单元705,用于将所述节点集合形成所述语音产品的状态机。The second forming unit 705 is configured to form the node to form a state machine of the voice product.
下面提供两种实现第二形成单元705的方式:Two ways of implementing the second forming unit 705 are provided below:
方式一:所述第二形成单元705包括第一确定模块7051和第一形成模块7052,其中:所述第一确定模块7051,用于根据所述步骤集合中每两个步骤之间的连接关系确定各每两个步骤对应的节点之间的连接关系(或转移条件);所述第一形成模块7052,用于根据所述节点集合中各节点之间的连接关系(或转移条件)形成所述语音产品的状态机。Manner 1: The second forming unit 705 includes a first determining module 7051 and a first forming module 7052, wherein: the first determining module 7051 is configured to perform a connection relationship between each two steps in the step set. Determining a connection relationship (or transition condition) between nodes corresponding to each of the two steps; the first forming module 7052 is configured to form a connection relationship (or a transition condition) between nodes in the node set State machine for voice products.
方式二,所述第二形成单元705包括获取模块7053和第二形成模 块7054,其中:所述获取模块7053,用于获取每一步骤对应的节点的标识;所述第二形成模块7054,用于根据每一步骤对应的节点的标识按照预设的状态图形成所述语音产品的状态机。In a second manner, the second forming unit 705 includes an obtaining module 7053 and a second forming module 7054, wherein: the obtaining module 7053 is configured to acquire an identifier of a node corresponding to each step; and the second forming module 7054 is configured to use The state machine of the voice product is formed according to a preset state map according to the identifier of the node corresponding to each step.
在本申请的其他实施例中,方式二中,所述装置700还包括用于形成预设的状态图的第三形成单元706,所述第三形成单元706包括第二确定模块7061、封装模块7062、第三确定模块7063和第三形成模块7064,其中:In other embodiments of the present application, in the second manner, the apparatus 700 further includes a third forming unit 706 for forming a preset state diagram, where the third forming unit 706 includes a second determining module 7061 and a packaging module. 7062. The third determining module 7063 and the third forming module 7064, wherein:
所述第二确定模块7061,用于确定语义解析中的步骤全集,所述步骤全集至少包括两个以上的步骤,所述步骤集合为所述步骤全集的子集;The second determining module 7061 is configured to determine a complete set of steps in semantic parsing, where the complete set of steps includes at least two or more steps, and the step set is a subset of the complete set of steps;
所述封装模块7062,用于将所述步骤全集中的每一步骤封装为状态机的节点;The encapsulating module 7062 is configured to encapsulate each step of the step set into a node of a state machine;
所述第二确定模块7063,用于根据所述步骤全集中每两个步骤之间的连接关系确定各每两个步骤对应的节点之间的连接关系(或转移条件);The second determining module 7063 is configured to determine, according to the connection relationship between each two steps in the step set, the connection relationship (or transition condition) between the nodes corresponding to each two steps;
所述第三形成模块7064,用于根据各节点之间的连接关系,形成状态图。The third forming module 7064 is configured to form a state diagram according to a connection relationship between the nodes.
在本申请的其他实施例中,方式二中的封装模块7062进一步包括获取子模块和确定子模块,其中:In other embodiments of the present application, the encapsulating module 7062 in the second method further includes an obtaining submodule and a determining submodule, where:
所述获取子模块,用于获取步骤与节点之间的关联信息;The obtaining submodule is configured to acquire association information between the step and the node;
所述确定子模块,用于根据所述关联信息确定为所述步骤集合中的每一步骤确定对应的状态机的节点。The determining submodule is configured to determine, according to the association information, a node that determines a corresponding state machine for each step in the set of steps.
在本申请的其他实施例中,所述装置还包括第一获取单元、第二获取单元、匹配单元和不匹配单元,其中:In other embodiments of the present application, the apparatus further includes a first obtaining unit, a second acquiring unit, a matching unit, and a non-matching unit, where:
所述第一获取单元,用于获取第一连接关系,所述第一连接关系为所述步骤集合中一个步骤与所述步骤集合中外其他任一步骤之间的连接关系;The first obtaining unit is configured to acquire a first connection relationship, where the first connection relationship is a connection relationship between a step in the step set and any other step in the step set;
所述第二获取单元,用于获取第二连接关系,所述第二连接关系为所述步骤集合中一个步骤对应的节点与所述状态机中其他任一步骤对应的节点之间的连接关系(或转移条件);The second obtaining unit is configured to acquire a second connection relationship, where the second connection relationship is a connection relationship between a node corresponding to one step in the step set and a node corresponding to any other step in the state machine (or transfer conditions);
所述匹配单元,用于如果所述第一连接关系与所述第二连接关系匹配,将所述一个步骤对应的节点确定为所述节点集合中一个节点;The matching unit is configured to determine, as the first connection relationship and the second connection relationship, a node corresponding to the one step as one node in the node set;
所述不匹配单元,用于如果所述第一连接关系与所述第二连接关系不匹配,重新为所述一个步骤确定节点。The unmatching unit is configured to re-determine the node for the one step if the first connection relationship does not match the second connection relationship.
这里,所述装置还包括判断单元,用于判断所述第一连接关系与所述第二连接关系是否匹配,得到判断结果;如果所述判断结果表明所述第一连接关系与所述第二连接关系匹配,将所述节点确定为所述节点集合中一个节点;如果所述第一连接关系与所述第二连接关系不匹配,重新为所述步骤确定将所述节点确定为所述节点集合中一个节点。Here, the device further includes a determining unit, configured to determine whether the first connection relationship and the second connection relationship match, and obtain a determination result; if the determination result indicates the first connection relationship and the second The connection relationship is matched, and the node is determined as one node in the node set; if the first connection relationship does not match the second connection relationship, determining, in the step, determining the node as the node A node in the collection.
在本申请的其他实施例中,所述装置还可以进一步包括:语句获取单元707,获取所述语音产品的待解析语句;语句输入单元708,将所述待解析语句输入所述状态机的第一个节点;结果获取单元709,从所述状态机的最后一个节点获取输出结果;结果输出单元710,将所述输出结果输出。In other embodiments of the present application, the apparatus may further include: a statement obtaining unit 707 that acquires a statement to be parsed of the voice product; a sentence input unit 708 that inputs the to-be-resolved sentence into the state machine a node; a result obtaining unit 709, which obtains an output result from a last node of the state machine; and a result output unit 710 that outputs the output result.
这里需要指出的是:以上装置实施例的描述,与上述方法实施例的描述是类似的,具有同方法实施例相似的有益效果,因此不做赘述。对于本申请装置实施例中未披露的技术细节,请参照本申请方法实施例的描述而理解,为节约篇幅,因此不再赘述。It should be noted here that the description of the above device embodiment is similar to the description of the above method embodiment, and has similar advantageous effects as the method embodiment, and therefore will not be described again. For the details of the technical solutions that are not disclosed in the embodiments of the present application, please refer to the description of the method embodiments of the present application, and the details are not described herein.
基于前述的实施例,本申请实施例提供一种基于状态机的语义解析装置,该装置所包括的各单元,以及各单元所包括各模块,都可以通过第二计算设备中的处理器来实现,在实现的过程中,处理器所实现的功能当然也可通过具体的逻辑电路实现;在具体实施例的过程中,处理器可以为中央处理器(CPU)、微处理器(MPU)、数字信号处理器(DSP)或现场可编程门阵列(FPGA)等。Based on the foregoing embodiments, the embodiment of the present application provides a state machine-based semantic parsing apparatus, and each unit included in the apparatus, and each module included in each unit, can be implemented by a processor in the second computing device. In the process of implementation, the functions implemented by the processor may of course be implemented by specific logic circuits; in the process of the specific embodiment, the processor may be a central processing unit (CPU), a microprocessor (MPU), and a digital Signal processor (DSP) or field programmable gate array (FPGA).
在实现的过程中,第二计算设备以采用各种具有信息处理能力的电子设备来实现,例如电子设备可以为智能手机、笔记本电脑、台式计算机、服务器集群等来实现。In the process of implementation, the second computing device is implemented by using various electronic devices with information processing capabilities, for example, the electronic device can be implemented for a smart phone, a notebook computer, a desktop computer, a server cluster, or the like.
图8为本申请实施例基于状态机的语义解析装置的组成结构示意图,如图8所示,所述装置800包括处理器以及与所述处理器相连接的存储器;所述存储器中存储有可由所述处理器执行的机器可读指令单元;所述机器可读指令单元包括:第三获取单元801、输入单元802、第三获取单元803和输出单元804,其中:8 is a schematic structural diagram of a semantic parsing apparatus based on a state machine according to an embodiment of the present application. As shown in FIG. 8, the apparatus 800 includes a processor and a memory connected to the processor; The machine readable instruction unit executed by the processor; the machine readable instruction unit comprises: a third acquisition unit 801, an input unit 802, a third acquisition unit 803, and an output unit 804, wherein:
所述第三获取单元801,用于获取语音产品的待解析语句;The third obtaining unit 801 is configured to obtain a to-be-analyzed statement of the voice product.
所述输入单元802,用于将所述待解析语句输入预设的状态机的第一个节点;The input unit 802 is configured to input the to-be-resolved statement into a first node of a preset state machine;
所述第四获取单元803,用于从所述状态机的最后一个节点获取输出结果;The fourth obtaining unit 803 is configured to obtain an output result from a last node of the state machine;
所述输出单元804,用于将所述输出结果输出。The output unit 804 is configured to output the output result.
在本申请的其他实施例中,所述装置包括第一确定单元、第二确定单元、第三确定单元、第一形成单元和第二形成单元,其中:In other embodiments of the present application, the apparatus includes a first determining unit, a second determining unit, a third determining unit, a first forming unit, and a second forming unit, wherein:
所述第一确定单元,用于确定语音产品的功能;The first determining unit is configured to determine a function of the voice product;
所述第二确定单元,用于根据所述语音产品的功能确定所述语音产品在语义解析中的步骤集合,所述步骤集合中至少包括两个以上的步骤;The second determining unit is configured to determine, according to a function of the voice product, a step set of the voice product in semantic parsing, where the step set includes at least two or more steps;
所述第三确定单元,用于为所述步骤集合中的每一步骤确定对应的状态机的节点,The third determining unit is configured to determine a node of the corresponding state machine for each step in the step set,
所述第一形成单元,用于根据确定的节点形成节点集合;The first forming unit is configured to form a node set according to the determined node;
所述第二形成单元,用于将所述节点集合形成所述语音产品的状态机。The second forming unit is configured to form the node to form a state machine of the voice product.
这里需要指出的是:以上装置实施例的描述,与上述方法实施例的描述是类似的,具有同方法实施例相似的有益效果,因此不做赘述。对于本申请装置实施例中未披露的技术细节,请参照本申请方法实施例的描述而理解,为节约篇幅,因此不再赘述。It should be noted here that the description of the above device embodiment is similar to the description of the above method embodiment, and has similar advantageous effects as the method embodiment, and therefore will not be described again. For the details of the technical solutions that are not disclosed in the embodiments of the present application, please refer to the description of the method embodiments of the present application, and the details are not described herein.
在本申请的其他实施例中,前述实施例中的第一计算设备为了形成状态机,第一计算设备形成的状态机可以运行在第一计算设备上,也可以作为一个功能模块运行在第二计算设备上,第二计算设备可以为语音产品的服务器也可以为语音产品的终端,换句话说,第一计算设备形成的状态机可以输出给语音产品的服务器也可以输出给语音产品的终端,基于这种理解,本申请的实施例再提供一种基于状态机的语义解析系统,该系统有多种实现模式,其中:In other embodiments of the present application, in order to form a state machine, the first computing device formed by the first computing device may run on the first computing device or may operate as a functional module in the second On the computing device, the second computing device may be a server of the voice product or a terminal of the voice product. In other words, the state machine formed by the first computing device may be output to the server of the voice product or may be output to the terminal of the voice product. Based on this understanding, an embodiment of the present application further provides a state machine based semantic parsing system, which has multiple implementation modes, wherein:
第一种模式:如图9的A图所示,第一种模式的系统900包括第一计算设备901、第二计算设备902和终端903,其中:First mode: As shown in FIG. 9A, the system 900 of the first mode includes a first computing device 901, a second computing device 902, and a terminal 903, where:
第一计算设备901用于形成状态机(如前述的方法或图8所示的实施例),然后将形成的状态机输出给第二计算设备902;The first computing device 901 is configured to form a state machine (such as the foregoing method or the embodiment shown in Figure 8), and then output the formed state machine to the second computing device 902;
终端903上安装有语音产品的客户端(例如,手机语音助手、浏览器语音助手),用户在终端上打开客户端,然后用户说出一句话,客户端将接收到的语音发送给第二计算设备902;The client 903 is installed with a voice product client (for example, a mobile phone voice assistant, a browser voice assistant), the user opens the client on the terminal, and then the user speaks a sentence, and the client sends the received voice to the second calculation. Device 902;
第二计算设备902作为终端903的服务器,第二计算设备902上运行有第一设备901输出的状态机,第二计算设备902还用于接收终端903的客户端输出的语音,然后将该语音进行语音识别预处理,得到待解析语句,将待解析语句输入运行在第二计算设备902上的状态机,然后获取从状态机上输出的输出结果,并将输出结果返回给终端903的客户端,最后终端903的客户端将输出结果输出给用户。The second computing device 902 is a server of the terminal 903, the second computing device 902 is configured to run the state machine output by the first device 901, and the second computing device 902 is further configured to receive the voice output by the client of the terminal 903, and then the voice is Performing speech recognition preprocessing, obtaining a statement to be parsed, inputting the statement to be parsed into a state machine running on the second computing device 902, and then obtaining an output result outputted from the state machine, and returning the output result to the client of the terminal 903. Finally, the client of the terminal 903 outputs the output result to the user.
第二种模式:如图9的B图所示,第二种模式的系统900包括第一计算设备901和第二计算设备902,其中:The second mode: as shown in FIG. 9B, the second mode system 900 includes a first computing device 901 and a second computing device 902, where:
第一计算设备901用于形成状态机(如前述的方法或图8所示的实施例),然后将形成的状态机输出给第二计算设备902;The first computing device 901 is configured to form a state machine (such as the foregoing method or the embodiment shown in Figure 8), and then output the formed state machine to the second computing device 902;
第二计算设备902作为终端,第二计算设备902上安装有语音产品的客户端(例如手机语音助手如苹果公司的siri、浏览器语音助手),用户在终端上打开客户端,然后用户说出一句话,客户端将接收到的语音发送给运行在第二计算设备902上的状态机;状态机运行后,将输出结果发送给客户端,然后客户端获取从状态机上输出的输出结果,最后客户端将输出结果输出给用户。在实现的过程中,状态机可以独立于客户端,其中,客户端包括检测装置,用于检测用户所说的话,然后检测装置将语音发送给运行在第二计算设备902上的状态机。The second computing device 902 serves as a terminal, and the second computing device 902 is installed with a client of the voice product (for example, a mobile phone voice assistant such as Apple's siri, browser voice assistant), the user opens the client on the terminal, and then the user speaks In a word, the client sends the received voice to the state machine running on the second computing device 902; after the state machine runs, the output is sent to the client, and then the client obtains the output output from the state machine, and finally The client outputs the output to the user. In an implementation process, the state machine can be independent of the client, wherein the client includes detection means for detecting what the user said, and then the detecting means sends the voice to the state machine running on the second computing device 902.
需要说明的是,本申请实施例中,如果以软件功能模块的形式实现上述的基于状态机的语义解析方法,并作为独立的产品销售或使用时, 也可以存储在一个计算机可读取存储介质中。基于这样的理解,本申请实施例的技术方案本质上或者说对现有技术做出贡献的部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储介质中,包括若干指令用以使得一台计算机设备(可以是个人计算机、服务器、或者网络设备等)执行本申请各个实施例所述方法的全部或部分。而前述的存储介质包括:U盘、移动硬盘、只读存储器(ROM,Read Only Memory)、磁碟或者光盘等各种可以存储程序代码的介质。这样,本申请实施例不限制于任何特定的硬件和软件结合。It should be noted that, in the embodiment of the present application, if the foregoing state machine based semantic parsing method is implemented in the form of a software function module, and is sold or used as a standalone product, it may also be stored in a computer readable storage medium. in. Based on such understanding, the technical solution of the embodiments of the present application may be embodied in the form of a software product in essence or in the form of a software product stored in a storage medium, including a plurality of instructions. A computer device (which may be a personal computer, server, or network device, etc.) is caused to perform all or part of the methods described in various embodiments of the present application. The foregoing storage medium includes various media that can store program codes, such as a USB flash drive, a mobile hard disk, a read only memory (ROM), a magnetic disk, or an optical disk. Thus, embodiments of the present application are not limited to any particular combination of hardware and software.
相应地,本申请实施例再提供一种计算机可读存储介质,所述计算机可读存储介质中存储有计算机可执行指令,所述计算机可执行指令被处理器执行时用于执行本申请实施例中基于状态机的语义解析方法。Correspondingly, the embodiment of the present application further provides a computer readable storage medium, where the computer readable storage medium stores computer executable instructions, and when the computer executable instructions are executed by the processor, are used to execute the embodiment of the present application. A semantic parsing method based on state machine.
相应地,本申请实施例再提供一种计算设备,包括:存储器、处理器和用于存储在所述存储器上并可在所述处理器上运行的计算机程序,所述处理器执行所述程序时用于实现本申请各实施例中的基于状态机的语义解析方法。Accordingly, the embodiment of the present application further provides a computing device, including: a memory, a processor, and a computer program for being stored on the memory and operable on the processor, the processor executing the program It is used to implement the state machine based semantic parsing method in the embodiments of the present application.
这里需要指出的是:以上计算设备实施例项的描述,与上述方法描述是类似的,具有同方法实施例相同的有益效果。对于本申请计算设备实施例中未披露的技术细节,本领域的技术人员请参照本申请方法实施例的描述而理解。It should be noted here that the description of the above computing device embodiment item is similar to the above method description, and has the same beneficial effects as the method embodiment. For technical details that are not disclosed in the embodiments of the computing device of the present application, those skilled in the art should understand the description of the method embodiments of the present application.
在实现的过程中,第一计算设备、第二计算设备、终端都可以通过电子设备来实现,图10为本申请实施例电子设备的组成结构示意图,如图10所示,该计算设备1000可以包括:至少一个处理器1001、至少一个通信总线1002、用户接口1003、至少一个外部通信接口1004和至少一个的存储器1005。其中,通信总线1002用于实现这些组件之间的连接通信。其中,用户接口1003可以包括显示屏和键盘。外部通信接 口1004可以包括标准的有线接口和无线接口。In the process of implementation, the first computing device, the second computing device, and the terminal may be implemented by using an electronic device. FIG. 10 is a schematic structural diagram of an electronic device according to an embodiment of the present application. As shown in FIG. 10, the computing device 1000 may be configured. The method includes: at least one processor 1001, at least one communication bus 1002, a user interface 1003, at least one external communication interface 1004, and at least one memory 1005. Among them, the communication bus 1002 is used to implement connection communication between these components. The user interface 1003 can include a display screen and a keyboard. External communication interface 1004 can include standard wired and wireless interfaces.
应理解,说明书通篇中提到的“一个实施例”或“一实施例”意味着与实施例有关的特定特征、结构或特性包括在本申请的至少一个实施例中。因此,在整个说明书各处出现的“在一个实施例中”或“在一实施例中”未必一定指相同的实施例。此外,这些特定的特征、结构或特性可以任意适合的方式结合在一个或多个实施例中。应理解,在本申请的各种实施例中,上述各过程的序号的大小并不意味着执行顺序的先后,各过程的执行顺序应以其功能和内在逻辑确定,而不应对本申请实施例的实施过程构成任何限定。上述本申请实施例序号仅仅为了描述,不代表实施例的优劣。It is to be understood that the phrase "one embodiment" or "an embodiment" or "an embodiment" or "an embodiment" means that the particular features, structures, or characteristics relating to the embodiments are included in at least one embodiment of the present application. Thus, "in one embodiment" or "in an embodiment" or "an" In addition, these particular features, structures, or characteristics may be combined in any suitable manner in one or more embodiments. It should be understood that, in the various embodiments of the present application, the size of the sequence numbers of the foregoing processes does not mean the order of execution sequence, and the order of execution of each process should be determined by its function and internal logic, and should not be applied to the embodiment of the present application. The implementation process constitutes any limitation. The serial numbers of the embodiments of the present application are merely for the description, and do not represent the advantages and disadvantages of the embodiments.
需要说明的是,在本文中,术语“包括”、“包含”或者其任何其他变体意在涵盖非排他性的包含,从而使得包括一系列要素的过程、方法、物品或者装置不仅包括那些要素,而且还包括没有明确列出的其他要素,或者是还包括为这种过程、方法、物品或者装置所固有的要素。在没有更多限制的情况下,由语句“包括一个……”限定的要素,并不排除在包括该要素的过程、方法、物品或者装置中还存在另外的相同要素。It is to be understood that the term "comprises", "comprising", or any other variants thereof, is intended to encompass a non-exclusive inclusion, such that a process, method, article, or device comprising a series of elements includes those elements. It also includes other elements that are not explicitly listed, or elements that are inherent to such a process, method, article, or device. An element that is defined by the phrase "comprising a ..." does not exclude the presence of additional equivalent elements in the process, method, item, or device that comprises the element.
在本申请所提供的几个实施例中,应该理解到,所揭露的设备和方法,可以通过其它的方式实现。以上所描述的设备实施例仅仅是示意性的,例如,所述单元的划分,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式,如:多个单元或组件可以结合,或可以集成到另一个系统,或一些特征可以忽略,或不执行。另外,所显示或讨论的各组成部分相互之间的耦合、或直接耦合、或通信连接可以是通过一些接口,设备或单元的间接耦合或通信连接,可以是电性的、机械的或其它形式的。In the several embodiments provided by the present application, it should be understood that the disclosed apparatus and method may be implemented in other manners. The device embodiments described above are merely illustrative. For example, the division of the unit is only a logical function division. In actual implementation, there may be another division manner, such as: multiple units or components may be combined, or Can be integrated into another system, or some features can be ignored or not executed. In addition, the coupling, or direct coupling, or communication connection of the components shown or discussed may be indirect coupling or communication connection through some interfaces, devices or units, and may be electrical, mechanical or other forms. of.
上述作为分离部件说明的单元可以是、或也可以不是物理上分开 的,作为单元显示的部件可以是、或也可以不是物理单元;既可以位于一个地方,也可以分布到多个网络单元上;可以根据实际的需要选择其中的部分或全部单元来实现本实施例方案的目的。The units described above as separate components may or may not be physically separated, and the components displayed as the unit may or may not be physical units; they may be located in one place or distributed on multiple network units; Some or all of the units may be selected according to actual needs to achieve the purpose of the solution of the embodiment.
另外,在本申请各实施例中的各功能单元可以全部集成在一个处理单元中,也可以是各单元分别单独作为一个单元,也可以两个或两个以上单元集成在一个单元中;上述集成的单元既可以采用硬件的形式实现,也可以采用硬件加软件功能单元的形式实现。In addition, each functional unit in each embodiment of the present application may be integrated into one processing unit, or each unit may be separately used as one unit, or two or more units may be integrated into one unit; the above integration The unit can be implemented in the form of hardware or in the form of hardware plus software functional units.
本领域普通技术人员可以理解:实现上述方法实施例的全部或部分步骤可以通过程序指令相关的硬件来完成,前述的程序可以存储于计算机可读取存储介质中,该程序在执行时,执行包括上述方法实施例的步骤;而前述的存储介质包括:移动存储设备、只读存储器(Read Only Memory,ROM)、磁碟或者光盘等各种可以存储程序代码的介质。It will be understood by those skilled in the art that all or part of the steps of implementing the foregoing method embodiments may be performed by hardware related to program instructions. The foregoing program may be stored in a computer readable storage medium, and when executed, the program includes The foregoing steps of the method embodiment; and the foregoing storage medium includes: a removable storage device, a read only memory (ROM), a magnetic disk, or an optical disk, and the like, which can store program codes.
或者,本申请上述集成的单元如果以软件功能模块的形式实现并作为独立的产品销售或使用时,也可以存储在一个计算机可读取存储介质中。基于这样的理解,本申请实施例的技术方案本质上或者说对现有技术做出贡献的部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储介质中,包括若干指令用以使得一台计算机设备(可以是个人计算机、服务器、或者网络设备等)执行本申请各个实施例所述方法的全部或部分。而前述的存储介质包括:移动存储设备、ROM、磁碟或者光盘等各种可以存储程序代码的介质。Alternatively, the above-described integrated unit of the present application may be stored in a computer readable storage medium if it is implemented in the form of a software function module and sold or used as a stand-alone product. Based on such understanding, the technical solution of the embodiments of the present application may be embodied in the form of a software product in essence or in the form of a software product stored in a storage medium, including a plurality of instructions. A computer device (which may be a personal computer, server, or network device, etc.) is caused to perform all or part of the methods described in various embodiments of the present application. The foregoing storage medium includes various media that can store program codes, such as a mobile storage device, a ROM, a magnetic disk, or an optical disk.
以上所述,仅为本申请的具体实施方式,但本申请的保护范围并不局限于此,任何熟悉本技术领域的技术人员在本申请揭露的技术范围内,可轻易想到变化或替换,都应涵盖在本申请的保护范围之内。因此,本申请的保护范围应以所述权利要求的保护范围为准。The foregoing is only a specific embodiment of the present application, but the scope of protection of the present application is not limited thereto, and any person skilled in the art can easily think of changes or substitutions within the technical scope disclosed in the present application. It should be covered by the scope of protection of this application. Therefore, the scope of protection of the present application should be determined by the scope of the claims.

Claims (17)

  1. 一种基于状态机的语义解析方法,所述方法应用于服务器,所述方法包括:A state machine based semantic parsing method, the method being applied to a server, the method comprising:
    确定语音产品的功能;Determine the functionality of the voice product;
    根据所述语音产品的功能确定所述语音产品在语义解析中的步骤集合,所述步骤集合中至少包括两个以上的步骤,所述两个以上的步骤用于完成至少以下操作:对用户输入的语音指令进行预处理、对所述语音指令进行解析、并根据解析的结果调用对应的功能;Determining, according to a function of the voice product, a step set of the voice product in semantic parsing, where the step set includes at least two or more steps, and the two or more steps are used to complete at least the following operations: inputting to a user The voice instruction performs preprocessing, parses the voice instruction, and invokes a corresponding function according to the parsed result;
    为所述步骤集合中的每一步骤确定对应的状态机的节点;Determining a node of the corresponding state machine for each step in the set of steps;
    根据确定的节点形成节点集合;Forming a set of nodes according to the determined nodes;
    将所述节点集合形成所述语音产品的状态机,使得所述服务器根据所述状态机对用户输入的语音指令进行解析,并根据解析结果向所述用户提供与所述语音指令对应的功能。Forming the node into a state machine of the voice product, so that the server parses the voice command input by the user according to the state machine, and provides a function corresponding to the voice command to the user according to the analysis result.
  2. 根据权利要求1所述的方法,所述将所述节点集合形成所述语音产品的状态机,包括:The method of claim 1, the grouping the nodes to form a state machine of the voice product, comprising:
    根据所述步骤集合中每两个步骤之间的连接关系确定各每两个步骤对应的节点之间的连接关系;Determining a connection relationship between nodes corresponding to each two steps according to a connection relationship between each two steps in the step set;
    根据所述节点集合中各节点之间的连接关系形成所述语音产品的状态机。Forming a state machine of the voice product according to a connection relationship between nodes in the node set.
  3. 根据权利要求1所述的方法,所述将所述节点集合形成所述语音产品的状态机,包括:The method of claim 1, the grouping the nodes to form a state machine of the voice product, comprising:
    获取每一步骤对应的节点的标识;Obtain the identifier of the node corresponding to each step;
    根据每一步骤对应的节点的标识按照预设的状态图形成所述语音产品的状态机。Forming a state machine of the voice product according to a preset state map according to the identifier of the node corresponding to each step.
  4. 根据权利要求3所述的方法,所述形成预设的状态图包括:The method according to claim 3, wherein the forming the preset state map comprises:
    确定语义解析中的步骤全集,所述步骤全集至少包括两个以上的步骤,所述步骤集合为所述步骤全集的子集;Determining a complete set of steps in semantic parsing, the set of steps comprising at least two or more steps, the set of steps being a subset of the complete set of steps;
    将所述步骤全集中的每一步骤封装为状态机的节点;Encapsulating each step of the step set into a node of a state machine;
    根据所述步骤全集中每两个步骤之间的连接关系确定各每两个步骤对应的节点之间的连接关系;Determining a connection relationship between nodes corresponding to each two steps according to a connection relationship between each two steps in the step set;
    根据各节点之间的连接关系,形成状态图。A state diagram is formed according to the connection relationship between the nodes.
  5. 根据权利要求4所述的方法,所述为所述步骤集合中的每一步骤确定对应的状态机的节点,包括:The method of claim 4, wherein determining, for each step in the set of steps, a node of a corresponding state machine, comprising:
    获取步骤与节点之间的关联信息;Obtain association information between the step and the node;
    根据所述关联信息确定为所述步骤集合中的每一步骤确定对应的状态机的节点。Determining, based on the association information, a node that determines a corresponding state machine for each step in the set of steps.
  6. 根据权利要求1至5任一项所述的方法,所述方法还包括:The method according to any one of claims 1 to 5, further comprising:
    获取第一连接关系,所述第一连接关系为所述步骤集合中一个步骤与所述步骤集合中其他任一步骤之间的连接关系;Obtaining a first connection relationship, where the first connection relationship is a connection relationship between a step in the step set and any other step in the step set;
    获取第二连接关系,所述第二连接关系为所述步骤集合中一个步骤对应的节点与所述状态机中其他任一步骤对应的节点之间的连接关系;Obtaining a second connection relationship, where the second connection relationship is a connection relationship between a node corresponding to one step in the step set and a node corresponding to any other step in the state machine;
    如果所述第一连接关系与所述第二连接关系匹配,将该一个步骤对应的节点确定为所述节点集合中一个节点;If the first connection relationship matches the second connection relationship, determine a node corresponding to the one step as one node in the node set;
    如果所述第一连接关系与所述第二连接关系不匹配,重新为所述一个步骤确定节点。If the first connection relationship does not match the second connection relationship, the node is determined again for the one step.
  7. 根据权利要求1所述的方法,进一步包括:The method of claim 1 further comprising:
    获取所述语音产品的待解析语句;Obtaining a statement to be parsed of the voice product;
    将所述待解析语句输入所述状态机的第一个节点;Inputting the to-be-resolved statement into the first node of the state machine;
    从所述状态机的最后一个节点获取输出结果;Obtaining an output result from a last node of the state machine;
    将所述输出结果输出。The output result is output.
  8. 一种基于状态机的语义解析方法,所述语义解析方法应用于服务器;所述语义解析方法包括:A state machine based semantic parsing method, the semantic parsing method is applied to a server; the semantic parsing method comprises:
    获取语音产品的待解析语句;Obtain a statement to be parsed of the voice product;
    将所述待解析语句输入预设的状态机的第一个节点;其中,所述状态机的每一个节点与语义解析的步骤集合中的一个步骤相对应;所述步骤集合是根据所述语音产品提供的功能确定的,并至少包括两个以上的步骤;所述两个以上的步骤用于完成至少以下操作:对所述语音指令进行预处理、对所述用户的语音指令进行解析、并根据解析的结果调用对应的功能;Inputting the to-be-resolved statement into a first node of a preset state machine; wherein each node of the state machine corresponds to a step in a set of semantic parsing steps; the step set is based on the voice The functionality provided by the product is determined and includes at least two or more steps; the two or more steps are used to perform at least the following operations: pre-processing the voice command, parsing the voice command of the user, and Calling the corresponding function according to the result of the parsing;
    从所述状态机的最后一个节点获取输出结果;Obtaining an output result from a last node of the state machine;
    将所述输出结果输出。The output result is output.
  9. 一种基于状态机的语义解析装置,所述装置应用于服务器,所述装置包括:A state machine based semantic parsing device, the device being applied to a server, the device comprising:
    处理器以及与所述处理器相连接的存储器;所述存储器中存储有可由所述处理器执行的机器可读指令单元;所述机器可读指令单元包括:a processor and a memory coupled to the processor; the memory having machine readable instruction units executable by the processor; the machine readable instruction unit comprising:
    第一确定单元、第二确定单元、第三确定单元、第一形成单元和第二形成单元,其中:a first determining unit, a second determining unit, a third determining unit, a first forming unit, and a second forming unit, wherein:
    所述第一确定单元,用于确定语音产品的功能;The first determining unit is configured to determine a function of the voice product;
    所述第二确定单元,用于根据所述语音产品的功能确定所述语音产品在语义解析中的步骤集合,所述步骤集合中至少包括两个以上的步骤,所述两个以上的步骤用于完成至少以下操作:对用户输入的语音指令进行预处理、对所述语音指令进行解析、并根据解析的结果调用对应的功能;The second determining unit is configured to determine, according to a function of the voice product, a step set of the voice product in semantic parsing, where the step set includes at least two or more steps, and the two or more steps are used by Performing at least the following operations: pre-processing a voice command input by the user, parsing the voice command, and invoking a corresponding function according to the result of the parsing;
    所述第三确定单元,用于将所述步骤集合中的每一步骤确定对应的 状态机的节点,The third determining unit is configured to determine, in each step of the step set, a node of a corresponding state machine,
    所述第一形成单元,用于根据确定的节点形成节点集合;The first forming unit is configured to form a node set according to the determined node;
    所述第二形成单元,用于将所述节点集合形成所述语音产品的状态机,使得所述服务器根据所述状态机对用户输入的语音指令进行解析,并根据解析结果向所述用户提供与所述语音指令对应的功能。The second forming unit is configured to form the node to form a state machine of the voice product, so that the server parses a voice instruction input by the user according to the state machine, and provides the user with the result according to the analysis result. A function corresponding to the voice command.
  10. 根据权利要求9所述的装置,所述第二形成单元,包括:The apparatus of claim 9, the second forming unit comprising:
    第一确定模块,根据所述步骤集合中每两个步骤之间的连接关系确定各每两个步骤对应的节点之间的连接关系;a first determining module, determining, according to a connection relationship between each two steps in the step set, a connection relationship between nodes corresponding to each two steps;
    第一形成模块,根据所述节点集合中各节点之间的连接关系形成所述语音产品的状态机。The first forming module forms a state machine of the voice product according to a connection relationship between nodes in the node set.
  11. 根据权利要求9所述的装置,所述第二形成单元,包括:The apparatus of claim 9, the second forming unit comprising:
    获取模块,获取每一步骤对应的节点的标识;Obtaining a module, obtaining an identifier of a node corresponding to each step;
    第二形成模块,根据每一步骤对应的节点的标识按照预设的状态图形成所述语音产品的状态机。The second forming module forms a state machine of the voice product according to a preset state map according to the identifier of the node corresponding to each step.
  12. 根据权利要求11所述的装置,所述装置还包括第三形成单元,形成预设的状态图;The apparatus according to claim 11, further comprising a third forming unit to form a preset state map;
    所述第三形成单元包括第二确定模块、封装模块、第三确定模块和第三形成模块,其中:The third forming unit includes a second determining module, a packaging module, a third determining module, and a third forming module, wherein:
    所述第二确定模块,确定语义解析中的步骤全集,所述步骤全集至少包括两个以上的步骤,所述步骤集合为所述步骤全集的子集;The second determining module determines a complete set of steps in the semantic parsing, where the complete set of steps includes at least two or more steps, and the step set is a subset of the complete set of steps;
    所述封装模块,为所述步骤全集中的每一步骤封装为状态机的节点;The encapsulating module is encapsulated as a node of the state machine for each step of the step set;
    所述第三确定模块,根据所述步骤全集中每两个步骤之间的连接关系确定各每两个步骤对应的节点之间的连接关系;The third determining module determines, according to the connection relationship between each two steps in the step set, the connection relationship between the nodes corresponding to each two steps;
    所述第三形成模块,根据各节点之间的连接关系,形成状态图。The third forming module forms a state diagram according to a connection relationship between the nodes.
  13. 根据权利要求12所述的装置,所述封装模块,包括:The device of claim 12, the package module comprising:
    获取子模块,获取步骤与节点之间的关联信息;Obtain a sub-module, and obtain association information between the step and the node;
    确定子模块,根据所述关联信息确定为所述步骤集合中的每一步骤确定对应的状态机的节点。Determining a sub-module, determining, according to the association information, a node that determines a corresponding state machine for each step in the set of steps.
  14. 根据权利要求10-13任一项所述的装置,所述装置还包括:第一获取单元、第二获取单元、匹配单元和不匹配单元,其中:The apparatus according to any one of claims 10-13, further comprising: a first obtaining unit, a second obtaining unit, a matching unit and a mismatching unit, wherein:
    所述第一获取单元,获取第一连接关系,所述第一连接关系为所述步骤集合中一个步骤与所述步骤集合中其他任一步骤之间的连接关系;The first acquiring unit acquires a first connection relationship, where the first connection relationship is a connection relationship between a step in the step set and any other step in the step set;
    所述第二获取单元,获取第二连接关系,所述第二连接关系为所述步骤集合中一个步骤对应的节点与所述状态机中其他任一步骤对应的节点之间的连接关系;The second obtaining unit acquires a second connection relationship, where the second connection relationship is a connection relationship between a node corresponding to one step in the step set and a node corresponding to any other step in the state machine;
    所述匹配单元,如果所述第一连接关系与所述第二连接关系匹配,将该一个步骤对应的节点确定为所述节点集合中一个节点;And the matching unit determines, if the first connection relationship and the second connection relationship, the node corresponding to the one step as one node in the node set;
    所述不匹配单元,如果所述第一连接关系与所述第二连接关系不匹配,重新为所述一个步骤确定节点。The unmatching unit re-determines the node for the one step if the first connection relationship does not match the second connection relationship.
  15. 根据权利要求10所述的装置,所述装置进一步包括:The device of claim 10, the device further comprising:
    语句获取单元,获取所述语音产品的待解析语句;a statement obtaining unit, which acquires a statement to be parsed of the voice product;
    语句输入单元,将所述待解析语句输入所述状态机的第一个节点;a statement input unit, inputting the statement to be parsed into a first node of the state machine;
    结果获取单元,从所述状态机的最后一个节点获取输出结果;a result obtaining unit, which obtains an output result from a last node of the state machine;
    结果输出单元,将所述输出结果输出。The result output unit outputs the output result.
  16. 一种基于状态机的语义解析装置,所述语义解析装置应用于服务器,所述语义解析装置包括:A state machine based semantic parsing device, the semantic parsing device is applied to a server, and the semantic parsing device comprises:
    处理器以及与所述处理器相连接的存储器;所述存储器中存储有可由所述处理器执行的机器可读指令单元;所述机器可读指令单元包括:a processor and a memory coupled to the processor; the memory having machine readable instruction units executable by the processor; the machine readable instruction unit comprising:
    第三获取单元、输入单元、第四获取单元和输出单元,其中:a third obtaining unit, an input unit, a fourth obtaining unit, and an output unit, wherein:
    所述第三获取单元,用于获取语音产品的待解析语句;The third obtaining unit is configured to acquire a to-be-analyzed statement of the voice product;
    所述输入单元,用于将所述待解析语句输入预设的状态机的第一个节点;其中,所述状态机的每一个节点与语义解析的步骤集合中的一个步骤相对应;所述步骤集合是根据所述语音产品提供的功能确定的,并至少包括两个以上的步骤;所述两个以上的步骤用于完成至少以下操作:对所述语音指令进行预处理、对所述用户的语音指令进行解析、并根据解析的结果调用对应的功能;The input unit is configured to input the to-be-resolved statement into a first node of a preset state machine; wherein each node of the state machine corresponds to a step in a set of steps of semantic parsing; The step set is determined according to the function provided by the voice product, and includes at least two or more steps; the two or more steps are used to complete at least the following operations: pre-processing the voice command, and the user The voice instruction is parsed, and the corresponding function is called according to the result of the parsing;
    所述第四获取单元,用于从所述状态机的最后一个节点获取输出结果;The fourth obtaining unit is configured to obtain an output result from a last node of the state machine;
    所述输出单元,用于将所述输出结果输出。The output unit is configured to output the output result.
  17. 一种非易失性计算机可读存储介质,存储有一个或多个程序,所述一个或多个程序包括指令,所述指令当由计算设备执行时,使得所述计算设备执行如权利要求1-8中任一项所述的方法。A non-transitory computer readable storage medium storing one or more programs, the one or more programs comprising instructions that, when executed by a computing device, cause the computing device to perform the claim 1 The method of any of -8.
PCT/CN2018/075795 2017-02-23 2018-02-08 Semantic parsing method and apparatus, and storage medium WO2018153273A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201710099405.9 2017-02-23
CN201710099405.9A CN106874259B (en) 2017-02-23 2017-02-23 A kind of semantic analysis method and device, equipment based on state machine

Publications (1)

Publication Number Publication Date
WO2018153273A1 true WO2018153273A1 (en) 2018-08-30

Family

ID=59168429

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2018/075795 WO2018153273A1 (en) 2017-02-23 2018-02-08 Semantic parsing method and apparatus, and storage medium

Country Status (2)

Country Link
CN (1) CN106874259B (en)
WO (1) WO2018153273A1 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP3876231A1 (en) * 2020-03-04 2021-09-08 Beijing Baidu Netcom Science and Technology Co., Ltd Method and apparatus for recognizing speech

Families Citing this family (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106874259B (en) * 2017-02-23 2019-07-16 腾讯科技(深圳)有限公司 A kind of semantic analysis method and device, equipment based on state machine
CN107633526B (en) * 2017-09-04 2022-10-14 腾讯科技(深圳)有限公司 Image tracking point acquisition method and device and storage medium
CN107861951A (en) * 2017-11-17 2018-03-30 康成投资(中国)有限公司 Session subject identifying method in intelligent customer service
CN110019303A (en) * 2017-12-11 2019-07-16 佛山市顺德区美的电热电器制造有限公司 Exchange method, device, system and storage medium in cooking process
CN108735201B (en) * 2018-06-29 2020-11-17 广州视源电子科技股份有限公司 Continuous speech recognition method, device, equipment and storage medium
CN109670025B (en) * 2018-12-19 2023-06-16 北京小米移动软件有限公司 Dialogue management method and device
CN109683897B (en) * 2018-12-29 2022-05-10 广州华多网络科技有限公司 Program processing method, device and equipment
CN110688859A (en) * 2019-09-18 2020-01-14 平安科技(深圳)有限公司 Semantic analysis method, device, medium and electronic equipment based on machine learning
CN112307167A (en) * 2020-10-30 2021-02-02 广州华多网络科技有限公司 Text sentence cutting method and device, computer equipment and storage medium
CN113035200B (en) * 2021-03-03 2022-08-05 科大讯飞股份有限公司 Voice recognition error correction method, device and equipment based on human-computer interaction scene

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20080262848A1 (en) * 2005-01-06 2008-10-23 Eric Shienbrood Applications Server and Method
CN105589848A (en) * 2015-12-28 2016-05-18 百度在线网络技术(北京)有限公司 Dialog management method and device
CN105845137A (en) * 2016-03-18 2016-08-10 中国科学院声学研究所 Voice communication management system
CN106325515A (en) * 2016-08-26 2017-01-11 北京零秒科技有限公司 Service-oriented human-computer interaction system and implementation method
CN106874259A (en) * 2017-02-23 2017-06-20 腾讯科技(深圳)有限公司 A kind of semantic analysis method and device, equipment based on state machine

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20080262848A1 (en) * 2005-01-06 2008-10-23 Eric Shienbrood Applications Server and Method
CN105589848A (en) * 2015-12-28 2016-05-18 百度在线网络技术(北京)有限公司 Dialog management method and device
CN105845137A (en) * 2016-03-18 2016-08-10 中国科学院声学研究所 Voice communication management system
CN106325515A (en) * 2016-08-26 2017-01-11 北京零秒科技有限公司 Service-oriented human-computer interaction system and implementation method
CN106874259A (en) * 2017-02-23 2017-06-20 腾讯科技(深圳)有限公司 A kind of semantic analysis method and device, equipment based on state machine

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP3876231A1 (en) * 2020-03-04 2021-09-08 Beijing Baidu Netcom Science and Technology Co., Ltd Method and apparatus for recognizing speech
US11416687B2 (en) 2020-03-04 2022-08-16 Apollo Intelligent Connectivity (Beijing) Technology Co., Ltd. Method and apparatus for recognizing speech

Also Published As

Publication number Publication date
CN106874259B (en) 2019-07-16
CN106874259A (en) 2017-06-20

Similar Documents

Publication Publication Date Title
WO2018153273A1 (en) Semantic parsing method and apparatus, and storage medium
CN107924483B (en) Generation and application of generic hypothesis ranking model
RU2699399C2 (en) System and method for detecting orphan utterances
JP2021018797A (en) Conversation interaction method, apparatus, computer readable storage medium, and program
JP2022539138A (en) Systems and methods for performing semantic search using a natural language understanding (NLU) framework
CN108369580B (en) Language and domain independent model based approach to on-screen item selection
CN114424185A (en) Stop word data augmentation for natural language processing
US10860289B2 (en) Flexible voice-based information retrieval system for virtual assistant
US11586689B2 (en) Electronic apparatus and controlling method thereof
US11164562B2 (en) Entity-level clarification in conversation services
US20170011114A1 (en) Common data repository for improving transactional efficiencies of user interactions with a computing device
TW201606750A (en) Speech recognition using a foreign word grammar
JP2023519713A (en) Noise Data Augmentation for Natural Language Processing
WO2021159904A1 (en) Voice data processing method and device for intelligent voice conversation system
Inupakutika et al. Integration of NLP and Speech-to-text Applications with Chatbots
CN113051895A (en) Method, apparatus, electronic device, medium, and program product for speech recognition
KR20210042520A (en) An electronic apparatus and Method for controlling the electronic apparatus thereof
US11966562B2 (en) Generating natural languages interface from graphic user interfaces
WO2022226811A1 (en) Method and system for constructing voice recognition model, and voice processing method and system
US11983464B2 (en) Neural network-based message communication framework with summarization and on-demand audio output generation
CN111104118A (en) AIML-based natural language instruction execution method and system
US20210109960A1 (en) Electronic apparatus and controlling method thereof
TWI836856B (en) Message mapping and combination for intent classification
JP2019109424A (en) Computer, language analysis method, and program
US20240169979A1 (en) Action topic ontology

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 18757668

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 18757668

Country of ref document: EP

Kind code of ref document: A1