WO2019218820A1 - 控制对象的确定方法及装置、存储介质、电子装置 - Google Patents
控制对象的确定方法及装置、存储介质、电子装置 Download PDFInfo
- Publication number
- WO2019218820A1 WO2019218820A1 PCT/CN2019/082348 CN2019082348W WO2019218820A1 WO 2019218820 A1 WO2019218820 A1 WO 2019218820A1 CN 2019082348 W CN2019082348 W CN 2019082348W WO 2019218820 A1 WO2019218820 A1 WO 2019218820A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- controlled
- control instruction
- determining
- target object
- state information
- Prior art date
Links
- 238000000034 method Methods 0.000 title claims abstract description 39
- 238000003860 storage Methods 0.000 title claims abstract description 14
- 238000004891 communication Methods 0.000 claims abstract description 11
- 238000004590 computer program Methods 0.000 claims description 15
- 238000013499 data model Methods 0.000 claims description 11
- 230000003993 interaction Effects 0.000 description 21
- 238000010586 diagram Methods 0.000 description 20
- 239000013598 vector Substances 0.000 description 13
- 230000002452 interceptive effect Effects 0.000 description 12
- 238000012549 training Methods 0.000 description 12
- 230000006870 function Effects 0.000 description 9
- 238000007781 pre-processing Methods 0.000 description 6
- 238000004422 calculation algorithm Methods 0.000 description 5
- 238000000605 extraction Methods 0.000 description 5
- 238000007726 management method Methods 0.000 description 5
- 230000009471 action Effects 0.000 description 4
- 230000004913 activation Effects 0.000 description 4
- 230000011218 segmentation Effects 0.000 description 4
- 238000007635 classification algorithm Methods 0.000 description 3
- 238000005516 engineering process Methods 0.000 description 3
- 230000007246 mechanism Effects 0.000 description 3
- 230000008569 process Effects 0.000 description 3
- 238000004458 analytical method Methods 0.000 description 2
- 238000013473 artificial intelligence Methods 0.000 description 2
- 230000005540 biological transmission Effects 0.000 description 2
- 230000008859 change Effects 0.000 description 2
- 238000013145 classification model Methods 0.000 description 2
- 230000009133 cooperative interaction Effects 0.000 description 2
- 238000013135 deep learning Methods 0.000 description 2
- 238000007477 logistic regression Methods 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 210000002569 neuron Anatomy 0.000 description 2
- 230000003287 optical effect Effects 0.000 description 2
- 238000004378 air conditioning Methods 0.000 description 1
- 238000013528 artificial neural network Methods 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 238000005282 brightening Methods 0.000 description 1
- 210000004027 cell Anatomy 0.000 description 1
- 238000006243 chemical reaction Methods 0.000 description 1
- 238000012937 correction Methods 0.000 description 1
- 238000007405 data analysis Methods 0.000 description 1
- 238000013500 data storage Methods 0.000 description 1
- 238000003745 diagnosis Methods 0.000 description 1
- 238000012423 maintenance Methods 0.000 description 1
- 238000004519 manufacturing process Methods 0.000 description 1
- 238000005065 mining Methods 0.000 description 1
- 238000003058 natural language processing Methods 0.000 description 1
- 238000010606 normalization Methods 0.000 description 1
- 238000002360 preparation method Methods 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
Images
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L41/00—Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
- H04L41/08—Configuration management of networks or network elements
- H04L41/0803—Configuration setting
-
- G—PHYSICS
- G05—CONTROLLING; REGULATING
- G05B—CONTROL OR REGULATING SYSTEMS IN GENERAL; FUNCTIONAL ELEMENTS OF SUCH SYSTEMS; MONITORING OR TESTING ARRANGEMENTS FOR SUCH SYSTEMS OR ELEMENTS
- G05B15/00—Systems controlled by a computer
- G05B15/02—Systems controlled by a computer electric
-
- G—PHYSICS
- G05—CONTROLLING; REGULATING
- G05B—CONTROL OR REGULATING SYSTEMS IN GENERAL; FUNCTIONAL ELEMENTS OF SUCH SYSTEMS; MONITORING OR TESTING ARRANGEMENTS FOR SUCH SYSTEMS OR ELEMENTS
- G05B19/00—Programme-control systems
- G05B19/02—Programme-control systems electric
- G05B19/418—Total factory control, i.e. centrally controlling a plurality of machines, e.g. direct or distributed numerical control [DNC], flexible manufacturing systems [FMS], integrated manufacturing systems [IMS] or computer integrated manufacturing [CIM]
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/22—Procedures used during a speech recognition process, e.g. man-machine dialogue
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/26—Speech to text systems
-
- G—PHYSICS
- G05—CONTROLLING; REGULATING
- G05B—CONTROL OR REGULATING SYSTEMS IN GENERAL; FUNCTIONAL ELEMENTS OF SUCH SYSTEMS; MONITORING OR TESTING ARRANGEMENTS FOR SUCH SYSTEMS OR ELEMENTS
- G05B2219/00—Program-control systems
- G05B2219/20—Pc systems
- G05B2219/26—Pc applications
- G05B2219/2642—Domotique, domestic, home control, automation, smart house
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16Y—INFORMATION AND COMMUNICATION TECHNOLOGY SPECIALLY ADAPTED FOR THE INTERNET OF THINGS [IoT]
- G16Y40/00—IoT characterised by the purpose of the information processing
- G16Y40/30—Control
Definitions
- the present disclosure relates to the field of communications, and in particular, to a method and apparatus for determining a control object, a storage medium, and an electronic device.
- the solution management mechanism of the solution is determined by the scenario.
- a management mechanism needs to be re-customized, and the implementation process is complicated and cannot be rapidly expanded.
- scene recognition only knows the current field of the message from a shallow level, and does not have a deep understanding of the user's true intentions.
- the existing solution is only applicable to a pure voice/text intelligent interactive device, and the artificial intelligence technology has not yet reached a state of true application freedom.
- the situation may be switched incorrectly or incomprehensible. For example, the user first presses the switch to turn on the light in the bedroom, and then the user says "too dark”. In fact, the user wants to brighten the light, but the intelligent central control cannot correctly understand the command.
- Embodiments of the present disclosure provide a method and apparatus for determining a control object, a storage medium, and an electronic device.
- a method for determining a control object comprising: acquiring, on a first device, a first control instruction and status information of an object to be controlled, wherein the first device and the A communication connection is established between the controlled objects; and the target object requested by the first control instruction is determined from the object to be controlled according to the status information.
- a device for determining a control object comprising: an obtaining module configured to acquire, on a first device, first control instructions and state information of an object to be controlled, wherein the A communication connection is established between a device and the object to be controlled; and the determining module is configured to determine, according to the state information, a target object that is requested to be controlled by the first control instruction from the object to be controlled.
- a storage medium having stored therein a computer program, wherein the computer program is configured to perform the steps of any one of the method embodiments described above at runtime.
- an electronic device comprising a memory and a processor, wherein the memory stores a computer program, the processor being configured to execute the computer program to perform any of the above The steps in the method embodiments.
- the state information of the object to be controlled is obtained, and the target object requested by the first control instruction is determined according to the state information of the object to be controlled, which solves the technical problem that the step of determining the target object in the related art is too cumbersome. It reduces the number of interactions between the central control and the user, improves the intelligence of the central control, and improves the user experience.
- FIG. 1 is a network architecture diagram of an embodiment of the present disclosure
- FIG. 2 is a flowchart of a method of determining a control object according to an embodiment of the present disclosure
- FIG. 3 is a structural block diagram of a determining apparatus for controlling an object according to an embodiment of the present disclosure
- FIG. 5 is a flow chart of a deep semantic understanding module of an embodiment of the present disclosure.
- FIG. 6 is a schematic diagram of storage of user history data of a memory module according to an embodiment of the present disclosure
- FIG. 7 is a framework diagram of a domain identification model of an embodiment of the present disclosure.
- FIG. 8 is a diagram of an intent recognition model framework of an embodiment of the present disclosure.
- Figure 9 is a diagram of a home service robot frame in Example 1.
- Figure 10 is a flow chart of the example 1 home service robot
- Figure 11 is a block diagram of the example 2 smart set top box
- Figure 12 is a flow chart of the example 2 smart set top box
- Figure 13 is a diagram of an example 3 intelligent conference control framework
- Figure 14 is a flow chart of the intelligent control of the example 3.
- Figure 15 is a diagram of an example 4 smart car frame
- Figure 16 is a smart car flow diagram of Example 4.
- FIG. 1 is a network architecture diagram of an embodiment of the present disclosure. As shown in FIG. 1 , the network architecture includes: an intermediate control, and an object of central control, wherein The central control controls each object according to the control command.
- FIG. 2 is a flowchart of a method for determining a control object according to an embodiment of the present disclosure. As shown in FIG. 2, the process includes the following steps. :
- Step S202 acquiring, by the first device, a first control instruction and status information of the object to be controlled, where a communication connection is established between the first device and the object to be controlled;
- Step S204 Determine, according to the state information, a target object that is requested to be controlled by the first control instruction from the object to be controlled.
- the state information of the object to be controlled is obtained, and the target object requested by the first control instruction is determined according to the state information of the object to be controlled, which solves the technical problem that the step of determining the target object in the related art is too cumbersome. It reduces the number of interactions between the central control and the user, improves the intelligence of the central control, and improves the user experience.
- the execution body of the foregoing step may be a central control (control unit), such as a speaker, a mobile phone, a set top box, a robot, an in-vehicle device, a smart butler, etc., but is not limited thereto.
- the first control instruction and the state information of the object to be controlled may be acquired on the first device, that is, the first control instruction and the state information of the object to be controlled are directly obtained.
- the execution body is not the first device, but A communication device connected to the first device, such as a control device of the first device.
- determining, according to the state information, the target object that is requested to be controlled by the first control instruction from the object to be controlled includes:
- the state information of the object to be controlled is parsed, and the target object is determined from the object to be controlled according to the predetermined correspondence, wherein the predetermined correspondence is used to describe the correspondence between the state information and the target object. If the state information of the first object is the open state or the standby state, the target object is the non-target object, and the state information of the third object is the foreground display state, and the target object is the target object.
- the status information of the fourth object is a background running state, and is a non-target object.
- determining the target object from the objects to be controlled according to the predetermined correspondence relationship includes the following example:
- Example 1 Determining an object to be controlled whose switch state is on as a target object
- Example 2 The object to be controlled whose opening time is the shortest from the current time is determined as the target object; the shortest opening time from the current time can be understood as the object that the user has just opened.
- an object whose frequency of use is greater than a predetermined value (or highest) may be determined as the target object, or the working state will change within a predetermined time (eg, the application switches from background operation to foreground display before 3S)
- the object of the state is determined as the target object.
- the status information includes at least one of the following: a switch state, an open time, a use frequency, and the like.
- determining, according to the state information, the target object that is requested to be controlled by the first control instruction from the object to be controlled includes:
- the object to be controlled that matches the state information with the specified state information is determined as the target object. If the first control instruction is “ON”, the specified state information of the object to be controlled is the off state, because the user cannot open the object that has been turned on again, for example, if the first control instruction is “turn up the volume”, then The specified state information of the controlled object is a state in which the current volume is below a predetermined threshold, and the like.
- the object to be controlled that matches the state information and the specified state information is determined as the target object, where the object to be controlled whose working state and the specified state information are higher than the preset threshold is determined as the target object, where Status information includes the working status. It may also be that the object to be controlled whose working state is similar to the specified state information is lower than the preset threshold is determined as the target object.
- the method further includes:
- the second control instruction is sent to the target object by the first device, wherein the second control instruction is used to indicate the operation requested by the target object to execute the first control instruction;
- the feedback information for confirming the first control instruction is returned by the first device.
- acquiring the first control instruction from the first device includes the following obtaining manners:
- the voice information is collected by the first device, where the voice information carries the feature information; and the first control command is generated according to the feature information;
- the first control instruction may also be identified, and then the target object is determined according to the first control instruction, and may be used simultaneously with determining the target object according to the state information, including selecting One of them determines the target object, or when there are more target objects determined by using one of the determining methods, the determining method further narrows the range of the target object, and determining the target object according to the first control instruction includes:
- S21 Identify a first control instruction, and determine a control area of the first control instruction
- identifying the first control instruction comprises: identifying the first control instruction by using a data model preset by the first device, wherein the data model includes a database of multiple domains; and identifying the first control instruction online through the network server.
- the data model Before the first control instruction is identified using the data model preset by the first device, the data model can also be trained through the neural network. When training the data model, the domain and state information need to be input into the data model as the label vector.
- a device for determining a control object is provided, which is used to implement the above-described embodiments and preferred embodiments, and will not be described again.
- the term "module” may implement a combination of software and/or hardware of a predetermined function.
- the apparatus described in the following embodiments is preferably implemented in software, hardware, or a combination of software and hardware, is also possible and contemplated.
- FIG. 3 is a structural block diagram of a determining apparatus for controlling an object according to an embodiment of the present disclosure. As shown in FIG. 3, the apparatus includes:
- the obtaining module 30 is configured to acquire, on the first device, the first control instruction and the state information of the object to be controlled, wherein the first device establishes a communication connection with the object to be controlled;
- the determining module 32 is configured to determine, from the object to be controlled, the target object requested by the first control instruction according to the status information.
- the determining module includes: a first determining unit, configured to parse state information of the object to be controlled, and determine a target object from the object to be controlled according to a predetermined correspondence, where the predetermined correspondence is used to describe the state information and the target The correspondence of objects.
- the determining module includes: a second determining unit configured to determine specified state information of the object to be controlled according to the first control instruction; and a third determining unit configured to match the state information with the specified state information to be controlled Determined as the target object.
- the apparatus of this embodiment further includes: a sending module, configured to: after determining, by the determining module, the target object that is requested to be controlled by the first control instruction from the object to be controlled according to the state information, determining from the object to be controlled In the case of the target object, the second control instruction is sent to the target object by the first device, wherein the second control instruction is used to indicate the operation requested by the target object to execute the first control instruction.
- a sending module configured to: after determining, by the determining module, the target object that is requested to be controlled by the first control instruction from the object to be controlled according to the state information, determining from the object to be controlled
- the second control instruction is sent to the target object by the first device, wherein the second control instruction is used to indicate the operation requested by the target object to execute the first control instruction.
- each of the above modules may be implemented by software or hardware.
- the foregoing may be implemented by, but not limited to, the foregoing modules are all located in the same processor; or, the above modules are in any combination.
- the forms are located in different processors.
- This embodiment provides a multi-scene collaborative interaction intelligent semantic understanding system, which is suitable for various scenarios and can be embedded in various voice/text interaction devices such as smart speakers, smart phones, and smart set top boxes. Involved in natural language processing, semantic analysis and understanding, artificial intelligence and other fields.
- the semantic understanding system of multi-device (scene) collaborative interaction provided by this embodiment can be applied to various smart device interaction systems such as smart home, smart phone, and smart car.
- the semantic understanding system can receive voice and text input information, and receive an indefinite number of smart device scene status messages in real time.
- the semantic understanding platform integrates multiple information, and the multi-round interaction deeply understands the user's intention and converts the user's manipulation instructions into smart devices.
- This embodiment includes four modules: a preprocessing module, a deep semantic understanding module, a result feedback module, and a data model management module.
- Pre-processing module Pre-processing the message, including text error correction, pinyin to Chinese characters, Chinese digital conversion, and so on.
- the deep semantic understanding module consists of three sub-modules, namely the domain identification module, the intent identification module, and the information extraction module.
- the domain identification module initially identifies the domain in which the user message is located in conjunction with the device state, and the result may be a single or multiple domains.
- Intent recognition module Initially determine user intent, including action intentions such as “listening”, “looking”, “opening”, and also including specific domain intents, such as weather consultation including “general query” and "focus query”.
- Information extraction module When the user's message field and intention are clear, the information is extracted, including date, place, singer, actor, etc., to deeply understand the user's intention.
- the result feedback module is composed of two sub-modules, which are an interaction module and an instruction generation module.
- Interaction module Actively guide interaction to determine user intent when the user's message field and intent are unclear.
- Instruction generation module The instruction class message uses the json string to return the operation to be performed by the user.
- Data Model Management Module Set up the algorithm library, rule base, and database required to maintain the pre-processing module and deep semantic understanding module.
- the semantic understanding platform mainly collects voice/text messages and an indefinite number of device states.
- the system is mainly composed of two parts: the semantic understanding system and the data model.
- the semantic understanding system consists of three modules, namely a preprocessing module and a deep semantic understanding module with a result feedback module.
- the purpose of the pre-processing module is to make the user message text more standardized and prepare for the subsequent deep semantic understanding module.
- the result feedback module is used for the corresponding user message.
- the deep semantic understanding module is the core functional module of the system.
- the deep semantic understanding module is a set of common scene semantic understanding frameworks that support multidimensional scene expansion.
- the new scenario extension only needs to maintain the corresponding corpus without redefining the new framework.
- the system is more intelligent, user-friendly, and reduces system maintenance costs, and can be applied to various intelligent interactive devices.
- FIG. 5 is a flowchart of a deep semantic understanding module according to an embodiment of the present disclosure.
- the module is a set of common scene semantic understanding framework, and the new scene expansion only needs to maintain a corresponding corpus without redefining a new framework. To make the system more intelligent.
- the module provides the device scene state message receiving function, can be used in a multi-interactive manner coexisting smart device, and better realizes context understanding, and thus is one of the core modules of the disclosure.
- the system can be used in a multi-device control system, such as a smart home, the field is a device, the intention is to control the actions of each device; or a single device multi-scene control system, such as a smart set-top box, the device has only one television, the scene has Albums, movies, music, etc., the field is the TV-related scene, intended to control the actions of each scene.
- a multi-device control system such as a smart home
- the field is a device, the intention is to control the actions of each device
- a single device multi-scene control system such as a smart set-top box
- the device has only one television
- the scene has Albums, movies, music, etc.
- the field is the TV-related scene, intended to control the actions of each scene.
- the corpus preparation mainly includes three parts of the domain library, equipment library and domain vocabulary.
- the domain library is composed of multiple sub-libraries. Taking the smart set-top box as an example, the domain library includes a music library, a video library, and an album library.
- Movie and TV library watching movies, wanting to watch war films,...
- Album library open albums, slides, ...
- the device library mainly refers to the state of the device involved in the semantic understanding system, taking the smart set-top box as an example:
- Air conditioning open, close, cool, heat, dehumidify...
- the domain lexicon is mainly used for information extraction, such as the location of the home device, the name of the film and other special domain vocabulary, the specific format is as follows:
- Devide_location master bedroom, living room, kitchen...
- Video_name Joy, the best of us, the emergency doctor...
- the json message collection module mainly includes a voice/text message and a device status message, and the specific format is as follows:
- the "zxvca_text” is the content of the message after the text message or the voice recognition
- the "zxvca_device” is the device state, which is an array form, which can be adjusted according to the number of real devices.
- the memory module is one of the core modules protected by the patent, and mainly stores user history message data, and forms a mesh structure.
- the specific storage format is shown in FIG. 6.
- FIG. 6 is a memory module user history data storage according to an embodiment of the present disclosure. Schematic, including voice/text message, current message field, intent, message time, etc.
- Follow-up can be based on the user's habits, according to user habits, big data analysis, mining reasoning, determine the user's true intentions, reduce the number of interactions, make the system more intelligent.
- the intent of the new user can be inferred based on the data of most users.
- This module can also be used in other product business such as recommendation system, user portrait analysis.
- the domain identification module is one of the core modules protected by this patent.
- the domain identification framework is shown in FIG. 7, which is a framework diagram of the domain recognition model of the embodiment of the present disclosure.
- the domain classification model framework is shown in Figure 6, where the parameter set of the network structure is the domain model.
- the model framework supports the continuous expansion of the device scene, avoiding the use of big data training models repeatedly when adding new corpus, and reducing the training time cost.
- the algorithm mainly includes the following five parts. The following is a detailed description of the application environment of the smart set-top box.
- the device has a TV number of 1, and the scene status is music, video, and album numbers are 1 00, 0 1 0, 0 0 1 respectively.
- the user message "to the first song", the device status "TV album”.
- Input layer User message text, many device status.
- Vectorization mainly consists of two-part sentence vectorization and device state vectorization.
- Sentence vectorization is the user message segmentation, and the word2vec summation of all words gives the sentence vector.
- the device state vectorization consists of two parts: the device number vector and the scene state vector.
- the current device scene state is: 1 0 0 1.
- the hidden layer mainly focuses on the activation function, the number of hidden layer neurons, and the number of hidden layers. These parameters can be adjusted according to specific application scenarios, and there is no uniform standard.
- Output layer Using multiple logistic regression functions for the output result of the hidden layer, N sets of binary vectors are obtained, where position 0 represents not belonging to the field, and position 1 represents belonging to the field.
- the output layer consists of three logistic regression models, which are L1 (whether it is music), L2 (whether it is film or TV), and L3 (whether it is an album).
- the final output layer results in three sets of binary vectors, 0.1 0.9, 0.8 0.2, and 0.9 0.1, respectively.
- Label normalization Convert the N binary vectors of the output layer into N-ary vectors, and extract the position where the maximum value of each binary vector is located.
- the final output value of the current scene is 1 0 0, which means that the message belongs to the music field.
- the training corpus format is device status + text + label, which can be separated by "
- the length of the label is the number of fields
- the position 1 represents "music”
- the position 2 represents “movie”
- the position 3 represents "album”.
- the results of the multiple binary model are used to determine which fields the message belongs to.
- the result can be single or multiple, for example:
- the user message "Come a joy” device status "TV music” model gets the label 1 0 0, ie the message belongs to the music field.
- the user message "Happy” device status "TV album” model gets the label 1 1 0, that is, the message belongs to both the music field and the film and television field.
- the intent identification module is one of the core modules protected by this patent.
- the intention is relatively stable in the field, and the patent is implemented by a multi-classification algorithm. Convert the intent in the device library into multiple tags, implement the function using the multi-classification algorithm RANN, separate line training and online use.
- the intent recognition model framework is shown in FIG. 8.
- FIG. 8 is a diagram of an intent recognition model framework of an embodiment of the present disclosure, wherein the parameter set of the network structure is an intent model. Similar to the domain recognition model, just change the output layer to the softMax function and modify the model structure to a multi-class model.
- the algorithm mainly includes the following four parts. The following is a detailed description of the application environment of the smart set-top box.
- the device has a TV number of 1, and the scene status is music, video, and album numbers are 1 00, 0 1 0, 0 0 1 respectively.
- the smart set-top box has the following intent: open, see, listen, and other (no intention).
- 1 0 0 0 means “open”, 0 1 0 0 means “see”, 0 0 1 0 means “listen”, 0 0 1 0 means “other”.
- the user message "to the first song", the device status "TV album”.
- Input layer User message text, many device status.
- Vectorization mainly consists of two-part sentence vectorization and device state vectorization.
- Sentence vectorization is the user message segmentation, and the word2vec summation of all words gives the sentence vector.
- the device state vectorization consists of two parts: the device number vector and the scene state vector.
- the current device scene state is: 1 0 0 1.
- the hidden layer mainly focuses on the activation function, the number of hidden layer neurons, and the number of hidden layers. These parameters can be adjusted according to specific application scenarios, and there is no uniform standard.
- Output layer softmax normalizes the output of the hidden layer, Wherein W hk hidden layer and output layer weights.
- the output layer is a 4-ary vector, and the position corresponding to the maximum value is the true intention of the current user.
- the model result is 0.02 0.05 0.9 0.03, which is intended to be “listening”.
- the training corpus format is device status + text + label, which can be separated by "
- the training model gets an intent recognition model.
- 1 0 0 0 stands for "open", 0 1 0 0
- the user message "The song of the first Andy Lau", the device status "TV album”, the model result is 0.02 0.05 0.9 0.03, that is, the intention is "listening".
- Module 205 Whether the domain intention is clear that the module is one of the core modules protected by the patent, and is mainly set to determine whether the process needs to enter an interactive state, accurately determine the user's intention, and add a humanoid interaction mechanism. This is mainly a question of judging the lack of multi-domain, unintentional or domain intentions.
- the interactive content will be returned with the json message together with the result of the instruction parsing.
- the information extraction module is a necessary module for the semantic understanding of the set.
- General knowledge mainly including date, place, name, etc. It is implemented by the serial tag algorithm of the industry classic algorithm LSTM+CRF.
- Domain knowledge such as singers, actors, film and television production, music styles, etc., need to provide the corresponding domain lexicon, using index matching.
- the output module generates a json instruction message with semantics, which is a core module of the patent, and is convenient for collecting information by log capture.
- the message format is as follows:
- zxvca_text is the content of the message after text message or speech recognition
- zxvca_result is the result of domain intent recognition, in the form of an array, including the domain, the intent and the corresponding score of the field
- zxvca_info is the result of information extraction, in the form of an array , including the name, time, location, etc., can expand other content that needs to be extracted according to the needs of the product.
- the present disclosure provides various implementations and steps by using a home service robot, a smart set top box, a smart conference control, and a smart vehicle.
- Figure 9 is a framework diagram of a home service robot in Example 1
- Figure 10 is a flow chart of a home service robot of Example 1.
- This embodiment mainly describes the following application scenarios: multiple devices and multiple scenarios are not in interaction, and the instruction parsing results need to be interactive.
- the home service robot scene is set to a lamp, an air conditioner, a curtain, and the like.
- the home intelligence central control collects user messages and home device status messages. Operations herein include, but are not limited to, voice commands, remote control commands, smart terminal touch screen operations, gesture commands, and the like.
- the smart central control separately collects user messages and device status messages.
- data stream 2 receives user messages and home device status messages, such as:
- the domain identification is performed according to the module 702 of FIG. 10, and the result is “light” or “television”; the intent recognition is performed according to the module 703 of FIG. 10, and the result is “brightening”.
- Figure 9 data stream 3 voice understanding platform sends command messages to the home smart control, the message content is as follows:
- Figure 9 data stream 4 intelligent central control selects interaction according to requirements or directly distributes instructions to the corresponding device to operate the device.
- FIG. 11 is a frame diagram of the smart set top box of the example 2
- FIG. 12 is a flow chart of the smart set top box of the example 2.
- This embodiment mainly describes the following application scenarios: a single device multiple scenes are not in an interaction, and an instruction parsing result needs to be interactive.
- the smart set-top box scene is set to video, music, photo album, and so on.
- the smart set top box collects user messages as well as TV interface status messages.
- Operations herein include, but are not limited to, voice commands, remote control commands, smart terminal touch screen operations, gesture commands, and the like.
- the smart set top box collects user messages and device status messages, respectively.
- Figure 11 data stream 2 the semantic understanding platform receives user messages and home device status messages.
- Contextual understanding Such as:
- the domain identification is performed according to the module 902 of FIG. 12, and the result is “music” or “video”; the intent recognition is performed according to the module 903 of FIG. 12, and the result is “search”.
- Figure 11 data stream 3 the voice understanding platform sends an instruction message to the smart set-top box, the message content is as follows:
- Figure 11 data stream 4 the smart set-top box selects interaction according to requirements or directly sends instructions to the television to operate the television.
- FIG. 13 is an example 3 intelligent conference control framework diagram
- FIG. 14 is an example 3 intelligent conference control flowchart.
- This embodiment mainly describes the following application scenarios: multiple devices and multiple scenarios are not in interaction, and the instruction parsing results do not need to be interactive.
- the intelligent conference control scene is set to command operation and fault diagnosis.
- the intelligent conference control terminal collects user messages. Operations herein include, but are not limited to, voice commands, remote control commands, smart terminal touch screen operations, gesture commands, and the like.
- the intelligent conference control terminal separately collects user messages and device status messages.
- the field intention is determined.
- the information is extracted according to the module 1105 of Fig. 14, and there is no content.
- Figure 13 data stream 3 the voice understanding platform sends an instruction message to the intelligent conference control terminal, and the message format is as follows:
- Figure 13 data stream 4 the intelligent conference terminal distributes instructions to the corresponding device to operate the device.
- FIG. 15 is an example 4 smart car frame diagram
- FIG. 16 is an example 4 smart car flow chart.
- This embodiment mainly describes the following application scenarios: multiple devices and multiple scenarios are in interaction, and the instruction parsing results do not need to be interactive.
- the smart car scene is set to make a call, listen to music, navigate, etc. Intelligent vehicle collection of user messages.
- Operations herein include, but are not limited to, voice commands, remote control commands, smart terminal touch screen operations, gesture commands, and the like.
- the smart vehicle collects user messages and device status messages, respectively.
- the data stream 4 of Fig. 15 is sent by the intelligent vehicle to the corresponding device to operate the device.
- Embodiments of the present disclosure also provide a storage medium having stored therein a computer program, wherein the computer program is configured to execute the steps of any one of the method embodiments described above.
- the above storage medium may be configured to store a computer program for performing the following steps:
- the first control instruction and the state information of the object to be controlled are acquired on the first device, where a communication connection is established between the first device and the object to be controlled;
- the foregoing storage medium may include, but is not limited to, a USB flash drive, a Read-Only Memory (ROM), and a Random Access Memory (RAM).
- ROM Read-Only Memory
- RAM Random Access Memory
- Embodiments of the present disclosure also provide an electronic device including a memory and a processor having a computer program stored therein, the processor being configured to execute a computer program to perform the steps of any one of the method embodiments described above.
- the electronic device may further include a transmission device and an input and output device, wherein the transmission device is connected to the processor, and the input and output device is connected to the processor.
- the processor may be configured to perform the following steps by using a computer program:
- the first control instruction and the state information of the object to be controlled are acquired on the first device, where a communication connection is established between the first device and the object to be controlled;
- modules or steps of the present disclosure described above can be implemented by a general-purpose computing device that can be centralized on a single computing device or distributed across a network of multiple computing devices. Alternatively, they may be implemented by program code executable by the computing device such that they may be stored in the storage device by the computing device and, in some cases, may be different from the order herein.
- the steps shown or described are performed, or they are separately fabricated into individual integrated circuit modules, or a plurality of modules or steps thereof are fabricated as a single integrated circuit module. As such, the disclosure is not limited to any specific combination of hardware and software.
- the method and apparatus for determining a control object, the storage medium, and the electronic device provided by the embodiments of the present invention have the following beneficial effects: the technical problem of determining the target object in the related art is too cumbersome, and the central control is reduced. The number of interactions with the user improves the intelligence of the central control and improves the user experience.
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Automation & Control Theory (AREA)
- General Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- Health & Medical Sciences (AREA)
- Human Computer Interaction (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Acoustics & Sound (AREA)
- Multimedia (AREA)
- Computational Linguistics (AREA)
- Signal Processing (AREA)
- Computer Networks & Wireless Communication (AREA)
- Manufacturing & Machinery (AREA)
- Quality & Reliability (AREA)
- User Interface Of Digital Computer (AREA)
Abstract
本公开提供了一种控制对象的确定方法及装置、存储介质、电子装置,其中,该方法包括:在第一设备上获取第一控制指令和待控制的对象的状态信息,其中,第一设备与待控制的对象之间建立有通信连接;根据状态信息从待控制的对象中确定第一控制指令所请求控制的目标对象。通过本公开,解决了相关技术中确定目标对象的步骤过于繁琐的技术问题。
Description
本公开涉及通信领域,具体而言,涉及一种控制对象的确定方法及装置、存储介质、电子装置。
相关技术中,各种智能交互设备呈爆炸式增长,如京东的叮咚音箱,亚马逊echo、以及智能机顶盒等。其中,语义理解是目前智能交互设备的重点和难点之一,主要表现在多维场景扩充,上下文理解层次。
针对多维场景扩充,相关技术主要根据业务定制方式,来不断扩充场景解析器。该方案对话管理机制由场景决定,当有新的场景接入,需要重新定制一套管理机制,实现流程复杂,无法快速扩展。另外,场景识别仅从浅层次了解当前消息所在的领域,并不能够深层次了解用户真正的意图。
相关技术中,已有的解决方案仅仅适用于纯语音/文本的智能交互设备,人工智能技术还没有达到真正应用自如的状态。
这样当前解决方案若以语义理解系统的对话管理模块处理,会造成场景之间切换错误或者无法理解的情况。比如,用户先按开关,打开卧室的灯,紧接着用户说“太暗了”,事实上用户是想调亮灯光,但是智能中控是无法正确理解该指令的。
针对相关技术中存在的上述问题,目前尚未发现有效的解决方案。
发明内容
本公开实施例提供了一种控制对象的确定方法及装置、存储介质、电子装置。
根据本公开的一个实施例,提供了一种控制对象的确定方法,包括:在第一设备上获取第一控制指令和待控制的对象的状态信息,其中,所述 第一设备与所述待控制的对象之间建立有通信连接;根据所述状态信息从所述待控制的对象中确定所述第一控制指令所请求控制的目标对象。
根据本公开的另一个实施例,提供了一种控制对象的确定装置,包括:获取模块,设置为在第一设备上获取第一控制指令和待控制的对象的状态信息,其中,所述第一设备与所述待控制的对象之间建立有通信连接;确定模块,设置为根据所述状态信息从所述待控制的对象中确定所述第一控制指令所请求控制的目标对象。
根据本公开的又一个实施例,还提供了一种存储介质,所述存储介质中存储有计算机程序,其中,所述计算机程序被设置为运行时执行上述任一项方法实施例中的步骤。
根据本公开的又一个实施例,还提供了一种电子装置,包括存储器和处理器,所述存储器中存储有计算机程序,所述处理器被设置为运行所述计算机程序以执行上述任一项方法实施例中的步骤。
通过本公开,获取待控制的对象的状态信息,并根据待控制的对象的状态信息来确定第一控制指令所请求控制的目标对象,解决了相关技术中确定目标对象的步骤过于繁琐的技术问题,减少了中控与用户的交互次数,提高了中控的智能性,提高了用户体验。
此处所说明的附图用来提供对本公开的进一步理解,构成本申请的一部分,本公开的示意性实施例及其说明用于解释本公开,并不构成对本公开的不当限定。在附图中:
图1是本公开实施例的网络构架图;
图2是根据本公开实施例的控制对象的确定方法的流程图;
图3是根据本公开实施例的控制对象的确定装置的结构框图;
图4是本公开实施例的整体系统架构图;
图5是本公开实施例的深度语义理解模块流程图;
图6是本公开实施例的记忆模块用户历史数据存储示意图;
图7是本公开实施例的领域识别模型框架图;
图8是本公开实施例的意图识别模型框架图;
图9是实例1中家庭服务机器人框架图;
图10是实例1家庭服务机器人流程图;
图11是实例2智能机顶盒框架图;
图12是实例2智能机顶盒流程图;
图13是实例3智能会控框架图;
图14是实例3智能会控流程图;
图15是实例4智能车载框架图;
图16是实例4智能车载流程图。
下文中将参考附图并结合实施例来详细说明本公开。需要说明的是,在不冲突的情况下,本申请中的实施例及实施例中的特征可以相互组合。
需要说明的是,本公开的说明书和权利要求书及上述附图中的术语“第一”、“第二”等是用于区别类似的对象,而不必用于描述特定的顺序或先后次序。
实施例1
本申请实施例可以运行于图1所示的网络架构上,图1是本公开实施例的网络构架图,如图1所示,该网络架构包括:中控、以及中控控制的对象,其中,中控根据控制指令控制各个对象。
在本实施例中提供了一种运行于上述网络架构的控制对象的确定方法,图2是根据本公开实施例的控制对象的确定方法的流程图,如图2所示,该流程包括如下步骤:
步骤S202,在第一设备上获取第一控制指令和待控制的对象的状态 信息,其中,第一设备与待控制的对象之间建立有通信连接;
步骤S204,根据状态信息从待控制的对象中确定第一控制指令所请求控制的目标对象。
通过上述步骤,获取待控制的对象的状态信息,并根据待控制的对象的状态信息来确定第一控制指令所请求控制的目标对象,解决了相关技术中确定目标对象的步骤过于繁琐的技术问题,减少了中控与用户的交互次数,提高了中控的智能性,提高了用户体验。
可选地,上述步骤的执行主体,即,上述的第一设备可以为中控(控制单元),如音箱、手机、机顶盒,机器人,车载设备,智能管家等,但不限于此。当然也可以不在第一设备上获取第一控制指令和待控制的对象的状态信息,即直接获取第一控制指令和待控制的对象的状态信息,此时,执行主体不是第一设备,而是与第一设备连接的通信设备,如第一设备的控制设备等。
在一个实施方式中,根据状态信息从待控制的对象中确定第一控制指令所请求控制的目标对象包括:
解析待控制的对象的状态信息,根据预定对应关系从待控制的对象中确定目标对象,其中,预定对应关系用于描述状态信息与目标对象的对应关系。如第一对象的状态信息为开启状态或待机状态,则为目标对象,第二对象的状态信息为关闭状态,则为非目标对象,第三对象的状态信息为前台显示状态,则为目标对象,第四对象的状态信息为后台运行状态,则为非目标对象。
可选的,根据预定对应关系从待控制的对象中确定目标对象包括以下示例:
示例1:将开关状态为开启的待控制的对象确定为目标对象;
示例2:将开启时间距离当前时间最短的待控制的对象确定为目标对象;开启时间距离当前时间最短可以理解为用户刚刚操作开启的对象。在其他的示例中,也可以将用户使用频率大于预定值(或者最高)的对象确 定为目标对象,或者将在预定时间内工作状态发生变化(如应用程序在3S前从后台运行切换为前台显示状态)的对象确定为目标对象。
其中,状态信息包括以下至少之一:开关状态,开启时间,使用频率等。
在一个实施方式中,根据状态信息从待控制的对象中确定第一控制指令所请求控制的目标对象包括:
S11,根据第一控制指令确定待控制的对象的指定状态信息;
S12,将状态信息与指定状态信息匹配的待控制的对象确定为目标对象。如第一控制指令为“打开”,则待控制的对象的指定状态信息为关闭状态,因为用户不可能将已经开启的对象再开启一遍,如第一控制指令为“调高音量”,则待控制的对象的指定状态信息为当前音量低于预定阈值的状态,等等。
可选的,将状态信息与指定状态信息匹配的待控制的对象确定为目标对象包括:将工作状态与指定状态信息的相似度高于预设阈值的待控制的对象确定为目标对象,其中,状态信息包括工作状态。也可以是将工作状态与指定状态信息的相似度低于预设阈值的待控制的对象确定为目标对象。
可选的,在根据状态信息从待控制的对象中确定第一控制指令所请求控制的目标对象之后,方法还包括:
在从待控制的对象中确定出目标对象的情况下,通过第一设备向目标对象发送第二控制指令,其中,第二控制指令用于指示目标对象执行第一控制指令所请求的操作;在从待控制的对象中未确定出目标对象的情况下,通过第一设备返回用于确认第一控制指令的反馈信息。
在本实施例中,从第一设备获取第一控制指令包括以下获取方式:
通过第一设备采集到语音信息,其中,语音信息中携带有特征信息;根据特征信息生成第一控制指令;
从第一设备接收到文本消息,其中,文本消息中携带有特征信息;根据特征信息生成第一控制指令;
从第一设备接收到遥控指令;根据遥控指令生成第一控制指令;
从第一设备接收控制手势,从控制手势中提取特征信息;根据特征信息生成第一控制指令。
在本实施例中,在第一设备上获取第一控制指令之后,还可以识别第一控制指令,进而根据第一控制指令来确定目标对象,可以与根据状态信息确定目标对象同时使用,包括选择其中一个来确定目标对象,或者在使用其中一种确定方式确定的目标对象较多时,使用另一种确定方式进一步缩小目标对象的范围,根据第一控制指令来确定目标对象包括:
S21,识别第一控制指令,确定第一控制指令的控制领域;
S22,将所属领域与控制领域相同的待控制的对象,确定为目标对象;
可选的,识别第一控制指令包括以下之一:使用第一设备预设的数据模型识别第一控制指令,其中,数据模型包括多个领域的数据库;通过网络服务器在线识别第一控制指令。在使用第一设备预设的数据模型识别第一控制指令之前,还可以通过神经网络训练数据模型,在训练数据模型时,需要将领域和状态信息作为其中的标签向量输入到数据模型中。
通过以上的实施方式的描述,本领域的技术人员可以清楚地了解到根据上述实施例的方法可借助软件加必需的通用硬件平台的方式来实现,当然也可以通过硬件,但很多情况下前者是更佳的实施方式。基于这样的理解,本公开的技术方案本质上或者说对现有技术做出贡献的部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储介质(如ROM/RAM、磁碟、光盘)中,包括若干指令用以使得一台终端设备(可以是手机,计算机,服务器,或者网络设备等)执行本公开各个实施例所述的方法。
实施例2
在本实施例中还提供了一种控制对象的确定装置,该装置用于实现上 述实施例及优选实施方式,已经进行过说明的不再赘述。如以下所使用的,术语“模块”可以实现预定功能的软件和/或硬件的组合。尽管以下实施例所描述的装置较佳地以软件来实现,但是硬件,或者软件和硬件的组合的实现也是可能并被构想的。
图3是根据本公开实施例的控制对象的确定装置的结构框图,如图3所示,该装置包括:
获取模块30,设置为在第一设备上获取第一控制指令和待控制的对象的状态信息,其中,第一设备与待控制的对象之间建立有通信连接;
确定模块32,设置为根据状态信息从待控制的对象中确定第一控制指令所请求控制的目标对象。
可选的,确定模块包括:第一确定单元,设置为解析待控制的对象的状态信息,根据预定对应关系从待控制的对象中确定目标对象,其中,预定对应关系用于描述状态信息与目标对象的对应关系。
可选的,确定模块包括:第二确定单元,设置为根据第一控制指令确定待控制的对象的指定状态信息;第三确定单元,设置为将状态信息与指定状态信息匹配的待控制的对象确定为目标对象。
可选的,本实施例的装置还包括:发送模块,设置为在确定模块根据状态信息从待控制的对象中确定第一控制指令所请求控制的目标对象之后,在从待控制的对象中确定出目标对象的情况下,通过第一设备向目标对象发送第二控制指令,其中,第二控制指令用于指示目标对象执行第一控制指令所请求的操作。
需要说明的是,上述各个模块是可以通过软件或硬件来实现的,对于后者,可以通过以下方式实现,但不限于此:上述模块均位于同一处理器中;或者,上述各个模块以任意组合的形式分别位于不同的处理器中。
实施例3
本实施例用于结合不同场景中的实例对本申请的方案进行详细解释和说明:
本实施例提供了一种多场景协同交互智能语义理解系统,该系统适合多种场景,可以嵌入智能音箱、智能手机、智能机顶盒等各种语音/文本交互设备中。涉及自然语言处理、语义分析与理解、人工智能等领域。本实施例提供的一种多设备(场景)协同交互的语义理解系统,可以应用于智能家居、智能手机、智能车载等各种智能设备交互系统。语义理解系统可以接收语音、文本输入信息,并实时接收不定数量智能设备场景状态消息,最终通过语义理解平台将多种信息融合,多轮交互深层次了解用户意图,将用户操控指令转化为智能设备调度执行的服务指令。
本实施例包括四大模块:预处理模块,深层语义理解模块,结果反馈模块以及数据模型管理模块。
预处理模块:对消息进行预处理,包括文本纠错、拼音转汉字,中文数字转化等。
深层语义理解模块,由三个子模块组成,分别是领域识别模块,意图识别模块,信息抽取模块。
领域识别模块:结合设备状态初步识别用户消息所在的领域,结果可以是单个或者多个领域。
意图识别模块:初步确定用户意图,包括动作意图比如“听”,“看”,“打开”,也包括特定领域意图,比如天气咨询包括“一般查询”和“焦点查询”。
信息抽取模块:当用户消息所在领域和意图明确时,进行信息抽取,包括日期、地点、歌手、演员等,深层次了解用户意图。
结果反馈模块,由两个子模块组成,分别是交互模块、指令生成模块。
交互模块:当用户消息所在领域和意图不清楚时,主动引导交互确定用户意图。
指令生成模块:指令类消息使用json串返回用户要执行的操作
数据模型管理模块:设置为维护预处理模块和深层语义理解模块需要 的算法库、规则库以及数据库等。
图4是本公开实施例的整体系统架构图,如图4所示,语义理解平台主要采集语音/文本消息以及不定数量的设备状态。该系统主要由语义理解系统和数据模型两大部分。其中语义理解系统包含三个模块,分别是预处理模块、深度语义理解模块以结果反馈模块。其中预处理模块的目的在于使得用户消息文本更加标准化,为后续深度语义理解模块做准备。结果反馈模块用于相应用户消息。深度语义理解模块是该系统的核心功能模块。
深度语义理解模块为一套通用的场景语义理解框架,支持多维场景扩充。新的场景扩充只需要维护相应的语料库,无需重新定义新的框架。
较业界已有的解决方案,该套系统更加智能化,人性化,减少系统维护成本的同时,可应用于各种智能交互设备。
图5是本公开实施例的深度语义理解模块流程图,如图5所示,该模块为一套通用的场景语义理解框架,新的场景扩充只需要维护相应的语料库,无需重新定义新的框架,使得系统更加智能化。此外,该模块提供设备场景状态消息接收功能能够,可用于多交互方式并存的智能设备,更好的实现上下文理解,故为本公开的核心模块之一。
该系统可以用于其中多设备控制系统,比如智能家居,领域即各个设备,意图即控制各个设备动作;也可以是单设备多场景的控制系统,比如智能机顶盒,设备只有一个电视机,场景有相册、影视、音乐等,领域即电视机相关的场景,意图为控制各个场景的动作。
语料准备主要包括三部分领域库、设备库以及领域词库。其中领域库由多个子库组成,以智能机顶盒为例,领域库包含音乐库、影视库、相册库。
音乐库:我想听音乐、来首歌、……
影视库:看电影、想看战争片、……
相册库:打开相册、幻灯片、……
设备库主要是指语义理解系统涉及的设备状态,以智能机顶盒为例:
电视机:音乐、影视、相册……
音乐:听,打开,关闭,快进……
相册:打开、关闭、放大……
影视:看、搜索……
以智能家居为例
灯:打开、关闭……
空调:打开、关闭、制冷、制热、除湿……
领域词库主要用于信息抽取,比如家庭设备所在的位置,影视名称等特殊领域词汇,具体格式如下:
Devide_location:主卧、客厅、厨房……
Music_name:欢乐颂,童年,漂洋过海来看你……
Video_name:欢乐颂,最好的我们,急诊科医生……
下面对图5的各个子模块进行详细描述:
模块201:json消息采集模块,主要包括语音/文本消息,以及设备状态消息,具体格式如下所示:
其中"zxvca_text"为文本消息或者语音识别后的消息内容,"zxvca_device"是设备状态,为数组形式,可以根据真实设备个数进行调整。
模块202:记忆模块为本专利保护的核心模块之一,主要存储用户历史消息数据,形成网状结构,具体存储格式如图6所示,图6是本公开实施例的记忆模块用户历史数据存储示意图,内容包括语音/文本消息、当前消息所在领域、意图、消息时间等内容。后续可以根据记忆模块根据用户习惯,进行大数据分析、挖掘推理,确定用户真正意图,减少交互次数,使系统更加智能化。同时可以根据大部分用户的数据推测新用户的意图。如用户A和用户B在说“欢乐颂”通过交互判断用户是想听音乐,当用户C也说“欢乐颂”,可以直接推测用户C想听欢乐颂这首歌而不是听音乐。该模块同样可以用于推荐系统、用户画像分析等其他产品业务中。
模块203:领域识别模块为本专利保护的核心模块之一。领域识别框架如图7所示,图7是本公开实施例的领域识别模型框架图。
采用多个二分类算法RANN实现。分离线训练和在线使用两部分。领域分类模型框架如图6所示,其中网络结构的参数集合即为领域模型。该模型框架支持领域即设备场景的不断扩充,避免新增语料时反复基于大数据训练模型,减少训练时间成本。该算法主要包括以下五个部分,下面结合智能机顶盒应用场景详细说明。
设备有电视机编号为1,场景状态为音乐、影视、相册编号分别为1 00、0 1 0、0 0 1。用户消息“来首歌”,设备状态“电视机相册”。
输入层:用户消息文本,诸多设备状态。
向量化:主要包含两部分句子向量化和设备状态向量化。
句子向量化即用户消息分词,全部词的word2vec求和得到句子向量。设备状态向量化由设备编号向量和场景状态向量两部分组成,则当前的设备场景状态为:1 0 0 1。
隐藏层:b
h=f(W
ihx
t+W
h'hb
h-1)+b,其中f为激活函数,W
ih为输入层与隐藏层的权重,W
h'h为隐藏层之前的权重。隐藏层作为深度学习的黑盒,主要关注的是激活函数、隐藏层神经元个数以及隐藏层层数,这些参数可以根据具体应用场景进行调整,没有统一标准。
输出层:对隐藏层的输出结果进行使用多个逻辑回归函数,得到N组二元向量,其中位置0代表不属于该领域,位置1代表属于该领域。该场景下输出层由3个逻辑回归模型组成,分别为L1(是否为音乐)、L2(是否为影视)、L3(是否为相册)。最终输出层的结果为3组二元向量,分别为0.1 0.9、0.8 0.2、0.9 0.1。
标签标准化:将输出层的N个二元向量转化为N元向量,提取每个二元向量最大值所在的位置。当前场景的最终输出值为1 0 0,即该消息属于音乐领域。
下面重点介绍领域模型的离线训练语料及在线使用方式:
离线训练:训练语料格式为设备状态+文本+标签,中间可以用“|”分开,如下所示:
电视机影视|来首歌|1 0 0
电视机音乐|欢乐颂|1 0 0
电视机影视|欢乐颂|0 1 0
电视机相册|欢乐颂|1 1 0
电视机影视|打开音乐|1 0 0
电视机音乐|打开相册|0 0 1
电视机音乐|看个电影|0 1 0
其中,其中标签长度为领域个数,位置1代表“音乐”,位置2代表“影视”,位置3代表“相册”。
在线使用:用户消息分词后,通过多二分类模型结果来判断该消息属于哪些领域,结果可以是单个或者多个,事例如下:
单领域结果
用户消息“来一首欢乐颂”设备状态“电视机音乐”模型得到标签为1 0 0,即该消息属于音乐领域。
多领域结果
用户消息“欢乐颂”设备状态“电视机相册”模型得到标签为1 1 0,即该消息同时属于音乐领域、影视领域。
模块204:意图识别模块为本专利保护的核心模块之一。意图相对领域比较稳定,固本专利采取多分类算法实现。将设备库中的意图转化为多个标签,采用多分类算法RANN实现该功能,分离线训练和在线使用两部分。意图识别模型框架如图8所示,图8是本公开实施例的意图识别模型框架图,其中网络结构的参数集合即为意图模型。与领域识别模型类似,只是将输出层改为softMax函数,将模型架构修改为多分类模型。该算法主要包括以下四个部分,下面结合智能机顶盒应用场景详细说明。
设备有电视机编号为1,场景状态为音乐、影视、相册编号分别为1 00、0 1 0、0 0 1。考虑到有些问句不涉及动作,即无意图,故这里假设智 能机顶盒存在以下意图:打开,看,听,其他(无意图)。1 0 0 0代表“打开”,0 1 0 0代表“看”,0 0 1 0代表“听”,0 0 1 0代表“其他”。用户消息“来首歌”,设备状态“电视机相册”。
输入层:用户消息文本,诸多设备状态。
向量化:主要包含两部分句子向量化和设备状态向量化。
句子向量化即用户消息分词,全部词的word2vec求和得到句子向量。设备状态向量化由设备编号向量和场景状态向量两部分组成,则当前的设备场景状态为:1 0 0 1。
隐藏层:b
h=f(W
ihx
t+W
h'hb
h-1)+b,其中f为激活函数,W
ih为输入层与隐藏层的权重,W
h'h为隐藏层之前的权重。隐藏层作为深度学习的黑盒,主要关注的是激活函数、隐藏层神经元个数以及隐藏层层数,这些参数可以根据具体应用场景进行调整,没有统一标准。
输出层:对隐藏层的输出结果进行softmax归一化,
其中W
hk为隐藏层和输出层的权重。该场景下输出层为4元向量,最大值对应的位置即当前用户的真正意图。模型结果为0.02 0.05 0.9 0.03,即意图为“听”。
下面重点介绍意图模型的离线训练语料及在线使用方式:
离线训练:训练语料格式为设备状态+文本+标签,中间可以用“|”分开,具体事例如下所示:
电视机影视|你好|0 0 0 1
电视机影视|听音乐|0 0 1 0
电视机音乐|打开相册|1 0 0 0
电视机相册|看刘德华的电影|0 1 0 0
训练模型得到意图识别模型。其中1 0 0 0代表“打开”,0 1 0 0
代表“看”,0 0 1 0代表“听”,0 0 1 0代表“其他”
在线使用:用户消息分词后,加载多分类模型,即可得到预测结果。事例如下:
用户消息“来首刘德华的歌”,设备状态“电视机相册”,模型结果为0.02 0.05 0.9 0.03,即意图为“听”。
模块205:领域意图是否明确模块为本专利保护的核心模块之一,主要设置为判断流程是否需要进入交互状态,准确判断用户意图的同时,为其增加类人交互机制。这块主要判断多领域、无意图或者领域意图均缺失的问题。
多领域问题,比如用户说“搜一下欢乐颂”,领域识别结果为“音乐”或者“影视”。由于意图不清晰,所以要和用户交互确定用户想要表达的意思。
无意图问题,比如用户说“欢乐颂”,意图识别结果为“其他”,即无意图。这个时候可以与用户交互“您想播放欢乐颂还是查找欢乐颂视频资源”。
领域意图缺失问题,比如用户说“你好”,领域意图均缺失,这个时候可以与用户交互“我可以帮您浏览照片,看电影,听音乐”。
交互内容会和指令解析结果一并以json消息返回,在具体业务应用中,可以灵活选择是否交互。
模块206:信息抽取模块为该套语义理解的必要模块。通用知识,主要包括日期、地点、人名等。采用业界经典算法LSTM+CRF的序列标签算法实现。领域知识,比如歌手、演员、影视出产地、音乐风格等,需要提供相应的领域词库,采用索引匹配方式。
模块207:输出模块,生成具有语义的json指令消息,为本专利的核心模块,便于日志抓包采集信息。消息格式如下所示:
其中"zxvca_text"为文本消息或者语音识别后的消息内容,"zxvca_result"是领域意图识别结果,为数组形式,包含领域、意图以及该领域对应的分数,"zxvca_info"是信息抽取结果,为数组形式,包含人名、 时间、地点等,可根据产品需求自行扩充其他需要抽取的内容。
本公开以家庭服务机器人、智能机顶盒、智能会控、智能车载为特例提供多种实施方式与步骤。
实例1
家庭服务机器人参考图9和图10,图9是实例1中家庭服务机器人框架图,图10是实例1家庭服务机器人流程图。
该实施例主要说明以下应用场景:多设备多场景不在交互中,指令解析结果需要交互。
家庭服务机器人场景设定为灯、空调、窗帘等。家庭智能中控采集用户消息以及家庭设备状态消息。这里操作包括但不限于语音指令、遥控器指令、智能终端触屏操作、手势指令等。
图9中数据流1A和1B,智能中控分别采集用户消息和设备状态消息。
3)图9中数据流2,语义理解平台接收用户消息以及家庭设备状态消息,如:
4)不在交互中,根据图10模块702进行领域识别,结果为“灯”或者“电视机”;根据图10模块703进行意图识别,结果为“调亮”。
5)根据图10模块704判断多领域意图不明确,需要交互确认用户意图。生成交互内容:“您要调亮灯光还是电视机屏幕呢?”
6)图9数据流3语音理解平台发送指令消息到家庭智能中控,消息内容如下所示:
7)图9数据流4智能中控根据需求选择交互或者直接分发指令到相应的设备,对设备进行操作。
实例2
智能机顶盒参考图11和图12,图11是实例2智能机顶盒框架图,图12是实例2智能机顶盒流程图。
该实施例主要说明以下应用场景:单设备多场景不在交互中,指令解析结果需需要交互。
智能机顶盒场景设定为影视、音乐、相册等。智能机顶盒采集用户消息以及电视机界面状态消息。这里操作包括但不限于语音指令、遥控器指令、智能终端触屏操作、手势指令等。
2)图11中数据流1A和1B,智能机顶盒分别采集用户消息和设备状态消息。
3)图11数据流2,语义理解平台接收用户消息以及家庭设备状态消息。进行上下文理解。如:
4)不在交互中,根据图12模块902进行领域识别,结果为“音乐”或者“影视”;根据图12模块903进行意图识别,结果为“搜索”。
5)根据图12模块904判断多领域意图不明确,需要交互确认用户意图。生成交互内容:“您想看影视节目还是听音乐?”
6)图11数据流3,语音理解平台发送指令消息到智能机顶盒,消息内容如下所示:
7)图11数据流4,智能机顶盒根据需求选择交互或者直接发送指令到电视机,对电视机进行操作。
实例3
智能会控参考图13和图14,图13是实例3智能会控框架图,图14是实例3智能会控流程图。
该实施例主要说明以下应用场景:多设备多场景不在交互中,指令解析结果无需交互。
智能会控场景设定为指令操作、故障诊断。智能会控终端采集用户消息。这里操作包括但不限于语音指令、遥控器指令、智能终端触屏操作、手势指令等。
2)图13中数据流1A和1B,智能会控终端分别采集用户消息和设备状态消息。
3)图13中数据流2,语义理解平台接收用户消息以及电视会议设备状态消息。进行上下文理解如:
4)不在交互中,根据图14模块1102进行领域识别,结果为“麦克风”;根据图14模块1103进行意图识别,结果为“加音”。
5)根据图14模块1104判断领域意图明确。根据图14模块1105进行信息抽取,无内容。
6)图13数据流3,语音理解平台发送指令消息到智能会控终端,消息格式如下所示:
7)图13数据流4,智能会控终端分发指令到相应的设备,对设备进行操作。
实例4
智能车载参考图15和图16,图15是实例4智能车载框架图,图16是实例4智能车载流程图。
该实施例主要说明以下应用场景:多设备多场景在交互中,指令解析结果无需交互。
智能车载场景设定为打电话、听音乐、导航等。智能车载采集用户消息。这里操作包括但不限于语音指令、遥控器指令、智能终端触屏操作、手势指令等。
2)图15中数据流1A和1B,智能车载分别采集用户消息和设备状态消息。
3)图15中数据流2,语义理解平台接收用户消息以及车载状态消息,如:
4)交互中,根据图16模块1302提取记忆中的领域和意图,结果为领域“电话”,意图“打”。
5)根据图16模块1303判断领域意图明确,根据图16模块1304进行信息抽取,结果为:人名“张三”。
6)图15的数据流3,语音理解平台发送指令消息到智能车载,消息格式如下所示:
7)图15的数据流4,智能车载分发指令到相应的设备,对设备进行操作。
实施例4
本公开的实施例还提供了一种存储介质,该存储介质中存储有计算机程序,其中,该计算机程序被设置为运行时执行上述任一项方法实施例中的步骤。
可选地,在本实施例中,上述存储介质可以被设置为存储用于执行以下步骤的计算机程序:
S1,在第一设备上获取第一控制指令和待控制的对象的状态信息,其中,第一设备与待控制的对象之间建立有通信连接;
S2,根据状态信息从待控制的对象中确定第一控制指令所请求控制的目标对象。
可选地,在本实施例中,上述存储介质可以包括但不限于:U盘、只读存储器(Read-Only Memory,简称为ROM)、随机存取存储器(Random Access Memory,简称为RAM)、移动硬盘、磁碟或者光盘等各种可以存储计算机程序的介质。
本公开的实施例还提供了一种电子装置,包括存储器和处理器,该存储器中存储有计算机程序,该处理器被设置为运行计算机程序以执行上述任一项方法实施例中的步骤。
可选地,上述电子装置还可以包括传输设备以及输入输出设备,其中,该传输设备和上述处理器连接,该输入输出设备和上述处理器连接。
可选地,在本实施例中,上述处理器可以被设置为通过计算机程序执 行以下步骤:
S1,在第一设备上获取第一控制指令和待控制的对象的状态信息,其中,第一设备与待控制的对象之间建立有通信连接;
S2,根据状态信息从待控制的对象中确定第一控制指令所请求控制的目标对象。
可选地,本实施例中的具体示例可以参考上述实施例及可选实施方式中所描述的示例,本实施例在此不再赘述。
显然,本领域的技术人员应该明白,上述的本公开的各模块或各步骤可以用通用的计算装置来实现,它们可以集中在单个的计算装置上,或者分布在多个计算装置所组成的网络上,可选地,它们可以用计算装置可执行的程序代码来实现,从而,可以将它们存储在存储装置中由计算装置来执行,并且在某些情况下,可以以不同于此处的顺序执行所示出或描述的步骤,或者将它们分别制作成各个集成电路模块,或者将它们中的多个模块或步骤制作成单个集成电路模块来实现。这样,本公开不限制于任何特定的硬件和软件结合。
以上所述仅为本公开的优选实施例而已,并不用于限制本公开,对于本领域的技术人员来说,本公开可以有各种更改和变化。凡在本公开的原则之内,所作的任何修改、等同替换、改进等,均应包含在本公开的保护范围之内。
如上所述,本发明实施例提供的一种控制对象的确定方法及装置、存储介质、电子装置具有以下有益效果:解决了相关技术中确定目标对象的步骤过于繁琐的技术问题,减少了中控与用户的交互次数,提高了中控的智能性,提高了用户体验。
Claims (16)
- 一种控制对象的确定方法,包括:在第一设备上获取第一控制指令和待控制的对象的状态信息,其中,所述第一设备与所述待控制的对象之间建立有通信连接;根据所述状态信息从所述待控制的对象中确定所述第一控制指令所请求控制的目标对象。
- 根据权利要求1所述的方法,其中,根据所述状态信息从所述待控制的对象中确定所述第一控制指令所请求控制的目标对象包括:解析所述待控制的对象的状态信息,根据预定对应关系从所述待控制的对象中确定所述目标对象,其中,所述预定对应关系用于描述状态信息与目标对象的对应关系。
- 根据权利要求2所述的方法,其中,根据预定对应关系从所述待控制的对象中确定所述目标对象包括以下之一:将开关状态为开启的待控制的对象确定为所述目标对象;将开启时间距离当前时间最短的待控制的对象确定为所述目标对象;其中,所述状态信息包括以下至少之一:开关状态,开启时间。
- 根据权利要求1所述的方法,其中,根据所述状态信息从所述待控制的对象中确定所述第一控制指令所请求控制的目标对象包括:根据所述第一控制指令确定待控制的对象的指定状态信息;将状态信息与所述指定状态信息匹配的待控制的对象确定为所述目标对象。
- 根据权利要求4所述的方法,其中,将状态信息与所述指定状态信息匹配的待控制的对象确定为所述目标对象包括:将工作状态与所述指定状态信息的相似度高于预设阈值的待控制的对象确定为所述目标对象,其中,所述状态信息包括所述工作状态。
- 根据权利要求1所述的方法,其中,在根据所述状态信息从所述待控制的对象中确定所述第一控制指令所请求控制的目标对象之后,所述方法还包括:在从所述待控制的对象中确定出所述目标对象的情况下,通过所述第一设备向所述目标对象发送第二控制指令,其中,所述第二控制指令用于指示所述目标对象执行所述第一控制指令所请求的操作。
- 根据权利要求1所述的方法,其中,在根据所述状态信息从所述待控制的对象中确定所述第一控制指令所请求控制的目标对象之后,所述方法还包括:在从所述待控制的对象中未确定出所述目标对象的情况下,通过所述第一设备返回用于确认所述第一控制指令的反馈信息。
- 根据权利要求1所述的方法,其中,从第一设备获取第一控制指令包括以下至少之一:通过所述第一设备采集到语音信息,其中,所述语音信息中携带有特征信息;根据所述特征信息生成所述第一控制指令;从所述第一设备接收到文本消息,其中,所述文本消息中携带有特征信息;根据所述特征信息生成所述第一控制指令;从所述第一设备接收到遥控指令;根据所述遥控指令生成所述第 一控制指令;从所述第一设备接收控制手势,从所述控制手势中提取特征信息;根据所述特征信息生成所述第一控制指令。
- 根据权利要求1所述的方法,其中,在第一设备上获取第一控制指令之后,所述方法还包括:识别所述第一控制指令,确定所述第一控制指令的控制领域;将所属领域与所述控制领域相同的待控制的对象,确定为所述目标对象。
- 根据权利要求9所述的方法,其中,识别所述第一控制指令包括以下之一:使用所述第一设备预设的数据模型识别所述第一控制指令,其中,所述数据模型包括多个领域的数据库;通过网络服务器在线识别所述第一控制指令。
- 一种控制对象的确定装置,包括:获取模块,设置为在第一设备上获取第一控制指令和待控制的对象的状态信息,其中,所述第一设备与所述待控制的对象之间建立有通信连接;确定模块,设置为根据所述状态信息从所述待控制的对象中确定所述第一控制指令所请求控制的目标对象。
- 根据权利要求11所述的装置,其中,所述确定模块包括:第一确定单元,设置为解析所述待控制的对象的状态信息,根据预定对应关系从所述待控制的对象中确定所述目标对象,其中,所述预定对应关系用于描述状态信息与目标对象的对应关系。
- 根据权利要求11所述的装置,其中,所述确定模块包括:第二确定单元,设置为根据所述第一控制指令确定待控制的对象的指定状态信息;第三确定单元,设置为将状态信息与所述指定状态信息匹配的待控制的对象确定为所述目标对象。
- 根据权利要求11所述的装置,其中,所述装置还包括:发送模块,设置为在所述确定模块根据所述状态信息从所述待控制的对象中确定所述第一控制指令所请求控制的目标对象之后,在从所述待控制的对象中确定出所述目标对象的情况下,通过所述第一设备向所述目标对象发送第二控制指令,其中,所述第二控制指令用于指示所述目标对象执行所述第一控制指令所请求的操作。
- 一种存储介质,所述存储介质中存储有计算机程序,其中,所述计算机程序被设置为运行时执行所述权利要求1至10任一项中所述的方法。
- 一种电子装置,包括存储器和处理器,所述存储器中存储有计算机程序,所述处理器被设置为运行所述计算机程序以执行所述权利要求1至10任一项中所述的方法。
Priority Applications (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
EP19804019.8A EP3796110A4 (en) | 2018-05-14 | 2019-04-12 | METHOD AND DEVICE FOR DETERMINING A CONTROLLED OBJECT AND STORAGE MEDIUM AND ELECTRONIC DEVICE |
US17/051,482 US20210160130A1 (en) | 2018-05-14 | 2019-04-12 | Method and Apparatus for Determining Target Object, Storage Medium, and Electronic Device |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810455771.8 | 2018-05-14 | ||
CN201810455771.8A CN108646580A (zh) | 2018-05-14 | 2018-05-14 | 控制对象的确定方法及装置、存储介质、电子装置 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2019218820A1 true WO2019218820A1 (zh) | 2019-11-21 |
Family
ID=63755190
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/CN2019/082348 WO2019218820A1 (zh) | 2018-05-14 | 2019-04-12 | 控制对象的确定方法及装置、存储介质、电子装置 |
Country Status (4)
Country | Link |
---|---|
US (1) | US20210160130A1 (zh) |
EP (1) | EP3796110A4 (zh) |
CN (1) | CN108646580A (zh) |
WO (1) | WO2019218820A1 (zh) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN114024996A (zh) * | 2022-01-06 | 2022-02-08 | 广东电网有限责任公司广州供电局 | 一种大规模异构智能终端容器管理方法及系统 |
CN114040324A (zh) * | 2021-11-03 | 2022-02-11 | 北京普睿德利科技有限公司 | 一种通信控制的方法、装置、终端及存储介质 |
Families Citing this family (11)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108646580A (zh) * | 2018-05-14 | 2018-10-12 | 中兴通讯股份有限公司 | 控制对象的确定方法及装置、存储介质、电子装置 |
CN111210824B (zh) * | 2018-11-21 | 2023-04-07 | 深圳绿米联创科技有限公司 | 语音信息处理方法、装置、电子设备及存储介质 |
CN111599355A (zh) * | 2019-02-19 | 2020-08-28 | 珠海格力电器股份有限公司 | 语音控制方法、语音控制装置和空调 |
CN112002311A (zh) * | 2019-05-10 | 2020-11-27 | Tcl集团股份有限公司 | 文本纠错方法、装置、计算机可读存储介质及终端设备 |
CN112786022B (zh) * | 2019-11-11 | 2023-04-07 | 青岛海信移动通信技术股份有限公司 | 终端、第一语音服务器、第二语音服务器及语音识别方法 |
CN111588884A (zh) * | 2020-05-18 | 2020-08-28 | 上海明略人工智能(集团)有限公司 | 对象消毒系统、方法、存储介质及电子装置 |
CN112767937B (zh) * | 2021-01-15 | 2024-03-08 | 宁波方太厨具有限公司 | 多设备语音控制方法、系统、设备及可读存储介质 |
CN114442536A (zh) * | 2022-01-29 | 2022-05-06 | 北京声智科技有限公司 | 交互控制方法、系统、设备和存储介质 |
CN114694644A (zh) * | 2022-02-23 | 2022-07-01 | 青岛海尔科技有限公司 | 语音意图识别方法、装置及电子设备 |
CN115001885B (zh) * | 2022-04-22 | 2024-01-26 | 青岛海尔科技有限公司 | 设备控制方法及装置、存储介质及电子装置 |
CN115373283A (zh) * | 2022-07-29 | 2022-11-22 | 青岛海尔科技有限公司 | 控制指令的确定方法及装置、存储介质及电子装置 |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104538030A (zh) * | 2014-12-11 | 2015-04-22 | 科大讯飞股份有限公司 | 一种可以通过语音控制家电的控制系统与方法 |
CN105511287A (zh) * | 2016-01-27 | 2016-04-20 | 珠海格力电器股份有限公司 | 智能家电控制方法和装置及系统 |
CN105739321A (zh) * | 2016-04-29 | 2016-07-06 | 广州视声电子实业有限公司 | 一种基于knx总线的语音控制系统及方法 |
CN107612968A (zh) * | 2017-08-15 | 2018-01-19 | 北京小蓦机器人技术有限公司 | 通过智能终端控制其所连接设备的方法、设备与系统 |
US20180026942A1 (en) * | 2016-07-25 | 2018-01-25 | Honeywell International Inc. | Industrial process control using ip communications with publisher subscriber pattern |
CN108646580A (zh) * | 2018-05-14 | 2018-10-12 | 中兴通讯股份有限公司 | 控制对象的确定方法及装置、存储介质、电子装置 |
Family Cites Families (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104122806A (zh) * | 2013-04-28 | 2014-10-29 | 海尔集团公司 | 家电设备的控制方法和系统 |
KR102411619B1 (ko) * | 2015-05-11 | 2022-06-21 | 삼성전자주식회사 | 전자 장치 및 그 제어 방법 |
CN106292558A (zh) * | 2015-05-25 | 2017-01-04 | 中兴通讯股份有限公司 | 智能家电的控制方法和装置 |
DK179588B1 (en) * | 2016-06-09 | 2019-02-22 | Apple Inc. | INTELLIGENT AUTOMATED ASSISTANT IN A HOME ENVIRONMENT |
JP6683893B2 (ja) * | 2016-10-03 | 2020-04-22 | グーグル エルエルシー | デバイストポロジーに基づく音声コマンドの処理 |
CN106647311B (zh) * | 2017-01-16 | 2020-10-30 | 上海智臻智能网络科技股份有限公司 | 智能中控系统、设备、服务器及智能设备控制方法 |
CN107290974A (zh) * | 2017-08-18 | 2017-10-24 | 三星电子(中国)研发中心 | 一种智能家居交互方法和装置 |
CN107390598B (zh) * | 2017-08-31 | 2020-10-09 | 广东美的制冷设备有限公司 | 设备控制方法、电子设备和计算机可读存储介质 |
CN107731226A (zh) * | 2017-09-29 | 2018-02-23 | 杭州聪普智能科技有限公司 | 基于语音识别的控制方法、装置及电子设备 |
CN107886952B (zh) * | 2017-11-09 | 2020-03-17 | 珠海格力电器股份有限公司 | 一种语音控制智能家电的方法、装置、系统和电子设备 |
-
2018
- 2018-05-14 CN CN201810455771.8A patent/CN108646580A/zh active Pending
-
2019
- 2019-04-12 WO PCT/CN2019/082348 patent/WO2019218820A1/zh unknown
- 2019-04-12 EP EP19804019.8A patent/EP3796110A4/en not_active Withdrawn
- 2019-04-12 US US17/051,482 patent/US20210160130A1/en not_active Abandoned
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104538030A (zh) * | 2014-12-11 | 2015-04-22 | 科大讯飞股份有限公司 | 一种可以通过语音控制家电的控制系统与方法 |
CN105511287A (zh) * | 2016-01-27 | 2016-04-20 | 珠海格力电器股份有限公司 | 智能家电控制方法和装置及系统 |
CN105739321A (zh) * | 2016-04-29 | 2016-07-06 | 广州视声电子实业有限公司 | 一种基于knx总线的语音控制系统及方法 |
US20180026942A1 (en) * | 2016-07-25 | 2018-01-25 | Honeywell International Inc. | Industrial process control using ip communications with publisher subscriber pattern |
CN107612968A (zh) * | 2017-08-15 | 2018-01-19 | 北京小蓦机器人技术有限公司 | 通过智能终端控制其所连接设备的方法、设备与系统 |
CN108646580A (zh) * | 2018-05-14 | 2018-10-12 | 中兴通讯股份有限公司 | 控制对象的确定方法及装置、存储介质、电子装置 |
Non-Patent Citations (1)
Title |
---|
See also references of EP3796110A4 * |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN114040324A (zh) * | 2021-11-03 | 2022-02-11 | 北京普睿德利科技有限公司 | 一种通信控制的方法、装置、终端及存储介质 |
CN114040324B (zh) * | 2021-11-03 | 2024-01-30 | 北京普睿德利科技有限公司 | 一种通信控制的方法、装置、终端及存储介质 |
CN114024996A (zh) * | 2022-01-06 | 2022-02-08 | 广东电网有限责任公司广州供电局 | 一种大规模异构智能终端容器管理方法及系统 |
Also Published As
Publication number | Publication date |
---|---|
EP3796110A4 (en) | 2021-07-07 |
US20210160130A1 (en) | 2021-05-27 |
EP3796110A1 (en) | 2021-03-24 |
CN108646580A (zh) | 2018-10-12 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
WO2019218820A1 (zh) | 控制对象的确定方法及装置、存储介质、电子装置 | |
US11488576B2 (en) | Artificial intelligence apparatus for generating text or speech having content-based style and method for the same | |
CN113762322B (zh) | 基于多模态表示的视频分类方法、装置和设备及存储介质 | |
US10341461B2 (en) | System and method for automatically recreating personal media through fusion of multimodal features | |
CN110364146B (zh) | 语音识别方法、装置、语音识别设备及存储介质 | |
CN111258995B (zh) | 数据处理方法、装置、存储介质及设备 | |
US11393465B2 (en) | Artificial intelligence apparatus for speech interaction and method for the same | |
CN109101545A (zh) | 基于人机交互的自然语言处理方法、装置、设备和介质 | |
CN111209440A (zh) | 一种视频播放方法、装置和存储介质 | |
CN106202165B (zh) | 人机交互的智能学习方法及装置 | |
CN111372109B (zh) | 一种智能电视以及信息交互方法 | |
CN112328849A (zh) | 用户画像的构建方法、基于用户画像的对话方法及装置 | |
WO2020253064A1 (zh) | 语音的识别方法及装置、计算机设备、存储介质 | |
CN115114395B (zh) | 内容检索及模型训练方法、装置、电子设备和存储介质 | |
CN107589828A (zh) | 基于知识图谱的人机交互方法及系统 | |
CN110992937B (zh) | 语言离线识别方法、终端及可读存储介质 | |
JP7247442B2 (ja) | ユーザ対話における情報処理方法、装置、電子デバイス及び記憶媒体 | |
CN115080836A (zh) | 基于人工智能的信息推荐方法、装置、电子设备及存储介质 | |
CN116737883A (zh) | 人机交互方法、装置、设备及存储介质 | |
WO2019228140A1 (zh) | 指令执行方法、装置、存储介质及电子设备 | |
CN116994169A (zh) | 标签预测方法、装置、计算机设备及存储介质 | |
KR20210027991A (ko) | 전자장치 및 그 제어방법 | |
CN114822598A (zh) | 服务器及语音情感识别方法 | |
CN112165626B (zh) | 图像处理方法、资源获取方法、相关设备及介质 | |
CN115062131A (zh) | 一种基于多模态的人机交互方法及装置 |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 19804019 Country of ref document: EP Kind code of ref document: A1 |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
ENP | Entry into the national phase |
Ref document number: 2019804019 Country of ref document: EP Effective date: 20201214 |