WO2014023257A1 - 指令处理方法、装置和系统 - Google Patents

指令处理方法、装置和系统 Download PDF

Info

Publication number
WO2014023257A1
WO2014023257A1 PCT/CN2013/081131 CN2013081131W WO2014023257A1 WO 2014023257 A1 WO2014023257 A1 WO 2014023257A1 CN 2013081131 W CN2013081131 W CN 2013081131W WO 2014023257 A1 WO2014023257 A1 WO 2014023257A1
Authority
WO
WIPO (PCT)
Prior art keywords
voice
commands
command
source
voice command
Prior art date
Application number
PCT/CN2013/081131
Other languages
English (en)
French (fr)
Inventor
梅敬青
薛国栋
Original Assignee
华为终端有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华为终端有限公司 filed Critical 华为终端有限公司
Priority to EP13827606.8A priority Critical patent/EP2830044B1/en
Publication of WO2014023257A1 publication Critical patent/WO2014023257A1/zh
Priority to US14/520,575 priority patent/US9704503B2/en

Links

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/08Speech classification or search
    • G10L15/18Speech classification or search using natural language modelling
    • G10L15/1822Parsing for meaning understanding
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/28Constructional details of speech recognition systems
    • G10L15/30Distributed recognition, e.g. in client-server systems, for mobile phones or network applications
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue
    • G10L2015/223Execution procedure of a spoken command

Definitions

  • the present invention relates to communication technologies, and in particular, to an instruction processing method, apparatus, and system.
  • voice control technology has been gradually recognized by the industry. More and more electronic devices, such as smart phones, tablets, and smart TVs. (Smart TV), etc., will have voice control functions, and will appear in people's daily life at the same time; It is foreseeable that voice control functions will be more and more diverse, and more and more voice control will be supported. Consumer electronics, office equipment, etc. With the enhancement of computing power of terminal devices and the trend of intelligence, some terminal devices can support more and more functions, and may overlap, such as users can use Smart Phone, Smart TV, etc. on Twitter.
  • the current voice command technology mainly includes traditional voice control technology and intelligent voice control technology.
  • traditional voice control technology users need to issue commands according to specific syntax and command vocabulary.
  • intelligent voice control technology users can freely issue commands through natural language.
  • the traditional voice control technology implementation mechanism is relatively simple and accurate, but the user experience is relatively poor.
  • the implementation mechanism of the intelligent voice control technology is complex, but the user experience is relatively good.
  • voice control that always runs monitoring in the background of the electronic device.
  • Samsung's Smart Interaction TV monitors the user's operation instructions in real time to quickly execute the user's operation instructions.
  • the same voice command sent by the user may be simultaneously monitored by multiple devices.
  • the command may be simultaneously monitored by the device B.
  • the device B executes the instruction sent by the device A and the instruction directly received from the user, so that the volume of the device B is repeatedly decreased twice, thereby causing repeated execution of the voice command, and even a control error occurs.
  • the embodiments of the present invention provide a command processing method, apparatus, and system, which prevent multiple voice control devices from repeatedly executing a voice command collected at the same time, and eliminating control errors caused by repeated command execution.
  • a first aspect of the embodiments of the present invention provides a method for processing an instruction, including:
  • the multiple voice commands respectively carry the collection time information of the source voice command corresponding to each voice instruction and the instruction content of each voice instruction;
  • the determining whether any two of the plurality of voice commands are similar instructions includes:
  • the acquisition time of the source voice command corresponding to any two voice commands in the plurality of voice commands When overlapping, and repeating on the content, it is determined that the two voice commands are similar instructions.
  • the method further includes:
  • the new voice command and the related voice command are used as the plurality of voice commands.
  • the collection time information of the source voice command respectively determines whether the collection time of the source voice command corresponding to any two voice commands in the plurality of voice commands overlaps:
  • the plurality of voice commands further carry priority parameters of the source voice commands corresponding to the voice commands respectively;
  • the method further includes:
  • discarding one of the two voice commands includes:
  • the voice commands with high priority among the two similar voice commands are returned to the corresponding voice control device, and the voice commands with low priority are discarded.
  • the instruction processing method of the embodiment of the present invention further includes:
  • a new voice command received is a similar command to a voice command that has been returned to other voice control devices, the new voice command is discarded.
  • the instruction processing method of the embodiment of the present invention further includes:
  • the voice resolution server performs time synchronization with each voice control device
  • the voice parsing server respectively receives the source voice commands sent by the voice control devices.
  • the instruction processing method of the embodiment of the present invention further includes:
  • the local voice control gateway performs time synchronization with each voice control device
  • the local voice control gateway respectively receives the source voice commands sent by the voice control devices, and sends each of the source voice commands to the voice resolution server.
  • a second aspect of the embodiments of the present invention provides an instruction processing apparatus, including:
  • a receiving module configured to receive a plurality of voice commands sent by the voice resolution server, where the multiple voice commands are generated by parsing the source voice commands from different voice control devices by the voice resolution server;
  • a determining module configured to determine, respectively, whether any two voice commands of the plurality of voice commands received by the receiving module are similar commands, where the similar commands are source voices obtained by different voice control devices collecting the same voice information The voice command corresponding to the command;
  • a redundant instruction processing module configured to discard one of the two similar voice commands when the judgment result of the determining module is that two voice commands are similar commands in the plurality of voice commands Order.
  • the multiple voice commands received by the receiving module respectively carry the collection time information of the source voice command corresponding to each voice instruction and the instruction content of each voice instruction;
  • the determining module includes:
  • a first determining unit configured to determine, according to the collection time information of the source voice command corresponding to the multiple voice commands received by the receiving module, the source voice commands corresponding to any two voice commands in the multiple voice commands Whether the collection time overlaps;
  • a second determining unit configured to determine, according to the instruction content of the plurality of voice commands received by the receiving module, whether any two voice commands of the plurality of voice commands are repeated on the content; the similar command determining unit, When the judgment result of the first judging unit and the second judging unit is that the collection time of the source voice command corresponding to any two of the plurality of voice commands overlaps, and when the content is repeated, determining the The two voice commands are similar commands.
  • the apparatus further includes:
  • a recording module configured to record acquisition time information of the new voice instruction when receiving a new voice instruction from the voice resolution server
  • a voice instruction determining module configured to compare an acquisition time of the new voice instruction with an acquisition time of a voice instruction recorded by the recording module, and determine that a difference between the collection time and the collection time of the new voice instruction is less than a predetermined voice command of a predetermined threshold; and the new voice command and the associated voice command as the plurality of voice commands.
  • the first determining unit includes: a subunit, configured to determine, according to the start timestamp and the cutoff timestamp of the source voice command corresponding to the multiple voice commands received by the receiving module, respectively, corresponding to any two voice commands in the multiple voice commands Whether the difference between the start timestamp of the source voice command and the difference of the deadline timestamp is less than a preset threshold; if the difference between the start timestamp and the cutoff timestamp is less than a preset threshold value, determining that an acquisition time of a source voice command corresponding to any two of the plurality of voice commands overlaps; or
  • a second determining subunit configured by the receiving module, corresponding to the multiple voice commands
  • the start time stamp and the cut-off time stamp of the source voice command respectively, respectively, obtain the durations of the plurality of voice commands, and determine whether the durations of any two voice commands in the plurality of voice commands overlap; if the duration If there is an overlapping portion, it is determined that the acquisition time of the source voice command corresponding to any two of the plurality of voice commands overlaps.
  • the plurality of voice commands received by the receiving module further carry priority parameters of the source voice commands corresponding to the multiple voice commands respectively;
  • the device also includes:
  • An acquiring module configured to determine, according to a priority parameter of the source voice command corresponding to the voice instruction received by the receiving module, a voice command with a high priority among two similar voice commands, and two similar voice commands Low priority voice command;
  • the redundant instruction processing module is specifically configured to: when the judgment result of the determining module is that the two voice commands are similar commands in the plurality of voice commands, the voice commands with high priority among the two similar voice commands Return to the corresponding voice control device, and discard the voice command with low priority.
  • the redundant instruction processing module is further configured to: when the receiving module receives a new voice command and a voice that has been returned to another voice control device When the instruction is a similar instruction, the new voice instruction is discarded.
  • a third aspect of the embodiments of the present invention provides an instruction processing system, including a voice parsing server, a plurality of voice control devices, and the foregoing instruction processing apparatus;
  • the plurality of voice control devices are respectively configured to collect a plurality of source voice commands, and respectively send the plurality of source voice commands to the voice resolution server;
  • the voice parsing server is configured to receive a plurality of source voice commands sent by the plurality of voice control devices, and parse the plurality of source voice commands to generate a plurality of voice commands corresponding to the plurality of source voice commands. And transmitting the plurality of voice instructions to the instruction processing device.
  • the voice resolution server is further configured to perform time synchronization with the multiple voice control devices.
  • a fourth aspect of the embodiments of the present invention provides an instruction processing system, including a voice resolution server, a plurality of voice control devices, and a local voice control gateway, where the local voice control gateway includes the above instruction processing device;
  • the plurality of voice control devices are configured to separately collect multiple source voice commands, and respectively send the multiple source voice commands to the local voice control gateway;
  • the voice parsing server is configured to receive a plurality of source voice commands sent by the local voice control gateway, and respectively parse the plurality of source voice commands to generate a plurality of voice commands corresponding to the plurality of source voice commands. And returning the plurality of voice commands to the local voice control gateway respectively.
  • the local voice control gateway is further configured to perform time synchronization with the multiple voice control devices.
  • the technical effect of the embodiment of the present invention is: determining whether any two voice commands in the plurality of voice commands are similar commands by receiving a plurality of voice commands sent by the voice parsing server, and the similar commands are performed by the different voice control devices on the same voice information.
  • the voice command corresponding to the obtained source voice command is collected; when two voice commands are similar commands, one of the voice commands is discarded.
  • This embodiment prevents multiple voice control devices from repeatedly performing a voice command collected at the same time, and eliminates control errors caused by repeated command execution.
  • Embodiment 2 is a flowchart of Embodiment 2 of an instruction processing method according to the present invention.
  • FIG. 3 is a schematic diagram of a system architecture in Embodiment 2 of an instruction processing method according to the present invention.
  • Embodiment 4 is a signaling diagram of Embodiment 3 of an instruction processing method according to the present invention.
  • FIG. 5 is a schematic structural diagram of a system in Embodiment 3 of an instruction processing method according to the present invention.
  • FIG. 6 is a schematic structural diagram of Embodiment 1 of an instruction processing apparatus according to the present invention
  • FIG. 7 is a schematic structural diagram of Embodiment 2 of an instruction processing apparatus according to the present invention
  • FIG. 6 is a schematic structural diagram of Embodiment 1 of an instruction processing apparatus according to the present invention
  • FIG. 7 is a schematic structural diagram of Embodiment 2 of an instruction processing apparatus according to the present invention
  • FIG. 6 is a schematic structural diagram of Embodiment 1 of an instruction processing apparatus according to the present invention
  • FIG. 7 is a schematic structural diagram of Embodiment 2 of an instruction processing apparatus according to the present invention
  • Embodiment 8 is a schematic structural diagram of Embodiment 3 of an instruction processing apparatus according to the present invention.
  • FIG. 9 is a schematic structural diagram of an embodiment of a computer system according to the present invention.
  • Embodiment 1 of an instruction processing system according to the present invention.
  • FIG. 11 is a schematic structural diagram of Embodiment 2 of an instruction processing system according to the present invention. detailed description
  • FIG. 1 is a flowchart of a first embodiment of an instruction processing method according to the present invention. As shown in FIG. 1, the embodiment provides an instruction processing method, which may specifically include the following steps:
  • Step 101 Receive multiple voice commands sent by the voice resolution server.
  • a Redundant voic E Command identification and Handling (hereinafter referred to as: RECH) mechanism is proposed, and a RECH functional entity may be added to an existing voice control system, and the RECH functional entity is added. It can be a stand-alone device or a module integrated into an existing device.
  • the RECH function entity in this embodiment may be set together with the voice resolution server on the network side, or directly in the voice resolution server as a module; or may be set locally, that is, set together with the local voice control gateway. , or directly as a module set in the local voice control gateway.
  • the RECH function entity receives multiple voice commands sent by the voice parsing server, and the plurality of voice commands may be sent by the voice parsing server in sequence, which may be generated and sent by the voice parsing server within a preset time period.
  • the purpose of setting the preset time period here is to perform different processing on the voice commands received at different times. When the time difference between the received two voice commands is large, the previously received voice commands may be directly received. Returning to the corresponding voice control device, and not waiting for the received voice command, and performing similarity judgment on the two, and then processing; therefore, the embodiment can specifically set the preset time period and preset Each voice command received during the time period performs a similarity judgment of the two.
  • multiple voice commands are voice parsing server pairs from different languages
  • the source voice command of the tone control device is generated after parsing.
  • the two voice commands that need to perform the similarity judgment are voice commands respectively from different voice control devices, without performing similarity judgment on voice commands from the same voice control device.
  • Each voice command is generated by the voice parsing server parsing the source voice commands from different voice control devices, and the voice parsing server parses each source voice command to generate a voice command corresponding to each source voice command.
  • Step 102 Determine whether any two voice commands in the plurality of voice commands are similar commands, and if yes, execute step 103, otherwise perform step 104.
  • step 103 is performed; otherwise, if any two voice commands are not similar commands, step 104 is performed.
  • step 104 is performed.
  • the multiple voice commands received in the foregoing step 101 in the embodiment refer to a voice command that meets a preset time condition, where the preset time condition is used for the source voice corresponding to the voice command for performing the similarity judgment.
  • the acquisition time of the command is limited. For example, it is only necessary to perform similarity judgment on the voice command with the close interval of the acquisition time, and the voice command with a long time interval (for example, more than 2 minutes) is basically impossible to be a similar instruction.
  • the embodiment may further include the following steps: when receiving a new voice instruction from the voice resolution server, recording acquisition time information of the new voice instruction; and the new voice instruction
  • the acquisition time is compared with the acquisition time of the previously recorded voice instruction, and the relevant voice instruction that determines that the difference between the acquisition time and the acquisition time of the new voice instruction is less than a predetermined threshold is determined; and the new voice instruction is related to the A voice command is used as the plurality of voice commands.
  • the collection time of the voice instruction is the start time stamp of the source voice command corresponding to the voice instruction.
  • an instruction schedule may be set, and the collection time of the received voice instruction is recorded in the instruction schedule.
  • This embodiment can also set a timing for the instruction schedule.
  • the timer is used to time the collection time information stored therein.
  • the voice command corresponding to the collection time information has timed out. It is basically impossible to similar instructions to other voice commands received subsequently, so that the instruction time information can be deleted from the instruction schedule, and subsequent corresponding timeout speeches are not obtained from the instruction schedule. instruction.
  • the acquisition time is limited for the voice instruction that needs to perform the similarity judgment, that is, the collection of each voice instruction stored in the instruction schedule. Time to determine whether a certain voice command requires a similarity judgment.
  • the relevant voice instruction that the difference between the acquisition time and the acquisition time of the new voice instruction is less than a predetermined threshold is obtained according to the instruction schedule. The relevant voice instruction obtained here and the new voice instruction are currently required to perform the similarity judgment. Multiple voice commands.
  • Step 103 Discard one of the two similar voice commands.
  • a voice command when two voice commands are similar instructions, a voice command can be selected from the two similar voice commands for discarding processing, thereby avoiding redundant instructions and effectively avoiding repeated execution of the same command.
  • another voice command is sent to the voice control device corresponding to the voice command, and after receiving the voice command, the voice control device can perform the operation indicated by the voice command in response to the source voice command sent by the user.
  • the two voice commands may be redundantly processed according to the priority of the source voice commands corresponding to the two similar voice commands, wherein the voice command priority may be based on the default source voice command.
  • the priority parameter is obtained, and the priority parameter of the source voice command may be carried in the voice command, and the priority parameter may be set according to an actual situation, such as setting a volume value of the source voice command received by the voice control device as a priority parameter. The higher the volume value, the higher the priority of the corresponding voice command.
  • the voice command device with the highest priority is returned to the corresponding voice control device, where the corresponding voice control device is specifically a device that sends the source voice command corresponding to the voice command to the voice resolution server.
  • the voice control device can perform the operation indicated by the voice command in response to the source voice command issued by the user.
  • the voice command with low priority is discarded, and the redundant command indication is sent to the voice control device corresponding to the voice command with low priority to notify the monitored source voice command to be a redundant command. Effectively avoids repeated execution of the same command.
  • Step 104 Return each voice instruction to a corresponding voice control device.
  • the voice commands can be directly returned to the corresponding voice control device, where the corresponding voice control device specifically sends the voice command to the voice resolution server.
  • the corresponding voice control device specifically sends the voice command to the voice resolution server.
  • each voice command corresponds to a voice control device, and after receiving the voice command, the voice control device can perform the operation indicated by the voice command in response to the source voice command sent by the user. .
  • the embodiment provides an instruction processing method for determining whether any two voice commands in a plurality of voice commands are similar commands by receiving a plurality of voice commands sent by the voice parsing server, and the similar commands are different voice control devices for the same voice.
  • the voice command corresponding to the source voice command obtained by the information acquisition; when the two voice commands are similar commands, one of the voice commands is discarded.
  • a plurality of voice control devices are repeatedly executed to repeatedly perform a voice command collected at the same time, and the control error caused by the repeated execution of the command is eliminated.
  • FIG. 2 is a signaling diagram of the second embodiment of the instruction processing method of the present invention.
  • the embodiment provides an instruction processing method.
  • the RECH functional entity is set on the network side
  • FIG. 3 is A schematic diagram of the system architecture in the second embodiment of the command processing method of the present invention is as shown in FIG. 3, and it is assumed that the device A and the device B are two voice control devices, both of which have voice control functions.
  • the voice control device is used as an example to describe the solution of the present invention.
  • the RECH function entity is a device independent of the voice resolution server.
  • the RECH function entity may also be integrated in the voice resolution server.
  • the instruction processing method provided in this embodiment may specifically include the following steps:
  • Step 201 Device A performs time synchronization with the voice resolution server.
  • Step 202 Device B performs time synchronization with the voice resolution server.
  • the foregoing steps are to synchronize the time synchronization of the device A and the device B with the voice control function on the network side, so that the subsequent voice resolution server can accurately obtain the collection time information carried in the source voice command.
  • Step 203 The device A sends a source voice command A to the voice resolution server.
  • the source voice command may be: "Reducing the volume of the device B by one space", and before sending the source voice command, the device A needs to send the voice command to the voice analysis.
  • the server performs parsing processing. This step is the device A to the speech resolution server.
  • Send source voice command A The source voice command A here refers to the source voice command reported by the device A.
  • the source voice command A carries the initial time stamp A and the end time stamp A of the source voice command A.
  • the priority parameter (priority re-value A)
  • the start timestamp of the source voice command A is used to indicate the start time of the source voice command monitored by the device A
  • the deadline timestamp of the source voice command A is used to indicate the device.
  • the priority parameter is a parameter set by the user or device to identify the priority of the device or command when a similar instruction occurs.
  • Step 204 The voice resolution server performs authentication and authentication on the device A.
  • the voice resolution server After receiving the source voice command reported by device A, the voice resolution server performs authentication and authentication on the device A, and performs subsequent resolution processing after the identity verification and authentication are passed.
  • Step 205 The device B sends a source voice command B to the voice resolution server.
  • the source voice command may be: "Reducing the volume of the device B by one space", and before sending the source voice command, the device B needs to send the voice command to the voice analysis.
  • the server performs parsing processing. In this step, device B sends a source voice command B to the voice parsing server.
  • the source voice command B here refers to the source voice command reported by the device B.
  • the source voice command B carries the initial time stamp B and the end time stamp B of the source voice command B.
  • the priority parameter (priority re-value B)
  • the start timestamp of the source voice command B is used to indicate the start time of the source voice command monitored by the device B
  • the deadline timestamp of the source voice command A is used to indicate the device.
  • the cutoff time of the source voice command monitored by B, the priority parameter is a parameter set by the user for identifying the priority of the device or command when a similar instruction occurs.
  • Step 206 The voice resolution server performs authentication and authentication on the device B.
  • the voice resolution server After receiving the source voice command reported by device B, the voice resolution server performs authentication and authentication on the device B, and performs subsequent analysis processing after the identity verification and authentication are passed.
  • Step 207 The voice parsing server sends the voice command A generated by parsing the source voice command A to the RECH function entity.
  • the voice parsing server After the voice parsing server receives the source voice command A reported by the device A, and performs the authentication and authentication on the device A, the voice parsing server parses the source voice command A, and the parsing processing device can understand and execute the device.
  • the voice parsing server sends the parsed voice command A to the RECH function entity, where the voice command A carries the start time stamp, the cut-off time stamp, and the priority parameter of the source voice command A corresponding to the voice command A, and is configured by the RECH function.
  • the entity makes a similarity judgment on the voice command A and other voice commands.
  • Step 208 The voice parsing server sends a voice command generated by parsing the source voice command B to the RECH function entity.
  • the voice parsing server After the voice parsing server receives the source voice command B reported by the device B, and performs the authentication and authentication on the device B, the voice parsing server parses the source voice command B, and the parsing processing device can understand and execute the parsing process.
  • Voice command B which corresponds to source voice command B.
  • the voice parsing server sends the parsed voice command B to the RECH function entity, where the voice command B carries the start time stamp, the cut-off time stamp, and the priority parameter of the source voice command B corresponding to the voice command B.
  • the entity makes a similarity judgment on the voice command B and other voice commands.
  • Step 209 The RECH function entity determines the source voice command A corresponding to the voice command A and the source voice command B corresponding to the voice command B according to the start time stamp and the cut-off time stamp of the source voice command corresponding to the voice command and the voice command B respectively. Whether the acquisition time overlaps, if yes, step 210 is performed, otherwise step 214 is performed.
  • the RECH function entity may include a start time stamp and a deadline time stamp according to the collection time information carried therein to determine the source corresponding to the voice command A. Whether the acquisition time of the source voice command B corresponding to the voice command A and the voice command B overlaps, that is, the time similarity judgment is performed. Specifically, when performing the time similarity determination, the RECH function entity may determine whether the difference between the start timestamp of the source voice command A and the start timestamp of the source voice command B is less than a preset threshold, and determine the source.
  • step 210 if the difference between the start time stamp and the cut-off time stamp of the two If the value is greater than or equal to the preset threshold, it indicates that the source voice command A and the voice command B corresponding to the voice command A If the acquisition time of the source voice command B does not overlap, step 214 is performed.
  • the RECH function entity may also obtain the durations of the voice instruction A and the voice instruction B according to the start time stamp and the deadline time stamp of the source voice command corresponding to the voice instruction A and the voice instruction B, respectively. Determining whether the duration of the voice command A overlaps with the duration of the voice command B. If there is an overlap between the durations of the voice commands A, the source voice command corresponding to the source voice command A and the voice command B corresponding to the voice command A is indicated. If the collection time of B overlaps, step 210 is performed. If there is no overlap between the durations of the two, the source voice command A corresponding to the voice command A and the source voice command B corresponding to the voice command B do not overlap. Go to step 214.
  • the RECH function entity may further determine whether the difference between the start timestamp of the voice command A and the start timestamp of the voice command B is greater than a preset time threshold. If yes, go to step 209 again, otherwise you can end this process.
  • Step 210 The RECH function entity determines whether the voice command A and the voice command B are repeated on the content according to the command content of the voice command VIII and the voice command B. If yes, step 211 is performed; otherwise, step 214 is performed.
  • the RECH function entity After the determining step, when the RECH function entity determines that the voice command A and the voice command B overlap in time, the RECH function entity further determines whether the voice command A and the voice command B are in content according to the voice command and the command content of the voice command B. Repeating, specifically, comparing the voice features of the user, thereby determining whether the source voice commands corresponding to the two voice commands are sent by the same user. If there is more overlapping part of the instruction content of the two, for example, a threshold may be set, if the percentage of the overlapping content part of the instruction content of the two is greater than the threshold, it indicates that the voice instruction A and the voice instruction B are repeated in the content.
  • the voice instruction A and the voice instruction B are similar instructions, and step 211 is performed; if the instruction contents of the two are not the same, it indicates that the voice instruction A and the voice instruction B are not repeated in content, and the voice instruction A and the voice instruction B are not similar.
  • Command, and execute step 214 are similar instructions, and step 211 is performed; if the instruction contents of the two are not the same, it indicates that the voice instruction A and the voice instruction B are not repeated in content, and the voice instruction A and the voice instruction B are not similar.
  • Command, and execute step 214 is performed.
  • step 214 is performed.
  • the voice command A and the voice command corresponding to the voice command are determined. If the acquisition time does not overlap, step 214 is performed. When the acquisition time overlaps, step 211 is performed.
  • Step 211 The RECH function entity acquires the priority of the voice command A and the voice command B according to the priority parameter of the voice command A and the voice command B corresponding to the source voice command.
  • the RECH function entity obtains the priority of the voice command A and the voice command B according to the priority parameters of the source voice command corresponding to the voice command and the voice command B. For example, when the priority parameter is set to the volume value of the source voice command received by the device, the volume value of the source voice command A is received by the comparison device A and the volume value of the source voice command B is received by the device B, and the volume value is large. It means that it is closer to the user, it may be the device that the user is facing; here, the device with a large volume value can be regarded as the device with high priority, which is defined as the main source voice command collection terminal, and the volume value will be small.
  • the device is regarded as a device with a low priority.
  • the priority of the voice command corresponding to the device with the higher priority is also higher, and the priority of the voice command corresponding to the device with the lower priority is also lower. It is assumed in the present embodiment that the priority of the voice command A is higher than the priority of the voice command B.
  • Step 212 The RECH function entity returns the voice command A with high priority to the device A, and discards the voice command B with low priority.
  • the voice command A with high priority is considered to be sent by the source voice command collecting terminal, and the voice command B with low priority is considered as redundant command. Then, the RECH function entity directly returns the voice command A with high priority to the device A, and discards the voice command B with low priority.
  • Step 213 the RECH function entity sends a redundant instruction indication to the device B.
  • the RECH function entity may also send a redundant instruction indication to the device B to notify the device B that the source voice command it listens to is a redundant command, and the source voice command does not need to be executed.
  • Step 214 the RECH function entity returns the voice command A to the device A, and returns the voice command B to the device.
  • the RECH function entity directly directly sets the voice command.
  • A returns to device A, returns voice command B to device B, and device A and device B execute voice command A and voice command B, respectively.
  • the RECH function entity may also return the new voice command to other voice control.
  • the voice command of the device makes a similarity judgment. For example, after the RECH function entity returns the voice command A to the device A, if the RECH function entity receives a new voice command from the device B from the voice resolution server, the RECH function entity can also The new voice command is compared with the voice command A that has been returned to the device A. When the new voice command and the voice command A are similar instructions, the new command does not need to be returned to the device B, and the discarding process is directly performed.
  • the embodiment provides an instruction processing method.
  • the RECH function entity receives the voice instruction A and the voice instruction B sent by the voice resolution server, and the start time stamp and the deadline time stamp of the source voice command corresponding to the voice instruction A and the voice instruction B. And the instruction content of the voice instruction A and the voice instruction B, determining whether the voice instruction A and the voice instruction B are similar instructions; when the voice instruction A and the voice instruction B are similar instructions, according to the source corresponding to the voice instruction A and the voice instruction B
  • the priority parameter of the voice command returns the voice command with a higher priority to the corresponding voice control device, and discards the voice command with a lower priority.
  • This embodiment prevents a plurality of voice control devices from repeatedly performing a voice command collected at the same time, and eliminates a control error caused by repeated execution of the command.
  • FIG. 4 is a signaling diagram of Embodiment 3 of the instruction processing method of the present invention.
  • this embodiment provides an instruction processing method.
  • the RECH function entity is set locally
  • FIG. 5 is The system architecture diagram of the third embodiment of the invention is as shown in FIG. 5, and it is assumed that the device A and the device B are two voice control devices, and both of them have voice control functions.
  • the voice control device is used as an example to describe the solution of the present invention.
  • the RECH function entity is a module integrated in the local voice control gateway.
  • the RECH function entity may also be a local device independent of the local voice control gateway.
  • the instruction processing method provided in this embodiment may specifically include the following steps:
  • Step 401 Device A synchronizes time with the local voice control gateway.
  • Step 402 Device B synchronizes time with the local voice control gateway.
  • the foregoing steps are to time synchronize the device A and the device B with the voice control function to the local local voice control gateway, so that the subsequent local voice control gateway can accurately obtain the collection time information carried in the source voice command.
  • Step 403 Device A sends a source voice command A to the local voice control gateway.
  • the source voice command can be: "Reducing the volume of the device B by one space".
  • the device A sends a source voice command to the local voice control gateway.
  • the source voice command A here refers specifically to the source voice command reported by device A, at the source.
  • the voice command A carries the initial time stamp A of the source voice command A, the end time stamp A, and the priority re-value A, and the start time of the source voice command A.
  • the stamp is used to indicate the start time of the source voice command monitored by the device A.
  • the cut-off time stamp of the source voice command A is used to indicate the cut-off time of the source voice command monitored by the device A, and the priority parameter is set by the user.
  • Step 404 The local voice control gateway performs authentication and authentication on the device A.
  • the local voice control gateway After receiving the source voice command reported by device A, the local voice control gateway performs authentication and authentication on the device A, and performs subsequent processing after the identity verification and authentication are passed.
  • Step 405 Device B sends a source voice command B to the local voice control gateway.
  • the source voice command can be:
  • the source voice command B here refers to the source voice command reported by the device B.
  • the source voice command B carries the initial time stamp B and the end time stamp B of the source voice command B.
  • the priority parameter (priority re-value B)
  • the start timestamp of the source voice command B is used to indicate the start time of the source voice command monitored by the device B
  • the cut-off timestamp of the source voice command A is used to indicate the device.
  • the cutoff time of the source voice command monitored by B, the priority parameter is a parameter set by the user for identifying the priority of the device or command when a similar instruction occurs.
  • Step 406 The local voice control gateway performs authentication and authentication on the device B.
  • the local voice control gateway After receiving the source voice command reported by device B, the local voice control gateway performs authentication and authentication on the device B, and performs subsequent processing after the identity verification and authentication are passed.
  • Step 407 The local voice control gateway sends the source voice command A to the voice resolution server.
  • Step 408 The local voice control gateway sends the source voice command B to the voice resolution server. It should be noted that there is no timing relationship between the above steps 407 and 408 in this embodiment, that is, the two steps may be performed simultaneously or in any order.
  • Step 409 The voice parsing server sends a voice command A generated by parsing the source voice command A to the local voice control gateway.
  • the voice parsing server After the voice parsing server receives the source voice command A reported by the device A, and performs the authentication and authentication on the device A, the voice parsing server parses and processes the source voice command A. The parsing process generates a voice command A that the device can understand and execute, the voice command A corresponding to the source voice command A. The voice parsing server sends the parsed voice command A to the local voice control gateway, where the voice command A carries the start time stamp, the deadline time stamp, and the priority parameter of the source voice command A corresponding to the voice command A.
  • the RECH function entity in the voice control gateway performs similarity judgment on the voice command A and other voice commands.
  • Step 410 The voice parsing server sends a voice command B generated by parsing the source voice command B to the local voice control gateway.
  • the voice parsing server After the voice parsing server receives the source voice command B reported by the device B, and performs the authentication and authentication on the device B, the voice parsing server parses the source voice command B, and the parsing processing device can understand and execute the parsing process.
  • Voice command B which corresponds to source voice command B.
  • the voice parsing server sends the parsed voice command B to the local voice control gateway, where the voice command B carries the start time stamp, the deadline time stamp, and the priority parameter of the source voice command B corresponding to the voice command B.
  • the RECH function entity in the voice control gateway performs similarity judgment on the voice command B and other voice commands.
  • Step 411 The local voice control gateway determines the source voice command corresponding to the source voice command A and the voice command B corresponding to the voice command A according to the start time stamp and the deadline time stamp of the source voice command corresponding to the voice command A and the voice command B respectively. Whether the acquisition time of B overlaps, if yes, step 412 is performed, otherwise step 416 is performed.
  • the local voice control gateway may include the start time stamp and the deadline time stamp according to the collection time information carried in the local voice control gateway, to determine the voice command A corresponding to the voice command A. Whether the acquisition time of the source voice command B corresponding to the source voice command A and the voice command B overlaps, that is, the time similarity judgment is performed. Specifically, when the time similarity determination is performed, the RECH function entity in the local voice control gateway may determine whether the difference between the start timestamp of the source voice command A and the start timestamp of the source voice command B is smaller than a preset gate.
  • a limit value determining whether the difference between the cutoff timestamp of the source voice command A and the cutoff timestamp of the source voice command B is less than a preset threshold value, if the difference between the start timestamps of the two and the deadline timestamp If the difference is less than the preset threshold, it indicates that the acquisition time of the source voice command A and the source voice command B overlaps, and step 412 is performed; if the difference between the start time stamp or the deadline time stamp of the two is greater than or equal to Preset
  • the threshold value indicates that the acquisition times of the source voice command A and the source voice command B do not overlap, and step 416 is performed.
  • the RECH function entity in the local voice control gateway may also obtain the voice instruction A according to the start time stamp and the deadline time stamp of the source voice command corresponding to the voice instruction A and the voice instruction B, respectively.
  • the duration of the voice command B determines whether the duration of the voice command A overlaps with the duration of the voice command B. If there is an overlap between the durations of the two, the acquisition time of the source voice command A and the source voice command B is indicated. If there is overlap, then step 412 is performed; if the durations of the two do not overlap in time, indicating that the voice command A and the voice command B do not satisfy the time similarity condition, step 416 is performed.
  • the RECH function entity may first determine whether the difference between the start timestamp of the voice command A and the start timestamp of the voice command B is greater than a preset time threshold. If yes, go to step 411 again, otherwise you can end this process.
  • Step 412 The RECH function entity in the local voice control gateway determines whether the voice command A and the voice command B are repeated on the content according to the command content of the voice command A and the voice command B. If yes, step 413 is performed; otherwise, step 416 is performed. .
  • the RECH function entity in the local voice control gateway determines that the source voice command A corresponding to the voice command A overlaps with the source voice command B corresponding to the voice command B
  • the RECH function entity is based on the voice command and the voice command.
  • the command content of B determines whether the voice command A and the voice command B are repeated on the content.
  • the voice features of the user can be compared to determine whether the source voice commands corresponding to the two voice commands are sent by the same user. If there is more overlapping part of the instruction content of the two, for example, a threshold may be set. If the percentage of the overlapping content part of the instruction content of the two is greater than the threshold, it indicates that the voice instruction A and the voice instruction B are repeated on the content.
  • the voice instruction A and the voice instruction B are similar instructions, and step 413 is performed; if the instruction contents of the two are not the same, it indicates that the voice instruction A and the voice instruction B are not repeated in content, and the voice instruction A and the voice instruction B are not similar.
  • the instruction, and step 416 is performed.
  • step 416 is performed.
  • the source voice corresponding to the voice command A and the voice command B is determined. If the acquisition time of the command overlaps, step 416 is performed when the acquisition time does not overlap. When the acquisition time overlaps, step 413 is performed.
  • Step 413 the RECH function entity in the local voice control gateway is based on the voice command A, voice.
  • the priority parameter of the source voice command corresponding to the instruction B acquires the priority of the voice instruction A and the voice instruction B.
  • the RECH function entity in the local voice control gateway is based on the source voice corresponding to the voice instruction A and the voice instruction B.
  • the priority parameter of the command acquires the priority of the voice command and the voice command B respectively.
  • the comparison device A receives the volume value of the source voice command A and the device B receives the volume value of the source voice command B, and the volume value is large.
  • the device with a large volume value can be regarded as the device with high priority, which is defined as the main source voice command collection terminal, and the volume value will be small.
  • the device is regarded as a device with a low priority.
  • the priority of the voice command corresponding to the device with the higher priority is also higher, and the priority of the voice command corresponding to the device with the lower priority is also lower. It is assumed in the present embodiment that the priority of the voice command A is higher than the priority of the voice command B.
  • Step 414 The local voice control gateway returns the voice command A with high priority to the device A, and discards the voice command B with low priority.
  • the voice command A with high priority is considered to be sent by the source voice command collecting terminal, and the voice command B with low priority is considered as redundant command. Then, the local voice control gateway directly returns the voice command A with high priority to the device A, and discards the voice command B with low priority.
  • Step 415 The local voice control gateway sends a redundant instruction indication to the device B.
  • the local voice control gateway may also send a redundant instruction indication to the device B to notify the device B that the source voice command it listens to is a redundant command, and the source voice command does not need to be executed.
  • Step 416 The local voice control gateway returns the voice command A to the device A, and returns the voice command B to the device B.
  • the local voice control gateway directly directly voices
  • the instruction A is returned to the device A
  • the voice command B is returned to the device B
  • the device A and the device B respectively execute the voice command A and the voice command B.
  • the RECH function in the local voice control gateway may also perform similarity determinations on the new voice command with voice commands that have been returned to other voice control devices. For example, after the RECH function entity returns the voice command A to the device A, if the RECH function entity receives a new voice command from the device B from the voice resolution server, the RECH function entity can also use the new voice command with the new voice command. The voice command A returned to the device A performs the similarity judgment. When the new voice command and the voice command A are similar instructions, the new command does not need to be returned to the device B, and the discarding process is directly performed.
  • the embodiment provides an instruction processing method.
  • the RECH function entity receives the voice instruction A and the voice instruction B sent by the voice resolution server, and the start time stamp and the deadline time stamp of the source voice command corresponding to the voice instruction A and the voice instruction B. And the instruction content of the voice instruction A and the voice instruction B, determining whether the voice instruction A and the voice instruction B are similar instructions; when the voice instruction A and the voice instruction B are similar instructions, according to the source corresponding to the voice instruction A and the voice instruction B
  • the priority parameter of the voice command returns the voice command with a higher priority to the corresponding voice control device, and discards the voice command with a lower priority.
  • This embodiment prevents a plurality of voice control devices from repeatedly performing a voice command collected at the same time, and eliminates a control error caused by repeated execution of the command.
  • the aforementioned program can be stored in a computer readable storage medium.
  • the program when executed, performs the steps including the foregoing method embodiments; and the foregoing storage medium includes: a medium that can store program codes, such as a ROM, a RAM, a magnetic disk, or an optical disk.
  • FIG. 6 is a schematic structural diagram of a first embodiment of an instruction processing apparatus according to the present invention. As shown in FIG. 6, the embodiment provides an instruction processing apparatus, which can specifically perform the steps in the first embodiment of the foregoing method, and details are not described herein again. .
  • the instruction processing apparatus provided in this embodiment may specifically include a receiving module 601, a determining module 602, and a redundant command processing module 603.
  • the receiving module 601 is configured to receive a plurality of voice commands sent by the voice resolution server, where the multiple voice commands are generated by the parsing server parsing the source voice commands from different voice control devices.
  • the determining module 602 is configured to determine whether any two voice commands of the plurality of voice commands received by the receiving module 601 are similar commands, and the similar commands are source voice commands obtained by different voice control devices for collecting the same voice information. Corresponding voice commands.
  • the redundant instruction processing module 603 is configured to discard one of the two similar voice commands when the judgment result of the determining module 602 is that the two voice commands are similar commands in the plurality of voice commands.
  • FIG. 7 is a schematic structural diagram of Embodiment 2 of an instruction processing apparatus according to the present invention, as shown in FIG.
  • the example provides an instruction processing device, which can specifically perform the steps in the second embodiment or the third embodiment of the foregoing method, and details are not described herein again.
  • the command processing device provided in this embodiment is based on the foregoing FIG. 6 , and the plurality of voice commands received by the receiving module 601 respectively carry the collection time information and the voice commands of the source voice commands corresponding to the voice commands. Instruction content.
  • the determining module 602 may specifically include a first determining unit 612, a second determining unit 622, and a similar instruction determining unit 632.
  • the first determining unit 612 is configured to determine, according to the collection time information of the source voice command corresponding to the multiple voice commands received by the receiving module 601, the source voice corresponding to any two voice commands in the multiple voice commands. Whether the acquisition time of the commands overlaps.
  • the second determining unit 622 is configured to determine, according to the instruction content of the multiple voice commands received by the receiving module 601, whether any two voice commands of the plurality of voice commands are repeated on the content.
  • the similarity instruction determining unit 632 is configured to: when the determination result of the first determining unit 612 and the second determining unit 622 is that the collection time of the source voice command corresponding to any two of the plurality of voice commands overlaps, and repeat the content When the two voice commands are determined to be similar instructions.
  • the instruction processing apparatus may further include a recording module 604 and a voice instruction determining module 605.
  • the recording module 604 is configured to record acquisition time information of the new voice instruction when receiving a new voice instruction from the voice resolution server.
  • the voice command determining module 605 is configured to compare the time of collecting the new voice command with the time of collecting the voice command recorded by the recording module 604, and determine that the difference between the time of collecting and the time of collecting the new voice command is less than a predetermined time. a threshold related voice command; and the new voice command and the associated voice command as the plurality of voice commands.
  • the first determining unit 612 may specifically include a first determining subunit 6121 and a second determining subunit 6122.
  • the first determining sub-unit 6121 is configured to determine, according to the start timestamp and the cut-off timestamp of the source voice command corresponding to the multiple voice commands received by the receiving module 601, any two of the multiple voice commands.
  • the difference between the start time stamp of the source voice command corresponding to the voice command, and whether the difference between the cut-off time stamps is less than a preset threshold; if the difference between the start time stamp and the deadline time stamp If the difference is less than the preset threshold, the acquisition time of the source voice command corresponding to any two of the plurality of voice commands is determined to overlap.
  • the second determining sub-unit 6122 is configured to obtain, according to the start timestamp and the cut-off timestamp of the source voice command corresponding to the multiple voice commands, the durations of the multiple voice commands respectively, and determine the Whether there is an overlap between the durations of any two voice commands in the plurality of voice commands; if the durations overlap In part, it is determined that the acquisition time of the source voice command corresponding to any two of the plurality of voice commands overlaps.
  • the instruction processing apparatus can also include an acquisition module 606.
  • the obtaining module 606 is configured to determine, according to a priority parameter of the source voice command corresponding to the voice command received by the receiving module 601, a voice command with a high priority among the two similar voice commands, and the two similar voices. A low priority voice command in the command.
  • the redundant instruction processing module 603 is specifically configured to: when the judgment result of the determining module 602 is that the two voice commands are similar commands in the multiple voice commands, return the voice commands with high priority among the two similar voice commands to the corresponding The voice control device discards voice commands with low priority.
  • the redundant instruction processing module 603 in the instruction processing apparatus is further configured to: when a new voice instruction received by the receiving module 601 is a similar instruction to a voice instruction returned to another voice control device, The new voice command is discarded.
  • the embodiment provides an instruction processing device, by receiving a plurality of voice commands sent by the voice parsing server, respectively determining whether any two voice commands in the plurality of voice commands are similar commands, and the similar commands are different voice control devices for the same voice.
  • the voice command corresponding to the source voice command obtained by the information acquisition; when the two voice commands are similar commands, one of the voice commands is discarded.
  • a plurality of voice control devices are repeatedly executed to repeatedly perform a voice command collected at the same time, and the control error caused by the repeated execution of the command is eliminated.
  • FIG. 8 is a schematic structural diagram of Embodiment 3 of the instruction processing apparatus of the present invention.
  • the instruction processing apparatus provided in this embodiment may specifically include a memory 801, a receiver 802, and a processor 803.
  • the receiver 802 is configured to receive multiple voice commands sent by the voice resolution server. The multiple voice commands are generated by the parsing server parsing source voice commands from different voice control devices.
  • the memory 801 is used to store program instructions.
  • Processor 803 is coupled to memory 801 and receiver 802. The processor 803 is configured to determine, according to the program instructions in the memory 801, whether any two voice commands of the plurality of voice commands received by the receiver 802 are similar commands, and the similar commands are different voice control devices for the same voice.
  • the plurality of voice commands received by the receiver 802 respectively carry the collection time information of the source voice command corresponding to each voice instruction and the instruction content of each voice instruction.
  • the processor 803 is configured to determine, according to the collection time information of the source voice command corresponding to the multiple voice commands, whether the collection time of the source voice command corresponding to any two voice commands overlaps; Determining, according to the instruction content of the plurality of voice commands, whether any two voice commands of the plurality of voice commands are repeated on the content; and when the source voice commands correspond to any two voice commands of the plurality of voice commands The acquisition time overlaps, and when the content is repeated, it is determined that the two voice instructions are similar instructions.
  • the processor 803 is further configured to: when receiving a new voice instruction from the voice resolution server, record acquisition time information of the new voice instruction; and collect time of the new voice instruction Comparing the acquisition times of the previously recorded voice commands, determining related voice instructions whose difference between the acquisition time and the acquisition time of the new voice command is less than a predetermined threshold; using the new voice command and the related voice command as Describe multiple voice commands.
  • the processor 803 is configured to determine, according to a start timestamp and a cutoff timestamp of the source voice command corresponding to the multiple voice commands, a source corresponding to any two voice commands of the multiple voice commands. Whether the difference between the start timestamp of the voice command and the difference of the deadline timestamp is less than a preset threshold; if the difference between the start timestamp and the cutoff timestamp is less than the pre- The threshold value is set to determine that the acquisition time of the source voice command corresponding to any two of the plurality of voice commands overlaps.
  • the processor 803 is configured to obtain a duration of the multiple voice commands according to a start timestamp and a deadline timestamp of the source voice command corresponding to the multiple voice commands, and determine any of the multiple voice commands. Whether there is an overlapping portion of the durations of the two voice commands; if the duration has an overlapping portion, determining that the acquisition time of the source voice commands corresponding to any two of the plurality of voice commands overlaps.
  • the plurality of voice commands received by the receiver 802 further carry priority parameters of the source voice commands corresponding to the voice commands.
  • the processor 803 is further configured to determine, according to a priority parameter of the source voice command corresponding to the voice instruction, a voice command with a high priority among two similar voice commands, and a low priority among the two similar voice commands.
  • Voice command when there are similar commands in multiple voice commands, the voice commands with higher priority among the two similar voice commands are returned to the corresponding voice control device, and the voice commands with lower priority are discarded.
  • the processor 803 is further configured to: when the received new voice command is a similar command to the voice command that has been returned to the other voice control device, perform the new voice command Discard processing.
  • FIG. 9 is a schematic structural diagram of an embodiment of a computer system according to the present invention.
  • the embodiment provides a computer system, which may be a microprocessor computer, such as a general purpose PC, a customized PC, For example, a portable device such as a desktop computer or a smart phone, the scope of the present invention is not limited to these examples.
  • the computer system includes a processor 901, an input device 902, and an output device 903 to which an input device 902 and an output device 903 are coupled.
  • the processor 901 can be a general purpose CPU, an application specific integrated circuit (Application Specific Integrated Circuit; ASIC) or one or more integrated circuits configured to control the execution of the program of the present invention.
  • Input device 902 includes a keyboard and mouse, a keypad, a touch screen input device, a voice input module, and the like.
  • the output device 903 includes a screen display unit and a voice module.
  • the computer system further includes a memory 904, which may also include one or more of the following storage devices: Read-Only Memory (hereinafter referred to as ROM), Random Access Memory (hereinafter referred to as RAM) ) and hard drive.
  • ROM Read-Only Memory
  • RAM Random Access Memory
  • the memory is coupled to the processor via a signal bus 905.
  • the computer system is further included for communicating with a communication network, such as Ethernet, a radio access network (Radio Access Network; RAN), a wireless local area network (hereinafter referred to as WLAN), and the like.
  • Communication interface 906 for communication.
  • the above-mentioned memory 904 (such as RAM) stores an operating system 914, application software 924, a program 934, etc., wherein the operating system 914 is an application that controls the processing executed by the processor, and the application software 924 can be a word processor, email.
  • a program or the like is used to display data on the output device to the user.
  • the program 934 may be specifically a program corresponding to the instruction processing method provided by the present invention.
  • the computer system further includes a receiver 907 configured to receive a plurality of voice commands sent by the voice resolution server, where the plurality of voice commands are generated by the parsing server parsing source voice commands from different voice control devices.
  • the processor 901 in this embodiment is configured to execute an instruction stored in the memory 904, wherein the processor 901 is configured to: separately determine whether any two voice commands of the plurality of voice commands are respectively For similar instructions, the similar instructions are different The voice command corresponding to the source voice command obtained by the voice control device for collecting the same voice information; when two voice commands are similar commands in the plurality of voice commands, one of the two similar voice commands is discarded.
  • the plurality of voice commands received by the receiver 907 respectively carry the acquisition time information of the source voice command corresponding to each voice instruction and the instruction content of each voice instruction.
  • the processor 901 is configured to determine, according to the collection time information of the source voice command corresponding to the multiple voice commands, whether the collection time of the source voice command corresponding to any two voice commands overlaps; Determining, according to the instruction content of the plurality of voice commands, whether any two voice commands of the plurality of voice commands are repeated on the content; and when the source voice commands correspond to any two voice commands of the plurality of voice commands The acquisition time overlaps, and when the content is repeated, it is determined that the two voice instructions are similar instructions.
  • the processor 901 is further configured to: when receiving a new voice instruction from the voice resolution server, record acquisition time information of the new voice instruction; and collect time of the new voice instruction Comparing the acquisition times of the previously recorded voice commands, determining related voice instructions whose difference between the acquisition time and the acquisition time of the new voice command is less than a predetermined threshold; using the new voice command and the related voice command as Describe multiple voice commands.
  • the processor 901 is configured to determine, according to a start timestamp and a deadline timestamp of the source voice command corresponding to the multiple voice commands, a source corresponding to any two voice commands in the multiple voice commands. Whether the difference between the start timestamp of the voice command and the difference of the deadline timestamp is less than a preset threshold; if the difference between the start timestamp and the cutoff timestamp is less than the pre- The threshold value is set to determine that the acquisition time of the source voice command corresponding to any two of the plurality of voice commands overlaps.
  • the processor 901 is configured to obtain a duration of the multiple voice commands according to a start timestamp and a deadline timestamp of the source voice command corresponding to the multiple voice commands, and determine any of the multiple voice commands. Whether there is an overlapping portion of the durations of the two voice commands; if the duration has an overlapping portion, determining that the acquisition time of the source voice commands corresponding to any two of the plurality of voice commands overlaps.
  • the plurality of voice commands received by the receiver 907 further carry priority parameters of the source voice commands corresponding to the voice commands.
  • the processor 901 is further configured to determine, according to a priority parameter of the source voice command corresponding to the voice instruction, a voice command with a high priority among two similar voice commands, and a low priority among two similar voice commands.
  • Voice command stored in multiple voice commands When the two voice commands are similar commands, the voice commands with higher priority among the two similar voice commands are returned to the corresponding voice control device, and the voice commands with lower priority are discarded.
  • the processor 901 is further configured to discard the new voice command when a new voice command received is a similar command to a voice command that has been returned to the other voice control device.
  • FIG. 10 is a schematic structural diagram of Embodiment 1 of the instruction processing system of the present invention.
  • the instruction processing system provided in this embodiment may specifically include a voice resolution server 1, a plurality of voice control devices 2, and an instruction processing device 3.
  • the instruction processing device 3 may be specifically the instruction processing device shown in FIG. 6, FIG. 7, or FIG. 8, wherein the instruction processing device 3 is a device independent of the voice resolution server 1, and the command processing device 3 may also be It is set in the voice resolution server 1 (not shown) according to the actual situation.
  • the plurality of voice control devices 2 are respectively configured to collect a plurality of source voice commands, and respectively send the plurality of source voice commands to the voice resolution server 1.
  • the voice resolution server 1 is configured to receive a plurality of source voice commands sent by the plurality of voice control devices 2, and separately parse the plurality of source voice commands to generate a plurality of voice commands corresponding to the plurality of source voice commands, and The plurality of voice instructions are respectively sent to the instruction processing device 3.
  • the speech resolution server 1 in this embodiment is also used for time synchronization with a plurality of voice control devices 2.
  • FIG. 11 is a schematic structural diagram of Embodiment 2 of the instruction processing system of the present invention.
  • the instruction processing system provided in this embodiment may specifically include a voice resolution server 1, a plurality of voice control devices 2, and a local voice control gateway 4.
  • the local voice control gateway 4 may include the instruction processing apparatus 3 shown in Fig. 6, Fig. 7, or Fig. 8 described above.
  • the plurality of voice control devices 2 are configured to separately collect a plurality of source voice commands, and respectively send the plurality of source voice commands to the local voice control gateway 3.
  • the voice resolution server 1 is configured to receive a plurality of source voice commands sent by the local voice control gateway 4, respectively, and parse the plurality of source voice commands to generate a plurality of voice commands corresponding to the plurality of source voice commands, and respectively Returning the plurality of voice commands to the local voice control gateway 4.
  • the local voice control gateway 4 in this embodiment is also used for time synchronization with the plurality of voice control devices 2.

Landscapes

  • Engineering & Computer Science (AREA)
  • Computational Linguistics (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Artificial Intelligence (AREA)
  • Quality & Reliability (AREA)
  • Signal Processing (AREA)
  • Telephonic Communication Services (AREA)

Abstract

本发明实施例提供一种指令处理方法、装置和系统,方法包括:接收语音解析服务器发送的多条语音指令,多条语音指令为语音解析服务器对来自不同语音控制设备的源语音命令进行解析后生成的;分别判断多条语音指令中任意两条语音指令是否为相似指令,相似指令为不同语音控制设备对同一语音信息进行采集得到的源语音命令对应的语音指令;当多条语音指令中存在两条语音指令为相似指令时,丢弃两条相似的语音指令中的一条语音指令。本发明实施例还提供了一种指令处理装置和系统。本实施例消除了命令重复执行带来的控制错误。

Description

指令处理方法、 装置和系统 本申请要求于 2012 年 8 月 9 日提交中国专利局、 申请号为 201210282268.X, 名称为 "指令处理方法、 装置和系统" 的中国专利申请的优 先权, 其全部内容通过引用结合在本申请中。
技术领域 本发明涉及通信技术, 尤其涉及一种指令处理方法、 装置和系统。 背景技术 语音控制技术作为一种相对更加筒易、人性化的控制方式, 已逐渐被业界 所认可,越来越多的电子设备,如智能手机( Smart Phone )、平板电脑( Tablet )、 智能电视(Smart TV )等, 都将具有语音控制功能, 并将会同时出现在人们的 日常生活中; 可以预见, 语音控制功能将越来越多样化, 且将出现越来越多的 支持语音控制的消费电子设备、办公设备等。随着终端设备计算能力的增强以 及智能化的趋势,一些终端设备所能支持的功能越来越丰富,且可能出现重叠, 如用户可以通过 Smart Phone、 Smart TV等上 Twitter。 此外, 随着家庭自动化 的普及, 通过各种智能终端均可以控制家庭网络中的其他设备, 如在 Moto的 4Home Service中, 用户可以用手机远程控制各种家用电器。 因此, 语音控制 技术不仅将成为一种重要的人机交互方式,而且可以被不同的智能终端所理解 和执行。 目前的语音命令技术主要包括传统语音控制技术和智能语音控制技 术,传统语音控制技术中用户需要依据特定的语法和命令词汇来发布命令,智 能语音控制技术中用户可以通过自然语言自由发布命令。相比之下,传统语音 控制技术实现机制相对筒单、 准确度高, 但用户体验相对较差, 智能语音控制 技术实现机制复杂,但用户体验相对较好。但目前业界普遍认为智能语音控制 技术的发展前景更为广阔,如 Apple、 Google等公司都在加大这方面的研究和 开发。对于智能语音控制技术,由于其计算开销大,通常采用云端处理的模式, 不仅可以降低设备本地处理的复杂度, 还可以减少能量消耗。 在现有技术中 ,语音控制的一种执行方式为在电子设备的后台一直运行监 听的语音控制, 例如 Samsung的 Smart Interaction TV实时监听用户的操作指 令, 以快速执行用户的操作指令。
然而, 用户发出的同一条语音命令有可能被多个设备同时监听到, 例如, 当用户对设备 A发出指令: "降低设备 B的音量一格" 时, 该命令可能同时 被设备 B监听到, 则设备 B会执行设备 A下发的指令和从用户直接接收到的 指令, 使得设备 B 的音量被重复降两次, 从而导致语音命令的重复执行, 甚 至出现控制错误。 发明内容 本发明实施例提供一种指令处理方法、装置和系统,避免多个语音控制设 备重复执行同时采集到的一条语音命令, 消除命令重复执行带来的控制错误。 本发明实施例的第一方面是提供一种指令处理方法, 包括:
接收语音解析服务器发送的多条语音指令,所述多条语音指令为所述语音 解析服务器对来自不同语音控制设备的源语音命令进行解析后生成的;
分别判断所述多条语音指令中任意两条语音指令是否为相似指令,所述相 似指令为不同语音控制设备对同一语音信息进行采集得到的源语音命令对应 的语音指令;
当所述多条语音指令中存在两条语音指令为相似指令时,丢弃两条相似的 语音指令中的一条语音指令。
在第一方面的第一种可能的实现方式中,所述多条语音指令中分别携带各 语音指令对应的源语音命令的采集时间信息和各语音指令的指令内容;
所述分别判断所述多条语音指令中任意两条语音指令是否为相似指令包 括:
根据所述多条语音指令对应的源语音命令的采集时间信息,分别判断所述 多条语音指令中任意两条语音指令对应的源语音命令的采集时间是否重叠; 根据所述多条语音指令的指令内容,分别判断所述多条语音指令中任意两 条语音指令在内容上是否重复;
当所述多条语音指令中任意两条语音指令对应的源语音命令的采集时间 重叠, 且在内容上重复时, 确定所述两条语音指令为相似指令。
结合第一方面的第一种可能的实现方式,在第一方面的第二种可能的实现 方式中, 所述方法还包括:
当从所述语音解析服务器接收到一条新的语音指令时,记录所述新的语音 指令的采集时间信息;
将所述新的语音指令的采集时间与之前记录的语音指令的采集时间进行 比较,确定采集时间与所述新的语音指令的采集时间的差值小于预定阈值的相 关语音指令;
将所述新的语音指令与所述相关语音指令作为所述多条语音指令。
结合第一方面的第一种可能的实现方式或第一方面的第二种可能的实现 方式,在第一方面的第三种可能的实现方式中, 所述根据所述多条语音指令对 应的源语音命令的采集时间信息,分别判断所述多条语音指令中任意两条语音 指令对应的源语音命令的采集时间是否重叠包括:
根据所述多条语音指令对应的源语音命令的起始时间戳和截止时间戳,分 别判断所述多条语音指令中任意两条语音指令对应的源语音命令的起始时间 戳的差值, 以及截止时间戳的差值是否均小于预设的门限值; 若所述起始时间 戳的差值以及所述截止时间戳的差值均小于预设的门限值,则确定所述多条语 音指令中的任意两条指令对应的源语音命令的采集时间重叠; 或者,
根据所述多条语音指令对应的源语音命令的起始时间戳和截止时间戳,分 别获取多条语音指令的持续时间,判断所述多条语音指令中任意两条语音指令 的持续时间是否有重叠部分; 若所述持续时间有重叠部分, 则确定所述多条语 音指令中的任意两条指令对应的源语音命令的采集时间重叠。
结合第一方面、第一方面的第一种可能的实现方式、第一方面的第二种可 能的实现方式或第一方面的第三种可能的实现方式,在第一方面的第四种可能 的实现方式中,所述多条语音指令中还分别携带各所述语音指令对应的源语音 命令的优先级参数;
所述方法还包括:
根据语音指令对应的源语音命令的优先级参数,确定两条相似的语音指令 中的优先级高的语音指令, 以及两条相似的语音指令中的优先级低的语音指 令; 所述当所述多条语音指令中存在两条语音指令为相似指令时,丢弃两条相 似的语音指令中的一条语音指令包括:
当所述多条语音指令中存在两条语音指令为相似指令时,将两条相似的语 音指令中优先级高的语音指令返回给对应的语音控制设备,将优先级低的语音 指令进行丢弃处理。
结合第一方面、第一方面的第一种可能的实现方式、第一方面的第二种可 能的实现方式、第一方面的第三种可能的实现方式或第一方面的第四种可能的 实现方式,在第一方面的第五种可能的实现方式中, 本发明实施例的指令处理 方法还包括:
当接收到的一条新的语音指令与已返回给其他语音控制设备的语音指令 为相似指令时, 对所述新的语音指令进行丢弃处理。
结合第一方面,在第一方面的第六种可能的实现方式中, 本发明实施例的 指令处理方法还包括:
所述语音解析服务器与各语音控制设备进行时间同步;
所述语音解析服务器分别接收所述各语音控制设备发送的所述源语音命 令。
结合第一方面,在第一方面的第七种可能的实现方式中, 本发明实施例的 指令处理方法还包括:
本地语音控制网关与各语音控制设备进行时间同步;
所述本地语音控制网关分别接收所述各语音控制设备发送的所述源语音 命令, 并将各所述源语音命令发送到所述语音解析服务器。
本发明实施例的第二方面是提供一种指令处理装置, 包括:
接收模块, 用于接收语音解析服务器发送的多条语音指令, 所述多条语音 指令为所述语音解析服务器对来自不同语音控制设备的源语音命令进行解析 后生成的;
判断模块,用于分别判断所述接收模块接收到的所述多条语音指令中任意 两条语音指令是否为相似指令,所述相似指令为不同语音控制设备对同一语音 信息进行采集得到的源语音命令对应的语音指令;
冗余指令处理模块,用于当所述判断模块的判断结果为所述多条语音指令 中有两条语音指令为相似指令时, 丢弃两条相似的语音指令中的一条语音指 令。
在第二方面的第一种可能的实现方式中,所述接收模块接收到的所述多条 语音指令中分别携带各语音指令对应的源语音命令的采集时间信息和各语音 指令的指令内容;
所述判断模块包括:
第一判断单元,用于根据所述接收模块接收到的所述多条语音指令对应的 源语音命令的采集时间信息,分别判断所述多条语音指令中任意两条语音指令 对应的源语音命令的采集时间是否重叠;
第二判断单元,用于根据所述接收模块接收到的所述多条语音指令的指令 内容, 分别判断所述多条语音指令中任意两条语音指令在内容上是否重复; 相似指令确定单元,用于当所述第一判断单元和第二判断单元的判断结果 为所述多条语音指令中任意两条语音指令对应的源语音命令的采集时间重叠, 且在内容上重复时, 确定所述两条语音指令为相似指令。
结合第二方面的第一种可能的实现方式,在第二方面的第二种可能的实现 方式中, 所述装置还包括:
记录模块, 用于当从所述语音解析服务器接收到一条新的语音指令时,记 录所述新的语音指令的采集时间信息;
语音指令确定模块,用于将所述新的语音指令的采集时间与所述记录模块 之前记录的语音指令的采集时间进行比较,确定采集时间与所述新的语音指令 的采集时间的差值小于预定阈值的相关语音指令;并将所述新的语音指令与所 述相关语音指令作为所述多条语音指令。
结合第二方面的第一种可能的实现方式或第二方面的第二种可能的实现 方式, 在第二方面的第三种可能的实现方式中, 所述第一判断单元包括: 第一判断子单元,用于根据所述接收模块接收到的所述多条语音指令对应 的源语音命令的起始时间戳和截止时间戳,分别判断所述多条语音指令中任意 两条语音指令对应的源语音命令的起始时间戳的差值,以及截止时间戳的差值 是否均小于预设的门限值;若所述起始时间戳的差值以及所述截止时间戳的差 值均小于预设的门限值,则确定所述多条语音指令中的任意两条指令对应的源 语音命令的采集时间重叠; 或者,
第二判断子单元,用于所述接收模块接收到的根据所述多条语音指令对应 的源语音命令的起始时间戳和截止时间戳, 分别获取多条语音指令的持续时 间, 判断所述多条语音指令中任意两条语音指令的持续时间是否有重叠部分; 若所述持续时间有重叠部分,则确定所述多条语音指令中的任意两条指令对应 的源语音命令的采集时间重叠。
结合第二方面、第二方面的第一种可能的实现方式、第二方面的第二种可 能的实现方式或第二方面的第三种可能的实现方式,在第二方面的第四种可能 的实现方式中,所述接收模块接收到的所述多条语音指令中还分别携带所述多 条语音指令对应的源语音命令的优先级参数;
所述装置还包括:
获取模块,用于根据所述接收模块接收到的语音指令对应的源语音命令的 优先级参数,确定两条相似的语音指令中的优先级高的语音指令, 以及两条相 似的语音指令中的优先级低的语音指令;
所述冗余指令处理模块具体用于当所述判断模块的判断结果为所述多条 语音指令中存在两条语音指令为相似指令时,将两条相似的语音指令中优先级 高的语音指令返回给对应的语音控制设备,将优先级低的语音指令进行丢弃处 理。
结合第二方面、第二方面的第一种可能的实现方式、第二方面的第二种可 能的实现方式、第二方面的第三种可能的实现方式或第二方面的第四种可能的 实现方式,在第二方面的第五种可能的实现方式中, 所述冗余指令处理模块还 用于当所述接收模块接收到的一条新的语音指令与已返回给其他语音控制设 备的语音指令为相似指令时, 对所述新的语音指令进行丢弃处理。
本发明实施例的第三方面是提供一种指令处理系统, 包括语音解析服务 器、 多个语音控制设备和上述的指令处理装置;
所述多个语音控制设备分别用于采集多个源语音命令,并分别将所述多个 源语音命令发送到所述语音解析服务器;
所述语音解析服务器用于接收所述多个语音控制设备发送的多个源语音 命令,对所述多个源语音命令分别进行解析后生成所述多个源语音命令对应的 多个语音指令, 并将所述多个语音指令分别发送到所述指令处理装置。
在第三方面的第一种可能的实现方式中,所述语音解析服务器还用于与所 述多个语音控制设备进行时间同步。 本发明实施例的第四方面是提供一种指令处理系统, 包括语音解析服务 器、 多个语音控制设备和本地语音控制网关, 所述本地语音控制网关包括上述 的指令处理装置;
所述多个语音控制设备用于分别采集多个源语音命令,并分别将所述多个 源语音命令发送到所述本地语音控制网关;
所述语音解析服务器用于分别接收所述本地语音控制网关发送的多个源 语音命令,对所述多个源语音命令分别进行解析后生成所述多个源语音命令对 应的多个语音指令, 并分别将所述多个语音指令返回到所述本地语音控制网 关。
在第四方面的第一种可能的实现方式中,所述本地语音控制网关还用于与 所述多个语音控制设备进行时间同步。
本发明实施例的技术效果是:通过接收语音解析服务器发送的多条语音指 令, 分别判断多条语音指令中任意两条语音指令是否为相似指令,相似指令为 不同语音控制设备对同一语音信息进行采集得到的源语音命令对应的语音指 令; 当两条语音指令为相似指令时, 丢弃其中一条语音指令。 本实施例避免了 多个语音控制设备重复执行同时采集到的一条语音命令,消除了命令重复执行 带来的控制错误。 附图说明 为了更清楚地说明本发明实施例或现有技术中的技术方案,下面将对实施 例或现有技术描述中所需要使用的附图作一筒单地介绍,显而易见地, 下面描 述中的附图是本发明的一些实施例,对于本领域普通技术人员来讲,在不付出 创造性劳动性的前提下, 还可以根据这些附图获得其他的附图。 图 1为本发明指令处理方法实施例一的流程图;
图 2为本发明指令处理方法实施例二的流程图;
图 3为本发明指令处理方法实施例二中的系统架构示意图;
图 4为本发明指令处理方法实施例三的信令图;
图 5为本发明指令处理方法实施例三中的系统架构示意图;
图 6为本发明指令处理装置实施例一的结构示意图; 图 7为本发明指令处理装置实施例二的结构示意图;
图 8为本发明指令处理装置实施例三的结构示意图;
图 9为本发明计算机系统实施例的结构示意图;
图 10为本发明指令处理系统实施例一的结构示意图;
图 11为本发明指令处理系统实施例二的结构示意图。 具体实施方式
为使本发明实施例的目的、技术方案和优点更加清楚, 下面将结合本发明 实施例中的附图,对本发明实施例中的技术方案进行清楚、完整地描述,显然, 所描述的实施例是本发明一部分实施例, 而不是全部的实施例。基于本发明中 的实施例,本领域普通技术人员在没有作出创造性劳动前提下所获得的所有其 他实施例, 都属于本发明保护的范围。
图 1为本发明指令处理方法实施例一的流程图,如图 1所示, 本实施例提 供了一种指令处理方法, 可以具体包括如下步骤:
步骤 101 , 接收语音解析服务器发送的多条语音指令。
本实施例提出了一种冗余语音指令识别和处理 ( Redundant voicE Command identification and Handling; 以下筒称: RECH )机制, 具体可以在现 有的语音控制系统中增加一个 RECH功能实体, 该 RECH功能实体可以为一 个独立的设备, 也可以为集成在现有的设备中的一个模块。 具体地, 本实施例 中的 RECH功能实体可以与网络侧的语音解析服务器设置在一起, 或者直接 作为一个模块设置在语音解析服务器中; 也可以设置在本地, 即与本地语音控 制网关设置在一起, 或者直接作为一个模块设置在本地语音控制网关中。
本步骤为 RECH功能实体接收语音解析服务器发送的多条语音指令, 多 条语音指令可以为语音解析服务器依次发送的,其可以为语音解析服务器在一 个预设的时间段内生成并发送的。此处设置预设的时间段的目的是为了对在不 同时刻接收到的语音指令进行不同处理,当接收到的两个语音指令的时间差较 大时, 则可以直接将在先接收到的语音指令返回给相应的语音控制设备, 而不 会等到接收到在后的语音指令, 并对二者进行相似性判断后才处理; 因此, 本 实施例可以具体设置预设的时间段,并对预设的时间段内接收到的各语音指令 进行两两的相似性判断。其中, 多条语音指令为语音解析服务器对来自不同语 音控制设备的源语音命令进行解析后生成的。在本实施例中, 需要进行相似性 判断的两个语音指令为分别来自不同语音控制设备的语音指令,而无需对来自 同一语音控制设备的语音指令进行相似性判断。各语音指令为语音解析服务器 对来自不同语音控制设备的源语音命令进行解析后生成的,语音解析服务器对 每一个源语音命令进行解析, 生成每一个源语音命令对应的语音指令。
步骤 102, 分别判断所述多条各语音指令中任意两条语音指令是否为相似 指令, 如果是, 则执行步骤 103, 否则执行步骤 104。
在接收到语音指令后,分别判断所述多条语音指令中任意两条语音指令是 否为相似指令,此处的相似指令为不同语音控制设备对同一语音信息进行采集 得到的源语音命令对应的语音指令。具体地, 当本实施例可以对各语音指令中 的任意两条语音指令进行相似性判断,分别判断每两条指令是否为不同语音控 制设备对同一语音信息进行采集得到的源语音命令对应的语音指令, 如果是, 即其中有两条语音指令为相似指令, 则执行步骤 103, 否则, 即其中任意两条 语音指令均不为相似指令, 则执行步骤 104。 具体地, 在进行相似性判断时, 可以根据对应的源语音命令的采集时间信息判断两个语音指令在时间上是否 重叠, 以及根据语音指令的指令内容判断两个语音指令在内容上是否满足重 复。
具体地,本实施例中上述步骤 101接收的多条语音指令是指满足预设的时 间条件的语音指令,此处的预设的时间条件用于对进行相似性判断的语音指令 对应的源语音命令的采集时间进行限定,如只需对采集时间间隔较近的语音指 令进行相似性判断, 而采集时间间隔较远(如 2分钟以上 )的语音指令基本上 不可能是相似指令。 在上述步骤 101之后, 本实施例还可以包括如下步骤: 当 从所述语音解析服务器接收到一条新的语音指令时,记录所述新的语音指令的 采集时间信息;将所述新的语音指令的采集时间与之前记录的语音指令的采集 时间进行比较,确定采集时间与所述新的语音指令的采集时间的差值小于预定 阈值的相关语音指令;将所述新的语音指令与所述相关语音指令作为所述多条 语音指令。其中,语音指令的采集时间为该语音指令对应的源语音命令的起始 时间戳。
相应地, 本实施例可以设置一个指令时间表,将接收到的语音指令的采集 时间记录在该指令时间表中。 本实施例还可以为该指令时间表设置一个定时 器, 该定时器用于对其中存储的采集时间信息进行计时, 当采集时间信息在指 令时间表中存储的时间大于一个预设时间,如 5分钟时,表明该采集时间信息 对应的语音指令已超时,其基本上不可能与后续接收到的其他语音指令为相似 指令, 则便可以将该指令时间信息从指令时间表中删除,后续便不会从指令时 间表中获取到对应的已超时的语音指令。
本实施例为了避免因对所有语音指令进行相似性判断而造成的较大计算 量,对需要进行相似性判断的语音指令进行采集时间的限定, 即通过指令时间 表中存储的各语音指令的采集时间来判定某两条语音指令是否需要进行相似 性判断。此处具体根据指令时间表获取采集时间与新的语音指令的采集时间的 差值小于预定阈值的相关语音指令,此处获取的相关语音指令以及新的语音指 令便是当前需要进行相似性判断的多条语音指令。
步骤 103 , 丢弃两条相似的语音指令中的一条语音指令。
经过上述相似性判断, 当其中两条语音指令为相似指令时, 可以从这两条 相似的语音指令中选择一条语音指令进行丢弃处理, 从而避免出现冗余指令, 有效避免了相同命令的重复执行。 同时,将另外一条语音指令发送给该语音指 令对应的语音控制设备,语音控制设备在接收到各自的语音指令后,便可以执 行该语音指令所指示的操作, 以响应用户发出的源语音命令。
具体地,本实施例也可以根据这两条相似的语音指令对应的源语音命令的 优先级对这两条语音指令进行冗余处理, 其中,语音指令的优先级可以根据默 认设置的源语音命令的优先级来获取,也可以在语音指令中携带源语音命令的 优先级参数, 该优先级参数可以根据实际情况来设定,如设置语音控制设备接 收到源语音命令的音量值作为优先级参数, 音量值越高, 则对应的语音指令的 优先级越高。具体为将这两条相似的语音指令中优先级高的语音指令返回给对 应的语音控制设备,此处对应的语音控制设备具体为向语音解析服务器发送该 语音指令对应的源语音命令的设备, 语音控制设备在接收到各自的语音指令 后, 便可以执行该语音指令所指示的操作, 以响应用户发出的源语音命令。 同 时,将其中优先级低的语音指令进行丢弃处理, 同时可以向该优先级低的语音 指令对应的语音控制设备发送冗余指令指示,以通知其监听到的源语音命令为 冗余命令, 从而有效避免了相同命令的重复执行。
步骤 104, 将各语音指令分别返回给对应的语音控制设备。 经过上述相似性判断, 当多条语音指令中不存在相似指令时, 可以直接将 各语音指令分别返回给对应的语音控制设备,此处对应的语音控制设备具体为 向语音解析服务器发送该语音指令对应的源语音命令的设备,每个语音指令分 别对应一个语音控制设备,语音控制设备在接收到各自的语音指令后,便可以 执行该语音指令所指示的操作, 以响应用户发出的源语音命令。
本实施例提供了一种指令处理方法,通过接收语音解析服务器发送的多条 语音指令, 分别判断多条语音指令中任意两条语音指令是否为相似指令,相似 指令为不同语音控制设备对同一语音信息进行采集得到的源语音命令对应的 语音指令; 当两条语音指令为相似指令时, 丢弃其中一条语音指令。 本实施例 避免了多个语音控制设备重复执行同时采集到的一条语音命令,消除了命令重 复执行带来的控制错误。
图 2为本发明指令处理方法实施例二的信令图,如图 2所示, 本实施例提 供了一种指令处理方法, 本实施例具体为将 RECH功能实体设置在网络侧, 图 3为本发明指令处理方法实施例二中的系统架构示意图,如图 3所示,假设 设备 A和设备 B为两个语音控制设备, 二者均具有语音控制功能, 本实施例 以网络中存在这两个语音控制设备为例来对本发明的方案进行说明, RECH功 能实体为与语音解析服务器相独立的一个设备, 当然该 RECH功能实体也可 以集成在语音解析服务器中。具体地, 本实施例提供的指令处理方法可以具体 包括如下步骤:
步骤 201 , 设备 A与语音解析服务器进行时间同步。
步骤 202, 设备 B与语音解析服务器进行时间同步。
上述步骤为先将具有语音控制功能的设备 A和设备 B分别与位于网络侧 的语音解析服务器进行时间同步,以使得后续语音解析服务器能够准确获取到 在源语音命令中携带的采集时间信息。
需要指出的是,本实施例中的上述步骤 201和步骤 202之间不存在时序限 定关系, 即这两个步骤可以同时执行, 也可以以任意顺序执行。
步骤 203, 设备 A向语音解析服务器发送源语音命令 A。
当设备 A监听并采集到用户发出的源语音命令后, 源语音命令可以为: "将设备 B的音量降低一格" , 设备 A在执行该源语音命令之前, 需要先将 其发送到语音解析服务器进行解析处理, 本步骤为设备 A向语音解析服务器 发送源语音命令 A。 此处的源语音命令 A具体指设备 A上报的源语音命令, 在该源语音命令 A中携带源语音命令 A的起始时间戳 ( initial time stamp A ) 、 截止时间戳 (end time stamp A)和优先权参数 (priority re-value A),源语音命令 A 的起始时间戳用于表示设备 A监听到的该源语音命令的起始时间, 源语音命 令 A的截止时间戳用于表示设备 A监听到的该源语音命令的截止时间, 优先 权参数为用户或设备设定的用于在出现相似指令时标识设备或命令优先权的 参数。
步骤 204, 语音解析服务器对设备 A进行身份验证与鉴权。
语音解析服务器在接收到设备 A上报的源语音命令后, 先对该设备 A进 行身份验证与鉴权, 身份验证与鉴权通过之后, 才执行后续的解析处理。
步骤 205, 设备 B向语音解析服务器发送源语音命令 B。
当设备 B监听并采集到用户发出的源语音命令后, 源语音命令可以为: "将设备 B的音量降低一格" , 设备 B在执行该源语音命令之前, 需要先将 其发送到语音解析服务器进行解析处理, 本步骤为设备 B 向语音解析服务器 发送源语音命令 B。 此处的源语音命令 B具体指设备 B上报的源语音命令, 在该源语音命令 B中携带源语音命令 B的起始时间戳 ( initial time stamp B ) 、 截止时间戳 (end time stamp B)和优先权参数 (priority re-value B), 源语音命令 B 的起始时间戳用于表示设备 B监听到的该源语音命令的起始时间, 源语音命 令 A的截止时间戳用于表示设备 B监听到的该源语音命令的截止时间, 优先 权参数为用户设定的用于在出现相似指令时标识设备或命令优先权的参数。
步骤 206 , 语音解析服务器对设备 B进行身份验证与鉴权。
语音解析服务器在接收到设备 B上报的源语音命令后, 先对该设备 B进 行身份验证与鉴权, 身份验证与鉴权通过之后, 才执行后续的解析处理。
需要指出的是,本实施例中的上述步骤 204和步骤 206之间不存在时序限 定关系, 即这两个步骤可以同时执行, 也可以以任意顺序执行。
步骤 207, 语音解析服务器向 RECH功能实体发送对源语音命令 A解析 后生成的语音指令 A。
语音解析服务器在接收到设备 A上报的源语音命令 A, 并对设备 A完成 身份验证与鉴权后, 语音解析服务器对该源语音命令 A进行解析处理, 通过 解析处理生成设备能够理解并执行的语音指令 A, 该语音指令 A与源语音命 令 A相对应。 语音解析服务器将解析后生成的语音指令 A发送到 RECH功能 实体,在该语音指令 A中携带语音指令 A对应的源语音命令 A的起始时间戳、 截止时间戳和优先权参数, 由 RECH功能实体对该语音指令 A与其他语音指 令进行相似性判断。
步骤 208,语音解析服务器向 RECH功能实体发送对源语音命令 B解析后 生成的语音指令
语音解析服务器在接收到设备 B上报的源语音命令 B, 并对设备 B完成 身份验证与鉴权后, 语音解析服务器对该源语音命令 B 进行解析处理, 通过 解析处理生成设备能够理解并执行的语音指令 B,该语音指令 B与源语音命令 B相对应。 语音解析服务器将解析后生成的语音指令 B发送到 RECH功能实 体, 在该语音指令 B中携带语音指令 B对应的源语音命令 B的起始时间戳、 截止时间戳和优先权参数, 由 RECH功能实体对该语音指令 B与其他语音指 令进行相似性判断。
需要指出的是,本实施例中的上述步骤 207和步骤 208之间不存在时序限 定关系, 即这两个步骤可以同时执行, 也可以以任意顺序执行。
步骤 209, RECH功能实体根据语音指令 、 语音指令 B分别对应的源语 音命令的起始时间戳和截止时间戳, 判断语音指令 A对应的源语音命令 A和 语音指令 B对应的源语音命令 B的采集时间是否重叠, 如果是, 则执行步骤 210, 否则执行步骤 214。
RECH功能实体在从语音解析服务器接收到语音指令 A和语音指令 B后, 根据其中携带的采集时间信息,该采集时间信息可以包括起始时间戳和截止时 间戳, 来判断语音指令 A对应的源语音命令 A和语音指令 B对应的源语音命 令 B 的采集时间是否重叠, 即进行时间相似性判断。 具体地, 在进行时间相 似性判断时, RECH功能实体可以判断源语音命令 A的起始时间戳与源语音 命令 B的起始时间戳的差值是否小于预设的门限值, 且判断源语音命令 A的 截止时间戳与源语音命令 B 的截止时间戳的差值是否小于预设的门限值, 如 果二者的起始时间戳的差值和截止时间戳的差值均小于预设的门限值,则表明 语音指令 A对应的源语音命令 A和语音指令 B对应的源语音命令 B的采集时 间重叠, 则执行步骤 210; 如果二者的起始时间戳或截止时间戳的差值大于或 等于预设的门限值, 则表明语音指令 A对应的源语音命令 A和语音指令 B对 应的源语音命令 B的采集时间不重叠, 则执行步骤 214。
或者,在进行时间相似性判断时, RECH功能实体也可以根据语音指令 A、 语音指令 B对应的源语音命令的起始时间戳和截止时间戳, 分别获取语音指 令 A、 语音指令 B的持续时间, 判断语音指令 A的持续时间与语音指令 B的 持续时间是否有重叠部分,如果二者的持续时间存在重叠部分, 则表明语音指 令 A对应的源语音命令 A和语音指令 B对应的源语音命令 B的采集时间重叠, 则执行步骤 210; 如果二者的持续时间不存在重叠部分, 则表明语音指令 A对 应的源语音命令 A和语音指令 B对应的源语音命令 B的采集时间不重叠, 则 执行步骤 214。
进一步地, 在本实施例中, 在上述步骤 209之前, RECH功能实体还可以 先判断语音指令 A的起始时间戳与语音指令 B的起始时间戳的差值是否大于 预设的时间阈值, 如果是, 再执行步骤 209, 否则可以结束本流程。
步骤 210, RECH功能实体根据语音指令八、 语音指令 B的指令内容, 判 断语音指令 A和语音指令 B在内容上是否重复, 如果是, 则执行步骤 211 , 否则执行步骤 214。
经过上述判断步骤, 当 RECH功能实体确定语音指令 A与语音指令 B在 时间上重叠时, RECH功能实体根据语音指令 、语音指令 B的指令内容, 进 一步判断语音指令 A和语音指令 B在内容上是否重复, 具体可以对用户的语 音特征进行比较,从而判断这两个语音指令对应的源语音命令是否由同一个用 户发出。 如果二者的指令内容出现的重叠部分较多, 例如可以设定一个阈值, 若二者的指令内容中重叠内容部分的百分比大于这个阈值,则表明语音指令 A 和语音指令 B在内容上重复, 语音指令 A和语音指令 B为相似指令, 并执行 步骤 211; 如果二者的指令内容不相同, 则表明语音指令 A和语音指令 B在 内容上不重复, 语音指令 A和语音指令 B不为相似指令, 并执行步骤 214。
需要指出的是, 也可以先判断语音指令 A和语音指令 B是否在内容上重 复, 当不满足时执行步骤 214, 当满足内容上重复时, 再判断语音指令 A和语 音指令对应的源语音命令的采集时间是否重叠,当采集时间不重叠时执行步骤 214, 当采集时间重叠时, 执行步骤 211。
步骤 211 , RECH功能实体根据语音指令 A、 语音指令 B对应的源语音命 令的优先级参数, 获取语音指令 A、 语音指令 B的优先级。 通过上述时间相似性判断和内容相似性判断的判断过程,当确定语音指令
A和语音指令 B为相似指令时, RECH功能实体根据语音指令 、语音指令 B 对应的源语音命令的优先级参数, 分别获取语音指令 A、 语音指令 B的优先 级。 例如, 当设定优先级参数为设备接收到源语音命令的音量值时, 通过比较 设备 A接收到源语音命令 A的音量值与设备 B接收到源语音命令 B的音量值, 音量值大的意味着其离用户更近, 则可能是用户面向的设备; 此处可以将音量 值大的设备当作优先级高的设备, 即将其定义为主要源语音命令采集终端,将 将音量值小的设备当作优先级低的设备; 相应地,优先级高的设备对应的语音 指令的优先级也高,优先级低的设备对应的语音指令的优先级也低。本实施例 中假设语音指令 A的优先级高于语音指令 B的优先级。
步骤 212, RECH功能实体将优先级高的语音指令 A返回给设备 A, 并丢 弃优先级低的语音指令 B。
当获取到语音指令 A和语音指令 B的优先级后, 在本实施例中, 优先级 高的语音指令 A认为是源语音命令采集终端发出的, 优先级低的语音指令 B 认为是冗余指令, 则 RECH功能实体将优先级高的语音指令 A直接返回给设 备 A, 并丢弃优先级低的语音指令 B。
步骤 213, RECH功能实体向设备 B发送冗余指令指示。
在本实施例中, RECH功能实体还可以向设备 B发送冗余指令指示, 以通 知设备 B其监听到的源语音命令为冗余命令, 无需执行该源语音命令。
步骤 214, RECH功能实体将语音指令 A返回给设备 A,将语音指令 B返 回给设备
通过上述判断, 如果语音指令 A与语音指令 B不满足时间相似性条件, 或者不满足内容相似性条件时, 表明语音指令 A与语音指令 B不为相似性指 令, 则 RECH功能实体直接将语音指令 A返回给设备 A, 将语音指令 B返回 给设备 B, 由设备 A和设备 B分别执行语音指令 A和语音指令 B。
在本实施例中, 当完成上述各个步骤的执行后, 若 RECH功能实体从语 音解析服务器接收到一个新的语音指令, 则 RECH功能实体还可以将该新的 语音指令与已返给其他语音控制设备的语音指令进行相似性判断。 例如, 当 RECH功能实体向设备 A返回语音指令 A后, 若 RECH功能实体又从语音解 析服务器接收到一个来自设备 B的新的语音指令, 则 RECH功能实体还可以 将该新的语音指令与已返给设备 A的语音指令 A进行相似性判断。 当该新的 语音指令与语音指令 A为相似指令时, 则无需将该新的指令返回给设备 B, 而直接将其进行丢弃处理。
本实施例提供了一种指令处理方法, RECH功能实体接收语音解析服务器 发送的语音指令 A和语音指令 B, 根据语音指令 A和语音指令 B对应的源语 音命令的起始时间戳和截止时间戳, 以及语音指令 A和语音指令 B的指令内 容, 判断语音指令 A与语音指令 B是否为相似指令; 当语音指令 A与语音指 令 B为相似指令时, 根据语音指令 A和语音指令 B对应的源语音命令的优先 级参数,将优先级高的语音指令返回给对应的语音控制设备,将优先级低的语 音指令进行丢弃处理。本实施例避免了多个语音控制设备重复执行同时采集到 的一条语音命令, 消除了命令重复执行带来的控制错误。
图 4为本发明指令处理方法实施例三的信令图,如图 4所示, 本实施例提 供了一种指令处理方法, 本实施例具体为将 RECH功能实体设置在本地, 图 5 为本发明指令处理方法实施例三中的系统架构示意图,如图 5所示,假设设备 A和设备 B为两个语音控制设备, 二者均具有语音控制功能, 本实施例以网 络中存在这两个语音控制设备为例来对本发明的方案进行说明, RECH功能实 体为集成在本地语音控制网关中的一个模块, 当然该 RECH功能实体也可以 为设置在本地的与本地语音控制网关相独立的一个设备。具体地, 本实施例提 供的指令处理方法可以具体包括如下步骤:
步骤 401 , 设备 A与本地语音控制网关进行时间同步。
步骤 402, 设备 B与本地语音控制网关进行时间同步。
上述步骤为先将具有语音控制功能的设备 A和设备 B分别与位于本地的 本地语音控制网关进行时间同步,以使得后续本地语音控制网关能够准确获取 到在源语音命令中携带的采集时间信息。
需要指出的是,本实施例中的上述步骤 401和步骤 402之间不存在时序限 定关系, 即这两个步骤可以同时执行, 也可以以任意顺序执行。
步骤 403, 设备 A向本地语音控制网关发送源语音命令 A。
当设备 A监听并采集到用户发出的源语音命令后, 源语音命令可以为: "将设备 B的音量降低一格" , 本步骤为设备 A向本地语音控制网关发送源 语音命令 。 此处的源语音命令 A具体指设备 A上报的源语音命令, 在该源 语音命令 A中携带源语音命令 A的起始时间戳 ( initial time stamp A ) 、 截止 时间戳 (end time stamp A)和优先权参数 (priority re-value A), 源语音命令 A的 起始时间戳用于表示设备 A监听到的该源语音命令的起始时间,源语音命令 A 的截止时间戳用于表示设备 A监听到的该源语音命令的截止时间, 优先权参 数为用户设定的用于在出现相似指令时标识设备或命令优先权的参数。
步骤 404, 本地语音控制网关对设备 A进行身份验证与鉴权。
本地语音控制网关在接收到设备 A上报的源语音命令后, 先对该设备 A 进行身份验证与鉴权, 身份验证与鉴权通过之后, 才执行后续的处理。
步骤 405 , 设备 B向本地语音控制网关发送源语音命令 B。
当设备 B监听并采集到用户发出的源语音命令后, 源语音命令可以为:
"将设备 B的音量降低一格" , 本步骤为设备 B向本地语音控制网关发送源 语音命令 B。 此处的源语音命令 B具体指设备 B上报的源语音命令, 在该源 语音命令 B中携带源语音命令 B的起始时间戳 ( initial time stamp B ) 、 截止 时间戳 (end time stamp B)和优先权参数 (priority re-value B),源语音命令 B的起 始时间戳用于表示设备 B监听到的该源语音命令的起始时间, 源语音命令 A 的截止时间戳用于表示设备 B监听到的该源语音命令的截止时间, 优先权参 数为用户设定的用于在出现相似指令时标识设备或命令优先权的参数。
步骤 406, 本地语音控制网关对设备 B进行身份验证与鉴权。
本地语音控制网关在接收到设备 B上报的源语音命令后, 先对该设备 B 进行身份验证与鉴权, 身份验证与鉴权通过之后, 才执行后续的处理。
需要指出的是,本实施例中的上述步骤 404和步骤 406之间不存在时序限 定关系, 即这两个步骤可以同时执行, 也可以以任意顺序执行。
步骤 407, 本地语音控制网关将源语音命令 A发送到语音解析服务器。 步骤 408, 本地语音控制网关将源语音命令 B发送到语音解析服务器。 需要指出的是,本实施例中的上述步骤 407和步骤 408之间不存在时序限 定关系, 即这两个步骤可以同时执行, 也可以以任意顺序执行。
步骤 409,语音解析服务器向本地语音控制网关发送对源语音命令 A解析 后生成的语音指令 A。
语音解析服务器在接收到设备 A上报的源语音命令 A, 并对设备 A完成 身份验证与鉴权后, 语音解析服务器对该源语音命令 A进行解析处理, 通过 解析处理生成设备能够理解并执行的语音指令 A, 该语音指令 A与源语音命 令 A相对应。 语音解析服务器将解析后生成的语音指令 A发送到本地语音控 制网关, 在该语音指令 A中携带语音指令 A对应的源语音命令 A的起始时间 戳、 截止时间戳和优先权参数, 由本地语音控制网关中的 RECH功能实体对 该语音指令 A与其他语音指令进行相似性判断。
步骤 410,语音解析服务器向本地语音控制网关发送对源语音命令 B解析 后生成的语音指令 B。
语音解析服务器在接收到设备 B上报的源语音命令 B, 并对设备 B完成 身份验证与鉴权后, 语音解析服务器对该源语音命令 B 进行解析处理, 通过 解析处理生成设备能够理解并执行的语音指令 B,该语音指令 B与源语音命令 B相对应。语音解析服务器将解析后生成的语音指令 B发送到本地语音控制网 关, 在该语音指令 B中携带语音指令 B对应的源语音命令 B的起始时间戳、 截止时间戳和优先权参数, 由本地语音控制网关中的 RECH功能实体对该语 音指令 B与其他语音指令进行相似性判断。
需要指出的是,本实施例中的上述步骤 409和步骤 410之间不存在时序限 定关系, 即这两个步骤可以同时执行, 也可以以任意顺序执行。
步骤 411 , 本地语音控制网关根据语音指令 A、 语音指令 B分别对应的源 语音命令的起始时间戳和截止时间戳, 判断语音指令 A对应的源语音命令 A 和语音指令 B对应的源语音命令 B的采集时间是否重叠, 如果是, 则执行步 骤 412, 否则执行步骤 416。
本地语音控制网关在从语音解析服务器接收到语音指令 A和语音指令 B 后,根据其中携带的采集时间信息, 该采集时间信息可以包括起始时间戳和截 止时间戳, 来判断语音指令 A对应的源语音命令 A和语音指令 B对应的源语 音命令 B 的采集时间是否重叠, 即进行时间相似性判断。 具体地, 在进行时 间相似性判断时, 本地语音控制网关中的 RECH功能实体可以判断源语音命 令 A的起始时间戳与源语音命令 B的起始时间戳的差值是否小于预设的门限 值, 且判断源语音命令 A的截止时间戳与源语音命令 B的截止时间戳的差值 是否小于预设的门限值,如果二者的起始时间戳的差值和截止时间戳的差值均 小于预设的门限值, 则表明源语音命令 A和源语音命令 B的采集时间重叠, 则执行步骤 412; 如果二者的起始时间戳或截止时间戳的差值大于或等于预设 的门限值, 则表明源语音命令 A和源语音命令 B的采集时间不重叠, 则执行 步骤 416。
或者, 在进行时间相似性判断时, 本地语音控制网关中的 RECH功能实 体也可以根据语音指令 A、 语音指令 B对应的源语音命令的起始时间戳和截 止时间戳, 分别获取语音指令 A、 语音指令 B的持续时间, 判断语音指令 A 的持续时间与语音指令 B 的持续时间是否有重叠部分, 如果二者的持续时间 存在重叠部分, 则表明源语音命令 A和源语音命令 B的采集时间重叠, 则执 行步骤 412; 如果二者的持续时间在时间上不重叠, 则表明语音指令 A和语音 指令 B不满足时间相似性条件, 则执行步骤 416。
进一步地, 在本实施例中, 在上述步骤 411之前, RECH功能实体还可以 先判断语音指令 A的起始时间戳与语音指令 B的起始时间戳的差值是否大于 预设的时间阈值, 如果是, 再执行步骤 411 , 否则可以结束本流程。
步骤 412, 本地语音控制网关中的 RECH功能实体根据语音指令 A、语音 指令 B的指令内容, 判断语音指令 A和语音指令 B在内容上是否重复, 如果 是, 则执行步骤 413, 否则执行步骤 416。
经过上述判断步骤, 当本地语音控制网关中的 RECH功能实体确定语音 指令 A对应的源语音命令 A与语音指令 B对应的源语音命令 B的采集时间重 叠时, RECH功能实体根据语音指令 、语音指令 B的指令内容判断语音指令 A和语音指令 B在内容上是否重复, 具体可以对用户的语音特征进行比较, 从而判断这两个语音指令对应的源语音命令是否由同一个用户发出。如果二者 的指令内容出现的重叠部分较多, 例如可以设定一个阈值, 若二者的指令内容 中重叠内容部分的百分比大于这个阈值, 则表明语音指令 A和语音指令 B在 内容上重复, 语音指令 A和语音指令 B为相似指令, 并执行步骤 413; 如果 二者的指令内容不相同, 则表明语音指令 A和语音指令 B在内容上不重复, 语音指令 A和语音指令 B不为相似指令, 并执行步骤 416。
需要指出的是, 也可以先判断语音指令 A和语音指令 B是否在内容上重 复, 当不满足时执行步骤 416, 当满足内容上重复时, 再判断语音指令 A和语 音指令 B对应的源语音命令的采集时间是否重叠, 当采集时间不重叠时执行 步骤 416, 当采集时间重叠时, 执行步骤 413。
步骤 413, 本地语音控制网关中的 RECH功能实体根据语音指令 A、语音 指令 B对应的源语音命令的优先级参数, 获取语音指令 A、 语音指令 B的优 先级。
通过上述时间相似性判断和内容相似性判断的判断过程,当确定语音指令 A和语音指令 B为相似指令时, 本地语音控制网关中的 RECH功能实体根据 语音指令 A、 语音指令 B对应的源语音命令的优先级参数, 分别获取语音指 令 、 语音指令 B的优先级。 例如, 当设定优先级参数为设备接收到源语音 命令的音量值时, 通过比较设备 A接收到源语音命令 A的音量值与设备 B接 收到源语音命令 B 的音量值, 音量值大的意味着其离用户更近, 则可能是用 户面向的设备; 此处可以将音量值大的设备当作优先级高的设备, 即将其定义 为主要源语音命令采集终端,将将音量值小的设备当作优先级低的设备; 相应 地,优先级高的设备对应的语音指令的优先级也高,优先级低的设备对应的语 音指令的优先级也低。 本实施例中假设语音指令 A的优先级高于语音指令 B 的优先级。
步骤 414, 本地语音控制网关将优先级高的语音指令 A返回给设备 A, 并 丢弃优先级低的语音指令 B。
当获取到语音指令 A和语音指令 B的优先级后, 在本实施例中, 优先级 高的语音指令 A认为是源语音命令采集终端发出的, 优先级低的语音指令 B 认为是冗余指令, 则本地语音控制网关将优先级高的语音指令 A直接返回给 设备 A, 并丢弃优先级低的语音指令 B。
步骤 415, 本地语音控制网关向设备 B发送冗余指令指示。
在本实施例中, 本地语音控制网关还可以向设备 B发送冗余指令指示, 以通知设备 B其监听到的源语音命令为冗余命令, 无需执行该源语音命令。
步骤 416, 本地语音控制网关将语音指令 A返回给设备 A, 将语音指令 B 返回给设备 B。
通过上述判断, 如果语音指令 A与语音指令 B不满足时间相似性条件, 或者不满足内容相似性条件时, 表明语音指令 A与语音指令 B不为相似性指 令, 则本地语音控制网关直接将语音指令 A返回给设备 A, 将语音指令 B返 回给设备 B, 由设备 A和设备 B分别执行语音指令 A和语音指令 B。
在本实施例中, 当完成上述各个步骤的执行后, 若本地语音控制网关从语 音解析服务器接收到一个新的语音指令, 则本地语音控制网关中的 RECH功 能实体还可以将该新的语音指令与已返给其他语音控制设备的语音指令进行 相似性判断。例如, 当 RECH功能实体向设备 A返回语音指令 A后,若 RECH 功能实体又从语音解析服务器接收到一个来自设备 B 的新的语音指令, 则 RECH功能实体还可以将该新的语音指令与已返给设备 A的语音指令 A进行 相似性判断。 当该新的语音指令与语音指令 A为相似指令时, 则无需将该新 的指令返回给设备 B, 而直接将其进行丢弃处理。
本实施例提供了一种指令处理方法, RECH功能实体接收语音解析服务器 发送的语音指令 A和语音指令 B, 根据语音指令 A和语音指令 B对应的源语 音命令的起始时间戳和截止时间戳, 以及语音指令 A和语音指令 B的指令内 容, 判断语音指令 A与语音指令 B是否为相似指令; 当语音指令 A与语音指 令 B为相似指令时, 根据语音指令 A和语音指令 B对应的源语音命令的优先 级参数,将优先级高的语音指令返回给对应的语音控制设备,将优先级低的语 音指令进行丢弃处理。本实施例避免了多个语音控制设备重复执行同时采集到 的一条语音命令, 消除了命令重复执行带来的控制错误。
本领域普通技术人员可以理解:实现上述各方法实施例的全部或部分步骤 可以通过程序指令相关的硬件来完成。前述的程序可以存储于一计算机可读取 存储介质中。 该程序在执行时, 执行包括上述各方法实施例的步骤; 而前述的 存储介质包括: ROM, RAM,磁碟或者光盘等各种可以存储程序代码的介质。
图 6为本发明指令处理装置实施例一的结构示意图,如图 6所示, 本实施 例提供了一种指令处理装置, 可以具体执行上述方法实施例一中的各个步骤, 此处不再赘述。 本实施例提供的指令处理装置可以具体包括接收模块 601、 判 断模块 602和冗余指令处理模块 603。 其中, 接收模块 601用于接收语音解析 服务器发送的多条语音指令,所述多条语音指令为所述解析服务器对来自不同 语音控制设备的源语音命令进行解析后生成的。判断模块 602用于分别判断接 收模块 601接收到的所述多条语音指令中任意两条语音指令是否为相似指令, 所述相似指令为不同语音控制设备对同一语音信息进行采集得到的源语音命 令对应的语音指令。冗余指令处理模块 603用于当判断模块 602的判断结果为 多条语音指令中存在两条语音指令为相似指令时,丢弃两条相似的语音指令中 的一条语音指令。
图 7为本发明指令处理装置实施例二的结构示意图,如图 7所示, 本实施 例提供了一种指令处理装置,可以具体执行上述方法实施例二或实施例三中的 各个步骤, 此处不再赘述。本实施例提供的指令处理装置在上述图 6所示的基 础之上,接收模块 601接收到的所述多条语音指令中分别携带各语音指令对应 的源语音命令的采集时间信息和各语音指令的指令内容。判断模块 602可以具 体包括第一判断单元 612、第二判断单元 622和相似指令确定单元 632。其中, 第一判断单元 612用于根据接收模块 601接收到的所述多条语音指令对应的源 语音命令的采集时间信息,分别判断所述多条语音指令中任意两条语音指令对 应的源语音命令的采集时间是否重叠。 第二判断单元 622用于根据接收模块 601接收到的所述多条语音指令的指令内容, 分别判断所述多条语音指令中任 意两条语音指令在内容上是否重复。相似指令确定单元 632用于当第一判断单 元 612和第二判断单元 622的判断结果为所述多条语音指令中任意两条语音指 令对应的源语音命令的采集时间重叠,且在内容上重复时,确定所述两条语音 指令为相似指令。
进一步地,本实施例提供的指令处理装置还可以包括记录模块 604和语音 指令确定模块 605。 记录模块 604用于当从所述语音解析服务器接收到一条新 的语音指令时, 记录所述新的语音指令的采集时间信息。 语音指令确定模块 605用于将所述新的语音指令的采集时间与记录模块 604之前记录的语音指令 的采集时间进行比较,确定采集时间与所述新的语音指令的采集时间的差值小 于预定阈值的相关语音指令;并将所述新的语音指令与所述相关语音指令作为 所述多条语音指令。
具体地, 第一判断单元 612可以具体包括第一判断子单元 6121和第二判 断子单元 6122。 其中, 第一判断子单元 6121用于根据接收模块 601接收到的 所述多条语音指令对应的源语音命令的起始时间戳和截止时间戳,分别判断所 述多条语音指令中任意两条语音指令对应的源语音命令的起始时间戳的差值, 以及截止时间戳的差值是否均小于预设的门限值;若所述起始时间戳的差值以 及所述截止时间戳的差值均小于预设的门限值,则确定所述多条语音指令中的 任意两条指令对应的源语音命令的采集时间重叠。 第二判断子单元 6122用于 根据接收模块 601 接收到的根据所述多条语音指令对应的源语音命令的起始 时间戳和截止时间戳, 分别获取多条语音指令的持续时间, 判断所述多条语音 指令中任意两条语音指令的持续时间是否有重叠部分;若所述持续时间有重叠 部分,则确定所述多条语音指令中的任意两条指令对应的源语音命令的采集时 间重叠。
更进一步地,本实施例中的接收模块 601接收到的所述多条语音指令中还 分别携带所述多条语音指令对应的源语音命令的优先级参数。该指令处理装置 还可以包括获取模块 606。 获取模块 606用于根据接收模块 601接收到的语音 指令对应的源语音命令的优先级参数,确定所述两条相似的语音指令中的优先 级高的语音指令, 以及所述两条相似的语音指令中的优先级低的语音指令。 冗 余指令处理模块 603具体用于当判断模块 602的判断结果为多条语音指令中存 在两条语音指令为相似指令时,将两条相似的语音指令中优先级高的语音指令 返回给对应的语音控制设备, 将优先级低的语音指令进行丢弃处理。
更进一步地,该指令处理装置中的冗余指令处理模块 603还用于当接收模 块 601 接收到的一条新的语音指令与已返回给其他语音控制设备的语音指令 为相似指令时, 对所述新的语音指令进行丢弃处理。
本实施例提供了一种指令处理装置,通过接收语音解析服务器发送的多条 语音指令, 分别判断多条语音指令中任意两条语音指令是否为相似指令,相似 指令为不同语音控制设备对同一语音信息进行采集得到的源语音命令对应的 语音指令; 当两条语音指令为相似指令时, 丢弃其中一条语音指令。 本实施例 避免了多个语音控制设备重复执行同时采集到的一条语音命令,消除了命令重 复执行带来的控制错误。
图 8为本发明指令处理装置实施例三的结构示意图,如图 8所示, 本实施 例提供的指令处理装置可以具体包括,存储器 801、接收器 802和处理器 803。 其中,接收器 802用于接收语音解析服务器发送的多条语音指令; 所述多条语 音指令为所述解析服务器对来自不同语音控制设备的源语音命令进行解析后 生成的。 存储器 801用于存储程序指令。 处理器 803与存储器 801和接收器 802耦合。 处理器 803被配置为根据存储器 801中的程序指令, 分别判断接收 器 802接收的所述多条语音指令中任意两条语音指令是否为相似指令,所述相 似指令为不同语音控制设备对同一语音信息进行采集得到的源语音命令对应 的语音指令; 当所述多条语音指令中存在两条语音指令为相似指令时, 丢 弃两条相似的语音指令中的一条语音指令。 具体地,接收器 802接收的多条语音指令中分别携带各语音指令对应的源 语音命令的采集时间信息和各语音指令的指令内容。处理器 803被配置具体用 于根据所述多条语音指令对应的源语音命令的采集时间信息,分别判断所述多 条语音指令中任意两条语音指令对应的源语音命令的采集时间是否重叠;根据 所述多条语音指令的指令内容,分别判断所述多条语音指令中任意两条语音指 令在内容上是否重复;当所述多条语音指令中任意两条语音指令对应的源语音 命令的采集时间重叠,且在内容上重复时,确定所述两条语音指令为相似指令。
进一步地,处理器 803还被配置用于当从所述语音解析服务器接收到一条 新的语音指令时,记录所述新的语音指令的采集时间信息; 将所述新的语音指 令的采集时间与之前记录的语音指令的采集时间进行比较,确定采集时间与所 述新的语音指令的采集时间的差值小于预定阈值的相关语音指令;将所述新的 语音指令与所述相关语音指令作为所述多条语音指令。
更具体地,处理器 803被配置用于根据所述多条语音指令对应的源语音命 令的起始时间戳和截止时间戳,分别判断所述多条语音指令中任意两条语音指 令对应的源语音命令的起始时间戳的差值,以及截止时间戳的差值是否均小于 预设的门限值;若所述起始时间戳的差值以及所述截止时间戳的差值均小于预 设的门限值,则确定所述多条语音指令中的任意两条指令对应的源语音命令的 采集时间重叠。或者, 处理器 803被配置用于根据所述多条语音指令对应的源 语音命令的起始时间戳和截止时间戳, 分别获取多条语音指令的持续时间, 判 断所述多条语音指令中任意两条语音指令的持续时间是否有重叠部分;若所述 持续时间有重叠部分,则确定所述多条语音指令中的任意两条指令对应的源语 音命令的采集时间重叠。
进一步地,接收器 802接收的多条语音指令中还分别携带各所述语音指令 对应的源语音命令的优先级参数。处理器 803还被配置用于根据语音指令对应 的源语音命令的优先级参数,确定两条相似的语音指令中的优先级高的语音指 令, 以及两条相似的语音指令中的优先级低的语音指令; 当多条语音指令中存 在相似指令时,将两条相似的语音指令中优先级高的语音指令返回给对应的语 音控制设备, 将优先级低的语音指令进行丢弃处理。
更进一步地,处理器 803还被配置用于当接收到的一条新的语音指令与已 返回给其他语音控制设备的语音指令为相似指令时,对所述新的语音指令进行 丢弃处理。
图 9为本发明计算机系统实施例的结构示意图,如图 9所示, 本实施例提 供了一种计算机系统, 该计算机系统可以具体为微处理器计算机,诸如通用目 的的 PC、 定制的 PC、 例如台式计算机或智能电话等便携式设备, 但本发明的 范围并不局限于这些例子。 该计算机系统包括处理器 901、 输入设备 902和输 出设备 903 , 输入设备 902和输出设备 903耦合于该处理器 901。
处理器 901可以为通用目的的 CPU、 专用集成电路( Application Specific Integrated Circuit; 以下筒称: ASIC )或者一个或多个集成电路, 其被配置用 于控制执行本发明的程序。 输入设备 902包括键盘和鼠标、键区、 触屏输入设 备、 语音输入模块等。 输出设备 903包括屏幕显示单元和语音模块。
计算机系统还包括存储器 904, 该存储器 904也可以包括一个或多个下述 存储设备: 只读内存(Read-Only Memory; 以下筒称: ROM ) 、 随机存储器 ( Random Access Memory; 以下筒称: RAM )和硬盘。 存储器通过信号总线 905与处理器相耦合。
该计算机系统还包括用于与通信网络, 诸如以太网 (Ethernet ) 、 无线接 入网( Radio Access Network; 以下筒称: RAN )、 无线局域网( Wireless Local Area Network; 以下筒称: WLAN )等进行通信的通信接口 906。
上述存储器 904 (诸如 RAM ) 中存储有操作系统 914、 应用软件 924、 程 序 934等, 其中, 操作系统 914为控制处理器所执行的处理过程的应用程序, 应用软件 924可以为 word处理器、 email程序等,用以将输出设备上的数据显 示给用户, 程序 934可以具体为本发明提供的指令处理方法所对应的程序。
该计算机系统还包括接收器 907, 被配置用于接收语音解析服务器发送的 多条语音指令,所述多条语音指令为所述解析服务器对来自不同语音控制设备 的源语音命令进行解析后生成的。本实施例中的处理器 901被配置为执行存储 在所述存储器 904中的指令, 其中, 所述处理器 901被配置为用于: 分别判断 所述多条语音指令中任意两条语音指令是否为相似指令,所述相似指令为不同 语音控制设备对同一语音信息进行采集得到的源语音命令对应的语音指令;当 多条语音指令中存在两条语音指令为相似指令时, 丢弃两条相似的语音指令 中的一条语音指令。
具体地,接收器 907接收的多条语音指令中分别携带各语音指令对应的源 语音命令的采集时间信息和各语音指令的指令内容。处理器 901被配置具体用 于根据所述多条语音指令对应的源语音命令的采集时间信息,分别判断所述多 条语音指令中任意两条语音指令对应的源语音命令的采集时间是否重叠;根据 所述多条语音指令的指令内容,分别判断所述多条语音指令中任意两条语音指 令在内容上是否重复;当所述多条语音指令中任意两条语音指令对应的源语音 命令的采集时间重叠,且在内容上重复时,确定所述两条语音指令为相似指令。
具体地,处理器 901还被配置用于当从所述语音解析服务器接收到一条新 的语音指令时,记录所述新的语音指令的采集时间信息; 将所述新的语音指令 的采集时间与之前记录的语音指令的采集时间进行比较,确定采集时间与所述 新的语音指令的采集时间的差值小于预定阈值的相关语音指令;将所述新的语 音指令与所述相关语音指令作为所述多条语音指令。
更具体地,处理器 901被配置用于根据所述多条语音指令对应的源语音命 令的起始时间戳和截止时间戳,分别判断所述多条语音指令中任意两条语音指 令对应的源语音命令的起始时间戳的差值,以及截止时间戳的差值是否均小于 预设的门限值;若所述起始时间戳的差值以及所述截止时间戳的差值均小于预 设的门限值,则确定所述多条语音指令中的任意两条指令对应的源语音命令的 采集时间重叠。或者, 处理器 901被配置用于根据所述多条语音指令对应的源 语音命令的起始时间戳和截止时间戳, 分别获取多条语音指令的持续时间, 判 断所述多条语音指令中任意两条语音指令的持续时间是否有重叠部分;若所述 持续时间有重叠部分,则确定所述多条语音指令中的任意两条指令对应的源语 音命令的采集时间重叠。
进一步地,接收器 907接收的多条语音指令中还分别携带各所述语音指令 对应的源语音命令的优先级参数。处理器 901还被配置用于根据语音指令对应 的源语音命令的优先级参数,确定两条相似的语音指令中的优先级高的语音指 令, 以及两条相似的语音指令中的优先级低的语音指令; 当多条语音指令中存 在两条语音指令为相似指令时,将两条相似的语音指令中优先级高的语音指令 返回给对应的语音控制设备, 将优先级低的语音指令进行丢弃处理。
更进一步地,处理器 901还被配置用于当接收到的一条新的语音指令与已 返回给其他语音控制设备的语音指令为相似指令时,对所述新的语音指令进行 丢弃处理。
图 10为本发明指令处理系统实施例一的结构示意图,如图 10所示, 本实 施例提供的指令处理系统可以具体包括语音解析服务器 1、 多个语音控制设备 2和指令处理装置 3。 其中, 指令处理装置 3可以具体如上述图 6、 图 7或图 8 所示的指令处理装置,该图中指令处理装置 3为与语音解析服务器 1相独立的 设备, 该指令处理装置 3还可以根据实际情况设置在语音解析服务器 1中(图 中未示出)。 多个语音控制设备 2分别用于采集多个源语音命令, 并分别将所 述多个源语音命令发送到所述语音解析服务器 1。 语音解析服务器 1用于接收 多个语音控制设备 2发送的多个源语音命令,对所述多个源语音命令分别进行 解析后生成所述多个源语音命令对应的多个语音指令,并将所述多个语音指令 分别发送到所述指令处理装置 3。
本实施例中的语音解析服务器 1还用于与多个语音控制设备 2进行时间同 步。
图 11为本发明指令处理系统实施例二的结构示意图,如图 11所示, 本实 施例提供的指令处理系统可以具体包括语音解析服务器 1、 多个语音控制设备 2和本地语音控制网关 4。本地语音控制网关 4可以包括上述图 6、 图 7或图 8 所示的指令处理装置 3。 多个语音控制设备 2用于分别采集多个源语音命令, 并分别将所述多个源语音命令发送到所述本地语音控制网关 3。 语音解析服务 器 1用于分别接收本地语音控制网关 4发送的多个源语音命令,对所述多个源 语音命令分别进行解析后生成所述多个源语音命令对应的多个语音指令,并分 别将所述多个语音指令返回到所述本地语音控制网关 4。
本实施例中的本地语音控制网关 4还用于与所述多个语音控制设备 2进行 时间同步。
最后应说明的是: 以上各实施例仅用以说明本发明的技术方案, 而非 对其限制; 尽管参照前述各实施例对本发明进行了详细的说明, 本领域的 普通技术人员应当理解: 其依然可以对前述各实施例所记载的技术方案进 行修改, 或者对其中部分或者全部技术特征进行等同替换; 而这些修改或 者替换,并不使相应技术方案的本质脱离本发明各实施例技术方案的范围。

Claims

权利要求书
1、 一种指令处理方法, 其特征在于, 包括:
接收语音解析服务器发送的多条语音指令,所述多条语音指令为所述语音 解析服务器对来自不同语音控制设备的源语音命令进行解析后生成的;
分别判断所述多条语音指令中任意两条语音指令是否为相似指令,所述相 似指令为不同语音控制设备对同一语音信息进行采集得到的源语音命令对应 的语音指令;
当所述多条语音指令中存在两条语音指令为相似指令时,丢弃两条相似的 语音指令中的一条语音指令。
2、 根据权利要求 1所述的方法, 其特征在于, 所述多条语音指令中分别 携带各语音指令对应的源语音命令的采集时间信息和各语音指令的指令内容; 所述分别判断所述多条语音指令中任意两条语音指令是否为相似指令包 括:
根据所述多条语音指令对应的源语音命令的采集时间信息,分别判断所述 多条语音指令中任意两条语音指令对应的源语音命令的采集时间是否重叠; 根据所述多条语音指令的指令内容,分别判断所述多条语音指令中任意两 条语音指令在内容上是否重复;
当所述多条语音指令中任意两条语音指令对应的源语音命令的采集时间 重叠, 且在内容上重复时, 确定所述两条语音指令为相似指令。
3、 根据权利要求 2所述的方法, 其特征在于, 所述方法还包括: 当从所述语音解析服务器接收到一条新的语音指令时,记录所述新的语音 指令的采集时间信息;
将所述新的语音指令的采集时间与之前记录的语音指令的采集时间进行 比较,确定采集时间与所述新的语音指令的采集时间的差值小于预定阈值的相 关语音指令;
将所述新的语音指令与所述相关语音指令作为所述多条语音指令。
4、 根据权利要求 2或 3所述的方法, 其特征在于, 所述根据所述多条语 音指令对应的源语音命令的采集时间信息,分别判断所述多条语音指令中任意 两条语音指令对应的源语音命令的采集时间是否重叠包括: 根据所述多条语音指令对应的源语音命令的起始时间戳和截止时间戳,分 别判断所述多条语音指令中任意两条语音指令对应的源语音命令的起始时间 戳的差值, 以及截止时间戳的差值是否均小于预设的门限值; 若所述起始时间 戳的差值以及所述截止时间戳的差值均小于预设的门限值,则确定所述多条语 音指令中的任意两条指令对应的源语音命令的采集时间重叠; 或者,
根据所述多条语音指令对应的源语音命令的起始时间戳和截止时间戳,分 别获取多条语音指令的持续时间,判断所述多条语音指令中任意两条语音指令 的持续时间是否有重叠部分; 若所述持续时间有重叠部分, 则确定所述多条语 音指令中的任意两条指令对应的源语音命令的采集时间重叠。
5、 根据权利要求 1-4任一所述的方法, 其特征在于, 所述多条语音指令 中还分别携带各所述语音指令对应的源语音命令的优先级参数;
所述方法还包括:
根据语音指令对应的源语音命令的优先级参数,确定两条相似的语音指令 中的优先级高的语音指令, 以及两条相似的语音指令中的优先级低的语音指 令;
所述当所述多条语音指令中存在两条语音指令为相似指令时,丢弃两条相 似的语音指令中的一条语音指令包括:
当所述多条语音指令中存在两条语音指令为相似指令时,将两条相似的语 音指令中优先级高的语音指令返回给对应的语音控制设备,将优先级低的语音 指令进行丢弃处理。
6、 根据权利要求 1-5任一所述的方法, 其特征在于, 还包括:
当接收到的一条新的语音指令与已返回给其他语音控制设备的语音指令 为相似指令时, 对所述新的语音指令进行丢弃处理。
7、 根据权利要求 1所述的方法, 其特征在于, 还包括:
所述语音解析服务器与各语音控制设备进行时间同步;
所述语音解析服务器分别接收所述各语音控制设备发送的所述源语音命 令。
8、 根据权利要求 1所述的方法, 其特征在于, 还包括:
本地语音控制网关与各语音控制设备进行时间同步;
所述本地语音控制网关分别接收所述各语音控制设备发送的所述源语音 命令, 并将各所述源语音命令发送到所述语音解析服务器。
9、 一种指令处理装置, 其特征在于, 包括:
接收模块, 用于接收语音解析服务器发送的多条语音指令, 所述多条语音 指令为所述语音解析服务器对来自不同语音控制设备的源语音命令进行解析 后生成的;
判断模块,用于分别判断所述接收模块接收到的所述多条语音指令中任意 两条语音指令是否为相似指令,所述相似指令为不同语音控制设备对同一语音 信息进行采集得到的源语音命令对应的语音指令;
冗余指令处理模块,用于当所述判断模块的判断结果为所述多条语音指令 中存在两条语音指令为相似指令时,丢弃两条相似的语音指令中的一条语音指 令。
10、 根据权利要求 9所述的装置, 其特征在于, 所述接收模块接收到的所 述多条语音指令中分别携带各语音指令对应的源语音命令的采集时间信息和 各语音指令的指令内容;
所述判断模块包括:
第一判断单元,用于根据所述接收模块接收到的所述多条语音指令对应的 源语音命令的采集时间信息,分别判断所述多条语音指令中任意两条语音指令 对应的源语音命令的采集时间是否重叠;
第二判断单元,用于根据所述接收模块接收到的所述多条语音指令的指令 内容, 分别判断所述多条语音指令中任意两条语音指令在内容上是否重复; 相似指令确定单元,用于当所述第一判断单元和第二判断单元的判断结果 为所述多条语音指令中任意两条语音指令对应的源语音命令的采集时间重叠, 且在内容上重复时, 确定所述两条语音指令为相似指令。
11、 根据权利要求 10所述的装置, 其特征在于, 所述装置还包括: 记录模块, 用于当从所述语音解析服务器接收到一条新的语音指令时,记 录所述新的语音指令的采集时间信息;
语音指令确定模块,用于将所述新的语音指令的采集时间与所述记录模块 之前记录的语音指令的采集时间进行比较,确定采集时间与所述新的语音指令 的采集时间的差值小于预定阈值的相关语音指令;并将所述新的语音指令与所 述相关语音指令作为所述多条语音指令。
12、 根据权利要求 10或 11所述的装置, 其特征在于, 所述第一判断单元 包括:
第一判断子单元,用于根据所述接收模块接收到的所述多条语音指令对应 的源语音命令的起始时间戳和截止时间戳,分别判断所述多条语音指令中任意 两条语音指令对应的源语音命令的起始时间戳的差值,以及截止时间戳的差值 是否均小于预设的门限值;若所述起始时间戳的差值以及所述截止时间戳的差 值均小于预设的门限值,则确定所述多条语音指令中的任意两条指令对应的源 语音命令的采集时间重叠; 或者,
第二判断子单元,用于所述接收模块接收到的根据所述多条语音指令对应 的源语音命令的起始时间戳和截止时间戳, 分别获取多条语音指令的持续时 间, 判断所述多条语音指令中任意两条语音指令的持续时间是否有重叠部分; 若所述持续时间有重叠部分,则确定所述多条语音指令中的任意两条指令对应 的源语音命令的采集时间重叠。
13、 根据权利要求 9-12中任一项所述的装置, 其特征在于, 所述接收模 块接收到的所述多条语音指令中还分别携带所述多条语音指令对应的源语音 命令的优先级参数;
所述装置还包括:
获取模块,用于根据所述接收模块接收到的语音指令对应的源语音命令的 优先级参数,确定两条相似的语音指令中的优先级高的语音指令, 以及两条相 似的语音指令中的优先级低的语音指令;
所述冗余指令处理模块具体用于当所述判断模块的判断结果为所述多条 语音指令中存在两条语音指令为相似指令时,将两条相似的语音指令中优先级 高的语音指令返回给对应的语音控制设备,将优先级低的语音指令进行丢弃处 理。
14、 根据权利要求 9-13任一所述的装置, 其特征在于, 所述冗余指令处 理模块还用于当所述接收模块接收到的一条新的语音指令与已返回给其他语 音控制设备的语音指令为相似指令时, 对所述新的语音指令进行丢弃处理。
15、 一种指令处理系统, 其特征在于, 包括语音解析服务器、 多个语音控 制设备和权利要求 9-14中任一项所述的指令处理装置;
所述多个语音控制设备分别用于采集多个源语音命令,并分别将所述多个 源语音命令发送到所述语音解析服务器;
所述语音解析服务器用于接收所述多个语音控制设备发送的多个源语音 命令,对所述多个源语音命令分别进行解析后生成所述多个源语音命令对应的 多个语音指令, 并将所述多个语音指令分别发送到所述指令处理装置。
16、 根据权利要求 15所述的系统, 其特征在于, 所述语音解析服务器还 用于与所述多个语音控制设备进行时间同步。
17、 一种指令处理系统, 其特征在于, 包括语音解析服务器、 多个语音控 制设备和本地语音控制网关, 所述本地语音控制网关包括权利要求 9-14中任 一项所述的指令处理装置;
所述多个语音控制设备用于分别采集多个源语音命令,并分别将所述多个 源语音命令发送到所述本地语音控制网关;
所述语音解析服务器用于分别接收所述本地语音控制网关发送的多个源 语音命令,对所述多个源语音命令分别进行解析后生成所述多个源语音命令对 应的多个语音指令, 并分别将所述多个语音指令返回到所述本地语音控制网 关。
18、 根据权利要求 17所述的系统, 其特征在于, 所述本地语音控制网 关还用于与所述多个语音控制设备进行时间同步。
PCT/CN2013/081131 2012-08-09 2013-08-09 指令处理方法、装置和系统 WO2014023257A1 (zh)

Priority Applications (2)

Application Number Priority Date Filing Date Title
EP13827606.8A EP2830044B1 (en) 2012-08-09 2013-08-09 Instruction processing method, apparatus, and system
US14/520,575 US9704503B2 (en) 2012-08-09 2014-10-22 Command handling method, apparatus, and system

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201210282268.XA CN102831894B (zh) 2012-08-09 2012-08-09 指令处理方法、装置和系统
CN201210282268.X 2012-08-09

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US14/520,575 Continuation US9704503B2 (en) 2012-08-09 2014-10-22 Command handling method, apparatus, and system

Publications (1)

Publication Number Publication Date
WO2014023257A1 true WO2014023257A1 (zh) 2014-02-13

Family

ID=47334993

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2013/081131 WO2014023257A1 (zh) 2012-08-09 2013-08-09 指令处理方法、装置和系统

Country Status (4)

Country Link
US (1) US9704503B2 (zh)
EP (1) EP2830044B1 (zh)
CN (1) CN102831894B (zh)
WO (1) WO2014023257A1 (zh)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109074808A (zh) * 2018-07-18 2018-12-21 深圳魔耳智能声学科技有限公司 语音控制方法、中控设备和存储介质
CN113162964A (zh) * 2020-01-23 2021-07-23 丰田自动车株式会社 代理系统、终端装置以及代理程序

Families Citing this family (43)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102831894B (zh) 2012-08-09 2014-07-09 华为终端有限公司 指令处理方法、装置和系统
CN104035814A (zh) * 2013-03-07 2014-09-10 联想(北京)有限公司 一种数据处理的方法及电子设备
US10204622B2 (en) * 2015-09-10 2019-02-12 Crestron Electronics, Inc. Acoustic sensory network
US10748539B2 (en) * 2014-09-10 2020-08-18 Crestron Electronics, Inc. Acoustic sensory network
CN106469040B (zh) 2015-08-19 2019-06-21 华为终端有限公司 通信方法、服务器及设备
US10783888B2 (en) * 2015-09-10 2020-09-22 Crestron Electronics Inc. System and method for determining recipient of spoken command in a control system
US9653075B1 (en) * 2015-11-06 2017-05-16 Google Inc. Voice commands across devices
US10026401B1 (en) 2015-12-28 2018-07-17 Amazon Technologies, Inc. Naming devices via voice commands
US10127906B1 (en) * 2015-12-28 2018-11-13 Amazon Technologies, Inc. Naming devices via voice commands
US10185544B1 (en) 2015-12-28 2019-01-22 Amazon Technologies, Inc. Naming devices via voice commands
US10049670B2 (en) 2016-06-06 2018-08-14 Google Llc Providing voice action discoverability example for trigger term
CN106357525A (zh) * 2016-08-29 2017-01-25 珠海格力电器股份有限公司 智能网关控制方法和装置及智能网关
US10515632B2 (en) 2016-11-15 2019-12-24 At&T Intellectual Property I, L.P. Asynchronous virtual assistant
CN106796790B (zh) * 2016-11-16 2020-11-10 深圳达闼科技控股有限公司 机器人语音指令识别的方法及相关机器人装置
US10102868B2 (en) * 2017-02-17 2018-10-16 International Business Machines Corporation Bot-based honeypot poison resilient data collection
US10757058B2 (en) 2017-02-17 2020-08-25 International Business Machines Corporation Outgoing communication scam prevention
US10810510B2 (en) 2017-02-17 2020-10-20 International Business Machines Corporation Conversation and context aware fraud and abuse prevention agent
CN107039041B (zh) * 2017-03-24 2020-10-20 广东美的制冷设备有限公司 语音扩展的方法与语音助手
CN107707436A (zh) * 2017-09-18 2018-02-16 广东美的制冷设备有限公司 终端控制方法、装置及计算机可读存储介质
CN107655154A (zh) * 2017-09-18 2018-02-02 广东美的制冷设备有限公司 终端控制方法、空调器及计算机可读存储介质
US10424299B2 (en) * 2017-09-29 2019-09-24 Intel Corporation Voice command masking systems and methods
US10887351B2 (en) * 2018-05-02 2021-01-05 NortonLifeLock Inc. Security for IoT home voice assistants
KR102506361B1 (ko) * 2018-05-03 2023-03-06 구글 엘엘씨 오디오 쿼리들의 오버랩핑 프로세싱의 조정
US10783886B2 (en) * 2018-06-12 2020-09-22 International Business Machines Corporation Cognitive agent disambiguation
CN109308897B (zh) * 2018-08-27 2022-04-26 广东美的制冷设备有限公司 语音控制方法、模块、家电设备、系统和计算机存储介质
CN111063344B (zh) * 2018-10-17 2022-06-28 青岛海信移动通信技术股份有限公司 一种语音识别方法、移动终端以及服务器
US10885912B2 (en) 2018-11-13 2021-01-05 Motorola Solutions, Inc. Methods and systems for providing a corrected voice command
CN109541953A (zh) * 2018-11-27 2019-03-29 深圳狗尾草智能科技有限公司 拓展辅助设备、基于智能机器人的拓展平台及方法
CN109671431A (zh) * 2018-12-14 2019-04-23 科大国创软件股份有限公司 一种基于机器人语音交互的管廊平台监控系统
US11183185B2 (en) * 2019-01-09 2021-11-23 Microsoft Technology Licensing, Llc Time-based visual targeting for voice commands
KR20200098025A (ko) * 2019-02-11 2020-08-20 삼성전자주식회사 전자 장치 및 그 제어 방법
JP2020140431A (ja) * 2019-02-28 2020-09-03 富士ゼロックス株式会社 情報処理装置、情報処理システム、及び情報処理プログラム
CN110299152A (zh) * 2019-06-28 2019-10-01 北京猎户星空科技有限公司 人机对话的输出控制方法、装置、电子设备及存储介质
US20210065719A1 (en) * 2019-08-29 2021-03-04 Comcast Cable Communications, Llc Methods and systems for intelligent content controls
CN113129878A (zh) * 2019-12-30 2021-07-16 富泰华工业(深圳)有限公司 声控方法及终端装置
KR20210106806A (ko) * 2020-02-21 2021-08-31 현대자동차주식회사 차량의 음성인식 장치 및 방법
CN111399910B (zh) * 2020-03-12 2022-06-07 支付宝(杭州)信息技术有限公司 用户指令的处理方法及装置
CN111524529B (zh) * 2020-04-15 2023-11-24 广州极飞科技股份有限公司 音频数据处理方法、装置和系统、电子设备及存储介质
CN112233672A (zh) * 2020-09-30 2021-01-15 成都长虹网络科技有限责任公司 分布式语音控制方法、系统、计算机设备和可读存储介质
US20220179619A1 (en) * 2020-12-03 2022-06-09 Samsung Electronics Co., Ltd. Electronic device and method for operating thereof
CN112837686A (zh) * 2021-01-29 2021-05-25 青岛海尔科技有限公司 唤醒响应操作的执行方法、装置、存储介质及电子装置
CN113470638B (zh) * 2021-05-28 2022-08-26 荣耀终端有限公司 槽位填充的方法、芯片、电子设备和可读存储介质
CN113990298B (zh) * 2021-12-24 2022-05-13 广州小鹏汽车科技有限公司 语音交互方法及其装置、服务器和可读存储介质

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP1215658A2 (en) * 2000-12-05 2002-06-19 Hewlett-Packard Company Visual activation of voice controlled apparatus
JP2008003371A (ja) * 2006-06-23 2008-01-10 Alpine Electronics Inc 車載用音声認識装置及び音声コマンド登録方法
CN101317416A (zh) * 2005-07-14 2008-12-03 雅虎公司 内容路由器
CN102831894A (zh) * 2012-08-09 2012-12-19 华为终端有限公司 指令处理方法、装置和系统

Family Cites Families (31)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20010054622A (ko) * 1999-12-07 2001-07-02 서평원 음성 인식 시스템의 음성 인식률 향상 방법
JP2001319045A (ja) * 2000-05-11 2001-11-16 Matsushita Electric Works Ltd 音声マンマシンインタフェースを用いたホームエージェントシステム、及びプログラム記録媒体
US7647374B2 (en) * 2001-07-03 2010-01-12 Nokia Corporation Method for managing sessions between network parties, methods, network element and terminal for managing calls
GB0213255D0 (en) * 2002-06-10 2002-07-17 Nokia Corp Charging in communication networks
US7379978B2 (en) * 2002-07-19 2008-05-27 Fiserv Incorporated Electronic item management and archival system and method of operating the same
US20070128899A1 (en) * 2003-01-12 2007-06-07 Yaron Mayer System and method for improving the efficiency, comfort, and/or reliability in Operating Systems, such as for example Windows
US20080177994A1 (en) * 2003-01-12 2008-07-24 Yaron Mayer System and method for improving the efficiency, comfort, and/or reliability in Operating Systems, such as for example Windows
US7752050B1 (en) * 2004-09-03 2010-07-06 Stryker Corporation Multiple-user voice-based control of devices in an endoscopic imaging system
DE602004015987D1 (de) * 2004-09-23 2008-10-02 Harman Becker Automotive Sys Mehrkanalige adaptive Sprachsignalverarbeitung mit Rauschunterdrückung
US20060136220A1 (en) * 2004-12-22 2006-06-22 Rama Gurram Controlling user interfaces with voice commands from multiple languages
JP4542974B2 (ja) * 2005-09-27 2010-09-15 株式会社東芝 音声認識装置、音声認識方法および音声認識プログラム
EP1955458B1 (en) * 2005-11-29 2012-07-11 Google Inc. Social and interactive applications for mass media
DE602006010505D1 (de) * 2005-12-12 2009-12-31 Gregory John Gadbois Mehrstimmige Spracherkennung
CN101444078B (zh) * 2006-04-13 2012-10-31 京瓷株式会社 群组通信方法和通信终端
EP1850593A1 (fr) * 2006-04-27 2007-10-31 Nagravision S.A. Procédé de génération de paquets à destination d'au moins un récepteur mobile
US9318108B2 (en) 2010-01-18 2016-04-19 Apple Inc. Intelligent automated assistant
EP2080272B1 (en) * 2006-10-17 2019-08-21 D&M Holdings, Inc. Unification of multimedia devices
US20090055379A1 (en) * 2007-08-22 2009-02-26 Richard Murtagh Systems and Methods for Locating Contact Information
US8724600B2 (en) * 2008-01-07 2014-05-13 Tymphany Hong Kong Limited Systems and methods for providing a media playback in a networked environment
WO2009086599A1 (en) * 2008-01-07 2009-07-16 Avega Systems Pty Ltd A user interface for managing the operation of networked media playback devices
US8411880B2 (en) 2008-01-29 2013-04-02 Qualcomm Incorporated Sound quality by intelligently selecting between signals from a plurality of microphones
US8725492B2 (en) * 2008-03-05 2014-05-13 Microsoft Corporation Recognizing multiple semantic items from single utterance
KR101631496B1 (ko) * 2008-06-03 2016-06-17 삼성전자주식회사 로봇 장치 및 그 단축 명령 등록 방법
WO2010113438A1 (ja) * 2009-03-31 2010-10-07 日本電気株式会社 音声認識処理システム、および音声認識処理方法
EP2478661A4 (en) * 2009-09-17 2014-10-01 Royal Canadian Mint Monnaie Royale Canadienne CONFIDENTIAL MESSAGE STORAGE AND PROTOCOL AND TRANSFER SYSTEM
CN102262879B (zh) * 2010-05-24 2015-05-13 乐金电子(中国)研究开发中心有限公司 语音命令竞争处理方法、装置、语音遥控器和数字电视
US9152634B1 (en) * 2010-06-23 2015-10-06 Google Inc. Balancing content blocks associated with queries
US20120311090A1 (en) * 2011-05-31 2012-12-06 Lenovo (Singapore) Pte. Ltd. Systems and methods for aggregating audio information from multiple sources
US20130018895A1 (en) * 2011-07-12 2013-01-17 Harless William G Systems and methods for extracting meaning from speech-to-text data
US8340975B1 (en) * 2011-10-04 2012-12-25 Theodore Alfred Rosenberger Interactive speech recognition device and system for hands-free building control
US20130317827A1 (en) * 2012-05-23 2013-11-28 Tsung-Chun Fu Voice control method and computer-implemented system for data management and protection

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP1215658A2 (en) * 2000-12-05 2002-06-19 Hewlett-Packard Company Visual activation of voice controlled apparatus
CN101317416A (zh) * 2005-07-14 2008-12-03 雅虎公司 内容路由器
JP2008003371A (ja) * 2006-06-23 2008-01-10 Alpine Electronics Inc 車載用音声認識装置及び音声コマンド登録方法
CN102831894A (zh) * 2012-08-09 2012-12-19 华为终端有限公司 指令处理方法、装置和系统

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
See also references of EP2830044A4 *

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109074808A (zh) * 2018-07-18 2018-12-21 深圳魔耳智能声学科技有限公司 语音控制方法、中控设备和存储介质
CN113162964A (zh) * 2020-01-23 2021-07-23 丰田自动车株式会社 代理系统、终端装置以及代理程序
CN113162964B (zh) * 2020-01-23 2024-03-19 丰田自动车株式会社 代理系统、终端装置以及代理程序

Also Published As

Publication number Publication date
EP2830044A4 (en) 2015-06-03
CN102831894B (zh) 2014-07-09
US20150039319A1 (en) 2015-02-05
CN102831894A (zh) 2012-12-19
EP2830044A1 (en) 2015-01-28
EP2830044B1 (en) 2016-05-25
US9704503B2 (en) 2017-07-11

Similar Documents

Publication Publication Date Title
WO2014023257A1 (zh) 指令处理方法、装置和系统
US11615794B2 (en) Voice recognition system, server, display apparatus and control methods thereof
CN111566730B (zh) 低功率设备中的语音命令处理
WO2019085073A1 (zh) 接口测试方法、装置、计算机设备和存储介质
US9479911B2 (en) Method and system for supporting a translation-based communication service and terminal supporting the service
CN107277272A (zh) 一种基于软件app的蓝牙设备语音交互方法及系统
CN110992955A (zh) 一种智能设备的语音操作方法、装置、设备及存储介质
WO2014176894A1 (zh) 一种语音处理的方法和终端
CN108028044A (zh) 使用多个识别器减少延时的语音识别系统
WO2020038145A1 (zh) 一种业务数据处理方法、装置以及相关设备
CN107592250B (zh) 基于航空fc总线多速率自适应测试设备
CN111933149A (zh) 语音交互方法、穿戴式设备、终端及语音交互系统
EP3496094B1 (en) Electronic apparatus and method for controlling the same
CN114244821B (zh) 数据处理方法、装置、设备、电子设备和存储介质
CN101656744B (zh) 一种出钞机的通讯协议转发装置及方法
CN113346973A (zh) 事件提示方法及装置、电子设备、计算机可读存储介质
WO2018120853A1 (zh) 一种总线信号协议解码方法
Nan et al. One solution for voice enabled smart home automation system
JP2019091012A (ja) 情報認識方法および装置
US20140337038A1 (en) Method, application, and device for audio signal transmission
CN103401742B (zh) 家庭网关sip协议配置生效方法及系统
CN113852835A (zh) 直播音频处理方法、装置、电子设备以及存储介质
US9213695B2 (en) Bridge from machine language interpretation to human language interpretation
CN112689112A (zh) 视频交流系统的耗时分析及优化方法、装置、设备及介质
US11483085B1 (en) Device time synchronization by networking device

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 13827606

Country of ref document: EP

Kind code of ref document: A1

WWE Wipo information: entry into national phase

Ref document number: 2013827606

Country of ref document: EP

NENP Non-entry into the national phase

Ref country code: DE