CN110660389A

CN110660389A - Voice response method, device, system and equipment

Info

Publication number: CN110660389A
Application number: CN201910859111.0A
Authority: CN
Inventors: 刘道宽; 李肇中
Original assignee: Beijing Xiaomi Mobile Software Co Ltd
Current assignee: Beijing Xiaomi Mobile Software Co Ltd
Priority date: 2019-09-11
Filing date: 2019-09-11
Publication date: 2020-01-07

Abstract

The disclosure relates to a voice response method, a voice response device, a voice response system and voice response equipment, wherein the voice response system comprises at least two controlled equipment in the same local area network, and the controlled equipment comprises first controlled equipment and at least one second controlled equipment; the first controlled device is used for detecting a voice instruction, determining first arrival time of the voice instruction to the first controlled device, comparing the first arrival time with second arrival time sent by the second controlled device, and deciding the controlled device responding to the voice instruction; the second controlled device is configured to detect a voice instruction, determine a second arrival time at which the voice instruction arrives at the second controlled device, and send the second arrival time to the first controlled device. The embodiment can automatically determine the controlled equipment closest to the user to process the voice instruction of the user, and has high intelligent degree.

Description

Voice response method, device, system and equipment

Technical Field

The present application relates to the field of data processing technologies, and in particular, to a voice response method, apparatus, system, and device.

Background

Along with the development of the internet of things technology, smart homes are more and more popular. It has also become common for a single user to purchase multiple smart devices of the same type to be placed in different rooms of the same house. But at the same time introduces new problems: when a user sends a certain instruction, if multiple intelligent devices placed at different positions all receive the instruction and process the instruction simultaneously, response confusion can be caused, and user experience is influenced.

For this case, the user may designate a device closer to the user to process the user command, which increases the user operation.

Disclosure of Invention

To overcome the problems in the related art, the present disclosure provides a voice response method, apparatus, system, and device.

According to a first aspect of the embodiments of the present disclosure, a voice response system is provided, where the voice response system includes at least two controlled devices in the same local area network, where the controlled devices include a first controlled device and at least one second controlled device;

the first controlled device is used for detecting a voice instruction, determining first arrival time of the voice instruction to the first controlled device, comparing the first arrival time with second arrival time sent by the second controlled device, and deciding the controlled device responding to the voice instruction;

the second controlled device is configured to detect a voice instruction, determine a second arrival time at which the voice instruction arrives at the second controlled device, and send the second arrival time to the first controlled device.

Optionally, the at least one controlled device communicates with each other through a P2P network.

Optionally, the first controlled device is a device determined by negotiation between the at least two controlled devices according to a specified negotiation rule, and the second controlled device includes a controlled device in the local area network except the first controlled device.

Optionally, the controlled device that responds to the voice instruction and is decided by the first controlled device is the controlled device corresponding to the shortest arrival time of the voice instruction.

Optionally, the first controlled device is specifically configured to, when at least two identical shortest arrival times exist in the first arrival time and/or the second arrival time, obtain the usage frequency of the controlled device corresponding to the identical shortest arrival time, and use the controlled device with the largest usage frequency as the controlled device that responds to the voice instruction.

Optionally, the controlled device responding to the voice instruction is configured to send the voice instruction to a server, receive an operation command returned by the server after performing voice recognition on the voice instruction, and execute an operation corresponding to the operation command.

According to a second aspect of the embodiments of the present disclosure, there is provided a voice response method, the method including:

the method comprises the steps that a first controlled device detects a voice command and determines first arrival time of the voice command to the first controlled device;

receiving a second arrival time sent by at least one second controlled device and located in the same local area network as the first controlled device, wherein the second arrival time is the time when the voice command detected by the second controlled device arrives at the second controlled device;

and according to the first arrival time and the second arrival time, deciding a controlled device responding to the voice instruction.

Optionally, the deciding, according to the first arrival time and the second arrival time, a controlled device responding to the voice instruction includes:

and selecting the controlled equipment corresponding to the shortest arrival time from the first arrival time and the second arrival time as the controlled equipment responding to the voice instruction.

Optionally, the deciding, according to the first arrival time and the second arrival time, a controlled device responding to the voice instruction further includes:

and when at least two same shortest arrival times exist in the first arrival time and/or the second arrival time, acquiring the use frequency of the controlled equipment corresponding to the same shortest arrival time, and using the controlled equipment with the largest use frequency as the controlled equipment responding to the voice instruction.

According to a third aspect of the embodiments of the present disclosure, there is provided a voice response method, the method including:

the second controlled equipment detects the voice command and determines second arrival time of the voice command to the second controlled equipment;

and sending the second arrival time to a first controlled device in the same local area network, so that the first controlled device can decide the controlled device responding to the voice instruction.

Optionally, the first controlled device is a device determined by negotiation of the at least two controlled devices according to a specified negotiation rule, and the second controlled device includes a controlled device in the local area network except the first controlled device.

According to a fourth aspect of the embodiments of the present disclosure, there is provided a voice response apparatus, the apparatus being provided in a first controlled device, the apparatus including:

a first voice detection module configured to detect a voice command and determine a first arrival time at which the voice command arrives at the first controlled device;

the arrival time receiving module is configured to receive a second arrival time sent by at least one second controlled device in the same local area network as the first controlled device, where the second arrival time is the time when the voice instruction detected by the second controlled device arrives at the second controlled device;

and the decision module is configured to decide the controlled equipment responding to the voice instruction according to the first arrival time and the second arrival time.

According to a fifth aspect of the embodiments of the present disclosure, there is provided a voice response apparatus provided in a second controlled device, the apparatus including:

a second voice detection module configured to detect a voice command and determine a second arrival time at which the voice command arrives at the second controlled device;

and the arrival time sending module is configured to send the second arrival time to a first controlled device in the same local area network, so that the first controlled device can decide the controlled device responding to the voice instruction.

According to a sixth aspect of embodiments of the present disclosure, there is provided a voice response device including:

a processor;

a memory for storing processor-executable instructions;

wherein the processor is configured to:

detecting a voice instruction, and determining a first arrival time of the voice instruction to a current device;

receiving second arrival time sent by at least one second controlled device and located in the same local area network with the current device, wherein the second arrival time is the time when the voice command detected by the second controlled device arrives at the second controlled device;

According to a seventh aspect of the embodiments of the present disclosure, there is provided a voice response device including:

a processor;

a memory for storing processor-executable instructions;

wherein the processor is configured to:

detecting a voice instruction and determining a second arrival time of the voice instruction to the current device;

The technical scheme provided by the embodiment of the disclosure can have the following beneficial effects:

in the voice response system provided by the embodiment of the disclosure, the first controlled device can automatically determine the controlled device closest to the user to process the voice command of the user according to the obtained arrival time of the voice command detected by each controlled device, so that response confusion caused by simultaneous processing of user requests by a plurality of controlled devices in the same local area network is avoided, meanwhile, the user does not need to manually set the device closer to the first controlled device, and the degree of intelligence is high.

It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the disclosure.

Drawings

The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the present disclosure and together with the description, serve to explain the principles of the disclosure.

Fig. 1 is a schematic diagram of a smart home shown in accordance with an exemplary embodiment of the present disclosure;

FIG. 2 is a block diagram of a voice response system shown in accordance with an exemplary embodiment of the present disclosure;

FIG. 3 is a schematic diagram illustrating a response process of a voice instruction according to an exemplary embodiment of the present disclosure;

FIG. 4 is a flow chart illustrating one voice response method embodiment of the present disclosure according to an exemplary embodiment;

FIG. 5 is a flow chart diagram illustrating one voice response method embodiment of the present disclosure in accordance with another exemplary embodiment;

FIG. 6 is a block diagram illustrating one embodiment of a voice response apparatus according to an exemplary embodiment of the present disclosure;

FIG. 7 is a block diagram illustrating one embodiment of a voice response apparatus according to another exemplary embodiment of the present disclosure;

FIG. 8 is a block diagram illustrating a device for voice response according to an exemplary embodiment of the present disclosure.

Detailed Description

Reference will now be made in detail to the exemplary embodiments, examples of which are illustrated in the accompanying drawings. When the following description refers to the accompanying drawings, like numbers in different drawings represent the same or similar elements unless otherwise indicated. The implementations described in the exemplary embodiments below are not intended to represent all implementations consistent with the present disclosure. Rather, they are merely examples of apparatus and methods consistent with certain aspects of the present disclosure, as detailed in the appended claims.

The terminology used in the present disclosure is for the purpose of describing particular embodiments only and is not intended to be limiting of the disclosure. As used in this disclosure and the appended claims, the singular forms "a," "an," and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise. It should also be understood that the term "and/or" as used herein refers to and encompasses any and all possible combinations of one or more of the associated listed items.

It is to be understood that although the terms first, second, third, etc. may be used herein to describe various information, such information should not be limited to these terms. These terms are only used to distinguish one type of information from another. For example, first information may also be referred to as second information, and similarly, second information may also be referred to as first information, without departing from the scope of the present disclosure. The word "if" as used herein may be interpreted as "at … …" or "when … …" or "in response to a determination", depending on the context.

The present disclosure provides a voice response system that may include at least two controlled devices in the same local area network, as an example, a controlled device may include a device capable of receiving voice signals and having a communication module, such as a smart speaker device, a smart television, and the like.

For example, as shown in the smart home schematic diagram of fig. 1, as more smart speakers are purchased by a single user, multiple speakers are placed in different rooms of a set of house, and at this time, multiple speakers (such as speaker a and speaker B in fig. 1) can be set under the same lan.

In the present disclosure, in the local area network, a Peer-to-Peer network (P2P) may be established between the at least one controlled device for communication, for example, in fig. 1, speaker a and speaker B communicate via a decentralized P2P Peer-to-Peer network. In a P2P network environment, multiple controlled devices connected to each other are in a peer-to-peer status, each controlled device has the same function, and there is no master-slave distinction, one controlled device can be used as a server to set shared resources for other controlled devices in the network, and can also be used as a workstation, and the whole network does not depend on a dedicated centralized server, nor has a dedicated workstation. Each controlled device in the network can act as a requester of network services and provide resources, services and content in response to requests of other controlled devices.

Referring to fig. 2, a block diagram of a voice response system is shown in accordance with an exemplary embodiment, wherein the voice response system may include at least two controlled devices in the same local area network. As shown in fig. 2, the controlled devices may include a first controlled device 10 and at least one second controlled device 20.

In an optional embodiment, the first controlled device may be a device determined by negotiation between the at least two controlled devices according to a specified negotiation rule, and the second controlled device includes a controlled device other than the first controlled device in the local area network.

Specifically, at least two controlled devices in the local area network may automatically negotiate a master device as a first controlled device from the controlled devices in the local area network by specifying a negotiation rule.

In practice, the controlled device may discover other controlled devices in the local area network by means of broadcast or multicast. For example, the controlled device broadcasts in the local area network, and after receiving the reply of other controlled devices of the broadcast, the controlled device initiating the broadcast can know the existence of other controlled devices. After the device discovery, each controlled device locally maintains an IP list of all controlled devices in a local area network for subsequent negotiation.

The present disclosure is not limited to a particular negotiation rule, which may include, for example, a Paxos algorithm. Among them, the Paxos algorithm is a consistency algorithm based on message passing, and the problem to be solved is how to reach agreement on a certain value (resolution) by a distributed system.

It should be noted that, when a new controlled device is added to the local area network, all controlled devices in the local area network need to renegotiate to determine the latest first controlled device 10.

In this embodiment of the present disclosure, the first controlled device 10 is configured to detect a voice instruction, determine a first arrival time at which the voice instruction arrives at the first controlled device, compare the first arrival time with a second arrival time sent by the second controlled device, and decide a controlled device that responds to the voice instruction;

the second controlled device 20 is configured to detect a voice command, determine a second arrival time at which the voice command arrives at the second controlled device, and send the second arrival time to the first controlled device.

And the controlled equipment responding to the voice command and decided by the first controlled equipment is the controlled equipment corresponding to the shortest arrival time of the voice command.

Specifically, the voice command issued by the user may be a voice signal in a convention format, for example, a voice signal beginning with a convention phrase, such as "xx classmate, please xxxxxx". Then, for the controlled device, when it detects the voice signal, it may first determine whether the voice signal is a voice signal in a default format, if so, perform subsequent processing, otherwise, directly ignore the voice signal.

In one implementation, simple voice recognition logic may be provided in the controlled device to recognize whether the detected voice signal is a voice command in a agreed format.

When it is detected that a voice signal sent by a user is a voice instruction, the controlled device records the time when the voice instruction is detected as the arrival time of the voice instruction, where the arrival time is a Coordinated Universal Time (UTC) time.

In order to facilitate distinguishing the arrival times recorded by the first controlled device 10 and the second controlled device 20, the embodiment of the present disclosure refers to the arrival time recorded by the first controlled device 10 as a first arrival time, and refers to the arrival time recorded by the second controlled device 20 as a second arrival time.

After the second controlled device 20 obtains the second arrival time of the voice command, the second arrival time may be sent to the first controlled device 10, so that the first controlled device 10 can decide the controlled device which finally responds to the voice command.

In practice, the second controlled device 20 may carry other information, such as heartbeat information and physical location information, besides sending the second arrival time to the first controlled device.

After receiving the second arrival times sent by the second controlled devices 20, the first controlled device 10 compares the first arrival time at which the voice command is detected with the received second arrival times, determines the shortest arrival time, and uses the controlled device corresponding to the shortest arrival time as the controlled device responding to the voice command.

In an embodiment, the first controlled device is specifically configured to, when at least two identical shortest arrival times exist in the first arrival time and/or the second arrival time, acquire the usage frequency of the controlled device corresponding to the identical shortest arrival time, and use the controlled device with the largest usage frequency as the controlled device responding to the voice instruction.

In a specific implementation, if there are more than two shortest arrival times, i.e. there are more than two same shortest arrival times, the frequency of use of the device may be further considered. Specifically, the use frequencies of the user for the controlled devices corresponding to the two or more same shortest arrival times may be obtained, and then the controlled device with the largest use frequency is determined to respond to the voice instruction of the user.

After deciding the controlled device responding to the voice instruction, the first controlled device may send the decision result to the controlled device to notify the controlled device of responding to the voice instruction of the user. For other controlled devices in the local area network, the first controlled device may send a notification that no response instruction is needed, or when the other device does not receive the notification sent by the first controlled device within a preset time period, the other device does not need to respond to the voice instruction.

In an embodiment, the controlled device responding to the voice instruction is configured to send the voice instruction to a server, receive an operation command returned by the server after performing voice recognition on the voice instruction, and execute an operation corresponding to the operation command.

After the controlled device responding to the instruction receives the decision result sent by the first controlled device, the voice instruction can be sent to the cloud server, the cloud server performs processing such as voice semantic recognition on the voice instruction to obtain a corresponding operation command, then the cloud server returns the operation command to the controlled device, and the controlled device executes the operation corresponding to the operation command.

For example, for the scenario of fig. 1, the response process of the voice command is as shown in fig. 3: the local area network comprises a sound box A and a sound box B which are placed in different rooms of the same house, the sound box A and the sound box B negotiate to obtain the sound box A as a first controlled device for decision, after a user C sends a voice instruction, the arrival Time of the voice instruction at the sound box A is Time delta CA, the arrival Time of the voice instruction at the sound box B is Time delta CB, the sound box B sends the Time delta CB to the sound box A, the sound box A compares the Time delta CA with the Time delta CB, if the Time delta CA is smaller than the Time delta CB, the sound box A is judged to be closer to the user, and therefore the sound box A is judged to respond to the voice instruction sent by the user, and the sound box B is judged to be silent. And then the sound box A uploads the voice instruction to the cloud server, the server performs voice semantic recognition on the voice instruction and then returns an operation command, and the sound box A makes a corresponding response to the user after receiving the operation command.

It should be noted that the result of each negotiation is valid in the current session, and the negotiation result is invalid after the current session is ended. After the next session is established, the negotiation is resumed to respond to the user instruction. Each session and the next session can be divided by adopting a preset time interval, and when the time interval between the next session and the last session is greater than the preset time, the sessions can be regarded as two independent sessions.

The various technical features in the above embodiments can be arbitrarily combined, so long as there is no conflict or contradiction between the combinations of the features, but the combination is limited by the space and is not described one by one, and therefore, any combination of the various technical features in the above embodiments also belongs to the scope disclosed in the present specification.

Corresponding to the embodiment of the voice response system, the present disclosure also provides an embodiment of a voice response method.

As shown in fig. 4, fig. 4 is a flowchart of an embodiment of a voice response method according to an exemplary embodiment of the present disclosure, and the embodiment of the present disclosure is explained from a first controlled device side, which may specifically include the following steps:

step 401, a first controlled device detects a voice instruction and determines a first arrival time of the voice instruction to the first controlled device;

step 402, receiving a second arrival time sent by at least one second controlled device in the same local area network as the first controlled device, where the second arrival time is a time when the voice command detected by the second controlled device arrives at the second controlled device;

and step 403, deciding the controlled equipment responding to the voice command according to the first arrival time and the second arrival time.

In an optional embodiment of the present disclosure, step 403 may specifically include the following sub-steps:

In an optional embodiment of the present disclosure, step 403 may further include the following sub-steps:

In an optional embodiment of the present disclosure, the first controlled device is a device determined by negotiation between the at least two controlled devices according to a specified negotiation rule, and the second controlled device includes a controlled device in the local area network except for the first controlled device.

In the embodiment of the present disclosure, in addition to detecting a voice instruction and determining a first arrival time of the voice instruction, a first controlled device serving as a master device may also receive second arrival times sent by other second controlled devices in the same local area network, and then the first controlled device compares the arrival times, so as to automatically determine a controlled device closest to a user to respond to the voice instruction of the user. The problem of disordered response caused by simultaneous processing of user requests by multiple controlled devices in the same local area network is avoided, meanwhile, the user does not need to manually set the device which is closer to the user, and the intelligent degree is high.

As shown in fig. 5, fig. 5 is a flowchart of an embodiment of a voice response method according to another exemplary embodiment shown in the present disclosure, and the embodiment of the present disclosure is explained from a second controlled device side, which may specifically include the following steps:

step 501, a second controlled device detects a voice instruction and determines a second arrival time of the voice instruction to the second controlled device;

step 502, sending the second arrival time to a first controlled device in the same local area network, so that the first controlled device can decide a controlled device responding to the voice instruction.

The first controlled device is a device determined by negotiation of the at least two controlled devices according to a specified negotiation rule, and the second controlled device includes a controlled device in the local area network except the first controlled device.

In the embodiment of the present disclosure, after detecting a voice instruction, a second controlled device serving as a slave device may determine a second arrival time at which the voice instruction arrives at the second controlled device, and send the second arrival time to a first controlled device serving as a master device, so that the first controlled device decides a controlled device responding to the voice instruction according to the second arrival time, resource consumption caused by the response of the second controlled device far away from a user to the user instruction is avoided, meanwhile, response confusion caused by the simultaneous processing of user requests by multiple controlled devices in the same local area network is also avoided, and the degree of intelligence is high.

Corresponding to the embodiments of the voice response system and the method, the present disclosure also provides an embodiment of a voice response apparatus.

As shown in fig. 6, fig. 6 is a block diagram of an embodiment of a voice response apparatus shown in the present disclosure according to an exemplary embodiment, where the apparatus in the embodiment of the present disclosure is disposed on a first controlled device side, and the apparatus may specifically include the following modules:

a first voice detection module 601 configured to detect a voice command and determine a first arrival time of the voice command at the first controlled device;

an arrival time receiving module 602, configured to receive a second arrival time sent by at least one second controlled device in the same local area network as the first controlled device, where the second arrival time is a time when the voice instruction detected by the second controlled device arrives at the second controlled device;

a decision module 603 configured to decide a controlled device responding to the voice instruction according to the first arrival time and the second arrival time.

As can be seen from the foregoing embodiments, the voice response apparatus detects a voice command through the first voice detection module 601, and determines a first arrival time of the voice command at the first controlled device. And receiving second arrival time sent by other second controlled devices through the arrival time receiving module 602, and deciding the controlled device responding to the voice instruction according to the first arrival time and the second arrival time through the decision module 603, so that the controlled device closest to the user is automatically determined to respond to the voice instruction of the user, thereby avoiding response confusion caused by simultaneously processing user requests by a plurality of controlled devices under the same local area network, simultaneously avoiding the need of manually setting the device closer to the user by the user, and having high intelligent degree.

In an optional embodiment of the present disclosure, the decision module 603 may be specifically configured to:

As can be seen from the above embodiments, when making a decision, the decision module 603 automatically determines the controlled device corresponding to the shortest arrival time as the controlled device closest to the user, so as to respond to the voice instruction without manually setting the device closest to the user, thereby saving user operations and achieving higher intelligence.

In another optional embodiment of the present disclosure, the decision module 603 may be specifically configured to:

As can be seen from the above embodiments, the decision module 603 also considers the factor of the use frequency when making a decision, so that the decided controlled device better meets the user requirement.

In another optional embodiment of the present disclosure, the first controlled device is a device determined by negotiation between the at least two controlled devices according to a specified negotiation rule, and the second controlled device includes a controlled device other than the first controlled device in the local area network.

As can be seen from the above embodiments, according to the present disclosure, the first controlled device that performs decision making through automatic negotiation of the controlled devices in the same lan completes a subsequent decision making process through the first controlled device, thereby effectively avoiding response confusion caused by simultaneous processing of user requests by multiple controlled devices in the same lan.

As shown in fig. 7, fig. 7 is a block diagram of an embodiment of a voice response apparatus shown in the present disclosure according to another exemplary embodiment, where the apparatus in the embodiment of the present disclosure is disposed on a second controlled device side, and the apparatus may specifically include the following modules:

a second voice detection module 701 configured to detect a voice command and determine a second arrival time at which the voice command arrives at the second controlled device;

an arrival time sending module 702 configured to send the second arrival time to a first controlled device in the same local area network, so that the first controlled device can decide a controlled device responding to the voice instruction.

As can be seen from the foregoing embodiments, the voice response apparatus detects the voice command through the second voice detection module 701, and determines a second arrival time of the voice command at the second controlled device. And sending the second arrival time to the first controlled device in the same local area network through the arrival time sending module 702, so that the first controlled device can decide the controlled device responding to the voice command, thereby avoiding resource consumption caused by responding to the user command by the second controlled device far away from the user, and simultaneously avoiding response confusion caused by simultaneously processing user requests by a plurality of controlled devices in the same local area network, and having high intelligent degree.

The detailed details of the implementation process of the functions and actions of the units in the apparatus are described in the above system embodiments, and are not described herein again.

For the device embodiment, since it basically corresponds to the system embodiment, the relevant points can be referred to the partial description of the system embodiment. The above-described embodiments of the apparatus are merely illustrative, and the units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the modules can be selected according to actual needs to achieve the purpose of the disclosed solution. One of ordinary skill in the art can understand and implement it without inventive effort.

As shown in fig. 8, fig. 8 is a block diagram of a voice response device 800 shown in accordance with an exemplary embodiment of the present disclosure. The device 800 may be a mobile telephone with routing capability, a computer, a digital broadcast terminal, a messaging device, a game console, a tablet device, a medical device, a fitness device, a personal digital assistant, or the like.

Referring to fig. 8, device 800 may include one or more of the following components: processing component 802, memory 804, power component 806, multimedia component 808, audio component 810, input/output (I/O) interface 812, sensor component 814, and communication component 816.

The processing component 802 generally controls overall operation of the device 800, such as operations associated with display, telephone calls, data communications, camera operations, and recording operations. The processing components 802 may include one or more processors 820 to execute instructions to perform all or a portion of the steps of the methods described above. Further, the processing component 802 can include one or more modules that facilitate interaction between the processing component 802 and other components. For example, the processing component 802 can include a multimedia module to facilitate interaction between the multimedia component 808 and the processing component 802.

The memory 804 is configured to store various types of data to support operation at the device 800. Examples of such data include instructions for any application or method operating on device 800, contact data, phonebook data, messages, pictures, videos, and so forth. The memory 804 may be implemented by any type or combination of volatile or non-volatile memory devices such as Static Random Access Memory (SRAM), electrically erasable programmable read-only memory (EEPROM), erasable programmable read-only memory (EPROM), programmable read-only memory (PROM), read-only memory (ROM), magnetic memory, flash memory, magnetic or optical disks.

The power component 806 provides power to the various components of the device 800. The power components 806 may include a power management system, one or more power supplies, and other components associated with generating, managing, and distributing power for the device 800.

The multimedia component 808 includes a screen that provides an output interface between the device 800 and a user. In some embodiments, the screen may include a Liquid Crystal Display (LCD) and a Touch Panel (TP). If the screen includes a touch panel, the screen may be implemented as a touch screen to receive an input signal from a user. The touch panel includes one or more touch sensors to sense touch, slide, and gestures on the touch panel. The touch sensor may not only sense the boundary of a touch or slide action, but also detect the duration and pressure associated with the touch or slide operation. In some embodiments, the multimedia component 808 includes a front facing camera and/or a rear facing camera. The front-facing camera and/or the rear-facing camera may receive external multimedia data when the device 800 is in an operating mode, such as a shooting mode or a video mode. Each front camera and rear camera may be a fixed optical lens system or have a focal length and optical zoom capability.

The audio component 810 is configured to output and/or input audio signals. For example, the audio component 810 includes a Microphone (MIC) configured to receive external audio signals when the device 800 is in an operational mode, such as a call mode, a recording mode, and a voice recognition mode. The received audio signals may further be stored in the memory 804 or transmitted via the communication component 816. In some embodiments, audio component 810 also includes a speaker for outputting audio signals.

The I/O interface 812 provides an interface between the processing component 802 and peripheral interface modules, which may be keyboards, click wheels, buttons, etc. These buttons may include, but are not limited to: a home button, a volume button, a start button, and a lock button.

The sensor assembly 814 includes one or more sensors for providing various aspects of state assessment for the device 800. For example, the sensor assembly 814 may detect the open/closed state of the device 800, the relative positioning of components, such as a display and keypad of the device 800, the sensor assembly 814 may also detect a change in the position of the device 800 or one of the components of the device 800, the presence or absence of user contact with the device 800, orientation or acceleration/deceleration of the device 800, and a change in the temperature of the device 800. Sensor assembly 814 may include a proximity sensor configured to detect the presence of a nearby object without any physical contact. The sensor assembly 814 may also include a light sensor, such as a CMOS or CCD image sensor, for use in imaging applications. In some embodiments, the sensor assembly 814 may also include an acceleration sensor, a gyroscope sensor, a magnetic sensor, a pressure sensor, or a temperature sensor.

Communications component 816 is configured to facilitate communications between device 800 and other devices in a wired or wireless manner. The device 800 may access a wireless network based on a communication standard, such as WiFi, 2G or 3G, or a combination thereof. In an exemplary embodiment, the communication component 816 receives a broadcast signal or broadcast related information from an external broadcast management system via a broadcast channel. In an exemplary embodiment, the communication component 816 further includes a Near Field Communication (NFC) module to facilitate short-range communications. For example, the NFC module may be implemented based on Radio Frequency Identification (RFID) technology, infrared data association (IrDA) technology, Ultra Wideband (UWB) technology, Bluetooth (BT) technology, and other technologies.

In an exemplary embodiment, the device 800 may be implemented by one or more Application Specific Integrated Circuits (ASICs), Digital Signal Processors (DSPs), Digital Signal Processing Devices (DSPDs), Programmable Logic Devices (PLDs), Field Programmable Gate Arrays (FPGAs), controllers, micro-controllers, microprocessors or other electronic components for performing the above-described methods.

In an exemplary embodiment, a non-transitory computer-readable storage medium comprising instructions, such as the memory 804 comprising instructions, executable by the processor 820 of the device 800 to perform the above-described method is also provided. For example, the non-transitory computer readable storage medium may be a ROM, a Random Access Memory (RAM), a CD-ROM, a magnetic tape, a floppy disk, an optical data storage device, and the like.

Wherein the instructions in the storage medium, when executed by the processor, enable the apparatus 800 to perform a voice response method comprising: detecting a voice instruction, and determining a first arrival time of the voice instruction to a current device; receiving second arrival time sent by at least one second controlled device and located in the same local area network with the current device, wherein the second arrival time is the time when the voice command detected by the second controlled device arrives at the second controlled device; and according to the first arrival time and the second arrival time, deciding a controlled device responding to the voice instruction.

Further, the instructions in the storage medium, when executed by the processor, enable the apparatus 800 to perform a voice response method comprising: detecting a voice instruction and determining a second arrival time of the voice instruction to the current device; and sending the second arrival time to a first controlled device in the same local area network, so that the first controlled device can decide the controlled device responding to the voice instruction.

Other embodiments of the disclosure will be apparent to those skilled in the art from consideration of the specification and practice of the disclosure disclosed herein. This disclosure is intended to cover any variations, uses, or adaptations of the disclosure following, in general, the principles of the disclosure and including such departures from the present disclosure as come within known or customary practice within the art to which the disclosure pertains. It is intended that the specification and examples be considered as exemplary only, with a true scope and spirit of the disclosure being indicated by the following claims.

It will be understood that the present disclosure is not limited to the precise arrangements described above and shown in the drawings and that various modifications and changes may be made without departing from the scope thereof. The scope of the present disclosure is limited only by the appended claims.

The above description is only exemplary of the present disclosure and should not be taken as limiting the disclosure, as any modification, equivalent replacement, or improvement made within the spirit and principle of the present disclosure should be included in the scope of the present disclosure.

Claims

1. A voice response system is characterized by comprising at least two controlled devices in the same local area network, wherein the controlled devices comprise a first controlled device and at least one second controlled device;

2. The system according to claim 1, wherein the at least one controlled device communicates with each other via a P2P network.

3. The system according to claim 1 or 2, wherein the first controlled device is a device determined by negotiation between the at least two controlled devices according to a specified negotiation rule, and the second controlled device comprises a controlled device other than the first controlled device in the local area network.

4. The system according to claim 1 or 2, wherein the controlled device that responds to the voice command and is decided by the first controlled device is the controlled device corresponding to the shortest arrival time of the voice command.

5. The system according to claim 1 or 2, wherein the first controlled device is specifically configured to, when at least two identical shortest arrival times exist in the first arrival time and/or the second arrival time, obtain the usage frequency of the controlled device corresponding to the identical shortest arrival time, and use the controlled device with the largest usage frequency as the controlled device that responds to the voice instruction.

6. The system according to claim 1, wherein the controlled device responding to the voice instruction is configured to send the voice instruction to a server, receive an operation command returned by the server after performing voice recognition on the voice instruction, and execute an operation corresponding to the operation command.

7. A voice response method, characterized in that the method comprises:

8. The method of claim 7, wherein the deciding the controlled device responding to the voice command according to the first arrival time and the second arrival time comprises:

9. The method of claim 8, wherein the deciding a controlled device responding to the voice command based on the first arrival time and the second arrival time further comprises:

10. The method according to any one of claims 7 to 9, wherein the first controlled device is a device determined by negotiation between the at least two controlled devices according to a specified negotiation rule, and the second controlled device comprises a controlled device other than the first controlled device in the local area network.

11. A voice response method, characterized in that the method comprises:

12. The method according to claim 11, wherein the first controlled device is a device determined by negotiation between the at least two controlled devices according to a specified negotiation rule, and the second controlled device comprises a controlled device other than the first controlled device in the local area network.

13. A voice response apparatus, provided in a first controlled device, comprising:

14. A voice response apparatus, provided in a second controlled device, comprising:

15. A voice response device, comprising:

a processor;

a memory for storing processor-executable instructions;

wherein the processor is configured to:

16. A voice response device, comprising:

a processor;

a memory for storing processor-executable instructions;

wherein the processor is configured to: