CN112700780A

CN112700780A - Voice processing method and system based on multiple devices

Info

Publication number: CN112700780A
Application number: CN202011501007.3A
Authority: CN
Inventors: 王云华; 王妍
Original assignee: Shenzhen TCL New Technology Co Ltd
Current assignee: Shenzhen TCL New Technology Co Ltd
Priority date: 2020-12-17
Filing date: 2020-12-17
Publication date: 2021-04-23

Abstract

The invention relates to the technical field of voice processing, and discloses a voice processing method and a system based on multiple devices, wherein the method comprises the following steps: acquiring voice instruction information, and extracting corresponding pulse code modulation data from the voice instruction information; acquiring device identifications of a plurality of intelligent devices which are associated in advance, and acquiring running state information of each intelligent device according to the device identifications; performing voice recognition processing on the pulse modulation data through a cloud server according to the running state information to obtain a voice recognition result; and selecting a target intelligent device from the plurality of intelligent devices according to the voice recognition result to respond to the voice recognition result. The voice command information is subjected to voice recognition processing according to the running state information of the intelligent devices and the cloud server, and then the corresponding target intelligent device is selected according to the voice recognition result to respond, so that the intelligent devices and the cloud work cooperatively, and the response speed of online voice processing and the voice interaction experience of a user are improved.

Description

Voice processing method and system based on multiple devices

Technical Field

The invention relates to the technical field of voice processing, in particular to a voice processing method and system based on multiple devices.

Background

The intelligentization of Internet of Things (IoT) devices requires support of numerous technologies, and among them, the online voice processing technology as an interface of man-machine conversation plays a very important role. In the prior art, voice instruction information is locally recognized through an intelligent device, a corresponding control instruction is generated based on a recognized text, and corresponding operation is executed based on the control instruction to respond to the voice instruction information. Therefore, how to improve the response speed of online voice processing to improve the voice interaction experience of the user becomes a problem to be solved urgently.

The above is only for the purpose of assisting understanding of the technical aspects of the present invention, and does not represent an admission that the above is prior art.

Disclosure of Invention

The invention mainly aims to provide a voice processing method and system based on multiple devices, and aims to solve the technical problem of how to improve the response speed of online voice processing so as to improve the voice interaction experience of a user.

In order to achieve the above object, the present invention provides a speech processing method based on multiple devices, the method comprising the steps of:

acquiring voice instruction information, and extracting corresponding pulse code modulation data from the voice instruction information;

acquiring device identifications of a plurality of intelligent devices which are associated in advance, and acquiring running state information of each intelligent device according to the device identifications;

performing voice recognition processing on the pulse modulation data through a cloud server according to the running state information to obtain a voice recognition result;

and selecting a target intelligent device from the plurality of intelligent devices according to the voice recognition result to respond to the voice recognition result.

Preferably, the step of performing voice recognition processing on the pulse modulation data through a cloud server according to the running state information to obtain a voice recognition result specifically includes:

judging whether the running state information meets the working state condition or not;

and when the running state information does not accord with the working state condition, uploading the pulse modulation data to a cloud server so that the cloud server performs voice recognition processing on the pulse modulation data to obtain a voice recognition result.

Preferably, the running state information includes processor occupancy and/or processor idle rate;

correspondingly, the step of judging whether the running state information meets the working state condition specifically includes:

extracting the processor occupancy from the operating state information;

detecting whether the occupancy rate of the processor is greater than a preset occupancy rate or not, and judging whether the running state information meets working state conditions or not according to a detection result;

and/or;

extracting the processor idle rate from the running state information;

and detecting whether the idle rate of the processor is less than a preset idle rate or not, and judging whether the running state information meets the working state condition or not according to the detection result.

Preferably, when the running state information does not meet the working state condition, the pulse modulation data is uploaded to a cloud server, so that the cloud server performs voice recognition processing on the pulse modulation data to obtain a voice recognition result, specifically including:

and when the running state information does not accord with the working state condition, uploading the pulse modulation data to a cloud server so that the cloud server acquires a corresponding cloud space characteristic value, and when detecting that the cloud space characteristic value accords with a preset storage condition, performing voice recognition processing on the pulse modulation data to obtain a voice recognition result.

Preferably, the step of selecting a target smart device from the plurality of smart devices according to the voice recognition result to respond to the voice recognition result specifically includes:

selecting corresponding target intelligent equipment from the intelligent equipment according to the voice recognition result;

and storing the voice recognition result into a memory of the target intelligent device so that the target intelligent device responds to the voice recognition result.

In addition, in order to achieve the above object, the present invention further provides a speech processing system based on multiple devices, wherein the system includes:

the data acquisition module is used for acquiring voice instruction information and extracting corresponding pulse code modulation data from the voice instruction information;

the state acquisition module is used for acquiring device identifications of a plurality of intelligent devices which are associated in advance and acquiring running state information of each intelligent device according to the device identifications;

the voice recognition module is used for carrying out voice recognition processing on the pulse modulation data through a cloud server according to the running state information to obtain a voice recognition result;

and the voice response module is used for selecting a target intelligent device from the plurality of intelligent devices according to the voice recognition result and responding to the voice recognition result.

Preferably, the voice recognition module is further configured to determine whether the running state information meets a working state condition;

the voice recognition module is further configured to upload the pulse modulation data to a cloud server when the operating state information does not meet the operating state condition, so that the cloud server performs voice recognition processing on the pulse modulation data to obtain a voice recognition result.

the voice recognition module is further configured to extract the processor occupancy rate from the operating state information;

the voice recognition module is also used for detecting whether the occupancy rate of the processor is greater than a preset occupancy rate or not and judging whether the running state information meets the working state condition or not according to the detection result;

the voice recognition module is further used for extracting the processor idle rate from the running state information;

the voice recognition module is further used for detecting whether the processor idle rate is smaller than a preset idle rate or not and judging whether the running state information meets working state conditions or not according to a detection result.

Preferably, the voice recognition module is further configured to upload the pulse modulation data to a cloud server when the operating state information does not meet the operating state condition, so that the cloud server obtains a corresponding cloud space characteristic value, and perform voice recognition processing on the pulse modulation data when detecting that the cloud space characteristic value meets a preset storage condition, so as to obtain a voice recognition result.

Preferably, the voice response module is further configured to select a corresponding target smart device from the plurality of smart devices according to the voice recognition result;

the voice response module is further configured to store the voice recognition result in a memory of the target smart device, so that the target smart device responds to the voice recognition result.

According to the method, voice instruction information is obtained, corresponding pulse code modulation data are extracted from the voice instruction information, device identifications of a plurality of intelligent devices which are associated in advance are obtained, operation state information of each intelligent device is obtained according to the device identifications, voice recognition processing is carried out on the pulse modulation data through a cloud server according to the operation state information, a voice recognition result is obtained, and a target intelligent device is selected from the intelligent devices according to the voice recognition result to respond to the voice recognition result. Different from the prior art that the voice instruction information is locally recognized by the intelligent equipment, the corresponding control instruction is generated based on the recognized text, the corresponding operation is executed based on the control instruction to respond to the voice instruction information, so that the accuracy and the response speed of voice interaction are great, the voice processing speed is slowed, and the voice interaction experience of a user is also influenced, the invention acquires the pulse code modulation data corresponding to the voice instruction information and the operation state information of a plurality of intelligent equipment which are associated in advance, then carries out voice recognition processing on the pulse modulation data through the cloud server according to the operation state information to acquire a voice recognition result, selects corresponding target intelligent equipment from the plurality of intelligent equipment according to the voice recognition result to respond to the voice recognition result, so that the intelligent equipment and the cloud terminal can cooperatively work, and the response speed of online voice processing is improved, further, the voice interaction experience of the user is also improved.

Drawings

FIG. 1 is a flowchart illustrating a first embodiment of a speech processing method based on multiple devices according to the present invention;

FIG. 2 is a schematic structural diagram of an intelligent device in a hardware operating environment according to an embodiment of the present invention;

FIG. 3 is a flowchart illustrating a second embodiment of a multi-device based speech processing method according to the present invention;

FIG. 4 is a block diagram of a multi-device based speech processing system according to a first embodiment of the present invention.

The implementation, functional features and advantages of the objects of the present invention will be further explained with reference to the accompanying drawings.

Detailed Description

It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention.

An embodiment of the present invention provides a speech processing method based on multiple devices, and referring to fig. 1, fig. 1 is a flowchart illustrating a first embodiment of the speech processing method based on multiple devices according to the present invention.

In this embodiment, the method for processing a voice based on multiple devices includes the following steps:

step S10: acquiring voice instruction information, and extracting corresponding pulse code modulation data from the voice instruction information;

it is easy to understand that the execution subject of this embodiment is an intelligent control end, and the intelligent control end can be understood as an interaction medium between a user and a plurality of intelligent devices, referring to fig. 2, fig. 2 is a schematic structural diagram of an intelligent device in a hardware operating environment according to an embodiment of the present invention. As shown in fig. 2, the smart device may include: a processor 1001 such as a Central Processing Unit (CPU), a Micro Controller Unit (MCU), a communication bus 1002, a user interface 1003, a network interface 1004, and a memory 1005. Wherein a communication bus 1002 is used to enable connective communication between these components. The user interface 1003 may comprise a display screen, an input unit such as a keyboard, and the optional user interface 1003 may also comprise a standard wired interface, a wireless interface. The network interface 1004 may optionally include a standard wired interface, a WIreless interface (e.g., a WIreless-FIdelity (WI-FI) interface). The Memory 1005 may be a Random Access Memory (RAM) Memory, or may be a Non-Volatile Memory (NVM), such as a disk Memory. The memory 1005 may alternatively be a storage device separate from the processor 1001.

Those skilled in the art will appreciate that the configuration shown in fig. 2 does not constitute a limitation of smart devices and may include more or fewer components than those shown, or some components may be combined, or a different arrangement of components.

As shown in fig. 2, a memory 1005, which is a storage medium, may include therein an operating system, a data storage module, a network communication module, a user interface module, and a multi-device based voice processing program.

In the smart device shown in fig. 2, the network interface 1004 is mainly used for data communication with a network server; the user interface 1003 is mainly used for data interaction with a user; the processor 1001 and the memory 1005 in the intelligent device of the present invention may be disposed in the intelligent device, or disposed in the intelligent control end, and when the processor 1001 and the memory 1005 are disposed in the intelligent device, the intelligent control end calls the multi-device based voice processing program stored in the memory 1005 through the processor 1001 in the intelligent device, and executes the multi-device based voice processing method provided in the embodiment of the present invention; when the processor 1001 and the memory 1005 are disposed in the intelligent control terminal, the intelligent control terminal directly calls the multi-device based voice processing program stored in the memory 1005 through the processor 1001 and executes the multi-device based voice processing method provided by the embodiment of the present invention.

It should be noted that, when the intelligent control end obtains the voice command information, it may extract corresponding Pulse Code Modulation (PCM) data from the voice command information, where the voice command information may be voice command information sent by a user or voice command information generated by an input unit (e.g. a keyboard), in the concrete implementation, the voice command information can be converted into an analog signal with continuous time and continuous value, then converting the analog signal into time-discrete digital signal, transmitting in channel, it can be understood that the analog signal corresponding to the voice command information is sampled first, then the amplitude of the sample value corresponding to the analog signal is quantized and encoded to obtain the pulse code modulation data, the pulse code modulation data is then saved to memory 00, and memory 00 is used to store the pulse code modulation data.

Step S20: acquiring device identifications of a plurality of intelligent devices which are associated in advance, and acquiring running state information of each intelligent device according to the device identifications;

it is easy to understand that, when the intelligent control end obtains the pulse code modulation data corresponding to the voice instruction information, the intelligent control end can also obtain the pre-associated device identifiers of the plurality of intelligent devices in operation, and then control the intelligent devices to start the state detection function so as to obtain the operation state information of the intelligent devices. The device identifier may be used to indicate the type of device, such as a television, a humidifier, an air conditioner, etc. The operating state information may be used to represent the state of a processor (e.g., a micro-control unit) of the smart device, including but not limited to processor occupancy and processor idle.

In a specific implementation, when the intelligent device is controlled to start the state detection function, the corresponding interface identifier may be set to be 1, and the identifier result is stored in the memory 01, then the intelligent control end accesses the memory 01 to obtain the identifier result to be 1, and obtains the device identifiers of the plurality of intelligent devices, which are respectively stored in the memory 11, the memory 22, and the memory 33 … …, and sets the corresponding interface identifier to be 1, and then stores the identifier result in the memory 02, the intelligent control end accesses the memory 02 to obtain the identifier result to be 1, and accesses the memory in which the device identifier is stored, obtains the device identifier in the memory in which the device identifier is stored, and queries the operating state information of the corresponding intelligent device according to the device identifier.

Step S30: performing voice recognition processing on the pulse modulation data through a cloud server according to the running state information to obtain a voice recognition result;

it is easy to understand that after the running state information of the intelligent device is acquired, in order to improve the voice interaction experience of the user, when the pulse modulation data is uploaded to the cloud server, the cloud server is controlled to perform voice recognition processing on the pulse modulation data according to the running state information, and a voice recognition result is acquired. The voice recognition result includes, but is not limited to, voice-related feature information such as voiceprint features, speech speed, frequency, duration, emotion, and text information obtained by semantic recognition in the voice recognition process, which can be understood as feature information related to text, including, but not limited to, text length, specific characters, difficulty of semantic resolution, and the like.

In a specific implementation, in order to improve the response speed of online voice processing, semantic recognition in the voice recognition processing may be performed only on pulse modulation data to obtain a voice recognition result only containing text information, and then a target smart device is selected from a plurality of smart devices according to the text information and the text information is stored in a memory of the target smart device, so that the target smart device responds to the voice recognition result.

Step S40: and selecting a target intelligent device from the plurality of intelligent devices according to the voice recognition result to respond to the voice recognition result.

It should be noted that, after obtaining the voice recognition result, the device identifier (such as a device name) may be extracted from the text information in the voice recognition result, and then, the corresponding target smart device is selected from the plurality of smart devices according to the device identifier, and the voice recognition result is stored in the memory of the target smart device, so that the target smart device responds to the voice recognition result.

In specific implementation, in order to improve the voice interaction experience of a user, feature information such as voiceprint features and the like can be extracted from a voice recognition result to determine an initiating object of voice instruction information, corresponding equipment preference setting is obtained according to the identity of the initiating object, a response result is further optimized according to the equipment preference setting, the equipment preference setting is the habitual setting of the initiating object to each intelligent device, which is obtained in a preset instruction database according to the identity of the initiating object, if the target intelligent device is determined to be an air conditioner according to the voice recognition result, the habitual setting of the initiating object to the air conditioner is 26 ℃, when the voice instruction information sent by the initiating object is 'turn on the air conditioner', the air conditioner is started according to the identity of the initiating object and the turn-on temperature of the air conditioner is set to 26 ℃.

It should be understood that the above is only an example, and the technical solution of the present invention is not limited in any way, and in a specific application, a person skilled in the art may set the technical solution as needed, and the present invention is not limited thereto.

In this embodiment, voice instruction information is acquired, corresponding pulse code modulation data is extracted from the voice instruction information, device identifiers of a plurality of intelligent devices which are associated in advance are acquired, operating state information of each intelligent device is acquired according to the device identifiers, voice recognition processing is performed on the pulse modulation data through a cloud server according to the operating state information, a voice recognition result is acquired, and a target intelligent device is selected from the intelligent devices according to the voice recognition result to respond to the voice recognition result. Different from the prior art that the voice command information is locally recognized by the intelligent device, the corresponding control command is generated based on the recognized text, the corresponding operation is executed based on the control command to respond to the voice command information, so that the accuracy and the response speed of voice interaction are great, the voice processing speed is slowed down, and the voice interaction experience of a user is also influenced, in the embodiment, the pulse code modulation data corresponding to the voice command information and the operation state information of a plurality of intelligent devices which are associated in advance are obtained, then the voice recognition processing is performed on the pulse modulation data through the cloud server according to the operation state information to obtain a voice recognition result, and then the corresponding target intelligent device is selected from the plurality of intelligent devices according to the voice recognition result to respond to the voice recognition result, so that the intelligent device and the cloud work are realized, and the response speed of online voice processing is improved, further, the voice interaction experience of the user is also improved.

Referring to fig. 3, fig. 3 is a flowchart illustrating a speech processing method based on multiple devices according to a second embodiment of the present invention.

Based on the first embodiment described above, in the present embodiment, the step S30 includes:

step S301: judging whether the running state information meets the working state condition or not;

it should be noted that the operating status information may be used to indicate the status of a processor (e.g., a micro control unit) of the smart device, including but not limited to processor occupancy and processor idle. In a specific implementation, the processor occupancy may be extracted from the running state information, and then whether the running state information meets the working state condition is determined according to whether the processor occupancy is greater than a preset occupancy, where the preset occupancy may be set according to an actual requirement, for example, 80%, and this embodiment is not limited thereto. And/or; the processor idle rate is extracted from the running state information, and whether the running state information meets the working state condition is determined according to whether the processor idle rate is smaller than a preset idle rate, where the preset idle rate may be set according to an actual requirement, for example, 10%, and this embodiment is not limited thereto.

In a specific implementation, when querying the running state information of the intelligent device (such as a television), if the processor occupancy rate of the television is greater than 80%, and/or; if the idle rate of the processor is less than 10% (namely, the processor corresponding to the television is in a busy state), setting the corresponding interface identifier to be 1, and storing the identifier result into the memory 02, otherwise, setting the corresponding interface identifier to be 2, and storing the identifier result into the memory 03. The intelligent control end accesses the memory 03 to obtain the identification result of 2, then accesses the memory 00 to obtain the PCM data, controls the television to upload the PCM data to the cloud server, sets the corresponding interface identification 1, and stores the identification result in the memory 04.

In another implementation, when querying the operation state information of the intelligent device (such as the humidifier), if the processor occupancy rate of the humidifier is greater than 80%, and/or; if the processor idle rate is less than 10% (namely, the processor corresponding to the humidifier is in a busy state), the corresponding interface identifier is set to be 1, and the identifier result is stored in the memory 05, otherwise, the corresponding interface identifier is set to be 2, and the identifier result is stored in the memory 05. The intelligent control end accesses the memory 05 to obtain the identification result of 2, then accesses the memory 00 to obtain the PCM data, controls the humidifier to upload the PCM data to the cloud server, sets the corresponding interface identification 1, and stores the identification result in the memory 06.

In another implementation, when the operation state information of the intelligent device (such as an air conditioner) is queried, if the processor occupancy rate of the air conditioner is greater than 80%, and/or; if the idle rate of the processor is less than 10% (namely, the processor corresponding to the air conditioner is in a busy state), the corresponding interface identifier is set to be 1, and the identifier result is stored in the memory 07, otherwise, the corresponding interface identifier is set to be 2, and the identifier result is stored in the memory 07. The intelligent control end accesses the memory 07 to obtain the identification result of 2, then accesses the memory 00 to obtain the PCM data, controls the air conditioner to upload the PCM data to the cloud server, sets the corresponding interface identification 1, and stores the identification result in the memory 08.

And repeating the operation to traverse other intelligent devices to obtain the running state information of each intelligent device, and judging whether the running state information of each intelligent device meets the working state condition or not.

Step S302: and when the running state information does not accord with the working state condition, uploading the pulse modulation data to a cloud server so that the cloud server performs voice recognition processing on the pulse modulation data to obtain a voice recognition result.

It is easy to understand that when the running state information does not meet the working state condition (if the processor occupancy rate is less than or equal to 80%, and the processor idle rate is greater than or equal to 10%, namely the processor is not in a busy state), the pulse modulation data can be uploaded to the cloud server, so that the cloud server obtains a corresponding cloud space characteristic value, and when the cloud space characteristic value is detected to meet a preset storage condition, voice recognition processing is performed on the pulse modulation data to obtain a voice recognition result, and corresponding target intelligent devices are selected from the intelligent devices according to the voice recognition result to respond to the voice recognition result. The cloud space characteristic value may be understood as a characteristic value representing a usage condition of a storage space corresponding to each smart device in the cloud server, including but not limited to a cloud space occupancy rate and a cloud space idle rate. Correspondingly, the preset storage condition may be set to determine whether the cloud space idle rate is greater than the preset cloud idle rate, and the preset cloud idle rate may be set according to an actual requirement, such as 60%, which is not limited in this embodiment; or, judge whether cloud space occupancy is less than preset cloud occupancy, preset the cloud occupancy and can set up according to the actual demand, if 40%, this embodiment does not put any restriction on this.

In a specific implementation, when processing running state information of an intelligent device (such as a television), an intelligent control end accesses a memory 04, obtains an identification result of 1, controls a cloud server to obtain PCM data corresponding to the television, so that the cloud server obtains a cloud space characteristic value corresponding to the television, such as a cloud end space idle rate, and if the cloud idle rate corresponding to the television is greater than 60%, controls the cloud server to perform semantic identification on the PCM data corresponding to the television, and stores obtained text information into a memory 44, so that the television executes corresponding operation to respond to the text information; if the cloud vacancy rate corresponding to the television is less than 40%, the corresponding interface identifier is 1, and the identifier result is stored in the memory 08.

In another implementation manner, when the running state information of the intelligent device (such as a humidifier) is processed, the intelligent control end accesses the memory 06, obtains a recognition result of 1, controls the cloud server to obtain PCM data corresponding to the humidifier, so that the cloud server obtains a cloud space characteristic value corresponding to the humidifier, such as a cloud end space idle rate, and if the cloud idle rate corresponding to the humidifier is greater than 60%, controls the cloud server to perform semantic recognition on the PCM data corresponding to the humidifier, and stores the obtained text information into the memory 55, so that the humidifier performs corresponding operation to respond to the text information; if the cloud vacancy rate corresponding to the humidifier is less than 40%, the corresponding interface identifier is 1, and the identifier result is stored in the memory 09.

In another implementation manner, when the running state information of the intelligent device (such as an air conditioner) is processed, the intelligent control terminal accesses the memory 08, obtains a recognition result of 1, controls the cloud server to obtain PCM data corresponding to the air conditioner, so that the cloud server obtains a cloud space characteristic value corresponding to the air conditioner, such as a cloud terminal space idle rate, and if the cloud idle rate corresponding to the air conditioner is greater than 60%, controls the cloud server to perform semantic recognition on the PCM data corresponding to the air conditioner, and stores the obtained text information into the memory 66, so that the humidifier performs corresponding operation to respond to the text information; if the cloud vacancy rate corresponding to the humidifier is less than 40%, the corresponding interface identifier is 1, and the identifier result is stored in the memory 10.

Based on the above implementation, the intelligent control terminal accesses the memory 44, the memory 55, or the memory 66, and then may obtain the voice recognition result, and further, the above operations may be repeated to traverse other intelligent devices to obtain the voice recognition result of each intelligent device.

In this embodiment, whether the operation state information meets a working state condition is determined, and when the operation state information does not meet the working state condition, the pulse modulation data is uploaded to a cloud server, so that the cloud server performs voice recognition processing on the pulse modulation data to obtain a voice recognition result. Different from the prior art that voice recognition processing is performed only through a locally stored voice recognition database, in this embodiment, voice recognition processing is performed on pulse modulation data of each intelligent device by controlling the cloud server according to the running state information of each intelligent device, such as processor occupancy rate and processor idle rate, so as to improve voice recognition efficiency and accuracy of voice recognition, and further, response speed of online voice processing and voice interaction experience of a user are also improved.

Referring to fig. 4, fig. 4 is a block diagram illustrating a first embodiment of a speech processing system based on multiple devices according to the present invention.

As shown in fig. 4, a speech processing system based on multiple devices according to an embodiment of the present invention includes:

the data acquisition module 10 is configured to acquire voice instruction information and extract corresponding pulse code modulation data from the voice instruction information;

the state obtaining module 20 is configured to obtain device identifiers of a plurality of pieces of intelligent equipment which are associated in advance, and obtain operation state information of each piece of intelligent equipment according to the device identifiers;

the voice recognition module 30 is configured to perform voice recognition processing on the pulse modulation data through a cloud server according to the running state information to obtain a voice recognition result;

and the voice response module 40 is configured to select a target intelligent device from the plurality of intelligent devices according to the voice recognition result and respond to the voice recognition result.

Based on the above-mentioned first embodiment of the multi-device based speech processing system of the present invention, a second embodiment of the multi-device based speech processing system of the present invention is proposed.

In this embodiment, the voice recognition module 30 is further configured to determine whether the running state information meets a working state condition;

the voice recognition module 30 is further configured to upload the pulse modulation data to a cloud server when the operating state information does not meet the operating state condition, so that the cloud server performs voice recognition processing on the pulse modulation data to obtain a voice recognition result.

The running state information comprises processor occupancy rate and/or processor idle rate;

the speech recognition module 30 is further configured to extract the processor occupancy rate from the running state information;

the voice recognition module 30 is further configured to detect whether the occupancy rate of the processor is greater than a preset occupancy rate, and determine whether the running state information meets the working state condition according to the detection result;

the speech recognition module 30 is further configured to extract the processor idle rate from the running state information;

the voice recognition module 30 is further configured to detect whether the processor idle rate is less than a preset idle rate, and determine whether the running state information meets the working state condition according to a detection result.

The voice recognition module 30 is further configured to upload the pulse modulation data to a cloud server when the operating state information does not meet the operating state condition, so that the cloud server obtains a corresponding cloud space characteristic value, and perform voice recognition processing on the pulse modulation data when detecting that the cloud space characteristic value meets a preset storage condition, so as to obtain a voice recognition result.

The voice response module 40 is further configured to select a corresponding target intelligent device from the plurality of intelligent devices according to the voice recognition result;

the voice response module 40 is further configured to store the voice recognition result in a memory of the target smart device, so that the target smart device responds to the voice recognition result.

Other embodiments or specific implementation manners of the speech processing system based on multiple devices of the present invention may refer to the above method embodiments, and are not described herein again.

It should be noted that, in this document, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or system that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or system. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other like elements in a process, method, article, or system that comprises the element.

The above-mentioned serial numbers of the embodiments of the present invention are merely for description and do not represent the merits of the embodiments.

Through the above description of the embodiments, those skilled in the art will clearly understand that the method of the above embodiments can be implemented by software plus a necessary general hardware platform, and certainly can also be implemented by hardware, but in many cases, the former is a better implementation manner. Based on such understanding, the technical solutions of the present invention may be embodied in the form of a software product, which is stored in a storage medium (e.g., a rom/ram, a magnetic disk, an optical disk) and includes instructions for enabling a terminal device (e.g., a mobile phone, a computer, a server, an air conditioner, or a network device) to execute the method according to the embodiments of the present invention.

The above description is only a preferred embodiment of the present invention, and not intended to limit the scope of the present invention, and all modifications of equivalent structures and equivalent processes, which are made by using the contents of the present specification and the accompanying drawings, or directly or indirectly applied to other related technical fields, are included in the scope of the present invention.

Claims

1. A method for speech processing based on multiple devices, the method comprising the steps of:

2. The method according to claim 1, wherein the step of performing voice recognition processing on the pulse modulation data through a cloud server according to the running state information to obtain a voice recognition result specifically includes:

3. The method of claim 2, wherein the operational status information comprises processor occupancy and/or processor idle rate;

extracting the processor occupancy from the operating state information;

and/or;

extracting the processor idle rate from the running state information;

4. The method according to claim 2, wherein the step of uploading the pulse modulation data to a cloud server when the operation state information does not meet the operation state condition, so that the cloud server performs voice recognition processing on the pulse modulation data to obtain a voice recognition result specifically includes:

5. The method of claim 1, wherein the step of selecting a target smart device from the plurality of smart devices to respond to the speech recognition result according to the speech recognition result comprises:

6. A multi-device based speech processing system, the system comprising:

7. The system of claim 6, wherein the speech recognition module is further configured to determine whether the operational status information meets an operational status condition;

8. The system of claim 7, wherein the operational status information includes processor occupancy and/or processor idle rate;

9. The system of claim 7, wherein the voice recognition module is further configured to upload the pulse modulation data to a cloud server when the operating state information does not meet the operating state condition, so that the cloud server obtains a corresponding cloud space feature value, and perform voice recognition processing on the pulse modulation data when it is detected that the cloud space feature value meets a preset storage condition, so as to obtain a voice recognition result.

10. The system of claim 6, wherein the voice response module is further configured to select a corresponding target smart device from the plurality of smart devices according to the voice recognition result;