CN114155851A

CN114155851A - Terminal device based on voice control and voice control system

Info

Publication number: CN114155851A
Application number: CN202111446783.2A
Authority: CN
Inventors: 高向阳; 程俊; 任子良; 张锲石; 康宇航; 郭海光
Original assignee: Shenzhen Institute of Advanced Technology of CAS
Current assignee: Shenzhen Institute of Advanced Technology of CAS
Priority date: 2021-11-30
Filing date: 2021-11-30
Publication date: 2022-03-08
Also published as: WO2023097761A1

Abstract

The application is suitable for speech recognition technical field, provides a terminal equipment and speech control system based on speech control, and wherein, terminal equipment based on speech control includes: the voice acquisition unit is used for acquiring voice signals; the awakening language recognition unit is connected with the voice acquisition unit and is used for recognizing the voice signal and sending an interrupt signal to the control unit when recognizing that the voice signal comprises a preset awakening language; the control unit is connected with the awakening language recognition unit and used for monitoring the voice instruction from the awakening language recognition unit when the interrupt signal is received and generating the control instruction corresponding to the voice instruction when the voice instruction is received, so that more processing resources of the control unit cannot be occupied, the resource utilization rate of the control unit is improved, and the power consumption of the control unit is reduced.

Description

Terminal device based on voice control and voice control system

Technical Field

The application belongs to the technical field of voice recognition, and particularly relates to a terminal device and a voice control system based on voice control.

Background

With the rapid development of speech recognition technology, speech control functions based on speech recognition technology are widely used in various fields. For example, a voice recognition algorithm is built in a control unit of a terminal device (e.g., a robot) to realize voice control over the terminal device, so that intelligent control over the terminal device is realized, and control efficiency of the terminal device is improved.

In order to improve the accuracy of voice recognition of a terminal device, a user usually needs to input a specific wake-up word to wake up the voice recognition function of the terminal device, and the terminal device starts voice recognition operation after the voice recognition function of the terminal device is woken up. However, the control unit in the existing terminal device needs to constantly monitor whether the user inputs the wakeup word, which occupies more processing resources of the control unit, reduces the resource utilization rate of the control unit, and increases the power consumption of the control unit.

Disclosure of Invention

In view of this, an embodiment of the present application provides a terminal device and a voice control system based on voice control, so as to solve the technical problems that a control unit in an existing terminal device needs to constantly monitor whether a wakeup word is input, occupies more processing resources of the control unit, and causes a low resource utilization rate and high power consumption of the control unit.

In a first aspect, an embodiment of the present application provides a terminal device based on voice control, including:

the voice acquisition unit is used for acquiring voice signals;

the awakening language recognition unit is connected with the voice acquisition unit and is used for recognizing the voice signal and sending an interrupt signal to the control unit when recognizing that the voice signal comprises a preset awakening language;

the control unit is connected with the awakening language recognition unit and used for monitoring the voice instruction from the awakening language recognition unit when receiving the interrupt signal and generating a control instruction corresponding to the voice instruction when receiving the voice instruction.

Optionally, the terminal device further includes a voice signal processing unit connected between the voice acquisition unit and the wakeup word recognition unit; the voice signal processing unit is used for preprocessing the voice signal and sending the preprocessed voice signal to the awakening language recognition unit; the preprocessing comprises filtering processing and signal amplification processing.

Optionally, the terminal device further includes a communication unit connected to the control unit; the terminal equipment is connected with at least one controlled equipment through the communication unit;

the control unit is used for sending the control instruction to the communication unit when the control instruction is the control instruction for the controlled equipment;

the communication unit is used for receiving the control instruction from the control unit and sending the control instruction to the controlled equipment.

Optionally, the terminal device further includes a motor driving unit, a motor, and a motion assembly; the motor driving unit is connected with the control unit, and the motor is connected with the motor driving unit and the motion assembly;

the control unit is used for sending the control instruction to the motor driving unit when the control instruction is a motion instruction for the terminal equipment where the control unit is located;

the motor driving unit is used for driving the motor to operate based on the control instruction so as to drive the motion assembly to correspondingly move.

Optionally, the terminal device further includes an audio output unit connected to the control unit; the control unit is used for generating an audio signal carrying a reply word when receiving the interrupt signal and sending the audio signal to the audio output unit;

the audio output unit is used for receiving the audio signal and playing the reply language.

Optionally, the system further comprises a state indicating unit connected with the control unit;

the state indicating unit is used for indicating the state of the terminal equipment where the control unit is located through an indicating lamp.

Optionally, the voice collecting unit is a microphone array composed of a plurality of microphones.

Optionally, the microphones are arranged linearly, and a preset distance is provided between two adjacent microphones.

Optionally, the wakeup word recognition unit includes an analog-to-digital conversion unit and a digital signal processing unit;

the analog-to-digital conversion unit is used for converting the voice signal into a voice instruction in a digital signal form; the voice instruction carries beam information corresponding to the microphone array; the beam information is used for describing the time when each microphone receives the voice signal and the position of each microphone;

the digital signal processing unit is used for determining the position range of the sound source corresponding to the voice signal based on the beam information, performing voice enhancement processing on the voice instruction based on the position range, and sending the voice instruction after the voice enhancement processing to the control unit.

In a second aspect, an embodiment of the present application provides a voice control system, which includes at least one controlled device and a terminal device based on voice control according to the first aspect or any optional manner of the first aspect, where the terminal device is connected to the at least one controlled device.

The implementation of the terminal device and the voice control system based on voice control provided by the embodiment of the application has the following beneficial effects:

according to the terminal equipment based on voice control, the awakening language identification unit used for identifying the voice signals is arranged between the voice acquisition unit and the control unit, and when the awakening language identification unit identifies that the voice signals comprise the preset awakening language, an interrupt signal is sent to the control unit; the control unit starts to monitor the voice command from the awakening word recognition unit after receiving the interrupt signal, namely the awakening word monitoring operation in the application is completed by the awakening word recognition unit, and the control unit starts the voice recognition function after the awakening word recognition unit monitors the preset awakening word, so that more processing resources of the control unit cannot be occupied, the resource utilization rate of the control unit is improved, and the power consumption of the control unit is reduced.

Drawings

In order to more clearly illustrate the technical solutions in the embodiments of the present application, the drawings needed to be used in the embodiments or the prior art descriptions will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present application, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without inventive exercise.

Fig. 1 is a schematic structural diagram of a terminal device based on voice control according to an embodiment of the present application;

fig. 2 is a schematic structural diagram of a terminal device based on voice control according to another embodiment of the present application;

fig. 3 is a schematic structural diagram of a voice control system according to an embodiment of the present application.

Detailed Description

It is noted that the terminology used in the description of the embodiments of the present application is for the purpose of describing particular embodiments of the present application only and is not intended to be limiting of the present application. In the description of the embodiments of the present application, "/" means "or" unless otherwise specified, for example, a/B may mean a or B; "and/or" herein is merely an associative relationship describing an association, meaning that there may be three relationships, e.g., a and/or B, may mean: a exists alone, A and B exist simultaneously, and B exists alone. In addition, in the description of the embodiments of the present application, "a plurality" means two or more, and "at least one", "one or more" means one, two or more, unless otherwise specified.

In the following, the terms "first", "second" are used for descriptive purposes only and are not to be understood as indicating or implying relative importance or implicitly indicating the number of technical features indicated. Thus, a definition of "a first" or "a second" feature may explicitly or implicitly include one or more of the features.

Reference throughout this specification to "one embodiment" or "some embodiments," or the like, means that a particular feature, structure, or characteristic described in connection with the embodiment is included in one or more embodiments of the present application. Thus, appearances of the phrases "in one embodiment," "in some embodiments," "in other embodiments," or the like, in various places throughout this specification are not necessarily all referring to the same embodiment, but rather "one or more but not all embodiments" unless specifically stated otherwise. The terms "comprising," "including," "having," and variations thereof mean "including, but not limited to," unless expressly specified otherwise.

The embodiment of the application firstly provides terminal equipment based on voice control. The terminal device may be a robot or an audio device, etc. Please refer to fig. 1, which is a schematic structural diagram of a terminal device based on voice control according to an embodiment of the present application. As shown in fig. 1, the terminal device 100 may include a voice acquisition unit 11, a wakeup word recognition unit 12, and a control unit 13. Wherein, the awakening speech recognition unit 12 is connected with the speech acquisition unit 11 and the control unit 13.

In the embodiment of the present application, the voice collecting unit 11 is used for collecting voice signals.

In a particular application, the voice acquisition unit 11 may comprise at least one microphone. The voice collecting unit 11 may collect a voice signal through the at least one microphone and transmit the collected voice signal to the wakeup word recognition unit 12. The number and arrangement of the microphones can be set according to actual requirements.

In the embodiment of the present application, the wake-up language recognizing unit 12 is configured to recognize a voice signal, and send an interrupt signal to the control unit 13 when a preset wake-up language is included in the voice signal.

In a specific application, the wakeup word recognition unit 12 may recognize the voice signal from the voice acquisition unit 11 based on a voice recognition algorithm, and determine whether the voice signal includes a preset wakeup word.

The preset wake-up words are used for waking up the voice control function of the terminal device 100. The preset wake-up words may be a word, a sentence, or the like, and may be specifically set according to actual requirements, which is not particularly limited herein. For example, the preset wake-up word may be "small Q".

In one embodiment of the present application, the wakeup words recognition unit 12 may send an interrupt signal to the control unit 13 when determining that the voice signal includes a preset wakeup word. In another embodiment of the present application, when determining that the voice signal does not include the preset wake-up word, the wake-up word recognition unit 12 may not respond to the voice signal, and continue to receive the voice signal from the voice acquisition unit 11 until it is recognized that the voice signal includes the preset wake-up word, and send an interrupt signal to the control unit 13.

In a specific application, the wakeup words recognition unit 12 may send an interrupt signal to the control unit 13 by means of a hardware interrupt, or may send an interrupt signal to the control unit 13 by means of a software interrupt. The specific setting can be according to the actual demand, and is not particularly limited herein.

After the wakeup word recognition unit 12 sends the interrupt signal to the control unit 13, if the voice signal from the voice acquisition unit 11 is received again, the voice signal may be converted into a corresponding voice command, and the voice command may be sent to the control unit 13. The voice signal is an analog signal, the voice command is a digital signal corresponding to the voice signal, and the wakeup words recognition unit 12 may perform analog-to-digital conversion on the voice signal to obtain the voice command corresponding to the voice signal.

In the embodiment of the present application, the control unit 13 is configured to start monitoring the voice instruction from the wakeup word recognition unit 12 when receiving the interrupt signal, and generate a control instruction corresponding to the voice instruction when receiving the voice instruction.

In one possible implementation, the control instruction may include a control instruction for the terminal device itself, for example, a motion instruction. The motion instruction may include a motion parameter of the terminal device, for example, a motion route, a motion speed, and/or a rotation angular speed of the terminal device. In this implementation, the terminal device 100 can implement control of itself by executing the control instruction.

In another possible implementation, the control instruction may include a control instruction for another device connected to the terminal device. In this implementation, the control unit 13 may send a control instruction to the other device to implement control of the other device.

In a specific application, the control unit 13 may include a Micro Controller Unit (MCU), a single chip microcomputer (scm), an Advanced reduced instruction microprocessor (ARM), or the like, and may be specifically set according to an actual requirement, which is not particularly limited herein.

As can be seen from the above, in the terminal device based on voice control provided in this embodiment, the wakeup word recognition unit for recognizing the voice signal is arranged between the voice acquisition unit and the control unit, and when the wakeup word recognition unit recognizes that the voice signal includes the preset wakeup word, an interrupt signal is sent to the control unit; the control unit starts to monitor the voice command from the awakening word recognition unit after receiving the interrupt signal, namely the awakening word monitoring operation in the application is completed by the awakening word recognition unit, and the control unit starts the voice recognition function after the awakening word recognition unit monitors the preset awakening word, so that more processing resources of the control unit cannot be occupied, the resource utilization rate of the control unit is improved, and the power consumption of the control unit is reduced.

Please refer to fig. 2, which is a schematic structural diagram of a terminal device based on voice control according to another embodiment of the present application. As shown in fig. 2, the difference between the present embodiment and the corresponding embodiment of fig. 1 is that the voice collecting unit 11 in the present embodiment may be a microphone array composed of a plurality of microphones (microphone 1 to microphone n). For example, the plurality of microphones may be arranged linearly, and every two adjacent microphones are spaced apart by a certain distance. Of course, the plurality of microphones may be arranged in other ways.

It is understood that the difference in the arrangement positions of the different microphones may cause the voice signals from the same sound source to arrive at the different microphones at different times. Therefore, the position range where the sound source is located can be calculated using the time information when the voice signal reaches each microphone and the position information of each microphone.

In this embodiment, the microphone transmits the voice signal to the wakeup word recognition unit 12 and also transmits the time when the microphone acquires the voice signal. The awakening speech recognition unit 12 can calculate the position range of the sound source according to the time of collecting the speech signal by each microphone and the position information of each microphone, and then perform speech enhancement processing on the speech signal in the position range, and filter the speech signal outside the position range. Wherein the position information of the respective microphones can be stored in the wake-up language recognition unit 12.

According to the embodiment, the microphone array is adopted to collect voice signals, the sound source is positioned by means of the strong directivity of the microphone array, the voice signals in the position range where the sound source is located are enhanced, signals outside the position range where the sound source is located are filtered, accordingly, the interference of environment noise to the voice signals can be reduced, the accuracy of voice recognition of the terminal equipment is improved, and voice control is accurate.

In a further embodiment of the present application, the voice control based terminal device 100 further comprises a voice signal processing unit 14 connected between the voice acquisition unit 11 and the wake-up language recognition unit 12.

The voice signal processing unit 14 is configured to pre-process a voice signal and send the pre-processed voice signal to the wakeup word recognition unit 12.

In a particular application, the pre-processing may include filtering processing and signal amplification processing. Based on this, the voice signal processing unit 14 may include a filter circuit 141 and a signal amplification circuit 142.

The filter circuit 141 may be a hardware filter circuit (for example, a filter circuit including components such as a resistor and a capacitor), or may be a finished filter, and is not particularly limited herein.

The signal amplification circuit 142 may be a hardware signal amplification circuit. In an example, the signal amplification circuit 142 may include a low noise amplifier.

In the embodiment, the voice signal processing unit is arranged between the voice acquisition unit and the awakening language recognition unit, so that noise (such as environmental noise) in the voice signal can be filtered and amplified, and the voice recognition accuracy of the awakening language recognition unit and the control unit can be improved.

In a further embodiment of the present application, the voice control based terminal device 100 further comprises a communication unit 15 connected to the control unit 13. The terminal device may be connected with at least one controlled device through the communication unit 15. In a specific application, the controlled device may be a smart home device, including but not limited to: intelligent lamp, air conditioner, refrigerator, washing machine, clothes hanger, curtain, TV and video monitor etc.. The number of controlled devices can be set according to actual requirements, and is not particularly limited herein.

In one possible implementation, the communication unit 15 may be a wireless communication unit, for example, a communication unit based on a wireless fidelity (WIFI) protocol, a communication unit based on a ZigBee (ZigBee) protocol, or a bluetooth protocol.

In another possible implementation, the communication unit 15 may be a wired communication unit, for example, a Universal Serial Bus (USB) interface unit.

In this embodiment, the control unit 13 is configured to send a control instruction to the communication unit 15 when the control instruction is a control instruction for a controlled device. The communication unit 15 is configured to receive a control instruction from the control unit 13 and transmit the control instruction to the controlled device.

In this embodiment, since the control unit 13 needs to send the control instruction to the controlled device connected to the terminal device, the control unit 13 needs a communication protocol between the terminal device and the controlled device when generating the control instruction, that is, a data structure of the control instruction generated by the control unit 13 needs to meet the requirement of the communication protocol.

In one possible approach, the data structure of the control instructions may be as shown in table 1 below.

TABLE 1

Data head	Data length	Function code	Data bit	Check bit
					Byte0，Byte1	Byte2	Byte3	Byte4-Byte n	Byte n+1

The data header is a start byte of the control instruction and is used for indicating the start of the control instruction. Illustratively, the header may be represented by two bytes (i.e., Byte0 and Byte 1). By way of example and not limitation, both Byte0 and Byte1 may be hexadecimal numbers 0xF8 (i.e., binary numbers 11111000).

The data length is used to indicate the effective data length of the control instruction, i.e., the length of all bytes including the data header in table 1.

The function code is used to indicate the type of function that the control instruction implements. The functions of different classes are uniquely identified by the function code. Illustratively, the definition of the function code may be as follows:

when the function code is hexadecimal number 0x00, it indicates that the function code is used for implementing a motion control function for the terminal device.

When the function code is hexadecimal number 0x01, it indicates that the function code is used for implementing the control function of the intelligent lamp.

When the function code is hexadecimal number 0x02, it indicates that the function code is used for implementing the control function of the air conditioner.

When the function code is hexadecimal number 0x03, it indicates that the function code is used for implementing a control function for the refrigerator.

When the function code is hexadecimal number 0x04, it indicates that the function code is used for implementing a control function of the washing machine.

When the function code is hexadecimal number 0x05, the function code is used for realizing the control function of the clothes hanger.

When the function code is hexadecimal number 0x06, it is indicated to implement the control function of the window curtain.

When the function code is hexadecimal number 0x07, it indicates that the function code is used for implementing the control function of the television.

When the function code is hexadecimal number 0x08, it indicates that the function code is used to implement the control function for the video monitor.

The data bits are used to record the valid control content. The effective control content is used to describe the manner of control over the target device, i.e., how the target device is controlled. The length of the data bits is different according to different control contents, and may be determined according to actual requirements, and the length of the data bits is not particularly limited herein. The target device may be the terminal device 100 itself, or may be a controlled device connected to the terminal device 100.

The check code is used for verifying the validity of the control instruction. The check code may be generated based on a preset check code generation policy. For example, the check code generation policy may be: starting from a first byte of the control instruction, carrying out XOR operation on the first byte and a second byte in the control instruction to obtain a first XOR value; performing XOR operation on the first XOR value and the third byte to obtain a second XOR value; and repeating the steps until the n-1 th exclusive-or value is obtained, and taking the n-1 th exclusive-or value as a check code.

After the controlled device receives the data, the controlled device may identify the control instruction through the data header, and then identify whether the control instruction is for the current controlled device itself based on the function code in the control instruction. And if the control instruction is specific to the current controlled equipment, verifying the validity of the control instruction based on the check code in the control instruction, and realizing corresponding control based on the data bit in the control instruction after determining that the control data is valid.

In this embodiment, the communication unit is added to the terminal device, so that the terminal device can be connected to the controlled device through the communication unit, and further, the voice control of the controlled device is realized.

In still another embodiment of the present application, the voice control-based terminal device 100 further includes a motor driving unit 16, a motor 17, and a moving member 18. Wherein the motor drive unit 16 is connected to the control unit 13 and the motor 17 is connected to the motor drive unit 16 and the movement assembly 18.

In this embodiment, the control unit 13 is configured to send the control instruction to the motor driving unit 16 when the control instruction is a motion instruction for the terminal device itself. The control command may carry a motion parameter of the terminal device, for example, a motion route, a motion speed, and/or a rotation angular speed of the terminal device.

The motor driving unit 16 is used for driving the motor 17 to operate based on the control instruction so as to drive the motion assembly 18 to perform corresponding motion.

According to the embodiment, the motor driving unit, the motor and the motion assembly are arranged in the terminal equipment, so that the motion of the terminal equipment can be controlled through voice, and the convenience of controlling the terminal equipment is improved.

In a further embodiment of the present application, the voice control based terminal device 100 further comprises an audio output unit 19 connected to the control unit 13. In this embodiment, the control unit 13 is configured to generate an audio signal carrying a reply word when receiving the interrupt signal, and send the audio signal to the audio output unit 19. The audio output unit 19 is configured to receive an audio signal and play a reply in the audio signal.

In this embodiment, the purpose of playing the reply language by the audio output unit 19 is to inform the user that the voice monitoring function of the terminal device is turned on, and the user can start voice control on the terminal device.

The reply language may be set according to actual requirements, and is not particularly limited herein. For example, the reply phrase may be "Lingxi".

In a specific application, the audio output unit 19 may include a signal amplification circuit and a speaker (not shown). The signal amplification circuit is connected to the control unit 13 and the speaker. The signal amplification circuit is used for carrying out signal amplification processing on the audio signal carrying the reply language and sending the audio signal subjected to the signal amplification processing to the loudspeaker. The loudspeaker is used for playing the reply language in the audio signal.

In this embodiment, the audio output unit is arranged in the terminal device, and the reply language corresponding to the voice signal sent by the user is output through the audio output unit, so that the user can timely know the state of the terminal device.

In a further embodiment of the present application, the voice control based terminal device 100 further comprises a status indication unit 20 connected to the control unit 13. The status indication unit 20 is configured to indicate the status of the terminal device 100 through an indicator lamp. Illustratively, the state of the terminal device 100 includes, but is not limited to, a voice listening state of the control unit 13 in the terminal device 100.

In a specific application, the status indication unit 20 may include a light-emitting diode (LED), and indicates different statuses of the terminal device by controlling the LED to emit light with different colors.

In yet another embodiment of the present application, the wake-up language recognizing unit 12 may include an analog-to-digital converting unit 121 and a digital signal processing unit 122.

The analog-to-digital conversion unit 121 is configured to convert a voice signal into a voice instruction in the form of a digital signal; the voice instruction carries beam information corresponding to the microphone array. The beam information may be used to describe the time at which the speech signal was received by each microphone and the location of each microphone.

The digital signal processing unit 122 is configured to determine a position range where a sound source corresponding to the voice signal is located based on the beam information, perform voice enhancement processing on the voice instruction based on the position range, and send the voice instruction after the voice enhancement processing to the control unit 13.

In a specific application, the number of the analog-to-digital conversion units 121 may be equal to the number of microphones, that is, each analog-to-digital conversion unit 121 corresponds to one microphone, and each analog-to-digital conversion unit 121 is used for converting a voice signal from its corresponding microphone into a voice instruction in a digital form.

In this embodiment, after the digital signal processing unit 122 determines the position range where the sound source is located, the voice enhancement processing can be performed on the voice signal within the position range, and the voice signal outside the position range is filtered, so that the interference of the environmental noise to the voice signal can be reduced, the accuracy of the voice recognition of the terminal device is improved, and the voice control is more accurate.

In still another embodiment of the present application, the terminal device 100 may further include a storage unit connected to the control unit 13, a power supply unit that supplies power to the respective units, and the like.

The embodiment of the application also provides a voice control system. Please refer to fig. 3, which is a schematic structural diagram of a voice control system according to an embodiment of the present application. As shown in fig. 3, the voice control system may include at least one controlled device and the terminal device 100 based on voice control in the corresponding embodiment of fig. 1 or fig. 2. The terminal device 100 is connected to at least one controlled device.

It should be noted that, for the description of the terminal device 100, reference may be specifically made to fig. 1 and fig. 2 and the related description in the embodiment corresponding to fig. 1 and fig. 2, and details thereof are not repeated here.

It is obvious to those skilled in the art that, for convenience and simplicity of description, the above-mentioned division of each functional unit is merely used as an example, and in practical applications, the above-mentioned function distribution may be performed by different functional units according to needs, that is, the internal structure of the voice broadcasting device is divided into different functional units to perform all or part of the above-mentioned functions. Each functional unit in the embodiments may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit, and the integrated unit may be implemented in a form of hardware, or in a form of software functional unit. In addition, specific names of the functional units are only used for distinguishing one functional unit from another, and are not used for limiting the protection scope of the application. The specific working process of the units in the system may refer to the corresponding process in the foregoing method embodiment, and is not described herein again.

In the above embodiments, the description of each embodiment has its own emphasis, and parts that are not described or illustrated in a certain embodiment may refer to the description of other embodiments.

Those of ordinary skill in the art will appreciate that the various illustrative elements and algorithm steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware or combinations of computer software and electronic hardware. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the implementation. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present application.

The above-mentioned embodiments are only used for illustrating the technical solutions of the present application, and not for limiting the same; although the present application has been described in detail with reference to the foregoing embodiments, it should be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; such modifications and substitutions do not substantially depart from the spirit and scope of the embodiments of the present application and are intended to be included within the scope of the present application.

Claims

1. A terminal device based on voice control, comprising:

the voice acquisition unit is used for acquiring voice signals;

2. The terminal device according to claim 1, further comprising a voice signal processing unit connected between the voice acquisition unit and the wake-up language recognition unit; the voice signal processing unit is used for preprocessing the voice signal and sending the preprocessed voice signal to the awakening language recognition unit; the preprocessing comprises filtering processing and signal amplification processing.

3. The terminal device according to claim 1, further comprising a communication unit connected to the control unit; the terminal equipment is connected with at least one controlled equipment through the communication unit;

4. The terminal device according to claim 1, further comprising a motor driving unit, a motor, and a moving assembly; the motor driving unit is connected with the control unit, and the motor is connected with the motor driving unit and the motion assembly;

5. The terminal device according to claim 1, further comprising an audio output unit connected to the control unit; the control unit is used for generating an audio signal carrying a reply word when receiving the interrupt signal and sending the audio signal to the audio output unit;

6. The terminal device according to claim 1, further comprising a status indication unit connected to the control unit;

7. The terminal device according to any one of claims 1 to 6, wherein the voice collecting unit is a microphone array composed of a plurality of microphones.

8. The terminal device according to claim 7, wherein the plurality of microphones are arranged linearly, and a preset distance is provided between two adjacent microphones.

9. The terminal device according to claim 7, wherein the wake-up language identification unit comprises an analog-to-digital conversion unit and a digital signal processing unit;

10. A voice control system, characterized by comprising at least one controlled device and a terminal device based on voice control according to any one of claims 1 to 9, the terminal device being connected to the at least one controlled device.