WO2022041319A1

WO2022041319A1 - Voice control optimization system and method

Info

Publication number: WO2022041319A1
Application number: PCT/CN2020/114184
Authority: WO
Inventors: 汤智文; 刘胜利; 唐韧; 叶鑫
Original assignee: 广东奥科伟业科技发展有限公司
Priority date: 2020-08-28
Filing date: 2020-09-09
Publication date: 2022-03-03
Also published as: CN111986672A

Abstract

A voice control optimization system and a voice control optimization method. The voice control optimization system comprises a voice recognition module (1), a voice control module (2) and a voice editing module (3). The voice control module (2) and the voice editing module (3) are respectively connected to the voice recognition module (1). The voice recognition module (1) is used to receive voice information, perform command recognition on the voice information according to a recognition mode, and convert the voice information recognized as a command into control information. The voice control module (2) receives the control information, and executes an action according to the control information. The voice editing module (3) is used to perform editing with respect to the recognition mode of the voice recognition module (1). Provision of the voice editing module (3) enables editing to be performed with respect to the recognition mode of the voice recognition module (1), and realizes optimization of voice control, such that voice information used by an operator in daily life can be used as custom control information to be recognized, thereby improving the convenience of voice control and operation experience.

Description

System and method for optimizing voice control

technical field

The invention relates to the technical field of automatic curtain opening and closing control, in particular to a system and method for optimizing voice control.

Background technique

Electric curtains are widely used in various buildings, and with the development of science and technology, voice-controlled electric curtains have appeared. In the prior art, voice control of electric curtains needs to use a specific language and accent. However, in different regions, people's daily language and accent are also different, which leads to people often need to adjust the language and accent several times during voice control. The tone can only be completed in accordance with the specific language and tone of the electric curtain, which greatly causes the inconvenience of voice control and reduces the operating experience of the operator's voice control.

SUMMARY OF THE INVENTION

Aiming at the deficiencies of the prior art, the present invention provides a system and method for optimizing voice control.

A system for optimizing voice control includes:

A voice recognition module, which is used for receiving voice information, and performing command recognition on the voice information according to the recognition mode, and converting the voice information recognized as commands into control information;

a voice control module connected to the voice recognition module; the voice control module receives control information and performs actions according to the control information; and

The voice editing module connected with the voice recognition module; the voice editing module is used to edit the recognition mode of the voice recognition module.

According to an embodiment of the present invention, it further includes a terminal module; the terminal module is connected to the voice editing module; the terminal module is used for sending editing information to the voice editing module, and the voice editing module edits the recognition mode of the voice recognition module according to the editing information.

According to an embodiment of the present invention, the speech recognition module includes a speech recognition unit and a speech processing unit; the speech recognition unit is connected to the speech processing unit; the speech processing unit stores a recognition mode; the speech recognition unit receives the speech information and transmits the speech information to A voice processing unit; the voice processing unit performs command recognition on the voice information according to the recognition mode, and converts the voice information recognized as commands into control information.

According to an embodiment of the present invention, the voice control module includes a voice control unit and an action execution unit; the voice control unit is connected to the action execution unit; the voice control unit receives the control information, and forms a control instruction and transmits it to the action execution unit; Instructions perform actions.

According to an embodiment of the present invention, the voice editing module includes a voice editing unit and a first wireless communication unit; the voice editing unit is connected to the first wireless communication unit; the terminal module includes an input unit and a second wireless communication unit; the input unit is connected to the second wireless communication unit The communication unit is connected; the first wireless communication unit is wirelessly connected with the second wireless communication unit; the editing information input by the input unit is wirelessly transmitted to the voice editing unit through the cooperation of the second wireless communication unit and the first wireless communication unit; Information to change the recognition mode of the speech recognition module.

One method of optimizing voice control includes:

The speech editing module edits the recognition mode of the speech recognition module;

The speech recognition module receives the speech information, performs command recognition on the speech information according to the edited recognition mode, and converts the speech information recognized as commands into control information;

The voice control module performs actions according to the control information.

According to an embodiment of the present invention, the speech editing module edits the recognition mode of the speech recognition module, which further includes:

The terminal module sends editing information to the voice editing module.

According to an embodiment of the present invention, before the terminal module sends editing information to the voice editing module, it further includes:

The terminal module sends confirmation information and verification information for the operator to confirm.

According to an embodiment of the present invention, the terminal module sends confirmation information and verification information for the operator to confirm, which further includes:

The operator inputs editing information through the terminal module.

According to an embodiment of the present invention, the recognition mode includes language type and voice tone.

Compared with the prior art, through the setting of the voice editing module, the editing of the recognition mode of the voice recognition module is realized, and the optimization of the voice control is realized, so that the voice information used by the operator can be customized as the control information to be recognized, which greatly improves the performance of the voice recognition module. It increases the convenience and operating experience of voice control.

Description of drawings

The drawings described herein are used to provide further understanding of the present application and constitute a part of the present application. The schematic embodiments and descriptions of the present application are used to explain the present application and do not constitute an improper limitation of the present application. In the attached image:

1 is a schematic structural diagram of a system for optimizing voice control in Embodiment 1;

FIG. 2 is a flowchart of a method for optimizing voice control in the second embodiment.

detailed description

Various embodiments of the present invention will be disclosed in the drawings below, and for the sake of clarity, many practical details will be described together in the following description. It should be understood, however, that these practical details should not be used to limit the invention. That is, in some embodiments of the invention, these practical details are unnecessary. In addition, for the purpose of simplifying the drawings, some well-known structures and components will be shown in a simple schematic manner in the drawings.

It should be noted that all directional indications (such as up, down, left, right, front, back...) in this embodiment of the present invention are only used to explain the difference between the various components under a certain posture (as shown in the accompanying drawings). If the specific posture changes, the directional indication also changes accordingly.

In addition, descriptions such as “first”, “second”, etc. in the present invention are only for the purpose of description, and do not refer to the meaning of order or sequence, nor are they used to limit the present invention. The components or operations are described by the same technical terms, and should not be construed as indicating or implying their relative importance or implying the quantity of the indicated technical features. Thus, a feature delimited with "first", "second" may expressly or implicitly include at least one of that feature. In addition, the technical solutions between the various embodiments can be combined with each other, but must be based on the realization by those of ordinary skill in the art. When the combination of technical solutions is contradictory or cannot be realized, it should be considered that the combination of such technical solutions does not exist. , is not within the scope of protection required by the present invention.

In order to further understand the content, features and effects of the present invention, the following embodiments are given as examples, and are described in detail as follows in conjunction with the accompanying drawings:

Example 1

Referring to FIG. 1 , FIG. 1 is a schematic structural diagram of a system for optimizing voice control in the first embodiment. The system for optimizing voice control in this embodiment includes a voice recognition module 1 , a voice control module 2 and a voice editing module 3 . The voice recognition module 1 is used for receiving voice information, and performing command recognition on the voice information according to the recognition mode, and converting the voice information recognized as commands into control information. The voice control module 2 is connected to the voice recognition module 1, and the voice control module 2 receives control information and performs actions according to the control information. The voice editing module 3 is connected to the voice recognition module 1 , and the voice editing module 3 is used to edit the recognition mode of the voice recognition module 1 .

Through the setting of the voice editing module 3, the editing of the recognition mode of the voice recognition module 1 is realized, and the optimization of the voice control is realized, so that the voice information used by the operator can be customized as the control information to be recognized, which greatly increases the voice control. convenience and operating experience.

Referring back to FIG. 1 , further, the system for optimizing voice control in this embodiment further includes a terminal module 4 . The terminal module 4 is connected with the voice editing module 3 . The terminal module 4 is used for sending editing information to the speech editing module 3, and the speech editing module 3 edits the recognition mode of the speech recognition module 1 according to the editing information. Through the setting of the terminal module 4, it is convenient for the operator to edit the input of information, and the convenience of the operation to optimize the voice control is increased.

Referring back to FIG. 1 , further, the speech recognition module 1 includes a speech recognition unit 11 and a speech processing unit 12 . The speech recognition unit 11 is connected to the speech processing unit 12 . A recognition pattern is stored in the speech processing unit 12 . The voice recognition unit 11 receives the voice information and transfers the voice information to the voice processing unit 12 . The voice processing unit 12 performs command recognition on the voice information according to the recognition mode, and converts the voice information recognized as commands into control information.

Specifically, the speech recognition unit 11 can use an existing speech recognizer or speech recognition circuit, such as a microphone, which can recognize and input human voices. The voice information in this embodiment is the speech voices made by the human body. The voice processing unit 12 may use an MCU chip with storage and voice processing functions. After the voice recognition unit 11 transmits the voice information to the voice processing unit 12, the voice processing unit 12 first performs command recognition on the voice information according to the recognition mode, that is, parses and judges the voice information, and only when it is judged that the voice information is a command, Voice information is converted into control information.

It can be understood that the follow-up operation is necessary only if the operator's voice is a valid instruction to carry out, otherwise the voice issued by the operation is a misoperation, and the subsequent operation is meaningless. The meaning of voice information for command recognition. The recognition mode here is the command voice for which the language type and voice tone have been set. The setting is set by the operator himself through the voice editing module 3. The language type and voice tone are all used by the operator daily and are familiar to them. Yes, used to. The language type can be a national language, such as "Mandarin", "English", "German", etc., or a local language, such as "Cantonese", "Hokkien", "Henan dialect", "Shanghai dialect", etc. The accent is the different accents of the above-mentioned language types.

In this embodiment, the command recognition is performed on the voice information in the following manner: the preset command information is stored in the voice processing 21, and if the voice information matches the preset command information, it is judged that the voice information is a command, otherwise, it is not a command . For example, in the recognition mode stored in the speech processing unit 12, there are command words such as "open the curtain" or "close the curtain" as the preset command information, that is, the preset command voice. When the voice information input by the voice recognition unit 11 is parsed After that, when it matches the command voice such as "open the curtain" or "close the curtain", the voice processing unit 12 judges the voice information as a command. If the command voices such as "close the curtains" do not match, the voice processing unit 12 determines that the voice information is not a command.

After judging the voice information as a command, the voice processing unit 12 converts the voice information judged as a command into control information according to the recognition mode. For example, after the voice information of "open the curtains" or "close the curtains" is recognized as a command, the voice processing unit 12 converts the above voice information into control information suitable for "open the curtains" or "close the curtains".

Preferably, the voice control module 2 includes a voice control unit 21 and an action execution unit 22 . The voice control unit 21 is connected to the action execution unit 22 . The voice control unit 21 receives the control information, and forms a control instruction and transmits it to the action execution unit 22 . The action execution unit 22 executes the action according to the control instruction.

Specifically, the voice control unit 21 is connected to the voice processing unit 12 . The voice processing unit 12 judges the voice information as a command, converts the voice information judged as a command into control information according to the recognition mode, and then transmits the control information to the voice control unit 21 . The voice control unit 21 in this embodiment is an MCU chip with a control function, such as a motor control chip. The action execution unit 22 is a device having an action execution function, such as a motor for a rolling shutter. After the voice processing unit 12 transmits the control information to the voice control unit 21, the voice control unit 21 forms control commands such as forward rotation or reverse rotation to the action execution unit 22, and the action execution unit 22 completes the corresponding action, thereby realizing the action of opening and closing the curtains .

Referring back to FIG. 1 , further, the voice editing module 3 includes a voice editing unit 31 and a first wireless communication unit 32 . The voice editing unit 31 is connected to the first wireless communication unit 32 . The terminal module 4 includes an input unit 41 and a second wireless communication unit 42 . The input unit 41 is connected to the second wireless communication unit 42 . The first wireless communication unit 32 is wirelessly connected to the second wireless communication unit 42 . The editing information input by the input unit 41 is wirelessly transmitted to the speech editing unit 31 through the cooperation of the second wireless communication unit 42 and the first wireless communication unit 32 . The voice editing unit 31 changes the recognition mode of the voice recognition module 1 according to the editing information.

The voice editing unit 31 is connected to the voice processing unit 12 . After the operator inputs the editing information through the input unit 41, the speech editing unit 31 edits and changes the recognition mode in the speech processing unit 12 according to the editing information. Specifically, the input unit 41 may be an APP built in a smartphone, such as a WeChat applet. When the operator needs to edit the recognition mode of the speech recognition module 1 , that is, when the operator needs to change the language type and tone of the control speech, the input unit 41 is activated. After the input unit 41 is activated, the wireless connection state of the second wireless communication unit 42 and the first wireless communication unit 32 is activated. In this embodiment, the first wireless communication unit 42 is a built-in Bluetooth module of a smart phone, and the first wireless communication unit 32 is also a Bluetooth module. After the two are wirelessly connected, the communication between the input unit 41 and the voice editing unit 31 can be realized through Bluetooth signals. Information exchange. The voice editing unit 31 in this embodiment is an MCU chip with an editing function, such as a chip with a burning function, or a burner can also be used, which can change the recognition mode stored in the voice processing unit 12, that is, realize voice Edit the recognized language type and tone of voice. For example, the input unit 41 has an editing input button. After the input unit 41 and the voice editing unit 31 are connected for information interaction, the operator presses the editing input button for a long time, and speaks the corresponding language according to his daily language and accent habits. For example, the operator speaks the Cantonese voices of "open the curtains" and "close the curtains". After the input is completed, the input unit 41 communicates the Cantonese voices of "open the curtains" and "close the curtains" with the second wireless communication unit 42. The first wireless communication unit 32 cooperates and transmits it to the speech editing unit 31, and the speech editing unit 31 deletes the recognition pattern originally stored by the speech processing unit 12, and records the new recognition patterns of “open curtains” and “close curtains” in Cantonese into the speech processing unit. In the unit 12, a new recognition mode is formed, so that the editing of the recognition mode is completed, and the recognition mode of the speech processing unit 12 is changed.

Preferably, in order to ensure the accuracy of editing information input, the input unit 41 may further add editing information confirmation and verification functions. Editing information confirmation is to display and play the editing information entered by the operator on the APP of the smartphone for the operator to confirm. Edit the information for matching verification, and complete the verification if they are consistent.

In this way, the operator cooperates with the voice editing unit 31 through the input unit 41 to complete the editing and modification of the recognition mode of the voice processing unit 12, so that the operator can customize and modify the control voice according to his own habits and preferences, which greatly increases the The convenience and operation experience of voice control are improved.

Embodiment 2

Referring to FIG. 2 , FIG. 2 is a flowchart of a method for optimizing voice control in the second embodiment. The method for optimizing voice control in this embodiment can be implemented based on a system for optimizing voice control in the embodiment, which specifically includes the following steps:

S1 , the speech editing module 3 edits the recognition mode of the speech recognition module 1 .

S2, the speech recognition module 1 receives the speech information, performs command recognition on the speech information according to the edited recognition mode, and converts the speech information recognized as commands into control information.

S3, the voice control module 2 performs an action according to the control information.

The recognition mode of the voice recognition module 1 is edited by the voice editing module 3, so that the recognition mode in the voice recognition module 1 can be customized and modified according to the operator's habit and preference, which greatly increases the convenience and operation of voice control. experience.

Preferably, in step S1, the speech editing module 3 edits the recognition mode of the speech recognition module 1, which also includes the following steps before:

S0, the terminal module 4 sends the editing information to the voice editing module 3.

The operator sends the editing information to the editing module 3 through the terminal module 4, which greatly increases the convenience of editing.

Preferably, in step S0, the terminal module 4 sends the editing information to the voice editing module 3, which further includes:

S00, the terminal module 4 sends confirmation information and verification information for the operator to confirm. The terminal module 4 ensures the accuracy of the edited information by sending the confirmation information and verifying the information for the operator to confirm.

In step S00, the terminal module 4 sends confirmation information and verification information for the operator to confirm, which also includes:

S000 , the operator inputs editing information through the terminal module 4 .

The recognition mode in this embodiment includes language types, and preferably, the speech recognition mode also includes voice intonation.

For the implementation of the above steps S000, S00, S0, S1, S2, and S3, reference may be made to the system for optimizing voice control in Embodiment 1, and details are not repeated here.

In summary, through the settings of the voice editing module, the editing of the recognition mode of the voice recognition module is realized, and the optimization of voice control is realized, so that the voice information used by the operator can be customized to be recognized as control information, which greatly increases the voice. Convenience of control and operating experience.

The above description is merely an embodiment of the present invention, and is not intended to limit the present invention. Various modifications and variations of the present invention are possible for those skilled in the art. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the present invention shall be included within the scope of the claims of the present invention.

Claims

A system for optimizing voice control, comprising:

A voice recognition module (1), which is used for receiving voice information, and carries out command recognition to the voice information according to a recognition mode, and converts the voice information identified as a command into control information;

a voice control module (2) connected to the voice recognition module (1); the voice control module (2) receives the control information and performs actions according to the control information; and

A voice editing module (3) connected to the voice recognition module (1); the voice editing module (3) is used for editing the recognition mode of the voice recognition module (1).
The system for optimizing voice control according to claim 1, characterized in that it further comprises a terminal module (4); the terminal module (4) is connected with the voice editing module (3); the terminal module (4) ) is used to send editing information to the speech editing module (3), and the speech editing module (3) edits the recognition mode of the speech recognition module (1) according to the editing information.
The system for optimizing voice control according to claim 1, wherein the voice recognition module (1) comprises a voice recognition unit (11) and a voice processing unit (12); the voice processing unit (12) is connected; the voice processing unit (12) stores the recognition pattern; the voice recognition unit (11) receives the voice information and transmits the voice information to the voice A processing unit (12); the voice processing unit (12) performs command recognition on the voice information according to the recognition mode, and converts the voice information recognized as commands into control information.
The system for optimizing voice control according to claim 1, wherein the voice control module (2) comprises a voice control unit (21) and an action execution unit (22); the voice control unit (21) is connected to the the action execution unit (22) is connected; the voice control unit (21) receives the control information, and forms a control instruction and transmits it to the action execution unit (22); the action execution unit (22) according to the control Instructions perform actions.
The system for optimizing voice control according to any one of claims 2-4, wherein the voice editing module (3) comprises a voice editing unit (31) and a first wireless communication unit (32); the voice editing The unit (31) is connected with the first wireless communication unit (32); the terminal module (4) includes an input unit (41) and a second wireless communication unit (42); the input unit (41) is connected with the The second wireless communication unit (42) is connected; the first wireless communication unit (32) is wirelessly connected with the second wireless communication unit (42); the editing information input by the input unit (41) is passed through the The cooperation of the second wireless communication unit (42) and the first wireless communication unit (32) is wirelessly transmitted to the voice editing unit (31); the voice editing unit (31) modifies the voice according to the editing information The recognition mode of the recognition module (1).
A method of optimizing voice control, comprising:

The speech editing module (3) edits the recognition mode of the speech recognition module (1);

The voice recognition module (1) receives voice information, and performs command recognition on the voice information according to the edited recognition pattern, and converts the voice information identified as a command into control information;

The voice control module (2) performs actions according to the control information.
The method for optimizing voice control according to claim 6, wherein the voice editing module (3) edits the recognition pattern of the voice recognition module (1), and further comprises:

The terminal module (4) sends editing information to the voice editing module (3).
The method for optimizing voice control according to claim 7, wherein the terminal module (4) sends editing information to the voice editing module (3), before further comprising:

The terminal module (4) sends confirmation information and verification information for the operator to confirm.
The method for optimizing voice control according to claim 8, wherein the terminal module (4) sends confirmation information and verification information for the operator to confirm, before further comprising:

The operator inputs the editing information through the terminal module (4).
The method for optimizing voice control according to claim 6, wherein the recognition mode includes language type and voice intonation.