CN111724797A

CN111724797A - Voice control method and system based on image and voiceprint recognition and vehicle

Info

Publication number: CN111724797A
Application number: CN201910220610.5A
Authority: CN
Inventors: 阮洲; 叶将涛
Original assignee: BYD Co Ltd
Current assignee: BYD Co Ltd
Priority date: 2019-03-22
Filing date: 2019-03-22
Publication date: 2020-09-29

Abstract

The invention provides a voice control method, a system and a vehicle based on image and voiceprint recognition, wherein the method comprises the following steps: collecting a voice instruction of a user; performing voiceprint recognition on the user according to the voice command to obtain a voiceprint recognition result; acquiring image information of a user; identifying the user according to the voiceprint identification result and the image information to obtain an identification result, wherein the identification result comprises the following steps: judging the priority of the user according to the voiceprint recognition result and the image information to obtain a priority result of the user; executing the voice instruction according to the authentication result, comprising: and executing the voice instruction according to the priority result. Corresponding voice instructions are executed according to the user priorities, the situation that execution is disordered when a plurality of users send the voice instructions simultaneously and the voice instructions conflict is avoided, and voice control is enabled to be more orderly and safer.

Description

Voice control method and system based on image and voiceprint recognition and vehicle

Technical Field

The invention belongs to the technical field of voice control, and particularly relates to a voice control method and system based on image and voiceprint recognition and a vehicle.

Background

At present, by combining image recognition and voice recognition technologies, the voice control can be realized freely and conveniently without depending on a handheld remote controller or using a close-range pickup module, so that the interference of sound output by multimedia equipment, environmental background sound and a non-control instruction voice signal of a user on control instruction voice recognition is effectively avoided, and a control command sent by the user is accurately recognized.

In the above technology, when a plurality of users send out voice commands simultaneously and the voice commands collide with each other, the execution may be confused, and there may be a potential safety hazard.

Disclosure of Invention

The present invention is directed to solving one of the problems or issues set forth above.

To this end, a first object of the present invention is to propose a speech control method based on image and voiceprint recognition, so that the speech control becomes more orderly and secure.

A second object of the present invention is to propose a speech control system based on image and voiceprint recognition.

A third object of the invention is to propose a vehicle.

In order to achieve the above object, a voice control method based on image and voiceprint recognition according to an embodiment of the first aspect of the present invention includes the following steps:

collecting a voice instruction of a user;

performing voiceprint recognition on the user according to the voice command to obtain a voiceprint recognition result;

acquiring image information of a user;

identifying the user according to the voiceprint identification result and the image information to obtain an identification result, wherein the identification result comprises the following steps: judging the priority of the user according to the voiceprint recognition result and the image information to obtain a priority result of the user;

executing the voice instruction according to the authentication result, comprising: and executing the voice instruction according to the priority result.

According to the voice control method based on image and voiceprint recognition, the corresponding voice command is executed according to the priority of the user, the situation that when a plurality of users send the voice command at the same time and the voice command conflicts, execution is disordered is avoided, and voice control is enabled to be more orderly and safer.

According to one embodiment of the invention, the priorities include: a driver priority and a passenger priority, wherein the driver priority is higher than the passenger priority.

According to an embodiment of the present invention, executing the voice instruction according to the priority result specifically includes: and when the priority result comprises the driver priority and the passenger priority, only the voice instruction of the driver is executed, and the voice instruction of the passenger is not executed.

According to one embodiment of the invention, the priorities include: the system comprises a manager priority, an authorized user priority and a normal user priority, wherein the manager priority is higher than the authorized user priority, and the authorized user priority is higher than the normal user priority.

According to an embodiment of the present invention, executing the voice instruction according to the priority result specifically includes:

when the priority result comprises at least two priorities, if the voice commands do not conflict, executing all the voice commands; if the voice instructions conflict and the priorities are different, executing the voice instruction of the user with high priority; if the voice commands conflict and the priorities are the same, executing the finally collected voice commands;

when the priority result only comprises one priority, if the voice instructions do not conflict, executing all the voice instructions; and if the voice commands conflict, executing the finally acquired voice commands.

According to an embodiment of the present invention, the method for authenticating a user according to the voiceprint recognition result and the image information to obtain an authentication result further includes: and judging the age and emotion of the user according to the voiceprint recognition result and the image information.

According to one embodiment of the invention, the voice instruction comprises: playing audio or video; executing the voice instruction according to the authentication result, further comprising: and selecting the video or audio suitable for the age and the emotion according to the age and the emotion of the user for playing.

In order to achieve the above object, a voice control system based on image and voiceprint recognition according to an embodiment of a second aspect of the present invention includes:

the voice acquisition module is used for acquiring a voice instruction of a user;

the voice processing module is used for carrying out voiceprint recognition on the user according to the voice command to obtain a voiceprint recognition result;

the image acquisition module is used for acquiring image information of a user;

the identification module is used for identifying the user according to the voiceprint recognition result and the image information to obtain an identification result, and comprises: judging the priority of the user according to the voiceprint recognition result and the image information to obtain a priority result of the user;

the execution module is used for executing the voice instruction according to the identification result, and comprises: and executing the voice instruction according to the priority result.

According to the voice control system based on image and voiceprint recognition, the corresponding voice command is executed according to the priority of the user, the situation that when a plurality of users send the voice command at the same time and the voice commands conflict with each other, execution is disordered is avoided, and voice control is enabled to be more orderly and safer.

According to an embodiment of the present invention, the execution module is specifically configured to: when the priority result comprises the driver priority and the passenger priority, the execution module only executes the voice instruction of the driver and does not execute the voice instruction of the passenger.

According to one embodiment of the invention, the priorities include: the system comprises a manager priority, an authorized user priority and a common user priority, wherein the manager priority is higher than the authorized user priority, and the authorized user priority is higher than the common user priority.

According to an embodiment of the present invention, the execution module is specifically configured to:

when the priority result comprises the at least two priorities, if the voice instructions do not conflict, the execution module executes all the voice instructions; if the voice instructions conflict and the priorities are different, the execution module executes the voice instructions of the users with high priorities; if the voice commands conflict and the priorities are the same, the execution module executes the finally acquired voice commands;

when the priority result only comprises one priority, if the voice instructions do not conflict, the execution module executes all the voice instructions; and if the voice command conflicts, the execution module executes the finally acquired voice command.

According to an embodiment of the present invention, further comprising: the device comprises a selection module, a judgment module and a judgment module, wherein the selection module is used for selecting a first priority and a second priority, the first priority comprises a driver priority and a passenger priority, and the driver priority is higher than the passenger priority; the second priority comprises a manager priority, an authorized user priority and a common user priority, wherein the manager priority is higher than the authorized user priority, and the authorized user priority is higher than the common user priority.

According to an embodiment of the present invention, the identification module is further configured to determine the age and emotion of the user according to the voiceprint recognition result and the image information.

According to one embodiment of the invention, the voice instruction comprises: playing audio or video; the execution module is also used for selecting the video or audio suitable for the age and the emotion to play according to the age and the emotion of the user.

In order to achieve the above object, a third embodiment of the present invention provides a vehicle including the above voice control system based on image and voiceprint recognition.

According to the vehicle provided by the embodiment of the invention, the corresponding voice command is executed according to the priority of the user, the situation of execution confusion when a plurality of users send the voice command at the same time and the voice commands are in conflict is avoided, and the voice control becomes more orderly and safer.

Additional aspects and advantages of the invention will be set forth in part in the description which follows and, in part, will be obvious from the description, or may be learned by practice of the invention.

Drawings

FIG. 1 is a schematic diagram of a method for voice control based on image and voiceprint recognition according to an embodiment of the present invention;

FIG. 2 is a schematic diagram of a voice control system based on image and voiceprint recognition according to a second embodiment of the present invention;

fig. 3 is a schematic view of a vehicle according to a third embodiment of the present invention.

Detailed Description

In order to make the technical problems, technical solutions and advantageous effects solved by the present invention more clearly apparent, the present invention is further described in detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention.

A voice control method, system, and vehicle based on image and voiceprint recognition according to an embodiment of the present invention will be described with reference to the accompanying drawings.

FIG. 1 is a schematic diagram of a voice control method based on image and voiceprint recognition in accordance with one embodiment of the present invention.

As shown in fig. 1, the voice control method based on image and voiceprint recognition includes the following steps:

and collecting a voice instruction of a user.

Specifically, after a user sends a voice instruction, the voice instruction sent by the user is collected.

And carrying out voiceprint recognition on the user according to the voice command to obtain a voiceprint recognition result.

Specifically, voiceprint recognition is carried out on the user according to the collected voice command, and a voiceprint recognition result is obtained.

Image information of a user is acquired.

The method for identifying the user according to the voiceprint identification result and the image information to obtain an identification result comprises the following steps: and judging the priority of the user according to the voiceprint recognition result and the image information to obtain a priority result of the user.

Specifically, the user is identified according to the voiceprint identification result and the image information to obtain an identification result, which includes: and judging the priority of the user according to the voiceprint recognition result and the image information to obtain a priority result of the user.

Executing the voice command according to the authentication result, comprising: and executing the voice instruction according to the priority result.

Specifically, the voice command is executed according to the above-mentioned authentication result, in this embodiment, the authentication result includes a priority result of the user, and the corresponding voice command is executed according to the priority result of the user.

In some embodiments, the corresponding voice command is executed according to the priority of the user, so that the situation of execution confusion when a plurality of users send the voice command simultaneously and the voice command conflicts is avoided, and the voice control becomes more orderly and safer.

In some embodiments, the priorities include: a driver priority and a passenger priority, wherein the driver priority is higher than the passenger priority.

Further, executing the voice instruction according to the priority result specifically includes: when the priority result includes the driver priority and the passenger priority, only the voice instruction of the driver is executed, and the voice instruction of the passenger is not executed.

Specifically, when the priority result of the user is judged according to the voiceprint recognition result and the image information, and the priority result is the driver priority and the passenger priority, only the voice instruction of the driver is executed at the moment, and the voice instruction of the passenger is not executed. For example, when two users send out voice commands, according to the voiceprint recognition results and the image information of the two users, it is determined that the first user is a driver and the second user is a passenger, and at this time, only the voice command of the first user (namely, the driver) is executed, and the voice command of the second user (namely, the passenger) is not executed. When only one user sends a voice instruction, if the user is judged to be a driver according to the voiceprint recognition result and the image information of the user, the voice instruction of the user is executed, and if the user is judged to be a passenger according to the voiceprint recognition result and the image information of the user, the voice instruction of the user is not executed.

The priority only distinguishes the priority of the driver and the priority of the passenger, only executes the voice instruction of the driver, does not execute the voice instruction of the passenger, enables the voice control to be safer, and simultaneously ensures the driving safety.

In some embodiments, the priorities include: the system comprises a manager priority, an authorized user priority and a normal user priority, wherein the manager priority is higher than the authorized user priority, and the authorized user priority is higher than the normal user priority.

Further, executing the voice instruction according to the priority result specifically includes:

when the priority result comprises at least two priorities, if the voice commands do not conflict, executing all the voice commands; if the voice instructions conflict and the priorities are different, executing the voice instruction of the user with the high priority; if the voice instructions conflict and the priorities are the same, executing the finally collected voice instructions;

Specifically, this will be explained below by way of example. It should be noted that the present invention is not limited to the above-mentioned cases.

When a three-bit user sends a voice command at the same time, if the priority result of the user is judged according to the voiceprint recognition result and the image information of the three-bit user, the priority result is as follows: the first user is a manager, the second user is an authorized user, the third user is a common user, and when the voice instructions of the three users do not conflict, the voice instructions of the three users are all executed; when the voice commands of the three-bit users conflict, according to the priority level, the priority level of the first user (namely the manager) is the highest, the voice command of the first user is executed, and the voice commands of the second user (namely the authorized user) and the third user (namely the common user) are not executed. If the priority result of the user is judged according to the voiceprint recognition result and the image information of the three-position user, the priority result is as follows: the first user is an authorized user, the second user is an authorized user, the third user is a common user, and when the voice instructions of the three users do not conflict, the voice instructions of the three users are all executed; when the voice instructions of the three-user conflict, the priority of the first user is the same as that of the second user, and the priority of the first user is higher than that of the third user, at the moment, the voice instruction of the third user is not executed, and the voice instruction collected finally in the first user and the voice instruction collected finally in the second user are executed.

When two users send out voice commands at the same time, if the priority results of the users are judged according to the voiceprint recognition results and the image information of the two users, if the first user and the second user are both ordinary users, all the voice commands are executed if the voice commands do not conflict, and the finally collected voice commands are executed if the voice commands conflict.

It should be noted that "simultaneously" in the above-mentioned simultaneous voice command issued by the user may be to issue the voice command at the same time of 1 second, which is called to issue the voice command simultaneously, or may have a time period within which the voice command issued will be regarded as issuing the voice command simultaneously, for example, the time period is 3 seconds, and within the 3 seconds, the voice command issued by the user will be regarded as issuing the voice command simultaneously. It is understood that the 1 second and 3 seconds are only illustrative and not limiting of the present invention.

The users are divided into manager priority, authorized user priority and common user priority, the priority division is more detailed, more users and users with different priorities can carry out voice control, and the situation of execution confusion can not occur.

In some embodiments, the identifying the user according to the voiceprint recognition result and the image information to obtain the identification result, further comprising: and judging the age and emotion of the user according to the voiceprint recognition result and the image information.

Further, the voice instruction includes: playing audio or video; executing the voice command according to the authentication result, further comprising: and selecting the video or audio suitable for the age and the emotion according to the age and the emotion of the user for playing.

Specifically, when a user sends a voice instruction of 'playing music', voiceprint recognition is carried out according to the voice instruction, image information of the user is obtained, the age of the user is judged to be 5 years old and the user is happy at the moment according to a voiceprint recognition result and the image information, and then music which is suitable for being aged to be 5 years old and is happy and relaxed in emotion is screened out to be played.

The age and emotion of the user are judged directly according to the voiceprint recognition result and the image information, and proper music or video is automatically selected to be played, so that voice control is more humanized.

Fig. 2 is a schematic diagram of a voice control system 100 based on image and voiceprint recognition according to an embodiment of the present invention.

As shown in fig. 2, the voice control system 100 based on image and voiceprint recognition includes:

the voice acquisition module 101, the voice acquisition module 101 is used for acquiring the voice instruction of the user;

the voice processing module 102, the voice processing module 102 is used for performing voiceprint recognition on the user according to the voice instruction to obtain a voiceprint recognition result;

the image acquisition module 103, the image acquisition module 103 is used for acquiring image information of a user;

an identification module 104, where the identification module 104 is configured to identify the user according to the voiceprint recognition result and the image information to obtain an identification result, and includes: judging the priority of the user according to the voiceprint recognition result and the image information to obtain a priority result of the user;

the execution module 105, the execution module 105 is configured to execute the voice instruction according to the authentication result, including: and executing the voice instruction according to the priority result.

Specifically, the voice acquiring module 101 acquires a voice instruction of a user, the image acquiring module 103 acquires image information of the user, the voice processing module 102 performs voiceprint recognition on the user according to the voice instruction acquired by the voice acquiring module 101 to obtain a voiceprint recognition result of the user, the identifying module 104 identifies the user according to the voiceprint recognition result of the voice processing module 102 and the user image information acquired by the image acquiring module 103 to obtain an identification result, including determining a priority of the user according to the voiceprint recognition result and the image information to obtain a priority result of the user, the executing module 105 executes the voice instruction according to the identification result of the identifying module 104, including: and executing the voice instruction according to the priority result.

According to the voice control system 100 based on image and voiceprint recognition, the corresponding voice command is executed according to the priority of the user, the situation that when a plurality of users send the voice command at the same time and the voice commands conflict with each other, execution is disordered is avoided, and voice control becomes more orderly and safer.

Further, the execution module 105 is specifically configured to: when the priority result includes the driver priority and the passenger priority, the execution module 105 executes only the voice instruction of the driver and does not execute the voice instruction of the passenger.

Specifically, when the authentication module 104 determines the priority result of the user according to the voiceprint recognition result of the voice processing module 102 and the image information acquired by the image acquisition module 103, and the priority result is the driver priority and the passenger priority, at this time, the execution module 105 only executes the voice instruction of the driver, and does not execute the voice instruction of the passenger. For example, when two users send out a voice instruction, the voice acquisition module 101 acquires the voice instruction of the two users, the voice processing module 102 performs voiceprint recognition on the acquired voice instruction, the image acquisition module 103 acquires image information of the two users, and the identification module 104 determines that the first user is a driver and the second user is a passenger according to the voiceprint recognition result and the image information of the two users, at this time, only the execution module 105 executes the voice instruction of the first user (i.e., the driver), and the voice instruction of the second user (i.e., the passenger) is not executed. When only one user sends a voice instruction, if the identification module 104 determines that the user is a driver according to the voiceprint recognition result and the image information of the user, the execution module 105 executes the voice instruction of the user, and if the identification module 104 determines that the user is a passenger according to the voiceprint recognition result and the image information of the user, the execution module 105 does not execute the voice instruction of the user.

Only the priority mode of the driver priority and the passenger priority is distinguished, so that the voice control becomes safer, and the driving safety is ensured.

In some embodiments, the priorities include: the system comprises a manager priority, an authorized user priority and a common user priority, wherein the manager priority is higher than the authorized user priority, and the authorized user priority is higher than the common user priority.

Further, the execution module 105 is specifically configured to:

when the priority result includes at least two priorities, if the voice commands do not conflict, the execution module 105 executes all the voice commands; if the voice instructions conflict and the priorities are different, the execution module 105 executes the voice instruction of the user with the higher priority; if the voice commands conflict and the priorities are the same, the execution module 105 executes the finally acquired voice command;

when the priority result only comprises one priority, if the voice commands do not conflict, the execution module 105 executes all the voice commands; if the voice command conflicts, the execution module 105 executes the last collected voice command.

In particular, the following description is illustrative, and it is to be understood that the present invention is illustrative only and is not to be construed as limited thereto.

When a three-bit user sends a voice command at the same time, if the authentication module 104 judges the priority result of the user according to the voiceprint recognition result and the image information of the three-bit user, the priority result is: the first user is a manager, the second user is an authorized user, the third user is a common user, and when the voice commands of the three-bit user are not in conflict, the execution module 105 executes the voice commands of the three-bit user; when the voice commands of the three-bit user conflict, the priority of the first user (i.e. the manager) is the highest according to the priority, and the execution module 105 executes the voice command of the first user and does not execute the voice commands of the second user (i.e. the authorized user) and the third user (i.e. the ordinary user). If the authentication module 104 judges the priority result of the user according to the voiceprint recognition result and the image information of the three-dimensional user, the priority result is: the first user is an authorized user, the second user is an authorized user, the third user is a common user, and when the voice commands of the three-bit user are not in conflict, the execution module 105 executes the voice commands of the three-bit user; when the voice commands of the three-user conflict, the priority of the first user is the same as that of the second user, and is higher than that of the third user, at this time, the execution module 105 does not execute the voice command of the third user, and executes the voice command acquired last in the first user and the second user.

When two users send out voice commands at the same time, if the authentication module 104 judges the priority result of the users according to the voiceprint recognition result and the image information of the two users, and if the first user and the second user are both common users, if the voice commands do not conflict, the execution module 105 executes all the voice commands, and if the voice commands conflict, the execution module 105 executes the finally acquired voice commands.

It should be noted that "simultaneously" in the simultaneous uttering of the voice command is explained above, and will not be described here.

In some embodiments, further comprising: the selection module 106 is used for selecting a first priority and a second priority, wherein the first priority comprises a driver priority and a passenger priority, and the driver priority is higher than the passenger priority; the second priority comprises a manager priority, an authorized user priority and a normal user priority, wherein the manager priority is higher than the authorized user priority, and the authorized user priority is higher than the normal user priority.

Specifically, after the selection module 106 selects to adopt the first priority or the second priority, the authentication module 104 authenticates the user according to the selection of the selection module 106 to obtain a priority result. For example, if the selection module 106 selects to adopt the first priority, the authentication module 104 will only distinguish the user into the driver priority and the passenger priority when the authentication module 104 authenticates the user to obtain the priority result, and if the selection module 106 selects to adopt the second priority, the authentication module 104 will only distinguish the user into the manager priority, the authorized user priority and the general user priority when the authentication module 104 authenticates the user to obtain the priority result.

The first priority may be selected when driving safety needs to be guaranteed, and the second priority may be selected when more users are to be satisfied for voice control. The selection module 106 facilitates the user to select a proper priority according to the actual requirement of the user, and the voice control is performed by using the priority, so that the voice control is more humanized.

In some embodiments, the authentication module 104 is further configured to determine the age and emotion of the user according to the voiceprint recognition result and the image information.

Further, the voice instruction includes: the execution module 105 is further configured to select a video or audio suitable for the age and emotion of the user to be played according to the age and emotion of the user.

Specifically, when a user sends a voice command of "playing music", the voice acquisition module 101 acquires the voice command, the voice processing module 102 performs a voiceprint recognition result of voiceprint recognition on the voice command, the image acquisition module 103 acquires image information of the user, the identification module 104 then judges the age and emotion of the user according to the voiceprint recognition result and the image information, and then the execution module 105 selects music suitable for the age and emotion to play.

Fig. 3 is a schematic diagram of a vehicle according to an embodiment of the present invention, and as shown in fig. 3, the vehicle 200 includes the voice control system 100 based on image and voiceprint recognition.

In an embodiment of the present invention, the vehicle 200 includes the voice control system 100 based on image and voiceprint recognition described above.

The vehicle 200 executes the corresponding voice command according to the priority of the user, and avoids the situation of disordered execution when a plurality of users send the voice command simultaneously and the voice command conflicts, so that the voice control becomes more orderly and safer.

The above description is only for the purpose of illustrating the preferred embodiments of the present invention and is not to be construed as limiting the invention, and any modifications, equivalents and improvements made within the spirit and principle of the present invention are intended to be included within the scope of the present invention.

In the description of the present invention, it is to be understood that the terms "center", "length", "width", "thickness", "upper", "lower", "front", "rear", "left", "right", "vertical", "horizontal", "top", "bottom", "inner", "outer", "axial", "radial", "circumferential", and the like, indicate orientations and positional relationships based on those shown in the drawings, and are used merely for convenience of description and for simplicity of description, and do not indicate or imply that the device or element being referred to must have a particular orientation, be constructed and operated in a particular orientation, and thus, should not be considered as limiting the present invention. Further, a feature defined as "first" or "second" may be an artist or may implicitly include one or more of the feature. In the description of the present invention, "a plurality" means two or more unless otherwise specified.

In the description of the present invention, it should be noted that unless otherwise explicitly stated or limited, the terms "mounted," "connected," and "connected" are to be construed broadly, e.g., as meaning either a fixed connection, a removable connection, or an integral connection; can be mechanically or electrically connected; either directly or indirectly through intervening profiles, or through internal communication between two elements. The specific meanings of the above terms in the present invention can be understood in specific cases to those skilled in the art. In the description of the specification, reference to the description of the terms "one embodiment," "some embodiments," "an illustrative embodiment," "an example," "a specific example," or "some examples," etc., means that a particular feature, structure, material, or characteristic described in connection with the embodiment or example is included in at least one embodiment or example of the invention. In this specification, the schematic representations of the terms used above do not necessarily refer to the same embodiment or example. Furthermore, the particular features, structures, materials, or characteristics described may be combined in any suitable manner in any one or more embodiments or examples. While embodiments of the invention have been shown and described, it will be understood by those of ordinary skill in the art that: various changes, modifications, substitutions and alterations can be made to the embodiments without departing from the principles and spirit of the invention, the scope of which is defined by the claims and their equivalents.

Claims

1. A voice control method based on image and voiceprint recognition is characterized by comprising the following steps:

collecting a voice instruction of a user;

acquiring image information of a user;

2. The method of claim 1, wherein the priority comprises: a driver priority and a passenger priority, wherein the driver priority is higher than the passenger priority.

3. The method of claim 2, wherein the voice control based on image and voiceprint recognition,

executing the voice instruction according to the priority result specifically includes:

and when the priority result comprises the driver priority and the passenger priority, only the voice instruction of the driver is executed, and the voice instruction of the passenger is not executed.

4. The method of claim 1, wherein the voice control based on image and voiceprint recognition is performed,

the priorities include: the system comprises a manager priority, an authorized user priority and a normal user priority, wherein the manager priority is higher than the authorized user priority, and the authorized user priority is higher than the normal user priority.

5. The method of claim 4, wherein the voice control based on image and voiceprint recognition is performed,

6. The method of claim 1, wherein the voice control based on image and voiceprint recognition is performed,

according to the voiceprint recognition result and the image information, the user is identified to obtain an identification result, and the method further comprises the following steps: and judging the age and emotion of the user according to the voiceprint recognition result and the image information.

7. The method of claim 6, wherein the voice command comprises: playing audio or video;

executing the voice instruction according to the authentication result, further comprising: and selecting the video or audio suitable for the age and the emotion according to the age and the emotion of the user for playing.

8. A voice control system based on image and voiceprint recognition comprising:

the image acquisition module is used for acquiring image information of a user;

9. The image and voice print recognition based speech control system of claim 8, wherein the priority comprises: a driver priority and a passenger priority, wherein the driver priority is higher than the passenger priority.

10. The image and voiceprint recognition based speech control system according to claim 9 wherein said execution module is specifically configured to: when the priority result comprises the driver priority and the passenger priority, the execution module only executes the voice instruction of the driver and does not execute the voice instruction of the passenger.

11. The image and voice print recognition based speech control system of claim 8, wherein the priority comprises: the system comprises a manager priority, an authorized user priority and a common user priority, wherein the manager priority is higher than the authorized user priority, and the authorized user priority is higher than the common user priority.

12. The image and voiceprint recognition based speech control system according to claim 11 wherein said execution module is specifically configured to:

13. The image and voice print recognition based speech control system of claim 8, further comprising: the device comprises a selection module, a judgment module and a judgment module, wherein the selection module is used for selecting a first priority and a second priority, the first priority comprises a driver priority and a passenger priority, and the driver priority is higher than the passenger priority; the second priority comprises a manager priority, an authorized user priority and a common user priority, wherein the manager priority is higher than the authorized user priority, and the authorized user priority is higher than the common user priority.

14. The voice control system based on image and voiceprint recognition according to claim 8, wherein the authentication module is further configured to determine the age and mood of the user based on the voiceprint recognition result and the image information.

15. The voice control system based on image and voiceprint recognition according to claim 14,

the voice instruction includes: playing audio or video;

the execution module is also used for selecting the video or audio suitable for the age and the emotion to play according to the age and the emotion of the user.

16. A vehicle comprising a voice control system based on image and voiceprint recognition according to any one of claims 8 to 15.