CN111724797A - Voice control method and system based on image and voiceprint recognition and vehicle - Google Patents

Voice control method and system based on image and voiceprint recognition and vehicle Download PDF

Info

Publication number
CN111724797A
CN111724797A CN201910220610.5A CN201910220610A CN111724797A CN 111724797 A CN111724797 A CN 111724797A CN 201910220610 A CN201910220610 A CN 201910220610A CN 111724797 A CN111724797 A CN 111724797A
Authority
CN
China
Prior art keywords
priority
voice
user
result
voiceprint recognition
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201910220610.5A
Other languages
Chinese (zh)
Inventor
阮洲
叶将涛
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
BYD Co Ltd
Original Assignee
BYD Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by BYD Co Ltd filed Critical BYD Co Ltd
Priority to CN201910220610.5A priority Critical patent/CN111724797A/en
Publication of CN111724797A publication Critical patent/CN111724797A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L17/00Speaker identification or verification techniques
    • G10L17/22Interactive procedures; Man-machine interfaces
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/174Facial expression recognition
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue
    • G10L2015/223Execution procedure of a spoken command

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Health & Medical Sciences (AREA)
  • Multimedia (AREA)
  • Human Computer Interaction (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Acoustics & Sound (AREA)
  • General Physics & Mathematics (AREA)
  • Oral & Maxillofacial Surgery (AREA)
  • Theoretical Computer Science (AREA)
  • General Health & Medical Sciences (AREA)
  • Computational Linguistics (AREA)
  • Collating Specific Patterns (AREA)

Abstract

The invention provides a voice control method, a system and a vehicle based on image and voiceprint recognition, wherein the method comprises the following steps: collecting a voice instruction of a user; performing voiceprint recognition on the user according to the voice command to obtain a voiceprint recognition result; acquiring image information of a user; identifying the user according to the voiceprint identification result and the image information to obtain an identification result, wherein the identification result comprises the following steps: judging the priority of the user according to the voiceprint recognition result and the image information to obtain a priority result of the user; executing the voice instruction according to the authentication result, comprising: and executing the voice instruction according to the priority result. Corresponding voice instructions are executed according to the user priorities, the situation that execution is disordered when a plurality of users send the voice instructions simultaneously and the voice instructions conflict is avoided, and voice control is enabled to be more orderly and safer.

Description

Voice control method and system based on image and voiceprint recognition and vehicle
Technical Field
The invention belongs to the technical field of voice control, and particularly relates to a voice control method and system based on image and voiceprint recognition and a vehicle.
Background
At present, by combining image recognition and voice recognition technologies, the voice control can be realized freely and conveniently without depending on a handheld remote controller or using a close-range pickup module, so that the interference of sound output by multimedia equipment, environmental background sound and a non-control instruction voice signal of a user on control instruction voice recognition is effectively avoided, and a control command sent by the user is accurately recognized.
In the above technology, when a plurality of users send out voice commands simultaneously and the voice commands collide with each other, the execution may be confused, and there may be a potential safety hazard.
Disclosure of Invention
The present invention is directed to solving one of the problems or issues set forth above.
To this end, a first object of the present invention is to propose a speech control method based on image and voiceprint recognition, so that the speech control becomes more orderly and secure.
A second object of the present invention is to propose a speech control system based on image and voiceprint recognition.
A third object of the invention is to propose a vehicle.
In order to achieve the above object, a voice control method based on image and voiceprint recognition according to an embodiment of the first aspect of the present invention includes the following steps:
collecting a voice instruction of a user;
performing voiceprint recognition on the user according to the voice command to obtain a voiceprint recognition result;
acquiring image information of a user;
identifying the user according to the voiceprint identification result and the image information to obtain an identification result, wherein the identification result comprises the following steps: judging the priority of the user according to the voiceprint recognition result and the image information to obtain a priority result of the user;
executing the voice instruction according to the authentication result, comprising: and executing the voice instruction according to the priority result.
According to the voice control method based on image and voiceprint recognition, the corresponding voice command is executed according to the priority of the user, the situation that when a plurality of users send the voice command at the same time and the voice command conflicts, execution is disordered is avoided, and voice control is enabled to be more orderly and safer.
According to one embodiment of the invention, the priorities include: a driver priority and a passenger priority, wherein the driver priority is higher than the passenger priority.
According to an embodiment of the present invention, executing the voice instruction according to the priority result specifically includes: and when the priority result comprises the driver priority and the passenger priority, only the voice instruction of the driver is executed, and the voice instruction of the passenger is not executed.
According to one embodiment of the invention, the priorities include: the system comprises a manager priority, an authorized user priority and a normal user priority, wherein the manager priority is higher than the authorized user priority, and the authorized user priority is higher than the normal user priority.
According to an embodiment of the present invention, executing the voice instruction according to the priority result specifically includes:
when the priority result comprises at least two priorities, if the voice commands do not conflict, executing all the voice commands; if the voice instructions conflict and the priorities are different, executing the voice instruction of the user with high priority; if the voice commands conflict and the priorities are the same, executing the finally collected voice commands;
when the priority result only comprises one priority, if the voice instructions do not conflict, executing all the voice instructions; and if the voice commands conflict, executing the finally acquired voice commands.
According to an embodiment of the present invention, the method for authenticating a user according to the voiceprint recognition result and the image information to obtain an authentication result further includes: and judging the age and emotion of the user according to the voiceprint recognition result and the image information.
According to one embodiment of the invention, the voice instruction comprises: playing audio or video; executing the voice instruction according to the authentication result, further comprising: and selecting the video or audio suitable for the age and the emotion according to the age and the emotion of the user for playing.
In order to achieve the above object, a voice control system based on image and voiceprint recognition according to an embodiment of a second aspect of the present invention includes:
the voice acquisition module is used for acquiring a voice instruction of a user;
the voice processing module is used for carrying out voiceprint recognition on the user according to the voice command to obtain a voiceprint recognition result;
the image acquisition module is used for acquiring image information of a user;
the identification module is used for identifying the user according to the voiceprint recognition result and the image information to obtain an identification result, and comprises: judging the priority of the user according to the voiceprint recognition result and the image information to obtain a priority result of the user;
the execution module is used for executing the voice instruction according to the identification result, and comprises: and executing the voice instruction according to the priority result.
According to the voice control system based on image and voiceprint recognition, the corresponding voice command is executed according to the priority of the user, the situation that when a plurality of users send the voice command at the same time and the voice commands conflict with each other, execution is disordered is avoided, and voice control is enabled to be more orderly and safer.
According to one embodiment of the invention, the priorities include: a driver priority and a passenger priority, wherein the driver priority is higher than the passenger priority.
According to an embodiment of the present invention, the execution module is specifically configured to: when the priority result comprises the driver priority and the passenger priority, the execution module only executes the voice instruction of the driver and does not execute the voice instruction of the passenger.
According to one embodiment of the invention, the priorities include: the system comprises a manager priority, an authorized user priority and a common user priority, wherein the manager priority is higher than the authorized user priority, and the authorized user priority is higher than the common user priority.
According to an embodiment of the present invention, the execution module is specifically configured to:
when the priority result comprises the at least two priorities, if the voice instructions do not conflict, the execution module executes all the voice instructions; if the voice instructions conflict and the priorities are different, the execution module executes the voice instructions of the users with high priorities; if the voice commands conflict and the priorities are the same, the execution module executes the finally acquired voice commands;
when the priority result only comprises one priority, if the voice instructions do not conflict, the execution module executes all the voice instructions; and if the voice command conflicts, the execution module executes the finally acquired voice command.
According to an embodiment of the present invention, further comprising: the device comprises a selection module, a judgment module and a judgment module, wherein the selection module is used for selecting a first priority and a second priority, the first priority comprises a driver priority and a passenger priority, and the driver priority is higher than the passenger priority; the second priority comprises a manager priority, an authorized user priority and a common user priority, wherein the manager priority is higher than the authorized user priority, and the authorized user priority is higher than the common user priority.
According to an embodiment of the present invention, the identification module is further configured to determine the age and emotion of the user according to the voiceprint recognition result and the image information.
According to one embodiment of the invention, the voice instruction comprises: playing audio or video; the execution module is also used for selecting the video or audio suitable for the age and the emotion to play according to the age and the emotion of the user.
In order to achieve the above object, a third embodiment of the present invention provides a vehicle including the above voice control system based on image and voiceprint recognition.
According to the vehicle provided by the embodiment of the invention, the corresponding voice command is executed according to the priority of the user, the situation of execution confusion when a plurality of users send the voice command at the same time and the voice commands are in conflict is avoided, and the voice control becomes more orderly and safer.
Additional aspects and advantages of the invention will be set forth in part in the description which follows and, in part, will be obvious from the description, or may be learned by practice of the invention.
Drawings
FIG. 1 is a schematic diagram of a method for voice control based on image and voiceprint recognition according to an embodiment of the present invention;
FIG. 2 is a schematic diagram of a voice control system based on image and voiceprint recognition according to a second embodiment of the present invention;
fig. 3 is a schematic view of a vehicle according to a third embodiment of the present invention.
Detailed Description
In order to make the technical problems, technical solutions and advantageous effects solved by the present invention more clearly apparent, the present invention is further described in detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention.
A voice control method, system, and vehicle based on image and voiceprint recognition according to an embodiment of the present invention will be described with reference to the accompanying drawings.
FIG. 1 is a schematic diagram of a voice control method based on image and voiceprint recognition in accordance with one embodiment of the present invention.
As shown in fig. 1, the voice control method based on image and voiceprint recognition includes the following steps:
and collecting a voice instruction of a user.
Specifically, after a user sends a voice instruction, the voice instruction sent by the user is collected.
And carrying out voiceprint recognition on the user according to the voice command to obtain a voiceprint recognition result.
Specifically, voiceprint recognition is carried out on the user according to the collected voice command, and a voiceprint recognition result is obtained.
Image information of a user is acquired.
The method for identifying the user according to the voiceprint identification result and the image information to obtain an identification result comprises the following steps: and judging the priority of the user according to the voiceprint recognition result and the image information to obtain a priority result of the user.
Specifically, the user is identified according to the voiceprint identification result and the image information to obtain an identification result, which includes: and judging the priority of the user according to the voiceprint recognition result and the image information to obtain a priority result of the user.
Executing the voice command according to the authentication result, comprising: and executing the voice instruction according to the priority result.
Specifically, the voice command is executed according to the above-mentioned authentication result, in this embodiment, the authentication result includes a priority result of the user, and the corresponding voice command is executed according to the priority result of the user.
In some embodiments, the corresponding voice command is executed according to the priority of the user, so that the situation of execution confusion when a plurality of users send the voice command simultaneously and the voice command conflicts is avoided, and the voice control becomes more orderly and safer.
In some embodiments, the priorities include: a driver priority and a passenger priority, wherein the driver priority is higher than the passenger priority.
Further, executing the voice instruction according to the priority result specifically includes: when the priority result includes the driver priority and the passenger priority, only the voice instruction of the driver is executed, and the voice instruction of the passenger is not executed.
Specifically, when the priority result of the user is judged according to the voiceprint recognition result and the image information, and the priority result is the driver priority and the passenger priority, only the voice instruction of the driver is executed at the moment, and the voice instruction of the passenger is not executed. For example, when two users send out voice commands, according to the voiceprint recognition results and the image information of the two users, it is determined that the first user is a driver and the second user is a passenger, and at this time, only the voice command of the first user (namely, the driver) is executed, and the voice command of the second user (namely, the passenger) is not executed. When only one user sends a voice instruction, if the user is judged to be a driver according to the voiceprint recognition result and the image information of the user, the voice instruction of the user is executed, and if the user is judged to be a passenger according to the voiceprint recognition result and the image information of the user, the voice instruction of the user is not executed.
The priority only distinguishes the priority of the driver and the priority of the passenger, only executes the voice instruction of the driver, does not execute the voice instruction of the passenger, enables the voice control to be safer, and simultaneously ensures the driving safety.
In some embodiments, the priorities include: the system comprises a manager priority, an authorized user priority and a normal user priority, wherein the manager priority is higher than the authorized user priority, and the authorized user priority is higher than the normal user priority.
Further, executing the voice instruction according to the priority result specifically includes:
when the priority result comprises at least two priorities, if the voice commands do not conflict, executing all the voice commands; if the voice instructions conflict and the priorities are different, executing the voice instruction of the user with the high priority; if the voice instructions conflict and the priorities are the same, executing the finally collected voice instructions;
when the priority result only comprises one priority, if the voice instructions do not conflict, executing all the voice instructions; and if the voice commands conflict, executing the finally acquired voice commands.
Specifically, this will be explained below by way of example. It should be noted that the present invention is not limited to the above-mentioned cases.
When a three-bit user sends a voice command at the same time, if the priority result of the user is judged according to the voiceprint recognition result and the image information of the three-bit user, the priority result is as follows: the first user is a manager, the second user is an authorized user, the third user is a common user, and when the voice instructions of the three users do not conflict, the voice instructions of the three users are all executed; when the voice commands of the three-bit users conflict, according to the priority level, the priority level of the first user (namely the manager) is the highest, the voice command of the first user is executed, and the voice commands of the second user (namely the authorized user) and the third user (namely the common user) are not executed. If the priority result of the user is judged according to the voiceprint recognition result and the image information of the three-position user, the priority result is as follows: the first user is an authorized user, the second user is an authorized user, the third user is a common user, and when the voice instructions of the three users do not conflict, the voice instructions of the three users are all executed; when the voice instructions of the three-user conflict, the priority of the first user is the same as that of the second user, and the priority of the first user is higher than that of the third user, at the moment, the voice instruction of the third user is not executed, and the voice instruction collected finally in the first user and the voice instruction collected finally in the second user are executed.
When two users send out voice commands at the same time, if the priority results of the users are judged according to the voiceprint recognition results and the image information of the two users, if the first user and the second user are both ordinary users, all the voice commands are executed if the voice commands do not conflict, and the finally collected voice commands are executed if the voice commands conflict.
It should be noted that "simultaneously" in the above-mentioned simultaneous voice command issued by the user may be to issue the voice command at the same time of 1 second, which is called to issue the voice command simultaneously, or may have a time period within which the voice command issued will be regarded as issuing the voice command simultaneously, for example, the time period is 3 seconds, and within the 3 seconds, the voice command issued by the user will be regarded as issuing the voice command simultaneously. It is understood that the 1 second and 3 seconds are only illustrative and not limiting of the present invention.
The users are divided into manager priority, authorized user priority and common user priority, the priority division is more detailed, more users and users with different priorities can carry out voice control, and the situation of execution confusion can not occur.
In some embodiments, the identifying the user according to the voiceprint recognition result and the image information to obtain the identification result, further comprising: and judging the age and emotion of the user according to the voiceprint recognition result and the image information.
Further, the voice instruction includes: playing audio or video; executing the voice command according to the authentication result, further comprising: and selecting the video or audio suitable for the age and the emotion according to the age and the emotion of the user for playing.
Specifically, when a user sends a voice instruction of 'playing music', voiceprint recognition is carried out according to the voice instruction, image information of the user is obtained, the age of the user is judged to be 5 years old and the user is happy at the moment according to a voiceprint recognition result and the image information, and then music which is suitable for being aged to be 5 years old and is happy and relaxed in emotion is screened out to be played.
The age and emotion of the user are judged directly according to the voiceprint recognition result and the image information, and proper music or video is automatically selected to be played, so that voice control is more humanized.
Fig. 2 is a schematic diagram of a voice control system 100 based on image and voiceprint recognition according to an embodiment of the present invention.
As shown in fig. 2, the voice control system 100 based on image and voiceprint recognition includes:
the voice acquisition module 101, the voice acquisition module 101 is used for acquiring the voice instruction of the user;
the voice processing module 102, the voice processing module 102 is used for performing voiceprint recognition on the user according to the voice instruction to obtain a voiceprint recognition result;
the image acquisition module 103, the image acquisition module 103 is used for acquiring image information of a user;
an identification module 104, where the identification module 104 is configured to identify the user according to the voiceprint recognition result and the image information to obtain an identification result, and includes: judging the priority of the user according to the voiceprint recognition result and the image information to obtain a priority result of the user;
the execution module 105, the execution module 105 is configured to execute the voice instruction according to the authentication result, including: and executing the voice instruction according to the priority result.
Specifically, the voice acquiring module 101 acquires a voice instruction of a user, the image acquiring module 103 acquires image information of the user, the voice processing module 102 performs voiceprint recognition on the user according to the voice instruction acquired by the voice acquiring module 101 to obtain a voiceprint recognition result of the user, the identifying module 104 identifies the user according to the voiceprint recognition result of the voice processing module 102 and the user image information acquired by the image acquiring module 103 to obtain an identification result, including determining a priority of the user according to the voiceprint recognition result and the image information to obtain a priority result of the user, the executing module 105 executes the voice instruction according to the identification result of the identifying module 104, including: and executing the voice instruction according to the priority result.
According to the voice control system 100 based on image and voiceprint recognition, the corresponding voice command is executed according to the priority of the user, the situation that when a plurality of users send the voice command at the same time and the voice commands conflict with each other, execution is disordered is avoided, and voice control becomes more orderly and safer.
In some embodiments, the priorities include: a driver priority and a passenger priority, wherein the driver priority is higher than the passenger priority.
Further, the execution module 105 is specifically configured to: when the priority result includes the driver priority and the passenger priority, the execution module 105 executes only the voice instruction of the driver and does not execute the voice instruction of the passenger.
Specifically, when the authentication module 104 determines the priority result of the user according to the voiceprint recognition result of the voice processing module 102 and the image information acquired by the image acquisition module 103, and the priority result is the driver priority and the passenger priority, at this time, the execution module 105 only executes the voice instruction of the driver, and does not execute the voice instruction of the passenger. For example, when two users send out a voice instruction, the voice acquisition module 101 acquires the voice instruction of the two users, the voice processing module 102 performs voiceprint recognition on the acquired voice instruction, the image acquisition module 103 acquires image information of the two users, and the identification module 104 determines that the first user is a driver and the second user is a passenger according to the voiceprint recognition result and the image information of the two users, at this time, only the execution module 105 executes the voice instruction of the first user (i.e., the driver), and the voice instruction of the second user (i.e., the passenger) is not executed. When only one user sends a voice instruction, if the identification module 104 determines that the user is a driver according to the voiceprint recognition result and the image information of the user, the execution module 105 executes the voice instruction of the user, and if the identification module 104 determines that the user is a passenger according to the voiceprint recognition result and the image information of the user, the execution module 105 does not execute the voice instruction of the user.
Only the priority mode of the driver priority and the passenger priority is distinguished, so that the voice control becomes safer, and the driving safety is ensured.
In some embodiments, the priorities include: the system comprises a manager priority, an authorized user priority and a common user priority, wherein the manager priority is higher than the authorized user priority, and the authorized user priority is higher than the common user priority.
Further, the execution module 105 is specifically configured to:
when the priority result includes at least two priorities, if the voice commands do not conflict, the execution module 105 executes all the voice commands; if the voice instructions conflict and the priorities are different, the execution module 105 executes the voice instruction of the user with the higher priority; if the voice commands conflict and the priorities are the same, the execution module 105 executes the finally acquired voice command;
when the priority result only comprises one priority, if the voice commands do not conflict, the execution module 105 executes all the voice commands; if the voice command conflicts, the execution module 105 executes the last collected voice command.
In particular, the following description is illustrative, and it is to be understood that the present invention is illustrative only and is not to be construed as limited thereto.
When a three-bit user sends a voice command at the same time, if the authentication module 104 judges the priority result of the user according to the voiceprint recognition result and the image information of the three-bit user, the priority result is: the first user is a manager, the second user is an authorized user, the third user is a common user, and when the voice commands of the three-bit user are not in conflict, the execution module 105 executes the voice commands of the three-bit user; when the voice commands of the three-bit user conflict, the priority of the first user (i.e. the manager) is the highest according to the priority, and the execution module 105 executes the voice command of the first user and does not execute the voice commands of the second user (i.e. the authorized user) and the third user (i.e. the ordinary user). If the authentication module 104 judges the priority result of the user according to the voiceprint recognition result and the image information of the three-dimensional user, the priority result is: the first user is an authorized user, the second user is an authorized user, the third user is a common user, and when the voice commands of the three-bit user are not in conflict, the execution module 105 executes the voice commands of the three-bit user; when the voice commands of the three-user conflict, the priority of the first user is the same as that of the second user, and is higher than that of the third user, at this time, the execution module 105 does not execute the voice command of the third user, and executes the voice command acquired last in the first user and the second user.
When two users send out voice commands at the same time, if the authentication module 104 judges the priority result of the users according to the voiceprint recognition result and the image information of the two users, and if the first user and the second user are both common users, if the voice commands do not conflict, the execution module 105 executes all the voice commands, and if the voice commands conflict, the execution module 105 executes the finally acquired voice commands.
It should be noted that "simultaneously" in the simultaneous uttering of the voice command is explained above, and will not be described here.
The users are divided into manager priority, authorized user priority and common user priority, the priority division is more detailed, more users and users with different priorities can carry out voice control, and the situation of execution confusion can not occur.
In some embodiments, further comprising: the selection module 106 is used for selecting a first priority and a second priority, wherein the first priority comprises a driver priority and a passenger priority, and the driver priority is higher than the passenger priority; the second priority comprises a manager priority, an authorized user priority and a normal user priority, wherein the manager priority is higher than the authorized user priority, and the authorized user priority is higher than the normal user priority.
Specifically, after the selection module 106 selects to adopt the first priority or the second priority, the authentication module 104 authenticates the user according to the selection of the selection module 106 to obtain a priority result. For example, if the selection module 106 selects to adopt the first priority, the authentication module 104 will only distinguish the user into the driver priority and the passenger priority when the authentication module 104 authenticates the user to obtain the priority result, and if the selection module 106 selects to adopt the second priority, the authentication module 104 will only distinguish the user into the manager priority, the authorized user priority and the general user priority when the authentication module 104 authenticates the user to obtain the priority result.
The first priority may be selected when driving safety needs to be guaranteed, and the second priority may be selected when more users are to be satisfied for voice control. The selection module 106 facilitates the user to select a proper priority according to the actual requirement of the user, and the voice control is performed by using the priority, so that the voice control is more humanized.
In some embodiments, the authentication module 104 is further configured to determine the age and emotion of the user according to the voiceprint recognition result and the image information.
Further, the voice instruction includes: the execution module 105 is further configured to select a video or audio suitable for the age and emotion of the user to be played according to the age and emotion of the user.
Specifically, when a user sends a voice command of "playing music", the voice acquisition module 101 acquires the voice command, the voice processing module 102 performs a voiceprint recognition result of voiceprint recognition on the voice command, the image acquisition module 103 acquires image information of the user, the identification module 104 then judges the age and emotion of the user according to the voiceprint recognition result and the image information, and then the execution module 105 selects music suitable for the age and emotion to play.
The age and emotion of the user are judged directly according to the voiceprint recognition result and the image information, and proper music or video is automatically selected to be played, so that voice control is more humanized.
Fig. 3 is a schematic diagram of a vehicle according to an embodiment of the present invention, and as shown in fig. 3, the vehicle 200 includes the voice control system 100 based on image and voiceprint recognition.
In an embodiment of the present invention, the vehicle 200 includes the voice control system 100 based on image and voiceprint recognition described above.
The vehicle 200 executes the corresponding voice command according to the priority of the user, and avoids the situation of disordered execution when a plurality of users send the voice command simultaneously and the voice command conflicts, so that the voice control becomes more orderly and safer.
The above description is only for the purpose of illustrating the preferred embodiments of the present invention and is not to be construed as limiting the invention, and any modifications, equivalents and improvements made within the spirit and principle of the present invention are intended to be included within the scope of the present invention.
In the description of the present invention, it is to be understood that the terms "center", "length", "width", "thickness", "upper", "lower", "front", "rear", "left", "right", "vertical", "horizontal", "top", "bottom", "inner", "outer", "axial", "radial", "circumferential", and the like, indicate orientations and positional relationships based on those shown in the drawings, and are used merely for convenience of description and for simplicity of description, and do not indicate or imply that the device or element being referred to must have a particular orientation, be constructed and operated in a particular orientation, and thus, should not be considered as limiting the present invention. Further, a feature defined as "first" or "second" may be an artist or may implicitly include one or more of the feature. In the description of the present invention, "a plurality" means two or more unless otherwise specified.
In the description of the present invention, it should be noted that unless otherwise explicitly stated or limited, the terms "mounted," "connected," and "connected" are to be construed broadly, e.g., as meaning either a fixed connection, a removable connection, or an integral connection; can be mechanically or electrically connected; either directly or indirectly through intervening profiles, or through internal communication between two elements. The specific meanings of the above terms in the present invention can be understood in specific cases to those skilled in the art. In the description of the specification, reference to the description of the terms "one embodiment," "some embodiments," "an illustrative embodiment," "an example," "a specific example," or "some examples," etc., means that a particular feature, structure, material, or characteristic described in connection with the embodiment or example is included in at least one embodiment or example of the invention. In this specification, the schematic representations of the terms used above do not necessarily refer to the same embodiment or example. Furthermore, the particular features, structures, materials, or characteristics described may be combined in any suitable manner in any one or more embodiments or examples. While embodiments of the invention have been shown and described, it will be understood by those of ordinary skill in the art that: various changes, modifications, substitutions and alterations can be made to the embodiments without departing from the principles and spirit of the invention, the scope of which is defined by the claims and their equivalents.

Claims (16)

1. A voice control method based on image and voiceprint recognition is characterized by comprising the following steps:
collecting a voice instruction of a user;
performing voiceprint recognition on the user according to the voice command to obtain a voiceprint recognition result;
acquiring image information of a user;
identifying the user according to the voiceprint identification result and the image information to obtain an identification result, wherein the identification result comprises the following steps: judging the priority of the user according to the voiceprint recognition result and the image information to obtain a priority result of the user;
executing the voice instruction according to the authentication result, comprising: and executing the voice instruction according to the priority result.
2. The method of claim 1, wherein the priority comprises: a driver priority and a passenger priority, wherein the driver priority is higher than the passenger priority.
3. The method of claim 2, wherein the voice control based on image and voiceprint recognition,
executing the voice instruction according to the priority result specifically includes:
and when the priority result comprises the driver priority and the passenger priority, only the voice instruction of the driver is executed, and the voice instruction of the passenger is not executed.
4. The method of claim 1, wherein the voice control based on image and voiceprint recognition is performed,
the priorities include: the system comprises a manager priority, an authorized user priority and a normal user priority, wherein the manager priority is higher than the authorized user priority, and the authorized user priority is higher than the normal user priority.
5. The method of claim 4, wherein the voice control based on image and voiceprint recognition is performed,
executing the voice instruction according to the priority result specifically includes:
when the priority result comprises at least two priorities, if the voice commands do not conflict, executing all the voice commands; if the voice instructions conflict and the priorities are different, executing the voice instruction of the user with high priority; if the voice commands conflict and the priorities are the same, executing the finally collected voice commands;
when the priority result only comprises one priority, if the voice instructions do not conflict, executing all the voice instructions; and if the voice commands conflict, executing the finally acquired voice commands.
6. The method of claim 1, wherein the voice control based on image and voiceprint recognition is performed,
according to the voiceprint recognition result and the image information, the user is identified to obtain an identification result, and the method further comprises the following steps: and judging the age and emotion of the user according to the voiceprint recognition result and the image information.
7. The method of claim 6, wherein the voice command comprises: playing audio or video;
executing the voice instruction according to the authentication result, further comprising: and selecting the video or audio suitable for the age and the emotion according to the age and the emotion of the user for playing.
8. A voice control system based on image and voiceprint recognition comprising:
the voice acquisition module is used for acquiring a voice instruction of a user;
the voice processing module is used for carrying out voiceprint recognition on the user according to the voice command to obtain a voiceprint recognition result;
the image acquisition module is used for acquiring image information of a user;
the identification module is used for identifying the user according to the voiceprint recognition result and the image information to obtain an identification result, and comprises: judging the priority of the user according to the voiceprint recognition result and the image information to obtain a priority result of the user;
the execution module is used for executing the voice instruction according to the identification result, and comprises: and executing the voice instruction according to the priority result.
9. The image and voice print recognition based speech control system of claim 8, wherein the priority comprises: a driver priority and a passenger priority, wherein the driver priority is higher than the passenger priority.
10. The image and voiceprint recognition based speech control system according to claim 9 wherein said execution module is specifically configured to: when the priority result comprises the driver priority and the passenger priority, the execution module only executes the voice instruction of the driver and does not execute the voice instruction of the passenger.
11. The image and voice print recognition based speech control system of claim 8, wherein the priority comprises: the system comprises a manager priority, an authorized user priority and a common user priority, wherein the manager priority is higher than the authorized user priority, and the authorized user priority is higher than the common user priority.
12. The image and voiceprint recognition based speech control system according to claim 11 wherein said execution module is specifically configured to:
when the priority result comprises the at least two priorities, if the voice instructions do not conflict, the execution module executes all the voice instructions; if the voice instructions conflict and the priorities are different, the execution module executes the voice instructions of the users with high priorities; if the voice commands conflict and the priorities are the same, the execution module executes the finally acquired voice commands;
when the priority result only comprises one priority, if the voice instructions do not conflict, the execution module executes all the voice instructions; and if the voice command conflicts, the execution module executes the finally acquired voice command.
13. The image and voice print recognition based speech control system of claim 8, further comprising: the device comprises a selection module, a judgment module and a judgment module, wherein the selection module is used for selecting a first priority and a second priority, the first priority comprises a driver priority and a passenger priority, and the driver priority is higher than the passenger priority; the second priority comprises a manager priority, an authorized user priority and a common user priority, wherein the manager priority is higher than the authorized user priority, and the authorized user priority is higher than the common user priority.
14. The voice control system based on image and voiceprint recognition according to claim 8, wherein the authentication module is further configured to determine the age and mood of the user based on the voiceprint recognition result and the image information.
15. The voice control system based on image and voiceprint recognition according to claim 14,
the voice instruction includes: playing audio or video;
the execution module is also used for selecting the video or audio suitable for the age and the emotion to play according to the age and the emotion of the user.
16. A vehicle comprising a voice control system based on image and voiceprint recognition according to any one of claims 8 to 15.
CN201910220610.5A 2019-03-22 2019-03-22 Voice control method and system based on image and voiceprint recognition and vehicle Pending CN111724797A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910220610.5A CN111724797A (en) 2019-03-22 2019-03-22 Voice control method and system based on image and voiceprint recognition and vehicle

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910220610.5A CN111724797A (en) 2019-03-22 2019-03-22 Voice control method and system based on image and voiceprint recognition and vehicle

Publications (1)

Publication Number Publication Date
CN111724797A true CN111724797A (en) 2020-09-29

Family

ID=72562741

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910220610.5A Pending CN111724797A (en) 2019-03-22 2019-03-22 Voice control method and system based on image and voiceprint recognition and vehicle

Country Status (1)

Country Link
CN (1) CN111724797A (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112856739A (en) * 2021-01-29 2021-05-28 青岛海尔空调器有限总公司 Control method, system and device for air conditioner and air conditioner
CN113314120A (en) * 2021-07-30 2021-08-27 深圳传音控股股份有限公司 Processing method, processing apparatus, and storage medium
CN113867527A (en) * 2021-09-26 2021-12-31 上海商汤临港智能科技有限公司 Vehicle window control method and device, electronic equipment and storage medium
CN115359789A (en) * 2022-08-02 2022-11-18 科大讯飞股份有限公司 Voice interaction method and related device, equipment and storage medium

Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102262879A (en) * 2010-05-24 2011-11-30 乐金电子(中国)研究开发中心有限公司 Voice command competition processing method and device as well as voice remote controller and digital television
CN105365707A (en) * 2014-08-11 2016-03-02 福特全球技术公司 Vehicle driver identification
CN106506442A (en) * 2016-09-14 2017-03-15 上海百芝龙网络科技有限公司 A kind of smart home multi-user identification and its Rights Management System
US20170133009A1 (en) * 2015-11-10 2017-05-11 Samsung Electronics Co., Ltd. Electronic device and method for controlling the same
CN107507612A (en) * 2017-06-30 2017-12-22 百度在线网络技术(北京)有限公司 A kind of method for recognizing sound-groove and device
CN107707436A (en) * 2017-09-18 2018-02-16 广东美的制冷设备有限公司 Terminal control method, device and computer-readable recording medium
CN107831903A (en) * 2017-11-24 2018-03-23 科大讯飞股份有限公司 The man-machine interaction method and device that more people participate in
CN108062354A (en) * 2017-11-22 2018-05-22 上海博泰悦臻电子设备制造有限公司 Information recommendation method, system, storage medium, electronic equipment and vehicle
CN108962260A (en) * 2018-06-25 2018-12-07 福来宝电子(深圳)有限公司 A kind of more human lives enable audio recognition method, system and storage medium
CN108958810A (en) * 2018-02-09 2018-12-07 北京猎户星空科技有限公司 A kind of user identification method based on vocal print, device and equipment

Patent Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102262879A (en) * 2010-05-24 2011-11-30 乐金电子(中国)研究开发中心有限公司 Voice command competition processing method and device as well as voice remote controller and digital television
CN105365707A (en) * 2014-08-11 2016-03-02 福特全球技术公司 Vehicle driver identification
US20170133009A1 (en) * 2015-11-10 2017-05-11 Samsung Electronics Co., Ltd. Electronic device and method for controlling the same
CN106506442A (en) * 2016-09-14 2017-03-15 上海百芝龙网络科技有限公司 A kind of smart home multi-user identification and its Rights Management System
CN107507612A (en) * 2017-06-30 2017-12-22 百度在线网络技术(北京)有限公司 A kind of method for recognizing sound-groove and device
CN107707436A (en) * 2017-09-18 2018-02-16 广东美的制冷设备有限公司 Terminal control method, device and computer-readable recording medium
CN108062354A (en) * 2017-11-22 2018-05-22 上海博泰悦臻电子设备制造有限公司 Information recommendation method, system, storage medium, electronic equipment and vehicle
CN107831903A (en) * 2017-11-24 2018-03-23 科大讯飞股份有限公司 The man-machine interaction method and device that more people participate in
CN108958810A (en) * 2018-02-09 2018-12-07 北京猎户星空科技有限公司 A kind of user identification method based on vocal print, device and equipment
CN108962260A (en) * 2018-06-25 2018-12-07 福来宝电子(深圳)有限公司 A kind of more human lives enable audio recognition method, system and storage medium

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112856739A (en) * 2021-01-29 2021-05-28 青岛海尔空调器有限总公司 Control method, system and device for air conditioner and air conditioner
WO2022160865A1 (en) * 2021-01-29 2022-08-04 青岛海尔空调器有限总公司 Control method, system, and apparatus for air conditioner, and air conditioner
CN112856739B (en) * 2021-01-29 2022-09-02 青岛海尔空调器有限总公司 Control method, system and device for air conditioner and air conditioner
CN113314120A (en) * 2021-07-30 2021-08-27 深圳传音控股股份有限公司 Processing method, processing apparatus, and storage medium
CN113867527A (en) * 2021-09-26 2021-12-31 上海商汤临港智能科技有限公司 Vehicle window control method and device, electronic equipment and storage medium
CN115359789A (en) * 2022-08-02 2022-11-18 科大讯飞股份有限公司 Voice interaction method and related device, equipment and storage medium

Similar Documents

Publication Publication Date Title
CN111724797A (en) Voice control method and system based on image and voiceprint recognition and vehicle
JP6358212B2 (en) Awakening control system for vehicles
CN110070868A (en) Voice interactive method, device, automobile and the machine readable media of onboard system
CN106452997A (en) Household electrical appliance and control system thereof
JP6011584B2 (en) Speech recognition apparatus and speech recognition system
DE102014109121A1 (en) Systems and methods for arbitrating a voice dialogue service
CN105812484B (en) Vehicle-mounted interactive system
KR102437833B1 (en) Apparatus for selecting at least one task based on voice command, a vehicle including the same and a method thereof
CN109637532A (en) Audio recognition method, device, car-mounted terminal, vehicle and storage medium
CN109240638A (en) Audio-frequency processing method and device for vehicle
CN108657186A (en) Intelligent driving cabin exchange method and device
CN110265009B (en) Active conversation initiating method and device based on user identity
CN105609105A (en) Speech recognition system and speech recognition method
CN113270095B (en) Voice processing method, device, storage medium and electronic equipment
CN109273002B (en) Vehicle configuration method and system, vehicle machine and vehicle
CN107894238A (en) A kind of navigation method and device
CN108648734A (en) Processing system towards automobile entertainment and its method
CN112614491A (en) Vehicle-mounted voice interaction method and device, vehicle and readable medium
CN110211579B (en) Voice instruction recognition method, device and system
CN108780644A (en) The system and method for means of transport, speech pause length for adjusting permission in voice input range
CN109131221A (en) A kind of vehicle starting system and method based on voice and gesture
CN111477226B (en) Control method, intelligent device and storage medium
CN117238288A (en) Vehicle control method and vehicle
CN112927688A (en) Voice interaction method and system for vehicle
CN112071306A (en) Voice control method, system, readable storage medium and gateway equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20200929