CN114093354A - Method and system for improving recognition accuracy of vehicle-mounted voice assistant - Google Patents

Method and system for improving recognition accuracy of vehicle-mounted voice assistant Download PDF

Info

Publication number
CN114093354A
CN114093354A CN202111245199.0A CN202111245199A CN114093354A CN 114093354 A CN114093354 A CN 114093354A CN 202111245199 A CN202111245199 A CN 202111245199A CN 114093354 A CN114093354 A CN 114093354A
Authority
CN
China
Prior art keywords
vehicle
instruction
language instruction
voice assistant
recognition accuracy
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202111245199.0A
Other languages
Chinese (zh)
Inventor
邱安崇
钟启兴
唐侨
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Huizhou Desay SV Intelligent Transport Technology Research Institute Co Ltd
Original Assignee
Huizhou Desay SV Intelligent Transport Technology Research Institute Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Huizhou Desay SV Intelligent Transport Technology Research Institute Co Ltd filed Critical Huizhou Desay SV Intelligent Transport Technology Research Institute Co Ltd
Priority to CN202111245199.0A priority Critical patent/CN114093354A/en
Publication of CN114093354A publication Critical patent/CN114093354A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/24Speech recognition using non-acoustical features
    • G10L15/25Speech recognition using non-acoustical features using position of the lips, movement of the lips or face analysis
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N23/00Cameras or camera modules comprising electronic image sensors; Control thereof
    • H04N23/57Mechanical or electrical details of cameras or camera modules specially adapted for being embedded in other devices
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue
    • G10L2015/223Execution procedure of a spoken command

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Computational Linguistics (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • Acoustics & Sound (AREA)
  • Signal Processing (AREA)
  • Studio Devices (AREA)
  • Fittings On The Vehicle Exterior For Carrying Loads, And Devices For Holding Or Mounting Articles (AREA)

Abstract

The invention provides a method and a system for improving the recognition accuracy of a vehicle-mounted voice assistant.

Description

Method and system for improving recognition accuracy of vehicle-mounted voice assistant
Technical Field
The invention relates to the technical field of vehicle-mounted terminal control, in particular to a method and a system for improving the recognition accuracy of a vehicle-mounted voice assistant.
Background
With the rapid development of the automobile industry, the automobile popularity rate is remarkably improved, the technology in the automobile is more and more mature, and the automobile technology is gradually developed towards intellectualization and unmanned. The vehicle-mounted voice assistant is also an important part of the current automobile technology, and a driver or a passenger can transmit instructions to the vehicle-mounted entertainment system by voice only through the vehicle-mounted voice assistant, such as playing a certain song, inquiring a route of a certain place, adjusting the size of the voice and the like, and after the system receives the instructions, the personalized operation of the user is executed, so that more comfortable driving experience is achieved. At present, most of voice assistants in the industry use microphones as unique sensors for acquiring voice information, and have a single-microphone scheme, a double-microphone scheme, a four-microphone scheme and the like, and voice assistant intelligent technologies are widely applied in the consumption industry and are mature, however, in driving automobiles or some noisy scenes, the voice recognition accuracy is sometimes greatly influenced, and a system may not accurately recognize useful voice instructions; in addition, if other passengers in the vehicle want to adjust the system settings through the voice assistant, the identity, position and other conditions of the sender of the voice command are difficult to recognize by only using the microphone, and the system is easily interfered by other noises, so that the system is subjected to false triggering, and even is controlled by a hacker in a voice control mode.
Wherein, patent No. 201910072525.9 discloses a method, an electronic device and a storage medium for improving accuracy of speech recognition, in particular a method for improving accuracy of speech recognition, which is mainly applied to application scenes of consumption, voice information is acquired through a sound acquisition device, mouth shape identification information is acquired through a camera image sensor, the voice information and the mouth shape identification information are compared by a system, one instruction data is selected by an alternative mode, however, the system needs to collect voice instruction data and mouth shape data in advance, only compares the information in the database, and an object outside the database cannot send a voice instruction, in addition, the system selects other judgment results according to the self-defined recognition rate, and the single judgment still has higher error rate, misdetection rate and the like, and does not effectively improve the voice accuracy rate.
Disclosure of Invention
Aiming at the technical problems, the invention provides a method and a system for improving the recognition accuracy of a vehicle-mounted voice assistant, which are used for fusing a microphone and a camera, namely fusing video information and sound information, and judging the authenticity of a language instruction collected by a language instruction collecting end from character characteristics in a video so as to ensure the accuracy of the language instruction.
Specifically, the method of the invention comprises the following steps:
s1: starting the system to normally run;
s2: when a language instruction is received, recording a time period T1 of the language instruction at the time;
s3: calling image data information corresponding to the time period T1;
s4: and analyzing the image data information to judge whether the consistency of the language instruction and the image data information is greater than a preset value, wherein the preset value is preferably set to 80%. If yes, go to S5; otherwise, go to S6;
s5: the language instruction is effective, and an instruction program corresponding to the language instruction is executed; and feeding back the information of the command sender;
s6: inquiring whether a specific instruction is to be executed, if so, turning to S5; otherwise, go to S2.
The language instruction is collected through the vehicle-mounted microphone terminal, the time period T1 of the language instruction at the moment is recorded, and the time period T1 is fed back to the system controller end.
The image data information comprises at least a video and a time stamp.
The S3 further includes: the system controller end acquires real-time data of image data information acquisition, analyzes face posture and mouth part change of a driver or a passenger, identifies corresponding control instruction information, and simultaneously judges whether the control instruction information is consistent with the language instruction.
Wherein the information of the instruction issuer at least comprises the identity and the location of the instruction issuer.
As another preferred embodiment, the present invention further provides a system for improving recognition accuracy of a vehicle-mounted voice assistant, including:
the system comprises at least 1 camera module, a camera module and a display module, wherein the camera module is used for acquiring image data information in a vehicle in real time;
the at least 1 vehicle-mounted microphone terminal is used for collecting the language instruction, recording the time period T1 of the language instruction at the time and feeding back the time period T1 to the system controller end;
the system controller end is responsible for judging the consistency of the language instruction and the image data information, and when the data is consistent, the language instruction is effective and executes an instruction program corresponding to the language instruction; and feeding back the information of the command sender; otherwise, sending out a query, re-confirming or re-judging.
Data transmission among the camera module, the vehicle-mounted microphone terminal and the system controller end is transmitted through a wireless or USB data line.
The system controller further includes a memory, a processor, and a computer program stored on the memory and executable on the processor.
The computer program when executed by a processor implements a method for improving recognition accuracy of an in-vehicle voice assistant as described above.
In summary, the invention provides a method and a system for improving recognition accuracy of a vehicle-mounted voice assistant, wherein the method and the system are characterized in that after a microphone and a camera are fused, the system receives an instruction of the voice assistant, compares the opening and closing conditions of the mouths of a driver and a passenger and facial expressions in the same time period, and judges that a voice instruction is effective when the relevant characteristics of images accord with a certain degree, and identifies the identity and the position of a sender of the voice instruction, thereby improving the recognition accuracy of the voice assistant.
Drawings
Fig. 1 is a flowchart of a bluetooth short message confusion encryption method according to the present invention.
Fig. 2 is a view of the installation positions of the camera module and the vehicle-mounted microphone terminal of the invention at the whole automobile end.
Fig. 3 is a diagram of a communication process between the systems of the present invention.
Detailed Description
In order to make the technical solutions of the present invention better understood, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
As shown in fig. 1, the method for improving the recognition accuracy of the vehicle-mounted voice assistant according to the present invention includes the following steps:
s1: starting the system to normally run;
s2: when a language instruction is received, recording a time period T1 of the language instruction at the time;
s3: calling image data information corresponding to the time period T1;
s4: and analyzing the image data information to judge whether the consistency of the language instruction and the image data information is greater than a preset value, wherein the preset value is preferably set to 80%. If yes, go to S5; otherwise, go to S6;
s5: the language instruction is effective, and an instruction program corresponding to the language instruction is executed; and feeding back the information of the command sender;
s6: inquiring whether a specific instruction is to be executed, if so, turning to S5; otherwise, go to S2.
The language instruction is collected through the vehicle-mounted microphone terminal, the time period T1 of the language instruction at the moment is recorded, and the time period T1 is fed back to the system controller end.
The image data information comprises at least a video and a time stamp.
The S3 further includes: the system controller end acquires real-time data of image data information acquisition, analyzes face posture and mouth part change of a driver or a passenger, identifies corresponding control instruction information, and simultaneously judges whether the control instruction information is consistent with the language instruction. For example, according to the time period T1 of the language instruction, assume that T1 is am8.00, am8.01, am8.03, am8.04, am8.05, during which the passenger issues the language instruction; the language instructions may be selected as: moxa, please turn on the air conditioner; and after the vehicle-mounted microphone terminal collects the language instruction, recording the time period of the language instruction at the moment, sending the time period to a system controller end, and after the system controller end receives the information, calling video information collected by the camera head end, wherein the video information comprises real-time dynamic videos of the face and mouth shapes of all people in the vehicle, searching videos of the same time node according to T1, and if the face posture and mouth shape change of the copilot are matched at the moment, the change starting time is am8.00 and is continued to am8.05, primarily judging that the language instruction is sent by the people in the copilot, the information is accurate, and whether a specific instruction needs to be executed or not can be inquired in the next step. If the face postures and mouth changes of all people are not changed in the video, the fact that the language instruction is not sent by people in the vehicle, possibly noise outside the vehicle or noise at the mobile phone end can be judged, the language instruction is invalid, and the next operation is not carried out.
Wherein the information of the instruction issuer at least comprises the identity and the location of the instruction issuer.
As another preferred embodiment, the present invention further provides a system for improving recognition accuracy of a vehicle-mounted voice assistant, including:
the system comprises at least 1 camera module, a camera module and a display module, wherein the camera module is used for acquiring image data information in a vehicle in real time;
the at least 1 vehicle-mounted microphone terminal is used for collecting the language instruction, recording the time period T1 of the language instruction at the time and feeding back the time period T1 to the system controller end;
the system controller end is responsible for judging the consistency of the language instruction and the image data information, and when the data is consistent, the language instruction is effective and executes an instruction program corresponding to the language instruction; and feeding back the information of the command sender; otherwise, sending out a query, re-confirming or re-judging.
Data transmission among the camera module, the vehicle-mounted microphone terminal and the system controller end is transmitted through a wireless or USB data line.
The system controller further includes a memory, a processor, and a computer program stored on the memory and executable on the processor.
The computer program when executed by a processor implements a method for improving recognition accuracy of an in-vehicle voice assistant as described above.
As another embodiment, as shown in fig. 2, for the installation positions of the camera module and the vehicle-mounted microphone terminal of the present invention on the whole vehicle end, it is preferable that the camera module, i.e., the in-vehicle monitoring camera, is installed at a position in front of the passenger seat, for example, at any position below the wind-blocking magic, so as to ensure the maximum wide angle of the camera as much as possible, and to clearly shoot all the conditions in the vehicle. The vehicle-mounted microphone terminals can be selectively arranged in the front row 2 and the rear row 2, for example, 1 microphone sensor is arranged in the left front of a driving position and in the right front of a secondary driving position. The rear row is also installed at the left and right front positions of the passenger and at the rear of the front seat. But the method is not limited to the method, and the method can be reasonably adjusted according to different requirements, vehicle models and the like.
The communication process between the systems specifically includes:
the camera module and the vehicle-mounted microphone terminal are in wireless communication with the system controller end or are connected through a USB to complete data transmission. The system controller end is used for controlling the playing and closing of the vehicle-mounted multimedia system, or the selection, addition or deletion of any multimedia information and the like. And finally, controlling the vehicle-mounted multimedia system through a system controller end, and playing through a vehicle-mounted sound box or a loudspeaker. Meanwhile, when any information needs to be notified or issued, the information is directly transmitted through the vehicle-mounted sound, for example, when the video data information and the sound control instruction are not consistent, language broadcasting or inquiry is carried out.
The above-mentioned embodiments only express several embodiments of the present invention, and the description thereof is more specific and detailed, but not construed as limiting the scope of the present invention. It should be noted that, for a person skilled in the art, several variations and modifications can be made without departing from the inventive concept, which falls within the scope of the present invention. Therefore, the protection scope of the present patent shall be subject to the appended claims.

Claims (10)

1. A method for improving the recognition accuracy of a vehicle-mounted voice assistant is characterized by comprising the following steps:
s1: starting the system to normally run;
s2: when a language instruction is received, recording a time period T1 of the language instruction at the time;
s3: calling image data information corresponding to the time period T1;
s4: judging whether the consistency of the language instruction and the image data information is greater than a preset value or not by analyzing the image data information, if so, turning to S5, otherwise, turning to S6;
s5: the language instruction is effective, and an instruction program corresponding to the language instruction is executed; and feeding back the information of the command sender;
s6: inquiring whether a specific instruction is to be executed, if so, turning to S5; otherwise, go to S2.
2. The method for improving the recognition accuracy of the vehicle-mounted voice assistant according to claim 1, further comprising: and acquiring the language instruction through the vehicle-mounted microphone terminal, recording the time period T1 of the language instruction at the moment, and feeding back the time period T1 to the system controller end.
3. The method for improving the recognition accuracy of the vehicle-mounted voice assistant according to claim 2, further comprising: the image data information comprises at least a video and a time stamp.
4. The method for improving recognition accuracy of the vehicle-mounted voice assistant according to claim 3, wherein the step S3 further comprises: the system controller end acquires real-time data of image data information acquisition, analyzes face posture and mouth part change of a driver or a passenger, identifies corresponding control instruction information, and simultaneously judges whether the control instruction information is consistent with the language instruction.
5. The method for improving the recognition accuracy of the vehicle-mounted voice assistant according to claim 4, further comprising: the preset value is set to 80%.
6. The method for improving the recognition accuracy of the vehicle-mounted voice assistant according to claim 5, further comprising: the information of the instruction issuer comprises at least the identity and location of the instruction issuer.
7. A system for improving recognition accuracy of a vehicle-mounted voice assistant is characterized by comprising:
the system comprises at least 1 camera module, a camera module and a display module, wherein the camera module is used for acquiring image data information in a vehicle in real time;
the at least 1 vehicle-mounted microphone terminal is used for collecting the language instruction, recording the time period T1 of the language instruction at the time and feeding back the time period T1 to the system controller end;
the system controller end is responsible for judging the consistency of the language instruction and the image data information, and when the data is consistent, the language instruction is effective and executes an instruction program corresponding to the language instruction; and feeding back the information of the command sender; otherwise, sending out a query, re-confirming or re-judging.
8. The system of claim 7, wherein data transmission between the camera module, the vehicle microphone terminal and the system controller terminal is transmitted through a wireless or USB data line.
9. The system of claim 8, wherein the system controller further comprises a memory, a processor, and a computer program stored on the memory and executable on the processor.
10. The system according to claim 9, wherein the computer program, when executed by a processor, implements a method for improving recognition accuracy of a vehicle-mounted voice assistant according to any of claims 1-6.
CN202111245199.0A 2021-10-26 2021-10-26 Method and system for improving recognition accuracy of vehicle-mounted voice assistant Pending CN114093354A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111245199.0A CN114093354A (en) 2021-10-26 2021-10-26 Method and system for improving recognition accuracy of vehicle-mounted voice assistant

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111245199.0A CN114093354A (en) 2021-10-26 2021-10-26 Method and system for improving recognition accuracy of vehicle-mounted voice assistant

Publications (1)

Publication Number Publication Date
CN114093354A true CN114093354A (en) 2022-02-25

Family

ID=80297591

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111245199.0A Pending CN114093354A (en) 2021-10-26 2021-10-26 Method and system for improving recognition accuracy of vehicle-mounted voice assistant

Country Status (1)

Country Link
CN (1) CN114093354A (en)

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102324035A (en) * 2011-08-19 2012-01-18 广东好帮手电子科技股份有限公司 Method and system of applying lip posture assisted speech recognition technique to vehicle navigation
CN108128310A (en) * 2017-12-25 2018-06-08 芜湖皖江知识产权运营中心有限公司 A kind of vehicle audio input control method for intelligent travel
US20180286404A1 (en) * 2017-03-23 2018-10-04 Tk Holdings Inc. System and method of correlating mouth images to input commands
CN109545219A (en) * 2019-01-09 2019-03-29 北京新能源汽车股份有限公司 Vehicle-mounted voice exchange method, system, equipment and computer readable storage medium
CN109584871A (en) * 2018-12-04 2019-04-05 北京蓦然认知科技有限公司 Method for identifying ID, the device of phonetic order in a kind of vehicle
CN111326152A (en) * 2018-12-17 2020-06-23 南京人工智能高等研究院有限公司 Voice control method and device
CN112164395A (en) * 2020-09-18 2021-01-01 北京百度网讯科技有限公司 Vehicle-mounted voice starting method and device, electronic equipment and storage medium
CN113129893A (en) * 2019-12-30 2021-07-16 Oppo(重庆)智能科技有限公司 Voice recognition method, device, equipment and storage medium

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102324035A (en) * 2011-08-19 2012-01-18 广东好帮手电子科技股份有限公司 Method and system of applying lip posture assisted speech recognition technique to vehicle navigation
US20180286404A1 (en) * 2017-03-23 2018-10-04 Tk Holdings Inc. System and method of correlating mouth images to input commands
CN108128310A (en) * 2017-12-25 2018-06-08 芜湖皖江知识产权运营中心有限公司 A kind of vehicle audio input control method for intelligent travel
CN109584871A (en) * 2018-12-04 2019-04-05 北京蓦然认知科技有限公司 Method for identifying ID, the device of phonetic order in a kind of vehicle
CN111326152A (en) * 2018-12-17 2020-06-23 南京人工智能高等研究院有限公司 Voice control method and device
CN109545219A (en) * 2019-01-09 2019-03-29 北京新能源汽车股份有限公司 Vehicle-mounted voice exchange method, system, equipment and computer readable storage medium
CN113129893A (en) * 2019-12-30 2021-07-16 Oppo(重庆)智能科技有限公司 Voice recognition method, device, equipment and storage medium
CN112164395A (en) * 2020-09-18 2021-01-01 北京百度网讯科技有限公司 Vehicle-mounted voice starting method and device, electronic equipment and storage medium

Similar Documents

Publication Publication Date Title
US6285924B1 (en) On-vehicle input and output apparatus
EP3067827A1 (en) Driver distraction detection system
WO2019201304A1 (en) Face recognition-based voice processing method, and device
US10809802B2 (en) Line-of-sight detection apparatus, computer readable storage medium, and line-of-sight detection method
CN110082726B (en) Sound source positioning method and device, positioning equipment and storage medium
CN111547063A (en) Intelligent vehicle-mounted emotion interaction device for fatigue detection
US11044566B2 (en) Vehicle external speaker system
CN112109647A (en) Vehicle and method and device for realizing vehicle contextual model
CN107730902A (en) Method for recording, picture pick-up device and the storage medium of vehicle video recording
JP2006193002A (en) Engine tone quality control system
US20200143810A1 (en) Control apparatus, control method, agent apparatus, and computer readable storage medium
CN114093354A (en) Method and system for improving recognition accuracy of vehicle-mounted voice assistant
CN114446322A (en) Emotion adjustment system and emotion adjustment method
CN113665514A (en) Vehicle service system and service method thereof
KR102537879B1 (en) Active Control System of Dual Mic for Car And Method thereof
JPH11352987A (en) Voice recognition device
US20200180533A1 (en) Control system, server, in-vehicle control device, vehicle, and control method
CN111119645A (en) Vehicle window control method, device, equipment and computer readable storage medium
WO2022242589A1 (en) Method and apparatus for controlling rear-view mirror of vehicle, and vehicle and storage medium
US11673512B2 (en) Audio processing method and system for a seat headrest audio system
CN109922397A (en) Audio intelligent processing method, storage medium, intelligent terminal and smart bluetooth earphone
US10997442B2 (en) Control apparatus, control method, agent apparatus, and computer readable storage medium
CN113665511A (en) Vehicle control method and device and computer readable storage medium
KR102561458B1 (en) Voice recognition based vehicle control method and system therefor
WO2023112114A1 (en) Communication system, information processing device, information processing method, program, and recording medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination