CN115243104A - Method and system for automatically adjusting vehicle-mounted multimedia volume - Google Patents

Method and system for automatically adjusting vehicle-mounted multimedia volume Download PDF

Info

Publication number
CN115243104A
CN115243104A CN202111438420.4A CN202111438420A CN115243104A CN 115243104 A CN115243104 A CN 115243104A CN 202111438420 A CN202111438420 A CN 202111438420A CN 115243104 A CN115243104 A CN 115243104A
Authority
CN
China
Prior art keywords
vehicle
audio
real
characteristic
voice recognition
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202111438420.4A
Other languages
Chinese (zh)
Inventor
庞健宇
李太华
刘涵昱
张亚
陈俊伊
于成龙
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guangzhou Automobile Group Co Ltd
Original Assignee
Guangzhou Automobile Group Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guangzhou Automobile Group Co Ltd filed Critical Guangzhou Automobile Group Co Ltd
Priority to CN202111438420.4A priority Critical patent/CN115243104A/en
Publication of CN115243104A publication Critical patent/CN115243104A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/442Monitoring of processes or resources, e.g. detecting the failure of a recording device, monitoring the downstream bandwidth, the number of times a movie has been viewed, the storage space available from the internal hard disk
    • H04N21/44213Monitoring of end-user related data
    • H04N21/44218Detecting physical presence or behaviour of the user, e.g. using sensors to detect if the user is leaving the room or changes his face expression during a TV program

Abstract

The invention discloses a method and a system for automatically adjusting the volume of vehicle-mounted multimedia, wherein the method comprises the following steps: s1, when a passenger monitoring system monitors that a passenger mouth is opened, acquiring corresponding characteristic audio according to a real-time mouth shape, and sending the characteristic audio to a vehicle-mounted voice recognition system along with a monitoring signal; s2, the vehicle-mounted voice recognition system receives the monitoring signal sent by the passenger monitoring system and compares the coincidence degree of the real-time audio frequency in the vehicle and the characteristic audio frequency; and S3, when the coincidence degree of the real-time audio and the characteristic audio in the vehicle reaches a preset threshold value, the vehicle-mounted voice recognition system triggers and reduces the volume of the audio played by the vehicle-mounted multimedia equipment. The method and the device do not need to identify sentences possibly output by specific mouth shape combination, only need to output the characteristic audio according to the real-time mouth shape so as to compare the characteristic audio with the real-time audio in the vehicle, can quickly and automatically respond to the requirement of volume adjustment, and improve the riding experience.

Description

Method and system for automatically adjusting vehicle-mounted multimedia volume
Technical Field
The invention belongs to the technical field of intelligent networked automobiles, and particularly relates to a method and a system for automatically adjusting vehicle-mounted multimedia volume.
Background
In the current intelligent vehicle cabin technology, an OMS (Occupancy Monitoring System) has been gradually popularized, but the related functions are relatively few, and only have the functions of automatic window opening for smoking, reminding of articles left in a vehicle, monitoring of passenger emotion, basic gesture recognition and the like, and the richness is relatively insufficient. Many vehicle factories configure OMS, which belongs to hardware reservation, and have not abundant related functions and perfect development planning.
In the process of taking a car, if a passenger has a conversation with a driver when music is played in the car, a first sentence often appears and cannot be clearly heard, the music/multimedia volume needs to be manually adjusted and reduced to finish the conversation scene, and the volume cannot be manually increased after the conversation is finished; the whole body can influence the experience of talking and listening to music among passengers. Although the in-vehicle speech recognition has already been applied, the current in-vehicle speech recognition technology cannot correctly recognize whether the sound emitted from the in-vehicle environment belongs to the sound generated by passenger communication, and cannot monitor the speech content in real time before the in-vehicle speech recognition technology is awakened by an awakening word. Therefore, the function of automatically lowering and raising the volume of the background music media cannot be realized only by means of the in-vehicle voice recognition technology.
Disclosure of Invention
The technical problem to be solved by the embodiments of the present invention is to provide a method and a system for automatically adjusting the volume of a vehicle-mounted multimedia, so as to improve the riding experience.
In order to solve the technical problem, the invention provides a method for automatically adjusting the volume of vehicle-mounted multimedia, which comprises the following steps:
the method comprises the following steps that S1, when a passenger monitoring system monitors that a passenger mouth is opened, corresponding characteristic audio is obtained according to a real-time mouth shape, and the characteristic audio is sent to a vehicle-mounted voice recognition system along with a monitoring signal;
s2, the vehicle-mounted voice recognition system receives the monitoring signal sent by the passenger monitoring system and compares the coincidence degree of the real-time audio frequency in the vehicle and the characteristic audio frequency;
and S3, when the coincidence degree of the real-time audio frequency and the characteristic audio frequency in the vehicle reaches a preset threshold value, the vehicle-mounted voice recognition system triggers to reduce the volume of the audio frequency played by the vehicle-mounted multimedia equipment.
Further, in the step S1, acquiring the corresponding feature audio according to the real-time mouth shape specifically includes inputting the real-time mouth shape to a trained neural network, and outputting the feature audio corresponding to the real-time mouth shape, where the neural network is trained by using a mouth shape feature data set obtained by processing a standard pronunciation video and a voice feature extracted from the standard pronunciation video as inputs.
Further, in the step S2, the vehicle-mounted voice recognition system compares the coincidence degree of the real-time audio frequency in the vehicle and the characteristic audio frequency, specifically, compares the coincidence degree of the waveform of the real-time audio frequency in the vehicle and the waveform of the characteristic audio frequency, and includes stretching and shrinking the waveform of the characteristic audio frequency to match the waveform of the real-time audio frequency in the vehicle; in the step S3, when the contact ratio of the wave peaks reaches a preset threshold, the vehicle-mounted voice recognition system recognizes that the vehicle-mounted passenger is speaking, and triggers to reduce the volume of the vehicle-mounted multimedia device for playing the audio.
Further, the step S2 further includes: the vehicle-mounted voice recognition system compares the coincidence degree of the audio played by the vehicle-mounted multimedia equipment with the characteristic audio; the step S3 further includes: when the coincidence degree of the audio played by the vehicle-mounted multimedia equipment and the characteristic audio reaches a preset threshold value, the vehicle-mounted voice recognition system triggers and reduces the volume of the voice in the audio played by the vehicle-mounted multimedia equipment.
Further, if the opening and closing of the mouth of the passenger is not monitored in the step S1, or when the overlap ratio of the in-vehicle real-time audio and the characteristic audio is compared in the step S2, the overlap ratio of the in-vehicle real-time audio and the characteristic audio does not reach a preset threshold, the step S3 further includes: the vehicle-mounted voice recognition system compares the in-vehicle noise energy with the audio energy played by the vehicle-mounted multimedia equipment, and triggers and increases the audio volume played by the vehicle-mounted multimedia equipment when the in-vehicle noise energy is larger than the audio energy played by the vehicle-mounted multimedia equipment.
Further, in step S1, if the plurality of passenger monitoring systems monitor that the mouths of the corresponding passengers are open and closed, in step S2, the vehicle-mounted voice recognition system superimposes the feature audio sent by each passenger monitoring system, and then compares the real-time audio in the vehicle with the superimposed feature audio.
Further, after the step S3, the method further includes: the passenger monitoring system does not monitor the opening and closing of the passenger mouth, and the vehicle-mounted voice recognition system triggers and improves the volume of the audio played by the vehicle-mounted multimedia equipment to the initial volume.
The invention also provides a system for automatically adjusting the vehicle-mounted multimedia volume, which comprises a passenger monitoring system and a vehicle-mounted voice recognition system,
the passenger monitoring system is used for acquiring corresponding characteristic audio according to a real-time mouth shape when the opening of the passenger mouth is monitored, and sending the characteristic audio to the vehicle-mounted voice recognition system along with a monitoring signal;
the vehicle-mounted voice recognition system is used for receiving the monitoring signal sent by the passenger monitoring system and comparing the coincidence degree of the real-time audio frequency in the vehicle and the characteristic audio frequency; and when the coincidence degree of the real-time audio and the characteristic audio in the vehicle reaches a preset threshold value, the volume of the vehicle-mounted multimedia equipment for playing the audio is reduced in a triggering mode.
Further, the vehicle-mounted voice recognition system is further used for comparing the coincidence degree of the audio played by the vehicle-mounted multimedia equipment with the characteristic audio, and triggering and reducing the volume of the voice in the audio played by the vehicle-mounted multimedia equipment when the coincidence degree of the audio played by the vehicle-mounted multimedia equipment and the characteristic audio reaches a preset threshold value.
Further, if the passenger monitoring system does not monitor that the mouth of the passenger is opened or closed, or when the coincidence degree of the real-time audio frequency in the vehicle and the characteristic audio frequency is compared, the coincidence degree of the real-time audio frequency in the vehicle and the characteristic audio frequency does not reach a preset threshold value, the vehicle-mounted voice recognition system is further used for comparing the noise energy in the vehicle and the audio energy played by the vehicle-mounted multimedia equipment, and triggering to improve the audio volume played by the vehicle-mounted multimedia equipment when the noise energy in the vehicle is greater than the audio energy played by the vehicle-mounted multimedia equipment.
The implementation of the invention has the following beneficial effects: the method does not need to identify sentences possibly output by specific mouth shape combination, only needs to output the characteristic audio according to the real-time mouth shape for carrying out contact ratio comparison with the real-time audio in the vehicle, reduces the requirement on calculation force, can quickly and automatically respond to the requirement on audio volume adjustment, and improves the riding experience; the audio volume can be adjusted in a self-adaptive manner according to the noise level in the vehicle, so that the influence of noise on communication between passengers is reduced; and after the voice communication of the passengers is finished, the volume of the audio played by the vehicle-mounted multimedia equipment is recovered, and the audio played by the vehicle-mounted multimedia equipment is continuously listened when the passengers take the bus without being influenced.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to the drawings without creative efforts.
Fig. 1 is a schematic flow chart of a method for automatically adjusting the volume of a vehicle-mounted multimedia according to an embodiment of the present invention.
Detailed Description
The following description of the embodiments refers to the accompanying drawings, which are included to illustrate specific embodiments in which the invention may be practiced.
Referring to fig. 1, an embodiment of the present invention provides a method for automatically adjusting a volume of a vehicle-mounted multimedia, including:
s1, when a passenger monitoring system monitors that a passenger mouth is opened, acquiring corresponding characteristic audio according to a real-time mouth shape, and sending the characteristic audio to a vehicle-mounted voice recognition system along with a monitoring signal;
s2, the vehicle-mounted voice recognition system receives the monitoring signal sent by the passenger monitoring system and compares the coincidence degree of the real-time audio frequency in the vehicle and the characteristic audio frequency;
and S3, when the coincidence degree of the real-time audio frequency and the characteristic audio frequency in the vehicle reaches a preset threshold value, the vehicle-mounted voice recognition system triggers to reduce the volume of the audio frequency played by the vehicle-mounted multimedia equipment.
Specifically, in step S1, the passenger monitoring system OMS monitors the mouth shape of the passenger in real time, and when the mouth of the passenger opens and closes, the passenger acquires the possible characteristic audio according to the real-time mouth shape. It can be understood that obtaining characteristic audio according to a real-time mouth shape requires pre-learning (machine learning) in the early stage, and belongs to the field of image recognition, specifically, a pre-learning method is as follows: processing the standard pronunciation video to enable the frame rates of the standard pronunciation video to be equal, for example, 30f/s, tracking the face in the video, extracting mouth regions, adjusting all the mouth regions to be the same size, splicing the mouth regions to form a mouth-shaped feature data set by taking 15 frames as a sample (sample) unit, and inputting the mouth-shaped feature data sets into a coupled 3D convolutional neural network; meanwhile, voice features are extracted from the standard pronunciation video by using an FFmpeg frame, the voice features correspond to the mouth shape features within a required duration, the voice features are input to the coupled 3D convolution neural network, and the trained neural network is finally obtained through training. In specific application, the passenger monitoring system monitors opening and closing of the passenger mouth, inputs the real-time mouth shape into the trained neural network, and outputs the characteristic audio corresponding to the real-time mouth shape as a comparison object of the subsequent steps. It should be noted that, the existing lip language recognition model usually converts the continuous lip picture frames into the hanzi sequence of the hanzi sentence (the middle part can map the continuous lip picture frames to the pinyin sequence of the pinyin sentence first, and then translate the continuous lip picture frames from the pinyin sequence of the pinyin sentence to the hanzi sequence of the hanzi sentence), so that the input data of the lip language recognition model can be changed, and the voice feature is used to replace the hanzi sequence, thereby being applied to the embodiment. Of course, the embodiment of the invention does not need to output sentences according to the mouth shape, thereby reducing the requirement on computing power.
In step S2, the vehicle-mounted voice recognition system receives the signal of the OMS, and then starts to monitor the sound in the vehicle to obtain the real-time audio in the vehicle. And the vehicle-mounted voice recognition system compares the coincidence degree of the in-vehicle real-time audio and the characteristic audio, specifically the coincidence degree of the waveform of the in-vehicle real-time audio and the waveform of the characteristic audio given by the OMS. Stretching and shrinking the waveform of the characteristic audio, matching the waveform of the real-time audio in the vehicle, and finding out the highest contact ratio; if the contact ratio of the wave peaks reaches a preset threshold value (for example, 70%), it is recognized that the vehicle-mounted passenger is speaking, and the vehicle-mounted voice recognition system triggers and adjusts the vehicle-mounted multimedia volume to be reduced.
Further, step S2 further includes: the vehicle-mounted voice recognition system compares the coincidence degree of the audio played by the vehicle-mounted multimedia equipment with the characteristic audio; the step S3 further includes: when the coincidence degree of the audio played by the vehicle-mounted multimedia equipment and the characteristic audio reaches a preset threshold value, the vehicle-mounted voice recognition system triggers and reduces the volume of the voice in the audio played by the vehicle-mounted multimedia equipment. In the riding process, the vehicle-mounted multimedia equipment may be playing music, at the moment, if a passenger takes a song with an opening, the passenger monitoring system acquires corresponding characteristic audio according to the monitored real-time mouth shape of the passenger and sends the characteristic audio to the vehicle-mounted voice recognition system, and the vehicle-mounted voice recognition system compares the audio played by the vehicle-mounted multimedia equipment with the characteristic audio in a coincidence degree manner; when the coincidence degree of the audio frequency played by the vehicle-mounted multimedia equipment and the characteristic audio frequency reaches a preset threshold value, the vehicle-mounted voice recognition system triggers and reduces the volume of the voice in the audio frequency played by the vehicle-mounted multimedia equipment, so that a vocal accompaniment scene is created, and vocal accompaniment experience is brought to passengers.
When the coincidence degree of the real-time audio frequency in the vehicle and the characteristic audio frequency is compared in the step S2, if the opening and closing of the mouth of a passenger is not monitored in the step S1, or the coincidence degree of the real-time audio frequency in the vehicle and the characteristic audio frequency sent by the OMS does not reach a preset threshold value, in the step S3, the vehicle-mounted voice recognition system can adjust the audio volume played by the vehicle-mounted multimedia device according to the relation between the noise energy in the vehicle and the audio energy played by the vehicle-mounted multimedia device, and the specific mode is as follows: the vehicle-mounted voice recognition system compares the in-vehicle noise energy with the audio energy played by the vehicle-mounted multimedia equipment, and triggers and increases the audio volume played by the vehicle-mounted multimedia equipment when the in-vehicle noise energy is larger than the audio energy played by the vehicle-mounted multimedia equipment. That is to say, in the foregoing scenario, the speech recognition system of this embodiment may correspondingly adjust the volume of audio (e.g., music, song, etc.) played by the in-vehicle multimedia device according to the in-vehicle noise level, so as to reduce the influence of noise.
In addition, because the passenger monitoring system monitors each passenger in the vehicle individually, if a plurality of passengers are monitored by the corresponding passenger monitoring system to open and close the mouth, in step S2, the vehicle-mounted voice recognition system superposes the characteristic audio sent by each passenger monitoring system, and then the real-time audio in the vehicle and the superposed characteristic audio are subjected to coincidence comparison. The advantage of processing like this lies in, if a plurality of passengers chat in the low voice, according to the characteristic audio frequency that single passenger's mouth type obtained, probably with the interior real-time audio frequency overlap ratio of car not reach preset threshold value (can't trigger at this moment and reduce the volume of on-vehicle multimedia equipment broadcast audio frequency), through carrying out the coincidence ratio comparison to the characteristic audio frequency that obtains respectively according to a plurality of passenger's mouths, the characteristic audio frequency after the stack again carries out the overlap ratio with the interior real-time audio frequency of car, from this easier reaching preset threshold value, thus reduce the volume of on-vehicle multimedia equipment broadcast audio frequency, reduce the influence to communication between the passenger.
It can be understood that, after the communication between the passengers is completed, the passenger monitoring system does not monitor the opening and closing of the passenger's mouth, and the method further comprises: and the vehicle-mounted voice recognition system triggers and increases the volume of the audio played by the vehicle-mounted multimedia equipment to the initial volume. When the coincidence degree of the real-time audio and the characteristic audio in the vehicle reaches a preset threshold value, the vehicle-mounted voice recognition system triggers to reduce the volume of the vehicle-mounted multimedia equipment for playing the audio, so that the reduced volume of the vehicle-mounted multimedia equipment for playing the audio is increased to the level before adjustment in the subsequent process, the whole volume adjusting process is automatically completed, the volume of the vehicle-mounted multimedia equipment for playing the audio is reduced when voice communication of passengers is not disturbed, the volume of the vehicle-mounted multimedia equipment for playing the audio is recovered after the voice communication is completed, and the vehicle-mounted multimedia equipment for playing the audio is continuously listened when the passengers are not disturbed.
Corresponding to the method for automatically adjusting the volume of the vehicle-mounted multimedia in the first embodiment of the invention, the second embodiment of the invention provides a system for automatically adjusting the volume of the vehicle-mounted multimedia, which comprises a passenger monitoring system and a vehicle-mounted voice recognition system,
the passenger monitoring system is used for acquiring corresponding characteristic audio according to a real-time mouth shape when the opening of the passenger mouth is monitored, and sending the characteristic audio to the vehicle-mounted voice recognition system along with a monitoring signal;
the vehicle-mounted voice recognition system is used for receiving the monitoring signal sent by the passenger monitoring system and comparing the coincidence degree of the real-time audio frequency in the vehicle and the characteristic audio frequency; and when the coincidence degree of the real-time audio frequency and the characteristic audio frequency in the vehicle reaches a preset threshold value, the volume of the vehicle-mounted multimedia equipment for playing the audio frequency is triggered and reduced.
Further, the vehicle-mounted voice recognition system is also used for comparing the contact ratio of the audio played by the vehicle-mounted multimedia equipment with the characteristic audio, and triggering and reducing the volume of the voice in the audio played by the vehicle-mounted multimedia equipment when the contact ratio of the audio played by the vehicle-mounted multimedia equipment and the characteristic audio reaches a preset threshold value.
Further, if the passenger monitoring system does not monitor that the mouth of the passenger is opened or closed, or when the coincidence degree of the real-time audio frequency in the vehicle and the characteristic audio frequency is compared, the coincidence degree of the real-time audio frequency in the vehicle and the characteristic audio frequency does not reach a preset threshold value, the vehicle-mounted voice recognition system is further used for comparing the noise energy in the vehicle and the audio energy played by the vehicle-mounted multimedia equipment, and triggering to improve the audio volume played by the vehicle-mounted multimedia equipment when the noise energy in the vehicle is greater than the audio energy played by the vehicle-mounted multimedia equipment.
For the working principle and process of the present embodiment, please refer to the description of the first embodiment of the present invention, which is not repeated herein.
As can be seen from the above description, compared with the prior art, the beneficial effects of the present invention are as follows: the method does not need to identify sentences possibly output by specific mouth shape combination, only needs to output the characteristic audio according to the real-time mouth shape for carrying out contact ratio comparison with real-time audio in the vehicle, reduces the requirement on calculation force, can quickly and automatically respond to the requirement on audio volume adjustment, and improves the riding experience; the audio volume can be adjusted in a self-adaptive manner according to the noise level in the vehicle, so that the influence of noise on communication between passengers is reduced; and after the voice communication of the passengers is finished, the volume of the audio played by the vehicle-mounted multimedia equipment is recovered, and the audio played by the vehicle-mounted multimedia equipment is continuously listened when the passengers take the bus without being influenced.
The above disclosure is only for the purpose of illustrating the preferred embodiments of the present invention, and it is therefore to be understood that the invention is not limited by the scope of the appended claims.

Claims (10)

1. A method for automatically adjusting the volume of vehicle-mounted multimedia is characterized by comprising the following steps:
the method comprises the following steps that S1, when a passenger monitoring system monitors that a passenger mouth is opened, corresponding characteristic audio is obtained according to a real-time mouth shape, and the characteristic audio is sent to a vehicle-mounted voice recognition system along with a monitoring signal;
s2, the vehicle-mounted voice recognition system receives the monitoring signal sent by the passenger monitoring system and compares the coincidence degree of the real-time audio frequency in the vehicle and the characteristic audio frequency;
and S3, when the coincidence degree of the real-time audio frequency and the characteristic audio frequency in the vehicle reaches a preset threshold value, the vehicle-mounted voice recognition system triggers to reduce the volume of the audio frequency played by the vehicle-mounted multimedia equipment.
2. The method according to claim 1, wherein in step S1, the obtaining of the corresponding characteristic audio according to the real-time mouth shape includes inputting the real-time mouth shape to a trained neural network, and outputting the characteristic audio corresponding to the real-time mouth shape, and the neural network is trained by using a mouth shape characteristic data set obtained by processing a standard pronunciation video and a voice characteristic extracted from the standard pronunciation video as input.
3. The method according to claim 1, wherein in step S2, the comparing of the coincidence of the in-vehicle real-time audio and the characteristic audio by the in-vehicle voice recognition system specifically comprises comparing the coincidence of the waveform of the in-vehicle real-time audio and the waveform of the characteristic audio, including stretching and shrinking the waveform of the characteristic audio to match the waveform of the in-vehicle real-time audio; in the step S3, when the contact ratio of the wave peaks reaches a preset threshold, the vehicle-mounted voice recognition system recognizes that the vehicle-mounted passenger is speaking, and triggers to reduce the volume of the vehicle-mounted multimedia device for playing the audio.
4. The method of claim 3, wherein the step S2 further comprises: the vehicle-mounted voice recognition system compares the coincidence degree of the audio played by the vehicle-mounted multimedia equipment with the characteristic audio; the step S3 further includes: when the coincidence degree of the audio played by the vehicle-mounted multimedia equipment and the characteristic audio reaches a preset threshold value, the vehicle-mounted voice recognition system triggers and reduces the volume of the voice in the audio played by the vehicle-mounted multimedia equipment.
5. The method according to claim 3, wherein if the opening and closing of the mouth of the occupant is not monitored in step S1, or when the coincidence ratio between the in-vehicle real-time audio and the characteristic audio is compared in step S2, the coincidence ratio between the in-vehicle real-time audio and the characteristic audio does not reach a preset threshold, the step S3 further comprises: the vehicle-mounted voice recognition system compares the in-vehicle noise energy with the audio energy played by the vehicle-mounted multimedia equipment, and triggers and increases the audio volume played by the vehicle-mounted multimedia equipment when the in-vehicle noise energy is larger than the audio energy played by the vehicle-mounted multimedia equipment.
6. The method according to claim 1, wherein in step S1, if a plurality of passenger monitoring systems monitor that the mouths of corresponding passengers are open and closed, in step S2, the in-vehicle voice recognition system superimposes the characteristic audio sent by each passenger monitoring system, and then compares the in-vehicle real-time audio with the superimposed characteristic audio in a coincidence manner.
7. The method according to claim 1, wherein the step S3 is further followed by: the passenger monitoring system does not monitor the opening and closing of the passenger mouth, and the vehicle-mounted voice recognition system triggers and improves the volume of the audio played by the vehicle-mounted multimedia equipment to the initial volume.
8. A system for automatically adjusting the volume of vehicle-mounted multimedia is characterized by comprising a passenger monitoring system and a vehicle-mounted voice recognition system,
the passenger monitoring system is used for acquiring corresponding characteristic audio according to a real-time mouth shape when the opening of the passenger mouth is monitored, and sending the characteristic audio to the vehicle-mounted voice recognition system along with a monitoring signal;
the vehicle-mounted voice recognition system is used for receiving the monitoring signal sent by the passenger monitoring system and comparing the coincidence degree of the real-time audio frequency in the vehicle and the characteristic audio frequency; and when the coincidence degree of the real-time audio frequency and the characteristic audio frequency in the vehicle reaches a preset threshold value, the volume of the vehicle-mounted multimedia equipment for playing the audio frequency is triggered and reduced.
9. The system of claim 8, wherein the vehicle-mounted voice recognition system is further configured to compare a coincidence degree of an audio frequency played by the vehicle-mounted multimedia device with the characteristic audio frequency, and trigger a reduction in a volume of a voice in the audio frequency played by the vehicle-mounted multimedia device when the coincidence degree of the audio frequency played by the vehicle-mounted multimedia device with the characteristic audio frequency reaches a preset threshold value.
10. The system of claim 8, wherein if the occupant monitoring system does not monitor that the mouth of the occupant is open or closed, or when the coincidence of the real-time audio frequency in the vehicle and the characteristic audio frequency is compared, the coincidence of the real-time audio frequency in the vehicle and the characteristic audio frequency does not reach a preset threshold, the vehicle-mounted voice recognition system is further configured to compare the magnitude of the noise energy in the vehicle and the audio energy played by the vehicle-mounted multimedia device, and trigger the audio volume played by the vehicle-mounted multimedia device to be increased when the noise energy in the vehicle is greater than the audio energy played by the vehicle-mounted multimedia device.
CN202111438420.4A 2021-11-30 2021-11-30 Method and system for automatically adjusting vehicle-mounted multimedia volume Pending CN115243104A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111438420.4A CN115243104A (en) 2021-11-30 2021-11-30 Method and system for automatically adjusting vehicle-mounted multimedia volume

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111438420.4A CN115243104A (en) 2021-11-30 2021-11-30 Method and system for automatically adjusting vehicle-mounted multimedia volume

Publications (1)

Publication Number Publication Date
CN115243104A true CN115243104A (en) 2022-10-25

Family

ID=83665934

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111438420.4A Pending CN115243104A (en) 2021-11-30 2021-11-30 Method and system for automatically adjusting vehicle-mounted multimedia volume

Country Status (1)

Country Link
CN (1) CN115243104A (en)

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102324035A (en) * 2011-08-19 2012-01-18 广东好帮手电子科技股份有限公司 Method and system of applying lip posture assisted speech recognition technique to vehicle navigation
CN107516534A (en) * 2017-08-31 2017-12-26 广东小天才科技有限公司 A kind of comparison method of voice messaging, device and terminal device
CN108146360A (en) * 2017-12-25 2018-06-12 出门问问信息科技有限公司 Method, apparatus, mobile unit and the readable storage medium storing program for executing of vehicle control
CN109147820A (en) * 2018-08-30 2019-01-04 深圳市元征科技股份有限公司 Vehicle audio control method, device, electronic equipment and storage medium
CN109743461A (en) * 2019-01-29 2019-05-10 广州酷狗计算机科技有限公司 Audio data processing method, device, terminal and storage medium
CN112163547A (en) * 2020-10-13 2021-01-01 霍雨佳 Spoken language evaluation method based on deep learning
CN112397084A (en) * 2020-11-04 2021-02-23 佛吉亚歌乐电子(丰城)有限公司 Method for adaptively adjusting multimedia volume, vehicle-mounted terminal and computer storage medium
CN113157080A (en) * 2020-01-07 2021-07-23 宝马股份公司 Instruction input method for vehicle, storage medium, system and vehicle

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102324035A (en) * 2011-08-19 2012-01-18 广东好帮手电子科技股份有限公司 Method and system of applying lip posture assisted speech recognition technique to vehicle navigation
CN107516534A (en) * 2017-08-31 2017-12-26 广东小天才科技有限公司 A kind of comparison method of voice messaging, device and terminal device
CN108146360A (en) * 2017-12-25 2018-06-12 出门问问信息科技有限公司 Method, apparatus, mobile unit and the readable storage medium storing program for executing of vehicle control
CN109147820A (en) * 2018-08-30 2019-01-04 深圳市元征科技股份有限公司 Vehicle audio control method, device, electronic equipment and storage medium
CN109743461A (en) * 2019-01-29 2019-05-10 广州酷狗计算机科技有限公司 Audio data processing method, device, terminal and storage medium
CN113157080A (en) * 2020-01-07 2021-07-23 宝马股份公司 Instruction input method for vehicle, storage medium, system and vehicle
CN112163547A (en) * 2020-10-13 2021-01-01 霍雨佳 Spoken language evaluation method based on deep learning
CN112397084A (en) * 2020-11-04 2021-02-23 佛吉亚歌乐电子(丰城)有限公司 Method for adaptively adjusting multimedia volume, vehicle-mounted terminal and computer storage medium

Similar Documents

Publication Publication Date Title
CN108564942B (en) Voice emotion recognition method and system based on adjustable sensitivity
CN105161093B (en) A kind of method and system judging speaker's number
CN101354887B (en) Ambient noise injection method for use in speech recognition
CN108146360A (en) Method, apparatus, mobile unit and the readable storage medium storing program for executing of vehicle control
US6411927B1 (en) Robust preprocessing signal equalization system and method for normalizing to a target environment
CN112397065A (en) Voice interaction method and device, computer readable storage medium and electronic equipment
US20120197637A1 (en) Speech processing responsive to a determined active communication zone in a vehicle
DE102008062542A1 (en) In-vehicle condition-aware speech recognition
DE102017102392A1 (en) AUTOMATIC LANGUAGE RECOGNITION BY VOICE CHANNELS
CN113345433B (en) Voice interaction system outside vehicle
DE102017121059A1 (en) IDENTIFICATION AND PREPARATION OF PREFERRED EMOJI
US8438030B2 (en) Automated distortion classification
DE102019107624A1 (en) System and method for fulfilling a voice request
CN113035227A (en) Multi-modal voice separation method and system
CN105390136A (en) Vehicle control device and method used for user-adaptable service
DE102018103188A1 (en) Improved task completion in speech recognition
US20230169983A1 (en) Speech recognition
CN110696756A (en) Vehicle volume control method and device, automobile and storage medium
CN111261145B (en) Voice processing device, equipment and training method thereof
CN112382310A (en) Human voice audio recording method and device
CN113593601A (en) Audio-visual multi-modal voice separation method based on deep learning
CN112185357A (en) Device and method for simultaneously recognizing human voice and non-human voice
CN115243104A (en) Method and system for automatically adjusting vehicle-mounted multimedia volume
US11715457B1 (en) Real time correction of accent in speech audio signals
WO2020073839A1 (en) Voice wake-up method, apparatus and system, and electronic device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination