CN111402925B - Voice adjustment method, device, electronic equipment, vehicle-mounted system and readable medium - Google Patents

Voice adjustment method, device, electronic equipment, vehicle-mounted system and readable medium Download PDF

Info

Publication number
CN111402925B
CN111402925B CN202010172637.4A CN202010172637A CN111402925B CN 111402925 B CN111402925 B CN 111402925B CN 202010172637 A CN202010172637 A CN 202010172637A CN 111402925 B CN111402925 B CN 111402925B
Authority
CN
China
Prior art keywords
voice
vehicle
information
person
environmental information
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202010172637.4A
Other languages
Chinese (zh)
Other versions
CN111402925A (en
Inventor
李黎萍
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Apollo Zhilian Beijing Technology Co Ltd
Original Assignee
Apollo Zhilian Beijing Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Apollo Zhilian Beijing Technology Co Ltd filed Critical Apollo Zhilian Beijing Technology Co Ltd
Priority to CN202010172637.4A priority Critical patent/CN111402925B/en
Publication of CN111402925A publication Critical patent/CN111402925A/en
Application granted granted Critical
Publication of CN111402925B publication Critical patent/CN111402925B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/48Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use
    • G10L25/51Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use for comparison or discrimination
    • GPHYSICS
    • G08SIGNALLING
    • G08GTRAFFIC CONTROL SYSTEMS
    • G08G1/00Traffic control systems for road vehicles
    • G08G1/09Arrangements for giving variable traffic instructions
    • G08G1/0962Arrangements for giving variable traffic instructions having an indicator mounted inside the vehicle, e.g. giving voice messages
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/48Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use
    • G10L25/51Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use for comparison or discrimination
    • G10L25/63Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use for comparison or discrimination for estimating an emotional state

Abstract

The embodiment of the disclosure discloses a voice adjusting method and device. One embodiment of the present disclosure includes: acquiring environment information of a vehicle; acquiring state information of at least one person on the vehicle; determining a voice adjustment strategy based on the environmental information and the status information; and adjusting parameters of the voice to be played on the vehicle according to the voice adjusting strategy. The embodiment judges the situation urgency by utilizing the conditions of the internal and external vehicles and the road conditions, and then determines whether the passengers in the vehicles need to be calmed according to the emotion states of the drivers/passengers in the vehicles, and carries out tone-changing and speed-changing processing on the voice. Under the condition of ensuring the integrity of the same voice image, the voice is endowed with correct emotion feedback, and safer driving voice interaction is brought. Can be applied to auxiliary driving and unmanned driving scenes.

Description

Voice adjustment method, device, electronic equipment, vehicle-mounted system and readable medium
Technical Field
The embodiment of the disclosure relates to the technical field of computers, in particular to a voice adjusting method and device.
Background
With the rapid development of computer technology and artificial intelligence, vehicle navigation and assisted driving are increasingly widely applied in the field of automobile driving. Early vehicle-mounted voice systems commonly employed mechanical single voice prompts, i.e., fixed use of one voice (speech rate, tone, intonation), which may result in poor prompting results, and dislike by the user. For example, in a general survey, some users prefer sweet and gentle female voices (slow speech, high pitch, tone change), and some users prefer voice broadcasts of a particular star. In order to meet the personalized needs of users, a voice synthesis technology can be adopted to simulate the voice of a specific person (a limited amount of voices are collected first, and an artificial intelligence technology is adopted to perform voice synthesis processing to obtain a target language with the voice tone of the specific person).
It has been found that for alert alerts during driving, it is desirable to be more confident that urgent intonation (rapid speech, low pitch, intonation) is more responsive to the user. The following solutions are envisaged:
(1) Different functions call different voice packets, for example, call voice packet 1 when the function of waking up the voice dialogue system is performed, and call voice packet 2 when the function of broadcasting navigation information is performed. However, in this scheme, different functions cannot fully represent the urgency of a scene, and the difference of the tone colors of the voice packets is large, so that the voice image experiences the cracking.
(2) The real person records specific corpus, and some vehicle-mounted voice systems record more corpus, so that natural changes of tone and speed of voice under different scenes are attempted. However, in this scheme, when the corpus needs to be recorded, various situations are considered, and the corpus recording workload is large.
It should be noted that the methods described in this section are not necessarily methods that have been previously conceived or pursued. Unless otherwise indicated, it should not be assumed that any of the approaches described in this section qualify as prior art merely by virtue of their inclusion in this section. Similarly, the problems mentioned in this section should not be considered as having been recognized in any prior art unless otherwise indicated.
Disclosure of Invention
The embodiment of the disclosure provides a voice adjusting method and a voice adjusting device.
According to a first aspect of the present disclosure, embodiments of the present disclosure provide a voice adjustment method, the method comprising: acquiring environment information of a vehicle; acquiring state information of at least one person on the vehicle; determining a voice adjustment strategy based on the environmental information and the status information; and adjusting parameters of the voice to be played on the vehicle according to the voice adjusting strategy.
In some embodiments, the environmental information includes at least one of in-vehicle environmental information and in-vehicle external environmental information, and wherein the in-vehicle environmental information includes road condition information and/or advanced assisted driving system ADAS information.
In some embodiments, the parameters include at least one of pitch, pace, and intonation.
In some embodiments, the determining a speech adjustment policy based on the environmental information and the status information comprises: and judging the emergency degree of a prompt event corresponding to the voice to be played according to the environment information, and judging the emotional state of at least one person on the vehicle according to the state information.
In some embodiments, where the environmental information includes in-vehicle environmental information, the in-vehicle environmental information is obtained by collecting vehicle component status information.
In some embodiments, in the case where the environmental information includes road condition information, the road condition information is acquired in one of the following ways: acquiring real-time high-precision road condition information through a cloud; or sensing conditions in the vicinity of the vehicle by means of a sensor camera and/or radar.
In some embodiments, the at least one person on the vehicle comprises a driver of the vehicle, and wherein the status information is obtained in at least one of: acquiring facial expressions of at least one person on the vehicle through a camera; collecting the language of at least one person on the vehicle through a voice receiver; collecting the driving action of the driver through a driving action collector; or collecting the duration of the current driving of the driver through clock record.
In some embodiments, the method further comprises: a voice regulation strategy model is established in advance, the voice regulation strategy model comprises the corresponding relation among the emergency degree, the emotion state and the voice regulation strategy, and the voice regulation strategy comprises the combination of frequency, speed and intonation model curves.
In some embodiments, the intonation model curves include serious intonation model curves with alert effects, mild intonation model curves with pacifying effects, and active intonation model curves with exciting effects.
In some embodiments, wherein said adjusting parameters of the speech to be played on the vehicle according to the speech adjustment strategy comprises: and according to the combination of the frequency, the speed and the intonation model curve included in the determined voice regulation strategy, correspondingly regulating the frequency and the speaking speed of the voice to be played, and correspondingly regulating the intonation model of the voice to be played.
According to a second aspect of the present disclosure, embodiments of the present disclosure provide a voice adjusting apparatus, including: a first acquisition unit configured to acquire environmental information of a vehicle; a second acquisition unit configured to acquire status information of at least one person on the vehicle; a determining unit configured to determine a voice adjustment policy based on the environment information and the state information; and the adjusting unit is configured to adjust parameters of the voice to be played on the vehicle according to the voice adjusting strategy.
In some embodiments, the environmental information includes at least one of in-vehicle environmental information and in-vehicle environmental information, and wherein the in-vehicle environmental information includes road condition information and/or ADAS information.
In some embodiments, the parameters include at least one of pitch, pace, and intonation.
In some embodiments, the determining unit is configured to determine, according to the environmental information, an emergency degree of a prompt event corresponding to the prompt voice, and determine, according to the state information, an emotional state of at least one person on the vehicle.
In some embodiments, in the case where the environmental information includes in-vehicle environmental information, the first acquisition unit is configured to acquire the in-vehicle environmental information by acquiring vehicle component status information; and wherein, in the case where the environmental information includes road condition information, the first acquisition unit is configured to acquire the road condition information in at least one of the following ways: acquiring real-time high-precision road condition information through a cloud; or sensing conditions in the vicinity of the vehicle by means of a sensor camera and/or radar.
In some embodiments, in case at least one person on the vehicle comprises a driver of the vehicle, the second acquisition unit is configured to acquire the status information at least in one of the following ways: acquiring facial expressions of at least one person on the vehicle through a camera; collecting the language of at least one person on the vehicle through a voice receiver; collecting the driving action of the driver through a driving action collector; or collecting the duration of the current driving of the driver through clock record.
In some embodiments, the determining unit is further configured to determine a speech adaptation strategy according to a pre-established speech adaptation strategy model, the speech adaptation strategy model comprising a correspondence of the urgency level, the emotional state, and the speech adaptation strategy, the speech adaptation strategy comprising a combination of frequency, speed, and intonation model curves.
According to a third aspect of the present disclosure, embodiments of the present disclosure provide an electronic device comprising: one or more processors; a storage device having one or more programs stored thereon; the one or more programs, when executed by the one or more processors, cause the one or more processors to implement the method as described in any of the implementations of the first aspect.
According to a fourth aspect of the present disclosure, embodiments of the present disclosure provide an in-vehicle system comprising an electronic device as described in the third aspect.
According to a fifth aspect of the present disclosure, embodiments of the present disclosure provide a computer readable medium having a computer program stored thereon, wherein the program, when executed by a processor, implements a method as described in any of the implementations of the first aspect.
The voice adjusting method and the voice adjusting device provided by the embodiment of the disclosure utilize the internal and external vehicle conditions and road conditions to judge the situation urgency, and then combine the emotional states of the driver/passengers in the vehicle to determine whether the passengers in the vehicle need to be calmed, so as to perform tone changing and speed changing processing on the voice. Under the condition of ensuring the integrity of the same voice image, the voice is endowed with correct emotion feedback, and safer driving voice interaction is brought. The present disclosure is applicable to assisted driving and unmanned driving scenarios.
Drawings
Other features, objects and advantages of the present disclosure will become more apparent upon reading of the detailed description of non-limiting embodiments made with reference to the following drawings.
Fig. 1 is a flow chart of one embodiment of a speech conditioning method according to the present disclosure.
Fig. 2 is a schematic structural view of one embodiment of a voice-adjusting device according to the present disclosure.
Fig. 3 is a schematic structural diagram of an electronic device suitable for use in implementing embodiments of the present disclosure.
Description of the embodiments
The present disclosure is described in further detail below with reference to the drawings and examples. It is to be understood that the specific embodiments described herein are merely illustrative of the invention and are not limiting of the invention. It should be noted that, for convenience of description, only the portions related to the present invention are shown in the drawings.
It should be noted that, without conflict, the embodiments of the present disclosure and features of the embodiments may be combined with each other. The present disclosure will be described in detail below with reference to the accompanying drawings in conjunction with embodiments.
Fig. 1 shows a flowchart of a process 100 according to one embodiment of a speech conditioning method of the present disclosure. The voice adjusting method comprises the following steps:
in step 101, environmental information of a vehicle is acquired.
In this step, the execution subject of the voice adjustment method may acquire the environmental information through a wired connection manner or a wireless connection manner. In some embodiments, the environmental information may include in-vehicle environmental information or out-of-vehicle environmental information. The in-vehicle environment information may include, for example, vehicle condition information, which may include various data related to the running condition of the vehicle, including, for example, data related to various components of the vehicle itself, such as tire pressure, water temperature, oil quantity, electric quantity, vehicle speed, and the like. As an example, the respective component state data may be acquired by a tire pressure sensor, a temperature sensor, an oil amount sensor, a level sensor, a vehicle speed sensor, and the like, respectively. The vehicle exterior environment information may include road condition information, and may also include advanced driving support system ADAS information. The traffic information may be, for example, traffic congestion conditions. As an example, real-time high-precision road condition information can be obtained from the cloud. This information can be used to determine whether there is a traffic emergency (bad weather, road collapse, ambulance clearance, etc.), or a route problem (miswalking, overspeed, etc.), in a later step. The ADAS information may include vehicle surroundings, such as the distance of the vehicle from surrounding obstacles. As an example, the surrounding situation of the vehicle may be perceived by a sensor camera, radar, etc., and the distance of the vehicle from surrounding obstacles may be acquired by an ultrasonic radar, for example. In some embodiments, the disclosure is applied to an intelligent cabin system, and whether a vehicle approaches an accident (crashing other vehicles, crashing other people, crashing objects, etc.) or runs a problem (running red light, pressing line, lane departure) can be judged through a sensor such as a sensing camera and a radar configured in the intelligent cabin, so that early warning is performed on a user or a manual taking over requirement is provided. Although the specific contents and the acquisition manner of the environment information are described above, this is merely an example, and the present disclosure is not limited thereto. Those skilled in the art will appreciate that the content and manner of acquisition of the information may be augmented according to specific needs.
In step 102, status information of at least one person on the vehicle is obtained.
In this step, the execution subject of the voice adjustment method may acquire the status information through a wired connection manner or a wireless connection manner. For example, status information of a person on the vehicle such as a driver of the current vehicle or a passenger in the vehicle is acquired. One or more persons on the vehicle may be considered users of the voice-conditioning function. For simplicity, hereinafter "user" is used to denote a person on the vehicle, including the driver and the passengers in the vehicle. It should be appreciated that the at least one person on the vehicle may include at least the driver of the vehicle.
In some embodiments, the user's state information may be various useful information to aid in analyzing the user's emotional state, by which it is preferable to more effectively analyze whether the user is in a particular emotional state, such as a panic, tired, or sad (low-mood) emotional state. As an example, the emotional state of the user may be analyzed by the user's facial expression, the language spoken by the user, the user's driving actions, the duration of this driving, and so on. As an example, the facial expression of the user may be collected by a camera, the language spoken by the user may be collected by a voice receiver, the driving action of the driver may be collected by a driving action collector, and the duration of the current driving of the driver may be collected by a clock record. In some embodiments, cameras, voice receivers, driving action collectors, etc. may be equipped in the intelligent cockpit systems to which the present disclosure applies. Although the specific contents and the acquisition manner of the user state information are described above, this is merely an example, and the present disclosure is not limited thereto. Those skilled in the art will appreciate that the content and manner of acquisition of the information may be augmented according to specific needs.
Although step 101 is described above and step 102 is described below, this is not intended to limit the order of the two steps, which may be performed simultaneously or as described below, as those skilled in the art will appreciate that the present disclosure is not limited thereto.
In step 103, a speech adjustment strategy is determined based on the context information and the status information.
In this step, the information acquired in the previous step is analyzed. Analyzing the emergency degree of the situation according to the environment information, and analyzing the emotion state of the user according to the state information of the user. Powerful analysis functions can be supported by means of the background server.
In some embodiments, the degree of urgency is divided into multiple levels in advance. The number of levels divided in advance depends on the need. As an example, it can be simply divided into urgent and non-urgent. This is merely an example and the present disclosure is not limited thereto.
In some embodiments, the level of urgency to which a particular situation corresponds is set in advance. For example, the user may need to be prompted to refuel when the amount of oil is below the first threshold or the second threshold, and the degree of urgency of the prompting event below the larger first threshold may be set to a lower level and the degree of urgency of the prompting event below the smaller second threshold may be set to a higher level. For example, if a vehicle component fails, a fault prompt is required, and an important component fails to cause a traffic accident, the emergency degree of the prompt event can be set to a higher level. For example, the emergency degree of a warning event that the vehicle speed is high and the safe driving distance from the preceding vehicle is not maintained needs to be set to a high level. For example, the emergency degree of the road condition prompting event of the congestion condition can be set to be low. In some embodiments, a database is pre-established, categorizing the various reminder events and setting the corresponding level of urgency. As an example, when the emergency level is analyzed according to the environmental information of the vehicle, the emergency level of the current situation, that is, the emergency level of the prompt event corresponding to the voice to be played, may be determined by searching a pre-established database based on the acquired environmental information of the vehicle.
In this step, the emotional state of the user is analyzed. The emotional state of the user can be analyzed according to the collected facial expression, language, driving action, driving duration and the like of the user, the driver can be judged according to one of the facial expressions, language, driving duration and the like, for example, the driver can be judged to be in a fatigue driving state according to the driving duration, the driver can be judged according to a plurality of pieces of information, even the comprehensive judgment is combined with other pieces of information, for example, the current anxiety state of the user can be comprehensively judged according to the current facial expression of the user, the sent specific language, and even the road condition information representing the congestion condition. In some embodiments, when analyzing the emotional state of the user, on the basis of the collected facial expression, language, driving action and/or driving duration of the user, comprehensive analysis can be performed by means of the user characteristic database, so that the emotional state of the user can be determined more effectively and accurately. Based on the user state information obtained in the previous step, whether the user is in a panic, tired, normal, happy, difficult or the like state is judged according to a preset mode. In some embodiments, the information processing determination may be processed in the cabin intelligence system or may be assisted by a cloud system.
In this step, a speech modulation strategy is determined based on the urgency and the emotional state, the speech modulation strategy comprising a combination of frequency modulation, speed modulation, and intonation model curves. The frequency is used for adjusting the pitch of the corresponding voice, the speed is used for adjusting the speed of the corresponding voice, and the intonation model curve is used for corresponding to the intonation of the voice. Intonation is a combination of pitch and rhythm, such as a happy intonation, a sentence-ending pitch is an up-going intonation, a lively intonation, and a sentence-ending pitch is down-going. A sentence typically has 2 pitch peaks, 3 low points. Therefore, different intonation model curves are adopted to adjust the voice, and the voice can be expressed as different intonation. As an example, intonation model curves may include serious intonation model curves with alert effects, mild intonation model curves with pacifying effects, active intonation model curves with exciting effects, and so forth.
In some embodiments, a speech conditioning policy model is pre-established, which may be linear, non-linear, or hierarchical. As an example, the speech adjustment policy model includes a speech adjustment policy table including correspondence of urgency, emotional state, and speech adjustment policy. Table 1 below is an example of a voice adjustment policy table. By way of example, the degree of urgency includes urgency and non-urgency, emotional states including normal, panic, fatigue … …, and the like. For example, when the situation is not urgent but the user is in a panic state, the bass, the slow and the mild intonation model curve are adopted to process the voice, and when the situation is urgent but the user is in a fatigue state, the treble, the fast and the serious intonation model curve are adopted to process the voice.
TABLE 1
Degree of emergency Emotional state Speech regulation strategy
Emergency system Normal state Bass, quick and serious intonation
Emergency system Fatigue of High pitch, fast and serious intonation
Emergency system Panic of panic Bass, slow, intonation and peace
Not urgent Normal state No treatment
Not urgent Fatigue of High pitch, fast and serious intonation
Not urgent Panic of panic Bass, slow, intonation and peace
…… …… ……
In some embodiments, speech modulation strategies with appropriate cues may also be employed depending on more emotional states (e.g., anxiety, heart injury, happiness, etc.). More intonation model curves can be developed, such as a soothing intonation with soothing effect, etc. The above is merely an example, and the present disclosure is not limited thereto.
In step 104, according to the voice adjustment strategy, parameters of the voice to be played on the vehicle are adjusted.
In this step, one or more parameters of the speech to be played, such as speech, intonation and speech speed, are adjusted according to the determined speech adjustment strategy. In some embodiments, according to the combination of the frequency, the speed and the intonation model curve included in the determined voice adjusting strategy, the frequency and the speed of the voice to be played are adjusted and corresponding intonation model processing is performed.
In some embodiments, the to-be-played voice may be a to-be-played voice corresponding to a prompt event invoked from a pre-made voice package. For example, a prompt event is generated to alert the user to refuel when the amount of fuel is below a certain threshold. And calling voice data matched with the content from the prefabricated voice packet based on the prompt event as voice to be played. In other embodiments, the voice to be played may also be voice data obtained from a background server according to user preference or the like. In some embodiments, the voice to be played may also be a predetermined voice in an existing playing program, which may not be associated with the environment information or the user status acquired in real time. The above description of the content and the manner of obtaining the voice to be played is merely an example, and the present disclosure is not limited thereto.
The voice processed by the determined voice regulation strategy is provided with correct emotion feedback when played, so that a better prompting effect can be achieved.
With further reference to fig. 2, as an implementation of the method shown in fig. 1, the present disclosure provides an embodiment of a speech-modifying apparatus, which corresponds to the embodiment of the method shown in fig. 1, and which is particularly applicable in various electronic devices.
As shown in fig. 2, the voice adjusting apparatus 200 provided in the present embodiment includes a first acquiring unit 201 configured to acquire environmental information of a vehicle; a second acquisition unit 202 configured to acquire status information of at least one person on the vehicle; a determining unit 203 configured to determine a speech adjustment policy based on the environment information and the state information; an adjusting unit 204 is configured to adjust parameters of the voice to be played on the vehicle according to the voice adjustment strategy.
In the present embodiment, in the voice adjusting apparatus 200: the specific processes of the first obtaining unit 201, the second obtaining unit 202, the determining unit 203, and the adjusting unit 204 and the technical effects thereof may refer to the descriptions related to step 101, step 102, step 103, and step 104 in the corresponding embodiment of fig. 1, and are not repeated herein.
In some optional implementations of this embodiment, the environmental information includes at least one of in-vehicle environmental information and in-vehicle external environmental information, and wherein the in-vehicle environmental information includes road condition information and/or ADAS information.
In some alternative implementations of the present embodiment, the parameter includes at least one of pitch, pace, and intonation.
In some optional implementations of this embodiment, the determining unit 203 may be configured to determine, according to the environmental information, an emergency degree of a prompting event corresponding to a prompting voice, and determine, according to the status information, an emotional state of at least one person on the vehicle.
In some optional implementations of the present embodiment, in a case where the environmental information includes in-vehicle environmental information, the first obtaining unit 201 may be configured to obtain the in-vehicle environmental information by collecting vehicle component status information; and wherein, in case the environmental information includes road condition information, the first obtaining unit 201 may be configured to obtain the road condition information in at least one of the following ways: acquiring real-time high-precision road condition information through a cloud; or sensing conditions in the vicinity of the vehicle by means of a sensor camera and/or radar.
In some optional implementations of this embodiment, the at least one person on the vehicle includes a driver of the vehicle, and wherein the second obtaining unit 202 may be configured to obtain the status information in at least one of: acquiring facial expressions of at least one person on the vehicle through a camera; collecting the language of at least one person on the vehicle through a voice receiver; collecting the driving action of the driver through a driving action collector; or collecting the duration of the current driving of the driver through clock record.
In some optional implementations of this embodiment, the determining unit 203 may be further configured to determine a speech adjustment policy according to a pre-established speech adjustment policy model, the speech adjustment policy model including the correspondence of the urgency level, the emotional state, and the speech adjustment policy, the speech adjustment policy including a combination of frequency, speed, and intonation model curves. The adjusting unit 204 may be configured to adjust the frequency and the speech speed of the speech to be played according to the combination of the frequency, the speed and the intonation model curve included in the determined speech adjustment strategy, and adjust the intonation model of the speech to be played accordingly.
In some alternative implementations of this embodiment, the intonation model curves may include serious intonation model curves with alert effects, mild intonation model curves with pacifying effects, and active intonation model curves with exciting effects.
The voice adjusting method and the voice adjusting device provided by the embodiment of the disclosure can process the voice to be played based on the vehicle-mounted situation, judge the situation urgency by utilizing the internal and external vehicle conditions and road conditions, and then determine whether the in-vehicle personnel need to be calmed according to the emotion states of the driver/in-vehicle passengers so as to perform tone-changing and speed-changing processing on the voice. Under the condition of ensuring the integrity of the same voice image, the voice is endowed with correct emotion feedback, and safer driving voice interaction is brought. Can be applied to auxiliary driving and unmanned driving scenes.
Referring now to fig. 3, a schematic diagram of an electronic device 300 suitable for use in implementing embodiments of the present disclosure is shown. The electronic device in the embodiments of the present disclosure may include, but is not limited to, a mobile terminal such as a car terminal (e.g., car navigation terminal), a mobile phone, a notebook computer, a PAD (tablet computer), and the like. The electronic device shown in fig. 3 is merely an example and should not impose any limitations on the functionality and scope of use of embodiments of the present disclosure.
As shown in fig. 3, the electronic device 300 may include a processing means (e.g., a central processing unit, a graphics processor, etc.) 301 that may perform various suitable actions and processes in accordance with a program stored in a Read Only Memory (ROM) 302 or a program loaded from a storage means 308 into a Random Access Memory (RAM) 303. In the RAM 303, various programs and data required for the operation of the electronic apparatus 300 are also stored. The processing device 301, the ROM 302, and the RAM 303 are connected to each other via a bus 304. An input/output (I/O) interface 305 is also connected to bus 304.
In general, the following devices may be connected to the I/O interface 305: input devices 306 including, for example, a touch screen, touchpad, camera, accelerometer, gyroscope, etc.; an output device 307 including, for example, a liquid crystal display (LCD, liquid Crystal Display), a speaker, a vibrator, and the like; storage 308 including, for example, flash memory (Flash Card) or the like; and communication means 309. The communication means 309 may allow the electronic device 300 to communicate with other devices wirelessly or by wire to exchange data. While fig. 3 shows an electronic device 300 having various means, it is to be understood that not all of the illustrated means are required to be implemented or provided. More or fewer devices may be implemented or provided instead. Each block shown in fig. 3 may represent one device or a plurality of devices as needed.
In particular, according to embodiments of the present disclosure, the processes described above with reference to flowcharts may be implemented as computer software programs. For example, embodiments of the present disclosure include a computer program product comprising a computer program embodied on a computer readable medium, the computer program comprising program code for performing the method shown in the flowcharts. In such an embodiment, the computer program may be downloaded and installed from a network via a communication device 309, or installed from a storage device 308, or installed from a ROM 302. The above-described functions defined in the methods of the embodiments of the present disclosure are performed when the computer program is executed by the processing means 301.
It should be noted that, the computer readable medium according to the embodiments of the present disclosure may be a computer readable signal medium or a computer readable storage medium, or any combination of the two. The computer readable storage medium can be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or a combination of any of the foregoing. More specific examples of the computer-readable storage medium may include, but are not limited to: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In an embodiment of the present disclosure, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. Whereas in embodiments of the present disclosure, the computer-readable signal medium may comprise a data signal propagated in baseband or as part of a carrier wave, with computer-readable program code embodied therein. Such a propagated data signal may take any of a variety of forms, including, but not limited to, electro-magnetic, optical, or any suitable combination of the foregoing. A computer readable signal medium may also be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device. Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to: electrical wires, fiber optic cables, RF (Radio Frequency), and the like, or any suitable combination thereof.
The computer readable medium may be contained in the electronic device; or may exist alone without being incorporated into the electronic device. The computer readable medium carries one or more programs which, when executed by the electronic device, cause the electronic device to: acquiring environment information of a vehicle; acquiring state information of at least one person on the vehicle; determining a voice adjustment strategy based on the environmental information and the status information; and adjusting parameters of the voice to be played on the vehicle according to the voice adjusting strategy.
The speech adjusting apparatus may be part of an in-vehicle system or an auxiliary driving system, for example, part of an advanced auxiliary driving system ADAS, and may be realized as a function of the system.
Computer program code for carrying out operations of embodiments of the present disclosure may be written in one or more programming languages, including an object oriented programming language such as Java, smalltalk, C ++ and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the case of a remote computer, the remote computer may be connected to the user's computer through any kind of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or may be connected to an external computer (for example, through the Internet using an Internet service provider).
The flowcharts and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present disclosure. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
The units involved in the embodiments described in the present disclosure may be implemented by means of software, or may be implemented by means of hardware. The described units may also be provided in a processor, for example, described as: a processor includes a first acquisition unit, a second acquisition unit, a determination unit, and an adjustment unit. The names of these units do not constitute a limitation on the unit itself in some cases, and for example, the first acquisition unit may also be described as "a unit that acquires environmental information of a vehicle".
The foregoing description is only of the preferred embodiments of the present disclosure and description of the principles of the technology being employed. It will be appreciated by those skilled in the art that the scope of the invention in the embodiments of the present disclosure is not limited to the specific combination of the above technical features, but encompasses other technical features formed by any combination of the above technical features or their equivalents without departing from the spirit of the invention. Such as the above-described features, are mutually substituted with (but not limited to) the features having similar functions disclosed in the embodiments of the present disclosure.

Claims (16)

1. A method of voice conditioning, the method comprising:
a first database is established in advance, and the first database comprises corresponding relations among vehicle environment information, prompt event categories and emergency degrees of prompt events;
pre-establishing a voice regulation strategy model, wherein the voice regulation strategy model comprises a corresponding relation among the emergency degree of a prompt event, the emotion state of a person on a vehicle and the voice regulation strategy, and the voice regulation strategy comprises a combination of frequency, speed and intonation model curves;
acquiring first environment information of a first vehicle;
acquiring first state information of at least one person on the first vehicle;
determining a target voice-conditioning policy based on the first environmental information and the first status information, comprising:
inquiring the first database according to the first environment information to judge a first prompt event corresponding to the voice to be played and a first emergency degree of the first prompt event;
judging a first emotion state of at least one person according to the first state information, the first environment information and the user characteristic data of each person in the at least one person recorded in a user characteristic database; and
determining a target voice adjustment strategy corresponding to the first emergency degree and the first emotion state according to the voice adjustment strategy model, wherein the first emergency degree comprises emergency or non-emergency, the first emotion state comprises emotional panic, the target voice adjustment strategy comprises that the frequency of the voice to be played is lower than a preset frequency threshold, the speed of the voice to be played is lower than a preset speed threshold, and the intonation model curve applies a peaceful intonation model curve with pacifying effect;
and adjusting parameters of the voice to be played on the first vehicle according to the target voice adjusting strategy.
2. The voice-conditioning method of claim 1, wherein the first environmental information includes at least one of in-vehicle environmental information and out-of-vehicle environmental information, and wherein the out-of-vehicle environmental information includes road condition information and/or advanced assisted driving system ADAS information.
3. The voice-conditioning method of claim 1 or 2, wherein the parameters include at least one of pitch, pace of speech, and intonation.
4. The voice-conditioning method of claim 2, wherein, in the case where the first environmental information includes in-vehicle environmental information, the in-vehicle environmental information is acquired by collecting vehicle component status information.
5. The voice-adjusting method according to claim 2, wherein, in a case where the first environmental information includes road condition information, the road condition information is acquired in one of the following ways:
acquiring real-time high-precision road condition information through a cloud; or (b)
The condition near the vehicle is sensed by a sensing camera and/or radar.
6. The voice-conditioning method of claim 1, wherein the at least one person includes a driver of the vehicle, and wherein the first status information is obtained in at least one of:
collecting facial expressions of the at least one person through a camera;
collecting the language of the at least one person through a voice receiver;
collecting the driving action of the driver through a driving action collector; or (b)
And collecting the duration time of the current driving of the driver through clock recording.
7. The voice-conditioning method of claim 1, wherein the intonation model curves include a serious intonation model curve with a warning effect, a mild intonation model curve with a pacifying effect, and an active intonation model curve with an exciting effect.
8. The voice-conditioning method according to claim 1 or 7, wherein the adjusting the parameter of the voice to be played on the first vehicle according to the target voice-conditioning policy includes:
and correspondingly adjusting the frequency and the speech speed of the speech to be played according to the combination of the frequency, the speed and the intonation model curve included in the target speech adjustment strategy, and correspondingly adjusting the intonation model of the speech to be played.
9. A speech conditioning apparatus comprising:
a first acquisition unit configured to acquire first environmental information of a first vehicle;
a second acquisition unit configured to acquire first status information of at least one person on the first vehicle;
a determining unit configured to determine a target voice adjustment policy based on the first environment information and the first state information, comprising:
inquiring a first database which is established in advance according to the first environment information so as to judge a first prompt event corresponding to the voice to be played and a first emergency degree of the first prompt event, wherein the first database comprises corresponding relations among vehicle environment information, prompt event categories and emergency degrees of the prompt events;
judging a first emotion state of at least one person according to the first state information, the first environment information and the user characteristic data of each person in the at least one person recorded in a user characteristic database; and
determining a target voice regulation strategy corresponding to the first emergency degree and the first emotion state according to a pre-established voice regulation strategy model, wherein the voice regulation strategy model comprises a corresponding relation among the emergency degree of a prompt event, the emotion state of a person on a vehicle and the voice regulation strategy, the voice regulation strategy comprises a combination of frequency, speed and intonation model curves, the first emergency degree comprises emergency or non-emergency conditions, the first emotion state comprises emotional panic, the target voice regulation strategy comprises that the frequency of the voice to be played is lower than a preset frequency threshold, the speed of the voice to be played is lower than a preset speed threshold and the intonation model curves apply a mild intonation model curve with a pacifying effect;
and the adjusting unit is configured to adjust parameters of the voice to be played on the first vehicle according to the target voice adjusting strategy.
10. The voice-adjusting device of claim 9, wherein the first environmental information comprises at least one of in-vehicle environmental information and out-of-vehicle environmental information, and wherein the out-of-vehicle environmental information comprises road condition information and/or ADAS information.
11. The speech regulating device according to claim 9 or 10, wherein the parameter comprises at least one of pitch, pace of speech and intonation.
12. The voice-adjusting apparatus according to claim 10, wherein in a case where the first environmental information includes in-vehicle environmental information, the first acquisition unit is configured to acquire the in-vehicle environmental information by acquiring vehicle component state information; and wherein, in the case where the first environmental information includes road condition information, the first acquisition unit is configured to acquire the road condition information in at least one of:
acquiring real-time high-precision road condition information through a cloud; or (b)
The condition near the vehicle is sensed by a sensing camera and/or radar.
13. The voice-adjusting device of claim 9, wherein the at least one person comprises a driver of the vehicle, and wherein the second acquisition unit is configured to acquire the first status information by at least one of:
collecting facial expressions of the at least one person through a camera;
collecting the language of the at least one person through a voice receiver;
collecting the driving action of the driver through a driving action collector; or (b)
And collecting the duration time of the current driving of the driver through clock recording.
14. An electronic device, comprising:
one or more processors;
a storage device having one or more programs stored thereon;
the one or more programs, when executed by the one or more processors, cause the one or more processors to implement the method of any of claims 1-8.
15. An in-vehicle system comprising the electronic device of claim 14.
16. A computer readable medium having stored thereon a computer program, wherein the program when executed by a processor implements the method of any of claims 1-8.
CN202010172637.4A 2020-03-12 2020-03-12 Voice adjustment method, device, electronic equipment, vehicle-mounted system and readable medium Active CN111402925B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010172637.4A CN111402925B (en) 2020-03-12 2020-03-12 Voice adjustment method, device, electronic equipment, vehicle-mounted system and readable medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010172637.4A CN111402925B (en) 2020-03-12 2020-03-12 Voice adjustment method, device, electronic equipment, vehicle-mounted system and readable medium

Publications (2)

Publication Number Publication Date
CN111402925A CN111402925A (en) 2020-07-10
CN111402925B true CN111402925B (en) 2023-10-10

Family

ID=71430758

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010172637.4A Active CN111402925B (en) 2020-03-12 2020-03-12 Voice adjustment method, device, electronic equipment, vehicle-mounted system and readable medium

Country Status (1)

Country Link
CN (1) CN111402925B (en)

Families Citing this family (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112033417B (en) * 2020-09-29 2021-08-24 北京深睿博联科技有限责任公司 Real-time navigation method and device for visually impaired people
CN112349299A (en) * 2020-10-28 2021-02-09 维沃移动通信有限公司 Voice playing method and device and electronic equipment
CN112418162B (en) * 2020-12-07 2024-01-12 安徽江淮汽车集团股份有限公司 Method, device, storage medium and apparatus for controlling vehicle
CN112837552A (en) * 2020-12-31 2021-05-25 北京梧桐车联科技有限责任公司 Voice broadcasting method and device and computer readable storage medium
CN112776710A (en) * 2021-01-25 2021-05-11 上汽通用五菱汽车股份有限公司 Sound effect adjusting method, sound effect adjusting system, vehicle-mounted device system and storage medium
CN114360241B (en) * 2021-12-10 2023-05-16 斑马网络技术有限公司 Vehicle interaction method, vehicle interaction device and storage medium
WO2023236691A1 (en) * 2022-06-08 2023-12-14 Pateo Connect+ Technology (Shanghai) Corporation Control method based on vehicle external audio system, vehicle intelligent marketing method, electronic apparatus, and storage medium
CN115460031B (en) * 2022-11-14 2023-04-11 深圳市听见时代科技有限公司 Intelligent sound control supervision system and method based on Internet of things

Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102874259A (en) * 2012-06-15 2013-01-16 浙江吉利汽车研究院有限公司杭州分公司 Automobile driver emotion monitoring and automobile control system
CN105895095A (en) * 2015-02-12 2016-08-24 哈曼国际工业有限公司 Adaptive interactive voice system
CN106627589A (en) * 2016-12-27 2017-05-10 科世达(上海)管理有限公司 Vehicle driving safety auxiliary method and system and vehicle
CN106650633A (en) * 2016-11-29 2017-05-10 上海智臻智能网络科技股份有限公司 Driver emotion recognition method and device
CN106652378A (en) * 2015-11-02 2017-05-10 比亚迪股份有限公司 Driving reminding method and system for vehicle, server and vehicle
CN106803423A (en) * 2016-12-27 2017-06-06 智车优行科技(北京)有限公司 Man-machine interaction sound control method, device and vehicle based on user emotion state
CN107117174A (en) * 2017-03-29 2017-09-01 昆明理工大学 A kind of driver's mood monitoring active safety guide device circuit system and its control method
CN108847239A (en) * 2018-08-31 2018-11-20 上海擎感智能科技有限公司 Interactive voice/processing method, system, storage medium, engine end and server-side
CN108875682A (en) * 2018-06-29 2018-11-23 百度在线网络技术(北京)有限公司 Information-pushing method and device
CN109036405A (en) * 2018-07-27 2018-12-18 百度在线网络技术(北京)有限公司 Voice interactive method, device, equipment and storage medium

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8442832B2 (en) * 2008-12-08 2013-05-14 Electronics And Telecommunications Research Institute Apparatus for context awareness and method using the same
JP6466385B2 (en) * 2016-10-11 2019-02-06 本田技研工業株式会社 Service providing apparatus, service providing method, and service providing program

Patent Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102874259A (en) * 2012-06-15 2013-01-16 浙江吉利汽车研究院有限公司杭州分公司 Automobile driver emotion monitoring and automobile control system
CN105895095A (en) * 2015-02-12 2016-08-24 哈曼国际工业有限公司 Adaptive interactive voice system
CN106652378A (en) * 2015-11-02 2017-05-10 比亚迪股份有限公司 Driving reminding method and system for vehicle, server and vehicle
CN106650633A (en) * 2016-11-29 2017-05-10 上海智臻智能网络科技股份有限公司 Driver emotion recognition method and device
CN106627589A (en) * 2016-12-27 2017-05-10 科世达(上海)管理有限公司 Vehicle driving safety auxiliary method and system and vehicle
CN106803423A (en) * 2016-12-27 2017-06-06 智车优行科技(北京)有限公司 Man-machine interaction sound control method, device and vehicle based on user emotion state
CN107117174A (en) * 2017-03-29 2017-09-01 昆明理工大学 A kind of driver's mood monitoring active safety guide device circuit system and its control method
CN108875682A (en) * 2018-06-29 2018-11-23 百度在线网络技术(北京)有限公司 Information-pushing method and device
CN109036405A (en) * 2018-07-27 2018-12-18 百度在线网络技术(北京)有限公司 Voice interactive method, device, equipment and storage medium
CN108847239A (en) * 2018-08-31 2018-11-20 上海擎感智能科技有限公司 Interactive voice/processing method, system, storage medium, engine end and server-side

Also Published As

Publication number Publication date
CN111402925A (en) 2020-07-10

Similar Documents

Publication Publication Date Title
CN111402925B (en) Voice adjustment method, device, electronic equipment, vehicle-mounted system and readable medium
JP6953464B2 (en) Information push method and equipment
CN106803423B (en) Man-machine interaction voice control method and device based on user emotion state and vehicle
EP3675121B1 (en) Computer-implemented interaction with a user
US20200057487A1 (en) Methods and systems for using artificial intelligence to evaluate, correct, and monitor user attentiveness
CN110214107B (en) Autonomous vehicle providing driver education
US9771082B2 (en) Reducing cognitive demand on a vehicle operator by generating passenger stimulus
CN109760585A (en) With the onboard system of passenger traffic
CN110219544A (en) Intelligent vehicle and its Intelligent control method for car window
CN109624985B (en) Early warning method and device for preventing fatigue driving
CN108860157B (en) Violation risk assessment method, system, equipment and storage medium
KR102045320B1 (en) Method and System for Preventing Driver Drowsiness Through Interaction with Driver by Based on Driver Status Monitoring
CN112102584A (en) Automatic driving alarm method and device for vehicle, vehicle and storage medium
CN112677985B (en) Method and device for determining activation level of central control function of vehicle, electronic equipment and medium
CN111652065B (en) Multi-mode safe driving method, equipment and system based on vehicle perception and intelligent wearing
CN116101205A (en) Intelligent cabin in-vehicle intelligent sensing system based on in-vehicle camera
Lashkov et al. Dangerous state detection in vehicle cabin based on audiovisual analysis with smartphone sensors
CN114035533A (en) Vehicle intelligent test method and device
CN112773349A (en) Method and device for constructing emotion log of user driving vehicle, storage medium and terminal
CN116039653B (en) State identification method, device, vehicle and storage medium
CN116588015B (en) Vehicle control method, vehicle control system, and storage medium
Barisic et al. Driver model for Take-Over-Request in autonomous vehicles
TWI764709B (en) Method and system for driving safety assisting
CN115628753A (en) Navigation voice broadcasting method and device, electronic equipment and storage medium
WO2020215976A1 (en) Traffic accident handling method and device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
TA01 Transfer of patent application right
TA01 Transfer of patent application right

Effective date of registration: 20211014

Address after: 100176 101, floor 1, building 1, yard 7, Ruihe West 2nd Road, Beijing Economic and Technological Development Zone, Daxing District, Beijing

Applicant after: Apollo Zhilian (Beijing) Technology Co.,Ltd.

Address before: 2 / F, baidu building, 10 Shangdi 10th Street, Haidian District, Beijing 100085

Applicant before: BEIJING BAIDU NETCOM SCIENCE AND TECHNOLOGY Co.,Ltd.

GR01 Patent grant
GR01 Patent grant