CN116229957A

CN116229957A - Multi-voice information fusion method, system and equipment for automobile cabin system and storage medium

Info

Publication number: CN116229957A
Application number: CN202310504699.4A
Authority: CN
Inventors: 胡东阳; 刘峰学; 王爱春; 黄少堂
Original assignee: Jiangling Motors Corp Ltd
Current assignee: Jiangling Motors Corp Ltd
Priority date: 2023-05-08
Filing date: 2023-05-08
Publication date: 2023-06-06

Abstract

The invention discloses a method, a system, equipment and a storage medium for fusing multiple voice information of an automobile cabin system, which mainly adopts the technical scheme that the natural language understanding technology of different party semantic recognition systems is integrated through a cloud, the credibility of recognition results of all the systems is compared, the systems are issued to an automobile end after being arbitrated by the cloud, and corresponding functions are called by a client of the automobile end, so that the coverage breadth and corpus richness of the voice control function of the whole voice function are improved, and simultaneously, the voice recognition rate, the conversation freedom and the like can be improved.

Description

Multi-voice information fusion method, system and equipment for automobile cabin system and storage medium

Technical Field

The invention relates to the technical field of automobile manufacturing, in particular to a method, a system, equipment and a storage medium for fusing multi-voice information of an automobile cabin system.

Background

Along with the development of the internet of vehicles technology, the intelligent cabin is the most widely applied direction at present, voice interaction is one of key functions of the intelligent cabin, and whether voice interaction experience is excellent or not is an evaluation standard, so that the arousal rate, the recognition rate, the corpus richness degree, the function coverage degree, the conversation freedom degree and the like are covered.

The language interaction system in the prior art adopts a single voice scheme, and has the defects that the arousal rate, the recognition rate, the corpus richness, the function coverage degree and the conversation freedom degree are all achieved, and different voice control requirements of different ecological client applications are difficult to match.

Disclosure of Invention

The present invention aims to solve at least one of the technical problems existing in the prior art. Therefore, the invention provides a method, a system, equipment and a storage medium for fusing multi-voice information of an automobile cabin system.

According to an embodiment of the first aspect of the invention, the multi-voice information fusion method of the automobile cabin system comprises the following steps:

step one: the vehicle end acquires a preset audio signal and sends the preset audio signal to a vehicle end system;

step two: receiving an echo audio signal fed back by the vehicle-end system based on the preset audio signal;

step three: the vehicle-end system performs environmental noise reduction on the callback audio signal to obtain a noise reduction audio signal;

step four: the method comprises the steps that noise reduction audio signals are sent to a cloud system, and the cloud system carries out voice recognition on the noise reduction audio signals to obtain voice recognition information;

step five: the cloud system performs semantic reading on the voice recognition information to obtain semantic information;

step six: the semantic information is sent to a vehicle-end system, and the vehicle-end system compiles the semantic information to obtain an action instruction program;

step seven: and according to the action instruction program, the vehicle-end system screens out the client application end which is most in line with the action instruction program and sends the action instruction program to the client application end.

According to the multi-voice information fusion method for the automobile cabin system, the echo audio signals sent by the automobile end system are subjected to noise reduction processing to obtain the noise reduction audio signals, the content of the noise reduction audio signals is subjected to voice recognition processing to obtain voice recognition information, the cloud system only can recognize characters and cannot understand the meaning of the voice recognition information, the voice recognition information is subjected to semantic reading to obtain semantic information, the cloud system and the automobile end system can understand the meaning of the semantic information, the semantic information is converted into action instructions which can be executed by the automobile end system, and finally the action instructions are transmitted to corresponding automobile end client applications, so that the requirements of different ecological client applications can be met through the multi-voice information fusion system.

According to some embodiments of the present invention, the semantic reading is a plurality of semantic reading technical terminals processing and arbitrating the voice recognition information at the same time, the rule of arbitration is to prioritize the voice recognition information, and the parameters for determining the priority include integrity, definition and execution degree of the semantics, so that the inaccuracy of recognition caused by single semantic reading is avoided, and the recognition result is more accurate.

According to some embodiments of the present invention, the screening is that the client system screens the client application end most suitable for running the action instruction program according to the semantic understanding capability and expertise of the client application end, so that the accuracy of the action executed by the client system can be improved.

According to some embodiments of the invention, the system is an android, linux, hong and microsoft system, and the mobile terminal system of the android, linux, hong and microsoft system operates more stably than other systems.

According to some embodiments of the present invention, the voice recognition is automatic voice recognition, and the automatic voice recognition has a higher recognition speed and a higher recognition accuracy than the conventional voice recognition process.

According to some embodiments of the invention, before the step of the vehicle end obtaining the preset audio signal and sending the preset audio signal to the vehicle end system, the method further includes: and acquiring a host receiving frequency of the vehicle-end system, acquiring international standard frequency offset, determining an analog waveform frequency according to the host receiving frequency and the international standard frequency offset, and determining the analog waveform frequency by adopting the international standard frequency offset so as to enable the identification rate to be more accurate.

According to a second aspect of the present invention, a multi-voice information fusion system for an automobile cabin system includes:

the acquisition module is used for acquiring a preset audio signal and sending the preset audio signal to the vehicle-end system;

the receiving module is used for receiving an echo audio signal fed back by the vehicle-end system based on the preset audio signal;

the noise reduction module is used for carrying out environmental noise reduction on the echo audio signal to obtain a noise reduction audio signal, and sending the noise reduction audio signal to a cloud system;

the voice recognition module is used for receiving and processing the noise reduction audio signals to obtain voice recognition information and sending the voice recognition information to the cloud system;

the semantic processing module is used for receiving the voice recognition information and arbitrating the voice recognition information to obtain semantic information and sending the semantic information to the vehicle-end system;

the compiling module is used for compiling the semantic information on the vehicle-end system at the vehicle end to obtain an action instruction program;

and the screening module is used for screening out the client application end which is most suitable for matching with the action instruction program by utilizing a vehicle end system according to the action instruction program, and sending the action instruction program to the client application end.

The device is easier to debug and test through modularized processing, so that the reliability of multi-voice information fusion of a cabin system of software is improved.

In addition, to achieve the above object, a multi-voice information fusion apparatus for an automobile cabin system according to an embodiment of the present invention is characterized in that the multi-voice information fusion apparatus for an automobile cabin system includes: the system comprises a memory, a processor and a cabin system multi-voice information fusion program stored on the memory and capable of running on the processor, wherein the cabin system multi-voice information fusion program is configured to realize the steps of the cabin system multi-voice information fusion method.

The multi-voice information fusion system of the automobile cabin system can be used in the equipment by applying the multi-voice information fusion system of the automobile cabin system to the multi-voice information fusion equipment of the automobile cabin system.

According to the storage medium of the fourth aspect of the embodiment of the present invention, a cabin system multi-voice information fusion program is stored on the storage medium, and the cabin system multi-voice information fusion program realizes the steps of the cabin system multi-voice information fusion method when being executed by a processor.

Through the use of the storage medium, the information in the automobile cabin multi-voice information fusion device can be stored in time, and the user experience is improved.

Additional aspects and advantages of the invention will be set forth in part in the description which follows, and in part will be obvious from the description, or may be learned by practice of the invention.

Drawings

In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings that are needed in the embodiments will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and that other drawings can be obtained according to these drawings without inventive effort for a person skilled in the art.

FIG. 1 is a flow chart of a method for fusion of multiple voice messages for an automotive cabin system according to an embodiment of the present invention;

fig. 2 is a functional block diagram of a method for fusion of multiple voice information of an automobile cabin system according to an embodiment of the present invention.

Detailed Description

The following detailed description of embodiments of the present invention is exemplary, with reference to the accompanying drawings, it being understood that the specific embodiments described herein are merely illustrative of the application and not intended to limit the application.

It will be understood that when an element is referred to as being "fixed to" another element, it can be directly on the other element or intervening elements may also be present. When an element is referred to as being "connected" to another element, it can be directly connected to the other element or intervening elements may also be present.

Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs. The terminology used herein in the description of the invention is for the purpose of describing particular embodiments only and is not intended to be limiting of the invention. The term "and/or" as used herein includes any and all combinations of one or more of the associated listed items.

Example 1

The invention relates to a multi-voice information fusion method of an automobile cabin system, which comprises the following steps:

step five: and the cloud system performs semantic reading on the voice recognition information to obtain semantic information.

According to the steps of the multi-voice information fusion method of the automobile cabin system, when a cloud system is used for transmitting a command, for example, opening music, the cloud system feeds back an echo audio signal based on a preset audio signal to the acquired command audio signal, the echo audio signal is sent to a noise reduction module for noise reduction processing to obtain the noise reduction audio signal, the noise reduction audio signal is sent to a voice recognition module for voice recognition processing to obtain voice recognition information, at the moment, the voice recognition information only comprises four words of opening music and cannot read the meanings in the words, the program is sent to a semantic processing module, then semantic reading is carried out on the voice recognition information through the semantics, at the moment, the semantic processing module is used for a plurality of semantic reading technical ends, semantic understanding is carried out on the four words of opening music, finally, the arbitration effect is matched to obtain self-semantic information containing the meaning of opening music and corresponding understanding, for example, semantic results which are most suitable for the meaning understanding at the moment are arbitrated in a "network music", "QQ music" and a "vehicle end self-carried music player", namely, a client application end which is most suitable for opening music "meaning understanding at the time, for example, the" QQ music "is converted into a semantic information" music "and then a random music action command is executed by the semantic processing module, and a random action is executed after the application command is executed.

In this embodiment, the semantic reading is that a plurality of semantic reading technical ends process and arbitrate voice recognition information at the same time, specifically, the semantic reading process includes at least two or more semantic reading technical ends (for example, semantic reading technical end 1, semantic reading technical end 2, semantic reading technical end 3, etc.) processing and arbitrating voice recognition information at the same time, the obtained arbitration result is that semantic information, the rule of arbitration is to prioritize the voice recognition information, the parameters determining the priority include integrity, definition and execution degree of semantics, for example, "open" does not have integrity, and "open music" does have integrity, definition is comprehensively judged according to the environment of the site, the execution degree is according to the meaning recognized by the system, preferably a client application end most suitable for executing the instruction, for example, "high-definition" is suitable for executing navigation instruction; as shown in FIG. 2, the inaccuracy of recognition caused by processing of a single semantic reading technology end is avoided, and the recognition result is more accurate.

According to the multi-voice information fusion method of the automobile cabin system, after the automobile end system receives the arbitration result, the voice service module (such as the voice service module 1, the voice service module 2, the voice service module 3 and the like) which is distributed to the semantic reading technical end most suitable for running the arbitration result is screened according to the semantic understanding capability and the expertise of the client application end, and the voice service module outputs and matches the semantic information to the client application end (such as the client application 1, such as the client application 2, such as the client application 3 and the like).

According to some embodiments of the invention, the vehicle-side system is android, linux, hong-mo and microsoft systems, because the android mobile-side system operates more stably than other systems.

According to some embodiments of the present invention, the noise reduction process is a dual-microphone environment noise reduction process, and compared with a single-microphone environment noise reduction process, the dual-microphone environment noise reduction process means that external noise is eliminated by using two microphones through technical processing, so that the sound after processing is quite clear, the design comes from the principle of 'sound wave superposition and mutual cancellation', and as the propagation of sound is realized through the vibration of a medium, waveforms between waves are cancelled under theoretical conditions if the waveforms are opposite, and the single-microphone environment noise reduction process is not performed.

According to some embodiments of the present invention, the voice recognition module is an automatic voice recognition module, and the automatic voice recognition process has a higher recognition speed and a higher recognition accuracy than the normal voice recognition process.

In order to achieve the above object, the present invention further provides a system for fusion of multiple voice information of an automobile cabin system, where the system for fusion of multiple voice information of an automobile cabin system includes:

the noise reduction module is used for receiving and processing the echo audio signals to obtain noise reduction audio signals and sending the noise reduction audio signals to the cloud system;

the voice recognition module is used for receiving and processing the noise reduction audio signals to obtain voice recognition audio signals and sending the voice recognition audio signals to the cloud system;

the semantic processing module is used for receiving and processing the voice recognition audio signals to obtain semantic audio signals and sending the semantic audio signals to the vehicle-end system;

the compiling module is used for receiving and processing the semantic audio signals sent by the cloud system, obtaining an action instruction program and sending the action instruction to the vehicle-end system;

The semantic reading in the semantic processing module is that a plurality of semantic reading technical ends process and arbitrate the voice recognition audio signals at the same time, the obtained arbitration result is the semantic information, the arbitration rule is that the voice recognition information is prioritized, and the parameters for determining the priority include the completeness, definition and execution degree of the semantics.

In order to achieve the above object, the present invention further provides a multi-voice information fusion device for an automobile cabin system, wherein the multi-voice information fusion device for an automobile cabin system includes: the system comprises a memory, a processor and a cabin system multi-voice information fusion program stored on the memory and capable of running on the processor, wherein the cabin system multi-voice information fusion program is configured to realize the steps of the cabin system multi-voice information fusion method.

In order to achieve the above objective, the present invention further provides a storage medium of a multi-voice information fusion device for an automobile cabin system, wherein a multi-voice information fusion program for the cabin system is stored in the storage medium, and the multi-voice information fusion program for the cabin system realizes the steps of the multi-voice information fusion method for the cabin system when being executed by a processor.

In the description of the present invention, it should be understood that the terms "center", "longitudinal", "lateral", "length", "width", "thickness", "upper", "lower", "front", "rear", "left", "right", "vertical", "horizontal", "top", "bottom", "inner", "outer", "clockwise", "counterclockwise", "axial", "radial", "circumferential", etc. indicate orientations or positional relationships based on the orientations or positional relationships shown in the drawings are merely for convenience in describing the present invention and simplifying the description, and do not indicate or imply that the device or element being referred to must have a specific orientation, be configured and operated in a specific orientation, and therefore should not be construed as limiting the invention.

Claims

1. The multi-voice information fusion method of the automobile cabin system is characterized by comprising the following steps of:

step three: performing environmental noise reduction on the echo audio signal to obtain a noise reduction audio signal;

step four: transmitting a noise reduction audio signal to a cloud system, and performing voice recognition on the noise reduction audio signal to obtain voice recognition information;

step five: semantic reading is carried out on the voice recognition information to obtain semantic information;

step seven: according to the action instruction program, the vehicle-end system screens out the client application end which is most in line with the action instruction program, and sends the action instruction program to the client application end, so as to complete the multi-voice information fusion process of the automobile cabin system.

2. The method for fusion of multiple voice messages in an automotive cabin system according to claim 1, wherein: in the fifth step, the semantic reading is that a plurality of semantic reading technical terminals process and arbitrate the voice recognition information at the same time, the obtained arbitration result is the semantic information, the rule of arbitration is that the voice recognition information is prioritized, and the parameters for determining the priority include the completeness, definition and execution degree of the semantic.

3. The method for fusion of multiple voice messages in an automotive cabin system according to claim 1, wherein: the screening process in the seventh step comprises that the vehicle-side system screens out the client application side most suitable for running the action instruction program according to the semantic identification type and the capacity of the client application side.

4. The method for fusion of multiple voice messages in an automotive cabin system according to claim 1, wherein: in the sixth step, the vehicle-end system is one of android, linux, hong Mongolian and Microsoft system hosts.

5. The method for fusion of multiple voice messages in an automotive cabin system according to claim 1, wherein: and step four, the voice recognition is automatic voice recognition.

6. The method for fusion of multiple voice messages in a cabin system of a vehicle according to claim 1, wherein before the step of obtaining a preset audio signal at the vehicle end and transmitting the preset audio signal to the vehicle end system, the method further comprises:

and acquiring the host receiving frequency and the international standard frequency offset of the vehicle-end system, and determining the analog waveform frequency according to the host receiving frequency and the international standard frequency offset.

7. A multi-voice information fusion system for an automobile cabin system, comprising:

8. The system for fusion of multiple voice messages in a vehicle cabin system of claim 7, wherein: the semantic reading in the semantic processing module is that a plurality of semantic reading technical ends process and arbitrate the voice recognition information at the same time, the obtained arbitration result is the semantic information, the arbitrated rule is to prioritize the voice recognition information, and the parameters for determining the priority include the completeness, definition and execution degree of the semantic.

9. A multi-voice information fusion device for an automobile cabin system, comprising: memory, a processor and a cabin system multi-lingual information fusion program stored on the memory and operable on the processor, the cabin system multi-lingual information fusion program being configured to implement the steps of the cabin system multi-lingual information fusion method of any one of claims 1 to 6.

10. A storage medium, wherein a cabin system multi-voice information fusion program is stored on the storage medium, and the cabin system multi-voice information fusion program, when executed by a processor, implements the steps of the cabin system multi-voice information fusion method according to any one of claims 1 to 6.