WO2021027132A1 - Audio processing method and apparatus and computer storage medium - Google Patents

Audio processing method and apparatus and computer storage medium Download PDF

Info

Publication number
WO2021027132A1
WO2021027132A1 PCT/CN2019/117172 CN2019117172W WO2021027132A1 WO 2021027132 A1 WO2021027132 A1 WO 2021027132A1 CN 2019117172 W CN2019117172 W CN 2019117172W WO 2021027132 A1 WO2021027132 A1 WO 2021027132A1
Authority
WO
WIPO (PCT)
Prior art keywords
noise
signal
power
audio
signals
Prior art date
Application number
PCT/CN2019/117172
Other languages
French (fr)
Chinese (zh)
Inventor
王涛
Original Assignee
平安科技(深圳)有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 平安科技(深圳)有限公司 filed Critical 平安科技(深圳)有限公司
Publication of WO2021027132A1 publication Critical patent/WO2021027132A1/en

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/06Creation of reference templates; Training of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice
    • G10L15/063Training
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/20Speech recognition techniques specially adapted for robustness in adverse environments, e.g. in noise, of stress induced speech

Definitions

  • This application relates to the field of speech processing technology, and in particular to an audio processing method, device and computer storage medium.
  • the music recognition system With the development of the Internet, audio noise addition is in demand in many industries. For example, the current popular music recognition song, ideally, if the user records a piece of music without any interference, as long as the music is stored in the music library, the music recognition system can correctly find the matching music. However, in practical applications, the music clips recorded by users will have obvious interference, including both system noise introduced by playback equipment and recording equipment, and noise surrounding the recording. Therefore, the music recognition system needs to be trained in advance to make The music recognition system can be applied to the real environment. Among them, in the training process, the audio after adding noise (that is, adding noise) needs to be used. In the prior art, the noise adding tool can add noise to the audio, but only one type of noise can be added at a time. When the user needs to add multiple different types of noise to the audio, the user needs to use the tool multiple times to add noise to a certain audio. Adding many different types of noise makes the operation cumbersome, time-consuming, and low efficiency.
  • the embodiments of the present application provide an audio processing method, device, and computer storage medium, which can improve the processing efficiency of adding noise to audio.
  • the embodiment of the application provides an audio processing method, which includes:
  • the electronic device acquires N audio signals, M noise signals, and P signal-to-noise ratios input by the user, where N, M, and P are all positive integers;
  • the electronic device For the first audio signal among the N audio signals and the first signal-to-noise ratio among the P signal-to-noise ratios, the electronic device is based on the power of the first audio signal and the first signal-to-noise ratio Calculating the power of the noise signal to be added to the first audio signal;
  • the electronic device adjusts the power of the M types of noise signals according to the power of the noise signal to be added by the first audio signal;
  • the electronic device mixes the first audio signal and the M types of noise signals after power adjustment to obtain a noise-added signal corresponding to the first audio signal.
  • An embodiment of the present application also provides an audio processing device, including:
  • the first acquiring unit is configured to acquire N audio signals, M noise signals, and P signal-to-noise ratios input by the user, where N, M, and P are all positive integers;
  • a second acquiring unit configured to acquire the power of each audio signal in the N audio signals and the power of each noise signal in the M types of noise signals;
  • the calculation unit is configured to, for the first audio signal among the N audio signals and the first signal-to-noise ratio among the P signal-to-noise ratios, according to the power of the first audio signal and the first signal-to-noise ratio Than calculating the power of the noise signal required to be added to the first audio signal;
  • An adjusting unit configured to adjust the power of the M types of noise signals according to the power of the noise signal to be added by the first audio signal
  • the mixing unit is configured to mix the first audio signal and the M types of noise signals after power adjustment to obtain a noise-added signal corresponding to the first audio signal.
  • the embodiment of the present application also provides an electronic device, which includes a processor, an input device, an output device, and a memory, and the processor, the input device, the output device, and the memory are connected to each other.
  • the communication interface is used to communicate with other electronic devices (such as electronic devices)
  • the memory is used to store the implementation code of the foregoing audio processing method
  • the processor is used to execute the program code stored in the memory, that is, the foregoing audio processing method is executed.
  • the embodiments of the present application also provide a computer non-volatile readable storage medium, which stores instructions on the non-volatile readable storage medium, and when the non-volatile readable storage medium runs on a processor, the processor executes the above audio processing method.
  • the embodiment of the present application also provides a computer program product containing instructions, which when running on a processor, causes the processor to execute the above audio processing method.
  • the electronic device can add noise to one or more audio signals at one time, and can add one or more different types of noise to an audio signal, and can obtain multiple different types of noise for one audio signal at a time.
  • the output signal of signal-to-noise ratio does not require the user to perform multiple operations to add noise to multiple audio signals, does not require the user to add multiple noise types to an audio signal through multiple operations, and does not require the user to target the same audio through multiple operations.
  • the signal gets multiple output signals with different signal-to-noise ratios, which saves the user's operation, reduces the operation time, improves the efficiency of adding noise to the audio, and realizes batch audio processing.
  • FIG. 1 is a schematic diagram of the hardware structure of an electronic device provided by an embodiment of the application.
  • FIG. 2 is a schematic flowchart of an audio processing method provided by an embodiment of this application.
  • FIG. 3 is a schematic diagram of a user input interface provided by an embodiment of the application.
  • FIG. 4 is a schematic diagram of parameters of an audio signal provided by an embodiment of the application.
  • 5A is a schematic diagram of another user input interface provided by an embodiment of the application.
  • 5B is a schematic diagram of another user input interface provided by an embodiment of the application.
  • FIG. 6 is a schematic structural diagram of an audio processing device provided by an embodiment of the application.
  • the electronic devices involved in the embodiments of this application may include various handheld devices with wireless communication functions, vehicle-mounted devices, wearable devices, computing devices or other processing devices connected to wireless modems, as well as various forms of user equipment (User Equipment, UE), mobile station (Mobile Station, MS), terminal device (terminal device), etc.
  • UE User Equipment
  • MS mobile Station
  • terminal device terminal device
  • it can be a mobile terminal such as a smart phone, a tablet computer, or other terminals, and there is no limitation here.
  • the devices mentioned above are collectively referred to as electronic devices.
  • the embodiments of the present application are described below in conjunction with the drawings.
  • FIG. 1 is a schematic structural diagram of an electronic device provided by an embodiment of the present application.
  • the electronic device 100 includes: at least one processor 101, at least one input device 102, and at least one output device 103, a memory 104, and at least one bus 105.
  • the bus 105 is used to implement connection and communication between these components.
  • the processor 101 may be a central processing unit (Central Processing Unit, CPU) or a graphics processing unit (Graphics Processing Unit, GPU). In some embodiments, it may also be referred to as an application processor (application processor). , AP) to distinguish it from the baseband processor.
  • the processor 101 may also be other general-purpose processors, digital signal processors (Digital Signal Processor, DSP), application specific integrated circuits (ASIC), ready-made programmable gate arrays (Field-Programmable Gate Array, FPGA) Or other programmable logic devices, discrete gates or transistor logic devices, discrete hardware components, etc.
  • the general-purpose processor may be a microprocessor or the processor may also be any conventional processor or the like.
  • the input device 102 may include a touch panel, a fingerprint sensor (used to collect user fingerprint information and fingerprint orientation information), a camera, a microphone, etc.
  • the output device 103 may include a display (LCD, etc.), a speaker, etc.
  • the memory 104 may include a read-only memory and a random access memory, and provides instructions and data to the processor 101.
  • the processor 101 can be used to read and execute computer readable instructions. Specifically, the processor 101 may be used to call data stored in the memory 104.
  • a part of the memory 104 may also include a non-volatile random access memory.
  • the processor 101, the input device 102, and the output device 103 described in the embodiments of the present application can execute part or all of the processes involved in the audio processing method shown in FIG. 2 below.
  • the electronic device 100 may further include a communication interface.
  • the communication interface may be a transceiver, a transceiver circuit, etc., where the communication interface is a general term and may include one or more interfaces, such as an interface between an electronic device and a server.
  • the communication interface may include a wired interface and a wireless interface, such as a standard interface, Ethernet, and a multi-machine synchronization interface.
  • the processor 101 when the processor 101 receives any message or data, it specifically receives it by driving or controlling the communication interface. Therefore, the processor 101 can be regarded as a control center that performs sending or receiving, and the communication interface is a specific performer of sending and receiving operations.
  • the electronic device 100 may be a terminal, a server, a computer, a video playback device, etc., capable of computing or processing.
  • FIG. 2 provides an audio processing method related to an embodiment of the present application.
  • the audio processing method includes but is not limited to the following steps S201-S202.
  • S201 The electronic device obtains N audio signals, M noise signals, and P signal-to-noise ratios input by the user, where N, M, and P are all positive integers.
  • the audio signal input by the user may be one audio signal or multiple audio signals.
  • the noise signal input by the user may be one type of noise signal or multiple different types of noise signals.
  • the signal-to-noise ratio input by the user can be one signal-to-noise ratio or multiple signal-to-noise ratios.
  • the audio signal input by the user may be music, voice, and so on.
  • the type of noise signal input by the user includes noise that can be generated by the signal generating device such as white noise, Gaussian noise, pink noise, or colored noise, and may also include other types of noise recorded by the user, such as the sound of water flowing, the sound of birds, etc. Real environmental noise.
  • the signal-to-noise ratio input by the user refers to the ratio of the signal power and the noise power of the desired audio after adding noise to the audio signal.
  • the user input interface may be, for example, but not limited to, as shown in FIG.
  • the user input interface includes: an audio signal input box 301, a noise signal input box 302, a signal-to-noise ratio input box 303, and a confirm button 304. If you need to input multiple audio signals, you can trigger the input of multiple audio signals by clicking the "+" sign on the right of the audio signal input box 301. Similarly, if you need to input multiple noise signals, you can click the noise signal input box 302 The "+" sign on the right triggers the input of multiple noise signals. If you need to input multiple signal-to-noise ratios, you can click the "+” sign on the right of the signal-to-noise ratio input box 303 to trigger input multiple signal-to-noise ratios.
  • the electronic device may receive the audio signal input by the user, such as voice or music, through a voice input device of the electronic device, such as a microphone.
  • the electronic device can display the files stored locally in the electronic device, and the user can select the audio signal from the files locally stored in the electronic device.
  • the electronic device after the electronic device receives the user's instruction to click on the noise signal input box 302, it can receive the noise signal input by the user through the voice input device of the electronic device, such as a microphone, such as the sound of water flow or the sound of birds.
  • the electronic device may display the noise type, and the user may select the noise signal from the noise type.
  • the user can click the OK button 304 after inputting the audio signal, the noise signal, and the signal-to-noise ratio.
  • the electronic device executes step S202.
  • the user inputs 2 audio signals, audio signal 1 and audio signal 2, the user inputs 2 noise signals, noise signal 1 and noise signal 2, and the user inputs 2 signal-to-noise ratios, which are signal Noise ratio 1 and signal to noise ratio 2.
  • S202 The electronic device obtains the power of each of the N audio signals and the power of each of the M types of noise signals.
  • the electronic device obtains the power of each audio signal, including:
  • the electronic device extracts the amplitude of each audio signal, and obtains the power of each audio signal according to the amplitude of each audio signal. If the user inputs the audio signal through a microphone, the electronic device can calculate the power of the audio signal according to the amplitude of the audio signal input by the user. If the user selects an audio file from the files stored locally in the electronic device, the electronic device can use a voice analysis tool to convert the audio file into the audio signal shown in Figure 4, where the horizontal axis is the time, the vertical axis is the amplitude, and the electronic The device can calculate the power of the audio signal according to the amplitude of the audio signal.
  • Electronic equipment obtains the power of each noise signal, including:
  • the electronic device extracts the amplitude of each noise signal, and obtains the power of each noise signal according to the amplitude of each noise signal.
  • the electronic device extracts the amplitude of each noise signal, and obtains the power of each noise signal according to the amplitude of each noise signal. If the user inputs a noise signal through a microphone, the electronic device can calculate the power of the noise signal according to the amplitude of the noise signal input by the user. If the user selects the noise file locally from the electronic device, the electronic device can use a voice analysis tool to convert the noise file into the noise signal shown in Figure 4, where the horizontal axis is time and the vertical axis is amplitude. The amplitude of the noise signal is calculated to obtain the power of the noise signal.
  • the power of audio signal 1 is 10000W
  • the power of noise signal 1 is 9W
  • the power of noise signal 2 is 5W.
  • the electronic device calculates the addition of the first audio signal according to the power of the first audio signal and the first signal-to-noise ratio The power of the noise signal.
  • the electronic device calculating the power of the noise signal to be added to the first audio signal according to the power of the first audio signal and the first signal-to-noise ratio includes:
  • the first audio signal is audio signal 1
  • the first signal-to-noise ratio is signal-to-noise ratio 1
  • the power value of audio signal 1 is 10000W
  • the value of signal-to-noise ratio 1 is 30db.
  • 1dB 10*log 10 (A/B)(dB)
  • 30dB 10*log 10 (10000/B)(dB)
  • the power of the noise signal to be added is 10W.
  • Step S203 can be used to calculate the power of the noise signal that needs to be added for each audio signal.
  • the power of the audio signal 1 and the signal-to-noise ratio 1 can be used to calculate the power of the noise signal that the audio signal 1 needs to add in one case, and the power of the audio signal 1 and the signal-to-noise ratio 2 can be calculated to obtain the audio signal 1
  • the power of the noise signal that needs to be added can be calculated by using the power of the audio signal 2 and the signal-to-noise ratio 1.
  • the power of the noise signal that needs to be added can be obtained by using the audio signal 2.
  • the power of and the signal-to-noise ratio 2 can be calculated to obtain the power of the noise signal that the audio signal 2 needs to add in another case.
  • S204 The electronic device adjusts the power of the M types of noise signals according to the power of the noise signal to be added by the first audio signal.
  • the noise type input by the user includes only one noise type
  • the electronic device can determine the adjustment of the noise signal input by the user after obtaining the power of the noise signal to be added by the first audio signal through step S203 After the power value. For example, if the noise selected by the user is white noise, based on the foregoing example, it can be determined that the power of the noise signal corresponding to the white noise is 10W.
  • the noise type includes multiple noise types.
  • the user also needs to input the weights of the multiple noise types in the user input interface.
  • Figure 5A which is a schematic diagram of a user input interface. The user can click the weight input box 305 in the user input interface to input the weight of each noise signal. If you need to input the weights of multiple noise signals, you can click the "+" sign on the right of the weight input box 305 to trigger the input of multiple noise signals the weight of.
  • the noise type input by the user includes white noise and pink noise, and the weight corresponding to the white noise and pink noise is 3:2.
  • the method further includes: the electronic device obtains the weights of the multiple noise types, and according to The weights of the multiple noise types determine the noise signal power corresponding to each noise signal in the multiple noise signals.
  • the types of noise include white noise and pink noise.
  • the weight corresponding to white noise and pink noise is 3:2. Since the total signal power of noise is 10W, the signal power of white noise is obtained according to the weight corresponding to the noise.
  • the signal power of pink noise is 4W.
  • the electronic device determines the power of the noise signal corresponding to each noise type, it adjusts the power of each noise signal. For example, if the noise input by the user is: white noise with a signal power of 9W and pink noise with a signal power of 5W, the electronic device adjusts the power of the white noise to 6W and the signal power of the pink noise to 4W.
  • S205 The electronic device mixes the first audio signal and the power-adjusted M types of noise signals to obtain a noise-added signal corresponding to the first audio signal.
  • the electronic device After adjusting the power of each noise signal, the electronic device mixes the noise signal with the audio signal to obtain a noise-added signal.
  • the power of the audio signal 1 and the signal-to-noise ratio 1 can be used to calculate the power of the noise signal that the audio signal 1 needs to add in a situation, and then adjust the power of each noise signal according to the power of the noise signal, and finally adjust The noise signal with reduced power is mixed with audio signal 1 to obtain a noise-added output signal.
  • the signal-to-noise ratio of the output signal is signal-to-noise ratio 1.
  • the power of the audio signal 1 and the signal-to-noise ratio 2 can be used to calculate the power of the noise signal that the audio signal 1 needs to add in another situation, and then adjust the power of each noise signal according to the power of the noise signal, and finally adjust The noise signal of power is mixed with audio signal 1 to obtain another output signal with added noise, and the signal-to-noise ratio of the output signal is signal-to-noise ratio 2.
  • the power of the audio signal 2 and the signal-to-noise ratio 1 can be used to calculate the power of the noise signal that the audio signal 2 needs to add in a situation, and then adjust the power of each noise signal according to the power of the noise signal, and finally adjust the power
  • the noise signal is mixed with the audio signal 2 to obtain a noise-added output signal.
  • the signal-to-noise ratio of the output signal is signal-to-noise ratio 1.
  • the power of the audio signal 2 and the signal-to-noise ratio 2 can be used to calculate the power of the noise signal that the audio signal 2 needs to add in another situation, and then adjust the power of each noise signal according to the power of the noise signal, and finally adjust The power noise signal is mixed with the audio signal 2 to obtain a noise-added output signal, and the signal-to-noise ratio of the output signal is signal-to-noise ratio 2.
  • the user inputs 2 audio signals and 2 signal-to-noise ratios, and finally 4 signals with noise added can be output.
  • the method further includes:
  • the electronic device performs characteristic marks on the noise-added audio signal, and the characteristic marks include the signal-to-noise ratio of the noise-added audio signal, the type of noise added to the noise-added audio signal, and the noise-added audio signal.
  • the noise power added by the audio signal includes the signal-to-noise ratio of the noise-added audio signal, the type of noise added to the noise-added audio signal, and the noise-added audio signal.
  • the electronic device performs different noise types and different proportions of noise mixtures for multiple audio signals, and after obtaining multiple noise-added audio signals, it performs a feature mark, which can indicate the noise type of the noise mixture and the noise after each noise addition.
  • the signal-to-noise ratio is easy to distinguish the noise after adding noise.
  • the noise-added audio storage table can be, for example, but not limited to, as shown in Table 1:
  • audio after audio A plus noise is: audio A1 with a signal-to-noise ratio of 10db after white noise, red-divided noise plus noise, and after white noise, red-divided noise plus noise 20db audio A2.
  • the foregoing embodiments are all described by taking all the noise input by the user as an example when noise is added.
  • the noise used by the electronic device It can be different, and it is not necessary to use all user input noise.
  • the audio signal input by the user includes audio signal 1 and audio signal 2
  • the noise signal input by the user includes noise signal 1 and noise signal 2
  • the signal to noise ratio input by the user includes signal to noise ratio 1 and signal to noise ratio 2.
  • the electronic device adds noise to audio signal 1, it can select only one of noise signal 1 and noise signal 2 to add noise to audio signal 1.
  • the electronic device adds noise to audio signal 1
  • the noise-added audio storage table can be, for example, but not limited to, as shown in Table 2:
  • the audio A after adding noise is: audio A1 with a signal-to-noise ratio of 10db after mixing with white noise, audio A2 with a signal-to-noise ratio of 20db after mixing with white noise, and mixing with pink Audio A3 with a signal-to-noise ratio of 10db after noise and audio A4 with a signal-to-noise ratio of 10db mixed with pink noise.
  • the method further includes:
  • the electronic device uses the noise-added signal corresponding to each of the N audio signals to train the music recognition system, so that the music recognition system can recognize noisy sounds in the real environment.
  • the electronic device can add noise to one or more audio signals at the same time and can mix multiple noise types of noise at one time, and obtain the noise-added signal-to-noise ratio according to actual needs, so that batch processing can be performed. Simplify the noise adding operation, save time, and adjust the signal-to-noise ratio to meet the diverse needs of users.
  • Figure 6 shows a schematic structural diagram of an audio processing device.
  • the audio processing device 600 includes: a first acquisition unit 601, a second acquisition unit 602, a calculation unit 603, and an adjustment unit 604 and mixing unit 605.
  • the first obtaining unit 601 is configured to obtain N audio signals, M types of noise signals, and P signal-to-noise ratios input by the user, where N, M, and P are all positive integers;
  • the second acquiring unit 602 is configured to acquire the power of each audio signal in the N audio signals and the power of each noise signal in the M types of noise signals;
  • the calculation unit 603 is configured to, for the first audio signal among the N audio signals and the first signal-to-noise ratio among the P signal-to-noise ratios, according to the power of the first audio signal and the first signal-to-noise ratio
  • the noise ratio calculates the power of the noise signal to be added to the first audio signal
  • the adjusting unit 604 is configured to adjust the power of the M types of noise signals according to the power of the noise signal to be added by the first audio signal;
  • the mixing unit 605 is configured to mix the first audio signal and the M types of noise signals after power adjustment to obtain a noise-added signal corresponding to the first audio signal.
  • calculation unit 603 is specifically configured to:
  • the M is an integer greater than or equal to 2
  • the audio processing device further includes:
  • the third acquiring unit is configured to acquire the weights of the M types of noise signals input by the user;
  • the adjustment unit 604 includes:
  • An allocation unit configured to allocate the power of the noise signal to be added by the first audio signal to each of the M noise signals according to the weight of the M noise signals;
  • the processing unit is configured to adjust the power of each noise signal according to the allocated power of each noise signal in the M noise signals.
  • the audio processing device 600 further includes:
  • the training unit is used to train the music recognition system by using the noise-added signal corresponding to each of the N audio signals.
  • the method before the training unit uses the noise-added signal corresponding to each of the N audio signals to train the music recognition system, the method further includes:
  • the marking unit is used to perform characteristic marking on the noise-added signal corresponding to each audio signal in the N audio signals to obtain the marked noise-added signal corresponding to each audio signal in the N audio signals, and the characteristic marking Including the signal-to-noise ratio of the noise-added signal, the type of noise added by the noise-added signal, and the noise power added by the noise-added signal;
  • the training unit is specifically configured to train the music recognition system by using the marked noise signal corresponding to each of the N audio signals.
  • the second acquiring unit 602 includes:
  • the first extraction unit is used to extract the amplitude of each audio signal, and the power of each audio signal is obtained according to the amplitude of each audio signal; the second extraction unit is used to extract each The amplitude of the noise signal obtains the power of each noise signal according to the amplitude of each noise signal.
  • the audio signal includes an audio signal input to the electronic device by the user through a voice input device.
  • the implementing voice input device may be a microphone.
  • the noise signal includes a noise signal input to the electronic device by the user through a voice input device.
  • the noise signal may be the sound of water flow, the sound of birds, etc. recorded by the user.
  • the noise signal may also be white noise, dividend noise, etc., and such noise may be generated by a signal generating device.
  • the second acquiring unit 602 is specifically configured to:
  • the amplitude of each noise signal is extracted, and the power of each noise signal is obtained according to the amplitude of each noise signal.
  • each unit in the audio processing device 600 can refer to the related description in the method embodiment shown in FIG. 2, and will not be repeated this time.
  • a computer non-volatile readable storage medium stores a computer program.
  • the computer program includes program instructions. Realized when executed by the processor.
  • the above embodiments it may be implemented in whole or in part by software, hardware, firmware or any combination thereof.
  • software it can be implemented in the form of a computer program product in whole or in part.
  • the computer program product includes one or more computer instructions.
  • the computer program instructions When the computer program instructions are loaded and executed on the computer, the processes or functions described in the embodiments of the present application are generated in whole or in part.
  • the computer may be a general-purpose computer, a special-purpose computer, a computer network, or other programmable devices.
  • the computer instructions may be stored in a computer non-volatile readable storage medium, or transmitted from one computer non-volatile readable storage medium to another computer non-volatile readable storage medium, for example, the computer instructions It can be from one website site, computer, server or data center through wired (such as coaxial cable, optical fiber, digital subscriber line (digital subscriber line, DSL) or wireless (such as infrared, wireless, microwave, etc.) to another website site, Computer, server or data center for transmission.
  • the computer non-volatile readable storage medium may be any available medium that can be accessed by a computer or a data storage device such as a server or data center integrated with one or more available media.
  • the usable medium may be a magnetic medium (for example, a floppy disk, a hard disk, a magnetic tape), an optical medium (for example, a digital versatile disc (DVD), a semiconductor medium (for example, a solid state disk, SSD), etc.
  • a magnetic medium for example, a floppy disk, a hard disk, a magnetic tape
  • an optical medium for example, a digital versatile disc (DVD)
  • DVD digital versatile disc
  • SSD solid state disk

Landscapes

  • Engineering & Computer Science (AREA)
  • Computational Linguistics (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Artificial Intelligence (AREA)
  • Soundproofing, Sound Blocking, And Sound Damping (AREA)
  • Circuit For Audible Band Transducer (AREA)

Abstract

An audio processing method and apparatus and a computer storage medium, the method comprising: an electronic device acquires N audio signals, M noise signals, and P signal-to-noise ratios inputted by a user, N, M, and P all being positive integers (S201); the electronic device acquires the power of each audio signal amongst the N audio signals and the power of each noise signal amongst the M noise signals (S202); for a first audio signal amongst the N audio signals and a first signal-to-noise ratio amongst the P signal-to-noise ratios, on the basis of the power of the first audio signal and the first signal-to-noise ratio, the electronic device calculates the power of a noise signal to be added to the first audio signal (S203); on the basis of the power of the noise signal to be added to the first audio signal, the electronic device adjusts the power of the M noise signals (S204); and the electronic device performs signal mixing of the first audio signal and the M noise signals after power adjustment to obtain a signal with noise added corresponding to the first audio signal (S205). The present method can increase the processing efficiency of adding noise to audio.

Description

一种音频处理方法、装置及计算机存储介质Audio processing method, device and computer storage medium
本申请要求于2019年08月12日提交中国专利局、申请号为201910748281.1、申请名称为“一种音频处理方法、装置及计算机存储介质”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。This application claims the priority of a Chinese patent application filed with the Chinese Patent Office on August 12, 2019, the application number is 201910748281.1, and the application name is "an audio processing method, device and computer storage medium", the entire content of which is incorporated by reference In this application.
技术领域Technical field
本申请涉及语音处理技术领域,尤其涉及一种音频处理方法、装置及计算机存储介质。This application relates to the field of speech processing technology, and in particular to an audio processing method, device and computer storage medium.
背景技术Background technique
随着互联网的发展,音频加噪在许多行业都有了需求。例如,当前热门的听音乐识别歌曲,理想情况下,如果用户录入了没有任何干扰的音乐片段,那么只要在音乐库中存有该音乐,音乐识别系统就能够正确找到匹配的音乐。但是在实际应用中,用户录制的音乐片段会带有明显的干扰,其中既包括播放设备、录制设备等引入的系统噪声,又包括录音周围环境的噪声,因此音乐识别系统需要预先进行训练,使得音乐识别系统能够应用到真实环境中。其中,在训练过程中,需要用到加噪(即添加噪声)后的音频。现有技术中,加噪工具可以给音频进行加噪,但一次性只能添加一种噪声,当用户需要对音频添加多种不同类型的噪声时,则用户需要多次使用工具对某一音频添加多种不同类型的噪声,操作繁琐,耗时长,效率低。With the development of the Internet, audio noise addition is in demand in many industries. For example, the current popular music recognition song, ideally, if the user records a piece of music without any interference, as long as the music is stored in the music library, the music recognition system can correctly find the matching music. However, in practical applications, the music clips recorded by users will have obvious interference, including both system noise introduced by playback equipment and recording equipment, and noise surrounding the recording. Therefore, the music recognition system needs to be trained in advance to make The music recognition system can be applied to the real environment. Among them, in the training process, the audio after adding noise (that is, adding noise) needs to be used. In the prior art, the noise adding tool can add noise to the audio, but only one type of noise can be added at a time. When the user needs to add multiple different types of noise to the audio, the user needs to use the tool multiple times to add noise to a certain audio. Adding many different types of noise makes the operation cumbersome, time-consuming, and low efficiency.
发明内容Summary of the invention
本申请实施例提供一种音频处理方法、装置及计算机存储介质,可以提高对音频加噪的处理效率。The embodiments of the present application provide an audio processing method, device, and computer storage medium, which can improve the processing efficiency of adding noise to audio.
本申请实施例提供了一种音频处理方法,该方法包括:The embodiment of the application provides an audio processing method, which includes:
电子设备获取用户输入的N个音频信号、M种噪声信号以及P个信噪比,所述N、M和P均为正整数;The electronic device acquires N audio signals, M noise signals, and P signal-to-noise ratios input by the user, where N, M, and P are all positive integers;
所述电子设备获取所述N个音频信号中每个音频信号的功率以及所述M种噪声信号中每种噪声信号的功率;Acquiring, by the electronic device, the power of each of the N audio signals and the power of each of the M types of noise signals;
针对所述N个音频信号中的第一音频信号以及所述P个信噪比中的第一信噪比,所述电子设备根据所述第一音频信号的功率以及所述第一信噪比计算所述第一音频信号所需添加的噪声信号的功率;For the first audio signal among the N audio signals and the first signal-to-noise ratio among the P signal-to-noise ratios, the electronic device is based on the power of the first audio signal and the first signal-to-noise ratio Calculating the power of the noise signal to be added to the first audio signal;
所述电子设备根据所述第一音频信号所需添加的噪声信号的功率调整所述M种噪声信号的功率;The electronic device adjusts the power of the M types of noise signals according to the power of the noise signal to be added by the first audio signal;
所述电子设备将所述第一音频信号和功率调整后的所述M种噪声信号进行信号混合,得到所述第一音频信号对应的已加噪信号。The electronic device mixes the first audio signal and the M types of noise signals after power adjustment to obtain a noise-added signal corresponding to the first audio signal.
本申请实施例还提供了一种音频处理装置,包括:An embodiment of the present application also provides an audio processing device, including:
第一获取单元,用于获取用户输入的N个音频信号、M种噪声信号以及P个信噪比,所述N、M和P均为正整数;The first acquiring unit is configured to acquire N audio signals, M noise signals, and P signal-to-noise ratios input by the user, where N, M, and P are all positive integers;
第二获取单元,用于获取所述N个音频信号中每个音频信号的功率以及所述M种噪声信号中每种噪声信号的功率;A second acquiring unit, configured to acquire the power of each audio signal in the N audio signals and the power of each noise signal in the M types of noise signals;
计算单元,用于针对所述N个音频信号中的第一音频信号以及所述P个信噪比中的第一信噪比,根据所述第一音频信号的功率以及所述第一信噪比计算所述第一音频信号所需添加的噪声信号的功率;The calculation unit is configured to, for the first audio signal among the N audio signals and the first signal-to-noise ratio among the P signal-to-noise ratios, according to the power of the first audio signal and the first signal-to-noise ratio Than calculating the power of the noise signal required to be added to the first audio signal;
调整单元,用于根据所述第一音频信号所需添加的噪声信号的功率调整所述M种噪声信号的功率;An adjusting unit, configured to adjust the power of the M types of noise signals according to the power of the noise signal to be added by the first audio signal;
混合单元,用于将所述第一音频信号和功率调整后的所述M种噪声信号进行信号混合,得到所述第一音频信号对应的已加噪信号。The mixing unit is configured to mix the first audio signal and the M types of noise signals after power adjustment to obtain a noise-added signal corresponding to the first audio signal.
本申请实施例还提供了一种电子设备,包括:处理器、输入装置、输出装置和存储器,处理器、输入装置、输出装置和存储器相互连接。其中,通信接口用于与其它电子设备(例如电子设备)进行通信,存储器用于存储上述音频处理方法的实现代码,处理器用于执行存储器中存储的程序代码,即执行上述音频处理方法。The embodiment of the present application also provides an electronic device, which includes a processor, an input device, an output device, and a memory, and the processor, the input device, the output device, and the memory are connected to each other. Wherein, the communication interface is used to communicate with other electronic devices (such as electronic devices), the memory is used to store the implementation code of the foregoing audio processing method, and the processor is used to execute the program code stored in the memory, that is, the foregoing audio processing method is executed.
本申请实施例还提供了一种计算机非易失性可读存储介质,非易失性可读存储介质上存储有指令,当其在处理器上运行时,使得处理器执行上述音频处理方法。The embodiments of the present application also provide a computer non-volatile readable storage medium, which stores instructions on the non-volatile readable storage medium, and when the non-volatile readable storage medium runs on a processor, the processor executes the above audio processing method.
本申请实施例还提供了一种包含指令的计算机程序产品,当其在处理器上运行时,使得处理器执行上述音频处理方法。The embodiment of the present application also provides a computer program product containing instructions, which when running on a processor, causes the processor to execute the above audio processing method.
实施本申请实施例,电子设备可以一次性对一个或多个音频信号加噪,且能够对一个音频信号添加一种或多种不同类型的噪声,且能够一次性针对一个音频信号得到多个不同信噪比的输出信号,无需用户通过多次操作实现对多个音频信号进行加噪,无需用户通过多次操作对某一音频信号添加多种噪声类型,且无需用户通过多次操作针对同一音频信号得到多个不同信噪比的输出信号,节省了用户的操作,减少了操作时间,提高了对音频加噪的效率,实现了批量的音频处理。To implement the embodiments of the present application, the electronic device can add noise to one or more audio signals at one time, and can add one or more different types of noise to an audio signal, and can obtain multiple different types of noise for one audio signal at a time. The output signal of signal-to-noise ratio does not require the user to perform multiple operations to add noise to multiple audio signals, does not require the user to add multiple noise types to an audio signal through multiple operations, and does not require the user to target the same audio through multiple operations The signal gets multiple output signals with different signal-to-noise ratios, which saves the user's operation, reduces the operation time, improves the efficiency of adding noise to the audio, and realizes batch audio processing.
本申请附加的方面和优点将在下面的描述中部分给出,这些将从下面的描述中变得明显,或通过本申请的实践了解到。The additional aspects and advantages of this application will be partly given in the following description, which will become obvious from the following description, or be understood through the practice of this application.
附图说明Description of the drawings
本申请上述的和/或附加的方面和优点从下面结合附图对实施例的描述中将变得明显和容易理解,其中:The above and/or additional aspects and advantages of the present application will become obvious and easy to understand from the following description of the embodiments in conjunction with the accompanying drawings, in which:
图1为本申请实施例提供的一种电子设备的硬件结构示意图;FIG. 1 is a schematic diagram of the hardware structure of an electronic device provided by an embodiment of the application;
图2为本申请实施例提供的一种音频处理方法的流程示意图;2 is a schematic flowchart of an audio processing method provided by an embodiment of this application;
图3为本申请实施例提供的一种用户输入界面的示意图;FIG. 3 is a schematic diagram of a user input interface provided by an embodiment of the application;
图4为本申请实施例提供的一种音频信号的参数示意图;4 is a schematic diagram of parameters of an audio signal provided by an embodiment of the application;
图5A为本申请实施例提供的另一种用户输入界面的示意图;5A is a schematic diagram of another user input interface provided by an embodiment of the application;
图5B为本申请实施例提供的另一种用户输入界面的示意图;5B is a schematic diagram of another user input interface provided by an embodiment of the application;
图6为本申请实施例提供的一种音频处理装置的结构示意图。FIG. 6 is a schematic structural diagram of an audio processing device provided by an embodiment of the application.
具体实施方式detailed description
为了使本申请的目的、技术方案和优点更加清楚,下面将结合附图对本申请作进一步地详细描述,显然,所描述的实施例仅仅是本申请一部份实施例,而不是全部的实施例。基于本申请中的实施例,本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其它实施例,都属于本申请保护的范围。In order to make the purpose, technical solutions and advantages of the application more clear, the application will be further described in detail below in conjunction with the accompanying drawings. Obviously, the described embodiments are only a part of the embodiments of the application, rather than all the embodiments. . Based on the embodiments in this application, all other embodiments obtained by a person of ordinary skill in the art without creative work shall fall within the protection scope of this application.
本申请的说明书和权利要求书及上述附图中的术语“第一”、“第二”等是用于区别不同的对象,而不是用于描述特定顺序。此外,术语“包括”和“具有”以及它们任何变形,意图在于覆盖不排他的包含。例如包含了一系列步骤或单元的过程、方法、系统、产品或设备没有限定于已列出的步骤或单元,而是可选地还包括没有列出的步骤或单元,或可选地还包括对于这些过程、方法或设备固有的其他步骤或单元。The terms "first", "second", etc. in the specification and claims of this application and the above-mentioned drawings are used to distinguish different objects, rather than to describe a specific sequence. In addition, the terms "including" and "having" and any variations thereof are intended to cover non-exclusive inclusion. For example, a process, method, system, product, or device that includes a series of steps or units is not limited to the listed steps or units, but optionally includes unlisted steps or units, or optionally also includes Other steps or units inherent in these processes, methods or equipment.
在本文中提及“实施例”意味着,结合实施例描述的特定特征、结构或特性可以包含在本申请的至少一个实施例中。在说明书中的各个位置出现该短语并不一定均是指相同的实施例,也不是与其它实施例互斥的独立的或备选的实施例。本领域技术人员显式地和隐式地理解的是,本文所描述的实施例可以与其它实施例相结合。Reference to "embodiments" herein means that a specific feature, structure, or characteristic described in conjunction with the embodiments may be included in at least one embodiment of the present application. The appearance of the phrase in various places in the specification does not necessarily refer to the same embodiment, nor is it an independent or alternative embodiment mutually exclusive with other embodiments. Those skilled in the art clearly and implicitly understand that the embodiments described herein can be combined with other embodiments.
本申请实施例所涉及到的电子设备可以包括各种具有无线通信功能的手持设备、车载设备、可穿戴设备、计算设备或连接到无线调制解调器的其他处理设备,以及各种形式的用户设备(User Equipment,UE),移动台(Mobile Station,MS),终端设备(terminal device)等等。例如,可以为智能手机、平板电脑等移动终端,还可以为其他终端,此处不做限制。为方便描述,上面提到的设备统称为电子设备。下面结合附图对本申请实施例进行介绍。The electronic devices involved in the embodiments of this application may include various handheld devices with wireless communication functions, vehicle-mounted devices, wearable devices, computing devices or other processing devices connected to wireless modems, as well as various forms of user equipment (User Equipment, UE), mobile station (Mobile Station, MS), terminal device (terminal device), etc. For example, it can be a mobile terminal such as a smart phone, a tablet computer, or other terminals, and there is no limitation here. For ease of description, the devices mentioned above are collectively referred to as electronic devices. The embodiments of the present application are described below in conjunction with the drawings.
请参见图1,图1是本申请实施例提供的一种电子设备的结构示意图,如图1所示,该电子设备100包括:至少一个处理器101,至少一个输入装置102,至少一个输出装置103,存储器104,至少一个总线105。其中,总线105用于实现这些组件之间的连接通信。Please refer to FIG. 1, which is a schematic structural diagram of an electronic device provided by an embodiment of the present application. As shown in FIG. 1, the electronic device 100 includes: at least one processor 101, at least one input device 102, and at least one output device 103, a memory 104, and at least one bus 105. Among them, the bus 105 is used to implement connection and communication between these components.
本申请实施例中,处理器101可为中央处理器(Central Processing Unit,CPU)或图形处理器(Graphics Processing Unit,GPU),在一些实施方式中,还可以被称为应用处理器(Application processor,AP),以与基带处理器进行区分。该处理器101还可以是其他通用处理器、数字信号处理器(Digital Signal Processor,DSP)、专用集成电路(Application Specific Integrated Circuit,ASIC)、现成可编程门阵列(Field-Programmable Gate Array,FPGA)或者其他可编程逻辑器件、分立门或者晶体管逻辑器件、分立硬件组件等。通用处理器可以是微处理器或者该处理器也可以是任何常规的处理器等。In the embodiment of the present application, the processor 101 may be a central processing unit (Central Processing Unit, CPU) or a graphics processing unit (Graphics Processing Unit, GPU). In some embodiments, it may also be referred to as an application processor (application processor). , AP) to distinguish it from the baseband processor. The processor 101 may also be other general-purpose processors, digital signal processors (Digital Signal Processor, DSP), application specific integrated circuits (ASIC), ready-made programmable gate arrays (Field-Programmable Gate Array, FPGA) Or other programmable logic devices, discrete gates or transistor logic devices, discrete hardware components, etc. The general-purpose processor may be a microprocessor or the processor may also be any conventional processor or the like.
输入设备102可以包括触控板、指纹采传感器(用于采集用户的指纹信息和指纹的方向信息)、摄像头、麦克风等,则输出设备103可以包括显示器(LCD等)、扬声器等。The input device 102 may include a touch panel, a fingerprint sensor (used to collect user fingerprint information and fingerprint orientation information), a camera, a microphone, etc., and the output device 103 may include a display (LCD, etc.), a speaker, etc.
该存储器104可以包括只读存储器和随机存取存储器,并向处理器101提供指令和数据。处理器101可用于读取和执行计算机可读指令。具体的,处理器101可用于调用存储于存储器104中的数据。存储器104的一部分还可以包括非易失性随机存取存储器。The memory 104 may include a read-only memory and a random access memory, and provides instructions and data to the processor 101. The processor 101 can be used to read and execute computer readable instructions. Specifically, the processor 101 may be used to call data stored in the memory 104. A part of the memory 104 may also include a non-volatile random access memory.
具体实现中,本申请实施例中所描述的处理器101、输入设备102、输出设备103可执行下述图2所示音频处理方法涉及的部分或全部流程。In specific implementation, the processor 101, the input device 102, and the output device 103 described in the embodiments of the present application can execute part or all of the processes involved in the audio processing method shown in FIG. 2 below.
可选的,电子设备100还可以包括通信接口。通信接口可以是收发器、收发电路等,其中,通信接口是统称,可以包括一个或多个接口,例如电子设备与服务器之间的接口。通信接口可以包括有线接口和无线接口,例如标准接口、以太网、多机同步接口。可选地,当处理器101接收任何消息或数据时,其具体通过驱动或控制通信接口做接收。因此,处理器101可以被视为是执行发送或接收的控制中心,通信接口是发送和接收操作的具体执行者。Optionally, the electronic device 100 may further include a communication interface. The communication interface may be a transceiver, a transceiver circuit, etc., where the communication interface is a general term and may include one or more interfaces, such as an interface between an electronic device and a server. The communication interface may include a wired interface and a wireless interface, such as a standard interface, Ethernet, and a multi-machine synchronization interface. Optionally, when the processor 101 receives any message or data, it specifically receives it by driving or controlling the communication interface. Therefore, the processor 101 can be regarded as a control center that performs sending or receiving, and the communication interface is a specific performer of sending and receiving operations.
本申请实施例中,电子设备100可以是具备计算或处理能力的终端、服务器、电脑、视频播放设备等。In the embodiment of the present application, the electronic device 100 may be a terminal, a server, a computer, a video playback device, etc., capable of computing or processing.
基于图1所示的电子设备的结构,图2提供了本申请实施例涉及的一种音频处理方法,该音频处理方法包括但不限于如下步骤S201-S202。Based on the structure of the electronic device shown in FIG. 1, FIG. 2 provides an audio processing method related to an embodiment of the present application. The audio processing method includes but is not limited to the following steps S201-S202.
S201:电子设备获取用户输入的N个音频信号、M种噪声信号以及P个信噪比,N、M和P均为正整数。S201: The electronic device obtains N audio signals, M noise signals, and P signal-to-noise ratios input by the user, where N, M, and P are all positive integers.
其中,用户输入的音频信号可以是一个音频信号也可以是多个音频信号。用户输入的噪声信号可以是一种类型的噪声信号也可以是多种不同类型的噪声信号。用户输入的信噪比可以是一个信噪比也可以是多个信噪比。The audio signal input by the user may be one audio signal or multiple audio signals. The noise signal input by the user may be one type of noise signal or multiple different types of noise signals. The signal-to-noise ratio input by the user can be one signal-to-noise ratio or multiple signal-to-noise ratios.
可选的,用户输入的音频信号可以是音乐、语音等等。Optionally, the audio signal input by the user may be music, voice, and so on.
可选的,用户输入的噪声信号的类型包括白噪声、高斯噪声、粉红噪声或有色噪声等信号生成设备可生成的噪声,也可以包括用户录制的其他类型噪声,例如水流声、鸟叫声等真实的环境噪声。Optionally, the type of noise signal input by the user includes noise that can be generated by the signal generating device such as white noise, Gaussian noise, pink noise, or colored noise, and may also include other types of noise recorded by the user, such as the sound of water flowing, the sound of birds, etc. Real environmental noise.
用户输入的信噪比是指对音频信号进行加噪后,期望加噪后的音频的信号功率和噪声功率的比值。The signal-to-noise ratio input by the user refers to the ratio of the signal power and the noise power of the desired audio after adding noise to the audio signal.
下面结合图3对用户输入界面进行解释,用户输入界面可以例如但不限于如图3所示。 如图3所示,用户输入界面包括:音频信号输入框301、噪声信号输入框302、信噪比输入框303、确定按钮304。若需要输入多个音频信号,则可以通过点击音频信号输入框301右边的“+”号触发输入多个音频信号,同样的,若需要输入多个噪声信号,则可以通过点击噪声信号输入框302右边的“+”号触发输入多个噪声信号,若需要输入多个信噪比,则可以通过点击信噪比输入框303右边的“+”号触发输入多个信噪比。The user input interface will be explained below in conjunction with FIG. 3. The user input interface may be, for example, but not limited to, as shown in FIG. As shown in FIG. 3, the user input interface includes: an audio signal input box 301, a noise signal input box 302, a signal-to-noise ratio input box 303, and a confirm button 304. If you need to input multiple audio signals, you can trigger the input of multiple audio signals by clicking the "+" sign on the right of the audio signal input box 301. Similarly, if you need to input multiple noise signals, you can click the noise signal input box 302 The "+" sign on the right triggers the input of multiple noise signals. If you need to input multiple signal-to-noise ratios, you can click the "+" sign on the right of the signal-to-noise ratio input box 303 to trigger input multiple signal-to-noise ratios.
可选的,电子设备接收到用户点击音频信号输入框301的指令之后,可以通过电子设备的语音输入装置,例如麦克风接收用户输入的音频信号,例如语音或者音乐。或者,电子设备接收到用户点击音频信号输入框301的指令后,可以显示电子设备本地存储文件,用户可以从电子设备本地存储文件中选择音频信号。Optionally, after receiving the user's instruction to click the audio signal input box 301, the electronic device may receive the audio signal input by the user, such as voice or music, through a voice input device of the electronic device, such as a microphone. Alternatively, after the electronic device receives the user's instruction to click the audio signal input box 301, it can display the files stored locally in the electronic device, and the user can select the audio signal from the files locally stored in the electronic device.
同样的,电子设备接收到用户点击噪声信号输入框302的指令之后,可以通过电子设备的语音输入装置,例如麦克风接收用户输入的噪声信号,例如水流声或鸟叫声。或者,电子设备接收到用户点击噪声信号输入框302的指令后,可以显示噪声类型,用户可以从噪声类型中选择噪声信号。Similarly, after the electronic device receives the user's instruction to click on the noise signal input box 302, it can receive the noise signal input by the user through the voice input device of the electronic device, such as a microphone, such as the sound of water flow or the sound of birds. Alternatively, after receiving the user's instruction to click the noise signal input box 302, the electronic device may display the noise type, and the user may select the noise signal from the noise type.
用户在输入了音频信号、噪声信号以及信噪比之后,可以点击确定按钮304。电子设备接收到用户点击确定按钮304的操作后,执行步骤S202。例如,用户输入了2个音频信号,分别为音频信号1和音频信号2,用户输入了2个噪声信号,分别为噪声信号1和噪声信号2,用户输入了2个信噪比,分别为信噪比1和信噪比2。The user can click the OK button 304 after inputting the audio signal, the noise signal, and the signal-to-noise ratio. After receiving the user's operation of clicking the OK button 304, the electronic device executes step S202. For example, the user inputs 2 audio signals, audio signal 1 and audio signal 2, the user inputs 2 noise signals, noise signal 1 and noise signal 2, and the user inputs 2 signal-to-noise ratios, which are signal Noise ratio 1 and signal to noise ratio 2.
S202:电子设备获取N个音频信号中每个音频信号的功率以及M种噪声信号中每种噪声信号的功率。S202: The electronic device obtains the power of each of the N audio signals and the power of each of the M types of noise signals.
可选的,电子设备获取每个音频信号的功率,包括:Optionally, the electronic device obtains the power of each audio signal, including:
电子设备提取每个音频信号的幅值,根据每个音频信号的幅值得到每个音频信号的功率。若用户是通过麦克风输入的音频信号,则电子设备可以根据该用户输入的音频信号的幅值计算得到该音频信号的功率。若用户是从电子设备本地存储文件中选择的音频文件,则电子设备可以利用语音解析工具将音频文件转换为图4所示的音频信号,其中,横轴为时间,纵轴为幅值,电子设备可以根据该音频信号的幅值计算得到该音频信号的功率。The electronic device extracts the amplitude of each audio signal, and obtains the power of each audio signal according to the amplitude of each audio signal. If the user inputs the audio signal through a microphone, the electronic device can calculate the power of the audio signal according to the amplitude of the audio signal input by the user. If the user selects an audio file from the files stored locally in the electronic device, the electronic device can use a voice analysis tool to convert the audio file into the audio signal shown in Figure 4, where the horizontal axis is the time, the vertical axis is the amplitude, and the electronic The device can calculate the power of the audio signal according to the amplitude of the audio signal.
电子设备获取每种噪声信号的功率,包括:Electronic equipment obtains the power of each noise signal, including:
电子设备提取每种噪声信号的幅值,根据每个噪声信号的幅值得到每个噪声信号的功率。The electronic device extracts the amplitude of each noise signal, and obtains the power of each noise signal according to the amplitude of each noise signal.
电子设备提取每个噪声信号的幅值,根据每个噪声信号的幅值得到每个噪声信号的功率。若用户是通过麦克风输入的噪声信号,则电子设备可以根据该用户输入的噪声信号的幅值计算得到该噪声信号的功率。若用户是从电子设备本地选择的噪声文件,则电子设备可以利用语音解析工具将噪声文件转换为图4所示的噪声信号,其中,横轴为时间,纵轴为幅值,电子设备可以根据该噪声信号的幅值计算得到该噪声信号的功率。The electronic device extracts the amplitude of each noise signal, and obtains the power of each noise signal according to the amplitude of each noise signal. If the user inputs a noise signal through a microphone, the electronic device can calculate the power of the noise signal according to the amplitude of the noise signal input by the user. If the user selects the noise file locally from the electronic device, the electronic device can use a voice analysis tool to convert the noise file into the noise signal shown in Figure 4, where the horizontal axis is time and the vertical axis is amplitude. The amplitude of the noise signal is calculated to obtain the power of the noise signal.
例如,音频信号1的功率为10000W,噪声信号1的功率值为9W,噪声信号2的功率值为5W。For example, the power of audio signal 1 is 10000W, the power of noise signal 1 is 9W, and the power of noise signal 2 is 5W.
S203:针对N个音频信号中的第一音频信号以及P个信噪比中的第一信噪比,电子设备根据第一音频信号的功率以及第一信噪比计算第一音频信号所需添加的噪声信号的功率。S203: For the first audio signal among the N audio signals and the first signal-to-noise ratio among the P signal-to-noise ratios, the electronic device calculates the addition of the first audio signal according to the power of the first audio signal and the first signal-to-noise ratio The power of the noise signal.
可选的,电子设备根据第一音频信号的功率以及第一信噪比计算第一音频信号所需添加的噪声信号的功率,包括:Optionally, the electronic device calculating the power of the noise signal to be added to the first audio signal according to the power of the first audio signal and the first signal-to-noise ratio includes:
电子设备根据香农公式计算第一音频信号所需添加的噪声信号的功率,其中,香农公式为信噪比(dB)=10*log 10(A/B)(dB),A为第一音频信号的功率,B为第一音频信号所需添加的噪声信号的功率。 The electronic device calculates the power of the noise signal to be added to the first audio signal according to the Shannon formula, where the Shannon formula is signal-to-noise ratio (dB) = 10*log 10 (A/B) (dB), A is the first audio signal The power of B is the power of the noise signal to be added to the first audio signal.
例如,第一音频信号为音频信号1,第一信噪比为信噪比1,音频信号1的功率值为10000W,信噪比1的值为30db。由香农公式计算公式可知,1dB=10*log 10(A/B)(dB),因 此30dB=10*log 10(10000/B)(dB),计算得到B=10,因此计算得到的音频信号1所需添加的噪声信号功率为10W。 For example, the first audio signal is audio signal 1, the first signal-to-noise ratio is signal-to-noise ratio 1, the power value of audio signal 1 is 10000W, and the value of signal-to-noise ratio 1 is 30db. According to the Shannon formula calculation formula, 1dB=10*log 10 (A/B)(dB), so 30dB=10*log 10 (10000/B)(dB), calculated B=10, so the calculated audio signal 1 The power of the noise signal to be added is 10W.
采用步骤S203可以计算为每种音频信号所需添加的噪声信号的功率。Step S203 can be used to calculate the power of the noise signal that needs to be added for each audio signal.
例如,利用音频信号1的功率和信噪比1可以计算得到音频信号1在一种情况下所需添加的噪声信号的功率,利用音频信号1的功率和信噪比2可以计算得到音频信号1在另一种情况下所需添加的噪声信号的功率,利用音频信号2的功率和信噪比1可以计算得到音频信号2在一种情况下所需添加的噪声信号的功率,利用音频信号2的功率和信噪比2可以计算得到音频信号2在另一种情况下所需添加的噪声信号的功率。For example, the power of the audio signal 1 and the signal-to-noise ratio 1 can be used to calculate the power of the noise signal that the audio signal 1 needs to add in one case, and the power of the audio signal 1 and the signal-to-noise ratio 2 can be calculated to obtain the audio signal 1 In another case, the power of the noise signal that needs to be added can be calculated by using the power of the audio signal 2 and the signal-to-noise ratio 1. In one case, the power of the noise signal that needs to be added can be obtained by using the audio signal 2. The power of and the signal-to-noise ratio 2 can be calculated to obtain the power of the noise signal that the audio signal 2 needs to add in another case.
S204:电子设备根据第一音频信号所需添加的噪声信号的功率调整M种噪声信号的功率。S204: The electronic device adjusts the power of the M types of noise signals according to the power of the noise signal to be added by the first audio signal.
在一种可能的情况中,用户输入的噪声类型只包括一个噪声类型,则电子设备在通过步骤S203得到第一音频信号所需添加的噪声信号的功率后,即可以确定用户输入的噪声信号调整后的功率值。例如,用户选择的噪声为白噪声,则基于前述示例,可以确定出白噪声对应的噪声信号的功率为10W。In a possible situation, the noise type input by the user includes only one noise type, and the electronic device can determine the adjustment of the noise signal input by the user after obtaining the power of the noise signal to be added by the first audio signal through step S203 After the power value. For example, if the noise selected by the user is white noise, based on the foregoing example, it can be determined that the power of the noise signal corresponding to the white noise is 10W.
在另一种可能的情况中,噪声类型包括多个噪声类型,这种情况下,用户在用户输入界面中还需要输入这多个噪声类型的权重。例如,参见图5A所示,在一种用户输入界面的示意图。用户可以点击用户输入界面中的权重输入框305输入每种噪声信号的权重,若需要输入多个噪声信号的权重,则可以通过点击权重输入框305右边的“+”号触发输入多个噪声信号的权重。例如,参见图5B所示,用户输入的噪声类型包括白噪声和粉红噪声,且白噪声和粉红噪声对应的权重为3:2。则电子设备根据所述第一音频信号的功率和第一信噪比确定第一音频信号所需添加的噪声信号的功率之后,还包括:电子设备获取所述多个噪声类型的权重,并根据所述多个噪声类型的权重确定所述多个噪声信号中每个噪声信号对应的噪声信号功率。In another possible situation, the noise type includes multiple noise types. In this case, the user also needs to input the weights of the multiple noise types in the user input interface. For example, see Figure 5A, which is a schematic diagram of a user input interface. The user can click the weight input box 305 in the user input interface to input the weight of each noise signal. If you need to input the weights of multiple noise signals, you can click the "+" sign on the right of the weight input box 305 to trigger the input of multiple noise signals the weight of. For example, referring to FIG. 5B, the noise type input by the user includes white noise and pink noise, and the weight corresponding to the white noise and pink noise is 3:2. Then, after the electronic device determines the power of the noise signal to be added to the first audio signal according to the power of the first audio signal and the first signal-to-noise ratio, the method further includes: the electronic device obtains the weights of the multiple noise types, and according to The weights of the multiple noise types determine the noise signal power corresponding to each noise signal in the multiple noise signals.
以图5A为例,噪声类型包括白噪声和粉红噪声,白噪声和粉红噪声对应的权重为3:2,由于噪声总的信号功率为10W,因此根据噪声对应的权重,得到白噪声的信号功率为6W,粉红噪声的信号功率为4W。Taking Figure 5A as an example, the types of noise include white noise and pink noise. The weight corresponding to white noise and pink noise is 3:2. Since the total signal power of noise is 10W, the signal power of white noise is obtained according to the weight corresponding to the noise. The signal power of pink noise is 4W.
电子设备确定了每种噪声类型对应的噪声信号功率之后,对每种噪声信号的功率进行调整。例如,用户输入的噪声为:信号功率为9W的白噪声和信号功率为5W的粉红噪声,则电子设备将白噪声的功率调整为6W,将粉红噪声的信号功率调整为4W。After the electronic device determines the power of the noise signal corresponding to each noise type, it adjusts the power of each noise signal. For example, if the noise input by the user is: white noise with a signal power of 9W and pink noise with a signal power of 5W, the electronic device adjusts the power of the white noise to 6W and the signal power of the pink noise to 4W.
S205:电子设备将第一音频信号和功率调整后的M种噪声信号进行信号混合,得到第一音频信号对应的已加噪信号。S205: The electronic device mixes the first audio signal and the power-adjusted M types of noise signals to obtain a noise-added signal corresponding to the first audio signal.
电子设备在将每种噪声信号的功率进行调整后,将噪声信号和音频信号进行混合,得到加噪后的信号。After adjusting the power of each noise signal, the electronic device mixes the noise signal with the audio signal to obtain a noise-added signal.
例如,利用音频信号1的功率和信噪比1可以计算得到音频信号1在一种情况下所需添加的噪声信号的功率,进而根据噪声信号的功率调整每种噪声信号的功率,最后将调整了功率的噪声信号与音频信号1进行混合,得到一个已加噪的输出信号,该输出信号的信噪比为信噪比1。利用音频信号1的功率和信噪比2可以计算得到音频信号1在另一种情况下所需添加的噪声信号的功率,进而根据噪声信号的功率调整每种噪声信号的功率,最后将调整了功率的噪声信号与音频信号1进行混合,得到另一个已加噪的输出信号,该输出信号的信噪比为信噪比2。利用音频信号2的功率和信噪比1可以计算得到音频信号2在一种情况下所需添加的噪声信号的功率,进而根据噪声信号的功率调整每种噪声信号的功率,最后将调整了功率的噪声信号与音频信号2进行混合,得到一个已加噪的输出信号,该输出信号的信噪比为信噪比1。利用音频信号2的功率和信噪比2可以计算得到音频信号2在另一种情况下所需添加的噪声信号的功率,进而根据噪声信号的功率调整每种噪声 信号的功率,最后将调整了功率的噪声信号与音频信号2进行混合,得到一个已加噪的输出信号,该输出信号的信噪比为信噪比2。用户输入了2个音频信号,2个信噪比,最终可以输出4个已加噪的信号。For example, the power of the audio signal 1 and the signal-to-noise ratio 1 can be used to calculate the power of the noise signal that the audio signal 1 needs to add in a situation, and then adjust the power of each noise signal according to the power of the noise signal, and finally adjust The noise signal with reduced power is mixed with audio signal 1 to obtain a noise-added output signal. The signal-to-noise ratio of the output signal is signal-to-noise ratio 1. The power of the audio signal 1 and the signal-to-noise ratio 2 can be used to calculate the power of the noise signal that the audio signal 1 needs to add in another situation, and then adjust the power of each noise signal according to the power of the noise signal, and finally adjust The noise signal of power is mixed with audio signal 1 to obtain another output signal with added noise, and the signal-to-noise ratio of the output signal is signal-to-noise ratio 2. The power of the audio signal 2 and the signal-to-noise ratio 1 can be used to calculate the power of the noise signal that the audio signal 2 needs to add in a situation, and then adjust the power of each noise signal according to the power of the noise signal, and finally adjust the power The noise signal is mixed with the audio signal 2 to obtain a noise-added output signal. The signal-to-noise ratio of the output signal is signal-to-noise ratio 1. The power of the audio signal 2 and the signal-to-noise ratio 2 can be used to calculate the power of the noise signal that the audio signal 2 needs to add in another situation, and then adjust the power of each noise signal according to the power of the noise signal, and finally adjust The power noise signal is mixed with the audio signal 2 to obtain a noise-added output signal, and the signal-to-noise ratio of the output signal is signal-to-noise ratio 2. The user inputs 2 audio signals and 2 signal-to-noise ratios, and finally 4 signals with noise added can be output.
可选的,所述电子设备将所述第一音频信号和功率调整后的所述M种噪声信号进行信号混合,得到所述第一音频信号对应的已加噪信号之后,还包括:Optionally, after the electronic device mixes the first audio signal and the power-adjusted M types of noise signals to obtain the noise-added signal corresponding to the first audio signal, the method further includes:
所述电子设备将所述加噪后的音频信号进行特征标记,所述特征标记包括所述加噪后的音频信号的信噪比、加噪后的音频信号添加的噪声类型和加噪后的音频信号添加的噪声功率。The electronic device performs characteristic marks on the noise-added audio signal, and the characteristic marks include the signal-to-noise ratio of the noise-added audio signal, the type of noise added to the noise-added audio signal, and the noise-added audio signal. The noise power added by the audio signal.
具体的,电子设备针对多个音频信号进行不同的噪声类型以及不同比例的噪声混合,得到多个加噪后的音频信号后,进行特征标记,可以指示该噪声混合的噪声类型以及各加噪后的信噪比大小,便于对加噪后的音频进行区分。加噪后的音频存储表可以例如但不限于如表1所示:Specifically, the electronic device performs different noise types and different proportions of noise mixtures for multiple audio signals, and after obtaining multiple noise-added audio signals, it performs a feature mark, which can indicate the noise type of the noise mixture and the noise after each noise addition. The signal-to-noise ratio is easy to distinguish the noise after adding noise. The noise-added audio storage table can be, for example, but not limited to, as shown in Table 1:
表1Table 1
Figure PCTCN2019117172-appb-000001
Figure PCTCN2019117172-appb-000001
以音频A为例来进行说明,音频A加噪后的音频分别有:经过白噪声、分红噪声加噪后信噪比为10db的音频A1,经过白噪声、分红噪声加噪后信噪比为20db的音频A2。Take audio A as an example to illustrate, the audio after audio A plus noise is: audio A1 with a signal-to-noise ratio of 10db after white noise, red-divided noise plus noise, and after white noise, red-divided noise plus noise 20db audio A2.
需要说明的是,前述实施例均是以进行加噪时,需要用到用户输入的全部噪声为例进行说明的,在实际应用中,在对不同信号进行加噪时,电子设备所使用的噪声可以是不同的,且不必用到全部用户输入的噪声。例如,用户输入的音频信号包括音频信号1和音频信号2,用户输入的噪声信号包括噪声信号1和噪声信号2,用户输入的信噪比包括信噪比1和信噪比2。则电子设备在对音频信号1进行加噪时,可以只选择噪声信号1和噪声信号2中的一种噪声对音频信号1进行加噪,同样的,电子设备在对音频信号1进行加噪时,也可以只选择噪声信号1和噪声信号2中的一种噪声对音频信号2进行加噪。加噪后的音频存储表可以例如但不限于如表2所示:It should be noted that the foregoing embodiments are all described by taking all the noise input by the user as an example when noise is added. In practical applications, when noise is added to different signals, the noise used by the electronic device It can be different, and it is not necessary to use all user input noise. For example, the audio signal input by the user includes audio signal 1 and audio signal 2, the noise signal input by the user includes noise signal 1 and noise signal 2, and the signal to noise ratio input by the user includes signal to noise ratio 1 and signal to noise ratio 2. When the electronic device adds noise to audio signal 1, it can select only one of noise signal 1 and noise signal 2 to add noise to audio signal 1. Similarly, when the electronic device adds noise to audio signal 1, It is also possible to select only one of noise signal 1 and noise signal 2 to add noise to audio signal 2. The noise-added audio storage table can be, for example, but not limited to, as shown in Table 2:
表2Table 2
Figure PCTCN2019117172-appb-000002
Figure PCTCN2019117172-appb-000002
以音频A为例来进行说明,音频A加噪后的音频分别有:混合了白噪声后信噪比为10db的音频A1,混合了白噪声后信噪比为20db的音频A2,混合了粉红噪声后信噪比为10db 的音频A3和混合了粉红噪声后信噪比为10db的音频A4。Take audio A as an example to illustrate. The audio A after adding noise is: audio A1 with a signal-to-noise ratio of 10db after mixing with white noise, audio A2 with a signal-to-noise ratio of 20db after mixing with white noise, and mixing with pink Audio A3 with a signal-to-noise ratio of 10db after noise and audio A4 with a signal-to-noise ratio of 10db mixed with pink noise.
所述电子设备将所述第一音频信号和功率调整后的所述M种噪声信号进行信号混合,得到所述第一音频信号对应的已加噪信号之后,还包括:After the electronic device mixes the first audio signal and the M types of noise signals after power adjustment to obtain the noise-added signal corresponding to the first audio signal, the method further includes:
所述电子设备利用所述N个音频信号中每个音频信号对应的已加噪信号对音乐识别系统进行训练,使得所述音乐识别系统能够识别出真实环境中含噪声的声音。The electronic device uses the noise-added signal corresponding to each of the N audio signals to train the music recognition system, so that the music recognition system can recognize noisy sounds in the real environment.
实施本申请实施例,电子设备可以同时对一个或多个音频信号进行加噪并且可以一次性混合多个噪声类型的噪声,根据实际需求得到加噪后的信噪比,这样批量处理的方式可以简化加噪操作,节约时间,信噪比可调整,可以满足用户需求多样化。To implement the embodiments of this application, the electronic device can add noise to one or more audio signals at the same time and can mix multiple noise types of noise at one time, and obtain the noise-added signal-to-noise ratio according to actual needs, so that batch processing can be performed. Simplify the noise adding operation, save time, and adjust the signal-to-noise ratio to meet the diverse needs of users.
参见图6,图6示给出了一种音频处理装置的结构示意图,如图6所示,该音频处理装置600包括:第一获取单元601、第二获取单元602、计算单元603、调整单元604和混合单元605。Referring to Figure 6, Figure 6 shows a schematic structural diagram of an audio processing device. As shown in Figure 6, the audio processing device 600 includes: a first acquisition unit 601, a second acquisition unit 602, a calculation unit 603, and an adjustment unit 604 and mixing unit 605.
其中,第一获取单元601,用于获取用户输入的N个音频信号、M种噪声信号以及P个信噪比,所述N、M和P均为正整数;Wherein, the first obtaining unit 601 is configured to obtain N audio signals, M types of noise signals, and P signal-to-noise ratios input by the user, where N, M, and P are all positive integers;
第二获取单元602,用于获取所述N个音频信号中每个音频信号的功率以及所述M种噪声信号中每种噪声信号的功率;The second acquiring unit 602 is configured to acquire the power of each audio signal in the N audio signals and the power of each noise signal in the M types of noise signals;
计算单元603,用于针对所述N个音频信号中的第一音频信号以及所述P个信噪比中的第一信噪比,根据所述第一音频信号的功率以及所述第一信噪比计算所述第一音频信号所需添加的噪声信号的功率;The calculation unit 603 is configured to, for the first audio signal among the N audio signals and the first signal-to-noise ratio among the P signal-to-noise ratios, according to the power of the first audio signal and the first signal-to-noise ratio The noise ratio calculates the power of the noise signal to be added to the first audio signal;
调整单元604,用于根据所述第一音频信号所需添加的噪声信号的功率调整所述M种噪声信号的功率;The adjusting unit 604 is configured to adjust the power of the M types of noise signals according to the power of the noise signal to be added by the first audio signal;
混合单元605,用于将所述第一音频信号和功率调整后的所述M种噪声信号进行信号混合,得到所述第一音频信号对应的已加噪信号。The mixing unit 605 is configured to mix the first audio signal and the M types of noise signals after power adjustment to obtain a noise-added signal corresponding to the first audio signal.
在一种实现方式中,所述计算单元603具体用于:In an implementation manner, the calculation unit 603 is specifically configured to:
根据香农公式计算所述第一音频信号所需添加的噪声信号的功率,其中,所述香农公式为信噪比(dB)=10*log 10(A/B)(dB),所述A为所述第一音频信号的功率,所述B为所述第一音频信号所需添加的噪声信号的功率。 Calculate the power of the noise signal to be added to the first audio signal according to Shannon’s formula, where the Shannon’s formula is signal-to-noise ratio (dB) = 10*log 10 (A/B) (dB), where A The power of the first audio signal, where B is the power of the noise signal to be added to the first audio signal.
在一种实现方式中,所述M为大于等于2的整数,所述音频处理装置还包括:In an implementation manner, the M is an integer greater than or equal to 2, and the audio processing device further includes:
第三获取单元,用于获取所述用户输入的所述M种噪声信号的权重;The third acquiring unit is configured to acquire the weights of the M types of noise signals input by the user;
所述调整单元604包括:The adjustment unit 604 includes:
分配单元,用于将所述第一音频信号所需添加的噪声信号的功率按照所述M种噪声信号的权重分配给所述M种噪声信号中的每种噪声信号;An allocation unit, configured to allocate the power of the noise signal to be added by the first audio signal to each of the M noise signals according to the weight of the M noise signals;
处理单元,用于根据所述M种噪声信号中每种噪声信号被分配的功率调整所述每种噪声信号的的功率。The processing unit is configured to adjust the power of each noise signal according to the allocated power of each noise signal in the M noise signals.
在一种实现方式中,所述音频处理装置600还包括:In an implementation manner, the audio processing device 600 further includes:
训练单元,用于利用所述N个音频信号中每个音频信号对应的已加噪信号对音乐识别系统进行训练。The training unit is used to train the music recognition system by using the noise-added signal corresponding to each of the N audio signals.
在一种实现方式中,训练单元利用所述N个音频信号中每个音频信号对应的已加噪信号对音乐识别系统进行训练之前,还包括:In an implementation manner, before the training unit uses the noise-added signal corresponding to each of the N audio signals to train the music recognition system, the method further includes:
标记单元,用于将所述N个音频信号中每个音频信号对应的已加噪信号进行特征标记,得到所述N个音频信号中每个音频信号对应的有标的已加噪信号,特征标记包括已加噪信号的信噪比、已加噪信号添加的噪声类型和已加噪信号添加的噪声功率;The marking unit is used to perform characteristic marking on the noise-added signal corresponding to each audio signal in the N audio signals to obtain the marked noise-added signal corresponding to each audio signal in the N audio signals, and the characteristic marking Including the signal-to-noise ratio of the noise-added signal, the type of noise added by the noise-added signal, and the noise power added by the noise-added signal;
训练单元,具体用于:利用所述N个音频信号中每个音频信号对应的有标的已加噪信号对音乐识别系统进行训练。The training unit is specifically configured to train the music recognition system by using the marked noise signal corresponding to each of the N audio signals.
在一种实现方式中,所述第二获取单元602,包括:In an implementation manner, the second acquiring unit 602 includes:
第一提取单元,用于提取所述每个音频信号的幅值,根据所述每个音频信号的幅值得到所述每个音频信号的功率;第二提取单元,用于提取所述每个噪声信号的幅值,根据所述每个噪声信号的幅值得到所述每个噪声信号的功率。The first extraction unit is used to extract the amplitude of each audio signal, and the power of each audio signal is obtained according to the amplitude of each audio signal; the second extraction unit is used to extract each The amplitude of the noise signal obtains the power of each noise signal according to the amplitude of each noise signal.
在一种实现方式中,所述音频信号包括所述用户通过语音输入设备输入至所述电子设备的音频信号。例如,实施语音输入设备可以是麦克风。In an implementation manner, the audio signal includes an audio signal input to the electronic device by the user through a voice input device. For example, the implementing voice input device may be a microphone.
在一种实现方式中,所述噪声信号包括所述用户通过语音输入设备输入至所述电子设备的噪声信号。例如,所述噪声信号可以是用户录制的水流声、鸟叫声等等。可选的,所述噪声信号还可以是白噪声、分红噪声等等,这种噪声可以由信号生成设备生成。In an implementation manner, the noise signal includes a noise signal input to the electronic device by the user through a voice input device. For example, the noise signal may be the sound of water flow, the sound of birds, etc. recorded by the user. Optionally, the noise signal may also be white noise, dividend noise, etc., and such noise may be generated by a signal generating device.
在一种实现方式中,所述第二获取单元602具体用于:In an implementation manner, the second acquiring unit 602 is specifically configured to:
提取所述每个音频信号的幅值,根据所述每个音频信号的幅值得到所述每个音频信号的功率;Extracting the amplitude of each audio signal, and obtaining the power of each audio signal according to the amplitude of each audio signal;
提取所述每种噪声信号的幅值,根据所述每个噪声信号的幅值得到所述每个噪声信号的功率。The amplitude of each noise signal is extracted, and the power of each noise signal is obtained according to the amplitude of each noise signal.
需要说明的是,音频处理装置600中各个单元的功能和实现可以参考前述图2所示方法实施例中的相关描述,此次不再赘述。It should be noted that the functions and implementation of each unit in the audio processing device 600 can refer to the related description in the method embodiment shown in FIG. 2, and will not be repeated this time.
在本申请的另一实施例中提供一种计算机非易失性可读存储介质,所述计算机非易失性可读存储介质存储有计算机程序,所述计算机程序包括程序指令,所述程序指令被处理器执行时实现。In another embodiment of the present application, a computer non-volatile readable storage medium is provided. The computer non-volatile readable storage medium stores a computer program. The computer program includes program instructions. Realized when executed by the processor.
在上述实施例中,可以全部或部分地通过软件、硬件、固件或者其任意组合来实现。当使用软件实现时,可以全部或部分地以计算机程序产品的形式实现。所述计算机程序产品包括一个或多个计算机指令。在计算机上加载和执行所述计算机程序指令时,全部或部分地产生按照本申请实施例所述的流程或功能。所述计算机可以是通用计算机、专用计算机、计算机网络、或者其他可编程装置。所述计算机指令可以存储在计算机非易失性可读存储介质中,或者从一个计算机非易失性可读存储介质向另一个计算机非易失性可读存储介质传输,例如,所述计算机指令可以从一个网站站点、计算机、服务器或数据中心通过有线(例如同轴电缆、光纤、数字用户线(digital subscriber line,DSL)或无线(例如红外、无线、微波等)方式向另一个网站站点、计算机、服务器或数据中心进行传输。所述计算机非易失性可读存储介质可以是计算机能够存取的任何可用介质或者是包含一个或多个可用介质集成的服务器、数据中心等数据存储设备。所述可用介质可以是磁性介质,(例如,软盘、硬盘、磁带)、光介质(例如数字多功能光盘(digital versatile disc,DVD)、半导体介质(例如固态硬盘solid state disk,SSD)等。In the above embodiments, it may be implemented in whole or in part by software, hardware, firmware or any combination thereof. When implemented by software, it can be implemented in the form of a computer program product in whole or in part. The computer program product includes one or more computer instructions. When the computer program instructions are loaded and executed on the computer, the processes or functions described in the embodiments of the present application are generated in whole or in part. The computer may be a general-purpose computer, a special-purpose computer, a computer network, or other programmable devices. The computer instructions may be stored in a computer non-volatile readable storage medium, or transmitted from one computer non-volatile readable storage medium to another computer non-volatile readable storage medium, for example, the computer instructions It can be from one website site, computer, server or data center through wired (such as coaxial cable, optical fiber, digital subscriber line (digital subscriber line, DSL) or wireless (such as infrared, wireless, microwave, etc.) to another website site, Computer, server or data center for transmission. The computer non-volatile readable storage medium may be any available medium that can be accessed by a computer or a data storage device such as a server or data center integrated with one or more available media. The usable medium may be a magnetic medium (for example, a floppy disk, a hard disk, a magnetic tape), an optical medium (for example, a digital versatile disc (DVD), a semiconductor medium (for example, a solid state disk, SSD), etc.
以上所述的具体实施方式,对本申请实施例的目的、技术方案和有益效果进行了进一步详细说明,所应理解的是,以上所述仅为本申请实施例的具体实施方式而已,并不用于限定本申请实施例的保护范围,凡在本申请实施例的技术方案的基础之上,所做的任何修改、等同替换、改进等,均应包括在本申请实施例的保护范围之内。The specific implementations described above further describe the purpose, technical solutions and beneficial effects of the embodiments of this application in further detail. It should be understood that the above descriptions are only specific implementations of the embodiments of this application and are not intended to To limit the protection scope of the embodiments of the application, any modification, equivalent replacement, improvement, etc. made on the basis of the technical solutions of the embodiments of the application shall be included in the protection scope of the embodiments of the application.

Claims (20)

  1. 一种音频处理方法,其特征在于,包括:An audio processing method, characterized by comprising:
    电子设备获取用户输入的N个音频信号、M种噪声信号以及P个信噪比,所述N、M和P均为正整数;The electronic device acquires N audio signals, M noise signals, and P signal-to-noise ratios input by the user, where N, M, and P are all positive integers;
    所述电子设备获取所述N个音频信号中每个音频信号的功率以及所述M种噪声信号中每种噪声信号的功率;Acquiring, by the electronic device, the power of each of the N audio signals and the power of each of the M types of noise signals;
    针对所述N个音频信号中的第一音频信号以及所述P个信噪比中的第一信噪比,所述电子设备根据所述第一音频信号的功率以及所述第一信噪比计算所述第一音频信号所需添加的噪声信号的功率;For the first audio signal among the N audio signals and the first signal-to-noise ratio among the P signal-to-noise ratios, the electronic device is based on the power of the first audio signal and the first signal-to-noise ratio Calculating the power of the noise signal to be added to the first audio signal;
    所述电子设备根据所述第一音频信号所需添加的噪声信号的功率调整所述M种噪声信号的功率;The electronic device adjusts the power of the M types of noise signals according to the power of the noise signal to be added by the first audio signal;
    所述电子设备将所述第一音频信号和功率调整后的所述M种噪声信号进行信号混合,得到所述第一音频信号对应的已加噪信号。The electronic device mixes the first audio signal and the M types of noise signals after power adjustment to obtain a noise-added signal corresponding to the first audio signal.
  2. 根据权利要求1所述的方法,其特征在于,所述电子设备根据所述第一音频信号的功率以及所述第一信噪比计算所述第一音频信号所需添加的噪声信号的功率,包括:The method according to claim 1, wherein the electronic device calculates the power of the noise signal to be added to the first audio signal according to the power of the first audio signal and the first signal-to-noise ratio, include:
    所述电子设备根据香农公式计算所述第一音频信号所需添加的噪声信号的功率,其中,所述香农公式为信噪比(dB)=10*log10(A/B)(dB),所述A为所述第一音频信号的功率,所述B为所述第一音频信号所需添加的噪声信号的功率。The electronic device calculates the power of the noise signal to be added to the first audio signal according to the Shannon formula, where the Shannon formula is signal-to-noise ratio (dB)=10*log10(A/B)(dB), so The A is the power of the first audio signal, and the B is the power of the noise signal to be added to the first audio signal.
  3. 根据权利要求1或2所述的方法,其特征在于,所述M为大于等于2的整数,所述方法还包括:The method according to claim 1 or 2, wherein the M is an integer greater than or equal to 2, and the method further comprises:
    所述电子设备获取所述用户输入的所述M种噪声信号的权重;Acquiring, by the electronic device, the weights of the M types of noise signals input by the user;
    所述电子设备根据所述第一音频信号所需添加的噪声信号的功率调整所述M种噪声信号的功率,包括:The electronic device adjusting the power of the M types of noise signals according to the power of the noise signal to be added by the first audio signal includes:
    所述电子设备将所述第一音频信号所需添加的噪声信号的功率按照所述M种噪声信号的权重分配给所述M种噪声信号中的每种噪声信号;The electronic device allocates the power of the noise signal required to be added by the first audio signal to each of the M noise signals according to the weight of the M noise signals;
    所述电子设备根据所述M种噪声信号中每种噪声信号被分配的功率调整所述每种噪声信号的功率。The electronic device adjusts the power of each noise signal according to the allocated power of each noise signal in the M noise signals.
  4. 根据权利要求1至3任一项所述的方法,其特征在于,所述电子设备将所述第一音频信号和功率调整后的所述M种噪声信号进行信号混合,得到所述第一音频信号对应的已加噪信号之后,还包括:The method according to any one of claims 1 to 3, wherein the electronic device mixes the first audio signal and the M types of noise signals after power adjustment to obtain the first audio signal After the noise signal corresponding to the signal, it also includes:
    所述电子设备利用所述N个音频信号中每个音频信号对应的已加噪信号对音乐识别系统进行训练。The electronic device uses the noise-added signal corresponding to each of the N audio signals to train the music recognition system.
  5. 根据权利要求4所述的方法,其特征在于,所述电子设备利用所述N个音频信号中每个音频信号对应的已加噪信号对音乐识别系统进行训练之前,还包括:The method according to claim 4, wherein before the electronic device uses the noise-added signal corresponding to each of the N audio signals to train the music recognition system, the method further comprises:
    所述电子设备将所述N个音频信号中每个音频信号对应的已加噪信号进行特征标记,得到所述N个音频信号中每个音频信号对应的有标的已加噪信号,所述特征标记包括所述已加噪信号的信噪比、所述已加噪信号添加的噪声类型和所述已加噪信号添加的噪声功率中的一项或多项;The electronic device signs the noise-added signal corresponding to each of the N audio signals to obtain a marked noise-added signal corresponding to each audio signal in the N audio signals. The mark includes one or more of the signal-to-noise ratio of the noise-added signal, the type of noise added by the noise-added signal, and the noise power added by the noise-added signal;
    所述电子设备利用所述N个音频信号中每个音频信号对应的已加噪信号对音乐识别系统进行训练,包括:The electronic device uses the noise-added signal corresponding to each of the N audio signals to train the music recognition system, including:
    所述电子设备利用所述N个音频信号中每个音频信号对应的有标的已加噪信号对音乐识别系统进行训练。The electronic device trains the music recognition system by using the labeled noise-added signal corresponding to each of the N audio signals.
  6. 根据权利要求1至5任一项所述的方法,其特征在于,所述电子设备获取所述N 个音频信号中每个音频信号的功率以及所述M种噪声信号中每种噪声信号的功率,包括:The method according to any one of claims 1 to 5, wherein the electronic device obtains the power of each of the N audio signals and the power of each of the M types of noise signals ,include:
    所述电子设备提取所述每个音频信号的幅值,根据所述每个音频信号的幅值得到所述每个音频信号的功率;Extracting the amplitude of each audio signal by the electronic device, and obtain the power of each audio signal according to the amplitude of each audio signal;
    所述电子设备提取所述每个噪声信号的幅值,根据所述每个噪声信号的幅值得到所述每个噪声信号的功率。The electronic device extracts the amplitude of each noise signal, and obtains the power of each noise signal according to the amplitude of each noise signal.
  7. 根据权利要求1至6任一项所述的方法,其特征在于,所述音频信号包括所述用户通过语音输入设备输入至所述电子设备的音频信号。The method according to any one of claims 1 to 6, wherein the audio signal comprises an audio signal input to the electronic device by the user through a voice input device.
  8. 根据权利要求1至7任一项所述的方法,其特征在于,所述噪声信号包括所述用户通过语音输入设备输入至所述电子设备的噪声信号。The method according to any one of claims 1 to 7, wherein the noise signal comprises a noise signal input by the user to the electronic device through a voice input device.
  9. 一种音频处理装置,其特征在于,包括:An audio processing device, characterized by comprising:
    第一获取单元,用于获取用户输入的N个音频信号、M种噪声信号以及P个信噪比,所述N、M和P均为正整数;The first acquiring unit is configured to acquire N audio signals, M noise signals, and P signal-to-noise ratios input by the user, where N, M, and P are all positive integers;
    第二获取单元,用于获取所述N个音频信号中每个音频信号的功率以及所述M种噪声信号中每种噪声信号的功率;A second acquiring unit, configured to acquire the power of each audio signal in the N audio signals and the power of each noise signal in the M types of noise signals;
    计算单元,用于针对所述N个音频信号中的第一音频信号以及所述P个信噪比中的第一信噪比,根据所述第一音频信号的功率以及所述第一信噪比计算所述第一音频信号所需添加的噪声信号的功率;The calculation unit is configured to, for the first audio signal among the N audio signals and the first signal-to-noise ratio among the P signal-to-noise ratios, according to the power of the first audio signal and the first signal-to-noise ratio Than calculating the power of the noise signal required to be added to the first audio signal;
    调整单元,用于根据所述第一音频信号所需添加的噪声信号的功率调整所述M种噪声信号的功率;An adjusting unit, configured to adjust the power of the M types of noise signals according to the power of the noise signal to be added by the first audio signal;
    混合单元,用于将所述第一音频信号和功率调整后的所述M种噪声信号进行信号混合,得到所述第一音频信号对应的已加噪信号。The mixing unit is configured to mix the first audio signal and the M types of noise signals after power adjustment to obtain a noise-added signal corresponding to the first audio signal.
  10. 根据权利要求9所述的装置,其特征在于,所述计算单元,具体用于:The device according to claim 9, wherein the calculation unit is specifically configured to:
    根据香农公式计算所述第一音频信号所需添加的噪声信号的功率,其中,所述香农公式为信噪比(dB)=10*log10(A/B)(dB),所述A为所述第一音频信号的功率,所述B为所述第一音频信号所需添加的噪声信号的功率。Calculate the power of the noise signal to be added to the first audio signal according to Shannon’s formula, where the Shannon’s formula is signal-to-noise ratio (dB)=10*log10(A/B)(dB), and the A is The power of the first audio signal, and the B is the power of the noise signal to be added to the first audio signal.
  11. 根据权利要求9或10所述的装置,其特征在于,所述M为大于等于2的整数,所述装置还包括:The device according to claim 9 or 10, wherein the M is an integer greater than or equal to 2, and the device further comprises:
    第三获取单元,用于获取所述用户输入的所述M种噪声信号的权重;The third acquiring unit is configured to acquire the weights of the M types of noise signals input by the user;
    所述调整单元,包括:The adjustment unit includes:
    分配单元,将所述第一音频信号所需添加的噪声信号的功率按照所述M种噪声信号的权重分配给所述M种噪声信号中的每种噪声信号;An allocation unit, which allocates the power of the noise signal to be added by the first audio signal to each of the M noise signals according to the weight of the M noise signals;
    噪声调整单元,用于根据所述M种噪声信号中每种噪声信号被分配的功率调整所述每种噪声信号的功率。The noise adjustment unit is configured to adjust the power of each noise signal according to the allocated power of each noise signal in the M noise signals.
  12. 根据权利要求9至11任一项所述的装置,其特征在于,所述混合单元将所述第一音频信号和功率调整后的所述M种噪声信号进行信号混合,得到所述第一音频信号对应的已加噪信号之后,还包括:The device according to any one of claims 9 to 11, wherein the mixing unit mixes the first audio signal and the M types of noise signals after power adjustment to obtain the first audio signal After the noise signal corresponding to the signal, it also includes:
    训练单元,用于利用所述N个音频信号中每个音频信号对应的已加噪信号对音乐识别系统进行训练。The training unit is used to train the music recognition system by using the noise-added signal corresponding to each of the N audio signals.
  13. 根据权利要求12所述的装置,其特征在于,所述训练单元利用所述N个音频信号中每个音频信号对应的已加噪信号对音乐识别系统进行训练之前,还包括:The device according to claim 12, wherein before the training unit uses the noise-added signal corresponding to each of the N audio signals to train the music recognition system, the method further comprises:
    标记单元,用于将所述N个音频信号中每个音频信号对应的已加噪信号进行特征标记,得到所述N个音频信号中每个音频信号对应的有标的已加噪信号,所述特征标记包括所述已加噪信号的信噪比、所述已加噪信号添加的噪声类型和所述已加噪信号添加的噪声功率;The marking unit is used for feature marking the noise-added signal corresponding to each audio signal in the N audio signals to obtain the marked noise-added signal corresponding to each audio signal in the N audio signals. The signature includes the signal-to-noise ratio of the noise-added signal, the type of noise added by the noise-added signal, and the noise power added by the noise-added signal;
    所述训练单元,具体用于:The training unit is specifically used for:
    利用所述N个音频信号中每个音频信号对应的有标的已加噪信号对音乐识别系统进行训练。The music recognition system is trained by using the marked noise signal corresponding to each of the N audio signals.
  14. 根据权利要求9至13任一项所述的装置,其特征在于,所述第二获取单元,包括:The device according to any one of claims 9 to 13, wherein the second acquiring unit comprises:
    第一提取单元,用于提取所述每个音频信号的幅值,根据所述每个音频信号的幅值得到所述每个音频信号的功率;The first extraction unit is configured to extract the amplitude of each audio signal, and obtain the power of each audio signal according to the amplitude of each audio signal;
    第二提取单元,用于提取所述每个噪声信号的幅值,根据所述每个噪声信号的幅值得到所述每个噪声信号的功率。The second extraction unit is configured to extract the amplitude of each noise signal, and obtain the power of each noise signal according to the amplitude of each noise signal.
  15. 根据权利要求9至14任一项所述的装置,其特征在于,所述音频信号包括所述用户通过语音输入设备输入至所述音频处理装置的音频信号。The apparatus according to any one of claims 9 to 14, wherein the audio signal comprises an audio signal input to the audio processing apparatus by the user through a voice input device.
  16. 根据权利要求9至15任一项所述的方法,其特征在于,所述噪声信号包括所述用户通过语音输入设备输入至所述音频处理装置的噪声信号。The method according to any one of claims 9 to 15, wherein the noise signal comprises a noise signal input to the audio processing apparatus by the user through a voice input device.
  17. 一种电子设备,其特征在于,包括:An electronic device, characterized in that it comprises:
    一个或多个处理器;One or more processors;
    存储器;Memory
    一个或多个应用程序,其中所述一个或多个应用程序被存储在所述存储器中并被配置为由所述一个或多个处理器执行,所述一个或多个应用程序配置用于执行以下步骤:One or more application programs, wherein the one or more application programs are stored in the memory and configured to be executed by the one or more processors, and the one or more application programs are configured to execute The following steps:
    获取用户输入的N个音频信号、M种噪声信号以及P个信噪比,所述N、M和P均为正整数;Acquiring N audio signals, M noise signals, and P signal-to-noise ratios input by the user, where N, M, and P are all positive integers;
    获取所述N个音频信号中每个音频信号的功率以及所述M种噪声信号中每种噪声信号的功率;Acquiring the power of each audio signal in the N audio signals and the power of each noise signal in the M types of noise signals;
    针对所述N个音频信号中的第一音频信号以及所述P个信噪比中的第一信噪比,根据所述第一音频信号的功率以及所述第一信噪比计算所述第一音频信号所需添加的噪声信号的功率;For the first audio signal among the N audio signals and the first signal-to-noise ratio among the P signal-to-noise ratios, the first audio signal is calculated according to the power of the first audio signal and the first signal-to-noise ratio. The power of the noise signal to be added to an audio signal;
    根据所述第一音频信号所需添加的噪声信号的功率调整所述M种噪声信号的功率;Adjusting the power of the M types of noise signals according to the power of the noise signal to be added to the first audio signal;
    将所述第一音频信号和功率调整后的所述M种噪声信号进行信号混合,得到所述第一音频信号对应的已加噪信号。Signal mixing the first audio signal and the M types of noise signals after power adjustment to obtain a noise-added signal corresponding to the first audio signal.
  18. 根据权利要求17所述的电子设备,其特征在于,所述根据所述第一音频信号的功率以及所述第一信噪比计算所述第一音频信号所需添加的噪声信号的功率,包括:18. The electronic device of claim 17, wherein the calculating the power of the noise signal to be added to the first audio signal according to the power of the first audio signal and the first signal-to-noise ratio comprises :
    根据香农公式计算所述第一音频信号所需添加的噪声信号的功率,其中,所述香农公式为信噪比(dB)=10*log10(A/B)(dB),所述A为所述第一音频信号的功率,所述B为所述第一音频信号所需添加的噪声信号的功率。Calculate the power of the noise signal to be added to the first audio signal according to Shannon’s formula, where the Shannon’s formula is signal-to-noise ratio (dB)=10*log10(A/B)(dB), and the A is The power of the first audio signal, and the B is the power of the noise signal to be added to the first audio signal.
  19. 根据权利要求17或18所述的电子设备,其特征在于,所述M为大于等于2的整数,还包括:The electronic device according to claim 17 or 18, wherein the M is an integer greater than or equal to 2, and further comprises:
    获取所述用户输入的所述M种噪声信号的权重;Acquiring the weights of the M types of noise signals input by the user;
    所述根据所述第一音频信号所需添加的噪声信号的功率调整所述M种噪声信号的功率,包括:The adjusting the power of the M types of noise signals according to the power of the noise signal to be added by the first audio signal includes:
    将所述第一音频信号所需添加的噪声信号的功率按照所述M种噪声信号的权重分配给所述M种噪声信号中的每种噪声信号;Allocating the power of the noise signal to be added by the first audio signal to each of the M noise signals according to the weight of the M noise signals;
    根据所述M种噪声信号中每种噪声信号被分配的功率调整所述每种噪声信号的功率。Adjust the power of each noise signal according to the allocated power of each noise signal in the M noise signals.
  20. 一种计算机非易失性可读存储介质,其特征在于,所述计算机非易失性可读存储介质存储有计算机程序,所述计算机程序被处理器执行以实现权利要求1至8任意一项所述的方法。A computer non-volatile readable storage medium, wherein the computer non-volatile readable storage medium stores a computer program, and the computer program is executed by a processor to implement any one of claims 1 to 8 The method described.
PCT/CN2019/117172 2019-08-12 2019-11-11 Audio processing method and apparatus and computer storage medium WO2021027132A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201910748281.1 2019-08-12
CN201910748281.1A CN110600022B (en) 2019-08-12 2019-08-12 Audio processing method and device and computer storage medium

Publications (1)

Publication Number Publication Date
WO2021027132A1 true WO2021027132A1 (en) 2021-02-18

Family

ID=68854167

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2019/117172 WO2021027132A1 (en) 2019-08-12 2019-11-11 Audio processing method and apparatus and computer storage medium

Country Status (2)

Country Link
CN (1) CN110600022B (en)
WO (1) WO2021027132A1 (en)

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102097100A (en) * 2011-01-07 2011-06-15 蔡镇滨 Device and method for reducing steady-state noises through adding noises
CN103280215A (en) * 2013-05-28 2013-09-04 北京百度网讯科技有限公司 Audio frequency feature library establishing method and device
US9564144B2 (en) * 2014-07-24 2017-02-07 Conexant Systems, Inc. System and method for multichannel on-line unsupervised bayesian spectral filtering of real-world acoustic noise
CN107481731A (en) * 2017-08-01 2017-12-15 百度在线网络技术(北京)有限公司 A kind of speech data Enhancement Method and system
CN107680586A (en) * 2017-08-01 2018-02-09 百度在线网络技术(北京)有限公司 Far field Speech acoustics model training method and system
CN108133702A (en) * 2017-12-20 2018-06-08 重庆邮电大学 A kind of deep neural network speech enhan-cement model based on MEE Optimality Criterias
CN108899041A (en) * 2018-08-20 2018-11-27 百度在线网络技术(北京)有限公司 Voice signal adds method for de-noising, device and storage medium
CN108922517A (en) * 2018-07-03 2018-11-30 百度在线网络技术(北京)有限公司 The method, apparatus and storage medium of training blind source separating model

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US4933973A (en) * 1988-02-29 1990-06-12 Itt Corporation Apparatus and methods for the selective addition of noise to templates employed in automatic speech recognition systems
US6937980B2 (en) * 2001-10-02 2005-08-30 Telefonaktiebolaget Lm Ericsson (Publ) Speech recognition using microphone antenna array
ATE405925T1 (en) * 2004-09-23 2008-09-15 Harman Becker Automotive Sys MULTI-CHANNEL ADAPTIVE VOICE SIGNAL PROCESSING WITH NOISE CANCELLATION
US9799330B2 (en) * 2014-08-28 2017-10-24 Knowles Electronics, Llc Multi-sourced noise suppression
CN108022591B (en) * 2017-12-30 2021-03-16 北京百度网讯科技有限公司 Processing method and device for voice recognition in-vehicle environment and electronic equipment
CN109473094A (en) * 2018-11-12 2019-03-15 东风汽车有限公司 Vehicle-mounted control screen voice recognition rate testing method, electronic equipment and system

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102097100A (en) * 2011-01-07 2011-06-15 蔡镇滨 Device and method for reducing steady-state noises through adding noises
CN103280215A (en) * 2013-05-28 2013-09-04 北京百度网讯科技有限公司 Audio frequency feature library establishing method and device
US9564144B2 (en) * 2014-07-24 2017-02-07 Conexant Systems, Inc. System and method for multichannel on-line unsupervised bayesian spectral filtering of real-world acoustic noise
CN107481731A (en) * 2017-08-01 2017-12-15 百度在线网络技术(北京)有限公司 A kind of speech data Enhancement Method and system
CN107680586A (en) * 2017-08-01 2018-02-09 百度在线网络技术(北京)有限公司 Far field Speech acoustics model training method and system
CN108133702A (en) * 2017-12-20 2018-06-08 重庆邮电大学 A kind of deep neural network speech enhan-cement model based on MEE Optimality Criterias
CN108922517A (en) * 2018-07-03 2018-11-30 百度在线网络技术(北京)有限公司 The method, apparatus and storage medium of training blind source separating model
CN108899041A (en) * 2018-08-20 2018-11-27 百度在线网络技术(北京)有限公司 Voice signal adds method for de-noising, device and storage medium

Also Published As

Publication number Publication date
CN110600022A (en) 2019-12-20
CN110600022B (en) 2024-02-27

Similar Documents

Publication Publication Date Title
US11240050B2 (en) Online document sharing method and apparatus, electronic device, and storage medium
JP6596173B1 (en) Incoming call management method and apparatus
WO2016184119A1 (en) Volume adjustment method, system and equipment, and computer storage medium
US11474775B2 (en) Sound effect adjustment method, device, electronic device and storage medium
EP3007163A1 (en) Asynchronous chorus method and device
US20200184948A1 (en) Speech playing method, an intelligent device, and computer readable storage medium
WO2018126613A1 (en) Method for playing audio data and dual-screen mobile terminal
WO2021012952A1 (en) Message processing method, device and electronic equipment
CN110246499B (en) Voice control method and device for household equipment
CN110827843A (en) Audio processing method and device, storage medium and electronic equipment
US11822854B2 (en) Automatic volume adjustment method and apparatus, medium, and device
CN110809214A (en) Audio playing method, audio playing device and terminal equipment
WO2016150191A1 (en) Data sharing method and device
WO2021072914A1 (en) Human-machine conversation processing method
US11741984B2 (en) Method and apparatus and telephonic system for acoustic scene conversion
CN108829370B (en) Audio resource playing method and device, computer equipment and storage medium
WO2020134547A1 (en) Fixed-point acceleration method and apparatus for data, electronic device and storage medium
WO2021027132A1 (en) Audio processing method and apparatus and computer storage medium
KR102157790B1 (en) Method, system, and computer program for operating live quiz show platform including characters
EP1783600A2 (en) Method for arbitrating audio data output apparatuses
CN112188342A (en) Equalization parameter determination method and device, electronic equipment and storage medium
US20070067169A1 (en) Method for arbitrating audio data output apparatuses
WO2021042538A1 (en) Method and device for audio processing, and computer storage medium
CN113393863B (en) Voice evaluation method, device and equipment
WO2023134328A1 (en) Electronic device control method and apparatus, and electronic device

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 19941352

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 19941352

Country of ref document: EP

Kind code of ref document: A1

122 Ep: pct application non-entry in european phase

Ref document number: 19941352

Country of ref document: EP

Kind code of ref document: A1