CN114449339B - Background sound effect conversion method and device, computer equipment and storage medium - Google Patents

Background sound effect conversion method and device, computer equipment and storage medium Download PDF

Info

Publication number
CN114449339B
CN114449339B CN202210140971.0A CN202210140971A CN114449339B CN 114449339 B CN114449339 B CN 114449339B CN 202210140971 A CN202210140971 A CN 202210140971A CN 114449339 B CN114449339 B CN 114449339B
Authority
CN
China
Prior art keywords
target
spectrogram
sound effect
frequency
audio
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202210140971.0A
Other languages
Chinese (zh)
Other versions
CN114449339A (en
Inventor
彭宁
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenzhen Wondershare Software Co Ltd
Original Assignee
Shenzhen Wondershare Software Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenzhen Wondershare Software Co Ltd filed Critical Shenzhen Wondershare Software Co Ltd
Priority to CN202210140971.0A priority Critical patent/CN114449339B/en
Publication of CN114449339A publication Critical patent/CN114449339A/en
Application granted granted Critical
Publication of CN114449339B publication Critical patent/CN114449339B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/439Processing of audio elementary streams
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
    • G10L19/02Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using spectral analysis, e.g. transform vocoders or subband vocoders
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/80Generation or processing of content or additional data by content creator independently of the distribution process; Content per se
    • H04N21/81Monomedia components thereof
    • H04N21/8106Monomedia components thereof involving special audio data, e.g. different tracks for different languages
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04RLOUDSPEAKERS, MICROPHONES, GRAMOPHONE PICK-UPS OR LIKE ACOUSTIC ELECTROMECHANICAL TRANSDUCERS; DEAF-AID SETS; PUBLIC ADDRESS SYSTEMS
    • H04R3/00Circuits for transducers, loudspeakers or microphones

Abstract

The application relates to the technical field of image processing and discloses a background sound effect conversion method, a device, computer equipment and a storage medium, wherein the method comprises the steps of obtaining audio to be processed and a target conversion type, wherein the target conversion type comprises a radio sound effect, an electric sound effect and a diving sound effect; carrying out framing treatment on the audio to be treated according to a preset sampling frequency to obtain basic audio; performing short-time Fourier transform processing on the basic audio according to a Hamming window with a preset size to obtain an initial spectrogram; and carrying out frequency range modification processing on the initial spectrogram based on the target conversion type to obtain a target spectrogram, and restoring the target spectrogram into a time domain signal to obtain a target sound effect. The embodiment of the invention modifies the frequency range of the spectrogram according to the sound effect conversion type, is beneficial to improving the accuracy of the conversion effect of the background sound effect, and ensures that the background sound effect has more identification.

Description

Background sound effect conversion method and device, computer equipment and storage medium
Technical Field
The present disclosure relates to the field of communications technologies, and in particular, to a method and apparatus for converting background sound effects, a computer device, and a storage medium.
Background
When creating video, users can add background sound according to specific video content to achieve the requirement of simulating the current background environment scene, but the practice is ineffective in partial scenes, such as diving sound effect, radio sound effect, telephone sound effect and the like, and even if the sound is added with water sound and current sound which accord with the current scene, the realistic experimental effect is still not achieved, and the sound sounds just like the summation of the background sound and the original audio.
Existing audio modem technology can generate radio and telephone sounds and is applied to various communication devices such as wireless interphones and mobile phones. And secondly, similar sound can be created in the Audio editing software Audio Director, and the Audio Director sound is only similar to frequency band suppression. However, the existing technology lacks identification degree in the process of converting the background sound effect, so that the accuracy of the converting effect of the background sound effect is lower.
Disclosure of Invention
An objective of the embodiments of the present application is to provide a method, an apparatus, a computer device, and a storage medium for converting a background sound effect, so as to improve the accuracy of the converting effect of the background sound effect.
In order to solve the above technical problems, an embodiment of the present application provides a method for converting background sound effects, including:
acquiring audio to be processed and a target conversion type, wherein the target conversion type comprises radio sound effect, electric sound effect and diving sound effect;
carrying out framing treatment on the audio to be treated according to a preset sampling frequency to obtain basic audio;
performing short-time Fourier transform processing on the basic audio according to a Hamming window with a preset size to obtain an initial spectrogram;
based on the target conversion type, carrying out frequency range modification processing on the initial spectrogram to obtain a target spectrogram;
and restoring the target spectrogram into a time domain signal to obtain target sound effect.
Further, the performing frequency range modification processing on the initial spectrogram based on the target conversion type to obtain a target spectrogram includes:
if the target conversion type is the radio sound effect, carrying out frequency range modification processing on the initial spectrogram according to a preset frequency band to obtain the target spectrogram;
if the target conversion type is the telephone sound effect, carrying out frequency range modification processing on the initial spectrogram by adopting a first preset formula and a second preset formula to obtain the target spectrogram;
and if the target conversion type is the diving sound effect, performing gradual change processing on the initial spectrogram by adopting a third preset formula so as to perform frequency range modification processing on the initial spectrogram to obtain the target spectrogram.
Further, if the target conversion type is the radio sound effect, performing frequency range modification processing on the initial spectrogram according to a preset frequency band to obtain the target spectrogram, including:
if the target conversion type is the radio sound effect, setting the signals with frequencies lower than 250Hz and higher than 8000Hz in the initial spectrogram to be 0, setting the frequency range from 50Hz to 2700Hz to be twice of the original frequency range, and setting the frequency range from 2700Hz to 8000Hz to be one fifth of the original frequency range, so as to obtain the target spectrogram.
Further, if the target conversion type is the phone sound effect, performing frequency range modification processing on the initial spectrogram by using a first preset formula and a second preset formula to obtain the target spectrogram, including:
if the target conversion type is the telephone sound effect, setting the signals with frequencies lower than 300Hz and higher than 8000Hz in the initial spectrogram to be 0, adopting the first preset formula to carry out frequency modification on the frequencies from 300Hz to 3000Hz, and adopting the second preset formula to carry out frequency modification on the frequencies from 3000Hz to 8000Hz, so as to obtain the target spectrogram.
Further, the first preset formula is y=y 1 *(1-0.125i);
Where i represents the ith row, i ε (0,72), y 1 Representing the frequency domain value of the current row, wherein y is the modified frequency domain value;
the second preset formula is y=y 2 *(1-0.0028j);
Where j represents the j th row, j ε (50,400), y 2 Representing the frequency domain value of the current row, y being the modified frequency domain value.
Further, the third preset formula is that
Where i represents the ith row, i ε (8,1025), y 3 Representing the frequency domain value of the current row, y being the modified frequency domain value.
Further, the restoring the target spectrogram to a time domain signal to obtain a target sound effect includes:
and carrying out short-time Fourier inverse change on the target spectrogram corresponding to the basic audio frequency of each frame so as to restore the target spectrogram to a time domain signal until all the target spectrograms are restored to the time domain signal, thereby obtaining the target audio effect.
In order to solve the above technical problem, an embodiment of the present application provides a device for converting background sound effects, including:
the audio processing device comprises an audio acquisition module to be processed, a processing module and a processing module, wherein the audio acquisition module is used for acquiring audio to be processed and a target conversion type, and the target conversion type comprises radio sound effect, electric sound effect and diving sound effect;
the basic audio generation module is used for carrying out framing processing on the audio to be processed according to a preset sampling frequency to obtain basic audio;
the initial spectrogram generation module is used for carrying out short-time Fourier transform processing on the basic audio according to a Hamming window with a preset size to obtain an initial spectrogram;
the target spectrogram generation module is used for carrying out frequency range modification processing on the initial spectrogram based on the target conversion type to obtain a target spectrogram;
and the target sound effect generation module is used for restoring the target spectrogram into a time domain signal to obtain the target sound effect.
In order to solve the technical problems, the invention adopts a technical scheme that: a computer device is provided comprising one or more processors; and the memory is used for storing one or more programs, so that the one or more processors can realize the background sound effect conversion method.
In order to solve the technical problems, the invention adopts a technical scheme that: a computer readable storage medium having stored thereon a computer program which, when executed by a processor, implements the method of converting a background sound effect of any one of the above.
The embodiment of the invention provides a background sound effect conversion method, a background sound effect conversion device, computer equipment and a storage medium. The method comprises the following steps: acquiring audio to be processed and a target conversion type, wherein the target conversion type comprises radio sound effect, electric sound effect and diving sound effect; carrying out framing treatment on the audio to be treated according to a preset sampling frequency to obtain basic audio; performing short-time Fourier transform processing on the basic audio according to a Hamming window with a preset size to obtain an initial spectrogram; and carrying out frequency range modification processing on the initial spectrogram based on the target conversion type to obtain a target spectrogram, and restoring the target spectrogram into a time domain signal to obtain a target sound effect. According to the embodiment of the invention, the audio to be processed is subjected to frame processing, and Fourier transformation is performed according to the preset Hamming window, so that the audio to be processed is converted into the spectrogram, the frequency range of the audio to be processed is favorably modified accurately according to the spectrogram, meanwhile, the spectrogram is subjected to frequency range modification processing according to the target conversion type, and the spectrogram is restored into the time domain signal, so that the target audio is obtained, the frequency range of the spectrogram is accurately adjusted according to different audio conversion types, the accuracy of the conversion effect of the background audio is favorably improved, and the background audio is more discernable.
Drawings
For a clearer description of the solution in the present application, a brief description will be given below of the drawings that are needed in the description of the embodiments of the present application, it being obvious that the drawings in the following description are some embodiments of the present application, and that other drawings may be obtained from these drawings without inventive effort for a person of ordinary skill in the art.
FIG. 1 is a flowchart of an implementation of a sub-process in a background sound conversion method according to an embodiment of the present disclosure;
FIG. 2 is a flowchart of still another implementation of a sub-flow in a method for converting background sound effects provided in an embodiment of the present application;
fig. 3 is a schematic diagram of a device for converting background sound effects according to an embodiment of the present application;
fig. 4 is a schematic diagram of a computer device provided in an embodiment of the present application.
Detailed Description
Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this application belongs; the terminology used in the description of the applications herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the application; the terms "comprising" and "having" and any variations thereof in the description and claims of the present application and in the description of the figures above are intended to cover non-exclusive inclusions. The terms first, second and the like in the description and in the claims or in the above-described figures, are used for distinguishing between different objects and not necessarily for describing a sequential or chronological order.
Reference herein to "an embodiment" means that a particular feature, structure, or characteristic described in connection with the embodiment may be included in at least one embodiment of the present application. The appearances of such phrases in various places in the specification are not necessarily all referring to the same embodiment, nor are separate or alternative embodiments mutually exclusive of other embodiments. Those of skill in the art will explicitly and implicitly appreciate that the embodiments described herein may be combined with other embodiments.
In order to better understand the technical solutions of the present application, the following description will clearly and completely describe the technical solutions in the embodiments of the present application with reference to the accompanying drawings.
The present invention will be described in detail with reference to the drawings and embodiments.
It should be noted that, the method for converting background sound effects provided in the embodiments of the present application is generally executed by a server, and accordingly, the device for converting background sound effects is generally configured in the server.
Referring to fig. 1, fig. 1 illustrates a specific embodiment of a method for converting background sound effects.
It should be noted that, if there are substantially the same results, the method of the present invention is not limited to the flow sequence shown in fig. 1, and the method includes the following steps:
s1: and acquiring the audio to be processed and a target conversion type, wherein the target conversion type comprises a radio sound effect, an electric sound effect and a diving sound effect.
Specifically, when creating a video, a user needs to add background sound according to specific video content to achieve the requirement of simulating the current background environment scene, but in some scenes, the method is ineffective, such as diving sound effects, radio sound effects, telephone sound effects and the like, and even if the sound effects are added to the water sound, the current sound and the telephone sound which accord with the current scene, the realistic experimental effect sound still cannot be heard as just the sum of the background sound and the original audio. For example, in a diving video, a background sound of a diving sound effect needs to be added, and the existing method simply superimposes the diving sound effect with the diving video, and does not process the background sound, so that the accuracy of a conversion effect of the background sound effect is low, and the background sound effect is lack of recognition. Therefore, when background sound needs to be added to a specific video content, the embodiment of the application firstly acquires the audio to be processed corresponding to the specific video content, and the user needs the sound effect conversion type, namely the target conversion type. The target conversion types include radio sound effects, electric sound effects and diving sound effects.
According to the embodiment of the application, the frequency domain range of the audio (radio sound effect, electric sound effect and diving sound effect) under different environments is limited, the frequency domain change mode of the audio under different background sounds is changed, and the audio can be converted in real time by cutting the audio into frames. According to the embodiment of the application, the user can freely change the sound effect in real time, and a vivid effect is achieved.
S2: and carrying out framing treatment on the audio to be treated according to a preset sampling frequency to obtain basic audio.
Specifically, frame processing is carried out on the audio to be processed by adopting a preset sampling frequency to obtain basic audio, a first frame is adopted as audio input, after the first frame of audio is converted into target audio, each subsequent frame of audio is gradually input until all the audio to be processed is subjected to audio conversion.
The preset sampling frequency is set according to the actual situation, and is not limited herein. Since 44100Hz is an audio CD, and is also commonly used for the sampling rate used for MPEG-1 audio (VCD, SVCD, MP 3), the preferred preset sampling frequency for embodiments of the present application is 44100HZ. Wherein. The inverse of the sampling frequency is the sampling period, or sampling time, which is the time interval between samples, i.e. the sampling frequency refers to how many signal samples per second are collected by the computer. The higher the sampling frequency, i.e. the shorter the sampling interval, the more sample data the computer gets per unit time, and the more accurate the representation of the signal waveform.
S3: and carrying out short-time Fourier transform processing on the basic audio according to a Hamming window with a preset size to obtain an initial spectrogram.
Specifically, the short-time Fourier transform processing is performed on the basic audio by limiting the size of the Hamming window, so that the basic audio is converted into a spectrogram, and the frequency range of the basic audio is favorably modified accurately according to the spectrogram.
The short-time fourier transform (STFT, short-time Fourier transform, or short-term Fourier transform) is a mathematical transform related to fourier transform, and is used to determine the frequency and phase of the local area sine wave of the time-varying signal. In the embodiment of the application, the short-time Fourier transform processing is performed on the basic audio, so that the basic audio is converted into the spectrogram, and the frequency range of the spectrogram is convenient to accurately modify, so that the accurate generation of the background sound effect is facilitated.
It should be noted that, the preset size of the hamming window is set according to the actual situation, and is not limited herein, and in one embodiment, the preset size of the hamming window is 2048.
S4: and carrying out frequency range modification processing on the initial spectrogram based on the target conversion type to obtain a target spectrogram.
Specifically, according to different target conversion types, frequency range modification processing is performed on the initial spectrogram, so that the modified spectrogram generates a spectrogram corresponding to the sound effect corresponding to the target conversion type, namely the target spectrogram.
Referring to fig. 2, fig. 2 shows a specific embodiment of step S4, which is described in detail as follows:
s41: and if the target conversion type is radio sound effect, carrying out frequency range modification processing on the initial spectrogram according to a preset frequency band to obtain a target spectrogram.
Further, step S41 includes: if the target conversion type is radio sound effect, the signals with frequencies lower than 250Hz and higher than 8000Hz in the initial spectrogram are set to be 0, the frequency range from 50Hz to 2700Hz is set to be twice of the original frequency range, and the frequency range from 2700Hz to 8000Hz is set to be one fifth of the original frequency range, so that the target spectrogram is obtained.
Specifically, the short-time fourier transform processing is performed on the basic audio, and an initial spectrogram is obtained, wherein at the moment, a row of the initial spectrogram represents a frequency band, and a row represents 21.5Hz. In the present embodiment, the frequencies below 250Hz and above 8000Hz signals are all set to 0 to simulate the frequency range of a radio. The frequency band from 250Hz to 2700Hz is set to be twice the original frequency band, and the human ear is sensitive to the sound of the frequency band and the sound is slightly clunky so as to simulate the intermediate frequency signal in the radio. The frequency band between 2700Hz and 8000Hz is set to be one fifth of the original frequency band, so that the influence of the frequency band on sound is reduced.
S42: and if the target conversion type is telephone sound effect, carrying out frequency range modification processing on the initial spectrogram by adopting a first preset formula and a second preset formula to obtain a target spectrogram.
Further, step S42 includes: if the target conversion type is telephone sound effect, setting the signals with frequencies lower than 300Hz and higher than 8000Hz in the initial spectrogram as 0, adopting a first preset formula to carry out frequency modification on the frequencies from 300Hz to 3000Hz, and adopting a second preset formula to carry out frequency modification on the frequencies from 3000Hz to 8000Hz, so as to obtain the target spectrogram.
In particular, since the electric voice effect has a lower frequency band range, the electric voice effect frequency band is mainly allocated between 300Hz and 3000 Hz. The embodiments of the present application thus preserve as much of the original characteristics as possible for the sound bands 300Hz to 3000Hz, while tapering to 0 at 3000Hz to 8000Hz in order to reduce the effects of abrupt frequency changes to 0.
Further, frequency modification is performed on the frequency of 300Hz to 3000Hz by adopting a first preset formula, wherein the first preset formula is y=y 1 *(1-0.125i);
Where i represents the ith row, i ε (0,72), y 1 Representing the frequency domain value of the current row, wherein y is the modified frequency domain value;
further, frequency modification is performed on the frequencies from 3000Hz to 8000Hz by adopting a second preset formula, wherein the second preset formula is y=y 2 *(1-0.0028j);
Where j represents the j th row, j ε (50,400), y 2 Representing the frequency domain value of the current row, y being the modified frequency domain value.
S43: and if the target conversion type is the diving sound effect, performing gradual change processing on the initial spectrogram by adopting a third preset formula so as to perform frequency range modification processing on the initial spectrogram to obtain the target spectrogram.
In particular, a very low frequency range is required for a relatively low diving sound effect, so that the frequency range is defined as 1000Hz in the embodiments of the present application.
Further, the third preset formula is
Where i represents the ith row, i ε (8,1025), y 3 Representing the frequency domain value of the current row, y being the modified frequency domain value.
In this embodiment, if the target conversion type is a radio audio effect, the frequency range modification processing is performed on the initial spectrogram according to the preset frequency band to obtain the target spectrogram, if the target conversion type is a telephone audio effect, the frequency range modification processing is performed on the initial spectrogram by adopting a first preset formula and a second preset formula to obtain the target spectrogram, and if the target conversion type is a diving audio effect, the gradual change processing is performed on the initial spectrogram by adopting a third preset formula to perform the frequency range modification processing on the initial spectrogram to obtain the target spectrogram, so that the frequency range of the spectrogram is modified according to different target conversion types, the frequency range of the spectrogram is accurately adjusted, the accuracy of the conversion effect of the background audio effect is improved, and the background audio effect is more discernable.
S5: and restoring the target spectrogram into a time domain signal to obtain the target sound effect.
Further, step S5 includes: and carrying out short-time Fourier inverse change on the target spectrogram corresponding to each frame of basic audio so as to restore the target spectrogram to a time domain signal until all the target spectrograms are restored to the time domain signal, thereby obtaining the target sound effect.
Specifically, in the embodiment of the present application, the first frame of basic audio is first input, and the target sound effect corresponding to the frame of basic audio is obtained through the processing of step S3-step S5, then each subsequent frame of basic audio is input, and the repeated processing of step S3-step S5 is performed until all the basic audio has obtained the corresponding target sound effect. In the embodiment of the application, short-time Fourier inverse change is performed on the target spectrogram, so that the target spectrogram is restored to a time domain signal until all the target spectrogram is restored to the time domain signal, and the target sound effect is obtained.
In this embodiment, audio to be processed and a target conversion type are acquired, where the target conversion type includes a radio sound effect, an electric sound effect, and a diving sound effect; carrying out framing treatment on the audio to be treated according to a preset sampling frequency to obtain basic audio; performing short-time Fourier transform processing on the basic audio according to a Hamming window with a preset size to obtain an initial spectrogram; and carrying out frequency range modification processing on the initial spectrogram based on the target conversion type to obtain a target spectrogram, and restoring the target spectrogram into a time domain signal to obtain a target sound effect. According to the embodiment of the invention, the audio to be processed is subjected to frame processing, and Fourier transformation is performed according to the preset Hamming window, so that the audio to be processed is converted into the spectrogram, the frequency range of the audio to be processed is favorably modified accurately according to the spectrogram, meanwhile, the spectrogram is subjected to frequency range modification processing according to the target conversion type, and the spectrogram is restored into the time domain signal, so that the target audio is obtained, the frequency range of the spectrogram is accurately adjusted according to different audio conversion types, the accuracy of the conversion effect of the background audio is favorably improved, and the background audio is more discernable.
Those skilled in the art will appreciate that implementing all or part of the above-described methods in accordance with the embodiments may be accomplished by way of a computer program stored in a computer-readable storage medium, which when executed, may comprise the steps of the embodiments of the methods described above. The storage medium may be a nonvolatile storage medium such as a magnetic disk, an optical disk, a Read-Only Memory (ROM), or a random access Memory (Random Access Memory, RAM).
Referring to fig. 3, as an implementation of the method shown in fig. 1, the present application provides an embodiment of a device for converting background sound effects, where the embodiment of the device corresponds to the embodiment of the method shown in fig. 1, and the device may be specifically applied to various electronic devices.
As shown in fig. 3, the device for converting background sound effects of the present embodiment includes: a to-be-processed audio acquisition module 61, a basic audio generation module 62, an initial spectrogram generation module 63, a target spectrogram generation module 64, and a target sound effect generation module 65, wherein:
a to-be-processed audio acquisition module 61, configured to acquire to-be-processed audio and a target conversion type, where the target conversion type includes a radio sound effect, an electric sound effect, and a diving sound effect;
the basic audio generation module 62 is configured to perform framing processing on audio to be processed according to a preset sampling frequency to obtain basic audio;
the initial spectrogram generation module 63 is configured to perform short-time fourier transform processing on the basic audio according to a hamming window with a preset size, so as to obtain an initial spectrogram;
the target spectrogram generating module 64 is configured to perform frequency range modification processing on the initial spectrogram based on the target conversion type, so as to obtain a target spectrogram;
the target sound effect generating module 65 is configured to restore the target spectrogram to a time domain signal, so as to obtain a target sound effect.
Further, the target spectrogram generating module 64 includes:
the radio sound effect modification unit is used for modifying the frequency range of the initial spectrogram according to a preset frequency band to obtain a target spectrogram if the target conversion type is radio sound effect;
the electronic voice effect modifying unit is used for modifying the frequency range of the initial spectrogram by adopting a first preset formula and a second preset formula if the target conversion type is telephone voice effect, so as to obtain a target spectrogram;
and the diving sound effect modifying unit is used for carrying out gradual change processing on the initial spectrogram by adopting a third preset formula if the target conversion type is diving sound effect so as to carry out frequency range modification processing on the initial spectrogram to obtain the target spectrogram.
Further, the radio sound effect modification unit includes:
and the radio sound effect frequency limiting subunit is used for setting the frequency of the signals with the frequency lower than 250Hz and higher than 8000Hz in the initial spectrogram as 0, setting the frequency range from 50Hz to 2700Hz to be twice of the original frequency range, and setting the frequency range from 2700Hz to 8000Hz to be one fifth of the original frequency range if the target conversion type is the radio sound effect, so as to obtain the target spectrogram.
Further, the telephone sound effect modification unit includes:
and the voice effect frequency limiting subunit is used for setting the frequency of the signal lower than 300Hz and higher than 8000Hz in the initial spectrogram to be 0 if the target conversion type is telephone effect, adopting a first preset formula to carry out frequency modification on the frequency of 300Hz to 3000Hz, and adopting a second preset formula to carry out frequency modification on the frequency of 3000Hz to 8000Hz, so as to obtain the target spectrogram.
Further, the first preset formula is y=y 1 *(1-0.125i);
Where i represents the ith row, i ε (0,72), y 1 Representing the frequency domain value of the current row, wherein y is the modified frequency domain value;
the second preset formula is y=y 2 *(1-0.0028j);
Where j represents the j th row, j ε (50,400), y 2 Representing the frequency domain value of the current row, y being the modified frequency domain value.
Further, the third preset formula is
Where i represents the ith row, i ε (8,1025), y 3 Representing the frequency domain value of the current row, y being the modified frequency domain value.
Further, the target sound effect generating module 65 includes:
and the spectrogram conversion unit is used for carrying out short-time Fourier inverse change on the target spectrogram corresponding to each frame of basic audio so as to restore the target spectrogram into a time domain signal until all the target spectrograms are restored into the time domain signal, thereby obtaining the target sound effect.
In order to solve the technical problems, the embodiment of the application also provides computer equipment. Referring specifically to fig. 4, fig. 4 is a basic structural block diagram of a computer device according to the present embodiment.
The computer device 7 comprises a memory 71, a processor 72, a network interface 73 communicatively connected to each other via a system bus. It is noted that only a computer device 7 having three components memory 71, a processor 72, a network interface 73 is shown in the figures, but it should be understood that not all of the illustrated components are required to be implemented and that more or fewer components may be implemented instead. It will be appreciated by those skilled in the art that the computer device herein is a device capable of automatically performing numerical calculations and/or information processing in accordance with predetermined or stored instructions, the hardware of which includes, but is not limited to, microprocessors, application specific integrated circuits (ApplicationSpecific Integrated Circuit, ASICs), programmable gate arrays (fields-Programmable Gate Array, FPGAs), digital processors (Digital Signal Processor, DSPs), embedded devices, etc.
The computer device may be a desktop computer, a notebook computer, a palm computer, a cloud server, or the like. The computer device can perform man-machine interaction with a user through a keyboard, a mouse, a remote controller, a touch pad or voice control equipment and the like.
The memory 71 includes at least one type of readable storage medium including flash memory, hard disk, multimedia card, card memory (e.g., SD or DX memory, etc.), random Access Memory (RAM), static Random Access Memory (SRAM), read Only Memory (ROM), electrically Erasable Programmable Read Only Memory (EEPROM), programmable Read Only Memory (PROM), magnetic memory, magnetic disk, optical disk, etc. In some embodiments, the memory 71 may be an internal storage unit of the computer device 7, such as a hard disk or a memory of the computer device 7. In other embodiments, the memory 71 may also be an external storage device of the computer device 7, such as a plug-in hard disk, a Smart Media Card (SMC), a Secure Digital (SD) Card, a Flash memory Card (Flash Card) or the like, which are provided on the computer device 7. Of course, the memory 71 may also comprise both an internal memory unit of the computer device 7 and an external memory device. In the present embodiment, the memory 71 is typically used to store an operating system installed on the computer device 7 and various types of application software, such as program codes of a conversion method of background sound effects. In addition, the memory 71 may also be used to temporarily store various types of data that have been output or are to be output.
Processor 72 may be a central processing unit (Central Processing Unit, CPU), controller, microcontroller, microprocessor, or other data processing chip in some embodiments. The processor 72 is typically used to control the overall operation of the computer device 7. In this embodiment, the processor 72 is configured to execute the program code stored in the memory 71 or process data, for example, the program code of the above-described background sound effect conversion method, to implement various embodiments of the background sound effect conversion method.
The network interface 73 may comprise a wireless network interface or a wired network interface, which network interface 73 is typically used to establish a communication connection between the computer device 7 and other electronic devices.
The present application also provides another embodiment, namely, a computer readable storage medium, where a computer program is stored, where the computer program is executable by at least one processor, so that the at least one processor performs the steps of a method for converting a background sound effect as described above.
From the above description of the embodiments, it will be clear to those skilled in the art that the above-described embodiment method may be implemented by means of software plus a necessary general hardware platform, but of course may also be implemented by means of hardware, but in many cases the former is a preferred embodiment. Based on such understanding, the technical solution of the present application may be embodied essentially or in a part contributing to the prior art in the form of a software product stored in a storage medium (such as ROM/RAM, magnetic disk, optical disk), comprising several instructions for causing a terminal device (which may be a mobile phone, a computer, a server, an air conditioner, or a network device, etc.) to perform the method of the embodiments of the present application.
The blockchain is a novel application mode of computer technologies such as distributed data storage, point-to-point transmission, consensus mechanism, encryption algorithm and the like. The Blockchain (Blockchain), which is essentially a decentralised database, is a string of data blocks that are generated by cryptographic means in association, each data block containing a batch of information of network transactions for verifying the validity of the information (anti-counterfeiting) and generating the next block. The blockchain may include a blockchain underlying platform, a platform product services layer, an application services layer, and the like.
It is apparent that the embodiments described above are only some embodiments of the present application, but not all embodiments, the preferred embodiments of the present application are given in the drawings, but not limiting the patent scope of the present application. This application may be embodied in many different forms, but rather, embodiments are provided in order to provide a more thorough understanding of the present disclosure. Although the present application has been described in detail with reference to the foregoing embodiments, it will be apparent to those skilled in the art that modifications may be made to the embodiments described in the foregoing, or equivalents may be substituted for elements thereof. All equivalent structures made by the specification and the drawings of the application are directly or indirectly applied to other related technical fields, and are also within the protection scope of the application.

Claims (7)

1. A method for converting background sound effects, comprising:
acquiring audio to be processed and a target conversion type, wherein the target conversion type comprises radio sound effect, electric sound effect and diving sound effect;
carrying out framing treatment on the audio to be treated according to a preset sampling frequency to obtain basic audio;
performing short-time Fourier transform processing on the basic audio according to a Hamming window with a preset size to obtain an initial spectrogram;
based on the target conversion type, carrying out frequency range modification processing on the initial spectrogram to obtain a target spectrogram;
restoring the target spectrogram into a time domain signal to obtain a target sound effect;
the frequency range modification processing is performed on the initial spectrogram based on the target conversion type to obtain a target spectrogram, which comprises the following steps:
if the target conversion type is the radio sound effect, setting signals with frequencies lower than 250Hz and higher than 8000Hz in the initial spectrogram as 0, setting the frequency range from 50Hz to 2700Hz to be twice of the original frequency range, and setting the frequency range from 2700Hz to 8000Hz to be one fifth of the original frequency range to obtain the target spectrogram;
if the target conversion type is the telephone sound effect, setting the frequency of the signals with the frequency lower than 300Hz and the frequency higher than 8000Hz in the initial spectrogram to be 0, adopting a first preset formula to carry out frequency modification on the frequencies from 300Hz to 3000Hz, and adopting a second preset formula to carry out frequency modification on the frequencies from 3000Hz to 8000Hz, so as to obtain the target spectrogram;
and if the target conversion type is the diving sound effect, performing gradual change processing on the initial spectrogram by adopting a third preset formula so as to perform frequency range modification processing on the initial spectrogram to obtain the target spectrogram.
2. The method for converting background sound effects according to claim 1, wherein the first preset formula is y=y1 (1-0.125 i);
wherein i represents the ith row, i e (0,72), y1 represents the frequency domain value of the current row, and y is the modified frequency domain value;
the second preset formula is y=y2 (1-0.0028 j);
where j represents the j-th row, j e (50,400), y2 represents the frequency domain value of the current row, and y is the modified frequency domain value.
3. The method for converting background sound effects according to claim 1, wherein the third preset formula is
Where i represents the i-th row, i e (8,1025), y3 represents the frequency domain value of the current row, and y is the modified frequency domain value.
4. A method for converting a background sound effect according to any one of claims 1 to 3, wherein the reducing the target spectrogram to a time domain signal to obtain the target sound effect comprises: performing short-time Fourier inverse change on the target spectrogram corresponding to the basic audio frequency for each frame so that the target spectrogram is provided with a short-time Fourier inverse change function
And restoring the target spectrogram into a time domain signal until all the target spectrograms are restored into the time domain signal, thereby obtaining the target sound effect.
5. A background sound effect conversion apparatus, comprising:
the audio processing device comprises an audio acquisition module to be processed, a processing module and a processing module, wherein the audio acquisition module is used for acquiring audio to be processed and a target conversion type, and the target conversion type comprises radio sound effect, electric sound effect and diving sound effect;
the basic audio generation module is used for carrying out framing processing on the audio to be processed according to a preset sampling frequency to obtain basic audio;
the initial spectrogram generation module is used for carrying out short-time Fourier transform processing on the basic audio according to a Hamming window with a preset size to obtain an initial spectrogram;
the target spectrogram generation module is used for carrying out frequency range modification processing on the initial spectrogram based on the target conversion type to obtain a target spectrogram;
the target sound effect generation module is used for restoring the target spectrogram into a time domain signal to obtain a target sound effect;
the target sound effect generation module comprises:
a radio sound effect frequency limiting subunit, configured to set a signal with a frequency lower than 250Hz and a frequency higher than 8000Hz in the initial spectrogram to be 0, set a frequency band from 50Hz to 2700Hz to be twice of an original frequency band, and set a frequency band from 2700Hz to 8000Hz to be one fifth of the original frequency band if the target conversion type is the radio sound effect, so as to obtain the target spectrogram;
a voice effect frequency limiting subunit, configured to set a signal with a frequency lower than 300Hz and a signal higher than 8000Hz in the initial spectrogram to be 0 if the target conversion type is the telephone effect, perform frequency modification on the signal with a frequency ranging from 300Hz to 3000Hz by using a first preset formula, and perform frequency modification on the signal with a frequency ranging from 3000Hz to 8000Hz by using a second preset formula, so as to obtain the target spectrogram;
and the electronic sound effect modification unit is used for carrying out gradual change processing on the initial spectrogram by adopting a third preset formula if the target conversion type is the diving sound effect so as to carry out frequency range modification processing on the initial spectrogram to obtain the target spectrogram.
6. A computer device comprising a memory and a processor, the memory having stored therein a computer program, the processor implementing the method of converting background sound effects according to any one of claims 1 to 4 when the computer program is executed.
7. A computer-readable storage medium, on which a computer program is stored, which computer program, when being executed by a processor, implements a method of converting a background sound effect according to any one of claims 1 to 4.
CN202210140971.0A 2022-02-16 2022-02-16 Background sound effect conversion method and device, computer equipment and storage medium Active CN114449339B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210140971.0A CN114449339B (en) 2022-02-16 2022-02-16 Background sound effect conversion method and device, computer equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210140971.0A CN114449339B (en) 2022-02-16 2022-02-16 Background sound effect conversion method and device, computer equipment and storage medium

Publications (2)

Publication Number Publication Date
CN114449339A CN114449339A (en) 2022-05-06
CN114449339B true CN114449339B (en) 2024-04-12

Family

ID=81374456

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210140971.0A Active CN114449339B (en) 2022-02-16 2022-02-16 Background sound effect conversion method and device, computer equipment and storage medium

Country Status (1)

Country Link
CN (1) CN114449339B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117761393B (en) * 2024-02-22 2024-05-07 南京派格测控科技有限公司 Time domain signal acquisition method and device

Citations (27)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9318086B1 (en) * 2012-09-07 2016-04-19 Jerry A. Miller Musical instrument and vocal effects
CN105679331A (en) * 2015-12-30 2016-06-15 广东工业大学 Sound-breath signal separating and synthesizing method and system
WO2017054507A1 (en) * 2015-09-29 2017-04-06 广州酷狗计算机科技有限公司 Sound effect simulation method, apparatus and system
CN107333076A (en) * 2017-06-26 2017-11-07 青岛海信电器股份有限公司 The method of adjustment of television set and its audio signal intermediate frequency point data, device
CN107481727A (en) * 2017-06-23 2017-12-15 罗时志 A kind of acoustic signal processing method and system based on the control of electric sound keynote
WO2018077364A1 (en) * 2016-10-28 2018-05-03 Transformizer Aps Method for generating artificial sound effects based on existing sound clips
CN108269579A (en) * 2018-01-18 2018-07-10 厦门美图之家科技有限公司 Voice data processing method, device, electronic equipment and readable storage medium storing program for executing
CN108281152A (en) * 2018-01-18 2018-07-13 腾讯音乐娱乐科技(深圳)有限公司 Audio-frequency processing method, device and storage medium
CN108305603A (en) * 2017-10-20 2018-07-20 腾讯科技(深圳)有限公司 Sound effect treatment method and its equipment, storage medium, server, sound terminal
CN109346111A (en) * 2018-10-11 2019-02-15 广州酷狗计算机科技有限公司 Data processing method, device, terminal and storage medium
CN109410973A (en) * 2018-11-07 2019-03-01 北京达佳互联信息技术有限公司 Voice change process method, apparatus and computer readable storage medium
CN109545174A (en) * 2018-12-26 2019-03-29 广州华多网络科技有限公司 A kind of audio-frequency processing method, device and equipment
CN110580914A (en) * 2019-07-24 2019-12-17 安克创新科技股份有限公司 Audio processing method and equipment and device with storage function
CN111326132A (en) * 2020-01-22 2020-06-23 北京达佳互联信息技术有限公司 Audio processing method and device, storage medium and electronic equipment
CN111429942A (en) * 2020-03-19 2020-07-17 北京字节跳动网络技术有限公司 Audio data processing method and device, electronic equipment and storage medium
CN111739544A (en) * 2019-03-25 2020-10-02 Oppo广东移动通信有限公司 Voice processing method and device, electronic equipment and storage medium
CN111796790A (en) * 2019-04-09 2020-10-20 深圳市冠旭电子股份有限公司 Sound effect adjusting method and device, readable storage medium and terminal equipment
CN112102846A (en) * 2020-09-04 2020-12-18 腾讯科技(深圳)有限公司 Audio processing method and device, electronic equipment and storage medium
CN112435643A (en) * 2020-11-20 2021-03-02 腾讯音乐娱乐科技(深圳)有限公司 Method, device, equipment and storage medium for generating electronic style song audio
CN112511123A (en) * 2020-11-30 2021-03-16 广州朗国电子科技有限公司 Sound effect customizing method and device, electronic equipment and storage medium
CN112599142A (en) * 2020-12-14 2021-04-02 北京百瑞互联技术有限公司 Bluetooth transmission method, equipment and storage medium for adjusting background sound and human voice
CN113178183A (en) * 2021-04-30 2021-07-27 杭州网易云音乐科技有限公司 Sound effect processing method and device, storage medium and computing equipment
CN113223542A (en) * 2021-04-26 2021-08-06 北京搜狗科技发展有限公司 Audio conversion method and device, storage medium and electronic equipment
CN113241082A (en) * 2021-04-22 2021-08-10 杭州朗和科技有限公司 Sound changing method, device, equipment and medium
CN113539299A (en) * 2021-01-12 2021-10-22 腾讯科技(深圳)有限公司 Multimedia information processing method and device, electronic equipment and storage medium
CN113689837A (en) * 2021-08-24 2021-11-23 北京百度网讯科技有限公司 Audio data processing method, device, equipment and storage medium
CN113924620A (en) * 2019-06-05 2022-01-11 哈曼国际工业有限公司 Sound modification based on frequency composition

Patent Citations (27)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9318086B1 (en) * 2012-09-07 2016-04-19 Jerry A. Miller Musical instrument and vocal effects
WO2017054507A1 (en) * 2015-09-29 2017-04-06 广州酷狗计算机科技有限公司 Sound effect simulation method, apparatus and system
CN105679331A (en) * 2015-12-30 2016-06-15 广东工业大学 Sound-breath signal separating and synthesizing method and system
WO2018077364A1 (en) * 2016-10-28 2018-05-03 Transformizer Aps Method for generating artificial sound effects based on existing sound clips
CN107481727A (en) * 2017-06-23 2017-12-15 罗时志 A kind of acoustic signal processing method and system based on the control of electric sound keynote
CN107333076A (en) * 2017-06-26 2017-11-07 青岛海信电器股份有限公司 The method of adjustment of television set and its audio signal intermediate frequency point data, device
CN108305603A (en) * 2017-10-20 2018-07-20 腾讯科技(深圳)有限公司 Sound effect treatment method and its equipment, storage medium, server, sound terminal
CN108269579A (en) * 2018-01-18 2018-07-10 厦门美图之家科技有限公司 Voice data processing method, device, electronic equipment and readable storage medium storing program for executing
CN108281152A (en) * 2018-01-18 2018-07-13 腾讯音乐娱乐科技(深圳)有限公司 Audio-frequency processing method, device and storage medium
CN109346111A (en) * 2018-10-11 2019-02-15 广州酷狗计算机科技有限公司 Data processing method, device, terminal and storage medium
CN109410973A (en) * 2018-11-07 2019-03-01 北京达佳互联信息技术有限公司 Voice change process method, apparatus and computer readable storage medium
CN109545174A (en) * 2018-12-26 2019-03-29 广州华多网络科技有限公司 A kind of audio-frequency processing method, device and equipment
CN111739544A (en) * 2019-03-25 2020-10-02 Oppo广东移动通信有限公司 Voice processing method and device, electronic equipment and storage medium
CN111796790A (en) * 2019-04-09 2020-10-20 深圳市冠旭电子股份有限公司 Sound effect adjusting method and device, readable storage medium and terminal equipment
CN113924620A (en) * 2019-06-05 2022-01-11 哈曼国际工业有限公司 Sound modification based on frequency composition
CN110580914A (en) * 2019-07-24 2019-12-17 安克创新科技股份有限公司 Audio processing method and equipment and device with storage function
CN111326132A (en) * 2020-01-22 2020-06-23 北京达佳互联信息技术有限公司 Audio processing method and device, storage medium and electronic equipment
CN111429942A (en) * 2020-03-19 2020-07-17 北京字节跳动网络技术有限公司 Audio data processing method and device, electronic equipment and storage medium
CN112102846A (en) * 2020-09-04 2020-12-18 腾讯科技(深圳)有限公司 Audio processing method and device, electronic equipment and storage medium
CN112435643A (en) * 2020-11-20 2021-03-02 腾讯音乐娱乐科技(深圳)有限公司 Method, device, equipment and storage medium for generating electronic style song audio
CN112511123A (en) * 2020-11-30 2021-03-16 广州朗国电子科技有限公司 Sound effect customizing method and device, electronic equipment and storage medium
CN112599142A (en) * 2020-12-14 2021-04-02 北京百瑞互联技术有限公司 Bluetooth transmission method, equipment and storage medium for adjusting background sound and human voice
CN113539299A (en) * 2021-01-12 2021-10-22 腾讯科技(深圳)有限公司 Multimedia information processing method and device, electronic equipment and storage medium
CN113241082A (en) * 2021-04-22 2021-08-10 杭州朗和科技有限公司 Sound changing method, device, equipment and medium
CN113223542A (en) * 2021-04-26 2021-08-06 北京搜狗科技发展有限公司 Audio conversion method and device, storage medium and electronic equipment
CN113178183A (en) * 2021-04-30 2021-07-27 杭州网易云音乐科技有限公司 Sound effect processing method and device, storage medium and computing equipment
CN113689837A (en) * 2021-08-24 2021-11-23 北京百度网讯科技有限公司 Audio data processing method, device, equipment and storage medium

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
基于傅里叶变换和倒谱系数的电子音乐标记算法;范勇冠;;现代电子技术(第13期);全文 *

Also Published As

Publication number Publication date
CN114449339A (en) 2022-05-06

Similar Documents

Publication Publication Date Title
CN108564963B (en) Method and apparatus for enhancing voice
TW200306526A (en) Method for robust voice recognition by analyzing redundant features of source signal
CN112309414B (en) Active noise reduction method based on audio encoding and decoding, earphone and electronic equipment
CN114449339B (en) Background sound effect conversion method and device, computer equipment and storage medium
CN110070884B (en) Audio starting point detection method and device
CN109361995B (en) Volume adjusting method and device for electrical equipment, electrical equipment and medium
CN113345460B (en) Audio signal processing method, device, equipment and storage medium
CN111369968B (en) Speech synthesis method and device, readable medium and electronic equipment
WO2023226839A1 (en) Audio enhancement method and apparatus, and electronic device and readable storage medium
CN114187922A (en) Audio detection method and device and terminal equipment
CN113421584B (en) Audio noise reduction method, device, computer equipment and storage medium
CN105224844A (en) Verification method, system and device
CN116913304A (en) Real-time voice stream noise reduction method and device, computer equipment and storage medium
KR102220964B1 (en) Method and device for audio recognition
CN111968651A (en) WT (WT) -based voiceprint recognition method and system
CN108630208B (en) Server, voiceprint-based identity authentication method and storage medium
CN113421554B (en) Voice keyword detection model processing method and device and computer equipment
CN109841232A (en) The extracting method of note locations and device and storage medium in music signal
CN116312559A (en) Training method of cross-channel voiceprint recognition model, voiceprint recognition method and device
CN114049882A (en) Noise reduction model training method and device and storage medium
CN113035216B (en) Microphone array voice enhancement method and related equipment
CN112309425A (en) Sound tone changing method, electronic equipment and computer readable storage medium
CN114333892A (en) Voice processing method and device, electronic equipment and readable medium
CN116982111A (en) Audio characteristic compensation method, audio identification method and related products
CN110289010B (en) Sound collection method, device, equipment and computer storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant