CN114449339A - Background sound effect conversion method and device, computer equipment and storage medium - Google Patents

Background sound effect conversion method and device, computer equipment and storage medium Download PDF

Info

Publication number
CN114449339A
CN114449339A CN202210140971.0A CN202210140971A CN114449339A CN 114449339 A CN114449339 A CN 114449339A CN 202210140971 A CN202210140971 A CN 202210140971A CN 114449339 A CN114449339 A CN 114449339A
Authority
CN
China
Prior art keywords
target
spectrogram
sound effect
audio
conversion type
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202210140971.0A
Other languages
Chinese (zh)
Other versions
CN114449339B (en
Inventor
彭宁
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenzhen Wondershare Software Co Ltd
Original Assignee
Shenzhen Wondershare Software Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenzhen Wondershare Software Co Ltd filed Critical Shenzhen Wondershare Software Co Ltd
Priority to CN202210140971.0A priority Critical patent/CN114449339B/en
Publication of CN114449339A publication Critical patent/CN114449339A/en
Application granted granted Critical
Publication of CN114449339B publication Critical patent/CN114449339B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/439Processing of audio elementary streams
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
    • G10L19/02Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using spectral analysis, e.g. transform vocoders or subband vocoders
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/80Generation or processing of content or additional data by content creator independently of the distribution process; Content per se
    • H04N21/81Monomedia components thereof
    • H04N21/8106Monomedia components thereof involving special audio data, e.g. different tracks for different languages
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04RLOUDSPEAKERS, MICROPHONES, GRAMOPHONE PICK-UPS OR LIKE ACOUSTIC ELECTROMECHANICAL TRANSDUCERS; DEAF-AID SETS; PUBLIC ADDRESS SYSTEMS
    • H04R3/00Circuits for transducers, loudspeakers or microphones

Abstract

The application relates to the technical field of image processing, and discloses a background sound effect conversion method, a device, computer equipment and a storage medium, wherein the method comprises the steps of acquiring audio to be processed and a target conversion type, wherein the target conversion type comprises a radio sound effect, a telephone sound effect and a diving sound effect; performing framing processing on the audio to be processed according to a preset sampling frequency to obtain a basic audio; according to a Hamming window with a preset size, carrying out short-time Fourier transform processing on the basic audio to obtain an initial spectrogram; and modifying the frequency range of the initial spectrogram based on the target conversion type to obtain a target spectrogram, and restoring the target spectrogram into a time domain signal to obtain a target sound effect. The frequency range of the spectrogram is modified according to the sound effect conversion type, so that the accuracy of the conversion effect of the background sound effect is improved, and the background sound effect is more recognizable.

Description

Background sound effect conversion method and device, computer equipment and storage medium
Technical Field
The present application relates to the field of communications technologies, and in particular, to a method and an apparatus for converting a background sound effect, a computer device, and a storage medium.
Background
When a user creates a video, background sound is added according to specific video content to meet the requirement of simulating a current background environment scene, but the method is invalid in partial scenes, such as diving sound effect sound, radio sound effect, telephone sound effect and the like, even if the sound effects are added with underwater sound and current sound which accord with the current scene, the vivid experimental effect cannot be achieved, and the sound only appears as the sum of the background sound and the original audio.
The existing sound effect modulation and demodulation technology can generate radio and telephone sounds and is applied to various communication devices such as wireless interphones and mobile phones. And secondly, similar sound can be created in Audio editor software Audio Director, and the Audio Director sound is only similar to frequency band suppression. However, the prior art lacks of recognition degree in the process of converting the background sound effect, which results in lower precision of the conversion effect of the background sound effect.
Disclosure of Invention
An object of the embodiments of the present application is to provide a method and an apparatus for converting a background sound effect, a computer device, and a storage medium, so as to improve the accuracy of the conversion effect of the background sound effect.
In order to solve the above technical problem, an embodiment of the present application provides a method for converting a background sound effect, including:
acquiring audio to be processed and a target conversion type, wherein the target conversion type comprises a radio sound effect, a telephone sound effect and a diving sound effect;
performing framing processing on the audio to be processed according to a preset sampling frequency to obtain a basic audio;
according to a Hamming window with a preset size, carrying out short-time Fourier transform processing on the basic audio to obtain an initial spectrogram;
modifying the frequency range of the initial spectrogram based on the target conversion type to obtain a target spectrogram;
and restoring the target spectrogram into a time domain signal to obtain a target sound effect.
Further, the modifying the frequency range of the initial spectrogram based on the target conversion type to obtain a target spectrogram includes:
if the target conversion type is the radio sound effect, performing frequency range modification processing on the initial spectrogram according to a preset frequency band to obtain the target spectrogram;
if the target conversion type is the telephone sound effect, performing frequency range modification processing on the initial spectrogram by adopting a first preset formula and a second preset formula to obtain a target spectrogram;
and if the target conversion type is the diving sound effect, performing gradual change processing on the initial spectrogram by adopting a third preset formula so as to perform frequency range modification processing on the initial spectrogram to obtain the target spectrogram.
Further, if the target conversion type is the radio audio effect, performing frequency range modification processing on the initial spectrogram according to a preset frequency band to obtain the target spectrogram, including:
if the target conversion type is the radio audio effect, setting signals with frequencies lower than 250Hz and higher than 8000Hz in the initial spectrogram to be 0, setting the frequency range from 50Hz to 2700Hz to be twice of the original frequency range, and setting the frequency range from 2700Hz to 8000Hz to be one fifth of the original frequency range to obtain the target spectrogram.
Further, if the target conversion type is the telephone sound effect, performing frequency range modification processing on the initial spectrogram by using a first preset formula and a second preset formula to obtain the target spectrogram, including:
if the target conversion type is the telephone sound effect, setting signals with frequencies lower than 300Hz and higher than 8000Hz in the initial spectrogram as 0, carrying out frequency modification on the frequencies from 300Hz to 3000Hz by adopting the first preset formula, and carrying out frequency modification on the frequencies from 3000Hz to 8000Hz by adopting the second preset formula to obtain the target spectrogram.
Further, the first preset formula is that y is equal to y1*(1-0.125i);
Where i denotes the ith row, i ∈ (0,72), y1Representing the frequency domain value of the current line, wherein y is the modified frequency domain value;
the second predetermined formula is y ═ y2*(1-0.0028j);
Where j denotes the jth row, j ∈ (50,400), y2Representing the frequency domain value of the current line, and y is the modified frequency domain value.
Further, the third predetermined formula is
Figure BDA0003506911290000031
Where i denotes the ith row, i ∈ (8,1025), y3Representing the frequency domain value of the current line, and y is the modified frequency domain value.
Further, the reducing the target spectrogram into a time domain signal to obtain a target sound effect includes:
and carrying out short-time Fourier inverse change on the target spectrogram corresponding to each frame of the basic audio so as to restore the target spectrogram to be a time domain signal until all the target spectrograms are restored to be time domain signals, and obtaining the target sound effect.
In order to solve the above technical problem, an embodiment of the present application provides a conversion device for a background sound effect, including:
the device comprises a to-be-processed audio acquisition module, a target conversion module and a processing module, wherein the to-be-processed audio acquisition module is used for acquiring to-be-processed audio and a target conversion type, and the target conversion type comprises a radio sound effect, a telephone sound effect and a diving sound effect;
the basic audio generating module is used for performing framing processing on the audio to be processed according to a preset sampling frequency to obtain basic audio;
the initial spectrogram generating module is used for performing short-time Fourier transform processing on the basic audio according to a Hamming window with a preset size to obtain an initial spectrogram;
a target spectrogram generating module, configured to perform frequency range modification processing on the initial spectrogram based on the target conversion type to obtain a target spectrogram;
and the target sound effect generation module is used for reducing the target spectrogram into a time domain signal to obtain a target sound effect.
In order to solve the technical problems, the invention adopts a technical scheme that: a computer device is provided that includes, one or more processors; a memory for storing one or more programs to enable the one or more processors to implement the method for converting the background sound effect.
In order to solve the technical problems, the invention adopts a technical scheme that: a computer-readable storage medium, on which a computer program is stored, which, when executed by a processor, implements the method for converting background sound effects described in any one of the above.
The embodiment of the invention provides a background sound effect conversion method and device, computer equipment and a storage medium. The method comprises the following steps: acquiring audio to be processed and a target conversion type, wherein the target conversion type comprises a radio sound effect, a telephone sound effect and a diving sound effect; performing framing processing on the audio to be processed according to a preset sampling frequency to obtain a basic audio; according to a Hamming window with a preset size, carrying out short-time Fourier transform processing on the basic audio to obtain an initial spectrogram; and modifying the frequency range of the initial spectrogram based on the target conversion type to obtain a target spectrogram, and restoring the target spectrogram into a time domain signal to obtain a target sound effect. According to the embodiment of the invention, the audio to be processed is subjected to framing processing, Fourier transform is carried out according to the preset Hamming window, so that the audio to be processed is converted into the spectrogram, the frequency range of the spectrogram is favorably and accurately modified according to the spectrogram, meanwhile, the frequency range of the spectrogram is subjected to modification processing according to the target conversion type, the spectrogram is reduced into a time domain signal, and the target audio is obtained, so that the frequency range of the spectrogram is accurately adjusted according to different audio conversion types, the accuracy of the conversion effect of the background audio is favorably improved, and the background audio has higher identification degree.
Drawings
In order to more clearly illustrate the solution of the present application, the drawings needed for describing the embodiments of the present application will be briefly described below, and it is obvious that the drawings in the following description are some embodiments of the present application, and that other drawings can be obtained by those skilled in the art without inventive effort.
Fig. 1 is a flowchart illustrating an implementation of a sub-process in a method for converting a background sound effect according to an embodiment of the present disclosure;
FIG. 2 is a flowchart of another implementation of a sub-process in the method for converting a background sound effect according to the embodiment of the present application;
FIG. 3 is a schematic diagram of a background sound effect conversion apparatus according to an embodiment of the present disclosure;
fig. 4 is a schematic diagram of a computer device provided in an embodiment of the present application.
Detailed Description
Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this application belongs; the terminology used in the description of the application herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the application; the terms "including" and "having," and any variations thereof, in the description and claims of this application and the description of the above figures are intended to cover non-exclusive inclusions. The terms "first," "second," and the like in the description and claims of this application or in the above-described drawings are used for distinguishing between different objects and not for describing a particular order.
Reference herein to "an embodiment" means that a particular feature, structure, or characteristic described in connection with the embodiment can be included in at least one embodiment of the application. The appearances of the phrase in various places in the specification are not necessarily all referring to the same embodiment, nor are separate or alternative embodiments mutually exclusive of other embodiments. It is explicitly and implicitly understood by one skilled in the art that the embodiments described herein can be combined with other embodiments.
In order to make the technical solutions better understood by those skilled in the art, the technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the accompanying drawings.
The present invention will be described in detail below with reference to the accompanying drawings and embodiments.
It should be noted that the method for converting the background sound effect provided in the embodiment of the present application is generally executed by a server, and accordingly, the converting apparatus for the background sound effect is generally configured in the server.
Referring to fig. 1, fig. 1 shows an embodiment of a method for converting a background sound effect.
It should be noted that, if the result is substantially the same, the method of the present invention is not limited to the flow sequence shown in fig. 1, and the method includes the following steps:
s1: acquiring audio to be processed and a target conversion type, wherein the target conversion type comprises a radio sound effect, a telephone sound effect and a diving sound effect.
Specifically, when creating a video, a user needs to add background sound according to specific video content to meet the requirement of simulating the current background environment scene, but the method is ineffective in some scenes, such as diving sound, radio sound, telephone sound, and the like, and even if the sound is added with underwater sound, current sound, and telephone sound which accord with the current scene, the sound still cannot achieve vivid experimental effect and can only be heard as the sum of the background sound and the original audio. For example, in the video of a dive, the background sound of dive audio need be added, and current mode is simply carry out simple stack with dive audio and dive video, does not handle background sound to the precision that leads to the conversion effect of background audio is lower, makes background audio lack to possess the degree of discerning. Therefore, when a background sound needs to be added to a specific video content, the embodiment of the application firstly acquires the audio to be processed corresponding to the specific video content and the sound effect conversion type required by the user, namely the target conversion type. Wherein the target transition types include radio sound, telephone sound, and diving sound.
The embodiment of the application changes the frequency domain change mode of the audio under different background sounds by limiting the frequency domain range of the audio (radio sound effect, telephone sound effect and diving sound effect) under different environments, and the audio can be converted in real time by cutting the audio into frames. The sound effect can be freely changed by the user in real time, and the vivid effect is achieved.
S2: and performing framing processing on the audio to be processed according to a preset sampling frequency to obtain a basic audio.
Specifically, the audio to be processed is subjected to framing processing by adopting a preset sampling frequency to obtain a basic audio, the first frame is adopted as audio input, and after the audio of the first frame is converted into a target audio effect, each subsequent audio of the frames is input step by step until all the audio to be processed is subjected to audio effect conversion.
It should be noted that the preset sampling frequency is set according to actual situations, and is not limited herein. Since 44100Hz is an audio CD, which is also commonly used for the sampling rate of MPEG-1 audio (VCD, SVCD, MP3), the preferred predetermined sampling frequency in the embodiment of the present application is 44100 Hz. Wherein. The inverse of the sampling frequency is the sampling period or sampling time, which is the time interval between samples, i.e. the sampling frequency refers to how many signal samples per second the computer takes. The higher the sampling frequency, i.e., the shorter the interval time between samples, the more sample data the computer obtains per unit time, and the more accurate the representation of the signal waveform.
S3: and carrying out short-time Fourier transform processing on the basic audio according to a Hamming window with a preset size to obtain an initial spectrogram.
Specifically, the size of the Hamming window is limited, and the basic audio is subjected to short-time Fourier transform processing, so that the basic audio is converted into a spectrogram, and the frequency range of the basic audio is favorably and accurately modified according to the spectrogram.
Wherein, the short-time Fourier transform (STFT) is a mathematical transform related to the Fourier transform for determining the frequency and phase of the local area sinusoid of the time-varying signal. In this application embodiment, through carrying out short-time Fourier transform to basic audio frequency and handling for basic audio frequency converts the spectrogram, is convenient for follow-up carry out accurate modification to the frequency range of spectrogram, thereby is favorable to the accurate generation of background audio.
It should be noted that the preset size of the hamming window is set according to practical situations, and is not limited herein, and in a specific embodiment, the preset size of the hamming window is 2048.
S4: and modifying the frequency range of the initial spectrogram based on the target conversion type to obtain a target spectrogram.
Specifically, according to different target conversion types required, the frequency range of the initial spectrogram is modified, so that the modified spectrogram generates a spectrogram corresponding to a sound effect corresponding to the target conversion type, namely a target spectrogram.
Referring to fig. 2, fig. 2 shows an embodiment of step S4, which is described in detail as follows:
s41: and if the target conversion type is a radio sound effect, modifying the frequency range of the initial spectrogram according to a preset frequency band to obtain a target spectrogram.
Further, step S41 includes: if the target conversion type is the radio audio effect, setting signals with frequencies lower than 250Hz and higher than 8000Hz in the initial spectrogram to be 0, setting the frequency band from 50Hz to 2700Hz to be twice of the original frequency band, and setting the frequency band from 2700Hz to 8000Hz to be one fifth of the original frequency band to obtain the target spectrogram.
Specifically, the basic audio is subjected to short-time fourier transform, and an initial spectrogram is obtained, where a row of the initial spectrogram represents a frequency band, and a row represents 21.5 Hz. In the embodiment of the present application, the frequency of the signals below 250Hz and above 8000Hz are all set to 0 to simulate the frequency range of the radio. The frequency band from 250Hz to 2700Hz is set to be twice of the original frequency band, and human ears are sensitive to sound in the frequency band and slightly clumsy to sound so as to simulate intermediate frequency signals in radio. The frequency band of 2700Hz to 8000Hz is set to be one fifth of the original frequency band so as to reduce the influence of the frequency band on the sound.
S42: and if the target conversion type is the telephone sound effect, performing frequency range modification processing on the initial spectrogram by adopting a first preset formula and a second preset formula to obtain a target spectrogram.
Further, step S42 includes: if the target conversion type is telephone sound, setting signals with frequencies lower than 300Hz and higher than 8000Hz in the initial spectrogram as 0, modifying the frequencies from 300Hz to 3000Hz by adopting a first preset formula, and modifying the frequencies from 3000Hz to 8000Hz by adopting a second preset formula to obtain a target spectrogram.
Specifically, the telephone sound effect has a lower frequency band range, wherein the telephone sound effect frequency band is mainly distributed between 300Hz and 3000 Hz. Therefore, the embodiment of the application reserves the original characteristics as much as possible for the sound frequency band of 300Hz to 3000Hz, and meanwhile, in order to reduce the influence of frequency mutation to 0, the frequency mutation is gradually changed to 0 from 3000Hz to 8000 Hz.
Further, the frequency of 300Hz to 3000Hz is modified by a first preset formula, wherein y is equal to y1*(1-0.125i);
Where i denotes the ith row, i ∈ (0,72), y1Representing the frequency domain value of the current line, wherein y is the modified frequency domain value;
further, a second preset formula is adopted to modify the frequency from 3000Hz to 8000Hz, wherein the second preset formula is that y is equal to y2*(1-0.0028j);
Where j denotes the jth row, j ∈ (50,400), y2Representing the frequency domain value of the current line, and y is the modified frequency domain value.
S43: and if the target conversion type is the diving sound effect, performing gradual change processing on the initial spectrogram by adopting a third preset formula so as to perform frequency range modification processing on the initial spectrogram to obtain a target spectrogram.
Specifically, the diving sound effect is relatively low, and an extremely low frequency range is required, so the frequency range is limited to 1000Hz in the embodiment of the present application.
Further, the third predetermined formula is
Figure BDA0003506911290000091
Where i denotes the ith row, i ∈ (8,1025), y3Representing the frequency domain value of the current line, and y is the modified frequency domain value.
In this embodiment, if the target conversion type is a radio audio effect, the frequency range modification processing is performed on the initial spectrogram according to a preset frequency band to obtain a target spectrogram, if the target conversion type is a telephone audio effect, the frequency range modification processing is performed on the initial spectrogram by using a first preset formula and a second preset formula to obtain a target spectrogram, and if the target conversion type is a diving audio effect, the frequency range modification processing is performed on the initial spectrogram by using a third preset formula to obtain the target spectrogram.
S5: and restoring the target spectrogram into a time domain signal to obtain a target sound effect.
Further, step S5 includes: and carrying out short-time Fourier inverse change on the target spectrogram corresponding to each frame of basic audio so as to restore the target spectrogram to be a time domain signal until all the target spectrograms are restored to be the time domain signal, and obtaining a target sound effect.
Specifically, in the embodiment of the present application, the first frame of basic audio is used as an input, the target sound effect corresponding to the first frame of basic audio is obtained through the processing of steps S3 to S5, and then each subsequent frame of basic audio is used as an input, and the processing of steps S3 to S5 is repeated until all the basic audio has obtained the corresponding target sound effect. In the embodiment of the application, the target spectrogram is subjected to short-time Fourier inverse change, so that the target spectrogram is restored to be a time domain signal until all the target spectrograms are restored to be the time domain signal, and a target sound effect is obtained.
In the embodiment, the audio to be processed and a target conversion type are obtained, wherein the target conversion type comprises a radio sound effect, a telephone sound effect and a diving sound effect; performing framing processing on the audio to be processed according to a preset sampling frequency to obtain a basic audio; according to a Hamming window with a preset size, carrying out short-time Fourier transform processing on the basic audio to obtain an initial spectrogram; and modifying the frequency range of the initial spectrogram based on the target conversion type to obtain a target spectrogram, and restoring the target spectrogram into a time domain signal to obtain a target sound effect. According to the embodiment of the invention, the audio to be processed is subjected to framing processing, Fourier transform is carried out according to the preset Hamming window, so that the audio to be processed is converted into the spectrogram, the frequency range of the spectrogram is favorably and accurately modified according to the spectrogram, meanwhile, the frequency range of the spectrogram is subjected to modification processing according to the target conversion type, the spectrogram is reduced into a time domain signal, and the target audio is obtained, so that the frequency range of the spectrogram is accurately adjusted according to different audio conversion types, the accuracy of the conversion effect of the background audio is favorably improved, and the background audio has higher identification degree.
It will be understood by those skilled in the art that all or part of the processes of the methods of the embodiments described above can be implemented by a computer program, which can be stored in a computer-readable storage medium, and can include the processes of the embodiments of the methods described above when the computer program is executed. The storage medium may be a non-volatile storage medium such as a magnetic disk, an optical disk, a Read-Only Memory (ROM), or a Random Access Memory (RAM).
Referring to fig. 3, as an implementation of the method shown in fig. 1, the present application provides an embodiment of a background sound effect conversion apparatus, where the embodiment of the apparatus corresponds to the embodiment of the method shown in fig. 1, and the apparatus may be specifically applied to various electronic devices.
As shown in fig. 3, the background sound effect conversion device of the present embodiment includes: a to-be-processed audio acquisition module 61, a basic audio generation module 62, an initial spectrogram generation module 63, a target spectrogram generation module 64, and a target sound effect generation module 65, wherein:
the audio processing device comprises a to-be-processed audio acquisition module 61, a target conversion module and a processing module, wherein the to-be-processed audio acquisition module is used for acquiring to-be-processed audio and a target conversion type, and the target conversion type comprises a radio sound effect, a telephone sound effect and a diving sound effect;
the basic audio generating module 62 is configured to perform framing processing on the audio to be processed according to a preset sampling frequency to obtain a basic audio;
an initial spectrogram generating module 63, configured to perform short-time fourier transform processing on the basic audio according to a hamming window with a preset size to obtain an initial spectrogram;
a target spectrogram generating module 64, configured to perform frequency range modification processing on the initial spectrogram based on the target conversion type to obtain a target spectrogram;
and the target sound effect generating module 65 is configured to restore the target spectrogram to a time domain signal, so as to obtain a target sound effect.
Further, the target spectrogram generating module 64 includes:
the radio sound effect modification unit is used for modifying the frequency range of the initial spectrogram according to a preset frequency band to obtain a target spectrogram if the target conversion type is a radio sound effect;
the telephone sound effect modification unit is used for modifying the frequency range of the initial spectrogram by adopting a first preset formula and a second preset formula to obtain a target spectrogram if the target conversion type is telephone sound effect;
and the diving sound effect modification unit is used for performing gradual change processing on the initial spectrogram by adopting a third preset formula if the target conversion type is a diving sound effect so as to perform frequency range modification processing on the initial spectrogram to obtain a target spectrogram.
Further, the radio sound effect modification unit includes:
and the radio audio frequency limiting subunit is used for setting the signals with the frequencies lower than 250Hz and higher than 8000Hz in the initial spectrogram as 0, setting the frequency band with the frequencies from 50Hz to 2700Hz as twice of the original frequency band, and setting the frequency band from 2700Hz to 8000Hz as one fifth of the original frequency band to obtain the target spectrogram if the target conversion type is the radio audio.
Further, the telephone sound effect modification unit includes:
and the telephone sound effect frequency limiting subunit is used for setting the signals with the frequencies lower than 300Hz and higher than 8000Hz in the initial spectrogram as 0 if the target conversion type is telephone sound effect, performing frequency modification on the frequencies from 300Hz to 3000Hz by adopting a first preset formula, and performing frequency modification on the frequencies from 3000Hz to 8000Hz by adopting a second preset formula to obtain a target spectrogram.
Further, the firstA predetermined formula of y ═ y1*(1-0.125i);
Where i denotes the ith row, i ∈ (0,72), y1Representing the frequency domain value of the current line, wherein y is the modified frequency domain value;
the second predetermined formula is y ═ y2*(1-0.0028j);
Where j denotes the jth row, j ∈ (50,400), y2Representing the frequency domain value of the current line, and y is the modified frequency domain value.
Further, the third predetermined formula is
Figure BDA0003506911290000121
Where i denotes the ith row, i ∈ (8,1025), y3Representing the frequency domain value of the current line, and y is the modified frequency domain value.
Further, the target sound effect generation module 65 includes:
and the spectrogram conversion unit is used for carrying out short-time Fourier inverse change on the target spectrogram corresponding to each frame of basic audio so as to restore the target spectrogram into a time domain signal until all the target spectrograms are restored into the time domain signal, thereby obtaining the target sound effect.
In order to solve the technical problem, an embodiment of the present application further provides a computer device. Referring to fig. 4, fig. 4 is a block diagram of a basic structure of a computer device according to the present embodiment.
The computer device 7 comprises a memory 71, a processor 72, a network interface 73, communicatively connected to each other by a system bus. It is noted that only a computer device 7 having three components memory 71, processor 72, network interface 73 is shown, but it is understood that not all of the shown components are required to be implemented, and that more or fewer components may be implemented instead. As will be understood by those skilled in the art, the computer device is a device capable of automatically performing numerical calculation and/or information processing according to a preset or stored instruction, and the hardware includes, but is not limited to, a microprocessor, an Application Specific Integrated Circuit (ASIC), a Programmable Gate Array (FPGA), a Digital Signal Processor (DSP), an embedded device, and the like.
The computer device may be a desktop computer, a notebook, a palmtop computer, a cloud server, or other computing device. The computer equipment can carry out man-machine interaction with a user through a keyboard, a mouse, a remote controller, a touch panel or voice control equipment and the like.
The memory 71 includes at least one type of readable storage medium including flash memory, hard disks, multimedia cards, card-type memory (e.g., SD or DX memory, etc.), Random Access Memory (RAM), Static Random Access Memory (SRAM), Read Only Memory (ROM), Electrically Erasable Programmable Read Only Memory (EEPROM), Programmable Read Only Memory (PROM), magnetic memory, magnetic disks, optical disks, etc. In some embodiments, the storage 71 may be an internal storage unit of the computer device 7, such as a hard disk or a memory of the computer device 7. In other embodiments, the memory 71 may also be an external storage device of the computer device 7, such as a plug-in hard disk, a Smart Media Card (SMC), a Secure Digital (SD) Card, a Flash memory Card (Flash Card), and the like provided on the computer device 7. Of course, the memory 71 may also comprise both an internal storage unit of the computer device 7 and an external storage device thereof. In this embodiment, the memory 71 is generally used for storing an operating system installed in the computer device 7 and various types of application software, such as program codes of a conversion method of background sound effects. Further, the memory 71 may also be used to temporarily store various types of data that have been output or are to be output.
Processor 72 may be a Central Processing Unit (CPU), controller, microcontroller, microprocessor, or other data Processing chip in some embodiments. The processor 72 is typically used to control the overall operation of the computer device 7. In this embodiment, the processor 72 is configured to run the program code stored in the memory 71 or process data, for example, the program code of the above-mentioned background sound effect conversion method, so as to implement various embodiments of the background sound effect conversion method.
The network interface 73 may comprise a wireless network interface or a wired network interface, and the network interface 73 is typically used to establish a communication connection between the computer device 7 and other electronic devices.
The present application further provides another embodiment, which is to provide a computer-readable storage medium, wherein the computer-readable storage medium stores a computer program, and the computer program can be executed by at least one processor, so that the at least one processor executes the steps of the method for converting the background sound effect.
Through the above description of the embodiments, those skilled in the art will clearly understand that the method of the above embodiments can be implemented by software plus a necessary general hardware platform, and certainly can also be implemented by hardware, but in many cases, the former is a better implementation manner. Based on such understanding, the technical solutions of the present application may be embodied in the form of a software product, which is stored in a storage medium (such as ROM/RAM, magnetic disk, optical disk) and includes instructions for enabling a terminal device (such as a mobile phone, a computer, a server, an air conditioner, or a network device) to execute the method of the embodiments of the present application.
The block chain is a novel application mode of computer technologies such as distributed data storage, point-to-point transmission, a consensus mechanism, an encryption algorithm and the like. A block chain (Blockchain), which is essentially a decentralized database, is a series of data blocks associated by using a cryptographic method, and each data block contains information of a batch of network transactions, so as to verify the validity (anti-counterfeiting) of the information and generate a next block. The blockchain may include a blockchain underlying platform, a platform product service layer, an application service layer, and the like.
It is to be understood that the above-described embodiments are merely illustrative of some, but not restrictive, of the broad invention, and that the appended drawings illustrate preferred embodiments of the invention and do not limit the scope of the invention. This application is capable of embodiments in many different forms and is provided for the purpose of enabling a thorough understanding of the disclosure of the application. Although the present application has been described in detail with reference to the foregoing embodiments, it will be apparent to one skilled in the art that the present application may be practiced without modification or with equivalents of some of the features described in the foregoing embodiments. All equivalent structures made by using the contents of the specification and the drawings of the present application are directly or indirectly applied to other related technical fields and are within the protection scope of the present application.

Claims (10)

1. A method for converting background sound effect is characterized by comprising the following steps:
acquiring audio to be processed and a target conversion type, wherein the target conversion type comprises a radio sound effect, a telephone sound effect and a diving sound effect;
performing framing processing on the audio to be processed according to a preset sampling frequency to obtain a basic audio;
according to a Hamming window with a preset size, carrying out short-time Fourier transform processing on the basic audio to obtain an initial spectrogram;
modifying the frequency range of the initial spectrogram based on the target conversion type to obtain a target spectrogram;
and restoring the target spectrogram into a time domain signal to obtain a target sound effect.
2. The method for converting a background sound effect according to claim 1, wherein the step of performing a frequency range modification process on the initial spectrogram based on the target conversion type to obtain a target spectrogram comprises:
if the target conversion type is the radio sound effect, performing frequency range modification processing on the initial spectrogram according to a preset frequency band to obtain the target spectrogram;
if the target conversion type is the telephone sound effect, performing frequency range modification processing on the initial spectrogram by adopting a first preset formula and a second preset formula to obtain a target spectrogram;
if the target conversion type is the diving sound effect, performing gradual change processing on the initial spectrogram by adopting a third preset formula so as to perform frequency range modification processing on the initial spectrogram to obtain the target spectrogram.
3. The method for converting background sound effect according to claim 2, wherein if the target conversion type is the radio sound effect, performing frequency range modification processing on the initial spectrogram according to a preset frequency band to obtain the target spectrogram, comprises:
if the target conversion type is the radio audio effect, setting signals with frequencies lower than 250Hz and higher than 8000Hz in the initial spectrogram to be 0, setting the frequency range from 50Hz to 2700Hz to be twice of the original frequency range, and setting the frequency range from 2700Hz to 8000Hz to be one fifth of the original frequency range to obtain the target spectrogram.
4. The method for converting a background sound effect according to claim 2, wherein if the target conversion type is the telephone sound effect, performing frequency range modification processing on the initial spectrogram by using a first preset formula and a second preset formula to obtain the target spectrogram, comprising:
and if the target conversion type is the telephone sound effect, setting the signals with the frequencies lower than 300Hz and higher than 8000Hz in the initial spectrogram as 0, carrying out frequency modification on the frequencies from 300Hz to 3000Hz by adopting the first preset formula, and carrying out frequency modification on the frequencies from 3000Hz to 8000Hz by adopting the second preset formula to obtain the target spectrogram.
5. The method for converting background sound effect according to claim 2, wherein the first predetermined formula is y ═ y1 *(1-0.125i);
Where i denotes the ith row, i ∈ (0,72), y1Representing the frequency domain value of the current line, wherein y is the modified frequency domain value;
the second predetermined formula is y ═ y2 *(1-0.0028j);
Where j denotes the jth row, j ∈ (50,400), y2Representing the frequency domain value of the current line, and y is the modified frequency domain value.
6. The method for converting background sound effect according to claim 2, wherein the third predetermined formula is
Figure FDA0003506911280000021
Where i denotes the ith row, i ∈ (8,1025), y3Representing the frequency domain value of the current line, and y is the modified frequency domain value.
7. The method for converting a background sound effect according to any one of claims 1 to 6, wherein the step of restoring the target spectrogram into a time domain signal to obtain a target sound effect comprises:
and carrying out short-time Fourier inverse change on the target spectrogram corresponding to each frame of the basic audio so as to restore the target spectrogram to be a time domain signal until all the target spectrograms are restored to be time domain signals, and obtaining the target sound effect.
8. A background sound effect conversion device is characterized by comprising:
the device comprises a to-be-processed audio acquisition module, a target conversion module and a processing module, wherein the to-be-processed audio acquisition module is used for acquiring to-be-processed audio and a target conversion type, and the target conversion type comprises a radio sound effect, a telephone sound effect and a diving sound effect;
the basic audio generating module is used for performing framing processing on the audio to be processed according to a preset sampling frequency to obtain basic audio;
the initial spectrogram generating module is used for performing short-time Fourier transform processing on the basic audio according to a Hamming window with a preset size to obtain an initial spectrogram;
a target spectrogram generating module, configured to perform frequency range modification processing on the initial spectrogram based on the target conversion type to obtain a target spectrogram;
and the target sound effect generation module is used for reducing the target spectrogram into a time domain signal to obtain a target sound effect.
9. A computer device characterized by comprising a memory in which a computer program is stored and a processor that implements the method of converting the background sound effect according to any one of claims 1 to 7 when the processor executes the computer program.
10. A computer-readable storage medium, characterized in that a computer program is stored thereon, which when executed by a processor implements the method for converting background sound effects of any one of claims 1 to 7.
CN202210140971.0A 2022-02-16 2022-02-16 Background sound effect conversion method and device, computer equipment and storage medium Active CN114449339B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210140971.0A CN114449339B (en) 2022-02-16 2022-02-16 Background sound effect conversion method and device, computer equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210140971.0A CN114449339B (en) 2022-02-16 2022-02-16 Background sound effect conversion method and device, computer equipment and storage medium

Publications (2)

Publication Number Publication Date
CN114449339A true CN114449339A (en) 2022-05-06
CN114449339B CN114449339B (en) 2024-04-12

Family

ID=81374456

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210140971.0A Active CN114449339B (en) 2022-02-16 2022-02-16 Background sound effect conversion method and device, computer equipment and storage medium

Country Status (1)

Country Link
CN (1) CN114449339B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117761393A (en) * 2024-02-22 2024-03-26 南京派格测控科技有限公司 Time domain signal acquisition method and device

Citations (27)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9318086B1 (en) * 2012-09-07 2016-04-19 Jerry A. Miller Musical instrument and vocal effects
CN105679331A (en) * 2015-12-30 2016-06-15 广东工业大学 Sound-breath signal separating and synthesizing method and system
WO2017054507A1 (en) * 2015-09-29 2017-04-06 广州酷狗计算机科技有限公司 Sound effect simulation method, apparatus and system
CN107333076A (en) * 2017-06-26 2017-11-07 青岛海信电器股份有限公司 The method of adjustment of television set and its audio signal intermediate frequency point data, device
CN107481727A (en) * 2017-06-23 2017-12-15 罗时志 A kind of acoustic signal processing method and system based on the control of electric sound keynote
WO2018077364A1 (en) * 2016-10-28 2018-05-03 Transformizer Aps Method for generating artificial sound effects based on existing sound clips
CN108269579A (en) * 2018-01-18 2018-07-10 厦门美图之家科技有限公司 Voice data processing method, device, electronic equipment and readable storage medium storing program for executing
CN108281152A (en) * 2018-01-18 2018-07-13 腾讯音乐娱乐科技(深圳)有限公司 Audio-frequency processing method, device and storage medium
CN108305603A (en) * 2017-10-20 2018-07-20 腾讯科技(深圳)有限公司 Sound effect treatment method and its equipment, storage medium, server, sound terminal
CN109346111A (en) * 2018-10-11 2019-02-15 广州酷狗计算机科技有限公司 Data processing method, device, terminal and storage medium
CN109410973A (en) * 2018-11-07 2019-03-01 北京达佳互联信息技术有限公司 Voice change process method, apparatus and computer readable storage medium
CN109545174A (en) * 2018-12-26 2019-03-29 广州华多网络科技有限公司 A kind of audio-frequency processing method, device and equipment
CN110580914A (en) * 2019-07-24 2019-12-17 安克创新科技股份有限公司 Audio processing method and equipment and device with storage function
CN111326132A (en) * 2020-01-22 2020-06-23 北京达佳互联信息技术有限公司 Audio processing method and device, storage medium and electronic equipment
CN111429942A (en) * 2020-03-19 2020-07-17 北京字节跳动网络技术有限公司 Audio data processing method and device, electronic equipment and storage medium
CN111739544A (en) * 2019-03-25 2020-10-02 Oppo广东移动通信有限公司 Voice processing method and device, electronic equipment and storage medium
CN111796790A (en) * 2019-04-09 2020-10-20 深圳市冠旭电子股份有限公司 Sound effect adjusting method and device, readable storage medium and terminal equipment
CN112102846A (en) * 2020-09-04 2020-12-18 腾讯科技(深圳)有限公司 Audio processing method and device, electronic equipment and storage medium
CN112435643A (en) * 2020-11-20 2021-03-02 腾讯音乐娱乐科技(深圳)有限公司 Method, device, equipment and storage medium for generating electronic style song audio
CN112511123A (en) * 2020-11-30 2021-03-16 广州朗国电子科技有限公司 Sound effect customizing method and device, electronic equipment and storage medium
CN112599142A (en) * 2020-12-14 2021-04-02 北京百瑞互联技术有限公司 Bluetooth transmission method, equipment and storage medium for adjusting background sound and human voice
CN113178183A (en) * 2021-04-30 2021-07-27 杭州网易云音乐科技有限公司 Sound effect processing method and device, storage medium and computing equipment
CN113223542A (en) * 2021-04-26 2021-08-06 北京搜狗科技发展有限公司 Audio conversion method and device, storage medium and electronic equipment
CN113241082A (en) * 2021-04-22 2021-08-10 杭州朗和科技有限公司 Sound changing method, device, equipment and medium
CN113539299A (en) * 2021-01-12 2021-10-22 腾讯科技(深圳)有限公司 Multimedia information processing method and device, electronic equipment and storage medium
CN113689837A (en) * 2021-08-24 2021-11-23 北京百度网讯科技有限公司 Audio data processing method, device, equipment and storage medium
CN113924620A (en) * 2019-06-05 2022-01-11 哈曼国际工业有限公司 Sound modification based on frequency composition

Patent Citations (27)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9318086B1 (en) * 2012-09-07 2016-04-19 Jerry A. Miller Musical instrument and vocal effects
WO2017054507A1 (en) * 2015-09-29 2017-04-06 广州酷狗计算机科技有限公司 Sound effect simulation method, apparatus and system
CN105679331A (en) * 2015-12-30 2016-06-15 广东工业大学 Sound-breath signal separating and synthesizing method and system
WO2018077364A1 (en) * 2016-10-28 2018-05-03 Transformizer Aps Method for generating artificial sound effects based on existing sound clips
CN107481727A (en) * 2017-06-23 2017-12-15 罗时志 A kind of acoustic signal processing method and system based on the control of electric sound keynote
CN107333076A (en) * 2017-06-26 2017-11-07 青岛海信电器股份有限公司 The method of adjustment of television set and its audio signal intermediate frequency point data, device
CN108305603A (en) * 2017-10-20 2018-07-20 腾讯科技(深圳)有限公司 Sound effect treatment method and its equipment, storage medium, server, sound terminal
CN108269579A (en) * 2018-01-18 2018-07-10 厦门美图之家科技有限公司 Voice data processing method, device, electronic equipment and readable storage medium storing program for executing
CN108281152A (en) * 2018-01-18 2018-07-13 腾讯音乐娱乐科技(深圳)有限公司 Audio-frequency processing method, device and storage medium
CN109346111A (en) * 2018-10-11 2019-02-15 广州酷狗计算机科技有限公司 Data processing method, device, terminal and storage medium
CN109410973A (en) * 2018-11-07 2019-03-01 北京达佳互联信息技术有限公司 Voice change process method, apparatus and computer readable storage medium
CN109545174A (en) * 2018-12-26 2019-03-29 广州华多网络科技有限公司 A kind of audio-frequency processing method, device and equipment
CN111739544A (en) * 2019-03-25 2020-10-02 Oppo广东移动通信有限公司 Voice processing method and device, electronic equipment and storage medium
CN111796790A (en) * 2019-04-09 2020-10-20 深圳市冠旭电子股份有限公司 Sound effect adjusting method and device, readable storage medium and terminal equipment
CN113924620A (en) * 2019-06-05 2022-01-11 哈曼国际工业有限公司 Sound modification based on frequency composition
CN110580914A (en) * 2019-07-24 2019-12-17 安克创新科技股份有限公司 Audio processing method and equipment and device with storage function
CN111326132A (en) * 2020-01-22 2020-06-23 北京达佳互联信息技术有限公司 Audio processing method and device, storage medium and electronic equipment
CN111429942A (en) * 2020-03-19 2020-07-17 北京字节跳动网络技术有限公司 Audio data processing method and device, electronic equipment and storage medium
CN112102846A (en) * 2020-09-04 2020-12-18 腾讯科技(深圳)有限公司 Audio processing method and device, electronic equipment and storage medium
CN112435643A (en) * 2020-11-20 2021-03-02 腾讯音乐娱乐科技(深圳)有限公司 Method, device, equipment and storage medium for generating electronic style song audio
CN112511123A (en) * 2020-11-30 2021-03-16 广州朗国电子科技有限公司 Sound effect customizing method and device, electronic equipment and storage medium
CN112599142A (en) * 2020-12-14 2021-04-02 北京百瑞互联技术有限公司 Bluetooth transmission method, equipment and storage medium for adjusting background sound and human voice
CN113539299A (en) * 2021-01-12 2021-10-22 腾讯科技(深圳)有限公司 Multimedia information processing method and device, electronic equipment and storage medium
CN113241082A (en) * 2021-04-22 2021-08-10 杭州朗和科技有限公司 Sound changing method, device, equipment and medium
CN113223542A (en) * 2021-04-26 2021-08-06 北京搜狗科技发展有限公司 Audio conversion method and device, storage medium and electronic equipment
CN113178183A (en) * 2021-04-30 2021-07-27 杭州网易云音乐科技有限公司 Sound effect processing method and device, storage medium and computing equipment
CN113689837A (en) * 2021-08-24 2021-11-23 北京百度网讯科技有限公司 Audio data processing method, device, equipment and storage medium

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
范勇冠;: "基于傅里叶变换和倒谱系数的电子音乐标记算法", 现代电子技术, no. 13 *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117761393A (en) * 2024-02-22 2024-03-26 南京派格测控科技有限公司 Time domain signal acquisition method and device
CN117761393B (en) * 2024-02-22 2024-05-07 南京派格测控科技有限公司 Time domain signal acquisition method and device

Also Published As

Publication number Publication date
CN114449339B (en) 2024-04-12

Similar Documents

Publication Publication Date Title
CN107731223B (en) Voice activity detection method, related device and equipment
CN110069608B (en) Voice interaction method, device, equipment and computer storage medium
JP6621536B2 (en) Electronic device, identity authentication method, system, and computer-readable storage medium
CN112562691A (en) Voiceprint recognition method and device, computer equipment and storage medium
CN112489677A (en) Voice endpoint detection method, device, equipment and medium based on neural network
CN112309414B (en) Active noise reduction method based on audio encoding and decoding, earphone and electronic equipment
CN112466314A (en) Emotion voice data conversion method and device, computer equipment and storage medium
CN110070884B (en) Audio starting point detection method and device
CN105224844B (en) Verification method, system and device
CN114449339B (en) Background sound effect conversion method and device, computer equipment and storage medium
US20120053937A1 (en) Generalizing text content summary from speech content
CN114187922A (en) Audio detection method and device and terminal equipment
CN110070885B (en) Audio starting point detection method and device
KR102220964B1 (en) Method and device for audio recognition
CN116913304A (en) Real-time voice stream noise reduction method and device, computer equipment and storage medium
CN108630208B (en) Server, voiceprint-based identity authentication method and storage medium
CN112071331B (en) Voice file restoration method and device, computer equipment and storage medium
CN116312559A (en) Training method of cross-channel voiceprint recognition model, voiceprint recognition method and device
CN113421554B (en) Voice keyword detection model processing method and device and computer equipment
CN113362249B (en) Text image synthesis method, text image synthesis device, computer equipment and storage medium
CN111933154B (en) Method, equipment and computer readable storage medium for recognizing fake voice
CN113035216B (en) Microphone array voice enhancement method and related equipment
CN112309425A (en) Sound tone changing method, electronic equipment and computer readable storage medium
CN112382296A (en) Method and device for voiceprint remote control of wireless audio equipment
CN113436633B (en) Speaker recognition method, speaker recognition device, computer equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant