CN107481735A - Method for converting audio sound production, server and computer readable storage medium - Google Patents
Method for converting audio sound production, server and computer readable storage medium Download PDFInfo
- Publication number
- CN107481735A CN107481735A CN201710752085.2A CN201710752085A CN107481735A CN 107481735 A CN107481735 A CN 107481735A CN 201710752085 A CN201710752085 A CN 201710752085A CN 107481735 A CN107481735 A CN 107481735A
- Authority
- CN
- China
- Prior art keywords
- converted
- frequency spectrum
- voice data
- spectrum information
- information
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 34
- 238000004519 manufacturing process Methods 0.000 title 1
- 238000001228 spectrum Methods 0.000 claims abstract description 135
- 238000006243 chemical reaction Methods 0.000 claims abstract description 30
- 230000002463 transducing effect Effects 0.000 claims description 16
- 238000004590 computer program Methods 0.000 claims description 11
- 230000008859 change Effects 0.000 claims description 5
- 230000006870 function Effects 0.000 description 11
- 230000000694 effects Effects 0.000 description 10
- 230000001360 synchronised effect Effects 0.000 description 9
- 238000005516 engineering process Methods 0.000 description 4
- 230000005291 magnetic effect Effects 0.000 description 4
- 230000003068 static effect Effects 0.000 description 4
- 230000010355 oscillation Effects 0.000 description 3
- 230000001755 vocal effect Effects 0.000 description 3
- 241000406668 Loxodonta cyclotis Species 0.000 description 2
- 230000008569 process Effects 0.000 description 2
- 239000003086 colorant Substances 0.000 description 1
- 230000007812 deficiency Effects 0.000 description 1
- 230000004069 differentiation Effects 0.000 description 1
- 238000000605 extraction Methods 0.000 description 1
- 239000004744 fabric Substances 0.000 description 1
- 230000005294 ferromagnetic effect Effects 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 230000033764 rhythmic process Effects 0.000 description 1
- 238000005070 sampling Methods 0.000 description 1
- 230000013707 sensory perception of sound Effects 0.000 description 1
- 230000005236 sound signal Effects 0.000 description 1
- 238000010183 spectrum analysis Methods 0.000 description 1
- 230000009466 transformation Effects 0.000 description 1
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
- G10L25/03—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of extracted parameters
- G10L25/18—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of extracted parameters the extracted parameters being spectral information of each sub-band
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L19/00—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
- G10L19/02—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using spectral analysis, e.g. transform vocoders or subband vocoders
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L21/00—Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
- G10L21/003—Changing voice quality, e.g. pitch or formants
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L21/00—Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
- G10L21/003—Changing voice quality, e.g. pitch or formants
- G10L21/007—Changing voice quality, e.g. pitch or formants characterised by the process used
- G10L21/013—Adapting to target pitch
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L21/00—Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
- G10L21/003—Changing voice quality, e.g. pitch or formants
- G10L21/007—Changing voice quality, e.g. pitch or formants characterised by the process used
- G10L21/013—Adapting to target pitch
- G10L2021/0135—Voice conversion or morphing
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Computational Linguistics (AREA)
- Signal Processing (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Human Computer Interaction (AREA)
- Acoustics & Sound (AREA)
- Multimedia (AREA)
- Quality & Reliability (AREA)
- Spectroscopy & Molecular Physics (AREA)
- Electrophonic Musical Instruments (AREA)
Abstract
The invention discloses a method for converting audio frequency to sound, which comprises the following steps: acquiring audio data to be converted and a conversion target object of the audio data to be converted, analyzing the audio data to be converted to obtain an analysis result, and determining audio track information of the audio data to be converted according to the analysis result, wherein the audio track information at least comprises the tone of the audio data to be converted; and determining the acoustic frequency spectrum information of the conversion target object in a preset acoustic frequency spectrum information database, converting the audio track information of the audio data to be converted according to the acoustic frequency spectrum information of the conversion target object, and determining the converted audio data. The invention also discloses a converted audio sounding device and a computer readable storage medium.
Description
Technical field
The present invention relates to audio signal processing technique, more particularly to a kind of method, server and the calculating of transducing audio sounding
Machine readable storage medium storing program for executing.
Background technology
In existing music APP, although the function of providing is increasingly abundanter, these functions are mainly both for sound
Function in terms of happy APP unmusical broadcasting, such as music social functions and music consumption function etc., and it is directed to traditional sound
Happy broadcasting field, the function that music APP can be provided still are mainly the function in terms of tuning, such as tune, rhythm etc.
Regulation, it is evident that the main purpose of this kind of work(is preferably to listen song to experience in order to bring user one, and to be used similar
Disc-jockey functionality be also required to user have certain music general knowledge with basis, thus existing music APP can be provided it is this kind of
Audient's scope of disc-jockey functionality is smaller.Therefore, overall, the function that existing music APP is provided is in recreational side
Face still shows slightly deficiency, and especially the basic function in music APP --- in terms of music, existing music APP is broadcast in music
Put the recreational more inadequate of the function that aspect is provided.
In daily life, each user often has one or several singers oneself liked, likes for oneself
Singer, user not only likes the song that these singers oneself sing, it may be desirable that the singer oneself liked can sing
Other songs oneself liked.It is therefore, existing also without the method that can change song artist from currently available technology
The functions that are provided of music APP can not meet user's use demand.
The content of the invention
In view of this, the embodiment of the present invention it is expected to provide a kind of method, server and the computer of transducing audio sounding
Readable storage medium storing program for executing, the singer that the audio file Central Plains singer of selection is revised as oneself liking can be sung, with
Improve interesting and Consumer's Experience.
To reach above-mentioned purpose, the embodiments of the invention provide a kind of method of transducing audio sounding:
The switch target object of voice data to be converted and the voice data to be converted is obtained, to described to be converted
Voice data is parsed, and obtains analysis result, and the track of the voice data to be converted is determined according to the analysis result
Information, wherein, the track information comprises at least the tone color of the voice data to be converted;
The acoustical frequency spectrum information of the switch target object is determined in default acoustical frequency spectrum information database, according to
The acoustical frequency spectrum information of the switch target object is changed to the track information of the voice data to be converted, determines to turn
Voice data after changing.
Wherein, before above-mentioned acquisition voice data to be converted and switch target object, methods described also includes:
The acoustical frequency spectrum information of at least one switch target object is obtained, the acoustical frequency spectrum of the switch target object is believed
Cease and be associated with the identification information of the switch target object, generate acoustical frequency spectrum information database.
Wherein, the acoustical frequency spectrum information of at least one object of above-mentioned acquisition, including:
The sound of object is acquired, digital-to-analogue conversion is carried out to the object sound of acquisition, obtains the object
Digital audio-frequency data, the object is parsed according to the digital audio-frequency data, obtain the acoustical frequency spectrum letter of the object
Breath, wherein, the acoustical frequency spectrum information of the object comprises at least the syllable spectrum information of object pronunciation.
Wherein, track of the above-mentioned acoustical frequency spectrum information according to the switch target object to the voice data to be converted
Information is changed, including:
According to the tone color of the acoustical frequency spectrum information sound intermediate frequency feature of the target converting objects to the audio number to be converted
According to track information in syllable tone color carry out tuning.
The embodiments of the invention provide a kind of transducing audio sound-producing device, it is characterised in that described device includes:
Parsing module, for obtaining the switch target object of voice data to be converted and the voice data to be converted,
The voice data to be converted is parsed, obtains analysis result, is determined according to the analysis result described to be converted
The track information of voice data, wherein, the track information comprises at least the tone color of the voice data to be converted;;
Modular converter, for determining the acoustics of the switch target object in default acoustical frequency spectrum information database
Spectrum information, the track information of the voice data to be converted is carried out according to the acoustical frequency spectrum information of the switch target object
Conversion, determine the voice data after conversion.
Wherein, said apparatus also includes:
Generation module, for obtaining the acoustical frequency spectrum information of at least one switch target object, by the switch target pair
The acoustical frequency spectrum information of elephant and the identification information of the switch target object are associated, and generate acoustical frequency spectrum information database.
Wherein, above-mentioned generation module, is specifically used for:
The sound of object is acquired, digital-to-analogue conversion is carried out to the object sound of acquisition, obtains the object
Digital audio-frequency data, the object is parsed according to the digital audio-frequency data, obtain the acoustical frequency spectrum letter of the object
Breath, wherein, the acoustical frequency spectrum information of the object comprises at least the syllable spectrum information of object pronunciation.
Wherein, above-mentioned modular converter, is specifically used for:
According to the tone color of the acoustical frequency spectrum information sound intermediate frequency feature of the target converting objects to the audio number to be converted
According to track information in syllable tone color carry out tuning.
The embodiments of the invention provide a kind of server, it is characterised in that including:Processor and for store can locate
The memory of the computer program run on reason device,
Wherein, when the processor is used to run the computer program, perform:
The switch target object of voice data to be converted and the voice data to be converted is obtained, to described to be converted
Voice data is parsed, and obtains analysis result, and the track of the voice data to be converted is determined according to the analysis result
Information, wherein, the track information comprises at least the tone color of the voice data to be converted;In default acoustical frequency spectrum Information Number
According to the acoustical frequency spectrum information that the switch target object is determined in storehouse, according to the acoustical frequency spectrum information of the switch target object
The track information of the voice data to be converted is changed, determines the voice data after conversion.
The embodiments of the invention provide a kind of computer-readable recording medium, computer program is stored thereon with, its feature
It is, the computer program is realized when being executed by processor:
The switch target object of voice data to be converted and the voice data to be converted is obtained, to described to be converted
Voice data is parsed, and obtains analysis result, and the track of the voice data to be converted is determined according to the analysis result
Information, wherein, the track information comprises at least the tone color of the voice data to be converted;In default acoustical frequency spectrum Information Number
According to the acoustical frequency spectrum information that the switch target object is determined in storehouse, according to the acoustical frequency spectrum information of the switch target object
The track information of the voice data to be converted is changed, determines the voice data after conversion.
Method, server and the computer-readable storage medium of a kind of transducing audio sounding provided in an embodiment of the present invention, are obtained
The switch target object of voice data to be converted and the voice data to be converted is taken, the voice data to be converted is entered
Row parsing, obtains analysis result, the track information of the voice data to be converted is determined according to the analysis result, wherein,
The track information comprises at least the tone color of the voice data to be converted;Determined in default acoustical frequency spectrum information database
Go out the acoustical frequency spectrum information of the switch target object, wait to turn to described according to the acoustical frequency spectrum information of the switch target object
The track information for changing voice data is changed, and determines the voice data after conversion.In this way, the voice data of selection is carried out
Parsing, obtain the track information of the audio, the track according to the acoustical frequency spectrum information of the converting objects of setting to the audio
Information is changed, and is obtained the voice data for possessing converting objects audio frequency characteristics, is improved the recreational of music APP, give simultaneously
User brings more preferable usage experience.
Brief description of the drawings
Fig. 1 is the schematic flow sheet of transducing audio vocal technique of the embodiment of the present invention;
Fig. 2 is the structural representation of transducing audio sound-producing device of the embodiment of the present invention;
Fig. 3 is first embodiment of the invention schematic flow sheet.
Embodiment
In order to more detailed the characteristics of understanding the embodiment of the present invention and technology contents, below to the embodiment of the present invention
Realization be described in detail.
Fig. 1 is the schematic flow sheet of transducing audio vocal technique of the embodiment of the present invention, as shown in figure 1, the embodiment of the present invention
The audio conversion method of offer comprises the following steps:
Step 101:The switch target object of voice data to be converted and the voice data to be converted is obtained, to described
Voice data to be converted is parsed, and obtains analysis result, and the audio number to be converted is determined according to the analysis result
According to track information;
Wherein, the track information comprises at least the tone color of the voice data to be converted.
Step 102:The acoustical frequency spectrum of the switch target object is determined in default acoustical frequency spectrum information database
Information, the track information of the voice data to be converted is turned according to the acoustical frequency spectrum information of the switch target object
Change, determine the voice data after conversion.
Wherein, according to the tone color of the acoustical frequency spectrum information sound intermediate frequency feature of the target converting objects to the sound to be converted
The tone color of syllable carries out tuning in the track information of frequency evidence.
In actual applications, the acoustical frequency spectrum information according to the switch target object is to the audio number to be converted
According to track information changed, determine conversion after voice data, can also be accomplished by the following way:
The switch target object of voice data to be converted and the voice data to be converted is obtained, to described to be converted
Voice data is parsed, and obtains analysis result, is determined according to the analysis result each in the voice data to be converted
Pronunciation syllable corresponding to individual text message and the text message;
Hair of the switch target object to the text message is determined in default acoustical frequency spectrum information database
Message ceases and the spectrum information of the pronunciation information, according to the character order of the text message to the pronunciation determined
The spectrum information of information is arranged and audio conversion, determines the voice data after turning.
Before the step 101, audio conversion method provided in an embodiment of the present invention is further comprising the steps of:
The acoustical frequency spectrum information of at least one switch target object is obtained, the acoustical frequency spectrum of the switch target object is believed
Cease and be associated with the identification information of the switch target object, generate acoustical frequency spectrum information database;
Wherein, the sound of object is acquired, digital-to-analogue conversion is carried out to the object sound of acquisition, it is described right to obtain
The digital audio-frequency data of elephant, the object is parsed according to the digital audio-frequency data, obtain the acoustics frequency of the object
Spectrum information, wherein, the acoustical frequency spectrum information of the object comprises at least the syllable spectrum information of object pronunciation.
Fig. 2 is the composition structural representation of transducing audio sound-producing device of the embodiment of the present invention, it is characterised in that the audio
Conversion equipment includes:
Parsing module 201, for obtaining the switch target pair of voice data to be converted and the voice data to be converted
As being parsed to the voice data to be converted, obtaining analysis result, wait to turn according to being determined the analysis result
The track information of voice data is changed, wherein, the track information comprises at least the tone color of the voice data to be converted;
Modular converter 202, for determining the switch target object in default acoustical frequency spectrum information database
Acoustical frequency spectrum information, according to the acoustical frequency spectrum information of the switch target object to the track information of the voice data to be converted
Changed, determine the voice data after conversion.
Wherein, above-mentioned parsing module 201, is specifically used for:
After being parsed to the voice data to be converted, at least one voice data sound to be converted is determined
The audio frequency characteristics of section, wherein, the audio frequency characteristics include loudness, tone, the tone color of the voice data syllable;
The audio frequency characteristics of the voice data syllable to be converted determined are synthesized, obtained described to be converted
The track information of voice data.
Wherein, said apparatus also includes:
Generation module 203, for obtaining the acoustical frequency spectrum information of at least one switch target object, by the switch target
The acoustical frequency spectrum information of object and the identification information of the switch target object are associated, and generate acoustical frequency spectrum information data
Storehouse.
Wherein, above-mentioned generation module 203, is specifically used for:
The sound of object is acquired, digital-to-analogue conversion is carried out to the object sound of acquisition, obtains the object
Digital audio-frequency data, the object is parsed according to the digital audio-frequency data, obtain the acoustical frequency spectrum letter of the object
Breath, wherein, the acoustical frequency spectrum information of the object comprises at least the syllable spectrum information of object pronunciation.
Wherein, above-mentioned modular converter 202, is specifically used for:
According to the tone color of the acoustical frequency spectrum information sound intermediate frequency feature of the target converting objects to the audio number to be converted
According to track information in syllable tone color carry out tuning.
The embodiments of the invention provide a kind of server, it is characterised in that including:Processor and for store can locate
The memory of the computer program run on reason device,
Wherein, when the processor is used to run the computer program, perform:
The switch target object of voice data to be converted and the voice data to be converted is obtained, to described to be converted
Voice data is parsed, and obtains analysis result, and the track of the voice data to be converted is determined according to the analysis result
Information, wherein, the track information comprises at least the tone color of the voice data to be converted;In default acoustical frequency spectrum Information Number
According to the acoustical frequency spectrum information that the switch target object is determined in storehouse, according to the acoustical frequency spectrum information of the switch target object
The track information of the voice data to be converted is changed, determines the voice data after conversion.
Wherein, the above-mentioned track information that the voice data to be converted is determined according to the analysis result, including:
After being parsed to the voice data to be converted, at least one voice data sound to be converted is determined
The audio frequency characteristics of section, wherein, the audio frequency characteristics include loudness, tone, the tone color of the voice data syllable;
The audio frequency characteristics of the voice data syllable to be converted determined are synthesized, obtained described to be converted
The track information of voice data.
Wherein, before above-mentioned acquisition voice data to be converted and converting objects, methods described also includes:
The acoustical frequency spectrum information of at least one switch target object is obtained, the acoustical frequency spectrum of the switch target object is believed
Cease and be associated with the identification information of the switch target object, generate acoustical frequency spectrum information database.
Wherein, the acoustical frequency spectrum information of at least one object of above-mentioned acquisition, including:
The sound of object is acquired, digital-to-analogue conversion is carried out to the object sound of acquisition, obtains the object
Digital audio-frequency data, the object is parsed according to the digital audio-frequency data, obtain the acoustical frequency spectrum letter of the object
Breath, wherein, the acoustical frequency spectrum information of the object comprises at least the syllable spectrum information of object pronunciation.
Wherein, the above-mentioned acoustical frequency spectrum according to the converting objects is carried out to the track information of the voice data to be converted
Conversion, including:
According to the tone color of the acoustical frequency spectrum information sound intermediate frequency feature of the target converting objects to the audio number to be converted
According to track information in syllable tone color carry out tuning.
The embodiments of the invention provide a kind of computer-readable recording medium, computer program is stored thereon with, its feature
It is, the computer program is realized when being executed by processor:
The switch target object of voice data to be converted and the voice data to be converted is obtained, to described to be converted
Voice data is parsed, and obtains analysis result, and the track of the voice data to be converted is determined according to the analysis result
Information, wherein, the track information comprises at least the tone color of the voice data to be converted;In default acoustical frequency spectrum Information Number
According to the acoustical frequency spectrum information that the switch target object is determined in storehouse, according to the acoustical frequency spectrum information of the switch target object
The track information of the voice data to be converted is changed, determines the voice data after conversion.
Wherein, the above-mentioned track information that the voice data to be converted is determined according to the analysis result, including:
After being parsed to the voice data to be converted, at least one voice data sound to be converted is determined
The audio frequency characteristics of section, wherein, the audio frequency characteristics include loudness, tone, the tone color of the voice data syllable;
The audio frequency characteristics of the voice data syllable to be converted determined are synthesized, obtained described to be converted
The track information of voice data.
Wherein, before above-mentioned acquisition voice data to be converted and converting objects, methods described also includes:
The acoustical frequency spectrum information of at least one switch target object is obtained, the acoustical frequency spectrum of the switch target object is believed
Cease and be associated with the identification information of the switch target object, generate acoustical frequency spectrum information database.
Wherein, the acoustical frequency spectrum information of at least one object of above-mentioned acquisition, including:
The sound of object is acquired, digital-to-analogue conversion is carried out to the object sound of acquisition, obtains the object
Digital audio-frequency data, the object is parsed according to the digital audio-frequency data, obtain the acoustical frequency spectrum letter of the object
Breath, wherein, the acoustical frequency spectrum information of the object comprises at least the syllable spectrum information of object pronunciation.
Wherein, the above-mentioned acoustical frequency spectrum according to the converting objects is carried out to the track information of the voice data to be converted
Conversion, including:
According to the tone color of the acoustical frequency spectrum information sound intermediate frequency feature of the target converting objects to the audio number to be converted
According to track information in syllable tone color carry out tuning.
Above-mentioned generation module 203 can be by any kind of volatibility or non-volatile memory device or their group
Close to realize.Wherein, nonvolatile memory can be read-only storage (ROM, Read Only Memory), it is programmable read-only
Memory (PROM, Programmable Read-Only Memory), Erasable Programmable Read Only Memory EPROM (EPROM,
Erasable Programmable Read-Only Memory), Electrically Erasable Read Only Memory (EEPROM,
Electrically Erasable Programmable Read-Only Memory), magnetic RAM (FRAM,
Ferromagnetic Random Access Memory), flash memory (Flash Memory), magnetic surface storage, light
Disk or read-only optical disc (CD-ROM, Compact Disc Read-Only Memory);Magnetic surface storage can be that disk is deposited
Reservoir or magnetic tape storage.Volatile memory can be random access memory (RAM, Random Access Memory),
It is used as External Cache.By exemplary but be not restricted explanation, the RAM of many forms can use, such as static random
Access memory (SRAM, Static Random Access Memory), synchronous static RAM (SSRAM,
Synchronous Static Random Access Memory), dynamic random access memory (DRAM, Dynamic
Random Access Memory), Synchronous Dynamic Random Access Memory (SDRAM, Synchronous Dynamic Random
Access Memory), double data speed synchronous dynamic RAM (DDRSDRAM, Double Data Rate
Synchronous Dynamic Random Access Memory), enhanced Synchronous Dynamic Random Access Memory
(ESDRAM, Enhanced Synchronous Dynamic Random Access Memory), synchronized links dynamic random are deposited
Access to memory (SLDRAM, SyncLink Dynamic Random Access Memory), direct rambus arbitrary access are deposited
Reservoir (DRRAM, Direct Rambus Random Access Memory).The generation module 203 of description of the embodiment of the present invention
It is intended to the memory of including but not limited to these and any other suitable type.
In the exemplary embodiment, the server can by one or more application specific integrated circuits (ASIC,
Application Specific Integrated Circuit), DSP, PLD (PLD, Programmable
Logic Device), CPLD (CPLD, Complex Programmable Logic Device), scene
Programmable gate array (FPGA, Field-Programmable Gate Array), general processor, controller, microcontroller
(MCU, Micro Controller Unit), microprocessor (Microprocessor) or other electronic components are realized, are used for
Perform preceding method.
The method of transducing audio sounding of the embodiment of the present invention is entered below by exemplified by music APP change song original singers
Row is expanded on further.
Embodiment one
First embodiment of the invention provides a kind of concrete methods of realizing of transducing audio vocal technique, as shown in figure 3, institute
The method of stating comprises the following steps:
Step 301:Acoustical frequency spectrum information is gathered, establishes acoustical frequency spectrum information bank;
In actual applications, because sound is a kind of sound wave with certain frequency of oscillation, and sound wave has frequency of oscillation, shaken
The physical parameters such as width, waveform or characteristic, it is exactly these different parameters and characteristic, just causes sound there are a variety of sense of hearings
Effect.If divided according to the characteristic voice of various musical instruments, there is the different performance shape of four kinds of tone, volume, tone color and figure etc.
Formula, is exactly these different forms, the characteristics of just determining various different musical instrument sounds.Wherein, tone is shaken with electromagnetic wave
Swing a kind of form of frequency dependence, and frequency is directly proportional, frequency is high, then tone is just high, and frequency is low, then tone is also low;Volume is
A kind of form related to the oscillation amplitude of electromagnetic wave, its size is directly proportional to the amplitude of electromagnetic wave, and amplitude is big, volume with regard to big,
Amplitude is small, and volume is also small.Compare from us intuitively for auditory effect, tone height, then the sound hair tip sent, thin, tone
Low, the sound sent seems simple and honest.
And tone color then refers to the sense quality of sound, i.e. people hear the auditory effect of sound, the sound difference of different people
Exactly distinguished by tone color.Equally it is soprano, even if they sing the sound of same first song, Li Guyi and Song Zuying, Ting Zhongyi
Can is listened accurately to distinguish, here it is the effect of tone color.And tone color is determined by the waveform of above-mentioned electromagnetic wave
Fixed.The waveform of standard electromagnetic wave is sine wave, such as the alternating current that our conventional days, and its waveform is exactly a kind of sine of standard
Ripple.But the sound of people, the sound of various musical instruments, and a variety of sound in nature, its waveform be often it is a kind of compared with
For the shape of complexity, exactly these waveforms of different shapes, the tone colors of alternative sounds is just determined.The tone color of sound is except can
To represent (time-domain representation that waveform is sound) by waveform outside, it can also represent that (frequency spectrum is the frequency of sound by sound spectrum
Domain representation), by carrying out Fourier transformation to a bit of waveform of sound, you can obtain the sound audio corresponding to this section of waveform
Spectrum.
Because the sound of same tone color may have a variety of different waveforms, but the frequency spectrum of the sound of same tone color is past
Past is identical, thus the Main Basiss being had different timbres usually using sound spectrum as differentiation alternative sounds.
In the embodiment of the present invention, in order to realize the effect for imitating different people sound, the present invention needs to treat mould in advance
Imitative people carries out sound collection, and the acoustical frequency spectrum information of the user is extracted from the voice data collected, specifically, music
APP can gather the acoustic information of singer in advance, and the acoustical frequency spectrum letter of those singers is extracted from the audio-frequency information collected
Breath;Or active user can also utilize the voice input device of terminal, for example, microphone, the sound of typing oneself, and lead to
Cross music APP the sound of oneself uploads onto the server, to cause server to extract the acoustical frequency spectrum information of the user.
In actual applications, when gathering acoustical frequency spectrum information, it is only necessary to gather 20 basic acoustical frequency spectrums, pass through
This 20 basic acoustical frequency spectrums can be combined into more than 400 kinds of acoustical frequency spectrum combination entirely, so as to pass through this more than 400 kinds of acoustics frequencies
Spectrum is combined to simulate the sound of the user.
, can be by those acoustical frequency spectrum information and user after the acoustical frequency spectrum information of user or other singers is collected
Or the names associate of singer is saved in the acoustical frequency spectrum information bank of server.
Step 302:Audio parsing is carried out to user's selection, artist to be changed song;
In actual applications, sound is recorded or regenerated by analog machine, turns into analogue audio frequency, then is digitized into turning into number
Word tone frequency, the song that we are usually heard by music APP, is exactly a kind of DAB.Audio mentioned here parses
Using digital audio and video signals as analysis object, using Digital Signal Processing as parsing means, extraction signal is a series of in time domain, frequency domain
The process of characteristic.
Audio parsing is main to make use of Fourier transform and signal sampling technology realizes.Fourier transform is to carry out frequency spectrum
The basis of analysis, the spectrum analysis of signal refer to the frequency structure by signal, ask for amplitude, phase of its component etc. by frequency point
Cloth rule, various " spectrums " using frequency as transverse axis are established, such as amplitude spectrum, phase spectrum.
Audio parsing is carried out by the song selected user in the embodiment of the present invention, the correlation of the song can be obtained
Audio frequency parameter, such as track, loudness, tone, waveform etc..Wherein, every track both defines the attribute of this track, such as should
Tone color attribute of bar track etc..Because tone color may decide that user hears the difference of sound, therefore can be by the song
Parse obtained track to modify, to change the effect of the sound of the first song artist.
Step 303:The acoustical frequency spectrum information corresponding to the singer of user's selection is determined, and is believed according to the acoustical frequency spectrum of determination
Breath, modifies to the track information of the song of artist to be changed;
In actual applications, the acoustical frequency spectrum information according to corresponding to the singer that user selects, to by performing step 302
The track information of the song of the artist to be changed obtained is modified, and by way of changing track, changes head songs
The tone color of bent singer, the sound of the singer of the song is converted into user by the sound of original singer so as to reach
The effect of the sound of selected singer.
Embodiment two
Below to the method for transducing audio sounding of the embodiment of the present invention so that the singer to specific song changes as an example
It is illustrated:
Active user wishes to hear sings song of the original singer as pottery Zhe using Sun Yanzi sound《Love is very simple》, it is first
First, music APP is to song《Love is very simple》Audio file carry out audio parsing, obtain the track of the song;Secondly, from clothes
Sun Yanzi acoustical frequency spectrum information is found in business device, is sung in antiphonal style by the acoustical frequency spectrum information according to the singer Sun Yanzi found
It is bent《Love is very simple》Pronunciation track modify, it is final to obtain the song sung with Sun Yanzi sound《Love is very simple》, lead to
Crossing aforesaid way can reach song《Love is very simple》The effect of Sun Yanzi sound is converted into by the sound of pottery Zhe.
The process modified in actual applications to track is time-consuming very short, and a 10M or so song is modified
The spent time is probably in 15s to 30s or so, thus song is changed in the arrival that the method provided by the present invention can be quickly
The effect of artist sound.
Method, server and the computer-readable storage medium of a kind of transducing audio sounding provided in an embodiment of the present invention, lead to
Cross and obtain voice data to be converted and converting objects, the voice data to be converted is parsed, generate analysis result,
The track information of the voice data to be converted is determined according to the analysis result;According to the converting objects in default sound
The acoustical frequency spectrum information that the converting objects is determined in spectrum information database is learned, according to the acoustical frequency spectrum of the converting objects
Information is changed to the track information of the voice data to be converted, determines the voice data after conversion.In this way, it is based on mesh
The use demand of preceding user and it is caused, a song liked, one singer oneself liked of reselection can be directed to, and lead to
The voice for crossing this singer sings the song, so as to reach the effect that other songs are sung with the voice of the singer liked, carries
High music APP's is recreational, while brings more preferable usage experience to user.
It should be noted that the foregoing is only a preferred embodiment of the present invention, it is not intended to limit the present invention's
Protection domain.
Claims (10)
- A kind of 1. method of transducing audio sounding, it is characterised in that methods described includes:The switch target object of voice data to be converted and the voice data to be converted is obtained, to the audio to be converted Data are parsed, and obtain analysis result, and the track information of the voice data to be converted is determined according to the analysis result, Wherein, the track information comprises at least the tone color of the voice data to be converted;The acoustical frequency spectrum information of the switch target object is determined in default acoustical frequency spectrum information database, according to described The acoustical frequency spectrum information of switch target object is changed to the track information of the voice data to be converted, after determining conversion Voice data.
- 2. according to the method for claim 1, it is characterised in that described to obtain voice data and switch target pair to be converted As before, methods described also includes:Obtain the acoustical frequency spectrum information of at least one switch target object, by the acoustical frequency spectrum information of the switch target object with The identification information of the switch target object is associated, and generates acoustical frequency spectrum information database.
- 3. according to the method for claim 2, it is characterised in that the acoustical frequency spectrum information for obtaining at least one object, Including:The sound of object is acquired, digital-to-analogue conversion is carried out to the object sound of acquisition, obtains the numeral of the object Voice data, the object is parsed according to the digital audio-frequency data, obtains the acoustical frequency spectrum information of the object, its In, the acoustical frequency spectrum information of the object comprises at least the syllable spectrum information of object pronunciation.
- 4. according to the method for claim 1, it is characterised in that the acoustical frequency spectrum according to the switch target object is believed Cease and the track information of the voice data to be converted is changed, including:According to the tone color of the acoustical frequency spectrum information sound intermediate frequency feature of the target converting objects to the voice data to be converted The tone color of syllable carries out tuning in track information.
- 5. a kind of transducing audio sound-producing device, it is characterised in that described device includes:Parsing module, for obtaining the switch target object of voice data to be converted and the voice data to be converted, to institute State voice data to be converted to be parsed, obtain analysis result, the audio to be converted is determined according to the analysis result The track information of data, wherein, the track information comprises at least the tone color of the voice data to be converted;Modular converter, for determining the acoustical frequency spectrum of the switch target object in default acoustical frequency spectrum information database Information, the track information of the voice data to be converted is turned according to the acoustical frequency spectrum information of the switch target object Change, determine the voice data after conversion.
- 6. device according to claim 5, it is characterised in that described device also includes:Generation module, for obtaining the acoustical frequency spectrum information of at least one switch target object, by the switch target object Acoustical frequency spectrum information and the identification information of the switch target object are associated, and generate acoustical frequency spectrum information database.
- 7. device according to claim 6, it is characterised in that the generation module, be specifically used for:The sound of object is acquired, digital-to-analogue conversion is carried out to the object sound of acquisition, obtains the numeral of the object Voice data, the object is parsed according to the digital audio-frequency data, obtains the acoustical frequency spectrum information of the object, its In, the acoustical frequency spectrum information of the object comprises at least the syllable spectrum information of object pronunciation.
- 8. device according to claim 6, it is characterised in that the modular converter, be specifically used for:According to the tone color of the acoustical frequency spectrum information sound intermediate frequency feature of the target converting objects to the voice data to be converted The tone color of syllable carries out tuning in track information.
- A kind of 9. server, it is characterised in that including:Processor and the computer journey that can be run on a processor for storage The memory of sequence,Wherein, when the processor is used to run the computer program, perform claim requires the step of 1 to 4 any methods described Suddenly.
- 10. a kind of computer-readable recording medium, is stored thereon with computer program, it is characterised in that the computer program quilt The step of Claims 1-4 any methods described is realized during computing device.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710752085.2A CN107481735A (en) | 2017-08-28 | 2017-08-28 | Method for converting audio sound production, server and computer readable storage medium |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710752085.2A CN107481735A (en) | 2017-08-28 | 2017-08-28 | Method for converting audio sound production, server and computer readable storage medium |
Publications (1)
Publication Number | Publication Date |
---|---|
CN107481735A true CN107481735A (en) | 2017-12-15 |
Family
ID=60602945
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201710752085.2A Pending CN107481735A (en) | 2017-08-28 | 2017-08-28 | Method for converting audio sound production, server and computer readable storage medium |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN107481735A (en) |
Cited By (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108364658A (en) * | 2018-03-21 | 2018-08-03 | 冯键能 | Cyberchat method and server-side |
CN109243477A (en) * | 2018-10-17 | 2019-01-18 | 杭州兆华电子有限公司 | A kind of audio repeating box |
CN109348274A (en) * | 2018-09-12 | 2019-02-15 | 咪咕音乐有限公司 | Live broadcast interaction method and device and storage medium |
CN110062267A (en) * | 2019-05-05 | 2019-07-26 | 广州虎牙信息科技有限公司 | Live data processing method, device, electronic equipment and readable storage medium storing program for executing |
CN110162660A (en) * | 2019-05-28 | 2019-08-23 | 维沃移动通信有限公司 | Audio-frequency processing method, device, mobile terminal and storage medium |
CN110170170A (en) * | 2019-05-30 | 2019-08-27 | 维沃移动通信有限公司 | A kind of information display method and terminal device |
CN110505496A (en) * | 2018-05-16 | 2019-11-26 | 腾讯科技(深圳)有限公司 | Live-broadcast control method and device, storage medium and electronic device |
TWI685835B (en) * | 2018-10-26 | 2020-02-21 | 財團法人資訊工業策進會 | Audio playback device and audio playback method thereof |
WO2021128256A1 (en) * | 2019-12-27 | 2021-07-01 | 深圳市优必选科技股份有限公司 | Voice conversion method, apparatus and device, and storage medium |
CN113259701A (en) * | 2021-05-18 | 2021-08-13 | 游艺星际(北京)科技有限公司 | Method and device for generating personalized timbre and electronic equipment |
Citations (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101359473A (en) * | 2007-07-30 | 2009-02-04 | 国际商业机器公司 | Auto speech conversion method and apparatus |
CN102881283A (en) * | 2011-07-13 | 2013-01-16 | 三星电子(中国)研发中心 | Method and system for processing voice |
CN102982809A (en) * | 2012-12-11 | 2013-03-20 | 中国科学技术大学 | Conversion method for sound of speaker |
CN103295574A (en) * | 2012-03-02 | 2013-09-11 | 盛乐信息技术(上海)有限公司 | Singing voice conversion device and method thereof |
JP5545935B2 (en) * | 2009-09-04 | 2014-07-09 | 国立大学法人 和歌山大学 | Voice conversion device and voice conversion method |
US20150025892A1 (en) * | 2012-03-06 | 2015-01-22 | Agency For Science, Technology And Research | Method and system for template-based personalized singing synthesis |
CN105390141A (en) * | 2015-10-14 | 2016-03-09 | 科大讯飞股份有限公司 | Sound conversion method and sound conversion device |
CN106205623A (en) * | 2016-06-17 | 2016-12-07 | 福建星网视易信息系统有限公司 | A kind of sound converting method and device |
CN107093421A (en) * | 2017-04-20 | 2017-08-25 | 深圳易方数码科技股份有限公司 | A kind of speech simulation method and apparatus |
-
2017
- 2017-08-28 CN CN201710752085.2A patent/CN107481735A/en active Pending
Patent Citations (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101359473A (en) * | 2007-07-30 | 2009-02-04 | 国际商业机器公司 | Auto speech conversion method and apparatus |
JP5545935B2 (en) * | 2009-09-04 | 2014-07-09 | 国立大学法人 和歌山大学 | Voice conversion device and voice conversion method |
CN102881283A (en) * | 2011-07-13 | 2013-01-16 | 三星电子(中国)研发中心 | Method and system for processing voice |
CN103295574A (en) * | 2012-03-02 | 2013-09-11 | 盛乐信息技术(上海)有限公司 | Singing voice conversion device and method thereof |
US20150025892A1 (en) * | 2012-03-06 | 2015-01-22 | Agency For Science, Technology And Research | Method and system for template-based personalized singing synthesis |
CN102982809A (en) * | 2012-12-11 | 2013-03-20 | 中国科学技术大学 | Conversion method for sound of speaker |
CN105390141A (en) * | 2015-10-14 | 2016-03-09 | 科大讯飞股份有限公司 | Sound conversion method and sound conversion device |
CN106205623A (en) * | 2016-06-17 | 2016-12-07 | 福建星网视易信息系统有限公司 | A kind of sound converting method and device |
CN107093421A (en) * | 2017-04-20 | 2017-08-25 | 深圳易方数码科技股份有限公司 | A kind of speech simulation method and apparatus |
Cited By (12)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108364658A (en) * | 2018-03-21 | 2018-08-03 | 冯键能 | Cyberchat method and server-side |
CN110505496A (en) * | 2018-05-16 | 2019-11-26 | 腾讯科技(深圳)有限公司 | Live-broadcast control method and device, storage medium and electronic device |
CN109348274A (en) * | 2018-09-12 | 2019-02-15 | 咪咕音乐有限公司 | Live broadcast interaction method and device and storage medium |
CN109243477A (en) * | 2018-10-17 | 2019-01-18 | 杭州兆华电子有限公司 | A kind of audio repeating box |
TWI685835B (en) * | 2018-10-26 | 2020-02-21 | 財團法人資訊工業策進會 | Audio playback device and audio playback method thereof |
US11049490B2 (en) | 2018-10-26 | 2021-06-29 | Institute For Information Industry | Audio playback device and audio playback method thereof for adjusting text to speech of a target character using spectral features |
CN110062267A (en) * | 2019-05-05 | 2019-07-26 | 广州虎牙信息科技有限公司 | Live data processing method, device, electronic equipment and readable storage medium storing program for executing |
CN110162660A (en) * | 2019-05-28 | 2019-08-23 | 维沃移动通信有限公司 | Audio-frequency processing method, device, mobile terminal and storage medium |
CN110170170A (en) * | 2019-05-30 | 2019-08-27 | 维沃移动通信有限公司 | A kind of information display method and terminal device |
WO2021128256A1 (en) * | 2019-12-27 | 2021-07-01 | 深圳市优必选科技股份有限公司 | Voice conversion method, apparatus and device, and storage medium |
CN113259701A (en) * | 2021-05-18 | 2021-08-13 | 游艺星际(北京)科技有限公司 | Method and device for generating personalized timbre and electronic equipment |
CN113259701B (en) * | 2021-05-18 | 2023-01-20 | 游艺星际(北京)科技有限公司 | Method and device for generating personalized timbre and electronic equipment |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN107481735A (en) | Method for converting audio sound production, server and computer readable storage medium | |
EP3675122B1 (en) | Text-to-speech from media content item snippets | |
CN107123415B (en) | Automatic song editing method and system | |
CN101996627B (en) | Speech processing apparatus, speech processing method and program | |
CN112382257B (en) | Audio processing method, device, equipment and medium | |
CN102881283B (en) | Method and system for processing voice | |
US11521585B2 (en) | Method of combining audio signals | |
CN112289300B (en) | Audio processing method and device, electronic equipment and computer readable storage medium | |
Zagorski-Thomas | The musicology of record production | |
JP5598516B2 (en) | Voice synthesis system for karaoke and parameter extraction device | |
Schneider | Perception of timbre and sound color | |
Davies | Works of Music: Approaches to the Ontology of Music from Analytic Philosophy | |
CN112669811B (en) | Song processing method and device, electronic equipment and readable storage medium | |
CN105895079A (en) | Voice data processing method and device | |
CN108922505A (en) | Information processing method and device | |
CN103425901A (en) | Original sound data organizer | |
CN101370216B (en) | Emotional processing and playing method for mobile phone audio files | |
Einbond | Subtractive Synthesis: noise and digital (un) creativity | |
KR20090023912A (en) | Music data processing system | |
Wang et al. | Soundscape: in the view of music | |
Ornoy et al. | Analysis of contemporary violin recordings of 19th century repertoire: Identifying trends and impacts | |
James et al. | Representations of Decay in the Works of Cat Hope | |
CN107704534A (en) | A kind of audio conversion method and device | |
O’Callaghan | Mediated Mimesis: Transcription as Processing | |
Cushing | Three solitudes and a DJ: A mashed-up study of counterpoint in a digital realm |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20171215 |