CN108172210B - Singing harmony generation method based on singing voice rhythm - Google Patents
Singing harmony generation method based on singing voice rhythm Download PDFInfo
- Publication number
- CN108172210B CN108172210B CN201810101219.9A CN201810101219A CN108172210B CN 108172210 B CN108172210 B CN 108172210B CN 201810101219 A CN201810101219 A CN 201810101219A CN 108172210 B CN108172210 B CN 108172210B
- Authority
- CN
- China
- Prior art keywords
- singing
- bpm
- singing voice
- harmony
- frame
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Expired - Fee Related
Links
Images
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L13/00—Speech synthesis; Text to speech systems
- G10L13/02—Methods for producing synthetic speech; Speech synthesisers
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10H—ELECTROPHONIC MUSICAL INSTRUMENTS; INSTRUMENTS IN WHICH THE TONES ARE GENERATED BY ELECTROMECHANICAL MEANS OR ELECTRONIC GENERATORS, OR IN WHICH THE TONES ARE SYNTHESISED FROM A DATA STORE
- G10H7/00—Instruments in which the tones are synthesised from a data store, e.g. computer organs
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10H—ELECTROPHONIC MUSICAL INSTRUMENTS; INSTRUMENTS IN WHICH THE TONES ARE GENERATED BY ELECTROMECHANICAL MEANS OR ELECTRONIC GENERATORS, OR IN WHICH THE TONES ARE SYNTHESISED FROM A DATA STORE
- G10H2250/00—Aspects of algorithms or signal processing methods without intrinsic musical character, yet specifically adapted for or used in electrophonic musical processing
- G10H2250/315—Sound category-dependent sound synthesis processes [Gensound] for musical use; Sound category-specific synthesis-controlling parameters or control means therefor
- G10H2250/455—Gensound singing voices, i.e. generation of human voices for musical applications, vocal singing sounds or intelligible words at a desired pitch or with desired vocal effects, e.g. by phoneme synthesis
Abstract
The invention relates to a singing harmony generation method based on singing voice rhythm. From the application of singing and voice, the rhythm detection is carried out on the singing voice of the singer based on the frequency spectrum flux, the delay quantity of the harmony voice part is adaptively adjusted according to the singing voice rhythm to generate harmony, the beat extraction process can be simplified, the time complexity is reduced, and the music expression form of the singer is enriched. The singing harmony generation method based on the singing rhythm is simple, flexible to implement and high in practicability.
Description
Technical Field
The invention relates to the field of singing voice synthesis, in particular to a singing harmony generating method based on singing voice rhythm.
Background
Singing voice is a more complex audio signal and artistic expression form, and has important significance for analysis and research. With the popularization of music entertainment, sound effect processing for music voice becomes a hotspot for research and application, and is receiving wide attention from academic and industrial fields. Although sound effect processing technology for karaoke application is relatively mature, it is difficult for users to match harmony sound for their singing due to the limitations of their vocal and singing abilities. Therefore, it is of great practical value to study how to generate harmony based on the vocal characteristics of singers and how to generate adaptive harmony based on the rhythm of singing voice.
Disclosure of Invention
The invention aims to provide a singing harmony generating method based on singing voice rhythm, which can generate harmony in a self-adaptive manner according to the speed of a beat so as to enrich the music expression form of a singer.
In order to achieve the purpose, the technical scheme of the invention is as follows: a singing harmony generation method based on singing sound rhythm is characterized by comprising the following steps:
step S1: preprocessing the input singing voice audio signal, wherein the preprocessing mode comprises the following steps: filtering, pre-emphasis and normalization;
step S2: framing the preprocessed singing voice audio x (n), and calculating the log spectrum of each frame
Step S3: from a sequence of log spectraCalculating the spectral flux SF (n) of the singing voice signal, taking the spectral flux SF (n) as an endpoint intensity curve F (t) after low-pass filtering and smoothing, and then calculating an autocorrelation sequence TG (tau) of the endpoint intensity curve, wherein the tau of the TG (tau) with the maximum value is a beat period, so that a BPM characteristic value can be calculated;
step S4: calculating the average BPM characteristic value of the whole input singing voice signal, recording the average BPM characteristic value as BPM, and calculating the delay amount of the sum voice part by the BPM;
step S5: copying a part of the preprocessed singing voice audio x (n) and increasing the pitch of the preprocessed singing voice audio x (n) to a third degree pitch, and then generating a harmony voice part h (n) through a time delay;
step S6: and (d) superposing the original sound part x (n) and the harmony sound part h (n) in a linear proportion to output y (n), namely the generated singing harmony.
In one embodiment of the present invention, in the step S2, the log spectrum of each frameThe calculation is realized according to the following steps:
step S21: dividing the singing voice audio into frames according to the frame length K and the frame shift hop of each frame to obtain xi(n);
Step S22: for xi(n) carrying out short-time Fourier transform to obtain frequency domain signal Xi(k);
In an embodiment of the present invention, the frame length K is a sampling number within 10ms to 30ms, where K is a time length of each frame and a sampling frequency; and the frame shift hop is the non-overlapped part of two adjacent frames, and hop is K/3.
In an embodiment of the present invention, in the step S3, the spectral flux sf (n) is:
wherein n is the frame number, K is the frame length, and H (x) is the half-wave rectification function;
the autocorrelation sequence TG (τ) is:
TG(τ)=W(τ)∑F(t)F(t-τ);
wherein W (τ) is a Gaussian weighting function;
the BPM characteristic value is as follows:
BPM=60*fs/hop*τmax;
wherein fs is the sampling rate, hop is the frame shift, τmaxIs the beat period.
In an embodiment of the present invention, in the step S4, the specific implementation steps are as follows:
step S41: calculating and extracting the BPM characteristic value of a singing voice signal every 2 seconds, wherein the average value of the BPM characteristic value sequence of the whole time signal is the average valueAnd is marked as BPM;
step S42: setting a delayed beat number D according to the formulaCalculating the delay amount delay。
In an embodiment of the present invention, in the step S5, the pitch-up method adopts a pitch conversion method of stabilizing timbre.
In one embodiment of the present invention, in the step S5, the three-degree interval is an incompletely harmonized three-degree interval, i.e., the pitch is 2^ (3/12) or 2^ (4/12) times the original pitch.
In an embodiment of the present invention, in the step S6, the linear scale superposition formula is:
y(n)=x(n)+k*h(n);
in the above formula, k is a dry-wet ratio, and a more preferable effect can be obtained when k is 0.8.
Compared with the prior art, the invention has the following beneficial effects: the invention provides a singing harmony generating method based on singing rhythm, which can simplify the beat extraction process and reduce the time complexity from the application of singing harmony, can generate harmony in a self-adaptive manner according to the speed of beats, and can enrich the music expression form of a singer. The method is simple, flexible to realize and high in practicability.
Drawings
Fig. 1 is a flow chart of a singing harmony generation method based on singing voice rhythm in the present invention.
Detailed Description
The technical scheme of the invention is specifically explained below with reference to the accompanying drawings.
The invention provides a singing harmony generation method based on singing sound rhythm, which is mainly divided into three stages as shown in figure 1: in a rhythm detection stage, a flux filtering method is provided aiming at singing voice and human ear auditory characteristics, an endpoint intensity curve is obtained by adopting spectral flux calculation, and a BPM characteristic value is further extracted; in the harmony part generation stage, a singing harmony generation algorithm is proposed for singing harmony, the harmony part delay amount is dynamically calculated according to the BPM characteristic value, and a harmony part with the same person is generated by adopting a pitch conversion algorithm with stable tone color; in the superposition synthesis stage, singing and voice are superposed and output by adopting a linear proportion according to the delay amount and the dry-wet ratio. The method comprises the following specific steps:
step S1: calculate the songLogarithmic spectrum of audio signal: firstly, the whole song audio signal is preprocessed by filtering, pre-emphasis, normalization and the like. Then dividing the obtained speech signal into small-segment speech frames according to the frame K and the frame hop to obtain xiAnd (n), wherein K is the time length of each frame and the sampling frequency, and hop is K/3. For each frame, the following processing is performed: x is to bei(n) short-time Fourier transform to Xi(k)=STFT(xi(n)), then according to the formulaThe obtained log spectrum sequence
Step S2: calculating BPM characteristic value: from a sequence of logarithmic spectraCalculating the spectral flux SF (n) of the singing voice signal, and then taking the signal as an endpoint intensity curve F (t) after low-pass filtering and smoothing; calculating an autocorrelation sequence TG (tau) of the endpoint intensity curve, weighting the autocorrelation sequence by adopting a Gaussian window function, and enabling the tau with the maximum value of the TG (tau) to be a beat period, wherein BPM (60 fs/hop tau) is obtained according to a formulamaxAnd calculating to obtain the BPM characteristic value.
Step S3: calculating the average beat: calculating and extracting the BPM characteristic value of a singing voice signal every 2 seconds, wherein the average value of the BPM characteristic value sequence of the whole time signal is the average value
Step S4: calculating a delay amount: if it isAccording to the formulaAnd calculating the delay amount of the sum sound part, otherwise, indicating that the BPM characteristic value exceeds the processing range and not processing.
Step S5: generating a harmony sound part: the original signal is copied and the pitch is promoted to be incompletely harmonious three-degree pitch by adopting a pitch conversion method for stabilizing tone color, namely the pitch is 2^ (3/12) or 2^ (4/12) times of the original pitch, and the harmonic sound part signal h (n) delayed by delay relative to the main sound part is obtained through a delayer.
Step S6: linear proportional superposition: and (d) linearly superposing the original vocal part x (n) and the harmony vocal part h (n) according to a formula y (n) (+ k) ((n), and outputting y (n)) which is the generated singing harmony. In the above formula, k is a dry-wet ratio, and a more preferable effect can be obtained when k is 0.8.
The above are preferred embodiments of the present invention, and all changes made according to the technical scheme of the present invention that produce functional effects do not exceed the scope of the technical scheme of the present invention belong to the protection scope of the present invention.
Claims (8)
1. A singing harmony generation method based on singing sound rhythm is characterized by comprising the following steps:
step S1: preprocessing the input singing voice audio signal, wherein the preprocessing mode comprises the following steps: filtering, pre-emphasis and normalization;
step S2: framing the preprocessed singing voice audio x (n), and calculating the log spectrum of each frame
Step S3: from a sequence of log spectraCalculating the spectral flux SF (n) of the singing voice signal, taking the spectral flux SF (n) as an endpoint intensity curve F (t) after low-pass filtering and smoothing, and then calculating an autocorrelation sequence TG (tau) of the endpoint intensity curve, wherein the tau of the TG (tau) with the maximum value is a beat period, so that a BPM characteristic value can be calculated; the BPM characteristic value is as follows:
BPM=60*fs/hop*τmax
wherein fs is the sampling rate, hop is the frame shift, τmaxIs the beat period;
step S4: calculating the average BPM characteristic value of the whole input singing voice signal, recording the average BPM characteristic value as BPM, and calculating the delay amount of the sum voice part by the BPM;
step S5: copying a part of the preprocessed singing voice audio x (n) and increasing the pitch of the preprocessed singing voice audio x (n) to a third degree, and then generating a harmony voice part h (n) delayed by delay relative to the singing voice audio x (n) through a delay device;
step S6: and (d) superposing the original sound part x (n) and the harmony sound part h (n) in a linear proportion to output y (n), namely the generated singing harmony.
2. The method for generating singing harmony sound based on singing voice rhythm as claimed in claim 1, wherein in step S2, log spectrum of each frameThe calculation is realized according to the following steps:
step S21: dividing the singing voice audio into frames according to the frame length K and the frame shift hop of each frame to obtain xi(n);
Step S22: for xi(n) carrying out short-time Fourier transform to obtain frequency domain signal Xi(k);
3. The method of claim 2, wherein the frame length K is a sampling number within 10ms to 30ms, K being a sampling frequency per frame time length; and the frame shift hop is the non-overlapped part of two adjacent frames, and hop is K/3.
4. The singing harmony generation method based on singing voice rhythm as claimed in claim 1, wherein in said step S3, said spectral flux sf (n) is:
wherein n is the frame number, K is the frame length, and H (x) is the half-wave rectification function;
the autocorrelation sequence TG (τ) is:
TG(τ)=W(τ)∑F(t)F(t-τ);
wherein W (τ) is a Gaussian weighting function;
the BPM characteristic value is as follows:
BPM=60*fs/hop*τmax;
wherein fs is the sampling rate, hop is the frame shift, τmaxIs the beat period.
5. The singing harmony generation method based on singing voice rhythm as claimed in claim 1, wherein in said step S4, the following steps are implemented:
step S41: calculating and extracting the BPM characteristic value of a singing voice signal every 2 seconds, wherein the average value of the BPM characteristic value sequence of the whole time signal is the average valueAnd is marked as BPM;
6. The method for generating singing harmony sound based on singing voice rhythm of claim 1, wherein in step S5, the pitch raising method employs a pitch conversion method of stabilizing tone.
7. The method of claim 1, wherein in step S5, the third degree interval is an incompletely harmonious third degree interval, i.e. the pitch is 2^ (3/12) or 2^ (4/12) times the original pitch.
8. The method for generating singing harmony sound based on singing voice rhythm as claimed in claim 1, wherein in said step S6, the formula of said linear scale superposition is:
y(n)=x(n)+k*h(n);
in the above formula, k is a dry-wet ratio, and a more preferable effect can be obtained when k is 0.8.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810101219.9A CN108172210B (en) | 2018-02-01 | 2018-02-01 | Singing harmony generation method based on singing voice rhythm |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810101219.9A CN108172210B (en) | 2018-02-01 | 2018-02-01 | Singing harmony generation method based on singing voice rhythm |
Publications (2)
Publication Number | Publication Date |
---|---|
CN108172210A CN108172210A (en) | 2018-06-15 |
CN108172210B true CN108172210B (en) | 2021-03-02 |
Family
ID=62512557
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201810101219.9A Expired - Fee Related CN108172210B (en) | 2018-02-01 | 2018-02-01 | Singing harmony generation method based on singing voice rhythm |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN108172210B (en) |
Families Citing this family (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109545176B (en) * | 2019-01-21 | 2022-03-04 | 北京小唱科技有限公司 | Dynamic echo processing method and device for audio |
CN109920449B (en) * | 2019-03-18 | 2022-03-04 | 广州市百果园网络科技有限公司 | Beat analysis method, audio processing method, device, equipment and medium |
CN110853604A (en) * | 2019-10-30 | 2020-02-28 | 西安交通大学 | Automatic generation method of Chinese folk songs with specific region style based on variational self-encoder |
CN112908289B (en) * | 2021-03-10 | 2023-11-07 | 百果园技术(新加坡)有限公司 | Beat determining method, device, equipment and storage medium |
CN113411663B (en) * | 2021-04-30 | 2023-02-21 | 成都东方盛行电子有限责任公司 | Music beat extraction method for non-woven engineering |
Citations (12)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN1134580A (en) * | 1995-02-02 | 1996-10-30 | 雅马哈株式会社 | Harmony chorus apparatus generating chorus sound derived from vocal sound |
CN1153964A (en) * | 1995-02-27 | 1997-07-09 | 雅马哈株式会社 | Karaoke apparatus creating virtual harmony voice over actual singing voice |
JP2001117578A (en) * | 1999-10-21 | 2001-04-27 | Yamaha Corp | Device and method for adding harmony sound |
US6816833B1 (en) * | 1997-10-31 | 2004-11-09 | Yamaha Corporation | Audio signal processor with pitch and effect control |
CN102568457A (en) * | 2011-12-23 | 2012-07-11 | 深圳市万兴软件有限公司 | Music synthesis method and device based on humming input |
CN102568454A (en) * | 2011-12-13 | 2012-07-11 | 北京百度网讯科技有限公司 | Method and device for analyzing music BPM (Beat Per Minutes) |
CN105070283A (en) * | 2015-08-27 | 2015-11-18 | 百度在线网络技术(北京)有限公司 | Singing voice scoring method and apparatus |
CN105659322A (en) * | 2013-09-19 | 2016-06-08 | 微软技术许可有限责任公司 | Recommending audio sample combinations |
CN106228973A (en) * | 2016-07-21 | 2016-12-14 | 福州大学 | Stablize the music voice modified tone method of tone color |
CN106373580A (en) * | 2016-09-05 | 2017-02-01 | 北京百度网讯科技有限公司 | Singing synthesis method based on artificial intelligence and device |
CN106653037A (en) * | 2015-11-03 | 2017-05-10 | 广州酷狗计算机科技有限公司 | Audio data processing method and device |
US20170221466A1 (en) * | 2012-10-19 | 2017-08-03 | Sing Trix Llc | Vocal processing with accompaniment music input |
-
2018
- 2018-02-01 CN CN201810101219.9A patent/CN108172210B/en not_active Expired - Fee Related
Patent Citations (12)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN1134580A (en) * | 1995-02-02 | 1996-10-30 | 雅马哈株式会社 | Harmony chorus apparatus generating chorus sound derived from vocal sound |
CN1153964A (en) * | 1995-02-27 | 1997-07-09 | 雅马哈株式会社 | Karaoke apparatus creating virtual harmony voice over actual singing voice |
US6816833B1 (en) * | 1997-10-31 | 2004-11-09 | Yamaha Corporation | Audio signal processor with pitch and effect control |
JP2001117578A (en) * | 1999-10-21 | 2001-04-27 | Yamaha Corp | Device and method for adding harmony sound |
CN102568454A (en) * | 2011-12-13 | 2012-07-11 | 北京百度网讯科技有限公司 | Method and device for analyzing music BPM (Beat Per Minutes) |
CN102568457A (en) * | 2011-12-23 | 2012-07-11 | 深圳市万兴软件有限公司 | Music synthesis method and device based on humming input |
US20170221466A1 (en) * | 2012-10-19 | 2017-08-03 | Sing Trix Llc | Vocal processing with accompaniment music input |
CN105659322A (en) * | 2013-09-19 | 2016-06-08 | 微软技术许可有限责任公司 | Recommending audio sample combinations |
CN105070283A (en) * | 2015-08-27 | 2015-11-18 | 百度在线网络技术(北京)有限公司 | Singing voice scoring method and apparatus |
CN106653037A (en) * | 2015-11-03 | 2017-05-10 | 广州酷狗计算机科技有限公司 | Audio data processing method and device |
CN106228973A (en) * | 2016-07-21 | 2016-12-14 | 福州大学 | Stablize the music voice modified tone method of tone color |
CN106373580A (en) * | 2016-09-05 | 2017-02-01 | 北京百度网讯科技有限公司 | Singing synthesis method based on artificial intelligence and device |
Non-Patent Citations (3)
Title |
---|
"synchronization method for improving temporal harmony of music and video clips";Hayato Kumagai;《international conference on applied computing & information technology/international conference on computational science& intelligence 》;20151231;全文 * |
"tempo and beat estimation of musical signals";Alonso M.;《international symposium on music information retrieval》;20041231;全文 * |
"一种基于简单自相关的基音周期搜索算法";王孝欣;《工业控制计算机》;20151231;全文 * |
Also Published As
Publication number | Publication date |
---|---|
CN108172210A (en) | 2018-06-15 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN108172210B (en) | Singing harmony generation method based on singing voice rhythm | |
TW200412178A (en) | Apparatus and method for audio-signal-processing | |
Ikemiya et al. | Singing voice separation and vocal F0 estimation based on mutual combination of robust principal component analysis and subharmonic summation | |
Boersma et al. | Spectral characteristics of three styles of Croatian folk singing | |
JP2015515647A (en) | Automatic utterance conversion to songs, rap, or other audible expressions with the desired time signature or rhythm | |
DE102012103553A1 (en) | AUDIO SYSTEM AND METHOD FOR USING ADAPTIVE INTELLIGENCE TO DISTINCT THE INFORMATION CONTENT OF AUDIOSIGNALS IN CONSUMER AUDIO AND TO CONTROL A SIGNAL PROCESSING FUNCTION | |
EP2962299B1 (en) | Audio signal analysis | |
JPH09258787A (en) | Frequency band expanding circuit for narrow band voice signal | |
CN110136730B (en) | Deep learning-based piano and acoustic automatic configuration system and method | |
CN106653048B (en) | Single channel sound separation method based on voice model | |
CN106383676B (en) | Instant photochromic rendering system for sound and application thereof | |
CN110139206A (en) | A kind of processing method and system of stereo audio | |
Camacho | On the use of auditory models' elements to enhance a sawtooth waveform inspired pitch estimator on telephone-quality signals | |
Kumar et al. | Musical onset detection on carnatic percussion instruments | |
McFee et al. | Better beat tracking through robust onset aggregation | |
Benetos et al. | Auditory spectrum-based pitched instrument onset detection | |
Sofianos et al. | Towards effective singing voice extraction from stereophonic recordings | |
Xu et al. | The extraction and simulation of Mel frequency cepstrum speech parameters | |
Ellis et al. | Inharmonic speech: a tool for the study of speech perception and separation | |
Chanrungutai et al. | Singing voice separation for mono-channel music using non-negative matrix factorization | |
Bonjyotsna et al. | Analytical study of vocal vibrato and mordent of Indian popular singers | |
Chen et al. | Modified Perceptual Linear Prediction Liftered Cepstrum (MPLPLC) Model for Pop Cover Song Recognition. | |
JP2001249676A (en) | Method for extracting fundamental period or fundamental frequency of periodical waveform with added noise | |
Sharma et al. | Separating the source information in repetition-dependent music and enhancing it by real-time digital audio processing | |
JP2000003200A (en) | Voice signal processor and voice signal processing method |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant | ||
CF01 | Termination of patent right due to non-payment of annual fee |
Granted publication date: 20210302 Termination date: 20220201 |
|
CF01 | Termination of patent right due to non-payment of annual fee |