WO2021179206A1 - 自动混音装置 - Google Patents

自动混音装置 Download PDF

Info

Publication number
WO2021179206A1
WO2021179206A1 PCT/CN2020/078803 CN2020078803W WO2021179206A1 WO 2021179206 A1 WO2021179206 A1 WO 2021179206A1 CN 2020078803 W CN2020078803 W CN 2020078803W WO 2021179206 A1 WO2021179206 A1 WO 2021179206A1
Authority
WO
WIPO (PCT)
Prior art keywords
music
mixing device
automatic mixing
melody
beat
Prior art date
Application number
PCT/CN2020/078803
Other languages
English (en)
French (fr)
Inventor
普莱斯·亚当
Original Assignee
努音有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 努音有限公司 filed Critical 努音有限公司
Priority to PCT/CN2020/078803 priority Critical patent/WO2021179206A1/zh
Priority to US17/910,484 priority patent/US20230267899A1/en
Publication of WO2021179206A1 publication Critical patent/WO2021179206A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10HELECTROPHONIC MUSICAL INSTRUMENTS; INSTRUMENTS IN WHICH THE TONES ARE GENERATED BY ELECTROMECHANICAL MEANS OR ELECTRONIC GENERATORS, OR IN WHICH THE TONES ARE SYNTHESISED FROM A DATA STORE
    • G10H1/00Details of electrophonic musical instruments
    • G10H1/0008Associated control or indicating means
    • G10H1/0025Automatic or semi-automatic music composition, e.g. producing random music, applying rules from music theory or modifying a musical piece
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10HELECTROPHONIC MUSICAL INSTRUMENTS; INSTRUMENTS IN WHICH THE TONES ARE GENERATED BY ELECTROMECHANICAL MEANS OR ELECTRONIC GENERATORS, OR IN WHICH THE TONES ARE SYNTHESISED FROM A DATA STORE
    • G10H1/00Details of electrophonic musical instruments
    • G10H1/46Volume control
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10HELECTROPHONIC MUSICAL INSTRUMENTS; INSTRUMENTS IN WHICH THE TONES ARE GENERATED BY ELECTROMECHANICAL MEANS OR ELECTRONIC GENERATORS, OR IN WHICH THE TONES ARE SYNTHESISED FROM A DATA STORE
    • G10H1/00Details of electrophonic musical instruments
    • G10H1/36Accompaniment arrangements
    • G10H1/38Chord
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10HELECTROPHONIC MUSICAL INSTRUMENTS; INSTRUMENTS IN WHICH THE TONES ARE GENERATED BY ELECTROMECHANICAL MEANS OR ELECTRONIC GENERATORS, OR IN WHICH THE TONES ARE SYNTHESISED FROM A DATA STORE
    • G10H1/00Details of electrophonic musical instruments
    • G10H1/36Accompaniment arrangements
    • G10H1/40Rhythm
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10HELECTROPHONIC MUSICAL INSTRUMENTS; INSTRUMENTS IN WHICH THE TONES ARE GENERATED BY ELECTROMECHANICAL MEANS OR ELECTRONIC GENERATORS, OR IN WHICH THE TONES ARE SYNTHESISED FROM A DATA STORE
    • G10H2210/00Aspects or methods of musical processing having intrinsic musical character, i.e. involving musical theory or musical parameters or relying on musical knowledge, as applied in electrophonic musical tools or instruments
    • G10H2210/031Musical analysis, i.e. isolation, extraction or identification of musical elements or musical parameters from a raw acoustic signal or from an encoded audio signal
    • G10H2210/066Musical analysis, i.e. isolation, extraction or identification of musical elements or musical parameters from a raw acoustic signal or from an encoded audio signal for pitch analysis as part of wider processing for musical purposes, e.g. transcription, musical performance evaluation; Pitch recognition, e.g. in polyphonic sounds; Estimation or use of missing fundamental
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10HELECTROPHONIC MUSICAL INSTRUMENTS; INSTRUMENTS IN WHICH THE TONES ARE GENERATED BY ELECTROMECHANICAL MEANS OR ELECTRONIC GENERATORS, OR IN WHICH THE TONES ARE SYNTHESISED FROM A DATA STORE
    • G10H2210/00Aspects or methods of musical processing having intrinsic musical character, i.e. involving musical theory or musical parameters or relying on musical knowledge, as applied in electrophonic musical tools or instruments
    • G10H2210/031Musical analysis, i.e. isolation, extraction or identification of musical elements or musical parameters from a raw acoustic signal or from an encoded audio signal
    • G10H2210/076Musical analysis, i.e. isolation, extraction or identification of musical elements or musical parameters from a raw acoustic signal or from an encoded audio signal for extraction of timing, tempo; Beat detection
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10HELECTROPHONIC MUSICAL INSTRUMENTS; INSTRUMENTS IN WHICH THE TONES ARE GENERATED BY ELECTROMECHANICAL MEANS OR ELECTRONIC GENERATORS, OR IN WHICH THE TONES ARE SYNTHESISED FROM A DATA STORE
    • G10H2210/00Aspects or methods of musical processing having intrinsic musical character, i.e. involving musical theory or musical parameters or relying on musical knowledge, as applied in electrophonic musical tools or instruments
    • G10H2210/031Musical analysis, i.e. isolation, extraction or identification of musical elements or musical parameters from a raw acoustic signal or from an encoded audio signal
    • G10H2210/081Musical analysis, i.e. isolation, extraction or identification of musical elements or musical parameters from a raw acoustic signal or from an encoded audio signal for automatic key or tonality recognition, e.g. using musical rules or a knowledge base
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10HELECTROPHONIC MUSICAL INSTRUMENTS; INSTRUMENTS IN WHICH THE TONES ARE GENERATED BY ELECTROMECHANICAL MEANS OR ELECTRONIC GENERATORS, OR IN WHICH THE TONES ARE SYNTHESISED FROM A DATA STORE
    • G10H2210/00Aspects or methods of musical processing having intrinsic musical character, i.e. involving musical theory or musical parameters or relying on musical knowledge, as applied in electrophonic musical tools or instruments
    • G10H2210/375Tempo or beat alterations; Music timing control
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10HELECTROPHONIC MUSICAL INSTRUMENTS; INSTRUMENTS IN WHICH THE TONES ARE GENERATED BY ELECTROMECHANICAL MEANS OR ELECTRONIC GENERATORS, OR IN WHICH THE TONES ARE SYNTHESISED FROM A DATA STORE
    • G10H2210/00Aspects or methods of musical processing having intrinsic musical character, i.e. involving musical theory or musical parameters or relying on musical knowledge, as applied in electrophonic musical tools or instruments
    • G10H2210/571Chords; Chord sequences

Definitions

  • the invention relates to the field of music mixing, in particular to an automatic mixing device.
  • DJ disc jockey
  • pre-recorded music such as pop songs
  • the software that assists DJ mixing includes Traktor, Serato, Mixed in Key, etc. These softwares are based on the similarity of music rhythm and tonality. They can assist the disc jockey to manually adjust the speed and tone of the music.
  • This type of DJ mixing is a series of multiple tunes, at the mixing point, a tune will replace the previous tune and continue to play.
  • the purpose of the present invention is to provide an automatic mixing device for using a song selected by the user as the main song, selecting other similar songs from the calculated database, and finding the main song And similar songs can replace part of the mixing point.
  • the purpose of the present invention is to provide an automatic mixing device to solve the problems that the prior art cannot automatically calculate the mixing point, the mixing result is stiff, and the error rate is high.
  • the present invention provides an automatic mixing device, which is characterized by comprising: a music feature calculator, and the input music of the music feature calculator includes melody, bass, percussion and vocal tracks
  • the music feature calculator selects one or more of the melody, bass, percussion and vocal tracks, and calculates the beat point time of the input music, the chord at the rebeat, the chroma vector at the rebeat, and the One or more of the characteristics of the sound energy, tonality, and tempo of the music at the beat.
  • the automatic mixing device of the present invention can calculate the music characteristics in the music according to different sound tracks, and automatically calculate the mixing points according to the music characteristics, realize the automation of mixing, and solve the problem of low mixing efficiency and mixing in the prior art.
  • the sound effect is stiff and other issues, so it has extremely high industrial application value.
  • Figure 1 is a working flow chart of the music feature calculator of the present invention
  • Figure 2 is a schematic diagram of a piece of music
  • Figure 3 is a flow chart for calculating the mixing point.
  • the automatic mixing device of the present invention includes a music feature calculator and a mixing point calculator.
  • the music feature calculator and the mixing point calculator will be introduced separately with reference to the accompanying drawings.
  • FIG. 1 is a working flow chart of the music feature calculator of the present invention.
  • the music features defined by the music feature calculator of the present invention include the time of the music rebeat beat point, the chord and chromaticity vector at the music rebeat point, the sound energy at the music rebeat point, the rhythm and tonality of the music.
  • the calculation result of music characteristics is an important reference for finding mixing points.
  • the input of the music feature calculator includes 4 tracks: melody, bass, percussion and vocals. Different feature calculations require different orbital combinations. The following describes the preferred implementation methods for calculating each music feature:
  • Music beat time and rebeat time Music rebeat refers to the first beat of each measure. Common music has 4 beats per measure, with a rebeat every 4 beats. The time of the first retake needs to be calculated. After the first beat point is obtained, a retake is taken every 4 beats. For example, traditional methods such as the method of calculating the time correlation of music appearance in signal processing can be used to find the music beat point. In this example, multiple recurrent neural networks based on deep learning are used to calculate the beat point time of music, and the first rebeat time is calculated from the calculated beat time through the hidden Markov model. There are many implementation tools for this type of method.
  • the DBNDownBeatTrackingProcessor can be used to calculate the time of the beat of the music.
  • the input is melody + bass + percussion track.
  • the calculation of the music beat does not use the vocal track input to avoid the interference of vocals on the beat search.
  • Chroma vector at music rebeat refers to the use of a multi-element vector to represent the energy of each sound level in a period of time (such as 1 frame) (the energy of the sound level is proportional to the sound amplitude of the sound, the calculation method can refer to The calculation of mechanical wave energy will not be repeated here).
  • the chroma vector uses 12 elements, and these elements respectively represent the energy of the 12 sound levels in a period of time (for example, 1 frame), and the energy of the same sound level of different octaves is accumulated.
  • the harmonic spectrum can be calculated based on the deep neural network method, and the chromaticity vector can be extracted.
  • Sound energy at the music rebeat point In this example, the square root mean value of the sound wave amplitude at the rebeat point is calculated as its energy.
  • the key of the music In this example, a convolutional neural network is used to calculate the key of the entire music, and the input is melody + bass track.
  • the tempo of the music can be calculated by the beat.
  • the formula for calculating speed is
  • beat refers to the beat in the phrase
  • i is the number of the beat.
  • the more intuitive way to calculate the tempo of a song is to calculate it by the time and total number of beats of the entire song, this calculation method is more time-consuming.
  • the tempo of the music usually tends to stabilize after a period of time. That is, if sampling is done at the appropriate position in the middle of the music, the tempo calculated by the sampling point is the same as that calculated by passing the time and total number of beats of the entire song. The approximate speed value will be very good. Obviously, it is much time-saving to calculate by sampling points.
  • the 20th to 90th beats of the music are usually relatively stable. In this example, the value of i is 70.
  • the mixing point can be calculated based on the music feature value.
  • a music divider is preferably included, which is used to divide the music before calculating the mixing point.
  • the structure of music can be divided into prelude, chorus, verse, bridge and epilogue.
  • Figure 2 is a schematic diagram of a piece of music. The sections of the music are the prelude, verse, chorus, bridge, etc.
  • the length of the music paragraph is cut into 4 bars of integral multiples of phrases, and then the phrases of 4 bars, 8 bars, and 16 bars are compared with each other to find the mixing points of the music.
  • each phrase of the main song is compared with the phrases of the same length of other songs to determine whether the two phrases have the same structure. For example, the phrase of the main song is only compared with the main phrases of other songs. Before comparing, you need to make sure that these two phrases have enough energy. Use the energy of each beat previously calculated to calculate the energy of the phrase. If both phrases have enough energy, then make the following comparison.
  • Percussion comparison does not need to consider the harmony and other attributes of the music. Only need to consider whether the rhythm of the two pieces of music is too different.
  • Rhythm ratio can be used to measure the difference in rhythm of music. Rhythm ratio refers to the ratio of the number of beats per minute (bpm) of two pieces of music. When the rhythm ratio is too large, changing the rhythm of a phrase will be abrupt and not suitable for replacement. When the rhythm ratio is between 0.7 and 1.3, if the energy of the two phrases is greater than the preset value, they can be replaced. The default value here.
  • the time point is the start time of the phrase.
  • the duration is the phrase time. Record the rhythm ratio to facilitate subsequent mixing.
  • the comparison based on harmony includes two parts, one is the comparison of chords, and the other is the comparison of chromaticity vector features.
  • Chord comparison is the comparison of the chord progression and chord sequence of each beat of the phrase with the chord progression of each beat of other phrases. If only the root note of the chord is considered here, there are 12 types of chords. Each chord is represented by a letter, namely C, C#, D, D#, E, F, F#, G, G#, A, A#, B. If a beat chord is empty, it is represented by N.
  • the comparison of chords is equivalent to the comparison of phrases and strings of chords.
  • the local comparison method in bioinformatics is used to compare two chord strings.
  • the substitution matrix uses the substitution scores of the chords shown in the following table:
  • the chroma vector feature is to calculate the cosine similarity of the chroma vectors of two phrases. These two scores are added with different weights as needed. If the score is lower, the comparison of the phrase transposition is re-compared for the tone of the main song phrase. If the result score is high enough, the time when the phrase starts is the mixing point time. It is also necessary to record the length of the phrase, the rhythm ratio of the phrase, and the number of semitones transposed to facilitate mixing. In this example, the weights of the two scores are both 0.5.
  • the mixing points of vocals are similar to those of melody and bass, but they are different. If the energy of the phrase music (melody + bass) appearing in the human voice is strong enough, the mixing points of the corresponding phrase of the melody and bass are used directly. If the energy of the melody and bass is insufficient, the cosine similarity of the chromaticity vectors of the two vocal phrases is directly compared. Also record the start time of the phrase, the length of the phrase, the rhythm ratio of the phrase, and the number of semitones transposed.
  • the automatic mixing device of the present invention When the automatic mixing device of the present invention is applied, it first preprocesses all the songs in the user's music library, and uses the above-mentioned music feature calculation method and mixing point calculation method to calculate it separately with any music in the music library as the main song
  • the mixing points with other songs are stored in the database. If this song finds enough mixing points with other songs when it is the main song, and satisfies the two conditions that the rhythm ratio of the other song and the main song is 0.7-1.3, and the tonality difference is within 3, then the qualified Other songs are regarded as similar songs of this song, and these songs are directly called when mixing.
  • the automatic mixing device of the present invention calculates music characteristics for multiple tracks separately, and calculates mixing points based on the calculated characteristics, realizes automatic mixing, and solves the problem of low mixing efficiency and mixing in the prior art.
  • the sound result is blunt and the error rate is high.

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Theoretical Computer Science (AREA)
  • Auxiliary Devices For Music (AREA)

Abstract

本发明提供一种自动混音装置,其特征在于,包括:音乐特征计算器,所述音乐特征计算器的输入音乐包括旋律,贝斯,打击乐和人声轨道;所述音乐特征计算器选择所述旋律,贝斯,打击乐和人声轨道中的一种或几种,计算所输入音乐的节拍点时间,重拍处和弦,重拍处色度向量,重拍处的声音能量,调性,乐曲的速度中的一种或者几种特征。本发明的自动混音装置,能够根据不同的音轨计算乐曲中的音乐特征,并根据音乐特征自动计算混音点,实现了混音的自动化,解决了现有技术中混音效率低、混音效果生硬等问题。

Description

自动混音装置 技术领域
本发明涉及音乐混音领域,尤其涉及自动混音的装置。
背景技术
音乐混音(mixing)一般是指唱片骑师(Disc Jockey,缩写为DJ)选择并且播放事先录好的音乐(如流行歌),并在现场以电脑混音,制造出不同于原曲的独特音乐的操作。辅助DJ混音的软件有Traktor,Serato,Mixed in Key等。这些软件都是基于音乐节奏以及调性的相似性。它们可以辅助唱片骑师手动调节音乐速度以及音乐的调性。此类的DJ混音是把多首曲子串联,在混音点处,一首曲子会替代上一首曲子而继续播放。
但这样的人工混音方式,效率过低,而且成本高、适用场景少。为提高效率,市场上也出现了一些商用方案可以辅助用户选择串烧歌曲。这些方案多是基于音乐节奏以及音乐调性的相似性,将一首歌曲整体替换为另一首歌曲。虽然这样的设计提供了一些辅助用户操作的提示,但用户还是需要手动选择需要替换的歌曲以及自己指定乐曲替换的时间点,不能完全自动的计算替换时间点(混音点)。而且也没有考虑多音轨的音乐,一首歌曲的替换部分会整体被另一首歌曲的一部分替换,导致替换的结果过于生硬。另外部分方案加入了和弦的比较,但是没有对人声轨进行特别的处理,和弦的检测错误率也很高。
发明内容
鉴于以上所述现有技术的缺点,本发明的目的在于提供自动混音装置,用于把用户选择的一首歌曲作为主歌,从计算得到的数据库中选择其他几首相似歌曲,找到主歌和相似歌曲中可以替换部分的混音点。本发明的目的在于提供自动混音装置解决了现有技术无法自动计算混音点,以及混音结果生硬、错误率高的问题。
为实现上述目的及其他相关目的,本发明提供一种自动混音装置,其特征在于,包括:音乐特征计算器,所述音乐特征计算器的输入音乐包括旋律,贝斯,打击乐和人声轨道;所述音乐特征计算器选择所述旋律,贝斯,打击乐和人声轨道中的一种或几种,计算所输入音乐的节拍点时间,重拍处和弦,重拍处色度向量,重拍点处的声音能量,调性,乐曲的速度中的一种或者几种特征。
本发明的自动混音装置,能够根据不同的音轨计算乐曲中的音乐特征,并根据音乐特征 自动计算混音点,实现了混音的自动化,解决了现有技术中混音效率低、混音效果生硬等问题,因此具有极高的产业应用价值。
附图说明
图1是本发明的音乐特征计算器工作流程图;
图2是乐曲段落示意图;
图3是计算混音点的流程图。
具体实施方式
以下通过特定的具体实例说明本发明的实施方式,本领域技术人员可由本说明书所揭露的内容轻易地了解本发明的其他优点与功效。本发明还可以通过另外不同的具体实施方式加以实施或应用,本说明书中的各项细节也可以基于不同观点与应用,在没有背离本发明的精神下进行各种修饰或改变。
请参阅附图。需要说明的是,本实施例中所提供的图示仅以示意方式说明本发明的基本构想,遂图式中仅显示与本发明中有关的组件而非按照实际实施时的组件数目、形状及尺寸绘制,其实际实施时各组件的型态、数量及比例可为一种随意的改变,且其组件布局型态也可能更为复杂。
本发明的自动混音装置包括了音乐特征计算器和混音点计算器。下面参阅附图对音乐特征计算器和混音点计算器分别进行介绍。
首先参阅图1,图1是本发明的音乐特征计算器工作流程图。本发明的音乐特征计算器所定义的音乐特征包括音乐重拍节拍点时间,音乐重拍处的和弦以及色度向量,音乐重拍点处的声音能量,音乐的节奏以及调性。音乐特征的计算结果,是寻找混音点的重要参考。
音乐特征计算器的输入包括4个轨道:旋律、贝斯、打击乐和人声。不同的特征计算需要用到不同的轨道组合。以下对计算每种音乐特征的优选的实施方式分别介绍:
音乐节拍点时间及重拍时间:音乐重拍指每小节的第一拍。常见的乐曲每小节有4拍,每4拍取一个重拍。第一个重拍的时间需要计算,得到第一个节拍点后每4拍取一个重拍。例如可以使用传统方法如信号处理中的计算音乐出现时间相关度的方法来找到音乐节拍点。本例中使用基于深度学习的多个递归神经网络计算音乐的节拍点时间,通过隐马尔可夫模型 从计算好的节拍时间中计算第一个重拍的时间,该类方法的实现工具较多,比如madmom软件包,其中的DBNDownBeatTrackingProcessor即可用来计算乐曲节拍点的时间,输入为旋律+贝斯+打击乐轨道,计算音乐节拍点不使用人声轨道输入,避免人声对节拍查找的干扰。
音乐重拍处和弦:在得到乐曲重拍时间之后,使用卷积神经网络计算音乐和弦特征,输入实用旋律和贝斯轨。在得到音乐和弦特征后,通过条件随机场的方法识别这个重拍点的和弦。
音乐重拍处色度向量:色度向量指用一个多元素向量来表示一段时间(如1帧)内各音级的能量(音级的能量与该音的发声振幅成正比,计算方式可参考机械波能量计算,此处不予赘述)。本例中,色度向量使用12个元素,这些元素分别代表一段时间(如1帧)内12个音级中的能量,不同八度的同一音级能量累加。对于人声轨、旋律轨和贝斯轨,均可以基于深度神经网络的方法计算谐波频谱,抽取色度向量。
音乐重拍点处的声音能量:本例中,计算重拍点处的声波振幅的平方根均值作为其能量。
音乐的调性:本例中使用卷积神经网络计算整首乐曲的调性,输入为旋律+贝斯轨。
乐曲的速度:可以通过节拍子计算出乐曲的速度。计算速度的公式为
Figure PCTCN2020078803-appb-000001
其中beat指乐句中的拍子,i为拍子的序号。虽然计算乐曲速度比较直观的方式是通过整个曲子的用时和拍子总数来计算,但这样的计算方式比较费时。通过实验数据,乐曲通常在进行一段时间后,速度会趋于稳定,也即如果在乐曲中段恰当的位置进行抽样,则抽样点计算出来的乐曲速度与通过通过整个曲子的用时和拍子总数算得的速度值近似程度会非常好。而通过抽样点计算显然要省时得多。通过大量的实验数据,乐曲第20~90拍通常较为稳定,本例中,i取值为70。
在得到音乐特征值之后,就可以基于音乐特征值计算混音点。但本例中,优选地还包括了乐曲分割器,用于在计算混音点之前对音乐进行分割。音乐的结构可以分为前奏,副歌,主歌,桥段以及尾声。市面上已经有一些实现了计算音乐段落的工具包,比如msaf软件包。这个软件包可以设置多种不同的算法来查找音乐段落,本例中使用基于结构特征的方法。图2是一首乐曲段落的示意图。乐曲的段落是前奏,主歌,副歌,桥段等。为了找出更多的混合点,乐曲段落的长度被切割成4小节的整数倍乐句,然后在4小节,8小节,16小节的乐句 之间互相比较,查找音乐的混音点,实验表明,以4小节的整数倍切割乐句,命中混音点的概率最高。
以下结合图2对计算混音点的步骤进行详述。主歌的每一个长度的乐句与其他歌曲的相同长度的乐句进行比较,确定两个乐句是否为同一结构的,例如主歌的乐句只和其它乐曲的主歌乐句进行比较。在比较之前,需要确定这两个乐句都有足够的能量。使用之前计算的每一拍的能量来计算乐句的能量。若两个乐句都有足够能量,再进行下面的比较。
打击乐的混音点计算:打击乐的比较不需要考虑音乐的和声以及其他属性。只需要考虑两首乐曲的节奏是否相差过大。衡量乐曲节奏差异程度可以使用节奏比这一指标,节奏比指的是两首乐曲每分钟拍子数(bpm)的比值。节奏比过大时变换一个乐句的节奏会比较突兀,并不适合进行替换。当节奏比在0.7-1.3之间时,若两个乐句能量大于预设值,即可进行替换。这里的预设值。时间点为乐句的开始时间。持续时间为乐句时间。记录下节奏比,方便后续的混音。
旋律以及贝斯的混音点计算:这里使用基于和声的比较。和声的比较包含了两个部分,一个是和弦的比较,一个是色度向量特征的比较。和弦比较是乐句的每一拍和弦与其他乐句每一拍的和弦进行和弦序列的比较。这里若只考虑和弦的根音,那么和弦共有12种类型。每一个和弦用一个字母来表示,分别为C、C#、D、D#、E、F、F#、G、G#、A、A#、B。若某一拍和弦为空,用N来表示。和弦的比较等同成乐句和弦字符串的比较。这里应用了生物信息学上的局部比对方法来比对两个和弦字符串。局部比对是利用两个序列之间的字符差异来测定序列之间的相似性,两条序列中相应位置的字符如果差异大,那么序列的相似性低,反之,序列的相似性就高。这样两个和弦的差异为相应字符串的差,可以利用基于音乐和谐度的分数来计算两个乐句的相似性。在进行序列比对时,有两方面问题直接影响相似性分值:取代矩阵和空位罚分。取代矩阵采用下表所示的和弦的替换分数:
和弦差(相差半音的个数) 分数
0 2.85
1 -2.85
2 -2.475
3 -0.825
4 -0.825
5 0
6 -1.8
空位罚分为0。若N与任一和弦比较,分数为0。每个乐句比较分数的和为这个乐句的和 弦分数。如CGFF与AGEF做比较,分数为-0.825+2.85-2.85+2.85=2.025。
色度向量特征是计算两个乐句色度向量的余弦相似度。这两个分数根据需要配以不同的权重后相加,若分数较低,则比较的乐句移调为主歌乐句调性重新比较。若结果分数足够高,乐句开始的时间为混音点时间。同样需要记录乐句长度,乐句的节奏比以及移调的半音数目,方便混音。本例中两分数权重均取0.5。
人声的混音点计算:人声的混音点与旋律和贝斯的混音点有相同之处,又有所不同。若人声出现的乐句音乐(旋律+贝斯)的能量足够强,则直接使用旋律和贝斯对应乐句的混音点。若旋律、贝斯的能量不足,则直接比较两个人声乐句色度向量的余弦相似度。同样记录乐句的开始时间,乐句长度,乐句的节奏比以及移调的半音数目。
本发明的自动混音装置在应用时,先对用户曲库中所有的歌曲进行预处理,使用上述音乐特征计算方法和混音点计算方法,以乐库中任一乐曲为主歌分别计算它和其他歌曲的混音点存入数据库。若这个歌曲为主歌时与其他歌曲找出的混音点足够多,而且满足其他歌曲与主歌的节奏比在0.7-1.3,并且调性差在3以内的两个条件,则把符合条件的其他歌曲作为这一歌曲的相似歌曲,混音时直接调用这些歌曲。
综上所述,本发明的自动混音装置对多音轨分别计算音乐特征,并基于计算出的特征计算混音点,实现了自动混音,解决了现有技术混音效率低,以及混音结果生硬、错误率高的问题。
上述实施例仅例示性说明本发明的原理及其功效,而非用于限制本发明。任何熟悉此技术的人士皆可在不违背本发明的精神及范畴下,对上述实施例进行修饰或改变。因此,举凡所属技术领域中具有通常知识者在未脱离本发明所揭示的精神与技术思想下所完成的一切等效修饰或改变,仍应由本发明的权利要求所涵盖。

Claims (19)

  1. 一种自动混音装置,其特征在于,包括:
    音乐特征计算器,所述音乐特征计算器的输入音乐包括多个轨道;
    所述音乐特征计算器选择所述旋律,贝斯,打击乐和人声轨道中的一种或几种,计算所输入音乐的节拍点时间,重拍处和弦,重拍处色度向量,重拍处的声音能量,调性,乐曲的速度中的一种或者几种特征。
  2. 根据权利要求1所述的自动混音装置,其特征在于,还包括混音点计算器。
  3. 根据权利要求2所述的自动混音装置,其特征在于,所述混音点计算器分别计算所述音乐的人声部分,旋律贝斯部分和打击乐部分的混音点。
  4. 根据权利要求3所述的自动混音装置,其特征在于,当两个乐句的节奏比在0.7-1.3之间时,则将两个乐句的起始点作为所述打击乐部分的混音点。
  5. 根据权利要求3所述的自动混音装置,其特征在于,所述旋律贝斯部分的混音点计算基于乐曲的和声比较;所述和声比较包括和弦的比较和色度向量的比较。
  6. 根据权利要求5所述的自动混音装置,其特征在于,所述和声比较的方法包括:
    使用字符表示和弦根音,将乐句转换成字符串;
    比较字符串,计算所述字符串中每个字符的差值;
    根据所述差值计算和弦相似性。
  7. 根据权利要求6所述的自动混音装置,其特征在于,利用取代矩阵和空位罚分计算字符串中每个字符的差值。
  8. 根据权利要求5所述的自动混音装置,其特征在于,所述色度向量的比较包括计算两个乐句色度向量的余弦相似度。
  9. 根据权利要求3所述的自动混音装置,其特征在于,所述计算人声部分混音点包括:
    判断人声部分是否包括旋律和贝斯,若是,则直接使用旋律和贝斯对应乐句的混音点;
    若否,则比较人声乐句色度向量的余弦相似度。
  10. 根据权利要求1所述的自动混音装置,其特征在于,所述音乐特征计算器的输入音乐包括旋律,人声和打击乐轨道。
  11. 根据权利要求1所述的自动混音装置,其特征在于,计算所述音乐的节拍点时仅选择旋律,贝斯,打击乐轨道。
  12. 根据权利要求1所述的自动混音装置,其特征在于,计算所述音乐的节拍点时间时,使用基于深度学习的多个递归神经网络计算音乐的节拍点时间,或根据音乐出现时间相关度的方法来找到音乐节拍。
  13. 根据权利要求12所述的自动混音装置,其特征在于通过隐马尔可夫模型从计算好的节拍时间中计算第一个重拍的时间。
  14. 根据权利要求1所述的自动混音装置,其特征在于,计算所述重拍处的和弦时选择旋律和贝斯轨道。
  15. 根据权利要求1所述的自动混音装置,其特征在于,计算所述乐曲速度的公式为
    Figure PCTCN2020078803-appb-100001
    其中beat指乐句中的拍子,i为拍子的序号。
  16. 根据权利要求1所述的自动混音装置,其特征在于,所述i取值为20~90。
  17. 根据权利要求1所述的自动混音装置,其特征在于,还包括乐曲分割器,用于在计算混音点之前对音乐进行分割。
  18. 根据权利要求17所述的自动混音装置,其特征在于,所述乐曲分割器采用基于乐曲结构特征的方法对音乐进行分割。
  19. 根据权利要求18所述的自动混音装置,其特征在于,所述乐曲分割器将音乐切割成4小节的整数倍的乐句。
PCT/CN2020/078803 2020-03-11 2020-03-11 自动混音装置 WO2021179206A1 (zh)

Priority Applications (2)

Application Number Priority Date Filing Date Title
PCT/CN2020/078803 WO2021179206A1 (zh) 2020-03-11 2020-03-11 自动混音装置
US17/910,484 US20230267899A1 (en) 2020-03-11 2020-03-11 Automatic audio mixing device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2020/078803 WO2021179206A1 (zh) 2020-03-11 2020-03-11 自动混音装置

Publications (1)

Publication Number Publication Date
WO2021179206A1 true WO2021179206A1 (zh) 2021-09-16

Family

ID=77671114

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2020/078803 WO2021179206A1 (zh) 2020-03-11 2020-03-11 自动混音装置

Country Status (2)

Country Link
US (1) US20230267899A1 (zh)
WO (1) WO2021179206A1 (zh)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2007066818A1 (ja) * 2005-12-09 2007-06-14 Sony Corporation 音楽編集装置及び音楽編集方法
CN108831425A (zh) * 2018-06-22 2018-11-16 广州酷狗计算机科技有限公司 混音方法、装置及存储介质
CN109599083A (zh) * 2019-01-21 2019-04-09 北京小唱科技有限公司 用于唱歌应用的音频数据处理方法及装置、电子设备及存储介质
CN110867174A (zh) * 2018-08-28 2020-03-06 努音有限公司 自动混音装置

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2007066818A1 (ja) * 2005-12-09 2007-06-14 Sony Corporation 音楽編集装置及び音楽編集方法
CN108831425A (zh) * 2018-06-22 2018-11-16 广州酷狗计算机科技有限公司 混音方法、装置及存储介质
CN110867174A (zh) * 2018-08-28 2020-03-06 努音有限公司 自动混音装置
CN109599083A (zh) * 2019-01-21 2019-04-09 北京小唱科技有限公司 用于唱歌应用的音频数据处理方法及装置、电子设备及存储介质

Also Published As

Publication number Publication date
US20230267899A1 (en) 2023-08-24

Similar Documents

Publication Publication Date Title
US8779268B2 (en) System and method for producing a more harmonious musical accompaniment
US9177540B2 (en) System and method for conforming an audio input to a musical key
US9251776B2 (en) System and method creating harmonizing tracks for an audio input
US9310959B2 (en) System and method for enhancing audio
EP3059886B1 (en) Virtual production of a musical composition by applying chain of effects to instrumental tracks.
US9263021B2 (en) Method for generating a musical compilation track from multiple takes
US7985917B2 (en) Automatic accompaniment for vocal melodies
US20120297958A1 (en) System and Method for Providing Audio for a Requested Note Using a Render Cache
US20120297959A1 (en) System and Method for Applying a Chain of Effects to a Musical Composition
US20070289432A1 (en) Creating music via concatenative synthesis
EP3063618A1 (en) System and method for enhancing audio, conforming an audio input to a musical key, and creating harmonizing tracks for an audio input
Cogliati et al. Transcribing Human Piano Performances into Music Notation.
WO2023040332A1 (zh) 一种曲谱生成方法、电子设备及可读存储介质
CN110867174A (zh) 自动混音装置
JP2009282464A (ja) コード検出装置およびコード検出プログラム
CA2843438A1 (en) System and method for providing audio for a requested note using a render cache
Carpentier et al. Automatic orchestration in practice
JP3750547B2 (ja) フレーズ分析装置及びフレーズ分析プログラムを記録したコンピュータ読み取り可能な記録媒体
WO2021179206A1 (zh) 自动混音装置
CN113140202B (zh) 信息处理方法、装置、电子设备及存储介质
JP2007240552A (ja) 楽器音認識方法、楽器アノテーション方法、及び楽曲検索方法
Beauguitte Music Information Retrieval for Irish Traditional Music
Maršík Cover Song Identification using Music Harmony Features, Model and Complexity Analysis
Ostercamp The Improvisational Style of Steve Lacy: Analyses of Selected Transcriptions (1957-1962)
Jones A computational composer's assistant for atonal counterpoint

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20924908

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 20924908

Country of ref document: EP

Kind code of ref document: A1