JP6807461B2

JP6807461B2 - Music analysis device and music analysis program

Info

Publication number: JP6807461B2
Application number: JP2019533869A
Authority: JP
Inventors: 敬坂上
Original assignee: AlphaTheta Corp
Current assignee: AlphaTheta Corp
Priority date: 2017-08-04
Filing date: 2017-08-04
Publication date: 2021-01-06
Anticipated expiration: 2037-08-04
Also published as: JPWO2019026286A1; WO2019026286A1

Description

本発明は、楽曲解析装置および楽曲解析プログラムに関する。 The present invention relates to a music analysis device and a music analysis program.

従来、歌唱を含む楽曲データから歌唱部分を消去して、楽曲データをカラオケ等に使用するボイスキャンセル機能を有する楽曲再生装置が知られている（たとえば、特許文献１、特許文献２参照）。
このような楽曲再生装置では、歌唱部分が、ＬチャンネルとＲチャンネルの中間の位置に定位し、ＬチャンネルとＲチャンネルにほぼ同位相、同レベルで含まれていることを利用して、ＬチャンネルおよびＲチャンネルの差を取った信号に低域成分と高域成分を補充することにより、入力される楽曲データから歌唱部分を除去している。Conventionally, there is known a music reproduction device having a voice canceling function of erasing a singing portion from music data including singing and using the music data for karaoke or the like (see, for example, Patent Document 1 and Patent Document 2).
In such a music playback device, the singing portion is localized at a position between the L channel and the R channel, and is included in the L channel and the R channel at substantially the same phase and at the same level. The singing part is removed from the input music data by supplementing the low-frequency component and the high-frequency component to the signal obtained by taking the difference between the R channel and the R channel.

特開２０１３−１０９２０７号公報Japanese Unexamined Patent Publication No. 2013-109207 特開平３−２１２００号公報Japanese Unexamined Patent Publication No. 3-21200

ところで、ＤＪパフォーマンスにおいては、再生中の楽曲データから他の楽曲データにテンポを同期させて遷移することが頻繁に行われている。
この場合、再生中の楽曲データの歌唱部分から他の楽曲データの歌唱部分に遷移してしまうと、聴取者に違和感を与える。
このため、コンピュータ読取可能なプログラム等を実行して楽曲データの解析を行い、歌唱部分を検出する必要があるが、前記特許文献１および特許文献２に記載の技術では、歌唱部分を検出することはできない。By the way, in DJ performance, it is frequently performed that the tempo is synchronized from the music data being played to another music data to make a transition.
In this case, if the singing part of the music data being played is changed to the singing part of other music data, the listener feels uncomfortable.
Therefore, it is necessary to execute a computer-readable program or the like to analyze the music data and detect the singing portion. However, in the techniques described in Patent Document 1 and Patent Document 2, the singing portion is detected. Can't.

本発明の目的は、再生中の楽曲データから他の楽曲データに遷移する際、聴取者に違和感を与えることなく、楽曲データの遷移を行うことのできる楽曲解析装置および楽曲解析プログラムを提供することにある。 An object of the present invention is to provide a music analysis device and a music analysis program capable of performing a transition of music data without giving a sense of discomfort to the listener when transitioning from music data being played to other music data. It is in.

本発明の楽曲解析装置は、
楽曲データに対して周波数領域への変換を行う周波数変換部と、
前記周波数変換部により変換された楽曲データの平滑化処理を行う平滑化処理部と、
前記平滑化処理部により平滑化処理された楽曲データの音圧レベルのピーク値、当該ピーク値の前後の極小値の差分として与えられるプロミネンス、および当該ピーク値の山の幅となるピーク幅を検出するピーク検出部と、
前記ピーク検出部により検出されたピーク値、プロミネンス、およびピーク幅が、所定の閾値を超えるか否かを判定し、所定の閾値超える区間を、前記楽曲データの歌唱データを含む区間であると判定する歌唱区間判定部と、
を備えていることを特徴とする。The music analysis device of the present invention
A frequency conversion unit that converts music data into the frequency domain,
A smoothing processing unit that performs smoothing processing of music data converted by the frequency conversion unit, and
The peak value of the sound pressure level of the music data smoothed by the smoothing processing unit, the prominence given as the difference between the minimum values before and after the peak value, and the peak width which is the width of the peak of the peak value are detected. Peak detector and
It is determined whether or not the peak value, prominence, and peak width detected by the peak detection unit exceed a predetermined threshold value, and the section exceeding the predetermined threshold value is determined to be a section including the singing data of the music data. Singing section judgment part to do,
It is characterized by having.

本発明の楽曲解析プログラムは、
コンピュータを、
楽曲データに対して周波数領域への変換を行う周波数変換部と、
前記周波数変換部により変換された楽曲データの平滑化処理を行う平滑化処理部と、
前記平滑化処理部により平滑化処理された楽曲データの音圧レベルのピーク値、当該ピーク値の前後の極小値の差分として与えられるプロミネンス、および当該ピーク値の山の幅となるピーク幅を検出するピーク検出部と、
前記ピーク検出部により検出されたピーク値、プロミネンス、およびピーク幅が、所定の閾値を超えるか否かを判定し、所定の閾値を超える区間を、前記楽曲データの歌唱データを含む区間であると判定する歌唱区間判定部と、
して機能させることを特徴とする。The music analysis program of the present invention
Computer,
A frequency conversion unit that converts music data into the frequency domain,
A smoothing processing unit that performs smoothing processing of music data converted by the frequency conversion unit, and
The peak value of the sound pressure level of the music data smoothed by the smoothing processing unit, the prominence given as the difference between the minimum values before and after the peak value, and the peak width which is the width of the peak of the peak value are detected. Peak detector and
It is determined whether or not the peak value, prominence, and peak width detected by the peak detection unit exceed a predetermined threshold value, and the section exceeding the predetermined threshold value is regarded as a section including the singing data of the music data. Singing section judgment unit to judge and
It is characterized in that it functions.

本発明の実現方法の考え方を説明するためのグラフ。The graph for demonstrating the concept of the realization method of this invention. 本発明の実施形態に係る音響制御システムの構成を示す模式図。The schematic diagram which shows the structure of the acoustic control system which concerns on embodiment of this invention. 前記実施形態における楽曲解析装置の構成を示すブロック図。The block diagram which shows the structure of the music analysis apparatus in said embodiment. 前記実施形態における周波数領域への変換を示すグラフ。The graph which shows the conversion to the frequency domain in the said embodiment. 前記実施形態における平滑化処理を示すグラフ。The graph which shows the smoothing process in said embodiment. 前記実施形態における周波数領域への変換を示すグラフ。The graph which shows the conversion to the frequency domain in the said embodiment. 前記実施形態における平滑化処理を示すグラフ。The graph which shows the smoothing process in said embodiment. 前記実施形態の作用を示すフローチャート。The flowchart which shows the operation of the said embodiment.

［１］本発明の実現方法
本発明の楽曲解析は、「歌声情報処理の最近の研究」（後藤真孝、齋藤毅、中野倫靖、藤原弘将（産業技術総合研究所）日本音響学会誌６４巻１０号（２００８）ｐｐ．６１６−６２３）に記載された内容に基づくものである。[1] Realization Method of the Present Invention The music analysis of the present invention is "Recent research on singing voice information processing" (Masataka Goto, Takeshi Saito, Tomoyasu Nakano, Hiromasa Fujiwara (National Institute of Advanced Industrial Science and Technology) Journal of the Acoustical Society of Japan 64 It is based on the contents described in Vol. 10 (2008) pp. 616-623).

具体には、以下に記載された内容に基づくものである。
「歌声は、話し声と比較して発声の高さや強さの変動幅が広く、より複雑で動的な特性を持つことが知られている。特に楽曲の旋律に沿って変化する基本周波数（Ｆ０、声の高さ）の軌跡には、歌声固有の動的な変動成分が現れる。中でも図１に示すように、プレパレーション、オーバーシュート、微細変動、ヴィブラートという４種の成分は、歌唱法や歌唱者に依存せず、様々な歌声において共通して存在し、かつ歌声を知覚する上で重要な役割を担っていることが明らかになってきている。」
このことから、歌声のスペクトルと楽器音のスペクトルを比較した場合、歌声のピークの幅は、楽器音のピークに比べてブロードになるという傾向を生じる。
そこで、本発明の楽曲解析では、ピークの幅の広い狭いの違いから、楽曲データ中の音が歌唱音であるか、楽器音であるかを判定することとした。Specifically, it is based on the contents described below.
"Singing voices are known to have a wider range of fluctuations in vocalization pitch and strength than speaking voices, and to have more complex and dynamic characteristics. In particular, the fundamental frequency (F0) that changes along the melody of a musical piece. , The pitch of the voice), the dynamic fluctuation component peculiar to the singing voice appears. Among them, as shown in Fig. 1, the four kinds of components, preparation, overshoot, fine fluctuation, and vibrato, are singing method and It is becoming clear that it is independent of the singer, exists in common in various singing voices, and plays an important role in perceiving the singing voice. "
From this, when comparing the spectrum of the singing voice and the spectrum of the musical instrument sound, the width of the peak of the singing voice tends to be broader than the peak of the musical instrument sound.
Therefore, in the musical composition analysis of the present invention, it is determined whether the sound in the musical composition data is a singing sound or a musical instrument sound from the difference between the wide and narrow peaks.

［２］全体構成
図２には、本発明の実施形態に係る音響制御システム１が示されている。この音響制御システム１は、音響機器としての４台のデジタルプレーヤー２と、音響機器としてのデジタルミキサー３と、コンピュータ４とを接続して構成される。
４台のデジタルプレーヤー２は、操作することにより音響制御情報を出力する機能を備え、ＬＡＮケーブル５によってデジタルミキサー３に接続されている。このように４台のデジタルプレーヤー２とデジタルミキサー３とをＬＡＮケーブル５で接続することにより、５つの音響機器２、３を連動させることが可能となる。
本実施形態では、ＬＡＮケーブル５は、ＩＥＥＥ１３９４規格のインターフェースが用いられている。なお、デジタルプレーヤー２とデジタルミキサー３との接続は、これに限らず、ＭＩＤＩ（Musical Instruments Digital Interface：登録商標／社団法人音楽電子事業協会）規格のインターフェースを利用してもよい。[2] Overall Configuration FIG. 2 shows an acoustic control system 1 according to an embodiment of the present invention. The acoustic control system 1 is configured by connecting four digital players 2 as audio equipment, a digital mixer 3 as audio equipment, and a computer 4.
The four digital players 2 have a function of outputting acoustic control information by operating them, and are connected to the digital mixer 3 by a LAN cable 5. By connecting the four digital players 2 and the digital mixer 3 with the LAN cable 5 in this way, it is possible to link the five audio devices 2 and 3.
In this embodiment, the LAN cable 5 uses an IEEE 1394 standard interface. The connection between the digital player 2 and the digital mixer 3 is not limited to this, and an interface of MIDI (Musical Instruments Digital Interface: registered trademark / Association of Musical Electronics Industry) standard may be used.

デジタルミキサー３は、ＵＳＢケーブル６を介して、コンピュータ４と通信可能に接続される。デジタルミキサー３は、デジタルプレーヤー２を操作するごとに入力される音響制御情報と、自身が操作されることにより生成される音響制御情報とを、一括してコンピュータ４に出力するマスター音響機器（いずれか１つの音響機器）として機能する。本実施形態では、ＵＳＢケーブル６は、ＵＳＢ２．０規格のインターフェースが用いられている。なお、デジタルミキサー３とコンピュータ４との接続は、これに限らず、ＩＥＥＥ１３９４規格のＬＡＮケーブルで接続してもよい。 The digital mixer 3 is communicably connected to the computer 4 via the USB cable 6. The digital mixer 3 is a master acoustic device that collectively outputs the acoustic control information input each time the digital player 2 is operated and the acoustic control information generated by the operation of the digital player 2 to the computer 4 (whichever comes). It functions as one audio device). In this embodiment, the USB cable 6 uses a USB 2.0 standard interface. The connection between the digital mixer 3 and the computer 4 is not limited to this, and may be connected by an IEEE1394 standard LAN cable.

コンピュータ４は、ＣＰＵおよびハードディスク、ＲＯＭ等のメモリーを備えて構成される。ＣＰＵ上では、楽曲情報を管理するプログラムが実行可能とされ、デジタルミキサー３からＵＳＢケーブル６を介して入力された音響制御情報に基づいて、再生する楽曲等にエフェクトや、ミキシング等の音響処理を施す。 The computer 4 includes a CPU and a memory such as a hard disk and a ROM. A program that manages music information can be executed on the CPU, and based on the acoustic control information input from the digital mixer 3 via the USB cable 6, effects and sound processing such as mixing are applied to the music to be played. Give.

［３］楽曲解析装置４Ａの構成
図３には、本実施形態に係る楽曲解析装置４Ａの機能ブロック図が示されている。楽曲解析装置４Ａは、コンピュータ４のＣＰＵ上で実行される楽曲解析プログラムとして構成され、周波数変換部４１、平滑化処理部４２、ピーク検出部４３、歌唱区間判定部４４、および楽曲データ切替制御部４５を備える。[3] Configuration of Musical Analysis Device 4A FIG. 3 shows a functional block diagram of the music analysis device 4A according to the present embodiment. The music analysis device 4A is configured as a music analysis program executed on the CPU of the computer 4, and includes a frequency conversion unit 41, a smoothing processing unit 42, a peak detection unit 43, a singing section determination unit 44, and a music data switching control unit. 45 is provided.

周波数変換部４１は、楽曲データＳＤの１小節目から１拍毎にＦＦＴ（Fast Fourier Transform）を実行し、振幅スペクトルを算出する。本実施形態では、楽曲データＳＤを１／８ダウンサンプリングしたデータを用い、計算量の削減を図っている。具体的には、周波数変換部４１は、図４に示すように、楽曲データＳＤを、周波数毎の音圧レベル（ｄＢ）に変換する。 The frequency converter 41 executes FFT (Fast Fourier Transform) every beat from the first bar of the music data SD, and calculates the amplitude spectrum. In the present embodiment, data obtained by downsampling the music data SD by 1/8 is used to reduce the amount of calculation. Specifically, as shown in FIG. 4, the frequency conversion unit 41 converts the music data SD into a sound pressure level (dB) for each frequency.

本実施形態では、ＦＦＴの間隔は、（１拍分の時間長）＝（１／８のサンプリング周波数）×（１拍の時間長）＝４４１００／８×６０／ＢＰＭ（サンプル）で与えられる。
なお、本実施形態では、ＦＦＴによる周波数領域への変換を行っているが、これに限らず、たとえば、ＤＣＴ（Discrete Cosine Transform）により周波数領域への変換を行ってもよい。In the present embodiment, the FFT interval is given by (time length for one beat) = (sampling frequency of 1/8) × (time length for one beat) = 44100/8 × 60 / BPM (sample).
In the present embodiment, the conversion to the frequency domain is performed by FFT, but the present invention is not limited to this, and for example, the conversion to the frequency domain may be performed by DCT (Discrete Cosine Transform).

平滑化処理部４２は、周波数変換部４１により変換された振幅スペクトルに対して、２次ＩＩＲ(Infinite Impulse Response) ＬＰＦ（Low-Pass Filter）により平滑化処理を行う。具体的には、平滑化処理部４２は、振幅スペクトルを時間波形とみなしてＬＰＦを行っている。平滑化処理部４２は、図４に示す周波数領域への変換結果に基づいて、図５に示すように、細かなノイズ成分を除去し、なだらかなカーブとなるように平滑化処理を行う。 The smoothing processing unit 42 performs smoothing processing on the amplitude spectrum converted by the frequency conversion unit 41 by a secondary IIR (Infinite Impulse Response) LPF (Low-Pass Filter). Specifically, the smoothing processing unit 42 performs LPF by regarding the amplitude spectrum as a time waveform. Based on the conversion result to the frequency domain shown in FIG. 4, the smoothing processing unit 42 removes fine noise components and performs smoothing processing so as to have a gentle curve, as shown in FIG.

ピーク検出部４３は、平滑化処理部４２により平滑化処理された楽曲データのピーク（極大）の音圧レベルであるピーク値、ピーク値とピークの前後の極小値の差分として与えられるプロミネンス、およびピークの山の幅となるピーク幅を検出する。
ここで、プロミネンスとは、言語学的には、伝達の意図で文中の部分を他の部分を際立たせて発音することをいうが、本発明にいうプロミネンスは、他の周波数よりも音圧レベルが際だった周波数、すなわちピーク（極大）となる周波数の音圧レベルと、そのピークの前後において極小となる周波数の音圧レベルとの差分として定義される。The peak detection unit 43 has a peak value which is the sound pressure level of the peak (maximum) of the music data smoothed by the smoothing processing unit 42, a prominence given as a difference between the peak value and the minimum value before and after the peak, and The peak width, which is the width of the peak peak, is detected.
Here, prominence linguistically means that a part in a sentence is pronounced with other parts emphasized for the purpose of transmission, but the prominence referred to in the present invention has a sound pressure level higher than that of other frequencies. Is defined as the difference between the sound pressure level of the frequency that stands out, that is, the frequency that becomes the peak (maximum), and the sound pressure level of the frequency that becomes the minimum before and after the peak.

具体的には、図５に示すように、ピーク検出部４３は、平滑化処理された楽曲データのそれぞれの周波数における音圧レベルの高い山のピーク値を検出する。次に、ピーク検出部４３は、ピーク値を取る周波数と、前後の極小値を取る周波数との音圧レベルの差分をとって、プロミネンスを検出する。そして、ピーク検出部４３は、ピーク値を与える山の幅をピーク幅として検出する。なお、本実施形態では、ピーク幅は、プロミネンスの１／２音圧レベルにおける周波数の幅として検出しているが、山の裾野をピーク幅として検出してもよい。 Specifically, as shown in FIG. 5, the peak detection unit 43 detects the peak value of a mountain having a high sound pressure level at each frequency of the smoothed music data. Next, the peak detection unit 43 detects the prominence by taking the difference in the sound pressure level between the frequency that takes the peak value and the frequency that takes the minimum value before and after. Then, the peak detection unit 43 detects the width of the mountain giving the peak value as the peak width. In the present embodiment, the peak width is detected as the frequency width at the 1/2 sound pressure level of the prominence, but the foot of the mountain may be detected as the peak width.

歌唱区間判定部４４は、ピーク検出部４３により検出されたピーク値、プロミネンス、およびピーク幅が、所定の閾値を超えるか否かを判定し、所定の閾値を超える区間を、前記楽曲データの歌唱データを含む区間であると判定する。 The singing section determination unit 44 determines whether or not the peak value, prominence, and peak width detected by the peak detection unit 43 exceeds a predetermined threshold value, and the section exceeding the predetermined threshold value is sung in the music data. It is determined that the section contains data.

歌唱区間判定部４４は、ピーク値およびプロミネンスが一定の音圧レベルを超え、かつ、前述したように歌声のピークの幅が楽器音のピークの幅に比べてブロードになる点に着目して、歌唱区間であると判定している。具体的には、図４および図５に示すように、ピーク値が高く、プロミネンスも大きく、ピーク幅も大きい場合には、当該区間を歌唱区間であると判定している。
一方、図６および図７に示すように、ピーク値が高く、プロミネンスも大きいが、ピーク幅が狭い区間は、プレパレーション、オーバーシュート、微細変動、ヴィブラート等の歌唱特有の効果が現れておらず、ピーク幅が狭くなっているため、歌唱区間とは判定しない。The singing section determination unit 44 pays attention to the fact that the peak value and the prominence exceed a certain sound pressure level, and as described above, the peak width of the singing voice is broader than the peak width of the musical instrument sound. It is judged to be a singing section. Specifically, as shown in FIGS. 4 and 5, when the peak value is high, the prominence is large, and the peak width is large, the section is determined to be a singing section.
On the other hand, as shown in FIGS. 6 and 7, the peak value is high and the prominence is large, but the singing-specific effects such as preparation, overshoot, fine variation, and vibrato do not appear in the section where the peak width is narrow. , Since the peak width is narrow, it is not judged as a singing section.

楽曲データ切替制御部４５は、図３に示すように、歌唱区間判定部４４の判定結果に基づいて、再生中の楽曲データから他の楽曲データへの切り替えを許容するか否かの制御指令をデジタルミキサー３に出力する。具体的には、楽曲データ切替制御部４５は、歌唱区間判定部４４によって、再生中の楽曲データＳＤが、歌唱区間であると判定された場合には、楽曲データＳＤの切り替えを規制し、歌唱区間でないと判定された場合には、楽曲データＳＤの切り替えを許容する制御指令をデジタルミキサー３に出力する。 As shown in FIG. 3, the music data switching control unit 45 issues a control command as to whether or not to allow switching from the music data being played to another music data based on the determination result of the singing section determination unit 44. Output to the digital mixer 3. Specifically, the music data switching control unit 45 regulates the switching of the music data SD when the singing section determination unit 44 determines that the music data SD being played is the singing section, and sings. If it is determined that it is not an interval, a control command that allows switching of the music data SD is output to the digital mixer 3.

［４］実施形態の作用および効果
次に、本発明の実施形態の作用について、図８に示すフローチャートに基づいて説明する。
まず、楽曲解析装置４Ａは、入力される楽曲データＳＤを、１／８でダウンサンプリングしてデータを軽減する（手順Ｓ１）。
周波数変換部４１は、１拍毎にＦＦＴを実行し、振幅スペクトルを算出する（手順Ｓ２）。[4] Actions and Effects of Embodiments Next, the actions of embodiments of the present invention will be described with reference to the flowchart shown in FIG.
First, the music analysis device 4A downsamples the input music data SD by 1/8 to reduce the data (procedure S1).
The frequency conversion unit 41 executes FFT for each beat and calculates the amplitude spectrum (procedure S2).

平滑化処理部４２は、２次ＩＩＲＬＰＦにより、周波数領域へ変換された楽曲データの平滑化処理を行う（手順Ｓ３）。
ピーク検出部４３は、平滑化処理された楽曲データに基づいて、ピーク値、プロミネンス、およびピーク幅の検出を行う（手順Ｓ４）。The smoothing processing unit 42 performs smoothing processing of the music data converted into the frequency domain by the secondary IIR LPF (procedure S3).
The peak detection unit 43 detects the peak value, prominence, and peak width based on the smoothed music data (procedure S4).

歌唱区間判定部４４は、ピーク検出部４３により検出されたピーク値、プロミネンス、およびピーク幅が、所定の閾値を超えるか否かを判定する（手順Ｓ５）。
現在再生中の楽曲データＳＤが、歌唱区間ではないと判定されたら（Ｓ５：Ｎｏ）、楽曲データ切替制御部４５は、他の楽曲データへの切り替えを許容する制御指令をデジタルミキサー３に出力する（手順Ｓ６）。The singing section determination unit 44 determines whether or not the peak value, prominence, and peak width detected by the peak detection unit 43 exceed a predetermined threshold value (procedure S5).
If it is determined that the music data SD currently being played is not in the singing section (S5: No), the music data switching control unit 45 outputs a control command for allowing switching to other music data to the digital mixer 3. (Procedure S6).

現在再生中の楽曲データＳＤが、歌唱区間であると判定されたら（Ｓ５：Ｙｅｓ）、楽曲データ切替制御部４５は、他の楽曲データへの切り替えを規制する制御指令をデジタルミキサー３に出力する（手順Ｓ７）。
デジタルミキサー３では、楽曲データ切替制御部４５からの制御指令に基づいて、デジタルミキサー３による楽曲データの切り替えを行う。When the music data SD currently being played is determined to be in the singing section (S5: Yes), the music data switching control unit 45 outputs a control command for restricting switching to other music data to the digital mixer 3. (Procedure S7).
In the digital mixer 3, the music data is switched by the digital mixer 3 based on the control command from the music data switching control unit 45.

このような本実施形態によれば、歌唱区間判定部４４が、ピーク検出部４３で検出されたピーク値、プロミネンス、およびピーク幅に基づいて、歌唱区間であるか否かの判定を行っているため、歌唱区間でない場合には、他の楽曲データへの切り替えを許容し、歌唱区間である場合には、他の楽曲データへの切り替えを規制している。したがって、楽曲データＳＤの再生中の歌唱区間で他の楽曲データへの切り替えを防止することができるため、ＤＪパフォーマンスにおいて、楽曲データの切り替えに際して、聴取者に違和感を与えることがない。 According to this embodiment, the singing section determination unit 44 determines whether or not the singing section is a singing section based on the peak value, prominence, and peak width detected by the peak detection unit 43. Therefore, if it is not a singing section, switching to other music data is permitted, and if it is a singing section, switching to other music data is restricted. Therefore, it is possible to prevent switching to other music data in the singing section during playback of the music data SD, so that the listener does not feel uncomfortable when switching the music data in the DJ performance.

また、周波数領域への変換をＦＦＴ、平滑化処理を２次ＩＩＲＬＰＦにより行うことにより、通常の楽曲解析で用いられる周波数領域への変換、平滑化処理を利用しているため、変換および処理の汎用化、簡略化を図り易い。
さらに、楽曲データＳＤを１／８ダウンサンプリングすることにより、データ数の軽減を図ることができるため、歌唱区間の判定のための演算負荷を軽減することができ、変換、処理の高速化を実現できる。Further, since the conversion to the frequency domain is performed by FFT and the smoothing process is performed by the secondary IIR LPF, the conversion to the frequency domain and the smoothing process used in normal music analysis are used. Easy to generalize and simplify.
Furthermore, by downsampling the music data SD by 1/8, the number of data can be reduced, so that the calculation load for determining the singing section can be reduced, and conversion and processing can be speeded up. it can.

１…音響制御システム、２…デジタルプレーヤー、３…デジタルミキサー、４…コンピュータ、４Ａ…楽曲解析装置、５…ＬＡＮケーブル、６…ＵＳＢケーブル、４１…周波数変換部、４２…平滑化処理部、４３…ピーク検出部、４４…歌唱区間判定部、４５…楽曲データ切替制御部、ＳＤ…楽曲データ。
1 ... Acoustic control system, 2 ... Digital player, 3 ... Digital mixer, 4 ... Computer, 4A ... Music analyzer, 5 ... LAN cable, 6 ... USB cable, 41 ... Frequency conversion unit, 42 ... Smoothing processing unit, 43 ... Peak detection unit, 44 ... Singing section determination unit, 45 ... Music data switching control unit, SD ... Music data.

Claims

A frequency conversion unit that converts music data into the frequency domain,
A smoothing processing unit that performs smoothing processing of music data converted by the frequency conversion unit, and
The peak value of the sound pressure level of the music data smoothed by the smoothing processing unit, the prominence given as the difference between the minimum values before and after the peak value, and the peak width which is the width of the peak of the peak value are detected. Peak detector and
It is determined whether or not the peak value, prominence, and peak width detected by the peak detection unit exceed a predetermined threshold value, and the section exceeding the predetermined threshold value is regarded as a section including the singing data of the music data. Singing section judgment unit to judge and
A music analysis device characterized by being equipped with.

In the music analysis device according to claim 1,
The frequency transforming unit is a music analysis device characterized in that it transforms into a frequency domain by FFT (Fast Fourier Transform).

In the music analysis device according to claim 1 or 2.
The smoothing processing unit is a music analysis apparatus characterized in that smoothing processing is performed by a secondary IIR (Infinite Impulse Response) LPF (Low-Pass Filter).

Computer,
A frequency conversion unit that converts music data into the frequency domain,
A smoothing processing unit that performs smoothing processing of music data converted by the frequency conversion unit, and
The peak value of the sound pressure level of the music data smoothed by the smoothing processing unit, the prominence given as the difference between the minimum values before and after the peak value, and the peak width which is the width of the peak of the peak value are detected. Peak detector and
It is determined whether or not the peak value, prominence, and peak width detected by the peak detection unit exceed a predetermined threshold value, and the section exceeding the predetermined threshold value is regarded as a section including the singing data of the music data. Singing section judgment unit to judge and
A music analysis program characterized by making it function.