JP4992295B2

JP4992295B2 - Information processing device

Info

Publication number: JP4992295B2
Application number: JP2006144125A
Authority: JP
Inventors: 千加志杉浦; 岳彦井阪
Original assignee: Fujitsu Mobile Communications Ltd
Current assignee: Fujitsu Mobile Communications Ltd
Priority date: 2006-05-24
Filing date: 2006-05-24
Publication date: 2012-08-08
Anticipated expiration: 2026-05-24
Also published as: JP2007316830A

Description

本発明は、情報処理装置に係り、特に、所望のコンテンツの選択処理に関する。 The present invention relates to an information processing equipment, in particular, for the selection process of the desired content.

情報処理装置がコンテンツを記憶し、記憶されたコンテンツを再生する処理が知られている。ここで、コンテンツは、楽曲、人の話した声、機械音、自然界で発生する音声で例えば背景音として用いられる音声、動画等、時間の関数として表される信号や、静止画であるが、以後、楽曲を例に取って説明する。 A process is known in which an information processing apparatus stores content and reproduces the stored content. Here, the content is a music, a voice spoken by a person, a mechanical sound, a sound that occurs in the natural world, for example, a sound that is used as a background sound, a moving image, a signal represented as a function of time, or a still image, Hereinafter, a description will be given using music as an example.

装置の使用者は、所望の楽曲、例えば好みの楽曲を選択する際、楽曲の名称、アーチスト名等の書誌的事項による。しかしながら、楽曲が記憶される記憶部の記憶容量の増大により、数多くの楽曲が記憶されるようになり、装置の使用者は、所望の楽曲の選択に困難を感じる。特に、携帯型の装置においては、表示部の表示画面が小さく、少数の楽曲の書誌的事項の一覧の表示のみが可能であり、その困難さが顕著である。 When the user of the apparatus selects a desired song, for example, a favorite song, it depends on bibliographic items such as the name of the song and the artist name. However, an increase in the storage capacity of the storage unit in which the music is stored causes a large number of music to be stored, and the user of the apparatus feels difficulty in selecting a desired music. In particular, in a portable device, the display screen of the display unit is small, and only a list of bibliographic items of a small number of songs can be displayed, and the difficulty is remarkable.

更に、使用者は、未知の楽曲が所望の楽曲であるか否かを知るには、その楽曲を試聴する他なく、長時間を必要とする。そのため、所望の楽曲が装置に記憶されていながら所望の楽曲が記憶されていることを知らず、選択することが困難である問題点があった。 Furthermore, the user needs a long time in order to know whether or not the unknown music is the desired music, other than just listening to the music. Therefore, there is a problem that it is difficult to select the desired music piece without knowing that the desired music piece is stored while the desired music piece is stored in the apparatus.

そこで、例えば、装置に記憶された楽曲の特徴量を抽出し、抽出された特徴量から印象値を算出し、その楽曲と対応づけて記憶する。一方、使用者は、所望の楽曲の印象の度合いを入力する。そして、装置は、入力された印象の度合いから印象値を算出し、算出された印象値との距離が小さい記憶された印象値を検索し、その検索された印象値に対応づけられた楽曲を所望の楽曲として検索する処理が知られている（例えば、特許文献１参照。）。
特開２００２−２７８５４７号公報（第１８−１９頁、図１） Therefore, for example, the feature value of the music stored in the apparatus is extracted, the impression value is calculated from the extracted feature value, and stored in association with the music. On the other hand, the user inputs the degree of impression of desired music. Then, the apparatus calculates an impression value from the degree of the input impression, searches for a stored impression value having a small distance from the calculated impression value, and selects a song associated with the searched impression value. A process of searching for a desired music piece is known (for example, see Patent Document 1).
JP 2002-278547 A (pages 18-19, FIG. 1)

しかしながら、上述した特許文献１に開示されている方法では、印象の度合いは、事前に定められた、小数の性質であって、使用者の所望の度合いを正確に表現することが不可能であり、その結果、所望の楽曲を検索することができない問題点があった。 However, in the method disclosed in Patent Document 1 described above, the degree of impression is a predetermined decimal property, and it is impossible to accurately represent the user's desired degree. As a result, there has been a problem that a desired music cannot be searched.

本発明は上記問題点を解決するためになされたもので、選択された所望のコンテンツの例及び／または選択された所望でないコンテンツの例に依存して、所望のコンテンツを検索する情報処理装置を提供することを目的とする。 The present invention has been made to solve the above problems, depending on the example of the examples and / or selected undesired content was the desired content selected, the information processing equipment to search for desired content The purpose is to provide.

上記目的を達成するために、本発明の情報処理装置は、コンテンツを記憶するコンテンツ記憶手段と、第１種の前記コンテンツを前記コンテンツ記憶手段から検索し、その検索で得られた第１種のコンテンツの特徴量ベクトルから前記第１種のコンテンツからなる第１の集団の特徴量を算出し、コンテンツの特徴量ベクトルの代表ベクトルを中心に前記検索で得られた第１種のコンテンツの特徴量ベクトルと対称なベクトルからの距離が最も小さい、及び／または、その距離が所定の値より小さい特徴量ベクトルを有する第２種のコンテンツを前記コンテンツ記憶手段から検索し、その検索された第２種のコンテンツの特徴量ベクトルから前記第２種のコンテンツからなる第２の集団の特徴量を算出する集団特徴量算出手段と、前記コンテンツ記憶手段に記憶されたコンテンツの特徴量ベクトルと前記集団特徴量算出手段によって算出された第１の集団の特徴量との間の第１の距離と、そのコンテンツの特徴量ベクトルと前記集団特徴量算出手段によって算出された第２の集団の特徴量との間の第２の距離とを算出し、前記第１の距離が前記第２の距離より小さい前記コンテンツを前記第１種のコンテンツとして検索し、及び／または、前記第１の距離が前記第２の距離より大きい前記コンテンツを前記第２種のコンテンツとして検索するコンテンツ検索手段とを有することを特徴とする。 In order to achieve the above object, an information processing apparatus according to the present invention searches a content storage unit for storing content and a first type of the content from the content storage unit, and obtains the first type obtained by the search. A feature amount of the first type of content obtained by calculating the feature amount of the first group including the first type of content from the feature amount vector of the content, and obtained by the search with a representative vector of the feature amount vector of the content as a center. A second type of content having the smallest distance from a vector symmetrical to the vector and / or a feature vector whose distance is smaller than a predetermined value is searched from the content storage means, and the searched second type A group feature quantity calculating means for calculating a feature quantity of the second group consisting of the second type of content from a feature quantity vector of the content, and the content A first distance between the feature vector of the content stored in the storage means and the feature quantity of the first group calculated by the collective feature quantity calculation means; the feature vector of the content; and the collective feature quantity A second distance between the second group and the feature amount of the second group calculated by the calculating means is calculated, and the content whose first distance is smaller than the second distance is searched as the first type of content. And / or content search means for searching for the content having the first distance larger than the second distance as the second type of content .

本発明によれば、選択された所望のコンテンツの例及び／または選択された所望でないコンテンツの例に依存して、所望のコンテンツを検索する情報処理装置を提供することができる。 According to the present invention, it can be provided depending on the example of the examples and / or selected undesired content was the desired content selected, the information processing equipment to find the desired content.

以下に、本発明による情報処理装置及びコンテンツ検索プログラムの実施の形態を、図面を参照して説明する。 Embodiments of an information processing apparatus and a content search program according to the present invention will be described below with reference to the drawings.

（第１の実施形態）
図１は、本発明の第１の実施形態に係るコンテンツ検索プログラムが適用された、本発明の第１の実施形態に係る情報処理装置の構成を示すブロック図である。この情報処理装置は、プログラムを利用して動作するコンピュータであり、装置全体の制御を行う制御部１１と、表示部１２と、入力装置１３と、楽曲登録部２１と、特徴抽出部２２と、コンテンツ記憶部２３と、集団特徴量算出部２４と、辞書記憶部２５と、楽曲検索部２６と、楽曲再生部２７と、楽曲再生用のスピーカ２７ａとからなる。 (First embodiment)
FIG. 1 is a block diagram showing a configuration of an information processing apparatus according to the first embodiment of the present invention to which the content search program according to the first embodiment of the present invention is applied. This information processing apparatus is a computer that operates using a program, and includes a control unit 11 that controls the entire apparatus, a display unit 12, an input device 13, a music registration unit 21, a feature extraction unit 22, The content storage unit 23, the collective feature amount calculation unit 24, the dictionary storage unit 25, the music search unit 26, the music playback unit 27, and the music playback speaker 27 a are included.

コンテンツ記憶部２３には、コンテンツ２３ａが記憶される。コンテンツ２３ａは、後で詳述するように、デジタル化されたコンテンツと、そのコンテンツの特徴を示す情報等からなる。辞書記憶部２５には、辞書２５ａが記憶される。辞書２５ａは、後で詳述するように、所望のコンテンツ集団の特徴を示す情報等からなる。 The content storage unit 23 stores content 23a. As will be described in detail later, the content 23a includes digitized content and information indicating the characteristics of the content. The dictionary storage unit 25 stores a dictionary 25a. As will be described in detail later, the dictionary 25a includes information indicating characteristics of a desired content group.

上記のように構成された、本発明の実施形態に係る情報処理装置の各部の動作を説明する。 The operation of each unit of the information processing apparatus according to the embodiment of the present invention configured as described above will be described.

表示部１２は、制御部１１に制御されることで、カーソルを含む文字・数字や画像データの表示動作を行い、表示されているデータは、入力装置１３からの入力操作等に応答して制御部１１からの指示を受けることで切換わる。 The display unit 12 is controlled by the control unit 11 to display characters / numbers and image data including a cursor, and the displayed data is controlled in response to an input operation from the input device 13 or the like. Switching is performed upon receiving an instruction from the unit 11.

入力装置１３は、数字、ひらがな文字、アルファベット文字及び記号文字を入力するための数字キーと、文字キーと、カーソル移動キーやスクロールキーを含む複数の機能キーとを含むキーからなる。そして、入力装置１３のキーが押下されると、そのキーの識別子が制御部１１に通知され、制御部１１によって、表示部１２に文字として表示され、各部に通知され、または、制御が行われる。また、入力装置１３は、マウス、タッチパネル等、キー以外の入力素子を含んでも良い。 The input device 13 includes keys including numeric keys for inputting numerals, hiragana characters, alphabetic characters, and symbol characters, character keys, and a plurality of function keys including cursor movement keys and scroll keys. When a key of the input device 13 is pressed, the identifier of the key is notified to the control unit 11 and displayed as a character on the display unit 12 by the control unit 11 and notified to each unit or control is performed. . The input device 13 may include input elements other than keys, such as a mouse and a touch panel.

楽曲登録部２１は、楽曲データを受信して、受信された楽曲データの特徴量ベクトルを特徴抽出部２２に抽出させる。そして、受信された楽曲データと、特徴抽出部２２によって抽出された特徴量ベクトルとをコンテンツ記憶部２３にコンテンツ２３ａとして記憶させる。 The music registration unit 21 receives the music data, and causes the feature extraction unit 22 to extract the feature quantity vector of the received music data. Then, the received music data and the feature vector extracted by the feature extraction unit 22 are stored in the content storage unit 23 as the content 23a.

特徴抽出部２２は、楽曲データを受信して、受信された楽曲データの特徴量ベクトルを抽出して、抽出された特徴量ベクトルを送信する。 The feature extraction unit 22 receives music data, extracts a feature vector of the received music data, and transmits the extracted feature vector.

集団特徴量算出部２４は、所望のコンテンツ２３ａの例として指定されたコンテンツ２３ａ及び／または所望でないコンテンツ２３ａの例として指定されたコンテンツ２３ａの特徴量ベクトルをコンテンツ記憶部２３から読み出し、それらの特徴量ベクトルから所望のコンテンツ２３ａの集団を示すベクトル及び行列と、所望でないコンテンツ２３ａの集団を示すベクトル及び行列とを算出し、辞書記憶部２５に辞書２５ａとして記憶させる。 The collective feature amount calculation unit 24 reads out the feature amount vector of the content 23a specified as an example of the desired content 23a and / or the content 23a specified as an example of the undesired content 23a from the content storage unit 23, and the features thereof From the quantity vector, a vector and a matrix indicating a group of desired contents 23a and a vector and a matrix indicating a group of undesired contents 23a are calculated and stored in the dictionary storage unit 25 as a dictionary 25a.

楽曲検索部２６は、辞書記憶部２５に記憶された辞書２５ａを参照して、コンテンツ２３ａの特徴量ベクトルから、そのコンテンツ２３ａの所望の度合いを算出する。そして、指定された所望の度合いであるコンテンツ２３ａをコンテンツ記憶部２３から検索する。 The music search unit 26 refers to the dictionary 25a stored in the dictionary storage unit 25, and calculates a desired degree of the content 23a from the feature amount vector of the content 23a. Then, the content storage unit 23 is searched for the content 23a having the designated desired degree.

楽曲再生部２７は、楽曲検索部２６によって検索されたコンテンツ２３ａの楽曲データをコンテンツ記憶部２３から読み出して、読み出された楽曲データを再生し、スピーカ２７ａから出力させる。 The music playback unit 27 reads the music data of the content 23a searched by the music search unit 26 from the content storage unit 23, plays back the read music data, and outputs it from the speaker 27a.

以下、本実施形態に係る情報処理装置における、指定された所望の度合いのコンテンツ２３ａの検索に関する動作を説明する。 Hereinafter, an operation related to the search for the content 23a having a specified desired degree in the information processing apparatus according to the present embodiment will be described.

図２は、コンテンツ２３ａの構成の一例を示す。このコンテンツ２３ａは、コンテンツ識別子２３ｂと、コンテンツデータ２３ｃと、名称２３ｄと、アーチスト名２３ｅと、アルバム名２３ｆと、特徴量ベクトル２３ｇとからなる。 FIG. 2 shows an example of the configuration of the content 23a. The content 23a includes a content identifier 23b, content data 23c, a name 23d, an artist name 23e, an album name 23f, and a feature vector 23g.

コンテンツ識別子２３ｂは、コンテンツ２３ａを一意に識別する識別子である。コンテンツデータ２３ｃは、コンテンツ２３ａのデータであり、例えば、楽曲を示すデジタルデータが符号化されたデータである。ただし、符号化されていなくとも良い。 The content identifier 23b is an identifier that uniquely identifies the content 23a. The content data 23c is data of the content 23a. For example, the content data 23c is data obtained by encoding digital data indicating music. However, it may not be encoded.

名称２３ｄは、コンテンツ２３ａの名称である。アーチスト名２３ｅは、コンテンツ２３ａが楽曲である場合、その楽曲の演奏者または歌手の名前である。また、コンテンツ２３ａが発話である場合、その発話の発話者である。アルバム名２３ｆは、コンテンツ２３ａが含まれるアルバムの名称である。 The name 23d is the name of the content 23a. The artist name 23e is the name of the performer or singer of the music when the content 23a is a music. Further, when the content 23a is an utterance, it is a speaker of the utterance. The album name 23f is the name of the album including the content 23a.

特徴量ベクトル２３ｇは、コンテンツデータ２３ｃの特徴を示すベクトルであって、第１の特徴量ベクトル２３ｇ１と、第２の特徴量ベクトル２３ｇ２と、…、第Ｍの特徴量ベクトル２３ｇＭとのＭ本のベクトルからなる。第ｉの特徴量ベクトル２３ｇｉ（１≦ｉ≦Ｍ）は、いずれもＮ次元のベクトルであって、Ｘｉ１、Ｘｉ２、…、ＸｉＮなる要素からなる。 The feature quantity vector 23g is a vector indicating the feature of the content data 23c. The feature quantity vector 23g includes M first feature quantity vectors 23g1, second feature quantity vectors 23g2,..., Mth feature quantity vectors 23gM. Consists of vectors. The i-th feature vector 23gi (1 ≦ i ≦ M) is an N-dimensional vector and includes elements Xi1, Xi2,.

コンテンツ２３ａは、書誌的事項として、名称２３ｄと、アーチスト名２３ｅと、アルバム名２３ｆとを含むとしたが、これに限るものではない。例えば、作曲者名、作詞者名、発売者名等、それら以外の書誌的事項を含むとしても良い。また、これらの書誌的事項は、必須情報ではなく、情報が記憶されていなくとも良い。後で説明するように、所望の度合いによってコンテンツ２３ａを検索して再生するために、コンテンツ識別子２３ｂと、コンテンツデータ２３ｃと、特徴量ベクトル２３ｇとは、必須の情報である。 The content 23a includes the name 23d, the artist name 23e, and the album name 23f as bibliographic items, but is not limited thereto. For example, other bibliographic items such as a composer name, a songwriter name, and a seller name may be included. In addition, these bibliographic items are not essential information, and information may not be stored. As will be described later, the content identifier 23b, the content data 23c, and the feature vector 23g are indispensable information in order to search and reproduce the content 23a according to a desired degree.

図３は、辞書２５ａの構成の一例を示す。この辞書２５ａは、辞書識別子２５ｂと、辞書名２５ｃと、第１の重心ベクトル２５ｄと、第１の分散共分散行列の逆行列２５ｅと、第２の重心ベクトル２５ｆと、第２の分散共分散行列の逆行列２５ｇとからなる。 FIG. 3 shows an example of the configuration of the dictionary 25a. The dictionary 25a includes a dictionary identifier 25b, a dictionary name 25c, a first centroid vector 25d, an inverse matrix 25e of the first variance-covariance matrix, a second centroid vector 25f, and a second variance-covariance. And an inverse matrix 25g of the matrix.

辞書識別子２５ｂは、辞書２５ａを一意に識別する識別子である。辞書名２５ｃは、辞書２５ａの名称である。第１の重心ベクトル２５ｄと、第２の重心ベクトル２５ｆとは、いずれもＮ次元のベクトルであり、コンテンツ２３ａに含まれる特徴量ベクトル２３ｇと同じ次元数のベクトルである。第１の分散共分散行列の逆行列２５ｅと、第２の分散共分散行列の逆行列２５ｇとは、いずれもＮ×Ｎ次元の行列である。 The dictionary identifier 25b is an identifier that uniquely identifies the dictionary 25a. The dictionary name 25c is the name of the dictionary 25a. The first centroid vector 25d and the second centroid vector 25f are both N-dimensional vectors, and are vectors having the same number of dimensions as the feature amount vector 23g included in the content 23a. The inverse matrix 25e of the first variance-covariance matrix and the inverse matrix 25g of the second variance-covariance matrix are both N × N-dimensional matrices.

そして、第１の重心ベクトル２５ｄと、第１の分散共分散行列の逆行列２５ｅとは、所望であると指定されたコンテンツ２３ａからなる集団の特徴量を示し、所望であると指定されたコンテンツ２３ａの特徴量ベクトル２３ｇに含まれる各ベクトルの重心と、分布とをそれぞれ示す。 The first centroid vector 25d and the inverse matrix 25e of the first variance-covariance matrix indicate the feature amount of the group including the content 23a designated as desired, and the content designated as desired. The barycenter and distribution of each vector included in the feature quantity vector 23g of 23a are shown.

第２の重心ベクトル２５ｆと、第２の分散共分散行列の逆行列２５ｇとは、所望でないと指定されたコンテンツ２３ａからなる集団の特徴量を示し、所望でないと指定されたコンテンツ２３ａの特徴量ベクトル２３ｇに含まれる各ベクトルの重心と、分布とをそれぞれ示す。 The second center-of-gravity vector 25f and the inverse matrix 25g of the second variance-covariance matrix indicate the feature amount of the group including the content 23a designated as undesired, and the feature amount of the content 23a designated as undesired The barycenter and distribution of each vector included in the vector 23g are shown.

楽曲登録部２１は、コンテンツデータと、名称と、アーチスト名と、アルバム名とを受信し、受信されたコンテンツデータと、名称と、アーチスト名と、アルバム名とをそれぞれコンテンツデータ２３ｃと、名称２３ｄと、アーチスト名２３ｅと、アルバム名２３ｆとに設定する。ここで、コンテンツデータは、ＭＰ３方式やＡＡＣ方式で符号化された楽曲データであるが、これに限るものではない。なお、コンテンツデータがアナログデータである場合、楽曲登録部２１は、そのアナログデータをデジタル変換し、更に符号化したデータをコンテンツデータ２３ｃに設定する。 The music registration unit 21 receives the content data, the name, the artist name, and the album name, and the received content data, the name, the artist name, and the album name are the content data 23c and the name 23d, respectively. And artist name 23e and album name 23f. Here, the content data is music data encoded by the MP3 system or the AAC system, but is not limited thereto. When the content data is analog data, the music registration unit 21 converts the analog data into digital data and sets the encoded data in the content data 23c.

ここで、受信の手段は、例えば、装置はセルラ通信網と通信する通信部（図示せず）を有し、セルラ網を経由して受信するが、これに限るものではない。装置はインターネット通信部（図示せず）を有し、インターネットを経由して受信するとしても良い。また、装置は電子メール送受信部（図示せず）を有し、電子メール送受信部によって受信された電子メールの本文や、そのメールに添付されたファイルとして受信するとしても良い。更に、取り外し可能な記憶媒体を介して受信するとしても良い。 Here, as a means for receiving, for example, the apparatus includes a communication unit (not shown) that communicates with the cellular communication network and receives via the cellular network, but is not limited thereto. The apparatus may have an Internet communication unit (not shown) and receive via the Internet. Further, the apparatus may have an e-mail transmission / reception unit (not shown), and may be received as a body of an e-mail received by the e-mail transmission / reception unit or a file attached to the e-mail. Furthermore, it may be received via a removable storage medium.

また、名称２３ｄと、アーチスト名２３ｅと、アルバム名２３ｆとは、入力装置１３の所定の操作によって入力された情報であっても良い。 Further, the name 23d, the artist name 23e, and the album name 23f may be information input by a predetermined operation of the input device 13.

次に、楽曲登録部２１は、コンテンツデータ２３ｃをパラメータとして、特徴抽出部２２を起動する。そして、特徴抽出部２２から送信されたＭ本のＮ次元ベクトルを受信する。そして、特徴抽出部２２から送信されたＭ本のＮ次元ベクトルを特徴量ベクトル２３ｇに設定し、更にコンテンツ２３ａを一意に識別するコンテンツ識別子２３ｂを設定することによって得られたコンテンツ２３ａをコンテンツ記憶部２３に記憶させる。 Next, the music registration unit 21 activates the feature extraction unit 22 using the content data 23c as a parameter. Then, the M N-dimensional vectors transmitted from the feature extraction unit 22 are received. Then, the content storage unit stores the content 23a obtained by setting the M N-dimensional vectors transmitted from the feature extraction unit 22 in the feature quantity vector 23g and further setting the content identifier 23b that uniquely identifies the content 23a. 23.

図４は、特徴抽出部２２がコンテンツデータ２３ｃを受信し、特徴量ベクトル２３ｇであるＭ本のＮ次元ベクトルを作成する動作のフローチャートを示す。 FIG. 4 shows a flowchart of an operation in which the feature extraction unit 22 receives the content data 23c and creates M N-dimensional vectors that are the feature amount vectors 23g.

特徴抽出部２２は、コンテンツデータ２３ｃをパラメータとして楽曲登録部２１によって起動されて動作を開始する（ステップＳ２２ａ）。そして、パラメータとして与えられたコンテンツデータ２３ｃが所定の時間間隔で分割され、複数のフレームが作成されたとして、どのＭ個のフレームを選択するかを決定する（ステップＳ２２ｂ）。ここで、この所定の時間間隔は、数秒程度が適切である。また、後述するように、選択されたフレームから特徴量ベクトルが算出される。 The feature extraction unit 22 is activated by the music registration unit 21 using the content data 23c as a parameter and starts operating (step S22a). Then, the content data 23c given as a parameter is divided at a predetermined time interval, and it is determined which M frames are selected assuming that a plurality of frames are created (step S22b). Here, the predetermined time interval is suitably about several seconds. Further, as will be described later, a feature vector is calculated from the selected frame.

また、Ｍ個のフレームを選択する際、それらのフレームのそれぞれから算出されたＭ本の特徴量ベクトルは、与えられたコンテンツデータ２３ｃ全体の特徴を正確に示すベクトルとするため、Ｍは、大きい程良い。しかし、大きい場合、以下に説明する動作の処理量が増大する。そこで、コンテンツデータ２３ｃの再生時間等に依存するものの、Ｍは、数十程度が適切である。 Further, when M frames are selected, the M feature amount vectors calculated from each of the frames are vectors that accurately indicate the characteristics of the entire given content data 23c. Therefore, M is large. Moderately good. However, if it is large, the processing amount of the operation described below increases. Therefore, although M depends on the reproduction time of the content data 23c, etc., it is appropriate that M is about several tens.

また、上記特徴量ベクトルは、コンテンツデータ２３ｃ全体の特徴を示すベクトルとするため、そのコンテンツデータ２３ｃの全フレームの中で、時間的に略等間隔に位置するフレームを選択することが適切である。ただし、楽曲の冒頭の前奏及び末尾の後奏は、コンテンツデータ２３ｃ全体とは異なる音調であることがあるので、冒頭及び末尾の所定時間に係るフレームは選択しないとしても良い。 In addition, since the feature vector is a vector indicating the characteristics of the entire content data 23c, it is appropriate to select frames located at approximately equal intervals in time among all the frames of the content data 23c. . However, since the prelude at the beginning and the end after the end of the music may have a tone different from that of the entire content data 23c, the frames related to the predetermined time at the beginning and end may not be selected.

次に、特徴抽出部２２は、パラメータとして与えられたコンテンツデータ２３ｃの中で、ステップＳ２２ｂで選択されたＭ個のフレームに相当するデコードされたデータを算出し、デコードされたフレームに対して、モノラル化、及び低域濾波の前処理を施して、特徴抽出部２２内に記憶する（ステップＳ２２ｃ）。 Next, the feature extraction unit 22 calculates decoded data corresponding to the M frames selected in step S22b in the content data 23c given as parameters, and for the decoded frames, Pre-processing for monauralization and low-pass filtering is performed and stored in the feature extraction unit 22 (step S22c).

なお、コンテンツデータ２３ｃは符号化されており、そのデータをデコードするには、そのデータを例えばハフマン復号し、周波数情報を得る。そして、その周波数情報を逆量子化処理によって周波数時間変換して波形データを得る。ここで、特徴抽出部２２は、得られた周波数情報を記憶して、後述の処理に用いても良い。この周波数情報を用いる方法によれば、周波数時間変換が不要であり、計算量の削減が可能となる。 The content data 23c is encoded, and in order to decode the data, the data is subjected to, for example, Huffman decoding to obtain frequency information. Then, the frequency information is subjected to frequency time conversion by inverse quantization processing to obtain waveform data. Here, the feature extraction unit 22 may store the obtained frequency information and use it for processing described later. According to the method using the frequency information, frequency time conversion is unnecessary, and the amount of calculation can be reduced.

そして、特徴抽出部２２内に記憶されたＭ個のフレームの中のフレームを１個ずつ逐次読み込む（ステップＳ２２ｄ）。フレームが尽きて、読み込めなかったか否かを調べ（ステップＳ２２ｅ）、尽きておらず、読み込まれた場合、特徴抽出部２２は、読み込まれたフレームに関する零交差数の平均値と標準偏差を算出する（ステップＳ２２ｆ）。 Then, the frames in the M frames stored in the feature extraction unit 22 are sequentially read one by one (step S22d). It is checked whether or not the frame is exhausted and could not be read (step S22e). When the frame is not exhausted and read, the feature extraction unit 22 calculates the average value and standard deviation of the number of zero crossings for the read frame. (Step S22f).

即ち、特徴抽出部２２は、そのフレームを所定の時間間隔でＮＮ個に分割して、ＮＮ個のサブフレームを作成する。そして、ｎ番目のサブフレーム（０≦ｎ≦ＮＮ−１）の離散波形信号の振幅をＳ（ｎ、ｔ）とする。ここで、ｔは、サブフレーム内の時刻を示す変数であり、０≦ｔ≦Ｔ−１である。すると、ｎ番目のサブフレームの零交差数Ｚｃ（ｎ）は、以下の（式１）が示すように、ある時刻ｔ（１≦ｔ≦Ｔ−１）における振幅と、その時刻の前の時刻ｔ−１における振幅との符号が異なる場合の数として算出される。次に、算出されたＮＮ個の零交差数Ｚｃ（ｎ）の平均値及び標準偏差を算出する（０≦ｎ≦ＮＮ−１）。

That is, the feature extraction unit 22 divides the frame into NN pieces at a predetermined time interval, and creates NN subframes. The amplitude of the discrete waveform signal in the nth subframe (0 ≦ n ≦ NN−1) is S (n, t). Here, t is a variable indicating the time in the subframe, and 0 ≦ t ≦ T−1. Then, the zero crossing number Zc (n) of the n-th subframe is expressed by the amplitude at a certain time t (1 ≦ t ≦ T−1) and the time before that time, as shown in the following (Formula 1). This is calculated as the number when the sign of the amplitude at t-1 is different. Next, the average value and standard deviation of the calculated NN zero crossing numbers Zc (n) are calculated (0 ≦ n ≦ NN−1).

次に、特徴抽出部２２は、ステップＳ２２ｄで読み込まれたフレームに関するメル周波数ケプストラム係数の平均値と標準偏差を算出する（ステップＳ２２ｇ）。このメル周波数ケプストラム係数の算出については、後述する。 Next, the feature extraction unit 22 calculates the average value and the standard deviation of the mel frequency cepstrum coefficients related to the frame read in step S22d (step S22g). The calculation of the mel frequency cepstrum coefficient will be described later.

次に、特徴抽出部２２は、ステップＳ２２ｆで算出された零交差数の平均値と標準偏差と、ステップＳ２２ｇで算出されたメル周波数ケプストラム係数の平均値と標準偏差とを要素とするベクトルを正規化前の特徴量ベクトルｘとし、この特徴量ベクトルを正規化する（ステップＳ２２ｈ）。なお、後述するメル周波数ケプストラム係数の数によって、この特徴量ベクトルの次元が決定されるが、その次元は、Ｎ次元であるとする。 Next, the feature extraction unit 22 normalizes a vector whose elements are the average value and standard deviation of the number of zero crossings calculated in step S22f and the average value and standard deviation of the mel frequency cepstrum coefficients calculated in step S22g. The feature vector x before normalization is used, and this feature vector is normalized (step S22h). Note that the dimension of the feature vector is determined by the number of mel frequency cepstrum coefficients described later, and the dimension is assumed to be N-dimensional.

正規化にあたり、特徴抽出部２２は、複数の、なるべく多くのコンテンツ２３ａのフレームに関する正規化前のＭ本の特徴量ベクトルの代表ベクトルμと、それらの特徴量ベクトルと代表ベクトルμとの間の分散行列Σとを記憶する。ここで、代表ベクトルμは、平均ベクトルである。 In normalization, the feature extraction unit 22 includes a representative vector μ of M feature vector vectors before normalization regarding a plurality of frames of the content 23a as much as possible, and between the feature vector and the representative vector μ. The variance matrix Σ is stored. Here, the representative vector μ is an average vector.

また、代表ベクトルμと分散行列Σとは、ベクトル量子化によって上記特徴量ベクトルから所定個数のコードベクトルを求め、求められたコードベクトルの平均ベクトルと分散行列Σとしても良い。ここで、コードベクトルは、上記特徴量ベクトルの全てに対して、それぞれの特徴量ベクトルから最も近い距離にあるコードベクトルとの間の距離を求め、その距離の総和が最小になる所定個のベクトルとして算出することによって求めるが、これに限るものではない。 Further, the representative vector μ and the variance matrix Σ may be obtained by obtaining a predetermined number of code vectors from the feature quantity vector by vector quantization, and obtaining the average vector of the obtained code vectors and the variance matrix Σ. Here, the code vector is a predetermined number of vectors for which the distance between the feature vectors and the code vector closest to each feature vector is obtained and the sum of the distances is minimized. However, the present invention is not limited to this.

なお、２つのベクトル間の距離は、例えばユークリッド距離、即ち、ベクトルの対応する各要素間の差を二乗し、その二乗された差の総和の平方根として求めるが、これに限るものではない。コードベクトルの数は、上記特徴量ベクトルの数の５０分の１から２０分の１（１／５０〜１／２０）程度が適切であるが、これに限るものではなく、事前に定められていても良い。 The distance between the two vectors is obtained as, for example, the Euclidean distance, that is, the square root of the sum of the squared differences obtained by squaring the difference between the corresponding elements of the vector, but is not limited thereto. The number of code vectors is suitably about 1/50 to 1/20 (1/50 to 1/20) of the number of feature vectors, but is not limited to this and is determined in advance. May be.

この、ベクトル量子化を経て代表ベクトルμと分散行列Σとを求める方法によれば、偏りのあるコンテンツ２３ａから適切な、即ち、無限に多くのコンテンツ２３ａから算出された代表ベクトルμと分散行列Σとに近いものを求めることができる。そこで、上記特徴量ベクトルの数が少ない場合、この方法の効果が顕著である。 According to this method of obtaining the representative vector μ and the variance matrix Σ through vector quantization, the representative vector μ and the variance matrix Σ that are appropriate from the biased content 23a, that is, calculated from an infinitely large number of content 23a. You can ask for something close to Therefore, when the number of feature quantity vectors is small, the effect of this method is remarkable.

これらの代表ベクトルμと、分散行列Σとは、コンテンツ記憶部２３に記憶された全てのコンテンツ２３ａから算出された特徴量ベクトルを用いて算出するとしても良い。また、過去に特徴抽出部２２によってコンテンツ２３ａから算出された特徴量ベクトルを用いても算出されたものでも良い。 The representative vector μ and the variance matrix Σ may be calculated using feature amount vectors calculated from all the contents 23 a stored in the content storage unit 23. Further, it may be calculated using a feature amount vector calculated from the content 23a by the feature extraction unit 22 in the past.

なお、分散行列Σは、対角成分のみからなる行列である、即ち、対角成分以外の成分は０であるとしても良い。この、分散行列Σは対角成分のみからなる行列であるとする処理によれば、分散行列Σは、ベクトルとして表現可能であり、使用する記憶容量の削減と、計算量の削減とが可能である。 Note that the variance matrix Σ is a matrix composed only of diagonal components, that is, components other than the diagonal components may be zero. According to this processing that the variance matrix Σ is a matrix composed only of diagonal components, the variance matrix Σ can be expressed as a vector, and it is possible to reduce the storage capacity used and the calculation amount. is there.

また、代表ベクトルμ及び分散行列Σは、例えば、装置外部から受信されたものであっても良い。また、装置の出荷時に記憶されていたものであっても良い。なお、複数のベクトルの平均ベクトルとは、複数のベクトルの要素毎の平均値を要素とするベクトルである。 Further, the representative vector μ and the dispersion matrix Σ may be received from the outside of the apparatus, for example. Further, it may be stored at the time of shipment of the apparatus. Note that the average vector of a plurality of vectors is a vector having an average value for each element of the plurality of vectors as an element.

特徴抽出部２２は、正規化された特徴量ベクトルｙを、以下の（式２）のように、分散行列Σの逆行列と、ステップＳ２２ｈで求められた正規化前の特徴量ベクトルｘから代表ベクトルμを減じたベクトルとの乗算により求めて、特徴抽出部２２内に記憶する。

The feature extraction unit 22 represents the normalized feature vector y from the inverse matrix of the variance matrix Σ and the feature vector x before normalization obtained in step S22h, as in (Equation 2) below. It is obtained by multiplication with a vector obtained by subtracting the vector μ and stored in the feature extraction unit 22.

そして、特徴抽出部２２は、ステップＳ２２ｄのフレームを逐次読み込む動作に戻る。一方、ステップＳ２２ｅで、フレームが尽きた場合、特徴抽出部２２は、ステップＳ２２ｈで求められ、特徴抽出部２２内に記憶されたＭ本のＮ次元の特徴量ベクトルを特徴抽出部２２を起動した処理部に送信して（ステップＳ２２ｉ）、動作を終了する（ステップＳ２２ｊ）。 Then, the feature extraction unit 22 returns to the operation of sequentially reading the frames in step S22d. On the other hand, when the frame is exhausted in step S22e, the feature extraction unit 22 activates the feature extraction unit 22 with M N-dimensional feature amount vectors obtained in step S22h and stored in the feature extraction unit 22. The data is transmitted to the processing unit (step S22i), and the operation ends (step S22j).

図５は、ステップＳ２２ｇのメル周波数ケプストラム係数の平均値と標準偏差を算出する動作のフローチャートを示す。特徴抽出部２２は、この動作を開始して（ステップＳ２２ｍ）、周波数の高域を強調し（ステップＳ２２ｎ）、ハニング窓関数を乗じて離散フーリェ変換することによって、離散パワースペクトルＰ（ｎ、ｆ）を算出する。ここで、ｆは、周波数を示す変数であり、０≦ｆ≦Ｆである（ステップＳ２２ｏ）。 FIG. 5 shows a flowchart of the operation for calculating the average value and standard deviation of the mel frequency cepstrum coefficients in step S22g. The feature extraction unit 22 starts this operation (step S22m), emphasizes the high frequency range (step S22n), and multiplies the Hanning window function to perform discrete Fourier transform, thereby obtaining the discrete power spectrum P (n, f ) Is calculated. Here, f is a variable indicating the frequency, and 0 ≦ f ≦ F (step S22o).

次に、特徴抽出部２２は、離散パワースペクトルＰ（ｎ、ｆ）をメル周波数間隔で両隣の区間を半分ずつオーバーラップするようにＩ区間に帯域分割し、このＩ区間毎の離散パワースペクトルの総和の対数を算出する。そして、このＩ個の値を離散コサイン変換し、０次からＩ−１次のＩ個のメル周波数ケプストラム係数を各サブフレーム毎に算出する（ステップＳ２２ｐ）。そして、それぞれの係数の各サブフレームに対するＮＮ個の値の平均値と標準偏差を算出してフレームの特徴量とし（ステップＳ２２ｑ）、動作を終了する（ステップＳ２２ｒ）。 Next, the feature extraction unit 22 divides the discrete power spectrum P (n, f) into I sections so that the adjacent sections are overlapped by half at the mel frequency interval, and the discrete power spectrum of each I section is divided. Calculate the logarithm of the sum. Then, the I values are subjected to discrete cosine transform, and I-th order I-1 order mel frequency cepstrum coefficients are calculated for each subframe (step S22p). Then, an average value and a standard deviation of NN values for each subframe of each coefficient are calculated as a feature amount of the frame (step S22q), and the operation is ended (step S22r).

なお、特徴抽出部２２は、ステップＳ２２ｆで説明した零交差数の平均値と標準偏差を算出する処理と、ステップＳ２２ｇで説明したメル周波数ケプストラム係数の平均値と標準偏差を算出する処理との２つの処理のどの処理を先に行い、どの処理を後に行っても良いことは明らかである。 Note that the feature extraction unit 22 performs processing 2 of calculating the average value and standard deviation of the number of zero crossings described in step S22f, and processing of calculating the average value and standard deviation of the mel frequency cepstrum coefficients described in step S22g. Obviously, any one of the two processes may be performed first, and which process may be performed later.

以上説明した動作により、特徴抽出部２２は、以下に示す４種の特徴量から算出された特徴量ベクトルを送信する。ここで、零交差数は、コンテンツデータ２３ｃの中心周波数を、メル周波数ケプストラム係数は、スペクトルの傾き等を示し、また、これらの数の標準偏差は、これらの数の時間的な変化を示すので、この特徴量ベクトルは、コンテンツデータ２３ｃの特徴を適切に示すが、特徴量は、これらに限るものではない。

With the operation described above, the feature extraction unit 22 transmits a feature quantity vector calculated from the following four kinds of feature quantities. Here, the number of zero crossings indicates the center frequency of the content data 23c, the mel frequency cepstrum coefficient indicates the slope of the spectrum, and the standard deviation of these numbers indicates changes in these numbers over time. The feature amount vector appropriately indicates the feature of the content data 23c, but the feature amount is not limited thereto.

例えば、特徴抽出部２２は、上記特徴量に加えて、または代えて、離散パワースペクトルＰ（ｎ、ｆ）全域の、または上記Ｉ区間に分割された帯域毎の時間変化度を特徴量として用いても良い。 For example, the feature extraction unit 22 uses the degree of temporal change of the entire discrete power spectrum P (n, f) or for each band divided into the I section as the feature amount in addition to or instead of the feature amount. May be.

また、特徴抽出部２２は、正規化された特徴量ベクトルを送信するとしたが、これに限るものではなく、正規化前の特徴量ベクトルを送信するとしても良い。後述するように、例えば、集団特徴量算出部２４は、正規化された特徴量ベクトルの逆ベクトルを算出することによって、その特徴量ベクトルと、上記代表ベクトルμを中心に対称であるベクトルを算出する。この処理は、上記代表ベクトルμに２を乗じたベクトルから上記正規化前の特徴量ベクトルを減算する処理によって代えることができる。 In addition, the feature extraction unit 22 transmits the normalized feature vector. However, the present invention is not limited to this, and the feature vector before normalization may be transmitted. As will be described later, for example, the collective feature amount calculation unit 24 calculates a vector that is symmetric about the feature amount vector and the representative vector μ by calculating an inverse vector of the normalized feature amount vector. To do. This process can be replaced by a process of subtracting the feature vector before normalization from a vector obtained by multiplying the representative vector μ by 2.

また、上記分散行列Σの逆行列との乗算によれば、大きさが正規化された特徴量ベクトルを用いることになるが、その乗算をしないことにより、そのベクトルの正規化されていない大きさを用いるとしても良い。 Also, according to the multiplication of the variance matrix Σ with the inverse matrix, a feature vector whose size is normalized is used, but by not performing the multiplication, the unnormalized size of the vector is used. May be used.

次に、集団特徴量算出部２４の動作を説明する。図６は、集団特徴量算出部２４の動作のフローチャートを示す。集団特徴量算出部２４は、楽曲再生部２７によって起動され、動作を開始する（ステップＳ２４ａ）。そして、楽曲再生部２７から、所望のコンテンツ２３ａを識別する情報を受信し（ステップＳ２４ｂ）、所望でないコンテンツ２３ａを識別する情報を受信する（ステップＳ２４ｃ）。 Next, the operation of the collective feature quantity calculation unit 24 will be described. FIG. 6 shows a flowchart of the operation of the collective feature quantity calculation unit 24. The collective feature amount calculation unit 24 is activated by the music playback unit 27 and starts operating (step S24a). And the information which identifies the desired content 23a is received from the music reproduction part 27 (step S24b), and the information which identifies the content 23a which is not desired is received (step S24c).

ここで、コンテンツ２３ａを識別する情報は、コンテンツ識別子２３ｂであるが、これに限るものではない。名称２３ｄ、アーチスト名２３ｅ、アルバム名２３ｆのいずれか、または、これらが組み合わされたものでも良い。この場合、１つの識別する情報によってコンテンツ２３ａを検索して複数のコンテンツ識別子２３ｂが得られれば、その複数のコンテンツ識別子２３ｂが受信されたものとする。 Here, the information for identifying the content 23a is the content identifier 23b, but is not limited thereto. Any of the name 23d, the artist name 23e, the album name 23f, or a combination thereof may be used. In this case, if a plurality of content identifiers 23b are obtained by searching the content 23a with one piece of identifying information, it is assumed that the plurality of content identifiers 23b are received.

また、コンテンツ２３ａを識別する情報には、所望の度合いまたは所望でない度合いが付加されていても良い。度合いは正の数であり、数が大きい程、度合いが大きいことを示す。 In addition, a desired degree or an undesired degree may be added to the information for identifying the content 23a. The degree is a positive number, and the larger the number, the greater the degree.

また、以下に説明するように、ステップＳ２４ｂで所望のコンテンツ２３ａを識別する情報が受信されず（空情報が受信される。）、または、ステップＳ２４ｃで所望でないコンテンツ２３ａを識別する情報が受信されなくとも良いが、いずれか一方は受信されることが必須である。 Further, as described below, information for identifying the desired content 23a is not received in step S24b (empty information is received), or information for identifying the undesired content 23a is received in step S24c. It is not necessary, but it is essential that either one is received.

次に、集団特徴量算出部２４は、所望のコンテンツ２３ａの特徴を示すベクトルを特徴量ベクトル２３ｇを読み出すことによって、更に、新たに特徴量ベクトルを作成することによって得る（ステップＳ２４ｄ）。即ち、ステップＳ２４ｂで所望のコンテンツ２３ａを識別する情報が受信された場合、その情報を検索キーにコンテンツ２３ａを検索し、検索された特徴量ベクトル２３ｇを得る。 Next, the collective feature quantity calculation unit 24 obtains a vector indicating the feature of the desired content 23a by reading the feature quantity vector 23g and further creating a new feature quantity vector (step S24d). That is, when information for identifying the desired content 23a is received in step S24b, the content 23a is searched using that information as a search key, and the searched feature vector 23g is obtained.

更に、ステップＳ２４ｃで所望でないコンテンツ２３ａを識別する情報が受信された場合、集団特徴量算出部２４は、その情報を検索キーにコンテンツ２３ａを検索し、検索された特徴量ベクトル２３ｇの逆ベクトルを作成することによって特徴量ベクトルを得ても良い。 Further, when information for identifying the undesired content 23a is received in step S24c, the collective feature amount calculation unit 24 searches the content 23a using the information as a search key, and calculates an inverse vector of the searched feature amount vector 23g. A feature vector may be obtained by creating the feature vector.

ここで、特徴量ベクトル２３ｇは、Ｍ本のＮ次元ベクトルからなる。特徴量ベクトル２３ｇの逆ベクトルは、Ｍ本のベクトルそれぞれの逆ベクトルからなるＭ本のＮ次元の特徴量ベクトルである。なお、ステップＳ２４ｂで所望のコンテンツ２３ａを識別する情報が受信されなかった場合、上記逆ベクトルを作成することによって特徴量ベクトルを得ることは必須である。 Here, the feature quantity vector 23g is composed of M N-dimensional vectors. The inverse vector of the feature quantity vector 23g is M N-dimensional feature quantity vectors composed of the inverse vectors of the M vectors. In addition, when the information which identifies the desired content 23a is not received by step S24b, it is essential to obtain the feature-value vector by producing the said reverse vector.

なお、ステップＳ２４ｃで所望でないコンテンツ２３ａを識別する情報が受信された場合、集団特徴量算出部２４は、逆ベクトルを作成するとしたが、逆ベクトルに限るものではない。逆ベクトルを構成するベクトルの各要素に適宜四捨五入等を施しても良い。各要素が２進数で表現されている際、所定の下位ビットを０にしても良い。 In addition, when the information which identifies the content 23a which is not desired is received by step S24c, although the collective feature-value calculation part 24 created the inverse vector, it is not restricted to an inverse vector. You may round off suitably each element of the vector which comprises an inverse vector. When each element is expressed in binary, a predetermined lower bit may be set to 0.

また、逆ベクトルを構成するベクトルの各要素に正の数を乗算するとしても良い。１未満の正の数を乗算すれば、ステップＳ２４ｂで受信された所望のコンテンツ２３ａの特徴量ベクトル２３ｇと、乗算されたベクトルとの距離が小さくなるので、後述する楽曲検索部２６によって所望であると判断されるコンテンツ２３ａの範囲を狭くすることができる。また、１を超える数を乗算すれば、上記距離が大きくなるので、所望であると判断されるコンテンツ２３ａの範囲を広くすることができる。 Further, each element of the vector constituting the inverse vector may be multiplied by a positive number. Multiplying a positive number less than 1 reduces the distance between the feature vector 23g of the desired content 23a received in step S24b and the multiplied vector, which is desired by the music search unit 26 described later. The range of the content 23a determined as can be narrowed. Further, if the number exceeding 1 is multiplied, the distance increases, so that the range of the content 23a determined to be desired can be widened.

更に、検索された特徴量ベクトル２３ｇより上記逆ベクトルに近い、即ち、検索された特徴量ベクトル２３ｇからの距離が上記逆ベクトルからの距離より大きいベクトルであれば良い。 Further, it may be a vector that is closer to the inverse vector than the searched feature vector 23g, that is, a vector that has a distance from the searched feature vector 23g larger than the distance from the inverse vector.

なお、上記距離の算出は、Ｍ本のベクトルと、Ｍ本のベクトルとの間の距離の算出となる。そこで、例えば、一方のＭ本のそれぞれのベクトルと、他方のＭ本のそれぞれのベクトルとの距離、例えばユークリッド距離を算出し、Ｍ×Ｍ個のユークリッド距離の総和を上記距離とすれば良い。 Note that the calculation of the distance is a calculation of the distance between the M vectors and the M vectors. Therefore, for example, the distance between each of the M vectors and the other M vectors, for example, the Euclidean distance may be calculated, and the sum of the M × M Euclidean distances may be set as the distance.

または、一方のＭ本のベクトルから１本、他方のＭ本のベクトルから１本ずつ選択して組み合わせ、それらの２本のベクトルの間の距離、例えばユークリッド距離を算出する。そして、算出されたＭ個のユークリッド距離の総和であって、上記組み合わせを変更することによって得られる最小の総和であるとしても良い。また、それぞれのＭ本のベクトルの平均ベクトル間の距離としても良い。ここで、平均ベクトルとは、ベクトルの要素ごとに平均値を算出し、算出された平均値を要素とするベクトルである。 Alternatively, one is selected and combined from one M vector and one from the other M vectors, and the distance between these two vectors, for example, the Euclidean distance is calculated. The total sum of the calculated M Euclidean distances may be the minimum sum obtained by changing the combination. Moreover, it is good also as the distance between the average vectors of each M vector. Here, the average vector is a vector that calculates an average value for each element of the vector and uses the calculated average value as an element.

そして、集団特徴量算出部２４は、ステップＳ２４ｄで得られた特徴量ベクトルから、重心ベクトルと、分散共分散行列の逆行列を算出する（ステップＳ２４ｅ）。重心ベクトルは、Ｎ次元のベクトルであり、各特徴量ベクトルにステップＳ２４ｂまたはステップＳ２４ｃで受信された度合いを重みとして乗じたベクトルの平均ベクトルであるが、これに限るものではない。例えば、読み出された特徴量ベクトル２３ｇに限って度合いを乗じるとしても良い。また、度合いを乗じず、重心ベクトルは、平均ベクトルであるとしても良い。分散共分散行列の逆行列は、Ｎ×Ｎ次元の行列である。 Then, the collective feature value calculation unit 24 calculates the centroid vector and the inverse matrix of the variance-covariance matrix from the feature value vector obtained in step S24d (step S24e). The center-of-gravity vector is an N-dimensional vector, and is an average vector of vectors obtained by multiplying each feature amount vector by the degree received in step S24b or step S24c, but is not limited thereto. For example, the degree may be multiplied only by the read feature value vector 23g. Further, the center-of-gravity vector may be an average vector without multiplying the degree. The inverse matrix of the variance-covariance matrix is an N × N-dimensional matrix.

なお、上記分散共分散行列は、対角成分のみからなる行列である、即ち、対角成分以外の成分は０であるとしても良い。この場合、上記分散共分散行列の逆行列は、対角成分のみからなる行列である、即ち、対角成分以外の成分は０である。この、分散共分散行列及び分散共分散行列の逆行列は対角成分のみからなる行列であるとする処理によれば、これらの行列は、ベクトルとして表現可能であり、使用する記憶容量の削減と、計算量の削減とが可能である。 The variance-covariance matrix is a matrix composed only of diagonal components, that is, components other than the diagonal components may be zero. In this case, the inverse matrix of the variance-covariance matrix is a matrix composed of only diagonal components, that is, the components other than the diagonal components are zero. According to the processing assuming that the variance-covariance matrix and the inverse matrix of the variance-covariance matrix are matrices composed only of diagonal components, these matrices can be expressed as vectors, and the storage capacity used can be reduced. The amount of calculation can be reduced.

次に、集団特徴量算出部２４は、所望でないコンテンツ２３ａの特徴量ベクトルを特徴量ベクトル２３ｇを読み出すことによって、または、新たに作成することによって得る（ステップＳ２４ｆ）。この動作は、既に説明したステップＳ２４ｄの所望のコンテンツ２３ａの特徴量ベクトルを得る動作と同様である。 Next, the collective feature quantity calculation unit 24 obtains the feature quantity vector of the content 23a that is not desired by reading the feature quantity vector 23g or by newly creating it (step S24f). This operation is the same as the operation for obtaining the feature vector of the desired content 23a in step S24d already described.

即ち、ステップＳ２４ｂで受信された所望のコンテンツ２３ａを識別する情報に代えてステップＳ２４ｃで受信された所望でないコンテンツ２３ａを識別する情報を用い、ステップＳ２４ｃで受信された所望でないコンテンツ２３ａを識別する情報に代えてステップＳ２４ｂで受信された所望のコンテンツ２３ａを識別する情報を用いる点が相違する他は同一であるので、説明を省略する。 That is, the information for identifying the undesired content 23a received in step S24c is used by using the information for identifying the undesired content 23a received in step S24c in place of the information for identifying the desired content 23a received in step S24b. Instead of this, the description is omitted because it is the same except that information for identifying the desired content 23a received in step S24b is used.

そして、集団特徴量算出部２４は、ステップＳ２４ｆで得られた特徴量ベクトルから、重心ベクトルと、分散共分散行列の逆行列を算出する（ステップＳ２４ｇ）。重心ベクトルと、分散共分散行列の逆行列の算出は、ステップＳ２４ｅで説明した通りであり、例えば、読み出された特徴量ベクトル２３ｇに限って度合いを乗じるとしても良い。 Then, the collective feature quantity calculation unit 24 calculates the centroid vector and the inverse matrix of the variance-covariance matrix from the feature quantity vector obtained in step S24f (step S24g). The calculation of the centroid vector and the inverse matrix of the variance-covariance matrix is as described in step S24e. For example, the degree may be multiplied only by the read feature vector 23g.

次に、集団特徴量算出部２４は、ステップＳ２４ｅで得られた重心ベクトルを第１の重心ベクトル２５ｄに、ステップＳ２４ｅで得られた分散共分散行列の逆行列を第１の分散共分散行列の逆行列２５ｅに、ステップＳ２４ｇで得られた重心ベクトルを第２の重心ベクトル２５ｆに、ステップＳ２４ｇで得られた分散共分散行列の逆行列を第２の分散共分散行列の逆行列２５ｇに設定する。 Next, the collective feature amount calculation unit 24 sets the centroid vector obtained in step S24e to the first centroid vector 25d, and the inverse matrix of the variance-covariance matrix obtained in step S24e to the first variance-covariance matrix. In the inverse matrix 25e, the centroid vector obtained in step S24g is set to the second centroid vector 25f, and the inverse matrix of the variance-covariance matrix obtained in step S24g is set to the inverse matrix 25g of the second variance-covariance matrix. .

更に、集団特徴量算出部２４は、辞書２５ａを識別する情報を入力装置１３の所定の操作によって入力し、入力された辞書２５ａを識別する情報を辞書名２５ｃに設定し、更に、辞書２５ａを一意に識別する辞書識別子２５ｂを設定することによって得られた辞書２５ａを辞書記憶部２５に記憶させて（ステップＳ２４ｈ）、動作を終了する（ステップＳ２４ｉ）。 Furthermore, the collective feature amount calculation unit 24 inputs information for identifying the dictionary 25a by a predetermined operation of the input device 13, sets the information for identifying the input dictionary 25a to the dictionary name 25c, and further stores the dictionary 25a. The dictionary 25a obtained by setting the uniquely identifying dictionary identifier 25b is stored in the dictionary storage unit 25 (step S24h), and the operation is terminated (step S24i).

なお、集団特徴量算出部２４は、ステップＳ２４ｂ、ステップＳ２４ｄ及びステップＳ２４ｅで説明した所望のコンテンツ２３ａに関する３段階からなる処理は、この順で行う必要がある。また、ステップＳ２４ｃ、ステップＳ２４ｆ及びステップＳ２４ｇで説明した所望でないコンテンツ２３ａに関する３段階からなる処理は、この順で行う必要がある。しかし、これらの２つの処理のどの処理を先に行い、どの処理を後に行っても良いことは明らかである。また、一方の処理のある段階の実行と、他方の処理のある段階の実行の後先が任意であることは明らかである。 Note that the collective feature amount calculation unit 24 needs to perform the three-stage process regarding the desired content 23a described in step S24b, step S24d, and step S24e in this order. Further, the three-stage process relating to the undesired content 23a described in step S24c, step S24f, and step S24g needs to be performed in this order. However, it is obvious that any of these two processes can be performed first and which can be performed later. Also, it is obvious that the execution of one stage of one process and the execution destination of the other stage of the other process are arbitrary.

ここで、以上の説明で、所望であるか否かについて、単に好むか否かであると説明したが、これに限るものではない。例えば、使用者は、聴取した際の印象によってある種類に分類された楽曲や、ある状況において聴取することを好む楽曲を、それぞれ独立して所望であるか否かを指定し、それぞれによって異なる辞書２５ａを作成させるとしても良い。その場合、辞書名２５ｃは、例えば、「海を連想させるような感じ」、「通勤時にぴったりな感じ」等とすれば良い。更に、所望であることは好まないことに相当し、所望でないことは好むことに相当するとしても良く、何ら支障を生じない。 Here, in the above description, it is described whether it is desired or not, but it is not limited to this. For example, the user designates whether or not each of the music pieces classified into a certain type according to the impression at the time of listening or the music that the user prefers to listen to in a certain situation is desired independently, and the dictionary differs depending on each. 25a may be created. In this case, the dictionary name 25c may be, for example, “feeling reminiscent of the sea”, “feeling perfect when commuting”, or the like. Further, what is desired corresponds to dislike, and what is not desired may correspond to favor, and does not cause any trouble.

上記の説明は、辞書２５ａが作成されていない場合に、集団特徴量算出部２４が辞書２５ａを作成する場合の動作の説明であったが、集団特徴量算出部２４の動作は、これに限るものではない。即ち、辞書２５ａが作成され、辞書記憶部２５に記憶されている場合、その記憶されている辞書２５ａの作成に用いられなかったコンテンツ２３ａを識別する情報をステップＳ２４ｂ及びステップＳ２４ｃで受信し、受信された情報によって、記憶されている辞書２５ａを修正する、学習処理を行っても良い。 The above explanation is an operation when the collective feature quantity calculation unit 24 creates the dictionary 25a when the dictionary 25a is not created. However, the operation of the collective feature quantity calculation unit 24 is limited to this. It is not a thing. That is, when the dictionary 25a is created and stored in the dictionary storage unit 25, information for identifying the content 23a that has not been used to create the stored dictionary 25a is received in step S24b and step S24c, and received. Learning processing may be performed in which the stored dictionary 25a is corrected based on the information that has been stored.

この学習処理にあたり、集団特徴量算出部２４は、記憶されている辞書２５ａの作成に用いられたコンテンツ２３ａを識別する情報を集団特徴量算出部２４内に記憶し、または辞書記憶部２５に記憶させ、その記憶された情報と上記受信された情報とから新たに辞書２５ａを作成しても良い。 In this learning process, the collective feature quantity calculation unit 24 stores information for identifying the content 23 a used for creating the stored dictionary 25 a in the collective feature quantity calculation unit 24 or stores it in the dictionary storage unit 25. The dictionary 25a may be newly created from the stored information and the received information.

また、記憶されている辞書２５ａの作成の際に重心ベクトルの算出に用いられた重みの合計値と、分散共分散行列を集団特徴量算出部２４内に記憶し、または辞書記憶部２５に記憶させ、その記憶された合計値及び行列とを、上記受信された情報によって更新して辞書２５ａを作成しても良い。この更新処理によれば、新たに作成する処理に比較して、計算量の減少が可能となる。 Further, the total weight value used for calculating the centroid vector and the variance-covariance matrix when the stored dictionary 25 a is created are stored in the collective feature value calculation unit 24 or stored in the dictionary storage unit 25. The stored total value and matrix may be updated with the received information to create the dictionary 25a. According to this update process, the amount of calculation can be reduced as compared with a newly created process.

なお、集団特徴量算出部２４は、楽曲再生部２７によって起動されるとしたが、これに限るものではない。他の処理部によって起動されるとしても良い。その場合、集団特徴量算出部２４は、ステップＳ２４ｂ及びステップＳ２４ｃで受信されるコンテンツ２３ａを識別する情報を、起動した処理部から受信する。また、起動に際し、入力装置１３の所定のキー操作によって起動の了解が得られたと判断された場合に限って処理を行うとしても良い。更に、集団特徴量算出部２４は、入力装置１３の所定の操作によって起動されても良い。 The collective feature amount calculation unit 24 is activated by the music reproduction unit 27, but is not limited thereto. It may be activated by another processing unit. In this case, the collective feature quantity calculation unit 24 receives information for identifying the content 23a received in step S24b and step S24c from the activated processing unit. Further, at the time of activation, the processing may be performed only when it is determined that the activation is obtained by a predetermined key operation of the input device 13. Further, the collective feature amount calculation unit 24 may be activated by a predetermined operation of the input device 13.

ここで、図７を参照して、集団特徴量算出部２４によって、受信された所望のコンテンツ２３ａの特徴量ベクトルと、所望でないコンテンツ２３ａの特徴量ベクトルとから、辞書２５ａが作成される際のデータの流れの概要を説明する。なお、この説明は、概要の説明であり、必ずしも全ての場合を網羅したものではない。 Here, referring to FIG. 7, the collective feature value calculation unit 24 creates a dictionary 25a from the received feature value vector of the desired content 23a and the feature value vector of the undesired content 23a. An outline of the data flow will be described. This description is an outline description and does not necessarily cover all cases.

ステップＳ２４ｂで受信された所望のコンテンツ２３ａの識別子からそのコンテンツ２３ａの特徴量ベクトル２３ｇがステップＳ２４ｄで検索される。その検索された特徴量ベクトル２３ｇに併せ、ステップＳ２４ｃで受信された所望でないコンテンツ２３ａの識別子からそのコンテンツ２３ａの特徴量ベクトル２３ｇを得て、そのベクトルの逆ベクトルがステップＳ２４ｄで作成されて、ステップＳ２４ｅの重心ベクトルと、分散共分散行列の逆行列の算出に用いられる。 From the identifier of the desired content 23a received in step S24b, the feature quantity vector 23g of the content 23a is searched in step S24d. Along with the retrieved feature vector 23g, the feature vector 23g of the content 23a is obtained from the identifier of the undesired content 23a received in step S24c, and an inverse vector of the vector is created in step S24d. It is used to calculate the centroid vector of S24e and the inverse matrix of the variance-covariance matrix.

また、ステップＳ２４ｃで受信された所望でないコンテンツ２３ａの識別子からそのコンテンツ２３ａの特徴量ベクトル２３ｇがステップＳ２４ｆで検索される。その検索された特徴量ベクトル２３ｇに併せ、ステップＳ２４ｂで受信された所望のコンテンツ２３ａの識別子からそのコンテンツ２３ａの特徴量ベクトル２３ｇを得て、そのベクトルの逆ベクトルがステップＳ２４ｆで作成されて、ステップＳ２４ｇの重心ベクトルと、分散共分散行列の逆行列の算出に用いられる。 Further, in step S24f, the feature vector 23g of the content 23a is retrieved from the identifier of the undesired content 23a received in step S24c. Along with the searched feature vector 23g, a feature vector 23g of the content 23a is obtained from the identifier of the desired content 23a received in step S24b, and an inverse vector of the vector is created in step S24f. It is used to calculate the centroid vector of S24g and the inverse matrix of the variance-covariance matrix.

ステップＳ２４ｅで算出された重心ベクトルと、分散共分散行列の逆行列、更に、ステップＳ２４ｇで算出された重心ベクトルと、分散共分散行列の逆行列とが辞書２５ａとして記憶される。 The centroid vector calculated in step S24e, the inverse matrix of the variance-covariance matrix, and the centroid vector calculated in step S24g and the inverse matrix of the variance-covariance matrix are stored as the dictionary 25a.

次に、楽曲検索部２６の動作を説明する。楽曲検索部２６は、入力装置１３の所定の操作に従い制御部１１によって起動されて動作を開始する。そして、入力装置１３の所定の操作に従って、名称２３ｄと、アーチスト名２３ｅと、アルバム名２３ｆとのいずれか、もしくは組合せを検索キーにコンテンツ２３ａを検索して、検索されたコンテンツ２３ａを順序付けする。 Next, the operation of the music search unit 26 will be described. The music search unit 26 is activated by the control unit 11 according to a predetermined operation of the input device 13 and starts operating. Then, in accordance with a predetermined operation of the input device 13, the content 23a is searched by using any one of the name 23d, the artist name 23e, and the album name 23f, or a combination thereof as a search key, and the searched content 23a is ordered.

または、辞書２５ａを参照してコンテンツ２３ａを所望の程度に従って順序付けする。更には、これらの２つの順序付けを組み合わせて順序付けする。そして、所定の順位に順序付けされたコンテンツ２３ａの名称２３ｄと、アーチスト名２３ｅと、アルバム名２３ｆとの１つまたは複数の組を表示部１２に表示する。 Alternatively, the contents 23a are ordered according to a desired degree with reference to the dictionary 25a. Furthermore, these two orderings are combined and ordered. Then, one or a plurality of sets of the content 23a name 23d, artist name 23e, and album name 23f ordered in a predetermined order are displayed on the display unit 12.

そして、入力装置１３の所定の操作に従って、表示された名称２３ｄと、アーチスト名２３ｅと、アルバム名２３ｆとの１つまたは複数の中の１つが選択されると、楽曲検索部２６は、楽曲再生部２７を起動し、その選択されたコンテンツ２３ａのコンテンツデータ２３ｃを楽曲再生部２７に送信して再生させる。 When one of one or more of the displayed name 23d, artist name 23e, and album name 23f is selected according to a predetermined operation of the input device 13, the music search unit 26 reproduces the music. The unit 27 is activated, and the content data 23c of the selected content 23a is transmitted to the music playback unit 27 for playback.

なお、表示部１２の表示画面の大きさに依存して、上記表示するものは、一部のみを表示し、残部は、スクロールによって表示されるとする。また、楽曲検索部２６は、表示するものから一部を乱数に基づいて、コンテンツ２３ａに要する記憶容量が所定の容量以内であるように、所定の再生時間以内であるように、または、所定の楽曲数を選択し、選択されたものを表示しても良い。 Note that, depending on the size of the display screen of the display unit 12, only a part of the display is displayed, and the remaining part is displayed by scrolling. In addition, the music search unit 26 is based on random numbers from a part to be displayed so that the storage capacity required for the content 23a is within a predetermined reproduction time, within a predetermined reproduction time, or with a predetermined You may select the number of music pieces and display the selected one.

また、表示部１２に表示することに限るものではない。表示するコンテンツ２３ａのコンテンツ識別子２３ｂの連結からなるプレイリストを作成し、コンテンツ記憶部２３に記憶させるとしても良い。また、通信部を制御して送信させても良い。 Further, the display is not limited to the display unit 12. A playlist composed of the concatenation of the content identifiers 23b of the content 23a to be displayed may be created and stored in the content storage unit 23. Further, the communication unit may be controlled to transmit.

コンテンツデータ２３ｃが楽曲再生部２７によって再生されている際、入力装置１３の所定の操作によって、そのコンテンツ２３ａが所望であると入力されると、楽曲検索部２６は、コンテンツ識別子２３ｂと、所望である旨とを楽曲検索部２６内に記憶する。所望の度合いが併せて入力された場合、その度合いを併せて記憶する。また、そのコンテンツ２３ａが所望でないと入力されると、楽曲検索部２６は、コンテンツ識別子２３ｂと、所望でない旨とを楽曲検索部２６内に記憶する。所望でない度合いが併せて入力された場合、その度合いを併せて記憶する。 When the content data 23c is being played back by the music playback unit 27, if the content 23a is input as desired by a predetermined operation of the input device 13, the music search unit 26 and the content identifier 23b The fact is stored in the music search unit 26. When the desired degree is input together, the degree is also stored. When the content 23a is input as not desired, the music search unit 26 stores the content identifier 23b and the fact that the content 23a is not desired in the music search unit 26. When an undesired degree is input together, the degree is also stored.

なお、コンテンツデータ２３ｃが楽曲再生部２７によって再生されている際、その再生が開始されてから所定の経過時間以内に入力装置１３の所定の操作によって異なるコンテンツデータ２３ｃの再生が指示された場合、楽曲検索部２６は、そのコンテンツ２３ａが所望でないと入力されたとみなしても良い。短時間で聴取を中止されたことは、所望でないと判断されるからである。 When the content data 23c is being reproduced by the music reproducing unit 27, when reproduction of the different content data 23c is instructed by a predetermined operation of the input device 13 within a predetermined elapsed time after the reproduction is started, The music search unit 26 may consider that the content 23a is input if it is not desired. This is because it is determined that it is not desirable to stop listening in a short time.

図８は、楽曲検索部２６が、辞書２５ａを参照してコンテンツ２３ａを所望の程度に従って順序付けし、所定の順位に順序付けされたコンテンツ２３ａの名称２３ｄと、アーチスト名２３ｅと、アルバム名２３ｆとの１つまたは複数を表示部１２に表示する動作のフローチャートを示す。 In FIG. 8, the music search unit 26 refers to the dictionary 25a to order the contents 23a according to a desired degree, and includes the contents 23a, the artist name 23e, and the album name 23f. The flowchart of the operation | movement which displays one or more on the display part 12 is shown.

楽曲検索部２６は、入力装置１３の所定の操作によって上記辞書２５ａを参照してコンテンツ２３ａを所望の程度に従って順序付けして、所定の順位に順序付けされたコンテンツ２３ａの名称２３ｄと、アーチスト名２３ｅと、アルバム名２３ｆとの１つまたは複数を表示部１２に表示する動作を開始する（ステップＳ２６ａ）。 The music search unit 26 refers to the dictionary 25a by a predetermined operation of the input device 13 and orders the content 23a according to a desired degree, and the name 23d of the content 23a, the artist name 23e, Then, the operation of displaying one or more of the album names 23f on the display unit 12 is started (step S26a).

そして、楽曲検索部２６は、入力装置１３の操作によって入力された辞書名と辞書名２５ｃとが等しい辞書２５ａを辞書記憶部２５から検索し、その検索された辞書２５ａの第１の重心ベクトル２５ｄと、第１の分散共分散行列の逆行列２５ｅと、第２の重心ベクトル２５ｆと、第２の分散共分散行列の逆行列２５ｇとを読み出す（ステップＳ２６ｂ）。 Then, the music search unit 26 searches the dictionary storage unit 25 for a dictionary 25a in which the dictionary name input by the operation of the input device 13 and the dictionary name 25c are the same, and the first centroid vector 25d of the searched dictionary 25a. Then, the inverse matrix 25e of the first variance-covariance matrix, the second centroid vector 25f, and the inverse matrix 25g of the second variance-covariance matrix are read (step S26b).

次に、楽曲検索部２６は、コンテンツ２３ａの特徴量ベクトル２３ｇを逐次コンテンツ記憶部２３から読み込む（ステップＳ２６ｃ）。ここで、コンテンツ２３ａが尽きて読み込めなかったか、尽きずに読み込めたかを調べる（ステップＳ２６ｄ）。 Next, the music search unit 26 sequentially reads the feature amount vector 23g of the content 23a from the content storage unit 23 (step S26c). Here, it is checked whether or not the content 23a can be read due to exhaustion (step S26d).

コンテンツ２３ａが尽きずに読み込まれた場合、楽曲検索部２６は、その特徴量ベクトル２３ｇを構成する第１〜第Ｍの特徴量ベクトル２３ｇ１〜２３ｇＭとのそれぞれと、ステップＳ２６ｂで読み込まれた第１の重心ベクトル２５ｄと、第１の分散共分散行列の逆行列２５ｅとによって示される集団（第１の集団と称する。）との距離（第１の距離と称する。）、及びその集団との類似度（第１の類似度と称する。）とを算出する。ここで、第１の類似度は、第１の距離が大きい程小さく、第１の距離が小さい程大きい。 When the content 23a is read without exhaustion, the music search unit 26 includes each of the first to Mth feature quantity vectors 23g1 to 23gM constituting the feature quantity vector 23g and the first read in step S26b. The distance (referred to as the first distance) to the group (referred to as the first distance) indicated by the centroid vector 25d and the inverse matrix 25e of the first variance-covariance matrix, and the similarity to the weight Degree (referred to as first similarity) is calculated. Here, the first similarity is smaller as the first distance is larger, and is larger as the first distance is smaller.

更に、楽曲検索部２６は、上記第１〜第Ｍの特徴量ベクトル２３ｇ１〜２３ｇＭとのそれぞれと、ステップＳ２６ｂで読み込まれた第２の重心ベクトル２５ｆと、第２の分散共分散行列の逆行列２５ｇとによって示される集団（第２の集団と称する。）との距離（第２の距離と称する。）、及びその集団との類似度（第２の類似度と称する。）とを算出する。ここで、第２の類似度は、第２の距離が大きい程小さく、第２の距離が小さい程大きい（ステップＳ２６ｅ）。 Further, the music search unit 26 includes each of the first to Mth feature quantity vectors 23g1 to 23gM, the second centroid vector 25f read in step S26b, and an inverse matrix of the second variance-covariance matrix. The distance (referred to as the second distance) with the group (referred to as the second group) indicated by 25g and the similarity (referred to as the second similarity) with the group are calculated. Here, the second similarity is smaller as the second distance is larger, and is larger as the second distance is smaller (step S26e).

例えば、上記第ｍの特徴量ベクトル２３ｇｍをｘｍ（１≦ｍ≦Ｍ）とし、第ｉの集団との距離をマハラノビス距離とし、第ｉの類似度は、そのマハラノビス距離に１を加えた値の逆数とする（ｉ＝１、２）。即ち、第１の重心ベクトル２５ｄをｖｇ１、第１の分散共分散行列の逆行列２５ｅをＤ１、第２の重心ベクトル２５ｆをｖｇ２、第２の分散共分散行列の逆行列２５ｇをＤ２とすると、第ｉの類似度Ｓｉ（ｉ＝１、２）は、以下の（式３）によって算出される。

For example, the m-th feature vector 23gm is xm (1 ≦ m ≦ M), the distance to the i-th group is the Mahalanobis distance, and the i-th similarity is a value obtained by adding 1 to the Mahalanobis distance. The reciprocal is set (i = 1, 2). That is, if the first centroid vector 25d is vg1, the inverse matrix 25e of the first variance-covariance matrix is D1, the second centroid vector 25f is vg2, and the inverse matrix 25g of the second variance-covariance matrix is D2. The i-th similarity Si (i = 1, 2) is calculated by the following (Formula 3).

なお、上記説明では、第ｉの集団との第ｉの距離としてマハラノビス距離を用いると説明したが、これに限るものではない。例えば、第ｉの距離は、上記第ｍの特徴量ベクトル２３ｇｍと、ｖｇｉ（第１の重心ベクトル２５ｄまたは第２の重心ベクトル２５ｆ。）との間の距離、例えばユークリッド距離であるとしても良い（ｉ＝１、２）。 In the above description, the Mahalanobis distance is used as the i-th distance to the i-th group, but the present invention is not limited to this. For example, the i-th distance may be a distance between the m-th feature vector 23gm and vgi (the first centroid vector 25d or the second centroid vector 25f), for example, the Euclidean distance ( i = 1, 2).

第ｉの距離としてベクトル間の距離を用いると、その距離の算出に第１の分散共分散行列の逆行列２５ｅ及び第２の分散共分散行列の逆行列２５ｇが用いられないので、これらを算出しない。そこで、計算量の削減が可能になる。一方、マハラノビス距離等のベクトルと集団との間の距離を用いると、第１の分散共分散行列の逆行列２５ｅと、第２の分散共分散行列の逆行列２５ｇとが異なる場合、適切な第ｉの類似度Ｓｉ（ｉ＝１、２）を算出することができる。 When the distance between vectors is used as the i-th distance, the inverse matrix 25e of the first variance-covariance matrix and the inverse matrix 25g of the second variance-covariance matrix are not used for calculating the distance. do not do. Therefore, the amount of calculation can be reduced. On the other hand, if the distance between the vector and the group such as Mahalanobis distance is used, if the inverse matrix 25e of the first variance-covariance matrix and the inverse matrix 25g of the second variance-covariance matrix are different, an appropriate first The similarity Si of i (i = 1, 2) can be calculated.

また、上記距離は、ベクトル量子化を用いた距離であるとしても良い。楽曲検索部２６は、集団特徴量算出部２４のステップＳ２４ｄの動作で得られた所望のコンテンツの特徴量ベクトルをベクトル量子化して所定個数のコードベクトルを算出し、算出されたコードベクトルと第ｍの特徴量ベクトル２３ｇｍとの間のユークリッド距離の中で最小である距離を第１の距離とする。 The distance may be a distance using vector quantization. The music search unit 26 performs vector quantization on the feature amount vector of the desired content obtained by the operation of step S24d of the collective feature amount calculation unit 24 to calculate a predetermined number of code vectors, and the calculated code vector and the m-th code vector. The minimum distance among the Euclidean distances to the feature quantity vector 23gm is defined as the first distance.

同様に、楽曲検索部２６は、集団特徴量算出部２４のステップＳ２４ｆの動作で得られた所望でないコンテンツの特徴量ベクトルをベクトル量子化して所定個数のコードベクトルを算出し、算出されたコードベクトルと第ｍの特徴量ベクトル２３ｇｍとの間のユークリッド距離の中で最小である距離を第２の距離とする。 Similarly, the music search unit 26 calculates a predetermined number of code vectors by vector quantization of the feature amount vector of the undesired content obtained by the operation of step S24f of the collective feature amount calculation unit 24, and the calculated code vector The second distance is the minimum distance among the Euclidean distances between and the m-th feature vector 23gm.

ここで、第１の距離を算出するためのコードベクトルの数と、第２の距離を算出するためのコードベクトルの数とは等しいとするが、これに限るものではない。なお、ベクトル量子化を用いた距離を用いるか否かは、特徴抽出部２２によってベクトル量子化によるコードベクトル算出が行われたか否かとは無関係である。また、距離を算出するためのコードベクトルの数と、特徴抽出部２２によって算出されたコードベクトルの数とは無関係である。 Here, it is assumed that the number of code vectors for calculating the first distance is equal to the number of code vectors for calculating the second distance, but the present invention is not limited to this. Whether or not the distance using vector quantization is used is unrelated to whether or not a code vector calculation by vector quantization is performed by the feature extraction unit 22. In addition, the number of code vectors for calculating the distance and the number of code vectors calculated by the feature extraction unit 22 are irrelevant.

また、ガウス混合モデルを用いた確率的距離であるとしても良い。即ち、楽曲検索部２６は、集団特徴量算出部２４のステップＳ２４ｄの動作で得られた所望のコンテンツの特徴量ベクトルを複数のガウス分布が混合された分布として表現する。そして、この混合されたガウス分布における第ｍの特徴量ベクトル２３ｇｍの生起確率の逆数、またはその生起確率の対数値の逆数を第１の距離とする。 Further, it may be a probabilistic distance using a Gaussian mixture model. In other words, the music search unit 26 expresses the feature quantity vector of the desired content obtained by the operation of step S24d of the collective feature quantity calculation unit 24 as a distribution in which a plurality of Gaussian distributions are mixed. The reciprocal of the occurrence probability of the m-th feature vector 23gm in the mixed Gaussian distribution or the reciprocal of the logarithmic value of the occurrence probability is defined as the first distance.

同様に、集団特徴量算出部２４のステップＳ２４ｆの動作で得られた所望でないコンテンツの特徴量ベクトルを複数のガウス分布が混合された分布として表現する。そして、この混合されたガウス分布における第ｍの特徴量ベクトル２３ｇｍの生起確率の逆数、またはその生起確率の対数値の逆数を第２の距離とする。 Similarly, the feature vector of undesired content obtained by the operation of step S24f of the collective feature calculator 24 is expressed as a distribution in which a plurality of Gaussian distributions are mixed. Then, the reciprocal of the occurrence probability of the m-th feature vector 23gm in the mixed Gaussian distribution or the reciprocal of the logarithmic value of the occurrence probability is set as the second distance.

また、ヒストグラムを用いた確率的距離であるとしても良い。即ち、楽曲検索部２６は、集団特徴量算出部２４のステップＳ２４ｄの動作で得られた所望のコンテンツの特徴量ベクトルの要素毎にヒストグラムを作成し、そのヒストグラムにおける第ｍの特徴量ベクトル２３ｇｍの生起確率の逆数、またはその生起確率の対数値の逆数を第１の距離とする。 Further, it may be a probabilistic distance using a histogram. That is, the music search unit 26 creates a histogram for each element of the feature amount vector of the desired content obtained by the operation of step S24d of the collective feature amount calculation unit 24, and the m-th feature amount vector 23gm in the histogram. The reciprocal of the occurrence probability or the reciprocal of the logarithmic value of the occurrence probability is set as the first distance.

同様に、集団特徴量算出部２４のステップＳ２４ｆの動作で得られた所望でないコンテンツの特徴量ベクトルの要素毎にヒストグラムを作成し、そのヒストグラムにおける第ｍの特徴量ベクトル２３ｇｍの生起確率の逆数、またはその生起確率の対数値の逆数を第２の距離とする。 Similarly, a histogram is created for each element of an undesired content feature vector obtained by the operation of step S24f of the collective feature calculator 24, and the reciprocal of the occurrence probability of the m-th feature vector 23gm in the histogram, Alternatively, the reciprocal of the logarithm of the occurrence probability is set as the second distance.

これらのガウス混合モデルを用いた確率的距離及びヒストグラムを用いた確率的距離を用いる際、距離は、生起確率の逆数、またはその生起確率の対数値の逆数であるとしたが、これに限るものではない。距離は、生起確率、またはその生起確率の対数値に対して非増加関数であれば良い。 When using stochastic distances using these Gaussian mixture models and stochastic distances using histograms, the distance is the reciprocal of the occurrence probability or the reciprocal of the logarithm of the occurrence probability. is not. The distance may be a non-increasing function with respect to the occurrence probability or a logarithmic value of the occurrence probability.

これらのベクトル量子化を用いた距離、ガウス混合モデルを用いた確率的距離及びヒストグラムを用いた確率的距離を用いる方法によれば、集団特徴量算出部２４のステップＳ２４ｅ及びステップＳ２４ｇで説明した重心ベクトルと、分散共分散行列の逆行列の算出が不要となる。その算出に代えて、集団特徴量算出部２４が、それぞれの方法に対応してコードベクトル、複数のガウス分布が混合された分布または要素毎のヒストグラムを算出するとしても良い。 According to the method using the distance using the vector quantization, the stochastic distance using the Gaussian mixture model, and the stochastic distance using the histogram, the centroid explained in steps S24e and S24g of the collective feature quantity calculation unit 24. It is not necessary to calculate the vector and the inverse matrix of the variance-covariance matrix. Instead of the calculation, the collective feature quantity calculation unit 24 may calculate a code vector, a distribution in which a plurality of Gaussian distributions are mixed, or a histogram for each element corresponding to each method.

なお、集団特徴量算出部２４の動作の際、図６に示すフローチャートのステップＳ２４ｂで所望のコンテンツ２３ａを識別する情報が入力され、かつ、ステップＳ２４ｃで所望でないコンテンツ２３ａを識別する情報が入力された場合、第１の分散共分散行列の逆行列２５ｅと、第２の分散共分散行列の逆行列２５ｇとが大きく異なる可能性があり、マハラノビス距離等のベクトルと集団との間の距離を用いる効果が大きい。 During the operation of the collective feature amount calculation unit 24, information for identifying the desired content 23a is input in step S24b of the flowchart shown in FIG. 6, and information for identifying the undesired content 23a is input in step S24c. In this case, there is a possibility that the inverse matrix 25e of the first variance-covariance matrix and the inverse matrix 25g of the second variance-covariance matrix are greatly different, and the distance between the vector such as the Mahalanobis distance and the group is used. Great effect.

上記のように、Ｍ本のベクトルｘｍに対して、第１の類似度Ｓ１と第２の類似度Ｓ２が算出されると、楽曲検索部２６は、コンテンツ２３ａが使用者によって所望される程度である、類似度Ｒを算出する。この類似度Ｒは、第１の類似度Ｓ１が大きいほど大きく、第２の類似度Ｓ２が大きいほど小さい。 As described above, when the first similarity S1 and the second similarity S2 are calculated with respect to the M vectors xm, the music search unit 26 allows the content 23a to be desired by the user. A certain similarity R is calculated. The similarity R increases as the first similarity S1 increases, and decreases as the second similarity S2 increases.

例えば、楽曲検索部２６は、第１の類似度Ｓ１が第２の類似度Ｓ２に所定の定数Ｔ１を加えた値より大きいベクトルの数Ｃｎｔを求める（０≦Ｃｎｔ≦Ｍ）。ここで、Ｔ１は、所定の定数であり、例えば、０である。そして、上記Ｃｎｔが大きいとステップＳ２６ｃで読み込まれたコンテンツ２３ａの特徴量ベクトル２３ｇの類似度Ｒは大きく、上記Ｃｎｔが小さいと、上記類似度Ｒは小さいと算出して（ステップＳ２６ｆ）、ステップＳ２６ｃのコンテンツ２３ａの特徴量ベクトル２３ｇを逐次読み込む動作に移る。 For example, the music search unit 26 calculates the number of vectors Cnt in which the first similarity S1 is larger than the value obtained by adding the predetermined constant T1 to the second similarity S2 (0 ≦ Cnt ≦ M). Here, T1 is a predetermined constant, for example, 0. If the Cnt is large, the similarity R of the feature vector 23g of the content 23a read in step S26c is large, and if the Cnt is small, the similarity R is calculated to be small (step S26f), and step S26c. The operation proceeds to the sequential reading of the feature vector 23g of the content 23a.

ここで、Ｃｎｔと、上記類似度Ｒとは、例えば、それぞれ以下の（式４−１）、（式４−２）によって算出される。

Here, Cnt and the similarity R are calculated by, for example, the following (Equation 4-1) and (Equation 4-2), respectively.

この類似度Ｒが大きいほど、ステップＳ２６ｃで読み込まれたコンテンツ２３ａは、ステップＳ２６ｂで読み込まれた辞書２５ａに従うと、使用者の所望のコンテンツ２３ａである可能性が高い。 The higher the similarity R is, the higher the possibility that the content 23a read in step S26c is the user's desired content 23a according to the dictionary 25a read in step S26b.

一方、ステップＳ２６ｄで、コンテンツ２３ａの特徴量ベクトル２３ｇが尽きて読み込まれなかった場合、楽曲検索部２６は、ステップＳ２６ｆで算出された類似度Ｒに従って、コンテンツ２３ａを識別する情報を順序付け、即ちソートする。そして、類似度Ｒが所定の範囲であるコンテンツ２３ａを識別する情報を表示部１２に表示して（ステップＳ２６ｇ）、動作を終了する（ステップＳ２６ｈ）。 On the other hand, when the feature vector 23g of the content 23a is not completely read in step S26d, the music search unit 26 orders the information for identifying the content 23a according to the similarity R calculated in step S26f, that is, sorts. To do. And the information which identifies the content 23a whose similarity R is the predetermined | prescribed range is displayed on the display part 12 (step S26g), and operation | movement is complete | finished (step S26h).

ステップＳ２６ｇで、表示部１２に識別情報を表示するコンテンツ２３ａは、例えば、類似度Ｒが大きいものである。また、類似度Ｒが小さいものである。また、類似度Ｒが所定の範囲の値のものである。ここで、類似度Ｒが大きいものを表示すると、使用者は、普段所望する、即ち好みの楽曲を聴取するのに適している。また、類似度Ｒが小さいものを表示すると、使用者は、普段所望しない、即ち再生することの少ない楽曲を試聴するのに適している。また、類似度Ｒが所定の値の範囲のものを表示すると、使用者は、普段所望しないながら、所望する楽曲と大きく所望の程度が異なることのない楽曲を試聴するのに適している。 In step S26g, the content 23a for displaying the identification information on the display unit 12 has a high similarity R, for example. Further, the similarity R is small. The similarity R is a value within a predetermined range. Here, when a thing with a large similarity R is displayed, the user is suitable for listening to a music piece that is usually desired, that is, a favorite piece. Also, when the one with a low similarity R is displayed, it is suitable for the user to listen to music that is not usually desired, that is, with little reproduction. If the similarity R is displayed within a predetermined value range, the user is suitable for listening to music that is not usually desired but is not greatly different from the desired music.

なお、楽曲検索部２６は、ステップＳ２６ｅで特徴量ベクトル２３ｇを構成する第１〜第Ｍの特徴量ベクトル２３ｇ１〜２３ｇＭのそれぞれと、第１の集団との距離を算出し、更に、上記第１〜第Ｍの特徴量ベクトル２３ｇ１〜２３ｇＭのそれぞれと、第２の集団との距離を算出する際、Ｎ次元である第１〜第Ｍの特徴量ベクトル２３ｇ１〜２３ｇＭのＮ未満の要素を用いて算出しても良い。 Note that the music search unit 26 calculates the distance between each of the first to Mth feature quantity vectors 23g1 to 23gM constituting the feature quantity vector 23g and the first group in step S26e, and further to the first group. When calculating the distance between each of the Mth feature quantity vectors 23g1 to 23gM and the second group, elements less than N of the first to Mth feature quantity vectors 23g1 to 23gM that are N-dimensional are used. It may be calculated.

例えば、第ｉ要素（１≦ｉ≦Ｎ）を取り除いたベクトルによって距離を算出する場合、第１の重心ベクトル２５ｄと、第２の重心ベクトル２５ｆの第ｉ要素を取り除いたベクトルを用い、第１の分散共分散行列の逆行列２５ｅと、第２の分散共分散行列の逆行列２５ｇとの第ｉ行及び第ｉ列の要素を取り除いた行列を用いる。 For example, when the distance is calculated using a vector obtained by removing the i-th element (1 ≦ i ≦ N), the first centroid vector 25d and the vector obtained by removing the i-th element of the second centroid vector 25f are used. The matrix obtained by removing the i-th row and i-th column elements of the inverse matrix 25e of the variance-covariance matrix and the inverse matrix 25g of the second variance-covariance matrix is used.

取り除く要素は、第１の分散共分散行列の逆行列２５ｅ及び／または第２の分散共分散行列の逆行列２５ｇの第ｉ行及び第ｉ列の値が小さい場合、第ｉ要素を取り除くとする。これにより、分散が大きい、即ち、第１の集団との距離及び／または第２の集団との距離への寄与が少ない要素を取り除くことによって、類似度Ｒへの影響を少なくし、かつ、計算量の削減が可能となる。この削減の効果は、コンテンツ記憶部２３に多くのコンテンツ２３ａが記憶されている場合、顕著である。 The element to be removed is to remove the i-th element when the values of the i-th row and the i-th column of the inverse matrix 25e of the first variance-covariance matrix and / or the inverse matrix 25g of the second variance-covariance matrix are small. . Thus, by removing elements that have a large variance, that is, a small contribution to the distance to the first group and / or the distance to the second group, the influence on the similarity R is reduced and the calculation is performed. The amount can be reduced. The effect of this reduction is significant when a large amount of content 23a is stored in the content storage unit 23.

ここで、図９を参照して、楽曲検索部２６によって、辞書２５ａを参照してコンテンツ２３ａを所望の程度に従って順序付けされることによって、所望の楽曲が検索される概念を説明する。なお、この説明は、概念の説明であり、必ずしも算出される値に対応するものではない。 Here, with reference to FIG. 9, the concept that the music search unit 26 searches the desired music by ordering the contents 23a according to a desired degree with reference to the dictionary 25a will be described. Note that this description is a conceptual description and does not necessarily correspond to a calculated value.

図９に示すように、楽曲の特徴は、「非常に暗い曲」、「暗い曲」、「やや暗い曲」「明暗曖昧曲」、「やや明るい曲」、「明るい曲」及び「非常に明るい曲」からなる順序付けられた７段階及び隣り合う２つの段階の中間であることによって表されるとする。そして、楽曲は、その特徴が「明暗曖昧曲」である平均的な楽曲を境にいずれにあるかによって、「比較的暗い曲」と、「比較的明るい曲」とに分けられる。 As shown in FIG. 9, the characteristics of the music are “very dark music”, “dark music”, “slightly dark music”, “light and dark ambiguous music”, “slightly bright music”, “bright music”, and “very bright music”. Suppose that it is represented by an ordered 7-stage consisting of "song" and the middle of two adjacent stages. The music is divided into “relatively dark music” and “relatively bright music” depending on which of the average music whose characteristics are “bright and dark ambiguous music”.

使用者の所望の楽曲は、「やや明るい曲」であるとし、聴取済みの所望の楽曲が集団特徴量算出部２４に与えられることにより、辞書２５ａで、所望の楽曲の集団を示す第１の重心ベクトル２５ｄは、「やや明るい曲」を示し、所望でない楽曲の集団を示す第２の重心ベクトル２５ｆは、「やや暗い曲」を示している。 The user's desired music is a “slightly bright music”, and the desired music that has been listened to is given to the collective feature quantity calculation unit 24, whereby the dictionary 25a shows a first group of desired music. The center-of-gravity vector 25d indicates “slightly bright music”, and the second center-of-gravity vector 25f indicating an undesired group of music indicates “slightly dark music”.

そこで、所望の楽曲を楽曲検索部２６に検索させると、楽曲検索部２６は、第１の重心ベクトル２５ｄが示す「やや明るい曲」に近く、そして、第２の重心ベクトル２５ｆが示す「やや暗い曲」から遠い特徴の楽曲を検索し、図９に示す第１の検索結果が得られる。即ち、第１の検索結果は、所望の「やや明るい曲」から明暗双方向に対称の範囲にある楽曲を示す第２の検索結果ではない。そこで、使用者が所望しない「比較的暗い曲」を含むことがなく、使用者の意図する検索結果が得られる。 Therefore, when the music search unit 26 searches for the desired music, the music search unit 26 is close to “slightly bright music” indicated by the first centroid vector 25d and “slightly dark” indicated by the second centroid vector 25f. The music having the characteristics far from the “music” is searched, and the first search result shown in FIG. 9 is obtained. In other words, the first search result is not the second search result indicating a song in a symmetric range in both light and dark directions from a desired “slightly bright song”. Therefore, a search result intended by the user can be obtained without including a “relatively dark song” that is not desired by the user.

なお、所望の「やや明るい曲」から明暗双方向に対称の範囲にある楽曲を示し、使用者の検索の意図から外れる「比較的暗い曲」を含む第２の検索結果は、第２の重心ベクトル２５ｆの概念、即ち、使用者の所望しない楽曲の概念を用いない場合の検索結果を示す。 Note that a second search result including a “relatively dark song” that shows a song in a symmetric range in both light and dark directions from a desired “slightly bright song” and includes a “relatively dark song” that deviates from the user's search intention is the second center of gravity. The search result when the concept of the vector 25f, that is, the concept of the music not desired by the user is not used is shown.

次に、図１０を参照して、本実施形態に係る情報処理装置における、コンテンツ２３ａの流れ（図では、２重線矢印で示す。）、特徴量の流れ（図では、実線矢印で示す。）及びコンテンツ２３ａ指定の流れ（図では、破線矢印で示す。）の概略について説明する。なお、この説明は、情報の流れの概略であって、細部の動作全てを示すものではない。 Next, with reference to FIG. 10, in the information processing apparatus according to the present embodiment, the flow of content 23a (indicated by a double line arrow in the figure) and the flow of feature amount (indicated by a solid line arrow in the figure). ) And the flow of specifying the content 23a (indicated by broken arrows in the figure) will be outlined. This description is an outline of the flow of information and does not show all the detailed operations.

コンテンツは、楽曲登録部２１によって受信され、楽曲登録部２１は、受信されたコンテンツをコンテンツデータ２３ｃとしてコンテンツ２３ａに記憶させ、更に、特徴抽出部２２に送信する。特徴抽出部２２は、受信されたコンテンツデータ２３ｃから特徴量ベクトルを抽出して、特徴量ベクトル２３ｇとしてコンテンツ２３ａに記憶させる。 The content is received by the music registration unit 21, and the music registration unit 21 stores the received content in the content 23 a as the content data 23 c and further transmits it to the feature extraction unit 22. The feature extraction unit 22 extracts a feature vector from the received content data 23c and stores it in the content 23a as a feature vector 23g.

入力装置１３から入力された所望のコンテンツ２３ａの指定が受信されると、集団特徴量算出部２４は、コンテンツ２３ａの特徴量ベクトル２３ｇを参照して、上記所望のコンテンツ２３ａの検索に必要な特徴量ベクトルを検索及び／または作成して、重心ベクトル等の特徴量を辞書２５ａに記憶させる。楽曲検索部２６は、辞書２５ａに記憶された特徴量を参照して、コンテンツ２３ａから所望のコンテンツ２３ａを検索し、検索されたコンテンツデータ２３ｃを楽曲再生部２７に送信する。楽曲再生部２７は、受信されたコンテンツデータ２３ｃを再生して、スピーカ２７ａから音声を発生させる。 When the designation of the desired content 23a input from the input device 13 is received, the collective feature amount calculation unit 24 refers to the feature amount vector 23g of the content 23a and features necessary for the search of the desired content 23a. A quantity vector is searched and / or created, and a feature quantity such as a centroid vector is stored in the dictionary 25a. The music search unit 26 refers to the feature amount stored in the dictionary 25 a, searches for the desired content 23 a from the content 23 a, and transmits the searched content data 23 c to the music playback unit 27. The music reproducing unit 27 reproduces the received content data 23c and generates sound from the speaker 27a.

以上の説明では、コンテンツ２３ａに含まれるコンテンツデータ２３ｃは、符号化されたデータであるとしたが、これに限るものではない。デコードされたデータであっても良い。 In the above description, the content data 23c included in the content 23a is encoded data. However, the present invention is not limited to this. It may be decoded data.

以上の説明は、楽曲検索部２６によって、辞書２５ａを参照して所望の楽曲が検索される概念を例に説明したが、楽曲の検索に限るものではない。例えば、コンテンツ記憶部２３には、楽曲であるコンテンツ２３ａと、人の声からなるコンテンツ２３ａが記憶されている場合、いずれか一方を所望のコンテンツ２３ａであり、他方を所望でないコンテンツ２３ａとして集団特徴量算出部２４に辞書２５ａを作成させても良い。このように辞書２５ａが作成された場合、楽曲検索部２６によって、楽曲であるコンテンツ２３ａと、人の声からなるコンテンツ２３ａとのいずれか一方を検索させることが可能である。 In the above description, the concept that the music search unit 26 searches for the desired music with reference to the dictionary 25a has been described as an example. However, the search is not limited to music search. For example, when the content storage unit 23 stores content 23a that is a song and content 23a that is composed of a human voice, either one is a desired content 23a and the other is an undesired content 23a. The amount calculation unit 24 may create the dictionary 25a. When the dictionary 25a is created in this way, the music search unit 26 can search for either the content 23a that is a music or the content 23a that is composed of a human voice.

ここで、楽曲であるコンテンツ２３ａと、人の声からなるコンテンツ２３ａとでは、含まれる周波数成分の分布が異なることが知られており、それぞれのコンテンツ２３ａの集団の特徴（重心ベクトル及び分散共分散行列の逆行列）は異なるので、上記検索が可能となる。 Here, it is known that the distribution of frequency components included in the content 23a that is a song and the content 23a that is composed of a human voice are different, and the characteristics (centroid vector and variance covariance) of the groups of the respective content 23a are known. Since the inverse matrix of the matrix is different, the above search is possible.

以上の説明は、楽曲検索部２６が、コンテンツ記憶部２３に記憶されたコンテンツ２３ａを所望の程度に従って順序付けするとしたが、これに限るものではない。楽曲検索部２６は、通信部によって受信されているコンテンツデータの類似度Ｒを辞書２５ａを参照して算出するとしても良い。そして、その類似度Ｒが第１の所定の値以上である場合及び／または第２の所定の値以下である場合、楽曲再生部２７を制御して、スピーカ２７ａから所定の音声を出力させて報知するとしても良い。 In the above description, the music search unit 26 orders the content 23a stored in the content storage unit 23 according to a desired level. However, the present invention is not limited to this. The music search unit 26 may calculate the similarity R of the content data received by the communication unit with reference to the dictionary 25a. When the similarity R is equal to or higher than the first predetermined value and / or equal to or lower than the second predetermined value, the music reproducing unit 27 is controlled to output a predetermined sound from the speaker 27a. You may notify.

この場合、通信部は、放送を受信し、楽曲検索部２６は、放送された楽曲であるコンテンツデータの類似度Ｒを算出するとしても良い。更に、受信された楽曲の冒頭の、例えば数秒間のデータによって類似度Ｒを算出するとしても良い。冒頭のデータによって算出する場合、そのコンテンツデータからＭ本の特徴量ベクトルを算出しても良く、また、Ｍ本未満の特徴量ベクトルを算出して類似度Ｒを算出するとしても良い。 In this case, the communication unit may receive the broadcast, and the music search unit 26 may calculate the similarity R of the content data that is the broadcasted music. Further, the similarity R may be calculated from data at the beginning of the received music, for example, for several seconds. When the calculation is performed based on the beginning data, M feature quantity vectors may be calculated from the content data, or the similarity R may be calculated by calculating less than M feature quantity vectors.

また、上記報知は、スピーカ２７ａから所定の音声を出力することに限るものではない。例えば、装置はバイブレータ（図示せず）を有し、バイブレータの振動によって報知するとしても良い。この報知動作によれば、例えば、ラジオ放送が受信されている際、装置の使用者の好みの楽曲の放送が開始された場合、使用者に好みの楽曲が放送中であることを報知することができる。 Further, the notification is not limited to outputting predetermined sound from the speaker 27a. For example, the apparatus may have a vibrator (not shown) and notify by vibration of the vibrator. According to this notification operation, for example, when a broadcast of a favorite song of the user of the apparatus is started when a radio broadcast is received, the user is notified that the favorite song is being broadcast. Can do.

以上の説明では、辞書２５ａは、集団特徴量算出部２４によって作成されるとしたが、これに限るものではない。例えば、装置外部から受信されたものであっても良い。また、装置の出荷時に含まれているものであっても良い。 In the above description, the dictionary 25a is created by the collective feature amount calculation unit 24, but is not limited thereto. For example, it may be received from outside the apparatus. It may also be included when the device is shipped.

以上の説明では、辞書２５ａは、辞書記憶部２５に記憶されるとしたが、これに限るものではない。集団特徴量算出部２４によって作成され、楽曲検索部２６に送信されるとしても良い。 In the above description, the dictionary 25a is stored in the dictionary storage unit 25, but is not limited thereto. It may be created by the collective feature amount calculation unit 24 and transmitted to the music search unit 26.

（第２の実施形態）
第２の実施形態が第１の実施形態と異なる点は、集団特徴量算出部２４の動作にある。そこで、第２の実施形態の集団特徴量算出部２４の動作を、図面を参照して説明する。図１１は、第２の実施形態の係る集団特徴量算出部２４の動作のフローチャートを示す。なお、第１の実施形態に係る集団特徴量算出部２４の動作と同じ動作ステップについては、同じ符号を付し、その部分の説明を省略する。 (Second Embodiment)
The second embodiment is different from the first embodiment in the operation of the collective feature quantity calculation unit 24. Therefore, the operation of the collective feature amount calculation unit 24 of the second embodiment will be described with reference to the drawings. FIG. 11 shows a flowchart of the operation of the collective feature quantity calculation unit 24 according to the second embodiment. Note that the same operation steps as those of the collective feature quantity calculation unit 24 according to the first embodiment are denoted by the same reference numerals, and description thereof is omitted.

第１の実施形態に係る集団特徴量算出部２４のステップＳ２４ｄの動作で、集団特徴量算出部２４は、所望のコンテンツ２３ａの特徴量ベクトル２３ｇを得ることに加え、所望でないコンテンツ２３ａの特徴量ベクトル２３ｇの逆ベクトルを用いるとした。 In the operation of step S24d of the collective feature value calculating unit 24 according to the first embodiment, the collective feature value calculating unit 24 obtains the feature value vector 23g of the desired content 23a, and in addition, obtains the feature value of the undesired content 23a. An inverse vector of the vector 23g is used.

第２の実施形態に係る集団特徴量算出部２４は、第１の実施形態に係る集団特徴量算出部２４によって、所望でないコンテンツ２３ａの特徴量ベクトル２３ｇの逆ベクトルによって仮に辞書２５ａを作成する。そして、楽曲検索部２６を制御して、仮の辞書２５ａに含まれる第１の重心ベクトル２５ｄと、第１の分散共分散行列の逆行列２５ｅとで示される集団との第１の距離が最も小さい、即ち、第１の類似度Ｓ１が最も大きいコンテンツ２３ａを検索させる。 The collective feature quantity calculation unit 24 according to the second embodiment temporarily creates a dictionary 25a by using the inverse vector of the feature quantity vector 23g of the content 23a that is not desired by the collective feature quantity calculation unit 24 according to the first embodiment. Then, the music search unit 26 is controlled so that the first distance between the first centroid vector 25d included in the temporary dictionary 25a and the group represented by the inverse matrix 25e of the first variance-covariance matrix is the largest. The content 23a having the smallest first degree of similarity S1 is searched.

なお、上記の条件の他、第１の距離が所定の値以下である、即ち、第１の類似度Ｓ１が所定の値以上であるとの条件を加えて付し、または代えて用いても良い。これによれば、上記検索によってコンテンツ２３ａが得られない可能性があることにより、不適切な所望のコンテンツ２３ａが検索されることを防ぐことができる。 In addition to the above conditions, a condition that the first distance is not more than a predetermined value, that is, the first similarity S1 is not less than a predetermined value may be added or used instead. good. According to this, since there is a possibility that the content 23a cannot be obtained by the search, it is possible to prevent an inappropriate desired content 23a from being searched.

そして、第１の実施形態に係る集団特徴量算出部２４は、所望でないコンテンツ２３ａの特徴量ベクトル２３ｇの逆ベクトルを用いることに代えて、第２の実施形態に係る集団特徴量算出部２４は、上記検索されたコンテンツ２３ａの特徴量ベクトル２３ｇを所望のコンテンツ２３ａの特徴量ベクトル２３ｇとして用いて辞書２５ａを作成する（ステップＳ２４ｍ）。 Then, the collective feature quantity calculating unit 24 according to the first embodiment uses the inverse vector of the feature quantity vector 23g of the content 23a that is not desired, and the collective feature quantity calculating unit 24 according to the second embodiment Then, the dictionary 25a is created using the searched feature amount vector 23g of the content 23a as the feature amount vector 23g of the desired content 23a (step S24m).

同様に、第１の実施形態に係る集団特徴量算出部２４のステップＳ２４ｆの動作に代えて、第２の実施形態に係る集団特徴量算出部２４は、所望のコンテンツ２３ａの特徴量ベクトル２３ｇの逆ベクトルによって仮に辞書２５ａを作成する。そして、楽曲検索部２６を制御して、仮の辞書２５ａに含まれる第２の重心ベクトル２５ｆと、第２の分散共分散行列の逆行列２５ｇとによって示される集団との第２の類似度Ｓ２が最も大きいコンテンツ２３ａを検索させ、検索されたコンテンツ２３ａの特徴量ベクトル２３ｇを所望でないコンテンツ２３ａの特徴量ベクトル２３ｇとして用いて辞書２５ａを作成する（ステップＳ２４ｎ）。 Similarly, instead of the operation of step S24f of the collective feature value calculating unit 24 according to the first embodiment, the collective feature value calculating unit 24 according to the second embodiment may include the feature vector 23g of the desired content 23a. A dictionary 25a is temporarily created using the inverse vector. Then, the music search unit 26 is controlled so that the second similarity S2 between the group indicated by the second centroid vector 25f included in the temporary dictionary 25a and the inverse matrix 25g of the second variance-covariance matrix. The content 23a having the largest value is searched, and the dictionary 25a is created using the feature vector 23g of the searched content 23a as the feature vector 23g of the undesired content 23a (step S24n).

これらの処理によれば、以後の処理で用いられる特徴量ベクトル２３ｇは、全てがコンテンツ２３ａに記憶されており、架空のものは含まれない。そのため、以後算出される第１の重心ベクトル２５ｄと、第１の分散共分散行列の逆行列２５ｅと、第２の重心ベクトル２５ｆと、第２の分散共分散行列の逆行列２５ｇとは、記憶されたコンテンツ２３ａに一層適切に依存したものとなる。 According to these processes, all of the feature quantity vectors 23g used in the subsequent processes are stored in the content 23a, and fictitious ones are not included. Therefore, the first centroid vector 25d, the inverse matrix 25e of the first variance-covariance matrix, the second centroid vector 25f, and the inverse matrix 25g of the second variance-covariance matrix calculated thereafter are stored. The content 23a depends more appropriately.

なお、上記の処理に代えてステップＳ２４ｍでは、所望でないコンテンツ２３ａの特徴量ベクトル２３ｇの逆ベクトルと距離が小さい特徴量ベクトル２３ｇをコンテンツ２３ａから検索し、検索された特徴量ベクトル２３ｇを所望のコンテンツ２３ａの特徴量ベクトル２３ｇとして用いても良い。 Note that in step S24m instead of the above processing, a feature vector 23g having a small distance and a reverse vector of the feature vector 23g of the undesired content 23a is searched from the content 23a, and the searched feature vector 23g is used as the desired content. You may use as the feature-value vector 23g of 23a.

また、ステップＳ２４ｎでは、所望のコンテンツ２３ａの特徴量ベクトル２３ｇの逆ベクトルと距離が小さい特徴量ベクトル２３ｇをコンテンツ２３ａから検索し、検索された特徴量ベクトル２３ｇを所望でないコンテンツ２３ａの特徴量ベクトル２３ｇとして用いても良い。 In step S24n, the feature vector 23g having a small distance from the inverse vector of the feature vector 23g of the desired content 23a is searched from the content 23a, and the searched feature vector 23g is the feature vector 23g of the undesired content 23a. It may be used as

これらの処理によれば、仮の辞書２５ａを作成する負荷が減少する。なお、上記距離の算出は、Ｍ本のベクトルと、Ｍ本のベクトルと間の距離の算出となるが、この距離については、既に説明した通りである。なお、これらの場合、上記距離が所定の値以下であるとの条件を加えて付し、または代えて用いても良い。これによれば、上記検索によってコンテンツ２３ａが得られない可能性があることにより、不適切な所望でないコンテンツ２３ａが検索されることを防ぐことができる。 According to these processes, the load for creating the temporary dictionary 25a is reduced. Note that the calculation of the distance is the calculation of the distance between the M vectors and the M vectors, and this distance is as described above. In these cases, a condition that the distance is not more than a predetermined value may be added or used instead. According to this, since there is a possibility that the content 23a cannot be obtained by the search, it is possible to prevent an inappropriate and undesired content 23a from being searched.

なお、第２の実施形態に係る集団特徴量算出部２４は、ステップＳ２４ｂ、ステップＳ２４ｍ及びステップＳ２４ｅで説明した所望のコンテンツ２３ａに関する処理は、この順で行う必要がある。また、ステップＳ２４ｃ、ステップＳ２４ｎ及びステップＳ２４ｇで説明した所望でないコンテンツ２３ａに関する処理は、この順で行う必要がある。しかし、これらの処理のどれを先に行い、どれを後に行っても良いことは、第１の実施形態に係る集団特徴量算出部２４の動作説明で説明した通りである。 Note that the collective feature amount calculation unit 24 according to the second embodiment needs to perform the processes related to the desired content 23a described in step S24b, step S24m, and step S24e in this order. Further, the processing related to the undesired content 23a described in step S24c, step S24n, and step S24g needs to be performed in this order. However, as described in the explanation of the operation of the collective feature quantity calculation unit 24 according to the first embodiment, which of these processes may be performed first and which may be performed later.

（その他の実施形態）
上記の各実施形態では、コンテンツ２３ａのコンテンツデータ２３ｃは、楽曲のデータであるとしたが、これに限るものではない。人の話した声、機械音、自然界で発生する音声で、例えば背景音として用いられる音声等の全ての種類の音声のデータであっても良く、当然に同様に動作する。 (Other embodiments)
In each of the above embodiments, the content data 23c of the content 23a is music data. However, the present invention is not limited to this. It may be data of all kinds of voices such as voices spoken by humans, mechanical sounds, and voices generated in nature, such as voices used as background sounds, and naturally operates in the same manner.

また、コンテンツデータ２３ｃが動画である場合、特定の画素の輝度や、色差の時間的な変化を音声の時間的な変化と同様に扱えば、同様の処理が可能である。また、任意の大きさのマクロブロックを離散コサイン変換して周波数成分を取り出すことによって、同様の処理が可能である。また、マクロブロックは、画像全体であっても良い。更に、コンテンツデータ２３ｃが静止画である場合、特定の線状の画素の輝度や、色差の線上の変化を時間的な変化と同様に扱えば、同様の処理が可能である。 Further, when the content data 23c is a moving image, the same processing can be performed by treating the temporal change in the luminance and color difference of a specific pixel in the same manner as the temporal change in audio. Further, the same processing can be performed by taking out a frequency component by performing discrete cosine transform on a macroblock having an arbitrary size. The macroblock may be the entire image. Further, when the content data 23c is a still image, the same processing can be performed by treating the luminance of a specific linear pixel and the change of the color difference on the line in the same manner as the temporal change.

本発明の情報処理装置は、固定式の装置であるか、携帯型の装置であるかを問わない。更に、ハードディスク搭載の動画像再生装置、ビデオカメラ、ビデオ再生装置、パソコン、携帯型音楽再生装置、移動通信端末装置等、あらゆるコンテンツを記憶する装置に適用することが当然に可能である。また、上記の各実施形態で説明した要素を適宜組み合わせても良い。本発明は以上の構成に限定されるものではなく、種々の変形が可能である。 It does not matter whether the information processing apparatus of the present invention is a fixed apparatus or a portable apparatus. Furthermore, the present invention can naturally be applied to devices that store various contents, such as a moving image playback device mounted on a hard disk, a video camera, a video playback device, a personal computer, a portable music playback device, and a mobile communication terminal device. Moreover, you may combine suitably the element demonstrated in said each embodiment. The present invention is not limited to the above configuration, and various modifications are possible.

本発明の実施形態に係る情報処理装置の構成を示すブロック図。The block diagram which shows the structure of the information processing apparatus which concerns on embodiment of this invention. 本発明の実施形態に係るコンテンツの構成の一例を示す図。The figure which shows an example of a structure of the content which concerns on embodiment of this invention. 本発明の実施形態に係る辞書の構成の一例を示す図。The figure which shows an example of a structure of the dictionary which concerns on embodiment of this invention. 本発明の実施形態に係る特徴抽出部の動作を示すフローチャート。The flowchart which shows operation | movement of the feature extraction part which concerns on embodiment of this invention. 本発明の実施形態に係る特徴抽出部のケプストラム係数に関わる特徴量算出動作を示すフローチャート。The flowchart which shows the feature-value calculation operation regarding the cepstrum coefficient of the feature extraction part which concerns on embodiment of this invention. 本発明の第１の実施形態に係る集団特徴量算出部の動作を示すフローチャート。5 is a flowchart showing the operation of a collective feature amount calculation unit according to the first embodiment of the present invention. 本発明の実施形態に係る集団特徴量算出部の動作による特徴量の流れを示すフロー図。The flowchart which shows the flow of the feature-value by operation | movement of the collective feature-value calculation part which concerns on embodiment of this invention. 本発明の実施形態に係る楽曲検索部の動作を示すフローチャート（辞書を参照して、所定の所望の程度のコンテンツを検索する動作）。The flowchart which shows operation | movement of the music search part which concerns on embodiment of this invention (Operation | movement which searches the content of a predetermined desired level with reference to a dictionary). 本発明の実施形態に係る楽曲検索部によって検索される楽曲の範囲の概念を示す図。The figure which shows the concept of the range of the music searched by the music search part which concerns on embodiment of this invention. 本発明の実施形態に係る情報処理装置の動作によるコンテンツ等の流れを示すフロー図。The flowchart which shows the flow of the content etc. by operation | movement of the information processing apparatus which concerns on embodiment of this invention. 本発明の第２の実施形態に係る集団特徴量算出部の動作を示すフローチャート。The flowchart which shows operation | movement of the collective feature-value calculation part which concerns on the 2nd Embodiment of this invention.

Explanation of symbols

２１楽曲登録部
２２特徴抽出部
２３コンテンツ記憶部
２３ａコンテンツ
２３ｂコンテンツ識別子
２３ｃコンテンツデータ
２３ｇ特徴量ベクトル
２３ｇ１第１の特徴量ベクトル
２３ｇ２第２の特徴量ベクトル
２３ｇｉ第ｉの特徴量ベクトル
２３ｇｍ第ｍの特徴量ベクトル
２３ｇＭ第Ｍの特徴量ベクトル
２４集団特徴量算出部
２５辞書記憶部
２５ａ辞書
２５ｂ辞書識別子
２５ｃ辞書名
２５ｄ第１の重心ベクトル
２５ｅ第１の分散共分散行列の逆行列
２５ｆ第２の重心ベクトル
２５ｇ第２の分散共分散行列の逆行列
２６楽曲検索部 21 music registration unit 22 feature extraction unit 23 content storage unit 23a content 23b content identifier 23c content data 23g feature quantity vector 23g1 first feature quantity vector 23g2 second feature quantity vector 23gi i-th feature quantity vector 23gm m-th feature Quantity vector 23gM Mth feature quantity vector 24 collective feature quantity calculation section 25 dictionary storage section 25a dictionary 25b dictionary identifier 25c dictionary name 25d first centroid vector 25e first inverse covariance matrix inverse matrix 25f second centroid vector 25g Inverse matrix of second variance-covariance matrix 26 Music search unit

Claims

Content storage means for storing content;
Searching for the first type of content from the content storage means, calculating a feature quantity of the first group of the first type of content from a feature quantity vector of the first type of content obtained by the search; A feature having the shortest distance from a vector symmetrical to the feature vector of the first type of content obtained by the search centered on a representative vector of content feature vectors and / or a feature whose distance is smaller than a predetermined value A second type of content having a quantity vector is searched from the content storage means, and a feature quantity of the second group consisting of the second type of content is calculated from the searched feature quantity vector of the second type content. A group feature quantity calculating means;
A first distance between the feature quantity vector of the content stored in the content storage means and the feature quantity of the first group calculated by the group feature quantity calculation means, the feature quantity vector of the content, and the group A second distance between the second group and the feature quantity of the second group calculated by the feature quantity calculation means, and the content of the first type is the content of which the first distance is smaller than the second distance. And / or content search means for searching for the content having the first distance larger than the second distance as the second type of content.