JP4021851B2 - 音声信号を特徴付ける方法 - Google Patents
音声信号を特徴付ける方法 Download PDFInfo
- Publication number
- JP4021851B2 JP4021851B2 JP2003556905A JP2003556905A JP4021851B2 JP 4021851 B2 JP4021851 B2 JP 4021851B2 JP 2003556905 A JP2003556905 A JP 2003556905A JP 2003556905 A JP2003556905 A JP 2003556905A JP 4021851 B2 JP4021851 B2 JP 4021851B2
- Authority
- JP
- Japan
- Prior art keywords
- signal
- audio signal
- classification
- energy
- parameter
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Expired - Lifetime
Links
- 230000005236 sound signal Effects 0.000 title claims abstract description 54
- 238000000034 method Methods 0.000 claims abstract description 24
- 238000004364 calculation method Methods 0.000 claims abstract description 13
- 239000013598 vector Substances 0.000 claims description 22
- 238000010586 diagram Methods 0.000 description 6
- 238000000605 extraction Methods 0.000 description 3
- 238000005070 sampling Methods 0.000 description 3
- 238000012512 characterization method Methods 0.000 description 2
- 230000004044 response Effects 0.000 description 2
- 241000233805 Phoenix Species 0.000 description 1
- 238000004458 analytical method Methods 0.000 description 1
- 238000006243 chemical reaction Methods 0.000 description 1
- 239000003795 chemical substances by application Substances 0.000 description 1
- 238000012790 confirmation Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 238000001914 filtration Methods 0.000 description 1
- 238000007429 general method Methods 0.000 description 1
- 230000002452 interceptive effect Effects 0.000 description 1
- 230000002045 lasting effect Effects 0.000 description 1
- 238000001228 spectrum Methods 0.000 description 1
- 230000002123 temporal effect Effects 0.000 description 1
- 238000012360 testing method Methods 0.000 description 1
- 230000009466 transformation Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
- G10L25/48—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/40—Information retrieval; Database structures therefor; File system structures therefor of multimedia data, e.g. slideshows comprising image and additional audio data
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/60—Information retrieval; Database structures therefor; File system structures therefor of audio data
- G06F16/63—Querying
- G06F16/632—Query formulation
- G06F16/634—Query by example, e.g. query by humming
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/60—Information retrieval; Database structures therefor; File system structures therefor of audio data
- G06F16/68—Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
- G06F16/683—Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Multimedia (AREA)
- General Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- Databases & Information Systems (AREA)
- Data Mining & Analysis (AREA)
- Library & Information Science (AREA)
- Mathematical Physics (AREA)
- Acoustics & Sound (AREA)
- Human Computer Interaction (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Health & Medical Sciences (AREA)
- Signal Processing (AREA)
- Computational Linguistics (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
- Stereophonic System (AREA)
- Measurement Of Mechanical Vibrations Or Ultrasonic Waves (AREA)
- Auxiliary Devices For Music (AREA)
- Compression, Expansion, Code Conversion, And Decoders (AREA)
Description
φ(1、k、t)=P(1、k、t)
φ(j、k、t)=P(j、k、t)−P(1、k、t)・f(k)/f(l)、(k>1の場合)
ただし、f(k)は、チャネルkの中央周波数である。
Claims (6)
- 異なる周波数帯域kにおいて期間Dに渡る時間tに応じて変化し、それゆえ、x(k、t)と記される音声信号x(t)を、特定パラメータに応じて特徴付ける方法において、
信号x(t)を格納するステップと、
期間2Nの時間窓h(t)に応じて1〜Kの範囲で変化する各周波数帯域kに対する上記信号x(k、t)のエネルギー信号E(k、t)を計算し、格納するステップと、
第2ステップにおいて、期間2N’の時間窓h’(t)を使用して1〜Jの範囲で変化する周波数帯域jにおける期間2N’のエネルギー信号E(k、t)のエネルギーF(j、k、t)と、周波数帯域jに対するエネルギー信号E(k、t)の位相φ(j、k、t)と、を計算し、格納するステップと、を含み、
エネルギーF(j、k、t)および位相φ(j、k、t)の得られたJ×K値が、音声信号x(t)の期間2N’に抽出された特定パラメータを構成しており、
さらに、音声信号x(t)の期間Dに対する全ての特定パラメータを得るために、一定の間隔で上記計算を繰り返すステップを含むことを特徴とする方法。 - 各周波数帯域jに対して、2N’秒に渡るエネルギー信号E(k、t)の平均値を計算するステップと、
音声信号x(t)の期間Dに対する全ての特定パラメータを得るために、一定の間隔で上記計算を繰り返すステップと、
得られた平均値を、音声信号x(t)の特定パラメータに含めるステップとをさらに含む請求項1に記載の方法。 - 音声信号x(t)の特定パラメータを、x(t)を表すベクトルの成分とみなすステップと、
最近似ベクトルをまとめた分類を定義するステップと、
上記分類を記録するステップとを含む請求項1または2に記載の方法。 - 上記分類が、分類間の距離および分類内の距離を有し、
特定パラメータから、分類内の距離に対して比較的大きな分類間の距離を得られるパラメータを選択するステップと、
選択したパラメータを記録するステップとを含む請求項3に記載の方法。 - 請求項1ないし4のいずれか1項に基づいて、特定パラメータに応じて音声信号を特徴付けるための方法を実施する手段と、データベースにある上記信号を検索する検索手段とを有するデータベースサーバーを備えている音声信号識別装置。
- 上記検索手段が、音声信号の属する分類を認識するための手段と、最近傍アルゴリズム法を用いて、未知の音声信号の特定パラメータとデータベースの特定パラメータとを比較するための手段とを含む請求項3または4と組み合わせた請求項5に記載の装置。
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
FR0116949A FR2834363B1 (fr) | 2001-12-27 | 2001-12-27 | Procede de caracterisation d'un signal sonore |
PCT/FR2002/004549 WO2003056455A1 (fr) | 2001-12-27 | 2002-12-24 | Procede de caracterisation d'un signal sonore |
Publications (2)
Publication Number | Publication Date |
---|---|
JP2005513576A JP2005513576A (ja) | 2005-05-12 |
JP4021851B2 true JP4021851B2 (ja) | 2007-12-12 |
Family
ID=8871036
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
JP2003556905A Expired - Lifetime JP4021851B2 (ja) | 2001-12-27 | 2002-12-24 | 音声信号を特徴付ける方法 |
Country Status (8)
Country | Link |
---|---|
US (1) | US20050163325A1 (ja) |
EP (1) | EP1459214B1 (ja) |
JP (1) | JP4021851B2 (ja) |
AT (1) | ATE498163T1 (ja) |
AU (1) | AU2002364878A1 (ja) |
DE (1) | DE60239155D1 (ja) |
FR (1) | FR2834363B1 (ja) |
WO (1) | WO2003056455A1 (ja) |
Families Citing this family (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US8918316B2 (en) * | 2003-07-29 | 2014-12-23 | Alcatel Lucent | Content identification system |
DE102004021404B4 (de) * | 2004-04-30 | 2007-05-10 | Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. | Wasserzeicheneinbettung |
JP4665836B2 (ja) * | 2006-05-31 | 2011-04-06 | 日本ビクター株式会社 | 楽曲分類装置、楽曲分類方法、及び楽曲分類プログラム |
Family Cites Families (11)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JPS57147695A (en) * | 1981-03-06 | 1982-09-11 | Fujitsu Ltd | Voice analysis system |
JPS6193500A (ja) * | 1984-10-12 | 1986-05-12 | 松下電器産業株式会社 | 音声認識装置 |
JPH0519782A (ja) * | 1991-05-02 | 1993-01-29 | Ricoh Co Ltd | 音声特徴抽出装置 |
JP3336619B2 (ja) * | 1991-07-12 | 2002-10-21 | ソニー株式会社 | 信号処理装置 |
US5536902A (en) * | 1993-04-14 | 1996-07-16 | Yamaha Corporation | Method of and apparatus for analyzing and synthesizing a sound by extracting and controlling a sound parameter |
US5918223A (en) * | 1996-07-22 | 1999-06-29 | Muscle Fish | Method and article of manufacture for content-based analysis, storage, retrieval, and segmentation of audio information |
US6201176B1 (en) * | 1998-05-07 | 2001-03-13 | Canon Kabushiki Kaisha | System and method for querying a music database |
JP2000114976A (ja) * | 1998-10-07 | 2000-04-21 | Nippon Columbia Co Ltd | 量子化ノイズ低減装置およびビット長拡張装置 |
NL1013500C2 (nl) * | 1999-11-05 | 2001-05-08 | Huq Speech Technologies B V | Inrichting voor het schatten van de frequentie-inhoud of het spectrum van een geluidssignaal in een ruizige omgeving. |
JP3475886B2 (ja) * | 1999-12-24 | 2003-12-10 | 日本電気株式会社 | パターン認識装置及び方法並びに記録媒体 |
US6657117B2 (en) * | 2000-07-14 | 2003-12-02 | Microsoft Corporation | System and methods for providing automatic classification of media entities according to tempo properties |
-
2001
- 2001-12-27 FR FR0116949A patent/FR2834363B1/fr not_active Expired - Lifetime
-
2002
- 2002-12-24 EP EP02801177A patent/EP1459214B1/fr not_active Expired - Lifetime
- 2002-12-24 US US10/500,441 patent/US20050163325A1/en not_active Abandoned
- 2002-12-24 WO PCT/FR2002/004549 patent/WO2003056455A1/fr active Application Filing
- 2002-12-24 DE DE60239155T patent/DE60239155D1/de not_active Expired - Lifetime
- 2002-12-24 JP JP2003556905A patent/JP4021851B2/ja not_active Expired - Lifetime
- 2002-12-24 AU AU2002364878A patent/AU2002364878A1/en not_active Abandoned
- 2002-12-24 AT AT02801177T patent/ATE498163T1/de not_active IP Right Cessation
Also Published As
Publication number | Publication date |
---|---|
FR2834363A1 (fr) | 2003-07-04 |
DE60239155D1 (de) | 2011-03-24 |
ATE498163T1 (de) | 2011-02-15 |
JP2005513576A (ja) | 2005-05-12 |
FR2834363B1 (fr) | 2004-02-27 |
US20050163325A1 (en) | 2005-07-28 |
AU2002364878A1 (en) | 2003-07-15 |
EP1459214B1 (fr) | 2011-02-09 |
EP1459214A1 (fr) | 2004-09-22 |
WO2003056455A1 (fr) | 2003-07-10 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US6995309B2 (en) | System and method for music identification | |
JP5826291B2 (ja) | 音声信号からの特徴フィンガープリントの抽出及びマッチング方法 | |
Lu et al. | Content-based audio classification and segmentation by using support vector machines | |
CN109493881B (zh) | 一种音频的标签化处理方法、装置和计算设备 | |
CN101292280B (zh) | 导出音频输入信号的一个特征集的方法 | |
Zhang | Automatic singer identification | |
EP1760693B1 (en) | Extraction and matching of characteristic fingerprints from audio signals | |
US8352259B2 (en) | Methods and apparatus for audio recognition | |
JP2005522074A (ja) | 話者識別に基づくビデオのインデックスシステムおよび方法 | |
US20060155399A1 (en) | Method and system for generating acoustic fingerprints | |
KR100676863B1 (ko) | 음악 검색 서비스 제공 시스템 및 방법 | |
WO2006132596A1 (en) | Method and apparatus for audio clip classification | |
CN113327626A (zh) | 语音降噪方法、装置、设备及存储介质 | |
Dong et al. | A novel representation of bioacoustic events for content-based search in field audio data | |
Dong et al. | Similarity-based birdcall retrieval from environmental audio | |
CN106098081A (zh) | 声音文件的音质识别方法及装置 | |
CN109271501B (zh) | 一种音频数据库的管理方法及系统 | |
JP4021851B2 (ja) | 音声信号を特徴付ける方法 | |
CN117409761A (zh) | 基于频率调制的人声合成方法、装置、设备及存储介质 | |
KR100766170B1 (ko) | 다중 레벨 양자화를 이용한 음악 요약 장치 및 방법 | |
Joshi et al. | Extraction of feature vectors for analysis of musical instruments | |
Chu et al. | Peak-Based Philips Fingerprint Robust to Pitch-Shift for Audio Identification | |
Dong et al. | Birdcall retrieval from environmental acoustic recordings using image processing | |
Liang et al. | A Histogram Algorithm for Fast Audio Retrieval. | |
Gruhne | Robust audio identification for commercial applications |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
A131 | Notification of reasons for refusal |
Free format text: JAPANESE INTERMEDIATE CODE: A131 Effective date: 20061017 |
|
A521 | Request for written amendment filed |
Free format text: JAPANESE INTERMEDIATE CODE: A523 Effective date: 20070116 |
|
RD02 | Notification of acceptance of power of attorney |
Free format text: JAPANESE INTERMEDIATE CODE: A7422 Effective date: 20070116 |
|
A521 | Request for written amendment filed |
Free format text: JAPANESE INTERMEDIATE CODE: A821 Effective date: 20070116 |
|
A131 | Notification of reasons for refusal |
Free format text: JAPANESE INTERMEDIATE CODE: A131 Effective date: 20070403 |
|
A601 | Written request for extension of time |
Free format text: JAPANESE INTERMEDIATE CODE: A601 Effective date: 20070629 |
|
A602 | Written permission of extension of time |
Free format text: JAPANESE INTERMEDIATE CODE: A602 Effective date: 20070706 |
|
A521 | Request for written amendment filed |
Free format text: JAPANESE INTERMEDIATE CODE: A523 Effective date: 20070730 |
|
TRDD | Decision of grant or rejection written | ||
A01 | Written decision to grant a patent or to grant a registration (utility model) |
Free format text: JAPANESE INTERMEDIATE CODE: A01 Effective date: 20070904 |
|
A61 | First payment of annual fees (during grant procedure) |
Free format text: JAPANESE INTERMEDIATE CODE: A61 Effective date: 20070927 |
|
FPAY | Renewal fee payment (event date is renewal date of database) |
Free format text: PAYMENT UNTIL: 20101005 Year of fee payment: 3 |
|
R150 | Certificate of patent or registration of utility model |
Ref document number: 4021851 Country of ref document: JP Free format text: JAPANESE INTERMEDIATE CODE: R150 Free format text: JAPANESE INTERMEDIATE CODE: R150 |
|
FPAY | Renewal fee payment (event date is renewal date of database) |
Free format text: PAYMENT UNTIL: 20111005 Year of fee payment: 4 |
|
R250 | Receipt of annual fees |
Free format text: JAPANESE INTERMEDIATE CODE: R250 |
|
FPAY | Renewal fee payment (event date is renewal date of database) |
Free format text: PAYMENT UNTIL: 20121005 Year of fee payment: 5 |
|
R250 | Receipt of annual fees |
Free format text: JAPANESE INTERMEDIATE CODE: R250 |
|
FPAY | Renewal fee payment (event date is renewal date of database) |
Free format text: PAYMENT UNTIL: 20121005 Year of fee payment: 5 |
|
FPAY | Renewal fee payment (event date is renewal date of database) |
Free format text: PAYMENT UNTIL: 20131005 Year of fee payment: 6 |
|
R250 | Receipt of annual fees |
Free format text: JAPANESE INTERMEDIATE CODE: R250 |
|
R250 | Receipt of annual fees |
Free format text: JAPANESE INTERMEDIATE CODE: R250 |
|
R250 | Receipt of annual fees |
Free format text: JAPANESE INTERMEDIATE CODE: R250 |
|
R250 | Receipt of annual fees |
Free format text: JAPANESE INTERMEDIATE CODE: R250 |
|
R250 | Receipt of annual fees |
Free format text: JAPANESE INTERMEDIATE CODE: R250 |
|
R250 | Receipt of annual fees |
Free format text: JAPANESE INTERMEDIATE CODE: R250 |
|
R250 | Receipt of annual fees |
Free format text: JAPANESE INTERMEDIATE CODE: R250 |
|
R250 | Receipt of annual fees |
Free format text: JAPANESE INTERMEDIATE CODE: R250 |
|
R250 | Receipt of annual fees |
Free format text: JAPANESE INTERMEDIATE CODE: R250 |
|
R250 | Receipt of annual fees |
Free format text: JAPANESE INTERMEDIATE CODE: R250 |
|
R250 | Receipt of annual fees |
Free format text: JAPANESE INTERMEDIATE CODE: R250 |
|
EXPY | Cancellation because of completion of term |