JP2023531850A - オーディオデータ識別装置 - Google Patents
オーディオデータ識別装置 Download PDFInfo
- Publication number
- JP2023531850A JP2023531850A JP2022554581A JP2022554581A JP2023531850A JP 2023531850 A JP2023531850 A JP 2023531850A JP 2022554581 A JP2022554581 A JP 2022554581A JP 2022554581 A JP2022554581 A JP 2022554581A JP 2023531850 A JP2023531850 A JP 2023531850A
- Authority
- JP
- Japan
- Prior art keywords
- audio data
- audio
- identification information
- unit
- resource
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000012795 verification Methods 0.000 claims abstract description 59
- 238000013473 artificial intelligence Methods 0.000 claims abstract description 47
- 238000004891 communication Methods 0.000 claims description 11
- 238000000605 extraction Methods 0.000 claims description 5
- 238000012545 processing Methods 0.000 claims description 2
- 239000000284 extract Substances 0.000 abstract description 2
- 238000000034 method Methods 0.000 description 24
- 238000010586 diagram Methods 0.000 description 11
- 230000005540 biological transmission Effects 0.000 description 7
- 230000000694 effects Effects 0.000 description 5
- 208000032041 Hearing impaired Diseases 0.000 description 2
- 238000005516 engineering process Methods 0.000 description 2
- 230000000007 visual effect Effects 0.000 description 2
- 206010011469 Crying Diseases 0.000 description 1
- 206010011878 Deafness Diseases 0.000 description 1
- 238000013528 artificial neural network Methods 0.000 description 1
- 238000009835 boiling Methods 0.000 description 1
- 238000007796 conventional method Methods 0.000 description 1
- 238000013135 deep learning Methods 0.000 description 1
- 239000000203 mixture Substances 0.000 description 1
- 238000010295 mobile communication Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 239000002699 waste material Substances 0.000 description 1
- XLYOFNOQVPJJNP-UHFFFAOYSA-N water Substances O XLYOFNOQVPJJNP-UHFFFAOYSA-N 0.000 description 1
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
- G10L25/48—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use
- G10L25/51—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use for comparison or discrimination
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/60—Information retrieval; Database structures therefor; File system structures therefor of audio data
- G06F16/65—Clustering; Classification
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/60—Information retrieval; Database structures therefor; File system structures therefor of audio data
- G06F16/68—Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/06—Creation of reference templates; Training of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L21/00—Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
- G10L21/02—Speech enhancement, e.g. noise reduction or echo cancellation
- G10L21/0272—Voice signal separating
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
- G10L25/27—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the analysis technique
- G10L25/30—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the analysis technique using neural networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N20/00—Machine learning
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- Multimedia (AREA)
- Health & Medical Sciences (AREA)
- Computational Linguistics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Human Computer Interaction (AREA)
- Acoustics & Sound (AREA)
- Artificial Intelligence (AREA)
- Signal Processing (AREA)
- Software Systems (AREA)
- Evolutionary Computation (AREA)
- Databases & Information Systems (AREA)
- Mathematical Physics (AREA)
- Computing Systems (AREA)
- Library & Information Science (AREA)
- Quality & Reliability (AREA)
- Molecular Biology (AREA)
- General Health & Medical Sciences (AREA)
- Biophysics (AREA)
- Biomedical Technology (AREA)
- Life Sciences & Earth Sciences (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Medical Informatics (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
- Telephonic Communication Services (AREA)
Abstract
Description
Claims (8)
- 任意のオーディオデータを収集して伝達する通信部(100)と、
収集されたオーディオデータを識別する制御部(200)と、を含み、
前記制御部(200)は、
収集された前記オーディオデータを所定の単位でパージングするパージング部(210)と、
前記オーディオデータのパージングされた複数の区間のうち、いずれか1つの区間をオーディオリソースとして選択する抽出部(220)と、
予め搭載された人工知能アルゴリズムによって前記オーディオリソースの識別情報をマッチングするマッチング部(230)と、
前記オーディオリソースにマッチングされた識別情報を検証する検証部(240)と、を含むことを特徴とするオーディオデータ識別装置。 - 前記人工知能アルゴリズムは、
前記検証部(240)で判別された識別情報の判別結果の入力を受けて学習することを特徴とする請求項1に記載のオーディオデータ識別装置。 - 前記検証部(240)は、
外部端末によるユーザーの入力に基づいて前記識別情報を判別することを特徴とする請求項2に記載のオーディオデータ識別装置。 - 前記検証部(240)は、
外部端末による不特定多数の入力に基づいて前記識別情報を判別するが、前記不特定多数の判別結果の誤差範囲が大きい場合、当該オーディオリソースは廃棄することを特徴とする請求項2に記載のオーディオデータ識別装置 - 前記外部端末は、
マッチングされた識別情報の真または偽の入力を受けて前記検証部(240)に送信することを特徴とする請求項3または請求項4に記載のオーディオデータ識別装置。 - 前記外部端末は、
予め提供された複数の識別子のうち、いずれか1つを選択して入力を受け、選択された識別子と前記オーディオリソースにマッチングされた識別情報が同一か否かを判別して前記検証部(240)に送信することを特徴とする請求項3または請求項4に記載のオーディオデータ識別装置。 - 前記マッチング部(230)は、
予め指定された所定の範疇内で識別情報をマッチングし、前記所定の範疇内に認識されない場合、当該オーディオリソースは、未分類データとして処理し、
前記未分類データとして処理されたオーディオリソースは、
前記外部端末から主観式で当該識別情報の入力を受け、前記検証部(240)に送信することを特徴とする請求項3または請求項4に記載のオーディオデータ識別装置。 - 前記任意のオーディオデータは、
予め指定されたキーワードによって収集されることを特徴とする請求項1に記載のオーディオデータ識別装置。
Applications Claiming Priority (3)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
KR1020200031064A KR102400903B1 (ko) | 2020-03-13 | 2020-03-13 | 오디오 데이터 식별장치 |
KR10-2020-0031064 | 2020-03-13 | ||
PCT/KR2021/002496 WO2021182782A1 (ko) | 2020-03-13 | 2021-02-26 | 오디오 데이터 식별장치 |
Publications (2)
Publication Number | Publication Date |
---|---|
JP2023531850A true JP2023531850A (ja) | 2023-07-26 |
JP7470336B2 JP7470336B2 (ja) | 2024-04-18 |
Family
ID=77670727
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
JP2022554581A Active JP7470336B2 (ja) | 2020-03-13 | 2021-02-26 | オーディオデータ識別装置 |
Country Status (6)
Country | Link |
---|---|
US (1) | US20230178096A1 (ja) |
EP (1) | EP4120098A4 (ja) |
JP (1) | JP7470336B2 (ja) |
KR (1) | KR102400903B1 (ja) |
CN (1) | CN115298661A (ja) |
WO (1) | WO2021182782A1 (ja) |
Families Citing this family (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
KR102501623B1 (ko) * | 2021-11-24 | 2023-02-21 | 주식회사 원아이디랩 | 음악을 검증하여 저작권료의 공정하고 투명한 정산 및 분배를 위한 저작권료 분배 방법 및 시스템 |
Family Cites Families (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN1889172A (zh) * | 2005-06-28 | 2007-01-03 | 松下电器产业株式会社 | 可增加和修正声音类别的声音分类系统及方法 |
JP6767322B2 (ja) | 2017-08-18 | 2020-10-14 | ヤフー株式会社 | 出力制御装置、出力制御方法及び出力制御プログラム |
KR101986905B1 (ko) * | 2017-10-31 | 2019-06-07 | 전자부품연구원 | 신호 분석 및 딥 러닝 기반의 오디오 음량 제어 방법 및 시스템 |
KR102635811B1 (ko) * | 2018-03-19 | 2024-02-13 | 삼성전자 주식회사 | 사운드 데이터를 처리하는 시스템 및 시스템의 제어 방법 |
US10832672B2 (en) * | 2018-07-13 | 2020-11-10 | International Business Machines Corporation | Smart speaker system with cognitive sound analysis and response |
KR20200016111A (ko) * | 2018-08-06 | 2020-02-14 | 주식회사 코클리어닷에이아이 | 오디오 정보 수집장치 및 그의 제어방법 |
US11069334B2 (en) | 2018-08-13 | 2021-07-20 | Carnegie Mellon University | System and method for acoustic activity recognition |
US11367438B2 (en) * | 2019-05-16 | 2022-06-21 | Lg Electronics Inc. | Artificial intelligence apparatus for recognizing speech of user and method for the same |
KR20190106902A (ko) * | 2019-08-29 | 2019-09-18 | 엘지전자 주식회사 | 사운드 분석 방법 및 장치 |
-
2020
- 2020-03-13 KR KR1020200031064A patent/KR102400903B1/ko active IP Right Grant
-
2021
- 2021-02-26 EP EP21766949.8A patent/EP4120098A4/en active Pending
- 2021-02-26 JP JP2022554581A patent/JP7470336B2/ja active Active
- 2021-02-26 CN CN202180019728.7A patent/CN115298661A/zh active Pending
- 2021-02-26 WO PCT/KR2021/002496 patent/WO2021182782A1/ko active Application Filing
- 2021-02-26 US US17/911,078 patent/US20230178096A1/en active Pending
Also Published As
Publication number | Publication date |
---|---|
EP4120098A1 (en) | 2023-01-18 |
CN115298661A (zh) | 2022-11-04 |
WO2021182782A1 (ko) | 2021-09-16 |
US20230178096A1 (en) | 2023-06-08 |
EP4120098A4 (en) | 2024-03-20 |
JP7470336B2 (ja) | 2024-04-18 |
KR20210115379A (ko) | 2021-09-27 |
KR102400903B1 (ko) | 2022-05-24 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
KR102100976B1 (ko) | 스택 데이터 구조 백그라운드의 디지털 어시스턴트 프로세싱 | |
CN108335695B (zh) | 语音控制方法、装置、计算机设备和存储介质 | |
US11194893B2 (en) | Authentication of audio-based input signals | |
WO2020119448A1 (zh) | 语音信息验证 | |
CN106294774A (zh) | 基于对话服务的用户个性化数据处理方法及装置 | |
PH12020551830A1 (en) | Computerized systems and methods for determining authenticity using micro expressions | |
CN107147618A (zh) | 一种用户注册方法、装置及电子设备 | |
CN109271533A (zh) | 一种多媒体文件检索方法 | |
CN110047481A (zh) | 用于语音识别的方法和装置 | |
US11757870B1 (en) | Bi-directional voice authentication | |
CN111368098B (zh) | 一种基于情景化的法律咨询评估系统 | |
CN107678287A (zh) | 设备控制方法、装置及计算机可读存储介质 | |
CN109729067A (zh) | 语音打卡方法、装置、设备和计算机存储介质 | |
CN109949798A (zh) | 基于音频的广告检测方法以及装置 | |
CN107666536A (zh) | 一种寻找终端的方法和装置、一种用于寻找终端的装置 | |
WO2019101099A1 (zh) | 视频节目识别方法、设备、终端、系统和存储介质 | |
CN107451185B (zh) | 录音方法、朗读系统、计算机可读存储介质和计算机装置 | |
CN112509586A (zh) | 电话信道声纹识别方法及装置 | |
JP7470336B2 (ja) | オーディオデータ識別装置 | |
TWI823055B (zh) | 電子資源推送方法及系統 | |
CN106782498A (zh) | 语音信息播放方法、装置及终端 | |
CN111161759B (zh) | 音频质量评价方法、装置、电子设备及计算机存储介质 | |
KR20200070783A (ko) | 사용자 단말의 알람 제어 방법 및 서버의 알람 해제 미션 결정 방법 | |
CN113571063A (zh) | 语音信号的识别方法、装置、电子设备及存储介质 | |
CN107610697B (zh) | 一种音频处理方法及电子设备 |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
A621 | Written request for application examination |
Free format text: JAPANESE INTERMEDIATE CODE: A621 Effective date: 20220907 |
|
A521 | Request for written amendment filed |
Free format text: JAPANESE INTERMEDIATE CODE: A523 Effective date: 20230523 |
|
A131 | Notification of reasons for refusal |
Free format text: JAPANESE INTERMEDIATE CODE: A131 Effective date: 20231024 |
|
A711 | Notification of change in applicant |
Free format text: JAPANESE INTERMEDIATE CODE: A711 Effective date: 20231120 |
|
A521 | Request for written amendment filed |
Free format text: JAPANESE INTERMEDIATE CODE: A821 Effective date: 20231121 |
|
A521 | Request for written amendment filed |
Free format text: JAPANESE INTERMEDIATE CODE: A523 Effective date: 20240124 |
|
TRDD | Decision of grant or rejection written | ||
A01 | Written decision to grant a patent or to grant a registration (utility model) |
Free format text: JAPANESE INTERMEDIATE CODE: A01 Effective date: 20240305 |
|
A61 | First payment of annual fees (during grant procedure) |
Free format text: JAPANESE INTERMEDIATE CODE: A61 Effective date: 20240328 |
|
R150 | Certificate of patent or registration of utility model |
Ref document number: 7470336 Country of ref document: JP Free format text: JAPANESE INTERMEDIATE CODE: R150 |