JPWO2021230180A5 - - Google Patents
Download PDFInfo
- Publication number
- JPWO2021230180A5 JPWO2021230180A5 JP2022521892A JP2022521892A JPWO2021230180A5 JP WO2021230180 A5 JPWO2021230180 A5 JP WO2021230180A5 JP 2022521892 A JP2022521892 A JP 2022521892A JP 2022521892 A JP2022521892 A JP 2022521892A JP WO2021230180 A5 JPWO2021230180 A5 JP WO2021230180A5
- Authority
- JP
- Japan
- Prior art keywords
- text image
- arrival
- information processing
- estimated
- sound
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 230000010365 information processing Effects 0.000 claims 11
- 238000000034 method Methods 0.000 claims 5
- 238000003384 imaging method Methods 0.000 claims 4
- 239000000284 extract Substances 0.000 claims 1
- 230000004270 retinal projection Effects 0.000 claims 1
Claims (17)
前記取得された音声の到来方向を推定する手段を具備し、
前記推定された到来方向に基づくビームフォーミング処理により、当該到来方向に対応する音声を抽出する手段を具備し、
前記抽出された音声に対応するテキスト画像を生成する手段を具備し、
前記推定された到来方向を参照して、前記テキスト画像の提示態様を決定する手段を具備し、
前記決定された提示態様で、前記テキスト画像を提示する手段を具備する、
情報処理装置。 A means for acquiring sounds collected by a plurality of microphones is provided,
means for estimating a direction of arrival of the acquired sound;
A means for extracting a sound corresponding to the estimated arrival direction by a beamforming process based on the estimated arrival direction,
means for generating a text image corresponding to the extracted speech;
means for determining a presentation manner of the text image by referring to the estimated direction of arrival;
means for presenting the text image in the determined presentation manner;
Information processing device.
前記提示態様を決定する手段は、前記推定した話者属性を参照して、前記テキスト画像の提示態様を決定する、請求項1乃至請求項5のいずれかに記載の情報処理装置。 A means for estimating speaker attributes by analyzing the acquired speech,
6. The information processing apparatus according to claim 1, wherein the means for determining the presentation manner determines the presentation manner of the text image by referring to the estimated speaker attribute.
前記提示態様を決定する手段は、前記取得されたセンシング信号を参照して、前記テキスト画像の提示態様を決定する、請求項1乃至請求項6のいずれかに記載の情報処理装置。 a means for acquiring, by a sensor, a sensing signal relating to an area where sound is collected by the plurality of microphones;
The information processing apparatus according to claim 1 , wherein the means for determining the presentation mode determines the presentation mode of the text image by referring to the acquired sensing signal.
請求項7に記載の情報処理装置。 the sensing signal is an imaging signal obtained by imaging the area using an image sensor;
The information processing device according to claim 7 .
前記取得した撮影信号を撮影画像に変換する手段を具備し、
前記テキスト画像を提示する手段は、前記テキスト画像を前記撮影画像に重畳させて提示する、請求項1乃至請求項7のいずれかに記載の情報処理装置。 a means for acquiring an imaging signal obtained by imaging the area,
means for converting the acquired photographic signal into a photographic image;
8. The information processing apparatus according to claim 1, wherein the means for presenting the text image presents the text image by superimposing it on the photographed image.
前記提示態様を決定する手段は、前記推定した話者属性を参照して、前記テキスト画像の提示態様を決定する、請求項8又は請求項9に記載の情報処理装置。 A means for estimating speaker attributes by analyzing the photographed signal,
The information processing apparatus according to claim 8 , wherein the means for determining the presentation manner determines the presentation manner of the text image by referring to the estimated speaker attribute.
前記到来方向を推定する手段は、前記抽出された音声の到来方向を推定し、
前記テキスト画像を生成する手段は、前記抽出された音声に対応するテキスト画像を生成する、
請求項1乃至請求項10のいずれかに記載の情報処理装置。 A means for extracting a speech sound uttered by a person from the acquired voice,
The means for estimating the direction of arrival estimates the direction of arrival of the extracted voice,
the means for generating a text image generates a text image corresponding to the extracted voice.
11. The information processing device according to claim 1 .
前記取得された音声の到来方向を推定する手段を具備し、
前記推定された到来方向に基づくビームフォーミング処理により、当該到来方向に対応する音声を抽出する手段を具備し、
前記抽出された音声に対応するテキスト画像を生成する手段を具備し、
前記推定された到来方向を参照して、前記テキスト画像の提示態様を決定する手段を具備し、
前記決定された提示態様で、前記テキスト画像を提示するディスプレイを具備する、
ディスプレイデバイス。 A means for acquiring sounds collected by a plurality of microphones is provided,
means for estimating a direction of arrival of the acquired sound;
A means for extracting a sound corresponding to the estimated arrival direction by a beamforming process based on the estimated arrival direction,
means for generating a text image corresponding to the extracted speech;
means for determining a presentation manner of the text image by referring to the estimated direction of arrival;
a display for presenting the text image in the determined presentation manner.
Display device.
前記音声を取得する手段は、前記通信する手段を介して、前記マイクロホンモジュールが備える前記複数のマイクロホンで収音された音声を取得する、the means for acquiring the sound acquires the sound picked up by the plurality of microphones included in the microphone module via the means for communicating;
請求項12乃至請求項14のいずれかに記載のディスプレイデバイス。A display device according to any one of claims 12 to 14.
複数のマイクロホンで集音された音声を取得するステップを具備し、
前記取得された音声の到来方向を推定するステップを具備し、
前記推定された到来方向に基づくビームフォーミング処理により、当該到来方向に対応する音声を抽出するステップを具備し、
前記抽出された音声に対応するテキスト画像を生成するステップを具備し、
前記推定された到来方向を参照して、前記テキスト画像の提示態様を決定するステップを具備し、
前記決定された提示態様で、前記テキスト画像を提示するステップを具備する、
方法。 A method for presenting an image corresponding to a sound, comprising:
Acquiring sounds collected by a plurality of microphones;
estimating a direction of arrival of the captured sound;
Extracting a sound corresponding to the estimated arrival direction by a beamforming process based on the estimated arrival direction,
generating a text image corresponding to the extracted speech;
determining a presentation manner of the text image with reference to the estimated direction of arrival;
presenting the text image in the determined presentation manner.
Method.
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
JP2020082945 | 2020-05-11 | ||
PCT/JP2021/017640 WO2021230180A1 (en) | 2020-05-11 | 2021-05-10 | Information processing device, display device, presentation method, and program |
Publications (2)
Publication Number | Publication Date |
---|---|
JPWO2021230180A1 JPWO2021230180A1 (en) | 2021-11-18 |
JPWO2021230180A5 true JPWO2021230180A5 (en) | 2024-05-21 |
Family
ID=78525808
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
JP2022521892A Pending JPWO2021230180A1 (en) | 2020-05-11 | 2021-05-10 |
Country Status (2)
Country | Link |
---|---|
JP (1) | JPWO2021230180A1 (en) |
WO (1) | WO2021230180A1 (en) |
Families Citing this family (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP7399413B1 (en) * | 2022-02-21 | 2023-12-18 | ピクシーダストテクノロジーズ株式会社 | Information processing device, information processing method, and program |
WO2023249073A1 (en) * | 2022-06-23 | 2023-12-28 | ピクシーダストテクノロジーズ株式会社 | Information processing device, display device, information processing method, and program |
Family Cites Families (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP5666219B2 (en) * | 2010-09-10 | 2015-02-12 | ソフトバンクモバイル株式会社 | Glasses-type display device and translation system |
US9848260B2 (en) * | 2013-09-24 | 2017-12-19 | Nuance Communications, Inc. | Wearable communication enhancement device |
WO2016075782A1 (en) * | 2014-11-12 | 2016-05-19 | 富士通株式会社 | Wearable device, display control method, and display control program |
JP2019057047A (en) * | 2017-09-20 | 2019-04-11 | 株式会社東芝 | Display control system, display control method and program |
-
2021
- 2021-05-10 JP JP2022521892A patent/JPWO2021230180A1/ja active Pending
- 2021-05-10 WO PCT/JP2021/017640 patent/WO2021230180A1/en active Application Filing
Similar Documents
Publication | Publication Date | Title |
---|---|---|
JP6520878B2 (en) | Voice acquisition system and voice acquisition method | |
JP2019531538A5 (en) | ||
WO2019206186A1 (en) | Lip motion recognition method and device therefor, and augmented reality device and storage medium | |
JP7100824B2 (en) | Data processing equipment, data processing methods and programs | |
US20170188173A1 (en) | Method and apparatus for presenting to a user of a wearable apparatus additional information related to an audio scene | |
JP2012220959A (en) | Apparatus and method for determining relevance of input speech | |
JP2007147762A (en) | Speaker predicting device and speaker predicting method | |
JP2013527947A5 (en) | ||
JPWO2021230180A5 (en) | ||
JP2009060394A (en) | Imaging device, image detector and program | |
WO2010010736A1 (en) | Conference image creating method, conference system, server device, conference device, and so forth | |
JP7427408B2 (en) | Information processing device, information processing method, and information processing program | |
KR101508092B1 (en) | Method and system for supporting video conference | |
WO2021017096A1 (en) | Method and installation for entering facial information into database | |
JP2013042356A (en) | Image processor, image processing method and program | |
WO2016159938A1 (en) | Locating individuals using microphone arrays and voice pattern matching | |
JP7388188B2 (en) | Speaker recognition system, speaker recognition method, and speaker recognition program | |
EP2503545A1 (en) | Arrangement and method relating to audio recognition | |
JP2011004007A (en) | Television conference device, television conference method, and program for the same | |
TW200411627A (en) | Robottic vision-audition system | |
WO2021230180A1 (en) | Information processing device, display device, presentation method, and program | |
JP2010148132A (en) | Imaging device, image detector and program | |
US10665243B1 (en) | Subvocalized speech recognition | |
KR101976937B1 (en) | Apparatus for automatic conference notetaking using mems microphone array | |
JP6798258B2 (en) | Generation program, generation device, control program, control method, robot device and call system |