JP4657532B2

JP4657532B2 - Shape transfer device

Info

Publication number: JP4657532B2
Application number: JP2001275845A
Authority: JP
Inventors: 幸治石突; 美穂細野
Original assignee: Line Media Research Co Ltd
Current assignee: Line Media Research Co Ltd
Priority date: 2001-09-12
Filing date: 2001-09-12
Publication date: 2011-03-23
Anticipated expiration: 2021-09-12
Also published as: JP2003084784A

Abstract

PROBLEM TO BE SOLVED: To provide a shape transmission device which converts the shape and features of image data read by an image reader into sound data and outputs the sound data. SOLUTION: Image data is read, and outline information 2 of the image data is extracted as the shape and features of image data, and the X axis of outline information is replaced with sound position information 4 utilizing sound source orientation which is a human aural function which decides the direction of outputted sounds, and the Y axis is replaced with a frequency 5 of sounds, and image data is converted into sound data to output the sound data, and thus the shape and features of image data can be recognized by hearing.

Description

【０００１】
【発明の属する技術分野】
本発明は、画像読取装置により認識され読取られた画像データ（１）の形状及び特徴を音データに変換し出力する手法である。
【０００２】
【従来の技術】
従来の画像読取装置より認識され読み取られた画像データ（１）の情報を使用者に伝達する為の出力方法として、イメージスキャナやプリンタに見られる、入力された画像データ（１）をモニター又は用紙に出力し、視覚に伝達する方法が採られている。また、視覚障害者向けの歩行に見られる、認識された画像データ（１）の形状を振動および電気的刺激を発生させる事により、触覚に伝達を行なう方法、および入力された画像データ（１）の形状を、音を用いて聴覚に伝達を行なう方法が用いられている。
【０００３】
【発明が解決しようとする課題】
従来の方法において、認識され読み取られた画像データ（１）の情報をモニター又は用紙へ出力する場合、暗所などの視界不良な場所で活動する場合をはじめとした視界および視覚に何らかの障害がある状態では、画像データ（１）を視覚にて認識する事は難しいという問題がある。また振動又は電気的刺激を発生させ触覚に伝達する方法、及び音を用いて聴覚に伝達する方法を用いた場合、読み取られた画像データ（１）の形状及び特徴を把握する事が難しいという問題がある。
【０００４】
【課題を解決するための手段】
【０００５】
ＣＣＤカメラやスキャナなどの画像読取装置より読取られた画像データ（１）の輪郭情報（２）を抽出する事で、画像データ（１）の形状及び特徴を抽出する。具体的には、画像データ（１）の横方向をＸ軸とし、画像データ（１）の縦方向をＹ軸とし、Ｘ軸方向に画像読取を行ない、画像データ（１）の形状及び特徴として抽出した輪郭情報（２）を音データに変換する。この時、Ｘ軸を音の出力される位置情報（４）に置き換え、Ｙ軸を音の周波数（５）に置き換える事により画像データ（１）の形状及び特徴を音データとして出力し、視覚にて認識する事が困難であった視覚に障害を持つ者が画像データ（１）の形状及び特徴を聴覚にて認識する事が可能となる。
【０００６】
【発明の実施の形態】
物体そのものが２次元もしくは３次元の場合でも、物体の輪郭情報（２）は２次元で表現することが可能である。又、人間の聴覚は音源定位（３）と呼ばれる出力される音がどの方向から聞こえるか位置を判定する機能にて、出力される音の方向を認識し、出力される音圧レベルによる音の音量の大小、及び音の周波数（５）により出力される音の音程の高低差及び音色を認識している。そこで、画像データ（１）の横方向をＸ軸とし、画像データ（１）の縦方向をＹ軸とし、Ｘ軸方向に画像読取を行なう事で画像データ（１）の輪郭情報（２）を画像データ（１）の形状及び特徴として抽出する。抽出された輪郭情報（２）を、音データに変換し出力する事で画像データ（１）の形状及び特徴を聴覚にて認識する事を可能にする。音データ変換時においてはＸ軸を音の出力される位置情報（４）に置き換え、Ｙ軸を音の周波数（５）に置き換える。しかし、人間の聴覚能力において複数同時に出力する同一周波数の認識能力は無く、又、複数同時に出力する異なる周波数の認識能力には個人差がある。さらに出力される音の位置を認識する音源定位（３）においては、後から出力される音の位置の認識能力よりも、前及び左右から出力される音の位置の認識能力が高いことから、音の出力される位置の情報（４）を左側よりＸ軸方向に時間走査的に出力する事により、画像データ（１）の特徴を音データとして出力する。例として、画像データ（１）を音データとして出力を開始してよりｎポイント（６）ｎ＋１ポイント（７）ｎ＋２ポイント（８）経過した各ポイントでは、出力される音データが異なる。ｎポイント（６）では、左側より異なる周波数の音が２音同時出力される。ｎ+１ポイント（７）では中央より異なる周波数の音が２音同時出力されるが、ｎポイント（６）時に比べ周波数の高い音と周波数の低い音が出力される。ｎ＋２ポイント（８）では、音は出力されない。
【０００７】
図１１において、スキャナ入力部より画像データ（１）を画像読取後、画像変換（９）にて二値化処理（１０）を行なう。画像読取された画像データ（１）は特徴抽出処理（１１）として輪郭線追跡処理（１２）を行ない、画像データ（１）の特徴を抽出する。輪郭情報（２）画像計測（１３）後、画像データ（１）から音データにテーブルを置き換え（１４）、画像読取を行なったＸ方向を音の出力される位置情報（４）、輪郭情報（２）の縦方向をＹ座標とし、ソート（１５）を行なう。ソート（１５）された画像データ（１）は、音データとして出力する。
【０００８】
図１２において、画像データと特定のポイントを音データに変換する例を示す。
図では、取り込まれた画像（２１）と、特定のポイント（２０）が画面上に混在する。特定のポイント（２０）の位置は、特定のポイントが位置する周波数（２２）と、特定のポイントが位置する音量（２３）により解析される。
図内の（２１）Ｌ．ｃｈと（２１）Ｒ．ｃｈは、取り込まれた画像（２１）を解析した時の左右の出力レベルと波形をイメージした波形であり、取り込まれた画像（２１）の形状により周波数と左右の出力レベルが変化する。
図内の（２０）Ｌ．ｃｈＡと（２０）Ｒ．ｃｈＡは、特定のポイント（２０）の位置を解析し特定のポイントが位置する周波数（２２）と、特定のポイントが位置する音量（２３）によって周辺の画像データと聞き分けが出きるように一定間隔で発信と無発信を繰り返した例であり、図内の（２０）Ｌ．ｃｈＢと（２０）Ｒ．ｃｈＢは、特定のポイント（２０）の位置を解析し特定のポイントが位置する周波数（２２）を、特定のポイントが位置する音量（２３）によって、特定のポイント（２０）が位置する横方向を再生する時間に一定間隔で発信した例である。
そして、図内の（２１）Ｌ．ｃｈと（２０）Ｌ．ｃｈＡ又は（２０）Ｌ．ｃｈＢの波形と、（２１）Ｒ．ｃｈと（２０）Ｒ．ｃｈＡ又は（２０）Ｒ．ｃｈＢの波形は各々合成して出力する。
【０００９】
【図面の簡単な説明】
【図１】画像データの一例を表す
【図２】画像データの輪郭情報を抽出した図
【図３】画像データを音データに変換する定義を表した図
【図４】２次元図形の輪郭情報を抽出し、２軸データとして表す
【図５】３次元図形の輪郭情報を抽出し、２軸データとして表す
【図６】音源定位（３）を表わした図。
【図７】取り込んだ画像に対して出力される音データを表わす図。
【図８】出力される音データ（６）を表わす図。
【図９】出力される音データ（７）を表わす図。
【図１０】出力される音データを表わす図。
【図１１】画像データを音データに変換するフローの例
【図１２】画像データと特定のポイントを音データに変換する例
【符号の説明】
１画像データ
２輪郭情報
３音源定位
４音の出力される位置情報
５音の周波数
６ｎポイント
７ｎ＋１ポイント
８ｎ＋２ポイント
９画像変換
１０二値化処理
１１特徴抽出処理
１２輪郭線追跡処理
１３画像計測
１４テーブル置換
１５ソート
２０特定のポイント
２１取りこまれた画像
２２特定のポイントが位置する周波数
２３特定のポイントが位置する音量[0001]
BACKGROUND OF THE INVENTION
The present invention is a technique for converting the shape and characteristics of image data (1) recognized and read by an image reading device into sound data and outputting the sound data.
[0002]
[Prior art]
As an output method for transmitting information of image data (1) recognized and read by a conventional image reading apparatus to a user, the input image data (1) found on an image scanner or printer is displayed on a monitor or paper. The method is used to output to and visually transmit. In addition, a method of transmitting the shape of the recognized image data (1) seen in walking for the visually handicapped person to the sense of touch by generating vibration and electrical stimulation, and the input image data (1). The method of transmitting the shape of the sound to the auditory sense using sound is used.
[0003]
[Problems to be solved by the invention]
In the conventional method, when the information of the recognized and read image data (1) is output to a monitor or paper, there are some obstacles in the field of vision and vision, including the case of operating in a place with poor visibility such as a dark place. In the state, there is a problem that it is difficult to visually recognize the image data (1). In addition, it is difficult to grasp the shape and characteristics of the read image data (1) when using a method of generating vibration or electrical stimulation and transmitting it to the tactile sense and a method of transmitting to the auditory sense using sound. There is.
[0004]
[Means for Solving the Problems]
[0005]
By extracting the contour information (2) of the image data (1) read by an image reading device such as a CCD camera or a scanner, the shape and characteristics of the image data (1) are extracted. Specifically, the horizontal direction of the image data (1) is the X axis, the vertical direction of the image data (1) is the Y axis, and the image is read in the X axis direction. As the shape and characteristics of the image data (1), The extracted contour information (2) is converted into sound data. At this time, the X-axis is replaced with the position information (4) where the sound is output, and the Y-axis is replaced with the sound frequency (5), so that the shape and characteristics of the image data (1) are output as sound data. It is possible for a visually impaired person who is difficult to recognize to recognize the shape and characteristics of the image data (1) by hearing.
[0006]
DETAILED DESCRIPTION OF THE INVENTION
Even when the object itself is two-dimensional or three-dimensional, the contour information (2) of the object can be expressed in two dimensions. In addition, the human auditory sense is called sound source localization (3), which recognizes the direction of the output sound with a function for determining the direction from which the output sound can be heard, and the sound of the output sound pressure level is detected. It recognizes the pitch difference and tone color of the sound output by the volume level and the sound frequency (5). Therefore, the horizontal direction of the image data (1) is taken as the X axis, the vertical direction of the image data (1) is taken as the Y axis, and image information is read in the X axis direction to obtain the contour information (2) of the image data (1). Extracted as the shape and features of the image data (1). By converting the extracted contour information (2) into sound data and outputting it, the shape and characteristics of the image data (1) can be recognized by hearing. At the time of sound data conversion, the X axis is replaced with position information (4) from which sound is output, and the Y axis is replaced with sound frequency (5). However, in human hearing ability, there is no recognition ability for the same frequency that is output simultaneously, and there are individual differences in recognition ability for different frequencies that are output simultaneously. Furthermore, in the sound source localization (3) for recognizing the position of the sound to be output, the ability to recognize the position of the sound output from the front and left and right is higher than the ability to recognize the position of the sound output later. By outputting the position information (4) where the sound is output from the left side in the X-axis direction in a time-scanning manner, the characteristics of the image data (1) are output as sound data. As an example, the output sound data is different at each point when n points (6), n + 1 points (7), and n + 2 points (8) have elapsed since the start of outputting image data (1) as sound data. At n point (6), two sounds of different frequencies are output simultaneously from the left side. At n + 1 point (7), two sounds having different frequencies are output simultaneously from the center, but a sound having a higher frequency and a sound having a lower frequency than those at n point (6) are output. At n + 2 points (8), no sound is output.
[0007]
In FIG. 11, after image data (1) is read from the scanner input unit, binarization processing (10) is performed by image conversion (9). The image data (1) that has been read is subjected to outline tracking processing (12) as feature extraction processing (11) to extract the features of the image data (1). Contour information (2) After image measurement (13), the table is replaced with sound data from image data (1) (14), position information (4) in which sound is output in the X direction in which image reading is performed, contour information ( Sorting (15) is performed with the vertical direction of 2) as the Y coordinate. The sorted (15) image data (1) is output as sound data.
[0008]
FIG. 12 shows an example of converting image data and specific points into sound data.
In the figure, the captured image (21) and the specific point (20) are mixed on the screen. The position of the specific point (20) is analyzed by the frequency (22) where the specific point is located and the volume (23) where the specific point is located.
(21) L. ch and (21) R.M. ch is a waveform in which the left and right output levels and the waveform when the captured image (21) is analyzed, and the frequency and the left and right output levels change depending on the shape of the captured image (21).
(20) L. chA and (20) R. The chA analyzes the position of a specific point (20), and is spaced at regular intervals so that it can be distinguished from surrounding image data by the frequency (22) where the specific point is located and the volume (23) where the specific point is located. In the example shown in FIG. chB and (20) R. chB analyzes the position of the specific point (20), determines the frequency (22) at which the specific point is located, and the horizontal direction in which the specific point (20) is located by the volume (23) at which the specific point is located. This is an example of transmission at regular intervals during the playback time.
And (21) L. ch and (20) L. chA or (20) L. chB waveform and (21) R.R. ch and (20) R.M. chA or (20) R.I. The chB waveforms are synthesized and output.
[0009]
[Brief description of the drawings]
[Fig. 1] An example of image data [Fig. 2] A diagram extracting contour information of image data [Fig. 3] A diagram showing a definition for converting image data into sound data [Fig. 4] [Fig. Is extracted and expressed as 2-axis data. [FIG. 5] The contour information of a three-dimensional figure is extracted and expressed as 2-axis data. [FIG. 6] A diagram showing sound source localization (3).
FIG. 7 is a diagram illustrating sound data output for a captured image.
FIG. 8 is a diagram showing sound data (6) to be output.
FIG. 9 is a diagram showing sound data (7) to be output.
FIG. 10 is a diagram showing sound data to be output.
11 is an example of a flow for converting image data into sound data. FIG. 12 is an example of converting image data and a specific point into sound data.
DESCRIPTION OF SYMBOLS 1 Image data 2 Contour information 3 Sound source localization 4 Sound output position information 5 Sound frequency 6 n point 7 n + 1 point 8 n + 2 point 9 Image conversion 10 Binarization process 11 Feature extraction process 12 Outline tracking process 13 Image measurement 14 Table replacement 15 Sort 20 Specific point 21 Captured image 22 Frequency at which specific point is located 23 Volume at which specific point is located

Claims

A device for outputting the shape and characteristics of image data (1) recognized and read by an image reading device such as a CCD camera or a scanner as sound ,
The horizontal direction of the image data (1) is the X-axis, the vertical direction of the image data (1) is the Y-axis, the image is read in the X-axis direction, and exists at the X coordinate at the time of reading on the image data (1). Contour information (2) including the X coordinate and Y coordinate of the contour is extracted from the contour of the figure ,
The X coordinates of the contour information (2), the sound source localization (3) location of the information in the output of sound using the sound output is judged human hearing functions or heard from which direction (4) Replace the Y coordinate with the sound frequency (5) ,
By scanning the image data (1) in the X-axis direction and outputting the sound specified by the position information (4) and the sound frequency (5) corresponding to the X coordinate at the time of scanning ,
Shape transmission apparatus capable Ru is recognized by hearing the shape and characteristics of the image data (1).

2. The shape transfer device according to claim 1 , wherein a specific point is determined from the captured image, and transmission and non-transmission are repeated at regular intervals or a tone is changed according to a frequency and a volume corresponding to the coordinates of the portion. the method allows the specific point, the shape transfer device enabling that the positional relationship of the surrounding situation Ru is recognized by hearing.