WO2007105927A1

WO2007105927A1 - Method and apparatus for converting image to sound

Info

Publication number: WO2007105927A1
Application number: PCT/KR2007/001309
Authority: WO
Inventors: Gil Ho Kim
Original assignee: Harmonicolor System Co., Ltd.
Priority date: 2006-03-16
Filing date: 2007-03-16
Publication date: 2007-09-20

Abstract

The present invention relates to a method and an apparatus for converting image into sound, and more particularly to a method and an apparatus for converting an image into sound by associating each color elements of colors (or lights) from a color image with respective sound elements of converted sound, so that a color image can be converted into sound while most of the sensible features of the color image are reflected on the converted sound. The method for converting an image into sound according to present invention comprises the steps of: associating at least one color elements of at least one pixel or pixel block having color elements of a first color system with at least one corresponding color elements of a second color system; determining a first frequency or wavelength within a visible frequency range corresponding to the value of the color elements of the second color system; and associating the determined first frequency or wavelength within the visible frequency range with a second frequency or wavelength within an audible frequency range.

Description

[DESCRIPTION]

[invention Title]

METHOD AND APPARATUS FOR CONVERTING IMAGE TO SOUND

[Technical Field] The present invention relates to a method and an apparatus for converting image into sound, and more particularly to a method and an apparatus for converting an image into sound by associating each color elements of colors (or lights) from a color image with respective sound elements of converted sound, so that a color image can be converted into sound while most of the sensible features of the color image are reflected on the converted sound.

Further, the present invention relates to a method and an apparatus for converting image into sound using harmonics, a law of harmony.

[Background Art]

Light and sound can be defined as a kind of wave in physical sense and they both have features of wave such as amplitude and wavelength (or frequency) . For lights, amplitude determines brightness of light and wavelength (or frequency) determines hue of light. For sounds, the amplitude determines the magnitude of sound and wavelength (or frequency) determines pitch of sound. Based on the above features, there has been technical attempts, such as LED level display in audio equipments or visualizers of audio file players, to convert light into sound and sound into light by associating the amplitude and wavelength of sounds with those of lights. However, in these attempts, since only limited number of features of sound (that is, amplitude and/or wavelength) corresponded to those of light it was impossible to reflect the features of hue, brightness, saturation of color image onto the converted sound or to perform a full range mutual conversion of light (or visual image) to and from sound across the overall visible and audible frequency ranges. Consequently the above-mentioned conventional attempts had lots of limitations in terms of mutual conversion between the light (or image) and sound.

In principle, one of the reasons that perfect mutual conversion of light and sound is that light and sound basically has different octave structures. This will be explained by referring to Fig. 1 showing a frequency range covering the audible frequency range and visual frequency range of human beings.

Human ears usually detect sound within frequency range of 20Hz~20kHz (or wavelength range of 20m~20mm) , which corresponds to ten octaves. Human eyes normally recognizes lights of 792~396nm wavelength, which corresponds to only one audible octave. Therefore full range conversion between sound and light (or color) may not be obtained by a simple one-to-one match of frequencies between the audible and visible frequency ranges.

Considering the above limitations, the applicant of the present invention disclosed a novel sound-image conversion method in the U.S. patent No. 6,686,529 issued on Feb. 3, 2004, wherein the audible frequency range is divided by ten sections i.e., octaves and each octave of audible frequency range is associated to a visual frequency range having different brightness of light. In this invention, each color in the visible frequency range having a given brightness level was associated with an audible musical note (or pitch) of a reference octave level and each color in the visible frequency range having a higher brightness level was associated with an audible musical note of a higher octave level. The full range conversion between full visible frequency range and full audible frequency range became possible by associating each of ten sections in the visible frequency range with one of the ten octaves in the audible frequency range.

However, even in the above invention, only the conversion of sound into light was disclosed while the method for converting light into sound was not disclosed in detail. Further, the above invention did not disclose any methods for naturally converting an image itself having given dimension, shape and other features into a sound by analyzing the image and converting each color or light from each pixel of the image into respective sound.

[Disclosure] [Technical Problem]

Therefore, an object of the present invention is to solve the problems involved in the prior art, and to provide a method and apparatus for analyzing color information of light of a pixel or a pixel block constituting a given image and converting the color information of light or color pixel into corresponding information of sound.

Other object of the present invention is to provide a method and an apparatus for converting image into sound wherein the sensible features from the image is maintained in the converted sound by associating the values of three elements of light, that is hue, brightness and saturation with the values of musical note, octave and timber of converted sound, respectively. Another object of the present invention is to provide a method and an apparatus for converting image into sound by associating three elements of light, that is hue, brightness and saturation of light as well as position of light source, that is, pixel within an image with musical note (or pitch) , magnitude or volume of sound, noise mixture ratio and octave level of converted sound, respectively, so that the converted sound can be reconstructed to an image in the end.

Still another object of the present invention is to provide a method and an apparatus for converting light or an image across the full visible frequency range having arbitrary hue, brightness and saturation values into a sound across the full audible frequency range.

Still another object of the present invention is to provide a method and an apparatus for converting an image into a sound according to harmonics, a law of harmony.

Still another object of the present invention is to provide a method and an apparatus for converting a visible frequency signal into an audible frequency signal using harmonics.

Still another object of the present invention is to provide a method and an apparatus for converting a color image, not a black and white image, into a sound so that image recognition level of blind person can be greatly enhanced. [Technical Solution]

The image-sound conversion method according to one aspect of the present invention comprises the steps of:

(a) associating at least one color elements of at least one pixel or pixel block having color elements of a first color system with at least one corresponding color elements of a second color system;

(b) determining a first frequency or wavelength within a visible frequency range corresponding to the value of the color elements of the second color system; and

(c) associating the determined first frequency or wavelength within the visible frequency range with a second frequency or wavelength within an audible frequency range.

The first and second color systems are formed based on the color elements of hue, brightness and saturation, and the wavelength ratio of red, green and blue colors in the second color system corresponds to the wavelength ratio of the musical notes Do, Mi and Sol in the audible frequency range.

The image-sound conversion method further comprises, in advance to step (a) , the step of (d) associating all color elements of at least one pixel or pixel block of a third color system formed based on the color elements of predetermined reference colors with the at least one color elements of the at least one pixel or pixel block in the first color system. The first color system is HLS color system and the third color system is RGB color system.

The image-sound conversion method further comprises the step of (e) setting image scan variables relating to input of the pixel or pixel block and sound generation variables relating to sound generation for generating sound using the second frequency or wavelength in the audible frequency range.

The image-sound conversion method may be implemented by a computer program.

The image-sound conversion method according the other aspect of the present invention comprises the steps of:

(a) associating the value of hue of at least one pixel or pixel block of a first color system with the value of hue of the pixel or pixel block of a second color system, the first and second color systems having color elements of hue, brightness and saturation, and the wavelength ratio of red, green and blue colors in the second color system being corresponding to the wavelength ratio of the musical notes Do, Mi and Sol in an audible frequency range;

(b) determining a first frequency or wavelength within a visible frequency range corresponding to the value of hue of the second color system;

(c) dividing the frequency spectrum or wavelength spectrum in the audible frequency range into a number of octave sections, dividing the range of brightness in the first and second color systems into the same number of levels, and determining an octave section in the audible frequency range in accordance with the level of brightness of the pixel or pixel block, of which the first frequency or wavelength has been determined in step (b) ; and

(d) within the octave section determined in step (c) , associating the first frequency or wavelength determined in step (b) with a second frequency or wavelength within the audible frequency range.

The image-sound conversion method further comprises the step of (e) adding a noise frequency or wavelength within the octave section to the second frequency or wavelength within the audible frequency range in step (d) in association with the value of saturation in the first and the second color systems.

The image-sound conversion method further comprises, in advance to step (a) , the step of (f) associating the values of hue, brightness and saturation of the pixel or pixel block of a third color system having color elements of predetermined reference colors with the value of hue, brightness and saturation of the first color system, respectively.

The image-sound conversion method further comprises the step of (g) generating sound using the second frequency or wavelength within the audible frequency range, wherein the position of the generated sound source being corresponding to the position of the pixel or pixel block within the input image.

The image-sound conversion method further comprises the steps of:

(h) generating fundamental tone data in accordance with the second frequency or wavelength in the audible frequency range corresponding to the value of hue and brightness in the first and the second color systems; (i) generating noise data in accordance with noise frequencies or wavelengths in the audible frequency range corresponding to the value of saturation in the first and the second color systems; (j) synthesizing the fundamental tone data and the noise data; and

(k) generating sound in accordance with the synthesized data. The image-sound conversion method further comprises the step of (1) setting image scan variables and sound generation variables, the image scan variables being related to image scanning process for input of the pixel or pixel block and the sound generation variables being related to sound generation process for generating sound using the frequencies or wavelengths within the audible frequency range.

The image-sound conversion method according to yet other aspect of the present invention comprises the steps of:

(b) determining a first frequency or wavelength within a visible frequency range corresponding to the value of hue of the second color system; (c) dividing the frequency spectrum or wavelength spectrum in the audible frequency range into a number of octave sections, dividing the input image having the pixel or pixel block into the same number of sections along a predetermined direction, and determining an octave section in the audible frequency range in accordance with the image section containing the pixel or pixel block, of which the first frequency or wavelength has been determined in step (b) ; and (d) within the octave section determined in step (c) , associating the first frequency or wavelength determined in step (b) with a second frequency or wavelength within the audible frequency range.

The image-sound conversion method further comprises the step of (e) adding noise frequencies or wavelengths within the octave section to the second frequency or wavelength within the audible frequency range in step (d) in association with the value of saturation in the first and the second color systems.

The step (d) of the image-sound conversion method further comprises the step of (f) determining the amplitude of the second frequency or wavelength in the audible frequency range in accordance with the value of brightness in the first and the second color system.

The image-sound conversion method further comprises the step of (g) generating sound using the second frequency or wavelength within the audible frequency range, the amount of noise added to the generated sound being corresponding to the position of the pixel or pixel block along the predetermined direction, and the position of the generated sound source being corresponding to the position of the pixel or pixel block along a direction different from the predetermined direction within the input image. The image-sound conversion method further comprises the steps of:

(h) generating fundamental tone data in accordance with the second frequency or wavelength in the audible frequency range corresponding to the values of hue and brightness in the first and the second color systems;

(i) generating noise data in accordance with noise frequencies or wavelengths in octave section of the audible frequency range corresponding to the value of saturation in the first and the second color systems;

(j) synthesizing the fundamental tone data and the noise data; and

(k) generating sound in accordance with the synthesized data. The image-sound conversion method further comprises, in advance to step (a) , the step of (1) associating the values of hue, brightness and saturation of the pixel or pixel block of a third color system having color elements of predetermined reference colors with the value of hue, brightness and saturation in the first color system, respectively.

The image-sound conversion method further comprises the step of (m) setting image scan variables and sound generation variables, the image scan variables being related to image scanning process for input of the pixel or pixel block and the sound generation variables being related to sound generation process for generating sound using the frequencies or wavelengths within the audible frequency range. The image-sound conversion method according to yet another aspect of the present invention comprises the step of converting a pixel or image related information that is increasing or decreasing at a certain rate in a visible frequency range into a pitch or sound related information that is increasing or decreasing at the same rate in an audible frequency range.

The image-sound conversion method according to yet another aspect of the present invention comprises the step of converting a pitch or sound related information that is increasing or decreasing at a certain rate in an audible frequency range into a pixel or image related information that is increasing or decreasing at the same rate in a visible frequency range.

The image-sound conversion method according to a further aspect of the present invention comprises the steps of: (a) detecting the values of hue, brightness, saturation and position within an image of each pixel in the image; and

(b) associating the values of hue, brightness, saturation and position of each pixel within the image detected at step (a) with the values of pitch, octave, timber and position of sound, respectively, and then generating the sound having the values of pitch, octave, timber and position of sound corresponding to the values of hue, brightness, saturation and position of each pixel. The step (a) of the image-sound conversion method comprises steps of: (a-1) determining the scan resolution of the image;

(a-2) scanning the image along a predetermined direction with a scan resolution determined at step (a-1) ; and (a-3) analyzing the image data generated at step (a-2) to obtain the values of hue, brightness, saturation and position of each pixel in the image.

In the image-sound conversion method, the direction or type of scan is selected from a group comprising Left-Right, Up-Down, Ellipsoid, Lattice and Three-dimensional space.

In the image-sound conversion method, the step (b) generates the sound in accordance with the predetermined direction and speed of scan. In the image-sound conversion method, the step (b) generates the sound based on sine waves.

In the image-sound conversion method, the step (b) generates the sound according to a law of harmony.

In the image-sound conversion method, the step (b) generates the sound that corresponds to major harmonious chords. In the image-sound conversion method, the step (b) selects the types and arrangements of musical instruments and then generates the sound based on the sound source of the selected musical instruments. Some or all steps of the aforementioned methods for image- sound conversion may be implemented by a computer program.

The image-sound conversion apparatus according to an aspect of the present invention comprises:

(a) an image analysis/tuning means for analyzing and tuning the input image;

(b) a visible/audible frequency association means for associating the visible frequency with the audible frequency; and

(c) a sound synthesis means for synthesizing sound, wherein the image analysis/tuning means associates the value of hue of at least one pixel or pixel block of a first color system with the value of hue of the pixel or pixel block of a second color system, and determines a first frequency or wavelength within a visible frequency range corresponding to the value of hue of the second color system, the first and second color systems having color elements of hue, brightness and saturation, and the wavelength ratio of red, green and blue colors in the second color system being corresponding to the wavelength ratio of the musical notes Do, Mi and Sol in an audible frequency range; wherein the visible/audible frequency association means associates the determined first frequency or wavelength in the visible frequency range with a second frequency or wavelength in the audible frequency range; and wherein the sound synthesis means adds noise frequencies or wavelengths in the audible frequency range to the second frequency or wavelength in the audible frequency range in association with the value of saturation in the first and the second color systems .

In the image-sound conversion apparatus, the image analysis/tuning means associates all color elements of at least one pixel or pixel block of a third color system formed based on the color elements of predetermined reference colors with the at least one color elements of the at least one pixel or pixel block in the first color system. In the image-sound conversion apparatus, the first color system is HLS color system and the third color system is RGB color system.

The image-sound conversion apparatus further comprises: (d) an image input means for providing the image to the image analysis/tuning means; and

(e) a sound generation means for generating sound provided by the sound synthesis means.

The image-sound conversion apparatus further comprises (f) a reference image DB for storing reference images, wherein the information regarding correspondence between the input image provided to the image analysis/tuning means and the reference images stored in the reference image DB is reflected on the synthesized sound. The image-sound conversion apparatus further comprises (g) a reference sound source DB for storing reference sound sources, wherein the information regarding correspondence between the sound synthesized by the sound synthesis means and the reference sound sources stored in the reference sound source DB is reflected on the synthesized sound.

The image-sound conversion apparatus according to other aspect of the present invention comprises: a means for detecting the values of hue, brightness, saturation and position within an image of each pixel in the image; and a means for associating the values of hue, brightness, saturation and position of each pixel within the image detected by the detecting means with the values of pitch, octave, timber and position of sound, respectively, and then generating the sound having the values of pitch, octave, timber and position of sound corresponding to the values of hue, brightness, saturation and position of each pixel. In the image-sound conversion apparatus, the detecting means comprises: a means for determining the scan resolution of the image; a means for scanning the image along a predetermined direction with a scan resolution determined by the determining means; and a means for analyzing the image data generated by the scanning means to obtain the values of hue, brightness, saturation and position of each pixel in the image.

In the image-sound conversion apparatus, the direction or type of scan of the scanning means is selected from a group comprising Left-Right, Up-Down, Ellipsoid, Lattice and Three- dimensional space.

In the image-sound conversion apparatus, the sound generation means generates the sound in accordance with the predetermined direction and speed of scan.

In the image-sound conversion apparatus, the sound generation means generates the sound based on sine waves.

In the image-sound conversion apparatus, the sound generation means generates the sound according to a law of harmony.

In the image-sound conversion apparatus, the sound generation means generates the sound that corresponds to major harmonious chords . In the image-sound conversion apparatus, the sound generation means selects the types and arrangements of musical instruments and then generates the sound based on the sound source of the selected musical instruments. The image-sound conversion apparatus further comprises an image input means for scanning or photographing a source image and then providing the obtained image to the detecting means.

The image-sound conversion apparatus further comprises a sound source DB for storing at least one sound source, wherein the sound generation means generates the sound by modifying the sound source.

The image-sound conversion apparatus further comprises an output means consisting of a multiple of speakers connected to the sound generation means, wherein the output means outputs the sound with a type of sound output selected from a group comprising Left-Right, Up-Down, Ellipsoid, Lattice and Three- dimensional space.

In the image-sound conversion apparatus, the output means outputs the sound for a predetermined time period. Meanwhile at least some parts of the image-sound conversion apparatus can be replaced by a computer program.

[Advantageous Effects]

According to the present invention, there are provided a method and apparatus for analyzing color information of a pixel or a pixel block constituting a given image and converting the color information into corresponding information of sound. According to the present invention, there are provided a method and an apparatus for converting image into sound wherein the sensible features from the image is maintained in the converted sound by associating three elements of light, that is hue, brightness and saturation with musical note, octave and timber of converted sound, respectively.

According to the present invention, there are provided a method and an apparatus for converting image into sound by associating three elements of color, that is hue, brightness and saturation as well as the position of pixel within an image with musical note, volume of sound, noise mixture ratio and octave level of converted sound, respectively, so that the converted sound can be reconstructed to an image in the end.

According to the present invention, there are provided a method and an apparatus for converting light or an image across the full visible frequency range having arbitrary hue, brightness and saturation value into a sound across the full audible frequency range.

According to the present invention, there is provided a handheld image-sound conversion apparatus and method thereof.

According to the present invention, there is provided a method and an apparatus for converting image into sound for the blind people.

According to the present invention, there is provided a method and an apparatus for converting image into sound and vice versa that are optimally adapted to the visual and aural characteristic of human beings, where the corresponding relationship regarding frequency/wavelength between the image and sound is set in correspondence to the frequency/wavelength ratio between colors within an image and the frequency/wavelength ratio between musical notes of the sound.

When used in an application where sound is reconstructed from the image or vice versa, the time period needed by a user to adapt to the function of the present invention shall be minimized since the method and apparatus for converting image into sound have been constructed to optimally fit to visual and aural characteristic of human beings .

[Description of Drawings]

Fig. 1 shows a frequency range covering the visible and audible frequency ranges .

Fig. 2 is a flow chart according to the method for converting image into sound according to the present invention.

Fig. 3 is a detailed flow chart of the method for converting image into sound according to a first embodiment of the present invention.

Fig. 4 is a detailed flow chart of the method for converting image into sound according to a second embodiment of the present invention.

Fig. 5 is a detailed flow chart of a sound generation process of the method for converting image into sound according to an embodiment of the present invention. Fig. 6 is a schematic diagram of the apparatus for converting image into sound according to an embodiment of the present invention. Fig. 7 shows an exemplary image for explaining the method for converting image into sound according to an embodiment of the present invention.

Fig. 8 is a graphical view of RGB color system. Fig. 9 is a graphical view of HLS color system.

Fig. 10 shows a frequency conversion range according to an embodiment of the present invention.

Fig. 11 shows a color-sound conversion table according to an embodiment of the present invention. Fig. 12 is a schematic diagram showing the musical note determination process according to an embodiment of the present invention.

Fig. 13 shows energy distribution of clear sound according to an embodiment of the present invention. Fig. 14 shows energy distribution of noisy sound according to an embodiment of the present invention.

Fig. 15 shows exemplary scanning process according to an embodiment of the present invention.

Fig. 16 shows exemplary scanning process according to other embodiment of the present invention.

Fig. 17 shows an application of the apparatus for converting image into sound according to an embodiment of the present invention.

Fig. 18 shows a block diagram of the apparatus for converting image into sound according to an embodiment of the present invention.

Fig. 19 shows an operation flow chart of the apparatus for converting image into sound shown in Fig. 18. < Explanation of selected reference numerals >

600: an image-sound conversion device

610: an image input device 620: an image analysis/tuning means

630: a visible/audible frequency association means

640: a sound synthesis means

650: an option input means

660: a reference image/sound source DB (database) 670: a sound generation device

[Best Mode]

Hereinafter the present invention will be explained in more detail with reference to the accompanied drawings illustrating a preferred embodiment of the present invention.

Fig. 2 is a flow chart of the method for converting an image into sound according to the present invention.

In Fig. 2, in advance to the image analysis or sound generation, a step S210 for setting an image analysis option and sound generation option can be performed. The image analysis option is used to set image-scanning variables such as input sequence of the pixel or pixel block contained in an image, size of pixel block, etc. The sound generation option is used to set sound-generation variables such as number of channels for generating the converted sound, bit rate, generation speed, etc.

This step S210, of course, can be inserted into any other position between the steps of the above method. Once an image is inputted, a color system conversion step S220 is performed, where information in color-oriented coordination system such as RGB color information is converted into information of different color system having hue, brightness and saturation as reference values such as HLS color system.

In order to associate the visible frequency with the audible frequency, each color in the visible frequency range should be arranged in a same way as each musical note in the audible frequency range. So a color value tuning step S230 for adjusting or tuning each color value in accordance with the frequency ratio of each color of the visible frequency range is performed.

Then, a frequency association step S240 for associating the frequency or wave in the visible frequency range corresponding to the tuned color value with the frequency or wave in the audible frequency range is performed. The frequency association step S240 should be performed so that color information such as hue, brightness and saturation contained in the visible frequency signal is effectively transferred to the audible frequency signal. The effective way of frequency association shall be explained with reference figs . 3 and 4.

Lastly, a sound generation step S250 for generating audible frequency signal determined by the frequency association is performed. Fig. 3 is a detailed flow chart of the first embodiment of the frequency association step S240 of the present invention shown in Fig. 2. In Fig. 3, the hue, brightness, saturation and position information of a pixel or pixel block contained in the visible frequency signal is converted so as to correspond to note, octave, noise mixture ratio and position, respectively, of major sound (e.g., The words major sound, major tone, fundamental tone, fundamental note, pure tone, pure sound, etc. are used in the same way in this document) contained in the audible frequency signal.

First, brightness-octave determination step S310 is performed, where the value of brightness of a pixel or pixel block is classified according to the octave classification of audible sound and the octave of major sound is determined based on the value of brightness. Then, a hue-note determination step S320 for determining the note of the major sound in accordance with the hue value is performed, where the note of major sound is allocated to the octave section determined in the brightness- octave determination step S310.

Further, a saturation-noise mixture ration determination step S330 for adding a complex noise to the major sound in order to represent the saturation value of the pixel or pixel block is performed. The octave section of the noise is typically the same as the octave section of the major sound and the mixture ratio of the major sound and the noise is determined by the value of saturation. The major sound means the sound determined by the hue value of the pixel or pixel block and the noise means the sound other than the major sound that is contained in the output sound. In the present invention, the noise representing the saturation of the pixel or pixel block is separately generated during the sound generation process.

Further, following or in parallel to the above steps, a pixel position-sound source position determination step S340 is performed to associate the position of the pixel or pixel block in the input image with the position of the sound source recognized in the generated sound.

In the aforementioned steps, the sequential order of the steps is not important and the sequence can easily vary by those skilled in the art.

Fig. 4 is a detailed flow chart of the second embodiment of the frequency association step S240 of the present invention shown in Fig. 2.

In Fig. 4, the hue, brightness, saturation and position information of the pixel or pixel block contained in the visible frequency signal is converted so as to correspond to note of major sound, volume of total generated sound, noise mixture ratio and octave of the major sound, respectively, contained in the audible frequency signal. In the second embodiment in Fig. 4, the brightness value of the pixel or pixel block is used for determining the volume of the sound to be generated. The brightness-volume determination step S410 for classifying and associating the value of brightness and volume of sound can be performed regardless of the sequence of the other steps or together with one of the other steps.

Then, a hue-note determination step S420 for determining the note of the major sound in accordance with the hue value of the pixel or pixel block is performed, where the note of major sound is allocated to the octave section determined in the octave determination step S440.

Further, a saturation-noise mixture ratio determination step S430 for adding a complex noise to the major sound in order to represent the saturation value of the pixel or pixel block is performed. The octave section of the noise is typically determined to be the same as the octave section of the major sound and the mixture ratio of the major sound and the noise is determined by the value of saturation. The major difference of the second embodiment from the first embodiment is that a pixel position-octave determination step S440 is performed, where the position of the pixel or pixel block within the input image is used for determining the octave of the major sound and/or noise in the generated sound. For example, in order to associate the vertical position (in the direction of height) of a pixel or pixel block with the value of octave, the rectangular image is divided into several sections in the vertical axis and the vertical section in which the pixel or pixel block locates corresponds to the value of octave in the generated sound. If the position information of the pixel in the horizontal direction is also used, then this step can be represented as pixel position-sound source position determination step S440.

In the aforementioned steps, the sequential order of the steps is not important and the sequence can easily vary by those skilled in the art. Fig. 5 is a detailed flow chart of an exemplary sound generation process of the method for converting image into sound according to an embodiment of the present invention.

Fig. 5 illustrates a sound generating process using the major sound and noise information that have been derived from the pixel or pixel block information.

The major sound and noise information determined from the steps explained with reference to Figs. 2, 3 and 4 are information about analog sound having a frequency (and wave) and amplitude. While this analog sound can directly be generated by an analog audio device, it usually is generated by a digital audio device and thus an analog-digital conversion is needed.

Fig. 5 relates to the digital information conversion process. First, digital data of a major sound (that is, a sound that determines the note) is produced in the fundamental note data generation step S510 using the information about the major sound signal. Secondly, digital data of noise (that is, a background sound) is produced in the noise data generation step S520 using the information about the noise signal. Thirdly, the major sound data signal and the noise data signal are synthesized in the fundamental tone-noise synthesis step S530. Lastly, output sound is generated using the synthesized data signal in the sound generation step S540.

Fig. 6 is a schematic diagram of the apparatus for converting image into sound according to an embodiment of the present invention.

The apparatus or device for converting image into sound 600 of the present invention basically comprises an image analysis/tuning means 620, a visible/audible frequency association means 630, and a sound synthesis means 640.

The image analysis/tuning means 620 performs the color system conversion step S220 and color value tuning step S230 of the Fig. 2, and the visible/audible frequency association means 630 performs the frequency association step S240 in Fig. 2 and other steps in Fig. 3 or 4. The sound synthesis means 640, a component for synthesizing the generated sound based on the determined octave, note, noise mixture ratio and volume information, performs the steps illustrated in the Fig. 5, for example .

When a media device for sound synthesis is available outside the image-sound conversion device or apparatus 600, the sound synthesis means 640 may only provide information for sound synthesis such as the octave, note, noise mixture ratio and volume and the separate media device may perform the function of sound synthesis based on the above information from the sound synthesis means 640. The sound synthesis means 640 is often called as sound mixing means in the art. The options for image analysis and sound generation may be set by the option input means 650 of the image-sound conversion device 600. Otherwise the option setting may be accomplished by a computer program or other device connected to the image-sound conversion device 600 of the present invention. While the image-sound conversion device or apparatus 600 may be an independent device having the above essential functions, it can also be embedded in or configured as parts or components of a conventional image input device 610 such as camera or scanner and/or a conventional sound generation device 670 such as speaker or speaker-embedded lighting equipment. Further as shown in Fig. 17, a handheld image-sound conversion system can be realized by a combination of an handheld image input device 1710 having an image sensor in its center portion, a handheld sound generation device 1730 and a handheld image-sound conversion device 1720.

Further, the image-sound conversion device 600 can be equipped with a reference image DB 660 storing reference images and the information regarding the correspondence between the input image of the image analysis/tuning means 620 and the predetermined image stored in the reference image DB 660 can be reflected in the synthesized sound output.

For example, in the case of an image-sound conversion device 600 for the blind people, the reference image information such as traffic signs and warning signs is stored in the reference image DB 660 in advance, and then an alarm or warning notice is added to the generated sound when the input image corresponds to one of the pre-stored reference images. The image-sound conversion device 600 may further comprise a reference sound source DB 660 incorporated in or separate from the reference image DB 660. Herein the information regarding the correspondence between the synthesized sound from the sound synthesis means 640 and the predetermined sound stored in the reference sound source DB 660 can be reflected on the generated sound output.

For example, according to the experiments by the inventor of the present invention, the resulting sound converted from the images of nature in raw by the image-sound conversion apparatus 600 was known to make a natural harmony, that is, a musical chord according to the law of musical harmony. Using this results, if the information of predetermined chords based on the law of harmony is stored in the reference sound source DB 660, then the image-sound conversion apparatus 600 can stop or reduce the volume of the generated sound or add an additional sound containing harmonious chord for compensating the discord to the generated sound when the color harmony of the input image is not desirable and the generated sound have discord.

The above explained usage of the reference image/sound source DB 660 is an example and those skilled in the art may use the reference image/sound source DB 660 in various ways without departing from the spirit of the present invention. Further, some or all components shown in the Fig. 6 can be implemented by a computer program by those skilled in the art.

Hereinafter, each step of image-sound conversion method according to a preferred embodiment of the present invention will be explained in more detail with reference to Fig. 7 showing an example of input image.

0. A step for setting options for image analysis/sound generation S210

The method of the present invention may begin with the image analysis step for analyzing the input image and identifying the hue, brightness, saturation and position of pixels or pixel blocks in the input image. Preferably, a step for setting image analysis and/or sound generation options is provided before starting the method of the present invention with the image analysis step. The option can be manually set by a user before the image analysis step by the apparatus or computer program of the present invention, or it can be automatically set by a computer program based on the characteristic of the input image.

The image analysis option of the apparatus or computer program (that is, as easily understood by those skilled in the art, at least part of the method and apparatus of the present invention can be implemented as a computer program) of the present invention may comprises (1) image scan mode, (2) image analysis mode, (3) size of pixel block, (4) unit of each scan, (5) scanning speed and (6) resolution (overall resolution and/or zone resolution within an image) .

The image scan mode is preferably configured to simulate the image recognizing pattern of human eyes and it includes left- right mode, up-down mode, difference resolution mode (lattice type, ellipsoid type, etc.), motion picture mode, for example. The left-right mode repeatedly scans the image from left to right side or from right to left side. Likewise, the up-down mode repeatedly scans the image upwardly and downwardly.

The difference resolution mode scans each zone in the image with different resolutions. The zone may be formed as lattice type or ellipsoid type and each lattice or ellipsoid may have different scan resolution from each other. In the lattice type, as shown in Fig. 16, the overall image is divided into several zones of concentric lattices that have different size and the scanning is performed with higher resolution in the center zone and lower resolution in the outer zone. The ellipsoid type is a kind of lattice type where the lattice shape is an ellipsoid. The lattice type and ellipsoid type can reflect the visual characteristic of human eyes that analyzes the center of the visual field in more detail.

The motion picture mode is used when a motion picture is inputted instead of the standstill image. The motion picture mode scans each image that constitutes the motion picture to make an image frame and acquires the information about the difference between each frame.

At least one of the above scan modes can be employed at a time. Other scan modes may be employed to the present invention by those skilled in the art. In the image analysis mode, the two-dimensional analysis

(x, y axis) uses one image, whereas three-dimensional analysis

(x, y, z axis) uses two images for the same object. In the three-dimensional analysis, like the working principle of human eyes, two pictures are taken (for example, L/R images) for a single object and the difference (for example, L/R difference) between the two pictures is represented. In the two-dimensional analysis two speakers (that is, L, R speakers) are used for the generation of converted sound. In the three-dimensional analysis at least four channel speakers are used and time-delay technique can be used to represent the object at a distance. The two-dimensional analysis and the three-dimensional analysis techniques are already known in the art and other analysis method may be employed by those skilled in the art. Meanwhile the image-sound conversion can be performed in a unit pixel block basis, not a unit pixel basis, that is the basic unit of scan is a pixel block of a predetermined size. For example, a given image can be divided into a multiple of 8*8 pixel blocks or 16*16 pixel blocks and the scanning process is performed by the pixel block basis. For the pixel block-based operation, the hue, brightness, saturation and position of each pixel in a pixel block should be detected and a representative values, such as mean values, for hue, brightness, saturation and position representing the overall pixel block should be taken.

Image processing method for a pixel block is widely known in the image processing industry and thus the detailed explanation will be omitted. While the present invention will be explained in terms of a unit pixel, the pixel block can be used in the same way. In a majority of cases, the image will preferably processed by a pixel block basis considering the volume of data to be processed, visual/audio resolution of human beings etc.

Usually, the pixel blocks are not scanned one-by-one basis. For example, in the left-right scan mode shown in Fig. 15, scanning is performed from the left-hand side to the right-hand side as shown by an arrow 1520 and a column 1510 of pixel blocks located at the same vertical line are scanned at the same time. Considering the process in the human brain for reconstructing images using the generated sound converted from the image scan information, a scanning scheme that simultaneously scans pixels or pixel blocks that are contained in a predetermined area, such as row or column, are desirable. Well known technologies may be employed regarding the scan speed and scan resolution.

Meanwhile the apparatus or program of the present invention may employ a sound generation option including, but not limited to, (1) number of channels, (2) bit rate, (3) bit resolution, (4) sampling rate, (5) generation speed, etc.

The concepts of number of channels, bit rate, bit resolution, sampling rate, etc are well known in the art and the explanation thereof will be omitted, while the generation speed can be defined as the value of "Time per Tone" that means the number of notes or tones generated in a given unit time.

The step of setting the above options is usually performed before the image analysis step, but sometimes it may be inserted in an arbitrary position among the steps of the present invention by a user command or computer program.

1. Image analysis step

1-1. RGB-HLS conversion step S220

In Fig. 7, the exemplary rectangular image 700 is composed of x*y pixels and divided into the object image 701 at the center and the background image 702. Each pixel 703, 704 in the image

700 has its own color information including the three color elements of hue, brightness and saturation. In order to convert an image into sound, information regarding hue, brightness, saturation and position of pixel in the image should be detected.

Generally there are many kinds of color systems for representing color information. A group of color systems such as RGB (Red, Green, Blue) and CMYK (Cyan, Magenta, Yellow, Kappa) are based on major colors and other group of color systems such as HLS (Hue, Lightness, Saturation) and HSV(Hue, Saturation, Value) are based on the three elements of hue, brightness and saturation. In the present invention, the second group of color system such as HLS or HSV employing the three elements of hue, brightness and saturation may be used and an additional step of coordination conversion is needed for an RGB or CMYK system.

The RGB color system represents colors by the addition of three reference colors of red, green and blue, which are the three primary colors of light. The RGB system is currently used as a basic color model in computer graphic industry and almost all kinds of image files and video cards use the RGB system. The RGB coordination system (or RGB color system) for representing colors according to RGB color system is shown in Fig. 8. Meanwhile, the HLS color system represents colors by its hue, brightness and saturation, which are three primary elements of color in the HLS color system. Since a color is recognized by its hue, brightness and saturation by human eye, the HLS system is the color model that is most similar to the human vision. So the HLS system is used for the detection of colors for use in the image-sound conversion method of the present invention. The HLS coordination system for representing colors according to a HLS color system is shown in Fig. 9.

HSV is very similar to HLS, and the explanation regarding HLS in the present invention can be equally or very similarly applied to HSV by those skilled in the art without departing from the teaching of the present invention. Other color systems that use hue, brightness and saturation or their direct derivatives as basic elements can also be employed to the present invention.

When the present invention is implemented with currently- existing analog/digital devices such as digital camera, CCTV, scanner that usually represent pixels in RGB format, an additional step for converting the RGB coordinates of each pixel into the HLS coordinates needs to be performed. A block based coordinate conversion should be performed with the consideration of the resolution of image, the performance of process for coordinate conversion, aural resolution of human ears for the converted sound, etc.

The method of converting RGB coordinates into HLS coordinates is widely known and any of them may be employed in the present invention. The R, G, B values for each pixel can be converted into corresponding H, L, S values by the coordinate conversion.

1-2. A step for tuning hue value S230

After the H, L and S values are determined, the values needs to be further tuned by a method according to the present invention for the one-to-one association between color and sound.

In the hue system in an HLS coordinates the red, green and blue are arranged by 120 degrees with each other, however their frequencies are arranged with difference angles. This will be explained in detail.

The audible and visible frequency ranges may slightly vary according to each person or system that detects colors. The standard wavelengths of the three basic colors of red, green and blue also vary according to the development of color technology for representing colors. In addition, human eyes have inferior discrimination for reddish colors of longer wavelength. In the present invention, arbitrary frequency ranges are selected within the standard wavelength ranges presented so far for the application of the principle of the present invention. Then, in consideration of the inferior discrimination of reddish colors of longer wavelength, the visible frequency range is shifted to the direction of short wavelength within the bluish color recognition boundary in order to set an applicable visual frequency range having enhanced discrimination for reddish colors.

One embodiment of the present invention deals with an aural frequency range of 20~28,160Hz comprising the audible frequency range of human beings and a visual frequency range of 4.6*10⁸~8.8*10⁸MHz(equal to 650~340nm) set by the above-mentioned reason regarding discrimination of color recognition. The audible and visible frequency ranges according to this embodiment of the present invention is shown in Fig. 10.

Meanwhile, in this embodiment of the present invention, 12 tone equal temperament (that is, a representative of one of the musical temperaments that is currently used as standard temperament worldwide, wherein 12 notes are spaced evenly inside an octave and one unit tone of the 12 notes is called semitone and two unit tone of the 12 notes is called whole tone) is used for representing the musical note. Using the 12 tone equal temperament, one can easily associate colors with their corresponding notes based on the frequency ratio of 12 notes of the 12 tone equal temperament, even though there might be some error in the red-blue boundary according to the selection or setting of applicable visible frequency range.

The color-sound conversion table according to the above embodiment in Fig. 11 will be used for further explanation. Fig. 11 is a color-sound conversion table for the frequency ranges shown in Fig. 10. In Fig. 11, the basic note "Do"?in the 12 tone equal temperament corresponds to the basic color "Red(650nm) "?and remaining notes in the 12 tone equal temperament correspond to respective colors accordingly. When frequencies of colors are arranged in logarithmic scale, the colors of red, orange, yellow, green, blue and purple can be arranged with an equal distance in the clockwise direction starting from the uppermost position of the color-sound conversion table of Fig. 11. The logarithmic musical scale of an octave including the twelve notes of Do(C), Do#(C#), Re(D), Re#(Eb), Mi(E), Fa(F), Fa#(F#), SoI(G), Sol# (Ab) , La(A), La# (Bb) and Ti (B) are also equally spaced on the color-sound conversion table .

The notes are equally divided in logarithmic scale since the frequencies of notes in the 12 tone equal temperament constitutes a geometric series, wherein the frequencies of notes in the 12 tone equal temperament increase by a fixed multiple (approx. 1.059times) as the notes increase by a semitone and the frequency doubles with 12 times of semitone increases. Likewise, the colors corresponding to the frequencies increasing (or decreasing) by the above ratio (approx. 1.059times) are arranged by equal distance on the color-sound conversion table in Fig. 11. According to the above arrangement of colors and musical note, the inventor of the present invention found that frequencies of colors and musical notes are both equally spaced in the logarithmic scale and the ratio between audible frequencies of "Do, Mi, Sol"?and the visible frequencies of "Red, Green, Blue"?is equal. In terms of wavelength, the wavelength ratio is given as Do(C): Mi(E): SoI(G) = Red (R) : Green (G): Blue (B) = 1: 4/5: 2/3. Therefore the lights of red, green and blue can be associated with the musical chord of Do-Mi-Sol. This kind of relationship was to be found viable across the overall audible and visible frequency range. Through a number of experiments, the inventor of the present invention could obtain color combinations corresponding to or associated with each of harmonious musical chords and found that the color combinations from the musical chords would make more harmonious color arrangements than those corresponding to musical discords.

In the color wheel according to HLS color system, red, green and blue are spaced by 120 degree distance and these distance can be calculated as a wavelength ratio of 1: 1/3: 2/3. However, since the wavelength ratio of real visible frequencies of these colors is 1: 4/5: 2/3, the hue value on the color wheel of the HLS color system should be tuned or adjusted to correspond to the wavelength ratio of real colors.

That is, the frequencies and wavelengths of Red, Green and Blue colors in the HLS color system should be tuned so that they are arranged with 0: 120: 210 degrees rather than 0: 120: 240 degrees and those for other colors should also be tuned accordingly. The ratio of RGB colors in HLS color system is 1: 1/3: 2/3 whereas the ratio between real colors is 1: 4/5: 2/3. In order to have hue values corresponding to frequency ratio, the hue values of bluish green colors between green and blue and the hue values of purple colors between blue and red have to be tuned in the opposite direction on the color wheel to have a tuned ratio of 1: 4/5: 2/3 for red, green and blue.

In the apparatus or computer program of the present invention, the above tuning may be accomplished by the association of hue values of each pixel or pixel block with new values of visible frequencies reflecting the above tuning scheme.

The visible frequencies for the colors of the pixels or pixel blocks obtained through this tuning scheme exactly correspond to or are proportionate to the frequencies and/or wavelengths of the audible frequencies of each musical notes, and thus each color in the visible frequency range can have one-to- one matching relationship with each musical note in the audible frequency range.

This newly established color system by the tuning of hue value according to the logarithmic frequency ratio of visible frequency is defined as ^armonyHue?in the present invention. Consequently the Fig. 11 can be called as color-sound conversion table according to HarmonyHue color system.

In the color-sound conversion table of the HarmonyHue color system in Fig. 11, the brightness of color and octave of sound are associated in one-to-one relationship and thus are arranged in the same color ring. The brightness of a given color and the octave of a given note increase in the radial direction to the center of the table, and vice versa. Therefore the same notes with difference octaves (for example, "Do" in lower octave and "Do" in higher octave) may be represented by shifting the value of brightness for the same hue values and vice versa.

Meanwhile saturation originally represents the intensity of hue and is defined as ratio or proportion of pure color against the total color consisting of the pure color and the remaining colors (that is, achromatic color) . In one embodiment of the present invention, the hue value corresponds to timber of sound and the timber is represented by the mixture ratio of major sound against noise.

Since the timber of sound varies by the amount of noise such as harmonics against the fundamental frequency, the timber of sound corresponding to the saturation of color can be newly defined as the fundamental wave (that is, a sine wave) divided by the combination of the fundamental wave and noise of a given aural wave. According to this scheme, a clear sound that has higher portion of fundamental sound can be represented by a pure color having highly saturated hue and an unclear sound having more noise can be represented by an impure color having lower saturation.

Since achromatic colors do not contain pure color, the timber of the converted sound is represented by noise only without any fundamental wave (i.e., sine wave).

According to the explanations above, those skilled in the art may select an arbitrary HarmonyHue frequency range, prepare a color-sound conversion table corresponding to the frequency ratios of the twelve musical notes of the 12 tone equal temperament, and perform an image to sound conversion. Here, a reference color corresponding to the note "Do(C)"?can be selected by a system designer according to optical and/or musical characteristics of hardware of the color-sound conversion system and HarmonyHue frequency range, and other frequencies for other colors and note may be associated with one another based on the selected frequency and the frequency ratio between colors of the present invention.

The aforementioned processes including image frequency analysis, conversion of RGB coordinates into HLS coordinates, conversion of HLS coordinates into HarmonyHue color system can be implemented by a computer program or a set of computer programs .

2. Image-sound association step (the first embodiment) 2-1. Brightness-octave determination step S310 As described above the audible frequency range is divided into ten sections, that is ten octaves for the optimal image- sound conversion. Accordingly the L values (that is, values of lightness or brightness) are also divided into ten sections. In the present invention, each octave is associated with a corresponding section of frequency range based on the L value (that is, lightness) of HLS color coordinates. The note determination process will be explained in detail with reference to Fig. 12.

In the first step (STEP 1), the audible frequency range is divided into ten sections. Taking the lowest reference note as "Lower La (A) " the audible frequency range is determined based on the corresponding audible frequency set to 27.5Hz. Since frequency of the "one octave higher La(A')"?is double the frequency of the reference "Lower La(A)", the corresponding audible frequency doubles to 55Hz. In this way, an audible frequency range of 27.5Hz~28.16kHz corresponding to ten octaves is set. In addition, brightness sections need to be set within the boundary of recognition of human beings according to the purpose of use or hardware environment of the image-sound conversion system. The brightness sections are normally set to ten levels according to the number of octaves in the audible frequency range .

In the second step (STEP 2), the audible frequency range is tuned to correspond to the ten octaves based on the reference note "Do(C) "?and the octave of sound is determined within the audible frequency range that has been tuned according to the brightness level of a given color. Let's assume that the color in Fig. 12 has its brightness value within the second octave range, then the second octave is also determined for the corresponding sound.

2-2. Color-note determination step S320

The color of pixel or pixel block is associated with the note of converted sound according to the present invention. In the third step (STEP 3) in Fig. 12, the determined octave section is again divided into twelve note sections and a note corresponding to the frequency of the given color is determined. This is a process for determining the position of input frequency within the determined octave in accordance with the position of color in the HarmonyHue system resulting from the image analysis step.

The "reference note Do(C) within the given octave section"?and the "reference note Do(C") within the next octave section"?has below relationship according to the 12 tone equal temperament . n"= π ' * 2 ^x//^

Here "x" represents the position of color within the given octave section. The value of "x"?can be determined by finding the visible frequency in the HarmonyHue color system corresponding to the hue value of the given pixel in the HLS color system. The frequency of the fundamental tone associated with the hue of color in the visible frequency range can be determined through the above processes.

2-3. Saturation-noise mixture ratio determination step S330 The ratio of fundamental tone against noise or remaining tone is used to represent the saturation of color in the present invention.

The saturation means the mixture ratio between pure color and achromatic color (that is, white, gray and black having only brightness and without hue) . As the proportion of achromatic color decreases the color is highly saturated to become a pure color and vice versa. In the present invention the ratio of pure color against achromatic color is represented as pure tone against noise. When it comes to lights, white light is composed of all the waves in the visible frequency range, and the more achromatic colors contained in white light the less it becomes saturated. In the case of sound, noise consists of all the waves within a given frequency range, and the more noise mixed with the pure tone the less clear the tone is heard.

Therefore, in order to represent the saturation, the value of saturation of given pixel is found in the HLS color system and then noise is added to the converted pure tone according to the position of the resulting saturation value in the overall saturation range. The octave value of noise is equal to that of pure tone and these values are both determined by the value of brightness of the given pixel.

According to the above-mentioned process, the converted sound of the pure color has higher value for pure tone-to-noise ratio and sounds as clear and pure tone as shown in Fig. 13, while the converted sound of the impure color has lower value for pure tone-to-noise ratio and sounds as unclear and noisy tone as shown in Fig. 14. The total energy of the sounds shown in Figs 13 and 14 are equal to each other and only the pure tone-to-noise ratio that is signal-to-noise ratio (SNR) is different from each other.

2-4. Position determination step S340 In order to convert the color of a single pixel or pixel block or the light of a given frequency into sound, the sound or tone can be composed by only the information regarding hue, brightness and saturation. However, when a group of pixel forms an image the size, shape and characteristic of the image is determined by the relative positions between the pixels, and thus the position information should be reflected on the sound for a complete conversion of image into sound. For this, the position of image in the horizontal axis (i.e., x-axis) may be converted into left-right position of sound source in the stereo-type sound and the position of image in the vertical axis (i.e., y-axis) may be converted into up-down position of the sound source. Considering the characteristic of human aural system that is superior in sensing the left-right position to up-down position of sound source, the vertical position of the image may be advantageously implemented as the front-rear position of sound source in the aural space.

For example, in case of surround sounding environment or limited sounding environment such as inside a car, the horizontal position and vertical position of a pixel in an image can preferably be converted into the left-right position and front- rear position of sound source, respectively. Those skilled in the art may reflect the left, right, up and down position of a pixel to resulting sound in various ways and define the position of a pixel using various kinds of coordinate axis other than horizontal and vertical axis.

3. Image-sound association step (the second embodiment) 3-1. Brightness-volume determination step S410

In the second embodiment, the L value (i.e. Lightness) of the HLS color system is associated with the volume of converted sound. Unlike the first embodiment the brightness is associated with volume, the limitation on the number of brightness levels is not required and those skilled in the art will freely set the levels of brightness and volume according to the system environment or designer's concept.

3-2. Hue-note determination step S420

In the second embodiment the process of determining the musical note is as follows. For the determination of value of note, the octave level should be determined first and the note should be determined within the predetermined octave level . In the first embodiment the octave is determined in accordance with the brightness of a pixel and the note is determined in accordance with the value of hue, whereas the octave is determined according to the position of a pixel within an image in a given direction (e.g. , y-axis direction) and the note is determined according to the value of hue in this second embodiment .

As aforementioned, association of the left, right, up and down positions of pixel or image with left, right, up and down positions of sound source is not desirable since human aural system has poor discrimination on the up-down position of sound source. Therefore, in the second embodiment, the vertical directional shape of an image or up-down position of a pixel in an image is associated with the value of octave of converted sound for clearer discrimination.

First, the vertical height of the input image needs to be divided into ten sections in correspondence to the ten octave sections of the audible frequency range. During the image analysis, the position of a given pixel among the ten vertical sections of the image is determined and associated with the value of octave of the converted sound.

The process of dividing the audible frequency range into ten sections to acquire ten octave sections is equal to that in the first embodiment and clearly shown in steps 1 and 2 in Fig.

12.

When an octave section or octave value is determined, the value of note within the resulting octave should be determined and the note determination can be carried out using the hue value in the HLS color system as in the first embodiment. Like in the first embodiment, the note shall be determined after the hue values of HLS color system are tuned according to the HarmonyHue color system.

3-3. Saturation-noise mixture ratio determination step S430 The second embodiment also uses the pure tone-to-noise ratio for the representation of the saturation of color. As aforementioned, unlike in the first embodiment, the octave value of the noise is determined by the position of the input image in a predetermined direction (e.g. , y-axis direction).

3-4. Position-octave determination step S440

In the second embodiment, the association between the position of image in the horizontal axis (i.e., x-axis) and the left-right position of sound source in a stereo-type audio environment is the same as that in the first embodiment.

However, as above-explained, the position of image in the vertical axis (i.e., y-axis) is used to determine the octave values of fundamental tone and noise.

The process of determining octave value for noise is equal to that for fundamental tone and the octave levels or octave values are usually identical for both the fundamental tone and noise except for those cases where high-tone noise is added to lower fundamental tone or lower-tone noise is added to higher fundamental tone so as to represent a special or functional information on the image.

4. Sound generation step

4-1. Fundamental tone data generation S510

From the frequency of noted determined through the color- note determination step of 2-2 or 2-3, digital data are produced for the generation of sound according to the sound generation options.

In order to convert the analog signal corresponding to the frequency of note into a digital signal, required is a quantization process for converting analog signal into discrete stepwise signal using sampling rate for reconstructing the analog signal without error based on the sampling theorem. In order to prevent the distortion of original analog signal, a proper rate of quantization error should be maintained thorough the control of the number of digital levels of quantization according to the property and state of the original signal.

Various methods of converting the analog signal corresponding to the frequency of note into digital signal are well known in the art and only the PCM (Pulse Code Modulation) modulation will be explained in combination with an embodiment of the present invention.

PCM data corresponding to the wave of the note of fundamental tone could be produced by performing a PAM (Pulse Amplitude Modulation) modulation and a following PCM modulation. The PAM modulation is performed for converting the analog signals (that usually have shapes of sine waves) corresponding to the frequency of the note of fundamental tone into pulse series having different heights. The PCM modulation is performed for converting the pulse series into the binary code strings wherein the number of the binary code strings correspond to the number of bits in the digital level of the pulse series, and then for converting the binary code strings into pulse series to transmit as base-band signals. The PCM data may be generated by the predetermined sound generation options.

4-2. Noise data generation S520

For noise, PCM data are generated by predetermined sound generation options with the same process as in generation of PCM data for the fundamental tone.

4-3. Fundamental tone-noise synthesis S530

The PCM data for the fundamental tone acquired in the above

4-1 and that for the noise acquired in the above 4-2 are synthesized according to the fundamental tone-to-noise ratio determined by the saturation-noise mixture ratio determination step of 2-3 or 3-3. 4-4. Sound synthesis by addition of individual data S540 The process of PCM data generation and synthesis in 4-1, A- 2 and 4-3 is for an individual pixel or pixel block, and thus PCM data for the whole image needs to be obtained by the addition of PCM data for each pixel or pixel block in consideration of scan mode, block size, etc. set in the image analysis option.

4-5. Sound generation S540

The resulting sound synthesized by the above processes is generated by output means such as speakers and the output environment is preferably determined by the option values determined in the sound generation option step S210.

Since the step in 4-1 and 4-4 is for the case of sound generation using digital device such as computer systems, those step may be omitted when using an analog audio system capable of generating analog signal based on the fundamental tone and noise signal information.

Figs. 18 and 19 are a block diagram and an operation flow chart of the apparatus for converting image into sound according to other embodiment of the present invention.

The image-sound conversion apparatus in Fig. 18 comprises image acquiring/inputting device 1100, image frequency and pattern analyzing device 1200, image/sound association device

1300, reference sound source DB 1400, audio frequency synthesis and output device 1500 and input device 1600.

The image acquiring/inputting device 100 generates image data by scanning or photographing image source such as photos, pictures or motion pictures using scanner (not shown) or digital camera. If the image source itself has a digital format, the image data can be inputted by the use of ordinary data input interface. Therefore it will be well understood by those skilled in the art that the present invention would not be limited or restricted to a specific way of generating data, the resolution of image, the file format of image data, etc.

The image-sound association device 1300 converts the sound source from the reference sound source DB 1400 to correspond to the frequency and pattern of the image analyzed in the image frequency and pattern analysis device 1200, and outputs the converted sound to the audio frequency synthesis and output device 1500. The image/sound association device 1300 preferably associates each value of hue, brightness, saturation and position of each pixel in the source image with pitch, octave, timber and position of converted sound as shown in Fig. 3 and modifies the sound source accordingly to correspond to the values of hue, brightness, saturation and position of each pixel. Other embodiment of the present invention may be employed for the conversion of hue, brightness, saturation and position of pixel into pitch, octave, timber and position of generated sound, respectively.

Further the image/sound association device 1300 preferably modifies the sound source based on harmonics, the law of harmony. The image/sound association device 1300 modifies the sound source to correspond to major music chord. In an embodiment of the present invention, the image/sound association device reads from the reference sound source DB 1400 the sound source selected by user through input device 1600. For example, when a sound source comprises pure sine wave, the image/sound association device 1300 may add harmonics to the sine wave as noise to convert the saturation of the image into the timber of the converted sound.

The audio frequency synthesis/output device 1500 synthesizes the audio frequencies, that is the converted sounds from the image/sound association device 1300 to generate the resulting sound using at least one speaker 700. For example, the audio frequency synthesis/output device 1500 can output the synthesized audio frequencies using a multiple of speakers in a left-right, up-down, ellipsoid, lattice or three-dimensional space format in order to reflect the spatial form of input image into the output sound. Further, the audio frequency synthesis/output device 1500 may output the synthesized sound for a certain period of time set by the user. Overall operation will be explained with reference to Fig. 19. First, when the user selects and inputs the resolution of image, source sound and output type into the apparatus, the image acquisition/input device 110 scans or acquires the image based on the resolution and direction of scan set by the user and provides the resulting image data to the image frequency/pattern analysis device in steps of SlOOO, SIlOO.

In step of S1200, the image data are transmitted to the image frequency/pattern analysis device 1200 from the image input/acquisition device 110, the image frequency/pattern analysis device 1200 analyzes the frequencies and pattern of the image data for acquiring information regarding hue, brightness, saturation, shape, left/right/up/down position, pattern of each pixel or pixel block in the image as well as the size of the whole image and provides the acquired information to the image/sound association/conversion device 1300.

In step of S1300, S1400, the image/sound association/conversion device 1300 modifies the sound source by associating the information regarding hue, brightness, saturation, left/right/up/down position, pattern of each pixel and the size of the whole image with the pitch, octave, timber, shape of wave, position of sound source, sound pattern and volume of sound, respectively, and provides the modified sound to the audio frequency synthesis/output device 1500. In step of S1500, the audio frequency synthesis/output device 1500 synthesizes the modified sound based on the output option selected by the user and generates the resulting sound through the speaker 1700.

Although the present invention has been described with reference to the drawings, it is understood that the above description is not to limit the invention to the embodiments shown in the drawings but simply to explain the invention. Those skilled in the art will understand that various changes and modifications can be made from the embodiments disclosed in the specification. Therefore, the scope of the present invention should be defined by the appended claims.

[industrial Applicability]

According to the present invention, a color image can be converted into a beautiful sound based on the law of harmony, and new musical contents can be automatically produced by the conversion of natural landscape, place of interest, famous painting, etc according to the present invention. In a display of works of art such as famous pictures, more effective appreciation could be available by the provision of music related to the displayed work of arts.

This method and apparatus of the present invention can be employed by digital devices such as mobile terminal, digital camera, digital cam-coder, synesthesia education device for color and music and automatic music composing device.

Further the present invention may be used to enhance the effect of the music and color therapy by simultaneously stimulating the visual and aural senses in medical, educational, healthcare facilities for the purpose of emotional balance or medical treatment.

The present invention may be adapted to all kinds of industrial and household application that require matching of image and sound.

Claims

[CLAIMS]

[Claim l]

A method for converting an image into sound comprising the steps of: (a) associating at least one color elements of at least one pixel or pixel block having color elements of a first color system with at least one corresponding color elements of a second color system;

[Claim 2]

The method for converting an image into sound according to claim 1, wherein the first and second color systems are formed based on the color elements of hue, brightness and saturation, and the wavelength ratio of red, green and blue colors in the second color system corresponds to the wavelength ratio of the musical notes Do, Mi and Sol in the audible frequency range.

[Claim 3]

The method for converting an image into sound according to claim 2, further comprising, in advance to step (a) , the step of: (d) associating all color elements of at least one pixel or pixel block of a third color system formed based on the color elements of predetermined reference colors with the at least one color elements of the at least one pixel or pixel block in the first color system.

[Claim 4]

The method for converting an image into sound according to claim 3, wherein the first color system is HLS color system and the third color system is RGB color system.

[Claim 5]

The method for converting an image into sound according to claim 4, further comprising the step of: (e) setting image scan variables relating to input of the pixel or pixel block and sound generation variables relating to sound generation for generating sound using the second frequency or wavelength in the audible frequency range.

[Claim 6]

A computer program product comprising computer executable code for converting an image into sound, the code being stored on a computer readable medium, comprising code means for implementing the steps of the method according to one of claims 1 to 5.

[Claim 7] A method for converting an image into sound comprising the steps of:

[Claim 8] The method for converting an image into sound according to claim 7, further comprising the step of:

(e) adding a noise frequency or wavelength within the octave section to the second frequency or wavelength within the audible frequency range in step (d) in association with the value of saturation in the first and the second color systems.

[Claim 9]

The method for converting an image into sound according to claim 8, further comprising, in advance to step (a) , the step of:

(f) associating the values of hue, brightness and saturation of the pixel or pixel block of a third color system having color elements of predetermined reference colors with the value of hue, brightness and saturation of the first color system, respectively.

[Claim 10]

The method for converting an image into sound according to claim 9, further comprising the step of: (g) generating sound using the second frequency or wavelength within the audible frequency range, the position of the generated sound source being corresponding to the position of the pixel or pixel block within the input image.

[Claim 11]

The method for converting an image into sound according to claim 10, further comprising the steps of: (h) generating fundamental tone data in accordance with the second frequency or wavelength in the audible frequency range corresponding to the value of hue and brightness in the first and the second color systems; (i) generating noise data in accordance with noise frequencies or wavelengths in the audible frequency range corresponding to the value of saturation in the first and the second color systems;

(j) synthesizing the fundamental tone data and the noise data; and

(k) generating sound in accordance with the synthesized data.

[Claim 12] The method for converting an image into sound according to claim 11, further comprising the step of:

(1) setting image scan variables and sound generation variables, the image scan variables being related to image scanning process for input of the pixel or pixel block and the sound generation variables being related to sound generation process for generating sound using the frequencies or wavelengths within the audible frequency range.

[Claim 13] A computer program product comprising computer executable code for converting an image into sound, the code being stored on a computer readable medium, comprising code means for implementing the steps of the method according to one of claims 7 to 12.

[Claim 14] A method for converting an image into sound comprising the steps of:

(a) associating the value of hue of at least one pixel or pixel block of a first color system with the value of hue of the pixel or pixel block of a second color system, the first and second color systems having color elements of hue, brightness and saturation, and the wavelength ratio of red, green and blue colors in the second color system being corresponding to the wavelength ratio of the musical notes Do, Mi and Sol in an audible frequency range; (b) determining a first frequency or wavelength within a visible frequency range corresponding to the value of hue of the second color system;

(c) dividing the frequency spectrum or wavelength spectrum in the audible frequency range into a number of octave sections, dividing the input image having the pixel or pixel block into the same number of sections along a predetermined direction, and determining an octave section in the audible frequency range in accordance with the image section containing the pixel or pixel block, of which the first frequency or wavelength has been determined in step (b) ; and

[Claim 15] The method for converting an image into sound according to claim 14, further comprising the step of:

(e) adding noise frequencies or wavelengths within the octave section to the second frequency or wavelength within the audible frequency range in step (d) in association with the value of saturation in the first and the second color systems .

[Claim lβ]

The method for converting an image into sound according to claim 15, wherein the step (d) further comprises the step of: (f) determining the amplitude of the second frequency or wavelength in the audible frequency range in accordance with the value of brightness in the first and the second color system.

[Claim 17] The method for converting an image into sound according to claim 16, further comprising the step of:

(g) generating sound using the second frequency or wavelength within the audible frequency range, the amount of noise added to the generated sound being corresponding to the position of the pixel or pixel block along the predetermined direction, and the position of the generated sound source being corresponding to the position of the pixel or pixel block along a direction different from the predetermined direction within the input image. [Claim 18]

The method for converting an image into sound according to claim 17, further comprising the steps of: (h) generating fundamental tone data in accordance with the second frequency or wavelength in the audible frequency range corresponding to the values of hue and brightness in the first and the second color systems;

(j) synthesizing the fundamental tone data and the noise data; and (Jc) generating sound in accordance with the synthesized data.

[Claim 19]

The method for converting an image into sound according to claim 18, further comprising, in advance to step (a) , the step of:

(1) associating the values of hue, brightness and saturation of the pixel or pixel block of a third color system having color elements of predetermined reference colors with the value of hue, brightness and saturation in the first color system, respectively.

[Claim 20] The method for converting an image into sound according to claim 19, further comprising the step of:

(m) setting image scan variables and sound generation variables, the image scan variables being related to image scanning process for input of the pixel or pixel block and the sound generation variables being related to sound generation process for generating sound using the frequencies or wavelengths within the audible frequency range.

[Claim 21]

A computer program product comprising computer executable code for converting an image into sound, the code being stored on a computer readable medium, comprising code means for implementing the steps of the method according to one of claims 14 to 20.

[Claim 22]

An image-sound conversion apparatus comprising:

(a) an image analysis/tuning means for analyzing and tuning the input image;

[Claim 23]

The image-sound conversion apparatus according to claim 22, wherein the image analysis/tuning means associates all color elements of at least one pixel or pixel block of a third color system formed based on the color elements of predetermined reference colors with the at least one color elements of the at least one pixel or pixel block in the first color system.

[Claim 24]

The image-sound conversion apparatus according to claim 23, wherein the first color system is HLS color system and the third color system is RGB color system. [Claim 25]

The image-sound conversion apparatus according to claim 24, further comprising: (d) an image input means for providing the image to the image analysis/tuning means; and

[Claim 26]

The image-sound conversion apparatus according to one of claims 22 to 24, further comprising:

(f) a reference image DB for storing reference images, wherein the information regarding correspondence between the input image provided to the image analysis/tuning means and the reference images stored in the reference image DB is reflected on the synthesized sound.

[Claim 27] The image-sound conversion apparatus according to one of claims 22 to 24, further comprising;

(g) a reference sound source DB for storing reference sound sources, wherein the information regarding correspondence between the sound synthesized by the sound synthesis means and the reference sound sources stored in the reference sound source DB is reflected on the synthesized sound.

[Claim 28] An image-sound conversion method comprising the step of converting a pixel or image related information that is increasing or decreasing at a certain rate in a visible frequency range into a pitch or sound related information that is increasing or decreasing at the same rate in an audible frequency range .

[Claim 29]

An image-sound conversion method comprising the step of converting a pitch or sound related information that is increasing or decreasing at a certain rate in an audible frequency range into a pixel or image related information that is increasing or decreasing at the same rate in a visible frequency range .

[Claim 30]

A computer program product comprising computer executable code for converting an image into sound, the code being stored on a computer readable medium, comprising code means for implementing the steps of the method according to one of claims

29 and 30.

[Claim 3l]

An image-sound conversion method comprising the steps of: (a) detecting the values of hue, brightness, saturation and position within an image of each pixel in the image; and

(b) associating the values of hue, brightness, saturation and position of each pixel within the image detected at step (a) with the values of pitch, octave, timber and position of sound, respectively, and then generating the sound having the values of pitch, octave, timber and position of sound corresponding to the values of hue, brightness, saturation and position of each pixel.

[Claim 32]

The image-sound conversion method according to claim 31, wherein the step (a) comprises steps of:

(a-1) determining the scan resolution of the image; (a-2) scanning the image along a predetermined direction with a scan resolution determined at step (a-1) ; and

(a-3) analyzing the image data generated at step (a-2) to obtain the values of hue, brightness, saturation and position of each pixel in the image.

[Claim 33]

The image-sound conversion method according to claim 32, wherein the direction or type of scan is selected from a group comprising Left-Right, Up-Down, Ellipsoid, Lattice and Three- dimensional space.

[Claim 34]

The image-sound conversion method according to claim 33, wherein the step (b) generates the sound in accordance with the predetermined direction and speed of scan.

[Claim 35] The image-sound conversion method according to one of claims 31 to 34, wherein the step (b) generates the sound based on sine waves .

[Claim 36]

The image-sound conversion method according to claim 35, wherein the step (b) generates the sound according to a law of harmony.

[Claim 37]

The image-sound conversion method according to claim 35, wherein the step (b) generates the sound that corresponds to major harmonious chords.

[Claim 38]

The image-sound conversion method according to one of claims 31 to 34, wherein the step (b) selects the types and arrangements of musical instruments and then generates the sound based on the sound source of the selected musical instruments.

[Claim 39]

An image-sound conversion apparatus comprising: a means for detecting the values of hue, brightness, saturation and position within an image of each pixel in the image; and a means for associating the values of hue, brightness, saturation and position of each pixel within the image detected by the detecting means with the values of pitch, octave, timber and position of sound, respectively, and then generating the sound having the values of pitch, octave, timber and position of sound corresponding to the values of hue, brightness, saturation and position of each pixel.

[Claim 40]

The image-sound conversion apparatus according to claim 39, wherein the detecting means comprises: a means for determining the scan resolution of the image; a means for scanning the image along a predetermined direction with a scan resolution determined by the determining means; and a means for analyzing the image data generated by the scanning means to obtain the values of hue, brightness, saturation and position of each pixel in the image.

[Claim 41]

The mage-sound conversion apparatus according to claim 40, wherein the direction or type of scan of the scanning means is selected from a group comprising Left-Right, Up-Down, Ellipsoid, Lattice and Three-dimensional space.

[Claim 42] The image-sound conversion apparatus according to claim 41, wherein the sound generation means generates the sound in accordance with the predetermined direction and speed of scan. [Claim 43]

The image-sound conversion apparatus according to one of claims 39 to 42, wherein the sound generation means generates the sound based on sine waves.

[Claim 44]

The image-sound conversion apparatus according to claim 43, wherein the sound generation means generates the sound according to a law of harmony.

[Claim 45]

The image-sound conversion apparatus according to claim 44, wherein the sound generation means generates the sound that corresponds to major harmonious chords.

[Claim 46]

The image-sound conversion apparatus according to one of claims 39 to 42, wherein sound generation means selects the types and arrangements of musical instruments and then generates the sound based on the sound source of the selected musical instruments.

[Claim 47] The image-sound conversion apparatus according to one of claims 39 to 42, further comprising an image input means for scanning or photographing a source image and then providing the obtained image to the detecting means. [Claim 48]

The image-sound conversion apparatus according to one of claims 39 to 42, further comprising a sound source DB for storing at least one sound source, wherein the sound generation means generates the sound by modifying the sound source.

[Claim 49]

The image-sound conversion apparatus according to one of claims 39 to 42, further comprising an output means consisting of a multiple of speakers connected to the sound generation means, wherein the output means outputs the sound with a type of sound output selected from a group comprising Left-Right, Up-Down, Ellipsoid, Lattice and Three-dimensional space.

[Claim 50]

The image-sound conversion apparatus according to claim 49, wherein the output means outputs the sound for a predetermined time period.