WO2021101460A1 - Navigational assistance system with auditory augmented reality for visually impaired persons - Google Patents

Navigational assistance system with auditory augmented reality for visually impaired persons Download PDF

Info

Publication number
WO2021101460A1
WO2021101460A1 PCT/TR2019/050976 TR2019050976W WO2021101460A1 WO 2021101460 A1 WO2021101460 A1 WO 2021101460A1 TR 2019050976 W TR2019050976 W TR 2019050976W WO 2021101460 A1 WO2021101460 A1 WO 2021101460A1
Authority
WO
WIPO (PCT)
Prior art keywords
user
assistance device
auditory
cameras
stereo
Prior art date
Application number
PCT/TR2019/050976
Other languages
French (fr)
Inventor
Cihan TOPAL
Original Assignee
Eskisehir Teknik Universitesi
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Eskisehir Teknik Universitesi filed Critical Eskisehir Teknik Universitesi
Priority to PCT/TR2019/050976 priority Critical patent/WO2021101460A1/en
Publication of WO2021101460A1 publication Critical patent/WO2021101460A1/en

Links

Classifications

    • GPHYSICS
    • G09EDUCATION; CRYPTOGRAPHY; DISPLAY; ADVERTISING; SEALS
    • G09BEDUCATIONAL OR DEMONSTRATION APPLIANCES; APPLIANCES FOR TEACHING, OR COMMUNICATING WITH, THE BLIND, DEAF OR MUTE; MODELS; PLANETARIA; GLOBES; MAPS; DIAGRAMS
    • G09B21/00Teaching, or communicating with, the blind, deaf or mute
    • G09B21/001Teaching or communicating with blind persons
    • G09B21/006Teaching or communicating with blind persons using audible presentation of the information

Definitions

  • the invention is directed to a navigational assistance system developed for assisting individuals with partial or complete visual impairment in finding direction.
  • Visual impairment is generally considered as one of the sensory disabilities that most affect a person's daily life. Visually impaired persons or persons with limited visual ability (partial visual impairment), faces many challenges in their daily lives, especially with regard to mobility and the ability to perform navigational tasks. In daily life, there are many negative situations such as inability to perceive approaching objects in traffic, inability to see or recognize the persons they encounter.
  • a person with visual impairment have tremendous difficulty in tasks such as crossing from one room to another, finding an object in a place or reaching a destination.
  • a person with visual impairment tends to perform tasks that he/she is unable to perform due to his/her visual impairment, by relying on other senses. For instance, he/she may discern a person approaching to him/her by the sound of footsteps.
  • the lack of visual information may be compensated with the information fed by the auditory or tactile system.
  • this situation often proves insufficient and causes the visually-impaired persons to delay in recognizing the obstacles in front of them or not to reach a desired destination .
  • navigation systems are user-managed systems and they are developed on the basic idea of responding to user requests.
  • buttons to change modes and other inputs There are various options for managing and controlling a navigation system.
  • One of them is a simple controller device, comprising several buttons to change modes and other inputs. This option is appropriate for simple processes such as switching modes; however, it is difficult to input the object names to be searched. For example, when a user activates the search mode to find his cell phone or air conditioner remote control, it would be challenging to input this information with a few buttons.
  • a button- based control device design leads to mechanical parts that are relatively bulky and prone to hardware malfunctions. Thus, there is a need to find a more appropriate way to provide more complex inputs to the system.
  • US2018189567A is a document of the current state of the art, disclosing a device, system and assistance method for users with visual impairment.
  • the system of said document comprises a user-worn, generally mounted to the head, haptic tape, consisting of computer processors and associated support devices and algorithms configured for computer vision and a plurality of far-range haptic transducers, and a plurality of video cameras. These haptic bands are worn such that user's hands are free for other tasks.
  • the spatial locations of each object considered as important is output to the user by varying outputs for the haptic transducers.
  • Generic objects, identified objects and potential obstacle objects are identified and reported.
  • This system can also optionally provide audio information or tactile graphics display information relating to these objects.
  • this document does not include object or address searching option as per user's request and does not mention any augmented auditory data.
  • a paper from the prior art, "A depth-based head-mounted visual display to aid navigation in partially sighted individuals” suggests a system for the persons with visual impairment consisting of a depth sensor and 2D LED array as ultra-high contrast screen. It is indicated that a depth map of user's point of view is constructed, and vision is augmented via the image. This system can only assist partially sighted persons.
  • Another paper from the prior art “Navigation assistance for the visually impaired using RGB-D sensor with range expansion” (Aladren et al. 2016. IEEE Systems Journal, 10;3, pp.922-932) describes a navigation system providing sound commands by using visual and range information. The system provides sound commands at varying frequencies to left ear, right ear or both ears, based on the distances and locations of objects respectively. Although it is used by blind people, this system provides only the relative direction of objects to left or right, rather than providing more accurate clues.
  • the invention is directed to a navigational assistance system developed for assisting persons with partial or complete visual impairment in finding direction.
  • auditory augmented reality technology it is recommended to add auditory information using augmented reality technology.
  • This technology may be referred to as auditory augmented reality technology.
  • the system is appropriate for both the persons with complete visual impairment and the persons with partial visual impairment by overcoming the drawbacks of the systems in the prior art and also aims to enhance the physical mobility of the its users.
  • An important object of this invention is to obtain three dimensional location information of the obstacles and objects surrounding the person with visual impairment and to enable the users to perceive this information in an auditory manner.
  • the images taken are processed and a real-time audio-positional expression is generated.
  • it is aimed to help visually impaired people by transforming each object into a virtual sound source.
  • An object of the invention is to make use of the abilities of the visually impaired individual to provide spatial location information of the surrounding objects and obstacles.
  • the system developed by the invention processes images with a computer to obtain the geometric structure of the user's immediate environment through visualization techniques and produces the scene as a spatial presentation in real time.
  • the proposed assistance system augments the reality with sound, a method other than a visual information.
  • This auditory augmentation alters phase and amplitude of the generated voice to represent the direction and distance of the objects present in the environment. Accordingly, all the objects and obstacles surrounding the user turns into a virtual voice source at the output of the system and are presented as a specific three dimensional spatial voice, and thus the user perceive the objects in the physical environment.
  • the assistance device of invention in its broadest sense, includes an assistance device comprising at least one camera (1), at least one depth sensor (2), a plurality of microphones (3), multi-channel/stereo earphone output (4) and a processing and controlling unit (6) operating in connection with this assistance device.
  • the assistance device of invention in more detail, comprises; - an assistance device comprising;
  • At least one camera (1) for taking an image to detect objects/people/walls surrounding user and to estimate their distances to the user
  • - at least one multi-channel/stereo sound output for delivering three dimensional auditory expressions to the user in order to guide/inform the user, and with the assistance device operating in connection with the following;
  • - a processing and controlling unit (6) wherein the obtained visual, locational and auditory information are processed by a software and converted into three dimensional real time auditory expressions, comprising;
  • - a software module comprising object detection algorithm, face recognition technique and written text recognition/detection technique.
  • a depth sensor (2) for obtaining the structural features of the user's surrounding.
  • a camera (1) and a depth sensor (2) there is a camera (1) and a depth sensor (2).
  • Stereo cameras are systems inspired by human eyes. A person sees an object near him/her at different locations on right and left eye; however, as the distance to object increases, the image starts to approach to the same location on both eyes, and from a sufficiently distant location it is perceived at the same location on both eyes. And in the system of the invention, objects are detected in the two cameras (1) separately, and by analysing the difference between positions where they are detected, the distance of the object to the cameras (1) is calculated.
  • coloured and grey stereo cameras (1) may be used.
  • the device of invention comprises multi-channel/stereo earphone output (4) and this requires the user to have the ability of binaural (stereophonic) hearing in order to use the suggested system. Since the basic motivation for this system is to provide directional information by using stereophonic hearing ability of humans, the suggested system is difficult to be used by people with no sense of binaural hearing.
  • a mixer that enables audio information generated by the software and external auditory information to be transmitted to the user via earphone output (4) by combining this information.
  • This mixer operates with an audio processor (DSP-digital signal processor) and a software running thereon.
  • processing and controlling unit (6) includes a software module comprising object detection algorithm, face recognition technique and written text recognition/detection technique, and a user interface that detects user's voice commands with its speech recognition feature.
  • processing and controlling unit (6) there are processing and controlling unit (6), an information display (7), at least one button (8) for transmitting user's voice commands and also eliminating ambient sound and a command microphone (9) for inputting user's voice commands.
  • command microphone (9) has a feature of noise blocking.
  • processing and controlling unit (6) is an external unit operating in connection with the device of invention.
  • This external device in an embodiment of invention, is a smart phone.
  • the device of invention comprises Bluetooth connection technology or wireless internet connection feature to connect with the smart phone.
  • the speech recognition infrastructure of the smart phone is utilized.
  • processing and controlling unit (6) is a smart phone
  • noise blocking feature is actualized by the smart phone.
  • smart phone's screen is used as information display (7) and smart phone buttons or touch screen are used as button (8).
  • the operating method of the system of invention is also within the scope of protection of present invention.
  • This method is characterized by comprising the steps of processing the visual information of physical environment generated by cameras (1) and/or depth sensor (2) and obtaining coarse geometrical structure (CGS), transmitting CGS information to processing and controlling unit (6) in order to generate auditory augmentation, transmitting generated augmented auditory information to the mixer and mixing it in the mixer by using the audio information of environment from microphones (3), transmitting the generated auditory-locational expressions to the person with visual impairment via earphone outputs (4).
  • the system of invention comprises at least one operating mode.
  • the operating modes are navigation mode, object searching mode and address searching mode.
  • a navigation mode there is a navigation mode.
  • a three dimensional audible expression of the current environment is generated so that the user can recognize obstacles or walls that may be encountered during the route from one point to another.
  • stereo cameras (1) are used in addition to the depth sensor (2).
  • the physical environment where the user is located is monitored at a predetermined angle and distance range. This is performed in order to monitor the entire physical environment where user is located, and not to overload the user with excessive sound information. In other words, inserting a sound effect for an obstacle that is 10 meters away from the user does not assist navigation, rather creates an unpleasant experience for the user.
  • the distance that the device scans within the physical environment of the device is no more than 5 meters. This distance can be adjusted to adapt to the user's speed.
  • navigation is carried out at two levels in the navigation mode.
  • the first is general navigation step wherein the system delivers route and direction information of the route to the user by using a web service providing map, and the other is the local navigation step, which allows the user to recognize the objects around him while moving in accordance with these directives and to proceed without hitting them. This allows the user to move from point A to point B as easily as a person with no visual disability .
  • An embodiment of invention comprises object searching mode.
  • object searching mode the system detects an object in the environment via object detection algorithms and guides the user towards this object.
  • object detection algorithms in addition to depth sensor (2), stereo cameras (1) detecting objects and locations are also used.
  • the system When the object is detected by object recognition algorithms in the stereo image pair, the system accurately calculates the depth and direction of the object. Then, the system performs auditory augmentation of the present object with a sound effect together with a sound information. In other words, the sound effect that is processed to give spatial location is delivered. As such, guiding the user to the said object is possible.
  • the searched object in object searching mode may also be a person. In this case the invention uses face recognition techniques .
  • An embodiment of invention comprises address searching mode.
  • address searching mode the system detects and identifies written texts on signboards or door numbers.
  • the objects in the vicinity of the user are converted to a virtual audio source so that they can be noticed by the user.
  • address searching mode accordingly, the objects are converted to sound and the user is reported, while for the object that is searched, a different sound is generated according to the location information, allowing the user to reach the object. Since the sound scheme to be created is updated in real time, as the user moves forward and changes direction, updated sound scheme will create the sense that the sounds created are really coming from the objects.
  • a speech recognition feature for controlling and managing the device by the user.
  • This feature allows user to easily command the system by speech controls.
  • the user transmits the object or address he/she wants to find to the assistance system by voice. Giving voice commands is a quite practical option for the persons with visual impairment.
  • the user starts to speak by pressing a button (8) on the device before speaking to the system. When the user presses this button (8), the ambient sounds are eliminated and only the user's voice is received through the microphones (3).
  • microphones (3) with noise blocking feature are used.
  • the assistance device of invention is used by placing it on the user's head.
  • the assistance device is a pair of eyeglasses (5).
  • Figure 1 is the Figure that relates to the present embodiment.
  • the assistance device in the form of the glasses (5) shown in this figure comprises two cameras (1) with one on each eyeglass (5), a depth sensor (2) located on the part of eyeglasses (5) that is placed on nose, two microphones (3) with one on each eyeglasses (5) temple, two earphone output (4) with one on each eyeglasses (5) temple.
  • the assistance device is connected to an external processing and controlling unit (6).
  • This processing and controlling unit (6) consists of an information display (7), buttons (8) and a command microphone (9).
  • the assistance device has a structure suitable to be placed on the user's forehead.
  • Figure 2 is the Figure that relates to the present embodiment.
  • an assistance device comprising two cameras (1) or a stereo-calibrated camera (1), a depth sensor (2), two microphones (3) or a stereo-calibrated microphone (3), a multi-channel/stereo earphone output (4) is placed on a headband (10).
  • a processing and controlling unit (6) which the assistance device is connected thereto.
  • the invention is a navigational assistance system based on the technology of auditory augmented reality for the persons with partial or complete visual impairment and is characterized in that it comprises an assistance device consisting of a depth sensor (2) for obtaining the structural characteristics of the physical environment where the user is located, at least one camera (1) for detecting objects/persons/walls in the vicinity of the user's location, at least one microphone (3) for receiving ambient sounds and/or voice commands from the user, at least one multi channel/stereo earphone output (4) for delivering three dimensional auditory expressions guiding/informing the user to the user, with the assistance device operating in connection with a user interface that detects the voice commands from the user and has speech recognition feature, and a processing and controlling unit (6) comprising a detection unit comprising an object detection algorithm, face recognition technique and written text detection/recognition technique, wherein the obtained visual, spatial and auditory information are processed and converted real time into three dimensional audio expressions.
  • an operating method for this device is also covered by the scope of the present invention.
  • a system developed for enabling the persons with complete or partial visual impairment to move in an physical environment as easily as persons with no visual disability, to reach a location of interest, to perceive the objects in the environment and to reach the said objects comprising an assistance device that allows obtaining the geometrical structure of the person's surroundings, processing images, as a result generating real time audio-spatial expressions, and for transmitting the said expressions to the user and the said system can be controlled by the user.
  • Figure 1 A View of the Assistance Device and Processing and Controlling unit (6) from An Embodiment of the Navigation Assistance System of Invention
  • Figure 2 A View of the Assistance Device and Processing and Controlling unit (6) from An Embodiment of the Navigation Assistance System of Invention

Abstract

The invention relates to a navigation assistance system for persons with partial or complete visual impairment and the method of operation for this device. With this system, all the objects in the physical environment where the user is located, and obstacles between user and the destination are detected and audio-locational information related to these objects and obstacles are generated in real time. This three dimensional sound information generated to guide the user are transmitted to the user, allowing the user to recognize the physical environment where he/she is located.

Description

NAVIGATIONAL ASSISTANCE SYSTEM WITH AUDITORY AUGMENTED
REALITY FOR VISUALLY IMPAIRED PERSONS
Field of the Invention
The invention is directed to a navigational assistance system developed for assisting individuals with partial or complete visual impairment in finding direction.
Known State of the Art
Visual impairment is generally considered as one of the sensory disabilities that most affect a person's daily life. Visually impaired persons or persons with limited visual ability (partial visual impairment), faces many challenges in their daily lives, especially with regard to mobility and the ability to perform navigational tasks. In daily life, there are many negative situations such as inability to perceive approaching objects in traffic, inability to see or recognize the persons they encounter.
For example, a person with visual impairment have tremendous difficulty in tasks such as crossing from one room to another, finding an object in a place or reaching a destination. A person with visual impairment tends to perform tasks that he/she is unable to perform due to his/her visual impairment, by relying on other senses. For instance, he/she may discern a person approaching to him/her by the sound of footsteps. The lack of visual information may be compensated with the information fed by the auditory or tactile system. However, this situation often proves insufficient and causes the visually-impaired persons to delay in recognizing the obstacles in front of them or not to reach a desired destination . Furthermore, navigation systems are user-managed systems and they are developed on the basic idea of responding to user requests. There are various options for managing and controlling a navigation system. One of them is a simple controller device, comprising several buttons to change modes and other inputs. This option is appropriate for simple processes such as switching modes; however, it is difficult to input the object names to be searched. For example, when a user activates the search mode to find his cell phone or air conditioner remote control, it would be challenging to input this information with a few buttons. Also, a button- based control device design leads to mechanical parts that are relatively bulky and prone to hardware malfunctions. Thus, there is a need to find a more appropriate way to provide more complex inputs to the system.
There are many apparatus, guidance systems, navigation systems in the art developed for assisting visually-impaired persons.
US2018189567A is a document of the current state of the art, disclosing a device, system and assistance method for users with visual impairment. The system of said document comprises a user-worn, generally mounted to the head, haptic tape, consisting of computer processors and associated support devices and algorithms configured for computer vision and a plurality of far-range haptic transducers, and a plurality of video cameras. These haptic bands are worn such that user's hands are free for other tasks. The spatial locations of each object considered as important is output to the user by varying outputs for the haptic transducers. Generic objects, identified objects and potential obstacle objects are identified and reported. This system can also optionally provide audio information or tactile graphics display information relating to these objects. However, this document does not include object or address searching option as per user's request and does not mention any augmented auditory data.
A paper from the prior art, "A depth-based head-mounted visual display to aid navigation in partially sighted individuals" (S.L. Hicks et al. 2013. PLOS ONE, 8;7, p.e67695.) suggests a system for the persons with visual impairment consisting of a depth sensor and 2D LED array as ultra-high contrast screen. It is indicated that a depth map of user's point of view is constructed, and vision is augmented via the image. This system can only assist partially sighted persons. Furthermore, another paper from the prior art, "Navigation assistance for the visually impaired using RGB-D sensor with range expansion" (Aladren et al. 2016. IEEE Systems Journal, 10;3, pp.922-932) describes a navigation system providing sound commands by using visual and range information. The system provides sound commands at varying frequencies to left ear, right ear or both ears, based on the distances and locations of objects respectively. Although it is used by blind people, this system provides only the relative direction of objects to left or right, rather than providing more accurate clues.
As a result, in order to overcome the aforementioned drawbacks, there is a need to develop a system protecting the person with visual impairment from the obstacles in the environment, guiding the person towards the desired object/location, warning/guiding the person in an auditory manner. Detailed Description of the Invention
The invention is directed to a navigational assistance system developed for assisting persons with partial or complete visual impairment in finding direction.
In this system, it is recommended to add auditory information using augmented reality technology. This technology may be referred to as auditory augmented reality technology.
The system is appropriate for both the persons with complete visual impairment and the persons with partial visual impairment by overcoming the drawbacks of the systems in the prior art and also aims to enhance the physical mobility of the its users.
An important object of this invention is to obtain three dimensional location information of the obstacles and objects surrounding the person with visual impairment and to enable the users to perceive this information in an auditory manner. In the developed system, in order to obtain the geometric structure in the immediate vicinity of the user, the images taken are processed and a real-time audio-positional expression is generated. In other words, instead of supporting objects in the environment with virtual visual stimuli, it is aimed to help visually impaired people by transforming each object into a virtual sound source.
Apart from the sense of sight, people can obtain spatial information by processing sound waves and estimate the source location of the voice by using the phase difference of the acquired sonance transmitted to both ears. An object of the invention is to make use of the abilities of the visually impaired individual to provide spatial location information of the surrounding objects and obstacles. The system developed by the invention, processes images with a computer to obtain the geometric structure of the user's immediate environment through visualization techniques and produces the scene as a spatial presentation in real time. As such, in contrast to many augmented reality applications, the proposed assistance system augments the reality with sound, a method other than a visual information. This auditory augmentation alters phase and amplitude of the generated voice to represent the direction and distance of the objects present in the environment. Accordingly, all the objects and obstacles surrounding the user turns into a virtual voice source at the output of the system and are presented as a specific three dimensional spatial voice, and thus the user perceive the objects in the physical environment.
The assistance device of invention, in its broadest sense, includes an assistance device comprising at least one camera (1), at least one depth sensor (2), a plurality of microphones (3), multi-channel/stereo earphone output (4) and a processing and controlling unit (6) operating in connection with this assistance device.
The assistance device of invention, in more detail, comprises; - an assistance device comprising;
- at least one camera (1) for taking an image to detect objects/people/walls surrounding user and to estimate their distances to the user,
- at least one microphone (3) for receiving sounds from the user's surrounding,
- at least one multi-channel/stereo sound output for delivering three dimensional auditory expressions to the user in order to guide/inform the user, and with the assistance device operating in connection with the following; - a processing and controlling unit (6), wherein the obtained visual, locational and auditory information are processed by a software and converted into three dimensional real time auditory expressions, comprising;
- a user interface detecting user's voice commands, and
- a software module comprising object detection algorithm, face recognition technique and written text recognition/detection technique.
In an embodiment of invention, there is a depth sensor (2) for obtaining the structural features of the user's surrounding.
In an embodiment of invention, there is a camera (1) and a depth sensor (2).
In an embodiment of invention, there are two cameras (1) that are stereo-calibrated. In this embodiment, there is not a depth sensor (2) and the distance of the objects to the user is measured via stereo-calibrated cameras (1).
In another embodiment of invention, there is a depth sensor (2) in addition to the two stereo-calibrated cameras (1).
Stereo cameras are systems inspired by human eyes. A person sees an object near him/her at different locations on right and left eye; however, as the distance to object increases, the image starts to approach to the same location on both eyes, and from a sufficiently distant location it is perceived at the same location on both eyes. And in the system of the invention, objects are detected in the two cameras (1) separately, and by analysing the difference between positions where they are detected, the distance of the object to the cameras (1) is calculated. In this case, coloured and grey stereo cameras (1) may be used. In this case, in an embodiment of invention, there is/are RGB (Red-green-blue) and/or Grey stereo camera (1). In an another embodiment of invention, there are two RGB/Grey stereo-calibrated cameras (1).
In an embodiment of invention, there are two microphones (3) or one stereo-calibrated microphone (3).
The device of invention comprises multi-channel/stereo earphone output (4) and this requires the user to have the ability of binaural (stereophonic) hearing in order to use the suggested system. Since the basic motivation for this system is to provide directional information by using stereophonic hearing ability of humans, the suggested system is difficult to be used by people with no sense of binaural hearing.
In an embodiment of invention, there is a mixer that enables audio information generated by the software and external auditory information to be transmitted to the user via earphone output (4) by combining this information. This mixer operates with an audio processor (DSP-digital signal processor) and a software running thereon.
In an embodiment of invention, processing and controlling unit (6) includes a software module comprising object detection algorithm, face recognition technique and written text recognition/detection technique, and a user interface that detects user's voice commands with its speech recognition feature.
In an embodiment of feature, there are processing and controlling unit (6), an information display (7), at least one button (8) for transmitting user's voice commands and also eliminating ambient sound and a command microphone (9) for inputting user's voice commands.
In an embodiment of invention, command microphone (9), has a feature of noise blocking.
In an embodiment of invention, processing and controlling unit (6) is an external unit operating in connection with the device of invention. This external device, in an embodiment of invention, is a smart phone. In this embodiment, the device of invention comprises Bluetooth connection technology or wireless internet connection feature to connect with the smart phone. In this embodiment, the speech recognition infrastructure of the smart phone is utilized.
If the processing and controlling unit (6) is a smart phone, noise blocking feature is actualized by the smart phone. Also in this case, smart phone's screen is used as information display (7) and smart phone buttons or touch screen are used as button (8).
The operating method of the system of invention is also within the scope of protection of present invention. This method is characterized by comprising the steps of processing the visual information of physical environment generated by cameras (1) and/or depth sensor (2) and obtaining coarse geometrical structure (CGS), transmitting CGS information to processing and controlling unit (6) in order to generate auditory augmentation, transmitting generated augmented auditory information to the mixer and mixing it in the mixer by using the audio information of environment from microphones (3), transmitting the generated auditory-locational expressions to the person with visual impairment via earphone outputs (4). In an embodiment, the system of invention comprises at least one operating mode. The operating modes are navigation mode, object searching mode and address searching mode.
In an embodiment of invention, there is a navigation mode. In the navigation mode, a three dimensional audible expression of the current environment is generated so that the user can recognize obstacles or walls that may be encountered during the route from one point to another. In order to construct the three dimensional structure, stereo cameras (1) are used in addition to the depth sensor (2).
In this mode, in the system of invention, the physical environment where the user is located, is monitored at a predetermined angle and distance range. This is performed in order to monitor the entire physical environment where user is located, and not to overload the user with excessive sound information. In other words, inserting a sound effect for an obstacle that is 10 meters away from the user does not assist navigation, rather creates an unpleasant experience for the user.
In an embodiment of invention, the distance that the device scans within the physical environment of the device is no more than 5 meters. This distance can be adjusted to adapt to the user's speed.
In an embodiment of invention, navigation is carried out at two levels in the navigation mode. The first is general navigation step wherein the system delivers route and direction information of the route to the user by using a web service providing map, and the other is the local navigation step, which allows the user to recognize the objects around him while moving in accordance with these directives and to proceed without hitting them. This allows the user to move from point A to point B as easily as a person with no visual disability .
In navigation mode, the system provides audible spatial representation of the surrounding environment to facilitate user navigation. Since the main object is to reach from point A to point B during navigation, the system does not need to recognize the object in the environment, but needs to estimate their three dimensional structures. The estimated three- dimensional structure is used to create a three-dimensional sound representation of the environment. Thus, the user can detect walls and other obstacles via directional sound and avoid them. In order to generate the three dimensional structure, stereo cameras (1) are used along with depth sensor (2).
An embodiment of invention comprises object searching mode. In object searching mode, the system detects an object in the environment via object detection algorithms and guides the user towards this object. In this mode, in addition to depth sensor (2), stereo cameras (1) detecting objects and locations are also used.
When the object is detected by object recognition algorithms in the stereo image pair, the system accurately calculates the depth and direction of the object. Then, the system performs auditory augmentation of the present object with a sound effect together with a sound information. In other words, the sound effect that is processed to give spatial location is delivered. As such, guiding the user to the said object is possible. The searched object in object searching mode may also be a person. In this case the invention uses face recognition techniques .
An embodiment of invention comprises address searching mode. In address searching mode, the system detects and identifies written texts on signboards or door numbers.
There are many objects in the physical environment where the user is located. If the user wants to reach specifically an object, he/she uses the object searching mode. All objects falling within the scope of the system of the invention are scanned, but a different sound effect is created for the object that the user wants to reach. In the cases where other objects exist to obstruct the user to reach the object of interest that he/she is looking for, the sound effect generated for the said objects is different from the sound effect generated for the object that the user seeks to reach.
In the system of invention, basically the objects in the vicinity of the user are converted to a virtual audio source so that they can be noticed by the user. In address searching mode, accordingly, the objects are converted to sound and the user is reported, while for the object that is searched, a different sound is generated according to the location information, allowing the user to reach the object. Since the sound scheme to be created is updated in real time, as the user moves forward and changes direction, updated sound scheme will create the sense that the sounds created are really coming from the objects.
In an embodiment of invention, there is a speech recognition feature for controlling and managing the device by the user. This feature allows user to easily command the system by speech controls. The user transmits the object or address he/she wants to find to the assistance system by voice. Giving voice commands is a quite practical option for the persons with visual impairment. In order to prevent the user's voice from interacting with the ambient sounds, the user starts to speak by pressing a button (8) on the device before speaking to the system. When the user presses this button (8), the ambient sounds are eliminated and only the user's voice is received through the microphones (3). In this embodiment of invention, microphones (3) with noise blocking feature are used.
In an embodiment of the invention, the assistance device of invention is used by placing it on the user's head. In an embodiment of invention, the assistance device is a pair of eyeglasses (5). Figure 1, is the Figure that relates to the present embodiment. The assistance device in the form of the glasses (5) shown in this figure comprises two cameras (1) with one on each eyeglass (5), a depth sensor (2) located on the part of eyeglasses (5) that is placed on nose, two microphones (3) with one on each eyeglasses (5) temple, two earphone output (4) with one on each eyeglasses (5) temple. Thus, it is seen that the assistance device is connected to an external processing and controlling unit (6). This processing and controlling unit (6), consists of an information display (7), buttons (8) and a command microphone (9).
In an embodiment of invention, the assistance device has a structure suitable to be placed on the user's forehead. Figure 2, is the Figure that relates to the present embodiment. Thus, it is seen that an assistance device comprising two cameras (1) or a stereo-calibrated camera (1), a depth sensor (2), two microphones (3) or a stereo-calibrated microphone (3), a multi-channel/stereo earphone output (4) is placed on a headband (10). There is also a processing and controlling unit (6) which the assistance device is connected thereto.
In summary, the invention is a navigational assistance system based on the technology of auditory augmented reality for the persons with partial or complete visual impairment and is characterized in that it comprises an assistance device consisting of a depth sensor (2) for obtaining the structural characteristics of the physical environment where the user is located, at least one camera (1) for detecting objects/persons/walls in the vicinity of the user's location, at least one microphone (3) for receiving ambient sounds and/or voice commands from the user, at least one multi channel/stereo earphone output (4) for delivering three dimensional auditory expressions guiding/informing the user to the user, with the assistance device operating in connection with a user interface that detects the voice commands from the user and has speech recognition feature, and a processing and controlling unit (6) comprising a detection unit comprising an object detection algorithm, face recognition technique and written text detection/recognition technique, wherein the obtained visual, spatial and auditory information are processed and converted real time into three dimensional audio expressions. An operating method for this device is also covered by the scope of the present invention.
With the system of invention, provided is a system developed for enabling the persons with complete or partial visual impairment to move in an physical environment as easily as persons with no visual disability, to reach a location of interest, to perceive the objects in the environment and to reach the said objects, comprising an assistance device that allows obtaining the geometrical structure of the person's surroundings, processing images, as a result generating real time audio-spatial expressions, and for transmitting the said expressions to the user and the said system can be controlled by the user.
Description of the Figures
Figure 1 A View of the Assistance Device and Processing and Controlling unit (6) from An Embodiment of the Navigation Assistance System of Invention
Figure 2 A View of the Assistance Device and Processing and Controlling unit (6) from An Embodiment of the Navigation Assistance System of Invention
Descriptions of Reference Numbers in Figures
1.Camera
2. Depth sensor
3 .Microphone
4 .Earphone output
5 .Eyeglasses
6.Processing and controlling unit
7.Information display
8.Button
9.Command microphone
10.Headband

Claims

1.A navigational assistance system for persons with partial or complete visual impairment, characterized by comprising;
- an assistance device comprising;
- at least one camera (1) for taking an image to detect objects/people/walls surrounding user,
- at least one microphone (3) for receiving sounds from the user's surrounding,
- at least one multi-channel/stereo earphone output (4) for delivering three dimensional auditory expressions guiding/informing the user, with the assistance device operating in connection with the following;
- a processing and controlling unit (6), wherein the obtained visual, locational and auditory information are processed and converted into three dimensional real time auditory- locational expressions, comprising;
- a user interface detecting voice commands of a user and having speech recognition feature, and
- a software module comprising object detection algorithm, face recognition technique and written text recognition/detection technique.
2 The system according to claim 1, wherein the processing and controlling unit (6) comprises at least one button (8) for inputting the user's voice commands as well as eliminating ambient sound.
3. The system according to claim 1, wherein the processing and controlling unit (6) comprises a command microphone (9) for inputting voice commands of the user, having noise blocking feature.
4.The system according to claim 1, wherein the processing and controlling unit (6) comprises an information display (7).
5 . The system according to claim 1, wherein the processing and controlling unit (6) is a smart phone.
6.The system according to claim 5, wherein the assistance device has Bluetooth or wireless internet feature for connecting to smart phone.
7.The system according to claim 1, wherein the assistance device comprises two microphones (3) or a stereo-calibrated microphone (3).
8.The system according to claim 1, wherein the assistance device comprises at least one stereo-calibrated camera (1).
9.The system according to claim 1, wherein the assistance device comprises a depth sensor (2) that constructs a depth map for obtaining the structural characteristics of the physical environment where the user is located.
10 . The system according to claim 9, wherein the assistance device comprises two stereo-calibrated cameras (1) and a depth sensor (2).
11 . The system according to claim 1, wherein it comprises RGB (Red-green-blue) camera and/or Grey stereo camera (1).
12 . The system according to claim 1, wherein the assistance device comprises two right and left earphone output (4).
13. The system according to claim 1, wherein the assistance device is a pair of eyeglasses (5).
14. The system according to claim 13, comprising two cameras (1) with one on each eyeglass (5), a depth sensor (2) located between the two cameras (1), two microphones (3) with one on each eyeglasses (5) temple and multi-channel/stereo earphone output (4) on eyeglasses (5) temple.
15. The system according to claim 1, wherein the assistance device has a structure suitable to be placed on a person's forehead.
16. The system according to claim 15, wherein the assistance device is located on a headband (10).
17. An operating method of the system according to any one of the preceding claims, characterized by comprising the steps of:
- processing the visual information of the physical environment obtained from cameras (1) and/or depth sensor (2) via algorithms and obtaining coarse geometrical structure (CGS) information,
- transmitting CGS information to processing and controlling unit (6) in order to generate auditory augmentation,
- transmitting augmented auditory information to the mixer,
- mixing it in the mixer by using the audio information of environment from microphones (3),
- transmitting the generated auditory-locational expressions to the person with visual impairment via earphone outputs (4).
18. The method according to claim 17, wherein if the device comprises two cameras (1), the objects are detected on two cameras (1) separately and the distance of object to the cameras (1) are calculated by analysing the difference between the positions that they are detected.
19.The method according to claim 17, comprising at least one of navigation mode, object searching mode and address searching mode.
20. The method according to claim 19, wherein in the navigation mode; the current physical environment is monitored by depth sensor (2) and stereo cameras (1) and three dimensional auditory expression of the environment is created in order to enable the user to recognize the obstacles or walls that may be encountered during the route from one point to another.
21. The method according to claim 20, wherein the physical environment where the user is located is monitored at a predetermined angle and distance range.
22. The method according to claim 21, wherein the distance is at most 5 meters.
23. The method according to claim 20, comprising two steps of a general navigation step in which the system delivers road and direction information of the route to the user by using a web service providing map, and a local navigation step in which the user is enabled to recognize surrounding objects and keep moving without hitting them while moving as per these directives.
24. The method according to claim 19, wherein in object searching mode; it comprises the steps of detecting object via object recognition algorithms on the stereo camera (1) image, calculating the depth and direction of the object via depth sensor (2), creating an auditory augmentation for the object with a sound effect and transmitting this auditory augmentation to the user.
25. The method according to claim 24, wherein if the object that is searched is a person, the person is recognized by face recognition technique.
26. The method according to claim 24, wherein if there are other objects that will cause obstacles on the direction the user is going to reach the desired object, the sound effect generated for these objects is different from the sound effect generated for the object that the user wants to reach.
27. The method according to claim 19, wherein in address searching mode; in order to detect and identify written texts on signboards or door numbers and guiding the user toward this location, the user is informed with a sound.
PCT/TR2019/050976 2019-11-20 2019-11-20 Navigational assistance system with auditory augmented reality for visually impaired persons WO2021101460A1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
PCT/TR2019/050976 WO2021101460A1 (en) 2019-11-20 2019-11-20 Navigational assistance system with auditory augmented reality for visually impaired persons

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/TR2019/050976 WO2021101460A1 (en) 2019-11-20 2019-11-20 Navigational assistance system with auditory augmented reality for visually impaired persons

Publications (1)

Publication Number Publication Date
WO2021101460A1 true WO2021101460A1 (en) 2021-05-27

Family

ID=69591701

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/TR2019/050976 WO2021101460A1 (en) 2019-11-20 2019-11-20 Navigational assistance system with auditory augmented reality for visually impaired persons

Country Status (1)

Country Link
WO (1) WO2021101460A1 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115470420A (en) * 2022-10-31 2022-12-13 北京智源人工智能研究院 Health and safety prompting method based on knowledge graph, electronic equipment and storage medium

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
DE102009043252A1 (en) * 2009-09-28 2011-03-31 Siemens Aktiengesellschaft Device and method for assistance for visually impaired persons with three-dimensionally spatially resolved object detection
US20150211858A1 (en) * 2014-01-24 2015-07-30 Robert Jerauld Audio navigation assistance
EP3058926A1 (en) * 2015-02-18 2016-08-24 Technische Universität München Method of transforming visual data into acoustic signals and aid device for visually impaired or blind persons

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
DE102009043252A1 (en) * 2009-09-28 2011-03-31 Siemens Aktiengesellschaft Device and method for assistance for visually impaired persons with three-dimensionally spatially resolved object detection
US20150211858A1 (en) * 2014-01-24 2015-07-30 Robert Jerauld Audio navigation assistance
EP3058926A1 (en) * 2015-02-18 2016-08-24 Technische Universität München Method of transforming visual data into acoustic signals and aid device for visually impaired or blind persons

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115470420A (en) * 2022-10-31 2022-12-13 北京智源人工智能研究院 Health and safety prompting method based on knowledge graph, electronic equipment and storage medium

Similar Documents

Publication Publication Date Title
KR101646503B1 (en) Device, system and method for informing about 3D obstacle or information for blind person
US10638251B2 (en) Customizing head-related transfer functions based on monitored responses to audio content
KR20160001178A (en) Glass type terminal and control method thereof
JP2016208348A (en) Display device, control method for display device, and program
US10848891B2 (en) Remote inference of sound frequencies for determination of head-related transfer functions for a user of a headset
WO2019160953A1 (en) Intercom system for multiple users
US11178481B2 (en) Ear-plug assembly for hear-through audio systems
JP2022549548A (en) A method and system for adjusting the level of haptic content when presenting audio content
KR20160017593A (en) Method and program for notifying emergency exit by beacon and wearable glass device
WO2020050186A1 (en) Information processing apparatus, information processing method, and recording medium
WO2021101460A1 (en) Navigational assistance system with auditory augmented reality for visually impaired persons
Al-Shehabi et al. An obstacle detection and guidance system for mobility of visually impaired in unfamiliar indoor environments
WO2022004130A1 (en) Information processing device, information processing method, and storage medium
TR2022008128T2 (en) A NAVIGATION ASSISTANCE SYSTEM WITH AUDITORY AUGMENTED REALITY FOR THE VISUALLY IMPAIRED
Bellotto A multimodal smartphone interface for active perception by visually impaired
US20200387221A1 (en) Method, Computer Program and Head Mountable Arrangement for Assisting a Subject to Acquire Spatial Information about an Environment
JP2022548811A (en) Method and system for controlling haptic content
Scalvini et al. Visual-auditory substitution device for indoor navigation based on fast visual marker detection
WO2015140586A1 (en) Method for operating an electric blind guiding device based on real-time image processing and the device for implementing the method
KR20140080740A (en) Apparatus and method for visual information converting
JP7252313B2 (en) Head-mounted information processing device
US20230196765A1 (en) Software-based user interface element analogues for physical device elements
EP3609199A1 (en) Customizing head-related transfer functions based on monitored responses to audio content
KR20150061741A (en) Infrared depth camera, and depth image-based wearable obstacle detection device
JP2024056580A (en) Information processing device, control method thereof, and program

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 19850824

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 19850824

Country of ref document: EP

Kind code of ref document: A1