TR2022008128T2

TR2022008128T2 - A NAVIGATION ASSISTANCE SYSTEM WITH AUDITORY AUGMENTED REALITY FOR THE VISUALLY IMPAIRED

Info

Publication number: TR2022008128T2
Application number: TR2022/008128
Authority: TR
Inventors: Topal Ci̇han
Original assignee: Eski̇şehi̇r Tekni̇k Üni̇versi̇tesi̇ İdari̇ Ve Mali̇ İşler Dai̇re Başkanliği
Filing date: 2019-11-20
Publication date: 2022-06-21

Abstract

Kısmi veya tam görme engeline sahip kişiler için geliştirilmiş bir navigasyon yardım sistemi ve bu cihazın çalışma yöntemine ilişkindir. Bu sistem ile kullanıcının bulunduğu fiziksel ortamda bulunan tüm nesneler ve ulaşmak istediği konum ile arasında bulunan engeller tespit edilmekte ve bu nesne ve engellere ilişkin konumsal-işitsel bilgiler gerçek zamanlı olarak oluşturulmaktadır. Kullanıcının yönlendirilmesi için oluşturulan bu üç boyutlu ses bilgileri kullanıcıya iletilerek bulunduğu fiziksel ortamı algılaması sağlanmaktadır.It relates to a navigation assistance system developed for people with partial or total visual impairment and the method of operation of this device. With this system, all objects in the physical environment where the user is located and the obstacles between him and the location he wants to reach are detected, and spatial-auditory information about these objects and obstacles is created in real time. This three-dimensional sound information, created to guide the user, is transmitted to the user, allowing him to perceive the physical environment he is in.

Description

TARIFNAME GÖRME ENGELLILER IÇIN ISITSEL ARTIRILMIS GERÇEKLIK ÖZELLIKLI BIR NAVIGASYON YARDIM SISTEMI Bulusun Konusu Bulus, kismi ya da tam görme engeli olan insanlarin yön bulmalarina yardimci olmak amaciyla gelistirilmis bir navigasyon yardim sistemini konu almaktadir. Teknigin Bilinen Durumu Görme bozuklugu genellikle bir insanin günlük yasamini en çok etkileyen duyusal engellerden biri olarak kabul edilir. Görme engelli ya da görme yetisi düsük (kismi görme engelli) kisiler özellikle hareketliligi ve seyir görevlerini yerine getirme kabiliyeti ile ilgili olarak günlük yasamda birçok zorlukla karsi karsiya kalmaktadir. Günlük yasamda, yolda yürürken, özellikle trafikte yaklasan nesneleri algilayamamalari, karsilarina çikan kisileri görememeleri veya taniyamamalari gibi birçok olumsuz durum bulunmaktadir. Örnegin, görme engelli bir kisi, bir odadan baska bir odaya geçmek, bir ortamda bir nesneyi bulmak, istenen bir hedefe ulasmak gibi isleri yapmakta büyük zorluk yasamaktadir. Görme engelli bir kisi görme bozuklugundan dolayi yapamadigi görevleri, diger duyularina güvenerek yerine getirme egilimindedir. Örnegin, yaklasan bir kisinin ayak seslerinden geldigini anlayabilir. Görme bilgisi eksikliginin telafisi isitme ya da dokunma sisteminden gelen bilgiler ile yapilabilir. Bu durum, çok yeterli olmamakta ve görme engelli kisinin önüne çikan engelleri fark etme konusunda geç kalmasina ya da ulasmak istedigi bir hedefe ulasamamasina sebep olmaktadir. Bunun disinda, navigasyon sistemleri kullanici tarafindan yönetilen ve kullanici isteklerine cevap verilmesi uzerine gelistirilmis sistemlerdir. Bir navigasyon sistemini yönetmek ve kontrol etmek için çesitli seçenekler yer almaktadir. Bunlardan biri modlari ve diger girisleri degistirmek için birkaç buton içeren basit bir kontrol cihazidir. Bu seçenek mod degistirme gibi basit islemler için uygundur ancak aranacak nesne adlarini girmek zor olmaktadir. Örnegin, bir kullanicinin cep telefonunu ya da klima uzaktan kumandasini bulmak için arama modunu çalistirmasi durumunda birkaç buton ile bu bilgileri girmesi zor olacaktir. Ayrica, buton tabanli bir kontrol cihazi tasarimi daha hantal ve donanim arizalarina açik olan mekanik parçalar anlamina gelir. Bu nedenle, sisteme karmasik girdiler saglamak için daha uygun bir yol bulunmasi gerekmektedir. Teknikte, görme engelli kisilere yardimci olmasi amaciyla gelistirilmis birçok aparat, yönlendirici sistemler, navigasyon sistemleri bulunmaktadir. USZOl8l89567A sayili dokuman, görme engelli kullanicilar için bir cihaz, sistem ve yardim yöntemini konu alan, teknigin bilinen durumuna ait bir dokümandir. Bu dokümana konu sistem, genellikle basa takilan, bilgisayar islemcileri ve iliskili destek aygitlari ve bilgisayar görmesi için yapilandirilmis algoritmalar ve çok sayida uzak aralikli dokunsal transdüser içeren bir kullanici tarafindan giyilen dokunsal bantlar ve çok sayida video kamera içermektedir. Bu dokunsal bantlar, kullanicinin elleri diger islerle ilgilenmesi için serbest kalacak sekilde giyilmektedir. Önemli olan her nesnenin mekansal konumlarinin, dokunsal transdüserlere göre degisken çiktilarla kullaniciya verildigi belirtilmektedir. Genel nesneler, tanimlanmis nesneler ve potansiyel engel nesneler tanimlanmakta ve raporlanmaktadir. Sistem ayrica istege bagli olarak bu nesnelere ait ses bilgisi veya dokunsal grafik görüntüleme bilgileri de saglayabilmektedir. Ancak bu dokumanda, kullanicinin istegine göre bir obje ya da adres aratma seçenegi bulunmamakta ve artirilmis isitsel verilerden bahsedilmemektedir. Önceki teknikte yer alan "A depth-based head-mounted visual display to aid navigation in partially sighted individuals" makalede, derinlik sensöründen ve çok yüksek kontrastli ekran olarak 2D LED dizisinden olusan görme engelliler için bir sistem önerilmektedir. Kullanicinin bakis açisinin derinlik haritasinin çikarildigi ve görüntü araciligiyla vizyonunun artirildigi belirtilmistir. Bu sistem sadece kismen kör insanlara yardim edebilir. Ayrica önceki teknikte yer alan, sensor with range expansion" (A Aladren et al. 2016. IEEE Systems Journal, lO;3, pp.922-932) adli bir makale, elde edilen görsel ve aralik bilgisini kullanarak ses komutlari saglayan bir navigasyon sistemini konu almaktadir. Sistem, sirasiyla. nesnelerin. mesafeleri. ve konumlarina. göre sola, saga veya her iki kulaga farkli frekanslarda ses komutlari saglar. Bu sistem kör insanlar tarafindan kullanilsa da, daha dogru ipuçlari saglamak yerine, nesnelerin yalnizca sola veya saga göreceli yönlerini saglamaktadir. Sonuç olarak, yukarida bahsedilen dezavantajlarin giderilmesi adina, görme engelli kisiyi bulundugu fiziksel ortamindaki engellerden koruyacak, kisinin ulasmak istedigi cisme/konuma kisiyi yönlendirecek, kisiyi isitsel bir sekilde uyaracak/yönlendirecek bir sistemin gelistirilmesine ihtiyaç duyulmaktadir. Bulusun Detayli Açiklamasi Bulus, kismi ya da tam görme engeli olan insanlarin yön bulmalarina yardimci olmak amaciyla gelistirilmis bir navigasyon yardim sistemini konu almaktadir. Bu sistemde, artirilmis gerçeklik teknolojisi kullanilarak isitsel bilgi eklemeleri yapilmasi önerilmektedir. Bu teknoloji, isitsel artirilmis gerçeklik teknolojisi olarak isimlendirilebilir. Bu sistem, mevcut teknikte yer alan sistemlerdeki dezavantajlari ortadan kaldirarak hem tamamen görme engeli olan hem de kismi görme engeline sahip kisiler için uygun olmakta ve kullanicilarin fiziksel hareketliliklerini artirmayi amaçlamaktadir. Bulusun önemli bir amaci, görme engeli olan kisinin çevresindeki engellerin ve objelerin üç boyutlu konum bilgilerinin elde edilmesi ve bu bilgilerin isitsel bir sekilde kullanicilar tarafindan algilanmasinin saglanmasidir. Gelistirilen sistemde, kullaniçinin yakin çevresindeki geometrik yapinin elde edilmesi için alinan görüntüler islenmekte ve gerçek zamanli isitsel-konumsal bir ifadesi olusturulmaktadir. Baska bir deyisle, ortamda bulunan objeleri sanal görsel uyaranlarla desteklemek yerine, her bir objeyi sanal birer ses kaynagi haline dönüstürerek görme engelli insanlara yardimci olmak amaçlanmaktadir. Görme duyusunun yani sira, insanlar ses dalgalarini isleyerek mekansal bilgi alabilir ve her iki kulaga iletilen edinilmis sonansin faz farkini kullanarak bir sesin kaynak konumunu tahmin edebilirler. Bulusun bir amaci, görme engelli bireyin, çevresindeki nesnelerin ve engellerin mekansal konum bilgisi saglama yeteneklerinden yararlanmaktir. Bulus ile gelistirilen sistem, kullanicinin yakin çevresinin geometrik yapisini görme teknikleriyle elde etmek için. bilgisayarla görüntüleri islemekte ve sahneyi gerçek zamanli olarak bir mekansal sunum olarak üretmektedir. Bu sekilde, birçok artirilmis gerçeklik uygulamasinin aksine, önerilen yardim sistemi görsel bilgiden baska bir yöntem olan ses ile gerçegi güçlendirmektedir. Bu sesli artirim, ortamdaki nesnelerin yönünü ve mesafesini temsil etmek için üretilen sesin fazini ve genligini degistirmektedir. Bu nedenle, kullanicinin etrafindaki tüm nesneler ve engeller, sistemin çikisinda sanal bir ses kaynagi olmakta ve özel bir üç boyutlu mekansal ses biçiminde saglanmakta. ve böylece kullanici, bulundugu fiziksel ortamdaki nesneleri algilamaktadir. Bulusa konu yardim cihazi, en genel haliyle, en az bir kamera (1), en az bir derinlik algilayici (2), birden fazla mikrofon (3), çok kanalli/stereo kulaklik çikisi (4) içeren bir yardim cihazi ve bu cihaza bagli çalisan bir islem ve kontrol ünitesi (6) içermektedir. Bulusa konu yardim sistemi daha detayli olarak; - kullanicinin bulundugu çevrede yer alan objelerin/ insanlarin/ duvarlarin tespit edilmesi ve kullaniciya olan mesafelerinin tahmin edilmesi için görüntü alan en az bir - kullanicinin bulundugu çevredeki seslerin alinmasi için en az bir mikrofon (3), -kullaniciyi yönlendirecek/bilgilendirecek üç boyutlu isitsel ifadelerin kullaniciya iletilmesi için en az bir adet çok kanalli/stereo ses çikisi içeren bir yardim cihazi ve yardim cihazinin baglantili olarak çalistigi; - kullanicinin sesli komutlarini algilayan bir kullanici arayüzü ve - obje algilama algoritmasi, yüz tanima teknigi ve yazili metin tespit etme/algilama teknigi içeren bir yazilim modülü - elde edilen görsel, konumsal ve isitsel bilgilerin bir yazilim vasitasiyla islendigi ve gerçek zamanli üç boyutlu isitsel ifadelere dönüstürüldügü bir islem ve kontrol ünitesi (6) içermektedir. Bulusun bir yapilandirmasinda, kullanicinin bulundugu çevrenin yapisal özelliklerinin elde edilmesi için bir derinlik algilayici (2) bulunmaktadir. Bulusun bir yapilandirmasinda, bir kamera (1) ve bir derinlik algilayici (2) bulunmaktadir. Bulusun bir yapilandirmasinda, stereo kalibre edilmis bir iki kamera (1) bulunmaktadir. Bu yapilandirmada, derinlik algilayici (2) bulunmamakta ve cisimlerin kullaniciya olan mesafesi stereo kalibre edilmis kameralar Hj vasitasiyla ölçülmektedir. Bulusun bir diger yapilandirmasinda, stereo kalibre edilmis iki kameraya HJ ek olarak bir derinlik algilayicisi (2) bulunmaktadir. Stereo kameralar, insanlarin gözlerinden ilham alinarak kullanilan sistemlerdir. Bir insan kendisine yakin bir nesneyi sag ve sol gözünde farkli konumlarda görür ancak nesne uzaklastikça görüntü iki gözünde de ayni konuma dogru yaklasir ve yeterince uzak bir noktada iki gözde de ayni konumda algilanir. Bulusa konu sistemde de, nesneler iki kamerada (1) da ayri ayri tespit edilir ve tespit edildikleri pozisyondaki fark incelenerek nesnenin kameralardan (l) uzakligi hesaplanmaktadir. Bu durumda, renkli ve gri stereo kamera (1) kullanilabilir. Bu durumda, bulusun bir yapilandirmasinda, RGB (Red-green-blue, kirmizi-yesil-maVi) ve/Veya Grey (gri) stereo kamera (1) bulunmaktadir. Bulusun bir diger yapilandirmasinda, iki adet RGB/Grey stereo kalibre edilmis kamera (1) bulunmaktadir. Bulusun bir yapilandirmasinda, iki mikrofon (3) veya stereo olarak kalibre edilmis bir mikrofon(3) bulunmaktadir. Bulusa. konu cihaz çok kanalli/stereo kulaklik çikisi (4) içermekte ve bu durum kullanicinin önerilen sistemi kullanabilmesi için binoral (stereofonik) isitme yetenegine sahip olmasini gerektirmektedir. Bu sistem için temel motivasyon, insanin stereofonik isitme kabiliyetini kullanarak yön bilgisi sagladigindan, önerilen sistemin, açikça binoral isitme duyusu olmayan kisilerce kullanimi zor olmaktadir. Bulusun bir yapilandirmasinda, yazilim tarafindan üretilen ses bilgisi ile disaridan gelen ses bilgilerini birlestirerek kulaklik çikisi (4) üzerinden kullaniciya iletilmesini saglayan bir karistirici eleman bulunmaktadir. Bu karistirici eleman bir ses islemcisi (DSP-digital signal processor) ve üstünde kosturulacak bir yazilim ile birlikte çalismaktadir. Bulusun bir yapilandirmasinda, islem ve kontrol ünitesi (6); obje algilama algoritmasi, yüz tanima teknigi ve yazili metin tespit etme/algilama teknigi içeren bir yazilim modülü ve konusma tanima özelligi için kullanicinin sesli komutlarini algilayan bir kullanici arayüzü içermektedir. Bulusun bir yapilandirmasinda, islem ve kontrol ünitesi (6), bir bilgilendirme ekrani (7), kullanicinin sesli komutlari iletmesi ve ayni zamanda ortam sesinin elimine edilmesi için kullanilan en az bir buton (8) ve kullanicinin sesli komutlarinin girdisi için bir komut mikrofonu (9) bulunmaktadir. Bulusun bir yapilandirmasinda, komut mikrofonu (9) gürültü önleme özelligine sahiptir. Bulusun bir yapilandirmasinda, islem ve kontrol ünitesi (6), bulusa konu cihaz ile baglantili çalisan harici bir ünitedir. Bu harici ünite, bulusun bir yapilandirmasinda bir akilli telefondur. Bu yapilandirmada, bulusa konu cihaz, akilli telefon ile eslesmek icin bluetooth baglanti teknolojisi veya kablosuz internet baglanti özelligi içermektedir. Bu yapilandirmada, akilli telefonun konusma tanima altyapisi kullanilmaktadir. Islem ve kontrol ünitesinin (6) bir akilli telefon oldugu durumda, gürültü önleme özelligi akilli telefon vasitasiyla gerçeklestirilmektedir. Ayrica bu durumda, bilgilendirme ekrani (7) olarak akilli telefonun ekrani ve buton (8) olarak akilli telefonun tuslari ya da dokunmatik ekrani kullanilmaktadir. Bulusa konu sistemin calisma yöntemi de bulusun koruma kapsamindadir. Bu yöntem; kameralardan (l) ve/veya derinlik algilayicisindan (2) elde edilen fiziksel ortama ait görsel bilginin algoritmalar vasitasiyla islenmesi ve ham geometrik yapi (CGS, Coarse Geometrical Structure) bilgisinin elde edilmesi, CGS bilgisinin, isitsel artirimin (audial augmentation) olusturulmasi için islem ve kontrol ünitesine (6) iletilmesi, olusturulan artirilmis isitsel bilginin karistiriciya iletilmesi, karistiricida mikrofonlardan (3) gelen çevreye ait ses bilgisi kullanilarak karistirilmasi, kulaklik çikislari (4) vasitasiyla, görme engeli olan kisiye olusturulan isitsel-konumsal ifadelerin iletilmesi asamalarini içermesi ile karakterize edilmektedir. Bulusa konu sistem, bir yapilandirmasinda, en az bir çalisma modu içermektedir. Çalisma modlari; navigasyon modu, cisim arama modu ve adres arama modu olmaktadir. Bulus bir yapilandirmasinda, navigasyon modu bulunmaktadir. Navigasyon modunda, kullanicinin bir noktadan baska bir noktaya gidisi sirasinda karsisina çikabilecek engelleri veya duvari taniyabilmesi için mevcut çevrenin üç boyutlu sesli bir ifadesi olusturulmaktadir. Üç boyutlu yapinin olusturulmasi için, derinlik algilayicinin (2) yaninda stereo kameralar (1) da kullanilmaktadir. Bu modda, bulusa konu sistemde, kullanicinin bulundugu fiziksel ortam, önceden belirlenen açida ve mesafe araliginda izlenmektedir. Bu durum, kullanicinin bulundugu fiziksel ortamin tümünün izlenmesi ve kullaniciyi asiri miktarda sesli bilgi ile yüklememek için yapilmaktadir. Baska bir deyisle, kullaniciya 10 metre uzakliktaki bir engel için ses efekti eklemek, navigasyona yardimci olmamakta aksine kullanici için kötü bir deneyim olusturabilmektedir. Bulusun bir yapilandirmasinda, cihazin fiziksel ortam içerisinde taradigi mesafe en fazla 5 metre olmaktadir. Bu mesafe, kullanicinin hizina göre adaptif bir sekilde ayarlanabilir. Bulusun bir yapilandirmasinda, navigasyon modunda, navigasyon iki seviyede gerçeklestirilmektedir. Bunlardan ilki, sistemin harita saglayan bir web servisinden faydalanarak kullaniciya gidecegi yol ve yön bilgilerini ilettigi genel navigasyon asamasi, digeri ise kullanicinin bu direktifler dogrultusunda ilerlerken etrafindaki objeleri tanimasini ve onlara çarpmadan ilerlemesini saglayacak lokal navigasyon asamasidir. Bu sayede, kullanicinin. A noktasindan B noktasina, görme engeli olmayan bir kisi gibi rahatça gitmesi saglanmaktadir. Navigasyon modunda, sistem, kullanicinin navigasyonunu kolaylastirmak için çevredeki ortamin sesli mekansal gösterimini saglamaktadir. Asil amaç navigasyon sirasinda A noktasindan B noktasina ulasmak oldugundan, sistemin ortamdaki nesneleri tanimasi gerekmemekte, bunun yerine üç boyutlu yapisini tahmin etmesi gerekmektedir. Tahmini üç boyutlu yapi, çevrenin üç boyutlu ses gösterimini olusturmak için kullanilmaktadir. Bu nedenle, kullanici duvarlari veya diger engelleri yönlü ses ile algilayabilir` ve bunlardan kaçinabilir. Üç boyutlu yapiyi elde etmek için derinlik algilayici (2) ile birlikte stereo kamera (1) da kullanilmaktadir. Bulus bir yapilandirmasinda, cisim arama modu içermektedir. Cisim arama modunda, sistem çevredeki bir cismi, cisim algilama algoritmalari ile algilamakta ve kullaniciyi o cisme dogru yönlendirmektedir. Bu modda, derinlik algilayicisinin (2) yaninda, cisimleri ve konumlarini algilayacak stereo kameralar (1) da kullanilmaktadir. Cisim, stereo görüntü çiftinde nesne tanima algoritmalariyla tespit edildiginde, sistem cismin derinligini ve yönünü dogru bir sekilde hesaplamaktadir. Daha sonra, sistem mekansal bir ses bilgisi ile birlikte bir ses efektiyle mevcut nesnenin isitsel artirimini (audial augmentation) gerçeklestirmektedir. Baska bir deyisle, mekansal konum verecek sekilde islenen ses efekti iletilmektedir. Bu sekilde, kullanicinin söz konusu cisme dogru yönlendirilmesi mümkün olmaktadir. Cisim arama modunda aratilan cisim, ayni zamanda bir kisi olabilir. Bu durumda bulusta, yuz tanima teknikleri kullanilmaktadir. Bulus bir yapilandirmasinda, adres arama modu içermektedir. Adres arama modunda ise, sistem tarafindan, tabelalardaki ya da kapi numaralarindaki yazili metinler tespit edilmekte ve tanimlanmaktadir. Kullanicinin bulundugu fiziksel ortamda birçok cisim bulunmaktadir. Kullanici özellikle bir cisme ulasmak istedigi durumda, cisim arama modunu kullanacaktir. Bulusa konu sistemin alanina giren bütün cisimler taranmakta ancak kullanicinin ulasmak istedigi cisim için farkli bir ses efekti olusturulmaktadir. Kullanicinin aradigi cisme ulasmasi için gidecegi dogrultuda, engel teskil edecek baska cisimler olmasi durumunda, bu cisimler için, aranan cisme ait olusturulan ses efektinden farkli bir ses efekti olusturulmaktadir. Bulusa konu sistemde, temelde kullanicinin etrafindaki objeler sanal bir ses kaynagina dönüsturulerek kullanici tarafindan fark edilmeleri saglanmaktadir. Arama modunda ise ayni sekilde nesneler sese dönüstürülerek kullaniciya bilgi verilirken aranan nesne için ise farkli bir ses, konum bilgisi dogrultusunda olusturularak kullanicinin ona ulasmasi saglanmaktadir. Olusturulacak ses semasi gerçek zamanli olarak güncellenecegi için, kullanici ilerledikçe ve yön degistirdikçe güncellenen ses semasi sayesinde olusturulan seslerin gerçekten objelerden geldigi duyusu olusacaktir. Bulusun bir yapilandirmasinda, cihazin kullanici tarafindan kontrol edilmesi ve yönetilmesi için bir konusma tanima özelligi bulunmaktadir. Bu özellik ile kullanicinin konusarak sistemi kolayca kontrol etmesi saglanmaktadir. Kullanici sesli bir sekilde, bulmak istedigi cismi veya adresi yardim sistemine iletmektedir. Sesli komut verilmesi, görme engeli olan kisiler için oldukça kullanisli bir seçenektir. Kullanici sesinin ortam sesleriyle etkilesimini önlemek için, kullanici sistemle konusmadan önce cihazdaki bir butona (8) basarak konusmaya baslamaktadir. Bu butona (8) bastiginda, ortam sesleri elimine edilerek sadece kullanici sesinin mikrofonlar (3) vasitasiyla alinmasi saglanmaktadir. Bulusun bu yapilandirmasinda, gürültü önleme özelligine sahip mikrofonlar (3) kullanilmaktadir. Bulusa konu yardim cihazi, bir yapilandirmada, kullanicinin basina geçirilerek kullanilmaktadir. Bulusun bir yapilandirmasinda, yardim cihazi bir gözlük (5) olmaktadir. Bu yapilandirmaya iliskin sekil, 1 numarali sekildir. Bu sekilde gösterilen gözlük (5) seklindeki yardim cihazi, gözlük (5) camlarinin her birinin üzerinde birer tane yer olacak sekilde iki adet kamera (1), gözlügün (5) burun kismina yerlestigi bölümünün üzerinde yer alan bir adet derinlik algilayici (2), gözlügün (5) sap kisimlari üzerinde birer tane olmak üzere iki adet ndkrofon (3), gözlügün (5) sap kisimlari üzerinde birer tane olmak üzere iki adet kulaklik çikisi (4) içermektedir. Bu sekilde, yardim cihazinin harici bir islem ve kontrol ünitesine (6) bagli oldugu görülmektedir. Bu islem ve kontrol ünitesi (6) üzerinde, bir bilgilendirme ekrani (7), butonlar (8) ve bir komut mikrofonu (9) yer almaktadir. Bulusun bir yapilandirmasinda, yardim cihazi, kullanicinin alin kismina yerlesecek bir yapiya sahip olmaktadir. Bu yapilandirmaya iliskin sekil, 2 numarali sekildir. Bu sekilde, iki kamera (1) ya da stereo kalibre edilmis bir kamera (1), bir derinlik algilayici (2), iki mikrofon (3) ya da stereo kalibre edilmis bir ndkrofon (3), bir adet çok kanalli/stereo kulaklik çikisinin (4) bulundugu bir yardim cihazinin bir bas bandi (lO) üzerine yerlestirildigi görülmektedir. Bu sekilde ayrica, yardim cihazinin bagli oldugu bir islem ve kontrol ünitesi (6) yer almaktadir. Bulus, özetle, kismi ya da tam görme engeline sahip kisiler için isitsel artirilmis gerçeklik teknolojisine dayanan bir navigasyon yardim cihazi olup, kullanicinin bulundugu fiziksel ortamin yapisal özelliklerinin elde edilmesi için derinlik haritasi olusturan bir derinlik algilayici (2), kullanicinin bulundugu çevrede yer alan objelerin/ insanlarin/ duvarlarin tespit edilmesi için görüntü alan en az bir kamera (1), kullanicinin bulundugu çevredeki seslerin ve/veya kullanicinin sesli komutlarinin alinmasi için en az bir mikrofon (3), kullaniciyi yönlendiren/bilgilendiren üç boyutlu isitsel ifadelerin kullaniciya iletilmesi için en az bir adet çok kanalli/stereo kulaklik çikisi (4) içeren bir yardim cihazi ve yardim cihazinin. baglantili olarak çalistigi; kullanicinin sesli komutlarini algilayan, konusma tanima özelligine sahip bir kullanici arayüzü ve obje algilama algoritmasi, yüz tanima teknigi ve yazili metin tespit etme/algilama teknigi içeren bir tespit ünitesi içeren, elde edilen görsel, konumsal ve isitsel bilgilerin islendigi ve gerçek zamanli üç boyutlu isitsel ifadelere dönüstürüldügü bir islem ve kontrol ünitesi (6) içermesi ile karakterize edilmektedir. Bu cihaza iliskin bir çalisma yöntemi de bulusun koruma kapsamindadir. Basvuruya konu bulus sayesinde, tam ya da kismi görme engeli olan kisilerin. bulunduklari fiziksel ortamda. görme engeli olmayan. diger insanlar` gibi rahatça. hareket edebilmeleri, istedikleri bir konuma ulasabilmeleri, ortamda bulunan cisimleri algilayabilmeleri ve bu çisimlere ulasabilmeleri adina gelistirilmis, kisinin yakin çevresindeki geometrik yapinin elde edilerek görüntülerin islenmesi sonucu gerçek zamanli isitsel-konumsal ifadelerin olusturulmasini ve bu ifadelerin kullaniçiya iletilmesini saglayan ve kullanici tarafindan kontrol edilebilen bir yardim cihazini haiz bir sistem sunulmaktadir. TR TR TR DESCRIPTION A NAVIGATION ASSISTANCE SYSTEM WITH AUDITORY AUGMENTED REALITY FEATURE FOR THE VISUALLY IMPAIRED. Subject of the Invention. The invention is about a navigation aid system developed to help people with partial or total visual impairment find their way. State of the Art Visual impairment is generally considered one of the sensory disabilities that most affects a person's daily life. People who are blind or have low vision (partially vision impaired) face many difficulties in daily life, especially regarding their mobility and the ability to perform navigational tasks. There are many negative situations in daily life, such as not being able to perceive approaching objects while walking on the road, especially in traffic, and not being able to see or recognize people they encounter. For example, a visually impaired person has great difficulty in doing tasks such as moving from one room to another, finding an object in an environment, or reaching a desired goal. A visually impaired person tends to perform tasks that he cannot perform due to his visual impairment by relying on his other senses. For example, it can recognize that it is coming from the footsteps of an approaching person. Compensation for the lack of visual information can be made with information from the hearing or touch system. This situation is not very sufficient and causes the visually impaired person to be late in noticing the obstacles in front of him or to not be able to reach a goal he wants to achieve. Apart from this, navigation systems are systems managed by the user and developed to respond to user requests. There are various options for managing and controlling a navigation system. One is a simple controller with a few buttons for changing modes and other inputs. This option is suitable for simple operations such as changing modes, but it is difficult to enter the object names to be searched. For example, if a user runs the search mode to find his mobile phone or air conditioner remote control, it will be difficult to enter this information with several buttons. Additionally, a button-based controller design means mechanical parts that are bulkier and prone to hardware failures. Therefore, a more appropriate way needs to be found to provide complex inputs to the system. In technology, there are many devices, guidance systems and navigation systems developed to assist visually impaired people. Document numbered USZOl8l89567A is a state-of-the-art document covering a device, system and assistance method for visually impaired users. The system subject to this document includes a plurality of video cameras and haptic patches worn by a user, usually head-mounted, containing computer processors and associated support devices and algorithms configured for computer vision, and a plurality of far-range haptic transducers. These tactile patches are worn so that the user's hands are free to do other tasks. It is stated that the spatial positions of each important object are given to the user with variable outputs according to tactile transducers. General objects, identified objects and potential obstacle objects are identified and reported. The system can also optionally provide audio information or tactile graphical display information about these objects. However, in this document, there is no option to search for an object or address according to the user's wishes, and augmented audio data is not mentioned. In the previous article "A depth-based head-mounted visual display to aid navigation in partially sighted individuals", a system for the visually impaired is proposed, consisting of a depth sensor and a 2D LED array as a very high contrast display. It has been stated that the depth map of the user's perspective is created and the vision is increased through the image. This system can only help partially blind people. In addition, an article titled "sensor with range expansion" (A Aladren et al. 2016. IEEE Systems Journal, lO;3, pp.922-932), which is included in the prior art, describes a navigation system that provides voice commands using the visual and range information obtained. The system provides voice commands at different frequencies to the left, right, or both ears based on the distance and position of objects, respectively. Although this system is used by blind people, it only provides relative orientations of objects to the left or right, rather than providing more accurate cues. As a result, in order to eliminate the disadvantages mentioned above, there is a need to develop a system that will protect the visually impaired person from the obstacles in his physical environment, direct the person to the object/position he wants to reach, and warn/guide the person audibly. Detailed Description of the Invention The invention can be used partially or fully. It is about a navigation assistance system developed to help visually impaired people find their way. In this system, it is recommended to add audio information using augmented reality technology. This technology can be called auditory augmented reality technology. This system eliminates the disadvantages of current state-of-the-art systems and is suitable for both completely visually impaired and partially visually impaired people, and aims to increase the physical mobility of users. An important purpose of the invention is to obtain three-dimensional location information of obstacles and objects around the visually impaired person and to ensure that this information is perceived by users in an auditory manner. In the developed system, the images taken are processed to obtain the geometric structure in the user's immediate environment and a real-time audio-spatial expression is created. In other words, instead of supporting the objects in the environment with virtual visual stimuli, it is aimed to help visually impaired people by turning each object into a virtual sound source. In addition to the sense of sight, humans can obtain spatial information by processing sound waves and estimate the source location of a sound using the phase difference of the acquired sonance transmitted to both ears. One purpose of the invention is to benefit from the visually impaired individual's ability to provide spatial location information of objects and obstacles in his or her environment. The system developed with the invention is used to obtain the geometric structure of the user's immediate environment through visual techniques. It processes images with a computer and produces the scene as a spatial representation in real time. In this way, unlike many augmented reality applications, the proposed assistance system reinforces reality with sound, a method other than visual information. This audio enhancement changes the phase and amplitude of the sound produced to represent the direction and distance of objects in the environment. Therefore, all objects and obstacles around the user become a virtual sound source at the output of the system and are provided in the form of a special three-dimensional spatial sound. and thus the user perceives the objects in his physical environment. The assistive device subject to the invention, in its most general form, is an assistive device containing at least one camera (1), at least one depth sensor (2), multiple microphones (3), multi-channel/stereo headphone output (4) and a device connected to this device. It contains a working processing and control unit (6). The help system subject to the invention is in more detail; - at least one that takes images to detect objects/people/walls in the user's environment and estimate their distance to the user, -at least one microphone (3) to pick up sounds in the user's environment, -3-dimensional audio expressions that will guide/inform the user. An auxiliary device containing at least one multi-channel/stereo audio output for transmission and the auxiliary device working in conjunction; - a user interface that detects the user's voice commands and - a software module containing an object detection algorithm, face recognition technique and written text detection/detection technique - where the visual, spatial and audio information obtained is processed through a software and converted into real-time three-dimensional auditory expressions. It contains a processing and control unit (6). In one embodiment of the invention, there is a depth sensor (2) to obtain the structural features of the user's environment. In one embodiment of the invention, there is a camera (1) and a depth sensor (2). In one embodiment of the invention, there are two cameras (1) that are stereo calibrated. In this configuration, there is no depth sensor (2) and the distance of the objects to the user is measured by stereo calibrated cameras Hj. In another embodiment of the invention, there is a depth sensor (2) in addition to two stereo calibrated cameras HJ. Stereo cameras are systems inspired by people's eyes. A person sees an object close to him/her in different positions in his right and left eyes, but as the object moves away, the image approaches the same position in both eyes, and at a sufficiently distant point, it is perceived in the same position in both eyes. In the system subject to the invention, objects are detected separately in two cameras (1) and the distance of the object from the cameras (l) is calculated by examining the difference in the position in which they are detected. In this case, the color and gray stereo camera (1) can be used. In this case, in an embodiment of the invention, there is an RGB (Red-green-blue) and/or Gray stereo camera (1). In another embodiment of the invention, there are two RGB/Grey stereo calibrated cameras (1). In one embodiment of the invention, there are two microphones (3) or a stereo calibrated microphone (3). Let's find it. The subject device contains a multi-channel/stereo headphone output (4), and this requires the user to have binaural (stereophonic) hearing ability to use the proposed system. The main motivation for this system is that since it provides directional information using the human stereophonic hearing ability, the proposed system is difficult to use by people who clearly do not have binaural hearing. In one embodiment of the invention, there is a mixing element that combines the sound information produced by the software with the sound information coming from outside and transmits it to the user via the headphone output (4). This mixing element works with a sound processor (DSP-digital signal processor) and a software to be run on it. In an embodiment of the invention, the processing and control unit (6); It includes a software module containing an object detection algorithm, face recognition technique and written text detection/detection technique, and a user interface that detects the user's voice commands for the speech recognition feature. In an embodiment of the invention, the processing and control unit (6) consists of an information screen (7), at least one button (8) used for the user to transmit voice commands and at the same time to eliminate ambient sound, and a command microphone (9) for input of the user's voice commands. ) is available. In one embodiment of the invention, the command microphone (9) has a noise canceling feature. In an embodiment of the invention, the processing and control unit (6) is an external unit working in connection with the device subject to the invention. This external unit is, in one embodiment of the invention, a smartphone. In this configuration, the device subject to the invention includes Bluetooth connection technology or wireless internet connection feature to pair with a smartphone. In this configuration, the speech recognition infrastructure of the smartphone is used. In case the processing and control unit (6) is a smartphone, the noise canceling feature is performed via the smartphone. Additionally, in this case, the screen of the smartphone is used as the information screen (7) and the buttons or touch screen of the smartphone are used as the button (8). The operating method of the system subject to the invention is also within the scope of protection of the invention. This method; Processing the visual information of the physical environment obtained from the cameras (l) and/or depth sensor (2) through algorithms and obtaining the raw geometric structure (CGS, Coarse Geometrical Structure) information, the process and process for creating the CGS information, auditory augmentation. It is characterized by the stages of transmission to the control unit (6), transmitting the created enhanced audio information to the mixer, mixing it using the environmental sound information coming from the microphones (3) in the mixer, and transmitting the audio-spatial expressions created to the visually impaired person through the headphone outputs (4). The system subject to the invention includes at least one operating mode in one embodiment. Working modes; navigation mode, object search mode and address search mode. In one embodiment of the invention, there is a navigation mode. In navigation mode, a three-dimensional audio expression of the current environment is created so that the user can recognize obstacles or walls that he may encounter while moving from one point to another. To create the three-dimensional structure, stereo cameras (1) are also used in addition to the depth sensor (2). In this mode, in the system subject to the invention, the physical environment where the user is located is monitored at a predetermined angle and distance range. This is done to monitor the entire physical environment where the user is located and not to burden the user with excessive amounts of audio information. In other words, adding a sound effect for an obstacle 10 meters away from the user does not help with navigation, but may instead create a bad experience for the user. In one embodiment of the invention, the distance that the device scans within the physical environment is at most 5 meters. This distance can be adjusted adaptively according to the user's speed. In one embodiment of the invention, in navigation mode, navigation is performed on two levels. The first of these is the general navigation phase, in which the system conveys the route and direction information to the user by using a web service that provides a map, and the other is the local navigation phase, which will enable the user to recognize the objects around him while proceeding in accordance with these directives and to move forward without hitting them. In this way, the user. It is possible to go from point A to point B as easily as a visually impaired person. In navigation mode, the system provides audio spatial representation of the surrounding environment to facilitate user navigation. Since the main goal is to get from point A to point B during navigation, the system does not need to recognize objects in the environment, but instead needs to estimate their three-dimensional structure. The estimated three-dimensional structure is used to create a three-dimensional sound representation of the environment. Therefore, the user can detect walls or other obstacles with directional sound and avoid them. A stereo camera (1) is also used together with the depth sensor (2) to obtain the three-dimensional structure. In one embodiment, the invention includes an object search mode. In object search mode, the system detects an object in the environment with object detection algorithms and directs the user towards that object. In this mode, in addition to the depth sensor (2), stereo cameras (1) are also used to detect objects and their positions. When the object is detected by object recognition algorithms in the stereo image pair, the system accurately calculates the depth and direction of the object. The system then performs auditory augmentation of the current object with a sound effect along with spatial audio information. In other words, the sound effect processed to give spatial location is transmitted. In this way, it is possible to direct the user towards the object in question. The object searched for in object search mode can also be a person. In this case, facial recognition techniques are used in the invention. In one embodiment, the invention includes an address search mode. In address search mode, written texts on signs or door numbers are detected and identified by the system. There are many objects in the physical environment where the user is located. If the user specifically wants to reach an object, he will use the object search mode. All objects that enter the field of the system subject to the invention are scanned, but a different sound effect is created for the object that the user wants to reach. If there are other objects that may obstruct the user's path to reach the object he is looking for, a different sound effect is created for these objects than the sound effect created for the object he is looking for. In the system subject to the invention, objects around the user are basically transformed into a virtual sound source, allowing them to be noticed by the user. In the search mode, in the same way, objects are converted into sounds and information is given to the user, while a different sound is created for the searched object in line with the location information, allowing the user to reach it. Since the sound scheme to be created will be updated in real time, as the user progresses and changes direction, the updated sound scheme will create the feeling that the sounds created are really coming from objects. In one embodiment of the invention, the device has a speech recognition feature for control and management by the user. This feature allows the user to easily control the system by speaking. The user communicates the object or address he wants to find to the help system by voice. Giving voice commands is a very useful option for visually impaired people. In order to prevent the user's voice from interacting with ambient sounds, the user starts speaking by pressing a button (8) on the device before speaking to the system. When this button (8) is pressed, ambient sounds are eliminated and only the user's voice is captured through the microphones (3). In this embodiment of the invention, microphones (3) with noise canceling feature are used. In one embodiment, the assistive device subject to the invention is used by placing it over the user's head. In an embodiment of the invention, the assistive device is a pair of glasses (5). The figure for this configuration is figure number 1. The assistive device in the form of glasses (5) shown in this figure consists of two cameras (1), one on each of the glasses (5), a depth sensor (2) located on the part of the glasses (5) where they are placed on the nose, It contains two microphones (3), one on each of the stem parts of the glasses (5), and two headphone outputs (4), one on each of the stem parts of the glasses (5). In this figure, it can be seen that the assist device is connected to an external processing and control unit (6). On this process and control unit (6), there is an information screen (7), buttons (8) and a command microphone (9). In one embodiment of the invention, the assistive device has a structure that can be placed on the user's forehead. The figure for this configuration is figure number 2. In this way, two cameras (1) or a stereo calibrated camera (1), a depth sensor (2), two microphones (3) or a stereo calibrated microphone (3), one multi-channel/stereo headphone output It is seen that an assistive device with (4) is placed on a head band (10). In this way, there is also a processing and control unit (6) to which the assistive device is connected. In summary, the invention is a navigation aid device based on auditory augmented reality technology for people with partial or complete visual impairment, and a depth sensor (2) that creates a depth map to obtain the structural features of the physical environment where the user is located. At least one camera (1) that takes images to detect people/walls, at least one microphone (3) to receive sounds from the user's environment and/or the user's voice commands, and at least one to transmit three-dimensional auditory expressions that guide/inform the user to the user. an auxiliary device with 1 multi-channel/stereo headphone output (4) and the auxiliary device. working in conjunction with; It contains a user interface with speech recognition feature that detects the user's voice commands and a detection unit that includes object detection algorithm, face recognition technique and written text detection/detection technique, where the obtained visual, spatial and auditory information is processed and converted to real-time three-dimensional auditory expressions. It is characterized by containing a processing and control unit (6) into which it is converted. An operating method related to this device is also within the scope of protection of the invention. Thanks to the invention subject to the application, people with complete or partial visual impairment. in their physical environment. without visual impairment. comfortably like other people. It is an assistive device that is developed to enable people to move, reach a desired position, perceive objects in the environment and reach these objects. It enables the creation of real-time audio-positional expressions as a result of processing the images by obtaining the geometric structure in the person's immediate environment and transmitting these expressions to the user, and which can be controlled by the user. A system with

Claims

1. CLAIMS: It is a navigation assistance system for people with partial or total visual impairment, and its feature is; - at least one camera (1) that takes images to detect objects/people/walls in the user's environment, - at least one microphone (3) to capture sounds in the user's environment, - light that guides/informs the user; An assist device containing at least one multi-channel/stereo headphone output (4) operates in conjunction with the assist device to transmit dimensional audio expressions to the user; - a user interface with speech recognition feature that detects the user's voice commands, and - a detection unit with object detection algorithm, face recognition technique and written text detection/detection technique, - three real-time data processing systems where the visual, spatial and audio information obtained are processed. It is characterized by containing a processing and control unit (6) where it is converted into dimensional audio-positional expressions. It is a system according to claim 1 and its feature is; processing and control unit (6); It contains at least one button (8) used for entering user voice commands and also eliminating ambient sound. It is a system according to claim 1 and its feature is; processing and control unit (6); It contains a command microphone (9) with noise canceling feature for entering user voice commands. It is a system according to claim 1 and its feature is; processing and control unit (6); It contains an information screen (7). It is a system according to claim 1 and its feature is; The processing and control unit (6) is a smartphone. It is a system according to claim 5 and its feature is; The assistive device must have Bluetooth or wireless internet feature to connect with the smartphone. It is a system according to claim 1 and its feature is; assistive device; It contains two microphones (3) or a stereo calibrated microphone (3). It is a system according to claim 1 and its feature is; assistive device; It contains at least one stereo calibrated camera (1). It is a system according to claim 1 and its feature is; assistive device; It contains a depth sensor (2) that creates a depth map to obtain the structural features of the physical environment where the user is located. It is a system according to claim 9 and its feature is; assistive device; It contains two stereo calibrated cameras (1) and a depth sensor (2). It is a system according to claim 1 and its feature is; The camera (l) is an RGB (Red-green-blue) and/or Gray stereo camera (1). It is a system according to claim 1 and its feature is; assistive device; It contains two headphone outputs (4), right and left. It is a system according to claim 1 and its feature is; The assistive device is a pair of glasses (5). It is a system according to claim 13 and its feature is; Two cameras (1), one on each of the glasses (5), a depth sensor (2) located in the middle of the two cameras (l), two microphones, one on each of the stem parts of the glasses (5). (3) and multi-channel/stereo headphone output (4) on the stem parts of the glasses (5). It is a system according to claim 1 and its feature is; The assistive device must have a structure suitable for placement on the person's forehead. It is a system according to claim 15 and its feature is; assistive device; It is located on a bass band (10). It is an operating method of a system according to any of the above claims and its feature is; - to the physical environment obtained from the cameras (l) and/or depth sensor (2). of visual information. processing through algorithms and obtaining raw geometric structure (CGS, Coarse Geometrical Structure) information, - processing the CGS information and transmitting it to the control unit (6) to create auditory augmentation, - transmitting the created augmented audio information to the mixer, - using microphones in the mixer It is characterized by the stages of (3) mixing using the sound information of the incoming environment, 30 23. - transmitting the real-time audio-spatial expressions created to the visually impaired person through headphone outputs (4). It is a method according to claim 17 and its feature is; In case the device contains two cameras (1), the objects are detected separately in both cameras (1) and the distance of the object from the cameras (l) is calculated by examining the difference in the position in which they are detected. It is a method according to claim 17 and its feature is; It must contain at least one of the navigation mode, object search mode and address search mode. It is a method according to claim 19 and its feature is; in navigation.mode; It is monitoring the current physical environment through depth sensors (2) and stereo cameras (l) and creating a three-dimensional audio expression so that the user can detect obstacles or walls that he may encounter while moving from one point to another. It is a method according to claim 20 and its feature is; It is the monitoring of the physical environment where the user is located at a predetermined angle and distance range. It is a method according to claim 21 and its feature is; The maximum distance is 5 meters. It is a method according to claim 20 and its feature is; It includes two stages: a general navigation stage, in which the system conveys the route and direction information to the user by using a web service that provides a map, and a local navigation stage, which will enable the user to recognize the objects around him while proceeding in accordance with these directives and to move forward without hitting them. It is a method according to claim 19 and its feature is; in object search mode; Detection of the object in the stereo camera (1) image by means of object recognition algorithms, and the depth of the object by means of the depth sensor (2). and calculating its direction, creating a sound increase for the object with a sound effect, and transmitting this sound increase to the user. It is a method according to claim 24 and its feature is; If the searched object is a person, the person is recognized by face recognition technique. It is a method according to claim 24 and its feature is; In case there are other objects that will hinder the user in the direction he/she is going to reach the object he/she is looking for; The sound effect created for these objects is different from the sound effect created for the object the person wants to reach. A method according to claim 19, characterized in: in address search mode; It is the detection and identification of written texts on signs or door numbers and audible notification to direct the user to that location. TR TR TR