EP4000579A1

EP4000579A1 - Camera-based assistance system with artificial intelligence for the blind

Info

Publication number: EP4000579A1
Application number: EP21000319.0A
Authority: EP
Inventors: Alexander Bayer; Thomas Bayer
Original assignee: Individual
Current assignee: Individual
Priority date: 2020-11-13
Filing date: 2021-11-09
Publication date: 2022-05-25
Also published as: DE102020006971A1

Abstract

Es wird ein Deep Learning basiertes Assistenzsystem für blinde Personen angegeben, welches diese auch in komplexeren Situationen in Bezug auf ihre weitere Gehrichtung zuverlässig unterstützt. Dazu wird vorzugsweise mindestens eine Ultra-Weitwinkel-Kamera verwendet, aus deren Aufnahmen durch Depth Estimation ein 3D-Bild generiert wird. In die Bestimmung der empfohlenen Gehrichtung werden auch die Anweisungen einer gegebenenfalls aktiven GPS-Navigation einbezogen. Die gewonnenen Informationen werden zusammen mit dem Kamerabild durch Deep Learning ausgewertet, um dem Benutzer durch einen von einem elektrischen Stellmotor bewegten Zeiger eine Gehrichtung zu empfehlen. Gefahren und sonstige wichtige Informationen über die Umgebung werden dem Benutzer durch unterschiedliche Vibrationsmuster oder Sprache signalisiert. Am Assistenzsystem kann zudem ein Blindenlangstock befestigt werden, sodass für den Benutzer keine lange Eingewöhnungsphase nötig ist. Durch den Deep Learning Ansatz wird auch das Erkennen von glänzenden, spiegelnden, schwarzen sowie durchsichtigen Gegenständen, als auch von Strukturen ermöglicht, die vom Benutzer als scheinbares Hindernis gar nicht umgangen werden müssen.A deep learning-based assistance system for blind people is specified, which also reliably supports them in more complex situations with regard to their further walking direction. For this purpose, at least one ultra-wide-angle camera is preferably used, from the recordings of which a 3D image is generated by depth estimation. When determining the recommended walking direction, the instructions of any active GPS navigation are also included. The information obtained is evaluated together with the camera image using deep learning in order to recommend a walking direction to the user using a pointer moved by an electric servomotor. Dangers and other important information about the environment are signaled to the user through different vibration patterns or speech. A cane for the blind can also be attached to the assistance system, so that the user does not need to get used to it for a long time. The deep learning approach also makes it possible to recognize shiny, reflective, black and transparent objects, as well as structures that the user does not have to avoid as an apparent obstacle.

Description

Die Erfindung betrifft ein Assistenzsystem nach dem Oberbegriff des Patentanspruch 1. Derartige Assistenzsysteme sind beispielsweise aus der AU 2020101563 A4 bekannt. Darin wird vorgeschlagen, den Bereich vor der blinden Person mit einem sich horizontal drehenden Ultraschallsensor abzutasten. Darüber hinaus wird auch vorgeschlagen, ein vom Ultraschallsensor erkanntes Hindernis über ein Kamerabild, das durch ein neuronales Netzwerk verarbeitet wird zu validieren, und die blinde Person davon über Ohrhörer zu alarmieren. Das bekannte Assistenzsystem weist den Nachteil auf, dass die blinde Person - da diese lediglich einen akustischen Alarm erhält - weder weiß, welche Art von Hindernis erkannt wurde, in welcher Richtung sich das Hindernis befindet, noch in welcher Richtung sie dem Hindernis ausweichen sollte. Darüber hinaus werden kontinuierliche Audiosignale und das Tragen von Kopfhörern von Blinden als störend und gefährlich empfunden, da dabei leicht Gefahren, wie beispielsweise herannahende Fahrzeuge überhört werden.The invention relates to an assistance system according to the preamble of claim 1. Such assistance systems are, for example, from AU2020101563A4 known. It proposes scanning the area in front of the blind person with a horizontally rotating ultrasonic sensor. In addition, it is also proposed to validate an obstacle detected by the ultrasonic sensor via a camera image processed by a neural network and to alert the blind person about it via earphones. The known assistance system has the disadvantage that the blind person—since they only receive an acoustic alarm—neither knows what type of obstacle was detected, in which direction the obstacle is located, nor in which direction they should avoid the obstacle. In addition, continuous audio signals and the wearing of headphones are perceived as annoying and dangerous by blind people, since dangers such as approaching vehicles are easily ignored.

Aus der KR 101321187 B1 ist ein Assistenzsystem für Blinde bekannt, bei dem mehrere Ultraschallsensoren den in Gehrichtung vor der blinden Person gelegenen Bereich abtasten, wobei je nach Lage eines erkannten Hindernisses am Bedienpult des Assistenzsystems ein von der blinden Person erkennbares taktiles Signal abgegeben wird. Dabei kann die Entfernung, ab der das Assistenzsystem ansprechen soll, von der blinden Person individuell eingestellt werden.From the KR 101321187 B1 For example, an assistance system for the blind is known in which several ultrasonic sensors scan the area in front of the blind person in the walking direction, depending on the position of a detected obstacle on the control panel of the assistance system, a tactile signal recognizable by the blind person is emitted. The distance from which the assistance system should respond can be set individually by the blind person.

Obgleich dieses Assistenzsystem einer blinden Person auch anzeigt, in welche Richtung einem vor ihr liegenden Hindernis auszuweichen ist, bleibt für diese nach wie vor unklar, um welche Art von Hindernis es sich handelt. Ein weiterer Nachteil dieses Assistenzsystems liegt darin, dass sein Benutzer zwar Informationen über die Existenz von vor ihm liegenden Hindernissen, aber keine Informationen über deren Dimensionen oder Art erhält.Although this assistance system also shows a blind person in which direction an obstacle in front of them is to be avoided, it remains unclear to them what type of obstacle it is. Another disadvantage of this assistance system is that its user receives information about the existence of obstacles in front of him, but no information about their dimensions or type.

Darüber hinaus ist die Zuverlässigkeit und Genauigkeit der Signale dadurch erheblich beschränkt, dass Ultraschallsensoren verwendet werden, die zum einen auf schwächer reflektierende Objekte teilweise nicht ansprechen und aufgrund ihrer wellenlängenbedingten geringeren Winkelauflösung nicht richtig erkannt werden.In addition, the reliability and accuracy of the signals is considerably limited by the fact that ultrasonic sensors are used, some of which do not respond to less reflective objects and are not correctly recognized due to their wavelength-related lower angular resolution.

Beim Assistenzsystem nach der DE 10 2017 001 476 A1 wird versucht, diesem Nachteil dadurch entgegenzuwirken, dass mehrere Infrarot- oder Ultraschallquellen und entsprechende Empfänger den vor dem Benutzer liegenden Bereich abtasten und die empfangenen Signale von einem Mikrocontroller ausgewertet werden. Die Auswertung geschieht dabei in der Weise, dass durch Vergleich mit standardmäßigen baulichen Gegebenheiten, wie zum Beispiel der Höhe von Treppenstufen, dem Benutzer die Art des Hindernisses akustisch mitgeteilt wird. Darüber hinaus weist dieses Assistenzsystem ein mit dem Boden in Kontakt stehendes Antriebssystem auf, durch welches der Benutzer bei seiner Vorwärtsbewegung aktiv so geführt wird, dass er Hindernissen sicher ausweichen kann.With the assistance system after the DE 10 2017 001 476 A1 Attempts are made to counteract this disadvantage in that several infrared or ultrasonic sources and corresponding receivers scan the area in front of the user and the received signals are evaluated by a microcontroller. The evaluation is done in such a way that the type of obstacle is acoustically communicated to the user by comparison with standard structural conditions, such as the height of stairs. In addition, this assistance system has a drive system in contact with the ground, through which the user is actively guided during his forward movement in such a way that he can safely avoid obstacles.

Bei diesem Assistenzsystem treten allerdings teilweise Situationen in Bezug auf die Umgebung des Benutzers auf, die nicht den standardmäßig der mittels Ultraschall- und Infrarotsensoren gewonnenen und im Mikrocontroller hinterlegten Daten für bauliche Gegebenheiten entsprechen, wodurch dieser in falscher Weise geführt werden kann.With this assistance system, however, there are sometimes situations in relation to the user's surroundings that do not correspond to the standard data for structural conditions obtained using ultrasonic and infrared sensors and stored in the microcontroller, which means that the user can be guided in the wrong way.

Das aus der US 8922759 B2 bekannte Assistenzsystem versucht diesen Nachteil dadurch zu vermeiden, dass zur Verbesserung der Erkennung der Art von Hindernissen ein Blindenlangstock mit einem sogenannten Time-Of-Flight Sensor (TOF oder auch LiDAR Sensor genannt) verwendet wird. Dieser dient zur Erfassung der Art und der Messung von Entfernungen räumlich gegenständlicher Objekte und vermittelt der blinden Person die Messwerte in Form haptischer Signale. Die Verwendung eines TOF-Sensors birgt die Nachteile, dass derartige Sensoren bei hellem Tageslicht bisher nicht zuverlässig genug arbeiten und ein vergleichsweise hohes Gewicht haben. Darüber hinaus werden durch TOF Sensoren glänzende, spiegelnde, schwarze und auch durchsichtige Gegenstände nicht erkannt. Weiterhin kann mit einem solchen TOF-Sensor nicht erkannt werden, ob es sich bei Hindernissen um solche handelt, die wie beispielsweise eine Tür vom Benutzer nicht umgangen werden muss, sondern benutzt werden kann. Umgekehrt soll beispielsweise ein bis zum Boden reichender Spiegel nicht als Durchgang, sondern als Hindernis erkannt werden. Schließlich ist eine solche Unterscheidung beispielsweise auch wichtig für einen auf gleicher Höhe wie eine benachbarte Fahrbahn verlaufender Fußweg, da beide für den TOF-Sensor eine gemeinsame Ebene darstellen.That from the US8922759B2 Known assistance system tries to avoid this disadvantage by using a cane with a so-called time-of-flight sensor (TOF or also called LiDAR sensor) to improve the recognition of the type of obstacles. This is used to record the type and measure the distances of spatially representational objects and conveys the measured values to the blind person in the form of haptic signals. The use of a TOF sensor has the disadvantage that such sensors have not worked reliably enough in bright daylight and are comparatively heavy. In addition, shiny, reflective, black and transparent objects are not recognized by TOF sensors. Furthermore, such a TOF sensor cannot detect whether the obstacles involved do not have to be bypassed by the user, such as a door, but can be used. Conversely, for example, a mirror reaching down to the floor should not be recognized as a passage but as an obstacle. Finally, such a distinction is also important, for example, for a footpath that runs at the same height as an adjacent lane, since both represent a common plane for the TOF sensor.

Auch dieses Assistenzsystem weist den weiteren Nachteil auf, dass die blinde Person - da diese lediglich Entfernungsangaben zu einem etwa vor ihr liegenden Hindernis erhält - nach wie vor nicht weiß, welche Art von Hindernis erkannt wurde und in welcher Richtung sie dem Hindernis ausweichen sollte.This assistance system also has the further disadvantage that the blind person still does not know what type of obstacle was recognized and in which direction they should avoid the obstacle, since they only receive information on the distance to an obstacle in front of them.

Es war daher eine Aufgabe der Erfindung, ein Assistenzsystem für blinde Personen nach dem Oberbegriff des Patentanspruchs 1 so weiterzubilden, dass dieses der blinden Person beim Auftreten eines Hindernisses nicht nur ein einfaches "rechts" oder "links" Signal als allgemeine Richtungsanweisung gibt, sondern auch eine genaue Gehrichtung.It was therefore an object of the invention to develop an assistance system for blind people according to the preamble of claim 1 in such a way that, when an obstacle occurs, it not only gives the blind person a simple "right" or "left" signal as a general direction instruction, but also an exact walking direction.

Weitere Aufgabe der Erfindung ist es, dieses bekannte Assistenzsystem so auszugestalten, dass es vor dem Benutzer sich befindliche, nicht durch ihre Abmessungen von der Umgebung sich abzeichnende Strukturen als solche erkennen kann.A further object of the invention is to design this known assistance system in such a way that it can recognize as such structures located in front of the user and whose dimensions cannot be distinguished from the surroundings.

Diese Aufgabe wird durch die kennzeichnenden Merkmale des Patentanspruchs 1 gelöst.This object is solved by the characterizing features of claim 1.

Das erfindungsgemäße Assistenzsystem weist den Vorteil auf, dass Benutzer insbesondere in komplexeren Situationen durch genaue Angabe der einzuschlagenden Gehrichtung in die Lage versetzt werden, schmale Durchgänge wie zum Beispiel an U-Bahn Haltestellen oder innerhalb von Zügen zielsicher zu passieren.The assistance system according to the invention has the advantage that, particularly in more complex situations, users are able to pass through narrow passages such as, for example, at subway stations or inside trains, by precisely specifying the walking direction to be taken.

Zusätzlich löst die Erfindung das Problem, dass die bekannten Assistenzsysteme nicht in der Lage waren, eine Gehrichtung zuverlässig zu signalisieren, wenn keine räumlichen Hindernisse vor der blinden Person lagen, wie es beispielsweise in großen Räumen, Fußgängerzonen oder auf Feld- und Gehwegen der Fall ist. Durch diese Fähigkeit des erfindungsgemäßen Assistenzsystems erweitert sich dessen Anwendungsbereich beträchtlich.In addition, the invention solves the problem that the known assistance systems were not able to reliably signal a walking direction if there were no spatial obstacles in front of the blind person, as is the case, for example, in large rooms, pedestrian zones or on field paths and sidewalks . This capability of the assistance system according to the invention considerably expands its area of application.

Ein weiterer Vorteil der Erfindung besteht darin, dass es mit dem erfindungsgemäßen Assistenzsystem möglich ist, verschiedene Kategorien (zum Beispiel Personen, Fahrzeuge oder Möbel) von vor der blinden Person liegenden Hindernissen und Gegenständen zu erkennen und der blinden Person zu signalisieren.A further advantage of the invention is that it is possible with the assistance system according to the invention to recognize different categories (for example people, vehicles or furniture) of obstacles and objects in front of the blind person and to signal the blind person.

Mit dem erfindungsgemäßen Assistenzsystem ist es auch möglich, mithilfe der vom neuronalen Netzwerk erkannten Anordnung von Gegenständen Situationen zu erkennen, in denen sich der Benutzer befindet. Beispiele dafür sind, dass der Benutzer den Gehweg verlassen hat und sich bereits auf der Straße befindet oder dass er vor einer noch zu öffnenden Türe steht, die bei Assistenzsystemen nach dem Stand der Technik schlicht als Wand interpretiert werden. Das künstliche neuronale Netzwerk wird dazu auf einen möglichst großen Datensatz an Beispielbildern, die manuell mit der korrekten Gehrichtung, sowie den in den Beispielbildern enthaltenen nützlichen Informationen versehen wurden, trainiert.With the assistance system according to the invention, it is also possible to recognize situations in which the user is located using the arrangement of objects recognized by the neural network. Examples of this are that the user has left the sidewalk and is already on the street or that he is standing in front of a door that still has to be opened, which is simply interpreted as a wall in the case of assistance systems based on the prior art. For this purpose, the artificial neural network is trained on the largest possible data set of example images, which have been manually provided with the correct walking direction and the useful information contained in the example images.

Was letztere angeht, so können dies insbesondere sein:
Kameralinse verschmutzt; Achtung Vorsicht geboten: Gehrichtung noch unklar; Zebrastreifen vorhanden; Vorsicht Stufe; Treppe abwärts; Treppe aufwärts; die Liniennummer und das Fahrziel einer einfahrenden Straßenbahn oder Busses; Türe vorhanden; Bahnsteigkante vorhanden; Haltestelle vorhandenAs for the latter, these can be in particular:
camera lens dirty; Caution caution required: walking direction still unclear; zebra crossing available; caution level; stairs down; stairs up; the line number and destination of an incoming tram or bus; door available; platform edge available; station available

Das künstliche neuronale Netzwerk kann auch auf komplexe Situationen trainiert werden, sodass es auch in einer Bahnhofsunterführung oder auf einem Marktplatz mit vielen beliebig umherlaufenden Menschen eine empfohlene Gehrichtung signalisieren kann, sodass Kollisionen mit Passanten verhindert werden.The artificial neural network can also be trained for complex situations, so that it can also signal a recommended walking direction in a train station underpass or on a market square with any number of people walking around, so that collisions with passers-by are prevented.

Die Weiterbildung der Erfindung nach Anspruch 2 bietet den Vorteil, dass durch die Generierung eines Tiefenbildes wesentlich detailliertere Informationen in Bezug auf den vor der blinden Person gelegenen Bereich geliefert werden. So kann die Genauigkeit der empfohlenen Gehrichtung dadurch verbessert werden, dass als Zwischenschritt durch Depth Estimation ein Tiefenbild erzeugt wird, aus dem zusammen mit dem Farbbild die empfohlene Gehrichtung gewonnen wird. Dadurch verbessert sich die Genauigkeit vor allem in Situationen in denen eine Vielzahl von Personen vor der blinden Person stehen (beispielsweise auf einem Marktplatz) oder in denen kein klarer Weg erkennbar ist, sondern sich viele Objekte in der Umgebung befinden (beispielsweise in Innenräumen).The development of the invention according to claim 2 offers the advantage that the generation of a depth image provides significantly more detailed information in relation to the area in front of the blind person. The accuracy of the recommended walking direction can be improved in that a depth image is generated as an intermediate step by depth estimation, from which the recommended walking direction is obtained together with the color image. This improves the accuracy, especially in situations where a large number of people are standing in front of the blind person (e.g. in a marketplace) or in which there is no clear path but many objects are in the area (e.g. indoors).

Die Weiterbildung der Erfindung nach Anspruch 3 bietet den Vorteil, dass durch Verwendung von Stereobildern nicht nur ein Tiefenbild erzeugt, sondern auch die Entfernung von bestimmten Punkten im Bild noch genauer auf einer absoluten Skala erfasst werden kann. Zusätzlich kann die Messung dieser Entfernung bei Situationen verbessert werden, in denen Referenzobjekte mit bekannter Dimension fehlen, wie beispielsweise Personen, oder Fahrzeuge.The further development of the invention according to claim 3 offers the advantage that not only can a depth image be generated by using stereo images, but the distance from specific points in the image can also be recorded more precisely on an absolute scale. In addition, the measurement of this distance can be improved in situations where reference objects of known dimensions, such as people or vehicles, are missing.

Die Weiterbildung der Erfindung nach Anspruch 4 bietet den Vorteil, dass in Situationen mit sich bewegenden Personen oder Objekten in der Umgebung aus mehreren nacheinander aufgenommenen Bildern bestimmt werden kann, ob die jeweiligen Personen oder Objekte sich auf die blinde Person zu, von ihr weg oder sich seitlich zu dieser bewegen. So stellen vor der blinden Person in die gleiche Richtung gehende Personen kein Hindernis für die Fortbewegung der blinden Person dar, Personen die sich auf die blinde Person zu bewegen jedoch schon. Dadurch kann das Assistenzsystem bei entgegenkommenden Passanten eine Richtung G empfehlen, die dazu führt, dass die blinde Person P der entgegenkommenden Person ausweicht. Außerdem kann aus den aus aufeinanderfolgenden Farb- oder Tiefenbildern gewonnenen Bewegungsinformationen durch das künstliche neuronale Netzwerk eine genauere empfohlene Gehrichtung gewonnen werden, wenn sich vor der blinden Person viele in unterschiedliche Richtungen gehende Menschen befinden. Ebenso stellt ein seitlich neben der von der blinden Person eingeschlagenen Gehrichtung geparktes Fahrzeug kein Hindernis für diese dar, fährt das gleiche Fahrzeug jedoch zu Bildmitte des aufgenommen Kamerabildes, so stellt dieses eine potenzielle Gefahr für die blinde Person dar.The development of the invention according to claim 4 offers the advantage that in situations with moving people or objects in the area, it can be determined from several consecutively recorded images whether the respective people or objects are moving towards, away from or themselves from the blind person move sideways to this. People walking in front of the blind person in the same direction do not constitute an obstacle to the blind person's movement, but people moving towards the blind person do. As a result, the assistance system can recommend a direction G for oncoming passers-by, which causes the blind person P to avoid the oncoming person. In addition, a more accurate recommended walking direction can be obtained from the movement information obtained from consecutive color or depth images by the artificial neural network when there are many people walking in different directions in front of the blind person. Likewise, parked to the side of the walking direction taken by the blind person If the vehicle is not an obstacle for them, but if the same vehicle drives to the center of the camera image, this poses a potential danger to the blind person.

Die Weiterbildung der Erfindung nach Anspruch 5 bietet den Vorteil, dass für die Berechnung der Ausgabewerte des neuronalen Netzwerks (G, I) deutlich weniger Zeit benötigt wird, als wenn dieselbe Berechnung auf der Hauptrecheneinheit (CPU) durchgeführt würde. Dadurch wird die gesamte Latenzzeit, das heißt die gesamte Zeitdauer von der Aufnahme des letzten Bildes einer Bildserie bis zur Möglichkeit der Ausgabe der empfohlenen Gehrichtung, verringert. Ein weiterer Vorteil besteht darin, dass die Tensor Recheneinheit ("TPU") für die gleiche Berechnung deutlich weniger Energie verbraucht als dies bei Verwendung der Hauptrecheneinheit der Fall wäre, wodurch die Akkulaufzeit des Assistenzsystems erhöht wird.The development of the invention according to claim 5 offers the advantage that significantly less time is required to calculate the output values of the neural network (G, I) than if the same calculation were carried out on the main processing unit (CPU). This reduces the total latency time, ie the total time from the recording of the last image in a series of images to the possibility of the recommended walking direction being output. Another advantage is that the tensor processing unit ("TPU") uses significantly less energy for the same calculation than would be the case if the main processing unit was used, which increases the battery life of the assistance system.

Nach Anspruch 6 werden die erfassten Informationen auch noch haptisch durch einen rotierenden Zeiger an die blinde Person weitergegeben, der an einem zum Assistenzsystem gehörenden Griff angebracht ist. Diese wird dadurch in die Lage versetzt, selbst Entscheidungen darüber treffen zu können, wie sie auf die Gegebenheiten, beispielsweise einer Treppe, einer Türe oder einem Zebrastreifen reagieren möchte. Dabei wird sie jedoch stets über die angezeigte Richtung um Hindernisse herumgeführt und immer auf begehbaren Wegen gehalten. So wird der blinden Person nicht nur eine ungefähre Richtung (wie bei einem akustischen rechts/links Signal) gegeben, sondern immer eine exakte Richtungsangabe, die mit den Fingern ertastet werden kann. So kann die blinde Person Ihre Geschwindigkeit erhöhen, da ihr kontinuierlich ein genaues Richtungssignal angezeigt wird, sodass kleine Abweichungen der blinden Person von der empfohlenen Gehrichtung ständig, aber sanft korrigiert werden.According to claim 6, the recorded information is also passed on to the blind person haptically by a rotating pointer, which is attached to a handle belonging to the assistance system. This puts them in a position to be able to make their own decisions about how they want to react to the circumstances, such as a staircase, a door or a zebra crossing. However, she is always guided around obstacles via the indicated direction and always kept on walkable paths. In this way, the blind person is not only given an approximate direction (as with an acoustic right/left signal), but always an exact direction that can be felt with the fingers. In this way, the blind person can increase their speed because they are continuously shown an accurate direction signal, so that small deviations of the blind person from the recommended walking direction are constantly but gently corrected.

Die Weiterbildung der Erfindung nach Anspruch 7 bietet den Vorteil, dass die blinde Person durch von einem Vibrationsmotor erzeugte unterschiedliche Vibrationsmuster, sowie akustisch über Sprachsignale - die auch deaktivierbar sind - über vorhandene Türen, Treppen und Gefahren informiert wird. Das ist insofern hilfreich, als mit einem konventionellen Blindenstock eine geschlossene Türe durch Tasten nicht von einer Wand unterschieden werden kann. Außerdem werden mit einem konventionellen Blindenstock Stufen erst etwa in einer Entfernung von ca. 60 cm erkannt, wodurch die blinde Person ihre Gehgeschwindigkeit reduzieren muss, um ständig auf plötzlich auftretende Stufen gefasst zu sein.The development of the invention according to claim 7 offers the advantage that the blind person is informed of existing doors, stairs and dangers by means of different vibration patterns generated by a vibration motor and acoustically via voice signals which can also be deactivated. This is helpful insofar as with a conventional white cane a closed door cannot be distinguished from a wall by touch. In addition, steps are only recognized with a conventional white cane at a distance of about 60 cm, which means that the blind person has to reduce their walking speed in order to constantly be prepared for steps that suddenly appear.

Die Erweiterung nach Anspruch 8 bietet den Vorteil, dass sich erblindete Menschen, die sich langsam an das erfindungsgemäße Assistenzsystem gewöhnen möchten, zu Beginn nicht gleich auf den Blindenlangstock verzichten müssen. Hindernisse können nämlich sicherheitshalber zusätzlich zu der optischen Erkennung durch das Assistenzsystem auch mit dem Blindenlangstock ertastet werden. Die ermöglichte Beibehaltung des Blindenlangstocks ist auch insofern vorteilhaft, als andere Verkehrsteilnehmer diesen als bekanntes Kennzeichen für eine erblindete Person im Straßenverkehr wahrnehmen.The extension according to claim 8 offers the advantage that blind people who want to slowly get used to the assistance system according to the invention do not have to do without the cane for the blind at the beginning. To be on the safe side, obstacles can be felt with a long cane in addition to the optical detection by the assistance system. The ability to keep the long cane for the blind is also advantageous in that other road users perceive it as a well-known indicator of a blind person on the road.

Die Weiterbildung der Erfindung nach Anspruch 9 bietet den Vorteil, dass eine Navigation zu einem bestimmten von der blinden Person P eingegebenen Ziel in die der blinden Person empfohlene Gehrichtung mit einberechnet wird. Hierdurch kann beispielsweise eine Straßenüberquerung an einer geeigneten Stelle mit Ampel oder Zebrastreifen priorisiert werden. Wenn es während einer aktiven GPS-Navigation also mehrere mögliche Richtungen gibt, in die eine Fortbewegung denkbar wäre, so kann als weitere Information durch das künstliche neuronale Netzwerk berücksichtigt werden, in welche Richtung die GPS-Route führt, um der blinden Person das Folgen der berechneten Route zu erleichtern.The development of the invention according to claim 9 offers the advantage that navigation to a specific destination entered by the blind person P is taken into account in the walking direction recommended for the blind person. In this way, for example, a street crossing at a suitable place with traffic lights or zebra crossings can be prioritized will. If, during active GPS navigation, there are several possible directions in which movement would be conceivable, the artificial neural network can take into account the direction in which the GPS route leads as additional information, in order to enable the blind person to follow the route calculated route easier.

Bei einer Straßenüberquerung kann mit der Weiterbildung der Erfindung nach Anspruch 10 durch Radarsensoren überprüft werden, ob ein sicheres Überqueren der Straße möglich ist oder ob sich ein Fahrzeug nähert. Wenn letzteres der Fall ist und somit ein Queren der Straße zu gefährlich wäre, kann dies durch ein Vibrationsmuster der blinden Person signalisiert werden. Bewegt sich die blinde Person trotz einer solchen Warnung auf die Straße, kann zusätzlich ein Warnton ausgegeben werden.When crossing a street, radar sensors can be used to check whether it is possible to safely cross the street or whether a vehicle is approaching. If the latter is the case and crossing the street would be too dangerous, this can be signaled to the blind person by a vibration pattern. If the blind person moves onto the road despite such a warning, a warning tone can also be emitted.

Um zu garantieren, dass das Kamerasystem auch bei Dunkelheit die Umgebung klar erfasst und die blinde Person von anderen Verkehrsteilnehmern gut wahrgenommen werden kann, sind nach Anspruch 11 mehrere Lichtquellen in dem Assistenzsystem verbaut. Diese können automatisch eingeschaltet werden, sowie in der Intensität gesteuert werden.In order to guarantee that the camera system clearly captures the surroundings even in the dark and that the blind person can be clearly seen by other road users, several light sources are installed in the assistance system according to claim 11 . These can be switched on automatically and the intensity can be controlled.

Die für die Ausleuchtung der Umgebung verwendete Lichtquelle kann Licht im Infrarotbereich emittieren, sodass Passanten dadurch nicht geblendet werden.The light source used to illuminate the surroundings can emit light in the infrared range, so that passers-by are not dazzled.

Die Weiterbildung der Erfindung nach Anspruch 12 bietet den Vorteil, dass bei einer schnellen Drehung des Assistenzsystems durch die blinde Person die angezeigte Gehrichtung um den von einem Drehratensensor durch Integration gemessenen Drehwinkel sehr schnell angepasst werden kann. Dadurch kann, bis die Neuberechnung der empfohlenen Gehrichtung durch das neuronale Netzwerk abgeschlossen ist zwischenzeitlich eine Gehrichtung angezeigt werden, die in den meisten Fällen bereits der korrekten Gehrichtung entspricht. Dadurch wirkt die Angabe der Gehrichtung gegenüber der blinden Person klarer und kann daher viel seltener zu deren Verwirrung führen.The development of the invention according to claim 12 offers the advantage that when the blind person rotates the assistance system quickly, the displayed walking direction can be adjusted very quickly by the rotation angle measured by integration by a rotation rate sensor. As a result, until the recalculation of the recommended walking direction by the neural network has been completed, a walking direction can be displayed in the meantime, which in most cases already corresponds to the correct walking direction. As a result, the indication of the walking direction appears clearer to the blind person and can therefore confuse them much less frequently.

Im Folgenden wird ein Ausführungsbeispiel der Erfindung näher beschrieben. Dabei zeigt:

Fig. 1 einen Querschnitt des Assistenzsystems A,
Fig. 2 eine mögliche Richtungsempfehlung G mit verschiedenen Hindernissen H,
Fig. 3 eine mögliche Richtungsempfehlung G bei einer Straßenüberquerung (mit Zebrastreifen)
Fig. 4 eine mögliche Richtungsempfehlung G auf einem Bahnsteig
Fig. 5 die Steckverbindung des faltbaren und modularen Langstocks
Fig. 6 den strukturellen Aufbau des neuronalen Netzwerks
Fig. 7 den Datenfluss des Assistenzsystems A und
Fig. 8 eine Übersicht, in der eine typische Anwendungssituation des Assistenzsystems zu sehen ist.

An exemplary embodiment of the invention is described in more detail below. It shows:

1 a cross section of the assistance system A,
2 a possible recommended direction G with various obstacles H,
3 a possible direction recommendation G for a road crossing (with zebra crossings)
4 a possible direction recommendation G on a platform
figure 5 the plug-in connection of the foldable and modular long pole
6 the structural composition of the neural network
7 the data flow of the assistance system A and
8 an overview in which a typical application situation of the assistance system can be seen.

Das Assistenzsystem A gemäß Fig. 1 umfasst ein im 3D-Druck Verfahren hergestelltes Gehäuse. Dies weist Einschübe für einen Akku 3 und einen Blindenlangstock 9 auf. Wird der Akku 3 in das Gehäuse eingeschoben, wird er mit einer mittig platzierten Steckverbindung verbunden, über die das System A mit Strom versorgt wird. Bei dem verwendeten Anschluss ist die Orientierung des Akkus 3 irrelevant, sodass der Akku 3 auch gedreht in das Gehäuse eingeschoben werden kann. Ist der Akku 3 in das System A eingeschoben, wird dieser verriegelt, beispielsweise durch Verschließen des Akkufachs.The assistance system A according to 1 includes a housing manufactured using the 3D printing process. This has slots for a battery 3 and a cane 9 for the blind. If the battery 3 is pushed into the housing, it is connected to a plug-in connection placed in the middle, via which the system A is supplied with power. With the connection used, the orientation of the battery 3 is irrelevant, so that the battery 3 can also be rotated and pushed into the housing. If the battery 3 is inserted into the system A, it is locked, for example by closing the battery compartment.

Wird der faltbare Blindenlangstock 9 in das Assistenzsystem A eingeschoben, kann dieser mit dem Haltemechanismus 10 am Herausrutschen gehindert werden. Dazu wird ein Magnet verwendet, der den Magneten im Ende des Blindenstocks anzieht um diesen in Position zu halten.If the foldable cane 9 is pushed into the assistance system A, it can be prevented from slipping out with the holding mechanism 10 . A magnet is used for this, which attracts the magnet in the end of the white cane to hold it in position.

Der Blindenlangstock 9 besteht aus Gewichtsgründen aus einem rot-weiß lackierten Carbon-Rohr, das in mehrere Abschnitte unterteilt ist, die wie in Fig. 5 dargestellt miteinander verbunden werden, sodass im zusammengesteckten Zustand kein Übergang zwischen den Unterteilungen gefühlt werden kann und somit Spaltmaße minimiert werden, wodurch die allgemeine Stabilität des Blindenstocks verbessert wird. Dadurch kann der Blindenstock, um Platz zu sparen zusammengefaltet werden wenn dieser nicht benötigt wird. Die in Fig. 5 dargestellte Verbindung ist ebenfalls robust für auftretende Belastungen durch das Pendeln des Langstockes 9 und kann durch Auseinanderziehen durch die blinde Person P getrennt werden. Die sogenannte Spitze 11 des Blindenstocks 9 kann beispielsweise für einfacheres und leiseres Pendeln kugelgelagert werden. Darüber hinaus wäre es möglich einen Motor in die Spitze einzubauen, der dem Blindenlangstock einen Impuls nach links oder rechts geben kann, sowie Sensoren zur Erkennung von Erschütterungen durch Boden und Hindernissen H. Die Sensorwerte als eine weitere Informationsquelle für das neuronale Netzwerk genutzt werden. Ebenfalls denkbar ist es, die kugelgelagerte Spitze des Langstockes 9 als Allseitenrad auszuführen um die zum Pendeln und Gehen nötige Kraft weiter zu reduzieren.For reasons of weight, the cane for blind people 9 consists of a red and white lacquered carbon tube, which is divided into several sections, which, as in figure 5 shown connected to each other, so that when plugged together no transition between the subdivisions can be felt and thus gap dimensions are minimized, whereby the general stability of the white cane is improved. This allows the cane to be folded up to save space when not in use. In the figure 5 The connection shown is also robust for loads that occur due to the swinging of the cane 9 and can be separated by the blind person P pulling it apart. The so-called tip 11 of the cane 9 can, for example, have ball bearings for easier and quieter commuting. In addition, it would be possible to install a motor in the tip that can give the cane an impulse to the left or right, as well as sensors to detect vibrations from the ground and obstacles H. The sensor values can be used as another source of information for the neural network. It is also conceivable to design the ball-bearing tip of the cane 9 as an all-round wheel in order to further reduce the force required for commuting and walking.

Zusätzlich zu dem tauschbaren Akku 3 ist auch eine Dockingstation denkbar, die an einer Wand angebracht wird und das System A über leitfähige Kontaktstellen oder kabelloses Laden durch entsprechendes Einhängen in die Ladestation automatisch deaktiviert und geladen wird.In addition to the replaceable battery 3, a docking station is also conceivable, which is attached to a wall and the system A is automatically deactivated and charged via conductive contact points or wireless charging by hanging it in the charging station.

Der Vorteil einer an der Wand befestigten Dockingstation ist, dass diese sehr wenig Platz einnimmt, das Assistenzsystem A dadurch immer aufgeräumt ist und ein eventuelles Stolpern verhindern kann. So kann die Station in der Nähe der Eingangstüre angebracht werden, sodass das Assistenzsystem A immer voll aufgeladen griffbereit ist. Alternativ ist auch ein USB Typ C Ladeanschluss integriert, der das Laden des Akkus mit herkömmlichen Smartphone Ladegeräten ermöglicht.The advantage of a docking station attached to the wall is that it takes up very little space, so the assistance system A is always tidy and can prevent any tripping. In this way, the station can be installed near the entrance door, so that the assistance system A is always fully charged and ready to hand. Alternatively, a USB type C charging connection is also integrated, which enables the battery to be charged with conventional smartphone chargers.

Das Gehäuse des Assistenzsystems A ist ergonomisch geformt, sodass der Griff, der einen Servomotor 5 mit Zeiger 6 beinhaltet, bei der Benutzung einen flachen Winkel von ca. 10° zur Horizontalen aufweist, der sich aus der natürlichen Handhaltung des Menschen ergibt, wodurch die blinde Person P ihr Handgelenk gegenüber einem üblichen Blindenlangstock weniger stark anwinkeln muss. Außerdem ist der Griff der Form einer Hand angepasst, sodass dieser bequemer in der Hand liegt.The housing of the assistance system A is ergonomically shaped so that the handle, which contains a servo motor 5 with a pointer 6, has a flat angle of approx Person P does not have to bend his wrist as much compared to a standard cane for the blind. In addition, the handle is adapted to the shape of a hand, making it more comfortable to hold.

Der Zeiger 6 ist je nach Vorliebe der blinden Person P etwa 15 bis 25 Millimeter lang, sodass die blinde Person P mit Ihrem Daumen gut die Richtung G ertasten kann, in die der Zeiger 6 gerichtet ist. Diese Richtung entspricht der empfohlenen Gehrichtung G, die das Assistenzsystem A aufgrund der vorliegenden Hindernisse H, für am besten geeignet für die weitere Fortbewegung vorschlägt. Dazu ist der Zeiger mittels einer Schraube auf der Welle eines Mikro Servo Motors befestigt, der eine Möglichst hohe Winkelgeschwindigkeit aufweist. Dieser Motor wird über eine PWM Signal durch die Hauptrecheneinheit die als Raspberry PI Compute Module ausgeführt ist, gesteuert.Depending on the preference of the blind person P, the pointer 6 is about 15 to 25 millimeters long, so that the blind person P can feel the direction G in which the pointer 6 is directed with their thumb. This direction corresponds to the recommended walking direction G, which the assistance system A, based on the obstacles H present, suggests as being the most suitable for further movement. For this purpose, the pointer is fastened to the shaft of a micro-servo motor with a screw, which has the highest possible angular velocity. This motor is controlled via a PWM signal by the main processing unit, which is implemented as a Raspberry PI Compute Module.

Die Kamera 4 ist in dem Gehäuse so verbaut, dass sie bei Benutzung des Systems A leicht zum Boden geneigt ist, sodass möglichst alle optischen Merkmale der Umgebung erfasst werden können.The camera 4 is installed in the housing in such a way that when using the system A it is inclined slightly towards the floor, so that as many optical features of the environment as possible can be recorded.

Dabei wurde die Position der Kamera 4 so gewählt, dass weder der Griff G des Systems A noch der Blindenlangstock 9 bzw. das entsprechende Endstück 11 auf dem Kamerabild zu sehen ist. Die Kamera ist über eine "MIPI CSI" Kameraschnittstelle mit dem Einplatinencomputer verbunden und verfügt über einen möglichst großen Blickwinkel von beispielsweise 180 Grad. Aus den Kamerabildern kann ebenfalls eine 3D-Karte der näheren Umgebung angelegt werden (Point Cloud, gewonnen durch SLAM, Simultaneous Localization And Mapping), die als weitere Informationsquelle für das neuronale Netzwerk dienen kann. Ebenso kann durch die Verwendung eines Drehratensensors und Beschleunigungssensors, während dem Pendeln mit dem Assistenzsystem A das Kamerabild über den Blickwinkel der Kamera hinaus erweitert werden. Dadurch würde eine Art Panoramabild entstehen.The position of the camera 4 was selected in such a way that neither the handle G of the system A nor the cane 9 or the corresponding end piece 11 can be seen on the camera image. The camera is connected to the single-board computer via a "MIPI CSI" camera interface and has the widest possible viewing angle of 180 degrees, for example. A 3D map of the immediate surroundings can also be created from the camera images (Point Cloud, obtained through SLAM, Simultaneous Localization And Mapping), which can serve as a further source of information for the neural network. Likewise, by using a yaw rate sensor and an acceleration sensor while commuting with the assistance system A, the camera image can be expanded beyond the field of view of the camera. This would create a kind of panorama picture.

In dem Gehäuse sind nach Fig. 1 noch ein Vibrationsmotor 7 und ein Lautsprecher 8 verbaut. Diese werden von dem Einplatinencomputer 1 angesteuert. Der Vibrationsmotor 7 kann dabei über Pulsweitenmodulation in der Intensität gesteuert werden, sodass nicht nur lange und kurze Vibrationen, nämlich auch starke und schwache Vibrationen sowie Vibrationsmuster (beispielsweise dreifache kurze Vibration) erzeugt werden können. So kann beispielsweise durch die Intensität der Vibration die Entfernung zu einem Hindernis H bzw. einer räumlichen Gegebenheit angegeben werden. Durch Vibrationsmuster kann zudem angegeben werden, ob eine räumliche Gegebenheit, wie z. B. eine Türe oder Treppe erkannt wurde. Dabei werden die von der blinden Person P am häufigsten benötigten Informationen I (beispielsweise Türe oder Treppe) durch unterschiedliche Vibrationsmuster signalisiert und für selten auftretende Informationen (beispielsweise "suche nach Briefkasten erfolgreich") auf die Sprachausgabe zurückgegriffen und das Gehör der blinden Person P möglichst wenig zu beanspruchen. Zur Steuerung des Vibrationsmotors 7 wird ein Transistor verwendet, um mit der 3,3V Steuerspannung des Einplatinencomputers oder System on Module (Beispielsweise ein Raspberry Pi compute module 4) mit 5V zu betreiben. Der Vibrationsgenerator und der Tongenerator können dabei als ein Bauteil ausgeführt werden. Solch ein sogenannter Exciter wird dazu verwendet Sprache durch Nutzung der Resonanz des Kunststoffgehäuse wiederzugeben. Außerdem kann über den Exciter auch ein Vibrationssignal wiedergegeben werden, das sich besonders angenehm anfühlt, ähnlich dem aus dem aus dem iPhone bekannten Signal der "Taptic Engine".In the housing are after 1 a vibration motor 7 and a loudspeaker 8 are installed. These are controlled by the single-board computer 1. The intensity of the vibration motor 7 can be controlled via pulse width modulation, so that not only long and short vibrations, but also strong and weak vibrations and vibration patterns (for example triple short vibration) can be generated. For example, the distance to an obstacle H or a spatial condition can be indicated by the intensity of the vibration. Vibration patterns can also indicate whether a spatial condition, e.g. B. a door or stairs was detected. The information I most frequently required by the blind person P (e.g. door or stairs) is signaled by different vibration patterns and for information that occurs rarely (e.g. "search for mailbox successfully") the voice output is used and the hearing of the blind person P is used as little as possible to claim. A transistor is used to control the vibration motor 7 in order to operate with the 3.3V control voltage of the single-board computer or system on module (e.g. a Raspberry Pi compute module 4) with 5V. The vibration generator and the Tone generators can be designed as one component. Such a so-called exciter is used to reproduce speech by using the resonance of the plastic housing. In addition, the exciter can also be used to reproduce a vibration signal that is particularly pleasant to the touch, similar to the "Taptic Engine" signal known from the iPhone.

Über den Lautsprecher 8 wird Sprache ausgegeben, die Synthetisch erzeugt wird. Als Synthesizer wird eine Machine Learning basierte Methode verwendet, wodurch ein natürliches Sprachbild zustande kommt. Auch möglich ist das Speichern von fertigen Audiodateien auf dem Speicher des Raspberry Pi, die häufig gebraucht werden, wie "Treppe" oder "Zebrastreifen". Ausgegeben kann über Sprache beispielsweise Informationen zu einer Routenführung oder erkannter räumlicher Gegebenheiten. Zusätzlich kann durch ein weiteres künstliches neuronales Netz das erfasste Kamerabild in Worten beschrieben werden, sowie Buslinien und Fahrziel eines Busses oder sonstiger Text vorgelesen werden. Auch das Suchen von Objekten ist so möglich, wobei der Nutzer akustisch benachrichtigt wird, sobald das gesuchte Objekt - beispielsweise ein Mobiltelefon oder ein Briefkasten - im Kamerabild durch ein Machine Learning Modell zur Objektklassifizierung (dazu kann MobileNet v2 als machine learning Modell verwendet werden) erkannt wurde. Denkbar ist auch das Erkennen und Vorlesen der Farbe eines vor dem System A befindlichen Kleidungsstückes oder sonstigen Objekts. Um dem Nutzer das Erlernen der Bedeutung der Vibrationsmuster zu erleichtern, kann zu Beginn zu jedem Vibrationssignal die entsprechende Bedeutung auch akustisch ausgegeben werden bis der Nutzer die Sprachausgabe nicht mehr benötigt um die Bedeutung der Vibrationssignale zu erkennen.Speech that is generated synthetically is output via the loudspeaker 8 . A machine learning-based method is used as a synthesizer, which creates a natural speech image. It is also possible to store finished audio files on the Raspberry Pi memory that are used frequently, such as "stairs" or "crosswalks". For example, information about route guidance or recognized spatial conditions can be output via speech. In addition, the captured camera image can be described in words by another artificial neural network, and bus routes and the destination of a bus or other text can be read out. It is also possible to search for objects, with the user being notified acoustically as soon as the object they are looking for - for example a mobile phone or a mailbox - is recognized in the camera image by a machine learning model for object classification (MobileNet v2 can be used as a machine learning model for this purpose). became. It is also conceivable to recognize and read out the color of a piece of clothing or other object located in front of system A. In order to make it easier for the user to learn the meaning of the vibration pattern, the corresponding meaning can also be output acoustically at the beginning of each vibration signal until the user no longer needs the voice output in order to recognize the meaning of the vibration signals.

Das Deep Learning Modell (künstliches neuronales Netzwerk, Deep Convolutional Neural Network) verarbeitet ein Farbbild mit den Dimensionen [224,224,3] für ein einzelnes Farbbild bzw. entsprechend erweiterten Dimensionen für weitere Eingabedaten, die beispielsweise Tiefen- oder Bewegungsinformationen beinhalten. Mit einem weiteren Channel für Tiefeninformation haben die Eingabedaten des Machine Learning Modells dann die Dimensionen [224,224,4].The deep learning model (artificial neural network, deep convolutional neural network) processes a color image with the dimensions [224,224,3] for a single color image or correspondingly extended dimensions for further input data, which contain, for example, depth or movement information. With another channel for depth information, the input data of the machine learning model then have the dimensions [224,224,4].

In Form weiterer Channels (der dritte Wert der Dimensionen repräsentiert die Anzahl der Channels des Eingabetensors) kann zu den Eingabedaten zusätzlich die Information "demnächst rechts/links abbiegen" hinzugefügt werden, sofern die GPS Navigation auf dem Assistenzsystem aktiviert ist (per im Assistenzsystem Integriertem GPS Empfänger), oder durch eine App per Bluetooth die Richtungsanweisungen der auf dem Smartphone der blinden Person an das Assistenzsystem übermittelt werden. Ebenfalls auf diese Weise werden dem Machine Learning Modell die Geschwindigkeitsmesswerte der nach links und rechts gerichteten Radarsensoren bereitgestellt, sodass herannahende Fahrzeuge erkannt werden können. Auch als zusätzlicher Channel des Eingabetensors kann eine Semantic Segmentation genannte Repräsentation des Kamerabildes verwendet werden die zuvor durch ein weiteres Neuronales Image to Image Netzwerk gewonnen wurde. Die Eingabedaten werden innerhalb des Netzwerkes durch eine oder je nach Eingabedaten auch mehrere parallele Netzwerkstrukturen verarbeitet, durch die bedeutsame Merkmale in den Daten erkannt und lokalisiert werden. Darauf folgt eine Netzwerkstruktur, die die erkannten Merkmale interpretiert und daraus die Ausgabe des Netzwerks schlussfolgert (Fully Connected Layers, auch Dense Layer genannt).In the form of additional channels (the third value of the dimensions represents the number of channels of the input tensor), the information "next turning right/left" can be added to the input data, provided that the GPS navigation is activated on the assistance system (via GPS integrated in the assistance system receiver), or the directions on the blind person's smartphone are transmitted to the assistance system via Bluetooth via an app. Also in this way, the machine learning model is provided with the speed readings of the radar sensors pointing to the left and right, so that approaching vehicles can be detected. A representation of the camera image called semantic segmentation can also be used as an additional channel of the input tensor, which was previously obtained by another neural image-to-image network. The input data is processed within the network by one or, depending on the input data, several parallel network structures, through which significant features in the data are recognized and localized. This is followed by a network structure that interprets the recognized features and from this concludes the output of the network (Fully Connected Layers, also called Dense Layer).

Es ist auch denkbar die Ersten drei Farbkanäle (das Farbbild) in ein neuronales Netzwerk zu leiten, welches daraus ein Tiefenbild berechnet (auch bekannt als Monocular Depth Estimation), das dann Zusammen mit dem Farbbild selbst und den gegebenenfalls weiteren Channels (Farbkanälen), das als sogenannter Bypass weitergeleitet wird, durch eine weitere neuronale Netzwerkstruktur verarbeitet wird. Daraufhin folgen wieder Dense Layer, die die erkannten Merkmale Interpretieren und schlussendlich die Ausgabewerte berechnen.It is also conceivable to route the first three color channels (the color image) into a neural network, which uses them to calculate a depth image (also known as monocular depth estimation), which is then combined with the color image itself and any additional ones Channels (color channels), which is forwarded as a so-called bypass, is processed by another neural network structure. This is followed again by dense layers, which interpret the recognized features and finally calculate the output values.

Die Ausgabe des Netzwerks besteht aus einem Winkel sowie mehreren Werten, die sich zwischen 0 und 1 befinden. Dabei nähert sich ein Wert 1 an, wenn die bestimmte Situation, die der Wert widerspiegelt, im Kamerabild zu sehen ist.The output of the network consists of an angle and several values ranging from 0 to 1. A value approaches 1 when the specific situation that the value reflects can be seen in the camera image.

Als Convolutional Neural network kann beispielsweise ein sogenanntes "ResNet50" oder auch "MobileNetv2" verwendet werden, dessen Parameter auf die im Datensatz enthaltenen Beispielbilder trainiert worden sind. Die letzten Layer der "MobileNet" Netzwerkarchitektur müssen dadurch durch Dense Layer ersetzt werden, die einen Output-Tensor erzeugen der die Passende Anzahl an ausgabewerte hat um die empfohlene Gehrichtung sowie die weiteren Nützlichen Informationen (Treppe, Zebrastreifen ...) auszugeben.For example, a so-called "ResNet50" or "MobileNetv2" can be used as a convolutional neural network, the parameters of which have been trained on the example images contained in the data set. The last layers of the "MobileNet" network architecture must be replaced by dense layers, which generate an output tensor that has the appropriate number of output values to output the recommended walking direction and other useful information (stairs, zebra crossings ...).

Für die Monocular Depth Estimation wird ein ResNet 50 Encoder verwendet und ein Decoder der auch über sogenannte skip Connections mit dem Encoder vernetzt ist um ein Tiefenbild zu erzeugen. Der Dazu nötige 3D Datensatz wird zuvor mit einem möglichst genauen Lidar Sensor oder Stereokamera aufgezeichnet. Um die jeweiligen Neuronalen Netzwerke auf den Datensatz zu trainieren wird entweder PyTorch oder TensorFlow als machine learning Framework verwendet.A ResNet 50 encoder is used for the monocular depth estimation and a decoder that is also networked with the encoder via so-called skip connections in order to generate a depth image. The 3D data set required for this is previously recorded with a lidar sensor or stereo camera that is as precise as possible. In order to train the respective neural networks on the data set, either PyTorch or TensorFlow is used as a machine learning framework.

Die Ausgabewerte werden in der beschriebenen Ausführungsform auf einem per PCI-Express mit dem Raspberry Pi compute module verbundenen "AI accelerator" berechnet. Dafür wird ein "Coral Accelerator Module" verwendet. Jedoch kann auch eine Platine verwendet werden, die sowohl Tensor- sowie Hauptrecheneinheit enthält. Auch kann als Hauptrecheneinheit ein i.MX 8M Plus Prozessor mit integrierter Neural Processing Unit (NPU / TPU) verwendet werden anstelle eines separaten Machine Learning BeschleunigersIn the embodiment described, the output values are calculated on an "AI accelerator" connected to the Raspberry Pi compute module via PCI Express. A "Coral Accelerator Module" is used for this. However, a circuit board containing both the tensor and the main processing unit can also be used. An i.MX 8M Plus processor with an integrated neural processing unit (NPU / TPU) can also be used as the main processing unit instead of a separate machine learning accelerator

Um die Latenzzeit weiter zu verringern und gleichzeitig den Energieverbrauch zu senken kann man die Position des Zeigers am Griff G mithilfe der Informationen über die Drehung des Assistenzsystems, die mit einem Drehratensensor gemessen werden, aktualisieren. So wird der Fakt verwendet, dass wenn die empfohlene Gehrichtung geradeaus nach vorne zeigte und zwischenzeitlich das Assistenzsystem um 20 Grad nach rechts gedreht wurde die neue empfohlene Gehrichtung mit sehr hoher Wahrscheinlichkeit mit 20 Grad nach links zeigt. Da die Messung der Drehrate und deren Integration über die Zeit weniger Zeit benötigt als das Aufnahmen eines neuen Bildes und dessen Auswertung durch das Neuronale Netzwerk verringert sich die Latenzzeit.To further reduce latency and at the same time reduce energy consumption, the position of the pointer on the handle G can be updated using information about the rotation of the assistance system, which is measured with a rotation rate sensor. The fact is used that if the recommended walking direction pointed straight ahead and the assistance system was turned 20 degrees to the right in the meantime, the new recommended walking direction is very likely to point 20 degrees to the left. Since the measurement of the yaw rate and its integration over time takes less time than taking a new image and evaluating it through the neural network, the latency time is reduced.

Alternativ kann zu Monocular Depth Estimation auch das Verfahren Structure From motion oder "Depth Map from Stereo Images" eingesetzt werden.
zwei nacheinander gewonnene Tiefenbilder werden nun voneinander subtrahiert, um ein Bild zu erhalten in dem zu erkennen ist welche Objekte sich relativ zum Assistenzsystem bewegt haben. Dieses Bild kann nun ebenfalls durch ein Neuronales Netzwerk verarbeitet werden, um die Genauigkeit der empfohlenen Gehrichtung zu verbessern.As an alternative to Monocular Depth Estimation, the Structure From motion or "Depth Map from Stereo Images" method can also be used.
Two depth images obtained one after the other are now subtracted from one another in order to obtain an image in which it is possible to recognize which objects have moved relative to the assistance system. This image can now also be processed by a neural network to improve the accuracy of the recommended walking direction.

Zusätzlich sind an verschiedenen Stellen des Gehäuses LED-Leuchtdioden angebracht die durch einen Verbauten Umgebungslichtsensor bei Dunkelheit aktiviert werden um das Assistenzsystem besser sichtbar zu machen. Außerdem Leuchten diese Leuchtdioden die Umgebung aus, um ein besseres Kamerabild aufnehmen zu können.In addition, LED light-emitting diodes are attached to various parts of the housing, which are activated by a built-in ambient light sensor in the dark to make the assistance system more visible. In addition, these light-emitting diodes illuminate the surroundings in order to be able to record a better camera image.

Es können auch mehrere Weitwinkelkameras integriert sein die in unterschiedliche Richtungen zeigen um gleichzeitig einen größeren Bereich abzudecken um Hindernisse zu erkennen die sich über dem Assistenzsystem befinden.Several wide-angle cameras can also be integrated that point in different directions in order to cover a larger area at the same time in order to detect obstacles that are located above the assistance system.

Reference list:

11: = Haupt-Recheneinheit= main processing unit
22: = Tensor-Recheneinheit= tensor arithmetic unit
33: = Tauschbarer Akku= Replaceable battery
4a, 4b4a, 4b: = Kamerasystem= camera system
55: = Motor= engine
66: = Zeiger= pointer
77: = Vibrationsgenerator= vibration generator
88th: = Tongenerator= tone generator
99: = Faltbarer Blindenlangstock= Foldable cane for the blind
1010: = Haltemechanismus= holding mechanism
1111: = Kugel= ball
12a, 12b12a, 12b: = Carbon-Röhren= carbon tubes
1313: = Stöpsel= stopper
1414: = Dehnbarer Gummi= Stretchable rubber
1515: = Griff= handle
1616: = Drehratensensor= rotation rate sensor
1717: = Lichtquelle= light source
1818: = Radarsensor= radar sensor
AA: = Assistenzsystem= assistance system
GG: = Empfohlene Gehrichtung= Recommended walking direction
HH: = Hindernis= obstacle
II: = Nützliche Information= Useful information
PP: = Blinde Person= blind person
Uu: = Umfeld= environment

Claims

Assistance system (A) for blind people (P) for detecting and signaling a recommended walking direction (G) and other information (I) useful for blind people (P), the environment (U) of the blind person (P) being replaced by a or a plurality of optical cameras (4a, 4b), characterized in that from the color camera images continuously obtained, a computing unit (1, 2) uses one or more so-called "convolutional neural networks" stored there to calculate those of the blind Person (P) determines the recommended walking direction (G) and this is signaled, as well as useful information (I), for example in relation to obstacles and options for their movement and that the aforementioned neural networks before their use on the computing unit (1,2) with typical images of obstacles (H) and options occurring when using the assistance system (A) by a so-called "machine learning framework"-P program have been trained.

Assistance system according to Claim 1, characterized in that on the assistance system (A) at least one of the cameras (4a) of the camera system (4) generates recordings of the surroundings (U) of the blind person (P), which are recorded by the neural network by so-called "Depth Estimation " be converted into a depth image in the computing unit (1, 2), which is also used to obtain the recommended walking direction (G) and other useful information (I) in relation to the type of objects in front of the blind person (P). .

Assistance system according to Claim 2, characterized in that two cameras (4a, 4b) recording the surroundings are attached to the assistance system (A), from their recordings in the processing unit (1, 2) by comparing the recordings of both cameras using a so-called "depth map from Stereo Images" or "Structure from Motion" algorithm a depth image is generated.

Assistance system according to Claim 3, characterized in that information on the movement of objects is obtained from images recorded one after the other at least one of the cameras (4a, 4b) or the depth images generated therefrom, which information is also used to calculate the recommended walking direction (G) and other useful Information (I) are used.

Assistance system according to Claim 1, characterized in that to calculate the recommended walking direction (G) and the useful information (I) with the aid of the artificial neural network, a tensor computing unit (2) specially produced for the execution of artificial neural networks, a so-called "machine learning accelerator" or "Tensor Processing Unit" is used.

Assistance system according to Claim 1, characterized in that the recommended walking direction (G) is indicated by an electronically controlled pointer (6) moved mechanically by an actuator (5) and attached to a handle (15) belonging to the assistance system (A). and can be felt there by the blind person (P).

Assistance system according to Claim 6, characterized in that various useful information (I) is transmitted haptically by different vibration signals generated by a vibration generator (7) on the handle (15) of the assistance system (A) and/or acoustically by a tone generator (8).

Assistance system according to Claim 1, characterized in that a cane (9) for the blind can be attached to the assistance system (A).

Assistance system according to one of the preceding claims, characterized in that when GPS navigation is activated when determining the recommended walking direction (G), the information from the map material and the route determined by GPS navigation software are also taken into account.

Assistance system according to one of the preceding claims, characterized in that the assistance system (A) has radar sensors (18) which are suitable for detecting the speed and optionally the distance from vehicles, people and objects approaching the user.

Assistance system according to one of the preceding claims, characterized in that it has one or more light sources (17) for illuminating the spatial area recorded by cameras 4a, 4b.

Assistance system according to one of the preceding claims, characterized in that this has a yaw rate sensor (16), the measured values of which are used when the assistance system (A) is rotated by the blind person (P) about the vertical axis, the recommended walking direction (G) of the pointer (6) more quickly than would be possible simply by recalculating the recommended walking direction (G) from the camera image.