FR3138944A1

FR3138944A1 - Device and method for estimating the traversability of terrain by a mobile system

Info

Publication number: FR3138944A1
Application number: FR2208427A
Authority: FR
Inventors: Michel MOUKARI; Ahmed Nasreddinne BENAICHOUCHE
Original assignee: Safran SA
Current assignee: Safran SA
Priority date: 2022-08-22
Filing date: 2022-08-22
Publication date: 2024-02-23

Abstract

L’invention porte sur un dispositif de cartographie de terrain destiné à être embarqué sur un système mobile, comprenant une unité d’estimation de traversabilité configurée pour : déterminer (E1) une carte de profondeur du terrain ;déterminer (E3) une carte de hauteur du terrain à partir de la carte de profondeur et de données d’image du terrain acquises par une caméra embarquée sur le système mobile ;déterminer (E4) un premier masque de traversabilité du terrain par le système mobile au moyen d’un seuillage de la carte de hauteur ;déterminer (E5) un deuxième masque de traversabilité du terrain par le système mobile au moyen d’une segmentation sémantique des données d’image du terrain acquises par la caméra embarquée sur le système mobile ;déterminer (E6) une carte de traversabilité du terrain par le système mobile (CT) par fusion du premier et du deuxième masque de traversabilité. Figure pour l’abrégé : Figure 1The invention relates to a terrain mapping device intended to be on board a mobile system, comprising a traversability estimation unit configured to: determine (E1) a depth map of the terrain; determine (E3) a height map of the terrain from the depth map and image data of the terrain acquired by a camera on board the mobile system; determine (E4) a first mask of traversability of the terrain by the mobile system by means of thresholding of the height map; determine (E5) a second terrain traversability mask by the mobile system by means of a semantic segmentation of the terrain image data acquired by the camera on board the mobile system; determine (E6) a height map traversability of the terrain by the mobile system (CT) by merging the first and second traversability mask. Figure for abstract: Figure 1

Description

Device and method for estimating the traversability of terrain by a mobile system

Le domaine de l’invention est celui de l’aide à la navigation d’un système mobile du type robot ou véhicule autonome. L’invention concerne plus particulièrement l’estimation de la traversabilité d’un terrain par le système mobile, c’est-à-dire l’identification de zones du terrain sur lesquelles le système mobile est apte à se déplacer.The field of the invention is that of aiding the navigation of a mobile system of the robot or autonomous vehicle type. The invention relates more particularly to the estimation of the traversability of terrain by the mobile system, that is to say the identification of areas of the terrain over which the mobile system is able to move.

Dans le domaine de la navigation de systèmes mobiles, on connait des méthodes qui ont pour but de trouver la présence d’une route dans des images acquises par une caméra embarquée sur le système mobile. Ces méthodes utilisent des indices visuels comme les points de fuite, les textures ou encore le relief pour délimiter les contours d’une route sur une image, ou posent le problème directement comme un problème de segmentation de la route dans l’image. Cependant, ces méthodes ne s’intéressent pas aux chemins au sens le plus large du terme, qui peuvent notamment être des chemins hors-pistes non forcément goudronnés ni correctement délimités, et encore moins au thème plus général de la traversabilité correspondant à l’identification dans les images acquises de zones du terrain sur lesquelles le système mobile est apte à se déplacer.In the field of navigation of mobile systems, we know methods which aim to find the presence of a road in images acquired by a camera on board the mobile system. These methods use visual clues such as vanishing points, textures or even relief to delineate the contours of a road on an image, or pose the problem directly as a road segmentation problem in the image. However, these methods are not interested in paths in the broadest sense of the term, which may in particular be off-road paths that are not necessarily paved or correctly demarcated, and even less in the more general theme of traversability corresponding to identification. in the images acquired of areas of the ground over which the mobile system is able to move.

On trouve d’autre part des méthodes qui identifient le type de sol sur lequel le véhicule évolue. Par exemple, le document US 2019/0129435 A1 vise à prédire la traversabilité d’un terrain pour un véhicule au moyen de capteurs embarqués dans le véhicule tels qu’une centrale inertielle, un GPS ainsi qu'un lidar pour identifier la topographie de la surface du sol et générer une carte 3D du terrain. La position du véhicule relativement à cette carte est déterminée au moyen d’un algorithme de localisation et cartographie simultanées (connu en anglais sous le nom de SLAM pourSimultaneous Localization And Mapping). Puis le chemin à suivre par le véhicule est déterminé en fonction de caractéristiques du terrain extraites des données de capteurs, comme par exemple des données d’un accéléromètre ou d’un gyroscope.There are also methods that identify the type of ground on which the vehicle is traveling. For example, document US 2019/0129435 A1 aims to predict the traversability of terrain for a vehicle by means of sensors embedded in the vehicle such as an inertial unit, a GPS as well as a lidar to identify the topography of the ground surface and generate a 3D map of the terrain. The position of the vehicle relative to this map is determined by means of a simultaneous localization and mapping algorithm (known in English as SLAM for Simultaneous Localization And Mapping ). Then the path to be followed by the vehicle is determined based on terrain characteristics extracted from sensor data, such as data from an accelerometer or a gyroscope.

L’invention a pour objectif de proposer une technique d’estimation de la traversabilité d’un terrain par un système mobile qui soit à la fois performante et simple à mettre en œuvre et ce qu’il y ait ou non une route sur le terrain (par exemple dans le cas d’une navigation en hors-piste ou dans un environnement déstructuré).The invention aims to propose a technique for estimating the traversability of terrain by a mobile system which is both efficient and simple to implement, whether or not there is a road on the terrain. (for example in the case of off-piste navigation or in an unstructured environment).

A cet effet, l’invention propose un dispositif de cartographie de terrain destiné à être embarqué sur un système mobile, comprenant une unité d’estimation de traversabilité configurée pour :

déterminer une carte de profondeur du terrain ;
déterminer une carte de hauteur du terrain à partir de la carte de profondeur et de données d’image du terrain acquises par une caméra embarquée sur le système mobile ;
déterminer un premier masque de traversabilité du terrain par le système mobile au moyen d’un seuillage de la carte de hauteur ;
déterminer un deuxième masque de traversabilité du terrain par le système mobile au moyen d’une segmentation sémantique des données d’image du terrain acquises par la caméra embarquée sur le système mobile ;
déterminer une carte de traversabilité du terrain par le système mobile par fusion du premier et du deuxième masque de traversabilité.

To this end, the invention proposes a terrain mapping device intended to be embedded on a mobile system, comprising a traversability estimation unit configured to:

determine a depth map of the terrain;
determining a height map of the terrain from the depth map and image data of the terrain acquired by a camera on board the mobile system;
determine a first terrain traversability mask by the mobile system by means of thresholding of the height map;
determine a second terrain traversability mask by the mobile system by means of a semantic segmentation of the terrain image data acquired by the camera on board the mobile system;
determine a terrain traversability map by the mobile system by merging the first and second traversability mask.

Certains aspects préférés mais non limitatifs de ce dispositif sont les suivants :

l’unité d’estimation de traversabilité est configurée pour déterminer la carte de profondeur du terrain à partir des données d’image du terrain acquises par la caméra ;
pour déterminer la carte de profondeur l’unité d’estimation de traversabilité exploite un modèle d’apprentissage profond ;
l’unité d’estimation de traversabilité comprend un réseau neuronal pré-entrainé pour mettre en œuvre le modèle d’apprentissage profond ;
le réseau neuronal est également pré-entrainé pour réaliser la segmentation sémantique des données d’image du terrain ;
le réseau neuronal comprend un encodeur, un premier décodeur dédié à l’inférence de profondeur et un deuxième décodeur dédié à la segmentation sémantique ;
le seuillage de la carte de hauteur est réalisée de manière à ce que le premier masque de traversabilité conserve, lorsqu’il est appliqué aux données d’image du terrain, des points du terrain présentant une hauteur comprise entre un premier seuil et un deuxième seuil supérieur au premier seuil ;
l’unité d’estimation de traversabilité est en outre configurée pour déterminer un troisième masque de traversabilité qui conserve, lorsqu’il est appliqué aux données d’image du terrain, des points du terrain présentant une hauteur inférieure au premier seuil et dans lequel déterminer la carte de traversabilité comprend en outre l’addition du troisième masque de traversabilité à la fusion du premier et du deuxième masque de traversabilité.

Some preferred but non-limiting aspects of this device are as follows:

the traversability estimation unit is configured to determine the terrain depth map from the terrain image data acquired by the camera;
to determine the depth map, the traversability estimation unit uses a deep learning model;
the traversability estimation unit includes a neural network pre-trained to implement the deep learning model;
the neural network is also pre-trained to perform semantic segmentation of the terrain image data;
the neural network comprises an encoder, a first decoder dedicated to depth inference and a second decoder dedicated to semantic segmentation;
the thresholding of the height map is carried out in such a way that the first traversability mask preserves, when applied to the terrain image data, points on the terrain having a height between a first threshold and a second threshold greater than the first threshold;
the traversability estimation unit is further configured to determine a third traversability mask which retains, when applied to the terrain image data, terrain points having a height less than the first threshold and in which to determine the traversability map further comprises the addition of the third traversability mask to the merger of the first and the second traversability mask.

D'autres aspects, buts, avantages et caractéristiques de l’invention apparaîtront mieux à la lecture de la description détaillée suivante de formes de réalisation préférées de celle-ci, donnée à titre d'exemple non limitatif, et faite en référence aux dessins annexés sur lesquels :Other aspects, aims, advantages and characteristics of the invention will appear better on reading the following detailed description of preferred embodiments thereof, given by way of non-limiting example, and made with reference to the appended drawings. on which ones :

- la est un schéma illustrant différentes étapes mises en œuvre au cours d’un procédé conforme à l’invention d’estimation de la traversabilité d’un terrain par un système mobile ;- there is a diagram illustrating different steps implemented during a method according to the invention for estimating the traversability of terrain by a mobile system;

- la est une image acquise par une caméra ;- there is an image acquired by a camera;

- la est une carte de profondeur de l’image de la ;- there is a depth map of the image of the ;

- la est une carte de hauteur déterminée à partir de la carte de profondeur de la et de l’image de la ;- there is a height map determined from the depth map of the and the image of the ;

- la illustre un premier masque de traversabilité ;- there illustrates a first traversability mask;

- la illustre une segmentation sémantique de l’image de la ;- there illustrates a semantic segmentation of the image of the ;

- la est une autre représentation du premier masque de traversabilité ;- there is another representation of the first traversability mask;

- la illustre le résultat de la fusion du premier et du deuxième masque de traversabilité.- there illustrates the result of merging the first and second traversability mask.

EXPOSÉ DÉTAILLÉ DE MODES DE RÉALISATION PARTICULIERSDETAILED DESCRIPTION OF PARTICULAR EMBODIMENTS

L’invention porte sur un dispositif de cartographie de terrain destiné à être embarqué sur un système mobile, par exemple un système mobile terrestre type tout-terrain tel qu’un robot, un drone ou un véhicule autonome. Comme cela sera détaillé par la suite, ce dispositif comprend une unité d’estimation de traversabilité configurée pour réaliser la fusion d’une solution de navigation géométrique tenant compte de la hauteur des obstacles rencontrés sur le terrain et d’une solution de navigation sémantique identifiant les zones traversables par le système mobile dans des données d’image du terrain acquises par une caméra embarquée sur le système mobile, typiquement une caméra monoculaire. L’unité d’estimation de traversabilité délivre une carte de traversabilité, par exemple une carte binaire dans laquelle chaque point du terrain imagé par la caméra est identifié comme étant traversable ou non par le système mobile ou encore une carte dans laquelle une probabilité de traversabilité est associée à chaque point du terrain.The invention relates to a terrain mapping device intended to be embedded on a mobile system, for example an off-road type land mobile system such as a robot, a drone or an autonomous vehicle. As will be detailed below, this device includes a traversability estimation unit configured to achieve the fusion of a geometric navigation solution taking into account the height of obstacles encountered on the ground and a semantic navigation solution identifying the areas traversable by the mobile system in image data of the terrain acquired by a camera on board the mobile system, typically a monocular camera. The traversability estimation unit delivers a traversability map, for example a binary map in which each point of the terrain imaged by the camera is identified as being traversable or not by the mobile system or even a map in which a probability of traversability is associated with each point on the ground.

Les images acquises par la caméra sont typiquement des images RGB du terrain. Le dispositif est alors fonctionnel en lumière visible. Dans une variante de réalisation, la fonctionnalité du dispositif est étendue à un fonctionnement nocturne en exploitant une autre plage de longueur d’onde (infrarouge typiquement).The images acquired by the camera are typically RGB images of the terrain. The device is then functional in visible light. In a variant embodiment, the functionality of the device is extended to nighttime operation by using another wavelength range (typically infrared).

En référence à la , l’unité d’estimation de traversabilité est configurée pour déterminer une carte de profondeur du terrain au cours d’une étape E1.In reference to the , the traversability estimation unit is configured to determine a depth map of the terrain during a step E1.

Dans un mode de réalisation privilégié, la détermination de la carte de profondeur du terrain au cours de cette étape E1 est directement réalisée à partir des données d’image du terrain It acquises par la caméra. Plus particulièrement, cette détermination peut être obtenue au moyen d’un modèle d’apprentissage supervisé, tel qu’un modèle basé sur un réseau neuronal profond.In a preferred embodiment, the determination of the depth map of the terrain during this step E1 is carried out directly from the image data of the terrain It acquired by the camera. More particularly, this determination can be obtained by means of a supervised learning model, such as a model based on a deep neural network.

Les figures 2 à 8 sont données à titre d’exemple illustratif de l’invention. La représente ainsi une image du jeu de données public NYU Depth v2 tandis que la représente la carte de profondeur correspondante.Figures 2 to 8 are given as an illustrative example of the invention. There thus represents an image of the public NYU Depth v2 dataset while the represents the corresponding depth map.

Dans une variante de réalisation, la carte de profondeur peut être déterminée par l’unité d’estimation de traversabilité au moyen d’une mesure directe par un capteur dédié émissif (comme un LiDAR ou un scanner 3D à lumière structurée). Dans une autre variante, cette détermination est toujours réalisée de manière passive mais en utilisant non pas une mais plusieurs caméras, par exemple des caméras d’un système de stéréovision. On retiendra que l’utilisation d’une seule caméra monoculaire présente les avantages d’une faible consommation pour un système embarqué, d’un coût réduit par rapport à un système de stéréovision ou à un système de type LiDAR et de permettre l’élaboration d’une solution non émissive.In a variant embodiment, the depth map can be determined by the traversability estimation unit by means of a direct measurement by a dedicated emissive sensor (such as a LiDAR or a 3D scanner with structured light). In another variant, this determination is still carried out passively but using not one but several cameras, for example cameras of a stereovision system. It will be noted that the use of a single monocular camera has the advantages of low consumption for an on-board system, of reduced cost compared to a stereovision system or a LiDAR type system and of allowing the development of a non-emissive solution.

La carte de profondeur ainsi déterminée permet de compléter l’information sur le terrain obtenue par la caméra pour fournir ainsi une connaissance sur la structure 3D du terrain. En prenant l’hypothèse que le système mobile se déplace sur le sol, les zones traversables du terrain correspondent alors au plan du sol contrairement aux obstacles rencontrés.The depth map thus determined makes it possible to supplement the information on the terrain obtained by the camera to thus provide knowledge of the 3D structure of the terrain. By assuming that the mobile system moves on the ground, the traversable areas of the terrain then correspond to the ground plane unlike the obstacles encountered.

Un des principes courants en robotique mobile pour adresser le problème de navigabilité d’un système est de définir une grille d’occupation de l’environnement du robot. L'idée de base de la grille d'occupation est de représenter une carte de l'environnement comme un champ uniformément espacé de variables aléatoires représentant chacune la probabilité de présence d'un obstacle à cet endroit dans l'environnement. Partant de ce principe, l’invention considère l’image acquise par la caméra comme une carte de l’environnement où chaque pixel correspond à un point de la grille. En connaissant la structure 3D du terrain, l’invention vient associer à chaque point de la grille la hauteur de l’élément présent en ce point pour décider si cet élément est un obstacle ou non.One of the common principles in mobile robotics to address the navigability problem of a system is to define an occupation grid for the robot's environment. The basic idea of the occupancy grid is to represent a map of the environment as a uniformly spaced field of random variables each representing the probability of the presence of an obstacle at that location in the environment. Based on this principle, the invention considers the image acquired by the camera as a map of the environment where each pixel corresponds to a point on the grid. By knowing the 3D structure of the terrain, the invention associates with each point of the grid the height of the element present at this point to decide whether this element is an obstacle or not.

Pour ce faire, l’unité d’estimation de traversabilité est configurée pour déterminer au cours d’une étape E3 une carte de hauteur du terrain à partir de la carte de profondeur et des données d’image du terrain. Avec les données d’image du terrain, la carte de profondeur permet en effet de calculer les coordonnées 3D de chaque point du terrain, coordonnées qui peuvent être reprojetées selon l’axe correspondant à la hauteur de ces points afin de calculer la carte de hauteur. La détermination de la carte de hauteur peut notamment comprendre une transformation projective inverse reposant sur le modèle projectif de la caméra, par exemple sous la forme d’une opération matricielle linéaire. La illustre la carte de hauteur déterminée à partir de la carte de profondeur de la et de l’image de la .To do this, the traversability estimation unit is configured to determine during a step E3 a height map of the terrain from the depth map and the image data of the terrain. With the image data of the terrain, the depth map makes it possible to calculate the 3D coordinates of each point on the terrain, coordinates which can be reprojected along the axis corresponding to the height of these points in order to calculate the height map . The determination of the height map may in particular include an inverse projective transformation based on the projective model of the camera, for example in the form of a linear matrix operation. There illustrates the height map determined from the depth map of the and the image of the .

Cette détermination de la carte de hauteur peut être réalisée comme suit. On considère les coordonnées d’un pixel dans l’images originale présentant des distorsions géométriques. Afin de déterminer l’élévation (hauteur) de ce pixel, il est nécessaire de connaitre les paramètres intrinsèques de la caméra, à savoir la matrice de projection (focale et centre optique) de la caméra ainsi que les paramètres de distorsion . Ces paramètres sont classiquement déterminés par calibration de la caméra. Une première étape vise à corriger la distorsion du pixel en appliquant le modèle de distorsion déterminé lors de la calibration. Par exemple dans le cas d’un modèle de distorsion radiale, la correction de distorsion est effectuée par l’équation suivante :This determination of the height map can be carried out as follows. We consider the coordinates of a pixel in the original images showing geometric distortions. In order to determine the elevation (height) of this pixel, it is necessary to know the intrinsic parameters of the camera, namely the projection matrix (focal length and optical center) of the camera as well as the distortion parameters . These parameters are conventionally determined by camera calibration. A first step aims to correct the pixel distortion by applying the distortion model determined during calibration. For example in the case of a radial distortion model, the distortion correction is carried out by the following equation:

avec représente le centre de la distorsion.with represents the center of the distortion.

Une deuxième étape consiste à déterminer la position 3D de chaque pixel dans l’espace 3D, en appliquant la projection inverse et en utilisant la profondeur estimée, comme suit :A second step is to determine the 3D position of each pixel in 3D space, by applying the inverse projection and using the estimated depth, as follows:

Avec :With :

la position du point 3D dans le repère de la caméra, the position of the 3D point in the camera frame,

les paramètres de la caméra (focale et centre optique), et the camera parameters (focal length and optical center), and

la profondeur estimée. the estimated depth.

Le point étant déterminé dans le repère de la caméra, il est possible de déterminer sa hauteur par rapport au sol, car la position de la caméra dans le véhicule est connue a priori (distance entre la caméra et les roue du véhicule).Point being determined in the camera's reference frame, it is possible to determine its height relative to the ground, because the position of the camera in the vehicle is known a priori (distance between the camera and the wheels of the vehicle).

Dans un mode de réalisation possible dans lequel le système mobile embarque une centrale inertielle, l’unité d’estimation de traversabilité peut, dans une étape E2 antérieure à l’étape E3, adapter le modèle projectif à la pose de la caméra telle qu’estimée par la centrale inertielle.In a possible embodiment in which the mobile system embeds an inertial unit, the traversability estimation unit can, in a step E2 prior to step E3, adapt the projective model to the pose of the camera such that estimated by the inertial unit.

Comme indiqué précédemment, la hauteur qu’on souhaite estimer est par rapport au sol et non pas dans le repère de la caméra. Ainsi, il est nécessaire de prendre en compte la pose de la caméra. Pour cela, il suffit de prendre comme référence un sol plat et d’estimer la position de la caméra à chaque prise d’image par rapport à une position initiale en utilisant les données inertielles. Cela revient à estimer les poses relatives de la caméra par rapport à une position initiale en utilisant les données inertielles.As indicated previously, the height that we wish to estimate is in relation to the ground and not in the camera's reference frame. Thus, it is necessary to take into account the pose of the camera. To do this, simply take a flat ground as a reference and estimate the position of the camera each time an image is taken relative to an initial position using inertial data. This amounts to estimating the relative poses of the camera compared to an initial position using inertial data.

En particulier, cette adaptation permet de normaliser la vue du terrain afin d’en éliminer des biais liés à l’orientation du véhicule par rapport au sol. Un changement de repère, depuis le repère de la caméra vers le repère du terrain imagé, peut notamment être effectué au moyen des différents angles fournis par la centrale inertielle.In particular, this adaptation makes it possible to standardize the view of the terrain in order to eliminate biases linked to the orientation of the vehicle in relation to the ground. A change of reference, from the camera reference to the imaged terrain reference, can in particular be carried out by means of the different angles provided by the inertial unit.

Dans une étape E4, l’unité d’estimation de traversabilité est configurée pour déterminer un premier masque de traversabilité du terrain par le système mobile au moyen d’un seuillage de la carte de hauteur. Ce seuillage permet de déterminer pour chaque pixel de la carte de hauteur si le point du terrain associé à ce pixel est traversable ou non par le système mobile, c’est-à-dire s’il fait partie du sol ou si au contraire sa hauteur indique qu’il est un obstacle. Ce seuillage est ainsi effectué en fonction des caractéristiques du système mobile selon les dimensions des obstacles à éviter par celui-ci.In a step E4, the traversability estimation unit is configured to determine a first terrain traversability mask by the mobile system by means of thresholding of the height map. This thresholding makes it possible to determine for each pixel of the height map whether the point on the ground associated with this pixel can be traversed or not by the mobile system, that is to say whether it is part of the ground or whether, on the contrary, its height indicates that it is an obstacle. This thresholding is thus carried out according to the characteristics of the mobile system according to the dimensions of the obstacles to be avoided by it.

Dans un mode de réalisation possible, le premier masque de traversabilité conserve, lorsqu’il est appliqué aux données d’image du terrain, des points du terrain présentant une hauteur inférieure à une hauteur seuil. La illustre à cet égard le seuillage de la carte de hauteur de la pour déterminer le premier masque de traversabilité, lequel conserve lorsqu’il est appliqué aux données d’image du terrain les points du terrain associés aux zones en blanc sur la .In one possible embodiment, the first traversability mask preserves, when applied to the terrain image data, terrain points having a height less than a threshold height. There illustrates in this regard the thresholding of the height map of the to determine the first traversability mask, which preserves when applied to the terrain image data the terrain points associated with the white areas on the .

Dans un autre mode de réalisation possible, le premier masque de traversabilité est déterminé pour qu’il conserve, lorsqu’il est appliqué aux données d’image du terrain, des points du terrain présentant une hauteur comprise entre un premier seuil et un deuxième seuil supérieur au premier seuil. Dans cet autre mode de réalisation, un autre masque de traversabilité, appelé troisième masque de traversabilité dans ce qui suit, peut être déterminé qui conserve, lorsqu’il est appliqué aux données d’image du terrain, les points du terrain présentant une hauteur inférieure au premier seuil.In another possible embodiment, the first traversability mask is determined so that it preserves, when applied to the terrain image data, terrain points having a height between a first threshold and a second threshold greater than the first threshold. In this other embodiment, another traversability mask, called third traversability mask in the following, can be determined which preserves, when applied to the terrain image data, the terrain points having a lower height at the first threshold.

L’unité d’estimation de traversabilité est par ailleurs configurée pour, au cours d’une étape E5, déterminer un deuxième masque de traversabilité du terrain par le système mobile au moyen d’une segmentation sémantique des données d’image du terrain acquises par la caméra embarquée sur le système mobile. La illustre à ce propos le résultat d’une segmentation sémantique de l’image de la dans laquelle les zones du terrain identifiées par la segmentation sémantique à la catégorie « traversable par le système mobile » portent la référence Tsm. Cette segmentation sémantique peut être obtenue au moyen d’un modèle d’apprentissage supervisé, tel qu’un modèle basé sur un réseau neuronal profond. Dans un mode de réalisation possible, cette segmentation est binaire, les pixels de l’image étant segmentés en deux classes distinctes : traversable par le système mobile et non-traversable par le système mobile.The traversability estimation unit is also configured to, during a step E5, determine a second terrain traversability mask by the mobile system by means of a semantic segmentation of the terrain image data acquired by the on-board camera on the mobile system. There illustrates in this regard the result of a semantic segmentation of the image of the in which the areas of the terrain identified by semantic segmentation with the category “traversable by the mobile system” bear the reference Tsm. This semantic segmentation can be obtained using a supervised learning model, such as a deep neural network-based model. In one possible embodiment, this segmentation is binary, the pixels of the image being segmented into two distinct classes: traversable by the mobile system and non-traversable by the mobile system.

Les informations géométriques représentées par le premier masque de traversabilité et les informations sémantiques représentées par le deuxième masque s’avèrent complémentaires.The geometric information represented by the first traversability mask and the semantic information represented by the second mask prove to be complementary.

En effet, d’une part, les objets de petite taille ou dont une partie présente une hauteur comprise dans la plage de hauteur autorisée pour la détermination du premier masque de traversabilité ne sont pas pris en compte comme des obstacles lorsque seules les informations géométriques sont utilisées. En revanche, en venant coupler les informations géométriques avec les informations sémantiques indiquant que ces objets ou parties d’objet ne sont pas traversables, il s’avère possible de correctement les identifier comme des obstacles.Indeed, on the one hand, objects of small size or of which a part has a height included in the height range authorized for the determination of the first traversability mask are not taken into account as obstacles when only the geometric information is used. On the other hand, by coupling the geometric information with the semantic information indicating that these objects or parts of objects are not traversable, it turns out to be possible to correctly identify them as obstacles.

Par ailleurs, d’autre part, la segmentation sémantique ne permet pas à elle seule de résoudre le problème d’estimation de traversabilité puisqu’elle est uniquement basée sur l’apparence de l’image. Or les zones traversables représentent une classe potentiellement complexe, notamment du fait de textures variées (la terre, le gazon ou encore le bitume sont trois types de sols potentiellement traversables mais donnent des rendus et textures très différents). Par ailleurs, décorrélée de notions géométriques, la segmentation sémantique peut fournir des résultats aberrants comme des pixels « traversables » à des endroits improbables dont la texture homogène est localement similaire à celle du sol (par exemple le pan de mur entre le lit et le bureau identifié dans la zone portant la référence ZA sur la ). En revanche, en venant coupler les informations sémantiques avec les informations géométriques indiquant que ces endroits ne sont pas traversables, il s’avère possible de correctement les identifier comme des obstacles.Furthermore, on the other hand, semantic segmentation does not alone solve the traversability estimation problem since it is only based on the appearance of the image. However, traversable zones represent a potentially complex class, particularly due to varied textures (earth, grass or even bitumen are three types of potentially traversable soil but give very different results and textures). Furthermore, uncorrelated from geometric notions, semantic segmentation can provide aberrant results such as “traversable” pixels in improbable locations whose homogeneous texture is locally similar to that of the floor (for example the section of wall between the bed and the desk identified in the zone bearing the reference ZA on the ). On the other hand, by coupling the semantic information with the geometric information indicating that these places are not traversable, it turns out to be possible to correctly identify them as obstacles.

Conformément à l’invention, l’unité d’estimation de traversabilité est avantageusement configurée pour, au cours d’une étape E6, déterminer une carte de traversabilité CT en procédant à la fusion du premier et du deuxième masque de traversabilité. De telle manière, ne sontin fineconsidérés comme traversables par le système mobile que les pixels de la grille de l’environnement qui sont à la fois conservés par le premier et par le deuxième masque de traversabilité. Cette fusion comprend typiquement l’application d’un ET logique entre le premier et le deuxième masque de traversabilité et permet ainsi de ne conserver que les pixels géométriquement viables et dont la sémantique est avérée.In accordance with the invention, the traversability estimation unit is advantageously configured to, during a step E6, determine a CT traversability map by merging the first and the second traversability mask. In this way, only the pixels of the environmental grid which are both preserved by the first and the second traversability mask are ultimately considered as traversable by the mobile system. This fusion typically includes the application of a logical AND between the first and the second traversability mask and thus makes it possible to retain only the geometrically viable pixels whose semantics are proven.

Sont ainsi réglés à la fois les problèmes de petits obstacles présents dans la plage de valeurs utilisée pour la détermination du premier masque de traversabilité, tout en enlevant les pixels aberrants dont la classe sémantique prédite est « traversable » à des endroits improbables de l’image. La illustre à ce propos le premier masque de traversabilité M1 tandis que la illustre la fusion du premier masque de la avec le deuxième masque pour fournir la carte de traversabilité CT. On constate que l’estimation de traversabilité est affinée. Notamment, les pieds de la chaise sont détourés et ne sont pas considérés comme des pixels traversables alors qu’ils l’étaient sur la . Par ailleurs, aucun pixel dont la hauteur est trop élevée dans l’image n’est proposé dans l’estimation finale (cf. le pan de mur dans la zone ZA de la ).The problems of small obstacles present in the range of values used for determining the first traversability mask are thus resolved, while removing aberrant pixels whose predicted semantic class is “traversable” at improbable locations in the image. . There illustrates in this regard the first traversability mask M1 while the illustrates the fusion of the first mask of the with the second mask to provide the CT traversability map. We see that the traversability estimate is refined. In particular, the legs of the chair are cut out and are not considered as traversable pixels whereas they were on the . Furthermore, no pixel whose height is too high in the image is proposed in the final estimate (see the section of wall in zone ZA of the ).

Dans le mode de réalisation précédemment évoqué où le premier masque de traversabilité est déterminé pour qu’il conserve, lorsqu’il est appliqué aux données d’image du terrain, les points du terrain présentant une hauteur comprise entre le premier et le deuxième seuil et où un troisième masque de traversabilité est déterminé pour qu’il conserve, lorsqu’il est appliqué aux données d’image du terrain, les points du terrain présentant une hauteur inférieure au premier seuil, seul le premier masque est fusionné au deuxième masque et le troisième masque est alors ajouté au résultat de cette fusion. Une telle réalisation permet de considérer qu’en dessous du premier seuil de hauteur, tous les pixels (du troisième masque) sont considérés traversables alors que sur le reste de l’image les pixels considérés comme étant traversables sont ceux résultant de la fusion du premier masque (approche géométrique) et du deuxième masque (approche sémantique).In the previously mentioned embodiment where the first traversability mask is determined so that it preserves, when applied to the terrain image data, the terrain points having a height between the first and the second threshold and where a third traversability mask is determined so that it preserves, when applied to the terrain image data, the terrain points having a height less than the first threshold, only the first mask is merged with the second mask and the third mask is then added to the result of this fusion. Such an achievement makes it possible to consider that below the first height threshold, all the pixels (of the third mask) are considered traversable while on the rest of the image the pixels considered to be traversable are those resulting from the fusion of the first mask (geometric approach) and the second mask (semantic approach).

On a vu dans ce qui précède que l’unité d’estimation de traversabilité peut exploiter un modèle d’apprentissage profond pour déterminer la carte de profondeur.We have seen in the above that the traversability estimation unit can exploit a deep learning model to determine the depth map.

Ce modèle d’apprentissage profond peut exploiter un réseau de neurones qui présente une architecture encodeur-décodeur. L’encodeur se charge de l’extraction de caractéristiques qui sont par la suite fournies au décodeur pour générer les cartes de profondeur. L’extracteur de caractéristiques comprend des couches de convolution successives suivies de non linéarités telles que des fonctions de normalisation de données, des fonctions de réduction de dimension ou des fonctions de reprojection non linéaire comme, entre autres, la sigmoïde ou l’unité linéaire rectifié (réseau convolutif). Les caractéristiques de dimensions réduites extraites de l’image sont ensuite délivrées au décodeur (lui aussi est typiquement un réseau convolutif) qui se charge de récupérer la dimension spatiale de l’image tout en calculant les caractéristiques nécessaires au décodage des caractéristiques produisant ainsi une carte de profondeur.This deep learning model can leverage a neural network that features an encoder-decoder architecture. The encoder is responsible for extracting features which are subsequently provided to the decoder to generate the depth maps. The feature extractor includes successive convolution layers followed by nonlinearities such as data normalization functions, dimension reduction functions or nonlinear reprojection functions such as, among others, the sigmoid or the rectified linear unit (convolutional network). The reduced-dimensional characteristics extracted from the image are then delivered to the decoder (it is also typically a convolutional network) which is responsible for recovering the spatial dimension of the image while calculating the characteristics necessary for decoding the characteristics, thus producing a map depth.

Un entrainement du modèle d’apprentissage profond peut être réalisé comme suit :

Acquisition d’une base de données d’apprentissage qui comporte des images de scènes ainsi que les cartes de profondeurs qui leurs correspondent. Ces cartes de profondeur vérité-terrain sont obtenue par une méthode tierce, par exemple une carte de profondeur acquise par LIDAR, au moyen d’une lumière structurée, par stéréovision, etc.
Calcul d’une valeur représentative de la performance du modèle d’apprentissage par comparaison de la carte de profondeur de la scène fournie en sortie du réseau de neurones et la carte de profondeur vérité-terrain.
Les poids des connexions du réseau de neurone sont alors ajustés de sorte à réduire l’erreur de prédiction de la carte de profondeur. Par exemple, le gradient de l’erreur peut être calculé afin de déterminer une direction de variation et un déplacement dans une direction opposée au gradient est alors réalisée.

Training of the deep learning model can be carried out as follows:

Acquisition of a learning database which includes scene images as well as the depth maps that correspond to them. These ground truth depth maps are obtained by a third-party method, for example a depth map acquired by LIDAR, by means of structured light, by stereovision, etc.
Calculation of a value representative of the performance of the learning model by comparison of the depth map of the scene provided as output from the neural network and the ground truth depth map.
The weights of the neural network connections are then adjusted so as to reduce the prediction error of the depth map. For example, the gradient of the error can be calculated to determine a direction of variation and a movement in a direction opposite to the gradient is then carried out.

Ainsi, dans une réalisation possible, l’unité d’estimation de traversabilité peut comprendre un réseau neuronal pré-entrainé pour mettre en œuvre le modèle d’apprentissage profond. Dans une variante possible, ce réseau neuronal peut également être pré-entrainé pour réaliser la segmentation sémantique des données d’image du terrain. Notamment, le réseau neuronal peut comprendre un encodeur et, en sortie de l’encodeur, un premier décodeur dédié à la détermination de la carte de profondeur et un deuxième décodeur dédié à la segmentation sémantique.Thus, in one possible embodiment, the traversability estimation unit may comprise a neural network pre-trained to implement the deep learning model. In a possible variant, this neural network can also be pre-trained to perform semantic segmentation of terrain image data. In particular, the neural network may include an encoder and, at the output of the encoder, a first decoder dedicated to determining the depth map and a second decoder dedicated to semantic segmentation.

L’invention n’est pas limitée au dispositif précédemment décrit mais s’étend également à un procédé au cours duquel sont mises en œuvre par ordinateur les étapes précédemment décrites ains qu’à un produit programme d’ordinateur comprenant des instructions qui, lorsque le programme est exécuté par un ordinateur, conduisent celui-ci à mettre en œuvre les étapes d’un tel procédé.The invention is not limited to the device previously described but also extends to a method during which the steps previously described are implemented by computer as well as to a computer program product comprising instructions which, when the program is executed by a computer, lead it to implement the steps of such a process.

Claims

Terrain mapping device intended to be embedded on a mobile system, comprising a traversability estimation unit configured to:

determine (E1) a depth map of the terrain;
determine (E3) a height map of the terrain from the depth map and image data of the terrain acquired by a camera on board the mobile system;
determine (E4) a first terrain traversability mask by the mobile system by means of thresholding of the height map;
determine (E5) a second terrain traversability mask by the mobile system by means of a semantic segmentation of the terrain image data acquired by the camera on board the mobile system;
determine (E6) a terrain traversability map by the mobile system (CT) by merging the first and second traversability mask.

Device according to claim 1, wherein the traversability estimation unit is configured to determine the terrain depth map from the terrain image data acquired by the camera.

Device according to claim 2, in which to determine the depth map the traversability estimation unit uses a deep learning model.

Device according to claim 3, wherein the traversability estimation unit comprises a neural network pre-trained to implement the deep learning model.

Device according to claim 4, wherein the neural network is also pre-trained to perform semantic segmentation of the terrain image data.

Device according to claim 5, in which the neural network comprises an encoder, a first decoder dedicated to depth inference and a second decoder dedicated to semantic segmentation.

Device according to one of claims 1 to 6, in which the thresholding of the height map is carried out in such a way that the first traversability mask preserves, when applied to the terrain image data, points of the land having a height between a first threshold and a second threshold greater than the first threshold.

A device according to claim 7, wherein the traversability estimation unit is further configured to determine a third traversability mask which preserves, when applied to the terrain image data, terrain points having a height lower than the first threshold and in which determining the traversability map further comprises adding the third traversability mask to the merger of the first and the second traversability mask.

Computer-implemented method for estimating the crossability of terrain by a mobile system, comprising the steps of:

determine a depth map of the terrain;
determining a terrain height map from the depth map and terrain image data acquired by a camera on the mobile system;
determine a first terrain traversability mask by the mobile system by means of thresholding of the height map:
determine a second terrain traversability mask by the mobile system by means of a semantic segmentation of terrain image data acquired by the camera on board the mobile system;
determine a terrain traversability map by the mobile system by merging the first and second traversability mask.

Computer program product comprising instructions which, when the program is executed by a computer, lead it to implement the steps of the method according to claim 9.