EP3555802A1 - Object recognition system based on an adaptive 3d generic model - Google Patents

Object recognition system based on an adaptive 3d generic model

Info

Publication number
EP3555802A1
EP3555802A1 EP17811644.8A EP17811644A EP3555802A1 EP 3555802 A1 EP3555802 A1 EP 3555802A1 EP 17811644 A EP17811644 A EP 17811644A EP 3555802 A1 EP3555802 A1 EP 3555802A1
Authority
EP
European Patent Office
Prior art keywords
objects
generic
model
images
class
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Withdrawn
Application number
EP17811644.8A
Other languages
German (de)
French (fr)
Inventor
Loïc LECERF
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Marelli Europe SpA
Original Assignee
Magneti Marelli SpA
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Magneti Marelli SpA filed Critical Magneti Marelli SpA
Publication of EP3555802A1 publication Critical patent/EP3555802A1/en
Withdrawn legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/60Type of objects
    • G06V20/64Three-dimensional objects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • G06F18/2148Generating training patterns; Bootstrap methods, e.g. bagging or boosting characterised by the process organisation or structure, e.g. boosting cascade
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/28Determining representative reference patterns, e.g. by averaging or distorting; Generating dictionaries
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T15/003D [Three Dimensional] image rendering
    • G06T15/04Texture mapping
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T17/00Three dimensional [3D] modelling, e.g. data description of 3D objects
    • G06T17/20Finite element generation, e.g. wire-frame surface description, tesselation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/772Determining representative reference patterns, e.g. averaging or distorting patterns; Generating dictionaries

Definitions

  • the invention relates to mobile object recognition systems, including systems based on machine learning.
  • the isolation and tracking of a moving object in a sequence of images can be performed by relatively unsophisticated generic algorithms based, for example, on background subtraction.
  • it is more difficult to classify the objects thus isolated in categories that one wishes to detect that is to say, to recognize whether the object is a person, a car, a bicycle, an animal, etc.
  • the objects can have a great variety of morphologies in the images of the sequence (position, size, orientation, distortion, texture, configuration of possible appendages and articulated elements, etc.).
  • the morphologies also depend on the viewing angle and the lens of the camera that films the scene to watch. Sometimes we also want to recognize subclasses (car model, gender of a person).
  • a machine learning system To classify and detect objects, a machine learning system is generally used. The ranking is then based on a knowledge base or a data set developed by learning.
  • An initial data set is generally generated during a so-called supervised learning phase, where an operator views sequences of images produced in context and manually annotates the image areas corresponding to the objects to be recognized. This phase is generally long and tedious, because one seeks ideally to capture all the possible variants of the morphology of the objects of the class, at least enough variants to obtain a satisfactory recognition rate.
  • a characteristic of these techniques is that they generate many synthesized images that, although they conform to the parameters and constraints of the 3D models, have improbable morphologies. This clutters the dataset with unnecessary images and can slow recognition.
  • a method of automatically configuring a recognition system of a variable morphology object class comprising the steps of: providing a machine learning system with an initial data set sufficient to recognize instances objects of the class in a sequence of images of a target scene; providing a generic three-dimensional model specific to the class of objects, the morphology of which can be defined by a set of parameters; acquire a sequence of images of the scene using a camera; recognizing image instances of objects of the class in the acquired image sequence using the initial dataset; conforming the generic three-dimensional model to recognized image instances; record ranges of variation of the parameters resulting from conformations of the generic model; synthesize multiple three-dimensional objects from the generic model by varying the parameters in the recorded ranges of variation; and complete the data set of the learning system by projections of the objects synthesized in the plane of the images.
  • the method may comprise the following steps: defining parameters of the generic three-dimensional model by the relative positions of bitters of a mesh of the model, the positions of the other nodes of the mesh being bound to the bitters by constraints; and perform conformations of the generic three-dimensional model by positioning bitters of a projection of the model in the plane of the images.
  • the method may further include the steps of: registering textures from areas of the recognized image instances; and stamping on each synthesized object a texture among the recorded textures.
  • the initial data set of the learning system can be obtained by supervised learning involving at least two objects of the class whose morphologies are at opposite ends of a domain of observed variation of the morphologies.
  • Figure 1 shows a schematic three-dimensional generic model of an object, projected in different positions of a scene seen by a camera
  • FIG. 2 schematically illustrates a configuration phase of a machine learning system for recognizing objects according to the generic model of FIG. 1.
  • Figure 1 illustrates a simplified generic model of an object, for example a car, projected onto an image in different positions of an example of a scene monitored by a fixed camera.
  • the scene here is, for simplicity's sake, a street crossing horizontally the field of view of the camera.
  • the model In the background, the model is projected in three aligned positions, in the center and near the left and right edges of the image. In the foreground, the model is projected in a slightly left position. All these projections come from the same model from the point of view of dimensions and show the morphological variations of the projections in the image as a function of the position in the scene. In a more complex scene, for example a curved street, one would also see variations of morphology depending on the orientation of the model.
  • the variations of morphology as a function of the position are defined by the projection of the plane on which the objects evolve, here the street.
  • the projection of the plane of evolution is defined by equations that depend on the characteristics of the camera (angle of view, focal length and distortion of the lens). Edges perpendicular to the camera axis change size homothetically as a function of distance from the camera, and edges parallel to the camera axis follow creepage distances.
  • the projections of the same object in the image at different positions or orientations have a variable morphology, even if the real object has a fixed morphology.
  • real objects can also have a variable morphology, whether from one object to another (between two cars of different models), or during the movement of the same object (pedestrian).
  • Learning systems are well suited to this situation when they have been configured with enough data to represent the range of most likely projected morphologies.
  • the envisaged three-dimensional generic model for example of the Point Distribution Model (PDM) type, may comprise a mesh of nodes linked to each other by constraints, that is to say parameters that establish the relative displacements between adjacent nodes or the deformations of the mesh which cause displacements of certain nodes, known as landmarks.
  • constraints that is to say parameters that establish the relative displacements between adjacent nodes or the deformations of the mesh which cause displacements of certain nodes, known as landmarks.
  • landmarks known as landmarks.
  • the bitter ones are chosen so that their displacements make it possible to reach all the desired morphologies of the model within the defined constraints.
  • a simplified generic car model may include, for the bodywork, a mesh of 16 nodes defining 10 rectangular surfaces and having 10 bitter.
  • Eight bitter KO to K7 define one of the side faces of the car, and the two remaining K8 and K9 landmarks on the other side face define the width of the car.
  • a single bitter would be enough to define the width of the car, but the presence of two bitter or more will allow to conform the model to a projection of a real object taking into account the deformations of the projection.
  • the wheels are a characteristic element of a car and can be assigned a specific set of bitter, not shown, defining the center distance, the diameter and the points of contact with the road.
  • the illustrated generic 3D model is simplistic, to clarify the presentation.
  • the model will include a finer mesh and to define edges and curved surfaces.
  • FIG. 2 schematically illustrates a configuration phase of a machine learning system for recognizing cars according to the generic model of FIG. 1, by way of example.
  • the learning system comprises a data set 10 associated with a camera 12 installed for filming a scene to be monitored, for example that of FIG.
  • the configuration phase can start from an existing dataset, which can be summary and offer only a low recognition rate.
  • This existing dataset may have been produced by a quick and uncomplicated supervised learning. The following steps are used to complete the dataset to achieve a satisfactory recognition rate.
  • the reconnaissance system is started and starts recognizing and tracking cars in successive images captured by the camera.
  • An image instance of each recognized car is extracted in 14.
  • the camera typically produces multiple images that each contain an instance of the same car at different positions. One can choose the largest instance, which will have the best resolution for subsequent operations.
  • the generic 3D model is conformed to each instance of image thus extracted.
  • This can be done by a conventional conformation algorithm ("fitting") which seeks, for example, the best matches between the image and the bitters of the model as projected in the plane of the image.
  • fitting seeks, for example, the best matches between the image and the bitters of the model as projected in the plane of the image.
  • algorithms based on bitter detection as described, for example, in “One Millisecond Face Alignment with an Ensemble of Regression Trees", Vahid Kazemi et al. IEEE CVPR 2014].
  • it is preferable that other faces of the cars are visible in the instances so that the model can be defined in a complete way.
  • conformation operations produce 3D models that are supposed to be scaled to real objects.
  • the conformation operations can use the equations of the projection of the plane of evolution of the object. These equations can be determined manually from the camera characteristics and the configuration of the scene, or estimated by the system in a calibration phase using, if necessary, adapted tools such as depth cameras. Knowing that objects evolve on a plane, the equations can be deduced from the variation of the size according to the position of the instances of a tracked object.
  • 3D model representing the car recognized scale.
  • the models are illustrated in two dimensions, in correspondence with the extracted lateral faces 14. (Note that the generic 3D model used is conformable to cars as well as vans or even buses, so the system is here rather intended to recognize any four-wheeled vehicle.)
  • the conformation step it is also possible to sample the image zones corresponding to the different faces of the car, and to store these image zones in the form of textures at 18. After a certain acquisition period, will have collected without supervision a multitude of 3D models representing different cars, as well as their textures. If the recognition rate offered by the initial dataset is low, it is sufficient to extend the acquisition time to reach a collection with a satisfactory number of models.
  • the models are compared with each other at 20 and there is a range of variation for each bitter.
  • An example of a range of variation for bitter K6 has been illustrated.
  • the ranges of variation can define relative variations affecting the shape of the model itself, or absolute variations such as the position and orientation of the model.
  • One of the bitter, for example KO can serve as an absolute reference. It can be assigned ranges of absolute variation that determine the positions and possible orientations of the car in the image. These ranges are in fact not directly deductible from the registered models, since a registered model can come from a single instance chosen on a multitude of instances produced during the displacement of a car. We can estimate the variations of position and orientation by deducing them from the multiple instances of a car followed, without having to carry out a complete conformation of the generic model in each instance.
  • bitter diametrically opposed to KO we can establish a range of variation relative to bitter KO, which determines the length of the car.
  • K8 we can establish a range of variation relative to bitter KO, which determines the width of the car.
  • the range of variation of each of the other bitters can be established relative to one of its adjacent landmarks.
  • the variations of the bitter can be random, incremental, or a combination of both.
  • each synthesized car is projected in the camera image plane to form a self-annotated image instance for completing the data set 10 of the training system.
  • These projections also use the equations of the projection of the plan of evolution of the cars.
  • the same car synthesized can be projected in several positions and different orientations, according to the absolute ranges of variation previously determined. In general, the orientations are correlated to the positions, so that we will not vary the two parameters independently, unless we want to detect abnormal situations, like a car across the road.
  • the dataset complemented by this procedure could still have gaps preventing the detection of certain car models. In this case, one can reiterate an automatic configuration phase starting from the completed dataset.
  • This dataset normally offers a recognition rate higher than the initial game, which will lead to the constitution of a more varied collection of models 16, making it possible to refine the parameters variation ranges and to synthesize models 22 at a time. more accurate and varied to feed the dataset 10 again.
  • the initial dataset can be produced by simple and fast supervised learning.
  • an operator views images of the filmed scene and, using a graphical interface, annotates the image areas corresponding to instances of the objects to be recognized. Since the subsequent configuration procedure is based on the morphological variations of the generic model, the operator may wish to annotate the objects exhibiting the most important variations. It can thus annotate at least two objects whose morphologies are at opposite ends of a range of variation that it would have visually observed.
  • the interface can be designed to establish the equations of the projection of the plan of evolution with the assistance of the operator. The interface can then propose to the operator to manually conform the generic model to image areas, offering both an annotation and the creation of the first models in the collection 16.
  • This annotation phase is summary and rapid, the objective being to obtain a restricted initial dataset allowing the start of the automatic configuration phase that will complete the dataset.

Abstract

The invention relates to a method for automatically configuring a system for recognizing a class of objects of variable morphology, comprising the following steps: providing a machine learning system with an initial data set (10) sufficient to recognize instances of objects of the class in a sequence of images of a target scene; providing a generic three-dimensional model specific to the class of objects, whose morphology is definable by a set of parameters; acquiring a sequence of images of the scene with the aid of a camera (12); recognizing image instances (14) of objects of the class in the acquired sequence of images using the initial data set; mapping the generic three-dimensional model (16) with recognized image instances (14); recording ranges of variation of the parameters (20) resulting from the mappings of the generic model; synthesizing multiple three-dimensional objects (22) on the basis of the generic model by varying the parameters in the recorded ranges of variation; and completing the data set (10) of the learning system by projections of the synthesized objects (24) in the plane of the images.

Description

SYSTÈME DE RECONNAISSANCE D'OBJETS BASÉ SUR  OBJECT RECOGNITION SYSTEM BASED ON
UN MODÈLE GÉNÉRIQUE 3D ADAPTATIF  AN ADAPTIVE 3D GENERIC MODEL
Domaine technique Technical area
L'invention est relative aux systèmes de reconnaissance d'objets mobiles, et notamment aux systèmes basés sur un apprentissage machine.  The invention relates to mobile object recognition systems, including systems based on machine learning.
Arrière-plan Background
L'isolation et le suivi d'un objet mobile dans une séquence d'images peuvent être effectués par des algorithmes génériques relativement peu sophistiqués basés, par exemple, sur la soustraction d'arrière-plan. Par contre, il est plus difficile de classer les objets ainsi isolés dans des catégories que l'on souhaite détecter, c'est-à-dire reconnaître si l'objet est une personne, une voiture, un vélo, un animal, etc. En effet, les objets peuvent avoir une grande variété de morphologies dans les images de la séquence (position, taille, orientation, distorsion, texture, configuration d'éventuels appendices et éléments articulés, etc.). Les morphologies dépendent en outre de l'angle de vue et de l'objectif de la caméra qui filme la scène à surveiller. Parfois on souhaite également reconnaître des sous-classes (modèle de voiture, genre d'une personne).  The isolation and tracking of a moving object in a sequence of images can be performed by relatively unsophisticated generic algorithms based, for example, on background subtraction. On the other hand, it is more difficult to classify the objects thus isolated in categories that one wishes to detect, that is to say, to recognize whether the object is a person, a car, a bicycle, an animal, etc. Indeed, the objects can have a great variety of morphologies in the images of the sequence (position, size, orientation, distortion, texture, configuration of possible appendages and articulated elements, etc.). The morphologies also depend on the viewing angle and the lens of the camera that films the scene to watch. Sometimes we also want to recognize subclasses (car model, gender of a person).
Pour classer et détecter les objets on a généralement recours à un système d'apprentissage machine. Le classement s'appuie alors sur une base de connaissances ou un jeu de données élaboré par apprentissage. Un jeu de données initial est généralement généré lors d'une phase d'apprentissage dit supervisé, où un opérateur visionne des séquences d'images produites en contexte et annote manuellement les zones d'image correspondant aux objets à reconnaître. Cette phase est généralement longue et fastidieuse, car on cherche idéalement à capturer toutes les variantes possibles de la morphologie des objets de la classe, du moins suffisamment de variantes pour obtenir un taux de reconnaissance satisfaisant. To classify and detect objects, a machine learning system is generally used. The ranking is then based on a knowledge base or a data set developed by learning. An initial data set is generally generated during a so-called supervised learning phase, where an operator views sequences of images produced in context and manually annotates the image areas corresponding to the objects to be recognized. This phase is generally long and tedious, because one seeks ideally to capture all the possible variants of the morphology of the objects of the class, at least enough variants to obtain a satisfactory recognition rate.
Pour alléger cette tâche initiale d'apprentissage supervisé, on a proposé des techniques d'apprentissage automatique où, plutôt que d'alimenter le jeu de données avec des images réelles annotées, on l'alimente avec des images synthétisées auto-annotées générées à partir de modèles tridimensionnels des objets à reconnaître. Une telle technique est décrite pour configurer un détecteur de piétons dans l'article ["Learning Scene-Specifïc Pedestrian Detectors without Real Data", Hironori Hattori et al. 2015 IEEE Conférence on Computer Vision and Pattern Récognition (CVPR)]. Une technique similaire est décrite pour configurer un détecteur de voitures dans l'article ["Teaching 3D Geometry to Deformable Part Models", Bojan Pepik et al. 2012 IEEE Conférence on Computer Vision and Pattern Récognition (CVPR)]. To alleviate this initial task of supervised learning, we have proposed machine learning techniques where, rather than feeding the dataset with annotated real images, we feed it with self-annotated synthesized images generated from three-dimensional models of the objects to be recognized. Such a technique is described for configuring a pedestrian detector in the article ["Learning Scene-Specifics Pedestrian Detectors Without Real Data", Hironori Hattori et al. 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)]. A similar technique is described for configuring a car detector in the article ["Teaching 3D Geometry to Deformable Part Models", Bojan Pepik et al. 2012 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)].
Une caractéristique de ces techniques est qu'elles génèrent de nombreuses images synthétisées qui, bien qu'elles soient conformes aux paramètres et contraintes des modèles 3D, ont des morphologies improbables. Cela encombre le jeu de données avec des images inutiles et peut ralentir la reconnaissance. A characteristic of these techniques is that they generate many synthesized images that, although they conform to the parameters and constraints of the 3D models, have improbable morphologies. This clutters the dataset with unnecessary images and can slow recognition.
En outre, certains objets ont une morphologie tellement variable qu'il est difficile de reproduire de façon réaliste toutes les possibilités avec un modèle 3D ayant un nombre gérable de paramètres et contraintes. Il en résulte des lacunes dans le jeu de données et des défauts de détection de certains objets. In addition, some objects have such a variable morphology that it is difficult to realistically reproduce all possibilities with a 3D model having a manageable number of parameters and constraints. This results in gaps in the data set and defects in the detection of certain objects.
Résumé summary
On prévoit de façon générale un procédé de configuration automatique d'un système de reconnaissance d'une classe d'objets de morphologie variable, comprenant les étapes suivantes : prévoir un système d'apprentissage machine avec un jeu de données initial suffisant pour reconnaître des instances d'objets de la classe dans une séquence d'images d'une scène cible ; prévoir un modèle tridimensionnel générique spécifique à la classe d'objets, dont la morphologie est définissable par un jeu de paramètres ; acquérir une séquence d'images de la scène à l'aide d'une caméra ; reconnaître des instances d'image d'objets de la classe dans la séquence d'images acquise en utilisant le jeu de données initial ; conformer le modèle tridimensionnel générique à des instances d'image reconnues ; enregistrer des plages de variation des paramètres résultant des conformations du modèle générique ; synthétiser de multiples objets tridimensionnels à partir du modèle générique en faisant varier les paramètres dans les plages de variation enregistrées ; et compléter le jeu de données du système d'apprentissage par des projections des objets synthétisés dans le plan des images.  There is generally provided a method of automatically configuring a recognition system of a variable morphology object class, comprising the steps of: providing a machine learning system with an initial data set sufficient to recognize instances objects of the class in a sequence of images of a target scene; providing a generic three-dimensional model specific to the class of objects, the morphology of which can be defined by a set of parameters; acquire a sequence of images of the scene using a camera; recognizing image instances of objects of the class in the acquired image sequence using the initial dataset; conforming the generic three-dimensional model to recognized image instances; record ranges of variation of the parameters resulting from conformations of the generic model; synthesize multiple three-dimensional objects from the generic model by varying the parameters in the recorded ranges of variation; and complete the data set of the learning system by projections of the objects synthesized in the plane of the images.
Le procédé peut comprendre les étapes suivantes : définir des paramètres du modèle tridimensionnel générique par les positions relatives d'amers d'un maillage du modèle, les positions des autres noeuds du maillage étant liées aux amers par des contraintes ; et opérer des conformations du modèle tridimensionnel générique en positionnant des amers d'une projection du modèle dans le plan des images. Le procédé peut comprendre en outre les étapes suivantes : enregistrer des textures à partir de zones des instances d'image reconnues ; et plaquer sur chaque objet synthétisé une texture parmi les textures enregistrées. The method may comprise the following steps: defining parameters of the generic three-dimensional model by the relative positions of bitters of a mesh of the model, the positions of the other nodes of the mesh being bound to the bitters by constraints; and perform conformations of the generic three-dimensional model by positioning bitters of a projection of the model in the plane of the images. The method may further include the steps of: registering textures from areas of the recognized image instances; and stamping on each synthesized object a texture among the recorded textures.
Le jeu de données initial du système d'apprentissage peut être obtenu par un apprentissage supervisé impliquant au moins deux objets de la classe dont les morphologies sont à des extrêmes opposés d'un domaine de variation constaté des morphologies. The initial data set of the learning system can be obtained by supervised learning involving at least two objects of the class whose morphologies are at opposite ends of a domain of observed variation of the morphologies.
Description sommaire des dessins Brief description of the drawings
Des modes de réalisation seront exposés dans la description suivante, faite à titre non limitatif en relation avec les figures jointes parmi lesquelles :  Embodiments will be set forth in the following description, given as a non-limiting example in relation to the appended figures among which:
• la figure 1 représente un modèle générique tridimensionnel schématique d'un objet, projeté en différentes positions d'une scène vue par une caméra ; et • Figure 1 shows a schematic three-dimensional generic model of an object, projected in different positions of a scene seen by a camera; and
• la figure 2 illustre schématiquement une phase de configuration d'un système d'apprentissage machine pour reconnaître des objets selon le modèle générique de la figure 1. FIG. 2 schematically illustrates a configuration phase of a machine learning system for recognizing objects according to the generic model of FIG. 1.
Description de modes de réalisation Description of embodiments
Pour simplifier la phase de configuration initiale d'un détecteur d'objets, on propose, comme l'article susmentionné de Hironori Hattori, de configurer un système d'apprentissage machine à l'aide d'images synthétisées et auto-annotées à partir de modèles tridimensionnels. Cependant, pour améliorer le taux de reconnaissance, les modèles tridimensionnels sont obtenus à partir d'un modèle générique paramétrable qui a été au préalable conformé aux images d'objets réels filmés en contexte.  To simplify the initial configuration phase of an object detector, it is proposed, like the aforementioned article by Hironori Hattori, to configure a machine learning system using synthesized and self-annotated images from three-dimensional models. However, to improve the recognition rate, the three-dimensional models are obtained from a parameterizable generic model that has been previously conformed to the images of real objects filmed in context.
La figure 1 illustre un modèle générique simplifié d'un objet, à titre d'exemple une voiture, projeté sur une image en différentes positions d'un exemple de scène surveillée par une caméra fixe. La scène est ici, pour des raisons de simplicité, une rue traversant horizontalement le champ de vision de la caméra. Figure 1 illustrates a simplified generic model of an object, for example a car, projected onto an image in different positions of an example of a scene monitored by a fixed camera. The scene here is, for simplicity's sake, a street crossing horizontally the field of view of the camera.
À Γ arrière-plan, le modèle est projeté en trois positions alignées, au centre et près des bords gauche et droit de l'image. Au premier plan, le modèle est projeté dans une position légèrement à gauche. Toutes ces projections sont issues d'un même modèle du point de vue des dimensions et montrent les variations de morphologie des projections dans l'image en fonction de la position dans la scène. Dans une scène plus complexe, par exemple une rue courbe, on verrait également des variations de morphologie en fonction de l'orientation du modèle. In the background, the model is projected in three aligned positions, in the center and near the left and right edges of the image. In the foreground, the model is projected in a slightly left position. All these projections come from the same model from the point of view of dimensions and show the morphological variations of the projections in the image as a function of the position in the scene. In a more complex scene, for example a curved street, one would also see variations of morphology depending on the orientation of the model.
Les variations de morphologie en fonction de la position sont définies par la projection du plan sur lequel évoluent les objets, ici la rue. La projection du plan d'évolution est définie par des équations qui dépendent des caractéristiques de la caméra (angle de vue, focale et distorsion de l'objectif). Les arêtes perpendiculaires à l'axe de la caméra changent de taille homothétiquement en fonction de la distance à la caméra, et les arêtes parallèles à l'axe de la caméra suivent des lignes de fuite. Il en résulte que, lors d'un déplacement latéral d'une voiture dans la vue illustrée, la face avant de la voiture, initialement visible, est cachée à partir du centre de l'image, tandis que la face arrière, initialement cachée, devient visible à partir du centre de l'image. La face supérieure de la voiture, toujours visible dans cet exemple, se déforme en cisaillement selon des lignes de fuite. The variations of morphology as a function of the position are defined by the projection of the plane on which the objects evolve, here the street. The projection of the plane of evolution is defined by equations that depend on the characteristics of the camera (angle of view, focal length and distortion of the lens). Edges perpendicular to the camera axis change size homothetically as a function of distance from the camera, and edges parallel to the camera axis follow creepage distances. As a result, during a lateral displacement of a car in the illustrated view, the front face of the car, initially visible, is hidden from the center of the image, while the rear face, initially hidden, becomes visible from the center of the image. The upper face of the car, still visible in this example, deforms in shear according to vanishing lines.
En résumé, les projections d'un même objet dans l'image à des positions ou orientations différentes ont une morphologie variable, même si l'objet réel a une morphologie fixe. Bien entendu, les objets réels peuvent également avoir une morphologie variable, que ce soit d'un objet à l'autre (entre deux voitures de modèles différents), ou au cours du déplacement d'un même objet (piéton). Les systèmes à apprentissage sont bien adaptés à cette situation lorsqu'ils ont été configurés avec suffisamment de données pour représenter l'éventail de morphologies projetées les plus probables. In summary, the projections of the same object in the image at different positions or orientations have a variable morphology, even if the real object has a fixed morphology. Of course, real objects can also have a variable morphology, whether from one object to another (between two cars of different models), or during the movement of the same object (pedestrian). Learning systems are well suited to this situation when they have been configured with enough data to represent the range of most likely projected morphologies.
Le modèle générique tridimensionnel envisagé, par exemple de type PDM (« Point Distribution Model » ou modèle à distribution de points), peut comporter un maillage de noeuds liés les uns aux autres par des contraintes, c'est-à-dire des paramètres qui établissent les déplacements relatifs entre des noeuds adjacents ou les déformations du maillage que provoquent les déplacements de certains noeuds, dits amers (« landmarks »). Les amers sont choisis pour que leurs déplacements permettent d'atteindre toutes les morphologies souhaitées du modèle à l'intérieur des contraintes définies. The envisaged three-dimensional generic model, for example of the Point Distribution Model (PDM) type, may comprise a mesh of nodes linked to each other by constraints, that is to say parameters that establish the relative displacements between adjacent nodes or the deformations of the mesh which cause displacements of certain nodes, known as landmarks. The bitter ones are chosen so that their displacements make it possible to reach all the desired morphologies of the model within the defined constraints.
Comme le représente la figure 1 , dans le premier plan, un modèle générique simplifié de voiture peut comporter, pour la carrosserie, un maillage de 16 noeuds définissant 10 surfaces rectangulaires et comportant 10 amers. Huit amers KO à K7 définissent l'une des faces latérales de la voiture, et les deux amers restants K8 et K9, situés sur l'autre face latérale, définissent la largeur de la voiture. Un seul amer suffirait à définir la largeur de la voiture, mais la présence de deux amers ou plus permettra de conformer le modèle à une projection d'un objet réel en tenant compte des déformations de la projection. Les roues sont un élément caractéristique d'une voiture et on peut leur attribuer un jeu d'amers spécifique, non représenté, définissant l'entre-axe, le diamètre et les points de contact avec la route. Diverses contraintes peuvent être affectées à ces amers pour que le modèle soit conforme à la gamme de voitures à détecter, par exemple, maintenir le parallélisme entre les deux faces latérales ; maintenir le parallélisme entre les faces avant et arrière ; maintenir le parallélisme entre les arêtes perpendiculaires aux faces latérales ; assurer que les noeuds K3 et K5 soient au-dessus des noeuds Kl et K6 ; assurer que les noeuds K3 et K4 soient au-dessus des noeuds K2 et K5 ; assurer que les noeuds K2 et K3 soient à droite des noeuds Kl et K2 ; assurer que les noeuds K4 et K5 soient à gauche des noeuds K5 et K6, etc. As shown in Figure 1, in the foreground, a simplified generic car model may include, for the bodywork, a mesh of 16 nodes defining 10 rectangular surfaces and having 10 bitter. Eight bitter KO to K7 define one of the side faces of the car, and the two remaining K8 and K9 landmarks on the other side face define the width of the car. A single bitter would be enough to define the width of the car, but the presence of two bitter or more will allow to conform the model to a projection of a real object taking into account the deformations of the projection. The wheels are a characteristic element of a car and can be assigned a specific set of bitter, not shown, defining the center distance, the diameter and the points of contact with the road. Various constraints can be assigned to these bitter so that the model is consistent with the range of cars to be detected, for example, maintain the parallelism between the two lateral faces; maintain parallelism between the front and rear faces; maintain the parallelism between the edges perpendicular to the lateral faces; ensure that the nodes K3 and K5 are above the nodes K1 and K6; ensure that the nodes K3 and K4 are above the nodes K2 and K5; ensure that the nodes K2 and K3 are to the right of the nodes K1 and K2; ensure that nodes K4 and K5 are to the left of nodes K5 and K6, etc.
Comme on l'a précédemment indiqué, le modèle 3D générique illustré est simpliste, cela pour clarifier l'exposé. Dans la pratique, le modèle comprendra un maillage plus fin et permettant de définir des arêtes et des surfaces courbes. As previously indicated, the illustrated generic 3D model is simplistic, to clarify the presentation. In practice, the model will include a finer mesh and to define edges and curved surfaces.
La figure 2 illustre schématiquement une phase de configuration d'un système d'apprentissage machine pour reconnaître des voitures selon le modèle générique de la figure 1 , à titre d'exemple. Le système d'apprentissage comprend un jeu de données 10 associé à une caméra 12 installée pour filmer une scène à surveiller, par exemple celle de la figure 1. FIG. 2 schematically illustrates a configuration phase of a machine learning system for recognizing cars according to the generic model of FIG. 1, by way of example. The learning system comprises a data set 10 associated with a camera 12 installed for filming a scene to be monitored, for example that of FIG.
La phase de configuration peut démarrer à partir d'un jeu de données 10 existant, qui peut être sommaire et n'offrir qu'un faible taux de reconnaissance. Ce jeu de données existant peut avoir été produit par un apprentissage supervisé rapide et peu contraignant. Les étapes qui suivent servent à compléter le jeu de données pour atteindre un taux de reconnaissance satisfaisant. The configuration phase can start from an existing dataset, which can be summary and offer only a low recognition rate. This existing dataset may have been produced by a quick and uncomplicated supervised learning. The following steps are used to complete the dataset to achieve a satisfactory recognition rate.
Le système de reconnaissance est mis en marche et se met à reconnaître et suivre des voitures dans les images successives capturées par la caméra. Une instance d'image de chaque voiture reconnue est extraite en 14. Pour simplifier l'exposé, seule une face latérale des voitures est illustrée dans les instances - en réalité chaque instance est une projection en perspective dans laquelle d'autres faces sont le plus souvent visibles. La caméra produit généralement plusieurs images qui contiennent chacune une instance de la même voiture, à des positions différentes. On peut choisir l'instance la plus grande, qui aura la meilleure résolution pour les opérations ultérieures. The reconnaissance system is started and starts recognizing and tracking cars in successive images captured by the camera. An image instance of each recognized car is extracted in 14. To simplify the presentation, only one side face of the cars is illustrated in the instances - actually each instance is a perspective projection in which other faces are the most often visible. The camera typically produces multiple images that each contain an instance of the same car at different positions. One can choose the largest instance, which will have the best resolution for subsequent operations.
Ensuite, le modèle 3D générique est conformé à chaque instance d'image ainsi extraite. Cela peut être effectué par un algorithme classique de conformation (« fitting ») qui cherche, par exemple, les meilleures correspondances entre l'image et les amers du modèle tel que projeté dans le plan de l'image. On peut également utiliser des algorithmes basés sur la détection d'amers, comme cela est décrit, par exemple, dans ["One Millisecond Face Alignment with an Ensemble of Régression Trees", Vahid Kazemi et al. IEEE CVPR 2014]. Bien entendu, il est préférable que d'autres faces des voitures soient visibles dans les instances pour que le modèle puisse être défini de manière complète. Then, the generic 3D model is conformed to each instance of image thus extracted. This can be done by a conventional conformation algorithm ("fitting") which seeks, for example, the best matches between the image and the bitters of the model as projected in the plane of the image. It is also possible to use algorithms based on bitter detection, as described, for example, in "One Millisecond Face Alignment with an Ensemble of Regression Trees", Vahid Kazemi et al. IEEE CVPR 2014]. Of course, it is preferable that other faces of the cars are visible in the instances so that the model can be defined in a complete way.
Ces opérations de conformation produisent des modèles 3D censés être à l'échelle des objets réels. Pour cela, les opérations de conformation peuvent utiliser les équations de la projection du plan d'évolution de l'objet. Ces équations peuvent être déterminées manuellement à partir des caractéristiques de la caméra et de la configuration de la scène, ou bien estimées par le système dans une phase d'étalonnage utilisant, le cas échéant, des outils adaptés comme des caméras de profondeur. En sachant que les objets évoluent sur un plan, les équations peuvent être déduites à partir de la variation de la taille en fonction de la position des instances d'un objet suivi. These conformation operations produce 3D models that are supposed to be scaled to real objects. For this, the conformation operations can use the equations of the projection of the plane of evolution of the object. These equations can be determined manually from the camera characteristics and the configuration of the scene, or estimated by the system in a calibration phase using, if necessary, adapted tools such as depth cameras. Knowing that objects evolve on a plane, the equations can be deduced from the variation of the size according to the position of the instances of a tracked object.
A l'issue de chaque conformation, on produit en 16 un modèle 3D représentant la voiture reconnue à l'échelle. Les modèles sont illustrés en deux dimensions, en correspondance avec les faces latérales extraites 14. (On remarque que le modèle 3D générique utilisé est conformable aussi bien à des voitures qu'à des fourgonnettes, voire à des autobus. Ainsi, le système est ici plutôt prévu pour reconnaître tout véhicule à quatre roues.) At the end of each conformation, 16 is produced in 3D model representing the car recognized scale. The models are illustrated in two dimensions, in correspondence with the extracted lateral faces 14. (Note that the generic 3D model used is conformable to cars as well as vans or even buses, so the system is here rather intended to recognize any four-wheeled vehicle.)
Pendant l'étape de conformation, on peut également échantillonner les zones d'image correspondant aux différentes faces de la voiture, et stocker ces zones d'image sous forme de textures en 18. Au bout d'une certaine durée d'acquisition, on aura collectionné sans supervision une multitude de modèles 3D représentant des voitures différentes, ainsi que leurs textures. Si le taux de reconnaissance offert par le jeu de données initial est faible, il suffit de prolonger la durée d'acquisition pour atteindre une collection ayant un nombre satisfaisant de modèles. During the conformation step, it is also possible to sample the image zones corresponding to the different faces of the car, and to store these image zones in the form of textures at 18. After a certain acquisition period, will have collected without supervision a multitude of 3D models representing different cars, as well as their textures. If the recognition rate offered by the initial dataset is low, it is sufficient to extend the acquisition time to reach a collection with a satisfactory number of models.
Lorsque la collection de modèles 16 est jugée satisfaisante, on compare les modèles entre eux en 20 et on relève pour chaque amer une plage de variation. On a illustré un exemple de plage de variation pour l'amer K6. When the model collection 16 is judged to be satisfactory, the models are compared with each other at 20 and there is a range of variation for each bitter. An example of a range of variation for bitter K6 has been illustrated.
Les plages de variation peuvent définir des variations relatives affectant la forme du modèle lui-même, ou des variations absolues comme la position et l'orientation du modèle. L'un des amers, par exemple KO, peut servir de référence absolue. On peut lui attribuer des plages de variation absolue qui déterminent les positions et orientations possibles de la voiture dans l'image. Ces plages ne sont en fait pas directement déductibles des modèles enregistrés, puisque qu'un modèle enregistré peut être issu d'une seule instance choisie sur une multitude d'instances produites au cours du déplacement d'une voiture. On peut estimer les variations de position et d'orientation en les déduisant des multiples instances d'une voiture suivie, sans pour cela devoir effectuer une conformation complète du modèle générique à chaque instance. The ranges of variation can define relative variations affecting the shape of the model itself, or absolute variations such as the position and orientation of the model. One of the bitter, for example KO, can serve as an absolute reference. It can be assigned ranges of absolute variation that determine the positions and possible orientations of the car in the image. These ranges are in fact not directly deductible from the registered models, since a registered model can come from a single instance chosen on a multitude of instances produced during the displacement of a car. We can estimate the variations of position and orientation by deducing them from the multiple instances of a car followed, without having to carry out a complete conformation of the generic model in each instance.
Pour un amer diamétralement opposé à KO, par exemple K6, on pourra établir une plage de variation relativement à l'amer KO, qui détermine la longueur de la voiture. Pour un autre amer diamétralement opposé, par exemple K8, on pourra établir une plage de variation relativement à l'amer KO, qui détermine la largeur de la voiture. La plage de variation de chacun des autres amers peut être établi relativement à l'un de ses amers adjacents. For a bitter diametrically opposed to KO, for example K6, we can establish a range of variation relative to bitter KO, which determines the length of the car. For another diametrically opposed bitter, for example K8, we can establish a range of variation relative to bitter KO, which determines the width of the car. The range of variation of each of the other bitters can be established relative to one of its adjacent landmarks.
Une fois que les plages de variation sont établies, on synthétise en 22 une multitude de voitures 3D virtuelles à partir du modèle générique en faisant varier les amers dans leurs plages de variation respectives. Sur chaque voiture virtuelle on peut en outre plaquer l'une des textures 18. Les variations des amers peuvent être aléatoires, incrémentales, ou une combinaison des deux. Once the ranges of variation are established, one synthesizes in 22 a multitude of virtual 3D cars from the generic model by varying the bitters in their respective ranges of variation. On each virtual car can also be tackled one of the textures 18. The variations of the bitter can be random, incremental, or a combination of both.
En 24, chaque voiture synthétisée est projetée dans le plan de l'image de la caméra pour former une instance d'image auto-annotée servant à compléter le jeu de données 10 du système d'apprentissage. Ces projections utilisent également les équations de la projection du plan d'évolution des voitures. Une même voiture synthétisée peut être projetée en plusieurs postions et orientations différentes, selon les plages de variation absolues précédemment déterminées. En général, les orientations sont corrélées aux positions, de sorte qu'on ne fera pas varier les deux paramètres de façon indépendante, sauf si on souhaite détecter des situations anormales, comme une voiture en travers de la route. At 24, each synthesized car is projected in the camera image plane to form a self-annotated image instance for completing the data set 10 of the training system. These projections also use the equations of the projection of the plan of evolution of the cars. The same car synthesized can be projected in several positions and different orientations, according to the absolute ranges of variation previously determined. In general, the orientations are correlated to the positions, so that we will not vary the two parameters independently, unless we want to detect abnormal situations, like a car across the road.
Si le jeu de données initial était insuffisant, le jeu de données complété par cette procédure pourrait encore comporter des lacunes empêchant la détection de certains modèles de voiture. Dans ce cas, on peut réitérer une phase de configuration automatique en partant du jeu de données complété. Ce jeu de données offre normalement un taux de reconnaissance supérieur au jeu initial, ce qui conduira à la constitution d'une collection de modèles 16 plus variée, permettant d'affiner les plages de variation de paramètres et de synthétiser des modèles 22 à la fois plus précis et variés pour alimenter de nouveau le jeu de données 10. If the initial dataset was insufficient, the dataset complemented by this procedure could still have gaps preventing the detection of certain car models. In this case, one can reiterate an automatic configuration phase starting from the completed dataset. This dataset normally offers a recognition rate higher than the initial game, which will lead to the constitution of a more varied collection of models 16, making it possible to refine the parameters variation ranges and to synthesize models 22 at a time. more accurate and varied to feed the dataset 10 again.
Comme on l'a précédemment indiqué, le jeu de données initial peut être produit par un apprentissage supervisé simple et rapide. Dans une telle procédure, un opérateur visionne des images de la scène filmée et, à l'aide d'une interface graphique, annote les zones d'image correspondant à des instances des objets à reconnaître. Comme la procédure de configuration ultérieure est basée sur les variations morphologiques du modèle générique, l'opérateur peut avoir intérêt à annoter les objets exhibant les variations les plus importantes. Il peut ainsi annoter au moins deux objets dont les morphologies sont à des extrêmes opposés d'un domaine de variation qu'il aurait constaté visuellement. L'interface peut être conçue pour établir les équations de la projection du plan d'évolution avec l'assistance de l'opérateur. L'interface peut ensuite proposer à l'opérateur de conformer manuellement le modèle générique à des zones d'image, offrant à la fois une annotation et la création des premiers modèles dans la collection 16. As previously indicated, the initial dataset can be produced by simple and fast supervised learning. In such a procedure, an operator views images of the filmed scene and, using a graphical interface, annotates the image areas corresponding to instances of the objects to be recognized. Since the subsequent configuration procedure is based on the morphological variations of the generic model, the operator may wish to annotate the objects exhibiting the most important variations. It can thus annotate at least two objects whose morphologies are at opposite ends of a range of variation that it would have visually observed. The interface can be designed to establish the equations of the projection of the plan of evolution with the assistance of the operator. The interface can then propose to the operator to manually conform the generic model to image areas, offering both an annotation and the creation of the first models in the collection 16.
Cette phase d'annotation est sommaire et rapide, l'objectif étant d'obtenir un jeu de données initial restreint permettant le démarrage de la phase de configuration automatique qui complétera le jeu de données. This annotation phase is summary and rapid, the objective being to obtain a restricted initial dataset allowing the start of the automatic configuration phase that will complete the dataset.
De nombreuses variantes et modifications des modes de réalisation décrits ici apparaîtront à l'homme du métier. Bien que ces modes de réalisation concernent essentiellement la détection de voitures, la voiture n'est présentée qu'à titre d'exemple d'objet que l'on souhaite reconnaître. Les principes décrits sont applicables à tout objet que l'on peut modéliser de façon générique, notamment des objets déformables, comme des animaux ou des humains. Many variations and modifications of the embodiments described herein will be apparent to those skilled in the art. Although these embodiments relate essentially to the detection of cars, the car is presented only as an example of an object that one wishes to recognize. The principles described are applicable to any object that can be modeled generically, including deformable objects, such as animals or humans.

Claims

Revendications claims
1. Procédé de configuration automatique d'un système de reconnaissance d'une classe d'objets de morphologie variable, comprenant les étapes suivantes :  A method of automatically configuring a system for recognizing a class of objects of variable morphology, comprising the following steps:
• prévoir un système d'apprentissage machine avec un jeu de données initial (10) suffisant pour reconnaître des instances d'objets de la classe dans une séquence d'images d'une scène cible ; Providing a machine learning system with an initial data set (10) sufficient to recognize instances of objects of the class in a sequence of images of a target scene;
• prévoir un modèle tridimensionnel générique spécifique à la classe d'objets, dont la morphologie est définissable par un jeu de paramètres ; • provide a generic three-dimensional model specific to the class of objects, the morphology of which can be defined by a set of parameters;
• acquérir une séquence d'images de la scène à l'aide d'une caméra (12) ; • acquire a sequence of images of the scene using a camera (12);
• reconnaître des instances d'image (14) d'objets de la classe dans la séquence d'images acquise en utilisant le jeu de données initial ; • recognize image instances (14) of objects of the class in the image sequence acquired using the initial dataset;
• conformer le modèle tridimensionnel générique (16) à des instances d'image reconnues (14) ; Conforming the generic three-dimensional model (16) to recognized image instances (14);
• enregistrer des plages de variation des paramètres (20) résultant des conformations du modèle générique ; • record ranges of variation of the parameters (20) resulting from the conformations of the generic model;
• synthétiser de multiples objets tridimensionnels (22) à partir du modèle générique en faisant varier les paramètres dans les plages de variation enregistrées ; et • synthesizing multiple three-dimensional objects (22) from the generic model by varying the parameters in the recorded ranges of variation; and
• compléter le jeu de données (10) du système d'apprentissage par des projections des objets synthétisés (24) dans le plan des images. • complete the data set (10) of the learning system by projections of the synthesized objects (24) in the plane of the images.
2. Procédé selon la revendication 1 , comprenant les étapes suivantes : The method of claim 1, comprising the steps of:
• définir des paramètres du modèle tridimensionnel générique par les positions relatives d'amers (K0-K9) d'un maillage du modèle, les positions des autres noeuds du maillage étant liées aux amers par des contraintes ; et Defining parameters of the generic three-dimensional model by the relative positions of bitters (K0-K9) of a mesh of the model, the positions of the other nodes of the mesh being bound to the bitters by constraints; and
• opérer des conformations du modèle tridimensionnel générique en positionnant des amers d'une projection du modèle dans le plan des images. • Operate conformations of the generic three-dimensional model by positioning bitters of a projection of the model in the plane of the images.
3. Procédé selon la revendication 1 , comprenant les étapes suivantes : The method of claim 1, comprising the steps of:
• enregistrer des textures (18) à partir de zones des instances d'image reconnues ; et Registering textures (18) from areas of the recognized image instances; and
• plaquer sur chaque objet synthétisé (22) une texture parmi les textures enregistrées. • Plating on each synthesized object (22) a texture among the recorded textures.
4. Procédé selon la revendication 1 , dans lequel le jeu de données initial du système d'apprentissage est obtenu par un apprentissage supervisé impliquant au moins deux objets de la classe dont les morphologies sont à des extrêmes opposés d'un domaine de variation constaté des morphologies. The method of claim 1, wherein the initial set of data of the learning system is obtained by supervised learning involving at least two objects of the class whose morphologies are at opposite ends of a domain of variation found in morphologies.
EP17811644.8A 2016-12-14 2017-11-21 Object recognition system based on an adaptive 3d generic model Withdrawn EP3555802A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
FR1662455A FR3060170B1 (en) 2016-12-14 2016-12-14 OBJECT RECOGNITION SYSTEM BASED ON AN ADAPTIVE 3D GENERIC MODEL
PCT/FR2017/053191 WO2018109298A1 (en) 2016-12-14 2017-11-21 Object recognition system based on an adaptive 3d generic model

Publications (1)

Publication Number Publication Date
EP3555802A1 true EP3555802A1 (en) 2019-10-23

Family

ID=58501514

Family Applications (1)

Application Number Title Priority Date Filing Date
EP17811644.8A Withdrawn EP3555802A1 (en) 2016-12-14 2017-11-21 Object recognition system based on an adaptive 3d generic model

Country Status (9)

Country Link
US (1) US11036963B2 (en)
EP (1) EP3555802A1 (en)
JP (1) JP7101676B2 (en)
KR (1) KR102523941B1 (en)
CN (1) CN110199293A (en)
CA (1) CA3046312A1 (en)
FR (1) FR3060170B1 (en)
IL (1) IL267181B2 (en)
WO (1) WO2018109298A1 (en)

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
FR3060170B1 (en) * 2016-12-14 2019-05-24 Smart Me Up OBJECT RECOGNITION SYSTEM BASED ON AN ADAPTIVE 3D GENERIC MODEL
US11462023B2 (en) 2019-11-14 2022-10-04 Toyota Research Institute, Inc. Systems and methods for 3D object detection
US11736748B2 (en) * 2020-12-16 2023-08-22 Tencent America LLC Reference of neural network model for adaptation of 2D video for streaming to heterogeneous client end-points
KR20230053262A (en) 2021-10-14 2023-04-21 주식회사 인피닉 A 3D object recognition method based on a 2D real space image and a computer program recorded on a recording medium to execute the same

Family Cites Families (25)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5397985A (en) * 1993-02-09 1995-03-14 Mobil Oil Corporation Method for the imaging of casing morphology by twice integrating magnetic flux density signals
DE10252298B3 (en) * 2002-11-11 2004-08-19 Mehl, Albert, Prof. Dr. Dr. Process for the production of tooth replacement parts or tooth restorations using electronic tooth representations
US7853085B2 (en) 2003-03-06 2010-12-14 Animetrics, Inc. Viewpoint-invariant detection and identification of a three-dimensional object from two-dimensional imagery
EP1811456B1 (en) * 2004-11-12 2011-09-28 Omron Corporation Face feature point detector and feature point detector
JP4653606B2 (en) * 2005-05-23 2011-03-16 株式会社東芝 Image recognition apparatus, method and program
JP2007026400A (en) 2005-07-15 2007-02-01 Asahi Engineering Kk Object detection/recognition system at place with sharp difference in illuminance using visible light and computer program
JP4991317B2 (en) * 2006-02-06 2012-08-01 株式会社東芝 Facial feature point detection apparatus and method
JP4585471B2 (en) * 2006-03-07 2010-11-24 株式会社東芝 Feature point detection apparatus and method
JP4093273B2 (en) * 2006-03-13 2008-06-04 オムロン株式会社 Feature point detection apparatus, feature point detection method, and feature point detection program
JP4241763B2 (en) * 2006-05-29 2009-03-18 株式会社東芝 Person recognition apparatus and method
JP4829141B2 (en) * 2007-02-09 2011-12-07 株式会社東芝 Gaze detection apparatus and method
US7872653B2 (en) * 2007-06-18 2011-01-18 Microsoft Corporation Mesh puppetry
US20100123714A1 (en) * 2008-11-14 2010-05-20 General Electric Company Methods and apparatus for combined 4d presentation of quantitative regional parameters on surface rendering
JP5361524B2 (en) * 2009-05-11 2013-12-04 キヤノン株式会社 Pattern recognition system and pattern recognition method
EP2333692A1 (en) * 2009-12-11 2011-06-15 Alcatel Lucent Method and arrangement for improved image matching
KR101697184B1 (en) * 2010-04-20 2017-01-17 삼성전자주식회사 Apparatus and Method for generating mesh, and apparatus and method for processing image
KR101681538B1 (en) * 2010-10-20 2016-12-01 삼성전자주식회사 Image processing apparatus and method
JP6026119B2 (en) * 2012-03-19 2016-11-16 株式会社東芝 Biological information processing device
GB2515510B (en) * 2013-06-25 2019-12-25 Synopsys Inc Image processing method
US9299195B2 (en) * 2014-03-25 2016-03-29 Cisco Technology, Inc. Scanning and tracking dynamic objects with depth cameras
US9633483B1 (en) * 2014-03-27 2017-04-25 Hrl Laboratories, Llc System for filtering, segmenting and recognizing objects in unconstrained environments
FR3021443B1 (en) * 2014-05-20 2017-10-13 Essilor Int METHOD FOR CONSTRUCTING A MODEL OF THE FACE OF AN INDIVIDUAL, METHOD AND DEVICE FOR ANALYZING POSTURE USING SUCH A MODEL
CN104182765B (en) * 2014-08-21 2017-03-22 南京大学 Internet image driven automatic selection method of optimal view of three-dimensional model
US10559111B2 (en) * 2016-06-23 2020-02-11 LoomAi, Inc. Systems and methods for generating computer ready animation models of a human head from captured data images
FR3060170B1 (en) * 2016-12-14 2019-05-24 Smart Me Up OBJECT RECOGNITION SYSTEM BASED ON AN ADAPTIVE 3D GENERIC MODEL

Also Published As

Publication number Publication date
CN110199293A (en) 2019-09-03
IL267181B (en) 2022-11-01
IL267181A (en) 2019-08-29
FR3060170B1 (en) 2019-05-24
US20190354745A1 (en) 2019-11-21
IL267181B2 (en) 2023-03-01
KR102523941B1 (en) 2023-04-20
WO2018109298A1 (en) 2018-06-21
FR3060170A1 (en) 2018-06-15
CA3046312A1 (en) 2018-06-21
JP7101676B2 (en) 2022-07-15
KR20190095359A (en) 2019-08-14
JP2020502661A (en) 2020-01-23
US11036963B2 (en) 2021-06-15

Similar Documents

Publication Publication Date Title
EP3555802A1 (en) Object recognition system based on an adaptive 3d generic model
FR2998401A1 (en) METHOD FOR 3D RECONSTRUCTION AND PANORAMIC 3D MOSQUERY OF A SCENE
CA2701698A1 (en) Method for synchronising video streams
EP3614306B1 (en) Method for facial localisation and identification and pose determination, from a three-dimensional view
WO2012007382A1 (en) Method for detecting a target in stereoscopic images by learning and statistical classification on the basis of a probability law
WO2005010820A2 (en) Automated method and device for perception associated with determination and characterisation of borders and boundaries of an object of a space, contouring and applications
FR3025918A1 (en) METHOD AND SYSTEM FOR AUTOMATED MODELING OF A PART
FR3002673A1 (en) METHOD AND DEVICE FOR THREE-DIMENSIONAL IMAGING OF A PARTIAL REGION OF THE ENVIRONMENT OF A VEHICLE
WO2019166743A1 (en) 3d scene modelling system by multi-view photogrammetry
WO2012117210A1 (en) Method and system for estimating a similarity between two binary images
EP3145405A1 (en) Method of determining at least one behavioural parameter
FR3083352A1 (en) METHOD AND DEVICE FOR FAST DETECTION OF REPETITIVE STRUCTURES IN THE IMAGE OF A ROAD SCENE
FR3033913A1 (en) METHOD AND SYSTEM FOR RECOGNIZING OBJECTS BY ANALYZING DIGITAL IMAGE SIGNALS OF A SCENE
EP3504683B1 (en) Method for determining the placement of the head of a vehicle driver
WO2018206331A1 (en) Method for calibrating a device for monitoring a driver in a vehicle
FR3039919A1 (en) TRACKING A TARGET IN A CAMERAS NETWORK
WO2018011498A1 (en) Method and system for locating and reconstructing in real time the posture of a moving object using embedded sensors
WO2018015654A1 (en) Method and device for aiding the navigation of a vehicule
WO2014053437A1 (en) Method for counting people for a stereoscopic appliance and corresponding stereoscopic appliance for counting people
FR3015099A1 (en) SYSTEM AND DEVICE FOR ASSISTING AUTOMATED DETECTION AND MONITORING OF MOVING OBJECTS OR PEOPLE
FR3138951A1 (en) System and method for aiding the navigation of a mobile system by means of a model for predicting terrain traversability by the mobile system
EP1958157A1 (en) Method of bringing steoreoscopic images into correspondence
FR3127295A1 (en) METHOD FOR AUTOMATIC IDENTIFICATION OF TARGET(S) WITHIN IMAGE(S) PROVIDED BY A SYNTHETIC APERTURE RADAR
FR3112401A1 (en) Method and system for stereoscopic vision of a celestial observation scene
WO2000004503A1 (en) Method for modelling three-dimensional objects or scenes

Legal Events

Date Code Title Description
STAA Information on the status of an ep patent application or granted ep patent

Free format text: STATUS: UNKNOWN

STAA Information on the status of an ep patent application or granted ep patent

Free format text: STATUS: THE INTERNATIONAL PUBLICATION HAS BEEN MADE

PUAI Public reference made under article 153(3) epc to a published international application that has entered the european phase

Free format text: ORIGINAL CODE: 0009012

STAA Information on the status of an ep patent application or granted ep patent

Free format text: STATUS: REQUEST FOR EXAMINATION WAS MADE

17P Request for examination filed

Effective date: 20190619

AK Designated contracting states

Kind code of ref document: A1

Designated state(s): AL AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HR HU IE IS IT LI LT LU LV MC MK MT NL NO PL PT RO RS SE SI SK SM TR

AX Request for extension of the european patent

Extension state: BA ME

DAV Request for validation of the european patent (deleted)
DAX Request for extension of the european patent (deleted)
STAA Information on the status of an ep patent application or granted ep patent

Free format text: STATUS: EXAMINATION IS IN PROGRESS

STAA Information on the status of an ep patent application or granted ep patent

Free format text: STATUS: EXAMINATION IS IN PROGRESS

17Q First examination report despatched

Effective date: 20210621

STAA Information on the status of an ep patent application or granted ep patent

Free format text: STATUS: THE APPLICATION IS DEEMED TO BE WITHDRAWN

18D Application deemed to be withdrawn

Effective date: 20211103