EP4150508A1 - Détection améliorée d'objets - Google Patents
Détection améliorée d'objetsInfo
- Publication number
- EP4150508A1 EP4150508A1 EP21723981.3A EP21723981A EP4150508A1 EP 4150508 A1 EP4150508 A1 EP 4150508A1 EP 21723981 A EP21723981 A EP 21723981A EP 4150508 A1 EP4150508 A1 EP 4150508A1
- Authority
- EP
- European Patent Office
- Prior art keywords
- image
- surroundings
- resolution
- areas
- environment
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000001514 detection method Methods 0.000 title description 12
- 238000000034 method Methods 0.000 claims abstract description 32
- 238000013528 artificial neural network Methods 0.000 claims abstract description 16
- 238000013527 convolutional neural network Methods 0.000 claims abstract description 11
- 238000013135 deep learning Methods 0.000 claims abstract description 8
- 230000007613 environmental effect Effects 0.000 claims description 18
- 238000012545 processing Methods 0.000 description 15
- 230000006870 function Effects 0.000 description 9
- 230000003287 optical effect Effects 0.000 description 9
- 230000003068 static effect Effects 0.000 description 3
- 239000003086 colorant Substances 0.000 description 2
- 239000011159 matrix material Substances 0.000 description 2
- 230000007704 transition Effects 0.000 description 2
- 238000013459 approach Methods 0.000 description 1
- 230000006399 behavior Effects 0.000 description 1
- 230000001419 dependent effect Effects 0.000 description 1
- 238000010586 diagram Methods 0.000 description 1
- 230000004069 differentiation Effects 0.000 description 1
- 230000002349 favourable effect Effects 0.000 description 1
- 238000009434 installation Methods 0.000 description 1
- 238000012544 monitoring process Methods 0.000 description 1
- 238000012549 training Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/50—Context or environment of the image
- G06V20/56—Context or environment of the image exterior to a vehicle by using sensors mounted on the vehicle
- G06V20/58—Recognition of moving objects or obstacles, e.g. vehicles or pedestrians; Recognition of traffic objects, e.g. traffic signs, traffic lights or roads
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
- G06F18/241—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
- G06F18/2413—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on distances to training or reference patterns
- G06F18/24133—Distances to prototypes
- G06F18/24137—Distances to cluster centroïds
- G06F18/2414—Smoothing the distance, e.g. radial basis function networks [RBFN]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/25—Fusion techniques
- G06F18/254—Fusion techniques of classification results, e.g. of results related to same input data
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/40—Extraction of image or video features
- G06V10/44—Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components
- G06V10/443—Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components by matching or filtering
- G06V10/449—Biologically inspired filters, e.g. difference of Gaussians [DoG] or Gabor filters
- G06V10/451—Biologically inspired filters, e.g. difference of Gaussians [DoG] or Gabor filters with interaction between the filter responses, e.g. cortical complex cells
- G06V10/454—Integrating the filters into a hierarchical structure, e.g. convolutional neural networks [CNN]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/40—Extraction of image or video features
- G06V10/50—Extraction of image or video features by performing operations within image blocks; by using histograms, e.g. histogram of oriented gradients [HoG]; by summing image-intensity values; Projection analysis
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/82—Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
Definitions
- the present invention relates to a method for recognizing objects in an image of the surroundings using a neural network, in particular a convolutional neural network using deep learning, for a driving support system of a vehicle.
- a neural network in particular a convolutional neural network using deep learning
- the image of the surroundings is received and encoded to provide a two-dimensional grid with image information.
- Object recognition is then carried out based on the image information.
- the present invention also relates to a driving assistance system for a vehicle, in particular as an improved driver assistance system, with at least one camera-based environment sensor for providing an image of the surroundings and a control unit which receives the image of the environment from the at least one camera-based environment sensor, the driving assistance system being designed to carry out the above method .
- ADAS Advanced Driver Assistance Systems
- An important system parameter is a grid size of the grid, i.e. a size of cells that are defined by the grid.
- This grid size defines a total number of objects that can be recognized and classified. In addition, this results in a spatial accuracy of the detection and classification of the objects.
- FIG. 1 a shows an image of the surroundings 100 that was recorded with a camera-based surroundings sensor of a vehicle.
- the image of the surroundings 100 shows a roadway 102 with two lateral footpaths 104.
- several objects 106 which are pedestrians here, can be seen in the image of the surroundings 100.
- FIG. 1b) shows a uniform grid placed over the image of the surroundings 100 with a plurality of regular cells 108.
- the cells 108 define a resolution for which the image of the surroundings 100 is encoded and image information is provided.
- the grid is selected to be fine, so that distant objects 106 can also be reliably detected and classified in the image of the surroundings 100. For nearby objects 106, however, the identification and classification of the objects 106 requires a comparatively large amount of processing.
- FIG. 1c) also shows a uniform grid placed over the image of the surroundings 100 with a plurality of regular cells 108. The cells 108. In FIG. 1c), the grid is roughly selected compared to the representation in FIG. 1b), so that objects in the vicinity are recognized very efficiently and can be classified.
- the fine grid of FIG. 1b also enables an improved differentiation of objects that extend over several cells. However, this goes hand in hand with an increased processing effort, with an increased number of regression steps typically being required.
- the invention is therefore based on the object of developing a method for recognizing objects in an image of the surroundings using a neural network, in particular a convolutional neural network using deep learning, for a driving support system of a vehicle, as well specify a corresponding driving support system for carrying out the method, which enables a reliable and efficient detection of objects in images of the surroundings.
- a method for recognizing objects in an image of the surroundings using a neural network, in particular a convolutional neural network using deep learning, for a driving support system of a vehicle comprising the steps of receiving the image of the surroundings, encoding the image of the surroundings to provide a two-dimensional grid, which has a first resolution, with Image information, subdividing the surrounding image into a plurality of image areas with at least one first image area and at least one second image area, performing a decoding step in the at least one second image area to provide a two-dimensional grid that has a second resolution that is lower than the first resolution Image information, and performing an object recognition based on the image information of the plurality of image areas, wherein the at least one first image area has the first resolution and the at least one second image area has the second resolution.
- a driving support system for a vehicle in particular as an improved driver assistance system, with at least one camera-based environment sensor for providing an image of the surroundings and a control unit that receives the image of the surroundings from the at least one camera-based environment sensor is also specified, the driving assistance system being designed to carry out the above method .
- the basic idea of the present invention is therefore to provide the image information of an image of the surroundings with a different degree of detail so that, on the one hand, the entire image of the surroundings can be efficiently processed and, on the other hand, no important detailed information is lost.
- the image of the surroundings is first encoded in order to provide the image information with the first resolution.
- the image information of the first resolution is preprocessed in the decoding step in order to provide the at least one second image area with a lower resolution of the image information.
- At least a first image area remains with the resolution of the image information as it is present after encoding, and which can be decoded without the need for additional processing in order to recognize and classify objects.
- the image information is provided with a resolution depending on the respective image area, so that it can be processed efficiently in a correspondingly adapted network structure of the neural network.
- the detection of the objects can be carried out optimally for each of the image areas, since there are comparable ratios of the objects in relation to the cells of the grid.
- distant objects which are relatively small in the surrounding image, can have a high reliability. Close objects that are relatively large in the image of the surroundings can also be recognized well.
- the training for recognizing the objects is made easier, since the objects are represented similarly in each of the image areas and are therefore easy to recognize.
- meta-knowledge about the information to be expected in the image of the surroundings is preferably used in order to define the different image areas and to divide the image of the surroundings accordingly.
- the meta-knowledge relates, for example, to knowledge about an assembly and / or alignment of the at least one camera-based environmental sensor on the vehicle.
- the grid defines an arrangement of cells with image information.
- the grid can, for example, define a cell size of 16 x 16 pixels or 32 x 32 pixels for the first resolution, a balance here too between a desired level of detail in the detection and classification of the objects and the processing speed.
- the grid can, for example, define a corresponding cell size of 32 x 32 pixels or 64 x 64 pixels for the second resolution.
- the grid can define cells with any dimensions, it not being necessary for the cells to include the same number of pixels in each plane direction. This applies to each resolution independently.
- the cells for one resolution can have a square shape, while the cells for another resolution have a rectangular shape.
- the cells for another resolution can have a different rectangular shape.
- the cells of the at least one second image area with the second resolution each combine a plurality of cells of the at least one image area with the first resolution.
- the cells for the grid can be newly formed with the second resolution and, for example, comprise non-integer multiples of cells with the first resolution.
- the at least one second image area relates to an area of the environmental image that is defined by the second resolution.
- the image of the surroundings can thus have a plurality of independent second image areas, which can be contiguous or non-contiguous. The same applies to the at least one first image area.
- the subdivision of the surrounding image into the plurality of image areas can accordingly take place with a high degree of freedom.
- the detection of objects in the image of the surroundings relates to a detection of the objects with their position and a classification of the object, for example as a pedestrian, car, truck, tree, house, dog or the like.
- the neural network is designed in particular as a convolutional neural network using deep learning.
- Convolutional Neural Networks are widespread in the field of object recognition and are highly reliable.
- the driving assistance system is designed, for example, as an improved driver assistance system.
- improved driver assistance systems are known, for example, as ADAS (Advanced Driver Assistance Systems) and can include various functions. These functions can include, for example, a blind spot assistant, a lane departure warning system and / or a collision warning and protection system.
- ADAS Advanced Driver Assistance Systems
- these functions can include, for example, a blind spot assistant, a lane departure warning system and / or a collision warning and protection system.
- the detection of objects is also relevant for other driving support functions through to the implementation of functions for autonomous driving of vehicles.
- the environment image is an image provided by the camera-based environment sensor. It contains a matrix with image points (pixels) which at least partially reproduce the surroundings of the vehicle.
- the surrounding image can only include brightness information for the individual pixels, ie the image of the surroundings is an image in the manner of a black / white image, or brightness information for a plurality of colors, for example in the RGB format or others.
- the camera-based environment sensor can be designed as a camera that only provides a brightness value for individual pixels, or the camera-based environment sensor is, for example, a camera for providing color information, ie a brightness value for each color that can be perceived by the camera.
- the environment image can be provided by a single camera-based environment sensor alone or as a combination of several individual images from a plurality of camera-based environment sensors together.
- the latter usually relates to a combination of the multiple individual images in a horizontal direction in order to create the image of the surroundings in the manner of a panorama image.
- the camera-based environment sensor can be designed as an optical camera.
- optical camera For example, wide-angle cameras through to cameras with fish-eye lenses are used in current vehicles for monitoring the surroundings.
- the optical camera can be designed for visible light and / or for light in wavelengths that are not visible to humans, for example for ultraviolet light or for infrared light.
- the encoding of the image of the surroundings to provide a two-dimensional grid with image information includes providing image information for each cell formed by the grid. Different image information can be provided for this purpose, as set out below.
- the encoding of the image of the surroundings includes, in particular, an encoding of the image of the surroundings with a CNN encoder to provide the image information for the entire image of the surroundings in accordance with the grid with the first resolution.
- the resolution of the image information relates to a level of detail in the image information. More image information means a higher resolution, less image information means a lower resolution. Accordingly, a higher resolution means providing one finer grid, ie with smaller cells, whereas a lower resolution means providing a coarser grid, ie with larger cells.
- Carrying out the decoding step in the at least one second image area to provide a two-dimensional raster, which has a second resolution that is lower than the first resolution, with image information relates to processing the image information with the first resolution in the area of the at least one second image area to provide the two-dimensional grid with the image information with the second resolution therefrom.
- the image information in the at least one first image area is adopted unchanged for the subsequent object recognition.
- An object recognition is carried out based on the image information of the plurality of image areas.
- Various approaches are known as such in the prior art to perform object recognition, as further specified below.
- the image of the surroundings is provided by the at least one camera-based surroundings sensor and transmitted to the control unit.
- the control unit After receiving the image of the surroundings, the control unit carries out the encoding of the image of the surroundings, the subdivision of the image of the surroundings into the image areas, the decoding step in the at least one second image area to provide the two-dimensional grid, which has a second resolution that is lower than the first resolution, with image information as well as performing the object recognition based on the image information of the plurality of image areas.
- the control unit is also referred to as an ECU (Electronic Control Unit).
- the control unit is preferably designed as an embedded device and is provided in the vehicle.
- subdividing the environmental image into a plurality of image areas includes subdividing the environmental image into a plurality of image areas with at least one third image area, and the method includes an additional step of performing a decoding step in the at least one third image area to provide a two-dimensional grid that has a third resolution that is lower than the first resolution and is different from the second resolution, with image information.
- the surrounding image can therefore be divided into three image areas, whereby the same principles can be applied as with the division into only two image areas.
- a division into four or more image areas with different resolutions is also conceivable.
- the individual image areas can be arranged contiguously or distributed and not contiguous.
- dividing the image of the surroundings into a plurality of image areas includes dividing the image of the surroundings into a plurality of image areas with at least one fourth image area, and the method includes an additional step for discarding image information in the at least one fourth image area.
- the meta-knowledge relates to a knowledge of the installation and alignment of the at least one camera-based environment sensor on the vehicle, whereby, for example, areas in the environment image can be identified that are covered or overlap with a field of view of another camera and therefore do not have to be processed twice.
- areas with strong distortions can be excluded from further processing, as can sometimes occur when using wide-angle optics through to fish-eye lenses.
- the subdivision of the environmental image into a plurality of image areas with the at least one fourth image area is preferably carried out as a static subdivision, in particular when the at least one fourth image area is based on meta-knowledge about the assembly and alignment of the at least one camera-based environmental sensor on the vehicle, i.e. on static information .
- the fourth image area can also be defined dynamically, for example by determining the horizon or an area with sky in previous images of the surroundings.
- the method comprises an additional step for identifying a horizon of the image of the surroundings, and the subdivision of the image of the surroundings into a plurality of image areas with at least one fourth image area is carried out based on the horizon.
- objects are usually located below or only slightly above a horizon plane in the image of the surroundings.
- image information can be discarded from an upper edge of the image downwards, but at a distance above the horizon, since no relevant objects are to be expected there on the road, ie no objects that are relevant for driving the vehicle.
- objects in the air are usually of little relevance.
- an upper row with cells with the encoded image information in the grid with the first resolution above the horizon can be discarded.
- several rows with cells can also be discarded.
- dividing the image of the surroundings into a plurality of image areas comprises dividing the image of the surroundings into two image areas along at least one horizontal line, with the at least one second image area and / or the at least one third image area being based on an orientation of the image of the surroundings below the at least one horizontal line is arranged.
- the subdivision of the image of the surroundings to form the different image areas along the horizontal line or along a plurality of horizontal lines is based on a typical image division of the at least one camera-based environmental sensor. In particular when driving outside of built-up areas, closer objects are typically located below a horizontal line in the image of the surroundings compared to more distant objects.
- a size within the image of the surroundings is typically dependent on their vertical position in the image of the surroundings. This can be taken into account by dividing the image of the surroundings along the at least one horizontal line. Image areas below a horizontal line preferably have a grid with a lower resolution than image areas above the corresponding horizontal line.
- performing an object recognition based on the image information of the plurality of image areas includes performing an independent object recognition in the plurality of image areas and merging the object recognition of the plurality of image areas for object recognition in the surrounding image.
- the same principles can therefore be used for each of the image areas can be used to capture and recognize objects.
- the object recognition for the different image areas can also be carried out with the same decoder, since there are no or only minor differences in principle for the objects with regard to the resolution of the raster in the different image areas.
- performing an independent object recognition in the plurality of image areas includes an independent object recognition using at least one regression layer of a deep neural network, YOLO and / or SSD.
- a deep neural network YOLO and / or SSD.
- Each image area of the respective environment image can be processed in the same way or in a different way.
- the same deep neural network can also be used to process the image information of different image areas, since objects have the same properties regardless of their position.
- YOLO is an abbreviation for "You only look once”
- SSD is an abbreviation for "Single Shot multibox Detector". Both YOLO and SSD are known as such in the prior art and are therefore not explained in detail at this point.
- YOLO as well as SSD are well suited for real-time object recognition, especially in embedded systems.
- the merging of the object recognition of the plurality of image areas for object recognition in the surrounding image includes providing a uniform resolution space for providing a list with merged object recognitions.
- the recognized objects can be made available for further processing in a uniform manner.
- encoding the image of the surroundings to provide a two-dimensional grid with image information and / or performing a decoding step in the at least one second image area to provide a two-dimensional grid with image information for each cell defined by the grid includes providing for each recognized object an object trustworthiness, a position of a bounding box enclosing the object, determining dimensions of the bounding box, and a Determining an object class probability for each object class to be recognized.
- object trustworthiness specifies how high the trust in the existence of an object is.
- the object class probability for each possible object class indicates the probability that the recognized object belongs to the corresponding object class. Further information is the position and dimensions of the bounding box that encloses the object, which enables easy handling of the recognized object. Objects at borders within each cell can also lie at borders thereof, and the objects themselves can extend over these several cells as recognized objects of several cells.
- the image information preferably includes information that relates not only to the respective cell, but also to neighboring cells or other cells located in the vicinity.
- the objects can be recognized with a high degree of reliability, in particular in the case of objects that extend over more than a single cell.
- FIG. 1 shows a view of an image of the surroundings with a road with lateral
- FIG. 2 shows a view of a vehicle with a driving assistance system, in particular as an improved driver assistance system, with a camera-based environment sensor for providing an image of the environment and a control unit that controls the Receives environmental image from the camera-based environmental sensor, according to a first, preferred embodiment,
- FIG. 3 shows a view of an image of the surroundings with a road with lateral
- Footpaths and a plurality of people alone and with a grid comprising an image area with a fine grid and an image area with a coarse grid in accordance with the first embodiment
- FIG. 4 shows a system illustration of the driving assistance system from FIG.
- FIG. 5 shows a view of an image of the surroundings with a road with sidewalks and a person extending over several cells of an image area with a fine grid and several cells of an image area with a coarse grid, in accordance with the first embodiment
- FIG. 6 shows a flow diagram of a method for recognizing objects in an image of the surroundings using a neural network in accordance with the first embodiment.
- FIG. 2 shows a vehicle 10 with a driving support system 12 according to a first, preferred embodiment.
- the driving assistance system 12 is designed, for example, as an improved driver assistance system.
- improved driver assistance systems are known, for example, as ADAS (Advanced Driver Assistance Systems) and can include various functions. These functions can include, for example, a blind spot assistant, a lane departure warning system and / or a collision warning and protection system.
- the driving support system 12 can support functions up to and including autonomous driving of the vehicle 10.
- the driving assistance system 12 is shown by way of example in FIG. 2 with a camera-based environment sensor 14.
- the camera-based environment sensor 14 is an optical camera in this exemplary embodiment.
- the optical camera 14 has, for example, a resolution of approximately 2 megapixels.
- the driving support system 12 also includes a control unit 16.
- the control unit 16 is also referred to as an ECU (Electronic Control Unit) in the field of vehicles.
- the control unit 16 is embodied as an embedded device and is provided in the vehicle 10.
- the optical camera 14 is connected to the control unit 16 via a data bus 18.
- the optical camera 14 detects the surroundings 20 of the vehicle 10 and records images of the surroundings 30, which are transmitted to the control unit 16 via the data bus 18.
- the surroundings images 30 each contain a matrix with image points (pixels) which at least partially reproduce the surroundings 20 of the vehicle 10.
- the image of the surroundings 30 comprises brightness information for a plurality of colors for each pixel, for example in the RGB or other format, which is provided by the optical camera 14.
- FIGS. 3 to 6 a method for recognizing objects 36 in the image of the surroundings 30 using a neural network is described below.
- objects 36 pedestrians 36 are represented in the image of surroundings 30 by way of example.
- the neural network is a convolutional neural network using deep learning. The method is carried out with the driving support system 12 described above.
- step S100 which relates to receiving the image 30 of the surroundings.
- the image of the surroundings 30 is recorded by the optical camera 14 and transmitted to the control unit 16 via the data bus 18.
- Step S110 relates to an encoding of the image of the surroundings 30 in order to provide a two-dimensional grid 38, which has a first resolution, with image information.
- a plurality of cells 40 is formed by the grid 38, image information being provided for each of the cells 40 by the encoding.
- the grid 38 thus defines an arrangement of the cells 40 with image information, the cells 40 in the exemplary embodiment described having a cell size of 16 ⁇ 16 pixels for the first resolution.
- the encoding of the image of the surroundings 30 includes an encoding of the image of the surroundings 30 with a CNN encoder 42, which is shown in FIG. 4, for providing the image information for the entire image of the surroundings 30 according to the grid 38 with the first resolution.
- the control unit 16 includes the encoder 42 and carries out the encoding of the environmental image 30.
- An object trustworthy value, a position of a bounding box 44 which encloses the object 36, dimensions of the bounding box 44 and an object class probability for each object class to be recognized are determined as image information for each cell 40 defined by the grid 38 for each recognized object 36.
- the object trustworthiness specifies how high the trust in the existence of an object 36 is.
- the object class probability for each possible object class indicates the probability that the recognized object 36 belongs to the corresponding object class.
- Further information is the position and dimensions of the bounding box 44 which encloses the object 36.
- the image information includes information that relates not only to the respective cell 40, but also to neighboring cells 40 or other cells 40 located in the vicinity.
- Step S120 relates to subdividing the image of the surroundings 30 into a plurality of image areas 46a, 46b, 46c.
- the image of the surroundings is first divided into an upper half 50 and a lower half 52 of the image.
- the upper half of the image 50 is then divided into a first image area 46a and a fourth image area 46c.
- the fourth image area 46c is formed on the upper edge of the image 30 of the surroundings.
- the lower half of the image 52 forms a second image area 46b.
- the image of the surroundings 30 is divided along horizontal lines 48.
- a horizon of the image of the surroundings 30 lies parallel to the two horizontal lines 48, as a result of which the subdivision of the image areas 46a, 46b, 46c and in particular the establishment of the fourth image area 46c takes place based on the horizon.
- the subdivision of the surrounding image 30 into the image areas 46a, 46b, 46c takes place as a static subdivision.
- Step S120 can also be carried out at any earlier point in time than the configuration of the driving support system 12.
- the subdivision of the environmental image into the image areas 46a, 46b, 46c is therefore identical for all environmental images 30 of the same type, i.e. for all environmental images 30 of the optical camera 14.
- Step S130 relates to discarding image information in the fourth image area 46c.
- the image information at the upper edge of the environmental image 30 is discarded, i.e. the image information from an upper image edge of the environmental image 30 downwards, but at a distance above the horizon, is discarded, since no relevant objects 36 are to be expected there on the road 32.
- Step S140 relates to carrying out a decoding step in the second image region 46b in order to provide a two-dimensional raster 38, which has a second resolution, which is lower than the first resolution, with image information.
- the image information is fed to a decoder 54, which is implemented in the control unit 16 and carries out the decoding step.
- the decoding step comprises processing the image information with the first resolution of the surrounding image 30 in the area of the second image area 46b in order to provide therefrom the two-dimensional raster 38 with the image information with the second resolution, which is lower than the first resolution.
- This is indicated in FIG. 4 by the fact that the lower half of the image 52 at the end of the decoding step, ie after passing through the decoder 54, has a smaller size than before passing through the decoder 54.
- the grid 38 in the second Image area 46b has a cell size of 32 x 32 pixels.
- the image information of the image areas 46a, 46b, 46c is then combined and further processed together, as shown in FIG. There is one Combination of the two different grids 38 for an image of the surroundings 30, as is also shown in FIG. 3b.
- Step S150 relates to performing an object recognition based on the image information of the image areas 46a, 46b, 46c.
- the image information is adopted unchanged in the first image area 46a.
- the detection of objects 36 in the image of the surroundings 30 relates to a detection of the objects 36 with their position and a classification of the respective object 36, for example as a pedestrian, car, truck, tree, house, dog or the like.
- An object recognition is carried out based on the image information of the plurality of image areas 46a, 46b, 46c. In this case, an independent object recognition is carried out in the first and second image areas 46a, 46b. The object recognition of the first and second image areas 46a, 46b is then merged to completely complete the object recognition in the environmental image 30. The same principles are used for the first and second image areas 46a, 46b in order to detect and recognize the objects 36. The object recognition for the first and second image areas 46a, 46b can in principle be carried out with the same decoder 54.
- the object recognition is carried out in detail using at least one regression layer of a deep neural network, YOLO and / or SSD.
- YOLO is an abbreviation for "You only look once”
- SSD is an abbreviation for "Single Shot multibox Detector”.
- the merging of the object recognition of the first and second image areas 46a, 46b for object recognition in the environment image 30 includes providing a uniform resolution space for providing a list with merged object recognitions. As a result, the recognized objects 36 are made available in a uniform manner for further processing.
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Physics & Mathematics (AREA)
- Multimedia (AREA)
- Evolutionary Computation (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Artificial Intelligence (AREA)
- Data Mining & Analysis (AREA)
- Life Sciences & Earth Sciences (AREA)
- Health & Medical Sciences (AREA)
- General Health & Medical Sciences (AREA)
- Molecular Biology (AREA)
- Software Systems (AREA)
- Medical Informatics (AREA)
- Biodiversity & Conservation Biology (AREA)
- Biomedical Technology (AREA)
- Computing Systems (AREA)
- Databases & Information Systems (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Evolutionary Biology (AREA)
- General Engineering & Computer Science (AREA)
- Image Analysis (AREA)
- Traffic Control Systems (AREA)
Abstract
L'invention concerne un procédé de reconnaissance d'objets (36) dans une image d'environnement (30) à l'aide d'un réseau neuronal, en particulier un réseau neuronal convolutif utilisant un apprentissage profond, pour un système d'aide à la conduite (12) d'un véhicule (10), comprenant les étapes suivantes : la réception de l'image d'environnement (30) ; le codage de l'image d'environnement (30) pour fournir une grille bidimensionnelle (38) ayant une première résolution avec des informations d'image ; la division de l'image d'environnement (30) en une pluralité de régions d'image (46a, 46b, 46c) avec au moins une première zone d'image (46a) et au moins une seconde région d'image (46b) ; la réalisation d'une étape de décodage dans la ou les secondes régions d'image (46b) pour fournir une grille bidimensionnelle (38) ayant une seconde résolution qui est inférieure à la première résolution avec des informations d'image ; et la réalisation d'une reconnaissance d'objet sur la base des informations d'image de la pluralité de régions d'image (46a, 46b, 46c), la ou les premières régions d'image (46a) ayant la première résolution et la ou les secondes régions d'image (46b) ayant la seconde résolution. L'invention concerne également un système d'aide à la conduite (12) pour un véhicule (10) comprenant au moins un capteur d'environnement basé sur une caméra (14) pour fournir une image d'environnement (30) et une unité de commande (16) qui reçoit l'image d'environnement (30) en provenance du ou des capteurs d'environnement basé sur une caméra (14), le système d'aide à la conduite (12) étant configuré pour mettre en œuvre le procédé ci-dessus.
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
DE102020112860.6A DE102020112860A1 (de) | 2020-05-12 | 2020-05-12 | Verbesserte Detektion von Objekten |
PCT/EP2021/062026 WO2021228686A1 (fr) | 2020-05-12 | 2021-05-06 | Détection améliorée d'objets |
Publications (1)
Publication Number | Publication Date |
---|---|
EP4150508A1 true EP4150508A1 (fr) | 2023-03-22 |
Family
ID=75850207
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
EP21723981.3A Pending EP4150508A1 (fr) | 2020-05-12 | 2021-05-06 | Détection améliorée d'objets |
Country Status (3)
Country | Link |
---|---|
EP (1) | EP4150508A1 (fr) |
DE (1) | DE102020112860A1 (fr) |
WO (1) | WO2021228686A1 (fr) |
Family Cites Families (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
DE102011121473A1 (de) | 2011-12-17 | 2013-06-20 | Valeo Schalter Und Sensoren Gmbh | Verfahren zum Anzeigen von Bildern auf einer Anzeigeeinrichtung eines Kraftfahrzeugs,Fahrerassistenzeinrichtung, Kraftfahrzeug und Computerprogramm |
EP2696310B1 (fr) | 2012-08-10 | 2017-10-18 | Delphi Technologies, Inc. | Procédé destiné à identifier un bord de route |
DE102013201545A1 (de) | 2013-01-30 | 2014-07-31 | Bayerische Motoren Werke Aktiengesellschaft | Erstellen eines Umfeldmodells für ein Fahrzeug |
US9305214B1 (en) * | 2013-10-29 | 2016-04-05 | The United States Of America, As Represented By The Secretary Of The Navy | Systems and methods for real-time horizon detection in images |
DE102015212771A1 (de) | 2015-07-08 | 2017-01-12 | Bayerische Motoren Werke Aktiengesellschaft | Vorrichtung zur Erkennung von teilverdeckten beweglichen Objekten für ein Umfelderfassungssystem eines Kraftfahrzeugs |
DE102017130488A1 (de) | 2017-12-19 | 2019-06-19 | Valeo Schalter Und Sensoren Gmbh | Verfahren zur Klassifizierung von Parklücken in einem Umgebungsbereich eines Fahrzeugs mit einem neuronalen Netzwerk |
US10949711B2 (en) * | 2018-04-23 | 2021-03-16 | Intel Corporation | Non-maximum suppression of features for object detection |
DE102018114229A1 (de) | 2018-06-14 | 2019-12-19 | Connaught Electronics Ltd. | Verfahren zum Bestimmen eines Bewegungszustands eines Objekts in Abhängigkeit einer erzeugten Bewegungsmaske und eines erzeugten Begrenzungsrahmens, Fahrerassistenzsystem sowie Kraftfahrzeug |
-
2020
- 2020-05-12 DE DE102020112860.6A patent/DE102020112860A1/de active Pending
-
2021
- 2021-05-06 WO PCT/EP2021/062026 patent/WO2021228686A1/fr unknown
- 2021-05-06 EP EP21723981.3A patent/EP4150508A1/fr active Pending
Also Published As
Publication number | Publication date |
---|---|
DE102020112860A1 (de) | 2021-11-18 |
WO2021228686A1 (fr) | 2021-11-18 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
EP2394234B1 (fr) | Procédé et dispositif de détermination d'un marquage de voie de circulation en vigueur | |
EP2179381B1 (fr) | Procédé et dispositif servant à la reconnaissance de panneaux de signalisation routière | |
DE102013205950B4 (de) | Verfahren zum Detektieren von Straßenrändern | |
DE69624980T2 (de) | Objektüberwachungsverfahren und -gerät mit zwei oder mehreren Kameras | |
DE112013001858T5 (de) | Mehrfachhinweis-Objekterkennung und -Analyse | |
DE19955919C1 (de) | Verfahren zur Erkennung von Objekten in Bildern auf der Bildpixelebene | |
DE102017203276B4 (de) | Verfahren und Vorrichtung zur Ermittlung einer Trajektorie in Off-road-Szenarien | |
EP2396746A2 (fr) | Procédé de détection d'objets | |
DE102011111440A1 (de) | Verfahren zur Umgebungsrepräsentation | |
WO2014032904A1 (fr) | Procédé et dispositif de détection de la position d'un véhicule sur une voie de circulation | |
DE102016210534A1 (de) | Verfahren zum Klassifizieren einer Umgebung eines Fahrzeugs | |
EP3520023B1 (fr) | Détection et validation d'objets provenant d'images séquentielles d'une caméra | |
DE102012000459A1 (de) | Verfahren zur Objektdetektion | |
DE102018121008A1 (de) | Kreuzverkehrserfassung unter verwendung von kameras | |
DE102013012930A1 (de) | Verfahren zum Bestimmen eines aktuellen Abstands und/oder einer aktuellen Geschwindigkeit eines Zielobjekts anhand eines Referenzpunkts in einem Kamerabild, Kamerasystem und Kraftfahrzeug | |
DE102009022278A1 (de) | Verfahren zur Ermittlung eines hindernisfreien Raums | |
WO2020020654A1 (fr) | Procédé pour faire fonctionner un système d'aide à la coduite doté deux dispositifs de détection | |
DE102020209605A1 (de) | Fahrzeug und verfahren zu dessen steuerung | |
DE102020204840A1 (de) | Prozessierung von Mehrkanal-Bilddaten einer Bildaufnahmevorrichtung durch einen Bilddatenprozessor | |
WO2021228686A1 (fr) | Détection améliorée d'objets | |
WO2019057252A1 (fr) | Procédé et dispositif de détection de voies de circulation, système d'aide à la conduite et véhicule | |
DE102019132012B4 (de) | Verfahren und System zur Detektion von kleinen unklassifizierten Hindernissen auf einer Straßenoberfläche | |
DE102015112389A1 (de) | Verfahren zum Erfassen zumindest eines Objekts auf einer Straße in einem Umgebungsbereich eines Kraftfahrzeugs, Kamerasystem sowie Kraftfahrzeug | |
WO2015074915A1 (fr) | Ensemble de filtres et procédé de fabrication d'un ensemble de filtres | |
DE102017115475A1 (de) | Verfahren zum Erkennen eines Hindernisses in einem Umgebungsbereich eines Kraftfahrzeugs, Auswerteeinrichtung, Fahrerassistenzsystem sowie Kraftfahrzeug |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
STAA | Information on the status of an ep patent application or granted ep patent |
Free format text: STATUS: UNKNOWN |
|
STAA | Information on the status of an ep patent application or granted ep patent |
Free format text: STATUS: THE INTERNATIONAL PUBLICATION HAS BEEN MADE |
|
PUAI | Public reference made under article 153(3) epc to a published international application that has entered the european phase |
Free format text: ORIGINAL CODE: 0009012 |
|
STAA | Information on the status of an ep patent application or granted ep patent |
Free format text: STATUS: REQUEST FOR EXAMINATION WAS MADE |
|
17P | Request for examination filed |
Effective date: 20221108 |
|
AK | Designated contracting states |
Kind code of ref document: A1 Designated state(s): AL AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HR HU IE IS IT LI LT LU LV MC MK MT NL NO PL PT RO RS SE SI SK SM TR |
|
DAV | Request for validation of the european patent (deleted) | ||
DAX | Request for extension of the european patent (deleted) |