WO2024028242A1 - Détermination d'une zone d'intérêt à partir d'images de caméra - Google Patents

Détermination d'une zone d'intérêt à partir d'images de caméra Download PDF

Info

Publication number
WO2024028242A1
WO2024028242A1 PCT/EP2023/071105 EP2023071105W WO2024028242A1 WO 2024028242 A1 WO2024028242 A1 WO 2024028242A1 EP 2023071105 W EP2023071105 W EP 2023071105W WO 2024028242 A1 WO2024028242 A1 WO 2024028242A1
Authority
WO
WIPO (PCT)
Prior art keywords
camera
image
computer
images
interest
Prior art date
Application number
PCT/EP2023/071105
Other languages
English (en)
Inventor
Rekha RAMACHANDRA
Jonathan Horgan
Prashanth Viswanath
Ciaran Hughes
Original Assignee
Connaught Electronics Ltd.
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Connaught Electronics Ltd. filed Critical Connaught Electronics Ltd.
Publication of WO2024028242A1 publication Critical patent/WO2024028242A1/fr

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/12Edge-based segmentation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/174Segmentation; Edge detection involving the use of two or more images
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10016Video; Image sequence
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30248Vehicle exterior or interior
    • G06T2207/30252Vehicle exterior; Vicinity of vehicle

Definitions

  • the present invention is directed to a computer-implemented method for determining a region of interest from a sequence of camera images.
  • the invention is further directed to a computer-implemented method for computer vision, to a method for guiding a vehicle at least in part automatically, to a computer vision system, to an electronic vehicle guidance system, to a computer program and to a computer-readable storage medium.
  • ADAS advanced driver assistance systems
  • camera images from one or more cameras mounted to the vehicle represent an essential basis for perceiving the vehicle's environment.
  • the camera images are for example automatically processed or analyzed and/or computer vision tasks, such as object detection, depth estimation, semantic segmentation, optical flow detection, stereo disparity matching, object tracking and so forth, make use of the camera images.
  • the vehicle may be guided at least in part automatically, for example by assisting a driver by means of an ADAS and/or by automatically or semi-automatically affecting the lateral and/or longitudinal control of the vehicle.
  • the data of multiple cameras of the vehicle and/or of further sensor systems are processed individually or in combination with each other.
  • the higher the pixel resolution of the camera images the more detailed results may be obtained from them. All this leads to a vast amount of sensor data, which is to be processed by the embedded computing units of the vehicle. Consequently, it is a general objective to reduce the computational load, in particular in terms of runtime and/or memory requirements, when processing the sensor inputs, in particular the camera images.
  • non-rectilinear cameras in particular fisheye cameras, are commonly used for example for rear-facing cameras or other cameras of surround view systems of the vehicle.
  • One drawback of such cameras is that they may capture relatively large portions of the vehicle body and/or the camera mount housing themselves. These are static components in the scene and do not provide any useful information about the dynamic scene to be captured and analyzed. Therefore, capturing and processing these components leads to an unnecessary increase of the system resources like runtime and memory without providing a real benefit.
  • the vehicle body in the camera image may lead to false positive object detections, for example due to light reflection effects. This may lead to issues particularly for safety critical applications such as automated parking or driving.
  • Document WO 2019/072911 A1 describes a method for determining a region of interest in an image.
  • a mean region of interest given by a plurality of reference points is provided and a plurality of displacement parameters is determined to displace the mean region of interest in order to obtain an improved region of interest.
  • the mean region of interest and the plurality of reference points are obtained in a training phase based on manually annotated training images.
  • the invention is based on the idea to sum up individual edge images obtained from a sequence of camera images by applying an edge detection algorithm and to determine a contour of maximum length based on the summed up edge images.
  • the region of interest is determined as an interior of a convex polygon, which approximates a convex hull of the contour of maximum length.
  • a computer-implemented method for determining a region of interest from a sequence of camera images of a camera of a vehicle, in particular a motor vehicle is provided. For each camera image of the sequence of camera images, a respective individual edge image is generated by applying an edge detection algorithm, in particular to the respective camera image or to a pre-processed image depending on the respective camera image.
  • a joint edge image is generated by summing the individual edge images, in particular by summing all of the individual edge images for the sequence of camera images.
  • a contour of maximum length is determined depending on the joint edge image and a convex hull of the contour is computed.
  • a convex polygon approximating the convex hull is determined, wherein an interior of the convex polygon corresponds to the region of interest or, in other words, the region of interest is determined as the interior of the convex polygon.
  • all steps of the computer-implemented method may be performed by at least one computing unit, in particular of the vehicle, which may also be denoted as a data processing apparatus.
  • the at least one computing unit comprises at least one processing circuit, which is configured or adapted to perform the steps of the computer-implemented method.
  • the at least one computing unit may for example store a computer program comprising instructions which, when executed by the at least one computing unit, in particular the at least one processing circuit, cause the at least one computing unit to execute the computer-implemented method.
  • a corresponding method which is not purely computer-implemented, is directly obtained by including the step of generating the sequence of camera images by means of the camera.
  • a computing unit may in particular be understood as a data processing device, which comprises processing circuitry.
  • the computing unit can therefore in particular process data to perform computing operations. This may also include operations to perform indexed accesses to a data structure, for example a look-up table, LUT.
  • the computing unit may include one or more computers, one or more microcontrollers, and/or one or more integrated circuits, for example, one or more application-specific integrated circuits, ASIC, one or more field-programmable gate arrays, FPGA, and/or one or more systems on a chip, SoC.
  • the computing unit may also include one or more processors, for example one or more microprocessors, one or more central processing units, CPU, one or more graphics processing units, GPU, and/or one or more signal processors, in particular one or more digital signal processors, DSP.
  • the computing unit may also include a physical or a virtual cluster of computers or other of said units.
  • the computing unit includes one or more hardware and/or software interfaces and/or one or more memory units.
  • a memory unit may be implemented as a volatile data memory, for example a dynamic random access memory, DRAM, or a static random access memory, SRAM, or as a nonvolatile data memory, for example a read-only memory, ROM, a programmable read-only memory, PROM, an erasable programmable read-only memory, EPROM, an electrically erasable programmable read-only memory, EEPROM, a flash memory or flash EEPROM, a ferroelectric random access memory, FRAM, a magnetoresistive random access memory, MRAM, or a phase-change random access memory, PCRAM.
  • a volatile data memory for example a dynamic random access memory, DRAM, or a static random access memory, SRAM
  • a nonvolatile data memory for example a read-only memory, ROM, a programmable read-only memory, PROM, an erasable programmable read-only memory, EPROM, an electrically erasable programmable read-only memory, EEPROM,
  • the sequence of camera images is generated by the camera, which is mounted to the vehicle, and received from the camera, in particular by the at least one computing unit.
  • the sequence of camera images may for example be considered as a part of a camera image stream generated by the camera. Consequently, the camera images of the sequence correspond to a plurality of subsequent frames of the camera image stream.
  • the total number of camera images of the sequence may be two or more, for example five or more, preferably ten or more.
  • the sequence of camera images is, in particular, generated by the camera while the vehicle is moving with respect to its environment.
  • the region of interest may be considered as a part of the camera images, which depicts an environment of the vehicle rather than components of the camera or the vehicle itself.
  • the edge detection algorithm which may be a known edge detection algorithm such as a Canny edge detector or a Laplacian-of-Gaussian filter, also denoted as Laplacian-over-Gaussian filter or LoG filter, may be applied directly to the respective camera image.
  • a pre-processed image may be computed and the edge detection algorithm may be applied to the pre-processed image. Consequently, the individual edge image is a representation of the edges present in the respective camera image or pre-processed image.
  • An edge may for example be understood as a separation or separation line of areas of different tonal value, brightness and/or color in the camera image or pre-processed image, respectively.
  • the individual edge images may correspond to single channel images or, in other words, monochromic images, wherein the pixel values of the individual edge images are larger the larger the gradient of tonal value or brightness or another suitable quantity is.
  • the edge detection algorithm computes the respective pixel value for each pixel of the camera image or pre-processed image, respectively.
  • Summing the individual edge images to generate the joint edge image may for example comprise a pixel-wise summation of the pixel values of the individual edge images.
  • a pixel value of the joint edge image is then given by the sum of all pixel values at the respective pixel position of all of the individual edge images.
  • the summation may also involve a renormalization to limit the pixel values of the joint edge image to a certain maximum value.
  • the pixel value of the joint edge image corresponds to the sum of pixel values at the respective pixel position of the individual edge images, if this sum is smaller than a predefined maximum value and to the maximum value otherwise.
  • the contour of maximum length may be a contour of maximum length of the joint edge image.
  • the joint edge image may be processed, for example by applying one or more filters, and the contour may then be a contour of maximum length of the processed joint edge image.
  • a contour may be understood as a plurality of connected pixels, which have the same pixel values or whose pixel values differ by less than a predefined tolerance range.
  • the contour could correspond to a plurality of connected white pixels.
  • the contour of maximum length is the longest of all such contours, for example of the joint edge image or the processed joint edge image, respectively.
  • each pixel has a pixel position defined by a row and a column
  • each pixel except boundary pixels of the first or last row and of the first or last column, has eight neighboring pixels.
  • the boundary pixels have only five or three neighboring pixel, depending on whether they are located in a corner of the array or not.
  • the neighborhood of a pixel can be defined differently as well. For example only top, bottom, right and left neighbors may be considered but not diagonal neighbors, et cetera. Two pixels can be considered to be connected, if one of them is a neighboring pixel of the other.
  • the convex polygon is determined by fitting a polygon with a predefined number of corners or with a number of corners within a predefined range to the convex hull.
  • the convex hull which is also denoted as a convex envelope or convex enclosure, may be considered to represent the smallest convex set that contains all pixels of the contour of maximum length.
  • Various known algorithms for computing the convex hull may be used. For example, for computing the convex hull of a set of discrete points in two dimensions, the Graham scan algorithm or another suitable algorithm may be used.
  • the convex polygon is, in particular, a polygon with N corners, wherein N is preferably greater than 4.
  • N may be a number from 5 to 50, for example from 10 to 30.
  • the computer-implemented method according to the invention allows for a fully automated detection of the region of interest for a camera, which is mounted to a vehicle.
  • the computer-implemented method requires only the sequence of camera images as input data but not a preliminary version of the region of interest, which would have to be determined at least in part manually.
  • the invention exploits the fact that the boundary in the camera images defining the region of interest separates static components in the camera images, in particular components of the vehicle or the camera mount itself, from potentially dynamic objects in the environment of the vehicle, which move with respect to the camera when the vehicle is moving.
  • the edges in the individual edge images indicating the region of interest do not or not significantly change their position in different subsequent camera images, while the edges in the active region or, in other words, in the portion of the camera images depicting the environment of the vehicle, potentially change their position from one camera image to the next.
  • the edges defining the contour are amplified, while the edges inside the region of interest are effectively washed out. In this way, it becomes possible to define the convex polygon, whose interior represents the region of interest, depending on the convex hull of the contour computed based on the joint edge image.
  • an image mask may be generated and stored depending on the convex polygon.
  • the image mask may be a binary image, wherein pixels inside and outside of the region of interest or, in other words in the interior and exterior of the convex polygon, have different pixel values.
  • each pixel of the image mask is associated with either a first predefined value, for example 0, or a second predefined value, for example 1 or other non-zero integer. All pixels in the interior of the convex polygon have the second value and all pixels in the exterior of the convex polygon have the first value or vice versa.
  • the image mask may then be used for example to generate a masked further camera image by combining a further camera image of the camera with the image mask in a known way.
  • the masked further camera image may then be used for further processing, for example for carrying out a computer vision task and/or guiding the vehicle at least in part automatically.
  • a binary image is generated based on the joint edge image by thresholding and the contour is determined depending on the binary image.
  • the joint edge image is for example a monochromic image, it is not necessarily a black and white or binary image but can for example be represented as a grayscale or the like image.
  • the binary image obtained by thresholding the joint edge image consists of pixels, which are associated either with the first value or the second value.
  • the binary image may be represented as a black and white image.
  • Thresholding the joint edge image may include assigning the second value to each pixel of the joint edge image, whose pixel value is equal to or greater than a predefined threshold value and assigning the second value to all other pixels or vice versa.
  • the predefined threshold value may be a fixed predefined number or may be determined automatically based on the joint edge image, for example depending on a difference between a maximum pixel value and a minimum pixel value of all pixel values of the joint edge image. In other words, the thresholding may be implemented as auto-thresholding.
  • the contour may be determined in a more consistent and reliable way.
  • the contour may be defined as a plurality of connected pixels having the second value assigned. Consequently, there is no additional parameter involved that would define a tolerance range for pixel values being similar enough to be considered belonging to a common contour. Consequently, the robustness and reliability is further improved.
  • the binary image is filtered by applying a noise filter, for example a median filter, in particular to the binary image.
  • a noise filter for example a median filter
  • the contour is determined depending on the filtered binary image.
  • the contour may be determined as contour of maximum length of the filtered binary image.
  • the noise filter can be a known filter for image processing with the objective to reduce noise, in particular salt and pepper noise, which may result from the thresholding process to generate the binary image.
  • a median filter has proven to be particularly suitable in the present application.
  • the median filter may also be denoted as a median blur filter.
  • the contour is determined as a plurality of connected pixels of the filtered binary image with the second value.
  • the edge detection algorithm comprises a Laplacian filter.
  • the Laplacian filter may for example be applied by means of a convolutional operation with a discrete convolutional kernel, for example a 3x3 kernel.
  • a discrete convolutional kernel for example a 3x3 kernel.
  • the following kernel may be used
  • the following kernel may be used, which is particularly sensitive also for edges of 45°
  • the edge detection algorithm comprises a Laplacian-of-Gaussian or LoG filter.
  • a Gaussian filter also denoted as Gaussian blur filter or Gaussian noise filter, is applied prior to applying the Laplacian filter.
  • a single convolutional kernel which represents the Gaussian as well as the Laplacian filter may be applied.
  • the accuracy of the edge detection may be improved by using the LoG filter compared to a Laplacian filter without a Gaussian blur.
  • the edge detection algorithm comprises a Canny edge detector.
  • a Canny edge detector with a kernel size of 9x9 may be used.
  • a lower threshold for the Canny edge detector may be for example 50 and an upper threshold may be for example 150.
  • these values may also be tuned to achieve optimum results.
  • the edge detection algorithm is applied to each camera image of the sequence to generate the individual edge images.
  • a pre-processed image may be generated and the edge detection algorithm is applied to the pre-processed camera images.
  • the pre-processing may for example comprise subtracting subsequent camera images from each other, in particular subtracting the respective pixel values from each other in a pixel-wise manner.
  • each pair of subsequent camera images of the sequence a respective difference image is generated by subtracting the camera images of the respective pair from each other.
  • the edge detection algorithm is applied to each of the difference images to generate the individual edge images.
  • each pair of subsequent images comprises a first and a second camera image and the first camera image is subtracted from the second or vice versa, wherein the second camera image directly follows the first camera image.
  • the subtraction is for example carried out for each channel separately.
  • the total number of pairs of subsequent camera images of the sequence is equal to N-1 .
  • the respective camera image may be subtracted from its preceding camera image of the sequence or vice versa.
  • the initial camera image may be subtracted from an additional camera image, which precedes the sequence or vice versa.
  • an additional difference image is generated, such that the total number of difference images and, consequently, the total number of individual edge images, is equal to N.
  • a reference image may be subtracted from the initial camera image or vice versa.
  • the reference image may for example be an image, whose pixel values are all constant.
  • a respective further individual edge image is generated by applying the edge detection algorithm.
  • a further joint edge image is generated by summing the further individual edge images.
  • At least one further contour of maximum length is determined depending on the further joint edge image.
  • a further convex hull of the further contour is computed and a further convex polygon approximating the further convex hull is determined.
  • a validation message is generated depending on a result of the comparison.
  • the camera is mounted at a fixed position of the motor vehicle. Consequently, up to certain tolerances, the convex polygon should not change over time or, in other words, the further convex polygon should be approximately identical to the convex polygon. If this is not the case, however, this may indicate a change in the mounting position of the camera with respect to the vehicle, for example due to a crash or degradation of mechanical components et cetera.
  • the camera is mounted such that it can be moved between two or more different positions with respect to the vehicle, for example in a side mirror of the vehicle. In this case, a deviation between the convex polygon and the further convex polygon could also indicate a change in position of the camera due to this reason.
  • the validation message may for example contain the information that, up to the predefined tolerances, the convex polygon matches the further convex polygon or the information that it does not match the further convex polygon. Based on this information, a subsequent process, for example a computer vision algorithm or another process for guiding the vehicle in part automatically or automatically, may decide not to take into account the camera images generated by the particular camera or assign a reduced confidence to those camera images.
  • a subsequent process for example a computer vision algorithm or another process for guiding the vehicle in part automatically or automatically, may decide not to take into account the camera images generated by the particular camera or assign a reduced confidence to those camera images.
  • a computer-implemented method for computer vision comprises carrying out a computer-implemented method for determining a region of interest according to the invention.
  • a further camera image in particular a second further camera image
  • the further camera image in particular the second further camera image
  • a computer vision task is carried out based on the masked further camera image.
  • the computer vision task may ignore information contained in the masked further camera image outside of the region of interest.
  • a computer vision task which may also be denoted as a visual perception task, may for example be understood as a task for extracting visually perceivable information from image data.
  • the visual perception task may, in principle, be carried out by a human, who is able to visually perceive an image corresponding to the image data. In the present context, however, visual perception tasks are performed automatically without requiring the support by a human.
  • Typical visual perception tasks include object detection tasks, detecting bounding boxes for objects, semantic segmentation tasks, size regression of objects, height regression of objects et cetera.
  • a method for guiding a vehicle at least in part automatically is provided.
  • a computer-implemented method for determining a region of interest according to the invention is carried out, in particular by at least one computing unit of the vehicle.
  • a further camera image, in particular a second further camera image is received from the camera and the further camera image, in particular the second further camera image, is masked depending on the region of interest, in particular depending on the image mask.
  • At least one control signal for guiding the vehicle at least in part automatically is generated depending on the masked further camera image, in particular by the at least one computing unit.
  • the at least one computing unit may carry out a computer-implemented method for computer vision according to the invention and the at least one control signal is generated depending on a result of the computer vision task.
  • the at least one control signal may be used by an electronic vehicle guidance system for assisting a driver of the vehicle and/or for implementing another function for guiding the vehicle at least in part automatically.
  • the at least one control signal may be transmitted to one or more actuators of the vehicle, which affect a lateral and/or longitudinal control of the vehicle depending on the at least one control signal.
  • an error message and/or a prompt for user feedback is output and/or a default setting and/or a predetermined initial state is set.
  • a computer vision system comprising at least one computing unit.
  • the at least one computing unit is configured to carry out a computer-implemented method for determining a region of interest according to the invention and/or a computer-implemented method for computer vision according to the invention.
  • the computer vision system further comprises the camera, which is configured to generate the sequence of camera images.
  • the camera is a non-rectilinear camera.
  • a non-gnomonic or non-rectilinear camera can be understood as a camera with a non-gnomonic or non-rectilinear lens unit.
  • a non-gnomonic or non-rectilinear lens unit can be understood as a lens unit, that is one or more lenses, having a non-gnomonic, that is non-rectilinear, mapping function, also denoted as curvilinear mapping function.
  • fisheye cameras represent non- gnomonic or non-rectilinear cameras.
  • the mapping function of the lens unit can be understood as a function r(0) mapping an angle 0 from the center axis of radial distortion of the lens unit to a radial shift r out of the image center.
  • the function depends parametrically on the focal length f of the lens unit.
  • a rectilinear lens unit maps straight lines in the real world to straight lines in the image, at least up to lens imperfections.
  • a non-rectilinear or curvilinear lens unit does, in general, not map straight lines to straight lines in the image.
  • the mapping function of a non-rectilinear camera can be stereographic, equidistant, equisolid angle or orthographic. Mapping functions of non- rectilinear lens units may also be given, at least approximately, by polynomial functions.
  • an electronic vehicle guidance system which comprises a computer vision system according to the invention.
  • the at least one computing unit is configured to receive a further camera image and mask the further camera image depending on the region of interest and to generate at least one control signal for guiding the vehicle at least in part automatically depending on the masked further camera image.
  • An electronic vehicle guidance system may be understood as an electronic system, configured to guide a vehicle in a fully automated or a fully autonomous manner and, in particular, without a manual intervention or control by a driver or user of the vehicle being necessary.
  • the vehicle carries out all required functions, such as steering maneuvers, deceleration maneuvers and/or acceleration maneuvers as well as monitoring and recording the road traffic and corresponding reactions automatically.
  • the electronic vehicle guidance system may implement a fully automatic or fully autonomous driving mode according to level 5 of the SAE J3016 classification.
  • An electronic vehicle guidance system may also be implemented as an advanced driver assistance system, ADAS, assisting a driver for partially automatic or partially autonomous driving.
  • the electronic vehicle guidance system may implement a partly automatic or partly autonomous driving mode according to levels 1 to 4 of the SAE J3016 classification.
  • SAE J3016 refers to the respective standard dated June 2018.
  • Guiding the vehicle at least in part automatically may therefore comprise guiding the vehicle according to a fully automatic or fully autonomous driving mode according to level 5 of the SAE J3016 classification. Guiding the vehicle at least in part automatically may also comprise guiding the vehicle according to a partly automatic or partly autonomous driving mode according to levels 1 to 4 of the SAE J3016 classification.
  • a first computer program comprising first instructions.
  • the first instructions When the first instructions are executed by at least one computing unit, for example by at least one computing unit of a computer vision system according to the invention, the first instructions cause the at least one computing unit to carry out a computer-implemented method for determining a region of interest according to the invention and/or a computer-implemented method for computer vision according to the invention.
  • a second computer program comprising second instructions.
  • the second instructions are executed by an electronic vehicle guidance system according to the invention, in particular by the at least one computing unit of the electronic vehicle guidance system, the second instructions cause the electronic vehicle guidance system to carry out a method for guiding a vehicle at least in part automatically according to the invention.
  • a computer readable storage medium storing a first and/or second computer program according to the invention is provided.
  • Fig. 1 shows schematically a vehicle with an exemplary implementation of an electronic vehicle guidance system according to the invention
  • Fig. 2 shows schematically a camera image
  • Fig. 3 shows schematically the camera image of Fig. 2 with a rectangular region of interest
  • Fig. 4 shows schematically the camera image of Fig. 2 with a further rectangular region of interest
  • Fig. 5 shows schematically a contour of maximum length generated in the course of an exemplary implementation of a computer-implemented method for determining a region of interest according to the invention.
  • Fig. 6 shows schematically an image mask with a region of interest determined by means of a further exemplary implementation of a computer-implemented method for determining a region of interest according to the invention.
  • Fig. 1 shows schematically a motor vehicle 1 with an exemplary implementation of an electronic vehicle guidance system 3 according to the invention.
  • the electronic vehicle guidance system 3 comprises an exemplary implementation of a computer vision system 2 according to the invention.
  • the computer vision system 2 comprises a computing unit 4 and a camera 5, for example a fisheye camera or another non-rectilinear camera, which is mounted to the vehicle 1 .
  • the computing unit 4 may also represent two or more computing units, such as electronic control units, ECUs, of the vehicle 1 in some implementations.
  • the computer vision system 2 or the vehicle 1 may optionally comprise further cameras 5.
  • the electronic vehicle guidance system 3 may carry out an exemplary implementation of a method for guiding the vehicle 1 at least in part automatically according to the invention.
  • the computing unit 4 is configured to carry out a computer-implemented method for determining a region of interest 12 (see Fig. 6) from a sequence of camera images 6 (see Fig. 2) of the camera 5 according to the invention.
  • the computing unit 4 may determine the region of interest 12 as an interior region of a convex polygon 7, as shown in Fig. 6.
  • Fig. 6 shows an image mask 1 1 , which consists of white or transparent pixels inside the region of interest 12 and black pixels outside of the region of interest 12.
  • the computing unit 4 may generate the image mask 12 depending on the convex polygon 7 and store it to a storage device (not shown) of the computing unit 4 or of the computer vision system 2 or of the vehicle 1 .
  • the computing unit 4 may receive a further camera image from the camera 5 and use the image mask 12 to generate a masked camera image. The computing unit 4 may then carry out a computer vision task, for example but not limited to an object detection task or a semantic segmentation task, based on the masked camera image. The computing unit 4 may generate one or more control signals for one or more actuators (not shown) of the vehicle 1 , for example steering actuators, driving actuators and/or braking actuators. The actuators affect a longitudinal and/or lateral control of the vehicle 1 depending on the control signals.
  • the camera 5 To carry out the computer-implemented method for determining the region of interest 12, the camera 5 generates a sequence of camera images 6, for example as frames of a camera image stream, in particular while the vehicle 1 is driving.
  • the camera images 6 depict an environment 13 of the vehicle 1 as well as components 14 of the vehi- cle 1 itself or a camera mount of the camera 5, for example. While the depicted environment 13 potentially changes between subsequent camera images 6, the components 14 remain static.
  • the computing unit 4 For each camera image 6, the computing unit 4 generated a respective individual edge image, by applying an edge detection algorithm, for example directly to the respective camera image 6 or to a processed image, which depends on the respective camera image 6.
  • the computing unit 4 generates a joint edge image by summing all the individual edge images in a pixel-wise manner. Therein, an appropriate normalization may be used.
  • the individual edge images will have positive values mainly in the environment 13.
  • the pixel values corresponding to the environment 13 may reach a buffer maximum while the components 14 remain fairly dark. This creates a clear demarcation between the environment 13 and the components 14.
  • the computing unit 4 determines a contour 8 of maximum length, as depicted schematically in Fig. 5, depending on the joint edge image.
  • the computing unit 4 may for example generate a binary image by thresholding the joint edge image.
  • the computing unit 4 may further filter the binary image by applying a noise filter, in particular a median filter, to the binary image. Then, the computing unit 4 may determine the contour 8 depending on the filtered binary image, in particular as the longest chain of connected white pixels in the filtered binary image.
  • the computing unit 4 may compute a convex hull of the contour 8 and determine the convex polygon 7 to approximate the convex hull.
  • the corners of the convex polygon 7 are marked with crosses for convenience only.
  • the edge detection algorithm may for example be a LoG filter.
  • a 3x3 discrete Laplacian filter kernel in particular may be used. Since the Laplacian method uses second derivative values, it is sensitive to noise.
  • a Gaussian blur is used as a noise removal mechanism.
  • the Gaussian blur is applied to the respective camera image 6 then the Laplacian filter kernel is applied to obtain the respective individual edge image.
  • the values obtained from the edge detection are added and normalized to obtain the joint edge image.
  • the joint edge image may then be auto-thresholded to obtain the binary image, for example as a black and white image.
  • Such thresholding may be prone to salt and pepper noise.
  • a median blur filtering may be run to get rid of the noise in the filtered binary image.
  • the filtered binary image is then for example passed as an input into an algorithm for contour detection and fitting the convex hull to produce the convex polygon 7.
  • a Canny edge detector with a kernel size 9 x 9 with lower threshold of 50 and higher threshold of 150 may be used.
  • Figs. 3 and 4 the camera image 6 is shown and simple rectangles 7', 7" as potential regions of interest, which are not generated according to the invention, are shown.
  • the rectangle 7' of Fig. 3 excludes the components 14 but also significant parts of the environment 13.
  • the rectangle 7" of Fig. 4 includes the environment 13 but also significant parts of the components 14. Such unfavorable trade-offs are avoided by the convex polygon 7 of Fig. 6.
  • the invention allows to generate an appropriate region of interest for a vehicle camera in a reliable and accurate manner fully automatically.
  • False detections, errors and artifacts in computer vision tasks may be avoided taking into account the region of interest, which is particularly important in safety critical applications such as automated driving or parking.
  • the invention yields a polygonal region of interest, which may exclude the body of the vehicle and the camera housing.
  • the region of interest can then be read by various computer vision algorithms, thus relieving applications of unnecessary processing and enabling them to provide accurate detections that are important for safety critical applications.
  • the invention also enables ascertaining certain non-functional states, for example due to an unintended camera movement, in particular caused by a crash, or a folded state of a side mirror carrying the camera. Such states may be detected if the region of interest changes significantly during run-time of the camera and thereby can be used to provide feedback to the system to degrade the confidence of outputs from the affected camera. Also under normal driving circumstances, even though the camera position may be fixed, there can be slight movements due to steering, weight or other external factors or degradation. The invention also allows to account for such variations and take them into consideration when defining the polygonal region of interest, in some implementations.

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Image Analysis (AREA)

Abstract

Selon un procédé mis en œuvre par ordinateur pour déterminer une zone d'intérêt (12) à partir d'une séquence d'images de caméra (6) d'une caméra (5) d'un véhicule (1), pour chaque image de caméra (6) de la séquence, une image de bord individuelle respective est générée par application d'un algorithme de détection de bord. Une image de bord commune est générée en faisant la somme des images de bord individuelles. Un contour (8) est déterminé en fonction de l'image de bord commune et un polygone convexe (7) s'approchant du contour (8) est déterminé, un intérieur du polygone convexe (7) correspondant à la zone d'intérêt (12).
PCT/EP2023/071105 2022-08-05 2023-07-31 Détermination d'une zone d'intérêt à partir d'images de caméra WO2024028242A1 (fr)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
DE102022119751.4 2022-08-05
DE102022119751.4A DE102022119751A1 (de) 2022-08-05 2022-08-05 Bestimmen eines Bereichs von Interesse aus Kamerabildern

Publications (1)

Publication Number Publication Date
WO2024028242A1 true WO2024028242A1 (fr) 2024-02-08

Family

ID=87560965

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/EP2023/071105 WO2024028242A1 (fr) 2022-08-05 2023-07-31 Détermination d'une zone d'intérêt à partir d'images de caméra

Country Status (2)

Country Link
DE (1) DE102022119751A1 (fr)
WO (1) WO2024028242A1 (fr)

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2019072911A1 (fr) 2017-10-11 2019-04-18 Valeo Schalter Und Sensoren Gmbh Procédé de détermination d'une région d'intérêt dans une image capturée par une caméra d'un véhicule automobile, système de commande, système de caméra ainsi que véhicule automobile
WO2021009524A1 (fr) * 2019-07-17 2021-01-21 Aimotive Kft. Procédé, produit-programme informatique et support lisible par ordinateur pour générer un masque pour un flux de caméra
US20220012893A1 (en) * 2020-07-07 2022-01-13 Aurora Flight Sciences Corporation, a subsidiary of The Boeing Company Sky segmentor using canny lines

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP5177651B2 (ja) 2008-06-09 2013-04-03 株式会社Ihi 物体認識装置および方法
DE102020107383A1 (de) 2020-03-18 2021-09-23 Connaught Electronics Ltd. Objekterkennung und Führen eines Fahrzeugs

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2019072911A1 (fr) 2017-10-11 2019-04-18 Valeo Schalter Und Sensoren Gmbh Procédé de détermination d'une région d'intérêt dans une image capturée par une caméra d'un véhicule automobile, système de commande, système de caméra ainsi que véhicule automobile
WO2021009524A1 (fr) * 2019-07-17 2021-01-21 Aimotive Kft. Procédé, produit-programme informatique et support lisible par ordinateur pour générer un masque pour un flux de caméra
US20220012893A1 (en) * 2020-07-07 2022-01-13 Aurora Flight Sciences Corporation, a subsidiary of The Boeing Company Sky segmentor using canny lines

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
ILIAS DAHI ET AL: "An edge-based method for effective abandoned luggage detection in complex surveillance videos", COMPUTER VISION AND IMAGE UNDERSTANDING, vol. 158, 19 January 2017 (2017-01-19), US, pages 141 - 151, XP055672141, ISSN: 1077-3142, DOI: 10.1016/j.cviu.2017.01.008 *

Also Published As

Publication number Publication date
DE102022119751A1 (de) 2024-02-08

Similar Documents

Publication Publication Date Title
CN106952308B (zh) 运动物体的位置确定方法及系统
US11393126B2 (en) Method and apparatus for calibrating the extrinsic parameter of an image sensor
US10776946B2 (en) Image processing device, object recognizing device, device control system, moving object, image processing method, and computer-readable medium
US20150206318A1 (en) Method and apparatus for image enhancement and edge verificaton using at least one additional image
EP2610778A1 (fr) Procédé pour la détection d'un obstacle et système d'aide au conducteur
US11443151B2 (en) Driving assistant system, electronic device, and operation method thereof
CN111144315A (zh) 目标检测方法、装置、电子设备和可读存储介质
CN114170826B (zh) 自动驾驶控制方法和装置、电子设备和存储介质
CN111627001A (zh) 图像检测方法及装置
CN112766135A (zh) 目标检测方法、装置、电子设备和存储介质
CN112597846A (zh) 车道线检测方法、装置、计算机设备和存储介质
WO2006040552A1 (fr) Appareil et procede de detection pour vehicules
KR20170106823A (ko) 부분적인 깊이 맵에 기초하여 관심 객체를 식별하는 영상 처리 장치
KR101877741B1 (ko) 영상 블러를 고려한 윤곽선 검출 장치
CN112529011A (zh) 目标检测方法及相关装置
WO2024028242A1 (fr) Détermination d'une zone d'intérêt à partir d'images de caméra
US20240020798A1 (en) Method and device for generating all-in-focus image
CN113450335B (zh) 一种路沿检测方法、路沿检测装置及路面施工车辆
CN111656404A (zh) 图像处理方法、系统及可移动平台
CN116309628A (zh) 车道线识别方法和装置、电子设备和计算机可读存储介质
EP3714427A1 (fr) Détermination de ciel en détection d'environnement pour plateformes mobiles, et systèmes et procédés associés
US11295465B2 (en) Image processing apparatus
CN112287731B (zh) 目标的三元图像构建方法和装置及检测方法和装置
CN113920490A (zh) 车辆障碍检测方法、装置及设备
CN110827205A (zh) 用于处理图像模糊的设备及其方法

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 23751573

Country of ref document: EP

Kind code of ref document: A1