WO2023120988A1 - Vehicle camera occlusion classification device using deep learning-based object detector and method thereof - Google Patents

Vehicle camera occlusion classification device using deep learning-based object detector and method thereof Download PDF

Info

Publication number
WO2023120988A1
WO2023120988A1 PCT/KR2022/017979 KR2022017979W WO2023120988A1 WO 2023120988 A1 WO2023120988 A1 WO 2023120988A1 KR 2022017979 W KR2022017979 W KR 2022017979W WO 2023120988 A1 WO2023120988 A1 WO 2023120988A1
Authority
WO
WIPO (PCT)
Prior art keywords
box
frame
size
value
deep learning
Prior art date
Application number
PCT/KR2022/017979
Other languages
French (fr)
Korean (ko)
Inventor
한동석
유민우
성재호
Original Assignee
경북대학교 산학협력단
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Priority claimed from KR1020220005390A external-priority patent/KR20230095747A/en
Application filed by 경북대학교 산학협력단 filed Critical 경북대학교 산학협력단
Publication of WO2023120988A1 publication Critical patent/WO2023120988A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T3/00Geometric image transformations in the plane of the image
    • G06T3/40Scaling of whole images or parts thereof, e.g. expanding or contracting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/11Region-based segmentation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/44Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/764Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/50Context or environment of the image
    • G06V20/56Context or environment of the image exterior to a vehicle by using sensors mounted on the vehicle

Definitions

  • the present invention relates to a vehicle camera occlusion classification apparatus and method using a deep learning-based object detector, and more particularly, to a deep-learning-based object detector that classifies whether a camera is occluded from a frame of a camera image using a deep-learning object detector It relates to a vehicle camera occlusion classification device and method using
  • a self-driving vehicle is a vehicle that goes to its destination on its own without the driver manipulating the steering wheel, accelerator pedal, or brakes.
  • Autonomous driving systems or advanced driver assistance systems installed in these self-driving vehicles automatically control the driving of the vehicle from the starting point to the ending point on the road using GPS location information and signals acquired from various sensors based on road map information. Assists driving to enable safe driving.
  • an autonomous driving system requires the help of a sensor capable of recognizing surrounding objects and a graphic processing device to recognize and determine the driving environment of a vehicle moving at high speed in real time.
  • the senor measures the distance between objects and detects dangers to help you see all areas without blind spots
  • the graphic processing unit identifies the surrounding environment of the vehicle through multiple cameras and analyzes the images so that the vehicle help you get there safely.
  • the technical problem to be achieved by the present invention is to provide a vehicle camera occlusion classification apparatus and method using a deep learning-based object detector that classifies whether a camera is occluded by an object from a frame of a camera image using a deep learning object detector.
  • a vehicle camera occlusion classification apparatus using a deep learning-based object detector for achieving this technical problem includes an input unit for receiving an original image taken from a camera in frame units; a first feature extractor for extracting features of the frame by reducing the size of the input frame and inputting the input to a convolutional neural network (CNN); a second feature extraction unit extracting features of an object included in a frame received from the input unit using an object detection algorithm; a calculation unit that mixes the characteristics of the frame and the characteristics of the object and inputs them to an artificial neural network (ANN) for calculation; and a determining unit determining whether the camera is blocked based on the calculation result.
  • CNN convolutional neural network
  • the first feature extraction unit reduces the size of the frame to 100x100 using Nearest Neighbor interpolation, and then inputs and extracts features of the frame to the convolutional neural network in one dimension. Can be expanded .
  • the second feature extractor extracts the feature of each object including location information, class information, and reliability information of the object in the frame using a deep learning-based object detection algorithm, and uses the location information to extract each object feature.
  • a box value for each object is calculated, and a set number of trust values are extracted in order of highest reliability value included in the reliability information, and FC (Fully -Connected) Can be combined into one dimension for layer operation.
  • the second feature extractor calculates the box value using the x coordinate value, the y coordinate value included in the location information of the object, the width of the bounding box of the image size, and the height of the bounding box of the image size, the box value is It can be calculated by the following formula.
  • V size is the size of the box
  • V ratio is the aspect ratio of the box
  • w 0 is the horizontal pixel of the detection box
  • w t is the horizontal pixel of the image
  • h 0 is the vertical pixel of the detection box
  • h t is the image Vertical pixel
  • axis min is the minimum value among horizontal or vertical pixels of the detection box
  • axis max is the maximum value among horizontal or vertical pixels of the detection box
  • ⁇ size is a size adjustment constant parameter
  • ⁇ ratio is a ratio adjustment constant parameter.
  • the determination unit may determine whether the camera is occluded according to a class classification result included in the operation result by using a softmax function.
  • a vehicle camera occlusion classification method using a deep learning-based object detector includes the steps of receiving an original image taken from a camera in frame units; After reducing the size of the input frame, inputting the input to a convolutional neural network (CNN) to extract features of the frame; extracting features of an object included in a frame received from the input unit using an object detection algorithm; After mixing the characteristics of the frame and the characteristics of the object, inputting the mixture into an artificial neural network (ANN) for calculation; and determining whether or not the camera is occluded according to the calculation result.
  • CNN convolutional neural network
  • the feature of the frame extracted by inputting it to the convolutional neural network is converted into one dimension.
  • the feature of the object including location information, class information, and reliability information of the object in the frame is extracted for each object using a deep learning-based object detection algorithm, and the location information Calculating a box value for each object using , extracting a set number of trust values in the order of highest confidence value included in the reliability information, and using the box values of objects corresponding to the extracted trust values , combining the characteristics of corresponding objects into one dimension for FC (Fully-Connected) layer operation.
  • the step of extracting the feature of the object calculates the box value using the x coordinate value, the y coordinate value included in the location information of the object, the width of the bounding box of the image size, and the height of the bounding box of the image size,
  • the box value can be calculated by the following formula.
  • V size is the size of the box
  • V ratio is the aspect ratio of the box
  • w 0 is the horizontal pixel of the detection box
  • w t is the horizontal pixel of the image
  • h 0 is the vertical pixel of the detection box
  • h t is the image Vertical pixel
  • axis min is the minimum value among horizontal or vertical pixels of the detection box
  • axis max is the maximum value among horizontal or vertical pixels of the detection box
  • ⁇ size is a size adjustment constant parameter
  • ⁇ ratio is a ratio adjustment constant parameter.
  • whether the camera is occluded may be determined according to a class classification result included in the calculation result using a softmax function.
  • the present invention it is possible to prevent the camera from failing or erroneously detecting an object by classifying whether or not the camera is occluded by an object from the frame of the camera image using the deep learning object detector, thereby reducing the probability of occurrence of an accident. There is an effect that can reduce it.
  • the present invention as it detects whether the camera sensor is blocked, it can be important in autonomous vehicles, and it can be applied to various systems using camera sensors in addition to vehicles, so it can be used universally in various fields. .
  • FIG. 1 is a block diagram showing a vehicle camera occlusion classification apparatus using a deep learning-based object detector according to an embodiment of the present invention.
  • FIG. 2 and 3 are exemplary diagrams for explaining the second feature extraction unit of FIG. 1 .
  • FIG. 4 is a diagram showing a clear image by way of example in a vehicle camera occlusion classification apparatus using a deep learning-based object detector according to an embodiment of the present invention.
  • FIG. 5 is a diagram exemplarily illustrating a blurred image in a vehicle camera occlusion classification apparatus using a deep learning-based object detector according to an embodiment of the present invention.
  • FIG. 6 is an example of arranging box values and confidence values calculated in the second feature extraction unit of FIG. 1 .
  • FIG. 7 is an exemplary diagram for explaining an artificial neural network calculation process in the calculation unit of FIG. 1 .
  • FIG. 8 is a flowchart illustrating an operation flow of a vehicle camera occlusion classification method using a deep learning-based object detector according to an embodiment of the present invention.
  • FIGS. 1 to 7 a vehicle camera occlusion classification apparatus using a deep learning-based object detector according to an embodiment of the present invention will be described with reference to FIGS. 1 to 7 .
  • FIG. 1 is a block diagram showing a vehicle camera occlusion classification apparatus using a deep learning-based object detector according to an embodiment of the present invention.
  • the vehicle camera occlusion classification apparatus 100 using the deep learning-based object detector includes an input unit 110, a first feature extraction unit 120, and a second feature extraction unit 130. ), a calculation unit 140 and a determination unit 150.
  • the input unit 110 receives an original image photographed by a camera (not shown) in units of frames.
  • the camera may be a camera used in an autonomous vehicle or the like, and may also be a sensor camera used in various systems.
  • the first feature extractor 120 reduces the size of the frame input through the input unit 110 and extracts the feature of the frame by inputting it to a convolutional neural network (CNN).
  • CNN convolutional neural network
  • the first feature extraction unit 120 reduces the size of the frame input through the input unit 110 to 100x100 using Nearest Neighbor interpolation, and then inputs the feature to the convolutional neural network to extract the characteristics of the frame. to one dimension (1-D) for artificial neural network (ANN) computation.
  • ANN artificial neural network
  • the second feature extraction unit 130 extracts features of objects included in frames received from the input unit 110 using a deep learning-based object detection algorithm.
  • the second feature extractor 130 extracts object features including location information, class information, and reliability information of objects in a frame for each object using a deep learning-based object detection algorithm.
  • FIG. 2 and 3 are exemplary diagrams for explaining the second feature extraction unit of FIG. 1 .
  • the second feature extraction unit 130 inputs the frame received from the input unit 110 to a deep learning-based object detection algorithm to extract features of objects included in the frame for each object.
  • the characteristics of the extracted object include location information (x, y, w, h) of the object, class information (c), and reliability information (p) as shown in FIG. 3 . 3 as an example, object 1 (x1, y1, w1, h1, c1, p1), object 2 (x2, y2, w2, h2, c2, p2), object 3 (x3, y3, w3, h3, c3, p3), object features can be extracted for each object.
  • the second feature extractor 130 calculates a box value for each object using the location information (x, y, w, h), and sets the set trust values in order of highest reliability value included in the reliability information.
  • the features of the corresponding objects are combined into one dimension for FC (Fully-Connected) layer operation.
  • FIG. 4 is a diagram showing a clear image by way of example in a vehicle camera occlusion classification apparatus using a deep learning-based object detector according to an embodiment of the present invention.
  • the second feature extractor 130 extracts features for all objects (object 1, object 2, and object 3 in FIG. 4) when the frame input from the input unit 110 is a clear image.
  • a box value (Box i ) may be calculated using location information included in the feature of the extracted object, and a confidence value ( Confi ) for the classified class may be extracted.
  • the class and confidence values are values automatically extracted through a deep learning-based object detector. That is, taking FIG. 4 as an example, when the class of object 1 classified through the object detector is a truck and the confidence value ( Confi ) is extracted as 80%, the probability that object 1 is a truck increases.
  • the second feature extractor 130 uses box values of objects corresponding to the extracted confidence values to combine the features of corresponding objects in one dimension for FC layer operation.
  • FIG. 5 is a diagram exemplarily illustrating a blurred image in a vehicle camera occlusion classification apparatus using a deep learning-based object detector according to an embodiment of the present invention.
  • the second feature extractor 130 uses box values of objects corresponding to the extracted confidence values to combine the features of corresponding objects in one dimension for FC layer operation.
  • FIG. 6 is an example of arranging box values and confidence values calculated in the second feature extraction unit of FIG. 1 .
  • the second feature extractor 130 sorts the reliability information in descending order in the order of highest reliability value, and then uses the box values of the objects corresponding to the top 10 highest reliability values. Then, the characteristics of the corresponding objects are merged into one dimension for FC layer operation.
  • the second feature extractor 130 determines the x coordinate value (x), the y coordinate value (y) included in the location information of the object, the width of the bounding box of the image size (w), and the height of the bounding box of the image size (h) ) is used to calculate the box value, but the box value is calculated by Equation 1 below.
  • V size is the size of the box
  • V ratio is the aspect ratio of the box
  • w 0 is the horizontal pixel of the detection box
  • w t is the horizontal pixel of the image
  • h 0 is the vertical pixel of the detection box
  • h t is the image Vertical pixel
  • axis min is the minimum value among horizontal or vertical pixels of the detection box
  • axis max is the maximum value among horizontal or vertical pixels of the detection box
  • ⁇ size is a size adjustment constant parameter
  • ⁇ ratio is a ratio adjustment constant parameter.
  • the box value is calculated by multiplying V size determined by the size of the box and V ratio determined by the aspect ratio of the box.
  • the operation unit 140 concatenates the frame features extracted from the first feature extractor 120 and the object features extracted from the second feature extractor 130, and inputs them to the artificial neural network for calculation.
  • FIG. 7 is an exemplary diagram for explaining an artificial neural network calculation process in the calculation unit of FIG. 1 .
  • the calculation unit 140 mixes the feature (a) of the frame extracted from the first feature extraction unit 120 and the feature (b) of the object extracted from the second feature extraction unit 130. Calculations are performed using an artificial neural network, and the calculation results (Normal or Abnormal) are output.
  • the determination unit 150 determines whether or not the camera is blocked according to the operation result of the operation unit 140 .
  • the determination unit 150 may determine whether or not the camera is blocked according to a class classification result included in the operation result of the operation unit 140 by using a Softmax function.
  • the softmax function is a multi-dimensional generalization of the logistic function, which is used in multinomial logistic regression and is often used as the last activation function to obtain a probability distribution in artificial neural networks. Contrary to its name, it does not smooth or smooth the max function, but smooth the one-hot arg max function, which is the argument of the max value.
  • the method of calculation is to take an exponential function with the base of the natural logarithm as the input value and divide it by the sum of the exponential function.
  • the softmax function makes the K class classification results produced by the artificial neural network interpretable as probabilities.
  • the determination unit 150 determines that the camera is in a normal state, and if it is highly likely to be abnormal, it is determined that the camera is blocked. can judge
  • the present invention if only the original frame is used to classify whether or not the camera is occluded, accuracy is reduced even if the camera is occluded by an object when the original frame is blurry as shown in FIG. 5 .
  • the reason is that the ability to extract information of objects in the frame is inferior in the original frame.
  • FIG. 8 is a flow chart illustrating an operation flow of a vehicle camera occlusion classification method using a deep learning-based object detector according to an embodiment of the present invention, and a detailed operation of the present invention will be described with reference to this flow.
  • the input unit 110 receives an original image photographed by the camera in units of frames (S10).
  • the first feature extractor 120 reduces the size of the frame received in step S10 and extracts the feature of the frame by inputting it to the convolutional neural network (S20).
  • the first feature extraction unit 120 reduces the size of the frame input through the input unit 110 to 100x100 by using the nearest-neighbor interpolation method, and inputs it to the convolutional neural network to artificially extract the frame features. It is spread in one dimension for neural network computation.
  • the second feature extraction unit 130 extracts the features of the object included in the frame received in step S10 using the object detection algorithm (S30).
  • the second feature extractor 130 extracts object features including location information, class information, and reliability information of objects in the frame for each object using a deep learning-based object detection algorithm.
  • the second feature extractor 130 calculates a box value for each object using location information, extracts a set number of trust values in order of highest reliability value included in the reliability information, and corresponds to the extracted trust values.
  • the box value is calculated by Equation 1 above.
  • the calculation unit 140 mixes the characteristics of the frame extracted in step S20 and the characteristics of the object extracted in step S30, and inputs the mixture to the artificial neural network for calculation (S40).
  • step S40 the calculation unit 140 calculates using an artificial neural network and outputs a calculation result (Normal or Abnormal).
  • the determination unit 150 determines whether the camera is blocked according to the operation result of step S40 (S50).
  • the determination unit 150 may determine whether or not the camera is blocked according to the class classification result included in the operation result of the operation unit 140 using the softmax function.
  • the determination unit 150 determines that the camera is in a normal state when the probability that the final calculation result of the calculation unit 140 is normal is high, and if the probability is high that the final calculation result is abnormal, the camera is occluded. can be judged to have occurred.
  • Such a vehicle camera occlusion classification apparatus and method using a deep learning-based object detector may be implemented as an application or implemented in the form of program instructions that can be executed through various computer components and recorded on a computer-readable recording medium.
  • the computer readable recording medium may include program instructions, data files, data structures, etc. alone or in combination.
  • Program instructions recorded on the computer-readable recording medium may be those specially designed and configured for the present invention, or those known and usable to those skilled in the art of computer software.
  • Examples of computer-readable recording media include magnetic media such as hard disks, floppy disks and magnetic tapes, optical recording media such as CD-ROMs and DVDs, and magneto-optical media such as floptical disks. media), and hardware devices specially configured to store and execute program instructions, such as ROM, RAM, flash memory, and the like.
  • Examples of program instructions include high-level language codes that can be executed by a computer using an interpreter or the like as well as machine language codes such as those produced by a compiler.
  • the hardware device may be configured to act as one or more software modules to perform processing according to the present invention.
  • the vehicle camera occlusion classification apparatus and method using a deep learning-based object detector classifies whether a camera is occluded by an object from a frame of a camera image using a deep learning object detector. Accordingly, it is possible to prevent the camera from failing or erroneously detecting an object, thereby reducing the probability of an accident.
  • the camera sensor as it detects whether the camera sensor is blocked, it can be important in autonomous vehicles, and it can be applied to various systems using camera sensors in addition to vehicles, so it can be used universally in various fields. It can be.
  • vehicle camera occlusion classification device 110 input unit
  • first feature extraction unit 130 second feature extraction unit

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Multimedia (AREA)
  • Evolutionary Computation (AREA)
  • Health & Medical Sciences (AREA)
  • Software Systems (AREA)
  • Artificial Intelligence (AREA)
  • Computing Systems (AREA)
  • General Health & Medical Sciences (AREA)
  • Biophysics (AREA)
  • Data Mining & Analysis (AREA)
  • Computational Linguistics (AREA)
  • Molecular Biology (AREA)
  • Biomedical Technology (AREA)
  • General Engineering & Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Databases & Information Systems (AREA)
  • Medical Informatics (AREA)
  • Image Analysis (AREA)

Abstract

The present invention relates to a vehicle camera occlusion classification device using a deep learning-based object detector and a method thereof. The vehicle camera occlusion classification device using the deep learning-based object detector according to the present invention comprises: an input unit which receives a captured original image as input from a camera frame by frame; a first feature extraction unit which extracts features of the input frames by reducing the size of the frames and then inputting same to a convolution neural network (CNN); a second feature extraction unit which uses an object detection algorithm to extract features of objects included in the frames input from the input unit; a calculation unit which performs a calculation by mixing the features of the frames and the features of the objects and then inputting same to an artificial neural network (ANN); and a determination unit which determines whether the camera is occluded according to the result of the calculation.

Description

딥러닝 기반 객체 검출기를 이용한 차량 카메라 폐색 분류 장치 및 그 방법Vehicle camera occlusion classification apparatus and method using deep learning-based object detector
본 발명은 딥러닝 기반 객체 검출기를 이용한 차량 카메라 폐색 분류 장치 및 그 방법에 관한 것으로서, 더욱 상세하게는 딥러닝 객체 검출기를 이용하여 카메라 영상의 프레임으로부터 카메라의 폐색 여부를 분류하는 딥러닝 기반 객체 검출기를 이용한 차량 카메라 폐색 분류 장치 및 그 방법에 관한 것이다.The present invention relates to a vehicle camera occlusion classification apparatus and method using a deep learning-based object detector, and more particularly, to a deep-learning-based object detector that classifies whether a camera is occluded from a frame of a camera image using a deep-learning object detector It relates to a vehicle camera occlusion classification device and method using
최근 들어, 무인 자율 주행시스템과 관련하여 특히, 자동차 분야에서의 자율 주행과 관련된 많은 연구가 이루어지고 있다.Recently, many studies related to autonomous driving in the field of automobiles have been conducted in relation to unmanned autonomous driving systems.
자율주행 차량이란 운전자가 핸들과 가속페달, 브레이크 등을 조작하지 않아도 스스로 목적지까지 찾아가는 차량으로서, 항공기와 선박 등에 적용된 자율주행 기술이 접목된 스마트 차량을 일컫는다.A self-driving vehicle is a vehicle that goes to its destination on its own without the driver manipulating the steering wheel, accelerator pedal, or brakes.
이러한 자율주행 차량에 탑재된 자율 주행시스템 또는 첨단 운전자 보조시스템 등은 도로맵 정보를 바탕으로 GPS 위치정보 및 각종 센서에서 취득한 신호를 이용하여 도로상의 시작점부터 종료점까지 자동차의 주행을 자동으로 제어하거나 운전자의 운전을 보조하여 안전운전을 가능하게 한다.Autonomous driving systems or advanced driver assistance systems installed in these self-driving vehicles automatically control the driving of the vehicle from the starting point to the ending point on the road using GPS location information and signals acquired from various sensors based on road map information. Assists driving to enable safe driving.
이와 같이 자율주행을 원활하게 수행하기 위해서는 다양한 센서 데이터를 수집하고 이에 대한 처리를 통해 차량의 움직임을 제어할 수 있어야 한다.In this way, in order to smoothly perform autonomous driving, it is necessary to be able to control the movement of the vehicle by collecting various sensor data and processing them.
특히, 자율 주행시스템은 고속으로 움직이는 자동차의 주행환경을 실시간으로 인식 및 판단하기 위해 주변 사물을 인식할 수 있는 센서와 그래픽 처리 장치의 도움이 필요하다.In particular, an autonomous driving system requires the help of a sensor capable of recognizing surrounding objects and a graphic processing device to recognize and determine the driving environment of a vehicle moving at high speed in real time.
이때, 센서는 사물과 사물의 거리를 측정하고 위험을 감지하여 사각지대 없이 모든 지역을 볼 수 있도록 도와준며, 그래픽 처리 장치는 여러 대의 카메라를 통해 자동차의 주변 환경을 파악하고 그 이미지를 분석해서 자동차가 안전하게 갈 수 있도록 도와준다.At this time, the sensor measures the distance between objects and detects dangers to help you see all areas without blind spots, and the graphic processing unit identifies the surrounding environment of the vehicle through multiple cameras and analyzes the images so that the vehicle help you get there safely.
그러나, 다양한 이유로 카메라 센서가 물체에 의해 폐색된 경우 카메라 센서가 객체 검출에 실패하거나 오검출할 확률이 증가함에 따라 대형 사고로 이어질 수 있는 문제점이 있다.However, when the camera sensor is obstructed by an object for various reasons, there is a problem that may lead to a large-scale accident as the probability of the camera sensor failing to detect the object or erroneously detecting the object increases.
따라서, 안전하고 원활한 자율주행을 위해서는 카메라 센서의 폐색 여부를 정확하게 검출하기 위한 기술의 개발이 필요하다.Therefore, for safe and smooth autonomous driving, it is necessary to develop a technology for accurately detecting whether the camera sensor is blocked.
본 발명의 배경이 되는 기술은 대한민국 등록특허공보 제10-2253989호(2021. 05. 20. 공고)에 개시되어 있다.The background technology of the present invention is disclosed in Republic of Korea Patent Registration No. 10-2253989 (2021. 05. 20. Notice).
본 발명이 이루고자 하는 기술적 과제는 딥러닝 객체 검출기를 이용하여 카메라 영상의 프레임으로부터 카메라가 물체에 의해 폐색되었는지 여부를 분류하는 딥러닝 기반 객체 검출기를 이용한 차량 카메라 폐색 분류 장치 및 그 방법을 제공하기 위한 것이다.The technical problem to be achieved by the present invention is to provide a vehicle camera occlusion classification apparatus and method using a deep learning-based object detector that classifies whether a camera is occluded by an object from a frame of a camera image using a deep learning object detector. will be.
이러한 기술적 과제를 이루기 위한 본 발명의 실시 예에 따른 딥러닝 기반 객체 검출기를 이용한 차량 카메라 폐색 분류 장치는, 카메라로부터 촬영된 원본 이미지를 프레임 단위로 입력받는 입력부; 입력받은 프레임의 사이즈를 줄인 후 합성곱 신경망(CNN)에 입력하여 상기 프레임의 특징을 추출하는 제1 특징 추출부; 객체 검출 알고리즘을 이용하여 상기 입력부로부터 입력받은 프레임에 포함된 객체의 특징을 추출하는 제2 특징 추출부; 상기 프레임의 특징과 상기 객체의 특징을 혼합한 후 인공신경망(ANN)에 입력하여 연산하는 연산부; 및 상기 연산 결과에 따라 상기 카메라의 폐색 여부를 판단하는 판단부를 포함한다.A vehicle camera occlusion classification apparatus using a deep learning-based object detector according to an embodiment of the present invention for achieving this technical problem includes an input unit for receiving an original image taken from a camera in frame units; a first feature extractor for extracting features of the frame by reducing the size of the input frame and inputting the input to a convolutional neural network (CNN); a second feature extraction unit extracting features of an object included in a frame received from the input unit using an object detection algorithm; a calculation unit that mixes the characteristics of the frame and the characteristics of the object and inputs them to an artificial neural network (ANN) for calculation; and a determining unit determining whether the camera is blocked based on the calculation result.
이때, 상기 제1 특징 추출부는 근접 이웃(Nearest Neighbor) 보간법을 이용하여 상기 프레임의 사이즈를 100x100으로 줄인 후, 상기 합성곱 신경망에 입력하여 추출되는 프레임의 특징을 일차원(one dimension)으로 펼칠 수 있다.At this time, the first feature extraction unit reduces the size of the frame to 100x100 using Nearest Neighbor interpolation, and then inputs and extracts features of the frame to the convolutional neural network in one dimension. Can be expanded .
또한, 상기 제2 특징 추출부는 딥러닝 기반 객체 검출 알고리즘을 이용하여 상기 프레임 내 객체의 위치 정보, 클래스 정보 및 신뢰도 정보가 포함된 객체의 특징을 각 객체별로 추출하되, 상기 위치 정보를 이용하여 각 객체별 박스 값을 산출하고, 상기 신뢰도 정보에 포함된 신뢰 값이 높은 순으로 설정 개의 신뢰 값을 추출하여 추출된 신뢰 값에 대응하는 객체의 박스 값들을 이용하여, 해당 객체들의 특징을 FC(Fully-Connected) 레이어 연산을 위한 일차원(one dimension)으로 합칠 수 있다.In addition, the second feature extractor extracts the feature of each object including location information, class information, and reliability information of the object in the frame using a deep learning-based object detection algorithm, and uses the location information to extract each object feature. A box value for each object is calculated, and a set number of trust values are extracted in order of highest reliability value included in the reliability information, and FC (Fully -Connected) Can be combined into one dimension for layer operation.
또한, 제2 특징 추출부는 상기 객체의 위치 정보에 포함된 x 좌표 값, y 좌표 값, 이미지 크기의 경계 박스 너비 및 이미지 크기의 경계 박스 높이를 이용하여 상기 박스 값을 산출하되, 상기 박스 값은 다음의 식에 의해 산출할 수 있다.In addition, the second feature extractor calculates the box value using the x coordinate value, the y coordinate value included in the location information of the object, the width of the bounding box of the image size, and the height of the bounding box of the image size, the box value is It can be calculated by the following formula.
Figure PCTKR2022017979-appb-img-000001
Figure PCTKR2022017979-appb-img-000001
여기서, BOX는 박스 값, Vsize는 박스의 크기, Vratio는 박스의 종횡비, w0는 검출 박스의 가로 픽셀, wt는 이미지 가로 픽셀, h0는 검출 박스의 세로 픽셀, ht는 이미지 세로 픽셀, axismin은 검출 박스의 가로 또는 세로 픽셀 중 최소 값, axismax는 검출 박스의 가로 또는 세로 픽셀 중 최대 값, λsize는 사이즈 조절 상수 파라미터, λratio는 비율 조절 상수 파라미터이다.where BOX is the box value, V size is the size of the box, V ratio is the aspect ratio of the box, w 0 is the horizontal pixel of the detection box, w t is the horizontal pixel of the image, h 0 is the vertical pixel of the detection box, h t is the image Vertical pixel, axis min is the minimum value among horizontal or vertical pixels of the detection box, axis max is the maximum value among horizontal or vertical pixels of the detection box, λ size is a size adjustment constant parameter, and λ ratio is a ratio adjustment constant parameter.
또한, 상기 판단부는 소프트맥스 함수를 이용하여 상기 연산 결과에 포함된 클래스 구분 결과에 따라 상기 카메라의 폐색 여부를 판단할 수 있다.In addition, the determination unit may determine whether the camera is occluded according to a class classification result included in the operation result by using a softmax function.
또한, 본 발명의 다른 실시 예에 따른 딥러닝 기반의 객체 검출기를 이용한 차량 카메라 폐색 분류 방법은, 카메라로부터 촬영된 원본 이미지를 프레임 단위로 입력받는 단계; 입력받은 프레임의 사이즈를 줄인 후 합성곱 신경망(CNN)에 입력하여 상기 프레임의 특징을 추출하는 단계; 객체 검출 알고리즘을 이용하여 상기 입력부로부터 입력받은 프레임에 포함된 객체의 특징을 추출하는 단계; 상기 프레임의 특징과 상기 객체의 특징을 혼합한 후 인공신경망(ANN)에 입력하여 연산하는 단계; 및 상기 연산 결과에 따라 상기 카메라의 폐색 여부를 판단하는 단계를 포함한다.In addition, a vehicle camera occlusion classification method using a deep learning-based object detector according to another embodiment of the present invention includes the steps of receiving an original image taken from a camera in frame units; After reducing the size of the input frame, inputting the input to a convolutional neural network (CNN) to extract features of the frame; extracting features of an object included in a frame received from the input unit using an object detection algorithm; After mixing the characteristics of the frame and the characteristics of the object, inputting the mixture into an artificial neural network (ANN) for calculation; and determining whether or not the camera is occluded according to the calculation result.
또한, 상기 프레임의 특징을 추출하는 단계는 근접 이웃(Nearest Neighbor) 보간법을 이용하여 상기 프레임의 사이즈를 100x100으로 줄인 후, 상기 합성곱 신경망에 입력하여 추출되는 프레임의 특징을 일차원(one dimension)으로 펼칠 수 있다.In addition, in the step of extracting the feature of the frame, after reducing the size of the frame to 100x100 using Nearest Neighbor interpolation, the feature of the frame extracted by inputting it to the convolutional neural network is converted into one dimension. can unfold
또한, 상기 객체의 특징을 추출하는 단계는 딥러닝 기반의 객체 검출 알고리즘을 이용하여 상기 프레임 내 객체의 위치 정보, 클래스 정보 및 신뢰도 정보가 포함된 객체의 특징을 각 객체별로 추출하되, 상기 위치 정보를 이용하여 각 객체별 박스 값을 산출하는 단계, 상기 신뢰도 정보에 포함된 신뢰 값이 높은 순으로 설정 개의 신뢰 값을 추출하는 단계, 및 상기 추출된 신뢰 값에 대응하는 객체의 박스 값들을 이용하여, 해당 객체들의 특징을 FC(Fully-Connected) 레이어 연산을 위한 일차원(one dimension)으로 합치는 단계를 포함할 수 있다.In addition, in the step of extracting the feature of the object, the feature of the object including location information, class information, and reliability information of the object in the frame is extracted for each object using a deep learning-based object detection algorithm, and the location information Calculating a box value for each object using , extracting a set number of trust values in the order of highest confidence value included in the reliability information, and using the box values of objects corresponding to the extracted trust values , combining the characteristics of corresponding objects into one dimension for FC (Fully-Connected) layer operation.
또한, 상기 객체의 특징을 추출하는 단계는 상기 객체의 위치 정보에 포함된 x 좌표 값, y 좌표 값, 이미지 크기의 경계 박스 너비 및 이미지 크기의 경계 박스 높이를 이용하여 상기 박스 값을 산출하되, 상기 박스 값은 다음의 식에 의해 산출할 수 있다.In addition, the step of extracting the feature of the object calculates the box value using the x coordinate value, the y coordinate value included in the location information of the object, the width of the bounding box of the image size, and the height of the bounding box of the image size, The box value can be calculated by the following formula.
Figure PCTKR2022017979-appb-img-000002
Figure PCTKR2022017979-appb-img-000002
여기서, BOX는 박스 값, Vsize는 박스의 크기, Vratio는 박스의 종횡비, w0는 검출 박스의 가로 픽셀, wt는 이미지 가로 픽셀, h0는 검출 박스의 세로 픽셀, ht는 이미지 세로 픽셀, axismin은 검출 박스의 가로 또는 세로 픽셀 중 최소 값, axismax는 검출 박스의 가로 또는 세로 픽셀 중 최대 값, λsize는 사이즈 조절 상수 파라미터, λratio는 비율 조절 상수 파라미터이다.where BOX is the box value, V size is the size of the box, V ratio is the aspect ratio of the box, w 0 is the horizontal pixel of the detection box, w t is the horizontal pixel of the image, h 0 is the vertical pixel of the detection box, h t is the image Vertical pixel, axis min is the minimum value among horizontal or vertical pixels of the detection box, axis max is the maximum value among horizontal or vertical pixels of the detection box, λ size is a size adjustment constant parameter, and λ ratio is a ratio adjustment constant parameter.
또한, 상기 카메라의 폐색 여부를 판단하는 단계는 소프트맥스 함수를 이용하여 상기 연산 결과에 포함된 클래스 구분 결과에 따라 상기 카메라의 폐색 여부를 판단할 수 있다.In the step of determining whether the camera is occluded, whether the camera is occluded may be determined according to a class classification result included in the calculation result using a softmax function.
이와 같이 본 발명에 따르면, 딥러닝 객체 검출기를 이용하여 카메라 영상의 프레임으로부터 카메라가 물체에 의해 폐색되었는지 여부를 분류함에 따라 카메라가 객체 검출에 실패하거나 오검출하는 것을 방지할 수 있어 사고 발생 확률을 감소시킬 수 있는 효과가 있다.As described above, according to the present invention, it is possible to prevent the camera from failing or erroneously detecting an object by classifying whether or not the camera is occluded by an object from the frame of the camera image using the deep learning object detector, thereby reducing the probability of occurrence of an accident. There is an effect that can reduce it.
또한 본 발명에 따르면, 카메라 센서의 폐색 여부를 검출함에 따라 자율주행 차량에서 중요하게 활용될 수 있고, 차량 이외에도 카메라 센서를 사용하는 다양한 시스템에 적용시킬 수 있어 다양한 분야에서 범용적으로 활용될 수 있다.In addition, according to the present invention, as it detects whether the camera sensor is blocked, it can be important in autonomous vehicles, and it can be applied to various systems using camera sensors in addition to vehicles, so it can be used universally in various fields. .
도 1은 본 발명의 실시 예에 따른 딥러닝 기반 객체 검출기를 이용한 차량 카메라 폐색 분류 장치를 나타낸 블록구성도이다.1 is a block diagram showing a vehicle camera occlusion classification apparatus using a deep learning-based object detector according to an embodiment of the present invention.
도 2 및 도 3은 도 1의 제2 특징 추출부를 설명하기 위해 도시한 예시 도면이다.2 and 3 are exemplary diagrams for explaining the second feature extraction unit of FIG. 1 .
도 4는 본 발명의 실시 예에 따른 딥러닝 기반 객체 검출기를 이용한 차량 카메라 폐색 분류 장치에서 선명한 이미지를 예시적으로 도시한 도면이다.4 is a diagram showing a clear image by way of example in a vehicle camera occlusion classification apparatus using a deep learning-based object detector according to an embodiment of the present invention.
도 5는 본 발명의 실시 예에 따른 딥러닝 기반 객체 검출기를 이용한 차량 카메라 폐색 분류 장치에서 흐릿한 이미지를 예시적으로 도시한 도면이다.5 is a diagram exemplarily illustrating a blurred image in a vehicle camera occlusion classification apparatus using a deep learning-based object detector according to an embodiment of the present invention.
도 6은 도 1의 제2 특징 추출부에서 산출된 박스 값과 신뢰 값의 정렬 예시이다.FIG. 6 is an example of arranging box values and confidence values calculated in the second feature extraction unit of FIG. 1 .
도 7은 도 1의 연산부에서 인공신경망 연산 과정을 설명하기 위한 예시 도면이다.FIG. 7 is an exemplary diagram for explaining an artificial neural network calculation process in the calculation unit of FIG. 1 .
도 8은 본 발명의 실시 예에 따른 딥러닝 기반 객체 검출기를 이용한 차량 카메라 폐색 분류 방법의 동작 흐름을 도시한 순서도이다.8 is a flowchart illustrating an operation flow of a vehicle camera occlusion classification method using a deep learning-based object detector according to an embodiment of the present invention.
이하 첨부된 도면을 참조하여 본 발명에 따른 바람직한 실시 예를 상세히 설명하기로 한다. 이 과정에서 도면에 도시된 선들의 두께나 구성요소의 크기 등은 설명의 명료성과 편의상 과장되게 도시되어 있을 수 있다. Hereinafter, preferred embodiments according to the present invention will be described in detail with reference to the accompanying drawings. In this process, the thickness of lines or the size of components shown in the drawings may be exaggerated for clarity and convenience of description.
또한 후술되는 용어들은 본 발명에서의 기능을 고려하여 정의된 용어들로서, 이는 사용자, 운용자의 의도 또는 관례에 따라 달라질 수 있다. 그러므로 이러한 용어들에 대한 정의는 본 명세서 전반에 걸친 내용을 토대로 내려져야 할 것이다.In addition, terms to be described later are terms defined in consideration of functions in the present invention, which may vary according to the intention or custom of a user or operator. Therefore, definitions of these terms will have to be made based on the content throughout this specification.
이하, 도면들을 참조하여 본 발명의 바람직한 실시예들을 보다 상세하게 설명하기로 한다.Hereinafter, preferred embodiments of the present invention will be described in more detail with reference to the drawings.
먼저, 도 1 내지 도 7을 통해 본 발명의 실시 예에 따른 딥러닝 기반 객체 검출기를 이용한 차량 카메라 폐색 분류 장치에 대하여 설명한다.First, a vehicle camera occlusion classification apparatus using a deep learning-based object detector according to an embodiment of the present invention will be described with reference to FIGS. 1 to 7 .
도 1은 본 발명의 실시 예에 따른 딥러닝 기반 객체 검출기를 이용한 차량 카메라 폐색 분류 장치를 나타낸 블록구성도이다.1 is a block diagram showing a vehicle camera occlusion classification apparatus using a deep learning-based object detector according to an embodiment of the present invention.
도 1에서와 같이 본 발명의 실시 예에 따른 딥러닝 기반 객체 검출기를 이용한 차량 카메라 폐색 분류 장치(100)는, 입력부(110), 제1 특징 추출부(120), 제2 특징 추출부(130), 연산부(140) 및 판단부(150)를 포함한다.1, the vehicle camera occlusion classification apparatus 100 using the deep learning-based object detector according to an embodiment of the present invention includes an input unit 110, a first feature extraction unit 120, and a second feature extraction unit 130. ), a calculation unit 140 and a determination unit 150.
먼저, 입력부(110)는 카메라(미도시)로부터 촬영된 원본 이미지를 프레임 단위로 입력받는다.First, the input unit 110 receives an original image photographed by a camera (not shown) in units of frames.
이때, 카메라는 자율주행 차량 등에 사용되는 카메라일 수 있으며 이 외에도 다양한 시스템에서 사용되는 센서 카메라일 수 있다.In this case, the camera may be a camera used in an autonomous vehicle or the like, and may also be a sensor camera used in various systems.
그리고 제1 특징 추출부(120)는 입력부(110)를 통해 입력받은 프레임의 사이즈를 줄인 후 합성곱 신경망(Convolutional neural network, CNN)에 입력하여 프레임의 특징을 추출한다.In addition, the first feature extractor 120 reduces the size of the frame input through the input unit 110 and extracts the feature of the frame by inputting it to a convolutional neural network (CNN).
자세히는, 제1 특징 추출부(120)는 근접 이웃(Nearest Neighbor) 보간법을 이용하여 입력부(110)를 통해 입력받은 프레임의 사이즈를 100x100으로 줄인 후, 합성곱 신경망에 입력하여 추출되는 프레임의 특징을 인공신경망(Artificial Neural Network, ANN) 연산을 위해 일차원(one dimension, 1-D)으로 펼친다.In detail, the first feature extraction unit 120 reduces the size of the frame input through the input unit 110 to 100x100 using Nearest Neighbor interpolation, and then inputs the feature to the convolutional neural network to extract the characteristics of the frame. to one dimension (1-D) for artificial neural network (ANN) computation.
이때, 제1 특징 추출부(120)에서 합성곱 신경망은 일반적으로 사용하는 합성곱 신경망 연산과 동일한 방법으로 연산하므로 자세한 설명은 생략하기로 한다.In this case, since the convolutional neural network in the first feature extractor 120 is calculated in the same way as the generally used convolutional neural network operation, a detailed description thereof will be omitted.
그리고 제2 특징 추출부(130)는 딥러닝 기반의 객체 검출 알고리즘을 이용하여 입력부(110)로부터 입력받은 프레임에 포함된 객체의 특징을 추출한다.The second feature extraction unit 130 extracts features of objects included in frames received from the input unit 110 using a deep learning-based object detection algorithm.
이때, 제2 특징 추출부(130)는 딥러닝 기반 객체 검출 알고리즘을 이용하여 프레임 내 객체의 위치 정보, 클래스 정보 및 신뢰도 정보가 포함된 객체의 특징을 각 객체별로 추출한다.At this time, the second feature extractor 130 extracts object features including location information, class information, and reliability information of objects in a frame for each object using a deep learning-based object detection algorithm.
도 2 및 도 3은 도 1의 제2 특징 추출부를 설명하기 위해 도시한 예시 도면이다.2 and 3 are exemplary diagrams for explaining the second feature extraction unit of FIG. 1 .
도 2에서와 같이 제2 특징 추출부(130)는 입력부(110)로부터 입력받은 프레임을 딥러닝 기반 객체 검출 알고리즘에 입력하여 프레임에 포함된 객체들의 특징을 각 객체별로 추출한다.As shown in FIG. 2 , the second feature extraction unit 130 inputs the frame received from the input unit 110 to a deep learning-based object detection algorithm to extract features of objects included in the frame for each object.
이때, 추출되는 객체의 특징에는 도 3에서와 같이 객체의 위치 정보(x, y, w, h)와 클래스 정보(c) 및 신뢰도 정보(p)가 포함되어 있다. 도 3을 예시로 들어 설명하자면, 딥러닝 기반 객체 검출 알고리즘에 의해 객체1(x1, y1, w1, h1, c1, p1), 객체2(x2, y2, w2, h2, c2, p2), 객체3(x3, y3, w3, h3, c3, p3)과 같이 각 객체별로 객체의 특징을 추출할 수 있다.At this time, the characteristics of the extracted object include location information (x, y, w, h) of the object, class information (c), and reliability information (p) as shown in FIG. 3 . 3 as an example, object 1 (x1, y1, w1, h1, c1, p1), object 2 (x2, y2, w2, h2, c2, p2), object 3 (x3, y3, w3, h3, c3, p3), object features can be extracted for each object.
따라서, 제2 특징 추출부(130)는 위치 정보(x, y, w, h)를 이용하여 각 객체별 박스 값을 산출하고, 신뢰도 정보에 포함된 신뢰 값이 높은 순으로 설정 개의 신뢰 값을 추출하여 추출된 신뢰 값에 대응하는 객체의 박스 값들을 이용하여, 해당 객체들의 특징을 FC(Fully-Connected) 레이어 연산을 위한 일차원(one dimension)으로 합친다.Therefore, the second feature extractor 130 calculates a box value for each object using the location information (x, y, w, h), and sets the set trust values in order of highest reliability value included in the reliability information. By using the box values of objects corresponding to the extracted confidence values, the features of the corresponding objects are combined into one dimension for FC (Fully-Connected) layer operation.
도 4는 본 발명의 실시 예에 따른 딥러닝 기반 객체 검출기를 이용한 차량 카메라 폐색 분류 장치에서 선명한 이미지를 예시적으로 도시한 도면이다.4 is a diagram showing a clear image by way of example in a vehicle camera occlusion classification apparatus using a deep learning-based object detector according to an embodiment of the present invention.
도 4에 도시된 바와 같이 제2 특징 추출부(130)는 입력부(110)로부터 입력받은 프레임이 선명한 이미지인 경우 모든 객체(도 4에서는 객체1, 객체2, 객체3)에 대한 특징을 추출할 수 있다. 이때, 추출된 객체의 특징에 포함된 위치 정보를 이용하여 박스 값(Boxi)을 산출하고, 분류된 클래스에 대한 신뢰 값(Confi)을 추출할 수 있다.As shown in FIG. 4, the second feature extractor 130 extracts features for all objects (object 1, object 2, and object 3 in FIG. 4) when the frame input from the input unit 110 is a clear image. can In this case, a box value (Box i ) may be calculated using location information included in the feature of the extracted object, and a confidence value ( Confi ) for the classified class may be extracted.
이때, 클래스와 신뢰 값은 딥러닝 기반 객체 검출기를 통하여 자동으로 추출되는 값이다. 즉, 도 4를 예로 들어 설명하자면 객체 검출기를 통해 분류된 객체 1의 클래스가 트럭이고, 신뢰 값(Confi)이 80%으로 추출되는 경우, 객체 1이 트럭일 확률이 높아지는 것이다.At this time, the class and confidence values are values automatically extracted through a deep learning-based object detector. That is, taking FIG. 4 as an example, when the class of object 1 classified through the object detector is a truck and the confidence value ( Confi ) is extracted as 80%, the probability that object 1 is a truck increases.
또한, 제2 특징 추출부(130)는 추출된 신뢰 값에 대응하는 객체의 박스 값들을 이용하여, 해당 객체들의 특징을 FC 레이어 연산을 위한 일차원으로 합쳐준다.In addition, the second feature extractor 130 uses box values of objects corresponding to the extracted confidence values to combine the features of corresponding objects in one dimension for FC layer operation.
도 5는 본 발명의 실시 예에 따른 딥러닝 기반 객체 검출기를 이용한 차량 카메라 폐색 분류 장치에서 흐릿한 이미지를 예시적으로 도시한 도면이다.5 is a diagram exemplarily illustrating a blurred image in a vehicle camera occlusion classification apparatus using a deep learning-based object detector according to an embodiment of the present invention.
도 5는 도 4와는 다르게 입력부(110)로부터 입력받은 프레임이 불분명한(흐릿한) 이미지인 경우 도 4에서와 동일한 방법으로 딥러닝 기반 객체 검출기를 통하여 모든 객체(도 5에서는 객체1, 객체2, 객체3)에 대한 특징을 추출할 수 있다. 이때, 도 5의 이미지에서 객체 3에 대하여는 추출이 불가능하기 때문에 박스 값과 신뢰 값이 0으로 추출된다.Unlike FIG. 4, in FIG. 5, when the frame received from the input unit 110 is an unclear (blurred) image, all objects (object 1, object 2 in FIG. 5, Features of object 3) can be extracted. At this time, since it is impossible to extract object 3 in the image of FIG. 5, the box value and confidence value are extracted as 0.
또한, 제2 특징 추출부(130)는 추출된 신뢰 값에 대응하는 객체의 박스 값들을 이용하여, 해당 객체들의 특징을 FC 레이어 연산을 위한 일차원으로 합쳐준다.In addition, the second feature extractor 130 uses box values of objects corresponding to the extracted confidence values to combine the features of corresponding objects in one dimension for FC layer operation.
도 6은 도 1의 제2 특징 추출부에서 산출된 박스 값과 신뢰 값의 정렬 예시이다.FIG. 6 is an example of arranging box values and confidence values calculated in the second feature extraction unit of FIG. 1 .
도 6에서와 같이, 제2 특징 추출부(130)는 신뢰도 정보에 포함된 신뢰 값이 높은 순으로 내림차순 정렬한 뒤, 신뢰 값이 높은 상위 10개의 신뢰 값에 각각 대응하는 객체의 박스 값들을 이용하여, 해당 객체들의 특징을 FC 레이어 연산을 위한 일차원으로 합쳐준다.As shown in FIG. 6, the second feature extractor 130 sorts the reliability information in descending order in the order of highest reliability value, and then uses the box values of the objects corresponding to the top 10 highest reliability values. Then, the characteristics of the corresponding objects are merged into one dimension for FC layer operation.
이때, 제2 특징 추출부(130)는 객체의 위치 정보에 포함된 x 좌표 값(x), y 좌표 값(y), 이미지 크기의 경계 박스 너비(w) 및 이미지 크기의 경계 박스 높이(h)를 이용하여 박스 값을 산출하되, 박스 값은 다음의 수학식 1에 의해 산출한다.At this time, the second feature extractor 130 determines the x coordinate value (x), the y coordinate value (y) included in the location information of the object, the width of the bounding box of the image size (w), and the height of the bounding box of the image size (h) ) is used to calculate the box value, but the box value is calculated by Equation 1 below.
[수학식][mathematical expression]
Figure PCTKR2022017979-appb-img-000003
Figure PCTKR2022017979-appb-img-000003
여기서, BOX는 박스 값, Vsize는 박스의 크기, Vratio는 박스의 종횡비, w0는 검출 박스의 가로 픽셀, wt는 이미지 가로 픽셀, h0는 검출 박스의 세로 픽셀, ht는 이미지 세로 픽셀, axismin은 검출 박스의 가로 또는 세로 픽셀 중 최소 값, axismax는 검출 박스의 가로 또는 세로 픽셀 중 최대 값, λsize는 사이즈 조절 상수 파라미터, λratio는 비율 조절 상수 파라미터이다.where BOX is the box value, V size is the size of the box, V ratio is the aspect ratio of the box, w 0 is the horizontal pixel of the detection box, w t is the horizontal pixel of the image, h 0 is the vertical pixel of the detection box, h t is the image Vertical pixel, axis min is the minimum value among horizontal or vertical pixels of the detection box, axis max is the maximum value among horizontal or vertical pixels of the detection box, λ size is a size adjustment constant parameter, and λ ratio is a ratio adjustment constant parameter.
즉, 박스 값은 박스의 크기에 의해 결정되는 Vsize와 박스의 종횡비에 의해 결정되는 Vratio의 곱으로 산출된다.That is, the box value is calculated by multiplying V size determined by the size of the box and V ratio determined by the aspect ratio of the box.
그리고 연산부(140)는 제1 특징 추출부(120)로부터 추출된 프레임의 특징과 제2 특징 추출부(130)로부터 추출된 객체의 특징을 혼합(concatenate)한 후 인공신경망에 입력하여 연산한다.Then, the operation unit 140 concatenates the frame features extracted from the first feature extractor 120 and the object features extracted from the second feature extractor 130, and inputs them to the artificial neural network for calculation.
도 7은 도 1의 연산부에서 인공신경망 연산 과정을 설명하기 위한 예시 도면이다.FIG. 7 is an exemplary diagram for explaining an artificial neural network calculation process in the calculation unit of FIG. 1 .
도 7에서와 같이, 연산부(140)는 제1 특징 추출부(120)로부터 추출된 프레임의 특징(a)과 제2 특징 추출부(130)로부터 추출된 객체의 특징(b)을 혼합한 후 인공신경망을 이용하여 연산하여 연산 결과(Normal or Abnormal)를 출력한다.As shown in FIG. 7, the calculation unit 140 mixes the feature (a) of the frame extracted from the first feature extraction unit 120 and the feature (b) of the object extracted from the second feature extraction unit 130. Calculations are performed using an artificial neural network, and the calculation results (Normal or Abnormal) are output.
이때, 연산부(140)에서 인공신경망은 일반적으로 사용하는 인공신경망 연산과 동일한 방법으로 연산하므로 자세한 설명은 생략하기로 한다.At this time, since the artificial neural network in the calculation unit 140 is calculated in the same way as the generally used artificial neural network calculation, a detailed description thereof will be omitted.
마지막으로 판단부(150)는 연산부(140)의 연산 결과에 따라 카메라의 폐색 여부를 판단한다.Finally, the determination unit 150 determines whether or not the camera is blocked according to the operation result of the operation unit 140 .
이때, 판단부(150)는 소프트맥스 함수(Softmax function)를 이용하여 연산부(140)의 연산 결과에 포함된 클래스 구분 결과에 따라 카메라의 폐색 여부를 판단할 수 있다.In this case, the determination unit 150 may determine whether or not the camera is blocked according to a class classification result included in the operation result of the operation unit 140 by using a Softmax function.
여기서 소프트맥스 함수는 로지스틱 함수의 다차원 일반화로, 다항 로지스틱 회귀에서 쓰이고 인공신경망에서 확률 분포를 얻기 위한 마지막 활성함수로 많이 사용되는 함수이다. 이름과 달리 최대값(max) 함수를 매끄럽거나 부드럽게 한 것이 아니라, 최대값의 인수인 원핫 형태의 arg max 함수를 매끄럽게 한 것이다. 그 계산 방법은 입력값을 자연로그의 밑을 밑으로 한 지수 함수를 취한 뒤 그 지수함수의 합으로 나눠주는 것이다.Here, the softmax function is a multi-dimensional generalization of the logistic function, which is used in multinomial logistic regression and is often used as the last activation function to obtain a probability distribution in artificial neural networks. Contrary to its name, it does not smooth or smooth the max function, but smooth the one-hot arg max function, which is the argument of the max value. The method of calculation is to take an exponential function with the base of the natural logarithm as the input value and divide it by the sum of the exponential function.
즉, 소프트맥스 함수는 인공신경망이 내놓은 K개의 클래스 구분 결과를 확률처럼 해석하도록 만들어준다.In other words, the softmax function makes the K class classification results produced by the artificial neural network interpretable as probabilities.
따라서, 판단부(150)는 연산부(140)의 최종 연산 결과가 정상(Normal)일 확률이 높은 경우, 카메라를 정상 상태로 판단하고, 비정상(Abnormal)일 확률이 높은 경우 카메라의 폐색이 발생한 것으로 판단할수 있다.Therefore, if the probability that the final calculation result of the calculation unit 140 is normal is high, the determination unit 150 determines that the camera is in a normal state, and if it is highly likely to be abnormal, it is determined that the camera is blocked. can judge
본 발명의 실시 예에서 원본 프레임만 사용하여 카메라의 폐색 여부를 분류한다면 도 5에서와 같이 원본 프레임이 흐릿한 경우 객체에 의해 카메라가 폐색된 경우라도 정확도가 떨어진다. 그 이유는 원본 프레임에서는 프레임 안에 있는 객체의 정보를 추출하는 능력이 떨어지기 때문이다.In an embodiment of the present invention, if only the original frame is used to classify whether or not the camera is occluded, accuracy is reduced even if the camera is occluded by an object when the original frame is blurry as shown in FIG. 5 . The reason is that the ability to extract information of objects in the frame is inferior in the original frame.
따라서 본 발명의 실시 예에서는 딥러닝 기반의 객체 검출기를 이용하여 추출된 객체 검출 결과 정보를 이용하여 원본 프레임에서 추출된 특징 정보와 함께 인공신경망 연산하여 카메라 폐색 여부를 최종 판단함으로써 판단 정확도를 향상시킬 수 있다.Therefore, in the embodiment of the present invention, using the object detection result information extracted using the deep learning-based object detector, artificial neural network operation is performed together with the feature information extracted from the original frame to finally determine whether the camera is blocked, thereby improving the accuracy of the determination. can
이하에서는 도 8을 통해 본 발명의 실시 예에 따른 딥러닝 기반 객체 검출기를 이용한 차량 카메라 폐색 분류 방법에 대하여 설명한다.Hereinafter, a vehicle camera occlusion classification method using a deep learning-based object detector according to an embodiment of the present invention will be described with reference to FIG. 8 .
도 8은 본 발명의 실시 예에 따른 딥러닝 기반 객체 검출기를 이용한 차량 카메라 폐색 분류 방법의 동작 흐름을 도시한 순서도로서, 이를 참조하여 본 발명의 구체적인 동작을 설명한다.8 is a flow chart illustrating an operation flow of a vehicle camera occlusion classification method using a deep learning-based object detector according to an embodiment of the present invention, and a detailed operation of the present invention will be described with reference to this flow.
본 발명의 실시 예에 따르면, 먼저 입력부(110)가 카메라로부터 촬영된 원본 이미지를 프레임 단위로 입력받는다(S10).According to an embodiment of the present invention, first, the input unit 110 receives an original image photographed by the camera in units of frames (S10).
그 다음 제1 특징 추출부(120)가 S10 단계에서 입력받은 프레임의 사이즈를 줄인 후 합성곱 신경망에 입력하여 프레임의 특징을 추출한다(S20).Next, the first feature extractor 120 reduces the size of the frame received in step S10 and extracts the feature of the frame by inputting it to the convolutional neural network (S20).
이때, S20 단계에서 제1 특징 추출부(120)는 근접 이웃 보간법을 이용하여 입력부(110)를 통해 입력받은 프레임의 사이즈를 100x100으로 줄인 후, 합성곱 신경망에 입력하여 추출되는 프레임의 특징을 인공신경망 연산을 위해 일차원으로 펼친다.At this time, in step S20, the first feature extraction unit 120 reduces the size of the frame input through the input unit 110 to 100x100 by using the nearest-neighbor interpolation method, and inputs it to the convolutional neural network to artificially extract the frame features. It is spread in one dimension for neural network computation.
그 다음 제2 특징 추출부(130)가 객체 검출 알고리즘을 이용하여 S10 단계에서 입력받은 프레임에 포함된 객체의 특징을 추출한다(S30).Next, the second feature extraction unit 130 extracts the features of the object included in the frame received in step S10 using the object detection algorithm (S30).
이때, S30 단계에서 제2 특징 추출부(130)는 딥러닝 기반 객체 검출 알고리즘을 이용하여 프레임 내 객체의 위치 정보, 클래스 정보 및 신뢰도 정보가 포함된 객체의 특징을 각 객체별로 추출한다.At this time, in step S30, the second feature extractor 130 extracts object features including location information, class information, and reliability information of objects in the frame for each object using a deep learning-based object detection algorithm.
자세히는, 제2 특징 추출부(130)는 위치 정보를 이용하여 각 객체별 박스 값을 산출하고, 신뢰도 정보에 포함된 신뢰 값이 높은 순으로 설정 개의 신뢰 값을 추출하여 추출된 신뢰 값에 대응하는 객체의 박스 값들을 이용하여, 해당 객체들의 특징을 FC 레이어 연산을 위한 일차원으로 합쳐준다. 이때 박스 값은 위의 수학식 1에 의해 산출한다.In detail, the second feature extractor 130 calculates a box value for each object using location information, extracts a set number of trust values in order of highest reliability value included in the reliability information, and corresponds to the extracted trust values. By using the box values of the object to be performed, the features of the corresponding objects are merged into one dimension for FC layer operation. At this time, the box value is calculated by Equation 1 above.
그 다음 연산부(140)가 S20 단계에서 추출된 프레임의 특징과 S30 단계에서 추출된 객체의 특징을 혼합한 후 인공신경망에 입력하여 연산한다(S40).Then, the calculation unit 140 mixes the characteristics of the frame extracted in step S20 and the characteristics of the object extracted in step S30, and inputs the mixture to the artificial neural network for calculation (S40).
S40 단계에서 연산부(140)는 인공신경망을 이용하여 연산하여 연산 결과(Normal or Abnormal)를 출력한다.In step S40, the calculation unit 140 calculates using an artificial neural network and outputs a calculation result (Normal or Abnormal).
마지막으로 판단부(150)가 S40 단계의 연산 결과에 따라 카메라의 폐색 여부를 판단한다(S50).Finally, the determination unit 150 determines whether the camera is blocked according to the operation result of step S40 (S50).
자세히는, S50 단계에서 판단부(150)는 소프트맥스 함수를 이용하여 연산부(140)의 연산 결과에 포함된 클래스 구분 결과에 따라 카메라의 폐색 여부를 판단할 수 있다.In detail, in step S50, the determination unit 150 may determine whether or not the camera is blocked according to the class classification result included in the operation result of the operation unit 140 using the softmax function.
더욱 자세히는, 판단부(150)는 연산부(140)의 최종 연산 결과가 정상(Normal)일 확률이 높은 경우, 카메라를 정상 상태로 판단하고, 비정상(Abnormal)일 확률이 높은 경우 카메라의 폐색이 발생한 것으로 판단할수 있다.More specifically, the determination unit 150 determines that the camera is in a normal state when the probability that the final calculation result of the calculation unit 140 is normal is high, and if the probability is high that the final calculation result is abnormal, the camera is occluded. can be judged to have occurred.
이와 같은, 딥러닝 기반 객체 검출기를 이용한 차량 카메라 폐색 분류 장치 및 그 방법은 애플리케이션으로 구현되거나 다양한 컴퓨터 구성요소를 통하여 수행될 수 있는 프로그램 명령어의 형태로 구현되어 컴퓨터 판독 가능한 기록 매체에 기록될 수 있다. 상기 컴퓨터 판독 가능한 기록 매체는 프로그램 명령어, 데이터 파일, 데이터 구조 등을 단독으로 또는 조합하여 포함할 수 있다.Such a vehicle camera occlusion classification apparatus and method using a deep learning-based object detector may be implemented as an application or implemented in the form of program instructions that can be executed through various computer components and recorded on a computer-readable recording medium. . The computer readable recording medium may include program instructions, data files, data structures, etc. alone or in combination.
상기 컴퓨터 판독 가능한 기록 매체에 기록되는 프로그램 명령어는 본 발명을 위하여 특별히 설계되고 구성된 것들이거니와 컴퓨터 소프트웨어 분야의 당업자에게 공지되어 사용 가능한 것일 수도 있다.Program instructions recorded on the computer-readable recording medium may be those specially designed and configured for the present invention, or those known and usable to those skilled in the art of computer software.
컴퓨터 판독 가능한 기록 매체의 예에는, 하드 디스크, 플로피 디스크 및 자기 테이프와 같은 자기 매체, CD-ROM, DVD와 같은 광기록 매체, 플롭티컬 디스크(floptical disk)와 같은 자기-광 매체(magneto-optical media), 및 ROM, RAM, 플래시 메모리 등과 같은 프로그램 명령어를 저장하고 수행하도록 특별히 구성된 하드웨어 장치가 포함된다.Examples of computer-readable recording media include magnetic media such as hard disks, floppy disks and magnetic tapes, optical recording media such as CD-ROMs and DVDs, and magneto-optical media such as floptical disks. media), and hardware devices specially configured to store and execute program instructions, such as ROM, RAM, flash memory, and the like.
프로그램 명령어의 예에는, 컴파일러에 의해 만들어지는 것과 같은 기계어 코드뿐만 아니라 인터프리터 등을 사용해서 컴퓨터에 의해서 실행될 수 있는 고급 언어 코드도 포함된다. 상기 하드웨어 장치는 본 발명에 따른 처리를 수행하기 위해 하나 이상의 소프트웨어 모듈로서 작동하도록 구성될 수 있다.Examples of program instructions include high-level language codes that can be executed by a computer using an interpreter or the like as well as machine language codes such as those produced by a compiler. The hardware device may be configured to act as one or more software modules to perform processing according to the present invention.
상술한 바와 같이, 본 발명의 실시 예에 따른 딥러닝 기반 객체 검출기를 이용한 차량 카메라 폐색 분류 장치 및 그 방법은 딥러닝 객체 검출기를 이용하여 카메라 영상의 프레임으로부터 카메라가 물체에 의해 폐색되었는지 여부를 분류함에 따라 카메라가 객체 검출에 실패하거나 오검출하는 것을 방지할 수 있어 사고 발생 확률을 감소시킬 수 있다.As described above, the vehicle camera occlusion classification apparatus and method using a deep learning-based object detector according to an embodiment of the present invention classifies whether a camera is occluded by an object from a frame of a camera image using a deep learning object detector. Accordingly, it is possible to prevent the camera from failing or erroneously detecting an object, thereby reducing the probability of an accident.
또한 본 발명의 실시 예에 따르면, 카메라 센서의 폐색 여부를 검출함에 따라 자율주행 차량에서 중요하게 활용될 수 있고, 차량 이외에도 카메라 센서를 사용하는 다양한 시스템에 적용시킬 수 있어 다양한 분야에서 범용적으로 활용될 수 있다.In addition, according to an embodiment of the present invention, as it detects whether the camera sensor is blocked, it can be important in autonomous vehicles, and it can be applied to various systems using camera sensors in addition to vehicles, so it can be used universally in various fields. It can be.
본 발명은 도면에 도시된 실시 예를 참고로 하여 설명되었으나 이는 예시적인 것에 불과하며, 당해 기술이 속하는 분야에서 통상의 지식을 가진 자라면 이로부터 다양한 변형 및 균등한 타 실시 예가 가능하다는 점을 이해할 것이다. 따라서 본 발명의 진정한 기술적 보호범위는 아래의 특허청구범위의 기술적 사상에 의하여 정해져야 할 것이다.Although the present invention has been described with reference to the embodiments shown in the drawings, this is merely exemplary, and those skilled in the art will understand that various modifications and equivalent other embodiments are possible therefrom. will be. Therefore, the true technical protection scope of the present invention should be determined by the technical spirit of the claims below.
[부호의 설명][Description of code]
100 : 차량 카메라 폐색 분류 장치 110 : 입력부100: vehicle camera occlusion classification device 110: input unit
120 : 제1 특징 추출부 130 : 제2 특징 추출부120: first feature extraction unit 130: second feature extraction unit
140 : 연산부 150 : 판단부140: calculation unit 150: determination unit

Claims (10)

  1. 카메라로부터 촬영된 원본 이미지를 프레임 단위로 입력받는 입력부;an input unit that receives an original image captured by the camera in units of frames;
    입력받은 프레임의 사이즈를 줄인 후 합성곱 신경망(CNN)에 입력하여 상기 프레임의 특징을 추출하는 제1 특징 추출부;a first feature extractor for extracting features of the frame by reducing the size of the input frame and inputting the input to a convolutional neural network (CNN);
    객체 검출 알고리즘을 이용하여 상기 입력부로부터 입력받은 프레임에 포함된 객체의 특징을 추출하는 제2 특징 추출부;a second feature extraction unit extracting features of an object included in a frame received from the input unit using an object detection algorithm;
    상기 프레임의 특징과 상기 객체의 특징을 혼합한 후 인공신경망(ANN)에 입력하여 연산하는 연산부; 및a calculation unit that mixes the characteristics of the frame and the characteristics of the object and inputs them to an artificial neural network (ANN) for calculation; and
    상기 연산 결과에 따라 상기 카메라의 폐색 여부를 판단하는 판단부를 포함하는 딥러닝 기반 객체 검출기를 이용한 차량 카메라 폐색 분류 장치.Vehicle camera occlusion classification apparatus using a deep learning-based object detector including a determination unit for determining whether the camera is occluded according to the calculation result.
  2. 제1항에 있어서,According to claim 1,
    상기 제1 특징 추출부는,The first feature extraction unit,
    근접 이웃(Nearest Neighbor) 보간법을 이용하여 상기 프레임의 사이즈를 100x100으로 줄인 후, 상기 합성곱 신경망에 입력하여 추출되는 프레임의 특징을 일차원(one dimension)으로 펼치는 딥러닝 기반 객체 검출기를 이용한 차량 카메라 폐색 분류 장치.After reducing the size of the frame to 100x100 using Nearest Neighbor interpolation, vehicle camera occlusion using a deep learning-based object detector that expands the features of the frame extracted by inputting to the convolutional neural network in one dimension sorting device.
  3. 제1항에 있어서,According to claim 1,
    상기 제2 특징 추출부는,The second feature extraction unit,
    딥러닝 기반의 객체 검출 알고리즘을 이용하여 상기 프레임 내 객체의 위치 정보, 클래스 정보 및 신뢰도 정보가 포함된 객체의 특징을 각 객체별로 추출하되,Using a deep learning-based object detection algorithm, the feature of the object including the location information, class information, and reliability information of the object in the frame is extracted for each object,
    상기 위치 정보를 이용하여 각 객체별 박스 값을 산출하고, 상기 신뢰도 정보에 포함된 신뢰 값이 높은 순으로 설정 개의 신뢰 값을 추출하여 추출된 신뢰 값에 대응하는 객체의 박스 값들을 이용하여, 해당 객체들의 특징을 FC(Fully-Connected) 레이어 연산을 위한 일차원(one dimension)으로 합치는 딥러닝 기반 객체 검출기를 이용한 차량 카메라 폐색 분류 장치.A box value for each object is calculated using the location information, and a set number of trust values are extracted in order of highest trust value included in the reliability information, and box values of objects corresponding to the extracted trust values are used to obtain corresponding A vehicle camera occlusion classification device using a deep learning-based object detector that combines the features of objects into one dimension for FC (Fully-Connected) layer calculation.
  4. 제3항에 있어서,According to claim 3,
    상기 제2 특징 추출부는,The second feature extraction unit,
    상기 객체의 위치 정보에 포함된 x 좌표 값, y 좌표 값, 이미지 크기의 경계 박스 너비 및 이미지 크기의 경계 박스 높이를 이용하여 상기 박스 값을 산출하되,Calculate the box value using the x coordinate value, the y coordinate value included in the location information of the object, the width of the bounding box of the image size, and the height of the bounding box of the image size,
    상기 박스 값은 다음의 식에 의해 산출하는 딥러닝 기반 객체 검출기를 이용한 차량 카메라 폐색 분류 장치:The box value is a vehicle camera occlusion classification device using a deep learning-based object detector calculated by the following equation:
    Figure PCTKR2022017979-appb-img-000004
    Figure PCTKR2022017979-appb-img-000004
    여기서, BOX는 박스 값, Vsize는 박스의 크기, Vratio는 박스의 종횡비, w0는 검출 박스의 가로 픽셀, wt는 이미지 가로 픽셀, h0는 검출 박스의 세로 픽셀, ht는 이미지 세로 픽셀, axismin은 검출 박스의 가로 또는 세로 픽셀 중 최소 값, axismax는 검출 박스의 가로 또는 세로 픽셀 중 최대 값, λsize는 사이즈 조절 상수 파라미터, λratio는 비율 조절 상수 파라미터이다.where BOX is the box value, V size is the size of the box, V ratio is the aspect ratio of the box, w 0 is the horizontal pixel of the detection box, w t is the horizontal pixel of the image, h 0 is the vertical pixel of the detection box, h t is the image Vertical pixel, axis min is the minimum value among horizontal or vertical pixels of the detection box, axis max is the maximum value among horizontal or vertical pixels of the detection box, λ size is a size adjustment constant parameter, and λ ratio is a ratio adjustment constant parameter.
  5. 제1항에 있어서,According to claim 1,
    상기 판단부는,The judge,
    소프트맥스 함수를 이용하여 상기 연산 결과에 포함된 클래스 구분 결과에 따라 상기 카메라의 폐색 여부를 판단하는 딥러닝 기반 객체 검출기를 이용한 차량 카메라 폐색 분류 장치.Vehicle camera occlusion classification apparatus using a deep learning-based object detector for determining whether the camera is occluded according to a class classification result included in the calculation result using a softmax function.
  6. 딥러닝 기반 객체 검출기를 이용한 차량 카메라 폐색 분류 장치에 의해 수행되는 차량 카메라 폐색 분류 방법에 있어서,In the vehicle camera occlusion classification method performed by a vehicle camera occlusion classification apparatus using a deep learning-based object detector,
    카메라로부터 촬영된 원본 이미지를 프레임 단위로 입력받는 단계;Receiving an original image photographed by a camera in units of frames;
    입력받은 프레임의 사이즈를 줄인 후 합성곱 신경망(CNN)에 입력하여 상기 프레임의 특징을 추출하는 단계;After reducing the size of the input frame, inputting the input to a convolutional neural network (CNN) to extract features of the frame;
    객체 검출 알고리즘을 이용하여 상기 입력부로부터 입력받은 프레임에 포함된 객체의 특징을 추출하는 단계;extracting features of an object included in a frame received from the input unit using an object detection algorithm;
    상기 프레임의 특징과 상기 객체의 특징을 혼합한 후 인공신경망(ANN)에 입력하여 연산하는 단계; 및After mixing the characteristics of the frame and the characteristics of the object, inputting the mixture into an artificial neural network (ANN) for calculation; and
    상기 연산 결과에 따라 상기 카메라의 폐색 여부를 판단하는 단계를 포함하는 차량 카메라 폐색 분류 방법.A vehicle camera occlusion classification method comprising the step of determining whether the camera is occluded according to the calculation result.
  7. 제6항에 있어서,According to claim 6,
    상기 프레임의 특징을 추출하는 단계는,The step of extracting the feature of the frame,
    근접 이웃(Nearest Neighbor) 보간법을 이용하여 상기 프레임의 사이즈를 100x100으로 줄인 후, 상기 합성곱 신경망에 입력하여 추출되는 프레임의 특징을 일차원(one dimension)으로 펼치는 차량 카메라 폐색 분류 방법.Vehicle camera occlusion classification method of reducing the size of the frame to 100x100 using Nearest Neighbor interpolation and then expanding the characteristics of the frame extracted by inputting to the convolutional neural network in one dimension.
  8. 제6항에 있어서,According to claim 6,
    상기 객체의 특징을 추출하는 단계는,The step of extracting the features of the object,
    딥러닝 기반의 객체 검출 알고리즘을 이용하여 상기 프레임 내 객체의 위치 정보, 클래스 정보 및 신뢰도 정보가 포함된 객체의 특징을 각 객체별로 추출하되,Using a deep learning-based object detection algorithm, the feature of the object including location information, class information, and reliability information of the object in the frame is extracted for each object,
    상기 위치 정보를 이용하여 각 객체별 박스 값을 산출하는 단계,Calculating a box value for each object using the location information;
    상기 신뢰도 정보에 포함된 신뢰 값이 높은 순으로 설정 개의 신뢰 값을 추출하는 단계, 및extracting a set number of trust values in order of highest reliability values included in the reliability information; and
    상기 추출된 신뢰 값에 대응하는 객체의 박스 값들을 이용하여, 해당 객체들의 특징을 FC(Fully-Connected) 레이어 연산을 위한 일차원(one dimension)으로 합치는 단계를 포함하는 차량 카메라 폐색 분류 방법.A vehicle camera occlusion classification method comprising the step of merging features of corresponding objects into one dimension for FC (Fully-Connected) layer calculation using box values of objects corresponding to the extracted confidence value.
  9. 제8항에 있어서,According to claim 8,
    상기 객체의 특징을 추출하는 단계는,The step of extracting the features of the object,
    상기 객체의 위치 정보에 포함된 x 좌표 값, y 좌표 값, 이미지 크기의 경계 박스 너비 및 이미지 크기의 경계 박스 높이를 이용하여 상기 박스 값을 산출하되,The box value is calculated using the x-coordinate value, the y-coordinate value included in the location information of the object, the width of the bounding box of the image size, and the height of the bounding box of the image size,
    상기 박스 값은 다음의 식에 의해 산출하는 차량 카메라 폐색 분류 방법:Vehicle camera occlusion classification method in which the box value is calculated by the following equation:
    Figure PCTKR2022017979-appb-img-000005
    Figure PCTKR2022017979-appb-img-000005
    여기서, BOX는 박스 값, Vsize는 박스의 크기, Vratio는 박스의 종횡비, w0는 검출 박스의 가로 픽셀, wt는 이미지 가로 픽셀, h0는 검출 박스의 세로 픽셀, ht는 이미지 세로 픽셀, axismin은 검출 박스의 가로 또는 세로 픽셀 중 최소 값, axismax는 검출 박스의 가로 또는 세로 픽셀 중 최대 값, λsize는 사이즈 조절 상수 파라미터, λratio는 비율 조절 상수 파라미터이다.where BOX is the box value, V size is the size of the box, V ratio is the aspect ratio of the box, w 0 is the horizontal pixel of the detection box, w t is the horizontal pixel of the image, h 0 is the vertical pixel of the detection box, h t is the image Vertical pixel, axis min is the minimum value among horizontal or vertical pixels of the detection box, axis max is the maximum value among horizontal or vertical pixels of the detection box, λ size is a size adjustment constant parameter, and λ ratio is a ratio adjustment constant parameter.
  10. 제6항에 있어서,According to claim 6,
    상기 카메라의 폐색 여부를 판단하는 단계는,The step of determining whether the camera is blocked,
    소프트맥스 함수를 이용하여 상기 연산 결과에 포함된 클래스 구분 결과에 따라 상기 카메라의 폐색 여부를 판단하는 차량 카메라 폐색 분류 방법.A vehicle camera occlusion classification method for determining whether the camera is occluded according to a class classification result included in the calculation result using a softmax function.
PCT/KR2022/017979 2021-12-22 2022-11-15 Vehicle camera occlusion classification device using deep learning-based object detector and method thereof WO2023120988A1 (en)

Applications Claiming Priority (4)

Application Number Priority Date Filing Date Title
KR10-2021-0184967 2021-12-22
KR20210184967 2021-12-22
KR1020220005390A KR20230095747A (en) 2021-12-22 2022-01-13 Apparatus for classifying occlusion of a vehicle camera using object detector algorithm based on deep learning and method thereof
KR10-2022-0005390 2022-01-13

Publications (1)

Publication Number Publication Date
WO2023120988A1 true WO2023120988A1 (en) 2023-06-29

Family

ID=86902932

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/KR2022/017979 WO2023120988A1 (en) 2021-12-22 2022-11-15 Vehicle camera occlusion classification device using deep learning-based object detector and method thereof

Country Status (1)

Country Link
WO (1) WO2023120988A1 (en)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20170034226A (en) * 2015-09-18 2017-03-28 삼성전자주식회사 Method and apparatus of object recognition, Method and apparatus of learning for object recognition
KR20190026116A (en) * 2017-09-04 2019-03-13 삼성전자주식회사 Method and apparatus of recognizing object
KR20190047243A (en) * 2017-10-27 2019-05-08 현대자동차주식회사 Apparatus and method for warning contamination of camera lens
KR20200039043A (en) * 2018-09-28 2020-04-16 한국전자통신연구원 Object recognition device and operating method for the same
US20210192745A1 (en) * 2019-12-18 2021-06-24 Clarion Co., Ltd. Technologies for detection of occlusions on a camera

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20170034226A (en) * 2015-09-18 2017-03-28 삼성전자주식회사 Method and apparatus of object recognition, Method and apparatus of learning for object recognition
KR20190026116A (en) * 2017-09-04 2019-03-13 삼성전자주식회사 Method and apparatus of recognizing object
KR20190047243A (en) * 2017-10-27 2019-05-08 현대자동차주식회사 Apparatus and method for warning contamination of camera lens
KR20200039043A (en) * 2018-09-28 2020-04-16 한국전자통신연구원 Object recognition device and operating method for the same
US20210192745A1 (en) * 2019-12-18 2021-06-24 Clarion Co., Ltd. Technologies for detection of occlusions on a camera

Similar Documents

Publication Publication Date Title
WO2021085848A1 (en) Signal control apparatus and signal control method based on reinforcement learning
WO2021187793A1 (en) Electronic device for detecting 3d object on basis of fusion of camera and radar sensor, and operating method therefor
WO2012011713A2 (en) System and method for traffic lane recognition
WO2011052826A1 (en) Map generating and updating method for mobile robot position recognition
WO2021137313A1 (en) Method for determining whether surrounding situation of vehicle is dangerous situation and generating driving guide to provide warning, and apparatus using same
WO2020246655A1 (en) Situation recognition method and device for implementing same
WO2020122300A1 (en) Deep learning-based number recognition system
EP3756160A1 (en) System and method for fast object detection
WO2021215672A1 (en) Method and device for calibrating pitch of camera on vehicle and method and device for continual learning of vanishing point estimation model to be used for calibrating the pitch
WO2012011715A2 (en) Vehicle collision warning system and method therefor
WO2019147024A1 (en) Object detection method using two cameras having different focal distances, and apparatus therefor
WO2021002722A1 (en) Method for perceiving event tagging-based situation and system for same
WO2020032506A1 (en) Vision detection system and vision detection method using same
WO2021215740A1 (en) Method and device for on-vehicle active learning to be used for training perception network of autonomous vehicle
WO2023120988A1 (en) Vehicle camera occlusion classification device using deep learning-based object detector and method thereof
WO2023286917A1 (en) System and method for detecting both road surface damage and obstacles by using deep neural network, and recording medium in which computer-readable program for executing same method is recorded
CN111507196A (en) Vehicle type identification method based on machine vision and deep learning
WO2021225296A1 (en) Method for explainable active learning, to be used for object detector, by using deep encoder and active learning device using the same
WO2021235682A1 (en) Method and device for performing behavior prediction by using explainable self-focused attention
Amitha et al. Object detection using yolo framework for intelligent traffic monitoring
Abdagic et al. Counting traffic using optical flow algorithm on video footage of a complex crossroad
EP3813354A1 (en) Image processing device, moving apparatus, method, and program
WO2016104842A1 (en) Object recognition system and method of taking account of camera distortion
JPH1097699A (en) Obstacle detecting device for vehicle
WO2023096011A1 (en) Device and method for zero-shot semantic segmentation

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22911626

Country of ref document: EP

Kind code of ref document: A1