KR101986592B1

KR101986592B1 - Recognition method of license plate number using anchor box and cnn and apparatus using thereof

Info

Publication number: KR101986592B1
Application number: KR1020190046508A
Authority: KR
Inventors: 오정훈; 고대경; 이강훈
Original assignee: 주식회사 펜타게이트
Priority date: 2019-04-22
Filing date: 2019-04-22
Publication date: 2019-06-10

Abstract

A license plate number recognition apparatus, according to the present invention, comprises: an image input unit for receiving a first image; a second image generating unit for generating a second image by detecting a region of interest including characters from the first image; a feature component extracting unit for extracting feature component from the second image; an anchor box generating unit for generating a plurality of anchor boxes having the same vertical size as the second image and a horizontal size different from that of the second image, from the second image; a bounding box generating unit for comparing the feature component of the second image, corresponding to the areas of the anchor boxes, with character detection learning data to select an anchor box having the maximum matching value as a bounding box for each character area; and a character recognition unit for comparing the feature component of the area corresponding to the bounding box with the character recognition learning data to recognize the characters corresponding to the anchor box. The anchor boxes each include: an end anchor boxes (where a is a natural number) positioned to overlap each other at predetermined intervals over all areas of the second image and having the same horizontal and vertical sizes; a base anchor boxes positioned on one sides of the end anchor boxes and having a horizontal size smaller than a vertical size; and a plurality of sub-anchor boxes including the areas of the base anchor boxes and having a size that is increased than the horizontal size of each of the base anchor boxes by a predetermined size while having a horizontal size smaller than the horizontal size of the end anchor box. Thus, the license plate number can be recognized more accurately.

Description

BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to an apparatus and method for recognizing a car number using an anchor box and a CNN feature map,

본 발명은 인공지능을 이용한 차량 번호 인식 장치 및 차량 번호 인식 방법에 관한 것이다. 보다 상세하게는 앵커박스 및 CNN 특징 맵을 이용하여 영상 이미지로부터 정확하게 문자를 인식할 수 있는 앵커박스 및 CNN 특징 맵을 이용한 차량 번호 인식 장치 및 차량 번호 인식 방법에 관한 것이다.BACKGROUND OF THE INVENTION 1. Field of the Invention [0002] The present invention relates to a vehicle number recognition apparatus and a vehicle number recognition method using artificial intelligence. And more particularly, to an apparatus and method for recognizing a car number using an anchor box and a CNN feature map capable of accurately recognizing characters from a video image using an anchor box and a CNN feature map.

차량용 번호판이란 차량의 전면 또는 후면에 부착되며 규격화된 문자(글자, 숫자 및 기호 포함)가 특정 패턴으로 기재된 판을 의미하며, 차량 번호판 인식 장치란 영상으로부터 차량용 번호판 및 이제 기재된 문자를 자동으로 인식 및 판독하는 장치를 의미한다. 일반적인 차량 번호판 인식 장치는 차량 촬영과 번호판 추출, 음영이나 기울기에 따른 전처리 과정을 수행하고 미리 정해진 기준 글자에 따라 번호판을 인식하는 단계를 포함한다. 이러한 차량 번호판 인식 장치에는 차량 번호판에 기재된 문자를 인식하는 차량 번호 인식 장치 또는 차량 번호 인식 방법이 반드시 수반되므로 보다 정확한 차량 번호 인식 방법은 차량 번호판 인식 장치에 있어서 매우 중요한 요소이다.A vehicle license plate is a plate which is attached to the front or rear surface of a vehicle and in which standardized characters (including letters, numbers and symbols) are written in a specific pattern. The license plate recognition device automatically recognizes license plates and characters Means a device for reading. A general license plate recognition apparatus includes a step of car shooting, a license plate extraction, a preprocessing process according to shade or tilt, and recognizing the license plate according to a predetermined standard letter. Such a vehicle license plate recognizing device necessarily carries out a vehicle number recognizing device or a car number recognizing method for recognizing characters written on the license plate, so that a more accurate car number recognizing method is a very important factor in a license plate recognizing device.

기존의 번호판 인식은 원형 정합 방법 및 구문론적 방법 등이 이용되고 있으며, 원형 정합 방법은 기하학적 정합 방법을 개선시킨 방법으로서, 표준 패턴에 입력된 영상을 정합 시켜 문자를 인식시키는 방법을 말한다. 그러나, 기울어진 영상이나 잡음이 있을 경우 인식률이 저하되며, 환경이 변화되면 표준 패턴을 재구성해야 되는 단점을 갖고 있으며, 구문론 적인 방법은 문자 특징 간의 상호 관련성 또는 상호 연결성 정보와 같은 구조적 정보를 이용하며 글자 크기, 기울기 등에 강한 특징을 지닌다. 특히 구조적 정보를 정량화 하여 추출할 수 있어야 한다는 어려움이 있으며, 특징 간의 정확한 구조적 정보를 구하는 것이 용이하지 않은 문제점이 있다.Conventional license plate recognition uses a circular matching method and a syntactic method, and the circular matching method improves the geometrical matching method and refers to a method of recognizing characters by matching images inputted in a standard pattern. However, if there is a slant image or noise, the recognition rate is degraded. If the environment is changed, the standard pattern must be reconstructed. Syntactic methods use structural information such as interrelationship between character characteristics or interconnection information Character size, and slope. Especially, it is difficult to quantitatively extract structural information, and it is not easy to obtain accurate structural information between features.

최근에는 신경망을 이용한 번호판 인식 방법에 많은 관심이 집중되고 있다. 신경망 또는 인공신경망(ANN: artificial neural network)이란 생물학의 신경망에서 영감을 얻은 통계학적 학습 알고리즘으로 인공신경망은 시냅스의 결합으로 네트워크를 형성한 인공 뉴런(노드)이 학습을 통해 시냅스의 결합 세기를 변화시켜, 문제 해결 능력을 가지는 모델 전반을 가리킨다. 딥러닝 기술이란 다층구조 형태의 인공 신경망을 기반으로 하는 머신 러닝의 한 분야로 여러 비선형 변환기법의 조합을 통해 높은 수준의 추상화를 시도하는 기계학습(machine learning) 알고리즘의 집합이다.Recently, much attention has been focused on the license plate recognition method using neural network. An artificial neural network (ANN) is a statistical learning algorithm inspired by the neural network of biology. An artificial neural network is a system in which artificial neurons (nodes) And refers to the overall model that has problem solving ability. Deep learning technology is a field of machine learning based on artificial neural networks in the form of a multi-layer structure. It is a set of machine learning algorithms that try to achieve a high level of abstraction through a combination of various nonlinear transformation techniques.

신경망을 이용한 방법은 입력된 영상의 영역을 나누어 신경망의 입력으로 하고 이로부터 신경망을 이용하여 문자를 인식하는 방법이다. 따라서, 이러한 신경망을 이용한 방법은 입력 영역의 선정이 매우 중요하며, 이에 따라서 성능이 크게 좌우되므로 문자가 차지하는 영역의 선정 방법에 대한 많은 연구가 요구된다.The method using the neural network divides the area of the input image into the input of the neural network and recognizes the character using the neural network. Therefore, the method using the neural network is very important to select the input area. Therefore, since the performance is greatly influenced by the selection of the input area, a lot of studies on the method of selecting the area occupied by the characters are required.

한국등록특허공보 제10-1931804호(2018.12.17)Korean Patent Registration No. 10-1931804 (Dec. 17, 2018)

이에 본 발명의 기술적 과제는 이러한 점에서 착안된 것으로, 본 발명의 목적은 앵커박스 및 인공 신경망을 이용하여 영상 이미지로부터 정확하게 문자를 인식할 수 있는 앵커박스 및 CNN 특징 맵을 이용한 차량 번호 인식 장치를 제공하는 것이다.SUMMARY OF THE INVENTION Accordingly, the present invention has been made keeping in mind the above problems occurring in the prior art, and it is an object of the present invention to provide a vehicle number recognition apparatus using an anchor box and a CNN feature map capable of accurately recognizing characters from an image using an anchor box and an artificial neural network .

또한 본 발명의 다른 목적은 앵커박스 및 인공 신경망을 이용하여 보다 정확히 차량번호를 인식할 수 있는 앵커박스 및 CNN 특징 맵을 이용한 차량 번호 인식 방법을 제공하는 것이다.Another object of the present invention is to provide a vehicle number recognition method using an anchor box and a CNN feature map that can more accurately recognize a vehicle number using an anchor box and an artificial neural network.

상기한 본 발명의 목적을 실현하기 위한 앵커박스 및 CNN 특징 맵을 이용한 차량 번호 인식 장치는 제1 영상 이미지를 입력 받는 영상 입력부, 상기 제1 영상 이미지로부터 문자가 포함된 관심 영역을 검출하여 제2 영상 이미지를 생성하는 제2 영상 생성부, 상기 제2 영상 이미지로부터 특징성분을 추출하는 특징성분 추출부, 상기 제2 영상 이미지로부터 상기 제2 영상 이미지와 세로의 크기가 서로 동일하고 가로의 크기가 서로 상이한 복수개의 앵커박스(Anchor Box)를 생성하는 앵커박스 생성부, 상기 앵커박스들의 영역에 해당하는 상기 제2 영상 이미지의 특징성분들을 문자 검출 학습데이터와 비교하여 최대 매칭 값을 갖는 앵커박스를 각 문자 영역의 바운딩박스(Bounding Box)로 선정하는 바운딩박스 생성부 및 상기 바운딩박스에 해당하는 영역의 특징성분을 문자 인식 학습데이터와 비교하여 상기 앵커박스에 해당하는 문자를 인식하는 문자 인식부를 포함하고, 상기 앵커박스들은 상기 제2 영상 이미지의 모든 영역에 걸쳐 일정한 간격으로 겹쳐서 위치하며 가로의 크기와 세로의 크기가 동일한 a개의 엔드 앵커박스(a는 자연수), 상기 엔드 앵커박스의 일측에 위치하며, 가로의 크기가 세로의 크기보다 작은 a개의 베이스 앵커박스 및 상기 베이스 앵커박스의 영역을 포함하며, 가로의 크기가 상기 엔드 앵커박스의 가로의 크기보다 작으면서 상기 베이스 앵커박스의 가로의 크기에서 일정한 크기만큼 증가한 크기를 갖는 복수개의 서브 앵커박스를 포함한다.In order to achieve the object of the present invention, an apparatus for recognizing a car number using an anchor box and a CNN feature map includes an image input unit for receiving a first image, a second region for detecting a region of interest including a character from the first image, A feature extraction unit for extracting a feature component from the second image, a second feature extraction unit for extracting a feature component from the second image, An anchor box generating unit for generating a plurality of anchor boxes different from each other, and an anchor box having a maximum matching value by comparing feature data of the second image corresponding to the anchor boxes with character learning data A bounding box generating unit for selecting a bounding box of each character region, And a character recognizing unit for recognizing characters corresponding to the anchor box by comparing the character recognition learning data with the character recognition learning data, wherein the anchor boxes are overlapped at all of the areas of the second video image at regular intervals, (A is a natural number) having the same size, a base anchor box located at one side of the end anchor box and having a lateral size smaller than the vertical size, and an area of the base anchor box, And a plurality of sub-anchor boxes having a size smaller than a width of the end anchor box and having a size that is increased by a predetermined size in the width of the base anchor box.

본 발명의 일 실시예에 있어서, 상기 앵커박스 생성부가 상기 앵커박스를 생성하는 방법은 상기 제2 영상 이미지의 일측에 위치하는 베이스 앵커박스를 생성하는 단계, 가로의 크기가 세로의 크기가 같아질 때까지 바로 이전에 생성된 앵커박스의 영역을 포함하며 바로 이전에 생성된 앵커박스의 가로의 크기에서 일정한 크기로 가로의 크기를 증가시킨 상기 서브 앵커박스를 생성하는 단계, 바로 이전에 생성된 앵커박스의 영역을 포함하는 상기 엔드 앵커박스를 생성하는 단계, 바로 이전에 생성된 베이스 앵커박스에서 일정한 크기만큼 가로 위치가 이동된 새로운 베이스 앵커박스를 생성하는 단계, 상기 서브 앵커박스를 생성하는 단계 및 상기 엔드 앵커박스를 생성하는 단계를 다시 수행하는 단계 및 상기 새로운 베이스 앵커박스를 생성하는 단계 및 상기 다시 수행하는 단계를 상기 제2 영상 이미지의 일측의 가장 반대측에 위치하는 엔드 앵커박스가 생성될 때까지 반복하는 단계를 포함할 수 있다.In one embodiment of the present invention, the method for generating the anchor box includes generating a base anchor box located at one side of the second video image, Creating the sub-anchor box including the area of the anchor box generated immediately before and increasing the size of the anchor box to a predetermined size from the size of the previously generated anchor box, Creating an end anchor box including an area of a box, creating a new base anchor box with a horizontal position shifted by a predetermined size in a previously generated base anchor box, creating the sub anchor box, Performing the step of creating the end anchor box again, and creating the new base anchor box And repeating the step of repeating until the end anchor box located on the most opposite side of one side of the second video image is generated.

본 발명의 일 실시예에 있어서, 상기 앵커박스의 크기는 픽셀 단위이고, 상기 베이스 앵커박스의 가로의 크기(픽셀의 개수)는 상기 엔드 앵커박스의 세로의 크기(픽셀의 개수)의 1/r배이고, 상기 서브 앵커박스의 증가하는 가로의 일정한 크기는 p픽셀이고, 상기 a개의 엔드 앵커박스들 중에서 가장 인접한 엔드 박스 사이의 시작 위치의 일정한 간격은 q픽셀(q는 자연수)일 수 있다.In one embodiment of the present invention, the size of the anchor box is in pixels, and the size of the base anchor box (the number of pixels) is 1 / r (number of pixels) of the size of the end anchor box And a constant size of the increasing width of the sub-anchor box is p pixels, and a constant interval of a start position between the end boxes adjacent to the end a boxes among the a end achor boxes may be q pixels (q is a natural number).

본 발명의 일 실시예에 있어서, r이 2일때 하나의 상기 제2 영상 이미지로부터 생성되는 앵커박스의 총 개수는 다음의 수학식으로 계산될 수 있다.In an embodiment of the present invention, when r is 2, the total number of anchor boxes generated from one second image may be calculated by the following equation.

여기서, N은 상기 제2 영상 이미지의 세로의 크기이고, M은 상기 제2 영상 이미지의 가로의 크기임.Here, N is the vertical size of the second video image and M is the horizontal size of the second video image.

본 발명의 일 실시예에 있어서, 상기 차량 번호 인식 장치는 상기 제1 영상 이미지로부터 SDD(Single shot MultiBox Detector) 알고리즘을 이용하여 차량 영역을 검출하는 차량 검출부를 더 포함하고, 상기 제2 영상 생성부는 상기 검출된 차량 영역으로부터 SDD(Single shot MultiBox Detector) 알고리즘을 이용하여 상기 관심 영역인 차량용 번호판 영역을 검출하여 제2 영상 이미지를 생성할 수 있다.In one embodiment of the present invention, the car number recognition apparatus further includes a vehicle detection section that detects a vehicle area from the first video image using an SDD (Single shot MultiBox Detector) algorithm, and the second image generation section The vehicle license plate area, which is the area of interest, is detected from the detected vehicle area using a SDD (Single shot MultiBox Detector) algorithm to generate a second image.

본 발명의 일 실시예에 있어서, 상기 제2 영상 이미지에 intensity balancing 및 모폴로지(Morphology)를 수행하여 화질을 개선하는 전처리부를 더 포함하고, 상기 모폴로지는 닫힘 모폴로지 및 열림 모폴로지를 포함하고, 상기 intensity balancing의 수행은 영상 그레이 레벨에 대한 히스토그램을 계산하고, 상기 히스토그램으로부터 하위 사분위수와 상위 사분위수에 해당되는 픽셀 값을 계산하며, 다음의 수학식으로 어파인 변환(affine transform)을 수행할 수 있다.In an exemplary embodiment of the present invention, the image processing apparatus may further include a preprocessor for enhancing image quality by performing intensity balancing and morphology on the second image, wherein the morphology includes a closed morphology and an open morphology, The histogram of the image gray level is calculated and the pixel values corresponding to the lower quartile and the upper quartile are calculated from the histogram and an affine transform can be performed by the following equation.

여기서, x는 해당 위치의 픽셀 값이고, max는 해당 영역에서 최대 픽셀 값이고, min은 해당 영역에서 최소 픽셀 값이고, Vmin은 하위 사분위수에 해당되는 픽셀 값이고, Vmax는 상위 사분위수에 해당되는 픽셀 값임.Here, x is a pixel value at the corresponding position, max is a maximum pixel value in the corresponding region, min is a minimum pixel value in the corresponding region, Vmin is a pixel value corresponding to a lower quartile, and Vmax corresponds to a higher quartile Pixel values.

본 발명의 일 실시예에 있어서, 상기 차량 번호 인식 장치는 수평투영법(Horizontal projection)을 이용하여 상기 제2 영상 이미지의 상위 위치와 하위 위치를 계산하는 프로젝션부를 더 포함할 수 있다.In one embodiment of the present invention, the car number recognition apparatus may further include a projection unit for calculating an upper position and a lower position of the second image by using a horizontal projection method.

본 발명의 일 실시예에 있어서, 상기 특징성분 추출부는 합성곱 신경망(CNN: Convolutional Neural Network)을 이용하여 상기 제2 영상 이미지의 컨볼루션 특징 맵(convolution feature map)을 추출하고, 이를 상기 특징성분으로 하며, 상기 합성곱 신경망은 입력 레이어(Input layer), 합성곱 레이어(Convolution Layer) 및 풀링 레이어(Pooling Layer)를 포함할 수 있다.In one embodiment of the present invention, the feature extraction unit extracts a convolution feature map of the second image using a Convolutional Neural Network (CNN) And the combined-product neural network may include an input layer, a convolution layer, and a pooling layer.

본 발명의 일 실시예에 있어서, 상기 문자 인식부는 Fully Connection 네트워크를 이용하여 상기 앵커박스에 해당하는 문자를 인식할 수 있다.In an embodiment of the present invention, the character recognition unit can recognize a character corresponding to the anchor box using a fully connected network.

상기한 본 발명의 목적을 실현하기 위한 앵커박스 및 CNN 특징 맵을 이용한 차량 번호 인식 방법은 영상 입력부가 제1 영상 이미지를 입력 받는 단계, 제2 영상 이미지 생성부가 상기 제1 영상 이미지로부터 문자가 포함된 관심 영역을 검출하여 제2 영상 이미지를 생성하는 단계, 특징성분 추출부가 상기 제2 영상 이미지로부터 특징성분을 추출하는 단계, 앵커박스 생성부가 상기 제2 영상 이미지로부터 상기 제2 영상 이미지와 세로의 크기가 서로 동일하고 가로의 크기가 서로 상이한 복수개의 앵커박스(Anchor Box)를 생성하는 단계, 바운딩박스 생성부가 상기 앵커박스들의 영역에 해당하는 상기 제2 영상 이미지의 특징성분들을 문자 검출 학습데이터와 비교하여 최대 매칭 값을 갖는 앵커박스를 각 문자 영역의 바운딩박스(Bounding Box)로 선정하는 단계 및 문자 인식부가 상기 바운딩박스에 해당하는 영역의 특징성분을 문자 인식 학습데이터와 비교하여 상기 앵커박스에 해당하는 문자를 인식하는 단계를 포함하고, 상기 앵커박스들은 상기 제2 영상 이미지의 모든 영역에 걸쳐 일정한 간격으로 겹쳐서 위치하며 가로의 크기와 세로의 크기가 동일한 a개의 엔드 앵커박스(a는 자연수), 상기 엔드 앵커박스의 일측에 위치하며, 가로의 크기가 세로의 크기보다 작은 a개의 베이스 앵커박스 및 상기 베이스 앵커박스의 영역을 포함하며, 가로의 크기가 상기 엔드 앵커박스의 가로의 크기보다 작으면서 상기 베이스 앵커박스의 가로의 크기에서 일정한 크기만큼 증가한 크기를 갖는 복수개의 서브 앵커박스를 포함한다.The method for recognizing a car number using an anchor box and a CNN feature map for realizing the object of the present invention includes the steps of receiving a first video image from a video input unit, Extracting a feature component from the second video image; extracting a feature component from the second video image by extracting a feature component from the second video image; Generating a plurality of anchor boxes having the same size and different widths from each other, and the bounding box generating unit is configured to classify the feature components of the second video image corresponding to the area of the anchor boxes into character detection learning data Selecting an anchor box having a maximum matching value as a bounding box of each character area; And recognizing the character corresponding to the anchor box by comparing the feature component of the area corresponding to the bounding box with the character recognition learning data, A number of base anchor boxes (a is a natural number) located at one side of the end anchor box and having a size smaller than the vertical size, And a plurality of sub-anchor boxes including an area of the base anchor box, the size of which is smaller than the width of the end anchor box, and the size of the sub anchor box is increased by a predetermined size in the width of the base anchor box .

본 발명의 일 실시예에 있어서, 상기 앵커박스(Anchor Box)를 생성하는 단계는 상기 제2 영상 이미지의 일측에 위치하는 베이스 앵커박스를 생성하는 단계, 가로의 크기가 세로의 크기가 같아질 때까지 바로 이전에 생성된 앵커박스의 영역을 포함하며 바로 이전에 생성된 앵커박스의 가로의 크기에서 일정한 크기로 가로의 크기를 증가시킨 상기 서브 앵커박스를 생성하는 단계, 바로 이전에 생성된 앵커박스의 영역을 포함하는 상기 엔드 앵커박스를 생성하는 단계, 바로 이전에 생성된 베이스 앵커박스에서 일정한 크기만큼 가로 위치가 이동된 새로운 베이스 앵커박스를 생성하는 단계;In one embodiment of the present invention, the step of creating an anchor box may include the steps of creating a base anchor box located at one side of the second video image, Creating the sub-anchor box that includes the area of the anchor box generated immediately before, and the size of the sub-anchor box that is increased in size by a predetermined size from the size of the anchor box created immediately before; Creating a new base anchor box in which a horizontal position is shifted by a predetermined size in a previously generated base anchor box;

상기 서브 앵커박스를 생성하는 단계 및 상기 엔드 앵커박스를 생성하는 단계를 다시 수행하는 단계 및 상기 새로운 베이스 앵커박스를 생성하는 단계 및 상기 다시 수행하는 단계를 상기 제2 영상 이미지의 일측의 가장 반대측에 위치하는 엔드 앵커박스가 생성될 때까지 반복하는 단계를 포함할 수 있다.Creating the sub-anchor box and creating the end anchor box again; and creating the new base anchor box, and performing the re-performing step on the opposite side of one side of the second video image And repeating until an end anchor box positioned is created.

본 발명의 일 실시예에 있어서, 상기 앵커박스의 크기는 픽셀 단위이고, 상기 베이스 앵커박스의 가로의 크기(픽셀의 개수)는 상기 엔드 앵커박스의 세로의 크기(픽셀의 개수)의 1/r배이고, 상기 서브 앵커박스의 증가하는 가로의 일정한 크기는 p픽셀이고, 상기 a개의 엔드 앵커박스들 중에서 가장 인접한 엔드 박스 사이의 시작 위치의 일정한 간격은 q픽셀일 수 있다.In one embodiment of the present invention, the size of the anchor box is in pixels, and the size of the base anchor box (the number of pixels) is 1 / r (number of pixels) of the size of the end anchor box The constant size of the increasing width of the sub-anchor box is p pixels, and a certain interval of the start position between the end-boxes closest to the a end-anchor boxes may be q pixels.

본 발명의 일 실시예에 있어서, 상기 차량 번호 인식 방법은 차량 검출부가 상기 제1 영상 이미지로부터 SDD(Single shot MultiBox Detector) 알고리즘을 이용하여 차량 영역을 검출하는 단계를 더 포함하고, 상기 제2 영상 이미지를 생성하는 단계에서는 상기 검출된 차량 영역으로부터 SDD(Single shot MultiBox Detector) 알고리즘을 이용하여 상기 관심 영역인 차량용 번호판 영역을 검출하여 제2 영상 이미지를 생성할 수 있다.In one embodiment of the present invention, the car number recognition method further includes the step of the vehicle detection unit detecting a vehicle area using the SDO (Single shot MultiBox Detector) algorithm from the first image, In the step of generating an image, a second image may be generated by detecting a vehicle license plate area, which is the area of interest, from the detected vehicle area using an SDD (Single shot MultiBox Detector) algorithm.

본 발명의 일 실시예에 있어서, 상기 차량 번호 인식 방법은 전처리부가 상기 제2 영상 이미지에 intensity balancing 및 모폴로지(Morphology)를 수행하여 화질을 개선하는 단계를 더 포함하고, 상기 화질을 개선하는 단계는 닫힘 모폴로지를 수행하는 단계, 상기 닫힘 모폴로지의 수행에 이어서 열림 모폴로지를 수행하는 단계, 영상 그레이 레벨에 대한 히스토그램을 계산하는 단계, 상기 히스토그램으로부터 하위 사분위수와 상위 사분위수에 해당되는 픽셀 값을 계산하는 단계 및 다음의 수학식으로 어파인 변환(affine transform)을 수행하는 단계를 포함할 수 있다.In one embodiment of the present invention, the vehicle number recognition method may further include a step of the image processing unit performing intensity balancing and morphology on the second image image to improve the image quality, Performing a closed morphology, performing an open morphology following the closed morphology, calculating a histogram for an image gray level, calculating a pixel value corresponding to a lower quartile and an upper quartile from the histogram And performing an affine transform using the following equation: < EMI ID = 1.0 >

본 발명의 일 실시예에 있어서, 상기 차량 번호 인식 방법은 프로젝션부가 수평투영법(Horizontal projection)을 이용하여 상기 제2 영상 이미지의 상위 위치와 하위 위치를 계산하는 단계를 더 포함할 수 있다.In one embodiment of the present invention, the vehicle number recognition method may further include a step of calculating an upper position and a lower position of the second image by using a horizontal projection.

본 발명의 일 실시예에 있어서, 상기 특징성분을 추출하는 단계에서는 합성곱 신경망(CNN: Convolutional Neural Network)을 이용하여 상기 제2 영상 이미지의 컨볼루션 특징 맵(convolution feature map)을 추출하고, 이를 상기 특징성분으로 하며, 상기 합성곱 신경망은 입력 레이어(Input layer), 합성곱 레이어(Convolution Layer) 및 풀링 레이어(Pooling Layer)를 포함할 수 있다.In one embodiment of the present invention, in extracting the feature component, a convolution feature map of the second image is extracted using a CNN (Convolutional Neural Network) And the composite neural network may include an input layer, a convolution layer, and a pooling layer.

본 발명의 일 실시예에 있어서, 상기 문자를 인식하는 단계에서는 Fully Connection 네트워크를 이용하여 상기 앵커박스에 해당하는 문자를 인식할 수 있다.In one embodiment of the present invention, in the step of recognizing the character, a character corresponding to the anchor box can be recognized using a fully connected network.

본 발명의 실시예들에 따르면, 앵커박스 및 CNN 특징 맵을 이용한 차량 번호 인식 장치 및 차량 번호 인식 방법은 영상 입력부, 제2 영상 이미지 생성부, 특징성분 추출부, 앵커박스 생성부, 바운딩박스 생성부 및 문자 인식부를 포함한다. 따라서, 영상으로부터 앵커박스를 생성하고 상기 앵커박스에 해당하는 부분의 특징성분을 문자 인식 학습데이터와 비교하여 보다 정확한 차량 번호를 인식할 수 있다.According to embodiments of the present invention, a car number recognition apparatus and a car number recognition method using an anchor box and a CNN feature map may include an image input unit, a second image image generation unit, a feature component extraction unit, an anchor box generation unit, And a character recognition unit. Therefore, an anchor box is generated from the image, and a more accurate vehicle number can be recognized by comparing the feature component of the portion corresponding to the anchor box with the character recognition learning data.

또한, 본 발명은 다양한 크기의 다수의 앵커박스로부터 문자가 위치하는 영역인 바운딩박스를 생성하므로 문자의 특징에 따라서 가변적으로 영역이 선택될 수 있으며, 이로 인해 효율적인 크기의 영역 선택 및 이를 이용한 보다 정확한 차량 번호를 인식할 수 있다.In addition, since the present invention creates a bounding box that is a region where characters are located from a plurality of anchor boxes of various sizes, an area can be variably selected according to characteristics of characters, The vehicle number can be recognized.

또한, 본 발명은 인공 신경망 기술을 이용하여 문자 영역을 추출하고, 문자를 인식하므로 문자 인식의 높은 확장성 및 정확도를 제공할 수 있다.In addition, the present invention extracts character regions using artificial neural network technology, and recognizes characters, thereby providing high scalability and accuracy of character recognition.

도 1은 본 발명의 일 실시예에 따른 앵커박스 및 CNN 특징 맵을 이용한 차량 번호 인식 장치를 나타내는 구성도이다.
도 2는 본 발명의 일 실시예에 따른 앵커박스 및 CNN 특징 맵을 이용한 차량 번호 인식 장치를 나타내는 구성도이다.
도 3은 본 발명의 일 실시예에 따른 앵커박스 및 CNN 특징 맵을 이용한 차량 번호 인식 방법을 나타내는 흐름도이다.
도 4는 본 발명의 일 실시예에 따른 앵커박스 및 CNN 특징 맵을 이용한 차량 번호 인식 방법을 나타내는 흐름도이다.
도 5는 본 발명의 일 실시예에 따른 앵커박스 및 CNN 특징 맵을 이용한 차량 번호 인식 방법의 화질을 개선하는 단계를 나타내는 흐름도이다.
도 6은 본 발명의 일 실시예에 따른 앵커박스 및 CNN 특징 맵을 이용한 차량 번호 인식 장치 및 방법의 앵커박스(Anchor Box)를 생성하는 단계를 나타내는 흐름도이다.
도 7은 본 발명의 일 실시예에 따른 앵커박스 및 CNN 특징 맵을 이용한 차량 번호 인식 장치 및 방법의 앵커박스를 나타내는 도면이다.
도 8은 본 발명의 일 실시예에 따른 앵커박스 및 CNN 특징 맵을 이용한 차량 번호 인식 장치 및 방법의 앵커박스를 나타내는 도면이다.
도 9는 본 발명의 일 실시예에 따른 앵커박스 및 CNN 특징 맵을 이용한 차량 번호 인식 장치 및 방법의 구현예를 나타내는 도면이다.
도 10은 일반적인 SSD 방법을 나타내는 도면이다.
도 11은 본 발명의 일 실시예에 따른 앵커박스 및 CNN 특징 맵을 이용한 차량 번호 인식 장치 및 방법의 제2 영상 이미지의 구현예를 나타내는 도면이다.
도 12는 본 발명의 일 실시예에 따른 앵커박스 및 CNN 특징 맵을 이용한 차량 번호 인식 장치 및 방법의 CNN의 구현예를 나타내는 도면이다.
도 13은 본 발명의 일 실시예에 따른 앵커박스 및 CNN 특징 맵을 이용한 차량 번호 인식 장치 및 방법의 Fully Connection 네트워크의 구현예를 나타내는 도면이다.1 is a configuration diagram illustrating a vehicle number recognition apparatus using an anchor box and a CNN feature map according to an embodiment of the present invention.
2 is a configuration diagram illustrating a vehicle number recognition apparatus using an anchor box and a CNN feature map according to an embodiment of the present invention.
3 is a flowchart illustrating a method of recognizing a car number using an anchor box and a CNN feature map according to an embodiment of the present invention.
4 is a flowchart illustrating a method of recognizing a car number using an anchor box and a CNN feature map according to an embodiment of the present invention.
5 is a flowchart illustrating a step of improving the image quality of the car number recognition method using the anchor box and the CNN feature map according to an embodiment of the present invention.
FIG. 6 is a flowchart illustrating a step of generating an anchor box of an apparatus and method for recognizing a car number using an anchor box and a CNN feature map according to an embodiment of the present invention.
7 is a view showing an anchor box of an apparatus and method for recognizing a car number using an anchor box and a CNN feature map according to an embodiment of the present invention.
8 is a view showing an anchor box of an apparatus and method for recognizing a car number using an anchor box and a CNN feature map according to an embodiment of the present invention.
9 is a view showing an embodiment of an apparatus and method for recognizing a car number using an anchor box and a CNN feature map according to an embodiment of the present invention.
10 is a diagram showing a general SSD method.
11 is a view showing an embodiment of a second image of an apparatus and method for recognizing a car number using an anchor box and a CNN feature map according to an embodiment of the present invention.
12 is a view showing an embodiment of CNN of an apparatus and method for recognizing a car number using an anchor box and a CNN feature map according to an embodiment of the present invention.
13 is a view showing an embodiment of a fully connected network of an apparatus and method for recognizing a car number using an anchor box and a CNN feature map according to an embodiment of the present invention.

본 발명은 다양한 변경을 가할 수 있고 여러 가지 형태를 가질 수 있는 바, 실시예들을 본문에 상세하게 설명하고자 한다. 그러나 이는 본 발명을 특정한 개시 형태에 대해 한정하려는 것이 아니며, 본 발명의 사상 및 기술 범위에 포함되는 모든 변경, 균등물 내지 대체물을 포함하는 것으로 이해되어야 한다. 각 도면을 설명하면서 유사한 참조부호를 유사한 구성요소에 대해 사용하였다. 제1, 제2 등의 용어는 다양한 구성요소들을 설명하는데 사용될 수 있지만, 상기 구성요소들은 상기 용어들에 의해 한정되어서는 안 된다.While the present invention has been described in connection with what is presently considered to be the most practical and preferred embodiment, it is to be understood that the invention is not limited to the disclosed embodiments. It is to be understood, however, that the invention is not intended to be limited to the particular forms disclosed, but on the contrary, is intended to cover all modifications, equivalents, and alternatives falling within the spirit and scope of the invention. Like reference numerals are used for like elements in describing each drawing. The terms first, second, etc. may be used to describe various components, but the components should not be limited by the terms.

상기 용어들은 하나의 구성요소를 다른 구성요소로부터 구별하는 목적으로만 사용된다. 본 출원에서 사용한 용어는 단지 특정한 실시예를 설명하기 위해 사용된 것으로, 본 발명을 한정하려는 의도가 아니다. 단수의 표현은 문맥상 명백하게 다르게 뜻하지 않는 한, 복수의 표현을 포함한다.The terms are used only for the purpose of distinguishing one component from another. The terminology used in this application is used only to describe a specific embodiment and is not intended to limit the invention. The singular expressions include plural expressions unless the context clearly dictates otherwise.

본 출원에서, "포함하다" 또는 "이루어진다" 등의 용어는 명세서상에 기재된 특징, 숫자, 단계, 동작, 구성요소, 부분품 또는 이들을 조합한 것이 존재함을 지정하려는 것이지, 하나 또는 그 이상의 다른 특징들이나 숫자, 단계, 동작, 구성요소, 부분품 또는 이들을 조합한 것들의 존재 또는 부가 가능성을 미리 배제하지 않는 것으로 이해되어야 한다.In the present application, the term "comprises" or "comprising ", etc. is intended to specify that there is a stated feature, figure, step, operation, component, But do not preclude the presence or addition of one or more other features, integers, steps, operations, components, parts, or combinations thereof.

다르게 정의되지 않는 한, 기술적이거나 과학적인 용어를 포함해서 여기서 사용되는 모든 용어들은 본 발명이 속하는 기술 분야에서 통상의 지식을 가진 자에 의해 일반적으로 이해되는 것과 동일한 의미를 가지고 있다. 일반적으로 사용되는 사전에 정의되어 있는 것과 같은 용어들은 관련 기술의 문맥 상 가지는 의미와 일치하는 의미를 가지는 것으로 해석되어야 하며, 본 출원에서 명백하게 정의하지 않는 한, 이상적이거나 과도하게 형식적인 의미로 해석되지 않는다. Unless defined otherwise, all terms used herein, including technical or scientific terms, have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs. Terms such as those defined in commonly used dictionaries are to be interpreted as having a meaning consistent with the contextual meaning of the related art and are to be interpreted as either ideal or overly formal in the sense of the present application Do not.

이하, 도면들을 참조하여 본 발명의 바람직한 실시예들을 보다 상세하게 설명하기로 한다.Hereinafter, preferred embodiments of the present invention will be described in more detail with reference to the drawings.

본 발명에서 문자는 글자, 숫자 및 기호를 포함한다.In the present invention, letters include letters, numbers and symbols.

도 1은 본 발명의 일 실시예에 따른 앵커박스 및 CNN 특징 맵을 이용한 차량 번호 인식 장치를 나타내는 구성도이다.1 is a configuration diagram illustrating a vehicle number recognition apparatus using an anchor box and a CNN feature map according to an embodiment of the present invention.

도 1을 참조하면, 본 발명의 일 실시예에 따른 앵커박스 및 CNN 특징 맵을 이용한 차량 번호 인식 장치는 영상 입력부(100), 제2 영상 생성부(200), 특징성분 추출부(300), 앵커박스 생성부(400), 바운딩박스 생성부(500) 및 문자 인식부(600)를 포함한다.1, an apparatus for recognizing a car number using an anchor box and a CNN feature map according to an embodiment of the present invention includes an image input unit 100, a second image generation unit 200, a feature extraction unit 300, An anchor box generation unit 400, a bounding box generation unit 500, and a character recognition unit 600.

상기 영상 입력부(100)는 제1 영상 이미지를 입력 받을 수 있다. 예를 들면, 상기 영상 입력부(100)는 카메라, CCTV 또는 영상 센서일 수 있다. 예를 들면, 상기 영상 입력부(100)는 IP 카메라일 수 있다. 예를 들면, 상기 영상 입력부(100)는 도로위에 설치되어 차량의 전면 또는 후면을 촬영하는 카메라일 수 있다. 상기 제1 영상 이미지에는 동일한 규격을 갖는 문자가 포함될 수 있다. 상기 제1 영상 이미지에는 동일한 규격을 갖는 일렬로 배치된 문자들이 포함될 수 있다. 예를 들면, 상기 제1 영상 이미지에는 숫자, 글자 또는 부호가 포함될 수 있다. 예를 들면, 도 11과 같이 상기 제1 영상 이미지에는 차량용 번호판이 포함될 수 있다. The image input unit 100 may receive a first image. For example, the image input unit 100 may be a camera, a CCTV, or an image sensor. For example, the image input unit 100 may be an IP camera. For example, the image input unit 100 may be a camera installed on a road and photographing a front surface or a rear surface of the vehicle. The first video image may include characters having the same standard. The first video image may include characters arranged in a line having the same standard. For example, the first video image may include numbers, letters, or signs. For example, as shown in FIG. 11, a vehicle license plate may be included in the first image.

상기 제2 영상 생성부(200)는 상기 제1 영상 이미지로부터 문자가 포함된 관심 영역을 검출하여 제2 영상 이미지를 생성할 수 있다. 예를 들면, 도 11과 같이 상기 관심 영역은 차량용 번호판의 영역일 수 있다. 상기 관심 영역의 검출에는 딥러닝 기술이 사용될 수 있다. 예를 들면, 도 9와 같이 상기 관심 영역은 차량용 번호판의 영역이고, 상기 딥러닝 기술은 SSD(Single shot Multibox Detector)이며 번호판검출 학습 데이터를 이용하여 상기 차량용 번호판을 검출할 수 있다. 차량용 번호판 검출에 사용되는 상기 SSD 기술은 도 10과 같이 이미 공지된 기술이므로 자세한 설명은 생략한다. The second image generator 200 may generate a second image by detecting a region of interest including a character from the first image. For example, as shown in FIG. 11, the area of interest may be the area of a license plate. For detection of the region of interest, a deep learning technique may be used. For example, as shown in FIG. 9, the area of interest is an area of a vehicle license plate, and the deep learning technique is a single shot multibox detector (SSD), and the vehicle license plate can be detected using license plate detection learning data. The SSD technology used for vehicle license plate detection is a known technology as shown in FIG. 10, so a detailed description will be omitted.

상기 특징성분 추출부(300)는 상기 제2 영상 이미지로부터 특징성분을 추출할 수 있다. 예를 들면, 상기 특징성분 추출부(300)는 합성곱 신경망(CNN: Convolutional Neural Network)을 이용하여 상기 제2 영상 이미지의 컨볼루션 특징 맵(convolution feature map)을 추출하고, 이를 상기 특징성분으로 할 수 있다. The feature extraction unit 300 may extract a feature component from the second image. For example, the feature extraction unit 300 extracts a convolution feature map of the second image by using a CNN (Convolutional Neural Network), and extracts a convolution feature map of the second image, can do.

상기 합성곱 신경망은 입력 레이어(Input layer), 합성곱 레이어(Convolution Layer), 풀링 레이어(Pooling Layer) 및 특징 맵(feature map)을 포함할 수 있다. 상기 합성곱 레이어 및 풀링 레이어는 복수개가 사용될 수 있다. 상기 합성곱 레이어(Convolution Layer)는 입력 데이터에 필터를 적용 후 활성화 함수를 반영하여 이미지를 변환할 수 있다. 상기 풀링 레이어(Pooling Layer)는 이미지의 차원을 줄이는 서브 샘플링(subsampling) 레이어 일 수 있다. 따라서, 상기 풀링 레이어는 생략될 수 있다.The composite neural network may include an input layer, a convolution layer, a pooling layer, and a feature map. A plurality of the composite product layer and the pooling layer may be used. The convolution layer may transform an image by applying a filter to the input data and then reflecting the activation function. The pooling layer may be a subsampling layer that reduces the dimension of the image. Thus, the pooling layer may be omitted.

예를 들면, 도 12와 같이 상기 입력 레이어에서는 상기 제2 영상 이미지를 112x112 크기로 정규화 시킬 수 있으며 그레이 레벨 한 채널을 사용할 수 있고(@1 또는 depth 1), 상기 합성곱 레이어에서는 그레이 레벨 32 채널을 사용하여 상기 정규화 된 제2 영상 이미지를 56x56 크기로 변환할 수 있으며, 상기 풀링 레이어에서는 그레이 레벨 32 채널을 사용하여 상기 변환된 제2 영상 이미지를 28x28 크기로 변환할 수 있고, 다음 합성곱 레이어에서는 그레이 레벨 64 채널을 사용하여 상기 풀링 레이어에서 변환된 상기 제2 영상 이미지를 14x14 크기로 변환할 수 있고, 다음 풀링 레이어에서는 상기 다음 합성곱 레이어에서 변환된 상기 제2 영상 이미지를 그레이 레벨 128 채널을 사용하여 7x7 크기로 변환할 수 있으며, 최종 변환된 7x7 크기에 128 depth를 갖는 이미지는 상기 특징 맵으로 사용될 수 있다.For example, as shown in FIG. 12, the input image layer may normalize the second image to 112x112, use a gray level channel (@ 1 or depth 1), and in the composite product layer, The normalized second video image can be converted into a size of 56x56, and in the pooling layer, the converted second video image can be converted into a 28x28 size using a gray level 32 channel, , The second video image converted in the pooling layer can be converted into a 14x14 size using the gray level 64 channel, and in the next pooling layer, the second video image converted in the next composite product layer can be converted into the gray level 128 channel , And an image having a final converted 7x7 size and 128 depth can be converted into a 7x7 image using the above It can be used as a gong map.

상기 앵커박스 생성부(400)는 상기 제2 영상 이미지로부터 복수개의 앵커박스(Anchor Box)를 생성할 수 있다. 상기 복수개의 앵커박스들은 상기 제2 영상 이미지와 세로의 크기가 서로 동일하고 가로의 크기가 서로 상이할 수 있다. 상기 앵커박스(Anchor Box)는 문자 인식을 위한 임의 구분 영역일 수 있다.The anchor box generation unit 400 may generate a plurality of anchor boxes from the second image. The plurality of anchor boxes may have the same vertical size as the second video image and have different horizontal sizes. The anchor box may be an arbitrary division area for character recognition.

상기 앵커박스들은 엔드 앵커박스(403), 베이스 앵커박스(401) 및 서브 앵커박스(402)를 포함할 수 있다. The anchor boxes may include an end anchor box 403, a base anchor box 401, and a sub anchor box 402.

상기 엔드 앵커박스(403)는 가로의 크기와 세로의 크기가 동일한 형상일 수 있다. 예를 들면, 도 7과 같이 가로의 크기 및 세로의 크기가 N인 앵커박스일 수 있다. 상기 엔드 앵커박스(403)는 복수개 일 수 있다. 예를 들면, 상기 엔드 앵커박스(403)는 a개일 수 있다(a는 자연수). 상기 엔드 앵커박스(403)는 상기 제2 영상 이미지의 모든 영역에 걸쳐 일정한 간격으로 겹쳐서 위치할 수 있다. 예를 들면, 도 8과 같이 제2 영상 이미지의 가로의 크기가 M, 세로의 크기가 N이며, 상기 엔드 앵커박스(403)가 크기 1(r=1)의 간격으로 배치된 경우 상기 엔드 앵커박스(403)들의 개수 a는 (M-N+1)개일 수 있다.The end anchor box 403 may have the same width and the same length. For example, it may be an anchor box having a horizontal size and a vertical size N as shown in FIG. The end anchor box 403 may be a plurality of end anchor boxes. For example, the number of the end anchor boxes 403 may be a (a is a natural number). The end anchor box 403 may be overlaid over all areas of the second video image at regular intervals. For example, if the size of the second video image is M, the size of the vertical is N, and the end anchor boxes 403 are arranged at intervals of size 1 (r = 1) as shown in FIG. 8, The number a of the boxes 403 may be (M-N + 1).

상기 베이스 앵커박스(401)는 가로의 크기가 세로의 크기보다 작을 수 있다. 예를 들면, 도 7과 같이 상기 베이스 앵커박스(401)의 가로의 크기는 세로의 크기의 1/r배일 수 있다. 예를 들면, 상기 베이스 앵커박스(401)의 가로의 크기는 세로의 크기의 1/2배일 수 있다. 상기 베이스 앵커박스(401)는 상기 엔드 앵커박스(403)의 일측에 위치할 수 있다. 상기 베이스 앵커박스(401)는 상기 엔드 앵커박스(403)와 동일한 개수로 생성될 수 있다. 예를 들면, 도 8과 같이 제2 영상 이미지의 가로의 크기가 M, 세로의 크기가 N이며, 상기 베이스 앵커박스(401)가 크기 1(r=1)의 간격으로 배치된 경우 상기 베이스 앵커박스(401)들의 개수 a는 (M-N+1)개일 수 있다.The width of the base anchor box 401 may be smaller than the length of the base anchor box 401. For example, as shown in FIG. 7, the width of the base anchor box 401 may be 1 / r times the vertical length. For example, the width of the base anchor box 401 may be 1/2 times the length of the base anchor box 401. The base anchor box 401 may be located at one side of the end anchor box 403. The base anchor box 401 may be formed in the same number as the end anchor box 403. For example, when the size of the second video image is M, the size of the vertical is N, and the base anchor boxes 401 are arranged at intervals of size 1 (r = 1) as shown in FIG. 8, The number a of boxes 401 may be (M-N + 1).

상기 서브 앵커박스(402)는 가로의 크기가 상기 엔드 앵커박스(403)의 가로의 크기보다 작으면서 상기 베이스 앵커박스(401)의 가로의 크기에서 일정한 크기만큼 증가한 크기를 가질 수 있다. 상기 서브 앵커박스(402)는 하나의 상기 베이스 앵커박스(401)에 대하여 복수개가 생성될 수 있다. 예를 들면, 도 7과 같이 상기 서브 앵커박스(402)들의 가로의 크기는 세로의 크기의 1/r에 p의 n배를 더한 값일 수 있다.The width of the sub-anchor box 402 may be smaller than the width of the end anchor box 403, but may be increased by a certain amount in the width of the base anchor box 401. A plurality of sub-anchor boxes 402 may be generated for one base anchor box 401. For example, as shown in FIG. 7, the width of the sub-anchor boxes 402 may be 1 / r of the vertical size plus n times of p.

상기 서브 앵커박스(402)는 상기 베이스 앵커박스(401)의 영역을 포함할 수 있다. 예를 들면, 도 7과 같이 상기 서브 앵커박스(402)들은 상기 베이스 앵커박스(401)의 가로의 크기인 N/r보다 큰 가로의 크기와 상기 베이스 앵커박스(401)의 세로의 크기인 N과 동일한 세로의 크기를 가지며, 상기 베이스 앵커박스(401)를 일측에 완전히 포함하는 위치에 배치될 수 있다. 상기 서브 앵커박스(402)들은 서로 가로로 p만큼 이동된 위치에 배치될 수 있다.The sub-anchor box 402 may include an area of the base anchor box 401. For example, as shown in FIG. 7, the sub-anchor boxes 402 may have a width larger than N / r, which is the width of the base anchor box 401, And can be disposed at a position completely including the base anchor box 401 on one side. The sub-anchor boxes 402 may be disposed at positions shifted by a distance of p from each other.

상기 앵커박스의 크기는 픽셀단위일 수 있다. 상기 베이스 앵커박스(401)의 가로의 크기(픽셀의 개수)는 상기 엔드 앵커박스(403)의 세로의 크기(픽셀의 개수)의 1/r배일 수 있다. 예를 들면, 상기 세로의 크기에 해당하는 픽셀의 개수는 짝수이고, 상기 r은 세로의 크기의 약수일 수 있다. 예를 들면, 도 7과 같이 상기 베이스 앵커박스(401)들은 세로 픽셀의 개수가 N, 가로 픽셀의 개수가 N/r일 수 있다.The size of the anchor box may be a pixel unit. The width (number of pixels) of the base anchor box 401 may be 1 / r times the length of the end anchor box 403 (the number of pixels). For example, the number of pixels corresponding to the vertical size may be an even number, and the r may be a divisor of the vertical size. For example, as shown in FIG. 7, the number of vertical pixels may be N and the number of horizontal pixels may be N / r in the base anchor boxes 401.

상기 서브 앵커박스(402)의 증가하는 가로의 일정한 크기는 p픽셀일 수 있다. 예를 들면, 도 7과 같이 상기 서브 앵커박스(402)들은 세로 픽셀의 개수가 N, 가로 픽셀의 개수가 (N/r)+n*p일 수 있다(n은 자연수).The constant size of the increasing width of the sub-anchor box 402 may be p pixels. For example, as shown in FIG. 7, the sub-anchor boxes 402 may have N number of vertical pixels and N / r number of horizontal pixels (n is a natural number).

상기 a개의 엔드 앵커박스(403)들 중에서 가장 인접한 엔드 박스 사이의 시작 위치의 일정한 간격은 q픽셀일 수 있다. 상기 a개의 베이스 앵커박스(401)들 중에서 가장 인접한 베이스 앵커박스(401) 사이의 시작 위치의 일정한 간격은 q픽셀일 수 있다. 예를 들면, 도 8의 (b)와 같이 가로의 픽셀 개수 N개, 세로의 픽셀 개수 N/r개를 갖는 베이스 앵커박스(401)들은 q픽셀의 간격으로 배치될 수 있다. 예를 들면, 도 8의 (c)와 같이 가로의 픽셀 개수 N개, 세로의 픽셀 개수 N개를 갖는 엔드 앵커박스(403)들은 q픽셀의 간격으로 배치될 수 있다.The predetermined interval between the start positions of the end a boxes closest to the end a boxes 403 may be q pixels. A certain interval between the start positions of the base anchor boxes 401 closest to the base anchor boxes 401 may be q pixels. For example, as shown in FIG. 8B, the base anchor boxes 401 having N number of pixels in the horizontal direction and N / r number of vertical pixels can be arranged at intervals of q pixels. For example, as shown in Fig. 8C, the end anchor boxes 403 having N number of pixels in the horizontal direction and N number of pixels in the vertical direction can be arranged at intervals of q pixels.

상기 앵커박스 생성부(400)는 상기 r 및 q를 상기 제2 영상 이미지의 크기에 따라서 변경할 수 있다. 예를 들면, 상기 제2 영상 이미지의 세로의 크기가 홀수이고, 가로의 크기가 짝수이며, r값이 2인 경우, 상기 앵커박스 생성부(400)는 q값을 0.5 단위를 포함하는 값으로 설정할 수 있다. 상기 r, q 및 p값은 장치 사용자 또는 장치 관리자에 의해 변경될 수 있다.The anchor box generating unit 400 may change r and q according to the size of the second image. For example, if the vertical size of the second image is odd, the horizontal size is an even number, and the r value is 2, the anchor box generation unit 400 may set the q value to a value including 0.5 units Can be set. The r, q, and p values may be changed by a device user or a device manager.

상기 앵커박스 생성부(400)가 상기 앵커박스를 생성하는 방법은 도 6과 같이 베이스 앵커박스를 생성하는 단계(S410), 서브 앵커박스를 생성하는 단계(S420), 엔드 앵커박스를 생성하는 단계(S430), 새로운 베이스 앵커박스를 생성하는 단계(S440), 다시 수행하는 단계(S450) 및 가장 반대측에 위치하는 엔드 앵커박스가 생성될 때까지 반복하는 단계(S460)를 포함할 수 있다.The method of generating the anchor box by the anchor box generating unit 400 includes the steps of creating a base anchor box (S410), creating a sub anchor box (S420), creating an end anchor box (S430), creating a new base anchor box (S440), performing again (S450), and repeating (S460) until the end anchor box located on the opposite side is created.

상기 베이스 앵커박스를 생성하는 단계(S410)에서는 상기 제2 영상 이미지의 일측에 위치하는 베이스 앵커박스(401)를 생성할 수 있다. 예를 들면, 도 8의 (a)와 같이 상기 제2 영상 이미지의 일측 끝단에 세로의 크기 N, 가로의 크기 N/r인 베이스 앵커박스(401)를 생성할 수 있다.In step S410 of creating the base anchor box, a base anchor box 401 positioned on one side of the second video image may be created. For example, as shown in FIG. 8A, a base anchor box 401 having a length N and a width N / r may be created at one end of the second video image.

상기 서브 앵커박스를 생성하는 단계(S420)에서는 가로의 크기와 세로의 크기가 같아질 때까지 바로 이전에 생성된 앵커박스의 영역을 포함하며 바로 이전에 생성된 앵커박스의 가로의 크기에서 일정한 크기로 가로의 크기를 증가시킨 상기 서브 앵커박스(402)를 생성할 수 있다. 예를 들면, 도 8의 (a)와 같이 상기 제2 영상 이미지의 일측 끝단으로부터 세로의 크기 N, 가로의 크기가 (N/r)+n*p인 서브 앵커박스(402)들을 생성할 수 있다(n은 자연수).In step S420, the sub-anchor box is generated. The sub-anchor box includes an area of the anchor box generated immediately before the size of the sub-anchor box is equal to the size of the sub-anchor box. The size of the sub-anchor box 402 can be increased. For example, as shown in FIG. 8A, sub-anchor boxes 402 having a size N and a width N / r of + N * p can be generated from one end of the second video image (N is a natural number).

상기 엔드 앵커박스를 생성하는 단계(S430)에서는 바로 이전에 생성된 앵커박스의 영역을 포함하는 상기 엔드 앵커박스(403)를 생성할 수 있다. 예를 들면, 도 8의 (a)와 같이 상기 제2 영상 이미지의 일측 끝단으로부터 세로의 크기 N, 가로의 크기가 N인 엔드 앵커박스(403)를 생성할 수 있다.In step S430 of creating the end anchor box, the end anchor box 403 including an area of the anchor box previously created may be created. For example, as shown in FIG. 8A, an end anchor box 403 having a length N and a width N of length N can be generated from one end of the second video image.

상기 새로운 베이스 앵커박스를 생성하는 단계(S440)에서는 바로 이전에 생성된 베이스 앵커박스(401)에서 일정한 크기만큼 가로 위치가 이동된 새로운 베이스 앵커박스(401)를 생성할 수 있다. 예를 들면, 도 8의 (b)와 같이 상기 제2 영상 이미지의 일측 끝단으로부터 q만큼 이동한 세로의 크기 N, 가로의 크기 N/r인 베이스 앵커박스(401)를 생성할 수 있다.In step S440 of creating the new base anchor box, a new base anchor box 401 may be created in which the previously generated base anchor box 401 has its horizontal position shifted by a predetermined size. For example, as shown in FIG. 8 (b), a base anchor box 401 having a vertical size N and a horizontal size N / r, which is shifted by q from one end of the second video image, can be generated.

상기 다시 수행하는 단계(S450)에서는 상기 서브 앵커박스를 생성하는 단계(S420) 및 상기 엔드 앵커박스를 생성하는 단계(S430)를 다시 수행할 수 있다. 예를 들면, 도 8의 (b)와 같이 상기 제2 영상 이미지의 일측 끝단으로부터 l*q만큼 이동한 세로의 크기 N, 가로의 크기가 (N/r)+n*p인 서브 앵커박스(402)들 및 세로의 크기 N이고 가로의 크기가 N인 엔드 앵커박스(403)들을 생성할 수 있다(n및 l은 자연수).In step S450, the sub-anchor box creation step S420 and the end anchor box creation step S430 may be performed again. For example, as shown in FIG. 8 (b), a sub-anchor box (N / r) + n * p having a size N and a size of horizontal (N / r) + n * p shifted by l * q from one end of the second video image 402) and end anchor boxes 403 of length N and width N (n and l are natural numbers).

상기 가장 반대측에 위치하는 엔드 앵커박스가 생성될 때까지 반복하는 단계(S460)에서는 상기 새로운 베이스 앵커박스를 생성하는 단계(S440) 및 상기 다시 수행하는 단계(S450)를 상기 제2 영상 이미지의 일측의 가장 반대측에 위치하는 엔드 앵커박스(403)가 생성될 때까지 반복할 수 있다. 예를 들면, 도 8의 (c)와 같이 상기 제2 영상 이미지의 일측의 반대측의 끝단에 위치하는 세로의 크기 N, 가로의 크기가 N인 엔드 앵커박스(403)를 생성할 수 있다.In the step S460 of repeating until the end anchor box located on the most opposite side is created, the step of creating the new base anchor box S440 and the step S450 of performing the re-performing are repeatedly performed on one side Can be repeated until the end anchor box 403 located on the most opposite side of the anchor box 403 is produced. For example, as shown in FIG. 8 (c), an end anchor box 403 having a length N and a width N, which are positioned at opposite ends of one side of the second video image, may be created.

따라서, 세로의 크기가 N이고 가로의 크기가 M이며 r이 2인 제2 영상 이미지로부터 생성되는 앵커박스의 개수는 다음의 수학식 1 로 계산될 수 있다.Accordingly, the number of anchor boxes generated from the second image having a vertical size of N, a horizontal size of M, and r of 2 can be calculated by the following equation (1).

수학식 1Equation 1

상기 바운딩박스 생성부(500)는 상기 앵커박스들의 영역에 해당하는 상기 제2 영상 이미지의 특징성분들을 문자 검출 학습데이터와 비교하여 최대 매칭 값을 갖는 앵커박스를 각 문자 영역의 바운딩박스(Bounding Box)로 선정할 수 있다.The bounding box generating unit 500 compares the feature components of the second image corresponding to the anchor boxes with the character detection learning data to generate an anchor box having a maximum matching value in a bounding box ) Can be selected.

상기 문자 인식부(600)는 상기 바운딩박스에 해당하는 영역의 특징성분을 문자 인식 학습데이터와 비교하여 상기 앵커박스에 해당하는 문자를 인식할 수 있다. 상기 문자 인식부(600)는 Fully Connection 네트워크를 이용하여 상기 앵커박스에 해당하는 문자를 인식할 수 있다. 상기 Fully Connection 네트워크는 입력 레이어(Input layer), Fully Connection layer 및 출력 레이어(Output layer)를 포함할 수 있다. 예를 들면, 도 13과 같이 상기 Fully Connection 네트워크는 입력 레이어(Input layer), 제1 Fully Connection layer 및 제2 Fully Connection Layer를 포함할 수 있으며, 상기 입력 레이어에는 상기 바운딩박스에 해당하는 영역의 특징성분이 입력될 수 있다. 여기서, Fully Connection 네트워크란 이전 계층의 모든 뉴런과 결합된 형태의 layer를 사용한 인공 신경망 기술을 말한다.The character recognition unit 600 can recognize the character corresponding to the anchor box by comparing the feature component of the area corresponding to the bounding box with the character recognition learning data. The character recognition unit 600 can recognize a character corresponding to the anchor box using a fully connected network. The Fully-Connected network may include an input layer, a fully-connected layer, and an output layer. For example, as shown in FIG. 13, the Fully-connected network may include an input layer, a first Fully-connected layer, and a second fully-connected layer, and the input layer may have a feature of an area corresponding to the bounding box Components can be input. Here, the Fully-connected network refers to artificial neural network technology using layers combined with all neurons of the previous layer.

도 2는 본 발명의 일 실시예에 따른 앵커박스 및 CNN 특징 맵을 이용한 차량 번호 인식 장치를 나타내는 구성도이다. 2 is a configuration diagram illustrating a vehicle number recognition apparatus using an anchor box and a CNN feature map according to an embodiment of the present invention.

본 실시예에 따른 앵커박스 및 CNN 특징 맵을 이용한 차량 번호 인식 장치는 차량 검출부(700), 전처리부(800) 및 프로젝션부(900)를 제외하고는 도 1의 차량 번호 인식 장치와 실질적으로 동일하다. 따라서, 도 1의 차량 번호 인식 장치와 동일한 구성요소는 동일한 도면 부호를 부여하고, 반복되는 설명은 생략한다.The vehicle number recognizing apparatus using the anchor box and the CNN characteristic map according to the present embodiment is substantially the same as the vehicle number recognizing apparatus of Fig. 1 except for the vehicle detecting section 700, the preprocessing section 800 and the projection section 900 Do. Therefore, the same components as those of the vehicle identification apparatus of FIG. 1 are denoted by the same reference numerals, and repeated descriptions are omitted.

도 2를 참조하면, 본 발명의 일 실시예에 따른 앵커박스 및 CNN 특징 맵을 이용한 차량 번호 인식 장치는 차량 검출부(700), 전처리부(800) 및 프로젝션부(900)를 포함할 수 있다. Referring to FIG. 2, the vehicle number recognizing apparatus using the anchor box and the CNN feature map according to an embodiment of the present invention may include a vehicle detecting unit 700, a preprocessing unit 800, and a projection unit 900.

상기 차량 검출부(700)는 상기 제1 영상 이미지로부터 SSD(Single shot MultiBox Detector) 알고리즘을 이용하여 차량 영역을 검출할 수 있다. 상기 차량 검출부(700)는 상기 SSD(Single shot Multibox Detector)기술 및 차량검출 학습 데이터를 이용하여 상기 차량 영역을 검출할 수 있다. 차량 검출에 사용되는 상기 SSD 기술은 도 10과 같이 이미 공지된 기술이므로 자세한 설명은 생략한다. 상기 제2 영상 생성부(200)는 상기 검출된 차량 영역으로부터 SSD(Single shot MultiBox Detector) 알고리즘을 이용하여 상기 관심 영역인 차량용 번호판 영역을 검출하여 제2 영상 이미지를 생성할 수 있다.The vehicle detecting unit 700 may detect a vehicle area using the SSD (Single Shot MultiBox Detector) algorithm from the first image. The vehicle detection unit 700 can detect the vehicle area using the SSD (Single Shot Multibox Detector) technology and vehicle detection learning data. The SSD technique used for vehicle detection is a known technology as shown in FIG. 10, and thus a detailed description thereof will be omitted. The second image generator 200 may generate a second image by detecting the vehicle license plate area, which is the ROI, from the detected vehicle area using a SSD (Single shot MultiBox Detector) algorithm.

상기 전처리부(800)는 상기 제2 영상 이미지에 intensity balancing 및 모폴로지(Morphology)를 수행하여 화질을 개선할 수 있다.The preprocessing unit 800 may enhance the image quality by performing intensity balancing and morphology on the second image.

상기 intensity balancing의 수행은 영상 그레이 레벨에 대한 히스토그램을 계산하고, 상기 히스토그램으로부터 하위 사분위수와 상위 사분위수에 해당되는 픽셀 값을 계산하며, 다음의 수학식 2를 이용하여 어파인 변환(affine transform)을 수행할 수 있다.The intensity balancing operation calculates a histogram of an image gray level, calculates a pixel value corresponding to a lower quartile and an upper quartile from the histogram, and performs an affine transform using Equation (2) Can be performed.

수학식 2Equation 2

여기서, x는 해당 위치의 픽셀 값이고, max는 영역에서 최대 픽셀 값이고, min은 영역에서 최소 픽셀 값이고, Vmin은 하위 사분위수에 해당되는 픽셀 값이고, Vmax는 상위 사분위수에 해당되는 픽셀 값임.Here, x is a pixel value at the position, max is a maximum pixel value in the region, min is a minimum pixel value in the region, Vmin is a pixel value corresponding to a lower quartile, and Vmax is a pixel value corresponding to a higher quartile Value.

상기 모폴로지(Morphology)란 영상의 노이즈를 제거하고 문자 부분을 강조하기 위한 방법일 수 있다. 상기 모폴로지는 닫힘 모폴로지 및 열림 모폴로지를 포함할 수 있으며, 상기 전처리부(800)는 상기 닫힘 모폴로지를 수행하고 이어서 열림 모폴로지를 수행할 수 있다. 상기 닫힘 모폴로지는 침식(erosion)연산을 수행 후 팽창(dilation)연산을 수행하는 것일 수 있다. 열림 모폴로지는 팽창연산을 한 후, 침식 연산을 수행하는 것일 수 있다. 상기 닫힘 모폴로지는 노이즈를 제거하며, 상기 열림 모폴로지는 문자 부분의 픽셀을 강조시킬 수 있다.The morphology may be a method for removing noise of an image and emphasizing a character portion. The morphology may include a closed morphology and an open morphology, and the preprocessor 800 may perform the closed morphology and then the open morphology. The closed morphology may be to perform a dilation operation after performing an erosion operation. The open morphology can be an expansion operation followed by an erosion operation. The closed morphology removes noise, and the open morphology can emphasize the pixels of the character portion.

상기 프로젝션부(900)는 수평투영법(Horizontal projection)을 이용하여 상기 제2 영상 이미지의 상위 위치와 하위 위치를 계산할 수 있다. 상기 수평투영법은 상기 제2 영상 이미지를 이진화 한 다음 각 행 별로 수행 프로젝션을 수행해서 픽셀 수를 카운트한 후 하위부터 픽셀 카운트 수를 계산해서 최소 카운트 수 부분을 하위 위치로 정하고, 다음에 최소 카운트 수 부분을 상위 위치로 정하는 방법일 수 있다.The projection unit 900 may calculate an upper position and a lower position of the second image by using a horizontal projection method. In the horizontal projection method, the second video image is binarized, and a projection is performed for each of the rows to count the number of pixels. Then, the number of pixel counts is calculated from the bottom to set the minimum count number as a lower position. Quot; portion "

상기 차량 번호 인식 장치는 상기 차량 번호 인식 장치에 전원을 제공하는 전원부, 상기 차량 번호 인식 장치의 각 구성에서 저장 기능이 필요한 경우 저장부를 더 포함할 수 있다.The car number recognizing apparatus may further include a power supply unit for supplying power to the car number recognizing apparatus, and a storage unit for storing the car number recognizing apparatus.

상기 차량 번호 인식 장치는 상기 인식된 문자를 순서대로 결합하여 차량의 번호를 출력하는 차량 번호 출력부를 더 포함할 수 있다.The car number recognition apparatus may further include a car number output unit for outputting a car number by sequentially combining the recognized characters.

상기 차량 번호 인식 장치의 구성은 다양하게 변형될 수 있다. 상기 차량 번호 인식 장치의 구성은 전부 하나의 장치로 구성되거나 분리된 장치로 구성될 수 있다. 예를 들면, 상기 영상 입력부(100), 상기 제2 영상 생성부(200), 상기 특징 성분 추출부, 상기 앵커박스 생성부(400), 상기 바운딩박스 생성부 및 상기 문자 인식부(600)는 하나의 장치로 구성될 수 있다. 예를 들면, 상기 제2 영상 생성부(200), 상기 특징 성분 추출부, 상기 앵커박스 생성부(400), 상기 바운딩박스 생성부 및 상기 문자 인식부(600)는 하나의 프로세서로 구성될 수 있다. 예를 들면, 상기 제2 영상 생성부(200), 상기 특징 성분 추출부, 상기 앵커박스 생성부(400), 상기 바운딩박스 생성부 및 상기 문자 인식부(600)는 별도의 장치인 서버로 구성되고, 상기 서버에 복수개의 상기 영상 입력부(100)가 유선 또는 무선으로 연결될 수 있다. 상기 차량 번호 인식 장치는 데이터를 다른 구성요소 또는 타 시스템으로 전송하는 통신부를 더 포함할 수 있다.The configuration of the vehicle identification apparatus may be variously modified. The configuration of the car number recognition device may be entirely composed of one device or a separate device. For example, the image input unit 100, the second image generation unit 200, the feature extraction unit, the anchor box generation unit 400, the bounding box generation unit and the character recognition unit 600 It can be composed of one device. For example, the second image generation unit 200, the feature extraction unit, the anchor box generation unit 400, the bounding box generation unit, and the character recognition unit 600 may be constituted by one processor have. For example, the second image generation unit 200, the feature extraction unit, the anchor box generation unit 400, the bounding box generation unit, and the character recognition unit 600 may be configured as servers And a plurality of the image input units 100 may be connected to the server by wire or wirelessly. The vehicle identification apparatus may further include a communication unit for transmitting data to another component or another system.

도 3은 본 발명의 일 실시예에 따른 앵커박스 및 CNN 특징 맵을 이용한 차량 번호 인식 방법을 나타내는 흐름도이다.3 is a flowchart illustrating a method of recognizing a car number using an anchor box and a CNN feature map according to an embodiment of the present invention.

본 실시예에 따른 차량 번호 인식 방법은 상기 차량 번호 인식 장치에서 수행되며, 도 1 및 도 2의 차량 번호 인식 장치와 카테고리만 상이할 뿐 실질적으로 동일하다. 따라서, 도 1 및 도 2의 차량 번호 인식 장치와 동일한 구성요소는 동일한 도면 부호를 부여하고, 반복되는 설명은 생략한다.The car number recognition method according to the present embodiment is performed in the car number recognition device and is substantially the same as the car number recognition device in Figs. Therefore, the same components as those of the vehicle identification apparatus of FIGS. 1 and 2 are denoted by the same reference numerals, and repeated descriptions are omitted.

도 3을 참조하면, 본 발명의 일 실시예에 따른 차량 번호 인식 방법은 제1 영상 이미지를 입력 받는 단계(S100), 제2 영상 이미지를 생성하는 단계(S200), 특징성분을 추출하는 단계(S300), 특징성분을 추출하는 단계(S400), 바운딩박스(Bounding Box)로 선정하는 단계(S500) 및 문자를 인식하는 단계(S600)를 포함한다.Referring to FIG. 3, a car number recognition method according to an embodiment of the present invention includes receiving a first image (S100), generating a second image (S200), extracting a feature S300), extracting feature components (S400), selecting a bounding box (S500), and recognizing characters (S600).

상기 제1 영상 이미지를 입력 받는 단계(S100)에서는 상기 영상 입력부(100)가 제1 영상 이미지를 입력 받을 수 있다. 상기 제1 영상 이미지에는 동일한 규격을 갖는 문자가 포함될 수 있다. 상기 제1 영상 이미지에는 동일한 규격을 갖는 일렬로 배치된 문자들이 포함될 수 있다. 예를 들면, 상기 제1 영상 이미지에는 숫자, 글자 또는 부호가 포함될 수 있다. 예를 들면, 도 11과 같이 상기 제1 영상 이미지에는 차량용 번호판이 포함될 수 있다. In the step S100 of receiving the first video image, the video input unit 100 may receive the first video image. The first video image may include characters having the same standard. The first video image may include characters arranged in a line having the same standard. For example, the first video image may include numbers, letters, or signs. For example, as shown in FIG. 11, a vehicle license plate may be included in the first image.

상기 제2 영상 이미지를 생성하는 단계(S200)에서는 상기 제2 영상 생성부(200)가 상기 제1 영상 이미지로부터 문자가 포함된 관심 영역을 검출하여 제2 영상 이미지를 생성할 수 있다. 예를 들면, 도 11과 같이 상기 관심 영역은 차량용 번호판의 영역일 수 있다. 상기 관심 영역의 검출에는 딥러닝 기술이 사용될 수 있다. 예를 들면, 상기 관심 영역은 차량용 번호판의 영역이고, 상기 딥러닝 기술은 SSD(Single shot Multibox Detector)이며 번호판검출 학습 데이터를 이용하여 상기 차량용 번호판 영역을 검출할 수 있다. 차량용 번호판 검출에 사용되는 상기 SSD 기술은 도 10과 같이 이미 공지된 기술이므로 자세한 설명은 생략한다.In the step S200 of generating the second video image, the second video generator 200 may detect a region of interest including the character from the first video image to generate a second video image. For example, as shown in FIG. 11, the area of interest may be the area of a license plate. For detection of the region of interest, a deep learning technique may be used. For example, the area of interest is the area of the vehicle license plate, the deep learning technique is a single shot multibox detector (SSD), and the vehicle license plate area can be detected using license plate detection learning data. The SSD technology used for vehicle license plate detection is a known technology as shown in FIG. 10, so a detailed description will be omitted.

상기 특징성분을 추출하는 단계(S300)에서는 상기 특징성분 추출부(300)가 상기 제2 영상 이미지로부터 특징성분을 추출할 수 있다. 예를 들면, 상기 특징성분 추출부(300)는 합성곱 신경망(CNN: Convolutional Neural Network)을 이용하여 상기 제2 영상 이미지의 컨볼루션 특징 맵(convolution feature map)을 추출하고, 이를 상기 특징성분으로 할 수 있다. In step S300 of extracting the feature component, the feature component extraction unit 300 may extract the feature component from the second image. For example, the feature extraction unit 300 extracts a convolution feature map of the second image by using a CNN (Convolutional Neural Network), and extracts a convolution feature map of the second image, can do.

상기 특징성분을 추출하는 단계(S400)에서는 상기 앵커박스 생성부(400)는 상기 제2 영상 이미지로부터 복수개의 앵커박스(Anchor Box)를 생성할 수 있다. 상기 복수개의 앵커박스들은 상기 제2 영상 이미지와 세로의 크기가 서로 동일하고 가로의 크기가 서로 상이할 수 있다. 상기 앵커박스(Anchor Box)는 문자 인식을 위한 임의 구분 영역일 수 있다.In step S400 of extracting the feature component, the anchor box generating unit 400 may generate a plurality of anchor boxes from the second image. The plurality of anchor boxes may have the same vertical size as the second video image and have different horizontal sizes. The anchor box may be an arbitrary division area for character recognition.

상기 앵커박스 생성부(400)는 상기 r 및 q를 상기 제2 영상 이미지의 크기에 따라서 변경할 수 있다. 예를 들면, 상기 제2 영상 이미지의 세로의 크기가 홀수이고, 가로의 크기가 짝수이며, r값이 2인 경우, 상기 앵커박스 생성부(400)는 q값을 0.5 단위를 포함하는 값으로 설정할 수 있다. The anchor box generating unit 400 may change r and q according to the size of the second image. For example, if the vertical size of the second image is odd, the horizontal size is an even number, and the r value is 2, the anchor box generation unit 400 may set the q value to a value including 0.5 units Can be set.

상기 특징성분을 추출하는 단계(S400)는 베이스 앵커박스를 생성하는 단계(S410), 서브 앵커박스를 생성하는 단계(S420), 엔드 앵커박스를 생성하는 단계(S430), 새로운 베이스 앵커박스를 생성하는 단계(S440), 다시 수행하는 단계(S450) 및 가장 반대측에 위치하는 엔드 앵커박스가 생성될 때까지 반복하는 단계(S460)를 포함할 수 있다.The step S400 of extracting the feature component includes a step S410 of creating a base anchor box, a step S420 of creating a sub anchor box, a step S430 of creating an end anchor box, a step of creating a new base anchor box (S440), performing again (S450), and repeating (S460) until the end anchor box located on the opposite side is generated.

따라서, 세로의 크기가 N이고 가로의 크기가 M이며 r이 2인 제2 영상 이미지로부터 생성되는 앵커박스의 개수는 다음의 수학식 3으로 계산될 수 있다.Accordingly, the number of anchor boxes generated from the second image having a vertical size of N, a horizontal size of M, and r of 2 can be calculated by the following equation (3).

수학식 3Equation 3

상기 바운딩박스(Bounding Box)로 선정하는 단계(S500)에서는 상기 바운딩박스 생성부(500)는 상기 앵커박스들의 영역에 해당하는 상기 제2 영상 이미지의 특징성분들을 문자 검출 학습데이터와 비교하여 최대 매칭 값을 갖는 앵커박스를 각 문자 영역의 바운딩박스(Bounding Box)로 선정할 수 있다.In the step S500 of selecting a bounding box, the bounding box generating unit 500 compares the feature components of the second video image corresponding to the area of the anchor boxes with the character detection learning data, An anchor box having a value can be selected as a bounding box of each character area.

상기 문자를 인식하는 단계(S600)에서는 상기 문자 인식부(600)는 상기 바운딩박스에 해당하는 영역의 특징성분을 문자 인식 학습데이터와 비교하여 상기 앵커박스에 해당하는 문자를 인식할 수 있다. 상기 문자 인식부(600)는 Fully Connection 네트워크를 이용하여 상기 앵커박스에 해당하는 문자를 인식할 수 있다. 상기 Fully Connection 네트워크는 입력 레이어(Input layer), Fully Connection layer 및 출력 레이어(Output layer)를 포함할 수 있다. 예를 들면, 도 13과 같이 상기 Fully Connection 네트워크는 입력 레이어(Input layer), 제1 Fully Connection layer 및 제2 Fully Connection Layer를 포함할 수 있으며, 상기 입력 레이어에는 상기 바운딩박스에 해당하는 영역의 특징성분이 입력될 수 있다. 여기서, Fully Connection 네트워크란 이전 계층의 모든 뉴런과 결합된 형태의 layer를 사용한 인공 신경망 기술을 말한다.In step S600 of recognizing the character, the character recognition unit 600 can recognize the character corresponding to the anchor box by comparing the feature component of the area corresponding to the bounding box with the character recognition learning data. The character recognition unit 600 can recognize a character corresponding to the anchor box using a fully connected network. The Fully-Connected network may include an input layer, a fully-connected layer, and an output layer. For example, as shown in FIG. 13, the Fully-connected network may include an input layer, a first Fully-connected layer, and a second fully-connected layer, and the input layer may have a feature of an area corresponding to the bounding box Components can be input. Here, the Fully-connected network refers to artificial neural network technology using layers combined with all neurons of the previous layer.

도 4는 본 발명의 일 실시예에 따른 앵커박스 및 CNN 특징 맵을 이용한 차량 번호 인식 방법을 나타내는 흐름도이다. 도 5는 본 발명의 일 실시예에 따른 차량 번호 인식 방법의 화질을 개선하는 단계를 나타내는 흐름도이다.4 is a flowchart illustrating a method of recognizing a car number using an anchor box and a CNN feature map according to an embodiment of the present invention. 5 is a flowchart illustrating a step of improving the image quality of the car number recognition method according to an embodiment of the present invention.

본 실시예에 따른 차량 번호 인식 방법은 차량 영역을 검출하는 단계(S110), 화질을 개선하는 단계(S210) 및 제2 영상 이미지의 상위 위치와 하위 위치를 계산하는 단계(S220)를 제외하고는 도 3의 차량 번호 인식 방법과 실질적으로 동일하다. 따라서, 도 3의 차량 번호 인식 방법과 동일한 구성요소는 동일한 도면 부호를 부여하고, 반복되는 설명은 생략한다.The vehicle number recognition method according to the present embodiment includes the steps of detecting a vehicle area (S110), improving an image quality (S210), and calculating a higher position and a lower position of a second image image (S220) Is substantially the same as the car number recognition method of Fig. Therefore, the same constituent elements as those of the car number recognition method of FIG. 3 are denoted by the same reference numerals, and repeated description is omitted.

도 4 및 도 5를 참조하면, 본 발명의 일 실시예에 따른 차량 번호 인식 방법은 차량 영역을 검출하는 단계(S110), 화질을 개선하는 단계(S210) 및 제2 영상 이미지의 상위 위치와 하위 위치를 계산하는 단계(S220)를 포함할 수 있다. 4 and 5, a vehicle number recognition method according to an exemplary embodiment of the present invention includes a step S110 of detecting a vehicle area, a step S210 of improving an image quality, And calculating a position (S220).

상기 차량 영역을 검출하는 단계(S110)에서는 상기 차량 검출부(700)는 상기 제1 영상 이미지로부터 SSD(Single shot MultiBox Detector) 알고리즘을 이용하여 차량 영역을 검출할 수 있다. 예를 들면, 도 9와 같이 상기 차량 영역을 검출하는 단계(S110)에서는 SSD(Single shot Multibox Detector) 및 차량검출 학습 데이터를 이용하여 상기 차량을 검출할 수 있다. 상기 차량 검출에 사용되는 상기 SSD 기술은 도 10과 같이 이미 공지된 기술이므로 자세한 설명은 생략한다. 상기 제2 영상 이미지를 생성하는 단계에서는 상기 검출된 차량 영역으로부터 SSD(Single shot MultiBox Detector) 알고리즘을 이용하여 상기 관심 영역인 차량용 번호판 영역을 검출하여 제2 영상 이미지를 생성할 수 있다.In the step S110 of detecting the vehicle area, the vehicle detector 700 may detect the vehicle area using the SSD (Single Shot MultiBox Detector) algorithm from the first image. For example, as shown in FIG. 9, in step S110 of detecting the vehicle area, the vehicle may be detected using a single shot multibox detector (SSD) and vehicle detection learning data. The SSD technique used for the vehicle detection is a known technique as shown in FIG. 10, so a detailed description thereof will be omitted. In the step of generating the second video image, the second plate image region may be generated by detecting the plate region of the vehicle, which is the region of interest, using the SSD (Single Shot MultiBox Detector) algorithm from the detected vehicle region.

상기 화질을 개선하는 단계(S210)에서는 상기 전처리부(800)는 상기 제2 영상 이미지에 intensity balancing 및 모폴로지(Morphology)를 수행하여 화질을 개선할 수 있다.In the step of improving the image quality (S210), the preprocessing unit 800 may improve the image quality by performing intensity balancing and morphology on the second image.

상기 화질을 개선하는 단계(S210)는 히스토그램을 계산하는 단계(S211), 픽셀 값을 계산하는 단계(S212), 어파인 변환(affine transform)을 수행하는 단계(S213), 닫힘 모폴로지를 수행하는 단계(S214) 및 열림 모폴로지를 수행하는 단계(S215)를 포함할 수 있다.The step of improving the image quality (S210) includes a step S211 of calculating a histogram, a step S212 of calculating a pixel value, an affine transform step S213, a step of performing a closed morphology (S214) and performing an open morphology (S215).

상기 히스토그램을 계산하는 단계(S211)에서는 상기 제2 영상 이미지로부터 영상 그레이 레벨에 대한 히스토그램을 계산할 수 있다. 상기 픽셀 값을 계산하는 단계(S212)에서는 상기 히스토그램으로부터 하위 사분위수와 상위 사분위수에 해당되는 픽셀 값을 계산할 수 있다. 상기 어파인 변환(affine transform)을 수행하는 단계(S213)에서는 다음의 수학식 4를 이용하여 어파인 변환(affine transform)을 수행할 수 있다.In the step of calculating the histogram (S211), a histogram of the image gray level may be calculated from the second image. In step S212, the pixel values corresponding to the lower quartile and the upper quartile can be calculated from the histogram. In step S213 of performing the affine transform, an affine transform may be performed using Equation (4).

수학식 4Equation 4

상기 모폴로지(Morphology)란 영상의 노이즈를 제거하고 문자 부분을 강조하기 위한 방법일 수 있다. 상기 모폴로지는 닫힘 모폴로지 및 열림 모폴로지를 포함할 수 있다. 상기 전처리부(800)는 상기 닫힘 모폴로지를 수행하고 이어서 열림 모폴로지를 수행할 수 있다. 상기 닫힘 모폴로지를 수행하는 단계(S214)에서는 침식(erosion)연산을 수행 후 팽창(dilation)연산을 수행하는 상기 닫힘 모폴로지를 수행할 수 있다. 상기 열림 모폴로지를 수행하는 단계(S215)에서는 팽창연산을 한 후, 침식 연산을 수행하는 상기 열림 모폴로지를 수행할 수 있다. 상기 닫힘 모폴로지는 노이즈를 제거하며, 상기 열림 모폴로지는 문자 부분의 픽셀을 강조시킬 수 있다.The morphology may be a method for removing noise of an image and emphasizing a character portion. The morphology may include a closed morphology and an open morphology. The preprocessor 800 may perform the closed morphology and then the open morphology. In the step S214 of performing the closed morphology, the closed morphology may be performed to perform a dilation operation after performing an erosion operation. In the step of performing the open morphology (S215), the open morphology for performing the erosion operation after the expansion operation can be performed. The closed morphology removes noise, and the open morphology can emphasize the pixels of the character portion.

상기 제2 영상 이미지의 상위 위치와 하위 위치를 계산하는 단계(S220)에서는 상기 프로젝션부(900)는 수평투영법(Horizontal projection)을 이용하여 상기 제2 영상 이미지의 상위 위치와 하위 위치를 계산할 수 있다. 상기 수평투영법은 상기 제2 영상 이미지를 이진화 한 다음 각 행 별로 수행 프로젝션을 수행해서 픽셀 수를 카운트한 후 하위부터 픽셀 카운트 수를 계산해서 최소 카운트 수 부분을 하위 위치로 정하고, 다음에 최소 카운트 수 부분을 상위 위치로 정하는 방법일 수 있다.In step S220 of calculating the upper and lower positions of the second image, the projection unit 900 may calculate upper and lower positions of the second image using a horizontal projection method . In the horizontal projection method, the second video image is binarized, and a projection is performed for each of the rows to count the number of pixels. Then, the number of pixel counts is calculated from the bottom to set the minimum count number as a lower position. Quot; portion "

이상에서는 실시예들을 참조하여 설명하였지만, 해당 기술 분야의 숙련된 통상의 기술자는 하기의 특허 청구의 범위에 기재된 본 발명의 사상 및 영역으로부터 벗어나지 않는 범위 내에서 본 발명을 다양하게 수정 및 변경시킬 수 있음을 이해할 수 있을 것이다.While the present invention has been particularly shown and described with reference to exemplary embodiments thereof, it will be understood by those skilled in the art that various changes and modifications may be made therein without departing from the spirit and scope of the invention as defined by the appended claims. It will be understood.

100: 영상 입력부
200: 제2 영상 생성부
300: 특징성분 추출부
400: 앵커박스 생성부
500: 바운딩박스 생성부
600: 문자 인식부100:
200: second image generating unit
300: Feature component extraction unit
400: anchor box creation unit
500: Bounding box generating unit
600:

Claims

A video input unit receiving a first video image;
A second image generation unit for generating a second image by detecting a region of interest including a character from the first image;
A feature extraction unit for extracting a feature component from the second image;
An anchor box generating unit for generating a plurality of anchor boxes having the same size as the second video image from the second video image and having different sizes in the horizontal direction;
A bounding box generating unit for comparing the feature components of the second image corresponding to the anchor boxes with the character detection learning data and selecting an anchor box having a maximum matching value as a bounding box of each character area; And
And a character recognizing unit for recognizing the character corresponding to the anchor box by comparing the feature component of the area corresponding to the bounding box with the character recognition learning data,
The anchor boxes may include:
A a number of end anchor boxes (a is a natural number) that are overlapped with all the areas of the second video image at regular intervals and have the same size in the horizontal and vertical directions;
A base anchor box located at one side of the end anchor box and having a width smaller than a vertical size; And
An anchor box including a plurality of sub-anchor boxes including an area of the base anchor box, the size of which is smaller than the width of the end anchor box and increased by a predetermined size in the width of the base anchor box, Box and CNN feature map.

2. The method of claim 1, wherein the anchor box generating unit generates the anchor box,
Creating a base anchor box located on one side of the second video image;
The size of the sub-anchor box including the area of the anchor box generated immediately before the size of the horizontal is equal to the size of the sub-anchor box, ;
Creating an end anchor box including an area of an anchor box created immediately before;
Creating a new base anchor box in which a horizontal position is shifted by a predetermined size in a base anchor box generated immediately before;
Creating the sub-anchor box and creating the end anchor box again; And
Generating the new base anchor box and repeating the step of repeating the step until the end anchor box located on the most opposite side of the one side of the second video image is generated, Vehicle number identification device used.

The method of claim 1, wherein the size of the anchor box is in pixels,
The width of the base anchor box (the number of pixels) is 1 / r times the length of the end anchor box (the number of pixels)
The constant size of the increasing width of the sub-anchor box is p pixels,
Wherein the predetermined interval of the start positions between the end boxes adjacent to the most adjacent end boxes is a q-pixel.

The apparatus as claimed in claim 3, wherein the total number of anchor boxes generated from one second image when r is 2 is calculated by the following equation.

Here, N is the vertical size of the second video image and M is the horizontal size of the second video image.

The apparatus of claim 1, further comprising: a vehicle detection unit that detects a vehicle area using a SSD (Single Shot MultiBox Detector) algorithm from the first image,
The second image generator generates an anchor box for generating a second image by detecting the vehicle license plate area, which is the area of interest, from the detected vehicle area using an SSD (Single Shot MultiBox Detector) algorithm, Recognition device.

The method according to claim 1,
Further comprising a preprocessing unit for enhancing image quality by performing intensity balancing and morphology on the second video image,
Wherein the morphology includes a closed morphology and an open morphology,
The intensity balancing operation calculates a histogram of the image gray level, calculates a pixel value corresponding to a lower quartile and an upper quartile from the histogram, and performs an affine transform using the following equation Vehicle identification number.

Here, x is a pixel value at the corresponding position, max is a maximum pixel value in the corresponding region, min is a minimum pixel value in the corresponding region, Vmin is a pixel value corresponding to a lower quartile, and Vmax corresponds to a higher quartile Pixel values.

Receiving a first video image from a video input unit;
Generating a second image by detecting a region of interest including a character from the first image;
Extracting a feature component from the second image;
Generating an anchor box from the second video image, the anchor box generating unit having a plurality of anchor boxes having the same vertical size as the second video image and different widths from each other;
The bounding box generating unit compares the feature components of the second video image corresponding to the area of the anchor boxes with the character detection learning data and selects an anchor box having a maximum matching value as a bounding box of each character area ; And
And recognizing the character corresponding to the anchor box by comparing the feature component of the area corresponding to the bounding box with the character recognition learning data,
The anchor boxes may include:
A a number of end anchor boxes (a is a natural number) that are overlapped with all the areas of the second video image at regular intervals and have the same size in the horizontal and vertical directions;
A base anchor box located at one side of the end anchor box and having a width smaller than a vertical size; And
An anchor box including a plurality of sub-anchor boxes including an area of the base anchor box, the size of which is smaller than the width of the end anchor box and increased by a predetermined size in the width of the base anchor box, Box number and CNN feature map.

8. The method of claim 7, wherein the creating of the anchor box comprises:
Creating a base anchor box located on one side of the second video image;
The size of the sub-anchor box including the area of the anchor box generated immediately before the size of the horizontal is equal to the size of the sub-anchor box, ;
Creating an end anchor box including an area of an anchor box created immediately before;
Creating a new base anchor box in which a horizontal position is shifted by a predetermined size in a base anchor box generated immediately before;
Creating the sub-anchor box and creating the end anchor box again; And
Generating the new base anchor box and repeating the step of repeating the step until the end anchor box located on the most opposite side of the one side of the second video image is generated, A method of recognizing a used car number.

8. The method of claim 7, wherein the size of the anchor box is in pixels,
The width of the base anchor box (the number of pixels) is 1 / r times the length of the end anchor box (the number of pixels)
The constant size of the increasing width of the sub-anchor box is p pixels,
Wherein the predetermined interval of the start positions between the most adjacent end boxes among the a number of end-anchor boxes is q pixels, and the CNN feature map is used.

10. The method of claim 9, wherein the total number of anchor boxes generated from one second image when r is 2 is calculated by the following equation.

8. The method of claim 7,
Further comprising the step of the vehicle detection unit detecting the vehicle area using the SSD algorithm from the first image,
In the step of generating the second video image, an anchor box and a CNN feature map for generating a second video image by detecting the vehicle license plate area, which is the area of interest, using SSD (Single shot MultiBox Detector) algorithm from the detected vehicle area A method of recognizing a car number using the.

8. The method of claim 7,
The pre-processing unit further performing an intensity balancing and a morphology on the second image to improve the image quality,
Wherein the improving the image quality comprises:
Performing a closed morphology;
Performing the closed morphology followed by an open morphology;
Calculating a histogram for the image gray level;
Calculating a pixel value corresponding to a lower quartile and an upper quartile from the histogram; And
A method for recognizing a car number using an anchor box and a CNN feature map, the method comprising: performing an affine transform using the following equation.