KR102234936B1

KR102234936B1 - Apparatus and method for recognizing license plates in real time

Info

Publication number: KR102234936B1
Application number: KR1020190034324A
Authority: KR
Inventors: 황관홍
Original assignee: (주)아이에스인텍
Priority date: 2019-03-26
Filing date: 2019-03-26
Publication date: 2021-04-01
Also published as: KR20200119384A

Abstract

본 발명은 CCTV에서 수집된 이미지데이터를 이용해서 번호판 검출, 문자열 추출, 문자열 인식 과정을 통해 차량 번호를 실시간으로 인식하는 기술적 사상에 관한 것으로서, 일실시예에 따른 차량 번호판 인식 장치는 CCTV를 통해 생성된 이미지데이터로부터 번호판의 비율을 고려하여 번호판 검출 영역을 설정하고, 상기 설정된 번호판 검출 영역에 컨볼루션을 적용하여 예측되는 위치에서 번호판을 크랍(crop)하는 번호판 검출부, 상기 비율을 고려하여, 상기 크랍된 번호판에 표시된 문자열을 추출하고, 상기 문자열에 포함된 문자의 크기에 기초하여 네트워크를 수정하는 문자열 추출부, 및 상기 수정된 네트워크를 기반으로 분류된 텍스트 분류기에 따라, 상기 추출된 문자열에 포함된 각각의 문자를 실시간 인식하는 문자 인식부를 포함할 수 있다.The present invention relates to a technical idea of real-time recognition of a vehicle number through a process of detecting a license plate, extracting a character string, and recognizing a character string using image data collected from a CCTV, and the vehicle license plate recognition apparatus according to an embodiment is generated through a CCTV. A license plate detection unit that sets a license plate detection area in consideration of the ratio of the license plate from the obtained image data, and crops the license plate at a predicted position by applying convolution to the set license plate detection area, and the cropping in consideration of the ratio Included in the extracted character string according to a character string extraction unit for extracting the character string displayed on the license plate and modifying the network based on the size of the character included in the character string, and a text classifier classified based on the modified network. It may include a character recognition unit for real-time recognition of each character.

Description

Apparatus and method for recognizing license plates in real time}

본 발명은 차량 번호판을 실시간으로 인식하는 기술로서, CCTV에서 수집된 이미지데이터를 이용해서 번호판 검출, 문자열 추출, 문자열 인식 과정을 통해 차량 번호를 실시간으로 인식하는 기술적 사상에 관한 것이다.The present invention relates to a technology for recognizing a vehicle license plate in real time, and to a technical idea of recognizing a vehicle number in real time through a license plate detection, character string extraction, and character string recognition process using image data collected from CCTV.

ALPR(Automatic License Plate Recognition)는 차량의 번호판을 인식하기 위한 기술로서, 차량 출입통제, 주차관리, 불법 차량 단속 등 다양한 분야에서 활용되고 있다.ALPR (Automatic License Plate Recognition) is a technology for recognizing the license plate of a vehicle, and is used in various fields such as vehicle access control, parking management, and illegal vehicle enforcement.

일반적으로 차량의 번호판을 인식하기 위해서는 번호판 검출, 문자열 추출, 문자열 인식 3단계를 통해 구현이 가능하다. 번호판 검출을 위해서는 수직에지 기반으로 명암도 변화나 기하학적 구조를 검출하거나, 번호판의 색상이나 구조적인 특징을 기반으로 검출하는 방법이 사용되었다.In general, in order to recognize the license plate of a vehicle, it can be implemented through three steps of license plate detection, character string extraction, and character string recognition. For license plate detection, a method of detecting a change in contrast or geometric structure based on a vertical edge or based on a color or structural feature of a license plate was used.

한편, 번호판에서 문자열을 추출하기 위해서는 이진화 블롭 레이블링을 기반으로 하는 기술을 통해 추출이 가능했다. Meanwhile, in order to extract the character string from the license plate, it was possible to extract it through a technology based on binarized blob labeling.

뿐만 아니라, 번호판에서 문자열을 인식하기 위해서는 템플릿 기반의 매칭과, 머신러닝과 뉴럴 네트워크를 기반으로 하는 인식 기술이 사용되었다.In addition, template-based matching, machine learning and neural network-based recognition technologies were used to recognize character strings in license plates.

그럼에도 불구하고, 기존 방식은 번호판을 인식할 수 있는 환경에 대한 제약이 크다는 단점이 있었다.Nevertheless, the existing method had a disadvantage in that the environment in which the license plate can be recognized is largely limited.

구체적으로, 번호판을 정확하게 인식하기 위해서는 대부분 일정한 광량, 차량 진입 방향 등 제약조건이 있는 환경이어야만 가능하다. Specifically, in order to accurately recognize the license plate, it is possible only in an environment where there are constraints such as a certain amount of light and a vehicle entry direction.

방범용 CCTV가 설치된 환경의 경우 다양한 외부 변인 존재할 수 있는데, 특히, 일정치 않은 광량, 낮, 밤, 새벽 시간별로 다른 광량, 비, 눈 등의 기상 현상으로 인한 변인으로 인해 번호판 인식에 오류가 많이 발생하는 문제가 있었다.In the case of an environment where CCTV for crime prevention is installed, various external variables may exist.In particular, there are many errors in license plate recognition due to variables caused by weather phenomena such as inconsistent amount of light, different amounts of light for each day, night, and dawn time. There was a problem that occurred.

뿐만 아니라, CCTV 설치 위치와 차량 진입 방향에 따른 기하학적인 왜곡 발생하여 번호판 인식률을 저하하는 요인으로 작용하였다.In addition, geometric distortion occurred according to the CCTV installation location and vehicle entry direction, which contributed to lowering the license plate recognition rate.

이러한 문제점들을 해소하고자 근래에는 번호판 인식률을 높이고자 하는 방향으로 기술이 개발되고 있는 추세이다.In order to solve these problems, technology is being developed in the direction of increasing the license plate recognition rate in recent years.

한국등록특허 제10-1949765호 "번호판 인식을 이용한 차량 위치 정보 인식 장치 및 방법"Korean Patent Registration No. 10-1949765 "Vehicle location information recognition apparatus and method using license plate recognition" 한국공개특허 제10-2016-0057356호 "차량 번호판 인식 방법과 시스템"Korean Patent Laid-Open Patent No. 10-2016-0057356 "Vehicle License Plate Recognition Method and System"

본 발명은 방범용 CCTV에서 차량 번호판의 인식률을 높이는 것을 목적으로 한다.An object of the present invention is to increase the recognition rate of vehicle license plates in CCTV for crime prevention.

본 발명은 광량이나, CCTV 설치위치, 차량의 진입 방향에 따른 기하학적 왜곡에 대해서도 번호판의 인식률을 높이는 것을 목적으로 한다.An object of the present invention is to increase the recognition rate of a license plate even for geometric distortion according to the amount of light, the CCTV installation location, and the vehicle entry direction.

본 발명은 다양한 변인이 있는 경우에 대해 학습을 통하여 성능을 높일 수 있는 딥 러닝 기반의 번호판 검출 및 인식 기술을 제공하는 것을 목적으로 한다.An object of the present invention is to provide a deep learning-based license plate detection and recognition technology that can improve performance through learning in case of various variables.

일실시예에 따른 차량 번호판 인식 장치는 CCTV를 통해 생성된 이미지데이터로부터 번호판의 비율을 고려하여 번호판 검출 영역을 설정하고, 상기 설정된 번호판 검출 영역에 컨볼루션을 적용하여 예측되는 위치에서 번호판을 크랍(crop)하는 번호판 검출부, 상기 비율을 고려하여, 상기 크랍된 번호판에 표시된 문자열을 추출하고, 상기 문자열에 포함된 문자의 크기에 기초하여 네트워크를 수정하는 문자열 추출부, 및 상기 수정된 네트워크를 기반으로 분류된 텍스트 분류기에 따라, 상기 추출된 문자열에 포함된 각각의 문자를 실시간 인식하는 문자 인식부를 포함할 수 있다.The vehicle license plate recognition apparatus according to an embodiment sets a license plate detection area in consideration of the ratio of the license plate from image data generated through CCTV, and crops the license plate at a predicted position by applying convolution to the set license plate detection area ( A license plate detection unit to crop), a character string extraction unit that extracts the character string displayed on the cropped license plate in consideration of the ratio, and modifies the network based on the size of the character included in the character string, and the modified network According to the classified text classifier, a character recognition unit for real-time recognition of each character included in the extracted character string may be included.

일실시예에 따른 상기 번호판 검출부는, 이전에 학습된 이미지데이터에서 번호판의 문자가 식별되는 영역들의 공통영역들을 관심영역으로 설정하고, 상기 설정된 관심영역으로 마스킹(masking)한 영역 내에서 상기 번호판 검출 영역을 설정할 수 있다.The license plate detection unit according to an embodiment sets common regions of regions in which characters of the license plate are identified in the previously learned image data as regions of interest, and detects the license plate within a region masked with the set region of interest. You can set the area.

일실시예에 따른 상기 번호판 검출부는, 상기 설정된 번호판 검출 영역에 컨볼루션을 적용하되, 상기 이미지데이터의 레이어를 분리하여 연산하지 않고, 필터를 적용하는 간격(stride)를 달리하여 한 종류의 필터를 이용해서 복수 회 컨볼루션을 적용하여 상기 예측되는 위치를 판단할 수 있다.The license plate detection unit according to an embodiment applies convolution to the set license plate detection area, but does not separate and calculate the image data layer, and performs one type of filter by varying the stride at which the filter is applied. By using the convolution multiple times, the predicted position can be determined.

일실시예에 따른 상기 번호판 검출부는, 상기 생성된 이미지데이터에 대해, 미리 학습된 가중치 대신에 초기화된 가중치를 적용하여 학습하되, 상기 이미지데이터의 전체 레이어를 동시에 학습하여 학습 세트에 상응하는 앵커박스(anchor box)를 재계산하여 학습할 수 있다.The license plate detection unit according to an embodiment learns by applying an initialized weight instead of a pre-learned weight to the generated image data, but simultaneously learning all layers of the image data to an anchor box corresponding to the learning set. You can learn by recalculating the (anchor box).

일실시예에 따른 상기 문자열 추출부는, 상기 크랍된 번호판을 이용하여 학습하되, 상기 크랍된 번호판에 대해, 미리 학습된 가중치 대신에 초기화된 가중치를 적용하고, 상기 크랍된 번호판의 전체 레이어를 동시에 학습하여 학습 세트에 상응하는 앵커박스(anchor box)를 재계산할 수 있다.The character string extractor according to an embodiment learns using the cropped license plate, but applies an initialized weight to the cropped license plate instead of a pre-learned weight, and simultaneously learns all layers of the cropped license plate. Thus, the anchor box corresponding to the training set can be recalculated.

일실시예에 따른 상기 문자열 추출부는, 학습 데이터, 테스트 데이터, 이미지 내에 물체가 존재하는 포지티브 데이터를 이용하여 상기 학습 세트를 구성하되, 상기 학습 데이터, 상기 테스트 데이터, 및 상기 포지티브 데이터 중에서 적어도 하나는 물체 레이블, 바운딩 박스 중점 x 좌표, 바운딩 박스 중점 y 좌표, 바운딩 박스 너비, 및 바운딩 박스 높이를 포함할 수 있다.The character string extractor according to an embodiment configures the learning set using training data, test data, and positive data in which an object exists in an image, wherein at least one of the training data, the test data, and the positive data is It may include an object label, a bounding box midpoint x coordinate, a bounding box midpoint y coordinate, a bounding box width, and a bounding box height.

일실시예에 따른 상기 문자 인식부는, 이전에 학습된 문자열을 이용하여 상기 문자를 인식하되, 이전에 수집된 문자열에 포함된 각각의 문자에 대해 리사이즈 후 컨벌루션하여 인식하는 과정을 반복하여 상기 문자열을 학습할 수 있다.The character recognition unit according to an embodiment recognizes the character using a previously learned character string, but resizes each character included in the previously collected character string, and then convolves to recognize the character. You can learn.

일실시예에 따른 상기 문자 인식부는, 상기 이전에 수집된 문자열에 대한 얼리 터미네이션 기법을 적용해서 포지티브 데이터를 추출하고, 추출된 포지티브 데이터를 이용해서 문자열을 학습할 수 있다.The character recognition unit according to an embodiment may extract positive data by applying an early termination technique to the previously collected character string, and learn the character string using the extracted positive data.

일실시예에 따른 차량 번호판 인식 방법은 CCTV를 통해 생성된 이미지데이터로부터 번호판의 비율을 고려하여 번호판 검출 영역을 설정하는 단계, 상기 설정된 번호판 검출 영역에 컨볼루션을 적용하여 예측되는 위치에서 번호판을 크랍(crop)하는 단계, 상기 비율을 고려하여, 상기 크랍된 번호판에 표시된 문자열을 추출하는 단계, 상기 문자열에 포함된 문자의 크기에 기초하여 네트워크를 수정하는 단계, 및 상기 수정된 네트워크를 기반으로 분류된 텍스트 분류기에 따라, 상기 추출된 문자열에 포함된 각각의 문자를 실시간 인식하는 단계를 포함할 수 있다.The vehicle license plate recognition method according to an embodiment comprises the steps of setting a license plate detection area in consideration of the ratio of the license plate from image data generated through CCTV, and cropping the license plate at a predicted position by applying convolution to the set license plate detection area. (crop), extracting a character string displayed on the cropped license plate in consideration of the ratio, modifying a network based on the size of a character included in the character string, and classifying it based on the modified network According to the generated text classifier, real-time recognition of each character included in the extracted character string may be included.

일실시예에 따른 상기 번호판 검출 영역을 설정하는 단계는, 이전에 학습된 이미지데이터에서 번호판의 문자가 식별되는 영역들의 공통영역들을 관심영역으로 설정하는 단계, 및 상기 설정된 관심영역으로 이미지데이터를 마스킹(masking)한 영역 내에서 상기 번호판 검출 영역을 설정하는 단계를 포함할 수 있다.The setting of the license plate detection area according to an embodiment includes: setting common areas of areas in which characters of the license plate are identified from previously learned image data as areas of interest, and masking the image data to the set areas of interest. (masking) setting the license plate detection area within one area.

일실시예에 따른 상기 번호판 검출 영역을 설정하는 단계는, 상기 설정된 번호판 검출 영역에 컨볼루션을 적용하되, 상기 이미지데이터의 레이어를 분리하여 연산하지 않고, 필터를 적용하는 간격(stride)를 달리하여 한 종류의 필터를 이용해서 복수 회 컨볼루션을 적용하여 상기 예측되는 위치를 판단하는 단계를 포함할 수 있다.In the step of setting the license plate detection area according to an embodiment, convolution is applied to the set license plate detection area, but the image data layer is not separated and calculated, and a filter is applied by varying the stride. It may include the step of determining the predicted position by applying convolution a plurality of times using one type of filter.

일실시예에 따른 상기 문자열을 추출하는 단계는, 상기 크랍된 번호판을 이용하여 학습하되, 상기 크랍된 번호판에 대해, 미리 학습된 가중치 대신에 초기화된 가중치를 적용하는 단계, 및 상기 크랍된 번호판의 전체 레이어를 동시에 학습하여 학습 세트에 상응하는 앵커박스(anchor box)를 재계산하는 단계를 포함할 수 있다.Extracting the character string according to an embodiment includes learning using the cropped license plate, but applying an initialized weight to the cropped license plate instead of the previously learned weight, and It may include the step of simultaneously learning all the layers and recalculating an anchor box corresponding to the learning set.

일실시예에 따른 상기 문자를 실시간 인식하는 단계는, 이전에 학습된 문자열을 이용하여 상기 문자를 인식하되, 이전에 수집된 문자열에 포함된 각각의 문자에 대해 리사이즈 후 컨벌루션하여 인식하는 과정을 반복하여 상기 문자열을 학습하는 단계를 포함할 수 있다.In the real-time recognition of the character according to an embodiment, the character is recognized using a previously learned character string, but the process of resizing each character included in the previously collected character string and recognizing it by convolving it is repeated. Thus, it may include the step of learning the character string.

일실시예에 따른 상기 문자열을 학습하는 단계는, 상기 이전에 수집된 문자열에 대한 얼리 터미네이션 기법을 적용해서 포지티브 데이터를 추출하는 단계, 및 상기 추출된 포지티브 데이터를 이용해서 문자열을 학습하는 단계를 포함할 수 있다.Learning the character string according to an embodiment includes extracting positive data by applying an early termination technique to the previously collected character string, and learning the character string using the extracted positive data. can do.

일실시예에 따르면, 방범용 CCTV에서 차량 번호판의 인식률을 높일 수 있다.According to one embodiment, it is possible to increase the recognition rate of the vehicle license plate in the security CCTV.

일실시예에 따르면, 광량이나, CCTV 설치위치, 차량의 진입 방향에 따른 기하학적 왜곡에 대해서도 번호판의 인식률을 높일 수 있다.According to an embodiment, it is possible to increase the recognition rate of the license plate even for geometric distortion according to the amount of light, the CCTV installation location, and the vehicle entry direction.

일실시예에 따르면, 다양한 변인이 있는 경우에 대해 학습을 통하여 성능을 높일 수 있는 딥 러닝 기반의 번호판 검출 및 인식 기술을 제공할 수 있다.According to an embodiment, it is possible to provide a deep learning-based license plate detection and recognition technology that can improve performance through learning in case of various variables.

도 1은 일실시예에 따른 차량 번호판 인식 장치를 설명하는 도면이다.
도 2는 일실시예에 따른 차량 번호판 인식 장치를 이용하는 딥러닝 기반의 ALPR(Automatic License Plate Recognition)를 설명하는 도면이다.
도 3은 이미지데이터로부터 관심영역을 추출하여 문자를 인식하는 실시예를 설명하는 도면이다.
도 4는 번호판을 검출하는데 있어, 번호판의 특성(비율)을 고려하여 입력의 크기를 설정하는 실시예를 설명하는 도면이다.
도 5 내지 7은 번호판을 검출하는데 있어, 증가된 입력 이미지 크기로 인한 컨볼루션 연산량을 줄이기 위한 구조를 설명하는 도면이다.
도 8은 자동차 번호판 문자열 추출을 위한 모델을 설명하는 도면이다.
도 9는 문자열 추출 학습을 위한 테스트 데이터를 설명하는 도면이다.
도 10은 선훈련 가중치(Pre-trained weight)를 사용하지 않고 문자열을 인식하는 구조를 설명하는 도면이다.
도 11은 일실시예에 따른 문자열 인식 데이터 셋을 설명하는 도면이다.
도 12는 일실시예에 따른 차량 번호판 인식 방법을 설명하는 도면이다.1 is a diagram illustrating an apparatus for recognizing a vehicle license plate according to an exemplary embodiment.
FIG. 2 is a diagram illustrating a deep learning-based Automatic License Plate Recognition (ALPR) using a vehicle license plate recognition apparatus according to an exemplary embodiment.
3 is a diagram illustrating an embodiment of recognizing a character by extracting an ROI from image data.
4 is a diagram for explaining an embodiment of setting the size of an input in consideration of the characteristics (ratio) of the license plate in detecting the license plate.
5 to 7 are diagrams for explaining a structure for reducing an amount of convolution calculation due to an increased input image size in detecting a license plate.
8 is a diagram illustrating a model for extracting a character string of a vehicle license plate.
9 is a diagram illustrating test data for learning to extract a character string.
10 is a diagram illustrating a structure for recognizing a character string without using a pre-trained weight.
11 is a diagram illustrating a character string recognition data set according to an embodiment.
12 is a diagram illustrating a method of recognizing a vehicle license plate according to an exemplary embodiment.

본 명세서에 개시되어 있는 본 발명의 개념에 따른 실시예들에 대해서 특정한 구조적 또는 기능적 설명들은 단지 본 발명의 개념에 따른 실시예들을 설명하기 위한 목적으로 예시된 것으로서, 본 발명의 개념에 따른 실시예들은 다양한 형태로 실시될 수 있으며 본 명세서에 설명된 실시예들에 한정되지 않는다.Specific structural or functional descriptions of embodiments according to the concept of the present invention disclosed in the present specification are exemplified only for the purpose of describing embodiments according to the concept of the present invention, and embodiments according to the concept of the present invention They may be implemented in various forms and are not limited to the embodiments described herein.

본 발명의 개념에 따른 실시예들은 다양한 변경들을 가할 수 있고 여러 가지 형태들을 가질 수 있으므로 실시예들을 도면에 예시하고 본 명세서에 상세하게 설명하고자 한다. 그러나, 이는 본 발명의 개념에 따른 실시예들을 특정한 개시형태들에 대해 한정하려는 것이 아니며, 본 발명의 사상 및 기술 범위에 포함되는 변경, 균등물, 또는 대체물을 포함한다.Since the embodiments according to the concept of the present invention can apply various changes and have various forms, the embodiments will be illustrated in the drawings and described in detail in the present specification. However, this is not intended to limit the embodiments according to the concept of the present invention to specific disclosed forms, and includes changes, equivalents, or substitutes included in the spirit and scope of the present invention.

제1 또는 제2 등의 용어를 다양한 구성요소들을 설명하는데 사용될 수 있지만, 상기 구성요소들은 상기 용어들에 의해 한정되어서는 안 된다. 상기 용어들은 하나의 구성요소를 다른 구성요소로부터 구별하는 목적으로만, 예를 들어 본 발명의 개념에 따른 권리 범위로부터 이탈되지 않은 채, 제1 구성요소는 제2 구성요소로 명명될 수 있고, 유사하게 제2 구성요소는 제1 구성요소로도 명명될 수 있다.Terms such as first or second may be used to describe various elements, but the elements should not be limited by the terms. The terms are only for the purpose of distinguishing one component from other components, for example, without departing from the scope of the rights according to the concept of the present invention, the first component may be referred to as the second component, Similarly, the second component may also be referred to as a first component.

어떤 구성요소가 다른 구성요소에 "연결되어" 있다거나 "접속되어" 있다고 언급된 때에는, 그 다른 구성요소에 직접적으로 연결되어 있거나 또는 접속되어 있을 수도 있지만, 중간에 다른 구성요소가 존재할 수도 있다고 이해되어야 할 것이다. 반면에, 어떤 구성요소가 다른 구성요소에 "직접 연결되어" 있다거나 "직접 접속되어" 있다고 언급된 때에는, 중간에 다른 구성요소가 존재하지 않는 것으로 이해되어야 할 것이다. 구성요소들 간의 관계를 설명하는 표현들, 예를 들어 "~사이에"와 "바로~사이에" 또는 "~에 직접 이웃하는" 등도 마찬가지로 해석되어야 한다.When a component is referred to as being "connected" or "connected" to another component, it is understood that it may be directly connected or connected to the other component, but other components may exist in the middle. It should be. On the other hand, when a component is referred to as being "directly connected" or "directly connected" to another component, it should be understood that there is no other component in the middle. Expressions describing the relationship between components, for example, "between" and "just between" or "directly adjacent to" should be interpreted as well.

본 명세서에서 사용한 용어는 단지 특정한 실시예들을 설명하기 위해 사용된 것으로, 본 발명을 한정하려는 의도가 아니다. 단수의 표현은 문맥상 명백하게 다르게 뜻하지 않는 한, 복수의 표현을 포함한다. 본 명세서에서, "포함하다" 또는 "가지다" 등의 용어는 설시된 특징, 숫자, 단계, 동작, 구성요소, 부분품 또는 이들을 조합한 것이 존재함으로 지정하려는 것이지, 하나 또는 그 이상의 다른 특징들이나 숫자, 단계, 동작, 구성요소, 부분품 또는 이들을 조합한 것들의 존재 또는 부가 가능성을 미리 배제하지 않는 것으로 이해되어야 한다.The terms used in the present specification are only used to describe specific embodiments, and are not intended to limit the present invention. Singular expressions include plural expressions unless the context clearly indicates otherwise. In the present specification, terms such as "comprise" or "have" are intended to designate that the specified features, numbers, steps, actions, components, parts, or combinations thereof exist, but one or more other features or numbers, It is to be understood that the presence or addition of steps, actions, components, parts, or combinations thereof does not preclude the possibility of preliminary exclusion.

다르게 정의되지 않는 한, 기술적이거나 과학적인 용어를 포함해서 여기서 사용되는 모든 용어들은 본 발명이 속하는 기술 분야에서 통상의 지식을 가진 자에 의해 일반적으로 이해되는 것과 동일한 의미를 가진다. 일반적으로 사용되는 사전에 정의되어 있는 것과 같은 용어들은 관련 기술의 문맥상 가지는 의미와 일치하는 의미를 갖는 것으로 해석되어야 하며, 본 명세서에서 명백하게 정의하지 않는 한, 이상적이거나 과도하게 형식적인 의미로 해석되지 않는다.Unless otherwise defined, all terms used herein including technical or scientific terms have the same meaning as commonly understood by one of ordinary skill in the art to which the present invention belongs. Terms as defined in a commonly used dictionary should be construed as having a meaning consistent with the meaning of the related technology, and should not be interpreted as an ideal or excessively formal meaning unless explicitly defined in the present specification. Does not.

이하, 실시예들을 첨부된 도면을 참조하여 상세하게 설명한다. 그러나, 특허출원의 범위가 이러한 실시예들에 의해 제한되거나 한정되는 것은 아니다. 각 도면에 제시된 동일한 참조 부호는 동일한 부재를 나타낸다.Hereinafter, embodiments will be described in detail with reference to the accompanying drawings. However, the scope of the patent application is not limited or limited by these embodiments. The same reference numerals shown in each drawing indicate the same members.

도 1은 일실시예에 따른 차량 번호판 인식 장치(100)를 설명하는 도면이다.1 is a diagram illustrating an apparatus 100 for recognizing a vehicle license plate according to an exemplary embodiment.

일실시예에 따른 차량 번호판 인식 장치(100)는 방범용 CCTV에서 차량 번호판의 인식률을 높일 수 있다. 또한, CCTV 설치위치, 차량의 진입 방향에 따른 기하학적 왜곡에 대해서도 번호판의 인식률을 높일 수 있고, 다양한 변인이 있는 경우에 대해 학습을 통하여 성능을 높일 수 있는 딥 러닝 기반의 번호판 검출 및 인식 기술을 제공할 수 있다.The vehicle license plate recognition apparatus 100 according to an embodiment may increase the recognition rate of the vehicle license plate in a security CCTV. In addition, it provides a deep learning-based license plate detection and recognition technology that can increase the recognition rate of the license plate for geometric distortions according to the CCTV installation location and the vehicle entry direction, and improve performance through learning when there are various variables. can do.

이를 위해, 일실시예에 따른 차량 번호판 인식 장치(100)는 번호판 검출부(110), 문자열 추출부(120), 및 문자 인식부(130)를 포함할 수 있다.To this end, the vehicle license plate recognition apparatus 100 according to an embodiment may include a license plate detection unit 110, a character string extraction unit 120, and a character recognition unit 130.

먼저, 일실시예에 따른 번호판 검출부(110)는 CCTV를 통해 생성된 이미지데이터로부터 번호판의 비율을 고려하여 번호판 검출 영역을 설정할 수 있다. 또한, 설정된 번호판 검출 영역에 컨볼루션을 적용하여 예측되는 위치에서 번호판을 크랍(crop)할 수 있다.First, the license plate detection unit 110 according to an embodiment may set the license plate detection area in consideration of the ratio of the license plate from image data generated through CCTV. In addition, the license plate may be cropped at a predicted position by applying convolution to the set license plate detection area.

이를 위해, 번호판 검출부(110)는 이전에 학습된 이미지데이터에서 번호판의 문자가 식별되는 영역들의 공통영역들을 관심영역으로 설정하고, 설정된 관심영역으로 마스킹(masking)한 영역 내에서 번호판 검출 영역을 설정할 수 있다.To this end, the license plate detection unit 110 sets the common areas of areas in which the characters of the license plate are identified in the previously learned image data as the ROI, and sets the license plate detection area within the area masked with the set ROI. I can.

또한, 번호판 검출부(110)는 설정된 번호판 검출 영역에 컨볼루션을 적용하여 예측되는 위치를 판단할 수 있다.In addition, the license plate detection unit 110 may determine a predicted position by applying convolution to the set license plate detection area.

특히, 번호판 검출부(110)는 이미지데이터의 레이어를 분리하여 연산하지 않고, 필터를 적용하는 간격(stride)를 달리하여 한 종류의 필터를 이용해서 복수 회 컨볼루션을 적용하여 예측되는 위치를 판단할 수 있다. 즉, 번호판 검출부(110)는 이미지에 Max Pooling을 적용하지 않으면서 stride를 1 또는 2로 서로 다르게 적용하여 예측되는 위치를 판단할 수 있다.In particular, the license plate detection unit 110 does not separate and calculate the layers of the image data, and determines the predicted position by applying convolution multiple times using one type of filter by varying the stride to which the filter is applied. I can. That is, the license plate detection unit 110 may determine a predicted position by applying different strides as 1 or 2 without applying Max Pooling to the image.

일실시예에 따른 번호판 검출부(110)는 생성된 이미지데이터에 대해, 미리 학습된 가중치 대신에 초기화된 가중치를 적용하여 학습할 수 있다. 또한, 학습 과정에서 이미지데이터의 전체 레이어를 동시에 학습하여 학습 세트에 상응하는 앵커박스(anchor box)를 재계산하여 학습할 수 있다.The license plate detection unit 110 according to an embodiment may learn by applying an initialized weight to the generated image data instead of a previously learned weight. In addition, during the learning process, an anchor box corresponding to the learning set may be recalculated and learned by simultaneously learning all layers of image data.

이러한 학습에 따른 처리는 번호판 검출부(110)의 제어에 따라서 학습 처리부(140)에서 수행될 수도 있다.The processing according to the learning may be performed by the learning processing unit 140 under the control of the license plate detection unit 110.

다음으로, 문자열 추출부(120)는 비율을 고려하여, 크랍된 번호판에 표시된 문자열을 추출하고, 문자열에 포함된 문자의 크기에 기초하여 네트워크를 수정할 수 있다.Next, the character string extraction unit 120 may extract the character string displayed on the cropped license plate in consideration of the ratio, and modify the network based on the size of the character included in the character string.

일례로, 문자열 추출부(120)는 크랍된 번호판을 이용하여 학습할 수 있다. 마찬가지로, 문자열 추출부(120) 역시 학습 처리부(140)를 제어하여 번호판에 대한 문자열 추출 과정을 학습할 수 있다.For example, the character string extraction unit 120 may learn using a cropped license plate. Similarly, the character string extraction unit 120 may also control the learning processing unit 140 to learn a character string extraction process for the license plate.

일실시예에 따른 문자열 추출부(120)는 학습을 위해 크랍된 번호판에 대해, 미리 학습된 가중치 대신에 초기화된 가중치를 적용하고, 크랍된 번호판의 전체 레이어를 동시에 학습하여 학습 세트에 상응하는 앵커박스(anchor box)를 재계산하도록 처리할 수 있다.The string extraction unit 120 according to an embodiment applies an initialized weight to a license plate cropped for learning, instead of a weight learned in advance, and simultaneously learns all layers of the cropped license plate to obtain an anchor corresponding to the learning set. It can be processed to recalculate the anchor box.

일실시예에 따른 문자열 추출부(120)는 학습 데이터, 테스트 데이터, 이미지 내에 물체가 존재하는 포지티브 데이터를 이용하여 학습 세트를 구성할 수 있다. 특히, 학습 세트를 구성하는 학습 데이터, 테스트 데이터, 및 포지티브 데이터 중에서 적어도 하나는 물체 레이블, 바운딩 박스 중점 x 좌표, 바운딩 박스 중점 y 좌표, 바운딩 박스 너비, 및 바운딩 박스 높이를 포함할 수 있다.The character string extraction unit 120 according to an embodiment may construct a learning set using training data, test data, and positive data in which an object exists in an image. In particular, at least one of training data, test data, and positive data constituting the training set may include an object label, a bounding box midpoint x coordinate, a bounding box midpoint y coordinate, a bounding box width, and a bounding box height.

다음으로, 문자 인식부(130)는 수정된 네트워크를 기반으로 분류된 텍스트 분류기에 따라, 상기 추출된 문자열에 포함된 각각의 문자를 실시간 인식할 수 있다.Next, the character recognition unit 130 may recognize each character included in the extracted character string in real time according to the text classifier classified based on the modified network.

일실시예에 따른 문자 인식부(130)는 이전에 학습된 문자열을 이용하여 상기 문자를 인식하되, 이전에 수집된 문자열에 포함된 각각의 문자에 대해 리사이즈 후 컨벌루션하여 인식하는 과정을 반복하여 문자열을 학습하도록 학습 처리부(140)를 제어할 수 있다.The character recognition unit 130 according to an embodiment recognizes the character using a previously learned character string, but resizes each character included in the previously collected character string, and then convolves to recognize the character. It is possible to control the learning processing unit 140 to learn.

특히, 문자 인식부(130)는 이전에 수집된 문자열에 대한 얼리 터미네이션 기법을 적용해서 포지티브 데이터를 추출하고, 추출된 포지티브 데이터를 이용해서 문자열을 학습할 수 있다.In particular, the character recognition unit 130 may extract positive data by applying an early termination technique to a previously collected character string, and learn the character string using the extracted positive data.

도 2는 일실시예에 따른 차량 번호판 인식 장치를 이용하는 딥러닝 기반의 ALPR(Automatic License Plate Recognition) 순서(200)를 설명하는 도면이다.FIG. 2 is a diagram illustrating a deep learning-based automatic license plate recognition (ALPR) procedure 200 using a vehicle license plate recognition apparatus according to an exemplary embodiment.

본 발명에 따르면, 다양한 변인이 있는 경우 또한 학습을 통하여 성능을 높일 수 있는 딥 러닝 기반의 번호판 검출 및 인식이 가능하다.According to the present invention, when there are various variables, it is possible to detect and recognize a license plate based on deep learning that can improve performance through learning.

도면부호 210은 입력영상으로서, 일례로 CCTV로부터 수집되는 이미지를 입력으로 하기 때문에, 영상 해상도의 해상도가 1920 x 1080, 1280 x 720일 수 있다.Reference numeral 210 denotes an input image, for example, since an image collected from a CCTV is input, the resolution of the image resolution may be 1920 x 1080 or 1280 x 720.

또한, 영상 입력 방식은 RTSP 통신을 주로 사용하나, 변동이 있을 수도 있다.In addition, the image input method mainly uses RTSP communication, but there may be variations.

하드웨어는 데스크탑 대비 비교적 낮은 성능에서 동작해야 하기 때문에, 반드시 적은 연산량을 갖는 알고리즘으로 구현되어야만 한다.Since the hardware must operate at a relatively low performance compared to the desktop, it must be implemented with an algorithm with a small amount of computation.

특히, 자동차의 움직임으로 인해 영상 내 블러 현상이 발생되어 번호판 내 문자 정보가 손실 되더라도 번호판이 검출되어야 한다. 예를 들어, 인식이 잘 안되더라도 검출만 잘되면 실제 필드에서 큰 도움이 될 수 있다.In particular, even if a blur phenomenon occurs in the image due to the movement of the vehicle and text information in the license plate is lost, the license plate must be detected. For example, even if it is not recognized well, it can be of great help in the actual field if it is detected well.

이를 위해, 본 발명에서는 도면부호 220과 같이 번호판 검출영역을 설정하여 관심영역(221)에 대해서만 크랍할 수 있다.To this end, in the present invention, a license plate detection region may be set as shown by reference numeral 220 so that only the region of interest 221 may be cropped.

일례로, 다양한 변인에 강인한 딥 러닝 모델을 각 ALPR 모듈에 적용하기 위해, 번호판 검출 과정(230)은 오픈 소스인 YOLOv3-tiny를 사용할 수 있다.For example, in order to apply a deep learning model robust to various variables to each ALPR module, the license plate detection process 230 may use an open source YOLOv3-tiny.

YOLOv3-tiny 모델은 높은 검출 성능과 비교적 낮은 연산량을 통해 구현이 가능하며, 다양한 CCTV 환경에서 동작 가능한 새로운 이미지 입력받을 수 있다.The YOLOv3-tiny model can be implemented through high detection performance and relatively low computational load, and new image inputs that can be operated in various CCTV environments can be received.

본 발명은 컨볼루션을 통해 분석된 특징에 기초하여, 도면부호 240과 같이 관심영역에서 번호판 영역(241)을 추출할 수 있다.In the present invention, based on the features analyzed through convolution, the license plate region 241 may be extracted from the region of interest as shown by reference numeral 240.

또한, 추출된 번호판 영역(250)은 도면부호 260의 과정을 통해 차량 번호판 내에서 문자열(270)이 추출될 수 있다.In addition, in the extracted license plate area 250, a character string 270 may be extracted from the vehicle license plate through the process of reference numeral 260.

추출된 문자열(270)은 도면부호 280의 과정을 통해 딥 러닝 기반으로 차량의 번호판 내에서 문자로 인식될 수 있다. 이 과정에서 CNN 기반으로 문자가 인식될 수 있다.The extracted character string 270 may be recognized as a character in the license plate of the vehicle based on deep learning through the process of reference numeral 280. In this process, characters can be recognized based on CNN.

인식된 문자는 도면부호 290과 같이 인식된 차량 번호로 식별될 수 있다.The recognized character may be identified by the recognized vehicle number as shown by reference numeral 290.

도 3은 이미지데이터로부터 관심영역을 추출하여 문자를 인식하는 실시예를 설명하는 도면이다.3 is a diagram illustrating an embodiment of recognizing a character by extracting a region of interest from image data.

번호판 검출을 위해서는 관심영역(ROI, Region Of Interesting)을 설정할 수 있다. 즉, 번호판의 문자가 식별되는 영역 설정해야 하고, 인식 단계에서 이미지 내에서 번호판의 문자가 식별 되어야만 한다.For license plate detection, a region of interest (ROI) can be set. That is, the area in which the characters of the license plate are identified must be set, and the characters of the license plate must be identified in the image in the recognition step.

일반적으로는, 도면부호 310과 같이 관심영역과 함께 관심영역 이외의 영역이 제외되지 않는 종래의 경우 발생하는 문제가 있을 수 있다.In general, there may be a problem that occurs in the conventional case in which a region other than the region of interest is not excluded together with the region of interest as shown by reference numeral 310.

구체적으로, Object Detection을 위한 딥 러닝 모델의 입력 이미지의 크기는 대부분 고정될 수 있다.Specifically, the size of the input image of the deep learning model for object detection may be mostly fixed.

예를 들어, Fully 컨볼루션 구조의 경우에 입력 이미지의 크기가 달라질 수 있지만 디텍션 레이어(Detection Layer)의 출력 크기가 일정하지 않은 문제가 발생할 수 있다. 또한, 모든 이미지는 입력 이미지의 사이즈에 맞게 리사이즈 되어 입력되어야 한다.For example, in the case of a fully convolution structure, the size of the input image may vary, but the output size of the detection layer may be uneven. In addition, all images must be input after being resized to fit the size of the input image.

또한, 관심영역을 비율의 고민 없이 크랍한 이미지를 입력 이미지로 사용할 경우, 관심영역의 Aspect Ratio, 크기가 다른 경우 구조적 정보의 차이가 크게 발생할 수 있다.In addition, when a cropped image is used as an input image without worrying about the ratio of the region of interest, when the aspect ratio and size of the region of interest are different, a difference in structural information may occur.

따라서, 본 발명에서는 도면부호 320과 같이 관심영역을 마스킹한 이미지(321)를 입력 이미지로 사용할 수 있다. 이로써, 관심영역의 크기, 비율과 상관 없이 항상 일정한 비율로 리사이즈가 가능하다. 뿐만 아니라, 다운 샘플링 및 업 샘플링으로 인한 구조적 정보로서, 비율 및 크기에 대한 왜곡을 최소화할 수 있다.Accordingly, in the present invention, an image 321 in which the region of interest is masked as shown by reference numeral 320 may be used as an input image. Accordingly, it is always possible to resize at a constant ratio regardless of the size and ratio of the region of interest. In addition, as structural information due to down-sampling and up-sampling, distortion with respect to the ratio and size can be minimized.

도 4는 번호판을 검출하는데 있어, 번호판의 특성(비율)을 고려하여 입력의 크기를 설정하는 실시예를 설명하는 도면이다.4 is a diagram for explaining an embodiment in which the size of an input is set in consideration of a characteristic (ratio) of a license plate in detecting a license plate.

도면부호 410은 1920x1080 해상도의 입력 이미지에 해당한다. 또한, 도면부호 420은 입력 이미지의 비율 그대로 관심영역에서 마스킹을 적용하여 추출된 입력으로서 864 x480의 해상도를 나타낸다.Reference numeral 410 corresponds to an input image of 1920x1080 resolution. In addition, reference numeral 420 denotes a resolution of 864 x480 as an input extracted by applying masking in an ROI as the ratio of the input image.

마스킹이 적용된 입력은 입력 이미지의 비율을 그대로 따르기 때문에 원본에 대한 왜곡이 크지 않다. 기존에는 입력 이미지의 비율과는 상관없는 비율을 이용하여 리사이즈를 했기 때문에 입력에 대한 왜곡이 발생하여 번호판 인식률 저하에 직접적인 영향을 끼쳤다. 예를 들어, YOLOv3 tiny를 이용하는 경우, 416 x 416으로 입력이 리사이즈 되므로 번호판에 대한 왜곡이 발생하였고 이는 인식률의 저하 원인이 되었다.Since the input to which the masking is applied follows the ratio of the input image, distortion to the original is not large. In the past, since resizing was performed using a ratio that was not related to the ratio of the input image, distortion of the input occurred, which directly affected the decrease in the license plate recognition rate. For example, in the case of using YOLOv3 tiny, since the input was resized to 416 x 416, distortion of the license plate occurred, which caused the recognition rate to deteriorate.

즉, 본 발명은 일반적인 CCTV 영상의 크기(FHD, HD)와 Aspect Ratio (16:9)를 고려한 입력 이미지 크기 및 Aspect Ratio로 수정하여 입력을 처리할 수 있다.That is, the present invention can process the input by modifying the input image size and Aspect Ratio in consideration of the general CCTV image size (FHD, HD) and Aspect Ratio (16:9).

다시 말해, 1920x1080 이미지를 그대로 처리할 경우 필요한 컨볼루션 연산량과 메모리 크기가 너무 커질 수 있는데, 본 발명은 다운 샘플링 및 업 샘플링으로 인한 물체의 구조적 정보(고주파 성분) 손실을 최소화할 수 있다.In other words, if a 1920x1080 image is processed as it is, the required convolution operation amount and memory size may be too large, and the present invention can minimize loss of structural information (high frequency component) of an object due to down-sampling and up-sampling.

도 5 내지 7은 번호판을 검출하는데 있어, 증가된 입력 이미지 크기로 인한 컨볼루션 연산량을 줄이기 위한 구조를 설명하는 도면이다.5 to 7 are diagrams for explaining a structure for reducing an amount of convolution calculation due to an increased input image size in detecting a license plate.

먼저, 도 5를 살펴보면, 도면부호 510은 기존에 번호판 검출을 위한 layer(0~3) 구조에 대한 예시를 나타내고, 도면부호 520은 본 발명에 따른 layer(0~3) 구조에 대한 예시를 나타낸다.First, referring to FIG. 5, reference numeral 510 denotes an example of a layer (0 to 3) structure for detecting a license plate, and reference numeral 520 denotes an example of a layer (0 to 3) structure according to the present invention. .

도면부호 510에서 보는 바와 같이, 기존에는 설정된 번호판 검출 영역에 컨볼루션을 적용하되, Max pooling의 스케일을 고려해서 이미지데이터의 레이어를 분리하여 연산했다. 이는, 증가된 입력 이미지 크기로 인한 컨볼루션의 연산량이 상당하다.As shown in reference numeral 510, convolution was applied to the previously set license plate detection area, but the image data layer was separated and calculated in consideration of the scale of Max pooling. This has a significant amount of convolutional computation due to the increased input image size.

한편, 도면부호 520에서 보는 바와 같이, 본 발명에 따른 번호판 검출부(110)는 설정된 번호판 검출 영역에 컨볼루션을 적용하되, 이미지데이터의 레이어를 분리하여 연산하지 않는다. 즉, 필터를 적용하는 간격(stride)를 달리하면서, 한 종류의 필터를 이용해서 복수 회 컨볼루션을 적용하여 예측되는 위치를 판단한다. 예를 들어, 본 발명에서는 간격을 1, 2로 달리 하면서 필터 16과 필터 32에 의한 컨볼루션 연산을 수행한다.Meanwhile, as shown by reference numeral 520, the license plate detection unit 110 according to the present invention applies convolution to the set license plate detection area, but does not separate and calculate the image data layer. That is, the predicted position is determined by applying convolution multiple times using one type of filter while varying the stride to which the filter is applied. For example, in the present invention, a convolution operation by filter 16 and filter 32 is performed while changing the interval to 1 and 2.

이러한 과정에서의 연산량은 도 6을 통해 살펴볼 수 있다.The amount of computation in this process can be seen through FIG. 6.

먼저, 도면부호 610은 이미지데이터의 레이어를 분리하여 연산하는 것으로서, 분리하여 연산하는 Max Pooling 과정에 따라 연산량이 줄어들지 않음을 확인할 수 있다.First, reference numeral 610 denotes the operation of separating and calculating layers of image data, and it can be seen that the amount of operation is not reduced according to the Max Pooling process of separating and calculating the image data.

반면, 도면부호 620은 이미지데이터의 레이어를 분리하여 연산하지 않고, 필터를 적용하는 간격(stride)를 달리하므로, 연산량이 현저하게 줄어드는 것을 확인할 수 있다.On the other hand, reference numeral 620 does not separate and calculate the layers of the image data, but differs in the stride for applying the filter, and thus it can be seen that the amount of calculation is remarkably reduced.

도 7은 본 발명에 따른 학습을 위한 네트워크의 구조(700)를 나타낼 수 있다.7 may show the structure 700 of a network for learning according to the present invention.

일례로, 도면부호 710은 Feature Extractor를 통해 각 입력들로부터 추출된 특징을 나타내고, 도면부호 720은 Bounding Box & Class Regressor를 통해 업샘플링한 결과들을 나타낸다.For example, reference numeral 710 denotes a feature extracted from each input through Feature Extractor, and reference numeral 720 denotes results of up-sampling through Bounding Box & Class Regressor.

즉, CCTV 영상에 나타나는 번호판의 특성을 고려한 LPD Net(License Plate Detection Net)을 구조(700)를 통해 CCTV 영상에서 검출되는 번호판 크기를 고려한 모델을 구현할 수 있다.That is, a model considering the size of the license plate detected in the CCTV image can be implemented through the structure 700 of the LPD Net (License Plate Detection Net) in consideration of the characteristics of the license plate appearing in the CCTV image.

기존에 제안된 Object Detection 관련 딥 러닝 모델들의 경우 작은, 중간, 큰 크기 등의 물체가 등장하는 데이터 셋에 대한 검출 성능을 기준으로 설계가 되었다. 그러나, 방범용 CCTV 영상(FHD (1920x1080))에서 검출되는 대부분의 번호판의 너비, 높이는 영상 전체 너비, 높이의 10% 이하로 전체 이미지 크기 대비 매우 작고, 따라서 기존 방식으로 번호판을 검출하는 경우 연산량에 대비 효율이 낮을 뿐만 아니라, 정확도도 떨어진다.In the case of the previously proposed object detection-related deep learning models, they are designed based on the detection performance of a data set in which objects of small, medium, and large sizes appear. However, the width and height of most license plates detected in security CCTV images (FHD (1920x1080)) are less than 10% of the total width and height of the image, which is very small compared to the total image size. Not only is the contrast efficiency low, but also the accuracy is low.

작은 크기를 갖는 물체 검출에 적합한 모델이 필요했는데, 본 발명에 따른 구조를 통해 이러한 모델을 구현할 수 있다.A model suitable for detecting objects having a small size was needed, and this model can be implemented through the structure according to the present invention.

학습을 위해서는, 관심영역에서 마스킹된 이미지가 필요하다. 또한, 학습 데이터, 테스트 데이터가 존재하며, 데이터의 형식은 <물체 레이블> <바운딩 박스 중점 x 좌표> <바운딩 박스 중점 y 좌표> <바운딩 박스 너비> <바운딩 박스 높이>를 포함할 수 있다. 또한, x, y 좌표 및 너비, 높이 정보는 이미지 너비, 높이에 의해 정규화 된 범위를 가진다. For learning, an image masked in the region of interest is required. In addition, training data and test data exist, and the format of the data may include <object label> <bounding box midpoint x coordinate> <bounding box midpoint y coordinate> <bounding box width> <bounding box height>. In addition, x and y coordinates and width and height information have a range normalized by the image width and height.

학습방법으로는, 선학습된 가중치를 사용하지 않고 초기화된 값으로 부터 학습이 가능하다. 또한, 입력 데이터의 전체 레이어(feature extractor + object detector)를 동시에 학습할 수 있고, 학습 셋에 맞는 anchor box를 재계산함으로써, 모델 파라미터를 변경할 수 있다.As a learning method, it is possible to learn from initialized values without using pre-learned weights. In addition, all layers of input data (feature extractor + object detector) can be learned at the same time, and model parameters can be changed by recalculating an anchor box suitable for the training set.

도 8은 자동차 번호판 문자열 추출을 위한 모델을 설명하는 실시예(800)이다.8 is an embodiment 800 illustrating a model for extracting a character string of a vehicle license plate.

도 8에서 보는 바와 같이, 본 발명은 자동차 번호판 문자열 추출을 위한 모델을 제안한다.As shown in FIG. 8, the present invention proposes a model for extracting a license plate string.

특히, 입력 이미지(810)는 동일한 비율로 마스킹된 후 리사이즈된 리사이즈 영상(820)에 대해 YOLO-T 오픈 소스(830)를 적용해서 문자열로 추출될 수 있다.In particular, the input image 810 may be extracted as a character string by applying the YOLO-T open source 830 to the resized image 820 that is masked at the same ratio and then resized.

도 9는 문자열 추출 학습을 위한 테스트 데이터를 설명하는 실시예(900)이다.9 is an embodiment 900 illustrating test data for learning to extract a character string.

도 9에서 보는 바와 같이 추출된 문자열은 학습 데이터 1,000장과, 테스트 데이터 167장, 그리고 이미지 내에 물체가 존재하는 Positive 데이터에 의해서 학습될 수 있다. 추출된 문자열의 데이터 형식 역시 <물체 레이블> <바운딩 박스 중점 x 좌표> <바운딩 박스 중점 y 좌표> <바운딩 박스 너비> <바운딩 박스 높이>를 포함할 수 있으며, x, y 좌표 및 너비, 높이 정보는 이미지 너비, 높이에 의해 정규화 된 범위를 가질 수 있다.As shown in FIG. 9, the extracted character string may be learned by 1,000 pieces of training data, 167 pieces of test data, and positive data in which an object exists in the image. The data format of the extracted string can also include <object label> <bounding box midpoint x coordinate> <bounding box midpoint y coordinate> <bounding box width> <bounding box height>, and information about x, y coordinates, width, and height. Can have a range normalized by the image width and height.

문자열 추출을 위해서 본 발명은 선학습된 가중치를 사용하지 않고 초기화된 값으로부터 학습을 처리할 수 있다. 또한, 전체 레이어(feature extractor + object detector)를 동시에 학습할 수 있으며, 학습 셋에 맞는 anchor box를 재계산하여 학습에 활용할 수 있다.In order to extract a character string, the present invention can process learning from an initialized value without using a pre-learned weight. In addition, the entire layer (feature extractor + object detector) can be learned at the same time, and the anchor box suitable for the training set can be recalculated and used for learning.

도 10은 선훈련 가중치(Pre-trained weight)를 사용하지 않고 문자열을 인식하는 구조를 설명하는 실시예(1000)이다.10 is an embodiment 1000 illustrating a structure for recognizing a character string without using a pre-trained weight.

입력 이미지(1010)는 리사이즈 과정(1020)을 통해 64X64X3의 크기로 변환될 수 있다. 또한, 리사이즈된 입력 이미지는 도면부호 1030에서 Vgg19와 유사한 구조로 가공되어 문자열로 인식될 수 있다.The input image 1010 may be converted to a size of 64X64X3 through the resizing process 1020. In addition, the resized input image may be processed into a structure similar to Vgg19 at 1030 to be recognized as a character string.

도 11은 일실시예에 따른 문자열 인식 데이터 셋을 설명하는 실시예(1100)이다.11 is an embodiment 1100 illustrating a character string recognition data set according to an embodiment.

실시예(1100)에 따르면, 문자열로 인식되는 데이터 셋은 학습에 활용될 수 있다.According to the embodiment 1100, a data set recognized as a character string may be used for learning.

일례로, 학습 데이터는 숫자 별로 1,000 장, 문자 별로 50장일 수 있고, 이때의 테스트 데이터는 학습 데이터를 그대로 사용할 수 있다. 이 과정에서, 얼리 터미네이션(Early Termination) 기법을 사용할 수 있다. 문자로 인식되기 위해서는 대부분 Positive 데이터일 수 있다.For example, the training data may be 1,000 pieces for each number and 50 pieces for each letter, and the test data at this time may use the training data as it is. In this process, an early termination technique can be used. In order to be recognized as text, it can be mostly positive data.

일실시예에 따르면, 본 발명에서는 학습을 위해 입력 이미지를 랜덤하게 각도나 크기 등을 변경하여 입력으로 처리할 수 있다. 이로써 다양한 변인에 의해서도 번호판의 인식률을 유지할 수 있다.According to an embodiment, in the present invention, an input image may be processed as an input by randomly changing an angle or size for learning. As a result, it is possible to maintain the recognition rate of the license plate even by various variables.

뿐만 아니라, 본 발명에서는 A지역에 위치하는 학습 데이터(또는 학습 결과)를 B지역에 제공할 수 있다. 예를 들어, A지역에서 촬영되는 번호판들의 형태는 CCTV가 설치된 위치와 각도, 그리고 차선에 따라 획일화될 가능성이 매우 높다. 한편, B지역에서는 CCTV가 설치된 위치와 각도, 그리고 차선의 상대적인 위치가 A지역과 달라 촬영되는 번호판들의 형태가 A지역이 형태와는 다를 가능성이 높다. 비록 A지역에서는 B지역에서 촬영되는 형태로 번호판이 촬영될 가능성이 높지는 않지만, 이례적인 경우 A지역에서도 B지역에서 촬영되는 형태로 번호판이 촬영될 수 있다. 이 경우에 A지역에서는 번호판에 대해 인식을 못할 가능성이 높은데, 이 경우를 고려하여 본 발명은 서로 다른 지역에서 학습한 결과를 수집하여 공유함으로써, 인식률 향상에 기여할 수 있다. In addition, in the present invention, it is possible to provide learning data (or learning results) located in area A to area B. For example, the shape of license plates photographed in area A is highly likely to be uniform according to the location and angle of CCTV installations and lanes. On the other hand, in area B, the location and angle of CCTV installations, and the relative position of the lanes are different from those in area A, so it is highly likely that the shape of the license plates being photographed differs from the shape in area A. Although it is unlikely that the license plate will be photographed in area B in area A, in exceptional cases, the license plate may be photographed in area B in area A as well. In this case, there is a high possibility that the license plate is not recognized in area A, and in consideration of this case, the present invention may contribute to improvement of the recognition rate by collecting and sharing the results of learning in different areas.

도 12는 일실시예에 따른 차량 번호판 인식 방법을 설명하는 도면이다.12 is a diagram illustrating a method of recognizing a vehicle license plate according to an exemplary embodiment.

일실시예에 따른 차량 번호판 인식 방법은 CCTV를 통해 생성된 이미지데이터로부터 번호판의 비율을 고려하여 번호판 검출 영역을 설정할 수 있다(단계 1201).In the vehicle license plate recognition method according to an embodiment, the license plate detection area may be set in consideration of the ratio of the license plate from image data generated through CCTV (step 1201).

예를 들어, 번호판 검출 영역을 설정하기 위해서는 이전에 학습된 이미지데이터에서 번호판의 문자가 식별되는 영역들의 공통영역들을 관심영역으로 설정할 수 있다. 또한, 설정된 관심영역으로 이미지데이터를 마스킹(masking)한 영역 내에서 번호판 검출 영역을 설정할 수 있다.For example, in order to set the license plate detection area, common areas of areas in which characters of the license plate are identified in the previously learned image data may be set as the ROI. In addition, a license plate detection area may be set within an area in which image data is masked as a set region of interest.

또한, 번호판 검출 영역을 설정하기 위해서는 설정된 번호판 검출 영역에 컨볼루션을 적용할 수 있다. 이를 위해, 이미지데이터의 레이어를 분리하여 연산하지 않고, 필터를 적용하는 간격(stride)를 달리하여 한 종류의 필터를 이용해서 복수 회 컨볼루션을 적용하여 예측되는 위치를 판단할 수도 있다. In addition, in order to set the license plate detection area, convolution can be applied to the set license plate detection area. To this end, the predicted position may be determined by applying convolution multiple times using one type of filter by varying the stride for applying the filter without separating and calculating the image data layers.

다음으로, 일실시예에 따른 차량 번호판 인식 방법은 설정된 번호판 검출 영역에 컨볼루션을 적용하여 예측되는 위치에서 번호판을 크랍(crop)할 수 있다(단계 1202).Next, in the vehicle license plate recognition method according to an embodiment, a license plate may be cropped at a predicted position by applying convolution to a set license plate detection area (step 1202).

일실시예에 따른 차량 번호판 인식 방법은 번호판의 비율을 고려하여, 크랍된 번호판에 표시된 문자열을 추출하고(단계 1203), 문자열에 포함된 문자의 크기에 기초하여 네트워크를 수정할 수 있다(단계 1204).In the vehicle license plate recognition method according to an embodiment, a character string displayed on the cropped license plate may be extracted in consideration of the ratio of the license plate (step 1203), and the network may be modified based on the size of characters included in the character string (step 1204). .

일례로, 문자열을 추출하기 위해 크랍된 번호판을 이용하여 학습하되, 상기 크랍된 번호판에 대해, 미리 학습된 가중치 대신에 초기화된 가중치를 적용할 수 있다. 또한, 크랍된 번호판의 전체 레이어를 동시에 학습하여 학습 세트에 상응하는 앵커박스(anchor box)를 재계산할 수 있다.For example, in order to extract a character string, it is learned using a cropped license plate, but an initialized weight may be applied to the cropped license plate instead of the previously learned weight. In addition, it is possible to recalculate the anchor box corresponding to the learning set by learning all the layers of the cropped license plate at the same time.

상기 수정된 네트워크를 기반으로 분류된 텍스트 분류기에 따라, 상기 추출된 문자열에 포함된 각각의 문자를 실시간 인식할 수 있다(단계 1205).According to the text classifier classified based on the modified network, each character included in the extracted character string may be recognized in real time (step 1205).

일례로, 문자를 실시간 인식하기 위해, 이전에 학습된 문자열을 이용하여 문자를 인식할 수 있다. 또한, 이전에 수집된 문자열에 포함된 각각의 문자에 대해 리사이즈 후 컨벌루션하여 인식하는 과정을 반복하여 상기 문자열을 학습할 수 있다.For example, in order to recognize a character in real time, a character may be recognized using a previously learned character string. In addition, the character string may be learned by repeating a process of resizing and convolving to recognize each character included in the previously collected character string.

예를 들어, 문자열을 학습하기 위해서는, 이전에 수집된 문자열에 대한 얼리 터미네이션 기법을 적용해서 포지티브 데이터를 추출하고, 추출된 포지티브 데이터를 이용해서 문자열을 학습할 수 있다.For example, in order to learn a string, positive data can be extracted by applying an early termination technique to a previously collected string, and the string can be learned using the extracted positive data.

결국, 본 발명을 이용하면 방범용 CCTV에서 차량 번호판의 인식률을 높일 수 있다. 또한, 일실시예에 따르면, 광량이나, CCTV 설치위치, 차량의 진입 방향에 따른 기하학적 왜곡에 대해서도 번호판의 인식률을 높일 수 있고, 다양한 변인이 있는 경우에 대해 학습을 통하여 성능을 높일 수 있는 딥 러닝 기반의 번호판 검출 및 인식 기술을 제공할 수 있다.Consequently, by using the present invention, it is possible to increase the recognition rate of the vehicle license plate in the security CCTV. In addition, according to an embodiment, the recognition rate of the license plate can be increased even for geometric distortions according to the amount of light, the CCTV installation location, and the vehicle entry direction, and deep learning that can improve performance through learning about various variables. It can provide a license plate detection and recognition technology based.

이상에서 설명된 장치는 하드웨어 구성요소, 소프트웨어 구성요소, 및/또는 하드웨어 구성요소 및 소프트웨어 구성요소의 조합으로 구현될 수 있다. 예를 들어, 실시예들에서 설명된 장치 및 구성요소는, 예를 들어, 프로세서, 콘트롤러, ALU(arithmetic logic unit), 디지털 신호 프로세서(digital signal processor), 마이크로컴퓨터, FPA(field programmable array), PLU(programmable logic unit), 마이크로프로세서, 또는 명령(instruction)을 실행하고 응답할 수 있는 다른 어떠한 장치와 같이, 하나 이상의 범용 컴퓨터 또는 특수 목적 컴퓨터를 이용하여 구현될 수 있다. 처리 장치는 운영 체제(OS) 및 상기 운영 체제 상에서 수행되는 하나 이상의 소프트웨어 애플리케이션을 수행할 수 있다. 또한, 처리 장치는 소프트웨어의 실행에 응답하여, 데이터를 접근, 저장, 조작, 처리 및 생성할 수도 있다. 이해의 편의를 위하여, 처리 장치는 하나가 사용되는 것으로 설명된 경우도 있지만, 해당 기술분야에서 통상의 지식을 가진 자는, 처리 장치가 복수 개의 처리 요소(processing element) 및/또는 복수 유형의 처리 요소를 포함할 수 있음을 알 수 있다. 예를 들어, 처리 장치는 복수 개의 프로세서 또는 하나의 프로세서 및 하나의 콘트롤러를 포함할 수 있다. 또한, 병렬 프로세서(parallel processor)와 같은, 다른 처리 구성(processing configuration)도 가능하다.The apparatus described above may be implemented as a hardware component, a software component, and/or a combination of a hardware component and a software component. For example, the devices and components described in the embodiments include, for example, a processor, a controller, an arithmetic logic unit (ALU), a digital signal processor, a microcomputer, a field programmable array (FPA), It may be implemented using one or more general purpose or special purpose computers, such as a programmable logic unit (PLU), a microprocessor, or any other device capable of executing and responding to instructions. The processing device may execute an operating system (OS) and one or more software applications executed on the operating system. Further, the processing device may access, store, manipulate, process, and generate data in response to the execution of software. For the convenience of understanding, although it is sometimes described that one processing device is used, one of ordinary skill in the art, the processing device is a plurality of processing elements and/or a plurality of types of processing elements. It can be seen that it may include. For example, the processing device may include a plurality of processors or one processor and one controller. In addition, other processing configurations are possible, such as a parallel processor.

소프트웨어는 컴퓨터 프로그램(computer program), 코드(code), 명령(instruction), 또는 이들 중 하나 이상의 조합을 포함할 수 있으며, 원하는 대로 동작하도록 처리 장치를 구성하거나 독립적으로 또는 결합적으로(collectively) 처리 장치를 명령할 수 있다. 소프트웨어 및/또는 데이터는, 처리 장치에 의하여 해석되거나 처리 장치에 명령 또는 데이터를 제공하기 위하여, 어떤 유형의 기계, 구성요소(component), 물리적 장치, 가상 장치(virtual equipment), 컴퓨터 저장 매체 또는 장치, 또는 전송되는 신호 파(signal wave)에 영구적으로, 또는 일시적으로 구체화(embody)될 수 있다. 소프트웨어는 네트워크로 연결된 컴퓨터 시스템 상에 분산되어서, 분산된 방법으로 저장되거나 실행될 수도 있다. 소프트웨어 및 데이터는 하나 이상의 컴퓨터 판독 가능 기록 매체에 저장될 수 있다.The software may include a computer program, code, instructions, or a combination of one or more of these, configuring the processing unit to behave as desired or processed independently or collectively. You can command the device. Software and/or data may be interpreted by a processing device or, to provide instructions or data to a processing device, of any type of machine, component, physical device, virtual equipment, computer storage medium or device. , Or may be permanently or temporarily embodyed in a transmitted signal wave. The software may be distributed over networked computer systems and stored or executed in a distributed manner. Software and data may be stored on one or more computer-readable recording media.

실시예에 따른 방법은 다양한 컴퓨터 수단을 통하여 수행될 수 있는 프로그램 명령 형태로 구현되어 컴퓨터 판독 가능 매체에 기록될 수 있다. 상기 컴퓨터 판독 가능 매체는 프로그램 명령, 데이터 파일, 데이터 구조 등을 단독으로 또는 조합하여 포함할 수 있다. 상기 매체에 기록되는 프로그램 명령은 실시예를 위하여 특별히 설계되고 구성된 것들이거나 컴퓨터 소프트웨어 당업자에게 공지되어 사용 가능한 것일 수도 있다. 컴퓨터 판독 가능 기록 매체의 예에는 하드 디스크, 플로피 디스크 및 자기 테이프와 같은 자기 매체(magnetic media), CD-ROM, DVD와 같은 광기록 매체(optical media), 플롭티컬 디스크(floptical disk)와 같은 자기-광 매체(magneto-optical media), 및 롬(ROM), 램(RAM), 플래시 메모리 등과 같은 프로그램 명령을 저장하고 수행하도록 특별히 구성된 하드웨어 장치가 포함된다. 프로그램 명령의 예에는 컴파일러에 의해 만들어지는 것과 같은 기계어 코드뿐만 아니라 인터프리터 등을 사용해서 컴퓨터에 의해서 실행될 수 있는 고급 언어 코드를 포함한다. 상기된 하드웨어 장치는 실시예의 동작을 수행하기 위해 하나 이상의 소프트웨어 모듈로서 작동하도록 구성될 수 있으며, 그 역도 마찬가지이다.The method according to the embodiment may be implemented in the form of program instructions that can be executed through various computer means and recorded in a computer-readable medium. The computer-readable medium may include program instructions, data files, data structures, and the like alone or in combination. The program instructions recorded on the medium may be specially designed and configured for the embodiment, or may be known and usable to those skilled in computer software. Examples of computer-readable recording media include magnetic media such as hard disks, floppy disks, and magnetic tapes, optical media such as CD-ROMs and DVDs, and magnetic media such as floptical disks. -A hardware device specially configured to store and execute program instructions such as magneto-optical media, and ROM, RAM, flash memory, and the like. Examples of program instructions include not only machine language codes such as those produced by a compiler, but also high-level language codes that can be executed by a computer using an interpreter or the like. The hardware device described above may be configured to operate as one or more software modules to perform the operation of the embodiment, and vice versa.

이상과 같이 실시예들이 비록 한정된 도면에 의해 설명되었으나, 해당 기술분야에서 통상의 지식을 가진 자라면 상기의 기재로부터 다양한 수정 및 변형이 가능하다. 예를 들어, 설명된 기술들이 설명된 방법과 다른 순서로 수행되거나, 및/또는 설명된 시스템, 구조, 장치, 회로 등의 구성요소들이 설명된 방법과 다른 형태로 결합 또는 조합되거나, 다른 구성요소 또는 균등물에 의하여 대치되거나 치환되더라도 적절한 결과가 달성될 수 있다.As described above, although the embodiments have been described by the limited drawings, various modifications and variations are possible from the above description to those of ordinary skill in the art. For example, the described techniques are performed in a different order from the described method, and/or components such as systems, structures, devices, circuits, etc. described are combined or combined in a form different from the described method, or other components Alternatively, even if substituted or substituted by an equivalent, an appropriate result can be achieved.

그러므로, 다른 구현들, 다른 실시예들 및 특허청구범위와 균등한 것들도 후술하는 특허청구범위의 범위에 속한다.Therefore, other implementations, other embodiments, and those equivalent to the claims also fall within the scope of the claims to be described later.

Claims

A license plate detection unit configured to set a license plate detection area in consideration of the ratio of the license plate from the image data generated through CCTV, and crop the license plate at a predicted position by applying convolution to the set license plate detection area;
A character string extracting unit for extracting the character string displayed on the cropped license plate in consideration of the ratio and modifying the network for learning based on the size of the character included in the character string; And
A character recognition unit for real-time recognition of each character included in the character string extracted by the character string extraction unit based on the modified network,
The license plate detection unit,
Applying convolution to the set license plate detection area, but without separating and calculating the image data layers, and applying a filter without applying maximum pooling to the image data is set to 1 or A vehicle license plate recognition device that determines the predicted position by applying convolution a plurality of times using filter 16 and filter 32, which are one type of filter differently to 2.

The method of claim 1,
The license plate detection unit,
A vehicle license plate recognition apparatus configured to set common regions of regions in which characters of a license plate are identified in previously learned image data as regions of interest, and to set the license plate detection region within a region masked with the set region of interest.

delete

The method of claim 1,
The license plate detection unit,
The generated image data is learned by applying an initialized weight instead of a previously learned weight, but learning by recalculating an anchor box corresponding to the training set by simultaneously learning all layers of the image data. Vehicle license plate recognition device.

The method of claim 1,
The character string extraction unit,
However, learning using the cropped license plate,
For the cropped license plate, an initialized weight is applied instead of the previously learned weight, and a feature extractor and an object detector, which are all layers of the cropped license plate, are simultaneously learned to correspond to a learning set. A vehicle license plate recognition device that recalculates the anchor box.

The method of claim 5,
The character string extraction unit,
The training set is constructed using training data, test data, and positive data in which an object exists in an image, and at least one of the training data, the test data, and the positive data is an object label, a bounding box center point x coordinate, and a bounding. A vehicle license plate recognition device including the box weight y coordinate, the bounding box width, and the bounding box height.

The method of claim 1,
The character recognition unit,
Recognizing the character using the previously learned character string,
A vehicle license plate recognition device that learns the character string by repeating a process of resizing and convoluting and recognizing each character included in the previously collected character string.

The method of claim 7,
The character recognition unit,
A vehicle license plate recognition device that extracts positive data by applying an early termination technique to the previously collected character string, and learns the character string using the extracted positive data.

Setting a license plate detection area in consideration of the ratio of the license plate from the image data generated through CCTV;
Cropping the license plate at the predicted position by applying convolution to the set license plate detection area;
Extracting a character string displayed on the cropped license plate in consideration of the ratio;
Modifying a network for learning based on the size of characters included in the character string; And
Including the step of real-time recognition of each character included in the character string extracted by the character string extraction unit based on the modified network,
The step of setting the license plate detection area,
Applying convolution to the set license plate detection area, but without separating and calculating the image data layers, and applying a filter without applying maximum pooling to the image data is set to 1 or A vehicle license plate recognition method for determining the predicted position by applying convolution a plurality of times by using a filter 16 and a filter 32, which are one type of filter differently to 2.

The method of claim 9,
The step of setting the license plate detection area,
Setting common areas of areas in which characters on the license plate are identified as areas of interest in the previously learned image data; And
Setting the license plate detection area within an area in which image data is masked as the set area of interest
Vehicle license plate recognition method comprising a.

delete

The method of claim 9,
The step of extracting the character string,
Learning using the cropped license plate, but applying an initialized weight to the cropped license plate instead of the previously learned weight; And
And recalculating an anchor box corresponding to a learning set by simultaneously learning a feature extractor and an object detector, which are all layers of the cropped license plate.

The method of claim 9,
The step of real-time recognition of the character,
Recognizing the character by using the previously learned character string, resizing each character included in the previously collected character string, and re-recognizing the character by convolution to learn the character string
Vehicle license plate recognition method comprising a.

The method of claim 13,
The step of learning the character string,
Extracting positive data by applying an early termination technique to the previously collected character string; And
Learning a character string using the extracted positive data
Vehicle license plate recognition method comprising a.