KR102653755B1

KR102653755B1 - System and method for collecting field image data sets for learning artificial intelligence image deep learning models

Info

Publication number: KR102653755B1
Application number: KR1020230092918A
Authority: KR
Inventors: 권문수; 김선중; 김정대
Original assignee: 메타빌드 주식회사
Priority date: 2023-07-18
Filing date: 2023-07-18
Publication date: 2024-04-03

Abstract

본 발명은, 현장에 설치된 수집용 영상 센서가 촬영한 현장 영상으로 현장 영상 데이터 셋을 구축함으로써, 상기 현장 영상 데이터 셋으로 추가학습된 영상 딥러닝 미세조정 모델이 국내 도로 환경에 맞게 생성될 수 있으므로, 객체 검지의 정확도와 신뢰도가 보다 향상될 수 있다. 또한, 현장 영상들에서 객체의 형태, 객체의 크기, 현장 배경의 유무, 촬영 시간대, 객체간 중첩도, 객체와 배경의 중첩도에 따라 영상을 선별하여 현장 영상 데이터 셋을 구축함으로써, 현장 영상 데이터 셋의 신뢰도가 향상될 수 있다. The present invention builds a field image data set from field images captured by a collection image sensor installed on the field, so that an image deep learning fine-tuning model additionally learned from the field image data set can be created to suit the domestic road environment. , the accuracy and reliability of object detection can be further improved. In addition, by constructing a field image data set by selecting images from field images according to the shape of the object, size of the object, presence or absence of field background, shooting time, degree of overlap between objects, and degree of overlap between objects and background, field video data The reliability of the set can be improved.

Description

System and method for collecting field image data sets for learning artificial intelligence image deep learning models {System and method for collecting field image data sets for learning artificial intelligence image deep learning models}

본 발명은 인공지능 영상 딥러닝 모델 학습을 위한 현장 영상 데이터 셋의 수집 시스템 및 방법에 관한 것으로서, 보다 상세하게는 국내 도로 현장을 촬영한 현장 영상들을 미리 설정된 선별 기준에 따라 필터링하여 현장 영상 데이터 셋으로 구축하는 인공지능 영상 딥러닝 모델 학습을 위한 현장 영상 데이터 셋의 수집 시스템 및 방법에 관한 것이다. The present invention relates to a system and method for collecting field image data sets for learning artificial intelligence image deep learning models. More specifically, field images taken of domestic road sites are filtered according to preset selection criteria to create a field image data set. This is about a system and method for collecting field video data sets for learning artificial intelligence video deep learning models built with .

일반적으로 도로 상의 교통 상황이나 돌발 상황 등을 파악하기 위하여 도로에 설치된 카메라의 영상을 이용하고 있다. 그러나, 카메라의 영상을 관리자가 육안으로 파악한 후 운전자들에게 안내해야 하기 때문에 시간이 오래 걸리는 문제점이 있다. In general, images from cameras installed on the road are used to determine traffic conditions or unexpected situations on the road. However, there is a problem in that it takes a long time because the manager must visually check the video from the camera and then guide the drivers.

따라서, 최근에는 도로에 설치된 카메라를 이용하여 영상을 획득하고, 딥러닝 모델을 이용하여 객체를 검출하고 추적하는 기술 개발에 대한 관심이 증가하고 있다. 그러나 사전 학습 딥러닝 모델을 이용하는 경우, 도로 환경에 따라 정확도가 떨어지는 문제점이 있다. Therefore, recently, there has been increasing interest in developing technology to acquire images using cameras installed on the road and detect and track objects using deep learning models. However, when using a pre-trained deep learning model, there is a problem in that accuracy decreases depending on the road environment.

한국등록특허 제10-2453627호Korean Patent No. 10-2453627

본 발명의 목적은, 현장의 특성을 반영한 현장 영상 데이터 셋을 수집할 수 있는 인공지능 영상 딥러닝 모델 학습을 위한 현장 영상 데이터 셋의 수집 시스템 및 방법을 제공하는 데 있다. The purpose of the present invention is to provide a system and method for collecting field image data sets for learning an artificial intelligence image deep learning model that can collect field image data sets that reflect the characteristics of the field.

본 발명에 따른 인공지능 영상 딥러닝 모델 학습을 위한 현장 영상 데이터 셋의 수집 시스템은, 미리 설정된 복수의 현장들에 설치되어 현장 영상들을 촬영하는 영상 센서와; 상기 현장 영상들을 미리 설정된 선별 기준에 따라 필터링하여, 상기 선별 기준에 만족하는 영상들을 선별 영상 데이터 셋으로 취합하고, 상기 현장 영상들 중에서 상기 선별 영상 데이터 셋으로 선별되지 못한 영상들 중 적어도 일부를 노이즈 영상 데이터 셋으로 취합하여, 상기 선별 영상 데이터 셋과 상기 노이즈 영상 데이터 셋을 미리 설정된 설정 비율로 혼합하여 현장 영상 데이터 셋으로 구축하는 서버를 포함한다.The system for collecting field image data sets for learning an artificial intelligence image deep learning model according to the present invention includes an image sensor installed at a plurality of preset sites to capture field images; The field images are filtered according to preset selection criteria, images that satisfy the selection criteria are collected into a selection image data set, and at least some of the images that are not selected as the selection image data set among the field images are noiseized. It includes a server that collects video data sets and mixes the selected video data set and the noise video data set at a preset ratio to construct a field video data set.

상기 서버는, 상기 현장 영상들에서 객체의 형태, 객체의 크기, 현장 배경의 유무, 촬영 시간대, 객체간 중첩도 및 객체와 배경의 중첩도를 각각에 따라 다르게 설정된 선별 기준과 비교하여, 상기 선별 영상 데이터 셋으로 선별하는 데이터 셋 구축부를 포함한다.The server compares the shape of the object, the size of the object, the presence or absence of the scene background, the shooting time, the degree of overlap between objects, and the degree of overlap between objects and the background in the field images with selection criteria set differently for each, and selects the object. It includes a data set construction unit that selects image data sets.

상기 서버는, 기저장된 범용 영상 데이터 셋을 학습하여 영상 내 객체의 종류를 도출하는 영상 딥러닝 초기 모델을 생성하고, 상기 영상 딥러닝 초기 모델을 상기 현장 영상 데이터 셋으로 추가 학습하여, 상기 현장의 특성을 반영하여 영상 내 객체의 종류를 도출하는 영상 딥러닝 미세조정 모델을 생성하는 모델 학습부를 포함한다.The server learns a pre-stored general-purpose image data set to create an initial image deep learning model that derives the type of object in the image, and additionally learns the initial image deep learning model with the field image data set to determine the type of object in the image. It includes a model learning unit that generates an image deep learning fine-tuning model that reflects the characteristics and derives the type of object in the image.

본 발명에 따른 인공지능 영상 딥러닝 모델 학습을 위한 현장 영상 데이터 셋의 수집 방법은, 서버가 미리 설정된 복수의 현장들에 설치된 영상 센서들이 촬영한 현장 영상들을 수집하는 현장 영상 수집 단계와; 상기 현장 영상들을 미리 설정된 선별 기준에 따라 필터링하여, 상기 선별 기준에 만족하는 영상들을 선별 영상 데이터 셋으로 취합하는 영상 선별 단계와; 상기 서버가 상기 선별 영상 데이터 셋과, 상기 영상 선별 단계에서 상기 선별 영상 데이터 셋으로 선별되지 못한 영상들 중 적어도 일부를 취합한 노이즈 영상 데이터 셋을 미리 설정된 설정 비율로 혼합하는 데이터 셋 혼합 단계와; 상기 서버가 상기 데이터 셋 혼합 단계에서 혼합된 데이터 셋을 현장 영상 데이터 셋으로 구축하는 데이터 셋 저장 단계를 포함한다.The method of collecting field image data sets for learning an artificial intelligence image deep learning model according to the present invention includes a field image collection step of collecting field images taken by image sensors installed at a plurality of sites where a server is preset; An image selection step of filtering the field images according to preset selection criteria and collecting images satisfying the selection criteria into a selection image data set; A data set mixing step in which the server mixes the selected image data set and a noise image data set obtained by collecting at least some of the images that were not selected as the selected image data set in the image selection step at a preset ratio; It includes a data set storage step in which the server constructs the data set mixed in the data set mixing step into a field image data set.

상기 영상 선별 단계는, 상기 현장 영상들에서 객체의 형태, 객체의 크기, 현장 배경의 유무, 촬영 시간대, 객체간 중첩도 및 객체와 배경의 중첩도를 각각에 따라 다르게 설정된 선별 기준과 비교하여, 상기 선별 영상 데이터 셋으로 선별한다.The image selection step compares the shape of the object, the size of the object, the presence or absence of the scene background, the shooting time, the degree of overlap between objects, and the degree of overlap between objects and the background in the field images with selection criteria set differently for each, Selection is performed using the selected image data set.

상기 영상 선별 단계는, 상기 현장 영상 내 객체의 형태가 나온 비율을 미리 설정된 설정 비율과 비교하고, 상기 객체의 형태가 나온 비율이 상기 설정 비율 이상이면, 해당 객체를 채택하는 과정을 포함한다.The image selection step includes comparing the ratio of the shape of the object in the field image with a preset ratio, and selecting the object if the ratio of the shape of the object is greater than or equal to the set ratio.

상기 영상 선별 단계는, 상기 객체의 형태가 나온 비율이 상기 설정 비율 미만이고, 상기 객체의 나온 부분의 크기가 미리 설정된 설정 크기 이상이면, 해당 객체가 나온 영상을 배제하고, 상기 객체의 형태가 나온 비율이 상기 설정 비율 미만이고, 상기 객체의 나온 부분의 크기가 미리 설정된 설정 크기 미만이면, 해당 객체가 나온 영상에서 해당 객체만 배제하는 과정을 포함한다.In the image selection step, if the ratio in which the shape of the object appears is less than the set ratio and the size of the part of the object in which the shape appears is greater than or equal to a preset size, the image in which the object appears is excluded, and the image in which the shape of the object appears is excluded. If the ratio is less than the set ratio and the size of the part of the object is less than the preset size, a process of excluding only the corresponding object from the image in which the object appears is included.

상기 영상 선별 단계는, 상기 현장 영상 내 객체의 크기를 미리 설정된 설정 크기와 비교하여, 상기 객체의 크기가 상기 설정 크기 이상이며, 해당 객체를 채택하는 과정을 포함한다.The image selection step includes comparing the size of the object in the field image with a preset size, determining that the size of the object is greater than or equal to the set size, and selecting the object.

상기 영상 선별 단계는, 상기 현장 영상 내 배경을 현장에 따라 미리 설정된 설정 배경과 비교하여, 상기 설정 배경을 포함하는 영상을 채택하는 과정을 포함한다.The image selection step includes comparing the background in the scene image with a background setting preset according to the scene, and selecting an image including the background setting.

상기 영상 선별 단계는, 상기 현장 영상의 촬영 시간대를 미리 설정된 설정 시간대와 비교하여, 상기 촬영 시간대가 상기 설정 시간대 이내이면, 해당 영상을 채택하는 과정을 포함한다.The image selection step includes comparing the shooting time zone of the field video with a preset time zone, and selecting the corresponding image if the shooting time zone is within the set time zone.

상기 영상 선별 단계는, 상기 현장 영상 내 객체간 중첩도를 미리 설정된 중첩 기준과 비교하여, 상기 객체간 중첩도가 상기 중첩 기준을 만족하면 상기 객체가 나온 영상을 채택하고, 상기 객체간 중첩도가 상기 중첩 기준을 만족하지 못하면 상기 객체가 나온 영상을 배제하는 과정을 포함한다.In the image selection step, the degree of overlap between objects in the field image is compared with a preset overlap standard, and if the degree of overlap between objects satisfies the overlap standard, the image from which the object appears is selected, and the degree of overlap between objects is selected. If the overlapping criteria are not met, the process includes excluding the image in which the object appears.

상기 영상 선별 단계는, 상기 현장 영상 내 객체와 배경의 중첩도를 미리 설정된 설정 중첩 비율과 비교하여, 상기 객체와 배경의 중첩도가 상기 설정 중첩 비율 미만이면, 해당 객체가 나온 영상을 채택하고, 상기 객체와 배경의 중첩도가 상기 설정 중첩 비율 이상이면, 해당 객체가 나온 영상을 배제하는 과정을 포함한다.In the image selection step, the degree of overlap between the object and the background in the field image is compared with a preset overlap ratio, and if the overlap between the object and the background is less than the set overlap ratio, the image in which the object appears is selected, If the degree of overlap between the object and the background is greater than the set overlap ratio, a process of excluding the image in which the object appears is included.

본 발명의 다른 측면에 따른 인공지능 영상 딥러닝 모델 학습을 위한 현장 영상 데이터 셋의 수집 방법은, 서버가 미리 설정된 복수의 현장들에 설치된 영상 센서들이 촬영한 현장 영상들을 수집하는 현장 영상 수집 단계와; 상기 현장 영상들을 미리 설정된 선별 기준에 따라 필터링하여, 상기 선별 기준에 만족하는 영상들을 선별 영상 데이터 셋으로 취합하는 영상 선별 단계와; 상기 서버가 상기 선별 영상 데이터 셋과, 상기 영상 선별 단계에서 상기 선별 영상 데이터 셋으로 선별되지 못한 영상들 중 적어도 일부를 취합한 노이즈 영상 데이터 셋을 미리 설정된 설정 비율로 혼합하는 데이터 셋 혼합 단계와; 상기 서버가 상기 데이터 셋 혼합 단계에서 혼합된 데이터 셋을 현장 영상 데이터 셋으로 구축하는 데이터 셋 저장 단계를 포함하고, 상기 영상 선별 단계는, 상기 현장 영상 내 객체의 형태가 나온 비율을 미리 설정된 설정 비율과 비교하고, 상기 객체의 형태가 나온 비율이 상기 설정 비율 이상이면, 해당 객체를 채택하고, 상기 객체의 형태가 나온 비율이 상기 설정 비율 미만이고, 상기 객체의 나온 부분의 크기가 미리 설정된 설정 크기 이상이면, 해당 객체가 나온 영상을 배제하고, 상기 객체의 형태가 나온 비율이 상기 설정 비율 미만이고, 상기 객체의 나온 부분의 크기가 미리 설정된 설정 크기 미만이면, 해당 객체가 나온 영상에서 해당 객체만 배제하는 과정과, 상기 현장 영상 내 객체의 크기를 미리 설정된 설정 크기와 비교하여, 상기 객체의 크기가 상기 설정 크기 이상이면, 해당 객체를 채택하는 과정과, 상기 현장 영상 내 배경을 현장에 따라 미리 설정된 설정 배경과 비교하여, 상기 설정 배경을 포함하는 영상을 채택하는 과정과, 상기 현장 영상의 촬영 시간대를 미리 설정된 설정 시간대와 비교하여, 상기 촬영 시간대가 상기 설정 시간대 이내이면, 해당 영상을 채택하는 과정과, 상기 현장 영상 내 객체간 중첩도를 미리 설정된 중첩 기준과 비교하여, 상기 객체간 중첩도가 상기 중첩 기준을 만족하면 상기 객체가 나온 영상을 채택하고, 상기 객체간 중첩도가 상기 중첩 기준을 만족하지 못하면 상기 객체가 나온 영상을 배제하는 과정과, 상기 현장 영상 내 객체와 배경의 중첩도를 미리 설정된 설정 중첩 비율과 비교하여, 상기 객체와 배경의 중첩도가 상기 설정 중첩 비율 미만이면, 해당 객체가 나온 영상을 채택하고, 상기 객체와 배경의 중첩도가 상기 설정 중첩 비율 이상이면, 해당 객체가 나온 영상을 배제하는 과정을 포함한다.A method of collecting field image data sets for learning an artificial intelligence image deep learning model according to another aspect of the present invention includes a field image collection step of collecting field images taken by image sensors installed at a plurality of sites where servers are preset; ; An image selection step of filtering the field images according to preset selection criteria and collecting images satisfying the selection criteria into a selection image data set; A data set mixing step in which the server mixes the selected image data set and a noise image data set obtained by collecting at least some of the images that were not selected as the selected image data set in the image selection step at a preset ratio; The server includes a data set storage step of constructing the mixed data set in the data set mixing step into a field image data set, and the image selection step is to set the ratio of the shape of the object in the field image to a preset setting ratio. Compare with, and if the ratio of the shape of the object is more than the set ratio, the object is adopted, the ratio of the shape of the object is less than the set ratio, and the size of the part of the object is a preset size If it is above, the image in which the object appears is excluded, and if the proportion of the shape of the object is less than the set ratio and the size of the part of the object is less than the preset size, only the object is included in the image in which the object appears. A process of excluding, comparing the size of the object in the scene image with a preset size, and if the size of the object is larger than the set size, adopting the object, and adjusting the background in the scene image in advance according to the scene. A process of selecting an image including the setting background by comparing it with a set setting background, comparing the shooting time zone of the scene video with a preset setting time zone, and selecting the image if the shooting time zone is within the setting time zone. In the process, the degree of overlap between objects in the field image is compared with a preset overlap standard, and if the degree of overlap between objects satisfies the overlap standard, the image in which the object appears is adopted, and the degree of overlap between objects is compared with the overlap standard. If it is not satisfied, the process of excluding the image in which the object appears, comparing the degree of overlap between the object and the background in the field image with a preset overlap ratio, and if the overlap degree of the object and the background is less than the set overlap ratio, It includes a process of selecting an image in which the corresponding object appears and, if the degree of overlap between the object and the background is greater than the set overlap ratio, excluding the image in which the corresponding object appears.

본 발명은, 현장에 설치된 수집용 영상 센서들이 촬영한 현장 영상으로 현장 영상 데이터 셋을 구축함으로써, 상기 현장 영상 데이터 셋으로 추가학습된 영상 딥러닝 미세조정 모델이 국내 도로 환경에 맞게 생성될 수 있으므로, 객체 검지의 정확도와 신뢰도가 보다 향상될 수 있다. The present invention builds an on-site image data set from on-site images captured by image collection sensors installed on-site, so that an image deep learning fine-tuning model additionally learned from the on-site image data set can be created to suit the domestic road environment. , the accuracy and reliability of object detection can be further improved.

또한, 현장 영상들에서 객체의 형태, 객체의 크기, 현장 배경의 유무, 촬영 시간대, 객체간 중첩도, 객체와 배경의 중첩도에 따라 영상을 선별하여 현장 영상 데이터 셋을 구축함으로써, 현장 영상 데이터 셋의 신뢰도가 향상될 수 있다. In addition, by constructing a field image data set by selecting images from field images according to the shape of the object, size of the object, presence or absence of field background, shooting time, degree of overlap between objects, and degree of overlap between objects and background, field video data The reliability of the set can be improved.

도 1은 본 발명의 실시예에 따른 인공지능 영상 딥러닝 모델 학습을 위한 현장 영상 데이터 셋의 수집 시스템의 구성을 개략적으로 나타낸 도면이다.
도 2는 본 발명의 실시예에 따른 인공지능 영상 딥러닝 모델 학습 단계를 나타낸 순서도이다.
도 3은 본 발명의 실시예에 따른 영상 딥러닝 모델 학습을 위한 현장 영상 데이터 셋의 수집 방법을 나타낸 순서도이다.
도 4는 본 발명의 실시예에 따른 현장 영상 데이터 셋의 수집시 객체 형태에 따라 선별하는 예를 나타낸다.
도 5는 본 발명의 실시예에 따른 현장 영상 데이터 셋의 수집시 객체 크기에 따라 선별하는 예를 나타낸다.
도 6은 본 발명의 실시예에 따른 현장 영상 데이터 셋의 수집시 객체간 중첩도에 따라 선별하는 예를 나타낸다.
도 7은 본 발명의 실시예에 따른 현장 영상 데이터 셋의 수집시 객체와 배경의 중첩도에 따라 선별하는 예를 나타낸다.Figure 1 is a diagram schematically showing the configuration of a system for collecting field image data sets for learning an artificial intelligence image deep learning model according to an embodiment of the present invention.
Figure 2 is a flowchart showing the steps of learning an artificial intelligence image deep learning model according to an embodiment of the present invention.
Figure 3 is a flowchart showing a method of collecting field image data sets for learning an image deep learning model according to an embodiment of the present invention.
Figure 4 shows an example of selection according to object shape when collecting a field image data set according to an embodiment of the present invention.
Figure 5 shows an example of selection according to object size when collecting a field image data set according to an embodiment of the present invention.
Figure 6 shows an example of selection according to the degree of overlap between objects when collecting a field image data set according to an embodiment of the present invention.
Figure 7 shows an example of selection according to the degree of overlap between an object and a background when collecting a field image data set according to an embodiment of the present invention.

이하, 첨부된 도면을 참조하여 본 발명의 실시예에 대해 설명하면, 다음과 같다.Hereinafter, embodiments of the present invention will be described with reference to the attached drawings.

도 1은 본 발명의 실시예에 따른 인공지능 영상 딥러닝 모델 학습을 위한 현장 영상 데이터 셋의 수집 시스템의 구성을 개략적으로 나타낸 도면이다.Figure 1 is a diagram schematically showing the configuration of a system for collecting field image data sets for learning an artificial intelligence image deep learning model according to an embodiment of the present invention.

도 1을 참조하면, 본 발명의 실시예에 따른 인공지능 영상 딥러닝 모델 학습을 위한 현장 영상 데이터 셋의 수집 시스템은, 수집용 영상 센서(12)와 서버(30)를 포함한다.Referring to FIG. 1, a system for collecting field image data sets for learning an artificial intelligence image deep learning model according to an embodiment of the present invention includes an image sensor 12 for collection and a server 30.

상기 수집용 영상 센서(12)는, 관리자 등에 의해 미리 설정된 현장에 설치되어 현장 영상을 촬영하는 영상 센서이다. 상기 수집용 영상 센서(12)는, 복수개가 복수의 현장들에 설치된다. 상기 수집용 영상 센서(12)는, 도로 위나 도로 주변에 설치된 CCTV인 것으로 예를 들어 설명한다. 본 실시예에서는, 상기 수집용 영상 센서(12)는 후술하는 영상 딥러닝 미세조정 모델에 입력하기 위한 검지 영상을 촬영하는 검지용 영상 센서(미도시)와 별도로 설치된 것으로 예를 들어 설명한다. 다만, 이에 한정되지 않고, 상기 검지용 영상 센서(미도시) 중 일부가 상기 수집용 영상 센서(12)로 사용되는 것도 가능하고, 상기 검지용 영상 센서(미도시)와 상기 수집용 영상 센서(12)가 서로 동일하게 사용되는 것도 물론 가능하다. The collection image sensor 12 is an image sensor that is installed at a site preset by an administrator or the like and captures on-site images. A plurality of the image sensors 12 for collection are installed at a plurality of sites. The image collection sensor 12 is explained as an example of a CCTV installed on or around a road. In this embodiment, the collection image sensor 12 is explained as an example of being installed separately from a detection image sensor (not shown) that captures a detection image to be input to an image deep learning fine-tuning model described later. However, it is not limited to this, and some of the detection image sensors (not shown) may be used as the collection image sensor 12, and the detection image sensor (not shown) and the collection image sensor ( Of course, it is possible for 12) to be used equally.

상기 서버(30)는, 데이터 셋 구축부(31)와 모델 학습부(32)를 포함하는 학습 서버이다. The server 30 is a learning server that includes a data set construction unit 31 and a model learning unit 32.

상기 데이터 셋 구축부(31)는, 상기 수집용 영상 센서들(12)로부터 수집한 현장 영상들을 선별하여 현장 영상 데이터 셋을 구축한다. 본 실시예에서는, 상기 데이터 셋 구축부(31)가 상기 서버(30)에 포함된 것으로 예를 들어 설명하였으나, 이에 한정되지 않고, 상기 서버(30)와 별도로 구성되는 것도 물론 가능하다.The data set construction unit 31 selects the field images collected from the collection image sensors 12 and builds a field image data set. In this embodiment, the data set construction unit 31 is described as being included in the server 30, but the data set construction unit 31 is not limited to this, and may of course be configured separately from the server 30.

상기 현장 영상 데이터 셋은, 상기 수집용 영상 센서들(12)이 설치된 현장의 도로 환경, 도로 상황 및 주행 차량들에 대한 영상 데이터이다. The field image data set is image data about the road environment, road conditions, and driving vehicles at the site where the collection image sensors 12 are installed.

상기 현장 영상 데이터 셋은, 상기 수집용 영상 센서들(12)을 통해 수집한 현장 영상들 중에서 객체의 크기, 배경, 촬영 시간대, 객체간 중첩도, 객체와 배경의 중첩도를 각각 미리 설정된 선별 기준과 비교하여, 상기 선별 기준에 만족하는 영상들로 선별된 선별 영상들과, 상기 수집용 영상 센서들(12)을 통해 수집한 현장 영상들 중에서 상기 선별 영상들을 제외한 노이즈 영상들을 미리 설정된 비율로 혼합한 데이터 셋이다. 여기서, 상기 노이즈 영상들은 상기 수집용 영상 센서들(12)을 통해 수집한 현장 영상들 중에서 상기 선별 영상들을 제외하고 일부 영상들을 샘플링한 것으로 예를 들어 설명한다. 상기 선별 영상들과 상기 노이즈 영상들의 혼합 비율은 1:1인 것으로 예를 들어 설명한다. 상기 현장 영상들의 선별 방법은 뒤에서 상세히 설명하기로 한다. The field image data set includes preset selection criteria for the size of the object, background, shooting time, degree of overlap between objects, and degree of overlap between objects and the background among the field images collected through the collection image sensors 12. In comparison, the selected images that satisfy the selection criteria and the noise images excluding the selected images among the field images collected through the collection image sensors 12 are mixed at a preset ratio. It is one data set. Here, the noise images are explained by way of example as samples of some images excluding the selected images among the field images collected through the image collection image sensors 12. For example, the mixing ratio of the selected images and the noise images is 1:1. The method of selecting the field images will be described in detail later.

상기 모델 학습부(32)는, 상기 서버(30)에 기저장된 범용 영상 데이터 셋을 학습하여 영상 내 객체의 종류를 도출하는 영상 딥러닝 초기 모델을 생성한다. The model learning unit 32 learns a general-purpose image data set pre-stored in the server 30 to create an initial image deep learning model that derives the type of object in the image.

여기서, 상기 범용 영상 데이터 셋은 국외의 외부 기관 등에서 범용으로 제공받은 영상 데이터이고, 상기 서버(30) 또는 별도의 데이터베이스(미도시)에 미리 저장된다. Here, the general-purpose image data set is image data provided for general purposes by an overseas external organization, etc., and is pre-stored in the server 30 or a separate database (not shown).

또한, 상기 모델 학습부(32)는 상기 영상 딥러닝 초기 모델을 생성한 후, 상기 영상 딥러닝 초기 모델을 상기 현장 영상 데이터 셋으로 추가 학습하여, 영상 딥러닝 미세조정(fine tunning) 모델을 생성한다.In addition, the model learning unit 32 generates the image deep learning initial model and then additionally learns the image deep learning initial model with the field image data set to generate an image deep learning fine tuning model. do.

상기 현장 영상 데이터 셋으로 추가 학습함으로써, 상기 수집용 영상 센서(12)가 설치된 현장의 특성을 반영할 수 있으므로, 영상 내 객체의 종류를 보다 정확하게 도출할 수 있다. By additionally learning with the field image data set, the characteristics of the field where the collection image sensor 12 is installed can be reflected, and the type of object in the image can be derived more accurately.

상기와 같이 구성된 본 발명의 실시예에 따른 인공지능 영상 딥러닝 모델 학습을 위한 현장 영상 데이터 셋의 수집 방법을 설명하면 다음과 같다. The method of collecting a field image data set for learning an artificial intelligence image deep learning model according to an embodiment of the present invention configured as described above will be described as follows.

도 2는 본 발명의 실시예에 따른 인공지능 영상 딥러닝 모델 학습 단계를 나타낸 순서도이다.Figure 2 is a flowchart showing the steps of learning an artificial intelligence image deep learning model according to an embodiment of the present invention.

도 2를 참조하면, 본 발명의 실시예에 따른 인공지능 영상 딥러닝 모델 학습 단계는, 초기 모델 학습 단계(S21), 현장 영상 데이터 셋 생성 단계(S22), 미세조정 모델 학습 단계(S23) 및 검증 단계(S24)를 포함한다. Referring to Figure 2, the artificial intelligence image deep learning model learning step according to an embodiment of the present invention includes an initial model learning step (S21), an on-site image data set generation step (S22), a fine-tuned model learning step (S23), and Includes a verification step (S24).

상기 초기 모델 학습 단계(S21)는, 상기 서버(30)의 모델 학습부(32)가 기저장된 상기 범용 영상 데이터 셋을 학습하여, 영상 내 객체의 종류를 도출하기 위한 영상 딥러닝 초기 모델을 생성하는 단계이다. In the initial model learning step (S21), the model learning unit 32 of the server 30 learns the pre-stored general-purpose image data set to generate an initial image deep learning model for deriving the type of object in the image. This is the step.

상기 현장 영상 데이터 셋 생성 단계(S22)는, 상기 현장 영상 데이터 셋을 수집하여 생성하는 단계이다. 본 실시예에서는, 상기 현장 영상 데이터 셋 생성 단계(S22)가 상기 초기 모델 학습 단계(S21) 이후인 것으로 예를 들어 설명하였으나, 이에 한정되지 않고 상기 초기 모델 학습 단계(S21) 이전에 수행되는 것도 물론 가능하다. The field image data set generation step (S22) is a step of collecting and generating the field image data set. In this embodiment, the field image data set generation step (S22) is described as an example after the initial model learning step (S21), but it is not limited to this and can also be performed before the initial model learning step (S21). Of course it is possible.

도 3은 본 발명의 실시예에 따른 영상 딥러닝 모델 학습을 위한 현장 영상 데이터 셋의 수집 방법을 나타낸 순서도이다. Figure 3 is a flowchart showing a method of collecting field image data sets for learning an image deep learning model according to an embodiment of the present invention.

도 3을 참조하면, 상기 현장 영상 데이터 셋 생성 단계(S22)는, 현장 영상 수집 단계(S22-1), 영상 선별 단계(S22-2), 데이터 셋 혼합 단계(S22-3), 현장 영상 데이터 셋 저장 단계(S22-4)를 포함한다.Referring to FIG. 3, the field image data set generation step (S22) includes the field image collection step (S22-1), the image selection step (S22-2), the data set mixing step (S22-3), and the field image data. Includes a set storage step (S22-4).

상기 현장 영상 수집 단계(S22-1)는, 상기 서버(30)가 사용자나 관리자가 복수의 현장들에 설치한 상기 수집용 영상 센서들(12)이 촬영한 현장 영상을 수집하는 과정이다. The field image collection step (S22-1) is a process in which the server 30 collects field images captured by the image collection sensors 12 installed at a plurality of sites by a user or an administrator.

상기 영상 선별 단계(S22-2)는, 상기 수집한 현장 영상들을 미리 설정된 선별 기준과 비교하여, 상기 선별 기준에 만족하는 영상들로 선별된 선별 영상 데이터 셋을 만드는 과정이다. The image selection step (S22-2) is a process of comparing the collected field images with preset selection criteria and creating a selection image data set of images that satisfy the selection criteria.

상기 영상 선별 단계(S22-2)에서는, 상기 현장 영상 내 객체의 형태와 객체의 크기, 상기 현장 배경의 유무, 상기 현장 영상의 촬영 시간대, 상기 현장 영상 내 객체간 중첩도, 상기 현장 영상 내 객체와 배경의 중첩도를 각각 확인하여 선별한다. 상기 선별 기준은, 상기 객체의 형태, 상기 객체의 크기, 상기 현장 배경의 유무, 상기 촬영 시간대, 상기 객체간 중첩도, 상기 객체와 배경의 중첩도에 따라 각각 다르게 미리 설정된다. In the image selection step (S22-2), the shape and size of the object in the scene image, the presence or absence of the scene background, the shooting time of the scene image, the degree of overlap between objects in the scene image, and the object in the scene image Select by checking the degree of overlap between the and background. The selection criteria are preset differently depending on the shape of the object, the size of the object, the presence or absence of the field background, the shooting time, the degree of overlap between the objects, and the degree of overlap between the object and the background.

상기 객체의 형태에 따른 영상 선별 과정은, 상기 현장 영상 내 상기 객체의 형태가 나온 비율을 미리 설정된 설정 비율과 비교하여, 상기 설정 비율 이상인 객체를 채택한다. The image selection process according to the shape of the object compares the ratio of the shape of the object in the field image with a preset ratio, and selects an object whose ratio is greater than or equal to the set ratio.

이 때, 상기 객체의 형태가 나온 비율이 상기 설정 비율 미만으로 상기 객체의 형태가 일부만 나온 경우에는 상기 객체의 크기를 미리 설정된 제1설정 크기와 비교한다. At this time, if the ratio of the shape of the object is less than the set ratio and only a part of the shape of the object is shown, the size of the object is compared with the first preset size.

즉, 상기 현장 영상 내 상기 객체의 형태가 일부만 나온 경우이고, 상기 객체의 크기가 상기 제1설정 크기 이상이면, 일부만 나온 객체의 크기가 너무 커서 노이즈 영상이라고 판단하여 해당 객체가 나온 영상을 배제한다. That is, in the case where the shape of the object in the scene image is only partially shown, and the size of the object is larger than the first set size, the size of the object only partially shown is too large, so it is judged to be a noise image and the image in which the object appears is excluded. .

한편, 상기 현장 영상 내 상기 객체의 형태가 일부만 나온 경우이고, 상기 객체의 크기가 상기 제1설정 크기 미만이면, 일부만 나온 객체의 크기가 작으므로, 해당 영상에서 해당 객체만 배제한다.Meanwhile, if the shape of the object in the scene image is only partially shown, and the size of the object is less than the first set size, the size of the object only partially shown is small, and therefore only the corresponding object is excluded from the image.

도 4는 본 발명의 실시예에 따른 현장 영상 데이터 셋의 수집시 객체 형태에 따라 선별하는 예를 나타낸다.Figure 4 shows an example of selection according to object shape when collecting a field image data set according to an embodiment of the present invention.

도 4를 참조하면, 상기 영상은 객체인 자동차의 일부만이 촬영되었고 상기 영상 내에서 상기 객체의 크기가 매우 크기 때문에, 해당 영상은 배제하여 노이즈 영상으로 분류할 수 있다.Referring to FIG. 4, since only a portion of the object, a car, was captured in the image and the size of the object in the image is very large, the image can be excluded and classified as a noise image.

또한, 상기 객체의 크기에 따른 영상 선별 과정은, 상기 객체의 크기를 미리 설정된 제2설정 크기와 비교하여, 상기 제2설정 크기 이상인 객체는 채택하고 상기 제2설정 크기 미만인 객체는 배제한다. 상기 객체의 크기가 상기 제2설정 크기 미만이면, 상기 객체의 크기가 너무 작아서 특징을 추출하기 어려우므로 영상에서 해당 객체를 배제한다.Additionally, the image selection process according to the size of the object compares the size of the object with a second preset size, selects objects larger than the second set size and excludes objects smaller than the second set size. If the size of the object is less than the second set size, the object is excluded from the image because the object is too small to extract features.

여기서, 상기 제2설정 크기는 객체를 인식하여 객체의 종류를 구분할 수 있는 크기로 미리 설정된다. 상기 제1설정 크기와 다르게 설정된 것으로 예를 들어 설명하나, 이에 한정되지 않고 상기 제1설정 크기와 상기 제2설정 크기는 동일한 것도 가능하다. Here, the second set size is preset to a size that can recognize the object and distinguish the type of object. Although the example is described as being set differently from the first setting size, it is not limited to this and the first setting size and the second setting size may be the same.

도 5는 본 발명의 실시예에 따른 현장 영상 데이터 셋의 수집시 객체 크기에 따라 선별하는 예를 나타낸다. Figure 5 shows an example of selection according to object size when collecting a field image data set according to an embodiment of the present invention.

도 5를 참조하면, 상기 현장 영상 속에 복수의 객체들이 포함되는 바, 복수의 객체들 중에서 상기 제2설정 크기 이상인 객체들만 채택될 수 있다. Referring to FIG. 5, a plurality of objects are included in the scene image, and among the plurality of objects, only objects larger than the second set size can be selected.

또한, 상기 현장 배경의 유무에 따른 영상 선별 과정은, 상기 현장 영상 내 배경을 현장에 따라 미리 설정된 설정 배경과 비교하여, 상기 설정 배경을 포함한 영상을 채택하고, 상기 설정 배경을 포함하지 않은 영상은 배제한다. 상기 설정 배경은 각각의 현장의 특성을 반영한 배경으로 미리 설정된다. In addition, the image selection process according to the presence or absence of the scene background compares the background in the scene image with a background preset according to the scene, selects the image including the set background, and selects the image that does not include the set background. exclude The setting background is preset to a background that reflects the characteristics of each site.

예를 들어, 고가 도로가 있는 현장에 설치된 상기 수집용 영상 센서들(12)로부터 수집된 영상들은 고가 도로가 포함된 영상을 우선적으로 채택하고, 바닷가 현장에 설치된 상기 수집용 영상 센서들(12)로부터 수집된 영상들은 바다가 포함된 영상을 우선적으로 채택한다. 따라서, 상기 현장 영상 데이터 셋에 복수의 현장들의 다양한 배경의 영상이 포함될 수 있도록 한다. For example, images collected from the collection image sensors 12 installed at a site with an overpass preferentially select images containing an overpass, and the images collected from the collection image sensors 12 installed at a beach site For the images collected from , images containing the sea are given priority. Therefore, the scene image data set can include images of various backgrounds from a plurality of scenes.

또한, 상기 촬영 시간대에 따른 영상 선별 과정은, 상기 현장 영상의 촬영 시간대를 미리 설정된 설정 시간대와 비교하여, 상기 촬영 시간대가 상기 설정 시간대 이내이면 해당 영상을 채택하고, 상기 촬영 시간대가 상기 설정 시간대를 벗어나면 해당 영상을 배제한다. 예를 들어, 상기 설정 시간대는 일몰 후부터 일몰 전까지 소정의 시간대를 제외한 시간대로 설정될 수 있다. 따라서, 상기 현장 영상 데이터 셋에는 복수의 현장들의 다양한 시간대의 영상이 포함되고, 보다 정확한 영상 데이터를 사용할 수 있다. In addition, the image selection process according to the shooting time zone compares the shooting time zone of the on-site video with a preset setting time zone, selects the image if the shooting time zone is within the setting time zone, and the shooting time zone is within the setting time zone. If it deviates, the video is excluded. For example, the set time zone may be set as a time zone excluding a predetermined time zone from after sunset to before sunset. Accordingly, the field image data set includes images from multiple sites at various times, and more accurate image data can be used.

또한, 상기 객체간 중첩도에 따른 영상 선별 과정은, 상기 현장 영상 내에 복수의 객체들이 포함된 경우, 상기 객체간 중첩도가 미리 설정된 중첩 기준에 만족하는지 판단하여 선별하는 과정이다. 상기 객체간 중첩도는 상기 객체들이 서로 겹쳐진 정도를 의미한다. 상기 객체간 중첩도에 따른 영상 선별 과정은 미리 설정된 분석 알고리즘을 통해 수행될 수 있다. 상기 중첩 기준은, 상기 영상 내에 객체를 검지하여 표시하는 사각형의 바운딩 박스의 4개의 선 중 적어도 하나라도 전방에 보이는 객체에 의해 완전히 가려지지 않는 것으로 예를 들어 설명한다. 즉, 제1객체의 바운딩 박스의 4개의 선 중 하나의 선이라도 전방에 있는 다른 객체에 의해 완전히 가려질 경우, 상기 중첩 기준을 만족하지 못했다고 판단하여 해당 객체가 나온 영상을 배제한다. 상기 해당 영상에서 상기 제1객체만을 배제할 경우, 상기 제1객체를 배경으로 인식할 수 있으므로, 정확한 학습을 위하여 해당 영상을 배제한다. In addition, the image selection process according to the degree of overlap between objects is a process of determining and selecting whether the degree of overlap between the objects satisfies a preset overlap standard when a plurality of objects are included in the field image. The degree of overlap between the objects refers to the degree to which the objects overlap each other. The image selection process according to the degree of overlap between objects may be performed through a preset analysis algorithm. The overlapping criterion is explained as an example in which at least one of the four lines of a rectangular bounding box that detects and displays an object in the image is not completely obscured by an object visible in front. That is, if even one of the four lines of the bounding box of the first object is completely obscured by another object in front, it is determined that the overlap criterion is not satisfied and the image in which the object appears is excluded. When only the first object is excluded from the corresponding image, the first object can be recognized as the background, so the corresponding image is excluded for accurate learning.

도 6은 본 발명의 실시예에 따른 현장 영상 데이터 셋의 수집시 객체간 중첩도에 따라 선별하는 예를 나타낸다.Figure 6 shows an example of selection according to the degree of overlap between objects when collecting a field image data set according to an embodiment of the present invention.

도 6a를 참조하면, 영상에서 2개의 제1,2객체(1)(2)가 검지된 경우이며, 상기 제1객체(1)를 표시하는 제1바운딩 박스의 4개의 선 중 1개의 선이 상기 제2객체(2)에 의해 완전히 가려진 상태를 나타낸다. 이 경우, 상기 객체간 중첩도가 상기 중첩 기준을 만족하지 못하다고 판단하여, 해당 영상을 배제한다. Referring to Figure 6a, this is a case where two first and second objects (1) and (2) are detected in the image, and one of the four lines of the first bounding box indicating the first object (1) is It represents a state completely obscured by the second object (2). In this case, it is determined that the degree of overlap between the objects does not satisfy the overlap criteria, and the corresponding image is excluded.

도 6b를 참조하면, 영상 속에서 복수의 객체들(1)(2)이 검지된 경우이며, 복수의 객체들 중에서 바운딩 박스의 4개의 선 중 완전히 가려지는 경우가 없으므로, 상기 객체간 중첩도가 상기 중첩 기준을 만족한다고 판단하여, 해당 영상을 채택한다. Referring to FIG. 6B, a plurality of objects (1) and (2) are detected in the image, and since none of the four lines of the bounding box are completely obscured among the plurality of objects, the degree of overlap between the objects is It is determined that the above overlap criteria are satisfied, and the corresponding image is selected.

또한, 상기 객체와 배경의 중첩도에 따른 영상 선별 과정은, 상기 현장 영상 내 객체가 배경에 의해 가려질 경우, 상기 객체가 상기 배경에 의해 가려지는 정도인 객체와 배경의 중첩도를 미리 설정된 설정 중첩 비율과 비교하여 선별하는 과정이다. In addition, in the image selection process according to the degree of overlap between the object and the background, when an object in the field image is obscured by the background, the degree of overlap between the object and the background, which is the degree to which the object is obscured by the background, is set in advance. This is a selection process compared to the overlap ratio.

상기 객체와 배경의 중첩도가 상기 설정 중첩 비율 이상이면 해당 영상을 배제하고, 상기 설정 중첩 비율 미만이면 해당 영상을 채택한다. 상기 객체와 배경의 중첩도에 따른 영상 선별 과정은 미리 설정된 분석 알고리즘을 통해 수행될 수 있다. 여기서, 상기 설정 중첩 비율은 60%인 것으로 예를 들어 설명한다.If the overlap between the object and the background is more than the set overlap ratio, the corresponding image is excluded, and if it is less than the set overlap ratio, the corresponding image is adopted. The image selection process according to the degree of overlap between the object and the background may be performed through a preset analysis algorithm. Here, the set overlap ratio is explained as an example of 60%.

도 7은 본 발명의 실시예에 따른 현장 영상 데이터 셋의 수집시 객체와 배경의 중첩도에 따라 선별하는 예를 나타낸다.Figure 7 shows an example of selection according to the degree of overlap between an object and a background when collecting a field image data set according to an embodiment of the present invention.

도 7을 참조하면, 영상에서 검지된 객체(O)가 도로 표지판(S)에 의해 가려진 경우를 나타내며, 객체(O)가 도로 표지판(S)에 가려진 정도가 60% 이상이면, 해당 영상을 배제한다. Referring to Figure 7, this shows a case where the object (O) detected in the image is obscured by the road sign (S), and if the degree to which the object (O) is obscured by the road sign (S) is more than 60%, the image is excluded. do.

상기 영상 선별 단계(S22-2)에서는, 상기 객체의 형태, 객체의 크기, 상기 현장 배경의 유무, 상기 현장 영상의 촬영 시간대, 상기 현장 영상 내 객체간 중첩도, 상기 현장 영상 내 객체와 배경의 중첩도를 포함한 선별 항목들에 따라 선별하는 바, 이들의 선별 순서는 다양하게 변경하여 적용 가능하다. 또한, 본 실시예에서는 상기 선별 항목들을 모두 만족해야 선별되는 것으로 예를 들어 설명하나, 이에 한정되지 않고 상기 선별 항목들 중 적어도 일부를 만족하는 영상들을 선별하는 것도 물론 가능하다. In the image selection step (S22-2), the shape of the object, the size of the object, the presence or absence of the scene background, the shooting time of the scene image, the degree of overlap between objects in the scene image, and the difference between the object and the background in the scene image. Since selection is made according to selection items including overlap, the selection order can be changed and applied in various ways. In addition, in this embodiment, it is explained as an example that the video is selected only when all of the selection items are satisfied. However, the image is not limited to this, and it is of course possible to select images that satisfy at least some of the selection items.

상기와 같이 구성된 상기 영상 선별 단계(S22-2)를 통해 선별되어 채택된 영상들은 선별 영상 데이터 셋이 된다. 이 때, 상기 채택된 영상들 중에서 상기 객체의 종류별 비율의 균형이 이루어지도록 추가 선별될 수 있다. The images selected and selected through the image selection step (S22-2) configured as above become a selected image data set. At this time, the selected images may be further selected to ensure a balanced ratio for each type of object.

상기 현장 영상들 중에서 상기 영상 선별 단계(S22-2)를 통해 배제된 영상들은 노이즈 영상 데이터 셋으로 분류된다. Among the field images, images excluded through the image selection step (S22-2) are classified as a noise image data set.

상기 영상 선별 단계(S22-2)가 끝나면, 상기 데이터 셋 혼합 단계(S22-3)가 수행된다. After the image selection step (S22-2) is completed, the data set mixing step (S22-3) is performed.

상기 데이터 셋 혼합 단계(S22-3)는, 상기 선별 영상 데이터 셋과 상기 노이즈 영상 데이터 셋을 미리 설정된 비율로 혼합하는 단계이다. 여기서, 상기 비율은 1:1인 것으로 예를 들어 설명한다. 다만, 이에 한정되지 않고, 상기 선별 영상 데이터 셋만으로도 추가 학습하기 충분한 양이라고 판단될 경우, 상기 현장 영상 데이터 셋은 상기 선별 영상 데이터 셋만으로 구성되는 것도 물론 가능하다. The data set mixing step (S22-3) is a step of mixing the selected image data set and the noise image data set at a preset ratio. Here, the ratio is 1:1 as an example. However, the method is not limited to this, and if it is determined that the selected image data set alone is sufficient for further learning, the field image data set may be composed of only the selected image data set.

상기 현장 영상 데이터 셋 저장 단계(S22-4)는, 상기 데이터 셋 혼합 단계(S22-3)에서 혼합된 데이터 셋을 상기 현장 영상 데이터 셋으로 저장하는 단계이다.The field image data set saving step (S22-4) is a step of storing the data set mixed in the data set mixing step (S22-3) as the field image data set.

상기와 같은 방법으로 상기 현장 영상 데이터 셋이 생성되면, 상기 현장 영상 데이터 셋을 이용하여 상기 미세조정 모델 학습 단계(S23)을 수행한다. When the field image data set is created in the same manner as above, the fine-tuning model learning step (S23) is performed using the field image data set.

상기 미세조정 모델 학습 단계(S23)는, 상기 모델 학습부(32)가 상기 현장 영상 데이터 셋으로 상기 영상 딥러닝 초기 모델을 추가 학습하여, 상기 현장의 특성을 반영하여 객체의 종류를 도출하는 영상 딥러닝 미세조정 모델을 생성하는 단계이다. 상기 미세조정 모델 학습 단계(S23)에서 상기 현장 영상 데이터 셋으로 추가 학습함으로써, 상기 영상 센서가 설치된 현장의 특성을 반영하여 영상 내 객체의 종류를 보다 정확하게 도출할 수 있다. In the fine-tuning model learning step (S23), the model learning unit 32 additionally learns the initial image deep learning model with the field image data set to derive the type of object by reflecting the characteristics of the field. This is the step of creating a deep learning fine-tuning model. By additionally learning with the field image data set in the fine-tuning model learning step (S23), the type of object in the image can be more accurately derived by reflecting the characteristics of the field where the image sensor is installed.

상기 검증 단계(S24)는, 상기 영상 딥러닝 초기 모델과 상기 영상 딥러닝 미세조정 모델에 기저장된 검증 데이터 셋을 각각 입력하여 각각에서 도출된 결과를 비교하여, 상기 영상 딥러닝 미세조정 모델의 정확도를 검증하는 단계이다. 상기 영상 딥러닝 미세조정 모델의 정확도가 미리 설정된 기준 정확도 미만일 경우, 상기 현장 영상 데이터 셋 생성 단계에서 선별 기준 등을 조정할 수 있다. In the verification step (S24), the verification data sets pre-stored in the image deep learning initial model and the image deep learning fine-tuning model are respectively input and the results derived from each are compared to determine the accuracy of the image deep learning fine-tuning model. This is the verification step. If the accuracy of the image deep learning fine-tuning model is less than the preset standard accuracy, selection criteria, etc. can be adjusted in the field image data set creation step.

상기와 같이, 본 발명에서는 객체 정보를 도출하고자 하는 실제 현장의 현장 영상으로 현장 영상 데이터 셋을 구축함으로써, 상기 현장 영상 데이터 셋으로 추가학습되어 생성된 영상 딥러닝 미세조정 모델이 현장의 특성을 보다 정확하게 반영하여 객체 정보를 도출할 수 있다.As described above, in the present invention, by constructing a field image data set with field images of an actual field from which object information is to be derived, an image deep learning fine-tuning model generated by additional learning with the field image data set can view the characteristics of the field. Object information can be derived by accurately reflecting it.

본 발명은 도면에 도시된 실시예를 참고로 설명되었으나 이는 예시적인 것에 불과하며, 본 기술 분야의 통상의 지식을 가진 자라면 이로부터 다양한 변형 및 균등한 다른 실시예가 가능하다는 점을 이해할 것이다. 따라서, 본 발명의 진정한 기술적 보호 범위는 첨부된 특허청구범위의 기술적 사상에 의하여 정해져야 할 것이다.The present invention has been described with reference to the embodiments shown in the drawings, but these are merely exemplary, and those skilled in the art will understand that various modifications and equivalent other embodiments are possible therefrom. Therefore, the true scope of technical protection of the present invention should be determined by the technical spirit of the attached patent claims.

12: 수집용 영상 센서 30: 서버
31: 데이터 셋 구축부 32: 모델 학습부12: Image sensor for collection 30: Server
31: Data set construction unit 32: Model learning unit

Claims

An image sensor installed at a plurality of preset sites to capture field images;
The field images are filtered according to preset selection criteria, images that satisfy the selection criteria are collected into a selection image data set, and at least some of the images that are not selected as the selection image data set among the field images are noiseized. Comprising a server that collects video data sets and mixes the selected video data set and the noise video data set at a preset ratio to construct a field video data set,
The server is,
In the field images, the shape of the object, size of the object, presence or absence of field background, shooting time, degree of overlap between objects, and degree of overlap between objects and background are compared with selection criteria set differently for each, and the selected image data set is selected. Including a data set construction unit for selecting,
A collection system for field video data sets for learning artificial intelligence video deep learning models.

delete

An image sensor installed at a plurality of preset sites to capture field images;
The field images are filtered according to preset selection criteria, images that satisfy the selection criteria are collected into a selection image data set, and at least some of the images that are not selected as the selection image data set among the field images are noiseized. Comprising a server that collects video data sets and mixes the selected video data set and the noise video data set at a preset ratio to construct a field video data set,
The server is,
Create an initial video deep learning model that derives the types of objects in the video by learning a pre-stored general-purpose video data set, and further learn the initial video deep learning model with the field video data set to reflect the characteristics of the field. Comprising a model learning unit that generates an image deep learning fine-tuning model that derives the type of object in the image,
A collection system of field video data sets for learning artificial intelligence video deep learning models.

A field image collection step in which a server collects field images captured by image sensors installed at a plurality of preset sites;
An image selection step in which the server filters the field images according to preset selection criteria and collects images satisfying the selection criteria into a selection image data set;
A data set mixing step in which the server mixes the selected image data set and a noise image data set obtained by collecting at least some of the images that were not selected as the selected image data set in the image selection step at a preset ratio;
A data set storage step in which the server constructs the mixed data set in the data set mixing step into a field image data set,
The image selection step is,
In the field images, at least some of the shape of the object, the size of the object, the presence or absence of the field background, the shooting time, the degree of overlap between objects, and the degree of overlap between the object and the background are compared with selection criteria set differently for each, and the selected image Selecting data sets,
Method of collecting field video data sets for learning artificial intelligence video deep learning models.

delete

In claim 4,
The image selection step is,
Compare the ratio of the shape of the object in the scene image with a preset ratio,
If the rate at which the shape of the object appears is greater than the set rate, including the process of selecting the image where the object appears,
Method of collecting field video data sets for learning artificial intelligence video deep learning models.

In claim 6,
The image selection step is,
If the ratio of the shape of the object is less than the set ratio and the size of the part of the object is more than the preset size, exclude the image in which the object appears,
If the ratio of the shape of the object is less than the set ratio and the size of the part of the object is less than the preset size, including the process of excluding only the object from the image in which the object appears and adopting the image,
Method of collecting field video data sets for learning artificial intelligence video deep learning models.

In claim 4,
The image selection step is,
By comparing the size of the object in the field image with a preset size,
The size of the object is greater than or equal to the set size, and comprising the process of selecting an image from which the object appears,
Method of collecting field video data sets for learning artificial intelligence video deep learning models.

In claim 4,
The image selection step is,
Comprising the process of comparing the background in the scene video with a preset background according to the scene and selecting an image including the set background,
Method of collecting field video data sets for learning artificial intelligence video deep learning models.

In claim 4,
The image selection step is,
By comparing the shooting time zone of the scene video with the preset setting time zone,
If the shooting time zone is within the set time zone, including the process of selecting the video,
Method of collecting field video data sets for learning artificial intelligence video deep learning models.

In claim 4,
The image selection step is,
By comparing the degree of overlap between objects in the field image with a preset overlap standard,
If the degree of overlap between the objects satisfies the overlap criteria, the image in which the object appears is selected,
Including a process of excluding the image in which the object appears if the degree of overlap between the objects does not satisfy the overlap criterion,
Method of collecting field video data sets for learning artificial intelligence video deep learning models.

In claim 4,
The image selection step is,
By comparing the degree of overlap between the object and the background in the field image with a preset overlap ratio,
If the degree of overlap between the object and the background is less than the set overlap ratio, the image from which the object appears is selected,
If the degree of overlap between the object and the background is greater than the set overlap ratio, including the process of excluding the image in which the object appears,
Method of collecting field video data sets for learning artificial intelligence video deep learning models.

A field image collection step in which a server collects field images captured by image sensors installed at a plurality of preset sites;
An image selection step in which the server filters the field images according to preset selection criteria and collects images satisfying the selection criteria into a selection image data set;
A data set mixing step in which the server mixes the selected image data set and a noise image data set obtained by collecting at least some of the images that were not selected as the selected image data set in the image selection step at a preset ratio;
A data set storage step in which the server constructs the mixed data set in the data set mixing step into a field image data set,
The image selection step is,
The rate at which the shape of the object appears in the field image is compared with a preset setting ratio, and if the rate at which the shape of the object appears is greater than the set ratio, the image in which the object appears is selected, and the rate at which the shape of the object appears is If it is less than the set ratio and the size of the exposed part of the object is greater than or equal to a preset size, the image in which the object appears is excluded, and if the ratio of the shape of the object is less than the set ratio, the size of the exposed part of the object is excluded. If the size is less than the preset size, a process of excluding only the object in question from the image in which the object appears and adopting the image;
Comparing the size of the object in the field image with a preset size, and if the size of the object is greater than the set size, selecting the image in which the object appears;
A process of comparing the background in the scene video with a preset background according to the scene and selecting an image including the set background;
By comparing the shooting time zone of the scene video with the preset setting time zone,
If the shooting time zone is within the set time zone, a process of selecting the video,
The degree of overlap between objects in the field image is compared with a preset overlap standard, and if the degree of overlap between objects satisfies the standard for overlap, the image in which the object appears is adopted. If the degree of overlap between objects does not satisfy the standard for overlap, the image from which the object appears is selected. If not, the process of excluding the image in which the object appears,
The degree of overlap between the object and the background in the field image is compared with a preset overlap ratio, and if the overlap between the object and the background is less than the set overlap ratio, the image from which the object appears is selected, and the overlap between the object and the background is selected. If the degree is greater than the set overlap ratio, including the process of excluding the image in which the object appears,
Method of collecting field video data sets for learning artificial intelligence video deep learning models.