KR101938491B1

KR101938491B1 - Deep learning-based streetscape safety score prediction method

Info

Publication number: KR101938491B1
Application number: KR1020170109793A
Authority: KR
Inventors: 강행봉
Original assignee: 가톨릭대학교 산학협력단
Priority date: 2017-08-30
Filing date: 2017-08-30
Publication date: 2019-01-14

Abstract

The present invention relates to a deep learning-based street safety score prediction method comprising: (a) a step of constructing a city image data set consisting of a plurality of city image data; (b) a step of applying safety base data assigned to the city image data to a preregistered ranking determination algorithm to calculate a safety score of the city image data; (c) a step of applying the city image data to a saliency estimation algorithm to extract an environmental context map; and (d) a step of applying the environmental context map and the safety score for the city image data to a convolutional neural network to train the convolutional neural network. Accordingly, environmental context which is an abstract high-level feature is extracted, and city safety can be accurately predicted by using the environmental context.

Description

[0001] DEEP LEARNING BASED STREETSCAPE SAFETY SCORE PREDICTION METHOD [0002]

본 발명은 딥 러닝 기반의 거리 안전도 점수 예측 방법에 관한 것으로서, 보다 상세하게는 The present invention relates to a method for predicting a degree of distance safety based on deep learning,

지난 수십 년간 도시가 속한 지역경제를 활성화시키기 위해 다양한 연구들이 이루어지고 있다. 이러한 연구들은 야간 경제(NTE: Night-Time Economy) 개념의 창조를 통해 도시를 즐거움과 소비의 공간으로 만드는 시도, 지역축제를 개최함으로써 경제적 효과를 얻으려는 시도 등 관광객을 끌어들일 수 있는 매력적인 도시를 만드는데 중점을 두고 있다.Over the past several decades, various studies have been conducted in order to revitalize the local economy of the city. These studies have created attractive cities that attract tourists, such as attempts to make the city a space of pleasure and consumption through the creation of the night-time economy (NTE) concept, and attempts to obtain economic effects by hosting local festivals It is focused on.

이러한 가운데 안전한 공간을 창출하면 방문객 수와 소비자 지출에 긍정적인 영향을 주는 것으로 알려져 범죄에 대한 공포감의 완화가 지역경제 활성화에 중요한 역할을 맡게 되었다. 따라서, 범죄에 대한 공포감을 감소시키기 위해서 범죄 발생 혹은 안전도 예측에 대한 연구가 진행되고 있다.It is known that creating a safe space positively affects the number of visitors and consumer spending, so that mitigation of fear of crime plays an important role in revitalizing the local economy. Therefore, in order to reduce fear of crime, researches on crime occurrence or safety prediction are under way.

과거에는 이러한 분석에 수많은 정보와 비용이 필요했지만, 최근 컴퓨터 비전과 빅 데이터가 발전함에 따라 영상으로부터 도시 환경을 예측하는 것이 가능해졌고, 특히 범죄 발생 예측과 관련하여, Bachner 등의 논문 "Predictive policing: Preventing crime with data and analytics(IBM Center for the Business of Government, 2013.)"과 Thompson 등의 논문 "The Santa Cruz Experiment: Can a City's Crime Be Predicted(Popular Science, 2011, 2011-10)"은 사회적으로 큰 관심을 받고 있다.In the past, this analysis required a lot of information and costs. However, as computer vision and big data developed recently, it became possible to predict the urban environment from the images. In particular, in relation to the prediction of crime, Bachner et al., "Predictive policing: "The Santa Cruz Experiment: Can a City's Crime Be Predicted (Popular Science, 2011, 2011-10)" by Thompson et al. I am receiving great attention.

도시의 안전도를 예측하는데 있어 Wilson과 Kelling의 '깨진 창문 이론'을 바탕으로 하는 방법이 있다. 깨진 창문 이론이란 거리에 깨진 유리창 하나를 방치해 두면 그 지점을 중심으로 범죄가 확산되기 시작한다는 이론으로, 사소한 무질서를 방치했을 경우 도시 전체의 혼란으로 퍼지게 된다는 것이다. 다시 말해 도시에서 시각적으로 무질서하게 느껴지는 장소는 실제로도 낮은 안전도를 가질 가능성이 높다고 판단할 수 있다. 이러한 이론을 토대로 시각적 인지(Visual Perception)를 이용해서 도시 거리 영상에 대한 안전도를 평가하고, 그 값들을 기계학습 알고리즘을 통해 학습하고 예측하는 연구들이 이루어졌다.There is a method based on Wilson and Kelling's "broken window theory" in predicting city safety. The theory of broken windows is that crime spreads around the point if you leave a broken window on the street. If you leave a minor disorder, it will spread to the whole city. In other words, a place that is visually disordered in the city can be judged to have a low degree of safety. Based on these theories, we tried to evaluate the safety of urban distance images using visual perception and to learn and predict the values through machine learning algorithms.

한편, 범죄 예측 및 도시의 안전도 예측에 대한 다양한 연구들이 이루어지고 있다. 상술한 Bachner 등의 논문에서는 Netflix와 WalMart와 같은 소매 업체가 개발한 비즈니스 기술을 사용하여 소비자 행동을 예측하고, 이를 토대로 범죄를 예측하였다.On the other hand, various studies on the prediction of crime and the safety prediction of the city are being conducted. In the above-mentioned paper by Bachner et al., We use business technologies developed by retailers such as Netflix and WalMart to predict consumer behavior and predict crime based on this.

또한, 상술한 Thompson 등의 논문에서는 캘리포니아의 산타크루즈에 적용되어 적은 인원으로도 효과적이고 효율적인 범죄대응을 할 수 있게 도와준 연구로, 빅 데이터를 토대로 범죄가 발생할 가능성이 높은 시간대와 장소를 예측하였다.In addition, Thompson et al. Described in the above-mentioned study has been applied to Santa Cruz, California, to enable effective and efficient crime response even with a small number of people, and predicts the time and place where crime is likely to occur based on the Big Data .

이와 같은 기존 연구들은 대부분 세 단계에 걸쳐 진행되는데, 먼저 학습에 사용할 데이터를 수집한다. 과거 범죄에 대한 기록들과 설문 내용, SNS, 그리고 영상 등을 주로 학습 데이터로 사용하였고, 다음으로 데이터로부터 특징을 추출한다. 범죄 발생 시간과 장소에 대한 특징을 찾거나, 영상을 데이터로 쓸 경우에는 색상, 그레디언트 등의 전역 특징을 추출하였다. 마지막으로 추출한 특징들을 토대로 예측 모델을 생성한다.Most of these existing studies are conducted in three stages. First, data to be used for learning are collected. Records of past crimes, questionnaires, SNS, and images are mainly used as learning data, and then features are extracted from the data. The characteristics of crime occurrence time and location are searched. When images are used as data, global features such as color and gradient are extracted. Finally, a prediction model is generated based on the extracted features.

대부분의 기존 연구들은 도시 영상에서 전역 특징 추출 방법을 통해 색상, 그레디언트, 텍스쳐 등의 저수준 특징(Low Level Feature)을 얻었다. 때문에 도시 환경의 안전도 평가와 같은 추상적인 정보를 예측하는데 있어 한계가 존재한다.Most existing studies have obtained low level features such as color, gradient, and texture through global feature extraction in urban images. Therefore, there are limitations in predicting abstract information such as safety assessment of urban environment.

이에, 본 발명은 상기와 같은 문제점을 해소하기 위해 안출된 것으로서, 추상적인 고수준 특징(High level feature)인 환경적 컨텍스트(Environmental context)를 추출하고, 이를 이용하여 도시의 안전도를 예측할 수 있는 딥 러닝 기반의 거리 안전도 점수 예측 방법을 제공하는데 그 목적이 있다.Accordingly, the present invention has been made to solve the above-mentioned problems, and it is an object of the present invention to provide a system and method for extracting an environmental context, which is an abstract high-level feature, Based distance safety score prediction method.

상기 목적은 본 발명에 따라, 딥 러닝 기반의 거리 안전도 점수 예측 방법에 있어서, (a) 복수의 도시 영상 데이터로 구성된 도시 영상 데이터 세트가 구성되는 단계와; (b) 각각의 상기 도시 영상 데이터에 대해 부여되는 안전도 기초 데이터가 기 등록된 순위 결정 알고리즘에 적용되어 각각의 상기 도시 영상 데이터의 안전도 점수가 산출되는 단계와; (c) 각각의 상기 도시 영상 데이터를 사일런시 추정 알고리즘((Saliency estimation algorism)에 적용하여 환경적 컨텍스트 맵이 추출되는 단계와; (d) 각각의 상기 도시 영상 데이터에 대한 상기 환경적 컨텍스트 맵과 상기 안전도 점수가 컨벌루션 신경망에 적용되어 상기 컨벌루션 신경망이 학습되는 단계를 포함하는 것을 특징으로 하는 딥 러닝 기반의 거리 안전도 점수 예측 방법에 의해서 달성된다.According to another aspect of the present invention, there is provided a method of predicting distance learning based on depth learning, the method comprising the steps of: (a) constructing a city image data set composed of a plurality of city image data; (b) calculating a safety degree score of each of the plurality of pieces of the urban image data by applying a safety degree basic data to each of the plurality of the urban image data to a pre-registered ranking algorithm; (c) extracting an environmental context map by applying each of the image data to a saliency estimation algorithm; (d) extracting an environmental context map for each of the image data; Wherein the degree of safety score is applied to a convolutional neural network to learn the convolutional neural network.

여기서, 상기 (b) 단계에서 상기 안전도 기초 데이터는 두 개의 상기 도시 영상 데이터에 대한 쌍대비교실험의 결과를 통해 부여되며; 상기 순위 결정 알고리즘은 Trueskill 알고리즘을 포함할 수 있다.Here, in the step (b), the safety-based data is given through a result of a pairwise comparison test on two pieces of the urban image data; The ranking algorithm may include a Trueskill algorithm.

또한, 상기 (c) 단계에서는 상기 안전도 점수를 기준으로 상기 도시 영상 데이터 중 상위 30%가 안전 영상 데이터로, 하위 30%가 위함 영상 데이터로 추출되어 상기 사일런시 추정 알고리즘((Saliency estimation algorism)에 적용될 수 있다.In the step (c), the upper 30% of the urban image data is extracted as the safety image data and the lower 30% of the urban image data is extracted as the safety data based on the safety degree score, and the saliency estimation algorithm Lt; / RTI >

그리고, 상기 (d) 단계에서는 상기 컨벌루션 신경망에 의해 추정된 추정 안전도 점수와 상기 안전도 점수 간의 손실이 산출되어 학습에 반영될 수 있다.In the step (d), a loss between the estimated safety degree estimated by the convolutional neural network and the safety degree score may be calculated and reflected in the learning.

또한, 상기 손실은 유클리디안 손실(Euclidean loss)을 통해 산출될 수 있다.In addition, the loss can be calculated through an Euclidean loss.

상기와 같은 구성에 따라, 본 발명에 따르면, 추상적인 고수준 특징(High level feature)인 환경적 컨텍스트(Environmental context)를 추출하고, 이를 이용하여 도시의 안전도를 예측할 수 있는 딥 러닝 기반의 거리 안전도 점수 예측 방법이 제공된다.According to the present invention, according to the present invention, it is possible to extract an environmental context which is an abstract high-level feature and to use it to predict a degree of safety of a city, A score prediction method is provided.

도 1은 본 발명에 따른 딥 러닝 기반의 거리 안전도 점수 예측 방법을 설명하기 위한 도면이고,
도 2는 본 발명에 따른 딥 러닝 기반의 거리 안전도 점수 예측 방법에서 도시 영상 데이터의 안전도 점수를 결정하는 과정을 설명하기 위한 도면이고,
도 3은 본 발명에 따른 딥 러닝 기반의 거리 안전도 점수 예측 방법에서 환경적 컨텍스트 맵을 추출하는 과정을 설명하기 위한 도면이고,
도 4는 본 발명에 따른 C-S 재구성 모델을 설명하기 위한 도면이다.FIG. 1 is a view for explaining a method of predicting a depth-based distance safety score according to the present invention,
2 is a diagram for explaining a process of determining the safety score of the urban image data in the depth learning based distance safety score prediction method according to the present invention,
3 is a view for explaining a process of extracting an environmental context map in the depth learning based distance safety score prediction method according to the present invention,
4 is a diagram for explaining a CS reconstruction model according to the present invention.

이하에서는 첨부된 도면을 참조하여 본 발명에 따른 실시예들을 상세히 설명한다.Hereinafter, embodiments of the present invention will be described in detail with reference to the accompanying drawings.

도 1은 본 발명에 따른 딥 러닝 기반의 거리 안전도 점수 예측 방법을 설명하기 위한 도면이다.FIG. 1 is a diagram for explaining a depth learning based distance safety score prediction method according to the present invention.

먼저, 다수의 도시 영상 데이터를 수집하여 도시 영상 데이터 세트를 구성한다(S10). 본 발명에서는 P. Salesses 등의 논문 "The collaborative image of the city: mapping the inequality of urban perception(PloS one 8.7 (2013): e68400)"의 Place Pulse 1.0 데이터와, 도시 영상 데이터를 별도로 수집하여 2286장의 도시 영상 데이터를 포함하는 도시 영상 데이터 세트를 구성하였다.First, a plurality of city image data is collected to form a city image data set (S10). In the present invention, the Place Pulse 1.0 data and the urban image data of P. Salesses et al., "The collaborative image of the city: mapping the inequality of urban perception (PloS one 8.7 (2013): e68400) A city image data set including urban image data is constructed.

도시 영상 데이터 세트가 구성되면, 다수의 참가자들에 대해 쌍대비교실험(Pairwise comparison)을 진행하여 각각의 도시 영상 데이터에 대한 안전도 기초 데이터를 획득한다(S11). 보다 구체적으로 설명하면, 도 2의 (a)에 도시된 바와 같이, 참가자에게 두 장의 도시 영상 데이터를 보여주고, 어느 쪽이 안전해 보이는지(또는 동일한지)를 선택하도록 한다. 본 발명에서는 상술한 2,286장의 도시 영상 데이터를 이용하여 총 32명의 참가자들이 23361번의 쌍대비교실험을 진행하여 시각적 인지에 따른 안전도 기초 데이터를 획득하는 것을 예로 한다.When the urban image data set is constructed, safety comparison basic data for each of the urban image data is obtained by performing a pairwise comparison on a plurality of participants (S11). More specifically, as shown in FIG. 2A, the participant is allowed to show two pieces of the urban image data and select which one looks safe (or the same). In the present invention, a total of 32 participants using the above-described 2,286 urban image data are subjected to a 23361 pair comparison experiment to acquire safety-based data according to visual recognition.

그런 다음, 쌍대비교실험을 통해 획득된 안전도 기초 데이터를 Trueskill 알고리즘에 적용하여 각각의 도시 영상 데이터의 안전도 점수를 산출한다(S12). Trueskill 알고리즘은 온라인 게임에서 플레이어의 순위를 결정하기 위해 베이지안 그래픽 모델을 사용하는데, 본 발명에서는 쌍대비교실험에서 주어진 질문에 따라 선택된 도시 영상 데이터를 일대일 경쟁에서의 승리자로 간주하여 적용하게 된다.Then, the safety degree data obtained through the pair comparison test is applied to the Trueskill algorithm to calculate the safety score of each of the urban image data (S12). The Trueskill algorithm uses a Bayesian graphical model to determine the ranking of a player in an online game. In the present invention, the selected urban image data is regarded as a winner in a one-to-one competition according to a query given in the pair comparison experiment.

각각의 도시 영상 데이터의 스킬은

으로 모델링되며 실험이 진행될 때마다 업데이트 된다. 두 명의 플레이어, 즉 두 개의 도시 영상 데이터 x와 y에 대해서 플레이어 x가 y를 이겼을 경우 업데이트는 [수학식 1] 및 [수학식 2]를 통해 수행된다.The skill of each city image data is

And is updated as the experiment progresses. If the player x has won y with respect to two players, that is, two city image data x and y, the updating is performed through [Equation 1] and [Equation 2].

[수학식 1][Equation 1]

[수학식 2]&Quot; (2) "

여기서, x와 y의 trueskill은

,

이다.

는 사전 정의된 상수로 게임 별 편차를 의미하며,

은 경험적으로 추정된 x, y 두 도시 영상 데이터가 비길 확률이다. 함수

와

는 정규 확률 밀도 함수

와 정규 누적 밀도 함수

가 적용되었다.Here, trueskill of x and y

,

to be.

Means a game-specific deviation with a predefined constant,

Is the likelihood ratio of the two urban image data, which are empirically estimated x and y. function

Wow

Normal probability density function

And normal cumulative density function

Respectively.

본 발명에서는 도시 영상 데이터에 대해서

을 초기값으로 적용하였고,

를 적용하였다. [수학식 1]은 도시 영상 데이터의 평균값을, [수학식 2]는 도시 영상 데이터의 분산 값을 업데이트 한다. [수학식 1] 및 [수학식 2]를 통해 업데이트 되는 도시 영상 데이터들의 trueskill에 대해서 평균값을 의미하는

가 최종 순위를 결정하는데 사용된다.In the present invention,

As initial values,

Respectively. Equation (1) updates the average value of the urban image data, and (2) updates the variance value of the urban image data. Means an average value for the trueskill of the urban image data updated through [Equation 1] and [Equation 2]

Is used to determine the final ranking.

여기서, 본 발명에서는 도시 영상 데이터의 최종 순위를 0에서 10 사이의 값으로 정규화하여 각각의 도시 영상 데이터의 안전도 점수로 산출하는 것을 예로 한다. 도 2의 (b)는 각각의 도시 영상 데이터에 대해 산출된 안전도 점수의 예를 나타내고 있다.In the present invention, it is assumed that the final ranking of the urban image data is normalized to a value between 0 and 10, and the safety score of each urban image data is calculated. Fig. 2 (b) shows an example of the degree of safety calculated for each piece of the urban image data.

상기와 같이, 도시 영상 데이터에 대해 안전도 점수가 산출되면, 도시 영상 데이터로부터 환경적 컨텍스트 맵(Environment Context map)을 추출하는 과정(S14)이 진행된다. 본 발명에서는 안전도 점수의 상위 30%를 안전 영상 데이터로 추출하고, 안전도 점수의 하위 30%를 위험 영상 데이터로 추출하여 환경적 컨텍스트 맵을 추출하는 것을 예로 한다(S13).As described above, when the safety degree score is calculated for the city image data, the process of extracting the environmental context map from the city image data (S14) is performed. In the present invention, the upper 30% of the safety score is extracted as the safety image data, and the lower 30% of the safety score is extracted as the dangerous image data to extract the environmental context map (S13).

도 3은 본 발명에 따른 딥 러닝 기반의 거리 안전도 점수 예측 방법에서 환경적 컨텍스트 맵을 추출하는 과정을 설명하기 위한 도면이다. 도 3을 참조하여 설명하면, 본 발명에서는 Xia 등의 논문 "Bottom-up visual saliency estimation with deep autoencoder-based sparse reconstruction(IEEE transactions on neural networks and learning systems 27.6 (2016): 1227-1240)"에 개시된 사일런시 추청 알고리즘(Saliency Estimation Algorism)을 사용하여 사일런시 맵(Saliency map)을 추출하는 것을 예로 한다.3 is a diagram for explaining a process of extracting an environmental context map in the depth learning based distance safety score prediction method according to the present invention. Referring to FIG. 3, in the present invention, as disclosed in Xia et al., "Bottom-up visualization estimation with deep autoencoder-based sparse reconstruction (IEEE Transactions on Neural Networks and Learning Systems 27.6 (2016): 1227-1240) An example is to extract a saliency map using the Saliency Estimation Algorithm.

사일런시 추청 알고리즘은 입력으로 들어온 도시 영상 데이터에 대해 16×16×3의 크기를 가지는 훈련 영상 8000개를 임의로 추출하고, 각각의 샘플 영상의 중앙에서 8×8×3 크기를 가지는 타겟 영상 8000개를 추출한다. 그런 다음, 훈련 영상과 타겟 영상을 통해 C-S 재구성 모델(Center-Surround reconstruction model)을 학습시킨다.The silence prediction algorithm extracts 8,000 training images having a size of 16 × 16 × 3 arbitrarily from the input urban image data and generates 8000 target images having a size of 8 × 8 × 3 at the center of each sample image . Then, the C-S reconstruction model is learned through the training image and the target image.

도 4는 본 발명에 따른 C-S 재구성 모델을 설명하기 위한 도면이다. 도 3 및 도 4를 참조하여 설명하면, C-S 재구성 모델은 4개의 인코더(Encoder) 층과 4개의 디코더(Decoder) 층, 그리고 2개의 추론 층(Inference layer)으로 구성된다.4 is a diagram for explaining a C-S reconfiguration model according to the present invention. Referring to FIG. 3 and FIG. 4, the C-S reconstruction model is composed of four encoder layers, four decoder layers, and two inference layers.

인코더 층은, 도 4의 (a)에 도시된 바와 같이, 컨볼루션(Convolution)과 맥스-풀링(Max-pooling)을 두 번 반복하고, 디코더 층은 맥스-언풀링(Max-unpooling)과 디컨볼루션(Deconvolution)을 두 번 반복한다. 그리고, 추론 층에서는 오토인코더(Autoencoder)를 통해 나온 도시 영상 데이터를 바탕으로 타겟 영상을 추론하는데, 완전 연결 층(Fully connected layer)을 사용하며, 도 4의 (b)에 도시된 바와 같이, 각각 768개, 192개로 구성된다. 이와 같이 구한 신경(Neuron)은 출력 크기를 8×8×3 크기로 변경되어 타겟 영상과 크기가 일치된다. 여기서, 각 층의 결과는 ReLU와 dropout 기법을 적용하며, 컨볼루션을 할 때 각각 3×3과 5×5 크기의 커널이 사용되었다.The encoder layer repeats convolution and Max-pooling twice as shown in FIG. 4 (a), and the decoder layer repeats Max-unpooling and decon- Repeat the deconvolution twice. In the inference layer, the target image is inferred based on the urban image data obtained through the autoencoder, and a completely connected layer is used. As shown in FIG. 4 (b) 768, and 192, respectively. The neuron thus obtained has an output size of 8 × 8 × 3, which corresponds to the size of the target image. Here, the results of each layer are applied to ReLU and dropout techniques, and 3 × 3 and 5 × 5 kernels are used for convolution, respectively.

그런 다음, 학습이 완료된 C-S 재구성 모델을 사용하여 전체 입력 영상에 대해서 예측 영상과 원본 영상의 차이(Residual map)를 사일런시 맵으로 얻고, 사일런시 맵을 원본 영상에 곱해서 불필요한 정보는 없애고 판단에 영향을 주는 특징만을 남겨, 도 4의 (c)에 도시된 바와 같은 환경적 컨텍스트 맵을 추출한다.Then, using the CS reconstruction model that has been learned, the residual map between the predicted image and the original image is obtained as a silence map for the entire input image, and the original image is multiplied by the silence map to eliminate unnecessary information, And extracts an environmental context map as shown in FIG. 4 (c).

상기와 같이, 안전 영상 데이터와 위함 영상 데이터에 대해 환경적 컨텍스트 맵의 추출이 완료되면(S14), 추출된 환경적 컨텍스트 맵을 컨벌루션 신경망(Convolutional Neural Network)에 적용하여(S15), 컨벌루션 신경망(Convolutional Neural Network)을 학습시키는데 사용된다.When the extraction of the environmental context map is completed for the safety image data and the desired image data as described above (S14), the extracted environmental context map is applied to the Convolutional Neural Network (S15) Convolutional Neural Network).

본 발명에 따른 컨벌루션 신경망은 2개의 컨볼루션 층(Convolution layer)와 2개의 맥스-풀링 층(Max-pooling layer)으로 구성되는 것을 예로 하며, 각 측의 결과에는 ReLU와 dropout 기법을 적용되는 것을 예로 한다. 컨벌루션 신경망의 마지막 완전 연결 층(fully connected layer)에서는 실험에 참가한 참관자가 매긴 안전도 점수와 모델을 통해 예측된 점수 간의 차이를 이용하여 손실을 구할 수 있다.The convolutional neural network according to the present invention is exemplified by two convolution layers and two Max-pooling layers, and the result of each side is applied to ReLU and dropout techniques as an example do. In the fully connected layer of the convolution neural network, the loss can be calculated using the difference between the safety score of the observers participating in the experiment and the predicted score of the model.

보다 구체적으로 설명하면, 컨벌루션 신경망의 첫 번째 컨볼루션 층은 스트라이드 4(Stride 4)에 11×11×3 크기의 커널을 사용하여 96개의 특징 맵을 획득하고, 두 번째 컨벌루션 층은 맥스-풀링 층을 통과한 27×27×96 크기의 영상을 입력으로 사용하여 스트라이드 2(Stride 2)에 3×3×96 크기의 커널을 통해 256개의 특징 맵을 획득한다. 그리고 두 번째 컨볼루션 층의 결과에 대해 다시 맥스-풀링이 수행된다.More specifically, the first convolution layer of the convolution neural network obtains 96 feature maps using a 11 × 11 × 3 kernel on Stride 4, and the second convolution layer is the Max- Using a 27 × 27 × 96 size image as input, 256 feature maps are obtained through a 3 × 3 × 96 kernel on Stride 2. And the Max-Pulling is again performed on the result of the second convolution layer.

그리고, 컨벌루션 층에 이어져있는 완전 연결 층은 각각 4096개와 1개의 뉴런으로 구성되는데, 인간의 인지된 안전도 점수를 예측하기 위해 마지막 완전 연결 층의 뉴런 수는 1개로 설정하고, 손실은 유클리디안 손실(Euclidean loss)를 통해 계산하는 것을 예로 한다.The total connectivity layer, which is connected to the convolution layer, is composed of 4096 and 1 neuron, respectively. To estimate the human perceived safety score, the number of neurons in the last complete connection layer is set to 1, An example is the calculation through loss (Euclidean loss).

상기와 같은 환경적 컨텍스트 맵의 추출과 이를 이용한 컨벌루션 신경망에서의 학습을 통해 본 발명에 딥 러닝 기반의 거리 안전도 점수 예측 방법이 구축되며, 새로이 입력되는 도시 영상의 안전도가 보다 정확히 예측 가능하게 된다.Through the extraction of the environmental context map and the learning in the convolutional neural network using the environment context map, the method of predicting the distance safety score based on the deep learning is constructed and the safety of the newly input urban image can be more accurately predicted .

비록 본 발명의 몇몇 실시예들이 도시되고 설명되었지만, 본 발명이 속하는 기술분야의 통상의 지식을 가진 당업자라면 본 발명의 원칙이나 정신에서 벗어나지 않으면서 본 실시예를 변형할 수 있음을 알 수 있을 것이다. 발명의 범위는 첨부된 청구항과 그 균등물에 의해 정해질 것이다.Although several embodiments of the present invention have been shown and described, those skilled in the art will appreciate that various modifications may be made without departing from the principles and spirit of the invention . The scope of the invention will be determined by the appended claims and their equivalents.

Claims

A depth learning based distance safety score prediction method implemented by a computer program,
(a) registering a city image data set composed of a plurality of city image data in the computer program;
(b) a step of calculating a safety degree score of each of the city image data by applying a safety degree basic data given to each of the city image data to a ranking algorithm previously registered in the computer program,
(c) applying the respective urban image data to a saliency estimation algorithm previously registered in the computer program to extract an environmental context map;
(d) the environment context map and the safety degree score for each of the city image data are applied to a convolutional neural network previously registered in the computer program, so that the convolutional neural network is learned. Based safety score prediction method.

The method according to claim 1,
In the step (b), the safety-based data is given through a result of a pairwise comparison experiment on two pieces of the urban image data;
Wherein the ranking algorithm includes a Trueskill algorithm. &Lt; RTI ID = 0.0 > 11. < / RTI >

The method according to claim 1,
In step (c), the upper 30% of the urban image data is extracted as the safety image data and the lower 30% of the urban image data is extracted as the safety data based on the degree of safety score, and the data is applied to the saliency estimation algorism Wherein the step of estimating a distance safety score based on the depth learning is performed.

The method according to claim 1,
In the step (d), the loss between the estimated safety degree estimated by the convolutional neural network and the safety degree score is calculated and reflected in the learning.

5. The method of claim 4,
Wherein the loss is calculated through an Euclidean loss. &Lt; Desc / Clms Page number 20 >