KR101725629B1

KR101725629B1 - System and method for predicting vehicular traffic based on genetic programming using fitness function considering error magnitude

Info

Publication number: KR101725629B1
Application number: KR1020150059203A
Authority: KR
Inventors: 원동호; 강동우; 김지예; 문종호; 이동훈; 정재욱; 최윤성
Original assignee: 성균관대학교산학협력단
Priority date: 2015-04-27
Filing date: 2015-04-27
Publication date: 2017-04-12
Also published as: KR20160127896A

Abstract

본 발명의 실시예들에 따른 유전 프로그래밍 기반의 교통량 예측 시스템은 유전 프로그래밍 환경 변수들 및 훈련 데이터를 저장하는 유전 프로그래밍 데이터베이스, 유전 프로그래밍을 위한 함수들 및 터미널들로 구성된 현재 세대의 개체들에 대한 교배 및 변이를 포함하는 진화 절차들을 수행하여, 후속 세대의 개체들을 생성하는 진화 절차 처리부 및 후속 세대의 개체로부터 얻은 예측값과 훈련 데이터의 차이인 오차에 관하여, 오차 구간별로 서로 다른 가중치들을 부여하는 적합도 함수를 이용하여, 후속 세대의 개체들 중에서 적어도 일부의 개체들을 선택하며, 선택된 개체들을 진화 절차 처리부에 제공하는 개체 선택부를 포함할 수 있다.The genetic programming based traffic volume prediction system according to embodiments of the present invention includes a genetic programming database for storing genetic programming environment variables and training data, a function for genetic programming, And an adaptation function for giving different weights to each error section with respect to an error that is a difference between a predicted value and training data obtained from an evolutionary procedure processing section for generating entities of a succeeding generation and individuals of succeeding generations, And an entity selection unit for selecting at least some entities among the entities of the next generation and providing the selected entities to the evolution procedure processing unit.

Description

TECHNICAL FIELD [0001] The present invention relates to a system and a method for predicting a traffic volume based on genetic programming using a fitness function considering an error size, and a system and a method for predicting a traffic volume based on a genetic programming.

본 발명은 교통량 예측 기법에 관한 것으로, 더욱 상세하게는, 유전 프로그래밍에 기반한 교통량 예측 기법에 관한 것이다.The present invention relates to traffic estimation techniques, and more particularly, to traffic estimation techniques based on genetic programming.

교통량 예측 문제는 국가적 과제 중 하나인 지능형 운송 시스템(ITS: Intelligent Transport Systems) 사업 중에서도 중요한 부분을 차지한다. 지능형 운송 시스템은 교통 수단 또는 교통 시설에 전자, 제어 및 통신 등 첨단 교통 기술과 교통 정보를 적용함으로써 교통 체계의 운영 및 관리를 과학화하고 자동화하며 교통의 효율성과 안정성을 향상시키기 위해 연구되고 있다.Traffic forecasting is an important part of the national task, Intelligent Transport Systems (ITS). Intelligent transportation system has been studied to improve the efficiency and stability of traffic by scientific and automated management and management of transportation system by applying advanced transportation technology and traffic information such as electronic, control and communication to transportation or transportation facilities.

효율적인 교통량 예측을 위해 다양한 예측 기법들이 시험되고 있는데, 유전 알고리즘 또는 유전 프로그래밍을 적용한 예측 기법들도 속속 제안되고 있다.Various predictive techniques are being tested to predict efficient traffic volume. Prediction techniques using genetic algorithms or genetic programming are also being proposed.

유전 알고리즘은 자연 세계의 진화 과정과 그에 따른 선택과 도태를 모방하여 최적 대안을 도출할 수 있도록 구축된 계산 모델로서, 전역 최적화 기법 중 하나이다. Genetic algorithms are one of the global optimization techniques that are constructed to emulate optimal alternatives by mimicking the evolutionary process of the natural world and the subsequent selection and culling.

유전 프로그래밍은 프로그래밍 기법과 유전 알고리즘을 결합한 것으로서, 유전자들이 트리 구조, 선형 구조, 그래프 구조, 스택 구조 등과 같은 구조체를 이룬다. 유전자들은 그 자체로 일종의 작은 프로그램 조각인 함수이거나, 상태 변수들과 같은 터미널이다. 이러한 함수들과 터미널들을 유전자로 가지는 각각의 개체들은 매 세대마다 소정의 진화 절차들, 예를 들어, 교배, 변이 및 대치되고, 자손 세대의 개체들 중에서 일부 개체들이 선택되며, 선택된 개체들이 다시 진화 절차를 수행하는 것을 반복함으로써 최종적으로 가장 최적화된 개체가 선택된다. 이렇게 하여, 최적화된 프로그램이 자동으로, 또한 동적으로 생성될 수 있다.Genetic programming is a combination of programming techniques and genetic algorithms, in which genes form structures such as a tree structure, a linear structure, a graph structure, and a stack structure. The genes themselves are a kind of small piece of program, a function, or a terminal like state variables. Each entity having these functions and terminals as genes is subjected to certain evolutionary procedures, such as mating, mutation, and replacement, every generation, and some of the entities of the descendant generation are selected, By repeating the procedure, the most optimized object is finally selected. In this way, an optimized program can be generated automatically and dynamically.

이때, 유전 알고리즘이나 유전 프로그래밍은 특정 유전자 조합이 얼마나 우수한지를 수치화하는 적합도 함수들을 이용하여 유전자 조합들을 선택 내지 도태시킨다. At this time, genetic algorithms and genetic programming select and kill gene combinations using fitness functions that quantify how superior a specific gene combination is.

유전자 조합의 우수성은 과거에 실존하였던 데이터들과 예측된 데이터들의 차이인 오차에 기초하여 수치화될 수 있다. 기존의 적합도 함수들은 예를 들어 오차의 표준 편차(SDE), 오차의 평균제곱근편차(RMSE), 평균절대 백분비 오차(MAPE), 평균 절대 오차(MAE), 타일의 불일치 계수(Theil's U-Statistics), 더빈-왓슨 계수 모형 통계량(Durbin-Watson Statistic) 등이 있다.The excellence of the gene combination can be quantified based on the difference between the data that existed in the past and the predicted data. Existing fitness functions include, for example, standard deviation of error (SDE), mean square root mean square error (RMSE), mean absolute percentage error (MAPE), mean absolute error (MAE), theil's U- , Durbin-Watson Statistic, and so on.

본 발명이 해결하고자 하는 과제는 오차 크기를 고려하는 적합도 함수를 이용한 유전 프로그래밍 기반의 교통량 예측 시스템 및 방법을 제공하는 데에 있다.A problem to be solved by the present invention is to provide a system and method for predicting a traffic volume based on genetic programming using a fitness function considering an error size.

본 발명이 해결하고자 하는 과제는 오차의 통계적 특성을 이용하는 종래의 적합도 함수들의 제한된 성능을 개선할 수 있는 유전 프로그래밍 기반의 교통량 예측 시스템 및 방법을 제공하는 데에 있다.SUMMARY OF THE INVENTION It is an object of the present invention to provide a traffic prediction system and method based on genetic programming that can improve the limited performance of conventional fitness functions using statistical characteristics of errors.

본 발명이 해결하고자 하는 과제는 적합도 함수의 성능이 개선됨에 따라 좀 더 빠르고 정확한 예측 결과를 도출할 수 있는 유전 프로그래밍 기반의 교통량 예측 시스템 및 방법을 제공하는 데에 있다.The present invention has been made to solve the above problems, and it is an object of the present invention to provide a traffic volume prediction system and method based on genetic programming that can obtain faster and more accurate prediction results as performance of a fitness function improves.

본 발명의 해결과제는 이상에서 언급된 것들에 한정되지 않으며, 언급되지 아니한 다른 해결과제들은 아래의 기재로부터 당업자에게 명확히 이해될 수 있을 것이다.The solution to the problem of the present invention is not limited to those mentioned above, and other solutions not mentioned can be clearly understood by those skilled in the art from the following description.

본 발명의 일 측면에 따른 유전 프로그래밍 기반의 교통량 예측 시스템은 유전 프로그래밍 환경 변수들 및 교통량 훈련 데이터를 저장하는 유전 프로그래밍 데이터베이스; 유전 프로그래밍을 위한 함수들 및 터미널들로 구성된 현재 세대의 개체들에 대한 교배 및 변이를 포함하는 진화 절차들을 수행하여, 후속 세대의 개체들을 생성하는 진화 절차 처리부; 상기 후속 세대의 개체로부터 얻은 예측값과 훈련 데이터의 차이인 오차에 관하여, 오차 구간별로 서로 다른 가중치들을 부여하는 적합도 함수를 이용하여, 상기 후속 세대의 개체들 중에서 적어도 일부의 개체들을 선택하며, 상기 선택된 개체들을 상기 진화 절차 처리부에 제공하는 개체 선택부; 및 최종적으로 선택된 개체를 구성하는 함수들 및 터미널들에 따른 프로그램과 입력 데이터에 기초하여 교통량을 예측하는 교통량 예측부를 포함할 수 있다.A genetic programming based traffic volume predicting system according to an aspect of the present invention includes a genetic programming database for storing genetic programming environment variables and traffic volume training data; An evolution procedure processing unit for performing evolution procedures including mating and mutation for the current generation of entities consisting of functions and terminals for genetic programming to generate subsequent generation entities; Selecting at least a part of the entities of the subsequent generation by using a fitness function that gives different weights for each error section with respect to an error that is a difference between the predicted value obtained from the entity of the succeeding generation and the training data, An object selection unit for providing objects to the evolution procedure processing unit; And a traffic amount predicting unit for predicting a traffic amount based on programs and input data according to functions and terminals constituting the finally selected entity.

일 실시예에 따라, 상기 적합도 함수는 오차의 부호에 따라 오차에 서로 다른 가중치들이 부여되도록 설정될 수 있다.According to one embodiment, the fitness function may be set such that different weights are assigned to the error according to the sign of the error.

일 실시예에 따라, 상기 적합도 함수는 0보다 큰 오차의 가중치에 비해 0보다 작은 오차의 가중치가 크도록 설정될 수 있다.According to one embodiment, the fitness function may be set such that the weight of the error less than zero is greater than the weight of the error greater than zero.

일 실시예에 따라, 상기 적합도 함수는 다음 수학식According to one embodiment, the fitness function may be represented by the following equation

에 따라 주어지고, 여기서

는 n 개의 훈련 데이터들 중 i 번째 훈련 데이터에 따른 오차이고,

은 가중치일 수 있다.Lt; / RTI >

Is an error according to the ith training data among the n training data,

May be a weight.

일 실시예에 따라, 상기 적합도 함수는 오차의 절대값에 따라 오차에 서로 다른 가중치들이 부여되도록 설정될 수 있다.According to one embodiment, the fitness function may be set such that different weights are given to the error according to the absolute value of the error.

에 따라 주어지고, 여기서,

,

및

은 오차 구간별 가중치들이며,

는 오차

의 절대오차백분율일 수 있다., Where < RTI ID = 0.0 >

Is an error according to the ith training data among the n training data,

,

And

Are the weights of the error sections,

Error

May be an absolute error percentage.

일 실시예에 따라, 상기 적합도 함수는 오차의 부호가 양이면 가중치가 1이고, 오차의 부호가 음이면 오차의 절대값에 따라 오차에 서로 다른 가중치들이 부여되도록 설정될 수 있다.According to one embodiment, the fitness function may be set such that different weights are assigned to the error according to the absolute value of the error if the sign of the error is negative and the weight is 1 if the sign of the error is positive.

본 발명의 다른 측면에 따른 컴퓨터를 이용한 유전 프로그래밍 기반의 교통량 예측 방법은 상기 컴퓨터가, (a) 유전 프로그래밍을 위한 함수들 및 터미널들로 구성된 현재 세대의 개체들에 대한 교배 및 변이를 포함하는 진화 절차들을 수행하여, 후속 세대의 개체들을 생성하는 단계; (b) 상기 후속 세대의 개체로부터 얻은 예측값과 훈련 데이터의 차이인 오차에 관하여, 오차 구간별로 서로 다른 가중치들을 부여하는 적합도 함수를 이용하여, 상기 후속 세대의 개체들 중에서 적어도 일부의 개체들을 선택하는 단계; (c) 상기 선택된 개체들에 관하여 소정의 종료 조건이 만족할 때까지 단계 (a) 및 (b)를 반복하는 단계; 및 (d) 최종적으로 선택된 개체를 구성하는 함수들 및 터미널들에 따른 프로그램과 입력 데이터에 기초하여 교통량을 예측하는 단계를 포함할 수 있다.According to another aspect of the present invention, there is provided a method for predicting traffic volume based on genetic programming using a computer, the method comprising the steps of: (a) evolving, including mating and variation for current generation entities consisting of functions and terminals for genetic programming; Performing the steps to create subsequent generation of entities; (b) selecting at least some entities among the entities of the succeeding generation, using a fitness function that gives different weights for each error section, with respect to errors that are differences between predicted values obtained from the individuals of the subsequent generation and training data step; (c) repeating steps (a) and (b) until a predetermined termination condition is satisfied for the selected entities; And (d) predicting traffic volume based on programs and input data according to functions and terminals that constitute the finally selected entity.

에 따라 주어지고, 여기서

은 가중치일 수 있다.Lt; / RTI >

Is an error according to the ith training data among the n training data,

May be a weight.

에 따라 주어지고, 여기서,

,

및

은 오차 구간별 가중치들이며,

는 오차

의 절대오차백분율일 수 있다., Where < RTI ID = 0.0 >

Is an error according to the ith training data among the n training data,

,

And

Are the weights of the error sections,

Error

May be an absolute error percentage.

본 발명의 유전 프로그래밍 기반의 교통량 예측 시스템 및 방법에 따르면, 오차 크기를 고려하는 적합도 함수를 이용하여, 오차의 통계적 특성을 이용하는 종래의 적합도 함수들의 제한된 성능을 개선할 수 있다.According to the genetic programming based traffic volume prediction system and method of the present invention, it is possible to improve the limited performance of the conventional fitness functions using the statistical characteristics of the errors using the fitness function considering the error size.

본 발명의 유전 프로그래밍 기반의 교통량 예측 시스템 및 방법에 따르면, 적합도 함수의 성능이 개선됨에 따라 좀 더 빠르고 정확한 예측 결과를 도출할 수 있다.According to the genetic programming based traffic volume prediction system and method of the present invention, as the performance of the fitness function improves, faster and more accurate prediction results can be derived.

본 발명의 효과는 이상에서 언급된 것들에 한정되지 않으며, 언급되지 아니한 다른 효과들은 아래의 기재로부터 당업자에게 명확하게 이해될 수 있을 것이다.The effects of the present invention are not limited to those mentioned above, and other effects not mentioned can be clearly understood by those skilled in the art from the following description.

도 1은 본 발명의 일 실시예에 따른 유전 프로그래밍 기반의 교통량 예측 시스템을 예시한 개념도이다.
도 2는 본 발명의 일 실시예에 따른 유전 프로그래밍 기반의 교통량 예측 시스템으로부터 얻은 예측 교통량과 실제 교통량을 비교한 그래프이다.
도 3은 본 발명의 일 실시예에 따른 유전 프로그래밍 기반의 교통량 예측 방법을 예시한 순서도이다.1 is a conceptual diagram illustrating a genetic programming based traffic volume prediction system according to an embodiment of the present invention.
2 is a graph comparing predicted traffic volume and actual traffic volume obtained from a genetic programming based traffic volume prediction system according to an embodiment of the present invention.
3 is a flowchart illustrating a method of predicting traffic volume based on genetic programming according to an embodiment of the present invention.

본문에 개시되어 있는 본 발명의 실시예들에 대해서, 특정한 구조적 내지 기능적 설명들은 단지 본 발명의 실시예를 설명하기 위한 목적으로 예시된 것으로, 본 발명의 실시예들은 다양한 형태로 실시될 수 있으며 본문에 설명된 실시예들에 한정되는 것으로 해석되어서는 아니 된다.For the embodiments of the invention disclosed herein, specific structural and functional descriptions are set forth for the purpose of describing an embodiment of the invention only, and it is to be understood that the embodiments of the invention may be practiced in various forms, The present invention should not be construed as limited to the embodiments described in Figs.

이하, 첨부한 도면들을 참조하여, 본 발명의 바람직한 실시예를 보다 상세하게 설명하고자 한다. 도면상의 동일한 구성요소에 대해서는 동일한 참조부호를 사용하고 동일한 구성요소에 대해서 중복된 설명은 생략한다.Hereinafter, preferred embodiments of the present invention will be described in detail with reference to the accompanying drawings. The same reference numerals are used for the same constituent elements in the drawings and redundant explanations for the same constituent elements are omitted.

도 1은 본 발명의 일 실시예에 따른 유전 프로그래밍 기반의 교통량 예측 시스템을 예시한 개념도이다.1 is a conceptual diagram illustrating a genetic programming based traffic volume prediction system according to an embodiment of the present invention.

도 1을 참조하면, 유전 프로그래밍 기반의 교통량 예측 시스템(10)은 유전 프로그래밍 데이터베이스(11), 진화 절차 처리부(12), 개체 선택부(13) 및 교통량 예측부(14)를 포함할 수 있다.Referring to FIG. 1, the genetic programming-based traffic volume prediction system 10 may include a genetic programming database 11, an evolution procedure processing unit 12, an entity selection unit 13, and a traffic volume predicting unit 14.

유전 프로그래밍 데이터베이스(11)는 유전 프로그래밍 환경 변수들 및 훈련 데이터를 저장할 수 있다.The genetic programming database 11 may store genetic programming environment variables and training data.

유전 프로그래밍 환경 변수들은 예를 들어, 터미널 설정, 트리 깊이 설정, 인구 수, 세대 수, 검증 집합, 선택 유형, 교차율, 변이율 등이다.Genetic programming environment variables are, for example, terminal settings, tree depth settings, population numbers, generation numbers, verification sets, selection types, crossing rates, rate of variation, and so on.

훈련 데이터는 예를 들어 특정 도로 구간에서 소정 기간 동안 측정된 교통량 데이터로 구성될 수 있다.The training data may comprise, for example, traffic volume data measured over a predetermined period of time in a particular road section.

진화 절차 처리부(12)는 유전 프로그래밍을 위한 함수들 및 터미널들로 구성된 현재 세대의 개체들에 대한 교배 및 변이를 포함하는 진화 절차들을 수행하여, 후속 세대의 개체들을 생성할 수 있다.The evolution procedure processing unit 12 may perform evolutionary procedures including mating and mutation for the current generation of entities composed of functions and terminals for genetic programming to generate subsequent generation entities.

각각의 개체는 트리 구조, 스택 구조, 선형 구조 또는 그래프 구조와 같은 다양한 구조로 함수들과 터미널들을 구조화하는데, 특히 인공지능 언어인 LISP로 함수를 작성할 경우에는 트리 구조로 구조화될 수 있다.Each entity structures functions and terminals with various structures such as a tree structure, a stack structure, a linear structure, or a graph structure, and can be structured as a tree structure when a function is created by LISP, which is an artificial intelligence language.

교배(crossover)는 두 개의 개체들 사이에서, 각 개체를 구성하는 트리의 일부인 서브 트리가 자리를 서로 바꾸는 절차이다. 통상적으로 유전 알고리즘에서는 교배 전후로 유전체의 길이는 일정하지만, 유전 프로그래밍에서는 자리를 바꾸는 서브 트리들의 길이가 서로 다를 수 있으므로 교배 전후로 유전체의 길이가 달라질 수 있다.A crossover is a procedure in which a subtree, which is a part of a tree constituting each object, is interchanged between two entities. Generally, in genetic algorithms, the length of a dielectric is constant before and after mating, but in genetic programming, the length of a dielectric may change before and after mating, because the length of the subtrees that change positions may be different.

변이(mutation), 즉 돌연변이는 하나의 개체 내에서 터미널을 교체하거나 또는 서브 트리의 위치를 교체하는 절차이다.Mutation, or mutation, is the process of replacing a terminal or replacing the location of a subtree within an entity.

유전 프로그래밍에서는 교배와 변이가 프로그래밍 문법이 허용하는 한에서만 이루어진다는 점이 유전 알고리즘과 다른 점이다.Genetic programming differs from genetic algorithms in that mating and mutation occur only as far as the programming grammar allows.

교배 및 변이로 인해 얻어진 후속 세대의 개체들은 선택(selection) 절차를 거쳐 선별되는데, 비례 선택, 승자승 선택, 순위 기반 선택, 균등 비례 룰렛 휠 선택 또는 토너먼트 선택 등의 다양한 방식이 있다.Subsequent generations of individuals obtained through mating and mutation are selected through a selection procedure, such as proportional selection, win-win selection, rank-based selection, even-proportional roulette wheel selection, or tournament selection.

예를 들어 토너먼트 선택은 10 개의 개체들이 토너먼트 방식으로 적합도 함수의 값에 따라 토너먼트마다 승자를 결정함으로써 그 중 일부의 개체들만 선택하는 방식이다.For example, a tournament selection is a method in which ten entities determine winners for each tournament according to the value of the fitness function in a tournament manner, thereby selecting only some of the entities.

선택 절차의 방식도 중요하지만, 선택 절차의 효율은 근본적으로 적합도 함수의 성능에 의해 결정될 수 있다.The manner of the selection procedure is also important, but the efficiency of the selection procedure can be fundamentally determined by the performance of the fitness function.

개체 선택부(13)는 후속 세대의 개체로부터 얻은 예측값과 훈련 데이터의 차이인 오차에 관하여, 오차 구간에 따라 서로 다른 가중치들을 부여하는 적합도 함수를 이용하여, 후속 세대의 개체들 중에서 적어도 일부의 개체들을 선택하며, 선택된 개체들을 진화 절차 처리부(12)에 제공할 수 있다.The object selecting unit 13 uses the fitness function to give different weights according to the error interval with respect to the error that is the difference between the prediction value obtained from the object of the succeeding generation and the training data, And may provide the selected entities to the evolution procedure processing unit 12.

종래의 적합도 함수들인 오차의 표준 편차(SDE), 오차의 평균제곱근편차(RMSE), 평균절대 백분비 오차(MAPE), 평균 절대 오차(MAE), 타일의 불일치 계수(Theil's U-Statistics), 더빈-왓슨 계수 모형 통계량(Durbin-Watson Statistic) 등은 오차 크기가 적합도 함수에 미치는 영향이 선형적이다. 즉, 어떤 훈련 데이터에 대한 오차가 다른 훈련 데이터에 대한 오차보다 10배라면, 적합도 함수에 미치는 영향도 10배만큼 커진다.(SDE), the mean square root mean square error (RMSE), the mean absolute percentage error (MAPE), the mean absolute error (MAE), the theil's U-Statistics, The Watson coefficient model statistic (Durbin-Watson Statistic) shows that the error magnitude has a linear effect on the fitness function. That is, if the error for any training data is 10 times the error for the other training data, the effect on the fitness function is also increased by 10 times.

하지만, 교통량 예측의 경우에, 실제 교통량이 예측 교통량보다 작다면, 예측 오차가 적거나 크거나 운전자들은 별다른 불편함을 느끼지 못하는 반면에, 실제 교통량이 예측된 교통량보다 크면, 예측 오차가 커질수록 운전자들이 느끼는 불편함은 급격히 증가할 수 있다. However, if the actual traffic volume is smaller than the predicted traffic volume, the prediction error is small or large, and the driver does not feel any inconvenience. On the other hand, if the actual traffic volume is larger than the predicted traffic volume, The inconvenience that they feel may increase sharply.

이러한 교통량 예측 오차에 대해 운전자들이 인지하는 불편함의 비선형성을 적합도 함수에 반영할 수 있도록, 본 발명의 실시예들에 따른 개체 선택부(13)는 오차 구간에 따라 가중치들을 다르게 부여하는 적합도 함수를 이용할 수 있다.In order to reflect the nonlinearity of the inconvenience that drivers perceive the traffic volume prediction error to the fitness function, the entity selection unit 13 according to the embodiments of the present invention includes a fitness function that gives different weights according to the error interval Can be used.

여기서, 실시예에 따라, 적합도 함수는 오차의 부호에 따라 오차에 서로 다른 가중치들이 부여되도록 설정될 수 있다.Here, according to the embodiment, the fitness function can be set such that different weights are given to the error according to the sign of the error.

예를 들어, 예측 오차가 음이면, 다시 말해 실제 교통량이 예측된 교통량보다 작다면, 가중치는 1로 설정되어 오차의 크기가 적합도 함수의 계산값에 그대로 반영될 수 있다. 반면에, 예측 오차가 양이면, 다시 말해 실제 교통량이 예측된 교통량보다 많았다면, 가중치는 1보다 큰 양수로 설정되어 오차의 크기가 적합도 함수의 계산값에 오차의 원래 절대값보다 더 크게 반영될 수 있다.For example, if the prediction error is negative, that is, if the actual traffic volume is smaller than the predicted traffic volume, the weight is set to 1 so that the size of the error can be reflected in the calculated value of the fitness function. On the other hand, if the prediction error is positive, that is, if the actual traffic volume is greater than the predicted traffic volume, the weight is set to a positive number greater than 1, so that the error size is reflected in the calculated value of the fitness function larger than the original absolute value of the error .

실시예에 따라, 적합도 함수는, 0보다 큰 오차의 가중치에 비해 0보다 작은 오차의 가중치가 크도록 설정될 수 있다.According to an embodiment, the fitness function may be set such that the weight of the error less than zero is greater than the weight of the error greater than zero.

예를 들어, 예측 오차가 음이면, 다시 말해 실제 교통량이 예측된 교통량보다 작다면, 가중치는 1보다 작은 값으로 설정되어 오차의 크기가 적합도 함수의 계산값에 상대적으로 작게 반영될 수 있다. 반면에, 예측 오차가 양이면, 다시 말해 실제 교통량이 예측된 교통량보다 많았다면, 가중치는 1보다 큰 양수로 설정되어 오차의 크기가 적합도 함수의 계산값에 상대적으로 크게 반영될 수 있다.For example, if the prediction error is negative, that is, if the actual traffic volume is smaller than the predicted traffic volume, the weight is set to a value smaller than 1, so that the size of the error can be reflected relatively small to the calculated value of the fitness function. On the other hand, if the prediction error is positive, that is, if the actual traffic volume is larger than the predicted traffic volume, the weight is set to a positive number larger than 1, so that the size of the error can be largely reflected in the calculated value of the fitness function.

실시예에 따라, 적합도 함수는 다음 수학식 1에 따라 주어질 수 있다.According to an embodiment, the fitness function may be given by: < EMI ID = 1.0 >

여기서,

은 가중치이다.here,

Is an error according to the ith training data among the n training data,

Is the weight.

실시예에 따라, 적합도 함수는 오차의 절대값에 따라 오차에 서로 다른 가중치들이 부여되도록 설정될 수 있다.According to the embodiment, the fitness function can be set such that different weights are given to the error according to the absolute value of the error.

예를 들어, 예측 오차들을 크기 순서대로 나열하여 복수의 오차 구간들로 나누고, 가장 작은 오차 구간의 경우에는 오차의 크기를 적합도 함수에 그대로 반영하고, 크기가 큰 오차 구간들일수록 더 큰 가중치들을 부여할 수 있다.For example, the prediction errors are arranged in order of magnitude and divided into a plurality of error intervals. In the case of the smallest error interval, the error magnitude is directly reflected in the fitness function, and larger error intervals are assigned larger weights can do.

실시예에 따라, 적합도 함수는 다음 수학식 2과 같이 주어질 수 있다.According to an embodiment, the fitness function may be given as: < EMI ID = 2.0 >

여기서,

,

및

은 오차 구간별 가중치들이며,

는 오차

의 절대오차백분율이다. here,

Is an error according to the ith training data among the n training data,

,

And

Are the weights of the error sections,

Error

Is the absolute error percentage.

수학식 2에서 예시적으로

의 구간을 0.1%, 1%, 10%의 기준으로 나누었지만, 이러한 구간의 분할은 얼마든지 다르게 설정될 수 있다.As an example in Equation 2

Is divided by the criteria of 0.1%, 1%, and 10%, but the division of these intervals can be set differently.

실시예에 따라, 적합도 함수는, 오차의 부호가 양이면 가중치가 1이고, 오차의 부호가 음이면 오차의 절대값에 따라 오차에 서로 다른 가중치들이 부여되도록 설정될 수 있다.According to the embodiment, the fitness function can be set such that different weights are assigned to the error according to the absolute value of the error if the weight is 1 when the sign of the error is positive and the sign of the error is negative.

예를 들어, 예측 오차가 음이면, 다시 말해 실제 교통량이 예측된 교통량보다 작다면, 오차가 그대로 반영된다. 하지만 예측 오차가 양이면, 다시 말해 실제 교통량이 예측된 교통량보다 많다면, 예측 오차들을 크기 순서대로 나열하여 복수의 오차 구간들로 나누고, 가장 작은 오차 구간의 경우에는 오차의 크기를 적합도 함수에 그대로 반영하지만, 크기가 큰 오차 구간들일수록 더 큰 가중치들을 부여할 수 있다.For example, if the forecast error is negative, that is, if the actual traffic volume is smaller than the predicted traffic volume, the error is reflected. However, if the prediction error is positive, that is, if the actual traffic volume is larger than the predicted traffic volume, the prediction errors are sorted in order of magnitude and divided into a plurality of error intervals. In the case of the smallest error interval, However, larger error intervals can give larger weights.

구체적으로 적합도 함수는 다음 수학식 3과 같이 주어질 수 있다.Specifically, the fitness function can be given by the following equation (3).

여기서,

, 및

은 오차 구간별 가중치들이며,

는 오차

의 절대오차백분율이다. here,

Is an error according to the ith training data among the n training data,

, And

Are the weights of the error sections,

Error

Is the absolute error percentage.

진화 절차 처리부(12) 및 개체 선택부(13)는 세대수, 연산 시간 또는 오차 등의 소정 종료 조건이 만족할 때까지 진화 절차와 개체 선택을 반복한다.The evolution procedure processing unit 12 and the entity selection unit 13 repeat the evolution procedure and the object selection until the predetermined termination condition such as the number of households, calculation time or error is satisfied.

종료 조건이 만족하면, 개체 선택부(13)는 최종 세대의 개체들 중 가장 성능이 우수한, 즉 적합도 함수의 판별 기준에서 가장 우수한 하나의 개체를 최종적으로 선택한다.If the termination condition is satisfied, the entity selecting unit 13 finally selects one entity having the best performance among the final generation entities, that is, the best entity in the criterion of the fitness function.

교통량 예측부(14)는 최종적으로 선택된 개체를 구성하는 함수들 및 터미널들에 따른 프로그램과 입력 데이터에 기초하여 교통량을 예측할 수 있다.The traffic volume predicting unit 14 can predict the traffic volume based on the programs and the input data according to the functions and terminals constituting the finally selected entity.

도 2는 본 발명의 일 실시예에 따른 유전 프로그래밍 기반의 교통량 예측 시스템으로부터 얻은 예측 교통량과 실제 교통량을 비교한 그래프이다.2 is a graph comparing predicted traffic volume and actual traffic volume obtained from a genetic programming based traffic volume prediction system according to an embodiment of the present invention.

도 2를 참조하면, 2013년 1월 1일부터 2013년 11월 23일까지, 고속도로 경부선 김천 IC ~ 추풍령 사이에서 측정된 327일 분량의 교통량 데이터 중 297일 동안의 훈련 데이터로써 교통량 예측 프로그램이 최적화되었다.Referring to FIG. 2, the traffic volume prediction program is optimized as the training data for 297 days from the traffic data of 327 days measured from the Gimcheon IC to the Chukbang Rd of the expressway Gyeongbu line from January 1, 2013 to November 23, .

나머지 30일 분량의 교통량 데이터를 검증 데이터로 이용하여 검증한 결과, 교통량 예측 프로그램은 실제 데이터와 상당히 유사하게 예측하였음을 알 수 있다.As a result of verifying the traffic data of the remaining 30 days using the data as verification data, it can be seen that the traffic volume prediction program is predicted to be substantially similar to the actual data.

도 3은 본 발명의 일 실시예에 따른 유전 프로그래밍 기반의 교통량 예측 방법을 예시한 순서도이다.3 is a flowchart illustrating a method of predicting traffic volume based on genetic programming according to an embodiment of the present invention.

도 3을 참조하면, 본 발명의 실시예에 따라 컴퓨터를 이용한 유전 프로그래밍 기반의 교통량 예측 방법은, 단계(S31)에서, 컴퓨터가, 유전 프로그래밍을 위한 함수들 및 터미널들로 구성된 현재 세대의 개체들에 대한 교배 및 변이를 포함하는 진화 절차들을 수행하여, 후속 세대의 개체들을 생성하는 단계로부터 시작할 수 있다.Referring to FIG. 3, a genetic programming based traffic volume prediction method using a computer according to an embodiment of the present invention is characterized in that, in step S31, the computer determines whether or not a current generation of objects And generating evolutionary procedures, including generation and generation of subsequent generations.

단계(S32)에서, 컴퓨터는 후속 세대의 개체로부터 얻은 예측값과 훈련 데이터의 차이인 오차에 관하여, 오차 구간별로 서로 다른 가중치들을 부여하는 적합도 함수를 이용하여, 후속 세대의 개체들 중에서 적어도 일부의 개체들을 선택할 수 있다.In step S32, the computer calculates, using a fitness function that gives different weights for each error section, with respect to the error, which is the difference between the predictive value and the training data obtained from the subsequent generation of individuals, Can be selected.

이때, 이러한 교통량 예측 오차에 대해 운전자들이 인지하는 불편함의 비선형성을 적합도 함수에 반영할 수 있도록, 적합도 함수는 오차 구간에 따라 가중치들을 다르게 부여하도록 설정될 수 있다.In this case, the fitness function may be set to be weighted according to the error interval so that the nonlinearity of the driver's discomfort with respect to the traffic volume prediction error is reflected in the fitness function.

실시예에 따라, 적합도 함수는 오차의 부호에 따라 오차에 서로 다른 가중치들이 부여되도록 설정될 수 있다.According to the embodiment, the fitness function may be set such that different weights are given to the error according to the sign of the error.

구체적으로 적합도 함수는 상술한 수학식들 1 내지 3과 같이 주어질 수 있다.Specifically, the fitness function can be given as Equations 1 to 3 above.

단계(S33)에서, 컴퓨터가, 선택된 개체들에 관하여 소정의 종료 조건이 만족할 때까지 단계(S31)과 단계(S32)를 반복할 수 있다.In step S33, the computer can repeat steps S31 and S32 until a predetermined termination condition is satisfied with respect to the selected entities.

단계(S33)에서, 만약 종료 조건이 만족하면, 컴퓨터는 최종 세대의 개체들 중 가장 성능이 우수한, 즉 적합도 함수의 판별 기준에서 가장 우수한 하나의 개체를 최종적으로 선택할 수 있다.In step S33, if the termination condition is satisfied, the computer can finally select the best one among the final generation entities, that is, the best one in the criterion of the fitness function.

단계(S34)에서, 컴퓨터가 최종적으로 선택된 개체를 구성하는 함수들 및 터미널들에 따른 프로그램과 입력 데이터에 기초하여 교통량을 예측할 수 있다.In step S34, the computer can predict the traffic volume based on the programs and input data according to the functions and terminals that constitute the finally selected entity.

본 실시예 및 본 명세서에 첨부된 도면은 본 발명에 포함되는 기술적 사상의 일부를 명확하게 나타내고 있는 것에 불과하며, 본 발명의 명세서 및 도면에 포함된 기술적 사상의 범위 내에서 당업자가 용이하게 유추할 수 있는 변형예와 구체적인 실시예는 모두 본 발명의 권리범위에 포함되는 것이 자명하다고 할 것이다.It is to be understood that both the foregoing general description and the following detailed description of the present invention are exemplary and explanatory and are intended to provide further explanation of the invention as claimed. It will be understood that variations and specific embodiments which may occur to those skilled in the art are included within the scope of the present invention.

또한, 본 발명에 따른 장치는 컴퓨터로 읽을 수 있는 기록매체에 컴퓨터가 읽을 수 있는 코드로서 구현하는 것이 가능하다. 컴퓨터가 읽을 수 있는 기록매체는 컴퓨터 시스템에 의하여 읽힐 수 있는 데이터가 저장되는 모든 종류의 기록장치를 포함한다. 기록매체의 예로는 ROM, RAM, 광학 디스크, 자기 테이프, 플로피 디스크, 하드 디스크, 비휘발성 메모리 등을 포함한다. 또한 컴퓨터가 읽을 수 있는 기록매체는 네트워크로 연결된 컴퓨터 시스템에 분산되어 분산방식으로 컴퓨터가 읽을 수 있는 코드가 저장되고 실행될 수 있다.Further, the apparatus according to the present invention can be implemented as a computer-readable code on a computer-readable recording medium. A computer-readable recording medium includes all kinds of recording apparatuses in which data that can be read by a computer system is stored. Examples of the recording medium include ROM, RAM, optical disk, magnetic tape, floppy disk, hard disk, nonvolatile memory and the like. The computer-readable recording medium may also be distributed over a networked computer system so that computer readable code can be stored and executed in a distributed manner.

10 교통량 예측 시스템
11 유전 프로그래밍 데이터베이스;
12 진화 절차 처리부
13 개체 선택부
14 교통량 예측부10 Traffic Estimation System
11 genetic programming database;
12 Evolution Process Processor
13 object selection unit
14 traffic volume prediction unit

Claims

A genetic programming database for storing genetic programming environmental variables and traffic volume training data;
An evolution procedure processing unit for performing evolution procedures including mating and mutation for the current generation of entities consisting of functions and terminals for genetic programming to generate subsequent generation entities;
Selecting at least a part of the entities of the subsequent generation by using a fitness function that gives different weights for each error section with respect to an error that is a difference between the predicted value obtained from the entity of the succeeding generation and the training data, An object selection unit for providing objects to the evolution procedure processing unit; And
And a traffic volume predicting unit for predicting a traffic volume based on programs and input data according to functions and terminals constituting the finally selected entity,
Wherein the fitness function is set such that different weights are assigned to the error according to the sign of the error.

delete

The method of claim 1,
Wherein a weight of an error smaller than 0 is set to be larger than a weight of an error larger than 0.

2. The method of claim 1,

Lt; / RTI >

Is an error according to the ith training data among the n training data,

Is a weighting value.

A genetic programming database for storing genetic programming environmental variables and traffic volume training data;
An evolution procedure processing unit for performing evolution procedures including mating and mutation for the current generation of entities consisting of functions and terminals for genetic programming to generate subsequent generation entities;
Selecting at least a part of the entities of the subsequent generation by using a fitness function that gives different weights for each error section with respect to an error that is a difference between the predicted value obtained from the entity of the succeeding generation and the training data, An object selection unit for providing objects to the evolution procedure processing unit; And
And a traffic volume predicting unit for predicting a traffic volume based on programs and input data according to functions and terminals constituting the finally selected entity,
Wherein the fitness function is set such that different weights are assigned to the error according to the absolute value of the error.

6. The method of claim 5,

, Where < RTI ID = 0.0 >

Is an error according to the ith training data among the n training data,

,

And

Are the weights of the error sections,

Error

Wherein the error rate is a percentage of absolute error of the traffic volume.

A genetic programming database for storing genetic programming environmental variables and traffic volume training data;
An evolution procedure processing unit for performing evolution procedures including mating and mutation for the current generation of entities consisting of functions and terminals for genetic programming to generate subsequent generation entities;
Selecting at least a part of the entities of the subsequent generation by using a fitness function that gives different weights for each error section with respect to an error that is a difference between the predicted value obtained from the entity of the succeeding generation and the training data, An object selection unit for providing objects to the evolution procedure processing unit; And
And a traffic volume predicting unit for predicting a traffic volume based on programs and input data according to functions and terminals constituting the finally selected entity,
Wherein the fitness function is set such that if the sign of the error is positive, the weight is 1 and if the sign of the error is negative, different weights are assigned to the error according to the absolute value of the error.

As a method for predicting traffic volume based on genetic programming using a computer,
The computer comprising:
(a) performing evolutionary procedures including mating and mutation for current generation entities consisting of functions and terminals for genetic programming to generate subsequent generation entities;
(b) selecting at least some entities among the entities of the succeeding generation, using a fitness function that gives different weights for each error section, with respect to errors that are differences between predicted values obtained from the individuals of the subsequent generation and training data step;
(c) repeating steps (a) and (b) until a predetermined termination condition is satisfied for the selected entities; And
(d) predicting a traffic volume based on programs and input data according to functions and terminals constituting the finally selected entity,
Wherein the fitness function is set such that different weights are assigned to the error according to the sign of the error.

delete

The method of claim 8,
Wherein a weight of an error smaller than zero is set to be larger than a weight of an error larger than zero.

9. The method of claim 8,

Lt; / RTI >

Is an error according to the ith training data among the n training data,

Wherein the weighting factor is a weighting factor.

As a method for predicting traffic volume based on genetic programming using a computer,
The computer comprising:
(a) performing evolutionary procedures including mating and mutation for current generation entities consisting of functions and terminals for genetic programming to generate subsequent generation entities;
(b) selecting at least some entities among the entities of the succeeding generation, using a fitness function that gives different weights for each error section, with respect to errors that are differences between predicted values obtained from the individuals of the subsequent generation and training data step;
(c) repeating steps (a) and (b) until a predetermined termination condition is satisfied for the selected entities; And
(d) predicting a traffic volume based on programs and input data according to functions and terminals constituting the finally selected entity,
Wherein the fitness function is set such that different weights are assigned to the error according to the absolute value of the error.

14. The method of claim 12,

, Where < RTI ID = 0.0 >

Is an error according to the ith training data among the n training data,

,

And

Are the weights of the error sections,

Error Wherein the absolute value of the error rate is a percentage of absolute error of the traffic volume.

As a method for predicting traffic volume based on genetic programming using a computer,
The computer comprising:
(a) performing evolutionary procedures including mating and mutation for current generation entities consisting of functions and terminals for genetic programming to generate subsequent generation entities;
(b) selecting at least some entities among the entities of the succeeding generation, using a fitness function that gives different weights for each error section, with respect to errors that are differences between predicted values obtained from the individuals of the subsequent generation and training data step;
(c) repeating steps (a) and (b) until a predetermined termination condition is satisfied for the selected entities; And
(d) predicting a traffic volume based on programs and input data according to functions and terminals constituting the finally selected entity,
Wherein the fitness function is set such that different weights are assigned to the error according to the absolute value of the error if the weight is 1 and the sign of the error is negative if the sign of the error is positive.

A computer program recorded on a recording medium so as to perform each step of a genetic programming based traffic volume prediction method according to any one of claims 8,