KR102310490B1 - The design of GRU-based cell structure robust to missing value and noise of time-series data in recurrent neural network - Google Patents

The design of GRU-based cell structure robust to missing value and noise of time-series data in recurrent neural network Download PDF

Info

Publication number
KR102310490B1
KR102310490B1 KR1020180048801A KR20180048801A KR102310490B1 KR 102310490 B1 KR102310490 B1 KR 102310490B1 KR 1020180048801 A KR1020180048801 A KR 1020180048801A KR 20180048801 A KR20180048801 A KR 20180048801A KR 102310490 B1 KR102310490 B1 KR 102310490B1
Authority
KR
South Korea
Prior art keywords
time
noise
neural network
observed
input value
Prior art date
Application number
KR1020180048801A
Other languages
Korean (ko)
Other versions
KR20190124846A (en
Inventor
오혜연
박성준
박정국
Original Assignee
한국과학기술원
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 한국과학기술원 filed Critical 한국과학기술원
Priority to KR1020180048801A priority Critical patent/KR102310490B1/en
Priority to PCT/KR2019/004873 priority patent/WO2019208998A1/en
Publication of KR20190124846A publication Critical patent/KR20190124846A/en
Application granted granted Critical
Publication of KR102310490B1 publication Critical patent/KR102310490B1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/049Temporal neural networks, e.g. delay elements, oscillating neurons or pulsed inputs

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • Computing Systems (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Molecular Biology (AREA)
  • Artificial Intelligence (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)
  • Feedback Control In General (AREA)

Abstract

예측하고자 하는 문제에 맞추어 시계열 데이터의 결측값 대치 및 노이즈 완화가 동시에 가능한 재귀적 인공 신경망 모델을 제공하는 것으로서, (a) 시계열 데이터에서 학습가능한 노이즈 완화 필터를 이용한 가중평균 방법으로 노이즈를 완화하는 단계, (b) 결측값을 대치하는 단계, (c) GRU연산을 통해 현재 시점에서 기억해야하는 정보를 잠재 상태 벡터에 저장하는 단계를 단일 셀 구조에 모두 포함하는 것을 특징으로 한다. 또한, 본 발명은 재귀적 인공 신경망 모델을 구성함에 있어서, 상기 (a) 단계에서, 셀 구조 내에 포함된 노이즈 완화를 위한 가중치 파라미터가 재귀적 인공 신경망 모델을 예측 과제에 적합하도록 학습하는 과정에서, 과제에 최적화되도록 학습되는 것을 특징으로 한다. 상기와 같은 방법에 의하여, 별도의 전처리 없이 시계열 데이터의 결측값 대치 및 노이즈 완화를 동시에 수행하는 재귀적 인공 신경망 모델을 활용해, 다양한 기계학습 과제에 활용할 수 있다.To provide a recursive artificial neural network model capable of simultaneously imputing missing values of time series data and mitigating noise according to the problem to be predicted, (a) mitigating noise by a weighted average method using a noise mitigation filter that can be learned from time series data , (b) replacing missing values, and (c) storing information that needs to be memorized at the current point in time through GRU operation in a latent state vector are all included in a single cell structure. In addition, in the present invention, in constructing a recursive neural network model, in the step (a), in the process of learning the recursive neural network model so that the weight parameter for noise mitigation included in the cell structure is suitable for the prediction task, It is characterized in that it is learned to be optimized for the task. By the above method, a recursive artificial neural network model that simultaneously performs replacement of missing values and noise mitigation of time series data without separate preprocessing can be utilized for various machine learning tasks.

Description

재귀적 신경망에서 시계열 데이터의 데이터 누락 및 노이즈에 강건한 GRU 기반의 셀 구조 설계 { The design of GRU-based cell structure robust to missing value and noise of time-series data in recurrent neural network }{ The design of GRU-based cell structure robust to missing value and noise of time-series data in recurrent neural network }

본 발명은 재귀적 신경망 (Recurrent Neural Network) 모델을 설계할 때 사용되는 셀 (Cell) 의 내부 구조에 가중치 평균 필터 (Weighted Average Filter) 를 추가한 뒤, 신경망 모델을 학습시킬 때 필터의 매개 변수를 함께 학습하여, 결측치 및 노이즈가 있는 시계열 입력 데이터를 학습하여 예측 과제를 수행할 수 있는 GRU (Gated Recurrent Unit) 기반의 새로운 재귀적 신경망 모델에 관한 것이다.The present invention adds a Weighted Average Filter to the internal structure of a cell used when designing a recurrent neural network model, and then sets the parameters of the filter when training the neural network model. It relates to a new recursive neural network model based on GRU (Gated Recurrent Unit) that can learn together and perform a prediction task by learning time series input data with missing values and noise.

일반적으로, 시계열 데이터를 신경망 모델 등의 분류기에 사용할 때 데이터에 포함된 데이터 누락 및 노이즈는 전처리 과정에서 처리되어 사용된다. 전처리 과정에서 누락 된 데이터는 전체 평균값이나 가중치 평균값, 또는 선형 회귀법이나 서포트 벡터 머신 기반 회귀법 등의 알고리즘을 사용하여 대치된다. 또한 데이터의 노이즈는 이동 평균 필터, 웨이브렛 필터, 퍼지 로직 등의 방법을 통해 완화된다.In general, when time series data is used in a classifier such as a neural network model, data omissions and noise included in the data are processed and used in the preprocessing process. In the preprocessing process, the missing data is replaced using an algorithm such as the overall mean value or the weighted mean value, or a linear regression method or a support vector machine-based regression method. In addition, data noise is mitigated through methods such as a moving average filter, a wavelet filter, and a fuzzy logic.

그러나 이러한 데이터 누락 및 노이즈 처리 기법은 신경망 모델의 목적 시스템과는 무관하게 적용된다는 한계가 존재한다. 따라서 신경망 모델이 입력 데이터를 통해 학습될 때 이미 전처리를 마친 데이터는 수정되지 않으며, 이 때문에 신경망 모델의 구조나 목적 시스템의 특성 등이 데이터 누락 및 노이즈 처리 과정에 효과적으로 반영될 수 없게 된다.However, there is a limitation that these data omission and noise processing techniques are applied regardless of the target system of the neural network model. Therefore, when the neural network model is learned from the input data, the data that has already been pre-processed is not modified, and for this reason, the structure of the neural network model or the characteristics of the target system cannot be effectively reflected in the data omission and noise processing process.

상기와 같은 데이터 누락 및 노이즈 처리 방법의 한계를 해결하고자 신경망 모델 내부에서 셀 구조를 변형하여 누락 된 데이터를 대치하고 노이즈를 완화하는 접근 방법이 제시되고 있다. 셀의 구조를 통해 데이터 전처리를 시행할 경우 신경망 모델의 학습 과정에서 전처리에 사용된 함수의 매개변수가 같이 학습될 수 있다는 장점이 있다. 그러나 아직까지 셀 구조의 누락 데이터의 대치 및 노이즈 완화를 동시에 수행하는 셀 구조는 제시된 바가 없으며, 목적 시스템의 정확도 향상에 대한 개선의 여지가 남아있는 상태이다.In order to solve the limitations of the data omission and noise processing methods as described above, an approach for replacing missing data and mitigating noise by modifying the cell structure inside the neural network model has been proposed. When data preprocessing is performed through the cell structure, there is an advantage that the parameters of the function used for preprocessing can be learned together in the training process of the neural network model. However, a cell structure that simultaneously performs replacement of missing data and noise mitigation of the cell structure has not been proposed, and there is still room for improvement in improving the accuracy of the target system.

KRUS 10201601026901020160102690 AA KRUS 10201800076571020180007657 AA

Che, Zhengping, Sanjay Purushotham, Kyunghyun Cho, David Sontag, and Yan Liu. "Recurrent neural networks for multivariate time series with missing values." Scientific reports 8, no. 1 (2018): 6085.Che, Zhengping, Sanjay Purushotham, Kyunghyun Cho, David Sontag, and Yan Liu. "Recurrent neural networks for multivariate time series with missing values." Scientific reports 8, no. 1 (2018): 6085.

본 발명의 목적은 상술한 바와 같은 문제점을 해결하기 위한 것으로, 재귀적 신경망 모델을 설계할 때 가중치 평균 필터가 내장된 GRU 기반의 셀을 사용하여, 신경망을 학습할 때 각 셀에 존재하는 필터의 매개 변수를 함께 학습하여, 데이터 누락 및 노이즈가 존재하는 시계열 데이터에 대해 별도의 전처리 없이 재귀적 신경망 모델의 학습을 가능하게 하는 방법을 제공하는 것이다.An object of the present invention is to solve the above-described problems, and when designing a recursive neural network model, a GRU-based cell with a built-in weighted average filter is used, and when learning a neural network, a filter present in each cell is used. It is to provide a method that enables training of a recursive neural network model without separate preprocessing for time series data in which data omission and noise exist by learning parameters together.

특히, 본 발명에서 제시하는 셀 구조는 결측값에 대한 대치가 가능하고, 학습가능하고 유연한 가중치 평균 필터를 통한 노이즈의 완화를 동시에 수행하는 방법을 제공하여, 예측하고자 하는 문제에 맞추어 노이즈 완화 필터의 매개 변수를 함께 학습하는 학습 알고리즘을 제공하는 것이다.In particular, the cell structure proposed in the present invention provides a method for simultaneously performing noise mitigation through a weighted averaging filter that can replace missing values, is learnable and flexible, so that the noise mitigation filter can be used in accordance with the problem to be predicted. It is to provide a learning algorithm that learns parameters together.

상기 목적을 달성하기 위해 본 발명은 예측하고자 하는 문제에 맞추어 시계열 데이터의 결측값 대치 및 노이즈 완화가 동시에 가능한 재귀적 인공 신경망 모델을 제공하는 것으로서, (a) 시계열 데이터에서 학습가능한 노이즈 완화 필터를 이용한 가중평균 방법으로 노이즈를 완화하는 단계, (b) 결측값을 대치하는 단계, (c) GRU연산을 통해 현재 시점에서 기억해야하는 정보를 잠재 상태 벡터에 저장하는 단계를 단일 셀 구조에 모두 포함하는 것을 특징으로 한다.In order to achieve the above object, the present invention provides a recursive artificial neural network model capable of simultaneously replacing missing values and noise mitigation of time series data in accordance with a problem to be predicted, (a) using a noise mitigation filter that can be learned from time series data Including all of the steps of mitigating noise with a weighted average method, (b) replacing missing values, and (c) storing information that needs to be remembered at the current point in time through GRU operation in a latent state vector in a single cell structure characterized.

또, 본 발명은 재귀적 인공 신경망 모델을 구성함에 있어서, 상기 (a) 단계에서, 셀 구조 내에 포함된 노이즈 완화를 위한 가중치 파라미터가 재귀적 인공 신경망 모델을 예측 과제에 적합하도록 학습하는 과정에서, 과제에 최적화되도록 학습되는 것을 특징으로 한다.In addition, in the process of learning the recursive artificial neural network model so that, in the step (a), the weight parameter for noise mitigation included in the cell structure is suitable for the prediction task, It is characterized in that it is learned to be optimized for the task.

상술한 바와 같이, 본 발명에 따른 데이터 누락 및 노이즈 강건한 GRU 셀을 사용하여 재귀적 신경망을 구성할 경우, 데이터 누락 및 노이즈가 존재하는 시계열 데이터 대해 작동하는 임의의 재귀적 신경망의 목적 시스템에 대한 성능이 개선되는 효과가 얻어진다. As described above, when a recursive neural network is constructed using a GRU cell that is robust to data omission and noise according to the present invention, the performance of any recursive neural network operating on time-series data in which data omission and noise are present on the target system This improved effect is obtained.

도 1은 기존의 GRU 셀 구조에 대한 블록도.
도 2는 본 발명에서 제시하는 셀 구조에 대한 블록도.
1 is a block diagram of a conventional GRU cell structure.
Figure 2 is a block diagram of the cell structure presented in the present invention.

이하, 본 발명의 실시를 위한 구체적인 내용을 도면에 따라서 설명한다. 본 발명이 제시하는 셀 구조는 아래와 같음.Hereinafter, specific contents for carrying out the present invention will be described with reference to the drawings. The cell structure proposed by the present invention is as follows.

1. 길이가 T인 시계열 데이터의 각 시점 t는 N개의 특징을 나타내는 N차원 벡터로 구성되어 있으며, 결측값과 노이즈를 포함하고 있을 수 있음. 1. Each time point t of time series data of length T consists of N-dimensional vectors representing N features, and may contain missing values and noise.

2. 노이즈 제거 레이어를 이용해 노이즈를 제거함. 레이어가 산출하는 값은 시점 t를 기준으로 가장 최근에 관측된 입력값

Figure 112018041855242-pat00001
으로, 노이즈가 제거된 값임.2. Remove noise using denoising layer. The value calculated by the layer is the most recently observed input value based on time t.
Figure 112018041855242-pat00001
, which is a value from which noise has been removed.

2. 1. 시점 t의 N차원 벡터에서 d번째 차원의 값

Figure 112018041855242-pat00002
가 관측되었다면, 이 값을 그대로 사용함. 반대로, 시점 t의 N차원 벡터에서 d번째 차원의 값
Figure 112018041855242-pat00003
가 관측되지 않았다면, 해당 차원에서 가장 최근에 관측되었던 값
Figure 112018041855242-pat00004
을 사용함.
Figure 112018041855242-pat00005
Figure 112018041855242-pat00006
가 관측되었는지 여부를 나타내며, 관측되었다면 1, 아니면 0의 값을 갖는 마스크임.2. 1. The d-dimensional value of the N-dimensional vector at time t
Figure 112018041855242-pat00002
is observed, this value is used as it is. Conversely, the d-dimensional value of the N-dimensional vector at time t
Figure 112018041855242-pat00003
If is not observed, the most recently observed value in that dimension
Figure 112018041855242-pat00004
used.
Figure 112018041855242-pat00005
silver
Figure 112018041855242-pat00006
Indicates whether or not is observed, a mask with a value of 1 if observed and 0 otherwise.

Figure 112018041855242-pat00007
Figure 112018041855242-pat00007

2. 2.

Figure 112018041855242-pat00008
는 시점 t-k부터 시점 t까지의
Figure 112018041855242-pat00009
들의 가중 평균임.
Figure 112018041855242-pat00010
는 가장 최근에 계산된 결측값이 대치된 k개의 시점별
Figure 112018041855242-pat00011
의 가중 평균으로서,
Figure 112018041855242-pat00012
는 k차원의 각 시점별 가중치를 나타내는 학습가능한 파라미터임.2. 2.
Figure 112018041855242-pat00008
is from time tk to time t
Figure 112018041855242-pat00009
is the weighted average of them.
Figure 112018041855242-pat00010
is for each k time points where the most recently calculated missing values are imputed.
Figure 112018041855242-pat00011
As the weighted average of
Figure 112018041855242-pat00012
is a learnable parameter representing the weight for each time point in the k dimension.

Figure 112018041855242-pat00013
Figure 112018041855242-pat00013

즉, 노이즈 제거 레이어가 산출하는

Figure 112018041855242-pat00014
는, 시점 t를 기준으로 가장 최근에 관측된 노이즈가 제거된
Figure 112018041855242-pat00015
임.That is, the noise removal layer calculates
Figure 112018041855242-pat00014
is the most recently observed noise with respect to time t.
Figure 112018041855242-pat00015
Lim.

3. 앞선 레이어에서 산출한 노이즈가 제거된 값을 기반으로 결측값을 대치함. 레이어가 산출하는 값은 노이즈가 제거되고 결측값이 고려된

Figure 112018041855242-pat00016
임. 자세한 과정을 수식으로 나타내면 아래와 같음.3. Replace missing values based on the noise-removed values calculated in the previous layer. The value calculated by the layer is calculated with noise removed and missing values considered.
Figure 112018041855242-pat00016
Lim. The detailed process is expressed in formulas as follows.

Figure 112018041855242-pat00017
Figure 112018041855242-pat00017

3. 1. 시점 t의 N차원 벡터에서 d번째 차원의 값

Figure 112018041855242-pat00018
가 관측되었다면, 노이즈가 제거된
Figure 112018041855242-pat00019
값을 그대로 사용함.3. 1. The d-dimensional value of the N-dimensional vector at time t
Figure 112018041855242-pat00018
is observed, the noise has been removed
Figure 112018041855242-pat00019
Use the value as it is.

3. 2. 시점 t의 N차원 벡터에서 d번째 차원의 값

Figure 112018041855242-pat00020
가 관측되지 않았다면, 감쇄율
Figure 112018041855242-pat00021
를 적용한
Figure 112018041855242-pat00022
를 사용함. 즉, 현재 시점 t로부터 마지막으로
Figure 112018041855242-pat00023
가 관측되었던 시점까지 흐른 시간
Figure 112018041855242-pat00024
에 비례하여, 지수적 감쇄(exponential decay)율을 적용함.3. 2. The d-dimensional value of the N-dimensional vector at time t
Figure 112018041855242-pat00020
If is not observed, the decay rate
Figure 112018041855242-pat00021
applied
Figure 112018041855242-pat00022
is used. That is, from the current time t to the last
Figure 112018041855242-pat00023
time elapsed until the point in time when was observed
Figure 112018041855242-pat00024
In proportion to , an exponential decay rate is applied.

Figure 112018041855242-pat00025
Figure 112018041855242-pat00025

감쇄율

Figure 112018041855242-pat00026
Figure 112018041855242-pat00027
에 비례하여 증가하며,
Figure 112018041855242-pat00028
Figure 112018041855242-pat00029
는 감쇄율을 결정하기 위해서 데이터에 따라 학습될 수 있는 파라미터임. 감쇄율
Figure 112018041855242-pat00030
은 0과 1사이의 값을 가질 수 있는데, 이 값이 1에 가까워질수록 가장 최근에 관측되었던 값
Figure 112018041855242-pat00031
대신, 전체 평균 혹은 임의의 상수
Figure 112018041855242-pat00032
를 설정하여, 모든 입력값은 일정 시간 이상 관측값이 제공되지 않으면
Figure 112018041855242-pat00033
으로 수렴하도록 함.decay rate
Figure 112018041855242-pat00026
silver
Figure 112018041855242-pat00027
increases in proportion to
Figure 112018041855242-pat00028
Wow
Figure 112018041855242-pat00029
is a parameter that can be learned according to the data to determine the decay rate. decay rate
Figure 112018041855242-pat00030
can have a value between 0 and 1, and as this value approaches 1, the most recently observed value
Figure 112018041855242-pat00031
Instead, an overall average or an arbitrary constant
Figure 112018041855242-pat00032
By setting , all input values are returned if no observations are provided for a certain period of time.
Figure 112018041855242-pat00033
to converge to .

4. 결측값과 노이즈가 처리된 값

Figure 112018041855242-pat00034
을 바탕으로 GRU 연산을 수행함. 4. Missing and noise-treated values
Figure 112018041855242-pat00034
GRU operation is performed based on

Figure 112018041855242-pat00035
Figure 112018041855242-pat00035

Figure 112018041855242-pat00036
Figure 112018041855242-pat00036

Figure 112018041855242-pat00037
Figure 112018041855242-pat00037

Figure 112018041855242-pat00038
Figure 112018041855242-pat00038

시점 t의 입력값 벡터

Figure 112018041855242-pat00039
는 리셋 게이트(reset gate)
Figure 112018041855242-pat00040
와 업데이트 게이트(update gate)
Figure 112018041855242-pat00041
를 계산하기 위해 사용됨. 각 게이트는 이전 시점의 잠재 상태(hidden state)
Figure 112018041855242-pat00042
과 입력값
Figure 112018041855242-pat00043
을 이용해 산출된 0과 1사이의 값으로써,
Figure 112018041855242-pat00044
는 후보 잠재 상태(candidate hidden state)
Figure 112018041855242-pat00045
를 계산할 때 이전 시점의 잠재 상태
Figure 112018041855242-pat00046
를 얼마나 반영할지를 나타내며,
Figure 112018041855242-pat00047
는 현재 시점의 잠재 상태
Figure 112018041855242-pat00048
를 계산할 때 이전 시점의 잠재 상태
Figure 112018041855242-pat00049
를 얼마나 반영할지를 나타냄. input vector at time t
Figure 112018041855242-pat00039
is the reset gate
Figure 112018041855242-pat00040
and update gate
Figure 112018041855242-pat00041
used to calculate Each gate has a hidden state from the previous point in time.
Figure 112018041855242-pat00042
and input
Figure 112018041855242-pat00043
As a value between 0 and 1 calculated using
Figure 112018041855242-pat00044
is a candidate hidden state.
Figure 112018041855242-pat00045
the latent state at a previous point in time when calculating
Figure 112018041855242-pat00046
indicates how much to reflect
Figure 112018041855242-pat00047
is the current latent state
Figure 112018041855242-pat00048
the latent state at a previous point in time when calculating
Figure 112018041855242-pat00049
indicates how much to reflect.

5. 본 발명의 셀 구조에서 최종적으로 얻게 되는 값은 현재 시점의 잠재 상태

Figure 112018041855242-pat00050
이다. 이는 이전 시점까지 처리한 정보
Figure 112018041855242-pat00051
와 현재 시점의 원시 데이터(raw data)를 조합하여, 시계열 자료에서 과제 수행을 위해 이번 시점에 기억해야 하는 정보를 벡터로 표현한 것임.5. The value finally obtained in the cell structure of the present invention is the current latent state
Figure 112018041855242-pat00050
am. This is information that has been processed up to a previous point in time.
Figure 112018041855242-pat00051
and the raw data of the current time are combined to express the information that needs to be remembered at this time in order to perform the task in the time series data as a vector.

Figure 112018041855242-pat00052
: 시점 t의 N차원 벡터에서 d번째 차원의 값.
Figure 112018041855242-pat00053
: 시점 t를 기준으로, N차원 벡터에서 d번째 차원의 값 중에서 가장 최근에 관측된 값. 만약 시점 t-1에서 이 값이 마지막으로 관측되고 시점 t에서의 이 값(
Figure 112018041855242-pat00054
)이 결측값이라면,
Figure 112018041855242-pat00055
임. (감쇄율을 고려하지 않았을 경우)
Figure 112018041855242-pat00056
: 시점 t의 N차원 벡터에서 d번째 차원의 값의 결측값 여부를 나타내는 마스크 값. 만약
Figure 112018041855242-pat00057
가 결측값이 아니라면 1,
Figure 112018041855242-pat00058
가 결측값이라면 0의 값을 가짐.
Figure 112018041855242-pat00059
: 결측값이 발생했을 때, 가장 최근에 관측된 값
Figure 112018041855242-pat00060
를 얼마나
Figure 112018041855242-pat00061
에 반영할지 결정하는 감쇄율을 나타내며, 0과 1사이의 값을 가질 수 있음.
Figure 112018041855242-pat00062
: 감쇄율을 결정하기 위한 입력값으로,
Figure 112018041855242-pat00063
Figure 112018041855242-pat00064
간의 시간차를 나타냄. 즉, 현재로부터 마지막으로 어떤 값이 관측된 시점까지의 거리를 나타냄. 예를 들어, 현재 시점이 t이고
Figure 112018041855242-pat00065
가 t-1 시점에 관측되었다면
Figure 112018041855242-pat00066
는 1임.
Figure 112018041855242-pat00067
: 감쇄율을 결정하기 위해
Figure 112018041855242-pat00068
에 곱해지는 학습가능한 파라미터.
Figure 112018041855242-pat00069
: 감쇄율을 결정하기 위해
Figure 112018041855242-pat00070
Figure 112018041855242-pat00071
에 더해지는 학습가능한 파라미터.
Figure 112018041855242-pat00072
: 노이즈 제거 레이어에서,
Figure 112018041855242-pat00073
를 계산하기 위해 각
Figure 112018041855242-pat00074
에 곱해지는 가중치.
Figure 112018041855242-pat00075
: 각 시점별
Figure 112018041855242-pat00076
에 가중치
Figure 112018041855242-pat00077
를 곱하여 구해진, 결측값이 대치되고 노이즈가 제거된 시점 t의 N차원 벡터에서 d번째 차원의 값.
Figure 112018041855242-pat00078
: 후보 잠재 상태. 이번 시점에 들어온 입력값을 활용하여, 이번 시점의 잠재 상태에 대한 후보를 생성함.
Figure 112018041855242-pat00079
: 현재 시점의 잠재 상태. 지난 시점의 잠재 상태와 이번 시점의 후보 잠재 상태를 바탕으로 계산된, 현재 시점에서 기억해야 하는 정보가 벡터 표현으로 나타난 것.
Figure 112018041855242-pat00080
: GRU 연산에서 사용되는 업데이트 게이트. 이전 시점의 잠재 상태(hidden state)
Figure 112018041855242-pat00081
에 파라미터
Figure 112018041855242-pat00082
를 내적하고, 입력값
Figure 112018041855242-pat00083
Figure 112018041855242-pat00084
를 내적한 다음
Figure 112018041855242-pat00085
을 더한 값에 시그모이드 활성함수를 적용하여 산출된 0과 1사이의 값으로, 현재 시점의 잠재 상태
Figure 112018041855242-pat00086
를 계산할 때 이전 시점의 잠재 상태
Figure 112018041855242-pat00087
를 얼마나 반영할지를 나타냄.
Figure 112018041855242-pat00088
: GRU 연산에서 사용되는 리셋 게이트. 이전 시점의 잠재 상태(hidden state)
Figure 112018041855242-pat00089
에 파라미터
Figure 112018041855242-pat00090
를 내적하고, 입력값
Figure 112018041855242-pat00091
Figure 112018041855242-pat00092
를 내적한 다음
Figure 112018041855242-pat00093
을 더한 값에 시그모이드 활성함수를 적용하여 산출된 0과 1사이의 값으로, 후보 잠재 상태(candidate hidden state)
Figure 112018041855242-pat00094
를 계산할 때 이전 시점의 잠재 상태
Figure 112018041855242-pat00095
를 얼마나 반영할지를 나타냄.
Figure 112018041855242-pat00052
: The d-dimensional value of the N-dimensional vector at time t.
Figure 112018041855242-pat00053
: The most recently observed value among the d-dimensional values in the N-dimensional vector with respect to the time point t. If this value was last observed at time t-1 and this value at time t (
Figure 112018041855242-pat00054
) is a missing value,
Figure 112018041855242-pat00055
Lim. (If the attenuation rate is not taken into account)
Figure 112018041855242-pat00056
: A mask value indicating whether the d-dimensional value is missing in the N-dimensional vector at time t. if
Figure 112018041855242-pat00057
1 if is not a missing value,
Figure 112018041855242-pat00058
has a value of 0 if is a missing value.
Figure 112018041855242-pat00059
: When a missing value occurs, the most recently observed value
Figure 112018041855242-pat00060
how much
Figure 112018041855242-pat00061
It represents the rate of decay that determines whether to reflect in , and can have a value between 0 and 1.
Figure 112018041855242-pat00062
: As an input value to determine the decay rate,
Figure 112018041855242-pat00063
class
Figure 112018041855242-pat00064
represents the time difference between them. That is, it represents the distance from the present time to the last time a certain value was observed. For example, if the current time point is t,
Figure 112018041855242-pat00065
If is observed at time t-1
Figure 112018041855242-pat00066
is 1.
Figure 112018041855242-pat00067
: to determine the decay rate
Figure 112018041855242-pat00068
A learnable parameter multiplied by .
Figure 112018041855242-pat00069
: to determine the decay rate
Figure 112018041855242-pat00070
Figure 112018041855242-pat00071
A learnable parameter added to .
Figure 112018041855242-pat00072
: In the noise removal layer,
Figure 112018041855242-pat00073
each to calculate
Figure 112018041855242-pat00074
weight multiplied by .
Figure 112018041855242-pat00075
: for each time point
Figure 112018041855242-pat00076
weighted on
Figure 112018041855242-pat00077
The d-dimensional value in the N-dimensional vector at the time t at which missing values are replaced and noise is removed, obtained by multiplying by .
Figure 112018041855242-pat00078
: Candidate potential status. By using the input values received at this time, a candidate for the latent state at this time is generated.
Figure 112018041855242-pat00079
: The latent state of the present moment. Information that needs to be remembered at the present time, calculated based on the latent state of the past time and the candidate latent state of this time, is expressed as a vector expression.
Figure 112018041855242-pat00080
: Update gate used in GRU operation. The hidden state at a previous point in time
Figure 112018041855242-pat00081
parameter on
Figure 112018041855242-pat00082
dot product, and input
Figure 112018041855242-pat00083
to
Figure 112018041855242-pat00084
then dot product
Figure 112018041855242-pat00085
A value between 0 and 1 calculated by applying the sigmoid activation function to the value added by
Figure 112018041855242-pat00086
the latent state at a previous point in time when calculating
Figure 112018041855242-pat00087
indicates how much to reflect.
Figure 112018041855242-pat00088
: Reset gate used in GRU operation. The hidden state at a previous point in time
Figure 112018041855242-pat00089
parameter on
Figure 112018041855242-pat00090
dot product, and input
Figure 112018041855242-pat00091
to
Figure 112018041855242-pat00092
then dot product
Figure 112018041855242-pat00093
A value between 0 and 1 calculated by applying the sigmoid activation function to the added value of the candidate hidden state.
Figure 112018041855242-pat00094
the latent state at a previous point in time when calculating
Figure 112018041855242-pat00095
indicates how much to reflect.

Claims (1)

재귀적 인공 신경망 모델의 셀 구조에 있어서,
길이가 T인 시계열 데이터의 각 시점 t는 N차원 벡터로 구성되어 결측값과 노이즈를 포함하고,
(a) 시계열 데이터에서 학습가능한 노이즈 완화 필터를 이용한 가중평균 방법으로 노이즈를 완화하는 레이어,
(b) 결측값을 대치하는 레이어, 및
(c) GRU연산을 통해 현재 시점에서 기억해야하는 정보를 잠재 상태 벡터에 저장하는 레이어를 포함하고,
상기 (a) 레이어는,
시점 t의 N차원 벡터에서 d번째 차원의 입력값
Figure 112021082788707-pat00098
가 관측되면 시점 t에 관측된 입력값
Figure 112021082788707-pat00099
를 그대로 사용하고 시점 t에 입력값
Figure 112021082788707-pat00100
가 관측되지 않으면 d번째 차원에서 가장 최근에 관측된 입력값
Figure 112021082788707-pat00101
를 사용하고,
시점 t을 기준으로 가장 최근에 관측된 입력값
Figure 112021082788707-pat00102
에 대해 가장 최근에 계산된 결측값이 대치된 k개의 시점 별 입력값
Figure 112021082788707-pat00103
의 가중 평균으로 노이즈가 제거된 입력값
Figure 112021082788707-pat00104
을 산출하고,
상기 (b) 레이어는,
노이즈가 제거된 입력값
Figure 112021082788707-pat00105
를 기반으로 결측값을 대치하는 것으로,
시점 t의 N차원 벡터에서 d번째 차원의 입력값
Figure 112021082788707-pat00106
가 관측되면 노이즈가 제거된 입력값
Figure 112021082788707-pat00107
를 그대로 하고 시점 t에 입력값
Figure 112021082788707-pat00108
가 관측되지 않으면 감쇄율이 적용된 입력값
Figure 112021082788707-pat00109
를 사용하고,
상기 (c) 레이어는,
결측값과 노이즈가 처리된 입력값
Figure 112021082788707-pat00110
를 바탕으로 GRU연산을 수행하고,
GRU연산에 따른 현재 시점의 잠재 상태
Figure 112021082788707-pat00111
로서, 이전 시점의 잠재 상태
Figure 112021082788707-pat00112
와 현재 시점의 원시 데이터(raw data)를 조합하여 현재 시점에서 기억해야 하는 정보를 잠재 상태 벡터로 표현하는 것
을 특징으로 하는 재귀적 인공 신경망 모델의 셀 구조.
In the cell structure of the recursive artificial neural network model,
Each time point t of time series data of length T consists of an N-dimensional vector containing missing values and noise,
(a) a layer that mitigates noise by a weighted average method using a noise reduction filter that can be learned from time series data;
(b) a layer for replacing missing values, and
(c) including a layer that stores information that needs to be remembered at the current point in the latent state vector through GRU operation,
The (a) layer is
The d-dimensional input of the N-dimensional vector at time t
Figure 112021082788707-pat00098
is observed, the observed input value at time t
Figure 112021082788707-pat00099
is used as is and the input value at time t
Figure 112021082788707-pat00100
is not observed, the most recently observed input in the d dimension
Figure 112021082788707-pat00101
use ,
The most recently observed input value at time t
Figure 112021082788707-pat00102
Inputs for k time points with the most recently computed missing values imputed for
Figure 112021082788707-pat00103
Input value with noise removed by weighted average of
Figure 112021082788707-pat00104
to calculate,
The (b) layer is
Input value with noise removed
Figure 112021082788707-pat00105
By imputing missing values based on
The d-dimensional input of the N-dimensional vector at time t
Figure 112021082788707-pat00106
is observed, the noise-removed input value
Figure 112021082788707-pat00107
Leave as is and input value at time t
Figure 112021082788707-pat00108
If is not observed, the input value with attenuation applied
Figure 112021082788707-pat00109
use ,
The (c) layer is
Missing and noise-processed inputs
Figure 112021082788707-pat00110
GRU operation is performed based on
Current latent state according to GRU operation
Figure 112021082788707-pat00111
, the latent state at a previous point in time
Figure 112021082788707-pat00112
Expressing the information that needs to be remembered at the current point in time as a latent state vector by combining the raw data at the current point in time.
Cell structure of a recursive artificial neural network model characterized by
KR1020180048801A 2018-04-27 2018-04-27 The design of GRU-based cell structure robust to missing value and noise of time-series data in recurrent neural network KR102310490B1 (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
KR1020180048801A KR102310490B1 (en) 2018-04-27 2018-04-27 The design of GRU-based cell structure robust to missing value and noise of time-series data in recurrent neural network
PCT/KR2019/004873 WO2019208998A1 (en) 2018-04-27 2019-04-23 Gru-based cell structure design robust to missing data and noise in time series data in recurrent neural network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
KR1020180048801A KR102310490B1 (en) 2018-04-27 2018-04-27 The design of GRU-based cell structure robust to missing value and noise of time-series data in recurrent neural network

Publications (2)

Publication Number Publication Date
KR20190124846A KR20190124846A (en) 2019-11-06
KR102310490B1 true KR102310490B1 (en) 2021-10-08

Family

ID=68293629

Family Applications (1)

Application Number Title Priority Date Filing Date
KR1020180048801A KR102310490B1 (en) 2018-04-27 2018-04-27 The design of GRU-based cell structure robust to missing value and noise of time-series data in recurrent neural network

Country Status (2)

Country Link
KR (1) KR102310490B1 (en)
WO (1) WO2019208998A1 (en)

Families Citing this family (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111030889B (en) * 2019-12-24 2022-11-01 国网河北省电力有限公司信息通信分公司 Network traffic prediction method based on GRU model
CN111338385A (en) * 2020-01-22 2020-06-26 北京工业大学 Vehicle following method based on fusion of GRU network model and Gipps model
KR102443586B1 (en) 2020-06-29 2022-09-15 세종대학교산학협력단 Method and server for predicting missing data
CN111931849B (en) * 2020-08-11 2023-11-17 北京中水科水电科技开发有限公司 Hydropower unit operation data trend early warning method
CN112561118B (en) * 2020-10-29 2022-09-02 北京水慧智能科技有限责任公司 Municipal pipe network water flow prediction method based on GRU neural network
CN112967816B (en) * 2021-04-26 2023-08-15 四川大学华西医院 Acute pancreatitis organ failure prediction method, computer equipment and system
CN116861347A (en) * 2023-05-22 2023-10-10 青岛海洋地质研究所 Magnetic force abnormal data calculation method based on deep learning model

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5408424A (en) * 1993-05-28 1995-04-18 Lo; James T. Optimal filtering by recurrent neural networks
US9668699B2 (en) * 2013-10-17 2017-06-06 Siemens Healthcare Gmbh Method and system for anatomical object detection using marginal space deep neural networks
US9349105B2 (en) * 2013-12-18 2016-05-24 International Business Machines Corporation Machine learning with incomplete data sets
KR102449837B1 (en) 2015-02-23 2022-09-30 삼성전자주식회사 Neural network training method and apparatus, and recognizing method
KR102399548B1 (en) 2016-07-13 2022-05-19 삼성전자주식회사 Method for neural network and apparatus perform same method

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
Zhengping Che et al., "Recurrent Neural Networks for Multivariate Time Series with Missing Values," Scientific Reports volume 8 (2018.04.17.)*

Also Published As

Publication number Publication date
KR20190124846A (en) 2019-11-06
WO2019208998A1 (en) 2019-10-31

Similar Documents

Publication Publication Date Title
KR102310490B1 (en) The design of GRU-based cell structure robust to missing value and noise of time-series data in recurrent neural network
US9111375B2 (en) Evaluation of three-dimensional scenes using two-dimensional representations
KR102641116B1 (en) Method and device to recognize image and method and device to train recognition model based on data augmentation
WO2020176295A1 (en) Artificial neural network compression via iterative hybrid reinforcement learning approach
CN111310814A (en) Method and device for training business prediction model by utilizing unbalanced positive and negative samples
CN113692594A (en) Fairness improvement through reinforcement learning
CN110956148A (en) Autonomous obstacle avoidance method and device for unmanned vehicle, electronic device and readable storage medium
US10902311B2 (en) Regularization of neural networks
CN110942142B (en) Neural network training and face detection method, device, equipment and storage medium
Pfeiffer et al. Reward-modulated Hebbian learning of decision making
JP2021111399A (en) Processing model trained based on loss function
EP3996035A1 (en) Methods and systems for training convolutional neural networks
Suresh et al. A sequential learning algorithm for meta-cognitive neuro-fuzzy inference system for classification problems
Xu et al. A deep deterministic policy gradient algorithm based on averaged state-action estimation
CN107743071B (en) Enhanced representation method and device for network node
CN111930602A (en) Performance index prediction method and device
US20220138573A1 (en) Methods and systems for training convolutional neural networks
JP7073171B2 (en) Learning equipment, learning methods and programs
CN114611673A (en) Neural network compression method, device, equipment and readable storage medium
CN115168722A (en) Content interaction prediction method and related equipment
Salt et al. Differential evolution and bayesian optimisation for hyper-parameter selection in mixed-signal neuromorphic circuits applied to UAV obstacle avoidance
CN113807541A (en) Fairness repair method, system, equipment and storage medium for decision system
CN113035286A (en) Method for generating new chemical structure, neural network device, and non-transitory computer-readable medium
US20230076893A1 (en) Complementary learning system based experience replay (cls-er)
Spears et al. Scale-invariant temporal history (sith): optimal slicing of the past in an uncertain world

Legal Events

Date Code Title Description
N231 Notification of change of applicant
E902 Notification of reason for refusal
E701 Decision to grant or registration of patent right