WO2019208998A1 - Gru-based cell structure design robust to missing data and noise in time series data in recurrent neural network - Google Patents

Gru-based cell structure design robust to missing data and noise in time series data in recurrent neural network Download PDF

Info

Publication number
WO2019208998A1
WO2019208998A1 PCT/KR2019/004873 KR2019004873W WO2019208998A1 WO 2019208998 A1 WO2019208998 A1 WO 2019208998A1 KR 2019004873 W KR2019004873 W KR 2019004873W WO 2019208998 A1 WO2019208998 A1 WO 2019208998A1
Authority
WO
WIPO (PCT)
Prior art keywords
neural network
noise
network model
time series
time
Prior art date
Application number
PCT/KR2019/004873
Other languages
French (fr)
Korean (ko)
Inventor
오혜연
박성준
박정국
Original Assignee
한국과학기술원
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 한국과학기술원 filed Critical 한국과학기술원
Publication of WO2019208998A1 publication Critical patent/WO2019208998A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/049Temporal neural networks, e.g. delay elements, oscillating neurons or pulsed inputs
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Definitions

  • the present invention adds a weighted average filter to the internal structure of a cell used when designing a recurrent neural network model, and then applies the parameter of the filter when training the neural network model.
  • the present invention relates to a new recursive neural network model based on a Gated Recurrent Unit (GRU) that can be trained together to learn missing time and noisy time series input data.
  • GRU Gated Recurrent Unit
  • time series data when time series data is used in a classifier such as a neural network model, data omission and noise included in the data are processed and used in a preprocessing process.
  • the missing data in the preprocessing process is replaced using algorithms such as global mean or weighted mean value, or linear regression or support vector machine-based regression.
  • Data noise is also mitigated through methods such as moving average filter, wavelet filter, and fuzzy logic.
  • An object of the present invention is to solve the problems described above.
  • a GRU-based cell having a weighted average filter is used to design a recursive neural network model.
  • the cell structure proposed in the present invention provides a method of performing noise mitigation through a learnable and flexible weighted average filter that can substitute for missing values, and provides a method of noise mitigation filter according to a problem to be predicted. It is to provide a learning algorithm that learns the parameters together.
  • the present invention provides a recursive artificial neural network model capable of simultaneously replacing missing values and noise mitigation in accordance with a problem to be predicted, and (a) using a noise mitigation filter that can be learned from time series data. Mitigating noise using a weighted average method, (b) replacing missing values, and (c) storing the information to be remembered at the present point in time in a latent state vector through a GRU operation. It features.
  • the present invention in constructing a recursive artificial neural network model, in the step (a), in the process of learning the weight parameter for noise mitigation included in the cell structure to fit the recursive artificial neural network model to the prediction task, It is characterized in that it is learned to be optimized for the task.
  • 1 is a block diagram of a conventional GRU cell structure.
  • FIG. 2 is a block diagram of a cell structure of the present invention.
  • Each time point t of time series data having a length T is composed of N-dimensional vectors representing N features, and may include missing values and noise.
  • the noise removal layer Is the most recently observed noise removed at time t. being.
  • Decay rate silver Increases in proportion to Wow Is a parameter that can be learned according to the data to determine the decay rate. Decay rate Can have a value between 0 and 1, the closer this value is to 1, the most recently observed value Instead, the overall mean or any constant Is set so that all input values are not available To converge.
  • the input vector x t at time t is used to calculate the reset gate r t and update gate z t .
  • Each gate is a hidden state of the previous point in time
  • R t is a value between 0 and 1 calculated using the input value x t
  • r t is the latent state of the previous point in the calculation of the candidate hidden state h t .
  • A represents how much to reflect
  • t z is a potential state of the previous time when the current to calculate the potential state at the time t h How much to reflect.
  • the final value obtained in the cell structure of the present invention is the latent state h t at the present time. This is the information processed up to the point in time And the raw data of the present time, which is a vector representation of the information to be remembered at this time to perform the task in the time series data.
  • the present invention provides a recursive artificial neural network model capable of simultaneously replacing missing values of a time series data and mitigating noise according to a problem to be predicted.
  • the present invention in constructing a recursive artificial neural network model, in the step (a), in the process of learning the weight parameter for noise mitigation included in the cell structure to fit the recursive artificial neural network model to the prediction task, It is characterized by being learned to the task.
  • Input value to determine attenuation rate Input value to determine attenuation rate. and Indicates time difference between That is, the distance from the present time to the last time a value was observed. For example, if the current time is t If was observed at time t-1 Is 1
  • Candidate potential status Using the input from this point in time, we generate a candidate for the potential state at this point.
  • the vector representation shows the information to be remembered at the present time, calculated based on the latent state of the past and the candidate latent state at this time.
  • z t Update gate used in GRU operations.
  • Parameter to the hidden state h t-1 at the previous time The inner and the input for the x t to a value between 0 and 1 by applying a sigmoid activation function to the value of the inner product of W z, plus the following: b z, prior to when the current to calculate the potential state h t the time Indicates how much to reflect the latent state h t-1 at the time point.
  • Reset gate used in GRU operations. Invert the parameter U r to the hidden state h t-1 at the previous time, and to the input value x t Internally Is a value between 0 and 1 calculated by applying the sigmoid activity function to the candidate hidden state. When calculating, how much to reflect the latent state h t-1 from the previous point in time.

Abstract

Provided is a recurrent artificial neural network model capable of imputing a missing value and reducing noise, simultaneously, in time series data in accordance to a problem being predicted, the recurrent artificial neural network model comprising, in a single cell structure, all of the steps of: (a) reducing noise in time series data by means of a weighted average method using a learnable noise reduction filter; (b) imputing a missing value; and (c) storing, in a hidden state vector, information which must be memorized at the present time through GRU computation. In addition, in configuring the recurrent artificial neural network model, the present invention is characterized in that, in step (a), a weighted parameter for reducing noise included in a cell structure is learned so as to be optimized for a task in a process of training the recurrent artificial neural network model to be adequate for a prediction task. By means of such method, the recurrent artificial neural network model performing missing value imputation and noise reduction, simultaneously, for time series data without separate preprocessing may be used for various machine learning tasks.

Description

재귀적 신경망에서 시계열 데이터의 데이터 누락 및 노이즈에 강건한 GRU 기반의 셀 구조 설계GRU-based Cell Structure Design Robust to Data Dropping and Noise in Recursive Neural Networks
본 발명은 재귀적 신경망 (Recurrent Neural Network) 모델을 설계할 때 사용되는 셀 (Cell) 의 내부 구조에 가중치 평균 필터 (Weighted Average Filter) 를 추가한 뒤, 신경망 모델을 학습시킬 때 필터의 매개 변수를 함께 학습하여, 결측치 및 노이즈가 있는 시계열 입력 데이터를 학습하여 예측 과제를 수행할 수 있는 GRU (Gated Recurrent Unit) 기반의 새로운 재귀적 신경망 모델에 관한 것이다.The present invention adds a weighted average filter to the internal structure of a cell used when designing a recurrent neural network model, and then applies the parameter of the filter when training the neural network model. The present invention relates to a new recursive neural network model based on a Gated Recurrent Unit (GRU) that can be trained together to learn missing time and noisy time series input data.
일반적으로, 시계열 데이터를 신경망 모델 등의 분류기에 사용할 때 데이터에 포함된 데이터 누락 및 노이즈는 전처리 과정에서 처리되어 사용된다. 전처리 과정에서 누락 된 데이터는 전체 평균값이나 가중치 평균값, 또는 선형 회귀법이나 서포트 벡터 머신 기반 회귀법 등의 알고리즘을 사용하여 대치된다. 또한 데이터의 노이즈는 이동 평균 필터, 웨이브렛 필터, 퍼지 로직 등의 방법을 통해 완화된다.In general, when time series data is used in a classifier such as a neural network model, data omission and noise included in the data are processed and used in a preprocessing process. The missing data in the preprocessing process is replaced using algorithms such as global mean or weighted mean value, or linear regression or support vector machine-based regression. Data noise is also mitigated through methods such as moving average filter, wavelet filter, and fuzzy logic.
그러나 이러한 데이터 누락 및 노이즈 처리 기법은 신경망 모델의 목적 시스템과는 무관하게 적용된다는 한계가 존재한다. 따라서 신경망 모델이 입력 데이터를 통해 학습될 때 이미 전처리를 마친 데이터는 수정되지 않으며, 이 때문에 신경망 모델의 구조나 목적 시스템의 특성 등이 데이터 누락 및 노이즈 처리 과정에 효과적으로 반영될 수 없게 된다.However, there is a limit that this data dropping and noise processing technique is applied regardless of the target system of the neural network model. Therefore, when the neural network model is trained through the input data, the data that has already been preprocessed is not modified. Therefore, the structure of the neural network model or the characteristics of the target system cannot be effectively reflected in the data loss and noise processing.
상기와 같은 데이터 누락 및 노이즈 처리 방법의 한계를 해결하고자 신경망 모델 내부에서 셀 구조를 변형하여 누락 된 데이터를 대치하고 노이즈를 완화하는 접근 방법이 제시되고 있다. 셀의 구조를 통해 데이터 전처리를 시행할 경우 신경망 모델의 학습 과정에서 전처리에 사용된 함수의 매개변수가 같이 학습될 수 있다는 장점이 있다. 그러나 아직까지 셀 구조의 누락 데이터의 대치 및 노이즈 완화를 동시에 수행하는 셀 구조는 제시된 바가 없으며, 목적 시스템의 정확도 향상에 대한 개선의 여지가 남아있는 상태이다.In order to solve the limitations of the data loss and noise processing methods as described above, an approach for replacing missing data and mitigating noise by modifying a cell structure in a neural network model has been proposed. When data preprocessing is performed through the cell structure, the neural network model has the advantage that the parameters of the function used for preprocessing can be learned together. However, there is no cell structure that simultaneously replaces missing data in the cell structure and reduces noise, and there is room for improvement of the accuracy of the target system.
본 발명의 목적은 상술한 바와 같은 문제점을 해결하기 위한 것으로, 재귀적 신경망 모델을 설계할 때 가중치 평균 필터가 내장된 GRU 기반의 셀을 사용하여, 신경망을 학습할 때 각 셀에 존재하는 필터의 매개 변수를 함께 학습하여, 데이터 누락 및 노이즈가 존재하는 시계열 데이터에 대해 별도의 전처리 없이 재귀적 신경망 모델의 학습을 가능하게 하는 방법을 제공하는 것이다.SUMMARY OF THE INVENTION An object of the present invention is to solve the problems described above. When designing a recursive neural network model, a GRU-based cell having a weighted average filter is used to design a recursive neural network model. By learning the parameters together, it is possible to provide a method for learning a recursive neural network model without separate preprocessing for time series data with missing data and noise.
특히, 본 발명에서 제시하는 셀 구조는 결측값에 대한 대치가 가능하고, 학습가능하고 유연한 가중치 평균 필터를 통한 노이즈의 완화를 동시에 수행하는 방법을 제공하여, 예측하고자 하는 문제에 맞추어 노이즈 완화 필터의 매개 변수를 함께 학습하는 학습 알고리즘을 제공하는 것이다.In particular, the cell structure proposed in the present invention provides a method of performing noise mitigation through a learnable and flexible weighted average filter that can substitute for missing values, and provides a method of noise mitigation filter according to a problem to be predicted. It is to provide a learning algorithm that learns the parameters together.
상기 목적을 달성하기 위해 본 발명은 예측하고자 하는 문제에 맞추어 시계열 데이터의 결측값 대치 및 노이즈 완화가 동시에 가능한 재귀적 인공 신경망 모델을 제공하는 것으로서, (a) 시계열 데이터에서 학습가능한 노이즈 완화 필터를 이용한 가중평균 방법으로 노이즈를 완화하는 단계, (b) 결측값을 대치하는 단계, (c) GRU연산을 통해 현재 시점에서 기억해야하는 정보를 잠재 상태 벡터에 저장하는 단계를 단일 셀 구조에 모두 포함하는 것을 특징으로 한다.In order to achieve the above object, the present invention provides a recursive artificial neural network model capable of simultaneously replacing missing values and noise mitigation in accordance with a problem to be predicted, and (a) using a noise mitigation filter that can be learned from time series data. Mitigating noise using a weighted average method, (b) replacing missing values, and (c) storing the information to be remembered at the present point in time in a latent state vector through a GRU operation. It features.
또, 본 발명은 재귀적 인공 신경망 모델을 구성함에 있어서, 상기 (a) 단계에서, 셀 구조 내에 포함된 노이즈 완화를 위한 가중치 파라미터가 재귀적 인공 신경망 모델을 예측 과제에 적합하도록 학습하는 과정에서, 과제에 최적화되도록 학습되는 것을 특징으로 한다.In addition, the present invention in constructing a recursive artificial neural network model, in the step (a), in the process of learning the weight parameter for noise mitigation included in the cell structure to fit the recursive artificial neural network model to the prediction task, It is characterized in that it is learned to be optimized for the task.
상술한 바와 같이, 본 발명에 따른 데이터 누락 및 노이즈 강건한 GRU 셀을 사용하여 재귀적 신경망을 구성할 경우, 데이터 누락 및 노이즈가 존재하는 시계열 데이터 대해 작동하는 임의의 재귀적 신경망의 목적 시스템에 대한 성능이 개선되는 효과가 얻어진다. As described above, when constructing a recursive neural network using data missing and noise robust GRU cells according to the present invention, the performance of the target system of any recursive neural network operating on time series data in which data missing and noise exist. This improved effect is obtained.
도 1은 기존의 GRU 셀 구조에 대한 블록도.1 is a block diagram of a conventional GRU cell structure.
도 2는 본 발명에서 제시하는 셀 구조에 대한 블록도.2 is a block diagram of a cell structure of the present invention.
이하, 본 발명의 실시를 위한 구체적인 내용을 도면에 따라서 설명한다. 본 발명이 제시하는 셀 구조는 아래와 같음.DETAILED DESCRIPTION Hereinafter, specific contents for carrying out the present invention will be described with reference to the drawings. The cell structure proposed by the present invention is as follows.
1. 길이가 T인 시계열 데이터의 각 시점 t는 N개의 특징을 나타내는 N차원 벡터로 구성되어 있으며, 결측값과 노이즈를 포함하고 있을 수 있음. 1. Each time point t of time series data having a length T is composed of N-dimensional vectors representing N features, and may include missing values and noise.
2. 노이즈 제거 레이어를 이용해 노이즈를 제거함. 레이어가 산출하는 값은 시점 t를 기준으로 가장 최근에 관측된 입력값
Figure PCTKR2019004873-appb-img-000001
으로, 노이즈가 제거된 값임.
2. Remove the noise using the noise reduction layer. The value computed by the layer is the most recently observed input relative to time t.
Figure PCTKR2019004873-appb-img-000001
The noise is removed.
2. 1. 시점 t의 N차원 벡터에서 d번째 차원의 값
Figure PCTKR2019004873-appb-img-000002
가 관측되었다면, 이 값을 그대로 사용함. 반대로, 시점 t의 N차원 벡터에서 d번째 차원의 값
Figure PCTKR2019004873-appb-img-000003
가 관측되지 않았다면, 해당 차원에서 가장 최근에 관측되었던 값
Figure PCTKR2019004873-appb-img-000004
을 사용함.
Figure PCTKR2019004873-appb-img-000005
Figure PCTKR2019004873-appb-img-000006
가 관측되었는지 여부를 나타내며, 관측되었다면 1, 아니면 0의 값을 갖는 마스크임.
2. 1. The value of the d th dimension in the N-dimensional vector at the start point t
Figure PCTKR2019004873-appb-img-000002
If is observed, use this value as is. Conversely, the value of the d-dimensional dimension in the N-dimensional vector of the time point t
Figure PCTKR2019004873-appb-img-000003
If is not observed, the most recently observed value for that dimension
Figure PCTKR2019004873-appb-img-000004
Using.
Figure PCTKR2019004873-appb-img-000005
silver
Figure PCTKR2019004873-appb-img-000006
Indicates whether or not is observed, and is a mask with a value of 1 if it is observed or 0.
Figure PCTKR2019004873-appb-img-000007
Figure PCTKR2019004873-appb-img-000007
2. 2.
Figure PCTKR2019004873-appb-img-000008
는 시점 t-k부터 시점 t까지의
Figure PCTKR2019004873-appb-img-000009
들의 가중 평균임.
Figure PCTKR2019004873-appb-img-000010
는 가장 최근에 계산된 결측값이 대치된 k개의 시점별
Figure PCTKR2019004873-appb-img-000011
의 가중 평균으로서, w는 k차원의 각 시점별 가중치를 나타내는 학습가능한 파라미터임.
2. 2.
Figure PCTKR2019004873-appb-img-000008
Is the time from point tk to point t
Figure PCTKR2019004873-appb-img-000009
Weighted average of the population.
Figure PCTKR2019004873-appb-img-000010
Is the k point in time at which the most recently computed missing values were replaced.
Figure PCTKR2019004873-appb-img-000011
As the weighted average of, w is a learnable parameter representing the weight of each viewpoint in k-dimensional.
Figure PCTKR2019004873-appb-img-000012
Figure PCTKR2019004873-appb-img-000012
즉, 노이즈 제거 레이어가 산출하는
Figure PCTKR2019004873-appb-img-000013
는, 시점 t를 기준으로 가장 최근에 관측된 노이즈가 제거된
Figure PCTKR2019004873-appb-img-000014
임.
That is, the noise removal layer
Figure PCTKR2019004873-appb-img-000013
Is the most recently observed noise removed at time t.
Figure PCTKR2019004873-appb-img-000014
being.
3. 앞선 레이어에서 산출한 노이즈가 제거된 값을 기반으로 결측값을 대치함. 레이어가 산출하는 값은 노이즈가 제거되고 결측값이 고려된
Figure PCTKR2019004873-appb-img-000015
임. 자세한 과정을 수식으로 나타내면 아래와 같음.
3. Replace missing values based on the noise-reduced values from the previous layer. The value calculated by the layer is noise-free and missing values are considered.
Figure PCTKR2019004873-appb-img-000015
being. The detailed procedure is as follows.
Figure PCTKR2019004873-appb-img-000016
Figure PCTKR2019004873-appb-img-000016
3. 1. 시점 t의 N차원 벡터에서 d번째 차원의 값
Figure PCTKR2019004873-appb-img-000017
가 관측되었다면, 노이즈가 제거된
Figure PCTKR2019004873-appb-img-000018
값을 그대로 사용함.
3. The value of the d-th dimension of the N-dimensional vector of the starting point t
Figure PCTKR2019004873-appb-img-000017
Is observed, the noise is removed
Figure PCTKR2019004873-appb-img-000018
Use the value as it is.
3. 2. 시점 t의 N차원 벡터에서 d번째 차원의 값
Figure PCTKR2019004873-appb-img-000019
가 관측되지 않았다면, 감쇄율
Figure PCTKR2019004873-appb-img-000020
를 적용한
Figure PCTKR2019004873-appb-img-000021
를 사용함. 즉, 현재 시점 t로부터 마지막으로
Figure PCTKR2019004873-appb-img-000022
가 관측되었던 시점까지 흐른 시간
Figure PCTKR2019004873-appb-img-000023
에 비례하여, 지수적 감쇄(exponential decay)율을 적용함.
2. The value of the d th dimension in the N-dimensional vector at the start point t
Figure PCTKR2019004873-appb-img-000019
If is not observed, decay rate
Figure PCTKR2019004873-appb-img-000020
Applied
Figure PCTKR2019004873-appb-img-000021
Using. That is, from the current time point t
Figure PCTKR2019004873-appb-img-000022
Time passed to the time point was observed
Figure PCTKR2019004873-appb-img-000023
In proportion to, apply an exponential decay rate.
Figure PCTKR2019004873-appb-img-000024
Figure PCTKR2019004873-appb-img-000024
감쇄율
Figure PCTKR2019004873-appb-img-000025
Figure PCTKR2019004873-appb-img-000026
에 비례하여 증가하며,
Figure PCTKR2019004873-appb-img-000027
Figure PCTKR2019004873-appb-img-000028
는 감쇄율을 결정하기 위해서 데이터에 따라 학습될 수 있는 파라미터임. 감쇄율
Figure PCTKR2019004873-appb-img-000029
은 0과 1사이의 값을 가질 수 있는데, 이 값이 1에 가까워질수록 가장 최근에 관측되었던 값
Figure PCTKR2019004873-appb-img-000030
대신, 전체 평균 혹은 임의의 상수
Figure PCTKR2019004873-appb-img-000031
를 설정하여, 모든 입력값은 일정 시간 이상 관측값이 제공되지 않으면
Figure PCTKR2019004873-appb-img-000032
으로 수렴하도록 함.
Decay rate
Figure PCTKR2019004873-appb-img-000025
silver
Figure PCTKR2019004873-appb-img-000026
Increases in proportion to
Figure PCTKR2019004873-appb-img-000027
Wow
Figure PCTKR2019004873-appb-img-000028
Is a parameter that can be learned according to the data to determine the decay rate. Decay rate
Figure PCTKR2019004873-appb-img-000029
Can have a value between 0 and 1, the closer this value is to 1, the most recently observed value
Figure PCTKR2019004873-appb-img-000030
Instead, the overall mean or any constant
Figure PCTKR2019004873-appb-img-000031
Is set so that all input values are not available
Figure PCTKR2019004873-appb-img-000032
To converge.
4. 결측값과 노이즈가 처리된 값
Figure PCTKR2019004873-appb-img-000033
을 바탕으로 GRU 연산을 수행함.
4. Missing and noisy values
Figure PCTKR2019004873-appb-img-000033
Perform GRU operation based on
Figure PCTKR2019004873-appb-img-000034
Figure PCTKR2019004873-appb-img-000034
Figure PCTKR2019004873-appb-img-000035
Figure PCTKR2019004873-appb-img-000035
Figure PCTKR2019004873-appb-img-000036
Figure PCTKR2019004873-appb-img-000036
Figure PCTKR2019004873-appb-img-000037
Figure PCTKR2019004873-appb-img-000037
시점 t의 입력값 벡터 x t는 리셋 게이트(reset gate) r t와 업데이트 게이트(update gate) z t를 계산하기 위해 사용됨. 각 게이트는 이전 시점의 잠재 상태(hidden state)
Figure PCTKR2019004873-appb-img-000038
과 입력값 x t을 이용해 산출된 0과 1사이의 값으로써, r t는 후보 잠재 상태(candidate hidden state) h t를 계산할 때 이전 시점의 잠재 상태
Figure PCTKR2019004873-appb-img-000039
를 얼마나 반영할지를 나타내며, z t는 현재 시점의 잠재 상태 h t를 계산할 때 이전 시점의 잠재 상태
Figure PCTKR2019004873-appb-img-000040
를 얼마나 반영할지를 나타냄.
The input vector x t at time t is used to calculate the reset gate r t and update gate z t . Each gate is a hidden state of the previous point in time
Figure PCTKR2019004873-appb-img-000038
R t is a value between 0 and 1 calculated using the input value x t and r t is the latent state of the previous point in the calculation of the candidate hidden state h t .
Figure PCTKR2019004873-appb-img-000039
A represents how much to reflect, t z is a potential state of the previous time when the current to calculate the potential state at the time t h
Figure PCTKR2019004873-appb-img-000040
How much to reflect.
5. 본 발명의 셀 구조에서 최종적으로 얻게 되는 값은 현재 시점의 잠재 상태 h t이다. 이는 이전 시점까지 처리한 정보
Figure PCTKR2019004873-appb-img-000041
와 현재 시점의 원시 데이터(raw data)를 조합하여, 시계열 자료에서 과제 수행을 위해 이번 시점에 기억해야 하는 정보를 벡터로 표현한 것임.
5. The final value obtained in the cell structure of the present invention is the latent state h t at the present time. This is the information processed up to the point in time
Figure PCTKR2019004873-appb-img-000041
And the raw data of the present time, which is a vector representation of the information to be remembered at this time to perform the task in the time series data.
본 발명은 예측하고자 하는 문제에 맞추어 시계열 데이터의 결측값 대치 및 노이즈 완화가 동시에 가능한 재귀적 인공 신경망 모델을 제공하는 것으로서, (a) 시계열 데이터에서 학습가능한 노이즈 완화 필터를 이용한 가중평균 방법으로 노이즈를 완화하는 단계, (b) 결측값을 대치하는 단계, (c) GRU연산을 통해 현재 시점에서 기억해야하는 정보를 잠재 상태 벡터에 저장하는 단계를 단일 셀 구조에 모두 포함하는 것을 특징으로 한다. 또한, 본 발명은 재귀적 인공 신경망 모델을 구성함에 있어서, 상기 (a) 단계에서, 셀 구조 내에 포함된 노이즈 완화를 위한 가중치 파라미터가 재귀적 인공 신경망 모델을 예측 과제에 적합하도록 학습하는 과정에서, 과제에 도록 학습되는 것을 특징으로 한다. 상기와 같은 방법에 의하여, 별도의 전처리 없이 시계열 데이터의 결측값 대치 및 노이즈 완화를 동시에 수행하는 재귀적 인공 신경망 모델을 활용해, 다양한 기계학습 과제에 활용할 수 있다.The present invention provides a recursive artificial neural network model capable of simultaneously replacing missing values of a time series data and mitigating noise according to a problem to be predicted. (A) Noise is weighted using a weighted average method using a noise mitigation filter that can be learned from time series data. Mitigating, (b) replacing missing values, and (c) storing information to be remembered at the present time in a latent state vector through a GRU operation in a single cell structure. In addition, the present invention in constructing a recursive artificial neural network model, in the step (a), in the process of learning the weight parameter for noise mitigation included in the cell structure to fit the recursive artificial neural network model to the prediction task, It is characterized by being learned to the task. By the above method, it is possible to utilize a recursive artificial neural network model that simultaneously performs missing value replacement and noise mitigation of time series data without any preprocessing, and can be utilized for various machine learning tasks.
Figure PCTKR2019004873-appb-img-000042
: 시점 t의 N차원 벡터에서 d번째 차원의 값.
Figure PCTKR2019004873-appb-img-000042
: The value of the d-th dimension in the N-dimensional vector of the starting point t.
Figure PCTKR2019004873-appb-img-000043
: 시점 t를 기준으로, N차원 벡터에서 d번째 차원의 값 중에서 가장 최근에 관측된 값. 만약 시점 t-1에서 이 값이 마지막으로 관측되고 시점 t에서의 이 값(
Figure PCTKR2019004873-appb-img-000044
)이 결측값이라면,
Figure PCTKR2019004873-appb-img-000045
임. (감쇄율을 고려하지 않았을 경우)
Figure PCTKR2019004873-appb-img-000043
: The most recently observed value of the d-th dimension of the N-dimensional vector with respect to the time point t. If at time t-1 this value is observed last and this value at time t (
Figure PCTKR2019004873-appb-img-000044
) Is missing value,
Figure PCTKR2019004873-appb-img-000045
being. (When not considering decay rate)
Figure PCTKR2019004873-appb-img-000046
: 시점 t의 N차원 벡터에서 d번째 차원의 값의 결측값 여부를 나타내는 마스크 값. 만약
Figure PCTKR2019004873-appb-img-000047
가 결측값이 아니라면 1,
Figure PCTKR2019004873-appb-img-000048
가 결측값이라면 0의 값을 가짐.
Figure PCTKR2019004873-appb-img-000046
: A mask value indicating whether a value of the d-th dimension is missing in the N-dimensional vector of the starting point t. if
Figure PCTKR2019004873-appb-img-000047
1 is not a missing value,
Figure PCTKR2019004873-appb-img-000048
Has a value of 0 if is missing.
Figure PCTKR2019004873-appb-img-000049
: 결측값이 발생했을 때, 가장 최근에 관측된 값
Figure PCTKR2019004873-appb-img-000050
를 얼마나
Figure PCTKR2019004873-appb-img-000051
에 반영할지 결정하는 감쇄율을 나타내며, 0과 1사이의 값을 가질 수 있음.
Figure PCTKR2019004873-appb-img-000049
: Most recently observed value when a missing value occurs
Figure PCTKR2019004873-appb-img-000050
How much
Figure PCTKR2019004873-appb-img-000051
Attenuation ratio to decide whether to reflect on the range, can have a value between 0 and 1.
Figure PCTKR2019004873-appb-img-000052
: 감쇄율을 결정하기 위한 입력값으로,
Figure PCTKR2019004873-appb-img-000053
Figure PCTKR2019004873-appb-img-000054
간의 시간차를 나타냄. 즉, 현재로부터 마지막으로 어떤 값이 관측된 시점까지의 거리를 나타냄. 예를 들어, 현재 시점이 t이고
Figure PCTKR2019004873-appb-img-000055
가 t-1 시점에 관측되었다면
Figure PCTKR2019004873-appb-img-000056
는 1임.
Figure PCTKR2019004873-appb-img-000052
: Input value to determine attenuation rate.
Figure PCTKR2019004873-appb-img-000053
and
Figure PCTKR2019004873-appb-img-000054
Indicates time difference between That is, the distance from the present time to the last time a value was observed. For example, if the current time is t
Figure PCTKR2019004873-appb-img-000055
If was observed at time t-1
Figure PCTKR2019004873-appb-img-000056
Is 1
Figure PCTKR2019004873-appb-img-000057
: 감쇄율을 결정하기 위해
Figure PCTKR2019004873-appb-img-000058
에 곱해지는 학습가능한 파라미터.
Figure PCTKR2019004873-appb-img-000057
To determine the decay rate
Figure PCTKR2019004873-appb-img-000058
A learnable parameter to be multiplied by.
Figure PCTKR2019004873-appb-img-000059
: 감쇄율을 결정하기 위해
Figure PCTKR2019004873-appb-img-000060
Figure PCTKR2019004873-appb-img-000061
에 더해지는 학습가능한 파라미터.
Figure PCTKR2019004873-appb-img-000059
To determine the decay rate
Figure PCTKR2019004873-appb-img-000060
Figure PCTKR2019004873-appb-img-000061
Learnable parameters added to.
w: 노이즈 제거 레이어에서,
Figure PCTKR2019004873-appb-img-000062
를 계산하기 위해 각
Figure PCTKR2019004873-appb-img-000063
에 곱해지는 가중치.
w: In the noise reduction layer,
Figure PCTKR2019004873-appb-img-000062
Each to calculate
Figure PCTKR2019004873-appb-img-000063
The weight multiplied by.
Figure PCTKR2019004873-appb-img-000064
: 각 시점별
Figure PCTKR2019004873-appb-img-000065
에 가중치 w를 곱하여 구해진, 결측값이 대치되고 노이즈가 제거된 시점 t의 N차원 벡터에서 d번째 차원의 값.
Figure PCTKR2019004873-appb-img-000064
: For each time point
Figure PCTKR2019004873-appb-img-000065
Is the value of the d-th dimension of the N-dimensional vector of the time point t at which missing values are replaced and noise removed, multiplied by the weight w.
Figure PCTKR2019004873-appb-img-000066
: 후보 잠재 상태. 이번 시점에 들어온 입력값을 활용하여, 이번 시점의 잠재 상태에 대한 후보를 생성함.
Figure PCTKR2019004873-appb-img-000066
: Candidate potential status. Using the input from this point in time, we generate a candidate for the potential state at this point.
h t: 현재 시점의 잠재 상태. 지난 시점의 잠재 상태와 이번 시점의 후보 잠재 상태를 바탕으로 계산된, 현재 시점에서 기억해야 하는 정보가 벡터 표현으로 나타난 것. h t : the latent state at the present time. The vector representation shows the information to be remembered at the present time, calculated based on the latent state of the past and the candidate latent state at this time.
z t: GRU 연산에서 사용되는 업데이트 게이트. 이전 시점의 잠재 상태(hidden state) h t-1에 파라미터
Figure PCTKR2019004873-appb-img-000067
를 내적하고, 입력값 x t에 W z를 내적한 다음 b z을 더한 값에 시그모이드 활성함수를 적용하여 산출된 0과 1사이의 값으로, 현재 시점의 잠재 상태 h t를 계산할 때 이전 시점의 잠재 상태 h t-1를 얼마나 반영할지를 나타냄.
z t : Update gate used in GRU operations. Parameter to the hidden state h t-1 at the previous time
Figure PCTKR2019004873-appb-img-000067
The inner and the input for the x t to a value between 0 and 1 by applying a sigmoid activation function to the value of the inner product of W z, plus the following: b z, prior to when the current to calculate the potential state h t the time Indicates how much to reflect the latent state h t-1 at the time point.
r t: GRU 연산에서 사용되는 리셋 게이트. 이전 시점의 잠재 상태(hidden state) h t-1에 파라미터 U r를 내적하고, 입력값 x t
Figure PCTKR2019004873-appb-img-000068
를 내적한 다음
Figure PCTKR2019004873-appb-img-000069
을 더한 값에 시그모이드 활성함수를 적용하여 산출된 0과 1사이의 값으로, 후보 잠재 상태(candidate hidden state)
Figure PCTKR2019004873-appb-img-000070
를 계산할 때 이전 시점의 잠재 상태 h t-1를 얼마나 반영할지를 나타냄.
r t : Reset gate used in GRU operations. Invert the parameter U r to the hidden state h t-1 at the previous time, and to the input value x t
Figure PCTKR2019004873-appb-img-000068
Internally
Figure PCTKR2019004873-appb-img-000069
Is a value between 0 and 1 calculated by applying the sigmoid activity function to the candidate hidden state.
Figure PCTKR2019004873-appb-img-000070
When calculating, how much to reflect the latent state h t-1 from the previous point in time.

Claims (1)

  1. 예측하고자 하는 문제에 맞추어 시계열 데이터의 결측값 대치 및 노이즈 완화가 동시에 가능한 재귀적 인공 신경망 모델을 제공하는 것에 있어서, In providing a recursive artificial neural network model capable of simultaneously replacing missing values and mitigating noise in accordance with a problem to be predicted,
    (a) 시계열 데이터에서 학습가능한 노이즈 완화 필터를 이용한 가중평균 방법으로 노이즈를 완화하는 단계, (a) mitigating noise by a weighted average method using a noise mitigating filter that can be learned from time series data,
    (b) 결측값을 대치하는 단계, (b) replacing missing values,
    (c) GRU연산을 통해 현재 시점에서 기억해야하는 정보를 잠재 상태 벡터에 저장하는 단계를 단일 셀 구조에 모두 포함하는 것을 특징으로 하는 재귀적 인공 신경망 모형.(c) Recursive artificial neural network model comprising the step of storing in a latent state vector all the information to be remembered at the present time through GRU operation.
PCT/KR2019/004873 2018-04-27 2019-04-23 Gru-based cell structure design robust to missing data and noise in time series data in recurrent neural network WO2019208998A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
KR1020180048801A KR102310490B1 (en) 2018-04-27 2018-04-27 The design of GRU-based cell structure robust to missing value and noise of time-series data in recurrent neural network
KR10-2018-0048801 2018-04-27

Publications (1)

Publication Number Publication Date
WO2019208998A1 true WO2019208998A1 (en) 2019-10-31

Family

ID=68293629

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/KR2019/004873 WO2019208998A1 (en) 2018-04-27 2019-04-23 Gru-based cell structure design robust to missing data and noise in time series data in recurrent neural network

Country Status (2)

Country Link
KR (1) KR102310490B1 (en)
WO (1) WO2019208998A1 (en)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111030889A (en) * 2019-12-24 2020-04-17 国网河北省电力有限公司信息通信分公司 Network traffic prediction method based on GRU model
CN111338385A (en) * 2020-01-22 2020-06-26 北京工业大学 Vehicle following method based on fusion of GRU network model and Gipps model
CN111931849A (en) * 2020-08-11 2020-11-13 北京中水科水电科技开发有限公司 Hydroelectric generating set operation data trend early warning method
CN112967816A (en) * 2021-04-26 2021-06-15 四川大学华西医院 Computer equipment and system for acute pancreatitis organ failure prediction
CN116861347A (en) * 2023-05-22 2023-10-10 青岛海洋地质研究所 Magnetic force abnormal data calculation method based on deep learning model

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR102443586B1 (en) 2020-06-29 2022-09-15 세종대학교산학협력단 Method and server for predicting missing data
CN112561118B (en) * 2020-10-29 2022-09-02 北京水慧智能科技有限责任公司 Municipal pipe network water flow prediction method based on GRU neural network

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5408424A (en) * 1993-05-28 1995-04-18 Lo; James T. Optimal filtering by recurrent neural networks
US20150238148A1 (en) * 2013-10-17 2015-08-27 Siemens Aktiengesellschaft Method and system for anatomical object detection using marginal space deep neural networks
US9349105B2 (en) * 2013-12-18 2016-05-24 International Business Machines Corporation Machine learning with incomplete data sets

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR102449837B1 (en) 2015-02-23 2022-09-30 삼성전자주식회사 Neural network training method and apparatus, and recognizing method
KR102399548B1 (en) 2016-07-13 2022-05-19 삼성전자주식회사 Method for neural network and apparatus perform same method

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5408424A (en) * 1993-05-28 1995-04-18 Lo; James T. Optimal filtering by recurrent neural networks
US20150238148A1 (en) * 2013-10-17 2015-08-27 Siemens Aktiengesellschaft Method and system for anatomical object detection using marginal space deep neural networks
US9349105B2 (en) * 2013-12-18 2016-05-24 International Business Machines Corporation Machine learning with incomplete data sets

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
JO BEYEONG SEUNG: "MICROSOFTWARE THE CHECK POINT. LEARN & TRY & SHARE", vol. 391, 29 January 2018 (2018-01-29), pages 1 - 204 *
WEI WEI: "A Generic Neural Network Approach for Filling Missing data in Data Mining", 2003 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, CONFERENCE THEME - SYSTEM SECURITY AND ASSURANCE, 8 October 2003 (2003-10-08), pages 862 - 867, XP010666852, DOI: 10.1109/ICSMC.2003.1243923 *

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111030889A (en) * 2019-12-24 2020-04-17 国网河北省电力有限公司信息通信分公司 Network traffic prediction method based on GRU model
CN111030889B (en) * 2019-12-24 2022-11-01 国网河北省电力有限公司信息通信分公司 Network traffic prediction method based on GRU model
CN111338385A (en) * 2020-01-22 2020-06-26 北京工业大学 Vehicle following method based on fusion of GRU network model and Gipps model
CN111931849A (en) * 2020-08-11 2020-11-13 北京中水科水电科技开发有限公司 Hydroelectric generating set operation data trend early warning method
CN111931849B (en) * 2020-08-11 2023-11-17 北京中水科水电科技开发有限公司 Hydropower unit operation data trend early warning method
CN112967816A (en) * 2021-04-26 2021-06-15 四川大学华西医院 Computer equipment and system for acute pancreatitis organ failure prediction
CN112967816B (en) * 2021-04-26 2023-08-15 四川大学华西医院 Acute pancreatitis organ failure prediction method, computer equipment and system
CN116861347A (en) * 2023-05-22 2023-10-10 青岛海洋地质研究所 Magnetic force abnormal data calculation method based on deep learning model

Also Published As

Publication number Publication date
KR102310490B1 (en) 2021-10-08
KR20190124846A (en) 2019-11-06

Similar Documents

Publication Publication Date Title
WO2019208998A1 (en) Gru-based cell structure design robust to missing data and noise in time series data in recurrent neural network
Sihwail et al. Improved harris hawks optimization using elite opposition-based learning and novel search mechanism for feature selection
Ruck et al. Comparative analysis of backpropagation and the extended Kalman filter for training multilayer perceptrons
US5636326A (en) Method for operating an optimal weight pruning apparatus for designing artificial neural networks
JP7366274B2 (en) Adaptive search method and device for neural networks
CN113449864B (en) Feedback type impulse neural network model training method for image data classification
KR20180045635A (en) Device and method to reduce neural network
CN108683614B (en) Virtual reality equipment cluster bandwidth allocation device based on threshold residual error network
CN108122048B (en) Transportation path scheduling method and system
CN110942142B (en) Neural network training and face detection method, device, equipment and storage medium
Khan et al. Artificial neural network (ANNs)
US5107442A (en) Adaptive neural network image processing system
Geerts et al. Probabilistic successor representations with Kalman temporal differences
Abu Doush et al. Archive-based coronavirus herd immunity algorithm for optimizing weights in neural networks
CN113407820A (en) Model training method, related system and storage medium
CN115599296A (en) Automatic node expansion method and system for distributed storage system
Zhang et al. Learning efficient sparse structures in speech recognition
CN115220818A (en) Real-time dependency task unloading method based on deep reinforcement learning
CN113240430A (en) Mobile payment verification method and device
CN114943330A (en) Neural network model training method, device, equipment and storage medium
WO2002080563A2 (en) Scalable expandable system and method for optimizing a random system of algorithms for image quality
Ding et al. Adaptive training of radial basis function networks using particle swarm optimization algorithm
Abdulhameed et al. Potentials of reinforcement learning in contemporary scenarios
CN112734048A (en) Reinforced learning method
Lee Embedding Differentiable Sparsity into Deep Neural Network

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 19792732

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 19792732

Country of ref document: EP

Kind code of ref document: A1