KR19990016586A

KR19990016586A - Optimal Product Approximation Method of Probability Distribution by K-Kernel Dependency and Multiple Decision Combining Method by Dependency

Info

Publication number: KR19990016586A
Application number: KR1019970039171A
Authority: KR
Inventors: 강희중; 김진형
Original assignee: 윤종용; 삼성전자 주식회사
Priority date: 1997-08-18
Filing date: 1997-08-18
Publication date: 1999-03-15

Abstract

본 발명은 1차 이상의 고차 의존관계에 의한 확률 분포의 최적 곱 근사 방법을 제공하는 것을 목적으로 한다.An object of the present invention is to provide an optimal product approximation method of a probability distribution due to a higher order dependence of a first order or higher order.

본 발명의 다른 목적은, 1차 이상의 고차 의존관계에 의한 확률분포의 최적 곱 근사에서 구성 분포 항을 구하는 방법을 제공하는 것이다.It is another object of the present invention to provide a method for obtaining a constitutive distribution term in an optimal product approximation of a probability distribution due to a higher order dependency.

그리고 본 발명의 또 다른 목적은, 의존관계에 의한 다수 결정 결합 방법을 제공하는 것이다.It is still another object of the present invention to provide a method of combining multiple decisions by dependency.

상기한 제 1 목적을 달성하기 위한 본 발명의차 의존 관계에 의한 확률분포의 최적 곱 근사 방법은,In order to achieve the first object, The optimal product approximation method of the probability distribution due to the differential dependence,

K개 결정기 집합이,K set of determiners ,

L개 결정 후보의 집합이,A set of L decision candidates ,

입력이,Input ,

의존관계의 차수가,The degree of dependency is ,

일 경우, If it is,

포스테리어 확률가Probability of force terrier end

=일 때, = when,

=가 되고,= Lt; / RTI &

가를 나타낸다면, end Lt; / RTI >

일 경우, If it is,

는및 The And

=관계가 성립하는 경우,= If the relationship is established,

상기한 고차 확률 분포를차 의존 관계에 의해 저차 확률 분포의 곱으로 최적 근사화 하는 방법에 있어서,The above-mentioned high-order probability distribution To In a method of optimally approximating a product of a low-order probability distribution by a differential dependence,

1차 의존 관계를 구하는 단계;Obtaining a primary dependency relationship;

조건부 독립 가정을 구하는 단계;Obtaining a conditionally independent assumption;

2차 의존 관계를 구하는 단계;Obtaining a secondary dependency relationship;

조건부 1차 의존 관계를 구하는 단계;Obtaining a conditional primary dependency;

..

차 의존 관계를 구하는 단계; 및 Obtaining a differential dependency; And

조건부차 의존 관계를 구하는 단계를 포함하여 이루어지는 것이 특징이다.Conditional And a step of obtaining a differential dependence relationship.

상기한 제 2 목적을 달성하기 위한 본 발명의차 의존 관계에 의한 확률분포의 최적 곱 근사시 구성 분포 항을 구하는 방법은,In order to achieve the above-mentioned second object, The method of obtaining the constituent distribution term at the time of the optimal product approximation of the probability distribution by the differential dependence is as follows:

K개 결정기 집합이,K set of determiners ,

L개 결정 후보의 집합이,A set of L decision candidates ,

입력이,Input ,

의존관계의 차수가,The degree of dependency is ,

, ,

포스테리어 확률가Probability of force terrier end

=일 때, = when,

=가 되고,= Lt; / RTI &

가를 나타내면, end Lt; / RTI >

일 때, when,

는및 The And

=관계가 성립하는 경우,= If the relationship is established,

차 의존 관계에 의한 확률분포의 최적 곱 근사시 구성 분포 항을 구하는 방법에 있어서, A method for finding a constituent distribution term at an optimal product approximation of a probability distribution by a differential dependence,

실제 확률 분포, 근사 확률 분포일 때,Actual probability distribution , An approximate probability distribution when,

로 정의되는 Defined as

유사도(Measure of Closeness)를 최소화하는는,Minimize the measure of closeness Quot;

for ㉠ do /* 1차 의존관계 */for ㉠ do / * Primary dependency * /

/* 이때 상기 ㉠은 1차 의존관계로서일 경우/ * Here, the above is the first dependency relation If

및 And

로 주어진다 */ Given as * /

for ㉡ do /* 2차 의존관계 */for ㉡ do / * Secondary dependency * /

/* 이때 상기 ㉡은 2차 의존 관계로서일 경우/ * Where the above is the second dependency relation If

및 And

로 주어진다 */ Given as * /

........ ................ ........

for ㉢ do /* (-1)차 의존관계 */for ㉢ do / * ( -1) Dependency relation * /

/* 이때 상기 ㉢은 (-1)차 의존관계로서 0i(-1)(j), ... ,i1(j)j 일 경우/ * At this time, -1) As a dependency relation 0 i ( -1) (j), ..., i1 (j) j

및 And

로 주어진다 */ Given as * /

while ( ㉣ ) do /*차 의존관계 */while (㉣) do / * Car dependency relationship * /

/* 이때 상기 ㉣은차 의존관계로서 0i(j), ... ,i1(j)j 인 경우/ * &Lt; / RTI > 0 as a car dependency i (j), ..., i1 (j) j

및 And

로 주어진다 */ Given as * /

..............

endend

..............

endend

와 같이, 1개의 1차 의존관계, 1개의 2차 의존관계, ... , 1개의 (-1)차 의존관계, (K-)개의차 의존관계등으로 구성되고, (-1)개의 내포된 for 루프와 1개의 while 루프를 포함하여 구해지는 깃이 특징이다., One primary dependency, one secondary dependency, ..., one ( -1) dependency, (K- )doggy And the dependency relation -1) nested for loops and one while loop.

상기한 제 3 목적을 달성하기 위한 본 발명의 의존관계에 의한 다수 결정 결합 방법은,According to another aspect of the present invention,

K개 결정기 집합이,K set of determiners ,

L개 결정 후보의 집합이,A set of L decision candidates ,

입력이,Input ,

의존관계의 차수가,The degree of dependency is ,

의 관계가 있을 때,차 의존 관계에 의한 확률분포의 최적 곱 근사 및 이를 위한 구성 분포 항을 구하는 방법의 수행을 통한 결정 방법을 통상의 베이지안 결정 방법과 결합하는 방법에 있어서, When there is a relationship, A method for combining a decision method by performing a method of obtaining an optimal product approximation of a probability distribution due to a differential dependence and a constituent distribution term for this with a normal Bayesian decision method,

훈련용 샘플 데이터를 참조하여 상기한차 의존관계에 의한 확률분포의 최적 곱 근사 집합을 구하는 제 1 단계; 및Referring to the training sample data, A first step of obtaining an optimal product approximate set of a probability distribution by a differential dependence; And

상기 단계에서 구해진 근사 집합을 베이지안 기법과의 결합식에 적용하여 실질적으로 다수 결정기의 결정을 확률적으로 결합하는 제 2 단계를A second step of probabilistically combining the determinations of the plurality of determinants by applying the approximate set obtained in the above step to the combination formula with the Bayesian method

포함하여 이루어지며, 독립가정을 필요로 하지 않고, 다수 결정을 내리는 결정기 간의 의존관계를 고려하는 것을 특징으로 한다.And is characterized by considering dependencies between determinants that make multiple decisions without requiring an independent assumption.

본 발명의 적용분야 및 효과는 다음과 같다:The application fields and effects of the present invention are as follows:

1) 본 발명은 독립 가정을 취하지 않음으로써 독립가정에 의해 야기될 수 있는 문제점을 피할 수 있다.1) The present invention avoids the problems that can be caused by independent households by not taking an independent assumption.

2) 고차 확률 분포를 저차 확률 분포의 곱으로 근사시키는 데 있어서 의존 관계 차수를 변화시키면서 해당 차수 의존 관계 기반으로 한 최적 곱 근사 분포 집합을 구할 수 있다.2) In approximating the higher-order probability distribution by the product of the lower-order probability distributions, it is possible to obtain the optimal product approximate distribution set based on the order dependence relation while changing the dependency order.

3) 이에 필요한 저장량 복잡도도 독립 가정 기반에 비해 크지만 BKS 기법보다는 작다는 장점을 가진다. 왜냐하면, O(L²)O(L^k+1)O(L^K+1)의 관계가 성립하기 때문이다.3) The required storage complexity is larger than that of the independent family, but it is smaller than the BKS method. Because O (L ² ) O (L ^{k + 1} ) O (L ^{K + 1} ).

4) 1차 의존관계만이 아닌 고차 의존관계를 처리할 수 있는 방법론을 제시하므로써, 곱 근사 연구에 대한 기존의 연구 결과를 확장하였다.4) Extending existing research results on product approximation by suggesting a methodology that can deal with higher order dependency, not just the first dependency relation.

5) 고차 의존관계를 기반으로 다수 결정기의 결정을 결합하는 것이 그 성능에 있어서 우수함을 보여 주었다.5) It has been shown that combining the decisions of multiple determinants based on higher order dependence is superior in performance.

6) 패턴 인식 분야에서 다수의 인식기를 사용하여 그 결정을 결합하는 실험을 통하여 본 발명의 우수성을 보였다.6) In the field of pattern recognition, the present inventors have excelled in the present invention through experiments in which a plurality of recognizers are used to combine the determinations.

7) 패턴 인식 분야에서 다수의 인식기를 사용하여 그 결정을 결합하는 실험을 통하여 본 발명의 우수성을 나타내었다. 비록 본 발명은 패턴 인식 분야에 대한 실시예에 대해서만 기술하였으나 본 발명의 사상은 그룹 의사 결정, 제어분야등 다른 분야에도 적용가능하다.7) In the field of pattern recognition, the excellence of the present invention is shown through experiments combining a plurality of recognizers with the determinations. Although the present invention has been described with respect to only the embodiment of the pattern recognition field, the idea of the present invention is applicable to other fields such as group decision making and control fields.

Description

Optimal Product Approximation Method of Probability Distribution by K-Kernel Dependence and Multiple Decision Combining Method by Dependency (fdsafdsa)

본 발명은, 2차 의존 관계에 의한 확률분포의 최적 곱 근사방법, 이를 확장한 k차 의존 관계에 의한 확률분포의 최적 곱 근사 방법, 1차 이상의 고차 의존관계에 의한 확률분포의 최적 곱 근사에서 구성 분포 항을 구하는 방법 및 의존관계에 의한 다수 결정 결합 방법을 제공하는 것에 관한 것이다.The present invention relates to an optimal product approximation method of a probability distribution by a second dependence relation, an optimal product approximation method of a probability distribution by k-order dependency expansion thereof, an optimal product approximation of a probability distribution by a first- A method for obtaining the constituent distribution term, and a method for combining multiple decisions by dependency.

이산 결정을 행하는 결정기에 있어서 종래 기술로는 투표 기법(절대 다수 또는 단순 다수)과 행동-인식 공간(Behavior-knowledge Space:BKS)기법이 있다.In the determiner for performing the discrete decision, there are a voting technique (absolute majority or simple majority) and a behavior-knowledge space (BKS) technique in the prior art.

상기 투표 기법은 이 분야의 기술자들에게 자명하므로 설명을 생략한다. 하지만 상기 BKS 기법에 대해서는 아래에 간단히 설명한다.The voting technique will be obvious to those skilled in the art and will not be described. However, the BKS technique is briefly described below.

K 개의 결정기가 L개의 결정 후보중에서 하나를 선택하여 결정한다고 가정하자.Suppose that K determinants select one of L decision candidates.

그러면, K개 원소로 구성된 결정 벡터가 관찰된다. 상기 BKS 기법은 훈련 샘플 데이터로부터 이러한 K차원의 결정 벡터와 실제 참이 되는 결정 값을 누적하여 (K+1) 차원의 테이블을 생성하게 된다.Then, a crystal vector composed of K elements is observed. The BKS technique generates a (K + 1) -dimensional table by accumulating the decision vector of the K-dimension and the actually determined value from the training sample data.

BKS 기법은 알려지지 않은 입력에 대해 K 개 결정기의 결정을 관찰하고, 생성된 테이블에 일치되는 경우가 있으면 해당 경우에서 참이 되는 결정들 중에서 최대 빈도수를 지닌 결정을 해당 경우에서의 결합된 결정으로 지정한다.The BKS technique observes the decision of the K determinator for unknown inputs and assigns the decision with the highest frequency among the decisions that are true in the case to the generated table as the combined decision in that case do.

만일 일치되는 경우가 없으면 결합된 결정을 내리지 않도록 하는 방식으로 운영된다.If no match is found, it is operated in such a way as not to make a combined decision.

따라서, 이론적으로 BKS 기법을 운영하기 위해선 최대만큼의 저장량이 필요하게 된다.Therefore, theoretically, to operate the BKS technique, Is required.

지금까지 다수 인식기를 결합하기 위한 많은 결정 결합 방법이 제안되어 왔지만 이 방법들의 대부분은 인식기와의 독립관계를 가정한다. Huang 및 Suen은 독립가정을 회피하기 위해 BKS(Behavior-knowledge Space) 방법을 제안하였다. 그러나, 이 BKS 방법은 계산 및 필요한 저장량이 기하학적으로 복잡하다는 잠재적 결점과, 결정결합에 있어서 많은 기각(rejection)을 갖는다는 문제점이 있다.So far, many decision combining methods have been proposed for combining multiple recognizers, but most of these methods assume an independent relationship with the recognizer. Huang and Suen proposed a Behavior-knowledge Space (BKS) method to avoid independent families. However, this BKS method has a problem that the calculation and the required storage amount are geometrically complicated, and there are many rejections in crystal bonds.

하지만 상기와는 달리 인식기 사이의 의존관계를 고려하는 방법도 있다. 이러한 인식기간의 의존관계는 독립 가정이 없는 확률적 다수 결정 결합방법에 있어서 중요한 역할을 한다.However, unlike the above, there is a method of considering the dependency between the recognizers. This dependence of the cognitive period plays an important role in the probabilistic multiple decision method without independent assumptions.

즉, 이 방법은 훈련 샘플로부터 획득된 고차 학률 분포가 k차 의존 관계 또는 (k+1)차 분포의 곱으로 최적 근사된 후, 베이지안 방법을 이용하여 다수 결정이근사 분포로 결합하게 된다.That is, the method is such that the higher order rate distribution obtained from the training sample is best approximated to the product of the k-order dependence or the (k + 1) -order distribution, and then the multiple decisions are combined into an approximate distribution using the Bayesian method.

패턴 인식 분야의 종래 기술에 대한 배경BACKGROUND OF THE PRIOR ART OF THE FIELD OF THE INVENTION

이제 본 발명을 적용할 패턴 인식 분야의 종래 기술에 대한 배경을 살펴보기로 한다. 문자 및 패턴 인식 분야에서 수행되는 연구의 궁극적인 목적은 인식 성능을 향상시키는 것이다. 이러한 목적을 달성하기 위해 크게 두 가지 방향으로 연구가 행해져 왔다.The background of the prior art of the pattern recognition field to which the present invention is applied will now be described. The ultimate goal of the research carried out in the field of character and pattern recognition is to improve recognition performance. In order to achieve this purpose, research has been conducted in two major directions.

하나는, 인식기 자체의 성능을 향상시키기 위한 연구였다.One was research to improve the performance of the recognizer itself.

다른 하나는, 결정 결합방법으로 다수 인식기를 결합하는 것에 대한 연구였다.The other was research on combining multiple recognizers in a crystal-bonding method.

상기한 목적들을 달성하기 위한 다수의 인식기 또는 인식 알고리즘이 있지만, 이들 가운데 어떠한 인식기 또는 인식 알고리즘도 기대한 만큼 성능이 양호하지 않았으며, 실생활 문제점에 대한 도전을 다룰 정확성 및 견고함을 모두 지닌 단일 인식기 또는 인식 알고리즘이 아직 제공되지 않고 있다.There are a number of recognizers or recognition algorithms to accomplish the above objects, but any recognizer or recognition algorithm among them is not as good as expected and has a single recognizer with both accuracy and robustness to handle challenges to real- Or recognition algorithms have not yet been provided.

인식기 자체를 개선시키는 방법에 있어서 가장 큰 어려움은 다양한 형태의 다수의 특징을 이용할 수 있을 지라도 이것들을 단일 인식기에 함께 묶는 것이 용이하지 않다는 사실에 있다.The biggest difficulty in how to improve the recognizer itself lies in the fact that it is not easy to group them together into a single recognizer, although many features of various types can be used.

다수 인식기 결합은 둘이 하나 보다 낫다는 가정을 바탕으로 하고 있다. 다수의 보상 특징 및 인식기가 동시에 사용된다면 이들 보상 특징 및 인식기가 인식 성능을 향상시킨다는 사실에 기인한다. 다양한 지식원은 실생활의 패턴 인식 문제의 해결을 위한 연구수행에 필수적이다. 여러 연구자들이 다양한 대안 알고리즘을 이용하여 상기 인식 문제에 대한 개선된 연구수행 결과를 발표하였다.The multiple recognizer combination is based on the assumption that two are better than one. This is due to the fact that if multiple compensation features and recognizers are used simultaneously, these compensation features and recognizers improve recognition performance. Various knowledge sources are essential for carrying out research to solve the problem of real life pattern recognition. Several researchers have presented the results of the improved research on the recognition problem using various alternative algorithms.

대부분의 경우에, 다수 인식기 결합은 이들 인식기의 결정(즉, 인식 결과)을 병렬로 또는 순차적으로 결합하므로써 이루어졌다.In most cases, the multiple recognizer combinations are made by combining the determinations of these recognizers (i.e., the recognition results) in parallel or sequentially.

일반적으로 인식기의 결정 형태는 도 1에 도시된 바와 같이 측정 단계에서 측정 점수, 순위 단계에서 순위(ranking) 및 추출 단계에서 최상위(top) 선택과 같은 3 단계 인식 결과중의 하나에 종속된다.In general, the decision type of the recognizer depends on one of the three-step recognition results such as measurement score in the measurement step, ranking in the ranking step and top selection in the extraction step as shown in Fig.

상기 순위는 최상위 선택 보다 더 많은 정보를 제공하며, 측정 점수 보다는 인식기에 덜 의존적이라는 사실이 알려져 있다.It is known that the ranking provides more information than the top selection and is less dependent on the recognizer than the measurement score.

그룹 결정을 행하는 시스템 또는 그룹 결정을 지원하는 시스템에서, 그룹 판정의 정확도가 개별 판정의 정확도 보다 중요하다는 것은 공지된 사실이다. 상이한 배경, 다양한 전문지식 및 경험을 갖는 다수 전문가들은 서로가 상호 협력하는 방식으로 그룹 결정을 수행하므로, 단일 전문가 보다는 훨씬 양질의 결정 및 해법을 제공할 수 있다It is a well-known fact that, in systems that make group decisions or in systems that support group decisions, the accuracy of group decisions is more important than the accuracy of individual decisions. A number of experts with different backgrounds, diverse expertise and experience can make group decisions in a mutually collaborative manner, providing a much better decision and solution than a single expert

다수의 연구자들이 그들의 결정에 서로 불일치 할 때, 의견 일치에 도달하기 위한 다수 방법들이 제안되어 왔다. 예를들어, 투표 방법, 그룹 의견일치 함수, 사회적 선택 함수, 사회 보장 함수등과 같은 방법들이 있다.When a number of researchers are incompatible with their decisions, a number of methods have been proposed to reach consensus. For example, there are methods such as voting method, group opinion matching function, social selection function, social security function, and so on.

이러한 상황에서, 인식기는 다수 전문가로 이루어진 그룹의 전문가에 대응한다. 또한, 다수 인식기 결합은 다른 패턴 인식 분야에서 다수 전문가 결합으로 호칭된다. 학문 분야간 접근으로서, 몇몇 그룹 결정 수행 방법은 다수 인식기 결합에 적용되었으며, 개선된 연구 수행이 여러 논문에 보고되고 있다.In this situation, the recognizer corresponds to a group of experts of a plurality of experts. In addition, multiple recognizer combinations are referred to as multiple expert combinations in other pattern recognition applications. As an interdisciplinary approach, some group decision making methods have been applied to multiple recognizer combinations, and improved research performance has been reported in several papers.

투표 방법 및 Borda 카운트 방법이 그러한 예이다. 다수 인식기 결합 프로세스는 다음과 같이 진행된다:An example is the voting method and the Borda count method. The multiple recognizer combining process proceeds as follows:

1) 입력 x가 K개 인식기(C₁,C₂,...,C_K)에 병렬로 주어질 때, K개 결정은 각각의 결정 형태에 따라 관측된다.1) When input x is given in parallel to _K recognizers (C ₁ , C ₂ , ..., C _K ), K determinations are observed according to each decision type.

2) 최선 결합 결정을 산출할 것으로 예상되는 일치(Consensus)를 기초로 관측된 결정을 결합한다.2) Combine the observed observations based on the Consensus expected to produce the best combination decision.

3) 관측된 결정에 대한 편리하고 일정한 결합을 위해, 모든 결정이 대부분의 결정 결합 방법에서 동일 형태로 표현되는 것으로 가정한다.3) For convenient and consistent combination of observed crystals, it is assumed that all crystals are represented in the same form in most crystal-combining methods.

요약하면, 다수 인식기 결합은 최선 인식 수행을 달성하기 위한 일치를 형성하는 것이다.In summary, multiple recognizer combinations form a consensus to achieve best perceived performance.

문자 인식 분야에서 다수의 결정 결합 방법이 제안되어 왔지만 이들 대부분은 인식기 사이의 의존관계에 초점을 맞추지 않았으며 인식기간 독립을 가정하였다.Many decision - combining methods have been proposed in the field of character recognition, but most of them have not focused on the dependency between the recognizers and assumed the recognition period independence.

따라서, 다수 인식기의 결합 성능은 고의존관계 인식기를 추가하는 경우, 열화되거나 편향된다. Huang 및 Suen은, 더 이상 독립 가정을 필요로 하지 않는 BKS(Behavior-knowledge Space)방법을 제안 하였다. 그러나, 이 BKS 방법을 K개 인식기 결합에 적용하기 위해서는, 하나의 결정 변수 및 K개 결정으로 이루어진 (K+1)차 확률 분포는, 기하학적으로 복잡하고, 또 이론적 분석에 의하면 작은 K값에 대해서도 관리가 불가능하다. 그래서 근사 방법이 필요하게 되었다.Thus, the combining performance of multiple recognizers is degraded or biased when adding high dependency recognizers. Huang and Suen proposed a behavior-knowledge space (BKS) method that no longer requires independent assumptions. However, in order to apply this BKS method to the combination of K recognizers, the (K + 1) -order probability distribution consisting of one decision variable and K determinations is geometrically complex and, according to the theoretical analysis, Management is impossible. So we need an approximate method.

Lewis는 Hartmanis의 확장 개념을 이용하여 다수의 저차 성분 분포의 곱으로 n 차 이진 분포를 근사화시키는 문제를 연구하였다. 이러한 곱 근사 방법은 두 분포의 유사도를 위한 정보 측정 및 최대 엔트로피에 대한 기준을 기반으로 하였다.Lewis explored the problem of approximating the nth order binary distribution by multiplying a number of low order component distributions using the Hartmanis expansion concept. This product approximation method is based on information measurement and maximum entropy criterion for the similarity of two distributions.

Chow 및 Liu는 Lewis에 의한 최적화 문제를 해결하였으며 (n-1)개의 2차 성분 분포로 n 차 2진 분포를 최적 근사시키기 위한 프로시저를 발표하였다. 이 최적화 프로시저는 두 성분 변수 간에 상호 정보를 전부 최대화시키는 것과 MWST(Maxi -mum Weight Spanning Tree) 알고리즘에 기초하고 있다.Chow and Liu solved the optimization problem by Lewis and presented a procedure to optimally approximate the nth order binary distribution with (n-1) second-order component distributions. This optimization procedure is based on maximizing mutual information between the two component variables and the Maxi -mum Weight Spanning Tree (MWST) algorithm.

이러한 상호정보는 몇몇 대안 가능성을 알려주는 특정 이벤트(즉, 결정)가 얼마나 발생하는 지에 대한 정량적 측정으로 정의된다. 의존관계는 다른 인식기 결정에 대해 한 인식기의 결정에 대한 예상 능력을 의미하며 평균 상호 정보를 계산하므로써 통계학적으로 측정된다.This mutual information is defined as a quantitative measure of how certain events (ie, decisions) are made that indicate some alternative possibilities. Dependency refers to the predictive ability of a recognizer to make decisions on other identifiers, and is statistically determined by calculating the average mutual information.

k차 의존 관계는 한 인식기에 다른 k개 인식기의 결정에 대한 예상 성능으로 정의된다. 순위 레벨에서 결정에 의해 인식기간 k차 의존관계를 측정하는 것이 곤란하기 때문에, 추출 레벨에서의 결정만이 k차 의존관계를 평가하는데 참여하는 것으로 가정한다.The k-th dependency is defined as the expected performance for the decision of the other k recognizers in one recognizer. Since it is difficult to measure the recognition period k-order dependency by determination at the ranking level, it is assumed that only the determination at the extraction level participates in evaluating the k-th dependence.

이 의존 관계는 저차 분포의 곱으로 고차 확률분포에 대한 근사화에 이론적 기초가 된다. 인식기는 다른 인식기에 대해 통계학적으로 의존관계에 있는 경향이 있으므로, 독립적으로 수행하는 인식기는 취약하다고 가정한다.This dependence is the theoretical basis for the approximation of a higher order probability distribution as a product of lower order distributions. Since recognizers tend to be statistically dependent on other identifiers, it is assumed that identifiers that perform independently are vulnerable.

따라서, 이러한 관점에서, 다수 인식기 결합에 있어서 인식기간의 의존관계를 고려하는 것이 바람직하다.Therefore, from this point of view, it is preferable to consider the dependence of the recognition period on the multiple recognizer combination.

기존의 대부분의 결정 결합 방법은 인식기가 서로에 대해 독립적으로 수행하는 것으로 가정하였다. 그리고, 다수 인식기 결합에 있어서 고차 확률분포 평가는 대부분 주어진 입력에 대해 서로가 조건부 독립적인 것으로 대부분 가정하였다. 조건부 독립 가정은 단순 계산 및 작은 저장량을 필요로 하는 잇점이 있지만 종종 실제 상황에선 그렇지 못하다. 멤버 인식기에 고 의존관계인 비멤버 인식기를 추가하는 것은, 결합된 결정이 의존 관계 인식기의 결정으로 편향되고, 다수 인식기 결합 성능이 의존관계 인식기에 대해 열화될 수 있다는 문제점을 야기시킨다. 더욱이 인식기는 다른 인식기에 대해 통계학적으로 의존 관계 경향이 있으므로 인식기가 독립적으로 수행한다는 가정은 바람직하지 않다. 그러므로, 독립 가정을 하지 않는 결정 결합방법이 바람직하다고 볼 수 있다.Most of the existing decision-combining methods assume that the recognizers perform independently of each other. And, in the multiple recognizer combination, the high-order probability distribution estimates are mostly assumed to be conditionally independent of each other for the given input. Conditional independent families have the advantage of requiring simple calculations and small amounts of storage, but often they are not in real situations. Adding a non-member recognizer that is highly dependent on the member recognizer causes the problem that the combined decision is biased toward the determination of the dependency recognizer and that the multiple recognizer combining performance can be degraded for the dependency recognizer. Furthermore, since the recognizer tends to be statistically dependent on other recognizers, the assumption that the recognizers perform independently is not desirable. Therefore, a decision-making method that does not have independent assumptions can be considered desirable.

반면에, BKS 방법은 고차 확률 분포의 평가 및 저장이 기하학적으로 복잡하고 이론적 분석 관점에서 관리 불가능하다는 문제점이 있다. 또한, BKS 방법은 실제 응용에서 충분하고 대표적인 훈련 데이터 셋트가 없는 경우엔 보이지 않는 다수 결정에 대한 고인식 기각율을 갖는 경향이 있다. 기하학적 복잡도 및 잠재적 고인식 기각율을 감소시키기 위해, 고차 확률 분포는 저차 성분 분포의 곱으로 근사화 되어야 한다. 의존관계는 고차 확률 분포의 최적 근사와 같은 식별에 기여한다.On the other hand, the BKS method has the problem that the evaluation and storage of the high-order probability distribution is geometrically complicated and can not be controlled from the theoretical analysis point of view. In addition, the BKS method tends to have a high rejection rate for a large number of crystals that are not visible if there is sufficient and representative training data set in practical applications. To reduce the geometric complexity and potential rejection rates, the higher-order probability distributions should be approximated by the product of the lower-order component distributions. The dependence contributes to the identification such as the optimal approximation of the higher-order probability distribution.

그리고 상기한 방법들의 중간적인 방법으로서, 종래 방법의 결점을 극복하고 장점을 취하기 위해 의존관계를 고려한 새로운 방법론이 요구된다.And as an intermediate method of the above methods, there is a need for a new methodology that considers dependencies to overcome the drawbacks of the prior art methods and take advantage of them.

Chow 및 Liu는 Lewis와 마찬가지로 1차 트리 의존관계를 고려한 2차 분포의 곱으로 n차 확률 분포의 근사방법을 제안하였다. 그러나, 흔히 하나의 결정이 둘 이상의 다른 인식기를 기반으로 하는 경우가 있다.Chow and Liu proposed an approximation method of the nth order probability distribution as a product of the second order distribution considering the first order tree dependence like Lewis. However, it is often the case that one decision is based on two or more different recognizers.

이러한 경우에, 1차 의존관계는 고차 확률 분포를 적절히 평가하는데 적절하지 못하다. 따라서, 1kK 일때, k차 의존 관계 또는 (k+1)차 분포의 곱으로 고차 확률 분포를 최적으로 근사화 시킬 새로운 방법이 필요하게 된다.In this case, the primary dependency is not adequate to adequately evaluate the high-order probability distribution. Therefore, 1 k K, a new method is required to optimally approximate the high-order probability distribution by the product of k-order dependence or (k + 1) -order distribution.

이하, 분야간 배경 및 전문지식에 관련되는 다수결정 결합에 대한 종래 연구에 대해 기술한다.Hereinafter, a conventional study of multiple decision bonds relating to interdisciplinary background and expertise is described.

패턴 인식 및 그룹 결정 수행에서 다수 결정 결합과 관련된 연구에 관한 것이고, 의존관계 트리에 기초한 고차 학률 분포의 근사에 대한 몇몇 연구를 설명한다.This paper is concerned with the study of multiple decision combinations in pattern recognition and group decision making, and explains some research on the approximation of higher order distribution based on dependency trees.

먼저, 결정 양호도를 개선하기 위한 다수결정 결합에 대한 종래의 연구가 그룹결정 수행을 포함한 그룹 지원 시스템 및 패턴 인식 영역에서 탐구된다. 그리고 저차 성분 분포의 곱으로 고차 이산 확률 분포의 근사에 대한 연구가 소개된다.First, conventional studies on multiple decision combinations to improve decision quality are explored in the group support system and pattern recognition area including group decision making. And a study on the approximation of higher order discrete probability distribution as products of lower order component distribution is introduced.

이러한 연구는 인식기간 의존관계를 고려하고 독립가정을 배제하는 의존관계기반 방법론에 대한 이론적 기초 및 동기를 제공한다.These studies provide the theoretical basis and motivation for the dependency - based methodology that considers the cognitive period dependency and excludes independent assumptions.

종래의 다수 결정 결합 방법Conventional multiple crystal bonding method

다수 인식기의 결정 결합 방법Decision-combining method of multiple recognizers

패턴 인식 분야에서 여러 연구자가 연구하는 다수 인식기의 결정 결합은 개선된 성능 목표를 달성하기 위해 다수의 결정 결합 방법을 전개한다. 다수 결정 결합 방법은 측정 단계에서 부류의 측정 점수, 순위 단계에서 부류의 순위 및 추출 단계에서 단일 선택과 같은 3 측정 결과중의 하나에 좌우되는 결정들의 각 형태에 따라 세가지 단계로 유별될 수 있다. 측정 점수의 결정으로부터, 측정 점수에 따른 부류 순서화에 의해 각 부류에 순위가 할당되어 순위 결정이 행해진다. 단일 선택 결정은 최량 측정 점수 또는 최상위 순위로 하나의 부류를 선택하므로써 행해진다.순위 결정면에서 결정을 행하는 인식기는 순위단계 및 추출단계에서 결정 결합에 참여할 수 있다.Decision combining of multiple recognizers, which are studied by many researchers in the field of pattern recognition, develops a number of decision-combining methods to achieve improved performance goals. The multiple decision binding method can be classified into three steps according to each type of decisions depending on one of the three measurement results, such as the measurement score of the class in the measurement step, the rank of the class in the ranking step, and the single selection in the extraction step. From the determination of the measurement score, ranking is assigned to each class by class ordering according to the measurement score, and ranking is performed. A single selection decision is made by selecting a category with the best measurement score or the highest ranking. The recognizer making a decision at the ranking decision plane can participate in the decision combination at the ranking step and the extraction step.

측정단계에서 결정 결합방법은, 평균화된 Bayes 인식기, 퍼지 적분 또는 퍼지 로직을 이용한 다수 인식기 통합, LCA(Linear Confidence Accumlation)방법, Dempster -Shafer 이론의 이용 및 신경망 결합기의 사용등이 있다.The decision-combining methods in the measurement phase include averaged Bayes recognizer, multiple recognizer integration using fuzzy or fuzzy logic, LCA (Linear Confidence Accumulation) method, use of Dempster-Shahfer theory, and use of neural network combiners.

추출 단계의 경우, 공통적으로 사용되는 투표 방법, 독립가정으로 베이지안 방법의 이용, 증거 추정에 사용된 Dempster-Shafer 공식의 이용 및 BKS 방법이다.In the case of the extraction step, common voting methods, use of the Bayesian method as an independent assumption, use of the Dempster-Shafer formula used for evidence estimation, and the BKS method.

특히, Xu등에 의한 베이지안 방법의 사용에서, Xu등은 독립 특징 집합에 이용된 인식기 또는 독립적으로 훈련된 인식기들은 상호 독립적으로 수행하는 것으로 가정한다. 이러한 접근법은 인식기가 기능적으로 독립적이지 못할 때 문제를 일으킨다.In particular, in the use of the Bayesian method by Xu et al., Xu et al. Assume that the recognizer used in the independent feature set or the independently trained recognizers perform mutually independently. This approach causes problems when the recognizer is not functionally independent.

순위 단계에서 결정결합 방법은 고차 순위 방법, Borda 카운트 방법, 로지스틱(Logistic)회귀 방법의 사용, 및 CAL(Candidate Appearance Likelihood)방법 등이 있다.In the ranking step, the decision combination method includes a higher order method, a Borda count method, a use of a logistic regression method, and a CAL (Candidate Appearance Likelihood) method.

측정 단계에서의 결합 방법Method of bonding in measuring step

Fu는 패턴 인식에서 구문적 및 통계학적 방법을 사용하였다. 그의 방법에선 페턴 프리미티브 집합이 먼저 이미지 데이터로부터 추출되어 심볼 스트링으로 표현된다. 또한, 프리미티브의 속성(attributes)은 이 프리미티브와 연관된 파라미터 벡터에 의해 표현된다. 두 서술자(descriptor)간의 (비)유사 점수(score)는 심볼 스트링간의 구문적 차이(distance)와, 연관 파라미터 벡터간의 세만틱적 차이의 합으로 정의된다. 속성 사용 및 속성과 프리미티브를 함께 사용하는 것은 매력적이긴 하지만, 측정 유사도에 있어서 문제를 일으킨다.Fu used syntactic and statistical methods in pattern recognition. In his method, the set of pattern primitives is first extracted from the image data and represented as a symbol string. Also, the attributes of a primitive are represented by the parameter vector associated with this primitive. The (non) similarity score between two descriptors is defined as the sum of the syntactic difference between the symbol strings and the semantic difference between the associated parameter vectors. The use of attributes and the use of properties and primitives together are attractive, but they cause problems in the similarity of measurements.

상기와 같은 두 차이는 양립할 수 없는 척도로 측정되며, 결합 이전에 적절히 정규화 되어야 한다. 또한, 파라미터 벡터는 항상 프리미티브와 연관되어, 두 서술자 유형에 포함된 정보간 독립 부족이 존재한다.These two differences are measured on an incompatible scale and must be properly normalized before combining. Also, the parameter vector is always associated with a primitive, so there is a lack of independence between the information contained in the two descriptor types.

Xu 등은 측정 레벨에서 결정 결합을 위한 Bayes 인식기에 의해 산출된 포스테리어(posterior)확률 및 Bayes 인식기를 이용한다. 측정 점수의 다수 결정은 측정 점수를 평균화 하거나 측정 점수의 중앙값(median)을 선택하므로써 결합된다. 이들의 연구 접근법은 k-NN(Nearest Neighbor) 인식기 및 차이 인식기류에 확대될 수 있다.Xu et al. Use a Bayesian recognizer and a posterior probability computed by a Bayes recognizer for decision combining at the measurement level. The majority of the measurement points are combined by averaging the measurement points or by selecting the median of the measurement points. Their research approach can be extended to k-NN (Nearest Neighbor) recognizers and differential recognition airflows.

Tahani등은 결정소스의 값에 대한 주관적 평가(subjective evaluation)를 이용하여 퍼지 멤버쉽 함수의 형태로 객관적 증거(objective evidence)를 비선형적으로 결합하기 위해 퍼지 적분(fuzzy integral)에 기초한 방법을 제안했다.Tahani et al. Proposed a method based on fuzzy integral to nonlinearly combine objective evidence in the form of a fuzzy membership function using subjective evaluation of the value of a decision source.

이 퍼지 적분은, 다양한 소스에 의해 공급되는 객관적 증거 및 융합 프로세스에서 고려된 상기 소스의 서브셋트의 예상 값 면에서 다른 패러다임과 차이가 난다.This fuzzy integral differs from other paradigms in terms of the objective evidence supplied by the various sources and the expected value of the subset of the sources considered in the fusion process.

이 방법의 가장 중요한 장점은 객관적 증거가 결합된다는 것 뿐만 아니라 상이한 소스의 관련 중요도도 고려된다는 것이다. 그러나, 이 방법은 주관적 평가가 취약하다는 문제점을 갖는다. Cho는 서로 상이한 구조를 갖는 다수 신경망을 사용하였다.The most important advantage of this method is not only that objective evidence is combined, but also the relative importance of different sources is taken into account. However, this method has a problem that the subjective evaluation is weak. Cho used multiple neural networks with different structures.

이 다수 신경망의 결정은 투표 방법, Borda 카운트 방법 및 퍼지 적분으로 결합된다. 그의 연구에서, 퍼지 적분이 온 라인 필기 문제에 대한 인식 성능면에서 더 양호함을 주장하였다.The decision of this multiple neural network is combined with the voting method, the Borda count method and the fuzzy integral. In his work, he argued that the fuzzy integral is better in terms of cognitive performance for online handwriting problems.

Huang 및 Suen은 LCA 방법이라 통칭되는 인식기간 결합을 위한 하나의 연구법을 소개하였으며, 이 연구법에서 각 인식기는 부류 레이블 및 대응 측정 값을 제공할 수 있다. 예비실험에서, 상기 LCA 방법은 투표 및 베이지안 방법 보다 더욱 양호한 성능을 나타냈다.Huang and Suen introduced one approach for cognitive period combining, known as the LCA method, in which each recognizer can provide a bracket label and corresponding measures. In preliminary experiments, the LCA method showed better performance than the voting and Bayesian methods.

Huang등은 데이터 변형 함수를 이용하여 가능성 측정치를 축적하기 위해 다층 신경망을 사용하였다. 이들의 연구법은 데이터 변형 및 데이터 인식이라는 두 단계로 이루어진다.Huang et al. Used a multi-layer neural network to accumulate probabilistic measures using data transformation functions. Their methodology consists of two steps: data transformation and data recognition.

데이터 변형 단계에서, 각 인식기의 출력값은 가능성 측정 형태로 변형된다.In the data modification step, the output value of each recognizer is transformed into a likelihood measurement form.

데이터 인식 단계에서, 신경망은 최종 인식 결과를 산출하기 위해 변형된 출력을 축적하는데 매우 적절함을 알 수 있었다.At the data recognition stage, the neural network was found to be very suitable for accumulating modified outputs to yield final recognition results.

Yamaoka등은 퍼지 로직에 기초한 수학적 모델링 방법을 이용하여 인식 결과 통합면에서 발견적 학습법(heuristic)의 통합을 발표하였다. 측정단계에서 인식 결과를 유효한 것으로 융합하기 위해 각 인식기의 출력 유닛은 재정규화 되어야 한다. 이러한 융합 알고리즘은 데이터 정규화 및 자격화 프로세스와 퍼지 통합 프로세스와 같은 두 단계로 이루어진다.Yamaoka et al. Proposed the integration of heuristic in the integration of recognition results using mathematical modeling method based on fuzzy logic. The output unit of each recognizer must be renormalized in order to fuse the recognition results in the measurement step to valid ones. This convergence algorithm consists of two steps: a data normalization and qualification process and a fuzzy integration process.

이들은 불명료하고 완벽하지 못한 인식 결과를 표현하는 수단을 제공하고 전문가 지식 및 통계학적 지식을 표현하느데 유연하다는 점에서 퍼지 로직을 지원하지만, 혼동 테이블의 계산 및 구조로 인해 베이지안 방법이 더욱 정교하고 복잡하다. 하지만, 척도면에서 상이한 값에 대한 재정규화는 용이하지 않다.They support fuzzy logic in that they provide a means of expressing imprecise and incomplete recognition results and are flexible in expressing expert knowledge and statistical knowledge, but Bayesian methods are more sophisticated and complex due to the computation and structure of confusion tables . However, renormalization of different values on the scale is not easy.

Rogova는 증거에 대한 Dempster-Shafer 이론에 기초한 다수 신경망 인식기의 결합 방법을 발표하였으며 증거 계산을 위한 두 개의 새로운 지지 함수 집합을 발표하였다.Rogova presented a method of combining multiple neural network recognizers based on the Dempster-Shafer theory for evidence and presented two new set of support functions for proof calculation.

Lee 및 Srihari는 전체 3 유형의 인식기를 결합하고 혼합 유형 상황에서 이용가능한 모든 정보를 이용하기 위한 단일 층 신경망을 제안하였다. 또한, 이들이 고려한 신경망은 제곱근 평균 에러학습 규칙에 제어 항을 추가하므로써 중복 인식기의 제거 및 상관 인식기를 취급할 수 있다.모든 유형 결합을 위해, 순위 및 추출 레벨에서 각 인식기의 출력은 재정규화되어 각 입력 노드에 할당될 수 있다.Lee and Srihari proposed a single layer neural network to combine all three types of recognizers and to use all available information in mixed-type situations. In addition, the neural networks they consider can handle elimination and correlation recognizers of redundant recognizers by adding control terms to the square root mean error learning rules. For all type combinations, the output of each recognizer at the ranking and extraction levels is re- Can be assigned to an input node.

Dimauro등은 결합 프로세스의 평가를 위한 방법론을 발표하였다. 이들은, 결합된 인식기 집합의 특성을 고려하여 평가된 결합 프로세스의 성능을 고려하였다. 평가를 위해 양호하게 정의된 인식기 유사성 인덱스를 도입하였다. 인식기의 출력은, 순위된 후보자 리스트의 각각을 위한 신뢰값을 갖는, 순위된 후보자 리스트임을 가정하였다. 결합 프로세스를 평가하기 위한 이러한 체계적 방법은 진보된 다수-인식기 시스템의 고안을 지원하는 단계를 표현한다.Dimauro et al. Published a methodology for the evaluation of the binding process. They considered the performance of the combining process that was evaluated taking into consideration the characteristics of the combined set of recognizers. We introduced a well-defined recognizer similarity index for evaluation. The output of the recognizer is assumed to be a ranked candidate list with a confidence value for each of the ranked candidate lists. This systematic approach for evaluating the combining process expresses steps to support the design of an advanced multi-recognizer system.

순위 단계에서의 결합 방법How to combine in the ranking step

Hull 및 Ho등은 인식 결과로서 순위 사용을 지원하였다. 이들은 상이한 인식기의 결정을 결합하기 위한 순위 단계 방법을 제안 하였다. 특히, Ho는 측정점수와 최상위 선택과 비교한 순위 사용의 이점을 설명하였고, 후보자 집합 감소 및 순위 재순서화(reordering)에 대한 몇몇 방법을 제안하였다. 후보자 집합 감소에 대한 일반적인 방법은 교집합 및 합집합 방법이다. 이러한 방법은 참 부류의 손실없이 주어진 집합에서 다수의 부류를 감소시키는 것을 목표로 한다. 후보자 집합 재순서화는 고순위 방법, Borda 카운트 방법 및 로지스틱 회귀와 같은 유용한 결정 결합 방법을 포함한다.Hull and Ho supported rankings as recognition results. They proposed a ranking step method to combine the determinations of different identifiers. In particular, Ho explains the advantages of ranking usage compared to the measurement score and the top-level selection, and suggests some methods for candidate set reduction and reordering. A common way to reduce candidate sets is intersection and union. This approach aims to reduce multiple classes in a given set without loss of true class. Candidate set reordering includes useful decision-combining methods such as the high-ranking method, the Borda count method, and the logistic regression.

Borda 카운트 방법은 사회적 선택 함수의 하나로서 잘 알려진 일치 함수이고 또한 그룹 결정 수행에서 보다 함수로서 알려져 있다.The Borda count method is well known as one of the social selection functions and is also known as a function in group decision making.

Borda 카운트 방법은 각각의 참가 인식기에 동일 가중치를 할당하지만, 로지스틱 회귀의 이용은 더 양호한 성능에 기여하는 중요도에 따라 비동일 가중치를 상기 참가 인식기에 할당하는 것을 허용한다. 이들은 그들의 논문에서 다른 방법 보다 더욱 양호한 인식 성능을 나타내는 로지스틱 회귀의 이용을 보고하였다.The Borda count method assigns the same weight to each participant recognizer, but the use of logistic regression allows assigning non-identical weights to the participant recognizer depending on the importance contributing to better performance. They reported the use of logistic regression in their paper to show better recognition performance than other methods.

Tsutsumida등은 보상 인식 알고리즘 결합에 의해 고인식 성능을 달성하기 위한 다수-인식기 시스템을 연구하고 IPTP(Institute for Posts and Telecommunica -tions Policy, Ministry of Posts and Telecommunications, Tokyo, Japan) 문자 인식 경쟁의 결과를 탐구하였다. 이들은 새로운 실험 조건을 이용한 CAL 방법 의 탐구와, 확장 후보자 출현을 축적하기 위한 결합 방법으로서 3-층 역-전파(back-propagation)학습방법의 응용에 대한 새로운 실험을 행하였다. 그래서 여러가지 인식 알고리즘의 결과는, 데이터 변형을 이용하여 순위 단계에서의 순위 리스트로부터 측정 단계에서의 숫자 리스트로 변형되었다.Tsutsumida et al. Studied a multi-recognizer system for achieving high recognition performance by combining compensation recognition algorithms and obtained the result of IPTP (Characteristic Recognition Competition of Ministry of Posts and Telecommunications, Tokyo, Japan) Explored. They conducted a new experiment on the application of the 3-layer back-propagation learning method as a CAL method exploiting new experimental conditions and as a combining method to accumulate extension candidates. Thus, the results of various recognition algorithms have been transformed from the ranking list in the ranking step to the numeric list in the measurement step using the data transformation.

추출 단계에서의 결합 방법Coupling method in the extraction step

Mazurov등은 최종 결정이 다수 투표 방법에 의해 행해지는 경우에, 2-부류 판정 문제에 위원회 방법의 응용을 논의 하였다. 이들은 최소 위원회 존재 공리 및 이 공리 구성을 위한 알고리즘을 보였다. 이 방법은 다-부류 문제에까지 확장하는 것이 용이하지 않다.Mazurov et al. Discussed the application of the committee method to the two-class decision problem when the final decision was made by a multiple voting method. They showed a minimum committee existence axiom and an algorithm for constructing this axiom. This method is not easy to extend to multi-class problems.

Suen등은 결정이 다수 투표에 의해 결합되는 경우에, 필기 숫자 인식을 위한 여러 인식기의 이용을 제안하였다. 이 투표 메커니즘은 각 인식기에 의한 최상위 부류를 받아들여 수행된다. 만일 인식기가 후보자 부류 집합중에서 판별하지 못하면 인식기의 투표는 후보자 부류들에 균등하게 분할된다.Suen et al. Proposed the use of multiple recognizers for handwritten digit recognition when decisions are combined by multiple votes. This voting mechanism is performed by accepting the highest class by each recognizer. If the recognizer can not determine from the candidate bracket set, the votes of the recognizer are evenly divided into candidate brackets.

Nadal등은 유사한 인식기 집합에 의한 결정 결합을 위한 여러 발견적 학습 규칙을 설명하였다. 문자 인식과 같은 다-부류 문제에서, 단지 최상위 선택보다는 인식기들로부터의 더 많은 정보는 유용하다.Nadal et al. Described several heuristic learning rules for decision combining by a similar set of recognizers. In multi-class problems such as character recognition, more information from recognizers is useful than just top-level selection.

Alpaydin은 상이한 학습 알고리즘 또는 동일한 학습 알고리즘에 상이한 파라미터 집합으로 다수의 신경망을 독립적으로 훈련시킨 후, 다수의 신경망에 대해 가중된 다수의 형태로 투표를 행하는 것을 제안하였다.Alpaydin proposed independently training multiple neural networks with different learning algorithms or different sets of parameters for the same learning algorithm and then casting votes in a weighted multiple form for multiple neural networks.

투표 방법은 정적(static) 대 동적(dynamic)으로 탐구되었고, 동적 탐구에선 신경망 복잡도도 고려된다. 이 시스템은 각각의 참가 신경망에 성공도 및 복잡도에 따라 신뢰 가중치를 할당한다.The voting method was explored static versus dynamic, and dynamic exploration also considered neural network complexity. The system assigns trust weights to each participating neural network according to its degree of success and complexity.

Lam등은 다수 인식기의 결과에 대한 결합을 위해 다수 투표의 사용에 관련된 이론적 고려사항을 논의 하였다. 특히, 이들은 새로운 인식기가 추가될 때 결합 결정전의 성능을 검사하고, 그룹 성능이 조건에 의해 개선될 조건을 유도하였다. 이들의 연구는 개선된 인식 결과를 산출할 수 있는 다수 투표에 의해 인식기의 결정 결합에 대한 이유(why) 및 방법(how)의 이해에 관련된 작업으로부터, 그리고 발생될 것으로 예상될 수 있는 가정으로부터 유발된다.Lam et al. Discussed the theoretical considerations related to the use of multiple polls to combine the results of multiple identifiers. In particular, they examined the performance of combinatorial games when new recognizers were added and led to conditions under which group performance would be improved by the conditions. These studies are based on an understanding of why and how the decision maker combines decisions by multiple polls to produce improved recognition results, and from the work that is expected to occur, do.

Mandler등은 다수 인식 알고리즘 이용의 장점을 발표하였다. 이들은, Bayes 결정 이론 및 Dempster-Shafer 증거 이론에 기초한 독립 인식기의 결정 결합을 위한 방법을 설명하였다. 통계학적 모델 및 Bayes 결정 이론은 인식기에 의해 계산된 차이 측정치를 각각의 부류와 연관된 신뢰값으로 변형하는 데 이용되고, 그후 증거 이론 규칙에 의해 결합된다. 이 방법의 커다란 어려움은 차이 측정치로부터 신뢰값으로 변형하는데 있다.Mandler et al. Have presented the advantages of using multiple recognition algorithms. They described a method for decision-combining of independent recognizers based on Bayes decision theory and Dempster-Shafer evidence theory. The statistical model and Bayes decision theory are used to transform the difference measure calculated by the recognizer into a confidence value associated with each class, and then combined by evidence theory rules. The great difficulty of this method is to transform from difference measure to confidence value.

Franke등은 상이한 인식기의 결과를 결합하기 위한 상이한 두 개의 방법을 제안하였다.Franke et al. Proposed two different methods for combining the results of different recognizers.

첫 번째 방법은 Dempster-Shafer 증거 이론에 기초하며 두 번째 방법은 입력 데이터에 대한 약간의 가정을 갖는 통계학적 방법이다. 이 방법에서, 각 인식기의 투표는 독립 증거 소스로서 간주되고 2 진값으로 찬성(pro) 또는 반대(contra)값을 갖는다.The first method is based on the Dempster-Shafer proof theory and the second method is a statistical method with some assumptions about the input data. In this way, each recognizer's vote is considered as an independent evidence source and has a pro or contra value as a binary value.

두 번째 방법은, 상기 투표는 2 진값을 갖는다고 가정하며, 공통 평균 평방근 에러(common mean-square-error)방법에 의해 해결될 수 있는 최적화 문제로서 설명된다. 이들은 온 라인 필기 문자에 대한 사용자-의존적 인식을 위해 시험된다.The second method assumes that the vote has a binary value and is described as an optimization problem that can be solved by a common mean-square-error method. They are tested for user-dependent recognition of online handwritten characters.

Xu등은 투표 방법, 베이지안 방법 및 Dempster-Shafer 증거 이론에 의한 다수-인식기 결합을 모델링하기 위한 몇몇 방법을 제안했다. 효과적인 Dempster -Shafer 증거 이론을 사용해서, 이들은 각각의 그룹에서 동일 결정에 영향을 미치는 그룹으로 증거를 수집한 후, Barnett에 의한 단순 함수에 대한 공식을 이용하여 각각의 그룹에 별개로 증거를 결합했다.Xu et al. Proposed several methods for modeling the multiple-recognizer combination by voting method, Bayesian method, and Dempster-Shafer proof theory. Using the effective Dempster-Shahfer evidence theory, they collected evidence in groups that affect the same decision in each group, then combined the evidence separately for each group using the formula for the simple function by Barnett .

따라서, L이 후보 부류의 갯수인 경우에, 전체 프로시저의 계산은 O(L)차수로 감소된다. 투표 방법에 적용하기 위한 몇몇 결정 규칙이 제안되었다.Thus, if L is the number of candidate classes, the computation of the entire procedure is reduced to an O (L) order. Several decision rules have been proposed for applying to voting methods.

베이지안 방법의 사용에서, 독립 특징 집합을 사용하거나 또는 인식기가 독립 훈련 셋트에 의해 훈련되므로 인식기는 상호 독립인 것으로 가정한다. 각 인식기가 최상위 선택에 대한 자신의 포스테리어확률(또는 혼동 확률)을 산출하기 때문에 모든 포스테리어확률(또는 혼동 확률)은 결정 결합을 위해 상호 독립 가정에 의해 곱해진다. 이 연구법은 인식기가 상호 독립적이지 못할 때 문제를 일으킨다.In the use of the Bayesian method, the recognizers are assumed to be mutually independent since they use a set of independent features or because the recognizers are trained by independent training sets. Since each recognizer calculates its own posterior probability (or confusion probability) for the top choice, all the posterior probabilities (or confusion probabilities) are multiplied by mutually independent assumptions for decision combinations. This method causes problems when the recognizers are not mutually independent.

Lu등은 또한 자유로운 필기 숫자 인식의 통합을 위해 Dempster-Shafer 증거 이론을 사용했다. 이들은 신경망, 구조적 템플리트 사상 인식기 및 폴리노미알(polynomial) 인식기와 같은 3개의 상이한 인식기로 이루어진 다수 인식기 시스템을 제안했다. 이들은 인식 결과 결합 및 모델링을 위한 알고리즘 및 결합 증거에 기초한 더욱 확실하고 정확한 인식 결과에 대한 추론 규칙을 제시했다. 이 시스템의 입력은 부류 레이블(label) 및 연관 신뢰값을 포함한다는 점에서 Xu 및 Mandler의 시스템과 다르다.Lu et al. Also used the Dempster-Shafer proof theory for the integration of free handwritten digit recognition. They proposed multiple recognizer systems consisting of three different recognizers, such as neural networks, structural template map recognizers and polynomial recognizers. They proposed inference rules for more accurate and accurate recognition results based on algorithms and combining evidence for combining and modeling recognition results. The input of this system is different from that of Xu and Mandler in that it contains a bracket label and an associated trust value.

Huang등은 개별 인식기로부터의 결정을 축적하여 최종 결정을 유도하는 BKS(Behavior-knowledge Space) 방법으로 불리는 새로운 방법을 제안하였다. 이 BKS 방법은 훈련 데이터로부터 외견상 합동 인식기들의 결정사항을 위한 부류에 의한 고차 빈도 테이블을 구성하는 지식-모델링 단계와, 주어진 합동 인식기들의 결정사항을 위한 빈도에 따른 최종 결정을 유도하는 동작 단계를 포함한다. 실제 샘플의 행동과 동일한 학습 샘플에 대한 인식기의 행동이 일치한다면, BKS 방법은 각 인식기가 추출 레벨에서 단지 하나의 부류만을 제공한다는 점에서 최선 방법인 것으로 증명될 수 있다고 주장한다.Huang et al. Proposed a new method called BKS (Behavior-knowledge Space) method which accumulates decisions from individual recognizers and induces final decision. The BKS method comprises a knowledge-modeling step of constructing a higher order frequency table by class for the decisions of apparently coherent recognizers from the training data and an operation step of deriving a final decision according to the frequency for the decisions of the given joint recognizers . If the behavior of the actual sample and the behavior of the recognizer for the same learning sample coincide, the BKS method claims that each recognizer can be proved to be the best method in that it provides only one class at the extraction level.

그러나, 보지 않은 인식기의 결정이 관측되면, BKS 방법은 자신의 최종 결정을 기각한다. 더욱이, 더 많은 수의 인식기는 이론적으로 더욱 많은 계산량 및 저장량을 필요로 한다. 이러한 사실들은 상기한 바와 같이 BKS 방법의 단점이다.However, if a decision of an unseen recognizer is observed, the BKS method rejects its final decision. Moreover, a larger number of recognizers theoretically require more computation and storage. These facts are disadvantages of the BKS method as described above.

Kittler 와 Hatef는 개별 패턴 표현을 사용하는 인식기를 결합하는 공통 이론적 방법을 제안하였다. 즉, 베이지안 결정 규칙에 기초하여 기존 인식기 결합에 대한 이론적 기반을 제공하였다. 더욱이, 평가 에러에 대한 결합 방법의 민감도에 대한 분석이 수행되었다. 이들의 실험 결과는 최대 제한 가정인 합계 규칙 및 그 파생 규칙 즉, 최소 규칙, 최대 규칙, 메디안 규칙 및 최대 투표 규칙하에서 개발된 결합 규칙은 일관되게 다른 결합 방법을 능가한다는 것을 나타내었다.Kittler and Hatef proposed a common theoretical method of combining identifiers using discrete pattern representations. In other words, based on Bayesian decision rules, we provide a theoretical basis for combining existing recognizers. Furthermore, an analysis of the sensitivity of the combining method to the evaluation error was performed. Their experimental results show that the sum rule, which is the maximum limit assumption, and its derivative rules, that is, the minimum rule, the maximum rule, the median rule, and the combination rule developed under the maximum voting rule consistently outperform other combining methods.

기타 논문에서의 결정 결합 방법Methods of binding crystals in other papers

그 밖의 논문에서 위원회 결정 수행을 위한 여러 방법들이 제안되었다. 이들은 다수 인식기 시스템에서 유용한 투표 및 Borda 카운트 방법을 포함한다. 또한, Bayes 결정, Dempster-Shafer 증거 이론, 퍼지 이론 및 생성 규칙을 포함하는 기타 결합 모델이 있다.Several other methods have been proposed for carrying out committee decisions in other papers. These include useful voting and Borda counting methods in multiple recognizer systems. There are also other coupled models including Bayes decision, Dempster-Shafer proof theory, fuzzy theory and generation rules.

그룹 결정 수행 결합 방법How to Combine Group Decisions

그룹 일치(consensus)함수, 사회적 선택 함수 및 사회적 보장 함수들은 의사 결정자들의 위원회에 의해 주어진 순위 결정을 결합하기 위해 발전되어 왔다. 다수투표 함수는 특정 그룹 일치 함수이다.Group consensus functions, social selection functions, and social security functions have been developed to combine the ranking given by the decision makers' committees. The multiple voting function is a specific group matching function.

Arrow는 이러한 함수 및 다양한 가정에 기초한 확률 공리에 대한 조건을 논의하였다.Arrow discussed conditions for these functions and various assumptions based on probability axioms.

Black은 위원회 결정 및 선거에 대한 수학적 이론의 역사 및, 위원회 조건 및 선거에 대한 로직을 제시하였다.Black presented a history of mathematical theories for committee decisions and elections, and logic for committee conditions and elections.

사회적 선택 함수 및 사회적 보장 함수에 대한 공식적 서술은 Hwang등에 의해 제공된다. Borda 카운트 방법은 사회적 선택 함수의 한 종류이다. 다-부류 인식 문제에 있어서, 각 인식기에 의한 최상위 선택보다 더 많은 것을 고려하는 것이 유익하다. 따라서, 순위를 이용한 결합 방법이 다수 투표 방법 보다 더 유용하다. 인지도는 인과적(casual) 추론을 표현하는 그래프 구조이며 인과적인 결정 수행 문제를 해결하는데 사용된다. 인식기가 인과적 객체(개념) 및 인과적 지향 아크(인과적 관계)로 이루어진 인지도 관점으로 자신의 지식을 표현할 때, 다수 인식기로 된 그룹을 위해 개별 인지도로부터 그룹 인지도를 구성하는 것이 필요하다. 그룹 인지도를 구성하는 것은 개별 인식기의 다수 결정을 결합하는 것에 상응한다. 인과관계(casualities)에 대한 표현 방법에 따라, Kosko는 퍼지 용어로 인과관계를 표현하고 인지도의 추론적 링크를 통해 퍼지 연산자를 이용하여 결합하는 방법을 제안하였다. Zhang등은 인지도에서 직접 부정적(negative) 인과관계를 표현할 수 있는 NPN(Negative-Positive-Negative) 로직을 제안하였다. 이 NPN 로직과 NPN 관계에 의해 퍼지 로직을 NPN 로직으로 대체한다. 이들은 또한 NPN 인지도의 다수 인과관계를 결합하기 위한 결합 함수를 제안하였다.Formal descriptions of social selection functions and social security functions are provided by Hwang et al. The Borda count method is a kind of social selection function. In the multi-class recognition problem, it is advantageous to consider more than the top choice by each recognizer. Thus, a ranking-based association method is more useful than a multiple voting method. Recognition is a graphical structure that expresses casual reasoning and is used to solve causal decision making problems. It is necessary to construct group recognition from individual recognition for a group of multiple recognizers when the recognizer expresses its knowledge from the viewpoint of awareness consisting of causal objects (concepts) and causal oriented arcs (causal relations). Constructing group awareness corresponds to combining multiple decisions of individual recognizers. According to the way of expressing casualities, Kosko proposed a way to express causal relations in fuzzy terms and combine them using fuzzy operators through inferential links of awareness. Zhang et al. Proposed a NPN (Negative-Positive-Negative) logic that can directly express negative causal relationships in recognition. This NPN logic and NPN relationship replace the fuzzy logic with NPN logic. They also proposed a combined function to combine multiple causal relationships of NPN awareness.

인식기 시스템에서의 결합 방법Method of combining in a recognizer system

인식기 시스템 영역에서 다수 소스 또는 증거로부터 지식을 결합하는 많은 방법이 있다. Duda 등은 확률에 의해 다수 불확실한 증거를 기초로 한 가설을 해결하기 위한 문제에 대한 몇몇 접근법을 제안하였다. 이러한 접근법에선, 연관 증거가 불확실하고 선행 연관 증거가 불일치할 때 가설에 대한 확률을 갱신하는데 다수 증거가 사용되는 방법이 있다. 다수 증거를 결합하기 위해, 이들은 주어진 가설에 대해 증거가 서로에 대해 조건부 독립인 것으로 가정한다.There are many ways to combine knowledge from multiple sources or evidence in the recognizer system domain. Duda et al. Proposed several approaches to solving hypotheses based on multiple uncertain evidence by probability. In this approach, there is a way in which multiple evidence is used to update the probability of a hypothesis when association evidence is uncertain and leading association evidence is inconsistent. To combine multiple evidences, they assume that evidence is conditional independent of each other for a given hypothesis.

Konolige는 불확실성을 다루어야만 하는 인식기 시스템을 위해 정보 이론에 의해 알려진 베이지안 이론의 공식화를 제안하였다. 베이지안 접근법에 대한 이론적인 올바른 구현은 모델의 모든 이벤트에 대한 합동 확률 함수의 지식을 요구한다. 그러나, 적은 이벤트 부분 집합에 대해서도 합동 확률 함수를 평가하는 것이 곤란하다. 따라서, 확률을 갱신하기 위해 몇몇 근사가 계산적으로 실용적인 방법에 사용된다.Konolige proposed the formalization of Bayesian theories known by information theory for recognizer systems that must deal with uncertainty. The theoretical correct implementation of the Bayesian approach requires knowledge of the joint probability function for all events in the model. However, it is also difficult to evaluate the joint probability function for a small set of event subsets. Thus, some approximations are used in a computationally practical way to update probabilities.

그러나, 조건부 독립 가정은 요구된 합동 분포를 올바로 특징지울 수 없다. 즉, 독립 가정은 다수의 상호 배제적이고 남기지 않는 가설의 경우와 양립할 수 없다. 그리고 흔히 인식기는 단일 증거 사건 보다는 여러 증거 사건들에 조건된 가설에 대한 확률을 평가하는 것에 대해 더욱 신뢰한다. 그의 하나의 생각은 통합 분포에 대한 최소 정보의 원리를 사용한다. 엔트로피 및 정보 측정을 정의한 후, 합동 분포로서, 모든 가능한 분포에 대해 최소 정보 측정을 갖는 분포를 선택한다. 또한, 최대 엔트로피에 기반한 저차 부분포(subdistribution) 집합에 의해 분포를 기술하는 곱 확장 공리에 의해 최소 정보 원리와 독립 개념을 연결시킨다. 다수 증거를 결합하기 위해, 증거 사건의 합동 확률을 위해 계산된 곱 확장을 사용한다.However, conditional independent assumptions can not correctly characterize the required joint distribution. In other words, the independent assumption is incompatible with many mutually exclusive and non-leaving hypotheses. And often the recognizer more trusts in assessing the probability of a hypothesized condition on multiple evidence events rather than a single evidence event. One of his ideas uses the principle of minimal information on the integrated distribution. After defining entropy and information measurements, we choose a distribution with minimal information measurement for all possible distributions as a joint distribution. It also links the minimum information principle and the independent concept by the product expansion axiom that describes the distribution by a set of subdistributions based on the maximum entropy. To combine multiple evidences, use the product extensions computed for joint probabilities of evidence events.

이산 확률 분포 근사 방법Discrete probability distribution approximation method

대다수 정보 시스템 설계에 있어서, 주요 문제는 유한 샘플 수로부터 기초가 되는 n-차원 확률 분포를 평가하는 것과, 제한된 기계 메모리 양에 분포를 저장하는 것이다. 허용 가능한 장비 복잡도 및 이용가능 샘플에 대한 제한은, 고려사항으로서 임의(random) 변수의 정규화 또는 통계학적 독립과 같은 단순화 가정의 사용으로 분포가 근사되는 것을 필요로 한다.For most information system designs, the main problem is to evaluate the underlying n-dimensional probability distribution from finite sample numbers and to store the distribution in a limited amount of machine memory. The allowable equipment complexity and limitations on available samples require the distribution to be approximated with the use of simplified assumptions such as normalization or statistical independence of random variables as consideration.

Lewis는 Hartmanis에 의한 확장개념을 이용하여 저차 분포로 된 여러 성분 분포의 곱으로 n차 2진 분포의 근사 문제를 고려하였다. 적절히 제한된 조건하에서 곱 근사는 최소 정보 특성을 갖는다는 것을 보여주었다. 이 근사 방법은 두 분포의 유사도를 위해 그리고 최대 엔트로피 기준으로 상대 엔트로피 또는 유사도로 불리는 정보 측정 기준에 기초한다. 실제 분포에 있는 정보는 유사도 측정기준의 정의에 의해 모든 근사에 대해 동일하므로, 최량 근사는 근사 분포에서 정보의 합이 최대인 근사이다. 따라서, 둘 이상의 제안된 근사가 비교될 수 있으며, 최량 근사는 근사에 의해 주어진 분포 이상의 실제 분포의 지식 없이 선택된다. 즉, 최대 상관관계량을 포함하는 근사 선택으로 이루어지는 비교 프로세스이다. 그러나 최량 근사를 구성하기 위해 주어진 복잡도의 성분 분포 집합을 선택하는 문제는 미해결 상태로 남아 있다.Lewis considers the approximation problem of the nth order binary distribution as the product of several component distributions with low order distribution using the expansion concept by Hartmanis. Under suitably limited conditions, the product approximation has shown that it has minimal information properties. This approximation method is based on an information measure called relative entropy or similarity for the similarity of the two distributions and on a maximum entropy basis. Since the information in the actual distribution is the same for all approximations by the definition of the similarity measurement criterion, the best approximation is an approximation in which the sum of information is the largest in the approximate distribution. Thus, more than one proposed approximation can be compared, and the best approximation is selected without knowledge of the actual distribution above the given distribution by approximation. That is, a comparison process comprising an approximate selection including a maximum correlation amount. However, the problem of selecting a set of component distributions of a given complexity to construct the best approximation remains unresolved.

Brown은 임의의 변수 부분집합으로 이루어진 합동 확률 분포가 주어질 경우에 2진 변수 집합으로 된 합동 확률 분포에 최적 근사를 제공하는 반복 방법을 발표하였다. 이 반복 방법의 각 단계는 개선된 근사를 제공하며, 프로시저는 채용된 성분 분포에 대한 최소 정보 확장인 근사를 제공하도록 수렴한다.Brown has presented an iterative method that provides an optimal approximation to a joint probability distribution with a set of binary variables given a joint probability distribution of arbitrary variable subsets. Each step of the iterative method provides an improved approximation, and the procedure converges to provide an approximation that is a minimum information extension to the employed component distribution.

Chow는 Lewis에 의한 미해결 문제를 해결하기 위해, (n-1)개 2차 성분 분포의 곱으로 n차 분포를 최량 근사시키는 방법을 연구하였다. 2 차 분포의 곱 또는 1 차 트리 의존 관계의 분포로 n차 이산 확률 분포를 최적 근사시키기 위한 방법이 제안되었다. n개 변수중에서 (n-1)개 1차 의존 관계로 된 최적 집합을 구하기 위해, 유사도를 이용하여 정보에서 최소 차이 근사를 산출하는 프로시저가 유도되었다. 최적 프로시저는 Kruskal의 MWST(Maximum Weight Spanning Tree) 알고리즘과 두 변수간의 상호 정보를 최대화하는 것을 기초로 한다. 유사도를 최소화 하는 것은 1차 의존 관계 트리의 전체 가중치를 최대화 하는 것과 동일하다.To solve the unresolved problem of Lewis, Chow studied how to approximate the n-th order distribution by multiplying the (n-1) second-order component distributions. A method for optimal approximation of the nth order discrete probability distribution by the product of the second order distribution or the distribution of the first order tree dependency has been proposed. To obtain the optimal set of (n-1) first-order dependencies among n variables, a procedure to derive the minimum difference approximation from the information was derived using the similarity. The optimal procedure is based on Kruskal's Maximum Weight Spanning Tree (MWST) algorithm and maximizing mutual information between two variables. Minimizing the similarity is equivalent to maximizing the overall weight of the primary dependency tree.

Wong 및 Liu는 이산 데이터를 인식하기 위해 결정-기반 접근법을 발표하였다. 이 접근법에서 채용된 분포 평가는 특성 독립 카이-제곱(chi-square)테스트에 기초하여, 의존관계 트리 프로시저로부터 수정된다. Wang 및 Wong은 에러 확률 최소최대(minimax) 인식 체계에 기초하여, 이산 데이터의 인식에서 고차 확률 분포를 다루었고 저차 성분 분포의 곱으로 분포를 근사시켰다. 따라서, Bayes 에러 최소화를 이용하여 근사시키는데 사용되는 CP(Classes-Patterns)의 측정기준을 정의하였다. Chow 및 Liu에 의해 제안된 접근법이 Wang 및 Wong의 접근법과 비교된다.Wong and Liu presented a decision-based approach to recognizing discrete data. The distribution evaluation employed in this approach is modified from the dependency tree procedure based on a characteristic independent chi-square test. Wang and Wong addressed the high-order probability distribution in the recognition of discrete data and approximated the distribution by the product of the low-order component distribution, based on the error probability minimum-maxax recognition system. Therefore, we defined the measurement criteria of Classes-Patterns (CP) used to approximate using Bayes error minimization. The approach proposed by Chow and Liu is compared with Wang and Wong's approach.

Malvestuto는 주어진 복잡도의 분해 가능한 모델로 n차원 이산 확률 분포를 근사시키기 위한 발견적 학습법 프로시저를 발표하였다. 분해 가능한 모델은 다수의 바람직한 특성을 갖는 특수 상호 작용 모델이며; 분해 가능한 모델은 Markov 신뢰망에 의해 표현될 수 있으며; 분해 가능한 모델에 의해 발생된 근사는 곱 형태를 갖는다. 순위 2인 분해 가능한 모델과 연관된 의존관계 그래프에 의해 1차 의존관계 트리로 불리우는 트리 숲이 된다. 하지만, 최적 근사를 얻기 위해 순위 k 인 모든 분해 가능한 모델을 평가해야 한다.Malvestuto has presented a heuristic learning procedure to approximate the n-dimensional discrete probability distribution with a decomposable model of a given complexity. The resolvable model is a special interaction model with a number of desirable characteristics; The decomposable model may be represented by a Markov trust network; The approximation generated by the decomposable model has a product form. The dependency graph associated with the decomposable model of rank 2 results in a tree called the primary dependency tree. However, to obtain an optimal approximation, all resolvable models with rank k must be evaluated.

상기한 바와 같이, 미리 정의된 결정 후보중의 하나를 선택하는 인식기, 인식기, 기계등의 결정기에 있어서, 다수의 결정을 결합하는 기존의 방법으로는 투표 기법, 독립 가정을 기반으로 한 베이지안 기법, 독립 가정을 기반으로 하지 않는 행동-인식 공간(Behavior-knowledge Space;BKS)기법이 있다.As described above, in the determiner of the recognizer, recognizer, machine, and the like that selects one of the predefined decision candidates, existing methods of combining a plurality of decisions include a voting technique, a Bayesian technique based on an independent assumption, There is a behavior-knowledge space (BKS) technique that is not based on independent assumptions.

독립 가정을 기반으로 한 투표 기법이나 베이지안 기법은 의존적인 결정이 포함될 때, 다수 결정의 결합관계가 편향되거나, 이로 인한 성능의 저하가 예상되는 문제점이 있다.The independent voting method or the Bayesian method has a problem in that when a dependent decision is included, the coupling relation of a plurality of decisions is biased and the performance is deteriorated due to this.

그렇지만, 다수 결정을 결합하는 계산은 간단하고 이 계산에 요구되는 양도 적은 장점이 있다.However, the calculation for combining multiple crystals is simple, and the amount required for this calculation is also advantageous.

한편, 독립 가정을 기반으로 하지 않는 상기 BKS 기법은 독립 가정을 필요로 하지 않는 장점이 있는 반면에, 결합될 다수의 결정으로 구성된 고차 확률 분포를 그대로 유지하기 위하여 기하학적인 저장량이 요구된다.On the other hand, the BKS technique, which is not based on the independent assumption, has an advantage of not requiring an independent assumption, whereas a geometric storage amount is required in order to maintain a high probability distribution composed of a plurality of decisions to be combined.

또한, 보지 않은 다수 결정의 패턴에 대해서 결합된 결과를 만들 수 없어 기각되는 비율이 높아진다는 단점이 있다.In addition, there is a disadvantage in that a combined result can not be made for a pattern of a large number of crystals which are not seen, and the rejection rate increases.

예를들어, K 개의 결정을 결합하는 과정에서, L 개의 결정 후보가 있다고 가정하면, 독립 가정을 기반으로 한 베이지안 기법의 저장량에 대한 복잡도는이고 , 상기 BKS 기법의 복잡도는이다.For example, assuming that there are L decision candidates in the process of combining K decisions, the complexity of the Bayesian method based on independent assumptions is , And the complexity of the BKS technique is to be.

아울러, 고차 확률 분포를 근사화하는 방법에 있어서, 이미 1차 의존관계에 의한 최적 근사 분포를 결정하는 방법이 제시되었으나, 2차 이상의 고차 의존 관계를 기반으로 고차 확률 분포를 근사화하는 방법은 아직 제시되지 않고 있다In addition, a method of approximating a higher-order probability distribution has already been proposed, but a method of approximating a higher-order probability distribution based on a higher-order higher order dependency relationship has not yet been proposed Be absent

따라서, 한 결정기가 각각 다른 2개 이상의 결정기의 결정에 의존적인 결정을 내린다면, 이를 고려한 확률 분포의 근사 방법은 현재는 없다.Therefore, if one determiner makes decisions that are dependent on the determinations of two or more different determinants, there is no way to approximate the probability distribution taking into account this.

본 발명은 상기와 같은 문제점들을 해결하기 위하여 창안된 것으로,일 경우차 의존관계에 의한 확률 분포의 최적 곱 근사 방법을 제공하는 것을 목적으로 한다.The present invention has been made to solve the above-mentioned problems, If And to provide an optimal product approximation method of a probability distribution due to a differential dependency.

본 발명의 다른 목적은,차 의존관계에 의한 확률분포의 최적 곱 근사에서 구성 분포 항을 구하는 방법을 제공하는 것이다.Another object of the present invention is to provide And to provide a method for obtaining a constitutive distribution term in an optimal product approximation of a probability distribution by a differential dependence.

본 발명의 또 다른 목적은, 의존관계에 의한 다수 결정 결합 방법을 제공하는 것이다.It is still another object of the present invention to provide a method of combining multiple decisions by dependency.

도 1은 인식 결과에 대한 3 단계를 나타내는 도면.Brief Description of the Drawings Fig. 1 is a diagram showing three steps of a recognition result. Fig.

도 2는 일반적인 다중 인식기 시스템의 구성도.2 is a block diagram of a general multiple recognizer system;

도 3은 본 발명에 따른 1차 의존관계에 의한 유사도 최소화방법을 수식으로 나타낸 도.FIG. 3 is a diagram illustrating a method of minimizing the degree of similarity based on a primary dependency relationship according to the present invention. FIG.

도 4는 본 발명에 따른 2차 의존관계에 의한 유사도 최소화방법을 수식으로 나타낸 도.FIG. 4 is a diagram illustrating a method of minimizing the degree of similarity based on the secondary dependency relationship according to the present invention. FIG.

도 5는 계산된 평균 상호 정보 값을 나타낸 도.5 is a diagram illustrating the calculated average mutual information value.

도 6은 1차 의존관계에 의한 의존관계 트리를 나타낸 도.Figure 6 is a diagram depicting a dependency tree by a primary dependency relationship;

도 7은 2차 의존관계에 의한 곱 근사 결과와 평균 상호 정보 값의 합을 나타낸 도.Fig. 7 is a diagram showing the sum of the product approximation result and the average mutual information value according to the second dependency relation; Fig.

도 8은 Bayes 에러율 최소화에 의한 1차 근사를 나타낸 도.8 is a diagram showing a first approximation by minimizing a Bayes error rate.

발명의 상세한 설명에서 사용된 부호의 간단한 설명Brief description of the symbols used in the description of the invention

K개 결정기 집합 : K determinator set:

L개 결정 후보의 집합 : Set of L decision candidates:

입력 : input :

의존관계의 차수 : Degree of dependency:

K개 결정기 집합이,K set of determiners ,

L개 결정 후보의 집합이,A set of L decision candidates ,

입력이,Input ,

의존관계의 차수가,The degree of dependency is ,

일 경우, If it is,

포스테리어 확률가Probability of force terrier end

=일 때, = when,

=가 되고,= Lt; / RTI &

가를 나타낸다면, end Lt; / RTI >

일 경우, If it is,

는및 The And

=관계가 성립하는 경우,= If the relationship is established,

..

K개 결정기 집합이,K set of determiners ,

L개 결정 후보의 집합이,A set of L decision candidates ,

입력이,Input ,

의존관계의 차수가,The degree of dependency is ,

, ,

포스테리어 확률가Probability of force terrier end

=일 때, = when,

=가 되고,= Lt; / RTI &

가를 나타내면, end Lt; / RTI >

일 때, when,

는및 The And

=관계가 성립하는 경우,= If the relationship is established,

로 정의되는 Defined as

for ㉠ do /* 1차 의존관계 */for ㉠ do / * Primary dependency * /

및 And

로 주어진다 */ Given as * /

for ㉡ do /* 2차 의존관계 */for ㉡ do / * Secondary dependency * /

및 And

로 주어진다 */ Given as * /

........ ................ ........

및 And

로 주어진다 */ Given as * /

및 And

로 주어진다 */ Given as * /

..............

endend

..............

endend

K개 결정기 집합이,K set of determiners ,

L개 결정 후보의 집합이,A set of L decision candidates ,

입력이,Input ,

의존관계의 차수가,The degree of dependency is ,

이제 본 발명의 동작원리와 바람직한 실시예에 대하여 자세히 살펴보기로 한다.Now, the operation principle and preferred embodiments of the present invention will be described in detail.

본 발명에서는, 인식기의 결정이 최상위 부류(class) 또는 부류 순위라는 가정하에, 동일 입력에 대해 다수 결정기의 병렬 결합만 고려한다. 다수 인식기 시스템은 도 2에 도시된 바와 같은 병렬로 된 결정기 집합 및 결정 결합방법으로 이루어진 단일체로서 정의된다. 그리고 본 발명에서는, 몇몇 사회적 선택 함수가 인식결과로서 부류의 순위를 산출하는 다수 결정기 결합에 응용된다. 또한 본 발명은 상기한 종래의 여러 방법들의 중간적인 방법으로서, 종래 방법의 결점을 극복하고 장점만을 취하기 위해 의존관계를 고려한 새로운 방법이다.In the present invention, only the parallel combination of multiple determinants is considered for the same input, assuming that the decision of the recognizer is at the highest class or class ranking. The multiple recognizer system is defined as a monolith of a determinator set and a decision-combining method in parallel as shown in Fig. In the present invention, some societal selection functions are applied to a multiple determinator combination that calculates the rank of a class as a recognition result. Further, the present invention is an intermediate method among the above-mentioned conventional methods, and is a new method considering the dependency to overcome the shortcomings of the conventional method and take merits.

본 발명의 방법은 (k+1)차 분포의 곱으로 고차 확률 분포를 최적으로 근사화하는 의존관계-지향 근사 방법에 기초하며, 다수 결정을 결합하기 위해 상기 최적 근사 분포를 베이지안(Bayesian) 방법에 적용하는 확률적 결합 방법에 기초한다.The method of the present invention is based on a dependency-directed approximation method that optimally approximates a high-order probability distribution as a product of a (k + 1) -order distribution, and the optimal approximation distribution is combined with a Bayesian method It is based on the probabilistic combining method to be applied.

본 발명에서 제공되는 방법이 독립 가정 기반의 접근 보다 더욱 복잡한 계산 및 대용량의 저장을 갖지만, 이 방법은 독립 가정을 더 이상 필요로 하지 않으며 BKS 방법 보다 덜 복잡한 계산 및 적은 저장용량을 갖는다. 또한, 이 방법은 곱 근사로 BKS 방법의 고인식 기각율을 감소시킨다.Although the method provided in the present invention has more complex computations and larger storage than an independent home based approach, this method no longer requires an independent assumption and has less computation and less storage capacity than the BKS method. In addition, this method reduces the high recognition rejection rate of the BKS method by the product approximation.

본 발명에서 제공되는 방법에서, 의존관계-지향적 근사 방법은 상기의 Lewis에 의해 제안된 유사도(measure of closeness)를 이용하여, k차 의존관계를 고려한 (k+1)차 분포의 곱으로 고차 확률 분포를 최적으로 근사화하는 방법이다.In the method provided by the present invention, the dependency-directed approximation method uses the measure of closeness proposed by Lewis above to calculate the product of the (k + 1) It is a method to optimally approximate the distribution.

여기서, 의존 관계는 훈련 샘플로부터 획득된 결정기의 결정만을 관측하므로서 확률적으로 결정된다고 가정한다.Here, it is assumed that the dependency relationship is determined stochastically by observing only the determiner's determinations obtained from the training samples.

고차 의존 관계인(즉, k 1) 경우에 고차 확률 분포를 최적으로 근사화하는 방법이 없기 때문에, 고차 의존 관계를 위한 알고리즘을 포함하는 방법이 본 발명에서 체계적으로 고안되었다. 따라서, 이 근사 분포는 베이지안 방법에 응용되고 결합 공식이 근사 분포 및 Bayes 정리에 의해 유도된다.Since there is no way to optimally approximate a higher order probability distribution in the case of a higher order dependency (i.e., k 1), a method involving an algorithm for higher order dependency is systematically designed in the present invention. Therefore, this approximate distribution is applied to the Bayesian method and the binding formula is derived by approximate distribution and Bayesian theorem.

문제 정의Problem definition

공식적으로 다수 인식기의 결합을 표명하기 위해, 확률이론 및 베이지안 방법을 이용하여 몇몇 표기가 정의되고 다수 결정기 결합 프로세스가 다음과 같이 공식화된다. x 가 K 개 결정기(예를들면, C₁,C₂,...,C_K)에 병렬로 주어지면, K 차 결정벡터 D = C₁(x) = M₁, C₂(x)= M₂, ... , C_K(x)= M_K가 관측되고,To formally express the combination of multiple identifiers, some notation is defined using probability theory and Bayesian methods, and a multiple determinator join process is formulated as follows: When x is given in parallel to _K determiners (e.g., C ₁ , C ₂ , ..., C _K ), the K-order decision vector D = C ₁ (x) = M ₁ , C ₂ M ₂ , ..., C _K (x) = M _K are observed,

M_i(1iL)는 추출레벨에서 M = {M₁,M₂,...,M_L}에 의해 L개의 결정 또는 부류 가운데 하나이고, 이때 첨자 i는 i번째 인식기(C_i)의 i와는 무관하다.M _i (1 i L) is one of L determinations or classes by M = {M ₁ , M ₂ , ..., M _L } at the extraction level, where the subscript i is independent of i of the i th recognizer (C _i ).

확률적 방법으로 다수 결정을 결합하는 주요 과제는The main task of combining multiple crystals in a stochastic way

인 포스테리어(posterior) 확률 P^*을 최대화하는 부류 m^*를 정하는 것이다.To determine the class m ^* that maximizes the posterior probability P ^* .

즉, (K +1)차 확률 분포That is, the (K + 1) -order probability distribution

P(m,C₁(x)=M₁,C₂(x)=M₂,...,C_K(x)=M_K) _{P (m, C 1 (x} ) = M 1, C 2 (x) = M 2, ..., C K (x) = M K)

는 훈련 샘플로부터 산정되어야 하고, 여기서 m은 입력 x의 부류 변수이다.Should be estimated from the training samples, where m is the class variable of the input x.

이것은 본 발명에서, (K +1)차 확률 분포로서 참조된다.This is referred to as a (K + 1) -order probability distribution in the present invention.

본 발명에서는 과업을 달성하기 위해(조건부) 독립 가정없이 다수 결정 결합을 위한 고차 확률을 계산하는 새로운 방법을 제공한다. 하지만, (K+1)차 확률 분포의 산정 및 저장은 기하학적으로 복잡하고 관리 불가능하다(예를들어, 공식화에서.The present invention provides a new way of calculating the higher order probabilities for multiple decision bonds without independent assumptions to achieve the task. However, the computation and storage of (K + 1) -order probability distributions is geometrically complex and unmanageable (for example, .

따라서, 1kK 일 때, k차 독립을 고려한 (k +1)차 분포의 곱으로 (K +1)차 확률 분포를 최적 근사화 하는 근사 방법이 요구된다.Therefore, an approximate approximation method is required to approximate the (K + 1) -order probability distribution by the product of the (k + 1) -order distribution considering k-order independence at 1kK.

즉, 문제는 고차 확률 분포의 최적 곱을 식별하고, 독립가정 없이 베이지안 방법을 이용하여 다수 결정을 확률적으로 결합하는 것이다.In other words, the problem is to identify the optimal product of the higher-order probability distributions and probabilistically combine multiple decisions using the Bayesian method without independent assumptions.

인식기 사이의 의존관계를 고려하고 Chow 및 Liu의 연구를 고차 의존관계까지 확대하므로써, 의존관계 기반 방법론이 제안된다. 이 의존관계 기반 방법론은 결합 방법에 기초한 독립 가정보다 약한 가정을 취하며, BKS방법 보다 적은 인식 기각율, 계산 복잡도(예를들어, 공식화에서및 적은 저장 용량을 필요로 한다.A dependency-based methodology is proposed by considering the dependencies between recognizers and extending Chow and Liu's research to higher-order dependencies. This dependency-based methodology takes a weaker assumption than the independent assumption based on the combining method, and requires less recognition rejection rate, computational complexity (eg, And less storage capacity.

본 발명에서는 우수한 의존관계 기반 방법론의 성능을 보이기 위해, 다수 인식기를 이용하여 KAIST(Korea Advanced Institute of Science and Technology) 및 캐나다 Concordia 대학의 표준화돤 CENPARMI 데이터 베이스에서 각각 수집한 필기 숫자 및 KAIST 에서 수집된 영문자에 대한 인식 실험을 행하였다.In order to demonstrate the superior performance of the dependency-based methodology, the present invention employs a plurality of recognizers, collected from the KAIST (Korea Advanced Institute of Science and Technology) and the standardized CENPARMI database of Concordia University of Canada, A recognition experiment was conducted for English characters.

다수 인식기의 다양한 집합이 각 응용 영역에서 수행되었고 다양한 결정방법으로 평가되었다. 즉, 이 방법론에서 의존관계기반 결정 결합방법들을 종래의 방법들인 다수 투표, 보다(Borda) 카운트 방법, BKS 방법 및 독립가정에 기초한 베이지안 방법과 비교하였다.Various sets of multiple recognizers were performed in each application area and evaluated by various decision methods. That is, in this methodology, dependency-based decision combining methods are compared to conventional methods such as the multiple voting method, the Borda count method, the BKS method, and the Bayesian method based on independent assumptions.

상기한 바와 같은 본 발명은 크게 3 부분으로 구성된다.The present invention as described above is largely composed of three parts.

첫째는, 실험용 샘플 데이터를 참조하여 지정된 차수의 의존관계에 기반한 최적의 곱 근사집합을 계산해 내기 위한 이론적 근거를 제시하는 부분이다.First, it presents the rationale for computing the optimal product approximation set based on the dependence relation of the specified order by referring to the experimental sample data.

둘째는, 이 근거를 기반으로 실제 최적 곱 근사 집합을 구하는 알고리즘이다.Second, it is an algorithm to obtain the actual optimal product approximation set based on this basis.

셋째는, 계산된 최적 곱 근사 집합으로부터 베이지안 기법을 적용하여 실질적으로 다수 결정기의 결정을 의존관계 기반으로 결합하는 부분이다.Third, the Bayesian method is applied from the computed optimal product approximation set to substantially concatenate the decisions of a plurality of determinants based on the dependency relation.

상기 일반적 근거를 제시하는 부분은 유사도와 상호정보 정의를 활용하여 고차 의존 관계를 기반으로 한 최적 곱 근사 분포를 어떻게 구할 수 있는 가를 보여 준다.The part that presents the above general evidence shows how to obtain the optimal product approximation distribution based on the higher order dependence by using similarity and mutual information definition.

상기 알고리즘 부분은 위 근거에 따라 실제로 곱 근사 집합을 구하는 것이며, 여기서 구해진 곱 근사 집합이 베이지안 기법에 활용되어 다수 결정을 확률적 방법으로 구할 수 있게 해 준다.The algorithm part actually obtains a product approximation set according to the above grounds, and the obtained product approximation set is utilized in the Bayesian method to obtain a plurality of decisions by a probabilistic method.

본 발명은 다음과 같이 동작된다.The present invention operates as follows.

첫째는, 훈련용 샘플 데이터를 참조하여 지정된 차수의 의존관계에 기반한 최적의 곱 근사 집합을 이론적 근거와 알고리즘을 기반으로 계산한다.First, we refer to the training sample data and calculate the optimal product approximation set based on the specified order of dependence based on the rationale and algorithm.

둘째는, 계산된 최적 곱 근사 집합을 베이지안 기법을 적용하여 실질적으로 다수 결정기의 결정을 결합하도록 한다.Second, the calculated optimal product approximation set is combined with the decision of the multiple determinator practically by applying the Bayesian method.

특히, 최적 곱 근사 집합을 계산하는 알고리즘은 확률적인 특성을 만족하면서 고차 확률 분포가 저차 분포의 곱으로 근사화 되도록 동작된다.In particular, the algorithm for computing the optimal product approximation set is operated such that the higher order probability distributions are approximated by the product of the lower order distributions while satisfying the stochastic properties.

이를 위하여 제한조건이라는 부분이 곱 근사 정의 식에 사용되며, 가능한 고차 의존관계 집합을 구하는 데 적용된다.To this end, the constraint term is used in the product approximation formula and applied to obtain the possible higher order dependency set.

이를 상세히 설명하면 다음과 같다.This will be described in detail as follows.

C = {C₁,C₂,...,C_K} : 결정기(인식기)의 집합,C = {C ₁ , C ₂ , ..., C _K }: a set of determiners (recognizers)

M = {M₁,M₂,..,M_L} : 결정 후보의 집합,M = {M ₁ , M ₂ , ..., M _L }: a set of decision candidates,

x : 입력이라 하면,x: input,

이고, M_i M을로 표현하고,And M _i M Lt; / RTI >

C_j(x)= M_j를 C_j로 표현하면(여기서, 1j(K+1)),Let C _j (x) = M _j be C _j , where 1 j (K + 1)),

여기서,를 다음과 같이 근사화 시킨다.here, Is approximated as follows.

여기서, 0i(j)j, Here, 0 i (j) j,

: 1차 의존관계, : Primary dependency,

: 조건부 독립 가정, : Conditional independent families,

및여기서,0i2(j),i1(j)j, : 2차 의존 관계,And Here, 0 i2 (j), i1 (j) j,: secondary dependency,

및여기서, 0i2(j)j, : 조건부 1차 의존관계, And Here, 0 i2 (j) j,: conditional primary dependency,

..

및여기서,0ik(j), ... ,i1(j)j, k차 의존관계,And Here, 0 ik (j), ..., i1 (j) j, k-order dependency,

및 And

여기서,및 조건부 k차 의존관계이다. here, And conditional k-th dependency.

위와 같이 고차 확률 분포(P)를 저차 확률 분포의 곱으로 근사화 하는 데 있어서, 최적 근사 분포(P_a)가 되는의 값을 1, ... , (K+1) 의 순열에서 결정해야 실제 분포와의 차가 적어지게 된다.In approximating the higher-order probability distribution (P) to the product of the lower-order probability distribution as described above, the approximate distribution (P _a ) (K + 1) must be determined from the permutation of 1, ..., (K + 1) so that the difference from the actual distribution becomes small.

그리고, 실제 확률 분포와 근사 분포와의 차를 계산하는 척도로 Lewis가 제안한 유사도(Measure of Closeness)를 사용하였다.And we use the measure of Closeness proposed by Lewis as a measure to calculate the difference between the actual probability distribution and the approximate distribution.

이 유사도를 적용하여 2차 의존 관계에 기반한 최적 근사 분포를 결정하기 위한 이론적 근거는 아래와 같다.The rationale for determining the optimal approximate distribution based on the second order dependency by applying this similarity is as follows.

아래 식에서 M항은 평균적 상호 정보(Averaged Mutual Information) 또는 간단히 상호정보를 의미한다.In the following equation, M stands for Averaged Mutual Information or simply mutual information.

전개 과정을 간단히 하기 위해서 근사 분포 정의 식에서 아래 첨자 n을 생략하여 표현했다.In order to simplify the expansion process, subscript n is omitted in the approximate distribution definition equation.

여기서, C가 (C₁, ... ,C_K+1)이면,Here, if C is (C ₁ , ..., C _{K + 1} )

I(P(C),P_a(C)) = I (P (C), P _a (C)) =

위의 식에서부터 I(P(C), P_a(C))를 최소화 해야 최적의 근사 분포를 구할 수 있다.From the above equation to minimize I (P (C), P a (C)) can be determined the best approximation distribution.

그런데, 이러한 최소화는 결국 M(C_j;C_i2(j),C_i1(j))를 최대화 하는 것과 마찬가지이다.However, this minimization is equivalent to maximizing M (C _j ; C _{i2 (j)} , C _{i1 (j)} ).

왜냐하면, 나머지 항들은 상수에 속하기 때문이다. 위의 이론적 근거를 활용하여 2 차 의존 관계를 기반으로 한 최적 근사 분포를 구하는 알고리즘은 다음과 같다.Because the remaining terms belong to the constant. Using the above rationale, the algorithm for finding the optimal approximate distribution based on the second dependence relation is as follows.

이 알고리즘에서 상호 정보 값이 가중치로서 사용되어 최적 곱 근사 집합을 구하는 데 쓰인다. 이를 설명하면 다음과 같다.In this algorithm, mutual information values are used as weights to obtain the optimal product approximation set. This is explained as follows.

2 차 의존 관계 기반의 최적 곱 근사 분포를 구하는 알고리즘Algorithm to find the optimal product approximation distribution based on the second dependence relation

입력 : s 개의 샘플 데이터 S¹,S^{1 ...},S^s.Input: s sample data S ¹ , S ^{1 ...} , S ^s .

출력 : 유사도에 따른 2 차 의존 관계를 기반으로 한 최적인 곱 근사 집합.Output: Optimal product approximation set based on second-order dependence by similarity.

방법 :Way :

1. 훈련 샘플 데이터로부터 2차 및 3차 분포를 계산한다.1. Calculate the secondary and tertiary distributions from the training sample data.

2. 샘플 데이터로부터 얻은 모든 2 쌍과 3 쌍의 자료에 대해 가중치 M(C_j;C_i(j)) 과 M(C_j;C_i2(j),C_i1(j))를 계산한다.2. Compute the weights M (C _j ; C _{i (j)} ) and M (C _j ; C _{i2 (j)} and C _{i1 (j)} ) for all 2 pairs and 3 pairs of data obtained from the sample data.

3. 1차 및 2차 의존관계들의 최대 가중치 합을 계산하고, 관련된 최적 곱 근사 집합을 구한다.3. Compute the maximum weighted sum of the first and second dependencies and obtain the associated optimal product approximation set.

max_total_weight = 0;max_total_weight = 0;

for n =1 to 1차 의존 관계의 개수 dofor n = 1 to Number of primary dependencies do

total_weight = 0 ;total_weight = 0;

제한조건으로서 1차 의존관계 가운데 하나를 선택한다;Select one of the primary dependencies as a constraint;

total_weight = 선택된 의존관계의 가중치;total_weight = weight of selected dependency;

while (선택되지 않은 결정기의 개수 0 ) dowhile (number of unselected determinants 0) do

선택되지 않은 결정기중의 하나를 선택한다;Selects one of the non-selected determinants;

선택된 결정기와 관련된 가능한 2차 의존관계중의 하나를 선택한다;Selects one of the possible secondary dependencies associated with the selected determinator;

total_weight = 선택된 2차 의존관계의 가중치;total_weight = weight of selected secondary dependency;

endend

max_total_weight = MAX(max_total_weight, total_weight)max_total_weight = MAX (max_total_weight, total_weight)

max_total_weight 와 이에 관련된 1차 및 2차 의존관계 집합을 저장한다;stores max_total_weight and its associated set of primary and secondary dependencies;

endend

최대 가중치 합인 max_total_weight 와 이에 관련된 1차 및 2차 의존관계 집합을 얻는다.The maximum weighted sum max_total_weight and the associated primary and secondary dependency sets are obtained.

4. 위의 결과를 최적 곱 근사 집합으로 출력한다.4. Output the above result as an optimal product approximation set.

위의 2차 의존 관계에 기반한 최적곱 근사 집합을 구하는 알고리즘을 기반으로 k차 의존관계에 기반한 최적곱 근사 집합은 1개의 1차 의존관계, 1개의 2차 의존관계, ... , 1개의 (k-1)차 의존관계, (K-k)개의 k차 의존관계등으로 구성되기 때문이다.Based on the algorithm for finding the optimal product approximation set based on the above second dependency relation, the optimal product approximation set based on the k-th dependence relation has one primary dependency, one secondary dependency, ..., one ( k-1) and (Kk) k-order dependency relations.

예를들면, k차 의존관계에 기반한 최적 곱 근사 집합을 구하는 알고리즘은 (k-1)개의 내포된 for 루프와 1개의 while 루프로 구성될 수 있다.For example, the algorithm for finding the optimal product approximation set based on the k-th dependence relation can be composed of (k-1) nested for loops and one while loop.

for do /* 1차 의존관계 */for do / * Primary dependency * /

for do /* 2차 의존관계 */for do / * Secondary dependency * /

........ ................ ........

for do /* (k-1)차 의존관계 */for do / * (k-1)

while ( ) do /* k차 의존관계 */while () do / * k dependency * /

..............

endend

..............

endend

특히, 최적 곱 근사 집합을 계산하는 알고리즘은 확률적인 특성을 만족하면서 고차 확률 분포가 저차 분포의 곱으로 근사되도록 동작되어야 한다. 이를 위하여 제한조건이라는 부분이 곱 근사 정의 식에 사용되며, 가능한 고차 의존관계 집합을 구하는 데 적용된다.In particular, the algorithm for computing the optimal product approximation set should be operated such that the higher order probability distribution approximates the product of the lower order distributions while satisfying the stochastic properties. To this end, the constraint term is used in the product approximation formula and applied to obtain the possible higher order dependency set.

일반적 근거를 제시하는 부분은 유사도와 상호정보 정의를 활용하여 고차 의존 관계를 기반으로 한 최적 곱 근사 분포를 어떻게 구할 수 있는 가를 보여 준다.The general reasoning section shows how to obtain the optimal product approximation distribution based on the high order dependence using the similarity and mutual information definition.

알고리즘 부분은 위 근거에 따라 실제로 곱 근사 집합을 구하는 것이며, 여기서 구해진 곱 근사 집합이 베이지안 기법에 활용되어 다수 결정을 확률적 방법으로 구할 수 있게 해 준다.The algorithm part is to actually obtain the product approximation set according to the above grounds, and the obtained product approximation set is used in the Bayesian method, so that multiple decisions can be obtained by the probabilistic method.

의존 관계 기반 방법론Dependency-based methodology

이하, 의존관계 기반 방법론에 대해 약술한다. 먼저 의존관계 기반 방법론의 이론적 배경에 대해 설명한다.The dependency-based methodology is outlined below. First, we discuss the theoretical background of dependency-based methodology.

의존 관계 기반 또는 다수 결정 결합을 위한 의존 관계에 기반한 방법론이 설명된다. 기본적으로, 이 방법론은 두 개의 순차적 단계, 즉 의존 관계-지향 근사와 베이지안 공식을 이용한 확률적 결합으로 이루어진다. 의존 관계-지향 근사는 1kK 일 때, k차 의존 관계에 의한 (K+1)차 확률 분포의 최적 곱 근사를 다룬다. 그리고, 베이지안 공식을 이용한 확률적 결합은 다수 결정 결합을 위해 베이지안 공식에 의존 관계-지향 근사에 의해 식별된 최적 곱 근사를 적용한다.A dependency-based methodology for dependency-based or multiple-decision combining is described. Basically, this methodology consists of two sequential steps: a stochastic combination using dependency-directed approximation and Bayesian equations. Dependency-Oriented Approximation 1 k K, the optimal product approximation of the (K + 1) -order probability distribution by k-order dependence is treated. And, stochastic combining using Bayesian equations applies the optimal product approximation identified by dependency-directed approximation to Bayesian equations for multiple crystal combining.

인식기는 통계학적으로 다른 인식기와 의존관계에 있으므로 인식기가 독립적으로 수행된다는 가정은 바람직하지 않다. 따라서, 이러한 관점에서 볼 때, 다수 인식기 결합에서 인식기간 의존관계에 대한 고려는 바람직하다.Since the recognizer is statistically dependent on other identifiers, it is not desirable to assume that the identifiers are performed independently. Therefore, in this view, consideration of the recognition period dependence in the multiple recognizer combination is desirable.

먼저, 베이지안 공식에 대한 배경 지식이 다음 장에서 확률적 용어의 사용을 위해 소개되고, 의존관계에 대한 관심사항이 설명된다. 의존관계-지향 방법론의 장점 및 단점도 복잡도 분석 관점에서 설명된다.First, the background on Bayesian formulas is introduced in the next chapter for the use of probabilistic terms, and the concerns about dependency are explained. The advantages and disadvantages of dependency-oriented methodologies are also explained in terms of complexity analysis.

베이지안 방법에 대한 배경Background on the Bayesian Method

주어진 입력 패턴 x 의 인식에 관한 베이지안 방법을 위해, 다음과 같이 정의되는 약간의 표기가 필요하다.For the Bayesian method of recognizing a given input pattern x, some notation as defined below is needed.

L개의 개별 결정 또는 부류의 집합은 M = {M₁,M₂, ...,M_L}으로 표시되고, K개 인식기 집합은 C = {C₁,..,C_K}로 표시한다. 베이지안 방법에서, 각 인식기의 출력은 인식기 자체의 에러를 고려하는 것으로 다루어진다. 각 인식기의 에러는 자신의 혼동 행렬로 기술된다. 따라서, 베이지안 방법의 사용은 인식기간 확립된 상기 혼동 행렬에서의 학습 단계를 필요로 한다. 혼동 행렬의 요소는 인식기(C_k)에 의해 레이블 부류(M_j)가 할당된 부류(M_i)의 샘플 수이다. 부류(M_i)의 전체 샘플 수는이고, 인식기(C_k)에 의해 레이블 부류(M_j)가 할당된 전체 샘플 수는이다.The set of L individual determinations or classes is denoted as M = {M ₁ , M ₂ , ..., M _L }, and the set of K recognizers is denoted as C = {C ₁ , .., C _K }. In the Bayesian method, the output of each recognizer is treated as taking into account the error of the recognizer itself. The error of each recognizer is described by its own confusion matrix. Thus, the use of the Bayesian method requires a learning step in the confusion matrix for which the recognition period has been established. Elements of confusion matrix Is the number of samples of the class (M _i ) to which the label class (M _j ) is assigned by the recognizer (C _k ). The total number of samples of class M _i is , And the total number of samples to which the label class M _j is assigned by the recognizer C _k is to be.

= , i=1,...,L. = , i = 1, ..., L.

= , j=1,...,L+1. = , j = 1, ..., L + 1.

이벤트(즉,결정) C_k(x)=M_j에 대해, 이 이벤트의 참값은 xM_i, i=1,...L 이 참이다라는 조건부 확률에 의해 기술되는 불확실성을 가진다.For an event (i. E., Decision) C _k (x) = M _j , the true value of this event is x M _i , i = 1, ..., L is true.

K개 인식기에 대해, 학습단계에서 K개 2차 혼동 매트릭스가 존재한다. 동일 입력 패턴 x가 K개 인식기에 의해 인식될 때, K개 이벤트 C_K(x)=M_jk=1,...,K, j=1,...,L이 발생한다. 그룹 일치(consensus)에 기초하여 승자 부류를 선택하기 위해 K개 개별 이벤트를 결합하는 방법이 문제이다. 이러한 문제는 포스테리어 확률 P^*를 최대화하는 승자 부류 m^*을 판정하는 다음 수학식 3에 의해 공식화 된다.For the K recognizers, there are K second-order confusion matrices in the learning phase. When the same input pattern x is to be understood by the reader the K, the K event _{_{C K (x) = M j}} k = 1, ..., K, j = 1, ..., L are generated. A problem is how to combine K individual events to select a winner class based on the group consensus. This problem is formulated by the following equation (3) which determines the winner class m ^* to maximize the posterior probability P ^* .

상기 수학식 3에서, 인식기의 첨자는 부류의 첨자와 무관하다. (K+1)차 확률을 산정하기 위해, 훈련 샘플로 부터 획득된 K개 결정의 수집은 저장공간에 축적되어 저장되어야 한다. 그러나, (K+1)차 확률 분포의 산정 및 저장은 기하학적으로 복잡하고 관리 불가능하다. 따라서, (조건부) 독립 가정이 (K+1)차 확률 분포의 단순 계산을 위해 적용된다. 그러나, 독립 가정 없이, 의존관계에 의해 저차 성분 분포로 (K+1)차 분포를 근사시킬 것이다.In the above Equation 3, Subscript Brackets Subscript . In order to estimate the (K + 1) -order probability, the collection of K determinations obtained from the training samples should be stored and stored in storage space. However, the calculation and storage of the (K + 1) -order probability distribution is geometrically complex and unmanageable. Therefore, (conditional) independent assumptions are applied for simple calculation of the (K + 1) -order probability distribution. However, without independence hypothesis, we will approximate the (K + 1) -th order distribution with a lower-order component distribution by dependence.

독립 및 곱 근사에 대한 배경지식Background on independence and product approximation

인식기들이 서로에 대해 독립적으로 수행한다는 가정하에, Xu는 베이지안 방법을 제안하였다. 이 방법에서는, 멤버 인식기에 높은 의존관계가 있는 비멤버 인식기가 멤버 인식기 집합에 추가되는 경우, 결합 결정은 의존관계 인식기로 편향된다. 더욱이, 의존관계 인식기가 오류 인식 결과를 가지면 성능도 열화된다. 이러한 사항은 독립가정에 기반한 결합방법의 중요한 약점이다.Under the assumption that the recognizers perform independently of each other, Xu proposed a Bayesian method. In this method, if a non-member recognizer with a high dependency on the member recognizer is added to the member recognizer set, the join decision is biased towards the dependency recognizer. Furthermore, if the dependency recognizer has an error recognition result, the performance deteriorates. This is an important weakness of the independent family based method.

의존관계는 하나의 인식기에 대한 한 인식기의 예측 능력을 의미한다.차 의존관계는 하나의 인식기에 대한 K 개 인식기 결정의 예측 성능으로 정의된다. 이러한 의존관계는 평균 상호 정보의 계산, 통계학에서 연관 수단 또는 이벤트 커버링 방법의 상호 의존 중복 기준에 의해 통계적으로 측정된다.Dependency refers to the ability of a recognizer to predict one recognizer. The differential dependence is defined as the predictive performance of the K recognizer determinations for a single recognizer. These dependencies are statistically determined by the calculation of the average mutual information, the statistical relationship, or the interdependent redundancy criteria of the event covering method.

유사도로부터 평균 상호 정보가 유도되고 이것이 고차 의존관계로 용이하게 확장되기 때문에, 의존관계 기준으로 (평균) 상호 정보를 선택한다. (평균) 상호 정보는 특정 이벤트(즉, 결정)의 발생이 몇몇 대안의 가능성에 관해 알려주는 정량적 기준으로 정의된다.Since the average mutual information is derived from the similarity and this easily expands into a higher order dependency, the (average) mutual information is selected on a dependency basis. (Average) Mutual information is defined as a quantitative criterion that indicates the occurrence of a particular event (ie, decision) as to the likelihood of some alternative.

의존관계는 저차 확률 분포의 곱으로 고차 확률 분포의 근사를 위한 이론적 기준을 제공한다. 본 발명에서, 인식기간 의존관계는 훈련 데이터 집합으로부터 K개 인식기의 결정에 의해 확률적으로 측정되는 것으로 가정하고, 이 의존관계는 고차 확률 분포의 근사에 기여한다. k차 의존관계에 의해 고차 확률 분포 근사에 있어서, 고차 확률 분포에 두 기준(criterion)이 응용될 수 있다. 하나의 기준은 유사도를 최소화 하는 것이다.The dependence provides a theoretical basis for approximating a higher-order probability distribution as a product of lower-order probability distributions. In the present invention, it is assumed that the recognition period dependence is probabilistically measured by the determination of K recognizers from the training data set, and this dependency contributes to the approximation of the high probability distribution. In the higher-order probability distribution approximation by k-order dependence, two criterion can be applied to the higher-order probability distribution. One criterion is to minimize the degree of similarity.

이 기준은 각 부류에 대해 하나의 곱 근사를 발생한다. 유사도는 수학적으로 다음과 같이 정의된다.This criterion yields one product approximation for each class. The similarity is mathematically defined as follows.

다른 기준은 Bayes 에러율을 최소화 하는 것이다. Bayes 에러율을 사용하여, 모든 부류에 대해 하나의 곱 근사가 생성된다. Bayes 에러율 P_e를 최소화 하는 것은, 인식기가 다음과 같이 주어질 때 결정 변수의 조건부 엔트로피를 최소화하는 것이다.Another criterion is to minimize the Bayes error rate. Using the Bayes error rate, one product approximation is generated for all classes. Minimizing the Bayes error rate P _{e is} to minimize the conditional entropy of the decision variable when the recognizer is given as

, ,

유사도는 최적 근사 기준으로서 강조되며, 이는 최적 근사 기준이 Bayes 에러율 보다 양호한 성능을 나타내기 때문이다. 실제 분포의 지식 없이, 의존관계에 의해 고차 확률 분포를 최적으로 근사하고, 확률적 방법으로 베이지안 방법을 이용하여 최적 근사 분포로 다수 결정을 결합한다. 제안된 방법은 실험에서 필기 숫자 및 영문자 인식에 응용되었다.The similarity is emphasized as the optimal approximation criterion because the optimal approximation criterion has better performance than the Bayes error rate. Without knowledge of the actual distribution, the higher order probability distribution is approximated optimally by the dependency relation, and the multiple decision is combined with the optimal approximation distribution using the Bayesian method in a probabilistic manner. The proposed method is applied to handwritten number and alphabet recognition in experiments.

본 방법의 장점 및 단점Advantages and Disadvantages of the Method

K개 인식기 및 L 개 결정이 있다고 가정한다.It is assumed that there are K recognizers and L decisions.

(K+1)차 확률 분포를 다루고 이 분포로부터 (K+1)차 확률을 계산한다. 앞에서 설명한 바와 같이, 근사하지 않고 (K+1)차 확률 분포를 저장 및 산정하는 것은 이론적 분석에서 볼 때 기하학적으로 복잡하고 관리 불가능하다.(K + 1) -order probability distribution and calculates the (K + 1) -order probability from this distribution. As previously described, storing and estimating a (K + 1) -order probability distribution without approximation is geometrically complex and unmanageable from a theoretical analysis.

만일, (K+1)차 확률 분포가 조건부 독립 가정을 수용한다면, 이 분포의 저장 복잡도는이다. 즉,If the (K + 1) -order probability distribution accepts a conditional independence hypothesis, then the storage complexity of this distribution is to be. In other words,

. .

이러한 독립 가정 기반 근사는 단순한 계산 및 적은 저장 용량을 갖는다. 그러나, 독립 가정은 실제 응용에 사용하기엔 때때로 지나치다.This independent assumption-based approximation has simple computation and low storage capacity. However, independent homes are often overused for practical applications.

반면에, BKS 방법은 근사하지 않고 (K+1)차 확률 분포를 유지해야 한다. 따라서, 저장 복잡도는이고 기하학적으로 복잡하다.On the other hand, the BKS method should maintain a (K + 1) -order probability distribution without approximation. Thus, the storage complexity is And is geometrically complex.

BKS 방법이 독립 가정을 필요로 하지 않지만, 적은 K에 대해서도 BKS 방법이 기하학적으로 복잡한 것이 단점이다. 더욱이, BKS 방법은 합동 K 개 결정이 빈도 테이블의 요소 성분과 일치하지 않는 것으로 관측되면 결정을 수행할 수 없기 때문에, 높은 기각율이 발생할 가능성이 있다.Although the BKS method does not require an independent assumption, the disadvantage is that the BKS method is geometrically complex even for a small K. Moreover, since the BKS method can not make a decision if the joint K decision is observed to be inconsistent with the elemental component of the frequency table, a high rejection rate may occur.

두 극단 방법에 대해 중간 방법으로서, 의존관계 기반 방법은 저차 성분 분포로 (K+1)차 확률 분포를 근사하고, 독립 가정없이 다수 결정을 결합한다. k차 의존관계를 고려하면(여기서, 1kK), 이 방법의 저장 복잡도는이고, 독립 가정 기반 방법과 BKS 방법사이의 복잡도를 갖는다. 또한, 이 방법은 곱 근사에 의해 고기각율을 감소시킬 수 있다.As an intermediate method for both extreme methods, the dependency-based method approximates the (K + 1) -order probability distribution with low-order component distributions and combines multiple determinations without independent assumptions. Considering the k-th dependency relationship (where 1 k K), the storage complexity of this method is And has complexity between the independent home based method and the BKS method. In addition, this method can reduce meat angular rate by product approximation.

의존관계 지향 최적 근사화Dependency Oriented Optimal Approximation

고차 확률 분포가 근사되면, 이 근사가 실제 분포와의 유사도를 측정할 기준이 요구된다. 정보 이론 모델에 종속하는 이러한 기준은 Lewis에 의해 전개되었다. 이것은 유사도(measure of closeness) 또는 발산도(measure of divergence) 또는 상대 엔트로피라 불리운다. 유사도는 근사 분포에 포함된 정보와 실제 분포에 포함된 정보간의 차이로서 정의된다. K차원 결정 벡터 D에서 C_j(x) = M_j를 C_j로 간단히 표현한다.When the high-order probability distribution is approximated, a criterion for measuring the degree of similarity between the approximation and the actual distribution is required. This criterion, which is subordinate to the information theory model, was developed by Lewis. This is called a measure of closeness or measure of divergence or relative entropy. The similarity is defined as the difference between the information contained in the approximate distribution and the information contained in the actual distribution. In the K-dimensional vector determined D simply represent a C _{_j} (x) = M _j with C _j.

근사화를 거치지 않고, (K+1)차 확률 분포는 다음과 같은 체인 규칙을 이용하여 저차 분포의 곱으로 변환된다.Without the approximation, the (K + 1) -order probability distribution is transformed into a product of lower-order distributions using the following chain rule.

= =

Chow 및 Liu는 Lewis에 의해 제안된 최적화 문제를 해결하는데 노력하였으며, Kruskal의 MWST(Maximum Weight Spanning Tree)알고리즘을 이용하여 1차 트리 의존관계 또는 (n-1)개 2차 성분 분포의 곱으로, n개 변수로 된 n차 2진 변수 분포를 최량 근사시킨다. 그들은 단지 1차 트리 의존관계의 분포에 중점을 두었기 때문에, 이들의 방법은 고차 의존관계를 고려하는데 적절한 방법이 못된다. 그러나, 1kK 일 때, 주어진 k차 의존관계의 최적 곱으로(K+1)차 확률 분포를 근사하기 위한 의존관계-지향 방법을 제안한다. 다음에 베이지안 방법을 이용하여 다수 결정을 결합하기 위해 최적 곱이 사용된다. 이 의존관계는 샘플로부터 획득된 인식기의 결정을 관측함에 의해서만 확률적으로 판정된다고 가정한다. 그리고, 의존관계 차수는 더욱 양호한 근사를 위해 증가될 수 있다.Chow and Liu attempted to solve the optimization problem proposed by Lewis and used Kruskal's Maximum Weight Spanning Tree (MWST) algorithm to calculate the product of the first-order tree dependence or the (n-1) We approximate the n-order binary distribution of n variables best. Since they focus only on the distribution of the first-order tree dependencies, these methods are not a suitable way to consider higher order dependencies. However, k K, we propose a dependency-oriented method to approximate the (K + 1) -order probability distribution with the optimal product of given k-order dependence. The optimal product is then used to couple multiple crystals using the Bayesian method. It is assumed that this dependency is probabilistically determined only by observing the decision of the recognizer obtained from the sample. And the dependency order can be increased for better approximation.

유사도를 이용하여, 확률 분포를 최적 근사하는 것은 실제 분포와 근사 분포간의 차이를 최소화하는 것이다. 1차 의존 관계 또는 2차 및 k차 의존 관계에 의해 의존 관계-지향 최적 근사에 대한 상세 사항을 논의한다.Using the similarity, the optimal approximation of the probability distribution is to minimize the difference between the actual distribution and the approximate distribution. The details of the dependency-directed optimal approximation are discussed by the primary or secondary and k-th dependencies.

1차 의존 관계 근사First-order dependency approximation

1차 의존 관계가 고려될 때, 근사 분포는 2차 분포 관점에서 다음과 같이 정의된다.When the primary dependency is considered, the approximate distribution is defined as follows from the secondary distribution point of view.

, ,

여기서 0i(j)j 이다.Where 0 i (j) j.

따라서는에 대한 조건이 되고, 여기서 n₁,n₂,...,n_K,n_K+1은 정수 1,2,...,K,K+1의 순열이다. 그리고 정의에 의해, P(|C₀)는 P()이다.therefore The Where n ₁ , n ₂ , ..., n _K , n _{K + 1} are permutations of integers 1, 2, ..., K, K + 1. By definition, P ( | C ₀ ) is P ( )to be.

간단히 표기하기 위해, n_j는 j에 의해 표기되고, C₁,C₂,...,C_K,C_K+1는 벡터 C의 (K+1)개 변수라 하자.For simplicity, let n _j be denoted by j, and let C ₁ , C ₂ , ..., C _K , and C _{K + 1 be} the (K + 1)

Chow 및 Liu에 의해 1차 의존 관계 기반 방법을 K개 다수 결정결합을 위한 근사에 적용하면, 벡터 C의 (K+1)차 확률 분포 P는 도 3 에 도시된바와 같이 P 와 P_a간의 유사도를 최소화하여 결정되는 2차 분포 또는 1차 분포의 최적 곱에 의해 P_a로 근사된다.When the primary dependency-based method is applied by Chow and Liu to an approximation for K multiple decision bonds, the (K + 1) -order probability distribution P of the vector C is given by the similarity between P and P _a Is approximated by P _a by the optimal product of the secondary distribution or the first order distribution determined by minimizing.

상기 도 3에 나타낸 바와 같이, I(P(C),P_a(C))를 최소화하는 것은 나머지 항이 모두 상수이므로 평균 상호 정보의 총합인M(C_j;C_i(j))를 최대화 하는 것이다.As shown in Figure 3, I (P (C) , P a (C)) is to minimize both because the remaining term of the constant sum of the average mutual information M (C _j ; C _{i (j)} ).

그후, 다음 단계는 (K+1)개 노드를 갖는 그래프 G에서 (K+1)^(k+1)-2L스패닝 트리로부터 1차 의존관계 최적 집합을 식별하는 방법이다. 그래프 G의 모든 노드 쌍에 대해, 평균 상호 정보를 계산하여 이 정보를 에지 가중치에 할당한 후, MWST 알고리즘을 그래프 G에 적용한 후, 최량 의존관계 트리를 최종적으로 선택한다. 선택된 최량 의존관계 트리로부터, 순열및 이 순열의 조건부 순열를모두 정할 수 있다.The next step is then to identify the primary dependency optimal set from the (K + 1) ^{(k + 1) -2L} spanning tree in graph G with (K + 1) For all node pairs in graph G, after calculating the average mutual information, assigning this information to the edge weights, applying the MWST algorithm to the graph G, and finally selecting the best dependency tree. From the selected optimal dependency tree, And conditional permutation of this permutation Can be determined.

반면에, 만일 상기 수학식 9의가 모든와동일하다면, 즉 C_j가 주어진 C_K+1에 대해 서로 조건부 독립인 것으로 가정하면, 근사 분포는 2차 분포 관점에서 다음과 같이 정의된다.On the other hand, if Equation (9) All Assuming vortices, that is, C _j , are assumed to be conditionally independent of each other for a given C _{K + 1} , then the approximate distribution is defined in terms of the second order distribution as:

여기서, C_j는 단지 C_K+1에 대해서만 강제로 조건부로 된다.Here, C _j is forcibly conditioned only for C _{K + 1} .

이러한 근사는 1차 의존관계 근사의 특정한 경우로서 정의될 수 있는 조건부 독립 가정으로 알려져 있다. 상세한 사항은 다음과 같다.This approximation is known as a conditional independent assumption that can be defined as a specific case of a first-order dependency approximation. The details are as follows.

2차 의존관계 근사Secondary dependency approximation

2차 의존 관계가 고려될 때, 근사 분포는 3차 분포 관점에서 다음과 같이 정의된다.When the secondary dependency is considered, the approximate distribution is defined as follows from the third distribution point of view.

, ,

여기서 0i2(j),i1(j)j 이다.Where 0 i2 (j), i1 (j) j.

따라서및는에 대해 조건이 되고, 여기서은 정수 1,2,…,K,K+1의 순열이다. 그리고 정의에 의해, P(|C₀,)는 P(,)과 같다.therefore And The Lt; RTI ID = 0.0 > The constants 1, 2, ... , K, K + 1. By definition, P ( | C ₀ , ) Is P ( , ).

또한, 도 4에 도시된 바와 같이 2차 의존관계의 곱으로 분포를 최적 근사하기 위해 유사도를 이용할 수 있다.Also, as shown in Fig. 4, the degree of similarity can be used to optimally approximate the distribution as the product of the second dependency relation.

상기 도 4에서, I(P(C),P_a(C))를 최소화 하는 것은 나머지 항이 모두 상수이므로 주어진 제한조건을 만족하는 평균 상호 정보의 총합인M(C_j;,)을 최대화 하는 것이다. 따라서, 다음 단계는 모든 허용가능 곱집합으로부터 2차 의존관계 기반의 최적 곱 근사 집합을 식별하는 방법이다. 2차 의존관계 기반의 최적 곱 근사 집합을 식별하기 위한 프로세스는 다음과 같이 알고리즘적으로 설명된다.In FIG. 4, I (P (C) , P a (C)) is to minimize the sum of the average mutual information that satisfy a given constraint so all remaining term is a constant M (C _j ; , ) Is maximized. Thus, the next step is to identify a second-order dependency-based optimal product approximation set from all allowable product sets. The process for identifying the optimal product approximation set based on the second dependence relation is explained as follows.

방법 :Way :

max_total_weight = 0;max_total_weight = 0;

total_weight = 0 ;total_weight = 0;

endend

반면에, 만일 상기 수학식 11의가 모든에 대해 동일하다면, 즉가 주어진 C_K+1에 대해에 조건부 의존관계인 것으로 가정하면, 근사 분포는 3차 분포 관점에서 다음과 같이 정의된다.On the other hand, if Equation (11) All Is the same for Is given for C _{K + 1} , The approximate distribution is defined as follows from the third-order distribution point of view.

여기서, 0i2(j)j 이다.및 C_K+1은에 대해 조건부로 되고, 여기서 n₁,n₂,…,n_K은 정수 1,2,...,K의 순열이다. 그리고 정의에 의해, P(|C₀,C_K+1)는 P(,C_K+1)과 같다.Here, 0 i2 (j) j. And C _{< K + 1} & gt _; , Where n ₁ , n ₂ , ... , n _K are permutations of integers 1, 2, ..., K. By definition, P ( | C ₀ , C _{K + 1} ) is P ( , C _{K + 1} ).

이러한 근사는 2차 의존관계 근사의 특정한 경우로서, 정의될 수 있는 조건부 1차 의존관계 가정으로 알려져 있다. 상세한 사항은 다음과 같다.This approximation is known as a conditional primary dependency hypothesis that can be defined as a particular case of second-order dependency approximation. The details are as follows.

새로운 의존관계-지향 근사를 이용하여, 고려될 의존관계의 차수는 체계적 방법으로 고차 확률 분포 근사를 위해차 까지 용이하게 확장될 수 있다.차 의존관계 곱 집합은 1개의 1차 의존관계, 1개의 2차 의존관계,..., 1개의 (-1)차 의존관계 및, (K-)개차 의존관계로 이루어진다.Using the new dependency-oriented approximation, the degree of dependency to be considered is computed in a systematic way for higher order probability distribution approximations It can be easily extended to the car. The primary dependency product set consists of one primary dependency, one secondary dependency, ..., one ( -1) and (K- )dog It depends on the interdependence.

, ,

여기서 0 …,,j 이다.Where 0 ... , , j.

차 의존관계의 최적 집합을 위한 알고리즘은 2차 의존관계를 위해 제안된 알고리즘에 2차 의존관계 대신에 선택된 인식기와 연관된 허용가능한차 의존관계중의 하나를 찾는 재귀 루틴과 , 2차 의존관계로부터 (-1)차 의존관계를 다루는 내포된 for 루프(nested for loop)를 더하므로써 고안될 수 있다. The algorithm for the optimal set of the dependency relations can be applied to the proposed algorithm for the second dependency, And a recursive routine for finding one of the secondary dependencies -1) It can be devised by adding a nested for loop that deals with the dependency.

2차 의존관계에 의한 곱 근사Product Approximation by Secondary Dependence

도 5에 3개 인식기 E1,E2,E3의 3개 결정 변수 C₁,C₂,C₃과 가설 결정 변수 C₄로 이루어진 4차 확률 분포로부터 1차 및 2차 의존관계를 계산하기 위한 평균 상호 정보 값이 나타나 있다.3 to 5 recognizer E1, E2, average cross for calculating primary and secondary dependency from quaternary probability distribution consisting of three decision variables in E3 C _1, C _2, C _3, and hypotheses decision variable C ₄ Information values are shown.

조건부 독립 가정을 4차 확률 분포에 적용하여 다음식의 첫번째 줄에 나타난결과를 얻을 수 있으며, 도 6에 도시된 바와 같이 의존관계 트리 방법을 동일한 4차 확률 분포에 적용하여 다음식의 두번째 줄에 나타난 결과를 얻을 수 있다.Applying the conditional independence hypothesis to the fourth-order probability distribution, we can obtain the results shown in the first line of the following equation, and apply the dependency tree method to the same fourth-order probability distribution as shown in FIG. The result can be obtained.

M = 6.717977 M = 6.717977

P_a(C₁,C₂,C₃,C₄) = P(C₁,C₄)P(C₃|C₄)P(C₂|C₄₃) :M = 6.728495^* _{_{_{P a (C 1, C 2}}} , C 3, C 4) = P (C 1, C 4) P (C 3 | C 4) P (C 2 | C 43): M = 6.728495 ^*

위 식에서 *는 최적 결과를 의미한다.In the above equation, * denotes the optimal result.

더욱 많이 이용할 수 있는 정보를 고려하기 위해, 도 7에 도시된 바와 같이 2차 의존관계에 대한 제안된 알고리즘을 동일한 실제 분포에 적용하여 다음과 같은 결과를 얻는다.In order to consider more available information, the proposed algorithm for the second order dependency is applied to the same actual distribution as shown in FIG. 7 to obtain the following results.

도 7의 결과로부터, 2차 의존관계에 대한 최적 곱 집합으로서, P(C₄,C₂)P(C₁|C₄,C₂)P(C₃|C₄,C₂),P(C₄,C₂)P(C₃|C₄,C₂)P(C₁|C₄,C₂),P(C₄,C₃)P(C₂|C₄,C₃)P(C₁|C₄,C₂),P(C₁,C₂)P(C₄|C₁,C₂) P(C₃|C₄,C₂), P(C₂,C₃)P(C₄|C₂,C₃)P(C₁|C₄,C₂)중 하나를 선택할 수 있다. 이는 상기 최적 곱 집합들이 2차 평균 상호 정보의 최대 가중치를 갖기 때문이다.From the results of Figure 7, as the optimum product set of the second _{_{dependency, P (C 4, C 2}} ) P (C 1 | C 4, C 2) P (C 3 | C 4, C 2), P ( _{_{C 4, C 2) P (}} C 3 | C 4, C 2) P (C 1 | C 4, C 2), P (C 4, C 3) P (C 2 | C 4, C 3) P ( _{_{_{C 1 | C 4, C 2}}} ), P (C 1, C 2) P (C 4 | C 1, C 2) P (C 3 | C 4, C 2), P (C 2, C 3) P (C ₄ | C ₂ , C ₃ ) P (C ₁ | C ₄ , C ₂ ). This is because the sets of optimal products have the largest weight of the secondary average mutual information.

Bayes 에러율 최소화에 의한 1차 의존관계 근사First-order dependency approximation by minimizing Bayes error rate

이제는 또 다른 기준으로서, Bayes 에러율 최소화를 1차 의존관계에 의한 고차 확률 분포를 근사하는 기준을 적용한다. 근사를 위해, 상기에 도시된 바와 같은 DC(decision-Classifiers)상호 정보를 정의한다.Now, as another criterion, we apply a criterion that approximates Bayes error rate minimization to a higher order probability distribution by first order dependence. For approximation, we define decision-classifiers (DC) mutual information as shown above.

, ,

여기서,m 은 결정 변수이고 C는 인식기의 결정 변수들의 집합이다. 상기에서 언급한 바와 같이, Bayes 에러율을 최소화하는 것은 DC상호 정보를 최대화하는 것이다.Here, m is a decision variable and C is a set of decision variables of the recognizer. As mentioned above, minimizing the Bayes error rate is to maximize the DC mutual information.

1차 의존관계가 Bayes 에러율을 최소화 하므로써 곱 근사에서 고려될 때,근사 분포는 3차 분포 관점에서 다음과 같이 정의된다.When the first order dependence is considered in the product approximation by minimizing the Bayes error rate, the approximate distribution is defined as follows in terms of the third order distribution.

여기서, 0i2(j)j 이다.및 m은에 대해 조건이 되고, 여기서 n₁,n₂,…,n_K은 정수 1,2,…,K의 순열이다. 그리고 정의에 의해, P(|C₀,m)은 P(,m)과 같다.Here, 0 i2 (j) j. And m is , Where n ₁ , n ₂ , ... , n _K are integers 1, 2, ... , K is a permutation. By definition, P ( | C ₀ , m) is P ( , m).

또한, P_a의 식에 부가하여, 인식기 결정변수들의 확률 분포는 2차 분포 관점에서 볼 때 다음과 같이 정의된다.Furthermore, in addition to the expression of P _a , the probability distribution of the recognizer decision variables is defined as follows from the secondary distribution point of view.

여기서, 0i2(j)j 이다.및 m은에 대해 조건이 되고, 여기서 n₁,n₂,...,n_K은 정수 1,2,...,K의 순열이다.Here, 0 i2 (j) j. And m is Where n ₁ , n ₂ , ..., n _K are permutations of integers 1, 2, ..., K.

만일 P(m,C) 및 P(C) 대신에 확률분포 P_a(m,C)및 P_a(C)를 각각 수학식15에 삽입하면, 다음과 같은 식을 얻는다. 도 8로부터, M(m;C)를 최대화하는 것은 나머지 항이 모두 상수이므로 주어진 제한조건을 만족하는 평균 상호 정보의 총합인를 최대화 하는 것이다.If the probability distributions P _a (m, C) and P _a (C) are inserted into Equation 15 instead of P (m, C) and P (C) respectively, the following equation is obtained. 8, maximizing M (m; C) is the sum of average mutual information satisfying a given constraint because all the other terms are constant .

한편, 2차 의존관계에 대한 근사는 다음과 같다.On the other hand, the approximation for the secondary dependency is as follows.

베이지안 방법에 적용Applied to the Bayesian method

확률적 방법으로 다수 결정을 결합하기 위해, 상기차 의존관계의 최적 곱 집합으로부터 획득된 근사 분포를 베이지안 방법에 적용한다. 결합 공식은 Bayes 정리 및 근사 분포를 이용하여 유도된다.In order to couple multiple crystals in a probabilistic manner, The approximate distribution obtained from the optimal product set of the dependency relation is applied to the Bayesian method. Combination formulas are derived using Bayesian theorem and approximate distribution.

베이지안 방법에 1차 의존관계의 적용Application of primary dependency to Bayesian methods

각각의 부류 M_i에 대해, 1차 의존관계 및 베이지안 방법을 이용하고, 하나의 부류항 M_i M 가 C_K+1(x) = M_K+1로 표현된다고 하면, 다음 공식을 얻을 수 있다.For each class M _i , we use a first order dependency and Bayesian method, and one class term M _i Assuming that M is expressed as C _{K + 1} (x) = M _{K + 1} , the following formula can be obtained.

여기서 η는이 되게하는 상수이다.Where η is .

여기서 n₁,n₂,...,n_K,n_K+1은 정수 1,2,...,K, K+1의 순열이다. 다수 결정을 결합하므로써 계산되는 Bel(M_K+1) 값에 종속하여, 최대화된 포스테리어 확률을 선택한다. 그리고, 결합된 결정은 다음과 같이 주어지는 결정 규칙 E(D)에 따라, 부류 m 으로서 판정된다. 만일,Where n ₁ , n ₂ , ..., n _K , n _{K + 1} are permutations of integers 1, 2, ..., K, K + 1. Depending on the Bel (M _{K + 1} ) value calculated by combining multiple decisions, the maximized posterior probability is selected. Then, the combined crystal is determined as the class m according to the determination rule E (D) given as follows. if,

Bel(M_i) 이면 , E(D) = m 이고, Bel (M _i ), E (D) = m,

그렇지 않으면 E(D) = L + 1 이다(기각을 의미함).Otherwise, E (D) = L + 1 (implies rejection).

베이지안 방법에 2차 의존관계의 적용Application of Secondary Dependence to the Bayesian Method

만일 고차 의존관계에 2차 의존관계의 최적 집합이 사용된다면, 결합 방법은 다음과 같이 표현된다.If an optimal set of secondary dependencies is used for higher order dependencies, then the join method is expressed as:

여기서, η는이 되게 하는 상수이다.Here, .

여기서 n₁,n₂,…,n_K,n_K+1은 정수 1,2,…,K, K+1의 순열이다. 또한, 2차 의존관계를 기반으로 다수 결정 결합에 상기 결정 규칙 E(D)를 적용한다.Where n ₁ , n ₂ , ... , n _K , n _{K + 1} are integers 1, 2, ... , K, K + 1. Further, the above decision rule E (D) is applied to a multiple decision combination on the basis of a secondary dependency relation.

이하, 본 발명에 따른 실시예를 설명한다.Hereinafter, embodiments according to the present invention will be described.

실험Experiment

제 1 실험은 캐나다 Concordia 대학 CENPARMI 연구소에서 배포된 전체적으로 무정형 오프라인 필기 숫자로 된 표준화 데이터 집합으로 행해졌다. 이 데이터 집합은 두 개의 상이한 훈련 데이터 집합(A,B) 과 테스트 데이터 집합(T)으로 나뉜다. 각 데이터 집합은 10개 숫자에 대해 부류(class)당 200 개 문자를 포함한다.The first experiment was carried out with a set of standardized data in a totally amorphous offline handwritten number distributed at the CENPARMI Institute in Concordia, Canada. This data set is divided into two different sets of training data (A, B) and test data sets (T). Each dataset contains 200 characters per class for 10 numbers.

제 2 실험은 KAIST에서 수집한 전체적으로 무정형 온라인 필기 숫자와 영문자로 행해졌다. 특히, 고 의존관계 인식기의 영향에 대한 증명을 나타내기 위함이다. 이 실험들로부터, 의존관계 기반 방법과 다른 결정 결합 방법과 비교할 것이고, 제안된 방법의 성능을 평가한다.The second experiment was conducted with amorphous online handwritten numbers and alphabet letters collected from KAIST. In particular, it is intended to show proof of the influence of the high dependency recognizer. From these experiments, we will compare the dependency-based method with other decision-combining methods and evaluate the performance of the proposed method.

무정형 오프라인 필기 숫자의 인식Recognition of amorphous offline handwritten numbers

개별 인식기의 인식 결과 뿐만 아니라 결합 인식기 NN1, NN2,NN3, R1, R2를 결합한 결과에 대한 실험이 논의된다. 인식기의 기각(reject) 인식 결과는차 의존관계의 최적 집합을 식별하는데 있어서 배제된다. 인식기 NN1, NN2,NN3 는 신경망 인식기이다.인식기 R1, R2 는 규칙-기반 인식기이다. 이들 인식기는 kAIST 및 전북대학교에서 개발되었다. 특히, 인식기 NN1, NN2는 기각 유형 결과 없이 개발되었다.Experiments on the results of combining the recognition recognizers NN1, NN2, NN3, R1, R2 as well as recognition results of individual recognizers are discussed. The rejection recognition result of the recognizer And is excluded from identifying the optimal set of differential dependencies. The recognizers NN1, NN2, NN3 are neural network recognizers. The recognizers R1, R2 are rule-based recognizers. These recognizers were developed by kAIST and Chonbuk University. In particular, recognizers NN1 and NN2 were developed without rejection type results.

테스트 데이터 집합(T)상에서의 개별 인식기의 성능과 인식 기각율이 표 1에 도시되었다.The performance of the individual recognizer on the test data set (T) and the recognition rejection rate are shown in Table 1.

[표 1][Table 1]

테스트 집합 T에 대한 개별 인식기 성능Individual recognizer performance for test set T

여기서, Classifier는 인식기를 , Recognition rate 는 인식률을, Rejection rate는 인식 기각율을 의미한다.Here, Classifier is the recognizer, Recognition rate is the recognition rate, and Rejection rate is the recognition rejection rate.

다수 투표 방법, Borda카운트 방법, BKS 방법, 조건부 독립 가정 기반 베이지안 방법과 같은 결합 방법과차 의존관계 기반 베이지안 방법으로서 제안된 방법들을 이용한 실험을 테스트 데이터 집합(T)에 대해 행하였다.Combination methods such as multiple voting method, Borda count method, BKS method, conditional independent home based Bayesian method Experiments using the proposed methods as the Bayesian method based on the dependency relation were performed on the test data set (T).

본 발명에서 제 1 실험은 5개 인식기로부터 선택된 3 개 인식기를 결합하여 수행됐다. 선택된 3 개 인식기에서 10개의 가능한 결합 경우의 수가 있다.In the present invention, the first experiment was performed by combining three recognizers selected from five recognizers. There are ten possible combinations in the three selected recognizers.

표 2에 도시된 바와 같이, 각각의 결합 경우에 최량 인식률을 2차 의존관계 기반 베이지안 방법(즉, 2차 의존관계)에 의해 얻어 졌다(Borda카운트 방법은 순위 레벨 결합 방법이므로 제외할 경우).As shown in Table 2, the best recognition rate was obtained for each combination case by a secondary dependency-based Bayesian method (i.e., a secondary dependency relationship) (except for the Borda count method because it is a ranking-level combining method).

[표 2][Table 2]

3 개 인식기의 결합 성능Combining performance of 3 recognizers

본 발명에서 제 2 실험은 5개 인식기로부터 선택된 4개 인식기를 결합하여 수행됐다. 선택된 4 개 인식기에서 5개의 가능한 결합 경우의 수가 있다.In the present invention, the second experiment was performed by combining the four recognizers selected from the five recognizers. There are five possible combinations in the four selected recognizers.

표 3에 도시된 바와 같이, 각각의 결합 경우에 최량 인식률은 조건부 1차 의존관계 기반 베이지안 방법(즉, 조건부 1차 의존관계)에 의해 얻어 졌다(역시 Borda카운트 방법을 제외할 경우). 흥미롭게도, BKS 방법에 의한 성능은 4개 인식기의 결합으로 저하됐다. BKS 테이블을 구축하는데 있어서 부족한 데이터는 인식 기각율을 증대시킨다는 것을 추론할 수 있다.As shown in Table 3, the best recognition rate for each combination case was obtained by conditional primary dependency-based Bayesian methods (i.e., conditional primary dependency) (except for the Borda count method too). Interestingly, the performance of the BKS method was degraded by the combination of four recognizers. It can be deduced that the insufficient data in constructing the BKS table increases the rate of recognition rejection.

[표 3][Table 3]

4 개 인식기의 결합 성능Combining performance of 4 recognizers

본 발명에서 제 3 실험은 모든 5개 인식기를 결합하여 수행됐다. 표 4에 도시된 바와 같이, 최량 인식률은 조건부 1차 의존관계 기반 베이지안 방법(즉, 조건부 1차 의존관계)에 의해 얻어 졌다. BKS 방법에 의한 성능은 증대된 인식 기각율로 인해 저하됐다.In the present invention, the third experiment was performed by combining all five recognizers. As shown in Table 4, the best recognition rate was obtained by a conditional primary dependency based Bayesian method (i.e., conditional primary dependency). Performance by the BKS method has been degraded due to the increased rejection rate of recognition.

[표 4][Table 4]

5 개 인식기의 결합 성능Combining performance of 5 recognizers

모든 실험 결과로부터,차 의존관계 기반 베이지안 방법이 조건부 독립 기반 베이지안 방법 및 BKS 방법 보다 인식률에서 양호한 성능을 나타내었다. 대부분의 경우에, 조건부 1차 의존관계를 포함하여 2차 의존관계 근사 방법에 의해 얻어진 인식률은 1차 의존관계 근사방법에 의한 인식률 보다 높았다. BKS 방법의 저하된 인식률은 훈련데이터 집합이 대표적이지 못하고 충분히 많지 않음에 의한 것이었다. 제안된 새로운 방법은1 일 때, 비록 조건부 독립 가정 기반 베이지안 방법보다 많은 저장량을 필요로 하지만,차 의존관계에 의한 곱 근사를 베이지안 방법에 통합하는 것은 독립 가정 없이 다수 분류기 결합의 성능에 대한 개선에 기여한다.From all the experimental results, The Bayesian method based on the dependency relation showed better performance than the conditionally independent Bayesian method and BKS method. In most cases, the recognition rate obtained by the second-order dependency approximation method including the conditional first-order dependency is higher than the recognition rate by the first-order dependency approximation method. The degraded recognition rate of the BKS method was due to the fact that the training data set was not representative and not enough. The proposed new method 1, it requires more storage than the conditionally independent home based Bayesian method, Integrating the product dependence relation by the Bayesian method contributes to the improvement of the performance of multiple classifier coupling without independent assumptions.

표 2에서 BKS 방법에 의해 기각된 53개 샘플중에서, 21개 샘플은 조건부 독립 가정 기반 베이지안 방법 또는 1차 의존관계 기반 베이지안 방법에 의해 올바르게 인식된다. 그러나, 34개 샘플은 조건부 1차 의존관계 기반 베이지안 방법 또는 2차 의존관계 기반 베이지안 방법에 의해 올바르게 인식된다. 이러한 결과는차 의존관계 기반 베이지안 방법이 양호한 인식률응 나타내는 이유를 설명한다. 최량 인식률은 주로 Borda 카운트 방법에 의해 달성되었다. 그러나, Borda 카운트 방법은 순위 레벨에서 결정을 결합하는 반면에, 의존관계 기반 베이지안 방법은 추출 레벨에서 결정을 결합한다. 따라서, 의존관계 기반 베이지안 방법이 때때로 최량 인식률을 나타내는 것은 바람직한 효과이다.Of the 53 samples rejected by the BKS method in Table 2, 21 samples are correctly recognized by conditionally independent hypothesis-based Bayesian methods or primary dependency-based Bayesian methods. However, 34 samples are correctly recognized by conditional primary dependency-based Bayesian methods or secondary dependency-based Bayesian methods. These results We explain the reason why the Bayesian method based on the dependency relation shows good recognition rate. The best recognition rate was achieved mainly by the Borda count method. However, the Borda count method combines decisions at the rank level, while the dependency based Bayesian method combines the decision at the extraction level. Therefore, it is a desirable effect that the dependency-based Bayesian method occasionally shows the best recognition rate.

무정형 온 라인 필기 숫자 인식Amorphous on-line handwritten digit recognition

이 실험은 KAIST 로부터 수집한 전체적으로 무정형 온 라인 필기 숫자의 데이터 집합을 가지고 수행되었다. 훈련 데이터 집합에는 13명이 작성한 4088개 문자 , 테스트 데이터 집합에는 10명이 작성한 988개의 문자가 기록됐다. 훈련 데이터 집합의 작성자는 테스트 데이터 집합에 참여하지 않는다. 인식기(즉, H1,H2,H3)는 관련 데이터 집합과 통계적 모델링 방법에 의해 개발된 Hidden Markov Model을 이용하여 KAIST에서 개발되었다. 테스트 데이터 집합에 대한 개별 인식기의 성능은 표 5에 인식 기각율과 함께 도시하였다.This experiment was carried out with a data set of amorphous online handwritten numbers collected from KAIST. The training data set contains 4088 characters written by 13 people, and the test data set contains 988 characters written by 10 people. The creator of the training data set does not participate in the test data set. The recognizers (ie, H1, H2, H3) were developed at KAIST using the Hidden Markov Model developed by related data sets and statistical modeling methods. The performance of the individual recognizers for the test data set is shown in Table 5 together with the recognition rejection rate.

[표 5][Table 5]

테스트 집합에 대한 개별 인식기의 성능Performance of individual recognizers for test sets

의존 관계-지향 접근법과 동일한 이전 방법과 비교하기 위해 테스트 데이터 집합에 대해 실험하였다.차 의존관계의 최적 곱 집합을 결정하는 것에 인식기의 기각(reject) 인식 결과가 포함되었다. 결합된 인식기의 성능이 표 6에 도시되어 있다.We tested the test data set for comparison with the same previous method as the dependency-oriented approach. The rejection recognition result of the recognizer was included in determining the optimal product set of the differential dependence. The performance of the combined recognizer is shown in Table 6.

표 6에 도시된 바와 같이, 최량 인식률은 2차 의존관계 기반 베이지안 방법 또는 조건부 1차 의존관계 기반 베이지안 방법에 의해 얻어졌다.As shown in Table 6, the best recognition rate was obtained by the Bayesian method based on the second dependency relationship or the Bayesian method based on the conditional first dependency relationship.

표 6에서 BKS 방법에 의해 기각된 26개 샘플중에서, 12개 샘플은 조건부 독립 가정 기반 베이지안 방법에 의해 올바르게 인식되고, 13 샘플은 1차 의존관계 기반 베이지안 방법에 의해 올바르게 인식 된다. 그러나 15 샘플은 조건부 1차 의존관계 기반 베이지안 방법 또는 2차 의존관계 기반 베이지안 방법 에 의해 올바르게 인식 된다. 또한, 이러한 결과는차 의존관계 기반 베이지안 방법이 양호한 인식률을 나타내는 이유를 설명한다.Of the 26 samples rejected by the BKS method in Table 6, 12 samples are correctly recognized by the conditionally independent hypothesis-based Bayesian method and 13 samples are correctly recognized by the primary dependency-based Bayesian method. However, 15 samples are correctly recognized by conditional primary dependency-based Bayesian methods or secondary dependency-based Bayesian methods. In addition, The reason why the differential dependency based Bayesian method shows good recognition rate is explained.

[표 6][Table 6]

테스트 숫자 집합에 대한 결합 인식기의 성능The performance of the join recognizer for a set of test numbers

표 6에 도시된 실험 결과로부터, 제안된 의존관계 기반 방법에 의한,차 의존관계 기반 베이지안 방법이 조건부 독립 가정 기반 베이지안 방법 및 BKS 방법 보다 강제 인식률에서 더 양호한 성능을 나타낸다. 인식 기각율은차 의존관계 기반 베이지안 방법에 의해 감소된 반면에 에러율은 결정 규칙 E(D)에서 인식 기각 기준의 부족으로 증가되었다. 따라서,차 의존관계 기반 베이지안 방법은 성능 모델 (10*E+R) 또는 (8*E+2 *R)을 이용하여 성능을 평가하는데 양호하지 못했다. 여기서,E는 에러율을 나타내고 R은 인식 기각율을 나타낸다. 따라서, к차 의존관계 기반 베이지안 방법은 강제 인식 성능만이 중요한 관심사인 경우에 적용하는 것이 적절하다. 그렇지 않으면, 기각 기준은 에러율을 최소화하기 위해 결정 규칙E(D)에 통합되어야 한다.From the experimental results shown in Table 6, it can be seen that by the proposed dependency- The Bayesian method based on the dependency relation shows better performance than the Bayesian method and BKS method based on conditional independent assumption. Recognition rejection rate The error rate was increased by the lack of recognition rejection criterion in the decision rule E (D), while it was reduced by the Bayesian method based on the dependency relation. therefore, The dependency-based Bayesian method was not good for evaluating performance using the performance model (10 * E + R) or (8 * E + 2 * R). Here, E represents the error rate and R represents the recognition rejection rate. Therefore, it is appropriate to apply the Bayesian method based on к-dependence to cases where only mandatory recognition performance is an important concern. Otherwise, the rejection criterion must be incorporated into decision rule E (D) to minimize the error rate.

BKS 방법의 저하된 인식율은 충분히 많고 대표적인 훈련 데이터 집합의 부족에 기인한다. 반면에,차 의존관계 기반 베이지안 방법의 개선된 인식률은 곱 근사를 이용하여 인식 기각된 샘플을 올바르게 처리하고 인식하므로써 개선되었다.The degraded recognition rate of the BKS method is large enough and is due to the lack of representative training data sets. On the other hand, The improved recognition rate of the Bayesian method based on the dependency relation is improved by correctly processing and recognizing the rejected samples using the product approximation.

제안된 새로운 방법은1 일 때, 비록 조건부 독립 가정 기반 베이지안 방법보다 많은 저장량을 필요로하지만,차 의존관계에 의한 곱 근사를 베이지안 방법에 통합하고, 독립 가정 없이 다수 인식기 결합의 성능에 대한 개선에 기여한다.The proposed new method 1, it requires more storage than the conditionally independent home based Bayesian method, We integrate the product approximation by the difference dependence into the Bayesian method and contribute to the improvement of the performance of the multiple recognizer combination without independent assumption.

의존관계 인식기를 다수 인식기 시스템에 추가Add dependency recognizer to multiple recognizer systems

앞에 기술한 내용에 부가하여 3 개 인식기 H1,H2,H3가 개발되었고 전체적으로 무정형 온 라인 필기 숫자 및 영문자 인식에 대한 실험을 행하였다. 이 인식기들은 베이스(Base)시스템의 구성성분이다. 의존관계 기반 방법의 효율성을 나타내 보이기 위해, 상기 베이스 시스템의 구성성분 인식기중의 하나를 위조(faking)하여 생성된 고 의존관계 인식기를 베이스 시스템에 추가하므로써 몇몇 다수 인식기 시스템을 구축하였다.In addition to the above, three recognizers H1, H2, and H3 have been developed and tested for amorphous on-line handwritten numbers and alphabet recognition as a whole. These recognizers are components of the Base system. In order to demonstrate the efficiency of the dependency-based method, several multiple recognizer systems have been built by adding a high dependency recognizer generated by faking one of the component recognizers of the base system to the base system.

예를들어, H1위조 시스템은 상기 구성성분 인식기 및 인식기 H1과 동종 인식기로 이루어진다.차 의존관계의 최적 곱 집합을 식별하기 위해 13명이 작성한 4088개의 숫자, 19명이 작성한 3749개의 소문자(lowercase) 및 2464개의 대문자(uppercase)가 훈련 데이터 집합으로 사용되었다. 그리고 테스트 데이터 집합으로서 10명이 작성한 988개의 숫자, 9명이 작성한 1684개의 소문자 및 1169개 대문자가 사용되었다. 훈련 데이터 집합의 작성자는 각 적용 영역에서 테스트 데이터 집합의 작성자와 상이하다. 인식기의 기각 인식률은차 의존관계의 최적 곱 집합의 식별에 있어서 배제되었다. 즉, 유효 인식 결과만이 고려된다. 테스트 데이터 집합에 대한 상기 구성성분 인식기의 제 1 후보(candidate)의 인식률이 표 7에 나타나 있다.For example, the H1 falsification system consists of the component recognizer and recognizer H1 and the homogeneous recognizer. In order to identify the optimal product set of the dependency relations, 4088 numbers created by 13 people, 3749 lowercase letters created by 19 people (lowercase), and 2464 uppercase letters were used as training data sets. The test data set consisted of 988 numbers written by 10 people, 1684 lowercase letters by 9 people, and 1169 uppercase letters. The creator of the training data set is different from the creator of the test data set in each coverage area. The rejection recognition rate of the recognizer And is excluded in the identification of the optimal product set of the differential dependence. That is, only valid recognition results are considered. The recognition rates of the first candidate of the constituent recognizer for the test data set are shown in Table 7.

[표 7][Table 7]

테스트 데이터 집합에 대한 인식기의 인식률Recognition rate of recognizer for test data set

표 8의 숫자 데이터에 대한 실험결과로부터,차 의존관계 기반 결합 방법이 조건부 독립 가정 기반 베이지안 방법 및 BKS 방법 보다 양호한 성능을 나타낸다. 표 9의 영문 소문자 데이터에 대한 최량 인식률은 모든 다수 인식기 시스템에 대해 조건부 1차 의존관계 기반 베이지안 방법에 의해 얻어졌다. 표 10의 영문대문자 데이터에 대한 최량 인식률은 H3 위조 시스템을 제외하고 모든 다수 인식기 시스템에 대해 조건부 1차 의존관계 기반 베이지안 방법에 의해 얻어졌다.From the experimental results on numerical data in Table 8, The dependency-based join method shows better performance than the conditionally independent assumption-based Bayesian method and the BKS method. The best recognition rate for the English lowercase data in Table 9 was obtained by Bayesian method based on conditional primary dependency for all the multiple recognizer systems. The best recognition rate for the English capital letter data in Table 10 was obtained by Bayesian method based on conditional first order dependency for all the multiple recognizer systems except the H3 forged system.

[표 8][Table 8]

숫자 데이터에 다수 인식기 시스템의 인식률Recognition rate of multiple recognizer systems on numerical data

H3 위조 시스템의 경우, 2차 의존관계 기반 베이지안 방법에 의해 최량 인식률이 얻어졌다.For the H3 counterfeit system, the best recognition rate was obtained by the Bayesian method based on the second dependence relation.

고차 의존관계가 고려될 때, 표 8 에서 표 10까지 나타난 바와 같이, 인식률이 증가한다. 하지만, 최적 곱 집합에 대한 식별 복잡도도 증가한다. 조건부 독립 가정 기반 베이지안 방법에서, 최적 곱 집합에 대한 식별 복잡도는 인식기의 갯수는 필요하지 않으므로 인식기의 개수와 관계없이 계산복잡도 O(1)을 갖는다. 1차 의존관계의 경우에, 최적 곱 집합에 대한 식별 복잡도로서 O(NlogN)이 필요하며, 여기서, N은 그래프 G의 에지의 전체 숫자 즉,이다.When higher order dependencies are considered, the recognition rate increases as shown in Table 8 through Table 10. However, the identification complexity for the optimal product set also increases. In the Bayesian Bayesian method based on the conditional independent assumption, the identification complexity for the optimal product set has the computational complexity O (1) regardless of the number of recognizers, since the number of recognizers is not required. In the case of a primary dependency, O (NlogN) is required as the identification complexity for the optimal product set, where N is the total number of edges of the graph G, to be.

또한, 2차 의존관계의 경우에, 최적 곱 집합에 대한 식별 복잡도로서 O(NN)이 필요하며, 여기서, 첫 번째 N은 상기 1차 의존관계에서와 동일하고, 두 번째 N은 선택된 1차 의존관계 기반 허용가능 2차 의존관계의 전체 숫자 즉,이다. 여기서 C(d,2)는 통계학적 조합 함수이다. 고차 의존관계를 고려하는 것이 복잡하더라도, 인식률은 증가될 수 있고 식별에 있어서 계산 복잡도는 훈련 단계에서 단 한번만 수행되므로, 고차 의존관계를 고려하는 것이 바람직하다. 하지만, 고차 의존관계 고려가 언제나 고 성능을 보장하진 않는다.Also, in the case of a second dependency relationship, O (NN) is required as the identification complexity for the optimal product set, where the first N is equal to the first dependency and the second N is the selected first dependency The total number of relationship-based allowable secondary dependencies, to be. Where C (d, 2) is a statistical combination function. Although it is complicated to consider higher order dependencies, it is desirable to consider the higher order dependence because the recognition rate can be increased and the computational complexity in identification is only performed once in the training phase. However, higher order dependency considerations do not always guarantee high performance.

[표 9][Table 9]

영문소문자 데이터에 다수 인식기 시스템의 인식률Recognition Rate of Multiple Recognition Systems in Lowercase English Data

요약하면, BKS방법의 성능은 고차 의존관계 인식기가 베이스(Base) 시스템에 추가되는 경우에도 변함없다. 대부분의 경우, 고차 의존관계 기반 베이지안 방법에 의해 얻어진 인식률이 저차 의존관계 기반 베이지안 방법에 의해 얻어진 인식률 보다 높다. 영문소문자 및 영문대문자 데이터에 대한 2차 의존관계 기반 베이지안 방법에 의해 얻어진 저인식률은 충분히 많고 대표적인 훈련 데이터 집합의 부족에 기인한다.In summary, the performance of the BKS method does not change even when a higher order dependency recognizer is added to the base system. In most cases, the recognition rate obtained by the higher-order dependency-based Bayesian method is higher than the recognition rate obtained by the lower-order dependency-based Bayesian method. The low recognition rate obtained by the Bayesian method based on the second dependency relation for lowercase and uppercase English data is sufficient and due to the lack of representative training data set.

[표 10][Table 10]

영문대문자 데이터에 다수 인식기 시스템의 인식률Recognition Rate of Multiple Recognition System in English Capitalization Data

차 의존관계를 특히, 고차 의존관계가 포함될 때 베이지안 방법에 통합하는 것은 다수 인식기 결합의 성능을 향상시킨다. 조건부 독립 가정 기반 베이지안 방법과차 의존관계 기반 베이지안 방법간의 인식률 차이는 일반적으로 유효도(significance) 레벨 0.01에서 t_테스트에 의해 통계학적으로 유효하다. 이 사실에 대해, 영문대문자 데이터에 대해 의존관계 고려에 의한 정확도(correctness) 개선이 통계학적으로 유효한지의 여부를 증명하기 위해 베이스 시스템에서 t_테스트 수행 예를 아래에서 설명한다. 고차 의존관계 인식기의 추가에 따른 열화 현상은 상기한 문제의 사례가 될 수 있다. 따라서, 단순한 방법으로 임의 인식기를 다수 인식기에 추가하고 양호한 성능에 도달하기 위해 투표 방법 및 사회적 선택 함수를 다수 인식기 시스템에 적용하는 것은 중요한 문제가 된다. Integrating the differential dependencies into the Bayesian method, especially when higher order dependencies are involved, improves the performance of multiple recognizer combinations. Conditionally independent home-based Bayesian methods Differences in recognition rates between differential dependency based Bayesian methods are generally statistically valid by t_ test at significance level 0.01. For this fact, an example of performing a t_ test in a base system to demonstrate whether the correctness improvement due to dependency considerations is statistically valid for English capitalization data is described below. Degradation due to addition of a higher order dependency recognizer can be an example of the above problem. It is therefore an important issue to apply the voting method and the social selection function to the multiple recognizer system in order to add the arbitrary recognizer to the multiple recognizers in a simple manner and to achieve good performance.

H₀:2차 의존관계 기반 베이지안 방법의 평균 정확도는 조건부 독립 가정 베이지안 방법의 평균 정확도 보다 같거나 작다.(즉,이고, D는 작성자당 인식률의 차이이다)H ₀ : The mean accuracy of the second-order dependency-based Bayesian method is less than or equal to the mean accuracy of the conditionally independent assumption Bayesian method (ie, , And D is the recognition rate per author)

H_a:H₀의 대안(alternative)이다.(즉, μ_D0 )H _a : An alternative to H ₀ (ie, μ _D 0)

n을 기록자(writer)의 수라 하면,If n is the number of writers,

= 2.4396 가 된다. = 2.4396.

여기서,here,

= 0.8578 이고, = 0.8578,

= 1.11272 이다. = 1.11272.

T = 2.4396 t_0.025= 2.306 이므로 H₀를 자유도(degrees of freedom) 8로 기각할 수 있다. 이것은2차 의존관계 기반 베이지안 방법의 평균 정확도가 97.5%의 신뢰 간격(belief interval)으로 개선됨을 의미한다.Since T = 2.4396 t _0.025 = 2.306, H ₀ can be rejected as degrees of freedom 8. This means that the average accuracy of the second-order dependency-based Bayesian method improves to a belief interval of 97.5%.

본 발명에서는 상기한 실험 예들로부터 다음과 같은 결론을 도출하였다.In the present invention, the following conclusions are derived from the above-described experimental examples.

다수결정 결합에 대한 인식기 성능은 개별 인식기에 의한 성능보다 뛰어나다.The recognizer performance for multiple crystal bonds is superior to that for individual identifiers.

투표 결합 방식은 사회적 선택 함수와 비슷하다. 이들의 인식성능은 고 의존관계 인식기에 의해 매우 열화된다.The voting combination is similar to the social selection function. Their recognition performance is greatly degraded by the high dependency recognizer.

BKS방법의 인식성능은 인식기의 갯수가 증가함에 따라 저하되지만 고 의존관계 인식기가 추가됨에도 불구하고 인식성능은 변함없다.The recognition performance of the BKS method decreases as the number of recognizers increase, but the recognition performance does not change even though a high dependency recognizer is added.

베이지안 방법을 이용한 모든 결합 방법은 기타 다른 결합 방법을 능가한다.All bonding methods using Bayesian methods outperform other bonding methods.

차 의존관계를 베이지안 방법에 통합하는 것은 다수 인식기 결합의 성능을 향상시킨다. 특히, 고 의존관계 인식기가 포함될 때 그러하다. 영문대문자 데이터에 대한 실험적 결과는 통계학적으로 유효한 t 테스트 결과를 보임으로써 지지된다. Integrating the differential dependence into the Bayesian method improves the performance of multiple recognizer combinations. Especially when high dependency recognizers are involved. Experimental results for capital letter data are supported by showing statistically valid t test results.

고차 의존관계에 의해 획득된 인식률은 일반적으로 저차 의존관계에 의해 획득된 인식률 보다 높다.The recognition rate obtained by higher order dependency is generally higher than the recognition rate obtained by lower order dependence.

획득된 최량 인식률은 KAIST 데이터 베이스에 대해 의존관계 기반 결합 방법을 CENPARMI 데이터 베이스에 대해 Borda 카운트 방법을 사용하여 얻어졌다.The best recognition rate obtained was obtained by using the Borda count method for the CENPARMI database on the KAIST database.

본 발명은 이산 결정을 행하는 인식기, 인식기, 센서 기계등의 결정기에 있어서, 다수의 결정기를 동시에 수행시켜 그 결정을 각각 얻고, 얻어진 다수의 결정을 확률적인 방법으로 결합하려는 응용분야에 적용 가능하다.INDUSTRIAL APPLICABILITY The present invention is applicable to an application field in which a plurality of determiners are simultaneously performed in a determiner such as a recognizer, a recognizer, and a sensor machine for performing discrete crystal, and each of the determinations is obtained and a plurality of obtained crystals are combined in a stochastic manner.

예를 들면, 패턴 인식 분야에서 다수의 인식기를 사용하고, 그 인식기의 결정을 결합하려는 경우, 그룹 의사 결정 분야에서 다수 인식기의 의견 또는 결정을 결합하려는 경우, 다수의 센서를 사용하고 그 결과를 결합하여 결정을 내리려는 경우등이 있을 수 있다.For example, if you want to use multiple identifiers in the field of pattern recognition, combine the decisions of that recognizer, and want to combine the opinions or decisions of multiple identifiers in the field of group decision making, you can use multiple sensors and combine the results And to make a decision.

상기 결정기로는 신경망을 이용한 문자 인식기 뿐만 아니라, 지식기반 인식기 또는 피춰(feature)기반 인식기도 사용가능하다.As the determiner, not only a character recognizer using a neural network but also a knowledge based recognizer or a feature based recognizer can be used.

또차 의존 관계에 의한 확률분포의 최적 근사 방법과 다수 결정의 결합 방법 즉, 다수의 결정으로 구성된 고차 확률 분포로부터차 의존관계에 기반한 저차 확률 분포의 곱에 의한 최적 근사 분포를 구하는 분야와 이 근사 분포를 활용하여 의존관계 기반으로 다수 결정을 결합하려는 분야에 적용 가능하다.In addition The optimal approximation method of the probability distribution by the dependency relation and the method of combining multiple decisions, that is, from the high- It can be applied to the field for obtaining the optimal approximate distribution by the product of the low-order probability distribution based on the differential dependence and to the field for combining multiple decisions based on the dependency relation using this approximate distribution.

이외에도 본 발명은 이미지 인식분야, DSS, 이산적(discrete) 결정을 하는 전문가, 센서, 기계등의 다수 결정기를 수행시키고, 그 결정을 결합하려는 경우에 적용할 수 있다. 그리고 본 발명은 상기한 실시예에 한정되지 아니하고, 본 발명의 사상이 유지되는 한, 다른 분야에도 적용가능함은 이 분야의 숙련된 기술자들에게는 자명한 사실이다.In addition, the present invention can be applied to a case where a plurality of determiners such as an image recognition field, a DSS, a discrete decision maker, a sensor, a machine, etc. are performed and the determinations are combined. It will be apparent to those skilled in the art that the present invention is not limited to the above-described embodiments, but may be applied to other fields as long as the spirit of the present invention is maintained.

상기한 바와 같은 구성에 의해, 본 발명은 독립 가정을 취하지 않음으로써 독립가정에 의해 야기될 수 있는 문제점을 피할 수 있다.With the above arrangement, the present invention does not take an independent assumption, thereby avoiding the problems caused by independent assumptions.

또한, 고차 확률 분포를 저차 확률 분포의 곱으로 근사시키는 데 있어서 의존 관계 차수를 변화시키면서 해당 차수 의존 관계 기반으로 한 최적 곱 근사 분포 집합을 구할 수 있다.Also, in order to approximate the high-order probability distribution by the product of the low-order probability distribution, it is possible to obtain an optimal product approximation distribution set based on the order dependence relationship while changing the dependency order.

그리고, 이에 필요한 저장량 복잡도도 독립 가정 기반에 비해 크지만 BKS 기법보다는 작다는 장점을 가진다. 왜냐하면, O(L²)O(L^k+1)O(L^K+1)의 관계가 성립하기 때문이다.Also, the storage complexity required is larger than that of the independent family, but it is smaller than that of the BKS technique. Because O (L ² ) O (L ^{k + 1} ) O (L ^{K + 1} ).

아울러, 1차 의존관계만이 아닌 고차 의존관계를 처리할 수 있는 방법론을 제시하므로써, 곱 근사 연구에 대한 기존의 연구 결과를 확장하였다.In addition, we extended the existing research results on the product approximation by suggesting a methodology for handling higher - order dependency, not just the first dependency relation.

또한, 고차 의존관계를 기반으로 다수 결정기의 결정을 결합하는 것이 그 성능에 있어서 우수함을 보여 주었다.In addition, it has been shown that combining the decisions of multiple determinants based on higher order dependence is superior in performance.

패턴 인식 분야에서 다수의 인식기를 사용하여 그 결정을 결합하는 실험을 통하여 본 발명의 우수성을 보였다.In the field of pattern recognition, the present inventors have excelled the present invention through an experiment in which a plurality of recognizers are used to combine the determinations.

Claims

K set of determiners ,

A set of L decision candidates ,

Input ,

The degree of dependency is ,

If it is,

Probability of force terrier end

= when,

= Lt; / RTI &

end Lt; / RTI >

If it is,

The And

= If the relationship is established,

The above-mentioned high-order probability distribution To In a method of optimally approximating a product of a low-order probability distribution by a differential dependence,

Obtaining a primary dependency relationship;

Obtaining a conditionally independent assumption;

Obtaining a secondary dependency relationship;

Obtaining a conditional primary dependency;

.

Obtaining a differential dependency; And

Conditional And a step of obtaining a difference dependence between the first and second states, Optimal Product Approximation Method of Probability Distribution by Differential Relations.

2. The method according to claim 1, wherein the actual probability distribution And approximate probability distribution The difference with

Defined as

The feature of being measured using the Measure of Closeness, Optimal Product Approximation Method of Probability Distribution by Differential Relations.

3. The method of claim 2, wherein when approximating an optimal product of a probability distribution by a second order dependency,

, &Lt; / RTI > Optimal Product Approximation Method of Probability Distribution by Differential Relations.

4. The method of claim 3, wherein an optimal product approximation of a probability distribution is obtained when the similarity is minimized. Optimal Product Approximation Method of Probability Distribution by Differential Relations.

The method of claim 3, Lt; / RTI > is maximized, an optimal product approximation of the probability distribution is obtained. Optimal Product Approximation Method of Probability Distribution by Differential Relations.

2. The method of claim 1,

If

= And

Lt; RTI ID = 0.0 > Optimal Product Approximation Method of Probability Distribution by Differential Relations.

7. The method of claim 6,

= Lt; RTI ID = 0.0 > Optimal Product Approximation Method of Probability Distribution by Differential Relations.

8. The method of claim 7,

If

= And

9. The method of claim 8,

If

= And

10. The method of claim 9, The car dependence

If

= And

11. The method according to claim 10, The car dependence

If

= And

K set of determiners ,

A set of L decision candidates ,

Input ,

The degree of dependency is ,

,

Probability of force terrier end

= when,

= Lt; / RTI &

end Lt; / RTI >

when,

The And

= If the relationship is established,

A method for finding a constituent distribution term at an optimal product approximation of a probability distribution by a differential dependence,

Actual probability distribution , An approximate probability distribution when,

Defined as

Minimize the measure of closeness Quot;

for ㉠ do / * Primary dependency * /

/ * Here, the above is the first dependency relation If

And

Given as * /

for ㉡ do / * Secondary dependency * /

/ * Where the above is the second dependency relation If

And

Given as * /

........ ........

for ㉢ do / * ( -1) Dependency relation * /

/ * At this time, -1) As a dependency relation 0 i ( -1) (j), ..., i1 (j) j

And

Given as * /

while (㉣) do / * Car dependency relationship * /

/ * &Lt; / RTI > 0 as a car dependency i (j), ..., i1 (j) j

And

Given as * /

.......

end

.......

end

, One primary dependency, one secondary dependency, ..., one ( -1) dependency, (K- )doggy And the dependency relation -1) nested for loops and one while loop, A method of finding the constituent distribution term at the time of the optimal product approximation of the probability distribution by the differential dependence.

13. The method of claim 12, = 2, the constituent distribution term at the optimal product approximation,

Reading ^s sample data S ¹ , S ^{1, ...} , S ^s as input;

Calculating second and third probability distributions from training sample data;

_Calculating weight values M (C _j ; C _{i (j)} ) and M (C _j ; C _{i2 (j)} and C _{i1 (j)} ) for all 2 pairs and 3 pairs of data obtained from the sample data;

Calculating a maximum weighted sum of the first and second dependency relations and obtaining an associated optimal product approximation set;

And outputting the result obtained in the step as an optimal product approximation set. A method of finding the constituent distribution term at the time of the optimal product approximation of the probability distribution by the differential dependence.

14. The method of claim 13, wherein obtaining the associated optimal product approximation set comprises:

max_total_weight = 0;

for n = 1 to the number of primary dependencies do

total_weight = 0;

Selecting one of the primary dependencies as a constraint;

total_weight = weight of selected dependency;

while (number of unselected determinants 0) do

Selecting one of the non-selected decision units;

Selecting one of possible possible secondary dependencies associated with the selected determinator;

total_weight = weight of selected secondary dependency;

end

max_total_weight = MAX (max_total_weight, total_weight)

storing max_total_weight and related primary and secondary dependency sets;

end

Obtaining a maximum weighted sum max_total_weight and its associated primary and secondary dependency set, A method of finding the constituent distribution term at the time of the optimal product approximation of the probability distribution by the differential dependence.

14. The method of claim 13,

And The And conditions for, and _{_{n 1, n 2, ...,}} n K, n K + 1 is an integer 1,2, ..., K, the permutations of the K + 1, P ( | C ₀ , ) Is P ( | ), N _j is denoted by j, and C ₁ , C ₂ , ..., C _K , and C _{K + 1} are (K + 1) i2 (j), i1 (j) j,

, &Lt; / RTI > A method of finding the constituent distribution term at the time of the optimal product approximation of the probability distribution by the differential dependence.

14. The method of claim 13,

And The And conditions for, and _{_{n 1, n 2, ...,}} n K, n K + 1 is an integer 1,2, ..., K, the permutations of the K + 1, P ( | C ₀ , ) Is P ( | ), N _j is denoted by j, and C ₁ , C ₂ , ..., C _K , and C _{K + 1} are (K + 1) i2 (j) j,

Gt; and < RTI ID = 0.0 > 1, < / RTI > A method of finding the constituent distribution term at the time of the optimal product approximation of the probability distribution by the differential dependence.

K set of determiners ,

A set of L decision candidates ,

Input ,

The degree of dependency is ,

When there is a relationship, A method for combining a decision method by performing a method of obtaining an optimal product approximation of a probability distribution due to a differential dependence and a constituent distribution term for this with a normal Bayesian decision method,

Referring to the training sample data, A first step of obtaining an optimal product approximate set of a probability distribution by a differential dependence; And

A second step of probabilistically combining the determinations of the plurality of determinants by applying the approximate set obtained in the above step to the combination formula with the Bayesian method

Wherein a dependency between the determinants that make a plurality of decisions is taken into consideration without requiring an independent assumption.

18. The method of claim 17, wherein the dependency observes only the determinants of the determinants obtained from the training samples.

19. The method according to claim 18, wherein the coupling equation when combining the Bayesian method and the secondary dependence is expressed as? Lt; / RTI >

Which is calculated by combining a plurality of crystals, Values, and combining the plurality of crystals by selecting a maximized posterior probability, depending on the value.

20. The method of claim 19, wherein the determiner is one of a character recognizer, a knowledge based recognizer, or a feature based recognizer using a neural network.

21. The method according to claim 20, wherein, when using the reject recognition result of the recognizer, Characterized in that it identifies the optimal set of differential dependencies by excluding only those of the rejected classes and using only those of the valid classes or including those of the valid classes and the rejected classes, / RTI >