CN103858135A

CN103858135A - Quick supervised learning method of discrete valve feature vectors, and classification system

Info

Publication number: CN103858135A
Application number: CN201380003066.XA
Authority: CN
Inventors: 不公告发明人
Original assignee: Individual
Current assignee: Individual
Priority date: 2013-05-12
Filing date: 2013-05-12
Publication date: 2014-06-11
Also published as: WO2014183244A1

Abstract

The invention relates to a quick supervised learning method of discrete valve feature vectors, and a classification system. The quick supervised learning method comprises: a first step that a data structure of a hypercube for a training vector collection, a second step that a value is assigned to each unit of the hypercube by utilizing a pattern classification method, and a third step that vectors are test through searching corresponding unit value classification of the hypercube so that space complexity at a training stage is improved to reduce time complexity of a test stage.

Description

A kind of quick supervised learning method and categorizing system of discrete value proper vector

Technical field

The present invention relates to a kind of method for classifying modes and system, especially a kind of quick supervised learning method and categorizing system of discrete value proper vector.

Background technology

Pattern-recognition (Pattern Recognition) refer to characterize things or phenomenon various forms of (numerical value, word with logical relation) information processes and analyzes, with the process that things or phenomenon are described, recognize, are classified and explain, it is the important component part of information science and artificial intelligence.Pattern-recognition is often called again pattern classification, from processing the character of problem and the method equal angles of dealing with problems, pattern-recognition is divided into two kinds of (Supervised Classification) and the unsupervised classification (Unsupervised Classification) that have the classification of supervision.

Wherein supervised learning is divided into 2 stages:

1 Construction of A Model stage

A supposes that each tuple/sample belongs to certain predefined class, and these classes are defined by classification designator attribute

B is used for the tuple/sample set of tectonic model and is called as training set (training set)

C model is generally expressed as: classifying rules, decision tree or mathematical formulae

2 model operational phases:

The accuracy of estimation model

A compares by the test set of some known classification designators and the result of being classified by model

Two of the B shared ratio that comes to the same thing is called accuracy rate

C test set and training set must be uncorrelated.

In existing a lot of application, because the time complexity of pattern classification algorithm is higher, the test phase of supervised learning is consuming time too many, has affected the widespread use of pattern classification.。

Summary of the invention

The quick supervised learning method and the categorizing system that the object of the invention is to propose a kind of discrete value proper vector, it can solve in pattern classification, the test phase of supervised learning too many problem consuming time.

In order to achieve the above object, the embodiment of the present invention is achieved in that

A quick supervised learning method for discrete value proper vector, is characterized in that comprising the following steps:

The first step is the data structure of a hypercube of training vector set structure;

Each cell value assignment that second step is hypercube with method for classifying modes;

The 3rd step, test vector is by searching the corresponding unit value classification of hypercube.

Wherein, each training vector of described training vector set, for m dimensional vector, wherein every one dimension span length is Ri, i=1,2,, m, described hypercube is of a size of R1*R2* ... * Rm, each index value of hypercube is the eigenwert of every one dimension that vector is corresponding, and each cell value of hypercube is the class label of corresponding vector.

Preferably, the method for classifying modes in second step comprises the following steps:

The 1st step is the each unit assignment in hypercube corresponding to each training vector of training vector set;

The 2nd step, is 0 if there is no the element number of assignment in hypercube, finishes, otherwise carries out the 3rd step;

The 3rd step, finds the next unit that there is no assignment, calculates the element number of each classification of the existing assignment unit in its hypercube neighborhood;

The 4th step, the unit that this is not had to assignment, if the quantity of existing assignment unit is not 0 in this neighborhood, carries out the 5th step, if the quantity of existing assignment unit is 0 in this neighborhood, carries out the 3rd step;

The 5th step, the class label that this cell value that there is no the unit of assignment is maximum element number, and carry out the 2nd step.

Another object of the embodiment of the present invention is to provide a kind of quick supervised learning categorizing system of discrete value proper vector, it is characterized in that comprising:

Hypercube builds module, is used to training vector set to build the data structure of a hypercube;

Hypercube assignment module, for each cell value assignment that to utilize method for classifying modes be hypercube;

Test vector sort module, for classifying test vector by the corresponding unit value of searching hypercube.

Preferably, described hypercube assignment module comprises:

Training vector assignment module, is used to the each unit assignment in the hypercube that each training vector of training vector set is corresponding;

Assignment element number judge module, for judging that hypercube does not have the element number of assignment, is 0 if there is no the element number of assignment in hypercube, jumps to end module, searches computing module otherwise jump to;

Search computing module, there is no the unit of assignment for finding the next one, and calculate the element number of each classification of the existing assignment unit in its hypercube neighborhood;

Neighborhood judge module, for do not have the unit of assignment to carry out the judgement of the quantity of the existing assignment of this neighborhood unit to this, if the quantity of existing assignment unit is not 0 in this neighborhood, jump to unit assignment module, if the quantity of existing assignment unit is 0 in this neighborhood, jumps to and search computing module;

Unit assignment module, for the unit that there is no assignment is carried out to assignment, this cell value that there is no the unit of assignment is in neighborhood, to have the class label of maximum element number in assignment unit, and jumps to assignment element number judge module;

Finish module, for finishing the operation of whole categorizing system.

The present invention has following beneficial effect:

At test phase, because test sample book only need to be searched class label from hypercube, accelerate greatly the efficiency of pattern classification.

Brief description of the drawings

Fig. 1 is the functional-block diagram of the quick supervised learning method of discrete value proper vector;

Fig. 2 be one only to the hypercube schematic diagram of training vector assignment;

Fig. 3 be one to the hypercube schematic diagram of all possible proper vector assignment;

Fig. 4 is the functional-block diagram of the method for classifying modes in the quick supervised learning method second step of discrete value proper vector;

Fig. 5 is a hypercube schematic diagram after first round iteration;

Fig. 6 be one through the second hypercube schematic diagram of taking turns after iteration;

Fig. 7 is the functional-block diagram of the quick supervised learning categorizing system of discrete value proper vector;

Fig. 8 is the functional-block diagram of hypercube assignment module in the quick supervised learning categorizing system of discrete value proper vector.

Embodiment

Below, by reference to the accompanying drawings and embodiment, the present invention is described further.

Utilize the sample of one group of known class to adjust the parameter of sorter, make its process that reaches required performance, also referred to as supervised training or there is teacher learning.Wherein supervised learning is divided into 2 stages:

1 Construction of A Model stage

2 model operational phases: the accuracy of estimation model

Two of the B shared ratio that comes to the same thing is called accuracy rate

C test set and training set must be uncorrelated.

Learn diagnostic techniques by known cases as people, computing machine will just can have by study the ability of the various things of identification and phenomenon.The material that is used for learning is exactly and is identified object and belongs to similar limited quantity sample.In supervised learning, in giving computer learning sample, also tell and calculate the affiliated classification of each sample.If the learning sample of giving, without classification information, is exactly unsupervised learning.Any study has certain object, for pattern-recognition, is exactly to pass through the study of limited quantity sample, the error probability minimum that sorter is produced in the time that unlimited multiple patterns are classified.

Supervised learning method is current research a kind of machine learning method comparatively widely, the application that such as neural network propagation algorithm, decision tree learning algorithm etc. are succeeded in a lot of fields.

The training stage of supervised learning and test phase can be separate, and the cost that we consider the space complexity that improves the training stage reduces the time complexity of test phase.

In the embodiment of the present invention,

As shown in Figure 1, the realization flow of the quick supervised learning method of a kind of discrete value proper vector providing for the embodiment of the present invention, the step relating in empty wire frame representation flow process in figure, solid box represents the data structure relating in flow process, details are as follows:

Reduce like this time complexity of test phase to improve the cost of space complexity of training stage.

Here we illustrate this algorithm with the 2 class sorters of m=2, and wherein the span of 2 dimensions is all 1-9, respectively 81 grid in corresponding diagram 2.Here the grid of 9*9 is corresponding to the hypercube in flow process, a unit in the corresponding hypercube of each grid.

Existing 8 training vectors, in [] is its eigenwert, and class label may get 1 or 2, and its eigenwert and class label are respectively:

[2,6]=2;[2,7]=2;[3,6]=2;[3,7]=2;

[6,2]=1;[7,3]=1;[8,3]=1;[8,4]=1;

8 grid of class label in corresponding diagram 2, are given respectively.

If we are in the training stage, by certain mode identification method, the class label of space (being the unit that there is no assignment in hypercube) place characteristic of correspondence vector all can be calculated, as shown in Figure 3, because the span of discrete value proper vector is all included in this hypercube, we,, at test phase, only need to search cell value corresponding in hypercube, just can calculate the class label of corresponding discrete value proper vector.Thereby complete classification.

Due at test phase, the I/O operation that only protection is searched, without calculating, thereby reduces the time complexity of test phase to improve the cost of space complexity of training stage.

As shown in Figure 4, the method for classifying modes realization flow in the second step of the quick supervised learning method of a kind of discrete value proper vector providing for the embodiment of the present invention, details are as follows:

Here we still illustrate this algorithm with the 2 class sorters of m=2, and wherein the span of 2 dimensions is all 1-9, respectively 81 grid in corresponding diagram 2.Here the grid of 9*9 is corresponding to the hypercube in flow process, a unit in the corresponding hypercube of each grid.

Training vector as hereinbefore.

The 1st step, we set up a hypercube as shown in Figure 2, according to known training vector, give class label to wherein 8 unit.

The 2nd step, is not 0 if there is no the element number of assignment in hypercube, carries out the 3rd step;

3rd, 4,5 steps, for convenience of description, the neighborhood that choose is here 4-neighborhood, all the other various distances are also all suitable for the neighborhood of form, and in the situation of multi-C vector, various neighborhoods are various distances and the hypercube neighborhood of form, are also all applicable to method of the present invention;

The assignment of 81 grid as shown in Figure 5, for the assignment graph of the grid after first round iteration, the assignment of 81 grid as shown in Figure 6, for the assignment graph of the grid after first round iteration, by that analogy, through too much taking turns after iteration, all possible proper vector all obtains class label in hypercube.For improving, the efficiency of follow-up test phase lays a good foundation.

This method for classifying modes in the second step proposing in the present invention is only searched for similar sample, and is not be used in the similar sample of global search in neighborhood, is also conducive to reduce the time complexity of training stage.

As shown in Figure 7, the architecture principle of the quick supervised learning categorizing system of a kind of discrete value proper vector providing for the embodiment of the present invention, details are as follows:

Supervised learning categorizing system fast, comprising:

As shown in Figure 8, the architecture principle of the hypercube assignment module in the quick supervised learning categorizing system of a kind of discrete value proper vector providing for the embodiment of the present invention, details are as follows:

Finish module, for finishing the operation of whole categorizing system.

For a person skilled in the art, can be according to technical scheme described above and design, make other various corresponding changes and distortion, and these all changes and distortion all should belong to the protection domain of the claims in the present invention within.

Claims

1. a quick supervised learning method for discrete value proper vector, is characterized in that comprising the following steps:

The 3rd step, test vector is by searching the corresponding unit value classification of hypercube;

Wherein, each training vector of described training vector set, for mdimensional vector, wherein every one dimension span length is r _i, i=1,2 ..., m, described hypercube is of a size of r ₁ * R ₂ * ... * R _m, each index value of hypercube is the eigenwert of every one dimension that vector is corresponding, each cell value of hypercube is the class label of corresponding vector.

2. the quick supervised learning method of discrete value proper vector as claimed in claim 1, is characterized in that,

Method for classifying modes in second step comprises the following steps:

The 5th step, this cell value that there is no the unit of assignment is in neighborhood, to have the class label of maximum element number in assignment unit, and carries out the 2nd step.

3. a quick supervised learning categorizing system for discrete value proper vector, is characterized in that comprising:

Test vector sort module, for classifying test vector by the corresponding unit value of searching hypercube;

4. the quick supervised learning categorizing system of discrete value proper vector as claimed in claim 3, is characterized in that, described hypercube assignment module comprises:

Finish module, for finishing the operation of whole categorizing system.