CN103858135A - Quick supervised learning method of discrete valve feature vectors, and classification system - Google Patents

Quick supervised learning method of discrete valve feature vectors, and classification system Download PDF

Info

Publication number
CN103858135A
CN103858135A CN201380003066.XA CN201380003066A CN103858135A CN 103858135 A CN103858135 A CN 103858135A CN 201380003066 A CN201380003066 A CN 201380003066A CN 103858135 A CN103858135 A CN 103858135A
Authority
CN
China
Prior art keywords
assignment
hypercube
unit
vector
module
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201380003066.XA
Other languages
Chinese (zh)
Inventor
不公告发明人
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Individual
Original Assignee
Individual
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Individual filed Critical Individual
Publication of CN103858135A publication Critical patent/CN103858135A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning

Abstract

The invention relates to a quick supervised learning method of discrete valve feature vectors, and a classification system. The quick supervised learning method comprises: a first step that a data structure of a hypercube for a training vector collection, a second step that a value is assigned to each unit of the hypercube by utilizing a pattern classification method, and a third step that vectors are test through searching corresponding unit value classification of the hypercube so that space complexity at a training stage is improved to reduce time complexity of a test stage.

Description

A kind of quick supervised learning method and categorizing system of discrete value proper vector
Technical field
The present invention relates to a kind of method for classifying modes and system, especially a kind of quick supervised learning method and categorizing system of discrete value proper vector.
Background technology
Pattern-recognition (Pattern Recognition) refer to characterize things or phenomenon various forms of (numerical value, word with logical relation) information processes and analyzes, with the process that things or phenomenon are described, recognize, are classified and explain, it is the important component part of information science and artificial intelligence.Pattern-recognition is often called again pattern classification, from processing the character of problem and the method equal angles of dealing with problems, pattern-recognition is divided into two kinds of (Supervised Classification) and the unsupervised classification (Unsupervised Classification) that have the classification of supervision.
Wherein supervised learning is divided into 2 stages:
1 Construction of A Model stage
A supposes that each tuple/sample belongs to certain predefined class, and these classes are defined by classification designator attribute
B is used for the tuple/sample set of tectonic model and is called as training set (training set)
C model is generally expressed as: classifying rules, decision tree or mathematical formulae
2 model operational phases:
The accuracy of estimation model
A compares by the test set of some known classification designators and the result of being classified by model
Two of the B shared ratio that comes to the same thing is called accuracy rate
C test set and training set must be uncorrelated.
In existing a lot of application, because the time complexity of pattern classification algorithm is higher, the test phase of supervised learning is consuming time too many, has affected the widespread use of pattern classification.。
Summary of the invention
The quick supervised learning method and the categorizing system that the object of the invention is to propose a kind of discrete value proper vector, it can solve in pattern classification, the test phase of supervised learning too many problem consuming time.
In order to achieve the above object, the embodiment of the present invention is achieved in that
A quick supervised learning method for discrete value proper vector, is characterized in that comprising the following steps:
The first step is the data structure of a hypercube of training vector set structure;
Each cell value assignment that second step is hypercube with method for classifying modes;
The 3rd step, test vector is by searching the corresponding unit value classification of hypercube.
Wherein, each training vector of described training vector set, for m dimensional vector, wherein every one dimension span length is Ri, i=1,2,, m, described hypercube is of a size of R1*R2* ... * Rm, each index value of hypercube is the eigenwert of every one dimension that vector is corresponding, and each cell value of hypercube is the class label of corresponding vector.
Preferably, the method for classifying modes in second step comprises the following steps:
The 1st step is the each unit assignment in hypercube corresponding to each training vector of training vector set;
The 2nd step, is 0 if there is no the element number of assignment in hypercube, finishes, otherwise carries out the 3rd step;
The 3rd step, finds the next unit that there is no assignment, calculates the element number of each classification of the existing assignment unit in its hypercube neighborhood;
The 4th step, the unit that this is not had to assignment, if the quantity of existing assignment unit is not 0 in this neighborhood, carries out the 5th step, if the quantity of existing assignment unit is 0 in this neighborhood, carries out the 3rd step;
The 5th step, the class label that this cell value that there is no the unit of assignment is maximum element number, and carry out the 2nd step.
Another object of the embodiment of the present invention is to provide a kind of quick supervised learning categorizing system of discrete value proper vector, it is characterized in that comprising:
Hypercube builds module, is used to training vector set to build the data structure of a hypercube;
Hypercube assignment module, for each cell value assignment that to utilize method for classifying modes be hypercube;
Test vector sort module, for classifying test vector by the corresponding unit value of searching hypercube.
Wherein, each training vector of described training vector set, for m dimensional vector, wherein every one dimension span length is Ri, i=1,2,, m, described hypercube is of a size of R1*R2* ... * Rm, each index value of hypercube is the eigenwert of every one dimension that vector is corresponding, and each cell value of hypercube is the class label of corresponding vector.
Preferably, described hypercube assignment module comprises:
Training vector assignment module, is used to the each unit assignment in the hypercube that each training vector of training vector set is corresponding;
Assignment element number judge module, for judging that hypercube does not have the element number of assignment, is 0 if there is no the element number of assignment in hypercube, jumps to end module, searches computing module otherwise jump to;
Search computing module, there is no the unit of assignment for finding the next one, and calculate the element number of each classification of the existing assignment unit in its hypercube neighborhood;
Neighborhood judge module, for do not have the unit of assignment to carry out the judgement of the quantity of the existing assignment of this neighborhood unit to this, if the quantity of existing assignment unit is not 0 in this neighborhood, jump to unit assignment module, if the quantity of existing assignment unit is 0 in this neighborhood, jumps to and search computing module;
Unit assignment module, for the unit that there is no assignment is carried out to assignment, this cell value that there is no the unit of assignment is in neighborhood, to have the class label of maximum element number in assignment unit, and jumps to assignment element number judge module;
Finish module, for finishing the operation of whole categorizing system.
The present invention has following beneficial effect:
At test phase, because test sample book only need to be searched class label from hypercube, accelerate greatly the efficiency of pattern classification.
Brief description of the drawings
Fig. 1 is the functional-block diagram of the quick supervised learning method of discrete value proper vector;
Fig. 2 be one only to the hypercube schematic diagram of training vector assignment;
Fig. 3 be one to the hypercube schematic diagram of all possible proper vector assignment;
Fig. 4 is the functional-block diagram of the method for classifying modes in the quick supervised learning method second step of discrete value proper vector;
Fig. 5 is a hypercube schematic diagram after first round iteration;
Fig. 6 be one through the second hypercube schematic diagram of taking turns after iteration;
Fig. 7 is the functional-block diagram of the quick supervised learning categorizing system of discrete value proper vector;
Fig. 8 is the functional-block diagram of hypercube assignment module in the quick supervised learning categorizing system of discrete value proper vector.
Embodiment
Below, by reference to the accompanying drawings and embodiment, the present invention is described further.
Pattern-recognition (Pattern Recognition) refer to characterize things or phenomenon various forms of (numerical value, word with logical relation) information processes and analyzes, with the process that things or phenomenon are described, recognize, are classified and explain, it is the important component part of information science and artificial intelligence.Pattern-recognition is often called again pattern classification, from processing the character of problem and the method equal angles of dealing with problems, pattern-recognition is divided into two kinds of (Supervised Classification) and the unsupervised classification (Unsupervised Classification) that have the classification of supervision.
Utilize the sample of one group of known class to adjust the parameter of sorter, make its process that reaches required performance, also referred to as supervised training or there is teacher learning.Wherein supervised learning is divided into 2 stages:
1 Construction of A Model stage
A supposes that each tuple/sample belongs to certain predefined class, and these classes are defined by classification designator attribute
B is used for the tuple/sample set of tectonic model and is called as training set (training set)
C model is generally expressed as: classifying rules, decision tree or mathematical formulae
2 model operational phases: the accuracy of estimation model
A compares by the test set of some known classification designators and the result of being classified by model
Two of the B shared ratio that comes to the same thing is called accuracy rate
C test set and training set must be uncorrelated.
Learn diagnostic techniques by known cases as people, computing machine will just can have by study the ability of the various things of identification and phenomenon.The material that is used for learning is exactly and is identified object and belongs to similar limited quantity sample.In supervised learning, in giving computer learning sample, also tell and calculate the affiliated classification of each sample.If the learning sample of giving, without classification information, is exactly unsupervised learning.Any study has certain object, for pattern-recognition, is exactly to pass through the study of limited quantity sample, the error probability minimum that sorter is produced in the time that unlimited multiple patterns are classified.
Supervised learning method is current research a kind of machine learning method comparatively widely, the application that such as neural network propagation algorithm, decision tree learning algorithm etc. are succeeded in a lot of fields.
The training stage of supervised learning and test phase can be separate, and the cost that we consider the space complexity that improves the training stage reduces the time complexity of test phase.
In the embodiment of the present invention,
As shown in Figure 1, the realization flow of the quick supervised learning method of a kind of discrete value proper vector providing for the embodiment of the present invention, the step relating in empty wire frame representation flow process in figure, solid box represents the data structure relating in flow process, details are as follows:
The first step is the data structure of a hypercube of training vector set structure;
Each cell value assignment that second step is hypercube with method for classifying modes;
The 3rd step, test vector is by searching the corresponding unit value classification of hypercube.
Wherein, each training vector of described training vector set, for m dimensional vector, wherein every one dimension span length is Ri, i=1,2,, m, described hypercube is of a size of R1*R2* ... * Rm, each index value of hypercube is the eigenwert of every one dimension that vector is corresponding, and each cell value of hypercube is the class label of corresponding vector.
Reduce like this time complexity of test phase to improve the cost of space complexity of training stage.
Here we illustrate this algorithm with the 2 class sorters of m=2, and wherein the span of 2 dimensions is all 1-9, respectively 81 grid in corresponding diagram 2.Here the grid of 9*9 is corresponding to the hypercube in flow process, a unit in the corresponding hypercube of each grid.
Existing 8 training vectors, in [] is its eigenwert, and class label may get 1 or 2, and its eigenwert and class label are respectively:
[2,6]=2;[2,7]=2;[3,6]=2;[3,7]=2;
[6,2]=1;[7,3]=1;[8,3]=1;[8,4]=1;
8 grid of class label in corresponding diagram 2, are given respectively.
If we are in the training stage, by certain mode identification method, the class label of space (being the unit that there is no assignment in hypercube) place characteristic of correspondence vector all can be calculated, as shown in Figure 3, because the span of discrete value proper vector is all included in this hypercube, we,, at test phase, only need to search cell value corresponding in hypercube, just can calculate the class label of corresponding discrete value proper vector.Thereby complete classification.
Due at test phase, the I/O operation that only protection is searched, without calculating, thereby reduces the time complexity of test phase to improve the cost of space complexity of training stage.
As shown in Figure 4, the method for classifying modes realization flow in the second step of the quick supervised learning method of a kind of discrete value proper vector providing for the embodiment of the present invention, details are as follows:
The 1st step is the each unit assignment in hypercube corresponding to each training vector of training vector set;
The 2nd step, is 0 if there is no the element number of assignment in hypercube, finishes, otherwise carries out the 3rd step;
The 3rd step, finds the next unit that there is no assignment, calculates the element number of each classification of the existing assignment unit in its hypercube neighborhood;
The 4th step, the unit that this is not had to assignment, if the quantity of existing assignment unit is not 0 in this neighborhood, carries out the 5th step, if the quantity of existing assignment unit is 0 in this neighborhood, carries out the 3rd step;
The 5th step, the class label that this cell value that there is no the unit of assignment is maximum element number, and carry out the 2nd step.
Here we still illustrate this algorithm with the 2 class sorters of m=2, and wherein the span of 2 dimensions is all 1-9, respectively 81 grid in corresponding diagram 2.Here the grid of 9*9 is corresponding to the hypercube in flow process, a unit in the corresponding hypercube of each grid.
Training vector as hereinbefore.
The 1st step, we set up a hypercube as shown in Figure 2, according to known training vector, give class label to wherein 8 unit.
The 2nd step, is not 0 if there is no the element number of assignment in hypercube, carries out the 3rd step;
3rd, 4,5 steps, for convenience of description, the neighborhood that choose is here 4-neighborhood, all the other various distances are also all suitable for the neighborhood of form, and in the situation of multi-C vector, various neighborhoods are various distances and the hypercube neighborhood of form, are also all applicable to method of the present invention;
The assignment of 81 grid as shown in Figure 5, for the assignment graph of the grid after first round iteration, the assignment of 81 grid as shown in Figure 6, for the assignment graph of the grid after first round iteration, by that analogy, through too much taking turns after iteration, all possible proper vector all obtains class label in hypercube.For improving, the efficiency of follow-up test phase lays a good foundation.
This method for classifying modes in the second step proposing in the present invention is only searched for similar sample, and is not be used in the similar sample of global search in neighborhood, is also conducive to reduce the time complexity of training stage.
As shown in Figure 7, the architecture principle of the quick supervised learning categorizing system of a kind of discrete value proper vector providing for the embodiment of the present invention, details are as follows:
Supervised learning categorizing system fast, comprising:
Hypercube builds module, is used to training vector set to build the data structure of a hypercube;
Hypercube assignment module, for each cell value assignment that to utilize method for classifying modes be hypercube;
Test vector sort module, for classifying test vector by the corresponding unit value of searching hypercube.
Wherein, each training vector of described training vector set, for m dimensional vector, wherein every one dimension span length is Ri, i=1,2,, m, described hypercube is of a size of R1*R2* ... * Rm, each index value of hypercube is the eigenwert of every one dimension that vector is corresponding, and each cell value of hypercube is the class label of corresponding vector.
As shown in Figure 8, the architecture principle of the hypercube assignment module in the quick supervised learning categorizing system of a kind of discrete value proper vector providing for the embodiment of the present invention, details are as follows:
Training vector assignment module, is used to the each unit assignment in the hypercube that each training vector of training vector set is corresponding;
Assignment element number judge module, for judging that hypercube does not have the element number of assignment, is 0 if there is no the element number of assignment in hypercube, jumps to end module, searches computing module otherwise jump to;
Search computing module, there is no the unit of assignment for finding the next one, and calculate the element number of each classification of the existing assignment unit in its hypercube neighborhood;
Neighborhood judge module, for do not have the unit of assignment to carry out the judgement of the quantity of the existing assignment of this neighborhood unit to this, if the quantity of existing assignment unit is not 0 in this neighborhood, jump to unit assignment module, if the quantity of existing assignment unit is 0 in this neighborhood, jumps to and search computing module;
Unit assignment module, for the unit that there is no assignment is carried out to assignment, this cell value that there is no the unit of assignment is in neighborhood, to have the class label of maximum element number in assignment unit, and jumps to assignment element number judge module;
Finish module, for finishing the operation of whole categorizing system.
For a person skilled in the art, can be according to technical scheme described above and design, make other various corresponding changes and distortion, and these all changes and distortion all should belong to the protection domain of the claims in the present invention within.

Claims (4)

1. a quick supervised learning method for discrete value proper vector, is characterized in that comprising the following steps:
The first step is the data structure of a hypercube of training vector set structure;
Each cell value assignment that second step is hypercube with method for classifying modes;
The 3rd step, test vector is by searching the corresponding unit value classification of hypercube;
Wherein, each training vector of described training vector set, for mdimensional vector, wherein every one dimension span length is r i , i=1,2 ..., m, described hypercube is of a size of r 1 * R 2 * ... * R m , each index value of hypercube is the eigenwert of every one dimension that vector is corresponding, each cell value of hypercube is the class label of corresponding vector.
2. the quick supervised learning method of discrete value proper vector as claimed in claim 1, is characterized in that,
Method for classifying modes in second step comprises the following steps:
The 1st step is the each unit assignment in hypercube corresponding to each training vector of training vector set;
The 2nd step, is 0 if there is no the element number of assignment in hypercube, finishes, otherwise carries out the 3rd step;
The 3rd step, finds the next unit that there is no assignment, calculates the element number of each classification of the existing assignment unit in its hypercube neighborhood;
The 4th step, the unit that this is not had to assignment, if the quantity of existing assignment unit is not 0 in this neighborhood, carries out the 5th step, if the quantity of existing assignment unit is 0 in this neighborhood, carries out the 3rd step;
The 5th step, this cell value that there is no the unit of assignment is in neighborhood, to have the class label of maximum element number in assignment unit, and carries out the 2nd step.
3. a quick supervised learning categorizing system for discrete value proper vector, is characterized in that comprising:
Hypercube builds module, is used to training vector set to build the data structure of a hypercube;
Hypercube assignment module, for each cell value assignment that to utilize method for classifying modes be hypercube;
Test vector sort module, for classifying test vector by the corresponding unit value of searching hypercube;
Wherein, each training vector of described training vector set, for mdimensional vector, wherein every one dimension span length is r i , i=1,2 ..., m, described hypercube is of a size of r 1 * R 2 * ... * R m , each index value of hypercube is the eigenwert of every one dimension that vector is corresponding, each cell value of hypercube is the class label of corresponding vector.
4. the quick supervised learning categorizing system of discrete value proper vector as claimed in claim 3, is characterized in that, described hypercube assignment module comprises:
Training vector assignment module, is used to the each unit assignment in the hypercube that each training vector of training vector set is corresponding;
Assignment element number judge module, for judging that hypercube does not have the element number of assignment, is 0 if there is no the element number of assignment in hypercube, jumps to end module, searches computing module otherwise jump to;
Search computing module, there is no the unit of assignment for finding the next one, and calculate the element number of each classification of the existing assignment unit in its hypercube neighborhood;
Neighborhood judge module, for do not have the unit of assignment to carry out the judgement of the quantity of the existing assignment of this neighborhood unit to this, if the quantity of existing assignment unit is not 0 in this neighborhood, jump to unit assignment module, if the quantity of existing assignment unit is 0 in this neighborhood, jumps to and search computing module;
Unit assignment module, for the unit that there is no assignment is carried out to assignment, this cell value that there is no the unit of assignment is in neighborhood, to have the class label of maximum element number in assignment unit, and jumps to assignment element number judge module;
Finish module, for finishing the operation of whole categorizing system.
CN201380003066.XA 2013-05-12 2013-05-12 Quick supervised learning method of discrete valve feature vectors, and classification system Pending CN103858135A (en)

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2013/075525 WO2014183244A1 (en) 2013-05-12 2013-05-12 Rapid supervision and learning method for characteristic vector of discrete value

Publications (1)

Publication Number Publication Date
CN103858135A true CN103858135A (en) 2014-06-11

Family

ID=50864338

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201380003066.XA Pending CN103858135A (en) 2013-05-12 2013-05-12 Quick supervised learning method of discrete valve feature vectors, and classification system

Country Status (2)

Country Link
CN (1) CN103858135A (en)
WO (1) WO2014183244A1 (en)

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7034670B2 (en) * 2003-12-30 2006-04-25 Lear Corporation Method of occupancy classification in a vehicle seat
CN101211341A (en) * 2006-12-29 2008-07-02 上海芯盛电子科技有限公司 Image intelligent mode recognition and searching method
CN101364239B (en) * 2008-10-13 2011-06-29 中国科学院计算技术研究所 Method for auto constructing classified catalogue and relevant system

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
孟德宇 等,: ""一种新的有监督流形学习方法"", 《计算机研究与发展》 *

Also Published As

Publication number Publication date
WO2014183244A1 (en) 2014-11-20

Similar Documents

Publication Publication Date Title
Sarfraz et al. Efficient parameter-free clustering using first neighbor relations
CN110852168A (en) Pedestrian re-recognition model construction method and device based on neural framework search
CN106528874B (en) The CLR multi-tag data classification method of big data platform is calculated based on Spark memory
CN108985380B (en) Point switch fault identification method based on cluster integration
Cao et al. A PSO-based cost-sensitive neural network for imbalanced data classification
CN104766098A (en) Construction method for classifier
CN111931505A (en) Cross-language entity alignment method based on subgraph embedding
Sun et al. Find the best path: An efficient and accurate classifier for image hierarchies
CN110443428A (en) A kind of air compressor group load forecasting method and its control equipment
Pei et al. The clustering algorithm based on particle swarm optimization algorithm
Han et al. Online feature selection of class imbalance via pa algorithm
Han et al. An efficient genetic algorithm for optimization problems with time-consuming fitness evaluation
Ning et al. Yolov4-object: An efficient model and method for Object Discovery
Kumari et al. Hellinger distance based oversampling method to solve multi-class imbalance problem
Lei et al. Coca: Cost-effective collaborative annotation system by combining experts and amateurs
Lu et al. A new hybrid clustering algorithm based on K-means and ant colony algorithm
Sahni et al. Aided selection of sampling methods for imbalanced data classification
Jiang et al. Graph learning-convolutional networks
Jie et al. Incremental learning algorithm of data complexity based on KNN classifier
CN103858135A (en) Quick supervised learning method of discrete valve feature vectors, and classification system
Cheng et al. Research on feasibility of convolution neural networks for rock thin sections image retrieval
Ma et al. Attention and Cost-Sensitive Graph Neural Network for Imbalanced Node Classification
Mao et al. Uncertain interval data EFCM-ID clustering algorithm based on machine learning
JP2017091083A (en) Information processing device, information processing method, and program
Hulett et al. Dynamic selection of k nearest neighbors in instance-based learning

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
REG Reference to a national code

Ref country code: HK

Ref legal event code: DE

Ref document number: 1199126

Country of ref document: HK

WD01 Invention patent application deemed withdrawn after publication

Application publication date: 20140611

WD01 Invention patent application deemed withdrawn after publication
REG Reference to a national code

Ref country code: HK

Ref legal event code: WD

Ref document number: 1199126

Country of ref document: HK