CN116842454A

CN116842454A - Financial asset classification method and system based on support vector machine algorithm

Info

Publication number: CN116842454A
Application number: CN202310661037.8A
Authority: CN
Inventors: 刘梦迪; 许凤
Original assignee: Nanjing University of Finance and Economics
Current assignee: Nanjing University of Finance and Economics
Priority date: 2023-06-06
Filing date: 2023-06-06
Publication date: 2023-10-03
Anticipated expiration: 2043-06-06
Also published as: CN116842454B

Abstract

The invention provides a financial asset classification method and a system, and particularly relates to a financial asset classification method and a system based on a support vector machine algorithm. The model aims to improve the accuracy and stability of the classification of financial assets. By combining specific characteristics and data in the financial accounting field, the invention constructs a financial asset classification model by utilizing the SVM algorithm, and realizes accurate classification and prediction of financial assets by learning and optimizing training data. The model has wide application prospect in financial investment decision and risk management. The invention provides an innovative solution for the field of financial asset classification, and can provide accurate support and guidance in financial investment decision and risk management.

Description

Financial asset classification method and system based on support vector machine algorithm

Technical Field

The invention relates to the financial field of finance and accounting, in particular to a financial asset classification method and system based on a support vector machine algorithm.

Background

In the financial market, accurate categorization and prediction of financial assets is critical to investment decision-making and risk management. Only by accurately knowing the nature and trends of the financial assets, investors can make informed decisions, reduce risk and achieve better return on investment. Thus, ensuring proper classification and reliable prediction of financial assets is critical to investors' success in the marketplace.

The SVM has strong learning ability and generalization ability, but the prediction performance of the SVM model is closely related to the selection of parameters. Therefore, an effective method is adopted to search the optimal SVM parameters, and obtaining higher classification accuracy is a hot spot problem of current research.

Although the existing SVM can deal with the problem of nonlinear classification through a kernel function, the data set is more complex or abstract in a certain sense and the classification precision requirement of the classical SVM is improved in the face of the need of large-scale data analysis in various scenes such as finance, biological medicine, astronomical measurement and the like, so that another path is needed. The scholars find that the Drosophila algorithm is easy to realize in operation through the demonstration result, and has the advantage of strong local searching capability. The SVM model can be subjected to parameter optimization by adopting the FOA drosophila optimization algorithm to improve the prediction performance of the model, but meanwhile, most of the samples in the financial field are unbalanced samples, especially, most of the assets classified by the assets can be divided into assets measured by amortization cost, and the unbalanced samples restrict the limitation of the FOA drosophila algorithm on the SVM support vector machine.

The unbalanced sample processing method HSMOTE (Hierarchical Synthesis Minority Oversampling Technique) provided by the invention can be well combined with a drosophila optimization algorithm, and sample reconstruction is obtained by carrying out hierarchical processing on sample feature vectors on the basis of SMOTE, so that the constraint of unbalanced sample on model optimization is broken, and a better super-parameter adaptation model can be searched by the drosophila method.

According to the method for processing the HSMOTE unbalanced sample and the FOA parameter optimizing method, the HSMOTE and FOA methods are combined with the SVM model, a financial asset classification model based on the FOA-HSMOTE-SVM is constructed, the comparison is carried out with the SVM evaluation effect of grid search parameters, and experimental results prove that the FOA-HSMOTE-SVM model has better applicability to unbalanced samples at the same time when the accuracy of the financial asset classification is improved.

Disclosure of Invention

The technical problem to be solved by the invention is to overcome the defect of insufficient accuracy of classification of the financial assets in the prior art, provide a drosophila algorithm optimization SVM which can adapt to unbalanced samples, and aim to improve the accuracy and stability of classification of the financial assets.

The model adopts the core principle and advantages of a support vector machine, optimizes the kernel function by using a drosophila algorithm, combines specific characteristics and data in the financial field, and accurately classifies and predicts financial assets by learning and optimizing training data.

Consider a classification of financial assets, where the second category of assets belonging to the classification among the financial assets has the largest occupancy. For the imbalance problem, the HSMOTE performs well when processing the imbalance problem, can improve the effectiveness and scientificity of parameters, and has excellent noise resistance.

The HSMOTE is provided on the basis of SMOTE and is designed according to different types of sample feature vectors, one-hot coding is adopted in the method, so that Boolean type data and floating point type data are mixed in the vectors, and the Boolean type data fluctuation range is easy to be larger by directly adopting SMOTE processing, and the model construction is influenced.

And by combining the simpler and more convenient FOA and stronger local searching capability, the SVM is subjected to parameter optimization, and an improved SVM financial asset classification model is constructed. The verification result shows that the financial asset classification model based on the FOA-HSMOTE-SVM has excellent performance, and the method can provide an auxiliary means for accurately classifying financial assets in batches for enterprises, investors and related financial support institutions.

The technical solution of the invention is realized: a model for classifying financial assets based on an SVM algorithm, comprising the steps of:

step 1, collecting data: collecting financial market related data including corporate financial statement data, bond stock related data, corporate business data, and industry and market data;

step 2, preprocessing data: performing data cleaning, sample reconstruction, feature extraction and feature selection operations on the relevant data collected in the step 1 to extract effective features related to the classification of the financial assets;

step 3, constructing a financial asset classification model:

optimization of SVM parameters based on Drosophila optimization algorithm, wherein discriminant functionsWherein K (x) _i ,y _i ) As a kernel function, x _i And y _i Respectively representing different characteristic values of the sample, b is a constant, a _i Is a lagrangian factor i=1, 2 … n;

kernel function K (x _i ,y _i ) Performing global optimization to construct a financial asset classification model;

step 4, the data preprocessed in the step 2 are brought into the financial asset classification model constructed in the step 3, and the model is compared with the accuracy, recall rate and F1 value constructed by processing the reconstructed sample in the default step 2;

and 5, selecting a financial asset classification model with better indexes obtained by comparison in the step 4.

Preferably, step 1, step 2, we collect the following data:

(1) Corporate financial statement data:

profit table: including business revenue, net profit, etc., can be used to evaluate business models and profitability of a company. Cash flow meter: in particular, business cash flows, can be used to analyze corporate cash inflow and outflow.

(2) Bond stock related data:

bond stock issuance files: including bond terms and conditions, which may be used to learn of bond payback arrangements and cash flow regulations. Bond repayment plan: the repayment schedule and amount of the bond are recorded and can be used to analyze the contracted cash flow of the bond.

(3) Company management data:

sales contract data: including sales contract amount, collection conditions, etc., may be used to evaluate the contract cash flow between the company and the customer. Vendor contract data: including purchase contract amounts, payment conditions, etc., may be used to evaluate contract cash flows between the company and the provider.

(4) Industry and market data:

industry report and study: knowledge of the common business patterns and cash flow characteristics of the industry provides a reference for comparison and analysis. Market index and competition conditions: market competition environments and industry trends are known to evaluate the advantages and disadvantages of companies managing financial assets.

Preferably, in step 2, the acquired data is subjected to data cleaning, abnormal values and missing data are removed, and data which are designated as non-transacting equity tool investments which are measured in terms of fair value and whose changes are counted in other comprehensive benefits are removed, so that 200 pieces of effective data are finally obtained. Subsequently, feature engineering, including feature extraction and feature selection, is performed with the aim of extracting valid features associated with the classification of financial assets. We have chosen the most representative and important features, namely business models in which companies manage financial assets and contractual cash flow features in which financial assets, to reduce model complexity and improve classification accuracy.

Table 1 company management financial asset transaction model

(1) Selecting a business mode, taking contract cash flow as a characteristic x value, and finally outputting a variable which is the classification of financial assets: 1 represents financial assets measured in terms of amortization costs, 2 represents financial assets measured in terms of equity value and whose variation is counted in other comprehensive benefits, 3 represents financial assets measured in terms of equity value and whose variation is counted in the current period of time, stock is reduced by deviations caused by special cases, stock is rejected in the data preprocessing stage, which is designated as investment of non-trade equity tools measured in terms of equity value and whose variation is counted in other comprehensive benefits, and the rest is classified directly as class 3 asset.

Table 2 company manages financial asset classifications

(2) And performing financial asset classification judgment on the collected stocks and bonds. For example, for bonds, if the condition is satisfied: with one-hot encoding, it is possible to pass the cash flow test and the traffic pattern is 1, i.e. vector x= [1,0, … ]. If the condition is satisfied: the cash flow test is passed and the traffic pattern is 2, i.e. vector x= [0,1,0, … ]. Likewise, if the bond does not meet the condition that the cash flow test is passed and the business pattern is the other business pattern, i.e., vector x= [0,1, … ].

(3) Preferably, in step 3, unbalanced samples are processed by using an HSMOTE algorithm, and the HSMOTE method inserts artificial samples into a few samples to reduce the excessive inclination degree of data, so as to improve the prediction accuracy of the model. The HSMOTE algorithm steps are as follows:

for each sample X in the minority class of samples, calculating Euclidean distance from the sample X to each other sample in the minority class, searching K nearest neighbor samples, and recording neighbor subscripts; setting a sampling rate vector n= (ω) of the minority class samples in accordance with a proportion of unbalance between the minority class samples and the majority class samples ₁ ,ω ₂ ,ω ₃ ,..,ω _n ) For all minority class samples X, X is randomly selected from the K nearest neighbor samples _i (i=1, 2, …, N), the eigenvector of the ith sample is denoted as x _i ＝(x _i1 ,x _i2 ,x _i3 ,…,x _in )；

Taking into account inter-sample variabilityIn->Representing the average of all samples corresponding to the j-th feature. Consider the collision between samples: />Wherein r is _ij Is the correlation coefficient of the ith index and the jth index.

The fluctuation coefficients f of different data types are considered, and the fluctuation coefficients of the Boolean type and the floating point type of the sample are respectively set to 0.5 and 2 through testing.

Weighting of

1) Each neighbor sample X _i Respectively according to X with the original sample X _new ＝X+rand(0,1)×N⊙(X _i -X) synthesizing a new sample, wherein N (X) _i -X) is the Hadamard (Hadamard) product of the vectors;

the FOA is an optimizing operation by simulating the foraging behavior of the drosophila population and based on a collaborative mechanism of the drosophila population, the algorithm comprises two parts of visual search and olfactory search, key parameters are only population quantity and maximum iteration times, and compared with other intelligent algorithms, the FOA is easier to understand and easy to operate, has stronger local searching capability and is applied to various fields such as multi-knapsack problems, financial crisis early warning, neural network parameter optimization and logistics service. The specific operation flow is as follows:

1) Setting the population scale sizepop, the maximum iteration number max gen, the drosophila population position range LR, the drosophila word flight range FR and other relevant parameter values. The position information of each individual in the corresponding drosophila population is given on (X, Y), and the initial position is: x is X _axis ＝rand(LR),Y _axis ＝rand(LR)。

2) Giving each fruit fly in the fruit fly group a random flying direction and distance, wherein the new position of the fruit fly individual i is as follows: x is X _i ＝X _axis +rand(FR),Y _i ＝Y _axis +rand(FR)

3) Calculating the distance DIST of the position of the individual fruit fly from the origin _i The formula is calculated:

4) Calculating taste concentration determination value S _i And a taste concentration value Smell for each Drosophila in the Drosophila population _i ，S _i ＝1/DIST _i ，Smell _i ＝fitness(S _i ) Wherein fitness is a fitness function or an objective function;

5) Selecting the fruit fly with the best taste concentration in the current population, and recording the taste concentration value and the position of the fruit fly:

[bestSmell,bestIndex]＝min(Smell)

6) And (3) other drosophila in the drosophila group are close to the position according to the optimal taste concentration value and the corresponding position information:

SmellBest＝bestSmell

X _axis ＝X(bestIndex)

Y _axis ＝Y(bestIndex)

7) Repeating sub-step 2) to sub-step 6) until the algorithm iteration number reaches max gen.

The final calculation of FOA-HSMOTE-SVM is to measure the asset as a majority of samples S with amortization cost _maj The rule value is measured and the variation thereof is counted as other comprehensive benefits to be a minority class sample S _min The method comprises the following specific steps:

1) Calculate S _min Each sample point (X) _smin ,Y _smin ) Is to randomly extract a neighbor |S _maj -S _min I/2, the neighbor is compared with the original sample point (X _smin ,Y _smin ) The difference multiplied by [0,1]A random number delta between them, plus the original sample point (X _smin ,Y _smin ) Thus, a new credit risk sample is obtained

2) Repetition 1)Until the number of artificially synthesized credit risk samples reaches |S _maj -S _min |/2；

3) Initializing model parameters, selecting a kernel function g and a penalty coefficient C of an SVM, determining a target function formula of a drosophila taste concentration judging function, and determining iteration times max gen and population scale sizepop of a drosophila optimizing algorithm, and parameters such as bestssmell and the like of algorithm termination, wherein max gen is 100, and sizepop is 20;

4) Optimizing parameters of an SVM early warning model by using FOA according toS _i ＝1/DIST _i Two formulas calculate the fruit fly taste concentration determination value Smell _i And performing iterative loop;

5) Terminating the algorithm when the bestshell is smaller than the specified value, obtaining the parameter with the optimal concentration value, and substituting the optimal parameter and x _new The FOA-HSMOTE-SVM financial asset classification model is input as a sample set.

(4) Preferably, in step 4, we divide the dataset into a training set and a validation set, containing a total of 200 pieces of data. The training set contains 140 pieces of data and the test set contains 60 pieces of data, divided by a proportion of 70%. For the linear inseparable case of the improved support vector machine algorithm, consider the relaxation variable, in A penalty factor C is introduced.

Meanwhile, models can be combined into Bayesian optimization, wherein w is all possible models, x is the input of the model, y is the output of the model, m is the average value of model parameters, and sigma is the covariance matrix of two models, and a Gaussian process is utilized:

P(y|x,D)＝∫P(y|x,D)P(w|D)dw～N(m,∑)

further searching and determining the final super-parameters. Finally, the model is evaluated and optimized by using methods such as cross verification and the like, and indexes such as accuracy, recall rate, F1 score and the like are included. And comparing the model with the SVM and the FOA-SVM with super parameters obtained by a grid search method respectively, and displaying the result that the model has better generalization capability on unknown data.

Table 3 comparison of test results of three classification models

Preferably, we use the trained SVM model to predict and classify financial asset data in the validation set in step 5. The accuracy of the model is up to 98% through the prediction and classification results, which shows that the model has excellent classification effect.

Drawings

FIG. 1 is a specific workflow diagram of a method of classifying financial assets

FIG. 2 is a flow chart of a process for classifying stock bonds by financial assets

Detailed Description

The invention is further described below with reference to the accompanying drawings. The following examples are only for more clearly illustrating the technical aspects of the present invention, and are not intended to limit the scope of the present invention.

step 3, constructing a financial asset classification model:

Preferably, step 1, step 2, we collect the following data:

(1) Corporate financial statement data:

(2) Bond stock related data:

(3) Company management data:

(4) Industry and market data:

(2) And performing financial asset classification judgment on the collected stocks and bonds. For example, for bonds, if the condition is satisfied: the cash flow test is passed and the traffic pattern is 1, i.e. vector x= [1,0 … ]. If the condition is satisfied: the cash flow test is passed and the traffic pattern is 2, i.e. vector x= [0,1,0 … ]. Likewise, if the bond does not meet the condition that the cash flow test is passed and the business pattern is the other business pattern, i.e., vector x= [0,1 … ].

(3) Preferably, in step 3, unbalanced samples are processed by using an HSMOTE algorithm, and the HSMOTE method inserts artificial samples into a few samples to reduce the excessive inclination degree of data, so as to improve the prediction accuracy of the model. The HSMOTE is provided on the basis of SMOTE and is designed according to different types of sample feature vectors, one-hot coding is adopted in the method, so that Boolean type data and floating point type data are mixed in the vectors, and the Boolean type data fluctuation range is easy to be larger by directly adopting SMOTE processing, and the model construction is influenced.

The HSMOTE algorithm steps are as follows:

for each sample X in the minority class of samples, calculating Euclidean distance from the sample X to each other sample in the minority class, searching K nearest neighbor samples, and recording neighbor subscripts;

1) Setting a sampling rate vector n= (ω) of the minority class samples in accordance with a proportion of unbalance between the minority class samples and the majority class samples ₁ ,ω ₂ ,ω ₃ ,..,ω _n ) For all minority class samples X, X is randomly selected from the K nearest neighbor samples _i (i=1, 2, …, N), the eigenvector of the ith sample can be expressed as x _i ＝(x _i1 ,x _i2 ,x _i3 ,…,x _in ) The method comprises the steps of carrying out a first treatment on the surface of the Taking into account inter-sample variabilityIn->Representing the average of all samples corresponding to the j-th feature.

2) Consider the collision between samples:wherein r is _ij Is the correlation coefficient of the ith index and the jth index.

3) The fluctuation coefficients f of different data types are considered, and the fluctuation coefficients of the Boolean type and the floating point type of the sample are respectively set to 0.5 and 2 through testing.

Weighting of

4) Each neighbor sample X _i Respectively according to X with the original sample X _new ＝X+rand(0,1)×N⊙(X _i -X) synthesizing a new sample, wherein N (X) _i -X) is the Hadamard (Hadamard) product of the vectors;

5) The synthesized new sample and the original training sample set are combined into a new training sample set, and learning is performed on the model using the new training sample set.

[bestSmell,bestIndex]＝min(Smell)

SmellBest＝bestSmell

X _axis ＝X(bestIndex)

Y _axis ＝Y(bestIndex)

1) Calculate S _min Each sample point (X) _smin ,Y _smin ) Is to randomly extract a neighbor |S _maj -S _min I/2, the neighbor is compared with the original sample point (X _smin ,Y _smin The difference multiplied by [0,1]A random number delta between them, plus the original sample point (X _smin ,Y _smin ) Thus, a new credit risk sample is obtained

2) Repeating 1) until the artificially synthesized credit risk sample number reaches |S _maj -S _min |/2；

P(y|x,D)＝∫P(y|x,D)P(w|D)dw～N(m,∑)

The foregoing is merely a preferred embodiment of the present invention, and it should be noted that modifications and variations could be made by those skilled in the art without departing from the technical principles of the present invention, and such modifications and variations should also be regarded as being within the scope of the invention.

Claims

1. The financial asset classification method based on the support vector machine algorithm is characterized by comprising the following steps of:

step 3, constructing a financial asset classification model:

2. The method of claim 1, wherein reconstructing the samples in step 2 comprises the sub-steps of:

step 2-a-1, processing unbalanced samples by using an HSMOTE algorithm, calculating Euclidean distances from each sample X in a minority class of samples to each other sample, searching K nearest neighbor samples, and recording neighbor subscripts;

step 2-a-2, setting a sampling rate vector n= (ω) of the minority class samples according to the proportion of the imbalance between the minority class samples and the majority class samples ₁ ,ω ₂ ,ω ₃ ,..,ω _n ) For all minority class samples X, X is randomly selected from the K nearest neighbor samples _i (i=1, 2, …, N), the eigenvector of the ith sample is denoted as x _i ＝(x _i1 ,x _i2 ,x _i3 ,…,x _in )；

Step 2-a-3, consider the inter-sample variabilityIn->Representing the average value of all samples corresponding to the j-th feature;

step 2-a-4, consider the collision between samples:wherein r is _ij The correlation coefficient of the ith index and the jth index;

step 2-a-5, considering the fluctuation coefficient f of different data types, and setting the fluctuation coefficients of the Boolean type and the floating point type of the sample to be 0.5 and 2 respectively through testing;

weighting of

Step 2-a-6, each neighbor sample X _i Respectively according to the following steps with the original sample X: x is X _new ＝X+rand(0,1)×N⊙(X _i -X) synthesizing a new sample, wherein N (X) _i -X) is the Hadamard (Hadamard) product of the vectors.

3. The method of claim 1, wherein the method further comprises the step of,

1) Corporate financial statement data: including the index of business income, net profit, etc

2) Bond stock related data: including bond terms and conditions, bond repayment plans:

3) Company management data: including sales contract data, vendor contract data:

4) Industry and market data: including industry reports and research.

4. A method of classifying a financial asset according to claim 1, wherein said data cleansing in step 2 comprises the sub-steps of:

and 2-b-1, performing data cleaning on the acquired data, removing abnormal values and missing data, selecting 200 pieces of effective data, dividing the effective data according to a proportion of 70%, wherein a training set comprises 140 pieces of data, and a test set comprises 60 pieces of data. The method comprises the steps of carrying out a first treatment on the surface of the

Step 2-b-2, selecting a business model and contract cash flow, wherein the input variable is the final output variable which is the classification of the financial assets, and the financial assets are classified into three types: 1 represents financial assets measured in terms of amortized costs, 2 represents financial assets measured in terms of equity value and whose variation accounts for other integrated benefits, and 3 represents financial assets measured in terms of equity value and whose variation accounts for current losses.

5. The method of claim 1, wherein the support vectorThe machine algorithm considers the relaxation variable for the linear inseparable case, inA penalty factor C is introduced.

6. A method of sorting financial assets according to any one of claims 1 to 4, wherein step 3 includes the sub-steps of:

1) And 3-1, setting parameter values such as population size sizepop, maximum iteration number max gen, drosophila population position range LR, drosophila word flight range FR and the like. The method comprises the steps of carrying out a first treatment on the surface of the The position information of each individual in the corresponding drosophila population is given on (X, Y), and the initial position is: x is X _axis ＝rand(LR),Y _axis ＝rand(LR)；

2) Step 3-2, endowing each drosophila in the drosophila population with a random flying direction and distance, wherein the new positions of the drosophila individuals i are as follows: x is X _i ＝X _axis +rand(FR),Y _i ＝Y _axis +rand(FR)

3) Step 3-3, calculating the distance DIST of the individual position of the drosophila from the origin _i The formula is calculated:

4) Step 3-4, calculating taste concentration determination value S _i And a taste concentration value Smell for each Drosophila in the Drosophila population _i ，S _i ＝1/DIST _i ，Smell _i ＝fitness(S _i ) Wherein fitness is the discriminant function of claim 3);

5) Step 3-5, selecting the Drosophila with the best taste concentration in the current population, and recording the taste concentration value and the position of the Drosophila:

[bestSmell,bestIndex]＝min(Smell)

6) Step 3-6, other drosophila in the drosophila population are close to the position according to the optimal taste concentration value and the corresponding position information:

SmellBest＝bestSmell

X _axis ＝X(bestIndex)

Y _axis ＝Y(bestIndex)

7) Step 3-7, repeating substeps 3-2, 2) to substeps 3-6, 6) until the algorithm iteration number reaches max gen.

7. The method of claim 5, wherein the assets are measured as a plurality of samples S at a amortized cost _maj j, S is a minority class sample with the fair value measured and its variation counted into other comprehensive benefits _min The method for constructing the financial asset classification model FOA-SMOTE-SVM comprises the following steps:

4) Optimizing parameters of an SVM early warning model by using FOA according toS _i ＝1/DIST _i Two formulas calculate the fruit fly taste concentration determination value Smell _i And is overlapped withCirculation is carried out;

5) Terminating the algorithm when the bestshell is smaller than the specified value, obtaining the parameter with the optimal concentration value, and substituting the optimal parameter and x _new The FOA-SMOTE-SVM financial asset classification model is input as a sample set.

8. The method of claim 6, wherein the model-built financial asset classification model incorporates bayesian optimizations, where w is all possible models, x is the input of the model, y is the output of the model, m is the average of the model parameters, and Σ is the covariance matrix between the models using gaussian processes:

P(y|x,D)＝∫P(y|x,D)P(w|D)dw～N(m,∑)

further searching and determining the final super-parameters.

9. The method according to claim 1, wherein step 4) includes taking the data of the test set into the method for classification of the financial assets for evaluation of classification effects, and comparing the data with the accuracy, recall, and F1 values of the SVM, FOA-SVM, and the mesh search method to obtain the super-parameters, respectively.

10. A method of classifying a financial asset according to claim 3 or 8, characterised in that in order to reduce the bias caused by special circumstances in the stock, stock designated as investment in non-transacting equity instruments with equity value metering and its variation into other comprehensive benefits is removed in the data pre-processing stage, the remainder classifying the asset directly into a class 3 asset.

11. A financial asset classification system, comprising:

the data collection unit is used for collecting relevant data of financial markets, including company financial statement data, bond stock relevant data, company management data and industry and market data;

the data preprocessing unit is used for performing data cleaning, sample reconstruction, feature extraction and feature selection operations on the related data collected by the data collection unit so as to extract effective features related to financial asset classification;

the financial asset classification model construction unit is used for constructing a financial asset classification model based on the FOA-HSMOTE-SVM;

and the evaluation unit is used for carrying the data preprocessed by the data preprocessing unit into the constructed financial asset classification model and evaluating the model effect constructed by the sample which is not balanced.

12. A computer readable storage medium having stored therein at least one executable instruction that when executed on an electronic device causes the electronic device to perform the operations of the financial asset classification method of any one of claims 1-4.