CN109766443A - A kind of file classification method and system based on Non-smooth surface type function - Google Patents

A kind of file classification method and system based on Non-smooth surface type function Download PDF

Info

Publication number
CN109766443A
CN109766443A CN201910023612.5A CN201910023612A CN109766443A CN 109766443 A CN109766443 A CN 109766443A CN 201910023612 A CN201910023612 A CN 201910023612A CN 109766443 A CN109766443 A CN 109766443A
Authority
CN
China
Prior art keywords
type
function
variable step
sample text
weight vectors
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201910023612.5A
Other languages
Chinese (zh)
Other versions
CN109766443B (en
Inventor
陶卿
程禹嘉
陈萍
袁广林
刘欣
秦晓燕
袁友宏
鲍蕾
王秀珍
王晓芳
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Pla Artillery Air Defense Force Academy
PLA Army Academy of Artillery and Air Defense
Original Assignee
Pla Artillery Air Defense Force Academy
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Pla Artillery Air Defense Force Academy filed Critical Pla Artillery Air Defense Force Academy
Priority to CN201910023612.5A priority Critical patent/CN109766443B/en
Publication of CN109766443A publication Critical patent/CN109766443A/en
Application granted granted Critical
Publication of CN109766443B publication Critical patent/CN109766443B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Abstract

The invention discloses a kind of file classification method and system based on Non-smooth surface type function.The file classification method includes: the loss function and regularization term for obtaining sample text;First object function is constructed according to the loss function and the regularization term;The number of iterations is obtained, and variable step is determined according to the number of iterations;K iterative model is constructed according to the variable step;Optimization weight vectors are determined according to the K iterative model and the variable step;The second objective function is determined according to the optimization weight vectors;Sample text type is determined according to second objective function.It can be more accurate under the background of text classification and text can be rapidly performed by distinguish using file classification method provided by the present invention and system.

Description

A kind of file classification method and system based on Non-smooth surface type function
Technical field
The present invention relates to text classification fields, more particularly to a kind of file classification method based on Non-smooth surface type function And system.
Background technique
Traditional file classification method when solving the problems, such as that text based on Non-smooth surface function is distinguished, it is relatively slow there are speed and The problems such as classification results are not accurate, simultaneously as side of the solution of most of traditional file classification methods all to be weighted and averaged sum Formula output, and objective function is not added with regularization term, has not only broken up the sparsity of script, and generalization ability is poor, cannot be compared with Good is applied in practical problem, and text classification type accuracy substantially reduces.
Summary of the invention
The object of the present invention is to provide a kind of file classification method and system based on Non-smooth surface type function, to solve to pass The low problem of the file classification method text classification type accuracy of system.
To achieve the above object, the present invention provides following schemes:
A kind of file classification method based on Non-smooth surface type function, comprising:
Obtain the loss function and regularization term of sample text;
First object function is constructed according to the loss function and the regularization term;
The number of iterations is obtained, and variable step is determined according to the number of iterations;
K iterative model is constructed according to the variable step;
Optimization weight vectors are determined according to the K iterative model and the variable step;
The second objective function is determined according to the optimization weight vectors;
Sample text type is determined according to second objective function.
Optionally, described that first object function is constructed according to the loss function and the regularization term, it specifically includes:
According to formulaConstruct first object function;Wherein, f (wi, It y) is loss function;λ R (x) is regularization term;M is the integer greater than 0.
Optionally, described that K iterative model is constructed according to the variable step, it specifically includes:
According to formula Construct K iterative model;Wherein,It is x in xkThe gradient at place;X is For training sample set wiOptimization weight vectors;xkFor the w when iterating to kth stepkOptimization weight vectors;xk+1To work as iteration W when to+1 step of kthk+1Optimization weight vectors;ψ (x)=λ R (x);Variable step includes the first variable step and the second variable step; First variable step isSecond variable step is K is The number of iterations.
Optionally, it is described sample text type is determined according to second objective function after, further includes:
Judge whether the sample text type reaches expected classification type, obtains the first judging result;
If first judging result, which is expressed as the sample text type, reaches expected classification type, the sample is determined Text type is correct sample text type;
If first judging result is expressed as the not up to expected classification type of the sample text type, institute is readjusted State optimization weight vectors.
A kind of Text Classification System based on Non-smooth surface type function, comprising:
Parameter acquisition module, for obtaining the loss function and regularization term of sample text;
First object function constructs module, for constructing first object according to the loss function and the regularization term Function;
The number of iterations obtains module, determines variable step for obtaining the number of iterations, and according to the number of iterations;
K times iterative model constructs module, for constructing K iterative model according to the variable step;
Optimize weight vectors determining module, for determining optimization power according to the K iterative model and the variable step Weight vector;
Second objective function determining module, for determining the second objective function according to the optimization weight vectors;
Sample text determination type module, for determining sample text type according to second objective function.
Optionally, the first object function building module specifically includes:
Objective function construction unit, for according to formulaBuilding First object function;Wherein, f (wi, y) and it is loss function;λ R (x) is regularization term;M is the integer greater than 0.
Optionally, the K iterative model building module specifically includes:
K iterative model construction unit, for according to formula Construct K Secondary iterative model;Wherein,It is x in xkThe gradient at place;X is for training sample set wiOptimization weight vectors;xkFor The w when iterating to kth stepkOptimization weight vectors;xk+1For the w when iterating to+1 step of kthk+1Optimization weight vectors;ψ(x) =λ R (x);Variable step includes the first variable step and the second variable step;First variable step isIt is described Second variable step isK is the number of iterations.
Optionally, further includes:
First judgment module obtains first and sentences for judging whether the sample text type reaches expected classification type Disconnected result;
Sample text determination type module reaches if being expressed as the sample text type for first judging result It is expected that classification type, determines that the sample text type is correct sample text type;
Optimize weight vectors and adjust module, is not reached if being expressed as the sample text type for first judging result To expected classification type, the optimization weight vectors are readjusted.
The specific embodiment provided according to the present invention, the invention discloses following technical effects: the present invention provides one kind File classification method and system based on Non-smooth surface type function are mostly flat to weight due to the solution of traditional file classification method The mode of equal sum exports, so that sparsity is destroyed, causes generalization ability poor, cannot preferably be applied in practical problem, Precision effect is distinguished to also fail to comply with one's wishes;The present invention chooses preferable variable step, has directly obtained the optimal solution of individual output, preferably Remain sparsity, improve destruction of the way of output to sparsity that sum is weighted and averaged in traditional optimization.
Detailed description of the invention
It in order to more clearly explain the embodiment of the invention or the technical proposal in the existing technology, below will be to institute in embodiment Attached drawing to be used is needed to be briefly described, it should be apparent that, the accompanying drawings in the following description is only some implementations of the invention Example, for those of ordinary skill in the art, without any creative labor, can also be according to these attached drawings Obtain other attached drawings.
Fig. 1 is the file classification method flow chart provided by the present invention based on Non-smooth surface type function;
Fig. 2 is the file classification method process based on Non-smooth surface type function provided by the present invention by taking two classification as an example Figure;
Fig. 3 is the Text Classification System structure chart provided by the present invention based on Non-smooth surface type function.
Specific embodiment
Following will be combined with the drawings in the embodiments of the present invention, and technical solution in the embodiment of the present invention carries out clear, complete Site preparation description, it is clear that described embodiments are only a part of the embodiments of the present invention, instead of all the embodiments.It is based on Embodiment in the present invention, it is obtained by those of ordinary skill in the art without making creative efforts every other Embodiment shall fall within the protection scope of the present invention.
The object of the present invention is to provide a kind of file classification method and system based on Non-smooth surface type function, in text point Text differentiation more can be accurately carried out under the background of class.
In order to make the foregoing objectives, features and advantages of the present invention clearer and more comprehensible, with reference to the accompanying drawing and specific real Applying mode, the present invention is described in further detail.
Fig. 1 is the file classification method flow chart provided by the present invention based on Non-smooth surface type function, as shown in Figure 1, A kind of file classification method based on Non-smooth surface type function, comprising:
Step 101: obtaining the loss function and regularization term of sample text.
Step 102: first object function is constructed according to the loss function and the regularization term.
As shown in Fig. 2, by taking two classification as an example, for independent identically distributed text classification sample setwiFor the feature based on text classification sample to Amount, yiFor for sample xiClassification type (+1, -1), RnFor Hilbert space, optimization object function are as follows:
Wherein, f (wi, y) and it is loss function, λ R (x) is regularization term;M is the integer greater than 0.
Step 103: obtaining the number of iterations, and variable step is determined according to the number of iterations.
Step 104: K iterative model is constructed according to the variable step.
Step 105: optimization weight vectors are determined according to the K iterative model and the variable step.
Initialize regularization parameter and variable step, it may be assumed that α0=1, β0=1, x0=x1=0;It executes K times in the following way Iteration:
Wherein,It is x in xkThe gradient at place;X is for training sample set wiOptimization weight vectors;xkFor when repeatedly W when generation walks to kthkOptimization weight vectors;xk+1For for the w when iterating to+1 step of kthk+1Optimization weight vectors;ψ (x) is Regularization term;ψ (x)=λ R (x);Variable step includes the first variable step and the second variable step;First variable step isSecond variable step isK is the number of iterations.
Step 106: determining the second objective function according to the optimization weight vectors.
Step 107: sample text type is determined according to second objective function.
After the step 107, further includes: judge whether the sample text type reaches expected classification type, if so, Determine that the sample text type is correct sample text type;If it is not, readjusting the regularization term.
Fig. 3 is the Text Classification System structure chart provided by the present invention based on Non-smooth surface type function, as shown in figure 3, A kind of Text Classification System based on Non-smooth surface type function, comprising:
Parameter acquisition module 301, for obtaining the loss function and regularization term of sample text.
First object function constructs module 302, for according to the loss function and regularization term building first Objective function.
The first object function building module 302 specifically includes: objective function construction unit, for according to formulaConstruct objective function;Wherein, f (wi, y) and it is loss function;λ R (x) is Regularization term;M is the integer greater than 0.
The number of iterations obtains module 303, determines variable step for obtaining the number of iterations, and according to the number of iterations.
K times iterative model constructs module 304, for constructing K iterative model according to the variable step.
The K iterative model building module 304 specifically includes: K iterative model construction unit, for according to formula Construct K iterative model;Wherein,It is x in xkThe ladder at place Degree;X is for training sample set wiOptimization weight vectors;xkFor the value of the x when iterating to kth step;xk+1To be iterated to work as The value of x when+1 step of kth;ψ (x)=λ R (x) variable step includes the first variable step and the second variable step;First variable step isSecond variable step isK is the number of iterations.
Optimize weight vectors determining module 305, for determining optimization according to the K iterative model and the variable step Weight vectors.
Second objective function determining module 306, for determining the second objective function according to the optimization weight vectors.
Sample text determination type module 307 determines sample according to second objective function for sample text type Text type.
The invention also includes: first judgment module, for judging whether the sample text type reaches expected classification class Type obtains the first judging result;Sample text determination type module, if being expressed as the sample for first judging result Text type reaches expected classification type, determines that the sample text type is correct sample text type;Regularization term tune Mould preparation block is adjusted again if being expressed as the not up to expected classification type of the sample text type for first judging result The whole regularization term.
Practical file classification method and system provided by the present invention based on Non-smooth surface type function are to improve Heavy- The file classification method and system based on Non-smooth surface type function of ball method, due to Heavy-ball method inertia because Son, can in processing large-scale text categorization problem faster find globally optimal solution, so that more fast accurate reaches More excellent text classification effect.
Most of optimization methods are using average way of output output at present, and sparsity is poor, and the present invention exports shape with individual Formula output, rate reach theoretic optimal and have preferable sparsity, can be more accurate under the background of text classification Carry out text differentiation.
The present invention has preferable generalization ability using regularization term, has universal applicability, while can also push away In the wide application to other similar based on machine learning optimization problem.
Each embodiment in this specification is described in a progressive manner, the highlights of each of the examples are with other The difference of embodiment, the same or similar parts in each embodiment may refer to each other.For system disclosed in embodiment For, since it is corresponded to the methods disclosed in the examples, so being described relatively simple, related place is said referring to method part It is bright.
Used herein a specific example illustrates the principle and implementation of the invention, and above embodiments are said It is bright to be merely used to help understand method and its core concept of the invention;At the same time, for those skilled in the art, foundation Thought of the invention, there will be changes in the specific implementation manner and application range.In conclusion the content of the present specification is not It is interpreted as limitation of the present invention.

Claims (8)

1. a kind of file classification method based on Non-smooth surface type function characterized by comprising
Obtain the loss function and regularization term of sample text;
First object function is constructed according to the loss function and the regularization term;
The number of iterations is obtained, and variable step is determined according to the number of iterations;
K iterative model is constructed according to the variable step;
Optimization weight vectors are determined according to the K iterative model and the variable step;
The second objective function is determined according to the optimization weight vectors;
Sample text type is determined according to second objective function.
2. the file classification method according to claim 1 based on Non-smooth surface type function, which is characterized in that the basis The loss function and the regularization term construct first object function, specifically include:
According to formulaConstruct first object function;Wherein, f (wi, y) be Loss function;λ R (x) is regularization term;M is the integer greater than 0.
3. the file classification method according to claim 1 based on Non-smooth surface type function, which is characterized in that the basis The variable step constructs K iterative model, specifically includes:
According to formula Construct K iterative model;Wherein,It is x in xkThe gradient at place;X is for training Sample set wiOptimization weight vectors;xkFor the w when iterating to kth stepkOptimization weight vectors;xk+1Kth+1 is iterated to work as W when stepk+1Optimization weight vectors;ψ (x)=λ R (x);Variable step includes the first variable step and the second variable step;Described first Variable step isSecond variable step isK is iteration time Number.
4. the file classification method according to claim 1 based on Non-smooth surface type function, which is characterized in that the basis Second objective function determines after sample text type, further includes:
Judge whether the sample text type reaches expected classification type, obtains the first judging result;
If first judging result, which is expressed as the sample text type, reaches expected classification type, the sample text is determined Type is correct sample text type;
If first judging result is expressed as the not up to expected classification type of the sample text type, readjust described excellent Change weight vectors.
5. a kind of Text Classification System based on Non-smooth surface type function characterized by comprising
Parameter acquisition module, for obtaining the loss function and regularization term of sample text;
First object function constructs module, for constructing first object letter according to the loss function and the regularization term Number;
The number of iterations obtains module, determines variable step for obtaining the number of iterations, and according to the number of iterations;
K times iterative model constructs module, for constructing K iterative model according to the variable step;
Optimize weight vectors determining module, for determined according to the K iterative model and the variable step optimize weight to Amount;
Second objective function determining module, for determining the second objective function according to the optimization weight vectors;
Sample text determination type module, for determining sample text type according to second objective function.
6. the Text Classification System according to claim 5 based on Non-smooth surface type function, which is characterized in that described first Objective function building module specifically includes:
Objective function construction unit, for according to formulaConstruct the first mesh Scalar functions;Wherein, f (wi, y) and it is loss function;λ R (x) is regularization term;M is the integer greater than 0.
7. the Text Classification System according to claim 5 based on Non-smooth surface type function, which is characterized in that K times described Iterative model building module specifically includes:
K iterative model construction unit, for according to formula Construct K iterative model;Wherein, It is x in xkThe gradient at place;X is for training sample set wiOptimization weight vectors;xkFor the w when iterating to kth stepkOptimization Weight vectors;xk+1For the w when iterating to+1 step of kthk+1Optimization weight vectors;ψ (x)=λ R (x);Variable step includes the first change Step-length and the second variable step;First variable step isSecond variable step isK is the number of iterations.
8. the Text Classification System according to claim 1 based on Non-smooth surface type function, which is characterized in that further include:
First judgment module obtains the first judgement knot for judging whether the sample text type reaches expected classification type Fruit;
Sample text determination type module, if being expressed as the sample text type for first judging result reaches expected Classification type determines that the sample text type is correct sample text type;
Optimize weight vectors and adjust module, if it is not up to pre- to be expressed as the sample text type for first judging result Phase classification type readjusts the optimization weight vectors.
CN201910023612.5A 2019-01-10 2019-01-10 Text classification method and system based on non-smooth function type Active CN109766443B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910023612.5A CN109766443B (en) 2019-01-10 2019-01-10 Text classification method and system based on non-smooth function type

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910023612.5A CN109766443B (en) 2019-01-10 2019-01-10 Text classification method and system based on non-smooth function type

Publications (2)

Publication Number Publication Date
CN109766443A true CN109766443A (en) 2019-05-17
CN109766443B CN109766443B (en) 2020-10-09

Family

ID=66453624

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910023612.5A Active CN109766443B (en) 2019-01-10 2019-01-10 Text classification method and system based on non-smooth function type

Country Status (1)

Country Link
CN (1) CN109766443B (en)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105046284A (en) * 2015-08-31 2015-11-11 鲁东大学 Feature selection based multi-example multi-tag learning method and system
CN107133626A (en) * 2017-05-10 2017-09-05 安徽大学 A kind of medical image classification method based on part mean random Optimized model
US20170308790A1 (en) * 2016-04-21 2017-10-26 International Business Machines Corporation Text classification by ranking with convolutional neural networks
US20180365228A1 (en) * 2017-06-15 2018-12-20 Oracle International Corporation Tree kernel learning for text classification into classes of intent

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105046284A (en) * 2015-08-31 2015-11-11 鲁东大学 Feature selection based multi-example multi-tag learning method and system
US20170308790A1 (en) * 2016-04-21 2017-10-26 International Business Machines Corporation Text classification by ranking with convolutional neural networks
CN107133626A (en) * 2017-05-10 2017-09-05 安徽大学 A kind of medical image classification method based on part mean random Optimized model
US20180365228A1 (en) * 2017-06-15 2018-12-20 Oracle International Corporation Tree kernel learning for text classification into classes of intent

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
高乾坤等: "基于交替方向乘子法的非光滑损失坐标优化算法", 《计算机应用》 *

Also Published As

Publication number Publication date
CN109766443B (en) 2020-10-09

Similar Documents

Publication Publication Date Title
CN108876021B (en) Medium-and-long-term runoff forecasting method and system
CN111639429B (en) Underwater sound field numerical simulation method, system and medium based on Chebyshev polynomial spectrum
WO2009038271A1 (en) Method for automatic clustering, and method and apparatus for multipath clustering in wireless communication using the same
CN105631090A (en) Finite element model optimization device and method
Slavakis et al. Adaptive algorithm for sparse system identification using projections onto weighted ℓ 1 balls
CN109871622A (en) A kind of low-voltage platform area line loss calculation method and system based on deep learning
CN106202756B (en) Deficient based on single layer perceptron determines blind source separating source signal restoration methods
CN110444022A (en) The construction method and device of traffic flow data analysis model
CN111062462A (en) Local search and global search fusion method and system based on differential evolution algorithm
CN105895089A (en) Speech recognition method and device
CN110376290A (en) Acoustic emission source locating method based on multidimensional Density Estimator
US9275304B2 (en) Feature vector classification device and method thereof
CN110118979A (en) The method of improved differential evolution algorithm estimation multipath parameter based on broad sense cross-entropy
CN104122584B (en) Method and device for determining directionality according to seismic data
CN109738852A (en) The distributed source two-dimensional space Power estimation method rebuild based on low-rank matrix
CN109766443A (en) A kind of file classification method and system based on Non-smooth surface type function
CN105044698B (en) Method suitable for micro-Doppler analysis of space target in short-time observation
CN109274352A (en) More convex combination adaptive filter methods based on maximal correlation entropy
CN109508785A (en) A kind of asynchronous parallel optimization method for neural metwork training
CN106384298B (en) A kind of intelligent power missing data modification method based on two stages interpolation model
CN104683006B (en) Beamforming Method based on Landweber iterative methods
CN110571825A (en) Static synchronous compensator model parameter identification method and system
CN109885877A (en) A kind of constrained domain optimization Latin hypercube design method based on clustering algorithm
Rahman et al. Convergence of the fast state estimation for power systems
CN109143371A (en) A kind of noise remove method and device of seismic data

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant