CN107729926A - A kind of data amplification method based on higher dimensional space conversion, mechanical recognition system - Google Patents

A kind of data amplification method based on higher dimensional space conversion, mechanical recognition system Download PDF

Info

Publication number
CN107729926A
CN107729926A CN201710899032.3A CN201710899032A CN107729926A CN 107729926 A CN107729926 A CN 107729926A CN 201710899032 A CN201710899032 A CN 201710899032A CN 107729926 A CN107729926 A CN 107729926A
Authority
CN
China
Prior art keywords
sample
data
dimensional space
higher dimensional
point
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201710899032.3A
Other languages
Chinese (zh)
Other versions
CN107729926B (en
Inventor
赵凤军
吴斌
贺小伟
侯榆青
易黄建
曹欣
王宾
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Northwest University
Original Assignee
Northwest University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Northwest University filed Critical Northwest University
Priority to CN201710899032.3A priority Critical patent/CN107729926B/en
Publication of CN107729926A publication Critical patent/CN107729926A/en
Application granted granted Critical
Publication of CN107729926B publication Critical patent/CN107729926B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/52Scale-space analysis, e.g. wavelet analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/74Image or video pattern matching; Proximity measures in feature spaces
    • G06V10/75Organisation of the matching processes, e.g. simultaneous or sequential comparisons of image or video features; Coarse-fine approaches, e.g. multi-scale approaches; using context analysis; Selection of dictionaries
    • G06V10/758Involving statistics of pixels or of feature values, e.g. histogram matching

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Data Mining & Analysis (AREA)
  • Multimedia (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Computation (AREA)
  • General Engineering & Computer Science (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Health & Medical Sciences (AREA)
  • Computing Systems (AREA)
  • Databases & Information Systems (AREA)
  • General Health & Medical Sciences (AREA)
  • Medical Informatics (AREA)
  • Software Systems (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Image Analysis (AREA)

Abstract

The invention belongs to image procossing, machine learning techniques field, discloses a kind of data amplification method based on higher dimensional space conversion, mechanical recognition system, background sample data are transformed into higher dimensional space from luv space;Distribution histogram based on background sample obtains the distribution of higher dimensional space target sample, generates higher dimensional space target sample data;Equation group conversion is carried out using distance function, amplification data transforms to luv space by higher dimensional space.The present invention has expanded corresponding positive sample data set, has solved the positive and negative sample data mismatch problem in machine learning model, improve classification performance, be improved particularly the nicety of grading of positive sample by learning to the distribution histogram of negative sample;Statistical analysis is carried out based on background sample, obtain the distribution of target sample data to be generated, and then target sample is generated, the validity of amplification data is improved, avoids and traditional synthesizes that sample caused by new target sample is overlapping, model over-fitting problem based on a small amount of sample.

Description

A kind of data amplification method based on higher dimensional space conversion, mechanical recognition system
Technical field
The invention belongs to image procossing, machine learning techniques field, more particularly to a kind of number based on higher dimensional space conversion According to amplification method, mechanical recognition system.
Background technology
Machine learning is a research machine recognition existing knowledge, obtains the knowledge of new knowledge and new technical ability, extensively Applied to every field, such as image recognition, data mining, fault diagnosis.Needed in machine learning techniques first to sample data Handled and trained.In actual applications, sample data set is often unbalanced, and negative sample quantity is remote in usual data set More than positive sample, the result being trained to this kind of data set is that the classification performance of grader declines;Such as know in vascular plaque In other problem, often accounting is less for vascular system sample medium vessels patch, largely belongs to healthy blood vessel, is entered with such sample Row training, obtained grader precision is relatively low, and normal blood vessels may be identified as to the blood vessel that patch be present, false judgment patient The state of an illness, it is also possible to the blood vessel for having patch is identified as normal blood vessels, so as to be delayed the state of an illness of patient.Therefore to this kind of inequality Weighing apparatus data are correctly classified, and are improved the accuracy rate of classification, are had very important significance for its affiliated research field. At present, the processing for unbalanced dataset mainly has two aspects, first, from the angle of data, by studying sample The mode of sampling or amplification reaches the purpose of equilibrium criterion collection, second, from the angle of algorithm, algorithm performance is changed Come in improve classifier performance.Traditional angle from data, the method handled unbalanced dataset mainly have Two kinds, one kind is sampling algorithm, by being sampled to negative sample, the negative sample of sampling is equal to the set of former positive sample, this Kind of method can cause the missing of the information entrained by the sample that is not sampled, for negative sample data much larger than positive sample data It sample, can lack the most information of research sample, participate in the sample size wretched insufficiency of training;Another method is to pass through Data amplification technique increases the quantity of positive sample, and the technology is analyzed based on target sample, and artificial according to target sample Synthesize new sample and carry out equilibrium criterion collection, for example, simple copy positive sample, to positive sample plus noise, positive sample rotation, upset etc. Mode, but simple data amplification technique easily causes that sample is overlapping and model over-fitting problem, the training for increasing model are difficult Degree;For the improvement of simple data amplification technique, some scholars propose new amplification algorithm, as SMOTE algorithms be by The artificial synthesized new sample of linear interpolation is carried out between positive sample similar in position and carrys out equilibrium criterion collection, this method to it is each just Sample all generates new samples, improves model over-fitting problem, but it is overlapping to easily cause sample, at the same the algorithm have ignored it is close Influence of the sample and isolated point of classification boundaries to target sample classification performance, there is certain blindness when synthesizing new samples Property;BSMOTE algorithms are to be based on SMOTE algorithms, and target sample is classified using nearest neighbor algorithm, obtain its noise sample, Internal specimen (sample away from classification boundaries), boundary sample, the synthesis of new samples is carried out using the target sample of classification boundaries, This algorithm have ignored background sample and isolated point, not be suitable for the few research sample of target sample.
In summary, the problem of prior art is present be:Analyzing to synthesize new sample based on target sample, easily makes Into sample it is overlapping, ignore border and isolated point the problems such as, due to the limitation of training sample so that grader classification is inaccurate, Certain limitation in raising to target sample classification performance be present, asking for model over-fitting is likely to result in as sample is overlapping Inscribe, ignore border and the problem of isolated point can be caused to this kind of sample point classification error etc..
The content of the invention
The problem of existing for prior art, the invention provides a kind of data amplification side based on higher dimensional space conversion Method, mechanical recognition system.
The present invention is achieved in that a kind of data amplification method based on higher dimensional space conversion, described empty based on higher-dimension Between the data amplification method that converts background sample data are transformed into higher dimensional space from luv space;Distribution based on background sample Histogram obtains the distribution of higher dimensional space target sample, generates higher dimensional space target sample data;Equation is carried out using distance function Group conversion, amplification data transform to luv space by higher dimensional space.
Further, the data amplification method based on higher dimensional space conversion comprises the following steps:
Step 1, data sample is divided into positive sample and negative sample, positive sample is target sample, and negative sample is background sample This;The Euclidean distance square of each background sample data and all background samples is calculated respectively, and the higher-dimension for obtaining background sample is empty Between convert, so as to which background sample data are transformed into higher dimensional space by luv space;
Step 2, the histogram of the higher dimensional space background sample in each dimension is counted respectively, to per one-dimensional sample data Distribution is normalized;Supplement is carried out to the histogram of the background sample after normalization, obtains target sample in each dimension Histogram distribution, and be standardized to obtain the probability distribution of target sample;Obtained according to the probability distribution in each dimension Take needs to generate sample point number and its span in each dimension;To generating preliminary target sample per one-dimensional probability distribution Data, obtained every one dimensional numerical internal sequence is upset at random, generates the target sample data of higher dimensional space;
The distance between target sample point of step 3, background sample point and generation is distance function, is obtained by distance function The distance function equation group of a certain data point into background sample point and amplification data;Adjacent two works of functional equation group of adjusting the distance Difference, carries out transposition and coefficient merges, and obtains the Linear Equations on certain point in data to be generated;Solve to be generated Data certain point is generalized to institute in data to be generated and a little, obtains the matrix equation on low-dimensional amplification data to be generated, solves Matrix equation, amplification data is transformed into luv space from higher dimensional space, the target sample data after being expanded.
Further, background sample data are transformed into higher dimensional space by luv space in the step 1 to specifically include:
(1) it is N initial data to be divided into research sample and background sample, background sample number, and background sample point is x01, x02,…,x0n,…,x0N, wherein each sample point includes Q dimension datas, i-th of sample data is a line vector x0i=[x0i1, x0i2,…,x0iq,…,x0iQ];
(2) to each background sample data point x0i, the Euclidean distance square of it and all background sample data points is calculated, Obtain:di,1,di,2,…,di,n,…,di,N, wherein di,n=| | x0i-x0n||2 2=(x0i1-x0n1)2+(x0i2-x0n2)2+…+ (x0iq-x0nq)2+…+(x0iQ-x0nQ)2, (1≤i≤N, 1≤n≤N), in formula | | x0i-x0n||2Represent (x0i-x0n) L2 models Number, finally give the N-dimensional space sample data of background sample:
Further, the target sample data that higher dimensional space is generated in the step 2 specifically include:
(1) histogram of N number of data in the higher dimensional space conversion of background sample is counted respectively by dimension, by the every of histogram One-dimensional data is divided into h section;
(2) sample counting in each section is counted, is designated as yt, ytFor a row vector, represent that background sample higher dimensional space becomes The sample counting in each section of t dimension datas in changing, to the section sample counting y of the dimension datatExcept sample in all sections The maximum of number is normalized
(3) the section sample counting y after normalizingt' supplement and standardization are carried out, obtain the probability point of target sample Cloth
(4) the number k of each section target sample data point to be generated in the dimension data is calculatedt=M × pt, ktFor one Row vector, represent that t ties up the counting of each section generation data, M is represented to generate the number of data point, pressed in each section K is generated at random according to being uniformly distributedtIndividual data point, and be l by the target sample data record of generation1,t,l2,t,…,lm,t,…, lM,t
(5) to being proceeded as described above in the higher dimensional space conversion of background sample per one-dimensional sample data, the M to be expanded is generated Each dimension sample data of the higher dimensional space of individual data point, the higher-dimension for upset to obtain amplification data by dimension progress internal random to it Space sample data:
Further, amplification data is transformed into luv space from higher dimensional space in the step 3 to specifically include:
(1) M sample point of amplification is designated as x1,x2,…,xm,…,xM, wherein each sample point includes Q dimension datas, i-th Individual sample data is a line vector xi=[xi1,xi2,…,xiq,…,xiQ], by distance function lm,n=| | xm-x0n||2 2 (1≤m ≤ M, 1≤n≤N), xmTo generate m-th of sample point of target sample, x0nFor n-th sample point of background sample, can obtain The distance function equation group of background sample point and amplification data:
(2) quadratic term of distance function equation group is deployed, and made the difference with n-th and (n+1)th, 1≤n≤N;On Generate the linear equation of than the m-th data in amplification data:
Linear equation is transplanted and coefficient merge after, can obtain:
Equation group is written as matrix equation:
By solution matrix equation, the certain point x of amplification data is calculatedm
(3) certain point x in amplification data will be calculatedmProcess be generalized to all M points, obtain on data point to be generated Matrix equation:
AX=B+C;
Wherein
Above-mentioned equation group is solved, obtains its unknown quantity X=A-1(B+C), wherein A-1Representing matrix A pseudo inverse matrix, is obtained Data result be amplification data point, complete conversion of the amplification data from higher dimensional space to luv space.
Another object of the present invention is to provide the data amplification method based on higher dimensional space conversion described in a kind of utilization Mechanical recognition system.
Another object of the present invention is to provide the data amplification method based on higher dimensional space conversion described in a kind of utilization Image identification system.
Advantages of the present invention and good effect are:In machine learning model, the classification that is trained based on original sample Device causes grader classification performance relatively low, the present invention passes through the distribution histogram to negative sample due to positive sample lazy weight Practise, expanded corresponding positive sample data set, solved the positive and negative sample data mismatch problem in machine learning model, improve Classification performance, especially substantially increase the nicety of grading of positive sample;The present invention is united based on background sample (negative sample) Meter analysis, the distribution of target sample to be generated (positive sample) data is obtained, and then generate target sample, solved in conventional method The problem of ignoring border and isolated point when generating target sample, so as to improve the validity of amplification data, avoid traditional The problems such as overlapping sample caused by new target sample, model over-fitting are synthesized based on a small amount of sample.
Brief description of the drawings
Fig. 1 is provided in an embodiment of the present invention to be based on higher dimensional space characteristic amplification method flow chart.
Fig. 2 is the regional choice figure of sample space feature extraction provided in an embodiment of the present invention.
Fig. 3 is to generate amplification data in the data amplification method provided in an embodiment of the present invention based on higher dimensional space conversion Flow chart is embodied.
Embodiment
In order to make the purpose , technical scheme and advantage of the present invention be clearer, with reference to embodiments, to the present invention It is further elaborated.It should be appreciated that the specific embodiments described herein are merely illustrative of the present invention, it is not used to Limit the present invention.
The present invention has expanded corresponding positive sample data set, has solved machine by learning to the distribution histogram of negative sample Positive and negative sample data mismatch problem in device learning model;Statistical analysis is carried out based on background sample (negative sample), treated The distribution of target sample (positive sample) data is generated, and then generates target sample, improves the validity of amplification data.
The application principle of the present invention is explained in detail below in conjunction with the accompanying drawings.
As shown in figure 1, the data amplification method provided in an embodiment of the present invention based on higher dimensional space conversion includes following step Suddenly:
S101:Sample is pre-processed, background sample data transform to higher dimensional space from luv space;
S102:Statistics with histogram and analysis higher dimensional space background sample data, the distribution of higher dimensional space target sample is obtained, and Generate higher dimensional space target sample data;
S103:Equation group conversion is carried out using distance function, amplification data is transformed into luv space.
The application principle of the present invention is further described below in conjunction with the accompanying drawings.
As shown in figure 3, it is provided in an embodiment of the present invention based on higher dimensional space conversion data amplification method specifically include with Lower step:
(1) to sample preprocessing, it is as follows that background sample data from luv space are transformed into higher dimensional space step;
The data that (1a) this example uses are along the vascular cross-section image with central axis direction in Human vascular's system;
(1b) chooses normal blood vessels cross-sectional image and obtained as background sample, vascular plaque cross-sectional image as target sample N is designated as to background sample number, background sample point is x01,x02,…,x0n,…,x0N
(1c) is as shown in Fig. 2 using current background center of a sample point as the center of circle, according to 1,3,5 voxels of center of a sample's point Circle on sample respectively, since innermost circle, sampling angle is followed successively by 90 °, and 45 °, 30 ° are sampled, obtain 24 sampling Region;
(1d) carries out feature extraction to background sample, and the average gray value in each region is the ash of all voxels in the region Average value is spent, obtains 24 characteristic vector [x0i1,x0i2,…,x0i24], wherein i represents i-th of background sample;Calculate each area The average curvature in domain is designated as the curvature feature in the region, obtains 24 characteristic vector [x0i25,x0i26,…,x0i48];By two dimension Gabor filtering obtains textural characteristics with the texture maps of 90 ° of filterings, obtains characteristic vector [x0i49,x0i50,…,x0i72];Calculate every The Hessian matrixes of individual point, obtain representing three characteristic values in the direction, characteristic vector [x can be obtained0i73,x0i74,…, x0i144];
(1e) carries out above sample mode to each background sample, calculates its characteristic vector, obtains each background sample point The Q=144 dimension datas being made up of the feature of four types, i-th of sample data are a line vector x0i=[x0i1,x0i2,…, x0iq,…,x0iQ];
(1f) is to each background sample data point x0i, the Euclidean distance for calculating it with all background sample data points puts down Side, is obtained:di,1,di,2,…,di,n,…,di,N, wherein di,n=| | x0i-x0n||2 2=(x0i1-x0n1)2+(x0i2-x0n2)2+…+ (x0iq-x0nq)2+…+(x0iQ-x0nQ)2, (1≤i≤N, 1≤n≤N), in formula | | x0i-x0n||2Represent (x0i-x0n) L2 models Number, finally give the N-dimensional space sample data of background sample:
(2) higher dimensional space background sample data are analyzed, generates the target sample data detailed process of higher dimensional space It is as follows:
(2a) counted respectively by dimension background sample higher dimensional space conversion in N number of data histogram, by histogram H section is divided into per one-dimensional data;
(2b) counts the sample counting in each section, is designated as yt, ytFor a row vector, background sample higher dimensional space is represented The sample counting in each section of t dimension datas in conversion, to the section sample counting y of the dimension datatExcept sample in all sections The maximum of this number is normalized
Section sample counting y after (2c) normalizationt' supplement and standardization are carried out, obtain the probability of target sample Distribution
(2d) calculates the number k of each section target sample data point to be generated in the dimension datat=M × pt, ktFor One row vector, represent that t ties up the counting of each section generation data, M represents to generate the number of data point, in each section K is generated at random according to being uniformly distributedtIndividual data point, and be l by the target sample data record of generation1,t,l2,t,…,lm,t,…, lM,t
(2e) generates what is expanded to being proceeded as described above in the higher dimensional space conversion of background sample per one-dimensional sample data Each dimension sample data of the higher dimensional space of M data point, the height for upset to obtain amplification data by dimension progress internal random to it Dimension space sample data:
(3) that amplification data is transformed into luv space step from higher dimensional space is as follows:
The M sample point of (3a) amplification is designated as x1,x2,…,xm,…,xM, wherein each sample point includes Q dimension datas, i-th Individual sample data is a line vector xi=[xi1,xi2,…,xiq,…,xiQ], by distance function lm,n=| | xm-x0n||2 2 (1≤m ≤ M, 1≤n≤N), xmTo generate m-th of sample point of target sample, x0nFor n-th sample point of background sample, can obtain The distance function equation group of background sample point and amplification data:
(3b) deploys the quadratic term of distance function equation group, and is made the difference with n-th and (n+1)th (1≤n≤N), can Obtain the linear equation on than the m-th data in generation amplification data:
Linear equation is transplanted and coefficient merge after, can obtain:
Equation group can be written as matrix equation:
By solution matrix equation, the certain point x of amplification data can be calculatedm
(3c) will calculate certain point x in amplification datamProcess be generalized to all M points, obtain on data to be generated The matrix equation of point:
AX=B+C;
Wherein
C=[c, c ..., c],
Above-mentioned equation group is solved, obtains its unknown quantity X=A-1(B+C), wherein A-1Representing matrix A pseudo inverse matrix, is obtained Data result be amplification data point, complete conversion of the amplification data from higher dimensional space to luv space.
The foregoing is merely illustrative of the preferred embodiments of the present invention, is not intended to limit the invention, all essences in the present invention All any modification, equivalent and improvement made within refreshing and principle etc., should be included in the scope of the protection.

Claims (7)

  1. A kind of 1. data amplification method based on higher dimensional space conversion, it is characterised in that the number based on higher dimensional space conversion Background sample data are transformed into higher dimensional space from luv space according to amplification method;Distribution histogram based on background sample obtains Higher dimensional space target sample is distributed, and generates higher dimensional space target sample data;Equation group conversion, amplification are carried out using distance function Data transform to luv space by higher dimensional space.
  2. 2. the data amplification method as claimed in claim 1 based on higher dimensional space conversion, it is characterised in that described to be based on higher-dimension The data amplification method of spatial alternation comprises the following steps:
    Step 1, data sample is divided into positive sample and negative sample, positive sample is target sample, and negative sample is background sample;Point The Euclidean distance square of each background sample data and all background samples is not calculated, and the higher dimensional space for obtaining background sample becomes Change, so as to which background sample data are transformed into higher dimensional space by luv space;
    Step 2, the histogram of the higher dimensional space background sample in each dimension is counted respectively, to being distributed per one-dimensional sample data It is normalized;Supplement is carried out to the histogram of the background sample after normalization, it is straight in each dimension to obtain target sample Side's figure distribution, and be standardized to obtain the probability distribution of target sample;Obtained according to the probability distribution in each dimension Each dimension needs to generate sample point number and its span;To generating preliminary target sample number per one-dimensional probability distribution According to being upset at random to obtained every one dimensional numerical internal sequence, generate the target sample data of higher dimensional space;
    The distance between target sample point of step 3, background sample point and generation is distance function, is carried on the back by distance function The distance function equation group of a certain data point in scape sample point and amplification data;Adjacent two works of functional equation group of adjusting the distance are poor, Carry out transposition and coefficient merges, obtain the Linear Equations on certain point in data to be generated;Solve number to be generated Institute in data to be generated is generalized to according to certain point and a little, obtains the matrix equation on low-dimensional amplification data to be generated, solves square Battle array equation, transforms to luv space, the target sample data after being expanded by amplification data from higher dimensional space.
  3. 3. the data amplification method as claimed in claim 2 based on higher dimensional space conversion, it is characterised in that in the step 1 Background sample data are transformed into higher dimensional space by luv space to specifically include:
    (1) it is N initial data to be divided into research sample and background sample, background sample number, and background sample point is x01,x02,…, x0n,…,x0N, wherein each sample point includes Q dimension datas, i-th of sample data is a line vector x0i=[x0i1,x0i2,…, x0iq,…,x0iQ];
    (2) to each background sample data point x0i, the Euclidean distance square of it and all background sample data points is calculated, is obtained Arrive:di,1,di,2,…,di,n,…,di,N, wherein di,n=| | x0i-x0n||2 2=(x0i1-x0n1)2+(x0i2-x0n2)2+…+(x0iq- x0nq)2+…+(x0iQ-x0nQ)2, (1≤i≤N, 1≤n≤N), in formula | | x0i-x0n||2Represent (x0i-x0n) L2 norms, finally Obtain the N-dimensional space sample data of background sample:
  4. 4. the data amplification method as claimed in claim 2 based on higher dimensional space conversion, it is characterised in that in the step 2 The target sample data of generation higher dimensional space specifically include:
    (1) histogram of N number of data in the higher dimensional space conversion of background sample is counted respectively by dimension, by the every one-dimensional of histogram Data are divided into h section;
    (2) sample counting in each section is counted, is designated as yt, ytFor a row vector, represent in the conversion of background sample higher dimensional space The sample counting in each section of t dimension datas, number of samples in all sections is removed to the section sample counting yt of the dimension data Maximum be normalized
    (3) the section sample counting y after normalizingt' supplement and standardization are carried out, obtain the probability distribution of target sample
    (4) the number k of each section target sample data point to be generated in the dimension data is calculatedt=M × pt, ktFor a line to Amount, represent that t ties up the counting of each section generation data, M represents to generate the number of data point, according to equal in each section The even random generation k of distributiontIndividual data point, and be l by the target sample data record of generation1,t,l2,t,…,lm,t,…,lM,t
    (5) to being proceeded as described above in the higher dimensional space conversion of background sample per one-dimensional sample data, the M number to be expanded is generated Each dimension sample data of the higher dimensional space at strong point, the higher dimensional space for upset to obtain amplification data by dimension progress internal random to it Sample data:
  5. 5. the data amplification method as claimed in claim 2 based on higher dimensional space conversion, it is characterised in that in the step 3 Amplification data is transformed into luv space from higher dimensional space to specifically include:
    (1) M sample point of amplification is designated as x1,x2,…,xm,…,xM, wherein each sample point includes Q dimension datas, i-th of sample Data are a line vector xi=[xi1,xi2,…,xiq,…,xiQ], by distance function lm,n=| | xm-x0n||2 2(1≤m≤M,1≤n ≤ N), xmTo generate m-th of sample point of target sample, x0nFor n-th of sample point of background sample, background sample point can obtain With the distance function equation group of amplification data:
    (2) quadratic term of distance function equation group is deployed, and made the difference with n-th and (n+1)th, 1≤n≤N;Obtain on generation The linear equation of than the m-th data in amplification data:
    Linear equation is transplanted and coefficient merge after, can obtain:
    Equation group is written as matrix equation:
    By solution matrix equation, the certain point x of amplification data is calculatedm
    (3) certain point x in amplification data will be calculatedmProcess be generalized to all M points, obtain the square on data point to be generated Battle array equation:
    AX=B+C;
    Wherein
    Above-mentioned equation group is solved, obtains its unknown quantity X=A-1(B+C), wherein A-1Representing matrix A pseudo inverse matrix, obtained number It is amplification data point according to result, completes conversion of the amplification data from higher dimensional space to luv space.
  6. 6. a kind of machine of data amplification method using based on higher dimensional space conversion described in any one of Claims 1 to 55 is known Other system.
  7. 7. a kind of image of data amplification method using based on higher dimensional space conversion described in any one of Claims 1 to 55 is known Other system.
CN201710899032.3A 2017-09-28 2017-09-28 Data amplification method and machine identification system based on high-dimensional space transformation Active CN107729926B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201710899032.3A CN107729926B (en) 2017-09-28 2017-09-28 Data amplification method and machine identification system based on high-dimensional space transformation

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201710899032.3A CN107729926B (en) 2017-09-28 2017-09-28 Data amplification method and machine identification system based on high-dimensional space transformation

Publications (2)

Publication Number Publication Date
CN107729926A true CN107729926A (en) 2018-02-23
CN107729926B CN107729926B (en) 2021-07-13

Family

ID=61208384

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201710899032.3A Active CN107729926B (en) 2017-09-28 2017-09-28 Data amplification method and machine identification system based on high-dimensional space transformation

Country Status (1)

Country Link
CN (1) CN107729926B (en)

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108164291A (en) * 2018-03-22 2018-06-15 广西鸿光农牧有限公司 A kind of chicken manure fertilizer device for making
CN108182286A (en) * 2018-01-29 2018-06-19 重庆交通大学 A kind of highway maintenance detection and virtual interactive interface method based on Internet of Things
CN108388203A (en) * 2018-04-09 2018-08-10 衢州学院 A kind of intelligent numerical control machine tool heat dissipation monitoring system
CN108491456A (en) * 2018-03-02 2018-09-04 西安财经学院 The processing method of purchase information is sold in a kind of insurance service based on big data
CN108549281A (en) * 2018-04-11 2018-09-18 湖南城市学院 A kind of architectural design safe escape method of calibration and system
CN109344904A (en) * 2018-10-16 2019-02-15 杭州睿琪软件有限公司 Generate method, system and the storage medium of training sample
CN109919183A (en) * 2019-01-24 2019-06-21 北京大学 A kind of image-recognizing method based on small sample, device, equipment and storage medium
CN110033417A (en) * 2019-04-12 2019-07-19 江西财经大学 A kind of image enchancing method based on deep learning

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20100274539A1 (en) * 2009-04-24 2010-10-28 Hemant VIRKAR Methods for mapping data into lower dimensions
CN104268593A (en) * 2014-09-22 2015-01-07 华东交通大学 Multiple-sparse-representation face recognition method for solving small sample size problem
CN104751191A (en) * 2015-04-23 2015-07-01 重庆大学 Sparse self-adaptive semi-supervised manifold learning hyperspectral image classification method
CN106096640A (en) * 2016-05-31 2016-11-09 合肥工业大学 A kind of feature dimension reduction method of multi-mode system

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20100274539A1 (en) * 2009-04-24 2010-10-28 Hemant VIRKAR Methods for mapping data into lower dimensions
CN104268593A (en) * 2014-09-22 2015-01-07 华东交通大学 Multiple-sparse-representation face recognition method for solving small sample size problem
CN104751191A (en) * 2015-04-23 2015-07-01 重庆大学 Sparse self-adaptive semi-supervised manifold learning hyperspectral image classification method
CN106096640A (en) * 2016-05-31 2016-11-09 合肥工业大学 A kind of feature dimension reduction method of multi-mode system

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
YI YU,AND ETC: "Construction for true three-dimensional imaging display system and analysis based on state-space model", 《2015 IEEE INTERNATIONAL CONFERENCE ON MECHATRONICS AND AUTOMATION (ICMA)》 *
王守觉等: "彩色图像特征空间变换的新算法及其应用", 《电子学报》 *

Cited By (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108182286A (en) * 2018-01-29 2018-06-19 重庆交通大学 A kind of highway maintenance detection and virtual interactive interface method based on Internet of Things
CN108491456A (en) * 2018-03-02 2018-09-04 西安财经学院 The processing method of purchase information is sold in a kind of insurance service based on big data
CN108164291A (en) * 2018-03-22 2018-06-15 广西鸿光农牧有限公司 A kind of chicken manure fertilizer device for making
CN108388203A (en) * 2018-04-09 2018-08-10 衢州学院 A kind of intelligent numerical control machine tool heat dissipation monitoring system
CN108549281A (en) * 2018-04-11 2018-09-18 湖南城市学院 A kind of architectural design safe escape method of calibration and system
CN109344904A (en) * 2018-10-16 2019-02-15 杭州睿琪软件有限公司 Generate method, system and the storage medium of training sample
CN109344904B (en) * 2018-10-16 2020-10-30 杭州睿琪软件有限公司 Method, system and storage medium for generating training samples
CN109919183A (en) * 2019-01-24 2019-06-21 北京大学 A kind of image-recognizing method based on small sample, device, equipment and storage medium
CN109919183B (en) * 2019-01-24 2020-12-18 北京大学 Image identification method, device and equipment based on small samples and storage medium
CN110033417A (en) * 2019-04-12 2019-07-19 江西财经大学 A kind of image enchancing method based on deep learning
CN110033417B (en) * 2019-04-12 2023-06-13 江西财经大学 Image enhancement method based on deep learning

Also Published As

Publication number Publication date
CN107729926B (en) 2021-07-13

Similar Documents

Publication Publication Date Title
CN107729926A (en) A kind of data amplification method based on higher dimensional space conversion, mechanical recognition system
CN109522857B (en) People number estimation method based on generation type confrontation network model
WO2022160771A1 (en) Method for classifying hyperspectral images on basis of adaptive multi-scale feature extraction model
CN110443281B (en) Text classification self-adaptive oversampling method based on HDBSCAN (high-density binary-coded decimal) clustering
Wang et al. Automatic analysis of lateral cephalograms based on multiresolution decision tree regression voting
CN107194937B (en) Traditional Chinese medicine tongue picture image segmentation method in open environment
CN110276745B (en) Pathological image detection algorithm based on generation countermeasure network
CN106295124B (en) The method of a variety of image detecting technique comprehensive analysis gene subgraph likelihood probability amounts
CN103400388B (en) A kind of method utilizing RANSAC to eliminate Brisk key point error matching points pair
CN111784721B (en) Ultrasonic endoscopic image intelligent segmentation and quantification method and system based on deep learning
CN109685768A (en) Lung neoplasm automatic testing method and system based on lung CT sequence
CN108537751B (en) Thyroid ultrasound image automatic segmentation method based on radial basis function neural network
CN110543906B (en) Automatic skin recognition method based on Mask R-CNN model
CN104217213B (en) A kind of medical image multistage sorting technique based on symmetric theory
CN110516525A (en) SAR image target recognition method based on GAN and SVM
CN106650572A (en) Method for assessing quality of fingerprint image
CN115496720A (en) Gastrointestinal cancer pathological image segmentation method based on ViT mechanism model and related equipment
CN111507184B (en) Human body posture detection method based on parallel cavity convolution and body structure constraint
CN105488798B (en) SAR image method for measuring similarity based on point set contrast
CN116763295B (en) Livestock scale measuring method, electronic equipment and storage medium
CN109919215A (en) The object detection method of feature pyramid network is improved based on clustering algorithm
CN108846845A (en) SAR image segmentation method based on thumbnail and hierarchical fuzzy cluster
CN113160392A (en) Optical building target three-dimensional reconstruction method based on deep neural network
CN106951873A (en) A kind of Remote Sensing Target recognition methods
CN115019065A (en) CT image lesion recognition method based on improved training network

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant