CN105931224A - Pathology identification method for routine scan CT image of liver based on random forests - Google Patents

Pathology identification method for routine scan CT image of liver based on random forests Download PDF

Info

Publication number
CN105931224A
CN105931224A CN201610231280.6A CN201610231280A CN105931224A CN 105931224 A CN105931224 A CN 105931224A CN 201610231280 A CN201610231280 A CN 201610231280A CN 105931224 A CN105931224 A CN 105931224A
Authority
CN
China
Prior art keywords
image
feature
pathological changes
liver
characteristic
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201610231280.6A
Other languages
Chinese (zh)
Inventor
金心宇
武海涛
金奇樑
刘帆
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhejiang University ZJU
Original Assignee
Zhejiang University ZJU
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhejiang University ZJU filed Critical Zhejiang University ZJU
Priority to CN201610231280.6A priority Critical patent/CN105931224A/en
Publication of CN105931224A publication Critical patent/CN105931224A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/0002Inspection of images, e.g. flaw detection
    • G06T7/0012Biomedical image inspection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10072Tomographic images
    • G06T2207/10081Computed x-ray tomography [CT]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30004Biomedical image processing
    • G06T2207/30056Liver; Hepatic

Abstract

The invention discloses a pathology identification method for a routine scan CT image of the liver based on the random forests. The method comprises that image gray-level texture characteristic is extracted from a pathologic area of the routine scan CT image of the liver and serves as image characteristic vector expression, the random forests is used to select characteristics from the image characteristic vector of the pathologic area of the routine scan CT image of the liver to form a most effective characteristic combination, a most effective characteristic data set is trained and learned, the identification capability of a decision tree of random forests is balanced and optimized, and a final pathology identification model is obtained.

Description

Liver plain CT image pathological changes recognition methods based on random forests algorithm
Technical field
The present invention relates to a kind of liver plain CT image pathological changes recognition methods based on random forests algorithm, particularly to The introducing of most effective feature selection approach and the improvement of random forests algorithm.
Background technology
Along with development and the maturation of medical imaging technology, medical image serves important work in diagnosis for liver disease With.At present, hepatocarcinoma has become as one of the highest disease of fatality rate in the world, because the shortage for the treatment of means and early hepatocarcinoma Pathological index is less obvious, is likely to result in mistaken diagnosis, thus misses optimal treatment time.Making a definite diagnosis of hepatocarcinoma relies primarily on liver Dirty biopsy technique, but patient's liver can be caused certain damage by this technology, implements difficulty than high, post-operative recovery in addition Slowly, therefore, diagnosis for hepatic disease at present is the most also to rely on medical image, such as Hepatic CT.
Summary of the invention
The technical problem to be solved in the present invention is to provide a kind of liver plain CT image pathological changes based on random forests algorithm Recognition methods.
In order to solve above-mentioned technical problem, the present invention provides following technical scheme:
Present invention liver to be accomplished that plain CT image local pathological changes recognition methods, is specially and extracts liver plain CT figure As the gradation of image textural characteristics of lesion region represents as image feature vector, then use random forests algorithm to Hepatic CT Image lesion region image feature vector carries out feature selection, selects the combination of maximally effective feature, then to most effective feature Data set is trained and learns, and the decision tree of random forest is identified ability balance optimizing, obtains final pathological changes Identify model.
The block diagram that pathological changes identification model is set up as shown in Figure 1, is specifically divided into following steps:
1) CT image for liver lesion region characteristic data set is set up
For gradation of image textural characteristics, there are grey level histogram, gray level co-occurrence matrixes and gray scale ladder in image processing field Degree three kinds of character representation methods of co-occurrence matrix.The diseased region of doctor's mark is extracted from the liver plain CT image of doctor's mark Territory, extracts region as characteristics of lesion can cover the rectangle frame of lesion region, characteristics of lesion is extracted extracted region based on ash Degree rectangular histogram, gray level co-occurrence matrixes and the characteristics of image of Gray level-gradient co-occurrence matrix.
A. grey level histogram feature extraction
Grey level histogram is for representing intensity profile and the statistical property of image, characteristics of image bag based on grey level histogram Containing average, variance, skewness, kurtosis, energy, entropy etc..
B. gray level co-occurrence matrixes feature extraction
Gray level co-occurrence matrixes is for describing the gray-scale relation of neighbor in gray level image, figure based on gray level co-occurrence matrixes As feature comprises angle second moment, contrast, unfavourable balance square, entropy, is correlated with.
C. Gray level-gradient co-occurrence matrix feature extraction
Gray level-gradient co-occurrence matrix features image slices vegetarian refreshments gray value and the mutual relation of Grad, depicts in image Portion's pixel gray scale and the distribution situation of gradient, and embody a pixel and local, the space letter of pixel in its neighborhood Breath, can well express the textural characteristics of image, and characteristics of image based on Gray level-gradient co-occurrence matrix comprises little gradient advantage, big Gradient advantage, the inhomogeneities of intensity profile, the inhomogeneities of Gradient distribution, energy, gray scale is average, gradient is average, gray scale is equal Variance, gradient mean square deviation, degree of association, gray level entropy, gradient entropy, the entropy of mixing, inertia, unfavourable balance square etc..
Combine the characteristics of image of above-mentioned three types, and combine the lesion type label of image as characteristic vector data collection D。
2) most effective feature is selected
In step 1) in extract this three category feature in, comprise 26 eigenvalues altogether, these 26 eigenvalues are not all Feature can embody the specificity of liver plain CT image local characteristics of lesion, so when selecting feature, probably due to select Bad feature causes the identification Model Identification effect obtained poor, so the selecting for the identification established of validity feature Model is most important, and can reduce the amount of calculation of algorithm.
The random forests algorithm that the present invention uses can provide the importance degree of each feature in characteristic vector, by following Ring iterative rejects least key character, and the residue character after rejecting feature is set up new Random Forest model, finds out extensive The model characteristic of correspondence combination that error is minimum, is the combination of maximally effective feature, and detailed iterative process is shown in specific embodiments.
3) foundation of random forest pathological changes identification model and improvement
For 2) step obtain most effective characteristic attribute combination, from initial characteristic data concentrate filter out most effective feature Data set.Set up Random Forest model with most effective characteristic data set, and the decision tree in Random Forest model is optimized And equilibrium, obtain final random forest pathological changes identification model.Concrete optimization method is shown in specific embodiments.
In the Random Forest model generated, comprising many decision trees, in these decision trees, some decision-tree model is known Other effect is preferable, and some decision-tree model recognition effect is poor, therefore can reject the decision tree that those recognition effects are poor, but The present invention in view of when the decision tree that screenability is higher, due to variety classes liver local patholoic change classification and recognition not With, when screening decision tree, not from overall OOB estimation, but from the OOB estimation of single class specimen discerning effect, right Each type focus characteristic, selects the decision tree the highest to single class classification performance of equivalent amount, new with these decision trees composition Random forest.As a example by the random forest generating 40 decision trees, in 40 decision trees, having 10 is to normal type liver Dirty feature identification OOB estimates optimum decision tree, has 10 hepatic haemangioma type characteristics of lesion identification OOB is estimated optimum Decision tree, having 10 is that hepatic cyst type characteristics of lesion identification OOB is estimated optimum decision tree, and having 10 is to hepatocarcinoma type Characteristics of lesion identification OOB estimates optimum decision tree, selects to optimize and after equilibrium through decision tree, can avoid overall situation screening Decision tree causes the defect that certain type characteristics of lesion accuracy of identification is on the low side.
The present invention is accomplished that the automatic identification to liver plain CT image local pathological changes, mainly studies hepatocarcinoma, liver blood vessel Several lesion type such as tumor, hepatic cyst, lesion region is embodied in grey scale change and texture variations with the image difference of normal region On, what current gradation of image textural characteristics was conventional is based on grey level histogram, gray level co-occurrence matrixes and Gray level-gradient co-occurrence matrix Characteristics of image, by extract suspected lesion area-of-interest as the character representation of image, on this basis, carry according to feature Take algorithm and obtain the characteristic that quantizes, then use the random forest sorting algorithm of improvement characteristics of image to be trained, learns Practising and prediction, recognition result can give doctor some diagnostic recommendations, although auxiliary diagnostic result cannot function as diagnostic criteria, but can Obtaining more scientific diagnosis to be combined with doctor personal experience, thus reduce the error rate of diagnosis, this examined for the early stage of hepatocarcinoma Disconnected have huge medical value.
The random forests algorithm that the present invention uses is a kind of integrated learning approach, with Bayes, neutral net, decision tree, The single classifier machine learning algorithms such as support vector machine are compared, it is not easy to over-fitting problem, the study of single classifier model occur Power limitations is in overall data sample, although can learn the data characteristics to whole data sample well, but it cannot be guaranteed that relatively Strong generalization ability, i.e. lacks good predictive ability to unknown data sample.By contrast, random forests algorithm can solve This problem, by integrated multiple Weak Classifiers, overcomes the defect of single grader learning capacity, uses bagging technology Making each single decision tree classifier have the learning capacity that Partial Feature is stronger, the most each single classifier has local feature Strong learning capacity rather than the learning capacity of global feature, each single classifier be responsible for learn Partial Feature, be combined with multiple The random forests algorithm of single classifier just has higher learning capacity, and therefore single classifier algorithm can be compared to a comprehensive energy The Learning machine that power is stronger, and random forest is equivalent to the combination learning machine of multiple expert composition, limited at data sample In the case of, random forests algorithm has clear superiority, therefore the present invention uses random forests algorithm as the taxonomy of pathological changes identification Practise device.
The innovative point of the present invention is that the introducing of most effective feature selection approach, decision tree select to optimize and decision tree identification The improved methods such as ability equilibrium, the method for improvement has for the identification of liver plain CT image local pathological changes and preferably identifies standard Really rate.
Accompanying drawing explanation
Below in conjunction with the accompanying drawings the detailed description of the invention of the present invention is described in further detail.
Fig. 1 is random forest pathological changes identification model;
Fig. 2 is the sequence of feature importance degree;
Fig. 3 is the impact that OOB is estimated by feature selection;
Fig. 4 is that the random forest improved compares with primal algorithm classification performance;
Fig. 5 is CT image for liver pathological changes identification operating process.
Detailed description of the invention
For realizing the liver plain CT image local pathological changes identification of the present invention, following two stages are used to carry out.
First stage: pathological changes identification model based on random forests algorithm is set up
1. liver plain CT image local lesion region image feature data collection is set up:
The present invention uses 3000 liver plain CT images from Hangzhou hospital mark, and size is 512*512, bag Containing several types such as normal, hepatocarcinoma, hepatic haemangioma and hepatics cyst, extract the rectangle frame conduct of the lesion region that can cover mark Area-of-interest, so obtains 3000 pathological changes tile images.
To each lesion region block, calculate the grey level histogram of pathological changes tile images, gray level co-occurrence matrixes and gray scale ladder Degree co-occurrence matrix.
For pathological changes tile images, grey level histogram matrix H computational methods are as follows:
H ( i ) = n i N , i = 0 , 1 , ...... , L - 1
Wherein N is pathological changes block image pixel number amount, and L is gray level, and value of the present invention is 256, niRepresent pathological changes block In image, gray level is the number of pixels of i, and grey level histogram H (i) represents that the number of pixels with certain gray level accounts for image pixel The ratio of sum, the global characteristics illustrating image describes, and the size of H-matrix is 256*1.
For pathological changes tile images, the computational methods of gray level co-occurrence matrixes P are as follows:
P(Ii, I2)=P1(I1, I2)+P2(I1, I2)+P3(I1, I2)+P4(I1, I2)
Wherein I1, I2For grey scale pixel value, grey level L takes 256, P1(I1, I2) represent in pathological changes tile images level away from It is respectively I from for 1 and 2 grey scale pixel values1, I2Pixel number account for pixel value to sum ratio, P2(I1, I2) represent In pathological changes tile images, diagonal distance is 1 and 2 grey scale pixel values are respectively I1, I2Pixel number account for pixel value to always The ratio of number, P3(I1, I2) represent that in pathological changes tile images, vertical dimension is 1 and 2 grey scale pixel values are respectively I1, I2Picture Vegetarian refreshments number accounts for the pixel value ratio to sum, P4(I1, I2) represent in pathological changes tile images that back-diagonal distance is 1 and 2 point Grey scale pixel value is respectively I1, I2Pixel number account for the pixel value ratio to sum, computing formula is respectively as follows:
Level:
Diagonal:
Vertical:
Back-diagonal:
Gray level co-occurrence matrixes P (I1, I2) then represent distance in pathological changes tile images be 1 and gray value be respectively I1, I2's The number of two pixels pair accounts for pixel that distance is 1 ratio to sum.It is 256 that the present invention takes gray level, then distance is 1 Pixel be 65536 to sum, the size of P matrix is 256*256.
For pathological changes tile images, the computational methods of Gray level-gradient co-occurrence matrix T are as follows:
Gray value grey level LfTake 256, for a pixel (i, j), gradient calculation method is as follows:
g ( i , j ) = [ g x 2 + g y 2 ] 1 / 2
gx(i, j)=f (i+1, j-1)+2f (i+1, j)+f (i+1, j+1)-f (i-1, j-1)-2f (i-1, j)-f (i-1, j +1)
gy(i, j)=f (i-1, j+1)+2f (i, j+1)+f (i+1, j+1)-f (i-1, j+1)-2f (i, j-1)-f (i+1, i-1)
Normalized gradient matrix G is:
G (i, j)=INT (g (i, j) × Lg/gM)+1
Wherein gradient number of stages is Lg, value of the present invention is 32, and image greatest gradient value is gM, INT is rounding operation, as The Grad of vegetarian refreshments is i.e. defined by gradient matrix G, and the size of G matrix is 256*32.
Grey level histogram matrix H, gray level co-occurrence matrixes P and shade of gray according to pathological changes tile images derived above is altogether Raw matrix T, calculates gradation of image textural characteristics based on these three matrix, and computational methods are respectively shown in following three form:
Table 1 grey level histogram characteristic measure
Table 2 gray level co-occurrence matrixes characteristic measure
Table 3 Gray level-gradient co-occurrence matrix characteristic measure
The most just the characteristic vector of pathological changes tile images has been obtained:
[f1,f2,f3,f4,f5,f6,f7,f8,f9,f10,f11,f12,f13,f14,f15,f16,f17,f18,f19, f20,f21,f22,f23,f24,f25,f26,label]
Label is pathological changes label, and 1 represents normal, and 2 represent hepatocarcinoma, and 3 represent hepatic haemangioma, and 4 represent hepatic cyst, and 5 represent it His lesion type.To all 3000 pathological changes block image zooming-out characteristic vectors, trained and test data set.
2. select most effective feature
For selecting most effective feature, it is necessary first to set up basic Random Forest model, take 3000 articles of spies in the 1st step Levying vector data, as model training data, the building process of random forest is:
1) definition random forest CART decision tree quantity to be set up is 40, repeats 2), 3) step 40 time, generate 40 CART Decision-Tree Classifier Model;
2) by there being the sampling approach put back to concentrate one group of sample of extraction, sample size and original training set from training data Equally;
3) from 26 sample attributes, randomly choose 10 attributes, be 2) in extraction sub-tree training dataset with choosing 10 attributes selected set up categorised decision tree.
CART decision tree uses Gini index as split criterion, it is assumed that node data collection T has K class, and sample point belongs to The probability of kth class is pk, for the data set that node T is corresponding, Gini Index for Calculation is as follows:
G i n i ( T ) = Σ k = 1 K p k ( 1 - p k ) = 1 - Σ k = 1 K p k 2
For CART decision tree, if training set T is unsatisfactory for, " T broadly falls into the most surplus next sample in same category or T This ", then this node is nonleaf node, so attempting each attribute according to sample and possible property value, enters sample Row binary divide, it is assumed that after classification, T is divided into A and B, during wherein A accounts for T the ratio of sample be p, B be q (obvious p+q=1).The most miscellaneous Matter knots modification: Gini (T)-p*Gini (A)-q*Gini (B), the purpose that each property value is attempted divide finds impurity to be somebody's turn to do exactly The division that variable is maximum, this property value divides subtree and is optimum branching.The data set that node is corresponding is found every time Disruptive features attribute and characteristic attribute value that impurity knots modification is maximum carry out node split, until the sample in node broadly falls into same Till one class, obtain decision-tree model by such recurrence split vertexes.CART algorithm is existing algorithm, the most detailed Explanation.
When setting up the CART decision tree of random forest, can concentrate from initial characteristic data can select at random in the way of putting back to Selecting and the sample set of initial characteristic data collection equal number, the sample not being extracted is referred to as the outer data of bag, is called for short OOB, and OOB data can be used to weigh the extensive error of Random Forest model, this extensive error to the test error of Random Forest model It is referred to as OOB to estimate, is the meansigma methods of every extensive error of decision tree OOB.By the OOB of random forests algorithm, sample can be obtained The importance degree of eigen attribute, for certain attribute X in sample, importance computational methods are as follows:
1) for each decision tree in random forest, use OOB data that decision-tree model is tested, calculate and survey Examination error e rrOOB1;
2) randomly the attribute X of samples all in OOB is added interference noise, change sample the most at random at attribute X Value, then recalculates extensive error e rrOOB2 of OOB;
3) set random forest decision tree quantity as k, take k=40, then the importance computing formula for attribute X is as follows:
i m p o r tan c e X = Σ i = 1 k ( e r r O O B 2 i - e r r O O B 1 i ) / k
By above-mentioned expression formula as the importance measures value of attribute X, if because adding noise at random to attribute X and cause The extensive error of OOB reduces a lot, illustrates that this attribute X affects the biggest for the nicety of grading of sample.
During building random forest, importance journey is calculated the most respectively for 26 characteristic attributes Degree, obtains importance degree vectorial:
[importance1, importance2 ..., importance26]
For based on grey level histogram, gray level co-occurrence matrixes, Gray level-gradient co-occurrence matrix characteristics of image in some tolerance Index is invalid index for CT image for liver local patholoic change identification, therefore can be optimized by feature selection improve the most gloomy The nicety of grading of woods algorithm, random forests algorithm model can provide the importance degree of each attribute of sample, for hepatopathy Becoming provincial characteristics data, represent 26 characteristics of image by sequence number 1~26, the feature importance ranking result obtained is shown in accompanying drawing 2 institute Showing, 26 features according to feature importance degree ranking results are:
f15,f22,f19,f9,f14,f2,f6,f5,f8,f4,f25,f3,f26,f1,f17,f21,f25,f7,f16, f11,f12,f10,f18,f20,f23,f13
The step of most effective feature selection is as follows:
1) training sample set is set up random forest, and calculate OOB estimation and the importance degree of each characteristic attribute, press Importance degree carries out characteristic attribute sequence.
2) reject the most unessential feature, this feature attribute of training sample set is rejected, obtains new training sample This collection, builds random forest with new sample and calculates OOB estimation.
3) 2 are repeated), find out OOB and estimate minimum Random Forest model characteristic of correspondence property set, as most effective Characteristic combinations of attributes.
According to most effective feature selection approach, oob estimates that the change that average error value changes along with feature selection quantity becomes Shown in gesture as accompanying drawing 3.From accompanying drawing 3, when feature quantity is 19, oob estimates that average error value is minimum, therefore selects by weight Front 19 features of 26 features of the property wanted degree sequence combine as most effective feature, and characteristic attribute is as follows:
f15,f22,f19,f9,f14,f2,f6,f5,f8,f4,f25,f3,f26,f1,f17,f21,f25,f7,f16
These 19 features are respectively average based on grey level histogram, variance, skewness, kurtosis, energy, entropy feature, base In the angle second moment of gray level co-occurrence matrixes, contrast, unfavourable balance moment characteristics, and intensity profile based on Gray level-gradient co-occurrence matrix is not Uniformity, Gradient distribution inhomogeneities, energy, gray scale are average, gray scale mean square deviation, relevant, gray level entropy, the entropy of mixing, inertia, unfavourable balance Moment characteristics.
3. the foundation of random forest pathological changes identification model and improvement.
By the most effective characteristic attribute combination obtained in step 2, every the characteristic sieve concentrated from initial characteristic data Select most effective characteristic, form new characteristic data set, new characteristic data set is set up Random Forest model, and to Machine forest model carries out decision tree and selects to optimize and identification ability equilibrium, and step is as follows:
1) characteristic is concentrated maximally effective 19 obtained in selecting step 2 in 26 characteristic attributes of every sample Characteristic attribute, sets up the Random Forest model comprising 400 decision trees, is designated as h1(x), h2(x) ..., h400(x);
2) to these 400 decision trees, test by test data set, for the normal liver feature in test data set Data, the forecast error of 400 decision trees is ordered as by little arrivalFor test data set In liver cancer characteristic data, the forecast error of 400 decision trees is ordered as by little arrivalPin To the hepatic haemangioma characteristic in test data set, the forecast error of 400 decision trees is ordered as by little arrival For the hepatic cyst characteristic in test data set, the forecast error of 400 decision trees It is ordered as by little arrival
3) select These decision trees are as final Random Forest model.In these decision trees, repetition may be comprised Decision tree, the decision tree of this kind of repetition has good recognition accuracy for multiple hepatic lesions type, and such decision tree is The decision tree should being selected, such system of selection is equivalent to add the weight of such high-class performance decision tree, Final random forest pathological changes identification model is just provided with more excellent recognition performance.
With the Random Forest model improved, test set is tested, it was predicted that error has obtained to a certain degree reducing, it was predicted that The variation tendency that error increases with decision tree quantity contrasts as shown in Figure 4, and the asterisk line of side on the upper side is former random forests algorithm Forecast error with the change curve of decision tree quantity, the cross wires of partial below is algorithm mould after new decision tree selects to optimize The forecast error of type is with the change curve of decision tree quantity.
Second stage: liver plain CT image pathological changes identification
For a given liver plain CT image, it is necessary first to draw and take the suspected lesion region needing to identify, pass through The rectangle frame region taking suspected lesion on CT image drawn by picture instrument, as the image in suspected lesion region.
1) the most effective characteristic attribute obtained for the first stage, extracts 19 of CT image for liver suspected lesion region Validity feature attribute is as characteristic vector;
2) the random forest pathological changes identification model that obtains of first stage is used, to each decision tree to 1) spy that extracts of step Levy vector and carry out lesion type prediction, the prediction voting results of all decision trees are added up, select type of prediction most Lesion type, as final pathological changes recognition result.
Concrete pathological changes identification operating process is as shown in Figure 5.
Finally, in addition it is also necessary to be only several specific embodiments of the present invention it is noted that listed above.Obviously, this Bright it is not limited to above example, it is also possible to have many deformation.Those of ordinary skill in the art can be from present disclosure The all deformation directly derived or associate, are all considered as protection scope of the present invention.

Claims (2)

1. liver plain CT image pathological changes recognition methods based on random forests algorithm, is characterized in that including herein below:
The gradation of image textural characteristics extracting liver plain CT image lesion region represents as image feature vector, then uses Random forests algorithm carries out feature selection to CT image for liver lesion region image feature vector, selects maximally effective feature group Close, then most effective characteristic data set be trained and learn, and the decision tree of random forest is identified ability equilibrium Optimize, obtain final pathological changes identification model.
Liver plain CT image pathological changes recognition methods based on random forests algorithm the most according to claim 1, its feature It is: pathological changes identification model is set up, and comprises the steps:
1), CT image for liver lesion region characteristic data set is set up:
Including:
A. grey level histogram feature extraction;
B. gray level co-occurrence matrixes feature extraction;
C. Gray level-gradient co-occurrence matrix feature extraction;
Combine the characteristics of image of above-mentioned three types, and combine the lesion type label of image as characteristic vector data collection D;
2), most effective feature is selected:
Use random forests algorithm to provide the importance degree of each feature in characteristic vector, rejected by loop iteration the heaviest Want feature, and the residue character after rejecting feature is set up new Random Forest model, find out the model pair that extensive error is minimum The feature combination answered, is the combination of maximally effective feature;
3), the foundation of random forest pathological changes identification model and improvement
For 2) step obtain most effective characteristic attribute combination, from initial characteristic data concentrate filter out most effective characteristic Collection;Set up Random Forest model with most effective characteristic data set, and the decision tree in Random Forest model is optimized and all Weighing apparatus, obtains final random forest pathological changes identification model.
CN201610231280.6A 2016-04-14 2016-04-14 Pathology identification method for routine scan CT image of liver based on random forests Pending CN105931224A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201610231280.6A CN105931224A (en) 2016-04-14 2016-04-14 Pathology identification method for routine scan CT image of liver based on random forests

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201610231280.6A CN105931224A (en) 2016-04-14 2016-04-14 Pathology identification method for routine scan CT image of liver based on random forests

Publications (1)

Publication Number Publication Date
CN105931224A true CN105931224A (en) 2016-09-07

Family

ID=56838951

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201610231280.6A Pending CN105931224A (en) 2016-04-14 2016-04-14 Pathology identification method for routine scan CT image of liver based on random forests

Country Status (1)

Country Link
CN (1) CN105931224A (en)

Cited By (30)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106897821A (en) * 2017-01-24 2017-06-27 中国电力科学研究院 A kind of transient state assesses feature selection approach and device
CN107480702A (en) * 2017-07-20 2017-12-15 东北大学 Towards the feature selecting and Feature fusion of the identification of HCC pathological images
CN107632995A (en) * 2017-03-13 2018-01-26 平安科技(深圳)有限公司 The method and model training control system of Random Forest model training
CN107845098A (en) * 2017-11-14 2018-03-27 南京理工大学 Liver cancer image full-automatic partition method based on random forest and fuzzy clustering
CN108090507A (en) * 2017-10-19 2018-05-29 电子科技大学 A kind of medical imaging textural characteristics processing method based on integrated approach
WO2018098697A1 (en) * 2016-11-30 2018-06-07 中国科学院深圳先进技术研究院 Image feature repeatability measurement method and device
CN108320026A (en) * 2017-05-16 2018-07-24 腾讯科技(深圳)有限公司 Machine learning model training method and device
CN108335282A (en) * 2017-01-20 2018-07-27 浙江京新术派医疗科技有限公司 The cut surface generation method and generating means of operation on liver
CN108764355A (en) * 2018-05-31 2018-11-06 清华大学 Image processing apparatus and method based on textural characteristics classification
CN108805858A (en) * 2018-04-10 2018-11-13 燕山大学 Hepatopathy CT image computers assistant diagnosis system based on data mining and method
CN108961207A (en) * 2018-05-02 2018-12-07 上海大学 Lymph node Malignant and benign lesions aided diagnosis method based on multi-modal ultrasound image
CN109003659A (en) * 2017-06-07 2018-12-14 万香波 Stomach Helicobacter pylori infects pathological diagnosis and supports system and method
CN109117890A (en) * 2018-08-24 2019-01-01 腾讯科技(深圳)有限公司 A kind of image classification method, device and storage medium
CN109344907A (en) * 2018-10-30 2019-02-15 顾海艳 Based on the method for discrimination for improving judgment criteria sorting algorithm
CN109493886A (en) * 2018-12-13 2019-03-19 西安电子科技大学 Speech-emotion recognition method based on feature selecting and optimization
CN109840535A (en) * 2017-11-29 2019-06-04 北京京东尚科信息技术有限公司 The method and apparatus for realizing classification of landform
CN109858562A (en) * 2019-02-21 2019-06-07 腾讯科技(深圳)有限公司 A kind of classification method of medical image, device and storage medium
CN109934179A (en) * 2019-03-18 2019-06-25 中南大学 Human motion recognition method based on automated characterization selection and Ensemble Learning Algorithms
CN109965829A (en) * 2019-03-06 2019-07-05 重庆金山医疗器械有限公司 Imaging optimization method, image processing apparatus, imaging device and endoscopic system
CN110378875A (en) * 2019-06-18 2019-10-25 中国科学院苏州生物医学工程技术研究所 Internal lithangiuria ingredient discrimination method based on machine learning algorithm
CN110619633A (en) * 2019-09-10 2019-12-27 武汉科技大学 Liver image segmentation method based on multi-path filtering strategy
CN111291896A (en) * 2020-02-03 2020-06-16 深圳前海微众银行股份有限公司 Interactive random forest subtree screening method, device, equipment and readable medium
CN111445946A (en) * 2020-03-26 2020-07-24 北京易康医疗科技有限公司 Calculation method for calculating lung cancer genotyping by using PET/CT (positron emission tomography/computed tomography) images
WO2020199692A1 (en) * 2019-04-04 2020-10-08 中国科学院深圳先进技术研究院 Method and apparatus for screening predictive image features for cancer metastasis, and storage medium
WO2020233259A1 (en) * 2019-07-12 2020-11-26 之江实验室 Multi-center mode random forest algorithm-based feature importance sorting system
CN112633317A (en) * 2020-11-02 2021-04-09 国能信控互联技术有限公司 CNN-LSTM fan fault prediction method and system based on attention mechanism
CN112883962A (en) * 2021-01-29 2021-06-01 北京百度网讯科技有限公司 Fundus image recognition method, device, apparatus, storage medium, and program product
CN113593707A (en) * 2021-09-29 2021-11-02 武汉楚精灵医疗科技有限公司 Stomach early cancer model training method and device, computer equipment and storage medium
WO2021259003A1 (en) * 2020-06-23 2021-12-30 平安科技(深圳)有限公司 Feature recognition method and apparatus, and computer device and storage medium
CN116309593A (en) * 2023-05-23 2023-06-23 天津市中西医结合医院(天津市南开医院) Liver puncture biopsy B ultrasonic image processing method and system based on mathematical model

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20100158332A1 (en) * 2008-12-22 2010-06-24 Dan Rico Method and system of automated detection of lesions in medical images
CN104751160A (en) * 2015-03-12 2015-07-01 西安电子科技大学 Mammary gland image processing method based on sparse automatic coding depth network
CN104866862A (en) * 2015-04-27 2015-08-26 中南大学 Strip steel surface area type defect identification and classification method
CN105427325A (en) * 2015-12-07 2016-03-23 苏州大学 Automatic lung tumour segmentation method based on random forest and monotonically decreasing function

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20100158332A1 (en) * 2008-12-22 2010-06-24 Dan Rico Method and system of automated detection of lesions in medical images
CN104751160A (en) * 2015-03-12 2015-07-01 西安电子科技大学 Mammary gland image processing method based on sparse automatic coding depth network
CN104866862A (en) * 2015-04-27 2015-08-26 中南大学 Strip steel surface area type defect identification and classification method
CN105427325A (en) * 2015-12-07 2016-03-23 苏州大学 Automatic lung tumour segmentation method based on random forest and monotonically decreasing function

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
于殿泓: "《图像检测与处理技术》", 31 December 2006, 西安电子科技大学出版社 *
姜慧 等: "基于双树复数小波的肝脏疾病分类", 《电脑与信息技术》 *
胡峻峰: "基于机器视觉的实木地板分选技术研究", 《中国博士学位论文全文数据库》 *

Cited By (40)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2018098697A1 (en) * 2016-11-30 2018-06-07 中国科学院深圳先进技术研究院 Image feature repeatability measurement method and device
CN108335282A (en) * 2017-01-20 2018-07-27 浙江京新术派医疗科技有限公司 The cut surface generation method and generating means of operation on liver
CN106897821B (en) * 2017-01-24 2023-07-21 中国电力科学研究院 Transient evaluation feature selection method and device
CN106897821A (en) * 2017-01-24 2017-06-27 中国电力科学研究院 A kind of transient state assesses feature selection approach and device
CN107632995B (en) * 2017-03-13 2018-09-11 平安科技(深圳)有限公司 The method and model training control system of Random Forest model training
CN107632995A (en) * 2017-03-13 2018-01-26 平安科技(深圳)有限公司 The method and model training control system of Random Forest model training
CN108320026B (en) * 2017-05-16 2022-02-11 腾讯科技(深圳)有限公司 Machine learning model training method and device
CN108320026A (en) * 2017-05-16 2018-07-24 腾讯科技(深圳)有限公司 Machine learning model training method and device
CN109003659A (en) * 2017-06-07 2018-12-14 万香波 Stomach Helicobacter pylori infects pathological diagnosis and supports system and method
CN107480702B (en) * 2017-07-20 2021-03-30 东北大学 Feature selection and feature fusion method for HCC pathological image recognition
CN107480702A (en) * 2017-07-20 2017-12-15 东北大学 Towards the feature selecting and Feature fusion of the identification of HCC pathological images
CN108090507A (en) * 2017-10-19 2018-05-29 电子科技大学 A kind of medical imaging textural characteristics processing method based on integrated approach
CN107845098A (en) * 2017-11-14 2018-03-27 南京理工大学 Liver cancer image full-automatic partition method based on random forest and fuzzy clustering
CN109840535A (en) * 2017-11-29 2019-06-04 北京京东尚科信息技术有限公司 The method and apparatus for realizing classification of landform
CN108805858A (en) * 2018-04-10 2018-11-13 燕山大学 Hepatopathy CT image computers assistant diagnosis system based on data mining and method
CN108961207A (en) * 2018-05-02 2018-12-07 上海大学 Lymph node Malignant and benign lesions aided diagnosis method based on multi-modal ultrasound image
CN108961207B (en) * 2018-05-02 2022-11-04 上海大学 Auxiliary diagnosis method for benign and malignant lymph node lesion based on multi-modal ultrasound images
CN108764355A (en) * 2018-05-31 2018-11-06 清华大学 Image processing apparatus and method based on textural characteristics classification
CN109117890A (en) * 2018-08-24 2019-01-01 腾讯科技(深圳)有限公司 A kind of image classification method, device and storage medium
CN109344907A (en) * 2018-10-30 2019-02-15 顾海艳 Based on the method for discrimination for improving judgment criteria sorting algorithm
CN109493886A (en) * 2018-12-13 2019-03-19 西安电子科技大学 Speech-emotion recognition method based on feature selecting and optimization
CN109858562A (en) * 2019-02-21 2019-06-07 腾讯科技(深圳)有限公司 A kind of classification method of medical image, device and storage medium
CN109965829A (en) * 2019-03-06 2019-07-05 重庆金山医疗器械有限公司 Imaging optimization method, image processing apparatus, imaging device and endoscopic system
CN109965829B (en) * 2019-03-06 2022-05-06 重庆金山医疗技术研究院有限公司 Imaging optimization method, image processing apparatus, imaging apparatus, and endoscope system
CN109934179A (en) * 2019-03-18 2019-06-25 中南大学 Human motion recognition method based on automated characterization selection and Ensemble Learning Algorithms
WO2020199692A1 (en) * 2019-04-04 2020-10-08 中国科学院深圳先进技术研究院 Method and apparatus for screening predictive image features for cancer metastasis, and storage medium
CN110378875A (en) * 2019-06-18 2019-10-25 中国科学院苏州生物医学工程技术研究所 Internal lithangiuria ingredient discrimination method based on machine learning algorithm
WO2020233259A1 (en) * 2019-07-12 2020-11-26 之江实验室 Multi-center mode random forest algorithm-based feature importance sorting system
CN110619633A (en) * 2019-09-10 2019-12-27 武汉科技大学 Liver image segmentation method based on multi-path filtering strategy
CN111291896A (en) * 2020-02-03 2020-06-16 深圳前海微众银行股份有限公司 Interactive random forest subtree screening method, device, equipment and readable medium
CN111445946B (en) * 2020-03-26 2021-07-30 山东省肿瘤防治研究院(山东省肿瘤医院) Calculation method for calculating lung cancer genotyping by using PET/CT (positron emission tomography/computed tomography) images
CN111445946A (en) * 2020-03-26 2020-07-24 北京易康医疗科技有限公司 Calculation method for calculating lung cancer genotyping by using PET/CT (positron emission tomography/computed tomography) images
WO2021259003A1 (en) * 2020-06-23 2021-12-30 平安科技(深圳)有限公司 Feature recognition method and apparatus, and computer device and storage medium
CN112633317A (en) * 2020-11-02 2021-04-09 国能信控互联技术有限公司 CNN-LSTM fan fault prediction method and system based on attention mechanism
CN112883962B (en) * 2021-01-29 2023-07-18 北京百度网讯科技有限公司 Fundus image recognition method, fundus image recognition apparatus, fundus image recognition device, fundus image recognition program, and fundus image recognition program
CN112883962A (en) * 2021-01-29 2021-06-01 北京百度网讯科技有限公司 Fundus image recognition method, device, apparatus, storage medium, and program product
CN113593707A (en) * 2021-09-29 2021-11-02 武汉楚精灵医疗科技有限公司 Stomach early cancer model training method and device, computer equipment and storage medium
CN113593707B (en) * 2021-09-29 2021-12-14 武汉楚精灵医疗科技有限公司 Stomach early cancer model training method and device, computer equipment and storage medium
CN116309593A (en) * 2023-05-23 2023-06-23 天津市中西医结合医院(天津市南开医院) Liver puncture biopsy B ultrasonic image processing method and system based on mathematical model
CN116309593B (en) * 2023-05-23 2023-09-12 天津市中西医结合医院(天津市南开医院) Liver puncture biopsy B ultrasonic image processing method and system based on mathematical model

Similar Documents

Publication Publication Date Title
CN105931224A (en) Pathology identification method for routine scan CT image of liver based on random forests
CN109493308B (en) Medical image synthesis and classification method for generating confrontation network based on condition multi-discrimination
CN109145921A (en) A kind of image partition method based on improved intuitionistic fuzzy C mean cluster
CN110853011B (en) Method for constructing convolutional neural network model for pulmonary nodule detection
CN106529448A (en) Method for performing multi-visual-angle face detection by means of integral channel features
JP2008077677A (en) Measurement of mitotic activity
CN105913086A (en) Computer-aided mammary gland diagnosing method by means of characteristic weight adaptive selection
CN104282008B (en) The method and apparatus that Texture Segmentation is carried out to image
CN101081168A (en) Method for shielding sex part on foetus image for preventing recognizing foetus sex
CN109978880A (en) Lung tumors CT image is carried out sentencing method for distinguishing using high dimensional feature selection
CN106326834A (en) Human body gender automatic identification method and apparatus
CN106127735A (en) A kind of facilities vegetable edge clear class blade face scab dividing method and device
Beevi et al. Detection of mitotic nuclei in breast histopathology images using localized ACM and Random Kitchen Sink based classifier
CN109671060A (en) Area of computer aided breast lump detection method based on selective search and CNN
CN113191359B (en) Small sample target detection method and system based on support and query samples
Giusti et al. A comparison of algorithms and humans for mitosis detection
CN103279960B (en) A kind of image partition method of human body cache based on X-ray backscatter images
Hassanien et al. Detection of spiculated masses in Mammograms based on fuzzy image processing
CN110348320A (en) A kind of face method for anti-counterfeit based on the fusion of more Damage degrees
WO2022141201A1 (en) Breast cancer grading method based on dce-mri
Nasir et al. Detection of acute leukaemia cells using variety of features and neural networks
Cao et al. 3D convolutional neural networks fusion model for lung nodule detection onclinical CT scans
Fan et al. Automated blood vessel segmentation in fundus image based on integral channel features and random forests
CN107545565A (en) A kind of solar energy half tone detection method
CN103902997B (en) Feature subspace integration method for biological cell microscope image classification

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20160907