CN106407992A

CN106407992A - Breast ultrasound image self-learning extraction method and system based on stacked noise reduction self-encoder

Info

Publication number: CN106407992A
Application number: CN201610834295.1A
Authority: CN
Inventors: 陈壮威
Original assignee: Fujian Maternal And Child Care Service Centre
Current assignee: Fujian Maternal And Child Care Service Centre
Priority date: 2016-09-20
Filing date: 2016-09-20
Publication date: 2017-02-15
Anticipated expiration: 2036-09-20
Also published as: CN106407992B

Abstract

The invention discloses a breast ultrasound image self-learning extraction method and system based on a stacked noise reduction self-encoder. The method comprises the steps of extracting manual shallow layer features from each ultrasound breast lesion area image ROI as a training sample to form a training sample set set_unlabeled = {x(1), x(2), ..., x(n)}, the i-th sample x(i) belonging to [0, 1]<d>, i = 1, 2, ..., n; based on the training sample set, training a first noise reduction self-encoder DAE1; after training the first noise reduction self-encoder, re-entering the training sample set, using the self-encoder trained in the step S4 to extract feature expressions obtained through hidden layer learning of all the samples to form a new sample {y(1), y(2), ..., y(n)}, and using the new sample as an input of a second noise reduction self-encoder DAE2 to train the second noise reduction self-encoder. The invention achieves extraction of breast ultrasound image features, thereby provides valuable reference opinions for clinic diagnosis, and improves the accuracy and efficiency of breast cancer diagnosis.

Description

A kind of breast ultrasound characteristics of image self study extraction based on stacking noise reduction self-encoding encoder Method and system

Technical field

The present invention relates to Feature Engineering technical field, more particularly, to a kind of breast ultrasound based on stacking noise reduction self-encoding encoder Characteristics of image self study extracting method and system.

Background technology

Breast carcinoma is the most commonly seen a kind of malignant tumor of women all over the world, and about 400,000 people die from this disease every year.China It is one of fastest-rising country of breast cancer incidence, especially breast carcinoma has become as China women sickness rate ranking in recent years The malignant tumor of one.The therapeutic effect of breast carcinoma of early stage is good, can save the life of patient to a great extent, therefore improves breast The precision of the early diagnosiss of adenocarcinoma and accuracy become more and more meaningful.

At present, the image checks such as breast ultrasound, molybdenum target are mainly applied in breast carcinoma clinical diagnosises, and diagnosis person passes through lump, calcium The image features such as change, blood flow signal image is analyzed.Breast ultrasonography is widely used to the clinical position of China In, its have easy to operate, "dead", hurtless measure, to lump accurate positioning and economic and practical the advantages of.But ultrasonic inspection Look into and there are still many deficiencies, the image of such as breast carcinoma of early stage is not often true to type and is difficult to differentiate, especially because diagnosis person visual impression The difference known, visual fatigue, the use of different features and diagnostic criteria, lack the quantitative measurement of characteristics of image, result in not With diagnosis result difference so that the mistaken diagnosis and failing to pinpoint a disease in diagnosis of breast carcinoma of early stage still occurs often.

With the continuous development of Medical Imaging Technology and computer technology, carry out auxiliary diagnosis using computer and improve diagnosis Accuracy be possibly realized；Such as：Using digital image processing techniques, extract pathology in breast ultrasound image related spy Levy, according to these features, Classification and Identification etc. is carried out to Diagnosis of Breast tumors with machine learning methods such as SVM.

From the point of view of the application present situation of computer-aided diagnosises breast carcinoma, the accuracy of computer-aided diagnosises largely takes Certainly whether effective in extracting B ultrasonic image pathology correlated characteristic.The medical image features being presently used for computer-aided diagnosises carry Take and substantially position focus area-of-interest using manual, and some bases of method extraction being processed by primary image is normal Rule feature, such as：Grey level histogram feature, shape facility, gray level co-occurrence matrixes feature, wavelet character etc..But said method have with Under several aspects deficiency：The extraction time and effort consuming one by one of the firstth, above-mentioned basic general characteristics；Secondth, above-mentioned single basis is often Not field is related for rule feature itself, and the application-specific degree of association of breast carcinoma is little；3rd, design effectively can be used for calculating The basic general characteristics combination of machine auxiliary diagnosis breast carcinoma has serious uncertainty.There is above-mentioned circumscribed essential reason It is the low-level feature that feature after Feature Selection remains medical image, with medical image pathology semanteme high-level characteristic on the whole Between have no direct mapping relations, therefore, best settlement mechanism be provide one kind can according to conventional breast carcinoma B ultrasonic image from Dynamic learn with pathology about and can be used for auxiliary diagnosis characteristics of image method.

Content of the invention

For this reason, it may be necessary to provide a kind of breast ultrasound characteristics of image self study extraction side based on stacking noise reduction self-encoding encoder Case, solves how automatically to learn image spy that is relevant with pathology and can be used for auxiliary diagnosis according to conventional breast carcinoma B ultrasonic image The problem levied.

For achieving the above object, inventor provide a kind of breast ultrasound characteristics of image based on stacking noise reduction self-encoding encoder Self study extracting method, comprises the following steps：

Step S1：A given medium-scale above breast ultrasound focal area image set, described medium-scale expression This image set is at least containing breast ultrasound diagnostic images more than 200 width；

Step S2：The focal area image of each breast ultrasound diagnostic image in image set in manual extraction step S1 ROI；

Step S3：Extract manual shallow-layer feature to train as one from each breast ultrasound focal area image ROI Sample, composing training sample set set_unlabeled={ x⁽¹⁾,x⁽²⁾,…,x⁽ⁿ⁾, i-th sample x⁽ⁱ⁾∈[0,1]^d, i=1, 2,…,n；Wherein d represents the characteristic dimension of sample, and n represents training set number of samples；

Step S4：Based on training sample set, train first noise reduction self-encoding encoder DAE1；

Step S5：After having trained first noise reduction self-encoding encoder, re-enter training sample set, trained according to step S4 Encoder extract the hidden layer character representation that obtains of study of all samples, constitute new sample { y⁽¹⁾,y⁽²⁾,…,y⁽ⁿ⁾, will It trains second noise reduction self-encoding encoder DAE2 as the input of second noise reduction self-encoding encoder；

Step S6：Two noise reduction self-encoding encoders DAE1 completing to train and DAE2 stacking are obtained three layers of SDAE structure, Corresponding ground floor is input layer, and dimension is d；The second layer is the hidden layer in DAE1, and dimension is d_h1；Third layer is corresponding in DAE2 Hidden layer, dimension is d_h2；By this SDAE structure, the manual shallow-layer feature of given breast molybdenum target image, after feed-forward, obtain base Semantic feature in the higher level of abstraction of stacking noise reduction self-encoding encoder represents

Further, described step S3 is：Extract the GLCM of each ROI image, small echo, wavelet packet, MPEG-7 tetra- respectively Plant manual shallow-layer feature, the characteristic vector being cascaded as a d dimension is as a sample.

Further, in described step S4, first noise reduction self-encoding encoder is made up of three-layer network, corresponding input layer x, hidden Layer y, the neuron number of output layer z are respectively d, d_h1, d, certain sample x that wherein input of input layer is concentrated for training sample⁽ⁱ⁾∈[0,1]^d, i=1,2 ..., n；Input layer has been artificially induced noise；Parameter θ₁={ W₁,b₁, θ₂={ W₂,b₂, b₁、b₂ For being respectively the bias vector of hidden layer and output layer, size is respectively d_h1With d dimension, W₁、W₂It is respectively input layer to the power of hidden layer It is worth connection matrix and hidden layer to the weights connection matrix of output layer, size is respectively d × d_h1、d_h1×d；Activation primitive all adopts Sigmoid function；

Step S4 comprises the following steps：

Step S41：Training sample set is split, concretely comprises the following steps：The training sample set of ultrasonoscopy is divided at random It is segmented into num batch, each batch_i∈[0,1]^{batch_size×d}, i=1,2 ..., num；

Step S42：Network parameter initializes；It is specifically configured to：

Learningrate=1；

b₁=0, b₂=0；

Wherein, leanningrate represents learning rate, and rand (m, n) function is the random m × n rank matrix generating [0,1]；

Step 43：Setting maximum cycle NN；

Step 44：Recirculate outward t=1 to NN；

Inside recirculate s=1 to num；

Step 441：Corrosion data：By binary system masking noise mode, with certain probability by input feature value x certain A little values are randomly reset to 0；It is specially：

batch_s=batch_s× (rand (batch_size, d) ＞ threashold), a batch_sConstitute batch_ Size × d rank matrix, threashold is the threshold value setting, and is specifically set as 0.2；If the random matrix A=rand generating (batch_size, d) in elements A_ijLess than threashold, then matrix batch_sThe element of middle correspondence position is reset to 0；Fixed Adopted batch_sIn i-th sample be x⁽ⁱ⁾, after corrosion it is

Step 442：Feed-forward：

z⁽ⁱ⁾=sigmoid (W₂ ^Ty⁽ⁱ⁾+b₂), i=1,2 ..., batch_size；

Step 443：Reverse transfer：

Step 444：Undated parameter

Wherein,Represent the residual error of i-th sample correspondence output layer and j-th node of hidden layer respectively.

The present invention provides a kind of breast ultrasound characteristics of image self study extraction system based on stacking noise reduction self-encoding encoder, bag Include with lower module：

Image set gives module：For giving a medium-scale above breast ultrasound focal area image set, described Medium-scale this image set of expression is at least containing breast ultrasound diagnostic images more than 200 width；

Focal area extraction module：For each breast ultrasound diagnostic image in image set in manual extraction step S1 Focal area image ROI；

Sample training module：Make for extracting manual shallow-layer feature from each breast ultrasound focal area image ROI For a training sample, composing training sample set set_unlabeled={ x⁽¹⁾,x⁽²⁾,…,x⁽ⁿ⁾, i-th sample x⁽ⁱ⁾∈ [0,1]^d, i=1,2 ..., n；Wherein d represents the characteristic dimension of sample, and n represents training set number of samples；

First encoder training module：For based on training sample set, training first noise reduction self-encoding encoder DAE1；

Second encoder training module：For having trained after first noise reduction self-encoding encoder, re-enter training sample set, The character representation being obtained according to the hidden layer study that the encoder that step S4 trains extracts all samples, constitutes new sample { y⁽¹⁾,y⁽²⁾,…,y⁽ⁿ⁾, as the input of second noise reduction self-encoding encoder, train second noise reduction self-encoding encoder DAE2；

Semantic feature generation module：For two noise reduction self-encoding encoders DAE1 completing to train and DAE2 stacking are obtained three layers SDAE structure, corresponding ground floor is input layer, and dimension is d；The second layer is the hidden layer in DAE1, and dimension is d_h1；Third layer is DAE2 In corresponding hidden layer, dimension be d_h2；By this SDAE structure, the manual shallow-layer feature of given breast molybdenum target image, obtain after feed-forward Semantic feature to the higher level of abstraction based on stacking noise reduction self-encoding encoder represents

Further, described sample training module：It is additionally operable to extract respectively the GLCM of each ROI image, small echo, small echo The manual shallow-layer feature of bag, tetra- kinds of MPEG-7, the characteristic vector being cascaded as a d dimension is as a sample.

Further, in described first encoder training module, first noise reduction self-encoding encoder is made up of three-layer network, right The neuron number answering input layer x, hidden layer y, output layer z is respectively d, d_h1, d, wherein the input of input layer be training sample set In certain sample x⁽ⁱ⁾∈[0,1]^d, i=1,2 ..., n；Input layer has been artificially induced noise；Parameter θ₁={ W₁,b₁, θ₂= {W₂,b₂, b₁、b₂For being respectively the bias vector of hidden layer and output layer, size is respectively d_h1With d dimension, W₁、W₂It is respectively input layer To the weights connection matrix of hidden layer and hidden layer to the weights connection matrix of output layer, size is respectively d × d_h1、d_h1×d；Activation Function is all using sigmoid function；

First encoder training module is included with lower unit：

Sample decomposition unit：For splitting to training sample set, concretely comprise the following steps：Training sample by ultrasonoscopy Integrate random division as num batch, each batch_i∈[0,1]^{batch_size×d}, i=1,2 ..., num；

Network parameter initialization unit：For network parameter initialization；It is specifically configured to：

Learningrate=1；

b₁=0, b₂=0；

Cycle-index arranging unit：For arranging maximum cycle NN；

The inside and outside arranging unit that recirculates：For arranging the outer t=1 to NN that recirculates；

For the s=1 to num that recirculates in arranging；

Corrosion data unit：For corrosion data：By binary system masking noise mode, with certain probability by input feature vector In vector x, some values are randomly reset to 0；It is specially：

Feed-forward unit：For feed-forward：

z⁽ⁱ⁾=sigmoid (W₂ ^Ty⁽ⁱ⁾+b₂), i=1,2 ..., batch_size；

Reverse transfer unit：For reverse transfer：

Undated parameter unit：For undated parameter

It is different from prior art, technique scheme is based on medium scale gland molybdenum target focal area image, training obtains Two self-encoding encoders, and obtain SDAE structure according to two self-encoding encoders, and finally give semantic feature it is achieved that breast ultrasound The extraction of characteristics of image, thus for clinical diagnosises provide valuable " advisory opinion ", improve breast cancer diagnosis accuracy rate and Efficiency.

Brief description

Fig. 1 is that the single noise reduction self-encoding encoder of breast ultrasound characteristics of image deep learning in the embodiment of the present invention was trained Journey；

Fig. 2 is to stack noise reduction self-encoding encoder training process in the embodiment of the present invention.

Specific embodiment

By the technology contents of detailed description technical scheme, structural features, realized purpose and effect, below in conjunction with concrete reality Apply example and coordinate accompanying drawing to be explained in detail.

Refer to Fig. 1 to Fig. 2, it is super that the present embodiment present embodiments provides a kind of mammary gland based on stacking noise reduction self-encoding encoder Acoustic image feature self study extracting method, specific as follows：

Step S1：A given medium-scale above breast ultrasound focal area image set, described medium-scale expression This image set is at least containing breast molybdenum target diagnostic images more than 200 width；

Step S2：The breast molybdenum target focal zone of each breast ultrasound diagnostic image in image set in manual extraction step S1 Area image ROI (Region of interest, area-of-interest)；Wherein said breast ultrasound focal area image ROI's is big Little is 150 × 150；

Step S3：Extract manual shallow-layer feature to train as one from each breast ultrasound focal area image ROI Sample, composing training sample set set_unlabeled={ x⁽¹⁾,x⁽²⁾,…,x⁽ⁿ⁾, i-th sample x⁽ⁱ⁾∈[0,1]^d, i=1, 2,…,n.Wherein d represents the characteristic dimension of sample, and n represents training set number of samples.

Step S4：Based on training sample set, train first noise reduction self-encoding encoder DAE1, wherein DAE is Denoising Autoencoder, noise reduction self-encoding encoder.

Step S5：After having trained first noise reduction self-encoding encoder, re-entering set_unlabeled sample is training sample Collection, the character representation being obtained according to the hidden layer study that the model DAE1 that step S4 trains extracts all samples, constitute new sample This { y⁽¹⁾,y⁽²⁾,…,y⁽ⁿ⁾, as the input of second noise reduction self-encoding encoder, train second noise reduction self-encoding encoder DAE2.

Step S6：Two that complete to train noise reduction self-encoding encoder (DAE1, DAE2) stackings are obtained three layers of SDAE (stacked Denoising Autoencoder) structure, as shown in Figure 2.Corresponding ground floor is input layer, and dimension is d；Second Layer is the hidden layer in DAE1, and dimension is d_h1；Third layer is corresponding hidden layer in DAE2, and dimension is d_h2.By this model, give The manual shallow-layer feature of breast ultrasound image, can obtain the higher level of abstraction based on stacking noise reduction self-encoding encoder after feed-forward Semantic feature representSo obtain semantic feature it is achieved that mammary gland surpasses The extraction of acoustic image feature, thus providing valuable " advisory opinion " for clinical diagnosises, improves the accuracy rate of breast cancer diagnosis And efficiency.

Further, described step S3 is：Extract GLCM ((gray level co-occurrence matrixes, the Gray- of each ROI image respectively Level co-occurrence matrix), small echo, wavelet packet, MPEG-7 (Moving Picture Experts Group, Dynamic image expert group) four kinds of manual shallow-layer features, the characteristic vector being cascaded as a d dimension is as a sample.In view of certain A little actual maximums of characteristic attribute and minima are unknown, and there is a possibility that outlier, take z-score's first Normalization method, normalizing is as follows：

Wherein x represents the observation of a certain dimensional characteristics, and mean is the average of this dimensional characteristics observation, and std is this dimension The standard deviation of degree feature observation, x' carries out the result after z-score specification for x.In view of nerve during training own coding Unit is that existed with Probability Forms, proceeds Min-Max and standardizes to [0,1] interval.Normalizing is as follows：

Wherein x' represents the observation of a certain dimensional characteristics, and min is the minima of this dimensional characteristics observation, and max is should The maximum of dimensional characteristics observation, x " carry out the result after Min-Max specification for x'.

In step s3, it focuses on cascade therein, in general, GLCM, small echo, wavelet packet, MPEG-7 this four Kind of shallow-layer feature is all only extracted the part physical feature of image, not comprehensively, in order to ensure subsequently can be from comprehensive physics Feature learning goes out more preferable high-level characteristic, and these four different shallow-layer features are concatenated together the study as follow-up work Basis, farthest to comprise the physical message of ROI image.

Further, described step S4 is：As shown in figure 1, whole noise reduction self-encoding encoder is made up of three-layer network, correspondence is defeated Enter neuron number respectively d, d of a layer x, hidden layer y, output layer z_h1, d, wherein the input of input layer be training set set_ Certain sample x in unlabeled⁽ⁱ⁾∈[0,1]^d, i=1,2 ..., n.Input layer has been artificially induced noise.Parameter θ₁= {W₁,b₁, θ₂={ W₂,b₂, b₁、b₂For being respectively the bias vector of hidden layer and output layer, size is respectively d_h1With d dimension, W₁、W₂ It is respectively input layer and arrive the weights connection matrix of hidden layer and hidden layer to the weights connection matrix of output layer, size respectively d × d_h1、 d_h1×d.Activation primitive is all using sigmoid function.

Specifically include following steps：

Step S41：Training sample set is split, concretely comprises the following steps：Training sample set set_ by ultrasonoscopy Unlabeled random division is num batch (block), each batch_i∈[0,1]^{batch_size×d}, i=1,2 ..., num；

Step S42：Network parameter initializes；It is specifically configured to：

Learningrate=1；

b₁=0, b₂=0；

Step 43：Setting maximum cycle NN；

Step 44：Recirculate outward t=1 to NN；

Inside recirculate s=1 to num；

Step 442：Feed-forward：

z⁽ⁱ⁾=sigmoid (W₂ ^Ty⁽ⁱ⁾+b₂), i=1,2 ..., batch_size；

Step 443：Reverse transfer：

Step 444：Undated parameter

Wherein,Represent the residual error of i-th sample correspondence output layer and j-th node of hidden layer respectively.This reality Apply example to be advantageous in that, traditional GLCM, small echo, wavelet packet, these shallow-layer features of MPEG-7 are all conventional image in fact Physical features, and the pathological characters that ultrasonoscopy to carry out needs when auxiliary diagnosis as medical image not directly do not close Connection, so had as the description sign on ultrasonoscopy pathology with GLCM, small echo, wavelet packet, these shallow-layer features of MPEG-7 Unreliability.And the aspect ratio physical features higher level by learner study gained, it is more nearly the semantic feature of image, Bigger with the pathology degree of association of ultrasonoscopy, it is more suitable for characterizing as the description on ultrasonoscopy pathology.

Semantic feature generation module：For two noise reduction self-encoding encoders DAE1 completing to train and DAE2 stacking are obtained three layers SDAE structure, corresponding ground floor is input layer, and dimension is d；The second layer is the hidden layer in DAE1, and dimension is d_h1；Third layer is DAE2 In corresponding hidden layer, dimension be d_h2；By this SDAE structure, the manual shallow-layer feature of given breast molybdenum target image, obtain after feed-forward The semantic feature of the higher level of abstraction based on stacking noise reduction self-encoding encoder represents So obtain semantic feature it is achieved that the extraction of breast ultrasound characteristics of image, thus providing valuable " reference for clinical diagnosises Suggestion ", improves accuracy rate and the efficiency of breast cancer diagnosis.

Further, described sample training module：It is additionally operable to extract respectively the GLCM of each ROI image, small echo, small echo The manual shallow-layer feature of bag, tetra- kinds of MPEG-7, the characteristic vector being cascaded as a d dimension is as a sample.Sample training module weight Point is cascade therein, and in general, these four shallow-layer features of GLCM, small echo, wavelet packet, MPEG-7 are all only extracted image Part physical feature, not comprehensively, preferably high-rise special in order to ensure subsequently can to go out from comprehensive physical features learning Levy, these four different shallow-layer features are concatenated together the learning foundation as follow-up work, farthest to comprise ROI The physical message of image.

First encoder training module is included with lower unit：

Learningrate=1；

b₁=0, b₂=0；

Cycle-index arranging unit：For arranging maximum cycle NN；

For the s=1 to num that recirculates in arranging；

Feed-forward unit：For feed-forward：

z⁽ⁱ⁾=sigmoid (W₂ ^Ty⁽ⁱ⁾+b₂), i=1,2 ..., batch_size；

Reverse transfer unit：For reverse transfer：

Undated parameter unit：For undated parameter

It should be noted that herein, such as first and second or the like relational terms are used merely to a reality Body or operation are made a distinction with another entity or operation, and not necessarily require or imply these entities or deposit between operating In any this actual relation or order.And, term " inclusion ", "comprising" or its any other variant are intended to The comprising of nonexcludability, so that include a series of process of key elements, method, article or terminal unit not only include those Key element, but also include other key elements being not expressly set out, or also include for this process, method, article or end The intrinsic key element of end equipment.In the absence of more restrictions, limited by sentence " including ... " or " comprising ... " It is not excluded that also there is other key element in process, method, article or the terminal unit including described key element in key element.This Outward, herein, " more than ", " less than ", " exceeding " etc. be interpreted as not including this number；" more than ", " below ", " within " etc. understand It is including this number.

Those skilled in the art are it should be appreciated that the various embodiments described above can be provided as method, device or computer program product Product.These embodiments can be using complete hardware embodiment, complete software embodiment or the embodiment combining software and hardware aspect Form.All or part of step in the method that the various embodiments described above are related to can be instructed by program correlation hardware Lai Complete, described program can be stored in the storage medium that computer equipment can read, for executing the various embodiments described above side All or part of step described in method.Described computer equipment, including but not limited to：Personal computer, server, general-purpose computations Machine, special-purpose computer, the network equipment, embedded device, programmable device, intelligent mobile terminal, intelligent home device, Wearable Smart machine, vehicle intelligent equipment etc.；Described storage medium, including but not limited to：RAM, ROM, magnetic disc, tape, CD, sudden strain of a muscle Deposit, USB flash disk, portable hard drive, storage card, memory stick, webserver storage, network cloud storage etc..

The various embodiments described above are with reference to the method according to embodiment, equipment (system) and computer program Flow chart and/or block diagram are describing.It should be understood that can be by every in computer program instructions flowchart and/or block diagram Flow process in one flow process and/or square frame and flow chart and/or block diagram and/or the combination of square frame.These computers can be provided Programmed instruction to computer equipment processor to produce a machine so that by the finger of the computing device of computer equipment Order produces and is used for what realization was specified in one flow process of flow chart or multiple flow process and/or one square frame of block diagram or multiple square frame The device of function.

These computer program instructions may be alternatively stored in and the computer that computer equipment works in a specific way can be guided to set So that the instruction being stored in this computer equipment readable memory produces the manufacture including command device in standby readable memory Product, this command device is realized in one flow process of flow chart or multiple flow process and/or one square frame of block diagram or multiple square frame middle finger Fixed function.

These computer program instructions also can be loaded on computer equipment so that executing a series of on a computing device Operating procedure is to produce computer implemented process, thus the instruction executing on a computing device is provided for realizing in flow process The step of the function of specifying in one flow process of figure or multiple flow process and/or one square frame of block diagram or multiple square frame.

Although being described to the various embodiments described above, those skilled in the art once know basic wound The property made concept, then can make other change and modification to these embodiments, so the foregoing is only embodiments of the invention, Not thereby the equivalent structure that the scope of patent protection of the restriction present invention, every utilization description of the invention and accompanying drawing content are made Or equivalent flow conversion, or directly or indirectly it is used in other related technical fields, all include the patent in the present invention in the same manner Within protection domain.

Claims

1. a kind of breast ultrasound characteristics of image self study extracting method based on stacking noise reduction self-encoding encoder is it is characterised in that wrap Include following steps：

Step S1：A given medium-scale above breast ultrasound focal area image set, described medium-scale this figure of expression Image set is at least containing breast ultrasound diagnostic images more than 200 width；

Step S2：The focal area image ROI of each breast ultrasound diagnostic image in image set in manual extraction step S1；

Step S3：Extract manual shallow-layer feature as a training sample from each breast ultrasound focal area image ROI, Composing training sample set set_unlabeled={ x⁽¹⁾,x⁽²⁾,…,x⁽ⁿ⁾, i-th sample x⁽ⁱ⁾∈[0,1]^d, i=1,2 ..., n；Wherein d represents the characteristic dimension of sample, and n represents training set number of samples；

Step S5：After having trained first noise reduction self-encoding encoder, re-enter training sample set, the volume training according to step S4 Code device extracts the character representation that the hidden layer study of all samples obtains, and constitutes new sample { y⁽¹⁾,y⁽²⁾,…,y⁽ⁿ⁾, made For the input of second noise reduction self-encoding encoder, train second noise reduction self-encoding encoder DAE2；

Step S6：Two noise reduction self-encoding encoders DAE1 completing to train and DAE2 stacking are obtained three layers of SDAE structure, corresponding Ground floor is input layer, and dimension is d；The second layer is the hidden layer in DAE1, and dimension is d_h1；Third layer is corresponding hidden in DAE2 Layer, dimension is d_h2；By this SDAE structure, the manual shallow-layer feature of given breast molybdenum target image, it is based on after feed-forward The semantic feature of the higher level of abstraction of stacking noise reduction self-encoding encoder represents

2. the breast ultrasound characteristics of image self study extraction side based on stacking noise reduction self-encoding encoder according to claim 1 Method it is characterised in that：

Described step S3 is：Extract the GLCM of each ROI image, small echo, wavelet packet, the manual shallow-layers of tetra- kinds of MPEG-7 respectively special Levy, the characteristic vector being cascaded as a d dimension is as a sample.

3. the breast ultrasound characteristics of image self study extraction side based on stacking noise reduction self-encoding encoder according to claim 1 Method it is characterised in that：

In described step S4, first noise reduction self-encoding encoder is made up of three-layer network, corresponding input layer x, hidden layer y, output layer z Neuron number is respectively d, d_h1, d, certain sample x that wherein input of input layer is concentrated for training sample⁽ⁱ⁾∈[0,1]^d,i =1,2 ..., n；Input layer has been artificially induced noise；Parameter θ₁={ W₁,b₁, θ₂={ W₂,b₂, b₁、b₂For being respectively hidden layer With the bias vector of output layer, size is respectively d_h1With d dimension, W₁、W₂Be respectively input layer arrive hidden layer weights connection matrix with To the weights connection matrix of output layer, size is respectively d × d to hidden layer_h1、d_h1×d；Activation primitive is all using sigmoid function；

Step S4 comprises the following steps：

Step S41：Training sample set is split, concretely comprises the following steps：The training sample set random division of ultrasonoscopy is Num batch, each batch_i∈[0,1]^{batch_size×d}, i=1,2 ..., num；

Step S42：Network parameter initializes；It is specifically configured to：

Learningrate=1；

W_{1} = (r a n d (d, d_{h 1}) - 0.5) \times \sqrt{\frac{6}{d + d_{h 1}}};

W_{2} = (r a n d (d_{h 1}, d) - 0.5) \times \sqrt{\frac{6}{d_{h 1} + d}};

b₁=0, b₂=0；

Step 43：Setting maximum cycle NN；

Step 44：Recirculate outward t=1to NN；

Inside recirculate s=1to num；

Step 441：Corrosion data：By binary system masking noise mode, with certain probability by values some in input feature value x Randomly it is reset to 0；It is specially：

batch_s=batch_s× (rand (batch_size, d) ＞ threashold), a batch_sConstitute batch_size × d rank matrix, threashold is the threshold value setting, and is specifically set as 0.2；If the random matrix A=rand generating (batch_size, d) in elements A_ijLess than threashold, then matrix batch_sThe element of middle correspondence position is reset to 0；Fixed Adopted batch_sIn i-th sample be x⁽ⁱ⁾, after corrosion it is

Step 442：Feed-forward：

z^{(i)} = s i g m o i d ({W_{2}}^{T} y^{(i)} + b_{2}), i = 1, 2, ..., b a t c h_s i z e;

Step 443：Reverse transfer：

δ_{j}^{(i) h} = Σ_{t = 1}^{d} W_{2_{j t}} δ_{t}^{(i) o} y_{j}^{(i)} (1 - y_{j}^{(i)}), i = 1, 2, ..., b a t c h_s i z e, 1 \leq j \leq d_{h 1};

Step 444：Undated parameter

W_{2} = W_{2} - \frac{1}{b a t c h_s i z e} Σ_{i = 1}^{b a t c h_s i z e} y^{(i)} δ^{(i) o^{T}};

b_{1} = b_{1} - \frac{1}{b a t c h_s i z e} Σ_{i = 1}^{b a t c h_s i z e} {δ_{j}}^{(i) h^{T}};

b_{2} = b_{2} - \frac{1}{b a t c h_s i z e} Σ_{i = 1}^{b a t c h_s i z e} {δ_{j}}^{(i) o^{T}};

4. a kind of breast ultrasound characteristics of image self study extraction system based on stacking noise reduction self-encoding encoder is it is characterised in that wrap Include with lower module：

Image set gives module：For giving a medium-scale above breast ultrasound focal area image set, described medium Scale represents this image set at least containing breast ultrasound diagnostic images more than 200 width；

Focal area extraction module：Focus for each breast ultrasound diagnostic image in image set in manual extraction step S1 Area image ROI；

Sample training module：For extracting manual shallow-layer feature from each breast ultrasound focal area image ROI as one Individual training sample, composing training sample set set_unlabeled={ x⁽¹⁾,x⁽²⁾,…,x⁽ⁿ⁾, i-th sample x⁽ⁱ⁾∈[0,1 ]^d, i=1,2 ..., n；Wherein d represents the characteristic dimension of sample, and n represents training set number of samples；

Second encoder training module：For having trained after first noise reduction self-encoding encoder, re-enter training sample set, according to The encoder that step S4 trains extracts the character representation that the hidden layer study of all samples obtains, and constitutes new sample { y⁽¹⁾,y⁽²⁾,…,y⁽ⁿ⁾, as the input of second noise reduction self-encoding encoder, train second noise reduction self-encoding encoder DAE2；

Semantic feature generation module：For two noise reduction self-encoding encoders DAE1 completing to train and DAE2 stacking being obtained three layers of SDAE Structure, corresponding ground floor is input layer, and dimension is d；The second layer is the hidden layer in DAE1, and dimension is d_h1；Third layer is corresponding in DAE2 Hidden layer, dimension be d_h2；By this SDAE structure, the manual shallow-layer feature of given breast molybdenum target image, after feed-forward, obtain base Semantic feature in the higher level of abstraction of stacking noise reduction self-encoding encoder represents

5. according to claim 4 extraction based on the breast ultrasound characteristics of image self study of stacking noise reduction self-encoding encoder is System it is characterised in that：

Described sample training module：It is additionally operable to extract respectively the GLCM of each ROI image, small echo, wavelet packet, tetra- kinds of handss of MPEG-7 Work shallow-layer feature, the characteristic vector being cascaded as a d dimension is as a sample.

6. according to claim 4 extraction based on the breast ultrasound characteristics of image self study of stacking noise reduction self-encoding encoder is System it is characterised in that：

In described first encoder training module, first noise reduction self-encoding encoder is made up of three-layer network, corresponding input layer x, hidden Layer y, the neuron number of output layer z are respectively d, d_h1, d, certain sample x that wherein input of input layer is concentrated for training sample⁽ⁱ⁾∈[0,1]^d, i=1,2 ..., n；Input layer has been artificially induced noise；Parameter θ₁={ W₁,b₁, θ₂={ W₂,b₂, b₁、b₂ For being respectively the bias vector of hidden layer and output layer, size is respectively d_h1With d dimension, W₁、W₂It is respectively input layer to the power of hidden layer It is worth connection matrix and hidden layer to the weights connection matrix of output layer, size is respectively d × d_h1、d_h1×d；Activation primitive all adopts Sigmoid function；

First encoder training module is included with lower unit：

Sample decomposition unit：For splitting to training sample set, concretely comprise the following steps：By the training sample set of ultrasonoscopy with Machine is divided into num batch, each batch_i∈[0,1]^{batch_size×d}, i=1,2 ..., num；

Learningrate=1；

W_{1} = (r a n d (d, d_{h 1}) - 0.5) \times \sqrt{\frac{6}{d + d_{h 1}}};

W_{2} = (r a n d (d_{h 1}, d) - 0.5) \times \sqrt{\frac{6}{d_{h 1} + d}};

b₁=0, b₂=0；

Cycle-index arranging unit：For arranging maximum cycle NN；

The inside and outside arranging unit that recirculates：For arranging the outer t=1to NN that recirculates；

For the s=1to num that recirculates in arranging；

Corrosion data unit：For corrosion data：By binary system masking noise mode, with certain probability by input feature value In x, some values are randomly reset to 0；It is specially：

Feed-forward unit：For feed-forward：

z⁽ⁱ⁾=sigmoid (W₂ ^Ty⁽ⁱ⁾+b₂), i=1,2 ..., batch_size；

Reverse transfer unit：For reverse transfer：

δ_{j}^{(i) o} = (z_{j}^{(i)} - x_{j}^{(i)}) z_{j}^{(i)} (1 - z_{j}^{(i)}), i = 1, 2, ..., b a t c h_s i z e, 1 \leq j \leq d;

δ_{j}^{(i) h} = Σ_{t = 1}^{d} W_{2_{j t}} δ_{t}^{(i) o} y_{j}^{(i)} (1 - y_{j}^{(i)}), i = 1, 2, ..., b a t c h_s i z e, 1 \leq j \leq d_{h 1};

Undated parameter unit：For undated parameter

W_{2} = W_{2} - \frac{1}{b a t c h_s i z e} Σ_{i = 1}^{b a t c h_s i z e} y^{(i)} δ^{(i) o^{T}};

b_{1} = b_{1} - \frac{1}{b a t c h_s i z e} Σ_{i = 1}^{b a t c h_s i z e} {δ_{j}}^{(i) h^{T}};

b_{2} = b_{2} - \frac{1}{b a t c h_s i z e} Σ_{i = 1}^{b a t c h_s i z e} {δ_{j}}^{(i) o^{T}};