CN108109160A - It is a kind of that interactive GrabCut tongue bodies dividing method is exempted from based on deep learning - Google Patents
It is a kind of that interactive GrabCut tongue bodies dividing method is exempted from based on deep learning Download PDFInfo
- Publication number
- CN108109160A CN108109160A CN201711133796.8A CN201711133796A CN108109160A CN 108109160 A CN108109160 A CN 108109160A CN 201711133796 A CN201711133796 A CN 201711133796A CN 108109160 A CN108109160 A CN 108109160A
- Authority
- CN
- China
- Prior art keywords
- mrow
- msub
- tongue
- layer
- candidate region
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/10—Segmentation; Edge detection
- G06T7/162—Segmentation; Edge detection involving graph-based methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/10—Segmentation; Edge detection
- G06T7/194—Segmentation; Edge detection involving foreground-background segmentation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20072—Graph-based image processing
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20081—Training; Learning
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20084—Artificial neural networks [ANN]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20112—Image segmentation details
- G06T2207/20132—Image cropping
Landscapes
- Engineering & Computer Science (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- Image Analysis (AREA)
Abstract
It is a kind of that interactive GrabCut tongue bodies dividing method is exempted from based on deep learning, including being used for the depth convolutional neural networks of tongue global feature extraction, for the area-of-interest of tongue body region Preliminary detection positioning network, the depth convolutional neural networks for carrying out deep layer abstract characteristics extraction to area-of-interest and GrabCut algorithms for being split to tongue picture.The present invention effectively solves the problems, such as that existing GrabCut algorithms are too dependent on man-machine interactively when being split to tongue body, improves the degree of automation of GrabCut algorithms in tongue body segmentation.
Description
Technical field
The present invention relates to a kind of dividing methods, and in particular to TCM tongue diagnosis, computer vision, Digital Image Processing, pattern
The technologies such as identification, deep learning and depth convolutional neural networks split the application in field in tongue picture automatically.
Background technology
Lingual diagnosis is the important component in motherland's medicine observation, according to patient lingual surface tongue mark, tongue nature association attributes, bag
The observation of color, form etc. is included, where judging disease, and then diagnosis and treatment.Nowadays, the standardizing of TCM tongue diagnosis, quantitative
Change, objectify research have become Diagnostics of Chinese Medicine modernization main direction of studying, have pole to the development of entire traditional Chinese medicine
Its profound significance.
The standardization of lingual diagnosis, quantification, the research that objectifies are regarded in camera shooting, Digital Image Processing, pattern-recognition, computer
What feel etc. technically grew up, mainly include tongue picture Image Acquisition, colour correction, tongue body segmentation, region division (coating nature point
From), tongue color, ligulate, indentation, tongue nature, sublingual vessel etc..The basis that these researchs are applied as modernization lingual diagnosis, by tongue
Examine band play the role of into quantification, the process to objectify it is most important.
It is the premise for carrying out lingual diagnosis that tongue body is precisely separating from tongue picture.In recent years, there is researcher by GrabCut algorithm applications
In tongue body segmentation, certain effect is achieved.But GranCut algorithms are needed by way of man-machine interactively in use
The divided frame of given prospect background, the step for greatly reduce the automation performance of algorithm.
Set forth herein interactive GrabCut algorithms are exempted from based on deep learning, using depth convolutional neural networks to tongue body
It is positioned, automatically derives prospect background divided frame.Two Introduction of key techniques that the present invention relies on are as follows:
(1) convolutional neural networks
Deep learning was used widely in computer vision field in recent years, this has benefited from the quick of depth learning technology
Development, convolutional neural networks can make full use of substantial amounts of training sample to extract abstracted information therein layer by layer, more directly
More fully the further feature of image is arrived in study, these features are proved in substantial amounts of task than traditional manual extraction feature
With stronger characterization ability, the overall structure of image can be described in more detail below.Convolutional neural networks technology from R-CNN,
Fast R-CNN develop to Faster R-CNN, develop to FCN from CNN, almost cover the meters such as target detection, classification, segmentation
Several big key areas of calculation machine vision.
Convolutional neural networks are that the sensory perceptual system of the imitation mankind is built.Human brain is to pass layer by layer to the processing of information
It passs, from specific to an abstract process, low-level feature is handled and extracted to input information, obtains the essence letter of data
Breath, so form brain it will be appreciated that higher level of abstraction information, the structure of this hierarchy type remains the essential information of object, and
Reduce the data volume of human brain processing.The pyramid structure for simulating human brain is transferred into row information so that depth convolutional neural networks
An important advantage be exactly successively to extract information from Pixel-level initial data to abstract semantic concept so that it is being extracted
There is prominent advantage in terms of the further feature and semantic information of image.
(2) GrabCut algorithms
GrabCut algorithms are a kind of very effective dividing methods.It marks foreground and background by hand firstly the need of user
Information, that is, need specify a rectangle for including prospect, then foreground and background is built by gauss hybrid models (GMM)
Mould.According to the input of user, GMM can learn and create new pixel distribution.The pixel unknown to those classification, can be according to him
Classify with the relations of known classified pixels.Thus can be according to the one width figure of profile creation of pixel, the node in figure
It is exactly pixel.Then figure obtained above is split based on mincut algorithms.
The content of the invention
Man-machine interactively is needed during the use that existing CrabCut algorithms split in tongue body in order to overcome, to give prospect
The divided frame of background, this not strong problem of automation performance, present invention proposition is a kind of to exempt from interactive mode based on deep learning
GrabCut tongue body dividing methods, structure depth convolutional neural networks automatically position tongue body, so as to obtain prospect background point
Frame is cut, without manually giving, improves the degree of automation of partitioning algorithm.
The technical solution adopted by the present invention to solve the technical problems is:
It is a kind of that interactive GrabCut tongue bodies dividing method is exempted from based on deep learning, including being used for the extraction of tongue global feature
Depth convolutional neural networks, for the area-of-interest of tongue body region Preliminary detection positioning network, for region of interest
Domain carries out the depth convolutional neural networks of deep layer abstract characteristics extraction and the GrabCut algorithms for being split to tongue picture;
The depth convolutional neural networks for the extraction of tongue global feature, the facilities network as whole network model
Network is divided into five layers, the depth structure being alternately made of convolutional layer, active coating and pond layer, implicitly from given tongue picture number
According to middle carry out unsupervised learning, avoid and manually carry out explicit feature extraction;
It is described to be used to position network, i.e. RPN networks to the area-of-interest of tongue body region Preliminary detection, on lingual surface not
With attribute, corresponding region is detected and divides, and obtains the preliminary advice result of tongue body;
The depth convolutional neural networks for being used to carry out area-of-interest deep layer abstract characteristics extraction, by connecting entirely
Layer composition carries out further feature extraction to the tongue body suggestion areas obtained on last stage, and input area carries out layer by layer in a network
Mapping, obtains different representations, extracts its abstract characteristics, so as to fulfill the depth representing to tongue picture, obtains tongue body positioning
Result.
The GrabCut algorithms being split to tongue picture, using tongue body posting obtained above as input, thus
The foreground and background of tongue picture figure is distinguished, and then the automatic segmentation of tongue body is completed on the premise of without man-machine interactively.
Further, the depth convolutional neural networks for the extraction of tongue global feature, are divided into five layers, convolutional Neural
Network is the depth structure being alternately made of convolutional layer, active coating and pond layer;By convolution operation, prime information is made to enhance and subtract
Few noise;It is operated by pondization, using the principle of image local correlation, sub-sample is carried out to image, image is useful retaining
The treating capacity of data is reduced on the basis of information;
Network receives the tongue picture of arbitrary dimension as inputting, and network structure is as follows:The convolution kernel of first convolutional layer Conv1
Number is 96, and size is 7 × 7 × 3, and convolution step-length is 2, Filling power 3;The Chi Huahe of first pond layer (Pool1) for 7 ×
7 × 3, pond step-length is 2, Filling power 1;ReLU active coatings 1 are then carried out to handle;Second convolutional layer Conv2 has 256 volumes
Product core, size are 5 × 5 × 96, step-length 2, Filling power 2;The Chi Huahe of second pond layer Pool2 is 7 × 7 × 96, step-length
For 2, Filling power 1;ReLU active coatings 1 are then carried out to handle;3rd convolutional layer Conv3 has 384 convolution kernels, size 3
× 3 × 256, Filling power 1;ReLU active coatings 1 are then carried out to handle;4th convolutional layer Conv4 has 384 convolution kernels, greatly
Small is 3 × 3 × 384, Filling power 1;ReLU active coatings 1 are then carried out to handle;5th convolutional layer Conv5 has 256 convolution
Core, size are 3 × 3 × 384, Filling power 1;ReLU active coatings 1 are then carried out to handle;
By this five layers of feature extraction, every tongue picture obtains 256 characteristic patterns, the input as RPN networks.
Further, described to be used in the area-of-interest positioning network to tongue body region Preliminary detection, RPN networks receive
256 characteristic patterns of basic network generation carry out after-treatment as inputting, using three convolutional layers and algorithm layer to characteristic pattern,
The set of rectangular target candidate frame is exported, each frame includes 4 position coordinates variables and a score;
First convolutional layer Conv1/rpn of RPN networks has 256 convolution kernels, and size is 3 × 3 × 256;RPN networks
Second convolutional layer Conv2/rpn has 18 convolution kernels, and size is the 3rd convolutional layer Conv3/ of 1 × 1 × 256, RPN networks
Rpn has 36 convolution kernels, and size is 1 × 1 × 256;
RPN networks additionally add algorithm layer for formation zone candidate frame, and multiple dimensioned convolution behaviour is carried out on characteristic pattern
Make, be implemented as:In the position of each sliding window using 3 kinds of scales and 3 kinds of length-width ratios, with current sliding window mouth center
Centered on, and a kind of corresponding scale and length-width ratio, then mapping obtains the candidate region of 9 kinds of different scales in artwork, such as
Size is the shared convolution characteristic pattern of w × h, then a total of w × h × 9 candidate region;Finally, classify layer output w × h × 9 ×
The score of 2 candidate regions is the estimated probability of target/non-targeted to each region, return layer output w × h × 9 × 4
The coordinate parameters of parameter, i.e. candidate region;
Training process is as follows in RPN networks:First with each point on 3 × 3 sliding window traversal characteristic pattern, find
Sliding window central point is mapped in the position in artwork, and point centered on it at the point, and 3 kinds of scales are generated in artwork
(1282,2562,5122) and 3 kinds of length-width ratios (1:1,2:1,1:2) each point on candidate region, i.e. characteristic pattern is in artwork
9 candidate regions are all corresponded to, if characteristic pattern size is w × h, then the candidate region number generated is w × h × 9, next to institute
There is candidate region to be screened and judged twice twice;Leave out first and complete to sieve for the first time beyond the candidate region of artwork scope
Choosing then calculates remaining candidate region it and hands over the ratio between unions i.e. Duplication with all real label areas, and according to than
It is worth and distributes a binary label for each candidate region, judges that the region is tongue body with this, criterion is:1) will
The candidate region of ratio maximum is considered as positive sample, i.e. tongue body;2) in other candidate regions, if ratio is more than 0.7, then it is assumed that be
Positive sample, less than 0.3, then it is assumed that be negative sample, i.e., background, the candidate region that ratio is interposed between the two are given up;
Candidate region and the calculating of true callout box GT Duplication are represented by formula (1):
After completing to the postsearch screening of candidate region, second of marker for judgment is carried out to it, there will be maximum hand over simultaneously with it
Label of the label of the true tab area of the ratio between collection as the candidate region, i.e. prospect label, and added for all negative samples
Background label carries out stochastical sampling to positive negative sample, and number of samples is set to 128, and oversampling ratio is set to 1:1, under normal circumstances just
Sample number is less, if positive sample number is less than 64, differential section is supplied by negative sample, in subsequent network by 128 just
Negative sample is merged trains together, with the discrimination of enhancing mark sample and non-mark sample.
Further, the depth convolutional Neural net for being used to carry out area-of-interest deep layer abstract characteristics extraction
Network is made of full articulamentum, and is added pyramid pond layer before this and carried out dimension normalization;
Sub-network carries out feature extraction using full articulamentum to the candidate region after sampling, and candidate region shares 9 kinds of sizes,
And full articulamentum requires input size consistent, therefore dimension normalization is carried out first with pyramid pond layer herein, then be sent to
Three full articulamentums carry out further feature extraction, and full articulamentum output neuron number is set to 1024 in sub-network, obtains
The feature vector of 1024 dimensions;Then, this feature vector is respectively fed to two full articulamentums and carries out Feature Compression, full articulamentum
Output neuron number is set to 2 and 8;Finally, output valve with true tag value is compared respectively, carries out returning for loss function
Reduction beam;
Loss function is represented by formula (2):
In formula, classification loss function is defined as by formula (3):
Position returns loss function and is defined as by formula (4):
R is the loss function smooth of robustL1, it is expressed as by formula (5):
In formula, NclsAnd NregIt is to avoid the regular terms of over-fitting, λ is weight coefficient, and i is the classification rope of the candidate region
Draw value, tiIt is the prediction coordinate shift amount of the candidate region, t*i is the actual coordinate offset of the candidate region, piIt is pre- astronomical observation
Favored area belongs to the probability of the i-th class, and p*i represents its true classification, and p*i=0 represents background classes, and p*i=1 represents tongue body class;
The error between predicted value and given actual value is calculated respectively by the two loss functions, is calculated using backpropagation
Method returns error layer by layer, and every layer of parameter is adjusted and updated using stochastic gradient descent method, more new formula such as formula (6)
It is shown so that closer to actual value, i.e., the output of most latter two full articulamentum is closer gives in mark value the predicted value of network
Classification and location information;
In formula, w and w' are respectively to update front and rear parameter value, and E is the error amount being calculated by loss function layer, η
For learning rate.
The described GrabCut algorithms for being split to tongue picture comprise the following steps:
Step1:The divided frame of prospect background given first, marks the information of foreground and background, and GrabCut is mixed by Gauss
Molding type (Gaussian Mixture Model, GMM) carries out statistical modeling, GMM meetings respectively to foreground data and background data
Learn and create new pixel distribution, the pixel unknown to those classification can be according to their pixel relationships with known classification
To classify;
Step2:It will be schemed by Step1 according to one pair of profile creation of pixel, the node in figure is exactly pixel.Except picture
Vegetarian refreshments is cooked outside node there are two node:Source_node and Sink_node, all foreground pixels are all and Source_
Node is connected, and all background pixels are all connected with Sink_node.Pixel is connected to Source_node/end_node's
The weight on (side) belongs to the probability of same class (be both prospect or be both background) to determine by them.Weight between two pixels
It is determined by the similitude of the information on side or two pixels.If the color of two pixels is very different, then they
Between the weight on side will very little;
Step3:Figure obtained above is split using mincut algorithms, it can will scheme to divide according to least cost equation
For Source_node and Sink_node, cost equation is exactly the sum of weight on all sides being cut up, and after cutting, is owned
The pixel for being connected to Source_node is considered as prospect, and all pixels for being connected to Sink_node are considered as background;
Step4:Continue this process until classification restrains, you can complete segmentation.
Beneficial effects of the present invention are:Without manually giving prospect background divided frame, the automation of partitioning algorithm is improved
Degree.
Description of the drawings
Fig. 1 is the overall network frame diagram positioned to tongue body;
Fig. 2 is RPN network structures;
Fig. 3 is the partial results figure of tongue body segmentation.
Fig. 4 is the flow chart for exempting from interactive GrabCut tongue bodies dividing method based on deep learning.
Specific embodiment
The present invention will be further described below in conjunction with the accompanying drawings.
It is a kind of that interactive GrabCut tongue bodies dividing method is exempted from based on deep learning with reference to Fig. 1~Fig. 4, including being used for tongue
Global feature extraction depth convolutional neural networks, for the area-of-interest of tongue body region Preliminary detection positioning network, use
In the depth convolutional neural networks that deep layer abstract characteristics extraction is carried out to area-of-interest and for being split to tongue picture
GrabCut algorithms;
The depth convolutional neural networks for the extraction of tongue global feature, the facilities network as whole network model
Network is divided into five layers, the depth structure being alternately made of convolutional layer, active coating and pond layer, implicitly from given tongue picture number
According to middle carry out unsupervised learning, avoid and manually carry out explicit feature extraction;
It is described to be used to position network, i.e. RPN networks to the area-of-interest of tongue body region Preliminary detection, on lingual surface not
With attribute, corresponding region is detected and divides, and obtains the preliminary advice result of tongue body;
The depth convolutional neural networks for being used to carry out area-of-interest deep layer abstract characteristics extraction, by connecting entirely
Layer composition carries out further feature extraction to the tongue body suggestion areas obtained on last stage, and input area carries out layer by layer in a network
Mapping, obtains different representations, extracts its abstract characteristics, so as to fulfill the depth representing to tongue picture, obtains tongue body positioning
Result.
The GrabCut algorithms being split to tongue picture, using tongue body posting obtained above as input, thus
The foreground and background of tongue picture figure is distinguished, and then the automatic segmentation of tongue body is completed on the premise of without man-machine interactively.
Further, the depth convolutional neural networks for the extraction of tongue global feature, are divided into five layers, convolutional Neural
Network is the depth structure being alternately made of convolutional layer, active coating and pond layer;By convolution operation, prime information is made to enhance and subtract
Few noise;It is operated by pondization, using the principle of image local correlation, sub-sample is carried out to image, image is useful retaining
The treating capacity of data is reduced on the basis of information;
Network receives the tongue picture of arbitrary dimension as inputting, and network structure is as follows:The convolution kernel of first convolutional layer Conv1
Number is 96, and size is 7 × 7 × 3, and convolution step-length is 2, Filling power 3;The Chi Huahe of first pond layer (Pool1) for 7 ×
7 × 3, pond step-length is 2, Filling power 1;ReLU active coatings 1 are then carried out to handle;Second convolutional layer Conv2 has 256 volumes
Product core, size are 5 × 5 × 96, step-length 2, Filling power 2;The Chi Huahe of second pond layer Pool2 is 7 × 7 × 96, step-length
For 2, Filling power 1;ReLU active coatings 1 are then carried out to handle;3rd convolutional layer Conv3 has 384 convolution kernels, size 3
× 3 × 256, Filling power 1;ReLU active coatings 1 are then carried out to handle;4th convolutional layer Conv4 has 384 convolution kernels, greatly
Small is 3 × 3 × 384, Filling power 1;ReLU active coatings 1 are then carried out to handle;5th convolutional layer Conv5 has 256 convolution
Core, size are 3 × 3 × 384, Filling power 1;ReLU active coatings 1 are then carried out to handle;
By this five layers of feature extraction, every tongue picture obtains 256 characteristic patterns, the input as RPN networks.
Further, described to be used in the area-of-interest positioning network to tongue body region Preliminary detection, RPN networks receive
256 characteristic patterns of basic network generation carry out after-treatment as inputting, using three convolutional layers and algorithm layer to characteristic pattern,
The set of rectangular target candidate frame is exported, each frame includes 4 position coordinates variables and a score;
First convolutional layer Conv1/rpn of RPN networks has 256 convolution kernels, and size is 3 × 3 × 256;RPN networks
Second convolutional layer Conv2/rpn has 18 convolution kernels, and size is the 3rd convolutional layer Conv3/ of 1 × 1 × 256, RPN networks
Rpn has 36 convolution kernels, and size is 1 × 1 × 256;
RPN networks additionally add algorithm layer for formation zone candidate frame, and multiple dimensioned convolution behaviour is carried out on characteristic pattern
Make, be implemented as:In the position of each sliding window using 3 kinds of scales and 3 kinds of length-width ratios, with current sliding window mouth center
Centered on, and a kind of corresponding scale and length-width ratio, then mapping obtains the candidate region of 9 kinds of different scales in artwork, such as
Size is the shared convolution characteristic pattern of w × h, then a total of w × h × 9 candidate region;Finally, classify layer output w × h × 9 ×
The score of 2 candidate regions is the estimated probability of target/non-targeted to each region, return layer output w × h × 9 × 4
The coordinate parameters of parameter, i.e. candidate region;
Training process is as follows in RPN networks:First with each point on 3 × 3 sliding window traversal characteristic pattern, find
Sliding window central point is mapped in the position in artwork, and point centered on it at the point, and 3 kinds of scales are generated in artwork
(1282,2562,5122) and 3 kinds of length-width ratios (1:1,2:1,1:2) each point on candidate region, i.e. characteristic pattern is in artwork
9 candidate regions are all corresponded to, if characteristic pattern size is w × h, then the candidate region number generated is w × h × 9, next to institute
There is candidate region to be screened and judged twice twice;Leave out first and complete to sieve for the first time beyond the candidate region of artwork scope
Choosing then calculates remaining candidate region it and hands over the ratio between unions i.e. Duplication with all real label areas, and according to than
It is worth and distributes a binary label for each candidate region, judges that the region is tongue body with this, criterion is:1) will
The candidate region of ratio maximum is considered as positive sample, i.e. tongue body;2) in other candidate regions, if ratio is more than 0.7, then it is assumed that be
Positive sample, less than 0.3, then it is assumed that be negative sample, i.e., background, the candidate region that ratio is interposed between the two are given up;
Candidate region and the calculating of true callout box GT Duplication are represented by formula (1):
After completing to the postsearch screening of candidate region, second of marker for judgment is carried out to it, there will be maximum hand over simultaneously with it
Label of the label of the true tab area of the ratio between collection as the candidate region, i.e. prospect label, and added for all negative samples
Background label carries out stochastical sampling to positive negative sample, and number of samples is set to 128, and oversampling ratio is set to 1:1, under normal circumstances just
Sample number is less, if positive sample number is less than 64, differential section is supplied by negative sample, in subsequent network by 128 just
Negative sample is merged trains together, with the discrimination of enhancing mark sample and non-mark sample.
Further, the depth convolutional Neural net for being used to carry out area-of-interest deep layer abstract characteristics extraction
Network is made of full articulamentum, and is added pyramid pond layer before this and carried out dimension normalization;
Sub-network carries out feature extraction using full articulamentum to the candidate region after sampling, and candidate region shares 9 kinds of sizes,
And full articulamentum requires input size consistent, therefore dimension normalization is carried out first with pyramid pond layer herein, then be sent to
Three full articulamentums carry out further feature extraction, and full articulamentum output neuron number is set to 1024 in sub-network, obtains
The feature vector of 1024 dimensions;Then, this feature vector is respectively fed to two full articulamentums and carries out Feature Compression, full articulamentum
Output neuron number is set to 2 and 8;Finally, output valve with true tag value is compared respectively, carries out returning for loss function
Reduction beam;
Loss function is represented by formula (2):
In formula, classification loss function is defined as by formula (3):
Position returns loss function and is defined as by formula (4):
R is the loss function smooth of robustL1, it is expressed as by formula (5):
In formula, NclsAnd NregIt is to avoid the regular terms of over-fitting, λ is weight coefficient, and i is the classification rope of the candidate region
Draw value, tiIt is the prediction coordinate shift amount of the candidate region, t*i is the actual coordinate offset of the candidate region, piIt is pre- astronomical observation
Favored area belongs to the probability of the i-th class, and p*i represents its true classification, and p*i=0 represents background classes, and p*i=1 represents tongue body class;
The error between predicted value and given actual value is calculated respectively by the two loss functions, is calculated using backpropagation
Method returns error layer by layer, and every layer of parameter is adjusted and updated using stochastic gradient descent method, more new formula such as formula (6)
It is shown so that closer to actual value, i.e., the output of most latter two full articulamentum is closer gives in mark value the predicted value of network
Classification and location information;
In formula, w and w' are respectively to update front and rear parameter value, and E is the error amount being calculated by loss function layer, η
For learning rate.
The described GrabCut algorithms for being split to tongue picture comprise the following steps:
Step1:The divided frame of prospect background given first, marks the information of foreground and background, and GrabCut is mixed by Gauss
Molding type (Gaussian Mixture Model, GMM) carries out statistical modeling, GMM meetings respectively to foreground data and background data
Learn and create new pixel distribution, the pixel unknown to those classification can be according to their pixel relationships with known classification
To classify;
Step2:It will be schemed by Step1 according to one pair of profile creation of pixel, the node in figure is exactly pixel.Except picture
Vegetarian refreshments is cooked outside node there are two node:Source_node and Sink_node, all foreground pixels are all and Source_
Node is connected, and all background pixels are all connected with Sink_node.Pixel is connected to Source_node/end_node's
The weight on (side) belongs to the probability of same class (be both prospect or be both background) to determine by them.Weight between two pixels
It is determined by the similitude of the information on side or two pixels.If the color of two pixels is very different, then they
Between the weight on side will very little;
Step3:Figure obtained above is split using mincut algorithms, it can will scheme to divide according to least cost equation
For Source_node and Sink_node, cost equation is exactly the sum of weight on all sides being cut up, and after cutting, is owned
The pixel for being connected to Source_node is considered as prospect, and all pixels for being connected to Sink_node are considered as background.
Step4:Continue this process until classification restrains, you can complete segmentation.
Claims (5)
1. a kind of exempt from interactive GrabCut tongue bodies dividing method based on deep learning, it is characterised in that:Including being used for tongue entirety
The depth convolutional neural networks of feature extraction, for the area-of-interest of tongue body region Preliminary detection positioning network, for pair
Area-of-interest carries out the depth convolutional neural networks of deep layer abstract characteristics extraction and for being split to tongue picture
GrabCut algorithms;
The depth convolutional neural networks for the extraction of tongue global feature, it is common as the basic network of whole network model
Be divided into five layers, by convolutional layer, active coating and pond the layer alternately depth structure that forms, implicitly from given tongue as in data into
Row unsupervised learning avoids and manually carries out explicit feature extraction;
The area-of-interest positioning network being used for tongue body region Preliminary detection, i.e. RPN networks, to not belonged to together on lingual surface
The corresponding region of property is detected and divides, and obtains the candidate region of tongue body;
The depth convolutional neural networks for being used to carry out area-of-interest deep layer abstract characteristics extraction, by full articulamentum group
Into the candidate region of the tongue body to obtaining on last stage carries out further feature extraction, and input area is reflected layer by layer in a network
It penetrates, obtains different representations, extract its abstract characteristics, so as to fulfill the depth representing to tongue picture, obtain tongue body positioning
As a result;
The GrabCut algorithms being split to tongue picture, using tongue body posting obtained above as input, so as to distinguish
Go out the foreground and background of tongue picture figure, and then the automatic segmentation of tongue body is completed on the premise of without man-machine interactively.
2. a kind of as described in claim 1 exempt from interactive GrabCut tongue bodies dividing method based on deep learning, feature exists
In:The depth convolutional neural networks for the extraction of tongue global feature, are divided into five layers, convolutional neural networks are by convolution
The depth structure that layer, active coating and pond layer are alternately formed;By convolution operation, enhance prime information and reduce noise;Pass through
Pondization operates, and using the principle of image local correlation, sub-sample is carried out to image, on the basis of image useful information is retained
Reduce the treating capacity of data;
Network receives the tongue picture of arbitrary dimension as inputting, and network structure is as follows:The convolution kernel number of first convolutional layer Conv1
For 96, size is 7 × 7 × 3, and convolution step-length is 2, Filling power 3;The Chi Huahe of first pond layer (Pool1) for 7 × 7 ×
3, pond step-length is 2, Filling power 1;ReLU active coatings 1 are then carried out to handle;Second convolutional layer Conv2 has 256 convolution
Core, size are 5 × 5 × 96, step-length 2, Filling power 2;The Chi Huahe of second pond layer Pool2 is 7 × 7 × 96, and step-length is
2, Filling power 1;ReLU active coatings 1 are then carried out to handle;3rd convolutional layer Conv3 has 384 convolution kernels, and size is 3 × 3
× 256, Filling power 1;ReLU active coatings 1 are then carried out to handle;4th convolutional layer Conv4 has 384 convolution kernels, and size is
3 × 3 × 384, Filling power 1;ReLU active coatings 1 are then carried out to handle;5th convolutional layer Conv5 has 256 convolution kernels, greatly
Small is 3 × 3 × 384, Filling power 1;ReLU active coatings 1 are then carried out to handle;
By this five layers of feature extraction, every tongue picture obtains 256 characteristic patterns, the input as RPN networks.
3. a kind of as claimed in claim 1 or 2 exempt from interactive GrabCut tongue bodies dividing method, spy based on deep learning
Sign is:It is described to be used for in the region of interesting extraction network of tongue body region Preliminary detection, RPN networks receive basic network life
Into 256 characteristic patterns as input, after-treatment is carried out to characteristic pattern using three convolutional layers and algorithm layer, exports rectangle mesh
The set of candidate frame is marked, each frame includes 4 position coordinates variables and a score;
First convolutional layer Conv1/rpn of RPN networks has 256 convolution kernels, and size is 3 × 3 × 256;The second of RPN networks
A convolutional layer Conv2/rpn has 18 convolution kernels, and size is the 3rd convolutional layer Conv3/rpn of 1 × 1 × 256, RPN networks
There are 36 convolution kernels, size is 1 × 1 × 256;
RPN networks additionally add algorithm layer for formation zone candidate frame, and multiple dimensioned convolution operation is carried out on characteristic pattern, tool
Body is embodied as:In the position of each sliding window using 3 kinds of scales and 3 kinds of length-width ratios, using current sliding window mouth center in
The heart, and a kind of corresponding scale and length-width ratio, then mapping obtains the candidate region of 9 kinds of different scales in artwork, such as size
For the shared convolution characteristic pattern of w × h, then a total of w × h × 9 candidate region;Finally, layer of classifying exports w × h × 9 × 2
The score of candidate region is the estimated probability of target/non-targeted to each region, return layer output w × h × 9 × 4 ginseng
Number, the i.e. coordinate parameters of candidate region;
Training process is as follows in RPN networks:First with each point on 3 × 3 sliding window traversal characteristic pattern, the point is found
Place's sliding window central point is mapped in the position in artwork, and point centered on it, and 3 kinds of scales (128 are generated in artwork2,
2562, 5122) and 3 kinds of length-width ratios (1:1,2:1,1:2) each point on candidate region, i.e. characteristic pattern corresponds to 9 in artwork
A candidate region, if characteristic pattern size is w × h, then the candidate region number generated is w × h × 9, next to all candidates
It is screened and is judged twice twice in region;Leave out first and complete to screen for the first time beyond the candidate region of artwork scope, then
It calculates remaining candidate region it and hands over the ratio between union i.e. Duplication, and be each according to ratio with all real label areas
A binary label is distributed in candidate region, judges that the region is tongue body with this, criterion is:1) it is ratio is maximum
Candidate region be considered as positive sample, i.e. tongue body;2) in other candidate regions, if ratio is more than 0.7, then it is assumed that it is positive sample,
Less than 0.3, then it is assumed that be negative sample, i.e., background, the candidate region that ratio is interposed between the two are given up;
Candidate region and the calculating of true callout box GT Duplication are represented by formula (1):
<mrow>
<mi>I</mi>
<mi>o</mi>
<mi>U</mi>
<mo>=</mo>
<mfrac>
<mrow>
<mi>A</mi>
<mi>n</mi>
<mi>c</mi>
<mi>h</mi>
<mi>o</mi>
<mi>r</mi>
<mi>I</mi>
<mi> </mi>
<mi>G</mi>
<mi>T</mi>
</mrow>
<mrow>
<mi>A</mi>
<mi>n</mi>
<mi>c</mi>
<mi>h</mi>
<mi>o</mi>
<mi>r</mi>
<mi>U</mi>
<mi> </mi>
<mi>G</mi>
<mi>T</mi>
</mrow>
</mfrac>
<mo>-</mo>
<mo>-</mo>
<mo>-</mo>
<mrow>
<mo>(</mo>
<mn>1</mn>
<mo>)</mo>
</mrow>
</mrow>
After completing to the postsearch screening of candidate region, second of marker for judgment is carried out to it, will have with it is maximum hand over union it
Label of the label of the true tab area of ratio as the candidate region, i.e. prospect label, and add background for all negative samples
Label carries out stochastical sampling to positive negative sample, and number of samples is set to 128, and oversampling ratio is set to 1:1, positive sample under normal circumstances
Number is less, if positive sample number is less than 64, differential section is supplied by negative sample, by 128 positive and negative samples in subsequent network
This is merged trains together, with the discrimination of enhancing mark sample and non-mark sample.
4. a kind of as claimed in claim 1 or 2 exempt from interactive GrabCut tongue bodies dividing method, spy based on deep learning
Sign is:The depth convolutional neural networks for being used to carry out area-of-interest deep layer abstract characteristics extraction, by full articulamentum
Composition, and add pyramid pond layer before this and carry out dimension normalization;
Sub-network carries out feature extraction using full articulamentum to the candidate region after sampling, and candidate region shares 9 kinds of sizes, and complete
Articulamentum requirement input size is consistent, therefore carries out dimension normalization first with pyramid pond layer herein, then is sent to three
Full articulamentum carries out further feature extraction, and full articulamentum output neuron number is set to 1024 in sub-network, obtains 1024 dimensions
Feature vector;Then, this feature vector is respectively fed to two full articulamentums and carries out Feature Compression, full articulamentum output god
2 and 8 are set to through first number;Finally, output valve with true tag value is compared respectively, carries out the recurrence of loss function about
Beam;
Loss function is represented by formula (2):
<mrow>
<mi>L</mi>
<mrow>
<mo>(</mo>
<mo>{</mo>
<msub>
<mi>p</mi>
<mi>i</mi>
</msub>
<mo>}</mo>
<mo>,</mo>
<mo>{</mo>
<msub>
<mi>t</mi>
<mi>i</mi>
</msub>
<mo>}</mo>
<mo>)</mo>
</mrow>
<mo>=</mo>
<mfrac>
<mn>1</mn>
<msub>
<mi>N</mi>
<mrow>
<mi>c</mi>
<mi>l</mi>
<mi>s</mi>
</mrow>
</msub>
</mfrac>
<munder>
<mo>&Sigma;</mo>
<mi>i</mi>
</munder>
<msub>
<mi>L</mi>
<mrow>
<mi>c</mi>
<mi>l</mi>
<mi>s</mi>
</mrow>
</msub>
<mrow>
<mo>(</mo>
<msub>
<mi>p</mi>
<mi>i</mi>
</msub>
<mo>,</mo>
<msubsup>
<mi>p</mi>
<mi>i</mi>
<mo>*</mo>
</msubsup>
<mo>)</mo>
</mrow>
<mo>+</mo>
<mi>&lambda;</mi>
<mfrac>
<mn>1</mn>
<msub>
<mi>N</mi>
<mrow>
<mi>r</mi>
<mi>e</mi>
<mi>g</mi>
</mrow>
</msub>
</mfrac>
<munder>
<mo>&Sigma;</mo>
<mi>i</mi>
</munder>
<msubsup>
<mi>p</mi>
<mi>i</mi>
<mo>*</mo>
</msubsup>
<msub>
<mi>L</mi>
<mrow>
<mi>r</mi>
<mi>e</mi>
<mi>g</mi>
</mrow>
</msub>
<mrow>
<mo>(</mo>
<msub>
<mi>t</mi>
<mi>i</mi>
</msub>
<mo>,</mo>
<msubsup>
<mi>t</mi>
<mi>i</mi>
<mo>*</mo>
</msubsup>
<mo>)</mo>
</mrow>
<mo>-</mo>
<mo>-</mo>
<mo>-</mo>
<mrow>
<mo>(</mo>
<mn>2</mn>
<mo>)</mo>
</mrow>
</mrow>
In formula, classification loss function is defined as by formula (3):
<mrow>
<msub>
<mi>L</mi>
<mrow>
<mi>c</mi>
<mi>l</mi>
<mi>s</mi>
</mrow>
</msub>
<mrow>
<mo>(</mo>
<msub>
<mi>p</mi>
<mi>i</mi>
</msub>
<mo>,</mo>
<msubsup>
<mi>p</mi>
<mi>i</mi>
<mo>*</mo>
</msubsup>
<mo>)</mo>
</mrow>
<mo>=</mo>
<mo>-</mo>
<mi>log</mi>
<mi> </mi>
<msub>
<mi>p</mi>
<mi>i</mi>
</msub>
<mo>-</mo>
<mo>-</mo>
<mo>-</mo>
<mrow>
<mo>(</mo>
<mn>3</mn>
<mo>)</mo>
</mrow>
</mrow>
Position returns loss function and is defined as by formula (4):
<mrow>
<msub>
<mi>L</mi>
<mrow>
<mi>r</mi>
<mi>e</mi>
<mi>g</mi>
</mrow>
</msub>
<mrow>
<mo>(</mo>
<msub>
<mi>t</mi>
<mi>i</mi>
</msub>
<mo>,</mo>
<msubsup>
<mi>t</mi>
<mi>i</mi>
<mo>*</mo>
</msubsup>
<mo>)</mo>
</mrow>
<mo>=</mo>
<mi>R</mi>
<mrow>
<mo>(</mo>
<msub>
<mi>t</mi>
<mi>i</mi>
</msub>
<mo>-</mo>
<msubsup>
<mi>t</mi>
<mi>i</mi>
<mo>*</mo>
</msubsup>
<mo>)</mo>
</mrow>
<mo>-</mo>
<mo>-</mo>
<mo>-</mo>
<mrow>
<mo>(</mo>
<mn>4</mn>
<mo>)</mo>
</mrow>
</mrow>
R is the loss function smooth of robustL1, it is expressed as by formula (5):
<mrow>
<msub>
<mi>smooth</mi>
<mrow>
<mi>L</mi>
<mn>1</mn>
</mrow>
</msub>
<mrow>
<mo>(</mo>
<mi>x</mi>
<mo>)</mo>
</mrow>
<mo>=</mo>
<mfenced open = "{" close = "">
<mtable>
<mtr>
<mtd>
<mrow>
<mn>0.5</mn>
<msup>
<mi>x</mi>
<mn>2</mn>
</msup>
</mrow>
</mtd>
<mtd>
<mrow>
<mi>i</mi>
<mi>f</mi>
<mo>|</mo>
<mi>x</mi>
<mo>|</mo>
<mo><</mo>
<mn>1</mn>
</mrow>
</mtd>
</mtr>
<mtr>
<mtd>
<mrow>
<mo>|</mo>
<mi>x</mi>
<mo>|</mo>
<mo>-</mo>
<mn>0.5</mn>
</mrow>
</mtd>
<mtd>
<mrow>
<mi>o</mi>
<mi>t</mi>
<mi>h</mi>
<mi>e</mi>
<mi>r</mi>
<mi>w</mi>
<mi>i</mi>
<mi>s</mi>
<mi>e</mi>
</mrow>
</mtd>
</mtr>
</mtable>
</mfenced>
<mo>-</mo>
<mo>-</mo>
<mo>-</mo>
<mrow>
<mo>(</mo>
<mn>5</mn>
<mo>)</mo>
</mrow>
</mrow>
In formula, NclsAnd NregIt is to avoid the regular terms of over-fitting, λ is weight coefficient, and i is the classification index of the candidate region
Value, tiIt is the prediction coordinate shift amount of the candidate region, t*i is the actual coordinate offset of the candidate region, piIt is predicting candidate
Region belongs to the probability of the i-th class, and p*i represents its true classification, and p*i=0 represents background classes, and p*i=1 represents tongue body class;
The error between predicted value and given actual value is calculated respectively by the two loss functions, it will using back-propagation algorithm
Error returns layer by layer, and every layer of parameter is adjusted and updated using stochastic gradient descent method, more new formula such as formula (6) institute
Show so that closer to actual value, i.e. the output of most latter two full articulamentum is closer to be given in mark value the predicted value of network
Classification and location information;
<mrow>
<msup>
<mi>w</mi>
<mo>&prime;</mo>
</msup>
<mo>=</mo>
<mi>w</mi>
<mo>-</mo>
<mi>&eta;</mi>
<mfrac>
<mrow>
<mo>&part;</mo>
<mi>E</mi>
</mrow>
<mrow>
<mo>&part;</mo>
<mi>w</mi>
</mrow>
</mfrac>
<mo>-</mo>
<mo>-</mo>
<mo>-</mo>
<mrow>
<mo>(</mo>
<mn>6</mn>
<mo>)</mo>
</mrow>
</mrow>
In formula, w and w' are respectively to update front and rear parameter value, and E is the error amount being calculated by loss function layer, and η is to learn
Habit rate.
5. a kind of as claimed in claim 1 or 2 exempt from interactive GrabCut tongue bodies dividing method, spy based on deep learning
Sign is:The GrabCut algorithms being split to tongue picture, comprise the following steps:
Step1:The divided frame of prospect background given first, marks the information of foreground and background, and GrabCut passes through Gaussian Mixture mould
Type GMM carries out foreground data and background data statistical modeling respectively, and GMM can learn and create new pixel distribution, to those
Classify unknown pixel, can be classified according to their pixel relationships with known classification;
Step2:It will be schemed by Step1 according to one pair of profile creation of pixel, the node in figure is exactly pixel, except pixel
Node there are two doing outside node:Source_node and Sink_node, all foreground pixels all with Source_node phases
Even, all background pixels are all connected with Sink_node, and pixel is connected to the weight on the side of Source_node/end_node
Belong to of a sort probability by them to determine, the weight between two pixels is by the information on side or the similitude of two pixels
It determines, if the color of two pixels is very different, then the weight on the side between them will very little;
Step3:Figure obtained above is split using mincut algorithms, figure can be divided by it according to least cost equation
Source_node and Sink_node, cost equation is exactly the sum of weight on all sides being cut up, after cutting, Suo Youlian
The pixel for being connected to Source_node is considered as prospect, and all pixels for being connected to Sink_node are considered as background;
Step4:Continue this process until classification restrains, you can complete segmentation.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201711133796.8A CN108109160A (en) | 2017-11-16 | 2017-11-16 | It is a kind of that interactive GrabCut tongue bodies dividing method is exempted from based on deep learning |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201711133796.8A CN108109160A (en) | 2017-11-16 | 2017-11-16 | It is a kind of that interactive GrabCut tongue bodies dividing method is exempted from based on deep learning |
Publications (1)
Publication Number | Publication Date |
---|---|
CN108109160A true CN108109160A (en) | 2018-06-01 |
Family
ID=62207321
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201711133796.8A Pending CN108109160A (en) | 2017-11-16 | 2017-11-16 | It is a kind of that interactive GrabCut tongue bodies dividing method is exempted from based on deep learning |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN108109160A (en) |
Cited By (20)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109376756A (en) * | 2018-09-04 | 2019-02-22 | 青岛大学附属医院 | Upper abdomen metastatic lymph node section automatic recognition system, computer equipment, storage medium based on deep learning |
CN109410168A (en) * | 2018-08-31 | 2019-03-01 | 清华大学 | For determining the modeling method of the convolutional neural networks model of the classification of the subgraph block in image |
CN109584251A (en) * | 2018-12-06 | 2019-04-05 | 湘潭大学 | A kind of tongue body image partition method based on single goal region segmentation |
CN109766877A (en) * | 2019-03-12 | 2019-05-17 | 北京羽医甘蓝信息技术有限公司 | The method and apparatus of whole scenery piece artificial tooth body identification based on deep learning |
CN110210319A (en) * | 2019-05-07 | 2019-09-06 | 平安科技(深圳)有限公司 | Computer equipment, tongue body photo constitution identification device and storage medium |
CN110599497A (en) * | 2019-07-31 | 2019-12-20 | 中国地质大学(武汉) | Drivable region segmentation method based on deep neural network |
CN110674807A (en) * | 2019-08-06 | 2020-01-10 | 中国科学院信息工程研究所 | Curved scene character detection method based on semi-supervised and weakly supervised learning |
CN110729045A (en) * | 2019-10-12 | 2020-01-24 | 闽江学院 | Tongue image segmentation method based on context-aware residual error network |
CN110738223A (en) * | 2018-07-18 | 2020-01-31 | 郑州宇通客车股份有限公司 | Point cloud data clustering method and device for laser radars |
WO2020038462A1 (en) * | 2018-08-24 | 2020-02-27 | 深圳市前海安测信息技术有限公司 | Tongue segmentation device and method employing deep learning, and storage medium |
CN110956225A (en) * | 2020-02-25 | 2020-04-03 | 浙江啄云智能科技有限公司 | Contraband detection method and system, computing device and storage medium |
WO2020108436A1 (en) * | 2018-11-26 | 2020-06-04 | 深圳市前海安测信息技术有限公司 | Tongue surface image segmentation device and method, and computer storage medium |
CN111462132A (en) * | 2020-03-20 | 2020-07-28 | 西北大学 | Video object segmentation method and system based on deep learning |
CN111488871A (en) * | 2019-01-25 | 2020-08-04 | 斯特拉德视觉公司 | Method and apparatus for switchable mode R-CNN based monitoring |
CN111818449A (en) * | 2020-06-15 | 2020-10-23 | 华南师范大学 | Visible light indoor positioning method based on improved artificial neural network |
CN112508968A (en) * | 2020-12-10 | 2021-03-16 | 马鞍山市瀚海云星科技有限责任公司 | Image segmentation method, device, system and storage medium |
CN113569855A (en) * | 2021-07-07 | 2021-10-29 | 江汉大学 | Tongue picture segmentation method, equipment and storage medium |
CN114511567A (en) * | 2022-04-20 | 2022-05-17 | 天中依脉(天津)智能科技有限公司 | Tongue body and tongue coating image identification and separation method |
CN114627136A (en) * | 2022-01-28 | 2022-06-14 | 河南科技大学 | Tongue picture segmentation and alignment method based on feature pyramid network |
WO2022252565A1 (en) * | 2021-06-04 | 2022-12-08 | 浙江智慧视频安防创新中心有限公司 | Target detection system, method and apparatus, and device and medium |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20110210915A1 (en) * | 2009-05-01 | 2011-09-01 | Microsoft Corporation | Human Body Pose Estimation |
CN104021566A (en) * | 2014-06-24 | 2014-09-03 | 天津大学 | GrabCut algorithm-based automatic segmentation method of tongue diagnosis image |
CN106295139A (en) * | 2016-07-29 | 2017-01-04 | 姹ゅ钩 | A kind of tongue body autodiagnosis health cloud service system based on degree of depth convolutional neural networks |
-
2017
- 2017-11-16 CN CN201711133796.8A patent/CN108109160A/en active Pending
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20110210915A1 (en) * | 2009-05-01 | 2011-09-01 | Microsoft Corporation | Human Body Pose Estimation |
CN104021566A (en) * | 2014-06-24 | 2014-09-03 | 天津大学 | GrabCut algorithm-based automatic segmentation method of tongue diagnosis image |
CN106295139A (en) * | 2016-07-29 | 2017-01-04 | 姹ゅ钩 | A kind of tongue body autodiagnosis health cloud service system based on degree of depth convolutional neural networks |
Non-Patent Citations (4)
Title |
---|
CARSTEN ROTHER 等: ""GrabCut" -Interactive Foreground Extraction using Iterated Graph Cuts", 《SIGGRAPH "04 ACM SIGGRAPH 2004 PAPERS》 * |
ROSS GIRSHICK: "Fast R-CNN", 《THE IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV)》 * |
SHAOQING REN 等: "Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks", 《IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE》 * |
杜玉龙 等: "基于深度交叉CNN和免交互GrabCut的显著性检测", 《基于深度交叉CNN和免交互GRABCUT 的显著性检测》 * |
Cited By (27)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110738223A (en) * | 2018-07-18 | 2020-01-31 | 郑州宇通客车股份有限公司 | Point cloud data clustering method and device for laser radars |
CN110738223B (en) * | 2018-07-18 | 2022-04-08 | 宇通客车股份有限公司 | Point cloud data clustering method and device of laser radar |
WO2020038462A1 (en) * | 2018-08-24 | 2020-02-27 | 深圳市前海安测信息技术有限公司 | Tongue segmentation device and method employing deep learning, and storage medium |
CN109410168A (en) * | 2018-08-31 | 2019-03-01 | 清华大学 | For determining the modeling method of the convolutional neural networks model of the classification of the subgraph block in image |
CN109410168B (en) * | 2018-08-31 | 2021-11-16 | 清华大学 | Modeling method of convolutional neural network for determining sub-tile classes in an image |
CN109376756A (en) * | 2018-09-04 | 2019-02-22 | 青岛大学附属医院 | Upper abdomen metastatic lymph node section automatic recognition system, computer equipment, storage medium based on deep learning |
WO2020108436A1 (en) * | 2018-11-26 | 2020-06-04 | 深圳市前海安测信息技术有限公司 | Tongue surface image segmentation device and method, and computer storage medium |
CN109584251A (en) * | 2018-12-06 | 2019-04-05 | 湘潭大学 | A kind of tongue body image partition method based on single goal region segmentation |
CN111488871B (en) * | 2019-01-25 | 2023-08-04 | 斯特拉德视觉公司 | Method and apparatus for R-CNN based monitoring of switchable modes |
CN111488871A (en) * | 2019-01-25 | 2020-08-04 | 斯特拉德视觉公司 | Method and apparatus for switchable mode R-CNN based monitoring |
CN109766877A (en) * | 2019-03-12 | 2019-05-17 | 北京羽医甘蓝信息技术有限公司 | The method and apparatus of whole scenery piece artificial tooth body identification based on deep learning |
CN110210319A (en) * | 2019-05-07 | 2019-09-06 | 平安科技(深圳)有限公司 | Computer equipment, tongue body photo constitution identification device and storage medium |
CN110599497A (en) * | 2019-07-31 | 2019-12-20 | 中国地质大学(武汉) | Drivable region segmentation method based on deep neural network |
CN110674807A (en) * | 2019-08-06 | 2020-01-10 | 中国科学院信息工程研究所 | Curved scene character detection method based on semi-supervised and weakly supervised learning |
CN110729045A (en) * | 2019-10-12 | 2020-01-24 | 闽江学院 | Tongue image segmentation method based on context-aware residual error network |
CN110956225A (en) * | 2020-02-25 | 2020-04-03 | 浙江啄云智能科技有限公司 | Contraband detection method and system, computing device and storage medium |
CN110956225B (en) * | 2020-02-25 | 2020-05-29 | 浙江啄云智能科技有限公司 | Contraband detection method and system, computing device and storage medium |
CN111462132A (en) * | 2020-03-20 | 2020-07-28 | 西北大学 | Video object segmentation method and system based on deep learning |
CN111818449A (en) * | 2020-06-15 | 2020-10-23 | 华南师范大学 | Visible light indoor positioning method based on improved artificial neural network |
CN111818449B (en) * | 2020-06-15 | 2022-04-15 | 华南师范大学 | Visible light indoor positioning method based on improved artificial neural network |
CN112508968A (en) * | 2020-12-10 | 2021-03-16 | 马鞍山市瀚海云星科技有限责任公司 | Image segmentation method, device, system and storage medium |
WO2022252565A1 (en) * | 2021-06-04 | 2022-12-08 | 浙江智慧视频安防创新中心有限公司 | Target detection system, method and apparatus, and device and medium |
CN113569855A (en) * | 2021-07-07 | 2021-10-29 | 江汉大学 | Tongue picture segmentation method, equipment and storage medium |
CN114627136A (en) * | 2022-01-28 | 2022-06-14 | 河南科技大学 | Tongue picture segmentation and alignment method based on feature pyramid network |
CN114627136B (en) * | 2022-01-28 | 2024-02-27 | 河南科技大学 | Tongue image segmentation and alignment method based on feature pyramid network |
CN114511567B (en) * | 2022-04-20 | 2022-08-05 | 天中依脉(天津)智能科技有限公司 | Tongue body and tongue coating image identification and separation method |
CN114511567A (en) * | 2022-04-20 | 2022-05-17 | 天中依脉(天津)智能科技有限公司 | Tongue body and tongue coating image identification and separation method |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN108109160A (en) | It is a kind of that interactive GrabCut tongue bodies dividing method is exempted from based on deep learning | |
CN107977671A (en) | A kind of tongue picture sorting technique based on multitask convolutional neural networks | |
CN107316307B (en) | Automatic segmentation method of traditional Chinese medicine tongue image based on deep convolutional neural network | |
EP3614308B1 (en) | Joint deep learning for land cover and land use classification | |
CN104281853B (en) | A kind of Activity recognition method based on 3D convolutional neural networks | |
CN109344736B (en) | Static image crowd counting method based on joint learning | |
CN104992223B (en) | Intensive population estimation method based on deep learning | |
CN107610087B (en) | Tongue coating automatic segmentation method based on deep learning | |
CN110766051A (en) | Lung nodule morphological classification method based on neural network | |
CN110532900A (en) | Facial expression recognizing method based on U-Net and LS-CNN | |
CN109166100A (en) | Multi-task learning method for cell count based on convolutional neural networks | |
CN107909566A (en) | A kind of image-recognizing method of the cutaneum carcinoma melanoma based on deep learning | |
CN107341506A (en) | A kind of Image emotional semantic classification method based on the expression of many-sided deep learning | |
CN107945153A (en) | A kind of road surface crack detection method based on deep learning | |
CN107316294A (en) | One kind is based on improved depth Boltzmann machine Lung neoplasm feature extraction and good pernicious sorting technique | |
CN106682569A (en) | Fast traffic signboard recognition method based on convolution neural network | |
CN109977955A (en) | A kind of precancerous lesions of uterine cervix knowledge method for distinguishing based on deep learning | |
CN107622233A (en) | A kind of Table recognition method, identifying system and computer installation | |
CN106408030A (en) | SAR image classification method based on middle lamella semantic attribute and convolution neural network | |
CN105740915B (en) | A kind of collaboration dividing method merging perception information | |
CN108717693A (en) | A kind of optic disk localization method based on RPN | |
CN107506793A (en) | Clothes recognition methods and system based on weak mark image | |
CN106778852A (en) | A kind of picture material recognition methods for correcting erroneous judgement | |
CN106340016A (en) | DNA quantitative analysis method based on cell microscope image | |
CN101556650A (en) | Distributed self-adapting pulmonary nodule computer detection method and system thereof |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20180601 |