CN105184303A - Image marking method based on multi-mode deep learning - Google Patents

Image marking method based on multi-mode deep learning Download PDF

Info

Publication number
CN105184303A
CN105184303A CN201510198325.XA CN201510198325A CN105184303A CN 105184303 A CN105184303 A CN 105184303A CN 201510198325 A CN201510198325 A CN 201510198325A CN 105184303 A CN105184303 A CN 105184303A
Authority
CN
China
Prior art keywords
image
mark
alpha
layer
training
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201510198325.XA
Other languages
Chinese (zh)
Other versions
CN105184303B (en
Inventor
朱松豪
孙成建
师哲
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Nanjing Post and Telecommunication University
Nanjing University of Posts and Telecommunications
Original Assignee
Nanjing Post and Telecommunication University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nanjing Post and Telecommunication University filed Critical Nanjing Post and Telecommunication University
Priority to CN201510198325.XA priority Critical patent/CN105184303B/en
Publication of CN105184303A publication Critical patent/CN105184303A/en
Application granted granted Critical
Publication of CN105184303B publication Critical patent/CN105184303B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Abstract

The invention discloses an image marking method based on multi-mode deep learning. The method comprises the following steps: firstly, a depth neural network is trained by utilization of images without labels; secondly, each single mode is optimized by utilization of counter propagation; finally, weights among different modes are optimized by utilization of on-line learning power gradient algorithm. The method employs a convolution neural network technology to optimize parameters of the depth neural network, and the marking precision is raised. Experiments of public data sets show that the method can raise the image marking performance effectively.

Description

A kind of image labeling method learnt based on the multi-modal degree of depth
Technical field
The present invention relates to a kind of image labeling method, particularly relate to the image labeling method learnt based on the multi-modal degree of depth, belong to technical field of image processing.
Background technology
In recent years, along with the sharp increase of amount of images, people need the efficient mark realizing picture material urgently, to realize effective retrieval and the management of large-scale image.
From the angle of pattern-recognition, image labeling problem is considered as distribute one group of label according to content to image, wherein how chooses the suitable characteristics of token image content, mark performance will be affected to a great extent.Due to well-known semantic gap problem, be difficult to when prior art carries out linguistic indexing of pictures reach gratifying result.In recent years, the people such as Hinton proposes to utilize deep neural network, training characteristics effectively from training set.Dissimilar deep neural network, has been successfully applied to various language and information retrieval.These methods find the data structure hidden and effective characteristic feature by depth structure, degree of depth study from training data, improve system performance.
Summary of the invention
The object of the invention there are provided a kind of image labeling method learnt based on the multi-modal degree of depth, and the method is applied to convolutional neural networks technology, optimizes deep-neural-network parameter, improves mark precision.The method is summed up on the basis of single mode study, realize multi-modal study, wherein both comprise the low-level image feature of research token image, as color, shape or texture etc., also similarity function between dimensioned plan picture and mark is comprised, as linear similarity, cosine similarity and radial distance etc.
The present invention solves the technical scheme that its technical matters takes: the invention provides a kind of image labeling method learnt based on the multi-modal degree of depth, the method comprises the following steps:
Step 1: utilize the image pattern collection without label, the node weights of pre-training deep neural network.
Step 2: adopt back-propagation algorithm, optimize the weight of each single mode.
Step 3: the power gradient algorithm adopting on-line study, optimizes the weight between modality combinations.
Deep neural network described in step 1 of the present invention is the convolutional neural networks of employing eight layers, and wherein the first five layer is convolutional layer, and its excess-three layer is full articulamentum; The output of full articulamentum is as the input of Softmax sorter, and Softmax sorter generates the classifications of 1000 marks; Pre-training and fine setting stage all use the objective function of polynomial expression logistic regression.
In the convolutional layer of the invention described above, ground floor, the second layer, layer 5 are normalization layer, and for maintaining the invariance, all normalization layers all use maximum pool technology.In addition, in all convolutional layers and full articulamentum, all use Serial regulation unit as nonlinear activation function;
In the above-mentioned convolutional neural networks used of the present invention, all input picture size unifications are 256 × 256 sizes; Next, respectively the first two convolution filter is set to 7 × 7 and 5 × 5, step-length is 2, uses this type filter to be for obtaining all band informations, uses little step-length to be produce next layer network influential " dead feature " for avoiding; Then, connect latter three layers of convolutional layer successively, and arrange wave filter size 3 × 3, step-length is 1; Finally, the Output Size of each full articulamentum is 4096.In the pre-training stage, the dropout rate of complete for the first two articulamentum is set to 0.6.
Each single mode step is optimized in backpropagation described in step 2 of the present invention, comprising:
1. single mode pre-training:
Utilize the pre-training of carrying out convolutional neural networks without mark training set, realize the intermediate representation of image object, the network of initialization simultaneously.Detailed process is described below: first, utilizes contrast difference, the node weights W1 between training input layer and first volume lamination; Then, using the input of the conditional probability of first volume lamination node as volume Two lamination:
p(Γ|x j)=S(W 1,x j)(1)
Wherein x jfor a jth eigenvector, Γ is markup information, the similarity function of S () for being shown below:
S ( W 1 , x j ) = W 1 T x j | | W 1 | | | x j | | Co sin e Function S ( W 1 , x j ) = W 1 T x j Linear Function S ( W 1 , x j ) = e - | | W 1 - x j | | 2 2 σ RBF Function - - - ( 2 )
Then, first volume lamination and volume Two lamination combine training node weights W2; Utilize identical method, train the node weights of remaining 3 layers of convolutional layer and 3 layers of full articulamentum;
2. the single mode fine setting stage:
In the single mode fine setting stage, utilize backpropagation to mark error and optimize node weights.From pattern-recognition angle, the study of many marks can be considered multi-task learning.Therefore, the overall mark error of convolutional neural networks can be considered the summation of each mark error.For l mark error, node optimization process is described below;
First, for image x, it is x under a jth feature mode j, containing l mark Γ lprobability can represent by the posterior probability of following formula:
p jl = exp ( p ( Γ l | x j ) ) Σ k = 1 L p ( Γ k | x j ) - - - ( 3 )
Wherein L represents mark quantity.
Then, the KL difference between prediction probability and reference probability is minimized.Assuming that every width image has multiple mark, with vector representation y ∈ R 1× c, wherein y l=1 represents that the mark of image x is concentrated containing this l mark, and y l=0 represents that the mark of image x is concentrated not containing this l mark.If q ilrepresent image x iand the probability between mark l, then by the error that this l mark correctly distributes to image be:
J l = - Σ i = 1 M Σ l = 1 L q il log ( p il ) - Σ i = 1 M Σ l = 1 L ( 1 - q i 1 ) log ( 1 - p il ) - - - ( 4 )
The distribution error of all marks is:
J = Σ l = 1 L J l - - - ( 5 )
Finally, backpropagation is utilized to upgrade the node weights of other two-layer full articulamentum and five layers of convolutional layer successively.
The power gradient algorithm of the employing on-line study described in step 3 of the present invention optimizes the weight process between different modalities, comprising:
For multi-modal degree of depth network, another vital task is the best of breed weight α=(α of multi-modal of study 1, α 2..., α n..., α n), wherein by α nbe initially set to 1/N.The present invention adopts the power gradient algorithm of on-line study to optimize multi-modal weight combination:
α t + 1 = arg min α KL ( α | α t ) + μ h t ( α ) - - - ( 6 )
Wherein KL (.) represents KL difference, and h (α) represents hinge loss function:
D KL ( u | v ) = Σ i u i ln ( u i v i ) h t ( α ) = max ( 0 , ψ - α T S t ) - - - ( 7 )
Wherein S tfor:
S t=(S 1(x,Γ +)-S 1(x,Γ -),...,S N(x,Γ +)-S N(x,Γ -)) T(8)
Wherein mark Γ +with Γ -more picture material can be reacted.
At α tplace carries out first order Taylor to function h (α), and to simplify optimization problem, therefore equation (8) can be written as first order Taylor and launch form:
α t + 1 = arg min α KL ( α | α t ) + μ [ h t ( α t ) + ▿ h t ( α t ) ( α - α t ) ] - - - ( 9 )
If Γ+with Γ-correctly do not arrange in order, namely robotization renewal is carried out to the value of node weights α.
Beneficial effect:
1, present invention optimizes deep-neural-network parameter, improve mark precision.
2, the present invention achieves the image labeling validity based on deep neural network learning model better.
3, the present invention can improve the performance of image labeling effectively.
Accompanying drawing explanation
Fig. 1 is method flow diagram of the present invention.
Fig. 2 is deep neural network model of the present invention.
Fig. 3 is the example image in natural scene figure storehouse of the present invention.
Fig. 4 is the image of NUS-WIDE image library of the present invention.
Fig. 5 is the example image of IAPRTC-12 image data base of the present invention.
Fig. 6 is in three kinds of common image storehouses of the present invention, the result schematic diagram of different modalities weight combination.
Embodiment
Below in conjunction with Figure of description, the invention is described in further detail.
As shown in Figure 1, the invention provides a kind of image labeling method learnt based on the multi-modal degree of depth, the method comprises: first, utilizes without label image training deep neural network; Secondly, backpropagation is adopted to optimize each single mode; Finally, the power gradient algorithm of employing on-line study optimizes the weight between different modalities.
Deep neural network in the present invention adopts convolutional neural networks, and its model structure as shown in Figure 2.The present invention, by series of experiments, assesses the performance based on multi-modal degree of depth study image labeling algorithm that the present invention proposes.
Step 1: introduce the data set for assessment of algorithm performance.
Experiment employing three common image data sets, comprise natural scene image storehouse as shown in Figure 3, NUS-WIDE image library as shown in Figure 4, and IAPRTC-12 image library as shown in Figure 5.The details of these three image libraries are described below:
Natural scene image storehouse comprises 2000 width images, and all these images comprise following 5 kinds of marks: desert, high mountain, sea, the setting sun and trees.Image more than 20% contains more than one and marks, and the mean value of every width image labeling is 1.3.Fig. 3 provides the example image of two width from natural scene figure storehouse, wherein Fig. 3 (a) be labeled as the setting sun and sea, Fig. 3 (b) is labeled as high mountain and trees.
NUS-WIDE image library comprises 30,000 kind of image, and these image labelings contain canoe, automobile, flag, horse, sky, the sun, tower, aircraft, zebra etc. at interior 31 kinds of marks.Fig. 4 provides the image of two width from NUS-WIDE image library, and wherein the mark of Fig. 4 (a) contains sky and aircraft, and the mark of Fig. 4 (b) contains sea and the setting sun.
IAPRTC-12 image data base comprises 20,000 width image, 291 kinds of marks, and the average mark number of every width image is 5.7.Fig. 5 gives the example image that two width come from IAPRTC-12 image data base.The mark of Fig. 5 (a) contains brown, face, hair, men and women, and the mark of Fig. 5 (b) contains boats and ships, lake, sky, trees.
Step 2: provide the visual signature of token image and the optimized parameter learning to obtain.
Feature selecting has very large impact to system performance.The present invention chooses following global characteristics and the local feature descriptor as characterization image:
Global characteristics: (1) 128 dimension hsv color histogram and 225 dimension LAB color moments, (2) 37 dimension edge orientation histograms, (3) 36 dimension pyramid wavelet textures, (4) 59 dimension local binary feature descriptors, (5) 960 dimension GIST feature descriptors.
Local feature: the partial descriptions symbol adopting two kinds of different sampling methods different with three kinds extracts Local textural feature, and detailed process comprises following description: first, carries out intensive sampling and Harris's Corner Detection; Then, extract SIFT feature, CSIFT feature, RGBSIFT feature, build the code book of 1000 classifications of k mean cluster; Next, adopt secondorder spatial pyramid pattern, build 5000 n dimensional vector ns of every width image; Finally, TF-IDF weight method is used to generate final visual word bag.In whole experiment, in scope that all proper vectors are all standardized in [0,1].
To every group polling-it is right to mark, in above-mentioned formula (4), give 3 kinds of similarity measurements, and select edge parameters μ by cross validation.After cross validation, the μ value in cosine similarity measurement is 0.18; μ value in linear similarity measurement is 1; σ value in RBF similarity measurement is 2, μ value is 0.18.
Step 3: by contrast experiment, test the present invention put forward the performance of algorithm.
Algorithm contrasts
Contrast experiment of the present invention carries out between following three kinds of image classification methods:
Based on inertia learning algorithm: first, for each test pattern, in training image storehouse, find K the most similar individual image; Then, the characteristic of the most similar image of K is added up; Finally, according to the mark of maximum a posteriori probability allocation for test image.
Based on depth representing and encryption algorithm: utilize hierarchical model to learn the expression of image pixel-class, realize image labeling
The inventive method: realize image labeling by deep-neural-network.
Mode weight
In the method for the invention, the combining weights α of different modalities has very large impact to system performance.Fig. 5 provides in three kinds of common image storehouses, the result of different modalities weight combination.Fig. 6 (a): the different modalities combining weights under natural image storehouse.Different modalities combining weights under Fig. 6 (b): NUS-WIDE image.Different modalities combining weights under Fig. 6 (c): IAPRTC-12 image.
Can see easily from the result shown in Fig. 6, the ratio between different modalities does not have significant difference.This just means that often kind of mode is more or less helpful to different images classification, and this is mainly because these three kinds of image libraries comprise many different classes of natural scene images, and this also demonstrates the importance obtaining different modalities optimum combination simultaneously further.
Performance comparison
Table 1 gives several Experimental comparison results making multiaspect image-annotation techniques differently.
Table 1: Experimental comparison results.
As can be seen from result shown in table 1, the NDCG@w performance of institute of the present invention extracting method is better than other two kinds of existing methods, and this checking is based on the image labeling validity of deep neural network learning model.

Claims (5)

1. based on the image labeling method that the multi-modal degree of depth learns, it is characterized in that, described method comprises the steps:
Step 1: utilize the image pattern collection without label, the node weights of pre-training deep neural network;
Step 2: adopt back-propagation algorithm, optimize the weight of each single mode;
Step 3: the power gradient algorithm adopting on-line study, optimizes the weight between modality combinations.
2. a kind of image labeling method learnt based on the multi-modal degree of depth according to claim 1, it is characterized in that, the deep neural network of described method step 1 adopts the convolutional neural networks of eight layers, and wherein the first five layer is convolutional layer, and its excess-three layer is full articulamentum; The output of full articulamentum is as the input of Softmax sorter, and Softmax sorter generates the classifications of 1000 marks; Pre-training and fine setting stage all use the objective function of polynomial expression logistic regression;
The ground floor of described convolutional layer, the second layer, layer 5 are normalization layer, and for maintaining the invariance, all normalization layers all use maximum pool technology; In all convolutional layers and full articulamentum, all use Serial regulation unit as nonlinear activation function;
In convolutional neural networks used, all input picture size unifications are 256 × 256 sizes; Next, respectively the first two convolution filter is set to 7 × 7 and 5 × 5, step-length is 2, uses this type filter to be for obtaining all band informations, uses little step-length to be produce next layer network influential " dead feature " for avoiding; Then, connect latter three layers of convolutional layer successively, and arrange wave filter size 3 × 3, step-length is 1; Finally, the Output Size of each full articulamentum is 4096, in the pre-training stage, the dropout rate of complete for the first two articulamentum is set to 0.6.
3. according to claim 1 based on the image labeling method that the multi-modal degree of depth learns, it is characterized in that, the back-propagation algorithm in described method step 2 comprises:
1. single mode pre-training:
Utilize the pre-training of carrying out convolutional neural networks without mark training set, realize the intermediate representation of image object, the network of initialization simultaneously, comprising: first, utilizes contrast difference, the node weights W1 between training input layer and first volume lamination; Then, using the input of the conditional probability of first volume lamination node as volume Two lamination:
p(Γ|x j)=S(W 1,x j)(1)
Wherein x jfor a jth eigenvector, Γ is markup information, the similarity function of S () for being shown below:
S ( W 1 , x j ) = W 1 T x j | | W 1 | | | | x j | | Co sin e Function S ( W 1 , x j ) = W 1 T x j Linear Function S ( W 1 , x j ) = e - | | W 1 - x j | | 2 2 σ RBF Function - - - ( 2 )
Then, first volume lamination and volume Two lamination combine training node weights W2; Utilize identical method, train the node weights of remaining 3 layers of convolutional layer and 3 layers of full articulamentum;
2. the single mode fine setting stage:
In the single mode fine setting stage, utilize backpropagation to mark error and optimize node weights, from pattern-recognition angle, the study of many marks can be considered multi-task learning; The overall mark error of convolutional neural networks can be considered the summation of each mark error, with l mark error description node optimization process, comprising:
First, for image x, it is x under a jth feature mode j, containing l mark Γ lprobability can represent by the posterior probability of following formula:
p jl = exp ( p ( Γ l | x j ) ) Σ k = 1 L p ( Γ k | x j ) - - - ( 3 )
Wherein L represents mark quantity;
Then, the KL difference between prediction probability and reference probability is minimized; Assuming that every width image has multiple mark, with vector representation y ∈ R 1× c, wherein y l=1 represents that the mark of image x is concentrated containing this l mark, and y l=0 represents that the mark of image x is concentrated not containing this l mark; If q ilrepresent image x iand the probability between mark l, then by the error that this l mark correctly distributes to image be:
J l = - Σ i = 1 M Σ l = 1 L q il log ( p il ) - Σ i = 1 M Σ l = 1 L ( 1 - q il ) log ( 1 - p il ) - - - ( 4 )
The distribution error of all marks is:
J = Σ l = 1 L J l - - - ( 5 )
Finally, backpropagation is utilized to upgrade the node weights of other two-layer full articulamentum and five layers of convolutional layer successively.
4. according to claim 1 based on the image labeling method that the multi-modal degree of depth learns, it is characterized in that, the power gradient algorithm of the on-line study of described method step 3 optimizes the weight process between different modalities, comprising:
For multi-modal degree of depth network, another vital task is the best of breed weight α=(α of multi-modal of study 1, α 2..., α n..., α n), wherein by α nbe initially set to 1/N; Adopt the power gradient algorithm of on-line study to optimize multi-modal weight combination, comprising:
α t + 1 = arg min α KL ( α | α t ) + μ h t ( α ) - - - ( 6 )
Wherein KL (.) represents KL difference, and h (α) represents hinge loss function:
D KL ( u | v ) = Σ i u i ln ( u i v i ) h t ( α ) = max ( 0 , ψ - α T S t ) - - - ( 7 )
Wherein S tfor:
S t=(S 1(x,Γ +)-S 1(x,Γ -),...,S N(x,Γ +)-S N(x,Γ -)) T(8)
Wherein mark Γ +with Γ -more picture material can be reacted;
At α tplace carries out first order Taylor to function h (α), and to simplify optimization problem, therefore equation (8) can be written as first order Taylor and launch form:
α t + 1 = arg min α KL ( α | α t ) + μ [ h t ( α t ) + ▿ h t ( α t ) ( α - α t ) ] - - - ( 9 )
If Γ+with Γ-correctly do not arrange in order, namely robotization renewal is carried out to the value of node weights α.
5. according to claim 1 based on the image labeling method that the multi-modal degree of depth learns, it is characterized in that, described method is applied to convolutional neural networks.
CN201510198325.XA 2015-04-23 2015-04-23 A kind of image labeling method based on multi-modal deep learning Active CN105184303B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201510198325.XA CN105184303B (en) 2015-04-23 2015-04-23 A kind of image labeling method based on multi-modal deep learning

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201510198325.XA CN105184303B (en) 2015-04-23 2015-04-23 A kind of image labeling method based on multi-modal deep learning

Publications (2)

Publication Number Publication Date
CN105184303A true CN105184303A (en) 2015-12-23
CN105184303B CN105184303B (en) 2019-08-09

Family

ID=54906369

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201510198325.XA Active CN105184303B (en) 2015-04-23 2015-04-23 A kind of image labeling method based on multi-modal deep learning

Country Status (1)

Country Link
CN (1) CN105184303B (en)

Cited By (30)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105654942A (en) * 2016-01-04 2016-06-08 北京时代瑞朗科技有限公司 Speech synthesis method of interrogative sentence and exclamatory sentence based on statistical parameter
CN105678340A (en) * 2016-01-20 2016-06-15 福州大学 Automatic image marking method based on enhanced stack type automatic encoder
CN105760859A (en) * 2016-03-22 2016-07-13 中国科学院自动化研究所 Method and device for identifying reticulate pattern face image based on multi-task convolutional neural network
CN105894012A (en) * 2016-03-29 2016-08-24 天津大学 Object identification method based on cascade micro neural network
CN105930877A (en) * 2016-05-31 2016-09-07 上海海洋大学 Multimodal depth learning-based remote sensing image classification method
CN106056602A (en) * 2016-05-27 2016-10-26 中国人民解放军信息工程大学 CNN (convolutional neural network)-based fMRI (functional magnetic resonance imaging) visual function data object extraction method
CN106202338A (en) * 2016-06-30 2016-12-07 合肥工业大学 Image search method based on the many relations of multiple features
CN106682592A (en) * 2016-12-08 2017-05-17 北京泛化智能科技有限公司 Automatic image recognition system and method based on neural network method
CN106845427A (en) * 2017-01-25 2017-06-13 北京深图智服技术有限公司 A kind of method for detecting human face and device based on deep learning
CN107122800A (en) * 2017-04-27 2017-09-01 南京大学 A kind of Robust digital figure mask method based on the screening that predicts the outcome
CN107273784A (en) * 2016-04-01 2017-10-20 富士施乐株式会社 Image steganalysis apparatus and method
CN108307205A (en) * 2017-12-06 2018-07-20 中国电子科技集团公司电子科学研究院 Merge the recognition methods of video expressive force, terminal and the storage medium of audio visual feature
CN108388768A (en) * 2018-02-08 2018-08-10 南京恺尔生物科技有限公司 Utilize the biological nature prediction technique for the neural network model that biological knowledge is built
CN108960015A (en) * 2017-05-24 2018-12-07 优信拍(北京)信息科技有限公司 A kind of vehicle system automatic identifying method and device based on deep learning
CN109196526A (en) * 2016-06-01 2019-01-11 三菱电机株式会社 For generating the method and system of multi-modal digital picture
CN109543835A (en) * 2018-11-30 2019-03-29 上海寒武纪信息科技有限公司 Operation method, device and Related product
CN109544517A (en) * 2018-11-06 2019-03-29 中山大学附属第医院 Method and system are analysed in multi-modal ultrasound group credit based on deep learning
CN109543833A (en) * 2018-11-30 2019-03-29 上海寒武纪信息科技有限公司 Operation method, device and Related product
CN109583580A (en) * 2018-11-30 2019-04-05 上海寒武纪信息科技有限公司 Operation method, device and Related product
CN109583583A (en) * 2017-09-29 2019-04-05 腾讯科技(深圳)有限公司 Neural network training method, device, computer equipment and readable medium
CN109711464A (en) * 2018-12-25 2019-05-03 中山大学 Image Description Methods based on the building of stratification Attributed Relational Graps
CN109886226A (en) * 2019-02-27 2019-06-14 北京达佳互联信息技术有限公司 Determine method, apparatus, electronic equipment and the storage medium of the characteristic of image
CN110019652A (en) * 2019-03-14 2019-07-16 九江学院 A kind of cross-module state Hash search method based on deep learning
CN111127456A (en) * 2019-12-28 2020-05-08 北京无线电计量测试研究所 Image annotation quality evaluation method
CN111383744A (en) * 2020-06-01 2020-07-07 北京协同创新研究院 Medical microscopic image annotation information processing method and system and image analysis equipment
WO2021046970A1 (en) * 2019-09-11 2021-03-18 山东浪潮人工智能研究院有限公司 Arithmetic coding-based neural network model compression encryption method and system
CN112633394A (en) * 2020-12-29 2021-04-09 厦门市美亚柏科信息股份有限公司 Intelligent user label determination method, terminal equipment and storage medium
CN114170481A (en) * 2022-02-10 2022-03-11 北京字节跳动网络技术有限公司 Method, apparatus, storage medium, and program product for image processing
CN115356363A (en) * 2022-08-01 2022-11-18 河南理工大学 Wide ion beam polishing-scanning electron microscope-based pore structure characterization method
CN116563400A (en) * 2023-07-12 2023-08-08 南通原力云信息技术有限公司 Small program image information compression processing method

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20120254086A1 (en) * 2011-03-31 2012-10-04 Microsoft Corporation Deep convex network with joint use of nonlinear random projection, restricted boltzmann machine and batch-based parallelizable optimization
CN102902966A (en) * 2012-10-12 2013-01-30 大连理工大学 Super-resolution face recognition method based on deep belief networks
CN103345656A (en) * 2013-07-17 2013-10-09 中国科学院自动化研究所 Method and device for data identification based on multitask deep neural network

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20120254086A1 (en) * 2011-03-31 2012-10-04 Microsoft Corporation Deep convex network with joint use of nonlinear random projection, restricted boltzmann machine and batch-based parallelizable optimization
CN102902966A (en) * 2012-10-12 2013-01-30 大连理工大学 Super-resolution face recognition method based on deep belief networks
CN103345656A (en) * 2013-07-17 2013-10-09 中国科学院自动化研究所 Method and device for data identification based on multitask deep neural network

Cited By (47)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105654942A (en) * 2016-01-04 2016-06-08 北京时代瑞朗科技有限公司 Speech synthesis method of interrogative sentence and exclamatory sentence based on statistical parameter
CN105678340A (en) * 2016-01-20 2016-06-15 福州大学 Automatic image marking method based on enhanced stack type automatic encoder
CN105678340B (en) * 2016-01-20 2018-12-25 福州大学 A kind of automatic image marking method based on enhanced stack autocoder
CN105760859B (en) * 2016-03-22 2018-12-21 中国科学院自动化研究所 Reticulate pattern facial image recognition method and device based on multitask convolutional neural networks
CN105760859A (en) * 2016-03-22 2016-07-13 中国科学院自动化研究所 Method and device for identifying reticulate pattern face image based on multi-task convolutional neural network
CN105894012A (en) * 2016-03-29 2016-08-24 天津大学 Object identification method based on cascade micro neural network
CN107273784B (en) * 2016-04-01 2022-04-15 富士胶片商业创新有限公司 Image pattern recognition apparatus and method
CN107273784A (en) * 2016-04-01 2017-10-20 富士施乐株式会社 Image steganalysis apparatus and method
CN106056602B (en) * 2016-05-27 2019-06-28 中国人民解放军信息工程大学 FMRI visual performance datum target extracting method based on CNN
CN106056602A (en) * 2016-05-27 2016-10-26 中国人民解放军信息工程大学 CNN (convolutional neural network)-based fMRI (functional magnetic resonance imaging) visual function data object extraction method
CN105930877B (en) * 2016-05-31 2020-07-10 上海海洋大学 Remote sensing image classification method based on multi-mode deep learning
CN105930877A (en) * 2016-05-31 2016-09-07 上海海洋大学 Multimodal depth learning-based remote sensing image classification method
CN109196526B (en) * 2016-06-01 2021-09-28 三菱电机株式会社 Method and system for generating multi-modal digital images
CN109196526A (en) * 2016-06-01 2019-01-11 三菱电机株式会社 For generating the method and system of multi-modal digital picture
CN106202338A (en) * 2016-06-30 2016-12-07 合肥工业大学 Image search method based on the many relations of multiple features
CN106202338B (en) * 2016-06-30 2019-04-05 合肥工业大学 Image search method based on the more relationships of multiple features
CN106682592B (en) * 2016-12-08 2023-10-27 北京泛化智能科技有限公司 Image automatic identification system and method based on neural network method
CN106682592A (en) * 2016-12-08 2017-05-17 北京泛化智能科技有限公司 Automatic image recognition system and method based on neural network method
CN106845427A (en) * 2017-01-25 2017-06-13 北京深图智服技术有限公司 A kind of method for detecting human face and device based on deep learning
CN106845427B (en) * 2017-01-25 2019-12-06 北京深图智服技术有限公司 face detection method and device based on deep learning
CN107122800A (en) * 2017-04-27 2017-09-01 南京大学 A kind of Robust digital figure mask method based on the screening that predicts the outcome
CN107122800B (en) * 2017-04-27 2020-09-18 南京大学 Robust digital image labeling method based on prediction result screening
CN108960015A (en) * 2017-05-24 2018-12-07 优信拍(北京)信息科技有限公司 A kind of vehicle system automatic identifying method and device based on deep learning
CN109583583B (en) * 2017-09-29 2023-04-07 腾讯科技(深圳)有限公司 Neural network training method and device, computer equipment and readable medium
CN109583583A (en) * 2017-09-29 2019-04-05 腾讯科技(深圳)有限公司 Neural network training method, device, computer equipment and readable medium
CN108307205A (en) * 2017-12-06 2018-07-20 中国电子科技集团公司电子科学研究院 Merge the recognition methods of video expressive force, terminal and the storage medium of audio visual feature
CN108388768A (en) * 2018-02-08 2018-08-10 南京恺尔生物科技有限公司 Utilize the biological nature prediction technique for the neural network model that biological knowledge is built
CN109544517A (en) * 2018-11-06 2019-03-29 中山大学附属第医院 Method and system are analysed in multi-modal ultrasound group credit based on deep learning
CN109543833A (en) * 2018-11-30 2019-03-29 上海寒武纪信息科技有限公司 Operation method, device and Related product
CN109583580A (en) * 2018-11-30 2019-04-05 上海寒武纪信息科技有限公司 Operation method, device and Related product
CN109543835A (en) * 2018-11-30 2019-03-29 上海寒武纪信息科技有限公司 Operation method, device and Related product
CN109543835B (en) * 2018-11-30 2021-06-25 上海寒武纪信息科技有限公司 Operation method, device and related product
CN109583580B (en) * 2018-11-30 2021-08-03 上海寒武纪信息科技有限公司 Operation method, device and related product
CN109711464A (en) * 2018-12-25 2019-05-03 中山大学 Image Description Methods based on the building of stratification Attributed Relational Graps
CN109711464B (en) * 2018-12-25 2022-09-27 中山大学 Image description method constructed based on hierarchical feature relationship diagram
CN109886226A (en) * 2019-02-27 2019-06-14 北京达佳互联信息技术有限公司 Determine method, apparatus, electronic equipment and the storage medium of the characteristic of image
CN110019652B (en) * 2019-03-14 2022-06-03 九江学院 Cross-modal Hash retrieval method based on deep learning
CN110019652A (en) * 2019-03-14 2019-07-16 九江学院 A kind of cross-module state Hash search method based on deep learning
WO2021046970A1 (en) * 2019-09-11 2021-03-18 山东浪潮人工智能研究院有限公司 Arithmetic coding-based neural network model compression encryption method and system
CN111127456A (en) * 2019-12-28 2020-05-08 北京无线电计量测试研究所 Image annotation quality evaluation method
CN111383744A (en) * 2020-06-01 2020-07-07 北京协同创新研究院 Medical microscopic image annotation information processing method and system and image analysis equipment
CN112633394A (en) * 2020-12-29 2021-04-09 厦门市美亚柏科信息股份有限公司 Intelligent user label determination method, terminal equipment and storage medium
CN114170481A (en) * 2022-02-10 2022-03-11 北京字节跳动网络技术有限公司 Method, apparatus, storage medium, and program product for image processing
CN115356363B (en) * 2022-08-01 2023-06-20 河南理工大学 Pore structure characterization method based on wide ion beam polishing-scanning electron microscope
CN115356363A (en) * 2022-08-01 2022-11-18 河南理工大学 Wide ion beam polishing-scanning electron microscope-based pore structure characterization method
CN116563400A (en) * 2023-07-12 2023-08-08 南通原力云信息技术有限公司 Small program image information compression processing method
CN116563400B (en) * 2023-07-12 2023-09-05 南通原力云信息技术有限公司 Small program image information compression processing method

Also Published As

Publication number Publication date
CN105184303B (en) 2019-08-09

Similar Documents

Publication Publication Date Title
CN105184303A (en) Image marking method based on multi-mode deep learning
Ševo et al. Convolutional neural network based automatic object detection on aerial images
He et al. Remote sensing scene classification using multilayer stacked covariance pooling
Imbriaco et al. Aggregated deep local features for remote sensing image retrieval
Liu et al. Scene classification via triplet networks
Yu et al. A two-stream deep fusion framework for high-resolution aerial scene classification
Zhang et al. Scene classification via a gradient boosting random convolutional network framework
Avila et al. Pooling in image representation: The visual codeword point of view
He et al. Learning and incorporating top-down cues in image segmentation
EP3029606A2 (en) Method and apparatus for image classification with joint feature adaptation and classifier learning
Cheng et al. Learning coarse-to-fine sparselets for efficient object detection and scene classification
Qayyum et al. Scene classification for aerial images based on CNN using sparse coding technique
Ali et al. A hybrid geometric spatial image representation for scene classification
CN110321967B (en) Image classification improvement method based on convolutional neural network
Zhu et al. Plant identification based on very deep convolutional neural networks
CN104462494B (en) A kind of remote sensing image retrieval method and system based on unsupervised feature learning
Tobías et al. Convolutional Neural Networks for object recognition on mobile devices: A case study
Holder et al. From on-road to off: Transfer learning within a deep convolutional neural network for segmentation and classification of off-road scenes
Al-Haija et al. Multi-class weather classification using ResNet-18 CNN for autonomous IoT and CPS applications
CN105184298A (en) Image classification method through fast and locality-constrained low-rank coding process
Ren et al. Ship recognition based on Hu invariant moments and convolutional neural network for video surveillance
Qayyum et al. Designing deep CNN models based on sparse coding for aerial imagery: a deep-features reduction approach
Ghadi et al. Robust object categorization and Scene classification over remote sensing images via features fusion and fully convolutional network
Alzu'Bi et al. Compact root bilinear cnns for content-based image retrieval
Li et al. Image decomposition with multilabel context: Algorithms and applications

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
EE01 Entry into force of recordation of patent licensing contract

Application publication date: 20151223

Assignee: Zhangjiagang Institute of Zhangjiagang

Assignor: Nanjing Post & Telecommunication Univ.

Contract record no.: X2019980001251

Denomination of invention: Image marking method based on multi-mode deep learning

Granted publication date: 20190809

License type: Common License

Record date: 20191224

EE01 Entry into force of recordation of patent licensing contract