CN105184303A - Image marking method based on multi-mode deep learning - Google Patents
Image marking method based on multi-mode deep learning Download PDFInfo
- Publication number
- CN105184303A CN105184303A CN201510198325.XA CN201510198325A CN105184303A CN 105184303 A CN105184303 A CN 105184303A CN 201510198325 A CN201510198325 A CN 201510198325A CN 105184303 A CN105184303 A CN 105184303A
- Authority
- CN
- China
- Prior art keywords
- image
- mark
- alpha
- layer
- training
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Abstract
The invention discloses an image marking method based on multi-mode deep learning. The method comprises the following steps: firstly, a depth neural network is trained by utilization of images without labels; secondly, each single mode is optimized by utilization of counter propagation; finally, weights among different modes are optimized by utilization of on-line learning power gradient algorithm. The method employs a convolution neural network technology to optimize parameters of the depth neural network, and the marking precision is raised. Experiments of public data sets show that the method can raise the image marking performance effectively.
Description
Technical field
The present invention relates to a kind of image labeling method, particularly relate to the image labeling method learnt based on the multi-modal degree of depth, belong to technical field of image processing.
Background technology
In recent years, along with the sharp increase of amount of images, people need the efficient mark realizing picture material urgently, to realize effective retrieval and the management of large-scale image.
From the angle of pattern-recognition, image labeling problem is considered as distribute one group of label according to content to image, wherein how chooses the suitable characteristics of token image content, mark performance will be affected to a great extent.Due to well-known semantic gap problem, be difficult to when prior art carries out linguistic indexing of pictures reach gratifying result.In recent years, the people such as Hinton proposes to utilize deep neural network, training characteristics effectively from training set.Dissimilar deep neural network, has been successfully applied to various language and information retrieval.These methods find the data structure hidden and effective characteristic feature by depth structure, degree of depth study from training data, improve system performance.
Summary of the invention
The object of the invention there are provided a kind of image labeling method learnt based on the multi-modal degree of depth, and the method is applied to convolutional neural networks technology, optimizes deep-neural-network parameter, improves mark precision.The method is summed up on the basis of single mode study, realize multi-modal study, wherein both comprise the low-level image feature of research token image, as color, shape or texture etc., also similarity function between dimensioned plan picture and mark is comprised, as linear similarity, cosine similarity and radial distance etc.
The present invention solves the technical scheme that its technical matters takes: the invention provides a kind of image labeling method learnt based on the multi-modal degree of depth, the method comprises the following steps:
Step 1: utilize the image pattern collection without label, the node weights of pre-training deep neural network.
Step 2: adopt back-propagation algorithm, optimize the weight of each single mode.
Step 3: the power gradient algorithm adopting on-line study, optimizes the weight between modality combinations.
Deep neural network described in step 1 of the present invention is the convolutional neural networks of employing eight layers, and wherein the first five layer is convolutional layer, and its excess-three layer is full articulamentum; The output of full articulamentum is as the input of Softmax sorter, and Softmax sorter generates the classifications of 1000 marks; Pre-training and fine setting stage all use the objective function of polynomial expression logistic regression.
In the convolutional layer of the invention described above, ground floor, the second layer, layer 5 are normalization layer, and for maintaining the invariance, all normalization layers all use maximum pool technology.In addition, in all convolutional layers and full articulamentum, all use Serial regulation unit as nonlinear activation function;
In the above-mentioned convolutional neural networks used of the present invention, all input picture size unifications are 256 × 256 sizes; Next, respectively the first two convolution filter is set to 7 × 7 and 5 × 5, step-length is 2, uses this type filter to be for obtaining all band informations, uses little step-length to be produce next layer network influential " dead feature " for avoiding; Then, connect latter three layers of convolutional layer successively, and arrange wave filter size 3 × 3, step-length is 1; Finally, the Output Size of each full articulamentum is 4096.In the pre-training stage, the dropout rate of complete for the first two articulamentum is set to 0.6.
Each single mode step is optimized in backpropagation described in step 2 of the present invention, comprising:
1. single mode pre-training:
Utilize the pre-training of carrying out convolutional neural networks without mark training set, realize the intermediate representation of image object, the network of initialization simultaneously.Detailed process is described below: first, utilizes contrast difference, the node weights W1 between training input layer and first volume lamination; Then, using the input of the conditional probability of first volume lamination node as volume Two lamination:
p(Γ|x
j)=S(W
1,x
j)(1)
Wherein x
jfor a jth eigenvector, Γ is markup information, the similarity function of S () for being shown below:
Then, first volume lamination and volume Two lamination combine training node weights W2; Utilize identical method, train the node weights of remaining 3 layers of convolutional layer and 3 layers of full articulamentum;
2. the single mode fine setting stage:
In the single mode fine setting stage, utilize backpropagation to mark error and optimize node weights.From pattern-recognition angle, the study of many marks can be considered multi-task learning.Therefore, the overall mark error of convolutional neural networks can be considered the summation of each mark error.For l mark error, node optimization process is described below;
First, for image x, it is x under a jth feature mode
j, containing l mark Γ
lprobability can represent by the posterior probability of following formula:
Wherein L represents mark quantity.
Then, the KL difference between prediction probability and reference probability is minimized.Assuming that every width image has multiple mark, with vector representation y ∈ R
1× c, wherein y
l=1 represents that the mark of image x is concentrated containing this l mark, and y
l=0 represents that the mark of image x is concentrated not containing this l mark.If q
ilrepresent image x
iand the probability between mark l, then by the error that this l mark correctly distributes to image be:
The distribution error of all marks is:
Finally, backpropagation is utilized to upgrade the node weights of other two-layer full articulamentum and five layers of convolutional layer successively.
The power gradient algorithm of the employing on-line study described in step 3 of the present invention optimizes the weight process between different modalities, comprising:
For multi-modal degree of depth network, another vital task is the best of breed weight α=(α of multi-modal of study
1, α
2..., α
n..., α
n), wherein by α
nbe initially set to 1/N.The present invention adopts the power gradient algorithm of on-line study to optimize multi-modal weight combination:
Wherein KL (.) represents KL difference, and h (α) represents hinge loss function:
Wherein S
tfor:
S
t=(S
1(x,Γ
+)-S
1(x,Γ
-),...,S
N(x,Γ
+)-S
N(x,Γ
-))
T(8)
Wherein mark Γ
+with Γ
-more picture material can be reacted.
At α
tplace carries out first order Taylor to function h (α), and to simplify optimization problem, therefore equation (8) can be written as first order Taylor and launch form:
If Γ+with Γ-correctly do not arrange in order, namely robotization renewal is carried out to the value of node weights α.
Beneficial effect:
1, present invention optimizes deep-neural-network parameter, improve mark precision.
2, the present invention achieves the image labeling validity based on deep neural network learning model better.
3, the present invention can improve the performance of image labeling effectively.
Accompanying drawing explanation
Fig. 1 is method flow diagram of the present invention.
Fig. 2 is deep neural network model of the present invention.
Fig. 3 is the example image in natural scene figure storehouse of the present invention.
Fig. 4 is the image of NUS-WIDE image library of the present invention.
Fig. 5 is the example image of IAPRTC-12 image data base of the present invention.
Fig. 6 is in three kinds of common image storehouses of the present invention, the result schematic diagram of different modalities weight combination.
Embodiment
Below in conjunction with Figure of description, the invention is described in further detail.
As shown in Figure 1, the invention provides a kind of image labeling method learnt based on the multi-modal degree of depth, the method comprises: first, utilizes without label image training deep neural network; Secondly, backpropagation is adopted to optimize each single mode; Finally, the power gradient algorithm of employing on-line study optimizes the weight between different modalities.
Deep neural network in the present invention adopts convolutional neural networks, and its model structure as shown in Figure 2.The present invention, by series of experiments, assesses the performance based on multi-modal degree of depth study image labeling algorithm that the present invention proposes.
Step 1: introduce the data set for assessment of algorithm performance.
Experiment employing three common image data sets, comprise natural scene image storehouse as shown in Figure 3, NUS-WIDE image library as shown in Figure 4, and IAPRTC-12 image library as shown in Figure 5.The details of these three image libraries are described below:
Natural scene image storehouse comprises 2000 width images, and all these images comprise following 5 kinds of marks: desert, high mountain, sea, the setting sun and trees.Image more than 20% contains more than one and marks, and the mean value of every width image labeling is 1.3.Fig. 3 provides the example image of two width from natural scene figure storehouse, wherein Fig. 3 (a) be labeled as the setting sun and sea, Fig. 3 (b) is labeled as high mountain and trees.
NUS-WIDE image library comprises 30,000 kind of image, and these image labelings contain canoe, automobile, flag, horse, sky, the sun, tower, aircraft, zebra etc. at interior 31 kinds of marks.Fig. 4 provides the image of two width from NUS-WIDE image library, and wherein the mark of Fig. 4 (a) contains sky and aircraft, and the mark of Fig. 4 (b) contains sea and the setting sun.
IAPRTC-12 image data base comprises 20,000 width image, 291 kinds of marks, and the average mark number of every width image is 5.7.Fig. 5 gives the example image that two width come from IAPRTC-12 image data base.The mark of Fig. 5 (a) contains brown, face, hair, men and women, and the mark of Fig. 5 (b) contains boats and ships, lake, sky, trees.
Step 2: provide the visual signature of token image and the optimized parameter learning to obtain.
Feature selecting has very large impact to system performance.The present invention chooses following global characteristics and the local feature descriptor as characterization image:
Global characteristics: (1) 128 dimension hsv color histogram and 225 dimension LAB color moments, (2) 37 dimension edge orientation histograms, (3) 36 dimension pyramid wavelet textures, (4) 59 dimension local binary feature descriptors, (5) 960 dimension GIST feature descriptors.
Local feature: the partial descriptions symbol adopting two kinds of different sampling methods different with three kinds extracts Local textural feature, and detailed process comprises following description: first, carries out intensive sampling and Harris's Corner Detection; Then, extract SIFT feature, CSIFT feature, RGBSIFT feature, build the code book of 1000 classifications of k mean cluster; Next, adopt secondorder spatial pyramid pattern, build 5000 n dimensional vector ns of every width image; Finally, TF-IDF weight method is used to generate final visual word bag.In whole experiment, in scope that all proper vectors are all standardized in [0,1].
To every group polling-it is right to mark, in above-mentioned formula (4), give 3 kinds of similarity measurements, and select edge parameters μ by cross validation.After cross validation, the μ value in cosine similarity measurement is 0.18; μ value in linear similarity measurement is 1; σ value in RBF similarity measurement is 2, μ value is 0.18.
Step 3: by contrast experiment, test the present invention put forward the performance of algorithm.
Algorithm contrasts
Contrast experiment of the present invention carries out between following three kinds of image classification methods:
Based on inertia learning algorithm: first, for each test pattern, in training image storehouse, find K the most similar individual image; Then, the characteristic of the most similar image of K is added up; Finally, according to the mark of maximum a posteriori probability allocation for test image.
Based on depth representing and encryption algorithm: utilize hierarchical model to learn the expression of image pixel-class, realize image labeling
The inventive method: realize image labeling by deep-neural-network.
Mode weight
In the method for the invention, the combining weights α of different modalities has very large impact to system performance.Fig. 5 provides in three kinds of common image storehouses, the result of different modalities weight combination.Fig. 6 (a): the different modalities combining weights under natural image storehouse.Different modalities combining weights under Fig. 6 (b): NUS-WIDE image.Different modalities combining weights under Fig. 6 (c): IAPRTC-12 image.
Can see easily from the result shown in Fig. 6, the ratio between different modalities does not have significant difference.This just means that often kind of mode is more or less helpful to different images classification, and this is mainly because these three kinds of image libraries comprise many different classes of natural scene images, and this also demonstrates the importance obtaining different modalities optimum combination simultaneously further.
Performance comparison
Table 1 gives several Experimental comparison results making multiaspect image-annotation techniques differently.
Table 1: Experimental comparison results.
As can be seen from result shown in table 1, the NDCG@w performance of institute of the present invention extracting method is better than other two kinds of existing methods, and this checking is based on the image labeling validity of deep neural network learning model.
Claims (5)
1. based on the image labeling method that the multi-modal degree of depth learns, it is characterized in that, described method comprises the steps:
Step 1: utilize the image pattern collection without label, the node weights of pre-training deep neural network;
Step 2: adopt back-propagation algorithm, optimize the weight of each single mode;
Step 3: the power gradient algorithm adopting on-line study, optimizes the weight between modality combinations.
2. a kind of image labeling method learnt based on the multi-modal degree of depth according to claim 1, it is characterized in that, the deep neural network of described method step 1 adopts the convolutional neural networks of eight layers, and wherein the first five layer is convolutional layer, and its excess-three layer is full articulamentum; The output of full articulamentum is as the input of Softmax sorter, and Softmax sorter generates the classifications of 1000 marks; Pre-training and fine setting stage all use the objective function of polynomial expression logistic regression;
The ground floor of described convolutional layer, the second layer, layer 5 are normalization layer, and for maintaining the invariance, all normalization layers all use maximum pool technology; In all convolutional layers and full articulamentum, all use Serial regulation unit as nonlinear activation function;
In convolutional neural networks used, all input picture size unifications are 256 × 256 sizes; Next, respectively the first two convolution filter is set to 7 × 7 and 5 × 5, step-length is 2, uses this type filter to be for obtaining all band informations, uses little step-length to be produce next layer network influential " dead feature " for avoiding; Then, connect latter three layers of convolutional layer successively, and arrange wave filter size 3 × 3, step-length is 1; Finally, the Output Size of each full articulamentum is 4096, in the pre-training stage, the dropout rate of complete for the first two articulamentum is set to 0.6.
3. according to claim 1 based on the image labeling method that the multi-modal degree of depth learns, it is characterized in that, the back-propagation algorithm in described method step 2 comprises:
1. single mode pre-training:
Utilize the pre-training of carrying out convolutional neural networks without mark training set, realize the intermediate representation of image object, the network of initialization simultaneously, comprising: first, utilizes contrast difference, the node weights W1 between training input layer and first volume lamination; Then, using the input of the conditional probability of first volume lamination node as volume Two lamination:
p(Γ|x
j)=S(W
1,x
j)(1)
Wherein x
jfor a jth eigenvector, Γ is markup information, the similarity function of S () for being shown below:
Then, first volume lamination and volume Two lamination combine training node weights W2; Utilize identical method, train the node weights of remaining 3 layers of convolutional layer and 3 layers of full articulamentum;
2. the single mode fine setting stage:
In the single mode fine setting stage, utilize backpropagation to mark error and optimize node weights, from pattern-recognition angle, the study of many marks can be considered multi-task learning; The overall mark error of convolutional neural networks can be considered the summation of each mark error, with l mark error description node optimization process, comprising:
First, for image x, it is x under a jth feature mode
j, containing l mark Γ
lprobability can represent by the posterior probability of following formula:
Wherein L represents mark quantity;
Then, the KL difference between prediction probability and reference probability is minimized; Assuming that every width image has multiple mark, with vector representation y ∈ R
1× c, wherein y
l=1 represents that the mark of image x is concentrated containing this l mark, and y
l=0 represents that the mark of image x is concentrated not containing this l mark; If q
ilrepresent image x
iand the probability between mark l, then by the error that this l mark correctly distributes to image be:
The distribution error of all marks is:
Finally, backpropagation is utilized to upgrade the node weights of other two-layer full articulamentum and five layers of convolutional layer successively.
4. according to claim 1 based on the image labeling method that the multi-modal degree of depth learns, it is characterized in that, the power gradient algorithm of the on-line study of described method step 3 optimizes the weight process between different modalities, comprising:
For multi-modal degree of depth network, another vital task is the best of breed weight α=(α of multi-modal of study
1, α
2..., α
n..., α
n), wherein by α
nbe initially set to 1/N; Adopt the power gradient algorithm of on-line study to optimize multi-modal weight combination, comprising:
Wherein KL (.) represents KL difference, and h (α) represents hinge loss function:
Wherein S
tfor:
S
t=(S
1(x,Γ
+)-S
1(x,Γ
-),...,S
N(x,Γ
+)-S
N(x,Γ
-))
T(8)
Wherein mark Γ
+with Γ
-more picture material can be reacted;
At α
tplace carries out first order Taylor to function h (α), and to simplify optimization problem, therefore equation (8) can be written as first order Taylor and launch form:
If Γ+with Γ-correctly do not arrange in order, namely robotization renewal is carried out to the value of node weights α.
5. according to claim 1 based on the image labeling method that the multi-modal degree of depth learns, it is characterized in that, described method is applied to convolutional neural networks.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201510198325.XA CN105184303B (en) | 2015-04-23 | 2015-04-23 | A kind of image labeling method based on multi-modal deep learning |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201510198325.XA CN105184303B (en) | 2015-04-23 | 2015-04-23 | A kind of image labeling method based on multi-modal deep learning |
Publications (2)
Publication Number | Publication Date |
---|---|
CN105184303A true CN105184303A (en) | 2015-12-23 |
CN105184303B CN105184303B (en) | 2019-08-09 |
Family
ID=54906369
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201510198325.XA Active CN105184303B (en) | 2015-04-23 | 2015-04-23 | A kind of image labeling method based on multi-modal deep learning |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN105184303B (en) |
Cited By (30)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105654942A (en) * | 2016-01-04 | 2016-06-08 | 北京时代瑞朗科技有限公司 | Speech synthesis method of interrogative sentence and exclamatory sentence based on statistical parameter |
CN105678340A (en) * | 2016-01-20 | 2016-06-15 | 福州大学 | Automatic image marking method based on enhanced stack type automatic encoder |
CN105760859A (en) * | 2016-03-22 | 2016-07-13 | 中国科学院自动化研究所 | Method and device for identifying reticulate pattern face image based on multi-task convolutional neural network |
CN105894012A (en) * | 2016-03-29 | 2016-08-24 | 天津大学 | Object identification method based on cascade micro neural network |
CN105930877A (en) * | 2016-05-31 | 2016-09-07 | 上海海洋大学 | Multimodal depth learning-based remote sensing image classification method |
CN106056602A (en) * | 2016-05-27 | 2016-10-26 | 中国人民解放军信息工程大学 | CNN (convolutional neural network)-based fMRI (functional magnetic resonance imaging) visual function data object extraction method |
CN106202338A (en) * | 2016-06-30 | 2016-12-07 | 合肥工业大学 | Image search method based on the many relations of multiple features |
CN106682592A (en) * | 2016-12-08 | 2017-05-17 | 北京泛化智能科技有限公司 | Automatic image recognition system and method based on neural network method |
CN106845427A (en) * | 2017-01-25 | 2017-06-13 | 北京深图智服技术有限公司 | A kind of method for detecting human face and device based on deep learning |
CN107122800A (en) * | 2017-04-27 | 2017-09-01 | 南京大学 | A kind of Robust digital figure mask method based on the screening that predicts the outcome |
CN107273784A (en) * | 2016-04-01 | 2017-10-20 | 富士施乐株式会社 | Image steganalysis apparatus and method |
CN108307205A (en) * | 2017-12-06 | 2018-07-20 | 中国电子科技集团公司电子科学研究院 | Merge the recognition methods of video expressive force, terminal and the storage medium of audio visual feature |
CN108388768A (en) * | 2018-02-08 | 2018-08-10 | 南京恺尔生物科技有限公司 | Utilize the biological nature prediction technique for the neural network model that biological knowledge is built |
CN108960015A (en) * | 2017-05-24 | 2018-12-07 | 优信拍(北京)信息科技有限公司 | A kind of vehicle system automatic identifying method and device based on deep learning |
CN109196526A (en) * | 2016-06-01 | 2019-01-11 | 三菱电机株式会社 | For generating the method and system of multi-modal digital picture |
CN109543835A (en) * | 2018-11-30 | 2019-03-29 | 上海寒武纪信息科技有限公司 | Operation method, device and Related product |
CN109544517A (en) * | 2018-11-06 | 2019-03-29 | 中山大学附属第医院 | Method and system are analysed in multi-modal ultrasound group credit based on deep learning |
CN109543833A (en) * | 2018-11-30 | 2019-03-29 | 上海寒武纪信息科技有限公司 | Operation method, device and Related product |
CN109583580A (en) * | 2018-11-30 | 2019-04-05 | 上海寒武纪信息科技有限公司 | Operation method, device and Related product |
CN109583583A (en) * | 2017-09-29 | 2019-04-05 | 腾讯科技(深圳)有限公司 | Neural network training method, device, computer equipment and readable medium |
CN109711464A (en) * | 2018-12-25 | 2019-05-03 | 中山大学 | Image Description Methods based on the building of stratification Attributed Relational Graps |
CN109886226A (en) * | 2019-02-27 | 2019-06-14 | 北京达佳互联信息技术有限公司 | Determine method, apparatus, electronic equipment and the storage medium of the characteristic of image |
CN110019652A (en) * | 2019-03-14 | 2019-07-16 | 九江学院 | A kind of cross-module state Hash search method based on deep learning |
CN111127456A (en) * | 2019-12-28 | 2020-05-08 | 北京无线电计量测试研究所 | Image annotation quality evaluation method |
CN111383744A (en) * | 2020-06-01 | 2020-07-07 | 北京协同创新研究院 | Medical microscopic image annotation information processing method and system and image analysis equipment |
WO2021046970A1 (en) * | 2019-09-11 | 2021-03-18 | 山东浪潮人工智能研究院有限公司 | Arithmetic coding-based neural network model compression encryption method and system |
CN112633394A (en) * | 2020-12-29 | 2021-04-09 | 厦门市美亚柏科信息股份有限公司 | Intelligent user label determination method, terminal equipment and storage medium |
CN114170481A (en) * | 2022-02-10 | 2022-03-11 | 北京字节跳动网络技术有限公司 | Method, apparatus, storage medium, and program product for image processing |
CN115356363A (en) * | 2022-08-01 | 2022-11-18 | 河南理工大学 | Wide ion beam polishing-scanning electron microscope-based pore structure characterization method |
CN116563400A (en) * | 2023-07-12 | 2023-08-08 | 南通原力云信息技术有限公司 | Small program image information compression processing method |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20120254086A1 (en) * | 2011-03-31 | 2012-10-04 | Microsoft Corporation | Deep convex network with joint use of nonlinear random projection, restricted boltzmann machine and batch-based parallelizable optimization |
CN102902966A (en) * | 2012-10-12 | 2013-01-30 | 大连理工大学 | Super-resolution face recognition method based on deep belief networks |
CN103345656A (en) * | 2013-07-17 | 2013-10-09 | 中国科学院自动化研究所 | Method and device for data identification based on multitask deep neural network |
-
2015
- 2015-04-23 CN CN201510198325.XA patent/CN105184303B/en active Active
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20120254086A1 (en) * | 2011-03-31 | 2012-10-04 | Microsoft Corporation | Deep convex network with joint use of nonlinear random projection, restricted boltzmann machine and batch-based parallelizable optimization |
CN102902966A (en) * | 2012-10-12 | 2013-01-30 | 大连理工大学 | Super-resolution face recognition method based on deep belief networks |
CN103345656A (en) * | 2013-07-17 | 2013-10-09 | 中国科学院自动化研究所 | Method and device for data identification based on multitask deep neural network |
Cited By (47)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105654942A (en) * | 2016-01-04 | 2016-06-08 | 北京时代瑞朗科技有限公司 | Speech synthesis method of interrogative sentence and exclamatory sentence based on statistical parameter |
CN105678340A (en) * | 2016-01-20 | 2016-06-15 | 福州大学 | Automatic image marking method based on enhanced stack type automatic encoder |
CN105678340B (en) * | 2016-01-20 | 2018-12-25 | 福州大学 | A kind of automatic image marking method based on enhanced stack autocoder |
CN105760859B (en) * | 2016-03-22 | 2018-12-21 | 中国科学院自动化研究所 | Reticulate pattern facial image recognition method and device based on multitask convolutional neural networks |
CN105760859A (en) * | 2016-03-22 | 2016-07-13 | 中国科学院自动化研究所 | Method and device for identifying reticulate pattern face image based on multi-task convolutional neural network |
CN105894012A (en) * | 2016-03-29 | 2016-08-24 | 天津大学 | Object identification method based on cascade micro neural network |
CN107273784B (en) * | 2016-04-01 | 2022-04-15 | 富士胶片商业创新有限公司 | Image pattern recognition apparatus and method |
CN107273784A (en) * | 2016-04-01 | 2017-10-20 | 富士施乐株式会社 | Image steganalysis apparatus and method |
CN106056602B (en) * | 2016-05-27 | 2019-06-28 | 中国人民解放军信息工程大学 | FMRI visual performance datum target extracting method based on CNN |
CN106056602A (en) * | 2016-05-27 | 2016-10-26 | 中国人民解放军信息工程大学 | CNN (convolutional neural network)-based fMRI (functional magnetic resonance imaging) visual function data object extraction method |
CN105930877B (en) * | 2016-05-31 | 2020-07-10 | 上海海洋大学 | Remote sensing image classification method based on multi-mode deep learning |
CN105930877A (en) * | 2016-05-31 | 2016-09-07 | 上海海洋大学 | Multimodal depth learning-based remote sensing image classification method |
CN109196526B (en) * | 2016-06-01 | 2021-09-28 | 三菱电机株式会社 | Method and system for generating multi-modal digital images |
CN109196526A (en) * | 2016-06-01 | 2019-01-11 | 三菱电机株式会社 | For generating the method and system of multi-modal digital picture |
CN106202338A (en) * | 2016-06-30 | 2016-12-07 | 合肥工业大学 | Image search method based on the many relations of multiple features |
CN106202338B (en) * | 2016-06-30 | 2019-04-05 | 合肥工业大学 | Image search method based on the more relationships of multiple features |
CN106682592B (en) * | 2016-12-08 | 2023-10-27 | 北京泛化智能科技有限公司 | Image automatic identification system and method based on neural network method |
CN106682592A (en) * | 2016-12-08 | 2017-05-17 | 北京泛化智能科技有限公司 | Automatic image recognition system and method based on neural network method |
CN106845427A (en) * | 2017-01-25 | 2017-06-13 | 北京深图智服技术有限公司 | A kind of method for detecting human face and device based on deep learning |
CN106845427B (en) * | 2017-01-25 | 2019-12-06 | 北京深图智服技术有限公司 | face detection method and device based on deep learning |
CN107122800A (en) * | 2017-04-27 | 2017-09-01 | 南京大学 | A kind of Robust digital figure mask method based on the screening that predicts the outcome |
CN107122800B (en) * | 2017-04-27 | 2020-09-18 | 南京大学 | Robust digital image labeling method based on prediction result screening |
CN108960015A (en) * | 2017-05-24 | 2018-12-07 | 优信拍(北京)信息科技有限公司 | A kind of vehicle system automatic identifying method and device based on deep learning |
CN109583583B (en) * | 2017-09-29 | 2023-04-07 | 腾讯科技(深圳)有限公司 | Neural network training method and device, computer equipment and readable medium |
CN109583583A (en) * | 2017-09-29 | 2019-04-05 | 腾讯科技(深圳)有限公司 | Neural network training method, device, computer equipment and readable medium |
CN108307205A (en) * | 2017-12-06 | 2018-07-20 | 中国电子科技集团公司电子科学研究院 | Merge the recognition methods of video expressive force, terminal and the storage medium of audio visual feature |
CN108388768A (en) * | 2018-02-08 | 2018-08-10 | 南京恺尔生物科技有限公司 | Utilize the biological nature prediction technique for the neural network model that biological knowledge is built |
CN109544517A (en) * | 2018-11-06 | 2019-03-29 | 中山大学附属第医院 | Method and system are analysed in multi-modal ultrasound group credit based on deep learning |
CN109543833A (en) * | 2018-11-30 | 2019-03-29 | 上海寒武纪信息科技有限公司 | Operation method, device and Related product |
CN109583580A (en) * | 2018-11-30 | 2019-04-05 | 上海寒武纪信息科技有限公司 | Operation method, device and Related product |
CN109543835A (en) * | 2018-11-30 | 2019-03-29 | 上海寒武纪信息科技有限公司 | Operation method, device and Related product |
CN109543835B (en) * | 2018-11-30 | 2021-06-25 | 上海寒武纪信息科技有限公司 | Operation method, device and related product |
CN109583580B (en) * | 2018-11-30 | 2021-08-03 | 上海寒武纪信息科技有限公司 | Operation method, device and related product |
CN109711464A (en) * | 2018-12-25 | 2019-05-03 | 中山大学 | Image Description Methods based on the building of stratification Attributed Relational Graps |
CN109711464B (en) * | 2018-12-25 | 2022-09-27 | 中山大学 | Image description method constructed based on hierarchical feature relationship diagram |
CN109886226A (en) * | 2019-02-27 | 2019-06-14 | 北京达佳互联信息技术有限公司 | Determine method, apparatus, electronic equipment and the storage medium of the characteristic of image |
CN110019652B (en) * | 2019-03-14 | 2022-06-03 | 九江学院 | Cross-modal Hash retrieval method based on deep learning |
CN110019652A (en) * | 2019-03-14 | 2019-07-16 | 九江学院 | A kind of cross-module state Hash search method based on deep learning |
WO2021046970A1 (en) * | 2019-09-11 | 2021-03-18 | 山东浪潮人工智能研究院有限公司 | Arithmetic coding-based neural network model compression encryption method and system |
CN111127456A (en) * | 2019-12-28 | 2020-05-08 | 北京无线电计量测试研究所 | Image annotation quality evaluation method |
CN111383744A (en) * | 2020-06-01 | 2020-07-07 | 北京协同创新研究院 | Medical microscopic image annotation information processing method and system and image analysis equipment |
CN112633394A (en) * | 2020-12-29 | 2021-04-09 | 厦门市美亚柏科信息股份有限公司 | Intelligent user label determination method, terminal equipment and storage medium |
CN114170481A (en) * | 2022-02-10 | 2022-03-11 | 北京字节跳动网络技术有限公司 | Method, apparatus, storage medium, and program product for image processing |
CN115356363B (en) * | 2022-08-01 | 2023-06-20 | 河南理工大学 | Pore structure characterization method based on wide ion beam polishing-scanning electron microscope |
CN115356363A (en) * | 2022-08-01 | 2022-11-18 | 河南理工大学 | Wide ion beam polishing-scanning electron microscope-based pore structure characterization method |
CN116563400A (en) * | 2023-07-12 | 2023-08-08 | 南通原力云信息技术有限公司 | Small program image information compression processing method |
CN116563400B (en) * | 2023-07-12 | 2023-09-05 | 南通原力云信息技术有限公司 | Small program image information compression processing method |
Also Published As
Publication number | Publication date |
---|---|
CN105184303B (en) | 2019-08-09 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN105184303A (en) | Image marking method based on multi-mode deep learning | |
Ševo et al. | Convolutional neural network based automatic object detection on aerial images | |
He et al. | Remote sensing scene classification using multilayer stacked covariance pooling | |
Imbriaco et al. | Aggregated deep local features for remote sensing image retrieval | |
Liu et al. | Scene classification via triplet networks | |
Yu et al. | A two-stream deep fusion framework for high-resolution aerial scene classification | |
Zhang et al. | Scene classification via a gradient boosting random convolutional network framework | |
Avila et al. | Pooling in image representation: The visual codeword point of view | |
He et al. | Learning and incorporating top-down cues in image segmentation | |
EP3029606A2 (en) | Method and apparatus for image classification with joint feature adaptation and classifier learning | |
Cheng et al. | Learning coarse-to-fine sparselets for efficient object detection and scene classification | |
Qayyum et al. | Scene classification for aerial images based on CNN using sparse coding technique | |
Ali et al. | A hybrid geometric spatial image representation for scene classification | |
CN110321967B (en) | Image classification improvement method based on convolutional neural network | |
Zhu et al. | Plant identification based on very deep convolutional neural networks | |
CN104462494B (en) | A kind of remote sensing image retrieval method and system based on unsupervised feature learning | |
Tobías et al. | Convolutional Neural Networks for object recognition on mobile devices: A case study | |
Holder et al. | From on-road to off: Transfer learning within a deep convolutional neural network for segmentation and classification of off-road scenes | |
Al-Haija et al. | Multi-class weather classification using ResNet-18 CNN for autonomous IoT and CPS applications | |
CN105184298A (en) | Image classification method through fast and locality-constrained low-rank coding process | |
Ren et al. | Ship recognition based on Hu invariant moments and convolutional neural network for video surveillance | |
Qayyum et al. | Designing deep CNN models based on sparse coding for aerial imagery: a deep-features reduction approach | |
Ghadi et al. | Robust object categorization and Scene classification over remote sensing images via features fusion and fully convolutional network | |
Alzu'Bi et al. | Compact root bilinear cnns for content-based image retrieval | |
Li et al. | Image decomposition with multilabel context: Algorithms and applications |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant | ||
EE01 | Entry into force of recordation of patent licensing contract |
Application publication date: 20151223 Assignee: Zhangjiagang Institute of Zhangjiagang Assignor: Nanjing Post & Telecommunication Univ. Contract record no.: X2019980001251 Denomination of invention: Image marking method based on multi-mode deep learning Granted publication date: 20190809 License type: Common License Record date: 20191224 |
|
EE01 | Entry into force of recordation of patent licensing contract |