WO2018209932A1 - 多量化深度二值特征学习方法及装置 - Google Patents

多量化深度二值特征学习方法及装置 Download PDF

Info

Publication number
WO2018209932A1
WO2018209932A1 PCT/CN2017/115622 CN2017115622W WO2018209932A1 WO 2018209932 A1 WO2018209932 A1 WO 2018209932A1 CN 2017115622 W CN2017115622 W CN 2017115622W WO 2018209932 A1 WO2018209932 A1 WO 2018209932A1
Authority
WO
WIPO (PCT)
Prior art keywords
depth
image
feature
quantization
real
Prior art date
Application number
PCT/CN2017/115622
Other languages
English (en)
French (fr)
Inventor
鲁继文
周杰
段岳圻
Original Assignee
清华大学
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 清华大学 filed Critical 清华大学
Publication of WO2018209932A1 publication Critical patent/WO2018209932A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/28Quantising the image, e.g. histogram thresholding for discrimination between background and foreground patterns
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/46Descriptors for shape, contour or point-related descriptors, e.g. scale invariant feature transform [SIFT] or bags of words [BoW]; Salient regional features
    • G06V10/462Salient features, e.g. scale invariant feature transforms [SIFT]

Definitions

  • the present invention relates to the field of computer vision and machine learning technology, and in particular, to a multi-quantization depth binary feature learning method and apparatus.
  • Visual recognition is a basic problem in the field of computer vision and can be widely used in a variety of visual applications, such as face recognition, object recognition, scene recognition and texture recognition.
  • the main steps of visual recognition can be divided into: feature extraction and feature matching.
  • the goal of the feature representation is to obtain a feature vector for each picture, so that the feature vector of the same picture has stronger similarity, and the feature matching identifies the type of the picture according to the similarity measure of the picture feature. Due to the large difference in illumination, posture, background, angle of view and occlusion of objects in the natural environment, the similarity between the same types of objects is small, and the similarity between different objects may be large, so accurate and efficient eigenvectors are obtained. It is the most critical part of visual recognition technology.
  • the feature extraction of images is mainly divided into two methods: manual feature extraction method and feature-based learning method.
  • the word bag model is a representative method for manually extracting features. The main steps are as follows: 1) extract key points or key areas from the image; 2) extract local feature descriptors for key points or key areas; 3) build a dictionary for the word bag model 4) Pooling local feature descriptors and extracting histogram features.
  • the extraction of key points or key areas and the extraction of feature descriptors are traditional problems in the field of visual computing. Since local invariant features have good adaptability to occlusion, scale, illumination and other interference factors in the image, in recent years The local invariant features gradually replaced the global features and became the mainstream method of image representation.
  • the extraction of key points or key areas can find key stable areas in the image that have some stability and repeatability as the image changes. Extracting feature descriptors provides an efficient and robust description of key points or key regions found.
  • the image local invariant feature detection method is generally divided into a corner point detector, a spot detector, and a region detector.
  • the feature-based learning method learns the visual features by learning the training set and summarizing the laws contained in the data set.
  • the method of deep learning has achieved excellent results in the field of visual recognition.
  • the field of visual perception has entered the era of big data.
  • big data is large in quantity, on the other hand, The size of the big.
  • the deep learning method can make better use of visual big data to learn effective visual features, because it not only pays attention to the global features, but also utilizes the local features that are very important in the field of image recognition, and integrates the algorithm of local feature extraction into the neural network. , thereby effectively completing the feature representation of the visual target.
  • Deep learning has achieved excellent results in visual recognition, the computational cost of deep learning is currently large, and there are bottlenecks in practical applications.
  • Binary feature learning technology has the characteristics of fast calculation, storage and matching. Deep binary feature learning can obtain higher descriptive power while costing lower computational cost. It can be accurate and efficient, and meet the needs of practical applications. For example, DeepBit learns deep binary features in an unsupervised way, achieving excellent recognition rates across multiple data sets.
  • existing binary feature learning methods are binarized using a symbol function, which results in a large quantization loss.
  • the present invention aims to solve at least one of the technical problems in the related art to some extent.
  • Another object of the present invention is to provide a multi-quantization depth binary feature learning device.
  • an embodiment of the present invention provides a multi-quantization depth binary feature learning method, including the following steps: extracting a depth real value feature of an image; and using a K self-coding network to depth the real value of the image.
  • the feature is multi-quantized to obtain a quantized result; the depth real-valued feature of the image is binary-coded according to the quantized result to obtain a binary feature of the image.
  • the multi-quantization depth binary feature learning method in the embodiment of the present invention performs binarization by using multiple quantization, and implements binarization based on multiple quantization by using a K self-coding network, thereby effectively solving the quantization error problem caused by binarization. Improve the accuracy of learning, and improve learning efficiency, more efficient and simple, and better meet the needs of practical applications.
  • the multi-quantization depth binary feature learning method according to the above embodiment of the present invention may further have the following additional technical features:
  • the extracting the depth real value feature of the image further comprises: passing the original input image to the deep convolutional neural network to obtain each after the last layer of the network is fully connected. Depth real-valued features of an image; the depth-realistic features of each of the images are passed to a fully-dimensionally reduced hierarchy to obtain low-dimensional real-valued features.
  • the overall loss function is:
  • J is the target to be optimized
  • X is the real value feature.
  • the reconstruction error of the kth automatic encoder for the real value feature of the nth picture For the projection of the first layer of the kth autoencoder, U is the mean vector of all real value features, and ⁇ 1 and ⁇ 2 are the weights of different terms.
  • the multi-quantizing the depth real-valued features of the image by the K self-encoding networks further includes: performing depth real values of the image by the K self-encoders
  • the features are separately reconstructed, wherein each training sample belongs to an encoder with the smallest reconstruction error; according to the first term loss function and the second term loss function, for each self-encoder, all of the self-encoders are used.
  • the sample is trained to iterate after multiple steps, using the number of the encoder with the smallest reconstruction error for each sample as the quantized result of the sample; using the third term loss function and the second term loss function for all samples
  • the pre-connected fully connected layer is iteratively iterated until the maximum number of iterations.
  • the performing the binary encoding of the depth real value feature of the image according to the quantization result further comprising: for each bit of the depth real value feature, The bit assignment is the binary number of the self-encoder that minimizes the bit reconstruction error.
  • another embodiment of the present invention provides a multi-quantization depth binary feature learning device, including: an extraction module for extracting depth real-value features of an image; and a multi-quantization module for passing K self-
  • the encoding network multi-quantizes the depth real-valued features of the image to obtain a quantized result; the encoding module is configured to perform binary encoding on the depth real-valued features of the image according to the quantized result to obtain a binary value of the image. feature.
  • the multi-quantization depth binary feature learning device of the embodiment of the present invention performs binarization by using multiple quantization, and implements binarization based on multiple quantization by using a K self-coding network, thereby effectively solving the quantization error problem caused by binarization. Improve the accuracy of learning, and improve learning efficiency, more efficient and simple, and better meet the needs of practical applications.
  • the multi-quantization depth binary feature learning device may further have the following additional technical features:
  • the extracting module is specifically configured to send the original input image to the deep convolutional neural network to obtain the depth of each image after the last layer of the network is fully connected.
  • Value features, and the depth real value features of each of the images are passed to the fully dimensioned dimension reduction dimension to obtain low dimensional real value features.
  • the overall loss function is:
  • J is the target to be optimized
  • X is the real value feature.
  • the reconstruction error of the kth automatic encoder for the real value feature of the nth picture For the projection of the first layer of the kth autoencoder, U is the mean vector of all real value features, and ⁇ 1 and ⁇ 2 are the weights of different terms.
  • the multiple quantization module is further configured to use K self-encoders to The depth real-valued features of the image are reconstructed separately, wherein each training sample belongs to an encoder with the smallest reconstruction error, and is used for each self-encoder according to the first term loss function and the second term loss function. All samples belonging to the self-encoder are trained to iterate after multiple steps, using the number of the encoder with the smallest reconstruction error as the quantized result of the sample for each sample, and using the third term loss function and the The second loss function uses the fully connected layer preprocessed by all samples and iteratively iterates until the maximum number of iterations.
  • the encoding module is further configured to: for each bit of the deep real value feature, assign the bit to a binary number of the self-encoder that minimizes the bit reconstruction error .
  • FIG. 1 is a flowchart of a multi-quantization depth binary feature learning method according to an embodiment of the present invention
  • FIG. 2 is a flowchart of a multi-quantized depth binary feature learning method according to an embodiment of the present invention
  • FIG. 3 is a flow chart of a method for learning a multi-quantized depth binary feature according to an embodiment of the present invention
  • FIG. 4 is a schematic structural diagram of a multi-quantization depth binary feature learning device according to an embodiment of the present invention.
  • FIG. 1 is a flowchart of a multi-quantization depth binary feature learning method according to an embodiment of the present invention.
  • the multi-quantization depth binary feature learning method includes the following steps:
  • step S101 the depth real value feature of the image is extracted.
  • extracting the depth real value feature of the image further includes: passing the original input image to the deep convolutional neural network to obtain each image after the last layer of the network is fully connected. Depth real-valued features; the depth-valued features of each image are passed to the fully-dimensionally dimensioned dimension to obtain low-dimensional real-valued features.
  • the original input image is forwarded to the deep convolutional neural network, and the 4096-dimensional depth real-valued feature of each image is obtained after the last layer of the network is fully connected, and then the 4096-dimensional feature is transmitted to the full dimension reduction.
  • Connection layer A low-dimensional real-valued feature is obtained as a result of image pre-processing.
  • the embodiment of the present invention inputs the pre-processed image real-valued features into K self-encoders connected in parallel, so as to optimize the loss function target with each real-valued feature of the input.
  • the self-encoder is trained to minimize the error of the entire feature reconstruction.
  • step S102 the depth real-valued features of the image are multi-quantized by K self-encoding networks to obtain a quantized result.
  • the overall loss function is:
  • J is the target to be optimized
  • X is the real value feature.
  • the reconstruction error of the kth automatic encoder for the real value feature of the nth picture For the projection of the first layer of the kth autoencoder, U is the mean vector of all real value features, and ⁇ 1 and ⁇ 2 are the weights of different terms.
  • the multi-quantization of the depth real-valued features of the image by the K self-encoding networks further includes: reconstructing the depth real-valued features of the image by using K self-encoders, Wherein, each training sample belongs to an encoder with the smallest reconstruction error; according to the first term loss function and the second term loss function, for each self-encoder, all samples belonging to the self-encoder are used for training, After iterating multiple steps, the number of the encoder with the smallest reconstruction error is used as the quantized result of the sample for each sample; the fully connected layer is preprocessed with all samples using the third term loss function and the second term loss function, and Iterate the training iteratively until the maximum number of iterations.
  • the embodiment of the present invention is a feature extraction method based on unsupervised learning, a loss function is introduced to train the network.
  • K self-encoding networks are used for multi-quantization of real-valued features, and this is used as a basis for binarization.
  • the multi-quantization method should have the following properties:
  • the pre-processed real-valued features are reconstructed with a self-encoder.
  • the feature extracted from the shortest layer of the encoder is the result of the original feature being dimensioned down to the subspace. If the error generated by reconstructing the pre-processed real-valued feature from the feature of the shortest layer of the encoder is smaller, the more information that the original data can be retained in the sub-space projection.
  • the present invention introduces a penalty term for preventing over-fitting of the network, keeping the coefficients sparse and preventing over-fitting.
  • Increasing the variance helps to increase the amount of information contained in the data.
  • the present invention hopes that the dimensionality reduction in the preprocessing process can preserve as much information as possible in the original high dimensional data, so the constraint is introduced to maximize the variance of the preprocessed features.
  • J is the target to be optimized
  • X is the real value feature.
  • the reconstruction error of the kth automatic encoder for the real value feature of the nth picture For the projection of the first layer of the kth autoencoder, U is the mean vector of all real value features, and ⁇ 1 and ⁇ 2 are the weights of different terms.
  • the embodiment of the present invention trains multi-quantization based on the K self-coding network in a two-step iterative manner:
  • Step 1 Reconstruct the pre-processed real-valued features with K self-encoders, and for each training sample, make it belong to the self-encoder with the smallest reconstruction error;
  • Step 2 according to the loss function of the first item and the second item, for each self-encoder, all the samples belonging to the self-encoder are trained. After iterating multiple steps, the number of the self-encoder with the smallest reconstruction error is used as the quantized result of the sample for each sample. Since each self-encoder is a projection of the sample in its different subspaces, for a self-encoder with minimal reconstruction error, the sample has the largest information in its projected subspace. It is quantized to the class represented by the self-encoder, which minimizes the quantization loss.
  • the third and second loss functions are used to train the pre-processed fully connected layers with all samples to minimize the loss of original information during the pre-processing. Repeat the above two-stage training until the maximum number of iterations is reached.
  • step S103 the depth real-valued features of the image are binary-coded according to the quantization result to obtain a binary feature of the image.
  • the depth-valued feature of the image is binary-coded according to the quantization result, and further includes: for each bit of the deep-real value feature, assigning the bit to the bit-reconstruction error The smallest self-encoder binary number.
  • each bit of the actual value feature is encoded into a self-encoder binary number that minimizes the bit reconstruction error, and finally the image is obtained.
  • Binary feature That is, the pre-processed real-valued feature is binary-coded after the quantized result is obtained, and for each bit of the real-valued feature, the bit is assigned a binary number of the self-encoder that minimizes the bit-reconstruction error. The binary feature of the image is finally obtained.
  • the feature performs large quantization error caused by binarization.
  • K self-encoding networks are used to quantize the projection of the real value feature of the image in the subspace, and the multi-quantization result is used as the basis for binarization.
  • the multi-quantization technique is used to perform fine binarization, and the real-valued function is binarized by using a symbol function or manually defining a threshold, and the information of the data set is ignored, resulting in quantization.
  • the loss is large; multi-quantization can realize the fine binarization process of data adaptation by clustering the data set; key point 2, multi-quantization by K self-encoding network; multi-quantization using K self-encoding networks
  • the method trains the parameters of the self-encoder by two-step iteration and obtains the final quantization result; key point 3, using the framework of multi-quantization depth binary feature extraction, using deep network for real-value feature learning and using multi-quantization network Binarization is performed to realize multi-quantization depth binary feature learning.
  • a real-value feature is first extracted for a picture using a depth network, and a multi-quantization method based on a K self-coding network is adopted for performing a fine binarization process.
  • Real-valued features are binarized to minimize the binary loss caused by this step, and in order to obtain more accurate binary features, a unified optimization function is used to train the parameters of the entire network, and finally for each picture.
  • the error problem improves the accuracy of learning, and improves the learning efficiency, is more efficient and simple, and better meets the needs of practical applications.
  • FIG. 4 is a schematic structural diagram of a multi-quantization depth binary feature learning device according to an embodiment of the present invention.
  • the multi-quantization depth binary feature learning device 10 includes an extraction module 100, a multi-quantization module 200, and an encoding module 300.
  • the extraction module 100 is configured to extract depth real-value features of the image.
  • the multi-quantization module 200 is configured to multi-quantize the depth real-valued features of the image through the K self-encoding networks to obtain a quantized result.
  • the encoding module 300 is configured to perform binary encoding on the depth real-valued features of the image according to the quantization result to obtain a binary feature of the image.
  • the device 10 of the embodiment of the invention can effectively solve the quantization error problem caused by binarization, improve the accuracy of learning, and improve the learning efficiency, is more efficient and simple, and better meets the practical application requirements.
  • the extraction module 100 is specifically configured to forward the original input image to the deep convolutional neural network to obtain the depth real value of each image after the last layer of the network is fully connected.
  • Features, and the depth real-valued features of each image are passed to the fully-dimensionally dimensioned dimension to obtain low-dimensional real-valued features.
  • the overall loss function is:
  • J is the target to be optimized
  • X is the real value feature.
  • the reconstruction error of the kth automatic encoder for the real value feature of the nth picture For the projection of the first layer of the kth autoencoder, U is the mean vector of all real value features, and ⁇ 1 and ⁇ 2 are the weights of different terms.
  • the multi-quantization module 200 is further configured to separately reconstruct depth real-valued features of the image by using K self-encoders, wherein each training sample belongs to a minimum reconstruction error.
  • An encoder and according to the first term loss function and the second term loss function, for each self-encoder, it is trained using all samples belonging to the self-encoder to iterate after multiple steps, using weight for each sample
  • the number of the encoder with the smallest construction error is taken as the quantized result of the sample, and the fully connected layer preprocessed with all samples using the third term loss function and the second term loss function, and iteratively iteratively trains until the maximum number of iterations.
  • the encoding module 300 is further configured to assign, for each bit of the deep real valued feature, the binary number of the self-encoder that minimizes the bit reconstruction error.
  • a real-value feature is first extracted using a depth network for a picture, and a multi-quantization method based on a K self-coding network is adopted for performing a fine binarization process.
  • Real-valued features are binarized to minimize the binary loss caused by this step, and in order to obtain more accurate binary features, a unified optimization function is used to train the parameters of the entire network, and finally for each picture.
  • the error problem improves the accuracy of learning, and improves the learning efficiency, is more efficient and simple, and better meets the needs of practical applications.
  • first and second are used for descriptive purposes only and are not to be construed as indicating or implying a relative importance or implicitly indicating the number of technical features indicated.
  • features defining “first” or “second” may include at least one of the features, either explicitly or implicitly.
  • the meaning of "a plurality” is at least two, such as two, three, etc., unless specifically defined otherwise.
  • the terms “installation”, “connected”, “connected”, “fixed” and the like shall be understood broadly, and may be either a fixed connection or a detachable connection, unless explicitly stated and defined otherwise. , or integrated; can be mechanical or electrical connection; can be directly connected, or indirectly connected through an intermediate medium, can be the internal communication of two elements or the interaction of two elements, unless otherwise specified Limited.
  • General technology in the art For the personnel, the specific meaning of the above terms in the present invention can be understood on a case-by-case basis.
  • the first feature "on” or “under” the second feature may be a direct contact of the first and second features, or the first and second features may be indirectly through an intermediate medium, unless otherwise explicitly stated and defined. contact.
  • the first feature "above”, “above” and “above” the second feature may be that the first feature is directly above or above the second feature, or merely that the first feature level is higher than the second feature.
  • the first feature “below”, “below” and “below” the second feature may be that the first feature is directly below or obliquely below the second feature, or merely that the first feature level is less than the second feature.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Data Mining & Analysis (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • General Engineering & Computer Science (AREA)
  • Image Analysis (AREA)
  • Image Processing (AREA)

Abstract

一种多量化深度二值特征学习方法及装置,其中,方法包括:提取图像的深度实值特征(S101);通过K个自编码网络对图像的深度实值特征进行多量化,以得到量化结果(S102);根据量化结果对图像的深度实值特征进行二值编码,以得到图像的二值特征(S103)。该方法可以有效解决二值化带来的量化误差问题,提高了学习的精确性,并且提高了学习效率,更加高效简单,更好地满足实际应用需求。

Description

多量化深度二值特征学习方法及装置
相关申请的交叉引用
本申请要求清华大学于2017年05月17日提交的、发明名称为“多量化深度二值特征学习方法及装置”的、中国专利申请号“201710349641.1”的优先权。
技术领域
本发明涉及计算机视觉与机器学习技术领域,特别涉及一种多量化深度二值特征学习方法及装置。
背景技术
视觉识别是计算机视觉领域的基本问题,能够广泛应用于多种视觉应用当中,例如人脸识别,物体识别,景物识别以及纹理识别等。作为一个经典的模式识别问题,视觉识别的主要步骤可以分为:特征提取和特征匹配。特征表示的目标是为每一张图片得到一个特征向量,使得同类图片的特征向量具有更强的相似性,而特征匹配则依据图片特征的相似性度量来识别图片的种类。由于自然环境下的物体光照、姿态、背景、视角和遮挡的差异较大,导致同一类物体之间的相似性小,不同物体之间的相似性可能较大,因此得到精确、高效的特征向量是视觉识别技术中最为关键的环节。
图像的特征提取主要分为两个方法:基于手工特征提取方法和基于特征学习方法。词袋模型是手工提取特征的代表性方法,主要有如下步骤:1)对图像提取关键点或关键区域;2)对关键点或关键区域提取局部特征描述符;3)为词袋模型建立字典;4)对局部特征描述符进行池化并提取直方图特征。其中,关键点或关键区域的提取以及提取特征描述符是视觉计算领域中的传统问题,由于局部不变特征对图像中的遮挡,尺度,光照等干扰因素具有较好的适应性,因此在近年来局部不变特征逐渐替代了全局特征而成为图像表征的主流方法,它们也成为了词袋模型中最为重要的两个环节。关键点或关键区域的提取能够找到图像中关键的稳定区域,这些局部区域随着图像的变化具有一定的稳定性和可重复性。而提取特征描述符则为找到的关键点或关键区域提供了具有高效鲁棒的描述。图像局部不变特征检测方法一般分为角点检测子、斑点检测子、区域检测子。基于特征学习的方法通过对训练集的学习,总结数据集蕴含的规律,学习视觉特征。
目前,深度学习的方法在视觉识别领域取得了极为出色的成果。随着互联网时代的快速发展,视觉感知领域已经进入了大数据时代,大数据一方面是数量上的大,另一方面是 维度的大。深度学习方法能更好地利用视觉大数据学习出高效的视觉特征,因其不但关注了全局特征,更是利用了图像识别领域非常重要的局部特征,将局部特征抽取的算法融入到了神经网络中,从而有效完成视觉目标的特征表达。
虽然深度学习在视觉识别中取得了极佳的效果,但是目前深度学习的计算代价较大,在实际应用中存在瓶颈。二值特征学习技术具有计算、储存、匹配速度快的特点,深度二值特征学习在花费较低运算代价的同时获得较高的描述力,能够具备精确、高效的特点,满足实际应用需求。例如,DeepBit通过非监督的方式学习深度二值特征,在多个数据集上取得了出色的识别率。然而,现有的二值特征学习方法均使用符号函数进行二值化,从而会导致较大的量化损失。
发明内容
本发明旨在至少在一定程度上解决相关技术中的技术问题之一。
为此,本发明的一个目的在于提出一种多量化深度二值特征学习方法,该方法可以提高学习的精确性,并且提高学习效率。
本发明的另一个目的在于提出一种多量化深度二值特征学习装置。
为达到上述目的,本发明一方面实施例提出了一种多量化深度二值特征学习方法,包括以下步骤:提取图像的深度实值特征;通过K个自编码网络对所述图像的深度实值特征进行多量化,以得到量化结果;根据所述量化结果对所述图像的深度实值特征进行二值编码,以得到图像的二值特征。
本发明实施例的多量化深度二值特征学习方法,通过使用多量化进行二值化,并且利用K自编码网络来实施基于多量化的二值化,有效解决二值化带来的量化误差问题,提高了学习的精确性,并且提高了学习效率,更加高效简单,更好地满足实际应用需求。
另外,根据本发明上述实施例的多量化深度二值特征学习方法还可以具有以下附加的技术特征:
进一步地,在本发明的一个实施例中,所述提取图像的深度实值特征,进一步包括:将原始的输入图像前传入深度卷积神经网络,以在网络的最后层全连接后得到每一张图像的深度实值特征;将所述每一张图像的深度实值特征传入降维的全连阶层,以得到低维实值特征。
进一步地,在本发明的一个实施例中,整体的损失函数为:
Figure PCTCN2017115622-appb-000001
其中,J为待优化目标,X为所求实值特征,
Figure PCTCN2017115622-appb-000002
为第n张图片的实值特征在第k个自动编 码器的重构误差,
Figure PCTCN2017115622-appb-000003
为第k个自动编码器的第l层的投影,U为所有实值特征的均值向量,λ1、λ2为不同项的权重。
进一步地,在本发明的一个实施例中,所述通过K个自编码网络对所述图像的深度实值特征进行多量化,进一步包括:通过K个自编码器对所述图像的深度实值特征分别进行重构,其中,令每个训练样本属于重构误差最小的编码器;根据第一项损失函数和第二项损失函数,对于每个自编码器,使用属于该自编码器的全部样本对其进行训练,以迭代多步后,对于每个样本使用重构误差最小的编码器的编号作为该样本的量化结果;利用第三项损失函数和所述第二项损失函数用所有样本预处理的全连接层,并且反复迭代训练,直到最大迭代次数。
进一步地,在本发明的一个实施例中,所述根据所述量化结果对所述图像的深度实值特征进行二值编码,进一步包括:对于所述深度实值特征的每一位,将该位赋值为使得该位重构误差最小的自编码器的二进制编号。
为达到上述目的,本发明另一方面实施例提出了一种多量化深度二值特征学习装置,包括:提取模块,用于提取图像的深度实值特征;多量化模块,用于通过K个自编码网络对所述图像的深度实值特征进行多量化,以得到量化结果;编码模块,用于根据所述量化结果对所述图像的深度实值特征进行二值编码,以得到图像的二值特征。
本发明实施例的多量化深度二值特征学习装置,通过使用多量化进行二值化,并且利用K自编码网络来实施基于多量化的二值化,有效解决二值化带来的量化误差问题,提高了学习的精确性,并且提高了学习效率,更加高效简单,更好地满足实际应用需求。
另外,根据本发明上述实施例的多量化深度二值特征学习装置还可以具有以下附加的技术特征:
进一步地,在本发明的一个实施例中,所述提取模块具体用于将原始的输入图像前传入深度卷积神经网络,以在网络的最后层全连接后得到每一张图像的深度实值特征,并且将所述每一张图像的深度实值特征传入降维的全连阶层,以得到低维实值特征。
进一步地,在本发明的一个实施例中,整体的损失函数为:
Figure PCTCN2017115622-appb-000004
其中,J为待优化目标,X为所求实值特征,
Figure PCTCN2017115622-appb-000005
为第n张图片的实值特征在第k个自动编码器的重构误差,
Figure PCTCN2017115622-appb-000006
为第k个自动编码器的第l层的投影,U为所有实值特征的均值向量,λ1、λ2为不同项的权重。
进一步地,在本发明的一个实施例中,所述多量化模块还用于通过K个自编码器对所 述图像的深度实值特征分别进行重构,其中,令每个训练样本属于重构误差最小的编码器,并且根据第一项损失函数和第二项损失函数,对于每个自编码器,使用属于该自编码器的全部样本对其进行训练,以迭代多步后,对于每个样本使用重构误差最小的编码器的编号作为该样本的量化结果,以及利用第三项损失函数和所述第二项损失函数用所有样本预处理的全连接层,并且反复迭代训练,直到最大迭代次数。
进一步地,在本发明的一个实施例中,所述编码模块还用于对于所述深度实值特征的每一位,将该位赋值为使得该位重构误差最小的自编码器的二进制编号。
本发明附加的方面和优点将在下面的描述中部分给出,部分将从下面的描述中变得明显,或通过本发明的实践了解到。
附图说明
本发明上述的和/或附加的方面和优点从下面结合附图对实施例的描述中将变得明显和容易理解,其中:
图1为根据本发明实施例的多量化深度二值特征学习方法的流程图;
图2为根据本发明一个实施例的多量化深度二值特征学习方法的流程图;
图3为根据本发明一个具体实施例的多量化深度二值特征学习方法的流程图;
图4为根据本发明实施例的多量化深度二值特征学习装置的结构示意图。
具体实施方式
下面详细描述本发明的实施例,所述实施例的示例在附图中示出,其中自始至终相同或类似的标号表示相同或类似的元件或具有相同或类似功能的元件。下面通过参考附图描述的实施例是示例性的,旨在用于解释本发明,而不能理解为对本发明的限制。
下面参照附图描述根据本发明实施例提出的多量化深度二值特征学习方法及装置,首先将参照附图描述根据本发明实施例提出的多量化深度二值特征学习方法。
图1是本发明实施例的多量化深度二值特征学习方法的流程图。
如图1所示,该多量化深度二值特征学习方法包括以下步骤:
在步骤S101中,提取图像的深度实值特征。
其中,在本发明的一个实施例中,提取图像的深度实值特征,进一步包括:将原始的输入图像前传入深度卷积神经网络,以在网络的最后层全连接后得到每一张图像的深度实值特征;将每一张图像的深度实值特征传入降维的全连阶层,以得到低维实值特征。
例如,将原始的输入图像前传入深度卷积神经网络,在网络的最后层全连接后得到每一张图像的4096维的深度实值特征,再将该4096维特征传入降维的全连接层, 得到低维实值特征,作为图像预处理的结果。
需要说明的是,由于采用了VGG网络,所以得到4096维的深度实值特征,但本领域技术人员应当理解的是,任意的深度网络都可以通过类似方法提取二值特征,在此不作具体限制。
可以理解的是,如图2所示,首先,本发明实施例将预处理后的图像实值特征输入并联的K个自编码器,以用输入的每一个实值特征在优化损失函数的目标下训练使得整个特征重构误差最小的自编码器。
在步骤S102中,通过K个自编码网络对图像的深度实值特征进行多量化,以得到量化结果。
其中,在本发明的一个实施例中,整体的损失函数为:
Figure PCTCN2017115622-appb-000007
其中,J为待优化目标,X为所求实值特征,
Figure PCTCN2017115622-appb-000008
为第n张图片的实值特征在第k个自动编码器的重构误差,
Figure PCTCN2017115622-appb-000009
为第k个自动编码器的第l层的投影,U为所有实值特征的均值向量,λ1、λ2为不同项的权重。
进一步地,在本发明的一个实施例中,通过K个自编码网络对图像的深度实值特征进行多量化,进一步包括:通过K个自编码器对图像的深度实值特征分别进行重构,其中,令每个训练样本属于重构误差最小的编码器;根据第一项损失函数和第二项损失函数,对于每个自编码器,使用属于该自编码器的全部样本对其进行训练,以迭代多步后,对于每个样本使用重构误差最小的编码器的编号作为该样本的量化结果;利用第三项损失函数和第二项损失函数用所有样本预处理的全连接层,并且反复迭代训练,直到最大迭代次数。
具体而言,如图3所示,由于本发明实施例是基于非监督学习的特征提取方法,所以引入损失函数来对网络进行训练。本发明实施例用K个自编码网络对于实值特征进行多量化,并以此作为二值化的依据。该多量化方法应具有如下性质:
1、重构误差最小
用自编码器对预处理的实值特征进行重构。自编码器最短的层提取的特征,是原始特征降维到子空间的结果。若通过自编码器最短层的特征重构预处理实值特征产生的误差越小,则在该子空间投影能保留原始数据越多的信息。
2、防止过拟合
由于样本的数量有限,需要避免让网络只学习到样本的局部特征,本发明引入防止网络过拟合的惩罚项,让系数保持稀疏,防止过拟合出现。
3、预处理实值特征方差最大
增大方差有助于提高数据所含信息量。本发明希望预处理过程中的降维能够尽可能多的保存原始高维数据中的信息量,所以引入约束使得预处理特征的方差最大。
综上所述,整体的损失函数为:
Figure PCTCN2017115622-appb-000010
其中,J为待优化目标,X为所求实值特征,
Figure PCTCN2017115622-appb-000011
为第n张图片的实值特征在第k个自动编码器的重构误差,
Figure PCTCN2017115622-appb-000012
为第k个自动编码器的第l层的投影,U为所有实值特征的均值向量,λ1、λ2为不同项的权重。
进一步地,本发明实施例采用两步迭代的方式训练基于K自编码网络的多量化:
步骤1,用K个自编码器对预处理的实值特征分别进行重构,对于每一个训练样本,令其属于重构误差最小的自编码器;
步骤2,根据第1项和第2项损失函数,对于每一个自编码器,使用属于该自编码器的全部样本对其进行训练。迭代多步后,对于每个样本使用重构误差最小的自编码器的编号作为该样本的量化结果。因为每一个自编码器是该样本在其不同子空间的一个投影,对于重构误差最小的自编码器,样本在其投影的子空间具有最大信息。将其量化为该自编码器代表的这一类,能够最大限度地减小量化损失。
最后,再利用第3项和第2项损失函数用所有样本训练预处理的全连接层,尽可能减少预处理过程中原始信息的损失。反复迭代上述两阶段训练,直到达到最大迭代次数。
在步骤S103中,根据量化结果对图像的深度实值特征进行二值编码,以得到图像的二值特征。
其中,在本发明的一个实施例中,根据量化结果对图像的深度实值特征进行二值编码,进一步包括:对于深度实值特征的每一位,将该位赋值为使得该位重构误差最小的自编码器的二进制编号。
可以理解的是,如图1所示,网络训练完成后,对于输入的测试样本,将其实值特征的每一位编码为使得该位重构误差最小的自编码器二进制编号,最终得到图像的二值特征。即言,在得到量化结果后对预处理的实值特征进行二值编码,并且对于实值特征的每一位,将该位赋值为使得该位重构误差最小的自编码器的二进制编号,最终得到图像的二值特征。
在本发明的实施例中,为了避免传统二值特征提取技术中用符号函数对图像实值 特征进行二值化带来的较大的量化误差,本发明实施例用K个自编码网络对图像实值特征在子空间的投影进行多量化,并用此多量化结果作为二值化的依据。
具体地,本发明实施例的关键点一,利用多量化技术进行精细的二值化,使用符号函数或手工定义阈值的方式对实值函数进行二值化,忽略了数据集的信息,导致量化损失较大;多量化通过将数据集进行聚类的方式,能够实现数据适应的精细二值化过程;关键点二,利用K自编码网络进行多量化;使用K个自编码网络进行多量化的方法,通过两步迭代的方式训练自编码器的参数,并得到最终的量化结果;关键点3,使用多量化深度二值特征提取的框架,使用深度网络进行实值特征学习并使用多量化网络进行二值化,实现多量化深度二值特征学习。
根据本发明实施例提出的多量化深度二值特征学习方法,首先对于图片使用深度网络提取实值特征,并且为了进行精细的二值化过程,采用了基于K自编码网络的多量化方法,对实值特征进行二值化,最小化该步骤带来的二值损失,以及为了得到更为精确的二值特征,使用统一的优化函数对整个网络的参数进行训练学习,最终对于每一张图片能够得到其精确、高效的多量化深度二值特征,其中,通过使用多量化进行二值化,并且利用K自编码网络来实施基于多量化的二值化,有效解决二值化带来的量化误差问题,提高了学习的精确性,并且提高了学习效率,更加高效简单,更好地满足实际应用需求。
其次参照附图描述根据本发明实施例提出的多量化深度二值特征学习装置。
图4是本发明实施例的多量化深度二值特征学习装置的结构示意图。
如图4所示,该多量化深度二值特征学习装置10包括:提取模块100、多量化模块200和编码模块300。
其中,提取模块100用于提取图像的深度实值特征。多量化模块200用于通过K个自编码网络对图像的深度实值特征进行多量化,以得到量化结果。编码模块300用于根据量化结果对图像的深度实值特征进行二值编码,以得到图像的二值特征。本发明实施例的装置10可以有效解决二值化带来的量化误差问题,提高了学习的精确性,并且提高了学习效率,更加高效简单,更好地满足实际应用需求。
进一步地,在本发明的一个实施例中,提取模块100具体用于将原始的输入图像前传入深度卷积神经网络,以在网络的最后层全连接后得到每一张图像的深度实值特征,并且将每一张图像的深度实值特征传入降维的全连阶层,以得到低维实值特征。
进一步地,在本发明的一个实施例中,整体的损失函数为:
Figure PCTCN2017115622-appb-000013
其中,J为待优化目标,X为所求实值特征,
Figure PCTCN2017115622-appb-000014
为第n张图片的实值特征在第k个自动编 码器的重构误差,
Figure PCTCN2017115622-appb-000015
为第k个自动编码器的第l层的投影,U为所有实值特征的均值向量,λ1、λ2为不同项的权重。
进一步地,在本发明的一个实施例中,多量化模块200还用于通过K个自编码器对图像的深度实值特征分别进行重构,其中,令每个训练样本属于重构误差最小的编码器,并且根据第一项损失函数和第二项损失函数,对于每个自编码器,使用属于该自编码器的全部样本对其进行训练,以迭代多步后,对于每个样本使用重构误差最小的编码器的编号作为该样本的量化结果,以及利用第三项损失函数和第二项损失函数用所有样本预处理的全连接层,并且反复迭代训练,直到最大迭代次数。
进一步地,在本发明的一个实施例中,编码模块300还用于对于深度实值特征的每一位,将该位赋值为使得该位重构误差最小的自编码器的二进制编号。
需要说明的是,前述对多量化深度二值特征学习方法实施例的解释说明也适用于该实施例的多量化深度二值特征学习装置,此处不再赘述。
根据本发明实施例提出的多量化深度二值特征学习装置,首先对于图片使用深度网络提取实值特征,并且为了进行精细的二值化过程,采用了基于K自编码网络的多量化方法,对实值特征进行二值化,最小化该步骤带来的二值损失,以及为了得到更为精确的二值特征,使用统一的优化函数对整个网络的参数进行训练学习,最终对于每一张图片能够得到其精确、高效的多量化深度二值特征,其中,通过使用多量化进行二值化,并且利用K自编码网络来实施基于多量化的二值化,有效解决二值化带来的量化误差问题,提高了学习的精确性,并且提高了学习效率,更加高效简单,更好地满足实际应用需求。
在本发明的描述中,需要理解的是,术语“中心”、“纵向”、“横向”、“长度”、“宽度”、“厚度”、“上”、“下”、“前”、“后”、“左”、“右”、“竖直”、“水平”、“顶”、“底”“内”、“外”、“顺时针”、“逆时针”、“轴向”、“径向”、“周向”等指示的方位或位置关系为基于附图所示的方位或位置关系,仅是为了便于描述本发明和简化描述,而不是指示或暗示所指的装置或元件必须具有特定的方位、以特定的方位构造和操作,因此不能理解为对本发明的限制。
此外,术语“第一”、“第二”仅用于描述目的,而不能理解为指示或暗示相对重要性或者隐含指明所指示的技术特征的数量。由此,限定有“第一”、“第二”的特征可以明示或者隐含地包括至少一个该特征。在本发明的描述中,“多个”的含义是至少两个,例如两个,三个等,除非另有明确具体的限定。
在本发明中,除非另有明确的规定和限定,术语“安装”、“相连”、“连接”、“固定”等术语应做广义理解,例如,可以是固定连接,也可以是可拆卸连接,或成一体;可以是机械连接,也可以是电连接;可以是直接相连,也可以通过中间媒介间接相连,可以是两个元件内部的连通或两个元件的相互作用关系,除非另有明确的限定。对于本领域的普通技术 人员而言,可以根据具体情况理解上述术语在本发明中的具体含义。
在本发明中,除非另有明确的规定和限定,第一特征在第二特征“上”或“下”可以是第一和第二特征直接接触,或第一和第二特征通过中间媒介间接接触。而且,第一特征在第二特征“之上”、“上方”和“上面”可是第一特征在第二特征正上方或斜上方,或仅仅表示第一特征水平高度高于第二特征。第一特征在第二特征“之下”、“下方”和“下面”可以是第一特征在第二特征正下方或斜下方,或仅仅表示第一特征水平高度小于第二特征。
在本说明书的描述中,参考术语“一个实施例”、“一些实施例”、“示例”、“具体示例”、或“一些示例”等的描述意指结合该实施例或示例描述的具体特征、结构、材料或者特点包含于本发明的至少一个实施例或示例中。在本说明书中,对上述术语的示意性表述不必须针对的是相同的实施例或示例。而且,描述的具体特征、结构、材料或者特点可以在任一个或多个实施例或示例中以合适的方式结合。此外,在不相互矛盾的情况下,本领域的技术人员可以将本说明书中描述的不同实施例或示例以及不同实施例或示例的特征进行结合和组合。
尽管上面已经示出和描述了本发明的实施例,可以理解的是,上述实施例是示例性的,不能理解为对本发明的限制,本领域的普通技术人员在本发明的范围内可以对上述实施例进行变化、修改、替换和变型。

Claims (10)

  1. 一种多量化深度二值特征学习方法,其特征在于,包括以下步骤:
    提取图像的深度实值特征;
    通过K个自编码网络对所述图像的深度实值特征进行多量化,以得到量化结果;以及
    根据所述量化结果对所述图像的深度实值特征进行二值编码,以得到图像的二值特征。
  2. 根据权利要求1所述的多量化深度二值特征学习方法,其特征在于,所述提取图像的深度实值特征,进一步包括:
    将原始的输入图像前传入深度卷积神经网络,以在网络的最后层全连接后得到每一张图像的深度实值特征;
    将所述每一张图像的深度实值特征传入降维的全连阶层,以得到低维实值特征。
  3. 根据权利要求1所述的多量化深度二值特征学习方法,其特征在于,整体的损失函数为:
    Figure PCTCN2017115622-appb-100001
    其中,J为待优化目标,X为所求实值特征,
    Figure PCTCN2017115622-appb-100002
    为第n张图片的实值特征在第k个自动编码器的重构误差,
    Figure PCTCN2017115622-appb-100003
    为第k个自动编码器的第l层的投影,U为所有实值特征的均值向量,λ1、λ2为不同项的权重。
  4. 根据权利要求1所述的多量化深度二值特征学习方法,其特征在于,所述通过K个自编码网络对所述图像的深度实值特征进行多量化,进一步包括:
    通过K个自编码器对所述图像的深度实值特征分别进行重构,其中,令每个训练样本属于重构误差最小的编码器;
    根据第一项损失函数和第二项损失函数,对于每个自编码器,使用属于该自编码器的全部样本对其进行训练,以迭代多步后,对于每个样本使用重构误差最小的编码器的编号作为该样本的量化结果;
    利用第三项损失函数和所述第二项损失函数用所有样本预处理的全连接层,并且反复迭代训练,直到最大迭代次数。
  5. 根据权利要求4所述的多量化深度二值特征学习方法,其特征在于,所述根据所述量化结果对所述图像的深度实值特征进行二值编码,进一步包括:
    对于所述深度实值特征的每一位,将该位赋值为使得该位重构误差最小的自编码器的二进制编号。
  6. 一种多量化深度二值特征学习装置,其特征在于,包括:
    提取模块,用于提取图像的深度实值特征;
    多量化模块,用于通过K个自编码网络对所述图像的深度实值特征进行多量化,以得到量化结果;以及
    编码模块,用于根据所述量化结果对所述图像的深度实值特征进行二值编码,以得到图像的二值特征。
  7. 根据权利要求6所述的多量化深度二值特征学习装置,其特征在于,所述提取模块具体用于将原始的输入图像前传入深度卷积神经网络,以在网络的最后层全连接后得到每一张图像的深度实值特征,并且将所述每一张图像的深度实值特征传入降维的全连阶层,以得到低维实值特征。
  8. 根据权利要求6所述的多量化深度二值特征学习装置,其特征在于,整体的损失函数为:
    Figure PCTCN2017115622-appb-100004
    其中,J为待优化目标,X为所求实值特征,
    Figure PCTCN2017115622-appb-100005
    为第n张图片的实值特征在第k个自动编码器的重构误差,
    Figure PCTCN2017115622-appb-100006
    为第k个自动编码器的第l层的投影,U为所有实值特征的均值向量,λ1、λ2为不同项的权重。
  9. 根据权利要求6所述的多量化深度二值特征学习装置,其特征在于,所述多量化模块还用于通过K个自编码器对所述图像的深度实值特征分别进行重构,其中,令每个训练样本属于重构误差最小的编码器,并且根据第一项损失函数和第二项损失函数,对于每个自编码器,使用属于该自编码器的全部样本对其进行训练,以迭代多步后,对于每个样本使用重构误差最小的编码器的编号作为该样本的量化结果,以及利用第三项损失函数和所述第二项损失函数用所有样本预处理的全连接层,并且反复迭代训练,直到最大迭代次数。
  10. 根据权利要求9所述的多量化深度二值特征学习装置,其特征在于,所述编码模块还用于对于所述深度实值特征的每一位,将该位赋值为使得该位重构误差最小的自编码器的二进制编号。
PCT/CN2017/115622 2017-05-17 2017-12-12 多量化深度二值特征学习方法及装置 WO2018209932A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201710349641.1A CN107239793B (zh) 2017-05-17 2017-05-17 多量化深度二值特征学习方法及装置
CN201710349641.1 2017-05-17

Publications (1)

Publication Number Publication Date
WO2018209932A1 true WO2018209932A1 (zh) 2018-11-22

Family

ID=59984523

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2017/115622 WO2018209932A1 (zh) 2017-05-17 2017-12-12 多量化深度二值特征学习方法及装置

Country Status (2)

Country Link
CN (1) CN107239793B (zh)
WO (1) WO2018209932A1 (zh)

Cited By (16)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109584225A (zh) * 2018-11-23 2019-04-05 聚时科技(上海)有限公司 一种基于自编码器的无监督缺陷检测方法
CN109840941A (zh) * 2019-02-20 2019-06-04 尹大龙 一种内窥镜探测物体的表面重建方法及系统
CN110109060A (zh) * 2019-03-27 2019-08-09 西安电子科技大学 一种基于深度学习网络的雷达辐射源信号分选方法及系统
CN110378362A (zh) * 2019-04-22 2019-10-25 浙江师范大学 基于概念稳定特征及其差异化网络的概念学习方法
CN110647891A (zh) * 2019-09-17 2020-01-03 上海仪电(集团)有限公司中央研究院 基于cnn和自编码器时序数据特征自动提取方法及系统
CN110706210A (zh) * 2019-09-18 2020-01-17 五邑大学 一种基于深度学习的钢筋计数方法及装置
CN111028939A (zh) * 2019-11-15 2020-04-17 华南理工大学 一种基于深度学习的多组学智能诊断系统
CN111291639A (zh) * 2020-01-20 2020-06-16 西北工业大学 基于分层变分自编码的跨源舰船特征融合学习与识别方法
CN111461147A (zh) * 2020-04-30 2020-07-28 柳州智视科技有限公司 一种基于图像特征的二进制编码组织算法
CN111565156A (zh) * 2020-04-27 2020-08-21 南京烽火星空通信发展有限公司 一种对网络流量识别分类的方法
CN112613577A (zh) * 2020-12-31 2021-04-06 上海商汤智能科技有限公司 神经网络的训练方法、装置、计算机设备及存储介质
CN113808061A (zh) * 2019-04-28 2021-12-17 深圳市商汤科技有限公司 一种图像处理方法及装置
CN114708180A (zh) * 2022-04-15 2022-07-05 电子科技大学 具有动态范围保持的预失真图像比特深度量化和增强方法
CN114979407A (zh) * 2022-05-24 2022-08-30 浙江科技学院 基于码分多址和深度学习鬼成像的多图加密和解密方法
CN113807396B (zh) * 2021-08-12 2023-07-18 华南理工大学 一种物联网高维数据异常检测方法、系统、装置及介质
RU2817397C1 (ru) * 2023-08-01 2024-04-16 федеральное государственное автономное образовательное учреждение высшего образования "Национальный исследовательский университет ИТМО" (Университет ИТМО) Способ дешифрования зашифрованных изображений

Families Citing this family (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107239793B (zh) * 2017-05-17 2020-01-17 清华大学 多量化深度二值特征学习方法及装置
CN107845116B (zh) * 2017-10-16 2021-05-25 北京京东尚科信息技术有限公司 生成平面图像的压缩编码的方法和装置
CN108182438B (zh) * 2018-01-17 2020-09-25 清华大学 基于深度强化学习的图二值特征学习方法及装置
CN110795975B (zh) * 2018-08-03 2023-07-21 浙江宇视科技有限公司 人脸误检优化方法及装置
CN109344893B (zh) * 2018-09-25 2021-01-01 华中师范大学 一种基于移动终端的图像分类方法
CN109670057B (zh) * 2019-01-03 2021-06-29 电子科技大学 一种渐进式的端到端深度特征量化系统及方法
CN109887075B (zh) * 2019-02-20 2020-12-15 清华大学 用于三维模型构建的三维点云模型训练方法
CN113159301B (zh) * 2021-05-25 2022-10-28 电子科技大学 一种基于二值化量化模型的图像处理方法

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104156464A (zh) * 2014-08-20 2014-11-19 中国科学院重庆绿色智能技术研究院 基于微视频特征数据库的微视频检索方法及装置
CN106445939A (zh) * 2015-08-06 2017-02-22 阿里巴巴集团控股有限公司 图像检索、获取图像信息及图像识别方法、装置及系统
CN107239793A (zh) * 2017-05-17 2017-10-10 清华大学 多量化深度二值特征学习方法及装置

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105205453B (zh) * 2015-08-28 2019-01-08 中国科学院自动化研究所 基于深度自编码器的人眼检测和定位方法
CN105913090B (zh) * 2016-04-14 2019-03-26 西安电子科技大学 基于sdae-svm的sar图像目标分类方法

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104156464A (zh) * 2014-08-20 2014-11-19 中国科学院重庆绿色智能技术研究院 基于微视频特征数据库的微视频检索方法及装置
CN106445939A (zh) * 2015-08-06 2017-02-22 阿里巴巴集团控股有限公司 图像检索、获取图像信息及图像识别方法、装置及系统
CN107239793A (zh) * 2017-05-17 2017-10-10 清华大学 多量化深度二值特征学习方法及装置

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
OU, XINYU ET AL.: "Image Hashing Retrieval Method based on Deep Self-Learning", COMPUTER ENGINEERING AND SCIENCE, vol. 37, no. 12, 15 December 2015 (2015-12-15), pages 2386 - 2391 *
PENG, TIANQIANG ET AL.: "Image Retrieval based on Convolutional Neural Networks and Binary Hashing Learning", JOURNAL OF ELECTRONICS & INFORMATION TECHNOLOGY, vol. 38, no. 8, 24 June 2016 (2016-06-24), pages 2068 - 2074, XP033258716 *
YUEQI DUAN ET AL.: "Learning Deep Binary Descriptor with Multi-Quantization", IEEE CONFERENCE ON COMPUTER VISION & PATTERN RECOGNITION, 26 July 2017 (2017-07-26), pages 1183 - 1192, XP033249842 *

Cited By (25)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109584225A (zh) * 2018-11-23 2019-04-05 聚时科技(上海)有限公司 一种基于自编码器的无监督缺陷检测方法
CN109584225B (zh) * 2018-11-23 2023-02-03 聚时科技(上海)有限公司 一种基于自编码器的无监督缺陷检测方法
CN109840941A (zh) * 2019-02-20 2019-06-04 尹大龙 一种内窥镜探测物体的表面重建方法及系统
CN110109060A (zh) * 2019-03-27 2019-08-09 西安电子科技大学 一种基于深度学习网络的雷达辐射源信号分选方法及系统
CN110109060B (zh) * 2019-03-27 2022-11-22 西安电子科技大学 一种基于深度学习网络的雷达辐射源信号分选方法及系统
CN110378362A (zh) * 2019-04-22 2019-10-25 浙江师范大学 基于概念稳定特征及其差异化网络的概念学习方法
CN113808061A (zh) * 2019-04-28 2021-12-17 深圳市商汤科技有限公司 一种图像处理方法及装置
CN110647891A (zh) * 2019-09-17 2020-01-03 上海仪电(集团)有限公司中央研究院 基于cnn和自编码器时序数据特征自动提取方法及系统
CN110647891B (zh) * 2019-09-17 2023-01-24 上海仪电(集团)有限公司中央研究院 基于cnn和自编码器时序数据特征自动提取方法及系统
CN110706210A (zh) * 2019-09-18 2020-01-17 五邑大学 一种基于深度学习的钢筋计数方法及装置
CN110706210B (zh) * 2019-09-18 2023-03-17 五邑大学 一种基于深度学习的钢筋计数方法及装置
CN111028939B (zh) * 2019-11-15 2023-03-31 华南理工大学 一种基于深度学习的多组学智能诊断系统
CN111028939A (zh) * 2019-11-15 2020-04-17 华南理工大学 一种基于深度学习的多组学智能诊断系统
CN111291639A (zh) * 2020-01-20 2020-06-16 西北工业大学 基于分层变分自编码的跨源舰船特征融合学习与识别方法
CN111291639B (zh) * 2020-01-20 2023-05-16 西北工业大学 基于分层变分自编码的跨源舰船特征融合学习与识别方法
CN111565156A (zh) * 2020-04-27 2020-08-21 南京烽火星空通信发展有限公司 一种对网络流量识别分类的方法
CN111461147A (zh) * 2020-04-30 2020-07-28 柳州智视科技有限公司 一种基于图像特征的二进制编码组织算法
CN111461147B (zh) * 2020-04-30 2023-05-23 柳州智视科技有限公司 一种基于图像特征的二进制编码组织算法
CN112613577A (zh) * 2020-12-31 2021-04-06 上海商汤智能科技有限公司 神经网络的训练方法、装置、计算机设备及存储介质
CN113807396B (zh) * 2021-08-12 2023-07-18 华南理工大学 一种物联网高维数据异常检测方法、系统、装置及介质
CN114708180A (zh) * 2022-04-15 2022-07-05 电子科技大学 具有动态范围保持的预失真图像比特深度量化和增强方法
CN114708180B (zh) * 2022-04-15 2023-05-30 电子科技大学 具有动态范围保持的预失真图像比特深度量化和增强方法
CN114979407A (zh) * 2022-05-24 2022-08-30 浙江科技学院 基于码分多址和深度学习鬼成像的多图加密和解密方法
CN114979407B (zh) * 2022-05-24 2023-08-22 浙江科技学院 基于码分多址和深度学习鬼成像的多图加密和解密方法
RU2817397C1 (ru) * 2023-08-01 2024-04-16 федеральное государственное автономное образовательное учреждение высшего образования "Национальный исследовательский университет ИТМО" (Университет ИТМО) Способ дешифрования зашифрованных изображений

Also Published As

Publication number Publication date
CN107239793B (zh) 2020-01-17
CN107239793A (zh) 2017-10-10

Similar Documents

Publication Publication Date Title
WO2018209932A1 (zh) 多量化深度二值特征学习方法及装置
CN111581405B (zh) 基于对偶学习生成对抗网络的跨模态泛化零样本检索方法
CN111291212B (zh) 基于图卷积神经网络的零样本草图图像检索方法和系统
Huang et al. Analysis and synthesis of 3D shape families via deep‐learned generative models of surfaces
Li et al. Real-time computerized annotation of pictures
CN106033426B (zh) 一种基于潜在语义最小哈希的图像检索方法
CN108875813B (zh) 一种基于几何图像的三维网格模型检索方法
CN109858015B (zh) 一种基于ctw和km算法的语义相似度计算方法及装置
CN109697451B (zh) 相似图像聚类方法及装置、存储介质、电子设备
CN110472652B (zh) 基于语义引导的少量样本分类方法
CN103473307B (zh) 跨媒体稀疏哈希索引方法
CN110928981A (zh) 一种文本标签体系搭建及完善迭代的方法、系统及存储介质
Bhute et al. Content based image indexing and retrieval
CN114723583A (zh) 基于深度学习的非结构化电力大数据分析方法
CN108182438B (zh) 基于深度强化学习的图二值特征学习方法及装置
CN114821299A (zh) 一种遥感图像变化检测方法
CN113204640A (zh) 一种基于注意力机制的文本分类方法
WO2023185209A1 (zh) 模型剪枝
Shu et al. Multiple Laplacian graph regularised low‐rank representation with application to image representation
CN108416389B (zh) 基于降噪稀疏自动编码器和密度空间采样的图像分类方法
Remil et al. Data‐Driven Sparse Priors of 3D Shapes
CN109035318B (zh) 一种图像风格的转换方法
CN111008529B (zh) 一种基于神经网络的中文关系抽取方法
WO2020168526A1 (zh) 图像编码方法、设备及计算机可读存储介质
CN112417234B (zh) 一种数据聚类方法和装置,及计算机可读存储介质

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 17909917

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 17909917

Country of ref document: EP

Kind code of ref document: A1