CN111695508B - Multi-scale Retinex and gesture image retrieval method based on improved VGGNet network - Google Patents

Multi-scale Retinex and gesture image retrieval method based on improved VGGNet network Download PDF

Info

Publication number
CN111695508B
CN111695508B CN202010532767.4A CN202010532767A CN111695508B CN 111695508 B CN111695508 B CN 111695508B CN 202010532767 A CN202010532767 A CN 202010532767A CN 111695508 B CN111695508 B CN 111695508B
Authority
CN
China
Prior art keywords
image
gesture
features
feature
gesture image
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202010532767.4A
Other languages
Chinese (zh)
Other versions
CN111695508A (en
Inventor
谢武
贾清玉
刘满意
强保华
崔梦银
瞿元昊
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guilin University of Electronic Technology
Original Assignee
Guilin University of Electronic Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guilin University of Electronic Technology filed Critical Guilin University of Electronic Technology
Priority to CN202010532767.4A priority Critical patent/CN111695508B/en
Publication of CN111695508A publication Critical patent/CN111695508A/en
Application granted granted Critical
Publication of CN111695508B publication Critical patent/CN111695508B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/107Static hand or arm
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/50Information retrieval; Database structures therefor; File system structures therefor of still image data
    • G06F16/55Clustering; Classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/50Information retrieval; Database structures therefor; File system structures therefor of still image data
    • G06F16/58Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
    • G06F16/583Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/74Image or video pattern matching; Proximity measures in feature spaces
    • G06V10/75Organisation of the matching processes, e.g. simultaneous or sequential comparisons of image or video features; Coarse-fine approaches, e.g. multi-scale approaches; using context analysis; Selection of dictionaries
    • G06V10/751Comparing pixel values or logical combinations thereof, or feature values having positional relevance, e.g. template matching
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/20Movements or behaviour, e.g. gesture recognition
    • G06V40/28Recognition of hand or arm movements, e.g. recognition of deaf sign language

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Engineering & Computer Science (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Computation (AREA)
  • General Health & Medical Sciences (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Health & Medical Sciences (AREA)
  • Databases & Information Systems (AREA)
  • Multimedia (AREA)
  • Computing Systems (AREA)
  • Software Systems (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Computational Linguistics (AREA)
  • Biophysics (AREA)
  • Biomedical Technology (AREA)
  • Human Computer Interaction (AREA)
  • Molecular Biology (AREA)
  • Mathematical Physics (AREA)
  • Library & Information Science (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Psychiatry (AREA)
  • Social Psychology (AREA)
  • Medical Informatics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses a gesture image retrieval method based on multi-scale Retinex and an improved VGGNet network. After the gesture image retrieval model is trained, the features extracted by the last FC layer are used as the features of the image to represent participation in a gesture image retrieval task, a Hash layer is introduced into an improved VGGNet multi-branch network structure, the input of the model is a gesture image and a category label, the category label is used as a supervision information learning image feature, each branch learns different label information, the learned features of the two previous branches are fused through a full connection layer to obtain a nonlinear combination feature, then a low-dimensional Hash feature is obtained through the Hash layer, a binary Hash is obtained through the Hash layer, and finally a binary Hash code is used as a feature vector to perform gesture retrieval. On the premise of ensuring the accuracy, the efficiency of gesture retrieval is improved.

Description

Multi-scale Retinex and gesture image retrieval method based on improved VGGNet network
Technical Field
The invention relates to a gesture image retrieval method, in particular to a gesture image retrieval method based on multi-scale Retinex and an improved VGGNet network.
Background
In recent years, deep learning has achieved a series of better results in the field of computer vision than the traditional method, and the deep learning technology has become one of the most popular research methods at present. When the feature vector with high dimensionality is adopted to carry out gesture feature vector similarity calculation, the feature dimensionality of the last but one layer of the fully connected layer of the VGGNet is up to 4096 dimensionalities, and the features are stored in floating point numbers, so that the time for matching the similarity is greatly increased when gesture image retrieval is carried out, and extremely poor user experience is caused.
Disclosure of Invention
The invention aims to provide a gesture image retrieval method based on a multiscale Retinex and an improved VGGNet network, aiming at the problem that the speed of high-dimensional feature vectors in the prior art is low when gesture feature vector similarity calculation is carried out.
The technical scheme for realizing the purpose of the invention is as follows:
a gesture image retrieval method based on multi-scale Retinex and improved VGGNet network comprises the following steps:
(1) picture preprocessing: performing dim light enhancement on the gesture image by adopting a multi-scale Retinex algorithm, normalizing the gesture image after data enhancement processing, and processing the gesture image into a data input format required by a CNN (hidden network connection) model;
(2) feature extraction: performing a series of convolution, pooling and full-connection operations on the gesture image by using a trained CNN model, extracting features of the gesture image, preprocessing a gesture data set, labeling and fusing labels, constructing a VGGNet-based network structure, defining and performing initialization training on the VGGNet-based network structure, taking the features extracted by the last FC layer of the trained gesture model as features of the image to represent the participation in a gesture image retrieval task, introducing a hash layer, fusing the features through the full-connection layer to obtain nonlinear combination features, then obtaining a binary hash code through the hash layer, taking the binary hash code as a feature vector to perform gesture retrieval, and constructing a feature database;
(3) similarity matching: and acquiring an image list from the feature database, and matching features similar to the query picture.
The gesture picture is normalized to be a data input format required by a CNN model in the step (1), and the adopted method comprises the following steps:
1) inputting an original image I (x, y);
2) estimating the noise of each position, and removing, assuming that an image I seen by human eyes is a product of an image illumination component L and a reflectivity component R, which is specifically expressed as formula 1:
I(x,y)=R(x,y)·L(x,y) (1)
3) separating three color channel space components and converting the three color channel space components into a logarithmic domain, reasonably calculating illumination L from a shot picture I, keeping the inherent attribute R of an object, and eliminating the interference of uneven illumination distribution; simultaneously, taking logarithms of both sides of formula 1, and then, assuming that I (x, y) is log (I (x, y)), R (x, y) is log (R (x, y)), and L (x, y) is log (L (x, y)), formula 2 can be obtained:
i(x,y)=r(x,y)+l(x,y) (2)
4) setting the number and size of Gaussian function scales;
5) the Gaussian function filters three channels of the image, the filtered image is an illumination component, an image r (x, y) is obtained, and a reflection component calculation formula is as follows.
ri(x,y)=ii(x,y)-ii(x,y)*G(x,y) (3)
Figure BDA0002535941500000021
Wherein ii(x, y) represents the original image of the ith channel, G (x, y) is a Gaussian filter function, ri(x, y) represents the reflection component of the ith channel, represents the convolution, and σ is the scale parameter.
The data enhancement of the gesture image in the step (1) is carried out, and the adopted method comprises the following steps:
1) for a gesture image, filtering three channels of the image by adopting Gaussian filter functions with various scales, and performing weighted average on the reflection component of each scale to obtain a final output result, wherein the formula (3) can be changed into the following steps:
Figure BDA0002535941500000022
wherein G isk(x, y) represents the kth gaussian filter function, N represents the number of gaussian filter functions, and experiments show that when N is 3, the gesture image data is enhanced most effectively; w is akThe weight of the kth scale is adopted, and the proportion of N Gaussian filter functions meets the constraint condition:
Figure BDA0002535941500000023
2) converting R (x, y) from a logarithmic domain to a real domain to obtain R (x, y);
3) and (3) performing linear correction processing on the R (x, y) (because the range of the R (x, y) is not in the range of 0-255), and obtaining an enhanced gesture image after correction.
The feature of the extracted gesture image in the step (2) comprises two faces: one is to extract the characteristics of the query picture uploaded by the user, and the other is to extract the characteristics of the picture database to construct an image characteristic database; the adopted characteristic extraction method comprises the following steps:
(1) data preprocessing: preprocessing a gesture data set and labeling and integrating labels, wherein the preprocessing comprises data enhancement, data normalization and the like;
(2) constructing a network structure based on VGGNet: training by adopting a VGGNet16 network model, performing network structure definition and initialization on VGGNet16, and setting a learning rate lr, a batch size batch, iteration rounds epochs and the like;
(3) training a model: training and verifying the model alternately;
(4) taking the features extracted from the last FC layer of the gesture model trained in the step (3) as the features of the image to represent participation in a gesture image retrieval task, inputting the features into a gesture image and a category label, learning the image features by taking the category label as supervision information, learning different label information by each branch, fusing the learned features of the previous two branches through a full connection layer to obtain a nonlinear combination feature, obtaining a low-dimensional hash feature through a hash layer, obtaining a binary hash through the hash layer, and finally taking the binary hash code as a feature vector to perform gesture retrieval;
(5) saving the model file;
(6) and randomly selecting 100 pictures from the test set as query pictures, using the rest pictures as an image database, selecting the model with the best classification effect as a feature extractor, and constructing a feature database.
The invention has the beneficial effects that: according to the invention, after the gesture data image is enhanced by a multi-scale Retinex method, the model is trained by adopting a deep learning method, after the gesture image retrieval model is trained, the characteristics extracted from the last FC layer are taken as the characteristics of the image to represent participation in a gesture image retrieval task, and the Hash layer is introduced while the multi-branch network structure of the improved VGGNet is adopted, so that the gesture retrieval efficiency can be improved on the premise of ensuring the accuracy.
Drawings
Fig. 1 is a flow chart of an improved VGGNet network according to an embodiment of the present invention;
FIG. 2 is a flow chart of the calculation of the reflection component according to the embodiment of the present invention.
Detailed Description
The invention is further illustrated with reference to the following figures and examples.
The embodiment is as follows:
a gesture image retrieval method based on multi-scale Retinex and an improved VGGNet network comprises the following steps:
1. a user uploads a picture needing gesture inquiry;
2. picture preprocessing: the method comprises the following steps of performing dim light enhancement on a gesture image by adopting a multi-scale Retinex algorithm, normalizing the gesture image after data enhancement processing, and processing the gesture image into a data input format required by a CNN (hidden network connection) model, wherein the method comprises the following specific steps:
(1) inputting an original image I (x, y);
(2) and estimating the noise of each position and removing the noise. Assuming that an image I seen by human eyes is a product of an image illumination component L and a reflectivity component R, the specific expression is as shown in formula 1:
I(x,y)=R(x,y)·L(x,y) (1)
(3) the three color channel spatial components are separated and converted to the log domain. The illumination L is reasonably calculated from the shot picture I, so that the inherent attribute R of the object is reserved, the interference of uneven illumination distribution is eliminated, and the sensory effect of the image is improved. For convenience of calculation, the logarithm of both sides of formula 1 is taken at the same time, and then formula 2 is obtained by setting I (x, y) to log (I (x, y)), R (x, y) to log (R (x, y)), and L (x, y) to log (L (x, y)):
i(x,y)=r(x,y)+l(x,y) (2)
the calculation process of the reflection component is shown in fig. 2;
(4) setting the number and size of Gaussian function scales;
(5) the Gaussian function filters three channels of the image, the filtered image is an illumination component, an image r (x, y) is obtained, and a calculation formula of a reflection component is as follows:
ri(x,y)=ii(x,y)-ii(x,y)*G(x,y) (3)
Figure BDA0002535941500000041
wherein ii(x, y) represents the original image of the ith channel, G (x, y) is a Gaussian filter function, ri(x, y) represents the reflection component of the ith channel, which represents the convolution, and σ is the scale parameter.
(6) The method adopts a multiscale Retinex algorithm to perform data enhancement on the gesture image, and the specific process of the algorithm is as follows: for a gesture image, filtering three channels of the image by adopting Gaussian filter functions of various scales, taking weighted average of reflection components of each scale to obtain a final output result, and changing the formula of 3 into:
Figure BDA0002535941500000042
wherein G isk(x, y) represents the kth gaussian filter function, N represents the number of gaussian filter functions, and experiments show that when N is 3, the gesture image data is enhanced most effectively. w is akThe weight of the kth scale is adopted, and the proportion of N Gaussian filter functions meets the constraint condition:
Figure BDA0002535941500000043
(7) converting R (x, y) from logarithmic domain to real domain to obtain R (x, y)
(8) And performing linear correction processing on the R (x, y) (because the range of the R (x, y) is not in the range of 0-255), and obtaining an enhanced gesture image after correction.
3. Feature extraction: the feature extraction is mainly to use a trained CNN model to carry out a series of convolution, pooling and full-connection operations on the gesture image so as to extract the features of the gesture image. The feature extraction here includes two faces: one is to extract the characteristics of the query picture uploaded by the user, and the other is to extract the characteristics of the picture database to construct an image characteristic database. The specific implementation steps are as follows:
(1) data preprocessing: and preprocessing the gesture data set and labeling and integrating the label, wherein the preprocessing comprises data enhancement, data normalization and the like.
(2) Constructing a network structure based on VGGNet: and training by adopting a VGGNet16 network model. Performing network structure definition and initialization on VGGNet16, and setting a learning rate lr, a batch size batch, iteration rounds epochs and the like;
(3) and (5) training the model. Training and verifying the model alternately;
(4) and (3) taking the feature extracted from the last FC layer of the trained gesture model as the feature of the image to represent participation in a gesture image retrieval task, wherein the feature dimension of the last but one full connection layer of the VGGNet reaches up to 4096 dimensions, so that the time for matching the similarity can be greatly increased when the gesture image retrieval is carried out, an improved multi-branch network structure of the VGGNet is provided, and a Hash layer is introduced, so that the efficiency of the gesture retrieval can be improved on the premise of ensuring the accuracy. The overall network model is as shown in fig. 1, input is a gesture image and a category label, the category label is used as supervision information to learn image characteristics, each branch learns different label information, the learned characteristics of the two previous branches are fused through a full connection layer to obtain nonlinear combination characteristics, then a low-dimensional hash characteristic is obtained through a hash layer, a binary hash is obtained through the hash layer, and finally the binary hash code is used as a characteristic vector to perform gesture retrieval;
(5) saving the model file;
(6) randomly selecting 100 pictures from the test set as query pictures, selecting the rest pictures as an image database, selecting a model with the best classification effect as a feature extractor, and constructing a feature database;
4. similarity matching: matching out features similar to the query picture from the feature database;
5. returning the result of the query: and acquiring the sorted image list from the image database and presenting the image list to the user.

Claims (2)

1. A gesture image retrieval method based on multi-scale Retinex and improved VGGNet network is characterized in that: the method comprises the following steps:
(1) picture preprocessing: performing dim light enhancement on the gesture image by adopting a multi-scale Retinex algorithm, normalizing the gesture image after data enhancement processing, and processing the gesture image into a data input format required by a CNN model; the method for normalizing the gesture picture into the data input format required by the CNN model comprises the following steps of:
1) inputting an original image
Figure 566586DEST_PATH_IMAGE001
2) Estimating the noise of each position, and eliminating the noise, assuming the original image seen by human eyes
Figure 952568DEST_PATH_IMAGE001
Is the product of the image illumination component L and the reflectance component R, and is specifically expressed as shown in equation 1:
Figure DEST_PATH_IMAGE002
(1)
3) separating three color channel space components and converting the three color channel space components into a logarithmic domain, reasonably calculating illumination from a shot picture, and keeping the inherent attribute of an object
Figure 404409DEST_PATH_IMAGE003
Eliminating the interference of uneven illumination distribution; taking logarithm of two sides of formula 1 at the same time, and then ordering
Figure DEST_PATH_IMAGE004
Figure 737301DEST_PATH_IMAGE005
Figure DEST_PATH_IMAGE006
Equation 2 can be obtained:
Figure 286094DEST_PATH_IMAGE007
(2)
4) setting the number and size of Gaussian function scales;
5) the Gaussian function filters three channels of the image, the filtered image is an illumination component, and a calculation formula of a reflection component is as follows:
Figure DEST_PATH_IMAGE008
wherein the content of the first and second substances,
Figure 780661DEST_PATH_IMAGE009
is shown as
Figure DEST_PATH_IMAGE010
The original image of each of the channels is,
Figure 976588DEST_PATH_IMAGE011
in the form of a gaussian filter function,
Figure DEST_PATH_IMAGE012
is shown as
Figure 909909DEST_PATH_IMAGE010
The reflected component of each channel, represents a convolution,
Figure 782050DEST_PATH_IMAGE013
is a scale parameter;
the data enhancement is carried out on the gesture image, and the adopted method comprises the following steps:
1) for a gesture image, filtering three channels of the image by adopting Gaussian filter functions of various scales, taking weighted average of reflection components of each scale to obtain a final output result, and changing a formula (3) into:
Figure DEST_PATH_IMAGE014
wherein the content of the first and second substances,
Figure 509834DEST_PATH_IMAGE015
represents the first
Figure DEST_PATH_IMAGE016
A function of a gaussian filter is used to filter,
Figure 936268DEST_PATH_IMAGE017
the number of gaussian filter functions is represented,
Figure DEST_PATH_IMAGE018
is the first
Figure 407700DEST_PATH_IMAGE016
The weight of each of the scales is determined,
Figure 399927DEST_PATH_IMAGE019
the proportion of the Gaussian filter functions meets the constraint condition:
Figure DEST_PATH_IMAGE020
2) handle
Figure 501875DEST_PATH_IMAGE021
Conversion from the logarithmic domain to the real domain
Figure DEST_PATH_IMAGE022
3) To pair
Figure 477922DEST_PATH_IMAGE022
Performing linear correction processing to obtain an enhanced gesture image after correction;
(2) characteristic extraction: performing a series of convolution, pooling and full-connection operations on the gesture image by using a trained CNN model, extracting features of the gesture image, preprocessing a gesture data set, labeling and fusing labels, constructing a VGGNet-based network structure, defining and performing initialization training on the VGGNet-based network structure, taking the features extracted by the last FC layer of the trained gesture model as features of the image to represent the participation in a gesture image retrieval task, introducing a hash layer, fusing the features through the full-connection layer to obtain nonlinear combination features, then obtaining a binary hash code through the hash layer, taking the binary hash code as a feature vector to perform gesture retrieval, and constructing a feature database;
(3) similarity matching: and acquiring an image list from the feature database, and matching features similar to the query picture.
2. The method for retrieving a gesture image according to claim 1, wherein: the feature of the extracted gesture image in the step (2) comprises two faces: one is to extract the characteristics of the query picture uploaded by the user, and the other is to extract the characteristics of the picture database to construct an image characteristic database; the adopted characteristic extraction method comprises the following steps:
(1) data preprocessing: preprocessing a gesture data set and labeling and integrating labels, wherein the preprocessing comprises data enhancement and data normalization;
(2) constructing a network structure based on VGGNet: training by adopting a VGGNet16 network model, defining and initializing a network structure of VGGNet16, setting a learning rate lr, a batch size batch and iteration rounds epochs;
(3) training a model: training and verifying the model alternately;
(4) taking the features extracted from the last FC layer of the gesture model trained in the step (3) as the features of the image to represent the participation in a gesture image retrieval task, inputting the features into a gesture image and a category label, learning the image features by taking the category label as supervision information, learning different label information by each branch, fusing the learned features of the previous two branches through a full connection layer to obtain a nonlinear combination feature, obtaining a low-dimensional hash feature through a hash layer, obtaining a binary hash code through the hash layer, and finally taking the binary hash code as a feature vector to perform gesture retrieval;
(5) saving the model file;
(6) and randomly selecting 100 pictures from the test set as query pictures, using the rest pictures as an image database, selecting the model with the best classification effect as a feature extractor, and constructing a feature database.
CN202010532767.4A 2020-06-12 2020-06-12 Multi-scale Retinex and gesture image retrieval method based on improved VGGNet network Active CN111695508B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010532767.4A CN111695508B (en) 2020-06-12 2020-06-12 Multi-scale Retinex and gesture image retrieval method based on improved VGGNet network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010532767.4A CN111695508B (en) 2020-06-12 2020-06-12 Multi-scale Retinex and gesture image retrieval method based on improved VGGNet network

Publications (2)

Publication Number Publication Date
CN111695508A CN111695508A (en) 2020-09-22
CN111695508B true CN111695508B (en) 2022-07-19

Family

ID=72480517

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010532767.4A Active CN111695508B (en) 2020-06-12 2020-06-12 Multi-scale Retinex and gesture image retrieval method based on improved VGGNet network

Country Status (1)

Country Link
CN (1) CN111695508B (en)

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109214250A (en) * 2017-07-05 2019-01-15 中南大学 A kind of static gesture identification method based on multiple dimensioned convolutional neural networks
CN109241313A (en) * 2018-08-14 2019-01-18 大连大学 A kind of image search method based on the study of high-order depth Hash
CN109815920A (en) * 2019-01-29 2019-05-28 南京信息工程大学 Gesture identification method based on convolutional neural networks and confrontation convolutional neural networks
CN109947963A (en) * 2019-03-27 2019-06-28 山东大学 A kind of multiple dimensioned Hash search method based on deep learning
CN110427509A (en) * 2019-08-05 2019-11-08 山东浪潮人工智能研究院有限公司 A kind of multi-scale feature fusion image Hash search method and system based on deep learning
CN110784253A (en) * 2018-07-31 2020-02-11 深圳市白麓嵩天科技有限责任公司 Information interaction method based on gesture recognition and Beidou satellite

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109214250A (en) * 2017-07-05 2019-01-15 中南大学 A kind of static gesture identification method based on multiple dimensioned convolutional neural networks
CN110784253A (en) * 2018-07-31 2020-02-11 深圳市白麓嵩天科技有限责任公司 Information interaction method based on gesture recognition and Beidou satellite
CN109241313A (en) * 2018-08-14 2019-01-18 大连大学 A kind of image search method based on the study of high-order depth Hash
CN109815920A (en) * 2019-01-29 2019-05-28 南京信息工程大学 Gesture identification method based on convolutional neural networks and confrontation convolutional neural networks
CN109947963A (en) * 2019-03-27 2019-06-28 山东大学 A kind of multiple dimensioned Hash search method based on deep learning
CN110427509A (en) * 2019-08-05 2019-11-08 山东浪潮人工智能研究院有限公司 A kind of multi-scale feature fusion image Hash search method and system based on deep learning

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
A Cascade Face Spoofing Detector Based on Face Anti-Spoofing R-CNN and Improved Retinex LBP;Haonan Chen,Yaowu Chen;《IEEE Access》;20191125;第7卷;第170116-170133页 *
Jin Kyu Kang ; Toan Minh Hoang.Person Re-Identification Between Visible and Thermal Camera Images Based on Deep Residual CNN Using Single Input.《 IEEE Access 》.2019,第7卷 *

Also Published As

Publication number Publication date
CN111695508A (en) 2020-09-22

Similar Documents

Publication Publication Date Title
CN107122809B (en) Neural network feature learning method based on image self-coding
Radenovic et al. Deep shape matching
CN109543502B (en) Semantic segmentation method based on deep multi-scale neural network
CN108510012B (en) Target rapid detection method based on multi-scale feature map
CN106372581B (en) Method for constructing and training face recognition feature extraction network
CN108898620B (en) Target tracking method based on multiple twin neural networks and regional neural network
WO2020228525A1 (en) Place recognition method and apparatus, model training method and apparatus for place recognition, and electronic device
CN108304826A (en) Facial expression recognizing method based on convolutional neural networks
CN111985581B (en) Sample-level attention network-based few-sample learning method
CN110032925B (en) Gesture image segmentation and recognition method based on improved capsule network and algorithm
CN111340814A (en) Multi-mode adaptive convolution-based RGB-D image semantic segmentation method
CN112036288B (en) Facial expression recognition method based on cross-connection multi-feature fusion convolutional neural network
CN112580590A (en) Finger vein identification method based on multi-semantic feature fusion network
CN110634170B (en) Photo-level image generation method based on semantic content and rapid image retrieval
CN109740679B (en) Target identification method based on convolutional neural network and naive Bayes
CN109033978B (en) Error correction strategy-based CNN-SVM hybrid model gesture recognition method
CN111814611B (en) Multi-scale face age estimation method and system embedded with high-order information
CN114299559A (en) Finger vein identification method based on lightweight fusion global and local feature network
CN111079514A (en) Face recognition method based on CLBP and convolutional neural network
CN110363156A (en) A kind of Facial action unit recognition methods that posture is unrelated
CN110610138A (en) Facial emotion analysis method based on convolutional neural network
CN111694977A (en) Vehicle image retrieval method based on data enhancement
CN115049814B (en) Intelligent eye protection lamp adjusting method adopting neural network model
CN115393225A (en) Low-illumination image enhancement method based on multilevel feature extraction and fusion
Lata et al. Data augmentation using generative adversarial network

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant