CN111695508B - Multi-scale Retinex and gesture image retrieval method based on improved VGGNet network - Google Patents
Multi-scale Retinex and gesture image retrieval method based on improved VGGNet network Download PDFInfo
- Publication number
- CN111695508B CN111695508B CN202010532767.4A CN202010532767A CN111695508B CN 111695508 B CN111695508 B CN 111695508B CN 202010532767 A CN202010532767 A CN 202010532767A CN 111695508 B CN111695508 B CN 111695508B
- Authority
- CN
- China
- Prior art keywords
- image
- gesture
- features
- feature
- gesture image
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V40/00—Recognition of biometric, human-related or animal-related patterns in image or video data
- G06V40/10—Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
- G06V40/107—Static hand or arm
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/50—Information retrieval; Database structures therefor; File system structures therefor of still image data
- G06F16/55—Clustering; Classification
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/50—Information retrieval; Database structures therefor; File system structures therefor of still image data
- G06F16/58—Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
- G06F16/583—Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
- G06F18/241—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/74—Image or video pattern matching; Proximity measures in feature spaces
- G06V10/75—Organisation of the matching processes, e.g. simultaneous or sequential comparisons of image or video features; Coarse-fine approaches, e.g. multi-scale approaches; using context analysis; Selection of dictionaries
- G06V10/751—Comparing pixel values or logical combinations thereof, or feature values having positional relevance, e.g. template matching
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V40/00—Recognition of biometric, human-related or animal-related patterns in image or video data
- G06V40/20—Movements or behaviour, e.g. gesture recognition
- G06V40/28—Recognition of hand or arm movements, e.g. recognition of deaf sign language
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- General Engineering & Computer Science (AREA)
- Artificial Intelligence (AREA)
- Evolutionary Computation (AREA)
- General Health & Medical Sciences (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Health & Medical Sciences (AREA)
- Databases & Information Systems (AREA)
- Multimedia (AREA)
- Computing Systems (AREA)
- Software Systems (AREA)
- Life Sciences & Earth Sciences (AREA)
- Computational Linguistics (AREA)
- Biophysics (AREA)
- Biomedical Technology (AREA)
- Human Computer Interaction (AREA)
- Molecular Biology (AREA)
- Mathematical Physics (AREA)
- Library & Information Science (AREA)
- Evolutionary Biology (AREA)
- Bioinformatics & Computational Biology (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Psychiatry (AREA)
- Social Psychology (AREA)
- Medical Informatics (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
- Image Analysis (AREA)
Abstract
The invention discloses a gesture image retrieval method based on multi-scale Retinex and an improved VGGNet network. After the gesture image retrieval model is trained, the features extracted by the last FC layer are used as the features of the image to represent participation in a gesture image retrieval task, a Hash layer is introduced into an improved VGGNet multi-branch network structure, the input of the model is a gesture image and a category label, the category label is used as a supervision information learning image feature, each branch learns different label information, the learned features of the two previous branches are fused through a full connection layer to obtain a nonlinear combination feature, then a low-dimensional Hash feature is obtained through the Hash layer, a binary Hash is obtained through the Hash layer, and finally a binary Hash code is used as a feature vector to perform gesture retrieval. On the premise of ensuring the accuracy, the efficiency of gesture retrieval is improved.
Description
Technical Field
The invention relates to a gesture image retrieval method, in particular to a gesture image retrieval method based on multi-scale Retinex and an improved VGGNet network.
Background
In recent years, deep learning has achieved a series of better results in the field of computer vision than the traditional method, and the deep learning technology has become one of the most popular research methods at present. When the feature vector with high dimensionality is adopted to carry out gesture feature vector similarity calculation, the feature dimensionality of the last but one layer of the fully connected layer of the VGGNet is up to 4096 dimensionalities, and the features are stored in floating point numbers, so that the time for matching the similarity is greatly increased when gesture image retrieval is carried out, and extremely poor user experience is caused.
Disclosure of Invention
The invention aims to provide a gesture image retrieval method based on a multiscale Retinex and an improved VGGNet network, aiming at the problem that the speed of high-dimensional feature vectors in the prior art is low when gesture feature vector similarity calculation is carried out.
The technical scheme for realizing the purpose of the invention is as follows:
a gesture image retrieval method based on multi-scale Retinex and improved VGGNet network comprises the following steps:
(1) picture preprocessing: performing dim light enhancement on the gesture image by adopting a multi-scale Retinex algorithm, normalizing the gesture image after data enhancement processing, and processing the gesture image into a data input format required by a CNN (hidden network connection) model;
(2) feature extraction: performing a series of convolution, pooling and full-connection operations on the gesture image by using a trained CNN model, extracting features of the gesture image, preprocessing a gesture data set, labeling and fusing labels, constructing a VGGNet-based network structure, defining and performing initialization training on the VGGNet-based network structure, taking the features extracted by the last FC layer of the trained gesture model as features of the image to represent the participation in a gesture image retrieval task, introducing a hash layer, fusing the features through the full-connection layer to obtain nonlinear combination features, then obtaining a binary hash code through the hash layer, taking the binary hash code as a feature vector to perform gesture retrieval, and constructing a feature database;
(3) similarity matching: and acquiring an image list from the feature database, and matching features similar to the query picture.
The gesture picture is normalized to be a data input format required by a CNN model in the step (1), and the adopted method comprises the following steps:
1) inputting an original image I (x, y);
2) estimating the noise of each position, and removing, assuming that an image I seen by human eyes is a product of an image illumination component L and a reflectivity component R, which is specifically expressed as formula 1:
I(x,y)=R(x,y)·L(x,y) (1)
3) separating three color channel space components and converting the three color channel space components into a logarithmic domain, reasonably calculating illumination L from a shot picture I, keeping the inherent attribute R of an object, and eliminating the interference of uneven illumination distribution; simultaneously, taking logarithms of both sides of formula 1, and then, assuming that I (x, y) is log (I (x, y)), R (x, y) is log (R (x, y)), and L (x, y) is log (L (x, y)), formula 2 can be obtained:
i(x,y)=r(x,y)+l(x,y) (2)
4) setting the number and size of Gaussian function scales;
5) the Gaussian function filters three channels of the image, the filtered image is an illumination component, an image r (x, y) is obtained, and a reflection component calculation formula is as follows.
ri(x,y)=ii(x,y)-ii(x,y)*G(x,y) (3)
Wherein ii(x, y) represents the original image of the ith channel, G (x, y) is a Gaussian filter function, ri(x, y) represents the reflection component of the ith channel, represents the convolution, and σ is the scale parameter.
The data enhancement of the gesture image in the step (1) is carried out, and the adopted method comprises the following steps:
1) for a gesture image, filtering three channels of the image by adopting Gaussian filter functions with various scales, and performing weighted average on the reflection component of each scale to obtain a final output result, wherein the formula (3) can be changed into the following steps:
wherein G isk(x, y) represents the kth gaussian filter function, N represents the number of gaussian filter functions, and experiments show that when N is 3, the gesture image data is enhanced most effectively; w is akThe weight of the kth scale is adopted, and the proportion of N Gaussian filter functions meets the constraint condition:
2) converting R (x, y) from a logarithmic domain to a real domain to obtain R (x, y);
3) and (3) performing linear correction processing on the R (x, y) (because the range of the R (x, y) is not in the range of 0-255), and obtaining an enhanced gesture image after correction.
The feature of the extracted gesture image in the step (2) comprises two faces: one is to extract the characteristics of the query picture uploaded by the user, and the other is to extract the characteristics of the picture database to construct an image characteristic database; the adopted characteristic extraction method comprises the following steps:
(1) data preprocessing: preprocessing a gesture data set and labeling and integrating labels, wherein the preprocessing comprises data enhancement, data normalization and the like;
(2) constructing a network structure based on VGGNet: training by adopting a VGGNet16 network model, performing network structure definition and initialization on VGGNet16, and setting a learning rate lr, a batch size batch, iteration rounds epochs and the like;
(3) training a model: training and verifying the model alternately;
(4) taking the features extracted from the last FC layer of the gesture model trained in the step (3) as the features of the image to represent participation in a gesture image retrieval task, inputting the features into a gesture image and a category label, learning the image features by taking the category label as supervision information, learning different label information by each branch, fusing the learned features of the previous two branches through a full connection layer to obtain a nonlinear combination feature, obtaining a low-dimensional hash feature through a hash layer, obtaining a binary hash through the hash layer, and finally taking the binary hash code as a feature vector to perform gesture retrieval;
(5) saving the model file;
(6) and randomly selecting 100 pictures from the test set as query pictures, using the rest pictures as an image database, selecting the model with the best classification effect as a feature extractor, and constructing a feature database.
The invention has the beneficial effects that: according to the invention, after the gesture data image is enhanced by a multi-scale Retinex method, the model is trained by adopting a deep learning method, after the gesture image retrieval model is trained, the characteristics extracted from the last FC layer are taken as the characteristics of the image to represent participation in a gesture image retrieval task, and the Hash layer is introduced while the multi-branch network structure of the improved VGGNet is adopted, so that the gesture retrieval efficiency can be improved on the premise of ensuring the accuracy.
Drawings
Fig. 1 is a flow chart of an improved VGGNet network according to an embodiment of the present invention;
FIG. 2 is a flow chart of the calculation of the reflection component according to the embodiment of the present invention.
Detailed Description
The invention is further illustrated with reference to the following figures and examples.
The embodiment is as follows:
a gesture image retrieval method based on multi-scale Retinex and an improved VGGNet network comprises the following steps:
1. a user uploads a picture needing gesture inquiry;
2. picture preprocessing: the method comprises the following steps of performing dim light enhancement on a gesture image by adopting a multi-scale Retinex algorithm, normalizing the gesture image after data enhancement processing, and processing the gesture image into a data input format required by a CNN (hidden network connection) model, wherein the method comprises the following specific steps:
(1) inputting an original image I (x, y);
(2) and estimating the noise of each position and removing the noise. Assuming that an image I seen by human eyes is a product of an image illumination component L and a reflectivity component R, the specific expression is as shown in formula 1:
I(x,y)=R(x,y)·L(x,y) (1)
(3) the three color channel spatial components are separated and converted to the log domain. The illumination L is reasonably calculated from the shot picture I, so that the inherent attribute R of the object is reserved, the interference of uneven illumination distribution is eliminated, and the sensory effect of the image is improved. For convenience of calculation, the logarithm of both sides of formula 1 is taken at the same time, and then formula 2 is obtained by setting I (x, y) to log (I (x, y)), R (x, y) to log (R (x, y)), and L (x, y) to log (L (x, y)):
i(x,y)=r(x,y)+l(x,y) (2)
the calculation process of the reflection component is shown in fig. 2;
(4) setting the number and size of Gaussian function scales;
(5) the Gaussian function filters three channels of the image, the filtered image is an illumination component, an image r (x, y) is obtained, and a calculation formula of a reflection component is as follows:
ri(x,y)=ii(x,y)-ii(x,y)*G(x,y) (3)
wherein ii(x, y) represents the original image of the ith channel, G (x, y) is a Gaussian filter function, ri(x, y) represents the reflection component of the ith channel, which represents the convolution, and σ is the scale parameter.
(6) The method adopts a multiscale Retinex algorithm to perform data enhancement on the gesture image, and the specific process of the algorithm is as follows: for a gesture image, filtering three channels of the image by adopting Gaussian filter functions of various scales, taking weighted average of reflection components of each scale to obtain a final output result, and changing the formula of 3 into:
wherein G isk(x, y) represents the kth gaussian filter function, N represents the number of gaussian filter functions, and experiments show that when N is 3, the gesture image data is enhanced most effectively. w is akThe weight of the kth scale is adopted, and the proportion of N Gaussian filter functions meets the constraint condition:
(7) converting R (x, y) from logarithmic domain to real domain to obtain R (x, y)
(8) And performing linear correction processing on the R (x, y) (because the range of the R (x, y) is not in the range of 0-255), and obtaining an enhanced gesture image after correction.
3. Feature extraction: the feature extraction is mainly to use a trained CNN model to carry out a series of convolution, pooling and full-connection operations on the gesture image so as to extract the features of the gesture image. The feature extraction here includes two faces: one is to extract the characteristics of the query picture uploaded by the user, and the other is to extract the characteristics of the picture database to construct an image characteristic database. The specific implementation steps are as follows:
(1) data preprocessing: and preprocessing the gesture data set and labeling and integrating the label, wherein the preprocessing comprises data enhancement, data normalization and the like.
(2) Constructing a network structure based on VGGNet: and training by adopting a VGGNet16 network model. Performing network structure definition and initialization on VGGNet16, and setting a learning rate lr, a batch size batch, iteration rounds epochs and the like;
(3) and (5) training the model. Training and verifying the model alternately;
(4) and (3) taking the feature extracted from the last FC layer of the trained gesture model as the feature of the image to represent participation in a gesture image retrieval task, wherein the feature dimension of the last but one full connection layer of the VGGNet reaches up to 4096 dimensions, so that the time for matching the similarity can be greatly increased when the gesture image retrieval is carried out, an improved multi-branch network structure of the VGGNet is provided, and a Hash layer is introduced, so that the efficiency of the gesture retrieval can be improved on the premise of ensuring the accuracy. The overall network model is as shown in fig. 1, input is a gesture image and a category label, the category label is used as supervision information to learn image characteristics, each branch learns different label information, the learned characteristics of the two previous branches are fused through a full connection layer to obtain nonlinear combination characteristics, then a low-dimensional hash characteristic is obtained through a hash layer, a binary hash is obtained through the hash layer, and finally the binary hash code is used as a characteristic vector to perform gesture retrieval;
(5) saving the model file;
(6) randomly selecting 100 pictures from the test set as query pictures, selecting the rest pictures as an image database, selecting a model with the best classification effect as a feature extractor, and constructing a feature database;
4. similarity matching: matching out features similar to the query picture from the feature database;
5. returning the result of the query: and acquiring the sorted image list from the image database and presenting the image list to the user.
Claims (2)
1. A gesture image retrieval method based on multi-scale Retinex and improved VGGNet network is characterized in that: the method comprises the following steps:
(1) picture preprocessing: performing dim light enhancement on the gesture image by adopting a multi-scale Retinex algorithm, normalizing the gesture image after data enhancement processing, and processing the gesture image into a data input format required by a CNN model; the method for normalizing the gesture picture into the data input format required by the CNN model comprises the following steps of:
2) Estimating the noise of each position, and eliminating the noise, assuming the original image seen by human eyesIs the product of the image illumination component L and the reflectance component R, and is specifically expressed as shown in equation 1:
3) separating three color channel space components and converting the three color channel space components into a logarithmic domain, reasonably calculating illumination from a shot picture, and keeping the inherent attribute of an objectEliminating the interference of uneven illumination distribution; taking logarithm of two sides of formula 1 at the same time, and then ordering,,Equation 2 can be obtained:
4) setting the number and size of Gaussian function scales;
5) the Gaussian function filters three channels of the image, the filtered image is an illumination component, and a calculation formula of a reflection component is as follows:
wherein the content of the first and second substances,is shown asThe original image of each of the channels is,in the form of a gaussian filter function,is shown asThe reflected component of each channel, represents a convolution,is a scale parameter;
the data enhancement is carried out on the gesture image, and the adopted method comprises the following steps:
1) for a gesture image, filtering three channels of the image by adopting Gaussian filter functions of various scales, taking weighted average of reflection components of each scale to obtain a final output result, and changing a formula (3) into:
wherein the content of the first and second substances,represents the firstA function of a gaussian filter is used to filter,the number of gaussian filter functions is represented,is the firstThe weight of each of the scales is determined,the proportion of the Gaussian filter functions meets the constraint condition:
3) To pairPerforming linear correction processing to obtain an enhanced gesture image after correction;
(2) characteristic extraction: performing a series of convolution, pooling and full-connection operations on the gesture image by using a trained CNN model, extracting features of the gesture image, preprocessing a gesture data set, labeling and fusing labels, constructing a VGGNet-based network structure, defining and performing initialization training on the VGGNet-based network structure, taking the features extracted by the last FC layer of the trained gesture model as features of the image to represent the participation in a gesture image retrieval task, introducing a hash layer, fusing the features through the full-connection layer to obtain nonlinear combination features, then obtaining a binary hash code through the hash layer, taking the binary hash code as a feature vector to perform gesture retrieval, and constructing a feature database;
(3) similarity matching: and acquiring an image list from the feature database, and matching features similar to the query picture.
2. The method for retrieving a gesture image according to claim 1, wherein: the feature of the extracted gesture image in the step (2) comprises two faces: one is to extract the characteristics of the query picture uploaded by the user, and the other is to extract the characteristics of the picture database to construct an image characteristic database; the adopted characteristic extraction method comprises the following steps:
(1) data preprocessing: preprocessing a gesture data set and labeling and integrating labels, wherein the preprocessing comprises data enhancement and data normalization;
(2) constructing a network structure based on VGGNet: training by adopting a VGGNet16 network model, defining and initializing a network structure of VGGNet16, setting a learning rate lr, a batch size batch and iteration rounds epochs;
(3) training a model: training and verifying the model alternately;
(4) taking the features extracted from the last FC layer of the gesture model trained in the step (3) as the features of the image to represent the participation in a gesture image retrieval task, inputting the features into a gesture image and a category label, learning the image features by taking the category label as supervision information, learning different label information by each branch, fusing the learned features of the previous two branches through a full connection layer to obtain a nonlinear combination feature, obtaining a low-dimensional hash feature through a hash layer, obtaining a binary hash code through the hash layer, and finally taking the binary hash code as a feature vector to perform gesture retrieval;
(5) saving the model file;
(6) and randomly selecting 100 pictures from the test set as query pictures, using the rest pictures as an image database, selecting the model with the best classification effect as a feature extractor, and constructing a feature database.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010532767.4A CN111695508B (en) | 2020-06-12 | 2020-06-12 | Multi-scale Retinex and gesture image retrieval method based on improved VGGNet network |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010532767.4A CN111695508B (en) | 2020-06-12 | 2020-06-12 | Multi-scale Retinex and gesture image retrieval method based on improved VGGNet network |
Publications (2)
Publication Number | Publication Date |
---|---|
CN111695508A CN111695508A (en) | 2020-09-22 |
CN111695508B true CN111695508B (en) | 2022-07-19 |
Family
ID=72480517
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202010532767.4A Active CN111695508B (en) | 2020-06-12 | 2020-06-12 | Multi-scale Retinex and gesture image retrieval method based on improved VGGNet network |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN111695508B (en) |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109214250A (en) * | 2017-07-05 | 2019-01-15 | 中南大学 | A kind of static gesture identification method based on multiple dimensioned convolutional neural networks |
CN109241313A (en) * | 2018-08-14 | 2019-01-18 | 大连大学 | A kind of image search method based on the study of high-order depth Hash |
CN109815920A (en) * | 2019-01-29 | 2019-05-28 | 南京信息工程大学 | Gesture identification method based on convolutional neural networks and confrontation convolutional neural networks |
CN109947963A (en) * | 2019-03-27 | 2019-06-28 | 山东大学 | A kind of multiple dimensioned Hash search method based on deep learning |
CN110427509A (en) * | 2019-08-05 | 2019-11-08 | 山东浪潮人工智能研究院有限公司 | A kind of multi-scale feature fusion image Hash search method and system based on deep learning |
CN110784253A (en) * | 2018-07-31 | 2020-02-11 | 深圳市白麓嵩天科技有限责任公司 | Information interaction method based on gesture recognition and Beidou satellite |
-
2020
- 2020-06-12 CN CN202010532767.4A patent/CN111695508B/en active Active
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109214250A (en) * | 2017-07-05 | 2019-01-15 | 中南大学 | A kind of static gesture identification method based on multiple dimensioned convolutional neural networks |
CN110784253A (en) * | 2018-07-31 | 2020-02-11 | 深圳市白麓嵩天科技有限责任公司 | Information interaction method based on gesture recognition and Beidou satellite |
CN109241313A (en) * | 2018-08-14 | 2019-01-18 | 大连大学 | A kind of image search method based on the study of high-order depth Hash |
CN109815920A (en) * | 2019-01-29 | 2019-05-28 | 南京信息工程大学 | Gesture identification method based on convolutional neural networks and confrontation convolutional neural networks |
CN109947963A (en) * | 2019-03-27 | 2019-06-28 | 山东大学 | A kind of multiple dimensioned Hash search method based on deep learning |
CN110427509A (en) * | 2019-08-05 | 2019-11-08 | 山东浪潮人工智能研究院有限公司 | A kind of multi-scale feature fusion image Hash search method and system based on deep learning |
Non-Patent Citations (2)
Title |
---|
A Cascade Face Spoofing Detector Based on Face Anti-Spoofing R-CNN and Improved Retinex LBP;Haonan Chen,Yaowu Chen;《IEEE Access》;20191125;第7卷;第170116-170133页 * |
Jin Kyu Kang ; Toan Minh Hoang.Person Re-Identification Between Visible and Thermal Camera Images Based on Deep Residual CNN Using Single Input.《 IEEE Access 》.2019,第7卷 * |
Also Published As
Publication number | Publication date |
---|---|
CN111695508A (en) | 2020-09-22 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN107122809B (en) | Neural network feature learning method based on image self-coding | |
Radenovic et al. | Deep shape matching | |
CN109543502B (en) | Semantic segmentation method based on deep multi-scale neural network | |
CN108510012B (en) | Target rapid detection method based on multi-scale feature map | |
CN106372581B (en) | Method for constructing and training face recognition feature extraction network | |
CN108898620B (en) | Target tracking method based on multiple twin neural networks and regional neural network | |
WO2020228525A1 (en) | Place recognition method and apparatus, model training method and apparatus for place recognition, and electronic device | |
CN108304826A (en) | Facial expression recognizing method based on convolutional neural networks | |
CN111985581B (en) | Sample-level attention network-based few-sample learning method | |
CN110032925B (en) | Gesture image segmentation and recognition method based on improved capsule network and algorithm | |
CN111340814A (en) | Multi-mode adaptive convolution-based RGB-D image semantic segmentation method | |
CN112036288B (en) | Facial expression recognition method based on cross-connection multi-feature fusion convolutional neural network | |
CN112580590A (en) | Finger vein identification method based on multi-semantic feature fusion network | |
CN110634170B (en) | Photo-level image generation method based on semantic content and rapid image retrieval | |
CN109740679B (en) | Target identification method based on convolutional neural network and naive Bayes | |
CN109033978B (en) | Error correction strategy-based CNN-SVM hybrid model gesture recognition method | |
CN111814611B (en) | Multi-scale face age estimation method and system embedded with high-order information | |
CN114299559A (en) | Finger vein identification method based on lightweight fusion global and local feature network | |
CN111079514A (en) | Face recognition method based on CLBP and convolutional neural network | |
CN110363156A (en) | A kind of Facial action unit recognition methods that posture is unrelated | |
CN110610138A (en) | Facial emotion analysis method based on convolutional neural network | |
CN111694977A (en) | Vehicle image retrieval method based on data enhancement | |
CN115049814B (en) | Intelligent eye protection lamp adjusting method adopting neural network model | |
CN115393225A (en) | Low-illumination image enhancement method based on multilevel feature extraction and fusion | |
Lata et al. | Data augmentation using generative adversarial network |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |