WO2021258482A1 - 基于迁移与弱监督的美丽预测方法、装置及存储介质 - Google Patents

基于迁移与弱监督的美丽预测方法、装置及存储介质 Download PDF

Info

Publication number
WO2021258482A1
WO2021258482A1 PCT/CN2020/104569 CN2020104569W WO2021258482A1 WO 2021258482 A1 WO2021258482 A1 WO 2021258482A1 CN 2020104569 W CN2020104569 W CN 2020104569W WO 2021258482 A1 WO2021258482 A1 WO 2021258482A1
Authority
WO
WIPO (PCT)
Prior art keywords
image
domain network
value
migration
loss function
Prior art date
Application number
PCT/CN2020/104569
Other languages
English (en)
French (fr)
Inventor
甘俊英
白振峰
翟懿奎
何国辉
Original Assignee
五邑大学
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 五邑大学 filed Critical 五邑大学
Priority to US17/414,196 priority Critical patent/US11769319B2/en
Publication of WO2021258482A1 publication Critical patent/WO2021258482A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/161Detection; Localisation; Normalisation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2415Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on parametric or probabilistic models, e.g. based on likelihood ratio or false acceptance rate versus a false rejection rate
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/047Probabilistic or stochastic networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/774Generating sets of training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/80Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level
    • G06V10/809Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level of classification results, e.g. where the classifiers operate on the same input data
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/168Feature extraction; Face representation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/172Classification, e.g. identification

Definitions

  • the invention relates to the field of image processing, in particular to a beauty prediction method, device and storage medium based on migration and weak supervision.
  • the face beauty prediction technology uses image processing and artificial intelligence to intelligently judge the face beauty level.
  • the face beauty prediction technology is mainly implemented through deep learning, but deep learning networks require a large number of training samples, training models are easy to overfit, ignoring the correlation and difference between multiple tasks, and the cost of data labeling in strong supervised learning is relatively high. High and ignore the actual situation that it is difficult to obtain all truth labels in the database.
  • most tasks are based on single-task, strong-labeled data for model training, single-task ignores the correlation between tasks, real-life tasks are often inextricably linked; strong-labeled data in real life is difficult Obtaining all, and obtaining all the truth labels is expensive.
  • the purpose of the present invention is to solve at least one of the technical problems existing in the prior art, and to provide a beauty prediction method, device and storage medium based on migration and weak supervision.
  • the first aspect of the present invention includes the following steps:
  • the preprocessed image to train the source domain network, and migrate the trained parameters of the source domain network to the target domain network; wherein in the migration process, for the source domain network, according to the T of the source domain network
  • the output of the softmax layer with a value greater than 1 and the original label obtains the loss function of the source domain network; for the target domain network, the output of the softmax layer with the value of the target domain network greater than 1 and the original label obtain the
  • the first sub-loss function of the target domain network, the second sub-loss function of the target domain network is obtained according to the output of the softmax layer whose T value of the target domain network is equal to 1 and the original label, and the first sub-loss function Adding to the second sub-loss function to obtain the loss function of the target domain network;
  • the image feature is input to the residual network to learn the mapping from the image feature to the difference between the noise label and the truth label to obtain the first predicted value, and to the standard neural network to learn from the
  • the image feature is mapped to the truth label and the second predicted value is obtained.
  • the first predicted value and the second predicted value are added and then input to the first classifier to obtain the first face beauty prediction result.
  • the second prediction value is input to a second classifier to obtain a second face beauty prediction result, and a final face beauty prediction result is obtained according to the first face beauty prediction result and the second face beauty prediction result.
  • the preprocessing of the input face image to obtain the preprocessed image specifically includes: sequentially performing image enhancement processing, image correction processing, image cropping processing, image deduplication processing, and image processing on the face image.
  • the normalization process obtains the preprocessed image.
  • the T value is an adjustment parameter defined in the softmax function of the softmax layer, and the softmax function is specifically Among them, q i is the output of the softmax function, and z is the input of the softmax function.
  • the loss function of the first classifier is:
  • the loss function of the second classifier is: Where h i is the predicted value of the first and the second predicted value and said, g j is the second prediction value, y i is the noise tag, v j is the true value of the tag, D n is the Image features, N n is the number of image features.
  • a beauty prediction device based on migration and weak supervision includes:
  • the preprocessing module is used to preprocess the input face image to obtain the preprocessed image
  • the migration module is used to train the source domain network using the preprocessed image, and migrate the trained parameters of the source domain network to the target domain network; wherein during the migration process, for the source domain network, according to the The output of the softmax layer whose T value of the source domain network is greater than 1 and the original label obtain the loss function of the source domain network; for the target domain network, the output of the softmax layer whose T value of the target domain network is greater than 1 is compared with the original label.
  • the original label obtains the first sub-loss function of the target domain network, and the second sub-loss function of the target domain network is obtained according to the output of the softmax layer whose T value of the target domain network is equal to 1 and the original label. Adding the first sub-loss function and the second sub-loss function to obtain the loss function of the target domain network;
  • a feature extraction module for dividing the preprocessed image into a noise image marked with a noise label and a true value image marked with a truth value label, and inputting the noise image and the truth value image to the target domain network Get image features;
  • the classification module is used to input the image feature into the residual network, learn the mapping from the image feature to the difference between the noise label and the truth label, and obtain the first predicted value, and input it into the standard neural network.
  • the network learns the mapping from the image feature to the truth label and obtains a second predicted value, adds the first predicted value and the second predicted value, and then inputs it to the first classifier to obtain the first face
  • the beauty prediction result, the second prediction value is input to the second classifier to obtain the second face beauty prediction result, and the final face is obtained according to the first face beauty prediction result and the second face beauty prediction result Beautiful prediction results.
  • the preprocessing of the input face image to obtain the preprocessed image specifically includes: sequentially performing image enhancement processing, image correction processing, image cropping processing, image deduplication processing, and image processing on the face image.
  • the normalization process obtains the preprocessed image.
  • the T value is an adjustment parameter defined in the softmax function of the softmax layer, and the softmax function is specifically Among them, q i is the output of the softmax function, and z is the input of the softmax function.
  • the loss function of the first classifier is:
  • the loss function of the second classifier is: Where h i is the predicted value of the first and the second predicted value and said, g j is the second prediction value, y i is the noise tag, v j is the true value of the tag, D n is the Image features, N n is the number of image features.
  • a beautiful prediction device based on migration and weak supervision includes a memory connected to a processor and the processor; the memory stores executable instructions; the processor executes the executable instructions to Perform the beauty prediction method based on migration and weak supervision as described in the first aspect of the present invention.
  • the storage medium stores executable instructions, which can be executed by a computer, so that the computer executes the beauty prediction method based on migration and weak supervision as described in the first aspect of the present invention.
  • the above scheme has at least the following beneficial effects: not only use migration to solve the problem of insufficient sample size, but also use the parameters of the source domain network to enhance the target domain network, which effectively solves the problem of excessive data required for the model, easy overfitting of the model, and generalization of the model
  • the problem of weak ability and long training time improves the stability and robustness of the model; and solves the problem of the unreliability of database labels.
  • Related model training can also be carried out when the data labels are inaccurate, insufficient, and unspecific; It is highly adaptable and can reduce the cost of data labeling and the impact of incorrect labeling on the model.
  • Fig. 1 is a flowchart of a beauty prediction method based on migration and weak supervision according to an embodiment of the present invention
  • FIG. 2 is a structural diagram of a beauty prediction device based on migration and weak supervision according to an embodiment of the present invention
  • Figure 3 is a structural diagram of the face beauty prediction model.
  • orientation description involved such as up, down, front, back, left, right, etc. indicates the orientation or positional relationship based on the orientation or positional relationship shown in the drawings, but only In order to facilitate the description of the present invention and simplify the description, it does not indicate or imply that the device or element referred to must have a specific orientation, be constructed and operated in a specific orientation, and therefore cannot be understood as a limitation of the present invention.
  • some embodiments of the present invention provide a beautiful prediction method based on migration and weak supervision, including the following steps:
  • Step S100 preprocess the input face image to obtain a preprocessed image
  • Step S200 Use the preprocessed image to train the source domain network 110, and migrate the trained parameters of the source domain network 110 to the target domain network 120; where in the migration process, for the source domain network 110, according to the T of the source domain network 110
  • the output of the softmax layer with a value greater than 1 and the original label obtains the loss function of the source domain network 110; for the target domain network 120, the output of the softmax layer with a T value greater than 1 and the original label obtains the target domain network 120
  • the first sub-loss function, the second sub-loss function of the target domain network 120 is obtained according to the output of the softmax layer whose T value of the target domain network 120 is equal to 1 and the original label, and the first sub-loss function and the second sub-loss function are added Obtain the loss function of the target domain network 120;
  • Step S300 Divide the preprocessed image into a noise image marked with a noise label and a true value image marked with a truth value label, and input the noise image and the true value image into the target domain network 120 to obtain image features;
  • Step S400 Input the image feature to the residual net 210 to learn the mapping from the image feature to the difference between the noise label and the true value label and obtain the first predicted value, and input to the standard neural network 220 to learn from the image feature to the true value
  • the second predicted value is obtained by mapping the label, the first predicted value and the second predicted value are added and then input to the first classifier 230 to obtain the first face beauty prediction result, and the second predicted value is input to the second classifier 240
  • the second face beauty prediction result is obtained, and the final face beauty prediction result is obtained according to the first face beauty prediction result and the second face beauty prediction result.
  • migration is not only used to solve the problem of insufficient sample size, but also the parameters of the source domain network 110 are used to enhance the target domain network 120, which effectively solves the problem that the model requires too much data, the model is easy to overfit, and the model is generalized.
  • the problem of weak and long training time improves the stability and robustness of the model; it also solves the problem of the unreliability of database labels.
  • Related model training can also be carried out when the data labels are inaccurate, insufficient, and unspecific; adaptability It is highly flexible and can reduce the cost of data labeling and the impact of error labeling on the model.
  • the input face image is data fused with multiple databases, including LSFBD face beauty database, Fer2013 face expression database, GENKI-4K smile recognition database, IMDB-WIKI 500k+ database, and SCUT-FBP5500 database.
  • databases including LSFBD face beauty database, Fer2013 face expression database, GENKI-4K smile recognition database, IMDB-WIKI 500k+ database, and SCUT-FBP5500 database.
  • step S100 image enhancement processing, image correction processing, image cropping processing, image deduplication processing, and image normalization processing are sequentially performed on the face image to obtain a preprocessed image.
  • the preprocessing can efficiently perform area detection and key point detection on the face image, as well as alignment and cropping, so that the size of the face image is consistent, which is convenient for subsequent operations.
  • the preprocessed image is input to the face beauty prediction model, and the face beauty prediction model executes step S200, step S300, and step S400.
  • step S200 the source domain network 110 is trained using the preprocessed image, and the parameters of the source domain network 110 after the training are migrated to the target domain network 120.
  • the loss function of the source domain network 110 is obtained according to the output of the softmax layer whose T value of the source domain network 110 is greater than 1 and the original label; for the target domain network 120, according to the target domain network 120
  • the output of the softmax layer whose T value is greater than 1 and the original label obtain the first sub-loss function of the target domain network 120.
  • the first sub-loss function of the target domain network 120 is obtained.
  • Two sub-loss functions adding the first sub-loss function and the second sub-loss function to obtain the loss function of the target domain network 120.
  • the parameters of the source domain network 110 training preprocessed image are extracted.
  • T value is the adjustment parameter, which is defined in the softmax function of the softmax layer.
  • the softmax function is specifically Among them, q i is the output of the softmax function, and z is the input of the softmax function.
  • the target domain network 120 functions as a feature extraction layer.
  • the feature extraction layer is one of VGG16, ResNet50, Google Inception V3 or DenseNet.
  • the specific structure of the target domain network 120 is: the first layer is a 3*3 size convolutional layer; the second layer is a 3*3 size convolutional layer; the third layer is a 3*3 size Convolutional layer; the fourth layer is a pooling layer; the fifth layer is a 3*3 convolutional layer; the sixth layer is a 3*3 convolutional layer; the seventh layer is a pooling layer; the eighth layer is 3*3 size convolutional layer; the ninth layer is a 3*3 size convolutional layer; the tenth layer is a 3*3 size convolutional layer; the eleventh layer is a pooling layer; the twelfth layer is 3 *3 size convolutional layer; the thirteenth layer is a 3*3 size convolutional layer; the fourteenth layer is a pooling layer.
  • step S400 the image features are input to the residual network 210 to learn the mapping from the image features to the difference between the noise label and the true value label, and the first predicted value is obtained, and the noise label is used to supervise the entry into the residual network 210 All image features; and input the image features to the standard neural network 220 to learn the mapping from the image feature to the ground truth label and obtain the second predicted value, and use the ground truth label to supervise all the image features entered into the standard neural network 220.
  • the first prediction value and the second prediction value are added and input to the first classifier 230 to obtain the first face beauty prediction result, and the second prediction value is input to the second classifier 240 to obtain the second face beauty prediction result.
  • the final face beauty prediction result is obtained.
  • K w1*K1+w2*K2
  • K1 and K2 are the first face beauty prediction result and the second face beauty prediction result, respectively.
  • the loss function of the first classifier 230 is:
  • the loss function of the second classifier 240 is: Where h i is the predicted value of the first and second prediction value, g j is a second prediction value, y i is the noise label, v j is the true value of the tag, D n is image feature, N n is the number of image feature .
  • the overall goal of the part consisting of the residual network 210, the standard neural network 220, the first classifier 230, and the second classifier 240 is: , Where W is a hyperparameter, and ⁇ is a trade-off parameter between the loss value of the residual network 210 and the loss value of the standard neural network 220.
  • some embodiments of the present invention provide a beauty prediction device based on migration and weak supervision.
  • the beauty prediction device based on migration and weak supervision as described in the method embodiment is applied.
  • the beauty prediction device includes:
  • the preprocessing module 10 is used to preprocess the input face image to obtain a preprocessed image
  • the migration module 20 is used to train the source domain network 110 using the preprocessed image, and migrate the parameters of the source domain network 110 after the training to the target domain network 120; in the migration process, for the source domain network 110, according to the source domain network
  • the output of the softmax layer with the T value of 110 greater than 1 and the original label obtains the loss function of the source domain network 110; for the target domain network 120, the target domain is obtained from the output of the softmax layer with the T value of the target domain network 120 greater than 1 and the original label.
  • the first sub-loss function of the network 120, the second sub-loss function of the target network 120 is obtained according to the output of the softmax layer whose T value of the target network 120 is equal to 1 and the original label, and the first sub-loss function and the second sub-loss Function addition to obtain the loss function of the target domain network 120;
  • the feature extraction module 30 is configured to divide the preprocessed image into a noise image marked with a noise label and a true value image marked with a truth value label, and input the noise image and the true value image to the target domain network 120 to obtain image features;
  • the classification module 40 is used to input image features to the residual net 210 to learn the mapping from the image features to the difference between the noise label and the truth label and obtain the first predicted value, and to input to the standard neural network 220 to learn from the image feature Mapping to the true value label and obtain the second predicted value, add the first predicted value and the second predicted value and input it to the first classifier 230 to obtain the prediction result of the beauty of the first face, and input the second predicted value to the first
  • the second classifier 240 obtains the second face beauty prediction result, and obtains the final face beauty prediction result according to the first face beauty prediction result and the second face beauty prediction result.
  • the beauty prediction device based on migration and weak supervision applies the beauty prediction method based on migration and weak supervision as described in the method embodiment. With the cooperation of various modules, it can perform beauty prediction based on migration and weak supervision. Each step of the method has the same technical effect as the beautiful prediction method based on migration and weak supervision, and will not be detailed here.
  • a beautiful prediction device based on migration and weak supervision includes a memory connected to a processor and the processor; the memory stores executable instructions; the processor executes the executable instructions to execute the The beautiful prediction method of migration and weak supervision.
  • the storage medium stores executable instructions, which can be executed by a computer, so that the computer executes the beautiful prediction method based on migration and weak supervision as described in the method embodiment.
  • Examples of storage media include, but are not limited to, phase change memory (PRAM), static random access memory (SRAM), dynamic random access memory (DRAM), other types of random access memory (RAM), read-only memory (ROM) ), electrically erasable programmable read-only memory (EEPROM), flash memory or other memory technologies, CD-ROM, digital versatile disc (DVD) or other optical storage, magnetic cartridges Type magnetic tape, magnetic tape magnetic disk storage or other magnetic storage devices or any other non-transmission media that can be used to store information that can be accessed by computing devices.
  • PRAM phase change memory
  • SRAM static random access memory
  • DRAM dynamic random access memory
  • RAM random access memory
  • ROM read-only memory
  • EEPROM electrically erasable programmable read-only memory
  • flash memory or other memory technologies
  • CD-ROM compact disc
  • DVD digital versatile disc

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Multimedia (AREA)
  • Evolutionary Computation (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Artificial Intelligence (AREA)
  • Oral & Maxillofacial Surgery (AREA)
  • Software Systems (AREA)
  • Computing Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Human Computer Interaction (AREA)
  • Databases & Information Systems (AREA)
  • Medical Informatics (AREA)
  • General Engineering & Computer Science (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Mathematical Physics (AREA)
  • Molecular Biology (AREA)
  • Computational Linguistics (AREA)
  • Biophysics (AREA)
  • Biomedical Technology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Evolutionary Biology (AREA)
  • Probability & Statistics with Applications (AREA)
  • Image Analysis (AREA)
  • Image Processing (AREA)

Abstract

基于迁移和弱监督的人脸美丽预测方法、装置及存储介质,其中方法包括:预处理输入的人脸图像;利用预处理图像训练源域网络,并将源域网络的参数迁移至目标域网络;将标有噪声标签的噪声图像和标有真值标签的真值图像输入至所述目标域网络得到图像特征;将图像特征输入至分类网络得到最终人脸美丽预测结果。有效解决了模型所需数据量过大、模型容易过拟合、模型泛化能力弱、训练时间长问题,提高模型的稳定性和鲁棒性;而且解决了数据库标签的不可靠性问题。

Description

基于迁移与弱监督的美丽预测方法、装置及存储介质 技术领域
本发明涉及图像处理领域,特别是基于迁移与弱监督的美丽预测方法、装置及存储介质。
背景技术
人脸美丽预测技术是通过图像处理与人工智能的结合,智能判断人脸美丽等级。目前人脸美丽预测技术主要通过深度学习实现,但深度学习网络要求有大量的训练样本、训练模型容易过拟合、忽略多任务之间的相关性和差异性、强监督学习中数据标注成本较高以及忽略了获取数据库全部真值标签较困难的实际情况。目前,大多数任务是针对单任务、强标签数据进行模型训练的,单任务忽略了任务之间的关联性,现实生活中的任务往往有千丝万缕的联系;现实生活中强标签数据难以全部获取,并且全部获取真值标签成本昂贵。
发明内容
本发明的目的在于至少解决现有技术中存在的技术问题之一,提供基于迁移与弱监督的美丽预测方法、装置及存储介质。
本发明解决其问题所采用的技术方案是:
本发明的第一方面,基于迁移与弱监督的美丽预测方法,包括以下步骤:
预处理输入的人脸图像得到预处理图像;
利用所述预处理图像训练源域网络,并将训练后的所述源域网络的参数迁移至目标域网络;其中在迁移过程中,对于所述源域网络,根据所述源域网络的T值大于1的softmax层的输出与原始标签得到所述源域网络的损失函数;对于所述目标域网络,根据所述目标域网络的T值大于1的softmax层的输出与原始标签得到所述目标域网络的第一子损失函数,根据所述目标域网络的T值等于1的softmax层的输出与原始标签得到所述目标域网络的第二子损失函数,将所述第一子损失函数和所述第二子损失函数相加得到所述目标域网络的损失函数;
将所述预处理图像分为标有噪声标签的噪声图像和标有真值标签的真值图像,将所述噪声图像和所述真值图像输入至所述目标域网络得到图像特征;
将所述图像特征输入至残差网学习从所述图像特征至所述噪声标签与所述真值标签间的差值的映射并得到第一预测值,以及输入至标准神经网络学习从所述图像特征至所述真值标签的映射并得到第二预测值,将所述第一预测值与所述第二预测值相加后输入至第一分类器得到第一人脸美丽预测结果,将所述第二预测值输入至第二分类器得到第二人脸美丽预测结果,根据所述第一人脸美丽预测结果和所述第二人脸美丽预测结果得到最终人脸美丽预测结果。
根据本发明的第一方面,所述预处理输入的人脸图像得到预处理 图像具体是:对所述人脸图像依次进行图像增强处理、图像矫正处理、图像裁剪处理、图像去重处理和图像归一化处理得到预处理图像。
根据本发明的第一方面,所述T值是调节参数,定义于softmax层的softmax函数,softmax函数具体为
Figure PCTCN2020104569-appb-000001
其中q i是softmax函数的输出,z为softmax函数的输入。
根据本发明的第一方面,所述第一分类器的损失函数为:
Figure PCTCN2020104569-appb-000002
所述第二分类器的损失函数为:
Figure PCTCN2020104569-appb-000003
其中h i是所述第一预测值与所述第二预测值的和,g j是所述第二预测值,y i是所述噪声标签,v j是真值标签,D n是所述图像特征,N n是所述图像特征的数量。
本发明的第二方面,基于迁移与弱监督的美丽预测装置,包括:
预处理模块,用于预处理输入的人脸图像得到预处理图像;
迁移模块,用于利用所述预处理图像训练源域网络,并将训练后的所述源域网络的参数迁移至目标域网络;其中在迁移过程中,对于 所述源域网络,根据所述源域网络的T值大于1的softmax层的输出与原始标签得到所述源域网络的损失函数;对于所述目标域网络,根据所述目标域网络的T值大于1的softmax层的输出与原始标签得到所述目标域网络的第一子损失函数,根据所述目标域网络的T值等于1的softmax层的输出与原始标签得到所述目标域网络的第二子损失函数,将所述第一子损失函数和所述第二子损失函数相加得到所述目标域网络的损失函数;
特征提取模块,用于将所述预处理图像分为标有噪声标签的噪声图像和标有真值标签的真值图像,将所述噪声图像和所述真值图像输入至所述目标域网络得到图像特征;
分类模块,用于将所述图像特征输入至残差网学习从所述图像特征至所述噪声标签与所述真值标签间的差值的映射并得到第一预测值,以及输入至标准神经网络学习从所述图像特征至所述真值标签的映射并得到第二预测值,将所述第一预测值与所述第二预测值相加后输入至第一分类器得到第一人脸美丽预测结果,将所述第二预测值输入至第二分类器得到第二人脸美丽预测结果,根据所述第一人脸美丽预测结果和所述第二人脸美丽预测结果得到最终人脸美丽预测结果。
根据本发明的第二方面,所述预处理输入的人脸图像得到预处理图像具体是:对所述人脸图像依次进行图像增强处理、图像矫正处理、图像裁剪处理、图像去重处理和图像归一化处理得到预处理图像。
根据本发明的第二方面,所述T值是调节参数,定义于softmax 层的softmax函数,softmax函数具体为
Figure PCTCN2020104569-appb-000004
其中q i是softmax函数的输出,z为softmax函数的输入。
根据本发明的第二方面,所述第一分类器的损失函数为:
Figure PCTCN2020104569-appb-000005
所述第二分类器的损失函数为:
Figure PCTCN2020104569-appb-000006
其中h i是所述第一预测值与所述第二预测值的和,g j是所述第二预测值,y i是所述噪声标签,v j是真值标签,D n是所述图像特征,N n是所述图像特征的数量。
本发明的第三方面,基于迁移与弱监督的美丽预测装置,包括处理器与所述处理器连接的存储器;所述存储器存储有可执行指令;所述处理器执行所述可执行指令,以执行如本发明第一方面所述的基于迁移与弱监督的美丽预测方法。
本发明的第四方面,存储介质存储有可执行指令,可执行指令能被计算机执行,使所述计算机执行如本发明第一方面所述的基于迁移与弱监督的美丽预测方法。
上述方案至少具有以下的有益效果:不仅利用迁移解决样本数量不足的问题,利用源域网络的参数增强目标域网络,有效解决了模型所需数据量过大、模型容易过拟合、模型泛化能力弱、训练时间长问题,提高模型的稳定性和鲁棒性;而且解决了数据库标签的不可靠性问题,在数据标签不准确、不足够、不具体的情况下也可进行相关模型训练;适应性强且能降低数据标注成本和失误标注对模型的影响。
本发明的附加方面和优点将在下面的描述中部分给出,部分将从下面的描述中变得明显,或通过本发明的实践了解到。
附图说明
下面结合附图和实例对本发明作进一步说明。
图1是本发明实施例基于迁移与弱监督的美丽预测方法的流程图;
图2是本发明实施例基于迁移与弱监督的美丽预测装置的结构图;
图3是人脸美丽预测模型的结构图。
具体实施方式
本部分将详细描述本发明的具体实施例,本发明之较佳实施例在附图中示出,附图的作用在于用图形补充说明书文字部分的描述,使人能够直观地、形象地理解本发明的每个技术特征和整体技术方案,但其不能理解为对本发明保护范围的限制。
在本发明的描述中,需要理解的是,涉及到方位描述,例如上、下、前、后、左、右等指示的方位或位置关系为基于附图所示的方位或位置关系,仅是为了便于描述本发明和简化描述,而不是指示或暗 示所指的装置或元件必须具有特定的方位、以特定的方位构造和操作,因此不能理解为对本发明的限制。
在本发明的描述中,若干的含义是一个或者多个,多个的含义是两个以上,大于、小于、超过等理解为不包括本数,以上、以下、以内等理解为包括本数。如果有描述到第一、第二只是用于区分技术特征为目的,而不能理解为指示或暗示相对重要性或者隐含指明所指示的技术特征的数量或者隐含指明所指示的技术特征的先后关系。
本发明的描述中,除非另有明确的限定,设置、安装、连接等词语应做广义理解,所属技术领域技术人员可以结合技术方案的具体内容合理确定上述词语在本发明中的具体含义。
参照图1和图3,本发明的某些实施例,提供了基于迁移与弱监督的美丽预测方法,包括以下步骤:
步骤S100、预处理输入的人脸图像得到预处理图像;
步骤S200、利用预处理图像训练源域网络110,并将训练后的源域网络110的参数迁移至目标域网络120;其中在迁移过程中,对于源域网络110,根据源域网络110的T值大于1的softmax层的输出与原始标签得到源域网络110的损失函数;对于目标域网络120,根据目标域网络120的T值大于1的softmax层的输出与原始标签得到目标域网络120的第一子损失函数,根据目标域网络120的T值等于1的softmax层的输出与原始标签得到目标域网络120的第二子损失函数,将第一子损失函数和第二子损失函数相加得到目标域网络120 的损失函数;
步骤S300、将预处理图像分为标有噪声标签的噪声图像和标有真值标签的真值图像,将噪声图像和真值图像输入至目标域网络120得到图像特征;
步骤S400、将图像特征输入至残差网210学习从图像特征至噪声标签与真值标签间的差值的映射并得到第一预测值,以及输入至标准神经网络220学习从图像特征至真值标签的映射并得到第二预测值,将第一预测值与第二预测值相加后输入至第一分类器230得到第一人脸美丽预测结果,将第二预测值输入至第二分类器240得到第二人脸美丽预测结果,根据第一人脸美丽预测结果和第二人脸美丽预测结果得到最终人脸美丽预测结果。
在该实施例中,不仅利用迁移解决样本数量不足的问题,利用源域网络110的参数增强目标域网络120,有效解决了模型所需数据量过大、模型容易过拟合、模型泛化能力弱、训练时间长问题,提高模型的稳定性和鲁棒性;而且解决了数据库标签的不可靠性问题,在数据标签不准确、不足够、不具体的情况下也可进行相关模型训练;适应性强且能降低数据标注成本和失误标注对模型的影响。
进一步,输入的人脸图像是融合了多个数据库的数据,包括LSFBD人脸美丽数据库、Fer2013人脸表情数据库、GENKI-4K微笑识别数据库、IMDB-WIKI 500k+数据库以及SCUT-FBP5500数据库。
在步骤S100中,对人脸图像依次进行图像增强处理、图像矫正 处理、图像裁剪处理、图像去重处理和图像归一化处理得到预处理图像。预处理能高效地对人脸图像进行区域检测以及关键点检测,以及对齐和裁剪,使人脸图像大小一致,便于后续操作。
将预处理图像输入至人脸美丽预测模型,人脸美丽预测模型执行步骤S200、步骤S300和步骤S400。
进一步,在步骤S200中,利用预处理图像训练源域网络110,并将训练后的源域网络110的参数迁移至目标域网络120。其中在迁移过程中,对于源域网络110,根据源域网络110的T值大于1的softmax层的输出与原始标签得到源域网络110的损失函数;对于目标域网络120,根据目标域网络120的T值大于1的softmax层的输出与原始标签得到目标域网络120的第一子损失函数,根据目标域网络120的T值等于1的softmax层的输出与原始标签得到目标域网络120的第二子损失函数,将第一子损失函数和第二子损失函数相加得到目标域网络120的损失函数。通过让T值升高,然后在后续阶段恢复至1,从而将源域网络110训练预处理图像的参数提取。
计算损失函数,梯度下降,更新目标域网络120的参数。
其中,T值是调节参数,定义于softmax层的softmax函数,softmax函数具体为
Figure PCTCN2020104569-appb-000007
其中q i是softmax函数的输出, z为softmax函数的输入。T值越大,则softmax函数的输出的分布越平缓;当T值趋向于无穷,则softmax函数的输出则是一个均匀分布且近似于源域网络110。
另外,对于迁移过程,其总体的损失函数为:L=CE(y,p)+αCE(q,p),其中CE=-∑p*log q,p是源域网络110产生的分布。
进一步,在步骤S300中,目标域网络120起到特征提取层的作用。特征提取层是VGG16、ResNet50、Google Inception V3或DenseNet中的一种。在本实施例中,目标域网络120的具体结构为:第一层为3*3大小的卷积层;第二层为3*3大小的卷积层;第三层为3*3大小的卷积层;第四层为池化层;第五层为3*3大小的卷积层;第六层为3*3大小的卷积层;第七层为池化层;第八层为3*3大小的卷积层;第九层为3*3大小的卷积层;第十层为3*3大小的卷积层;第十一层为池化层;第十二层为3*3大小的卷积层;第十三层为3*3大小的卷积层;第十四层为池化层。通过特征提取层提取图像特征,在提取过程中根据实际任务可以对特征提取层进行结构调整、参数优化,寻找最优的形式。
进一步,在步骤S400中,将图像特征输入至残差网210学习从图像特征至噪声标签与真值标签间的差值的映射并得到第一预测值,利用噪声标签监督进入残差网210的所有图像特征;以及将图像特征输入至标准神经网络220学习从图像特征至真值标签的映射并得到 第二预测值,利用真值标签监督进入标准神经网络220的所有图像特征。将第一预测值与第二预测值相加后输入至第一分类器230得到第一人脸美丽预测结果,将第二预测值输入至第二分类器240得到第二人脸美丽预测结果。根据第一人脸美丽预测结果和第二人脸美丽预测结果得到最终人脸美丽预测结果。具体为K=w1*K1+w2*K2,其中K为最终人脸美丽预测结果,w1和w2为权重,K1和K2分别是第一人脸美丽预测结果和第二人脸美丽预测结果。
其中,第一分类器230的损失函数为:
Figure PCTCN2020104569-appb-000008
第二分类器240的损失函数为:
Figure PCTCN2020104569-appb-000009
其中h i是第一预测值与第二预测值的和,g j是第二预测值,y i是噪声标签,v j是真值标签,D n是图像特征,N n是图像特征的数量。
另外,该由残差网210、标准神经网络220、第一分类器230和第二分类器240组成的部分的总体目标为:
Figure PCTCN2020104569-appb-000010
,其中W是超参数,α是残差网210的损失值与标准神经网络220的损失值之间的权衡参数。
参照图2,本发明的某些实施例,提供了基于迁移与弱监督的美 丽预测装置,应用了如方法实施例所述的基于迁移与弱监督的美丽预测装置,美丽预测装置包括:
预处理模块10,用于预处理输入的人脸图像得到预处理图像;
迁移模块20,用于利用预处理图像训练源域网络110,并将训练后的源域网络110的参数迁移至目标域网络120;其中在迁移过程中,对于源域网络110,根据源域网络110的T值大于1的softmax层的输出与原始标签得到源域网络110的损失函数;对于目标域网络120,根据目标域网络120的T值大于1的softmax层的输出与原始标签得到目标域网络120的第一子损失函数,根据目标域网络120的T值等于1的softmax层的输出与原始标签得到目标域网络120的第二子损失函数,将第一子损失函数和第二子损失函数相加得到目标域网络120的损失函数;
特征提取模块30,用于将预处理图像分为标有噪声标签的噪声图像和标有真值标签的真值图像,将噪声图像和真值图像输入至目标域网络120得到图像特征;
分类模块40,用于将图像特征输入至残差网210学习从图像特征至噪声标签与真值标签间的差值的映射并得到第一预测值,以及输入至标准神经网络220学习从图像特征至真值标签的映射并得到第二预测值,将第一预测值与第二预测值相加后输入至第一分类器230得到第一人脸美丽预测结果,将第二预测值输入至第二分类器240得到第二人脸美丽预测结果,根据第一人脸美丽预测结果和第二人脸美 丽预测结果得到最终人脸美丽预测结果。
在该装置实施例中,基于迁移与弱监督的美丽预测装置应用如方法实施例所述的基于迁移与弱监督的美丽预测方法,经各个模块的配合,能执行基于迁移与弱监督的美丽预测方法的各个步骤,具有和基于迁移与弱监督的美丽预测方法相同的技术效果,在此不再详述。
本发明的某些实施例,基于迁移与弱监督的美丽预测装置,包括处理器与处理器连接的存储器;存储器存储有可执行指令;处理器执行可执行指令,以执行如方法实施例的基于迁移与弱监督的美丽预测方法。
本发明的某些实施例,存储介质存储有可执行指令,可执行指令能被计算机执行,使计算机执行如方法实施例所述的基于迁移与弱监督的美丽预测方法。
存储介质的例子包括,但不限于相变内存(PRAM)、静态随机存取存储器(SRAM)、动态随机存取存储器(DRAM)、其他类型的随机存取存储器(RAM)、只读存储器(ROM)、电可擦除可编程只读存储器(EEPROM)、快闪记忆体或其他内存技术、只读光盘只读存储器(CD-ROM)、数字多功能光盘(DVD)或其他光学存储、磁盒式磁带,磁带磁磁盘存储或其他磁性存储设备或任何其他非传输介质,可用于存储可以被计算设备访问的信息。
以上所述,只是本发明的较佳实施例而已,本发明并不局限于上述实施方式,只要其以相同的手段达到本发明的技术效果,都应属于 本发明的保护范围。

Claims (10)

  1. 基于迁移与弱监督的美丽预测方法,其特征在于,包括以下步骤:
    预处理输入的人脸图像得到预处理图像;
    利用所述预处理图像训练源域网络,并将训练后的所述源域网络的参数迁移至目标域网络;其中在迁移过程中,对于所述源域网络,根据所述源域网络的T值大于1的softmax层的输出与原始标签得到所述源域网络的损失函数;对于所述目标域网络,根据所述目标域网络的T值大于1的softmax层的输出与原始标签得到所述目标域网络的第一子损失函数,根据所述目标域网络的T值等于1的softmax层的输出与原始标签得到所述目标域网络的第二子损失函数,将所述第一子损失函数和所述第二子损失函数相加得到所述目标域网络的损失函数;
    将所述预处理图像分为标有噪声标签的噪声图像和标有真值标签的真值图像,将所述噪声图像和所述真值图像输入至所述目标域网络得到图像特征;
    将所述图像特征输入至残差网学习从所述图像特征至所述噪声标签与所述真值标签间的差值的映射并得到第一预测值,以及输入至标准神经网络学习从所述图像特征至所述真值标签的映射并得到第二预测值,将所述第一预测值与所述第二预测值相加后输入至第一分类器得到第一人脸美丽预测结果,将所述第二预测值输入至第二分类器得到第二人脸美丽预测结果,根据所述第一人脸 美丽预测结果和所述第二人脸美丽预测结果得到最终人脸美丽预测结果。
  2. 根据权利要求1所述的基于迁移与弱监督的美丽预测方法,其特征在于,所述预处理输入的人脸图像得到预处理图像具体是:对所述人脸图像依次进行图像增强处理、图像矫正处理、图像裁剪处理、图像去重处理和图像归一化处理得到预处理图像。
  3. 根据权利要求1所述的基于迁移与弱监督的美丽预测方法,其特征在于,所述T值是调节参数,定义于softmax层的softmax函数,softmax函数具体为
    Figure PCTCN2020104569-appb-100001
    其中q i是softmax函数的输出,z为softmax函数的输入。
  4. 根据权利要求1所述的基于迁移与弱监督的美丽预测方法,其特征在于,所述第一分类器的损失函数为:
    Figure PCTCN2020104569-appb-100002
    所述第二分类器的损失函数为:
    Figure PCTCN2020104569-appb-100003
    其中h i是所述第一预测值与所述第二预测值的和,g j是所述第二预 测值,y i是所述噪声标签,v j是真值标签,D n是所述图像特征,N n是所述图像特征的数量。
  5. 基于迁移与弱监督的美丽预测装置,其特征在于,包括:
    预处理模块,用于预处理输入的人脸图像得到预处理图像;
    迁移模块,用于利用所述预处理图像训练源域网络,并将训练后的所述源域网络的参数迁移至目标域网络;其中在迁移过程中,对于所述源域网络,根据所述源域网络的T值大于1的softmax层的输出与原始标签得到所述源域网络的损失函数;对于所述目标域网络,根据所述目标域网络的T值大于1的softmax层的输出与原始标签得到所述目标域网络的第一子损失函数,根据所述目标域网络的T值等于1的softmax层的输出与原始标签得到所述目标域网络的第二子损失函数,将所述第一子损失函数和所述第二子损失函数相加得到所述目标域网络的损失函数;
    特征提取模块,用于将所述预处理图像分为标有噪声标签的噪声图像和标有真值标签的真值图像,将所述噪声图像和所述真值图像输入至所述目标域网络得到图像特征;
    分类模块,用于将所述图像特征输入至残差网学习从所述图像特征至所述噪声标签与所述真值标签间的差值的映射并得到第一预测值,以及输入至标准神经网络学习从所述图像特征至所述真值标签的映射并得到第二预测值,将所述第一预测值与所述第二预测值相加后输入至第一分类器得到第一人脸美丽预测结果,将所 述第二预测值输入至第二分类器得到第二人脸美丽预测结果,根据所述第一人脸美丽预测结果和所述第二人脸美丽预测结果得到最终人脸美丽预测结果。
  6. 根据权利要求5所述的基于迁移与弱监督的美丽预测装置,其特征在于,所述预处理输入的人脸图像得到预处理图像具体是:对所述人脸图像依次进行图像增强处理、图像矫正处理、图像裁剪处理、图像去重处理和图像归一化处理得到预处理图像。
  7. 根据权利要求5所述的基于迁移与弱监督的美丽预测装置,其特征在于,所述T值是调节参数,定义于softmax层的softmax函数,softmax函数具体为
    Figure PCTCN2020104569-appb-100004
    其中q i是softmax函数的输出,z为softmax函数的输入。
  8. 根据权利要求5所述的基于迁移与弱监督的美丽预测装置,其特征在于,所述第一分类器的损失函数为:
    Figure PCTCN2020104569-appb-100005
    所述第二分类器的损失函数为:
    Figure PCTCN2020104569-appb-100006
    其 中h i是所述第一预测值与所述第二预测值的和,g j是所述第二预测值,y i是所述噪声标签,v j是真值标签,D n是所述图像特征,N n是所述图像特征的数量。
  9. 基于迁移与弱监督的美丽预测装置,其特征在于,包括处理器与所述处理器连接的存储器;所述存储器存储有可执行指令;所述处理器执行所述可执行指令,以执行如权利要求1至4任一项所述的基于迁移与弱监督的美丽预测方法。
  10. 存储介质,其特征在于,所述存储介质存储有可执行指令,可执行指令能被计算机执行,使所述计算机执行如权利要求1至4任一项所述的基于迁移与弱监督的美丽预测方法。
PCT/CN2020/104569 2020-06-24 2020-07-24 基于迁移与弱监督的美丽预测方法、装置及存储介质 WO2021258482A1 (zh)

Priority Applications (1)

Application Number Priority Date Filing Date Title
US17/414,196 US11769319B2 (en) 2020-06-24 2020-07-24 Method and device for predicting beauty based on migration and weak supervision, and storage medium

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202010586901.9 2020-06-24
CN202010586901.9A CN111832435A (zh) 2020-06-24 2020-06-24 基于迁移与弱监督的美丽预测方法、装置及存储介质

Publications (1)

Publication Number Publication Date
WO2021258482A1 true WO2021258482A1 (zh) 2021-12-30

Family

ID=72898155

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2020/104569 WO2021258482A1 (zh) 2020-06-24 2020-07-24 基于迁移与弱监督的美丽预测方法、装置及存储介质

Country Status (3)

Country Link
US (1) US11769319B2 (zh)
CN (1) CN111832435A (zh)
WO (1) WO2021258482A1 (zh)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114550315A (zh) * 2022-01-24 2022-05-27 云南联合视觉科技有限公司 身份比对识别方法、装置及终端设备

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114898424B (zh) * 2022-04-01 2024-04-26 中南大学 一种基于双重标签分布的轻量化人脸美学预测方法

Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108182394A (zh) * 2017-12-22 2018-06-19 浙江大华技术股份有限公司 卷积神经网络的训练方法、人脸识别方法及装置
CN108520213A (zh) * 2018-03-28 2018-09-11 五邑大学 一种基于多尺度深度的人脸美丽预测方法
CN108629338A (zh) * 2018-06-14 2018-10-09 五邑大学 一种基于lbp和卷积神经网络的人脸美丽预测方法
CN109344855A (zh) * 2018-08-10 2019-02-15 华南理工大学 一种基于排序引导回归的深度模型的人脸美丽评价方法
CN109492666A (zh) * 2018-09-30 2019-03-19 北京百卓网络技术有限公司 图像识别模型训练方法、装置及存储介质
CN110119689A (zh) * 2019-04-18 2019-08-13 五邑大学 一种基于多任务迁移学习的人脸美丽预测方法
CN110414489A (zh) * 2019-08-21 2019-11-05 五邑大学 一种基于多任务学习的人脸美丽预测方法
CN110705406A (zh) * 2019-09-20 2020-01-17 五邑大学 基于对抗迁移学习的人脸美丽预测方法及装置
CN110728294A (zh) * 2019-08-30 2020-01-24 北京影谱科技股份有限公司 一种基于迁移学习的跨领域图像分类模型构建方法和装置
CN111274422A (zh) * 2018-12-04 2020-06-12 北京嘀嘀无限科技发展有限公司 模型训练方法、图像特征提取方法、装置及电子设备

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11126826B1 (en) * 2019-04-03 2021-09-21 Shallow.Ai Inc. Machine learning system and method for recognizing facial images
CN111985265B (zh) * 2019-05-21 2024-04-12 华为技术有限公司 图像处理方法和装置
US11521011B2 (en) * 2019-06-06 2022-12-06 Samsung Electronics Co., Ltd. Method and apparatus for training neural network model for enhancing image detail
CN110705407B (zh) * 2019-09-20 2022-11-15 五邑大学 基于多任务迁移的人脸美丽预测方法及装置
CN111080123A (zh) * 2019-12-14 2020-04-28 支付宝(杭州)信息技术有限公司 用户风险评估方法及装置、电子设备、存储介质

Patent Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108182394A (zh) * 2017-12-22 2018-06-19 浙江大华技术股份有限公司 卷积神经网络的训练方法、人脸识别方法及装置
CN108520213A (zh) * 2018-03-28 2018-09-11 五邑大学 一种基于多尺度深度的人脸美丽预测方法
CN108629338A (zh) * 2018-06-14 2018-10-09 五邑大学 一种基于lbp和卷积神经网络的人脸美丽预测方法
CN109344855A (zh) * 2018-08-10 2019-02-15 华南理工大学 一种基于排序引导回归的深度模型的人脸美丽评价方法
CN109492666A (zh) * 2018-09-30 2019-03-19 北京百卓网络技术有限公司 图像识别模型训练方法、装置及存储介质
CN111274422A (zh) * 2018-12-04 2020-06-12 北京嘀嘀无限科技发展有限公司 模型训练方法、图像特征提取方法、装置及电子设备
CN110119689A (zh) * 2019-04-18 2019-08-13 五邑大学 一种基于多任务迁移学习的人脸美丽预测方法
CN110414489A (zh) * 2019-08-21 2019-11-05 五邑大学 一种基于多任务学习的人脸美丽预测方法
CN110728294A (zh) * 2019-08-30 2020-01-24 北京影谱科技股份有限公司 一种基于迁移学习的跨领域图像分类模型构建方法和装置
CN110705406A (zh) * 2019-09-20 2020-01-17 五邑大学 基于对抗迁移学习的人脸美丽预测方法及装置

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114550315A (zh) * 2022-01-24 2022-05-27 云南联合视觉科技有限公司 身份比对识别方法、装置及终端设备

Also Published As

Publication number Publication date
CN111832435A (zh) 2020-10-27
US11769319B2 (en) 2023-09-26
US20220309768A1 (en) 2022-09-29

Similar Documents

Publication Publication Date Title
CN111724083B (zh) 金融风险识别模型的训练方法、装置、计算机设备及介质
US20230016365A1 (en) Method and apparatus for training text classification model
US20230229919A1 (en) Learning to generate synthetic datasets for training neural networks
US20210027083A1 (en) Automatically detecting user-requested objects in images
WO2023134082A1 (zh) 图像描述语句生成模块的训练方法及装置、电子设备
US20090290802A1 (en) Concurrent multiple-instance learning for image categorization
CN111274424B (zh) 一种零样本图像检索的语义增强哈希方法
WO2021258482A1 (zh) 基于迁移与弱监督的美丽预测方法、装置及存储介质
CN113051914A (zh) 一种基于多特征动态画像的企业隐藏标签抽取方法及装置
WO2023088174A1 (zh) 目标检测方法及装置
CN115984653B (zh) 一种动态智能货柜商品识别模型的构建方法
CN114925205B (zh) 基于对比学习的gcn-gru文本分类方法
CN110111365A (zh) 基于深度学习的训练方法和装置以及目标跟踪方法和装置
CN114861671A (zh) 模型训练方法、装置、计算机设备及存储介质
CN111091198A (zh) 一种数据处理方法及装置
Ye Modern Deep learning design and application development
CN113516118B (zh) 一种图像与文本联合嵌入的多模态文化资源加工方法
CN111435453A (zh) 细粒度图像零样本识别方法
CN117011577A (zh) 图像分类方法、装置、计算机设备和存储介质
CN115344794A (zh) 一种基于知识图谱语义嵌入的旅游景点推荐方法
JP2022104911A (ja) エンベッディング正規化方法およびこれを利用した電子装置
WO2021258481A1 (zh) 基于多任务与弱监督的美丽预测方法、装置及存储介质
CN111401519A (zh) 一种基于物体内和物体间相似性距离的深层神经网络无监督学习方法
CN118379387B (zh) 一种基于基础模型的单域泛化方法
Song et al. Loop closure detection of visual SLAM based on variational autoencoder

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20941714

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 20941714

Country of ref document: EP

Kind code of ref document: A1