CN114943999A - Method for training age detection model, age detection method and related device - Google Patents

Method for training age detection model, age detection method and related device Download PDF

Info

Publication number
CN114943999A
CN114943999A CN202210623051.4A CN202210623051A CN114943999A CN 114943999 A CN114943999 A CN 114943999A CN 202210623051 A CN202210623051 A CN 202210623051A CN 114943999 A CN114943999 A CN 114943999A
Authority
CN
China
Prior art keywords
age
information
neural network
training
training set
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210623051.4A
Other languages
Chinese (zh)
Inventor
陈仿雄
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenzhen Shuliantianxia Intelligent Technology Co Ltd
Original Assignee
Shenzhen Shuliantianxia Intelligent Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenzhen Shuliantianxia Intelligent Technology Co Ltd filed Critical Shenzhen Shuliantianxia Intelligent Technology Co Ltd
Priority to CN202210623051.4A priority Critical patent/CN114943999A/en
Publication of CN114943999A publication Critical patent/CN114943999A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/178Human faces, e.g. facial parts, sketches or expressions estimating age from face image; using age information for improving recognition
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/205Parsing
    • G06F40/216Parsing using statistical methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/237Lexical tools
    • G06F40/242Dictionaries
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/764Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/774Generating sets of training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/80Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level
    • G06V10/806Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level of extracted features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • General Physics & Mathematics (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Computation (AREA)
  • Computing Systems (AREA)
  • Software Systems (AREA)
  • Multimedia (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Computational Linguistics (AREA)
  • General Engineering & Computer Science (AREA)
  • Medical Informatics (AREA)
  • Databases & Information Systems (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Data Mining & Analysis (AREA)
  • Mathematical Physics (AREA)
  • Molecular Biology (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Oral & Maxillofacial Surgery (AREA)
  • Human Computer Interaction (AREA)
  • Probability & Statistics with Applications (AREA)
  • Image Analysis (AREA)
  • Image Processing (AREA)

Abstract

The embodiment of the application relates to the technical field of facial image attribute prediction, and discloses a method for training an age detection model, an age detection method and a related device. And then, encoding the age interference information to obtain an information characteristic code. And finally, carrying out iterative training on the neural network by adopting a training set and information characteristic codes corresponding to the face images in the training set to obtain an age detection model. In this embodiment, the neural network can learn the characteristics of the face image and the age interference characteristics in the training process, that is, clearly distinguish the age interference characteristics from the effective characteristics affecting the accuracy of age prediction, so that the interference characteristics can be filtered out in age prediction, the influence of the interference characteristics on network learning is reduced, and the trained age detection model can accurately and stably detect the age.

Description

Method for training age detection model, age detection method and related device
Technical Field
The embodiment of the application relates to the technical field of human face image attribute prediction, in particular to a method for training an age detection model, an age detection method and a related device.
Background
The face image comprises a plurality of face feature information, such as face shape, face skin state, face expression, face five sense organs, face age and the like, wherein the face age is used as more important feature information and is widely applied to the field of face image detection. For example, some clients running on mobile devices have an age detection function, in which the client obtains a face image and outputs the detected age based on the obtained face image to feed back to the user.
However, the changes of angles, facial expressions and the like of the face images can cause the age value recognized by the same person to fluctuate greatly, and the stability of the face images is poor to a certain extent.
Disclosure of Invention
The embodiment of the application mainly solves the technical problem of providing a method for training an age detection model, an age detection method and a related device, wherein the age detection model obtained by training by the method can accurately and stably detect the age.
In order to solve the above technical problem, in a first aspect, an embodiment of the present application provides a method for training an age detection model, including:
acquiring a training set, wherein the training set comprises a plurality of face images, and each face image is marked with a real age;
acquiring age interference information corresponding to each face image in a training set, wherein the age interference information comprises shooting information and/or expression information, and the age interference information is an interference factor for interfering a neural network to accurately detect age;
coding the age interference information to obtain an information characteristic code;
and performing iterative training on a neural network by adopting the training set and the information characteristic codes corresponding to the face images in the training set to obtain an age detection model.
In some embodiments, said encoding said age interference information to obtain an information characteristic code comprises:
coding the text data in the age interference information to obtain a text code;
and splicing the text code and the numerical data in the age interference information to obtain the information characteristic code.
In some embodiments, said encoding text data in the age interference information to obtain a text code includes:
and coding the text data in the age interference information by adopting a bag-of-words model to obtain the text code.
In some embodiments, before the performing iterative training on a neural network by using the training set and the information feature codes corresponding to the face images in the training set to obtain an age detection model, the method further includes:
extracting the characteristics of the information characteristic codes by adopting a multilayer perceptron module to obtain target information vectors;
adopting the training set and the information feature codes corresponding to the face images in the training set to carry out iterative training on a neural network to obtain an age detection model, comprising the following steps:
and performing iterative training on the neural network and the multilayer perceptron module by adopting the training set and target information vectors corresponding to the training set to obtain the age detection model.
In some embodiments, the neural network comprises a cascade of convolution modules, a fully-connected layer, a fusion layer, and a classification layer, wherein the convolution modules comprise a plurality of convolution layers;
adopting the training set and the target information vector corresponding to the training set to carry out iterative training on the neural network and the multilayer perceptron module to obtain the age detection model, comprising the following steps:
inputting the training set into a convolution module of the neural network, wherein the last convolution layer of the convolution module outputs an age characteristic diagram;
inputting the age characteristic diagram into a full connection layer of the neural network to obtain an age characteristic vector;
inputting the age characteristic vector and the target information vector into the fusion layer for fusion, and inputting the fusion vector obtained by fusion into the classification layer for age classification to obtain a predicted age;
and calculating the loss between each real age and the corresponding predicted age by adopting a loss function, and adjusting model parameters of the neural network and the multilayer perceptron module according to the loss until convergence to obtain the age detection model.
In some embodiments, said inputting said age feature vector and said target information vector into said fusion layer for fusion comprises:
and multiplying and fusing the age characteristic vector and the target information vector to obtain the fused vector.
In some embodiments, the loss function comprises:
Figure BDA0003675356970000031
P i =log(Y ci )
wherein, Y ci E (1,2, …, n) is the probability that the ith age corresponds to the predicted age, Y Ti And n is the probability corresponding to the ith age in the real ages, and is the maximum age.
In order to solve the above technical problem, in a second aspect, an age detection method is provided in an embodiment of the present application, and includes:
acquiring a human face image to be detected;
acquiring age interference information corresponding to the face image to be detected, wherein the age interference information comprises shooting information and/or expression information, and the age interference information is an interference factor for interfering a neural network to accurately detect age;
encoding age interference information corresponding to the face image to be detected to obtain information characteristic codes corresponding to the face image to be detected;
and inputting the facial image to be detected and the information characteristic code corresponding to the facial image to be detected into an age detection model for age detection to obtain the age corresponding to the facial image to be detected.
In order to solve the above technical problem, in a third aspect, an embodiment of the present application provides an electronic device, including:
at least one processor, and
a memory communicatively coupled to the at least one processor, wherein,
the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the method of the first aspect.
In order to solve the above technical problem, in a fourth aspect, an embodiment of the present application provides a computer-readable storage medium storing computer-executable instructions for causing a computer device to perform the method of the first aspect.
The beneficial effects of the embodiment of the application are as follows: different from the situation in the prior art, in the method for training an age detection model provided in the embodiment of the present application, a training set includes a plurality of face images marked with real ages, and age interference information corresponding to each face image in the training set is first obtained, for example, the age interference information includes shooting information and/or expression information. And then, encoding the age interference information to obtain an information characteristic code. And finally, carrying out iterative training on the neural network by adopting a training set and information characteristic codes corresponding to the face images in the training set to obtain an age detection model. In this embodiment, the neural network can learn the characteristics of the face image and the age interference characteristics in the training process, that is, clearly distinguish the age interference characteristics and the effective characteristics affecting the accuracy of age prediction, so that the effective characteristics have higher discriminability, thereby filtering the interference characteristics when age prediction is performed, reducing the influence of the interference characteristics on network learning, and enabling the trained age detection model to accurately and stably detect the age.
Drawings
One or more embodiments are illustrated by way of example in the accompanying drawings, which correspond to the figures in which like reference numerals refer to similar elements and which are not to scale unless otherwise specified.
FIG. 1 is a schematic diagram of an application scenario of an age detection system in some embodiments of the present application;
FIG. 2 is a schematic diagram of an electronic device according to some embodiments of the present application;
FIG. 3 is a schematic flow chart of a method of training an age detection model according to some embodiments of the present application;
FIG. 4 is a schematic flow chart illustrating a sub-step S30 of the method shown in FIG. 3;
FIG. 5 is a schematic flow chart illustrating a sub-process of step S40 in the method of FIG. 3;
FIG. 6 is a schematic diagram of a neural network used for training in some embodiments of the present application;
fig. 7 is a schematic flow chart of an age detection method according to some embodiments of the present application.
Detailed Description
The present application will be described in detail with reference to specific examples. The following examples will aid those skilled in the art in further understanding the present application, but are not intended to limit the present application in any way. It should be noted that various changes and modifications can be made by one skilled in the art without departing from the spirit of the application. All falling within the scope of protection of the present application.
In order to make the objects, technical solutions and advantages of the present application more apparent, the present application is described in further detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the present application and are not intended to limit the present application.
It should be noted that, if not conflicted, the various features of the embodiments of the present application may be combined with each other within the scope of protection of the present application. Additionally, while functional block divisions are performed in device schematics, with logical sequences shown in flowcharts, in some cases, steps shown or described may be performed in a different order than the block divisions in devices, or in flowcharts. Further, the terms "first," "second," "third," and the like, as used herein, do not limit the data and the execution order, but merely distinguish the same items or similar items having substantially the same functions and actions.
Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this application belongs. The terminology used in the description of the present application is for the purpose of describing particular embodiments only and is not intended to be limiting of the present application. As used herein, the term "and/or" includes any and all combinations of one or more of the associated listed items.
In addition, the technical features mentioned in the embodiments of the present application described below may be combined with each other as long as they do not conflict with each other.
To facilitate understanding of the method provided in the embodiments of the present application, first, terms referred to in the embodiments of the present application will be described:
(1) neural network
The neural network may be composed of neural units, and may be specifically understood as a neural network having an input layer, a hidden layer, and an output layer, where generally the first layer is the input layer, the last layer is the output layer, and the middle layers are all hidden layers. Among them, a neural network with many hidden layers is called a Deep Neural Network (DNN). The work of each layer in the neural network can be described by a mathematical expression y ═ a (W · x + b), and from the physical level, the work of each layer in the neural network can be understood as that the transformation of the input space to the output space (i.e. the row space to the column space of the matrix) is completed through five operations on the input space (the set of input vectors), including 1, ascending/descending; 2. zooming in/out; 3. rotating; 4. translating; 5. "bending". The operation of 2 and 3 is completed by W.x, the operation of 4 is completed by + b, and the operation of 5 is realized by a () because the classified object is not a single thing but a kind of thing, and the space refers to the set of all individuals of the kind of thing, wherein W is the weight matrix of each layer of the neural network, and each value in the matrix represents the weight value of one neuron of the layer. The matrix W determines the spatial transformation of the input space to the output space described above, i.e. W at each layer of the neural network controls how the space is transformed. The purpose of training the neural network is to finally obtain the weight matrix of all layers of the trained neural network. Therefore, the training process of the neural network is essentially a way of learning the control space transformation, and more specifically, the weight matrix.
It should be noted that, in the embodiment of the present application, based on the model adopted by the machine learning task, the model is essentially a neural network. The common components in the neural network include a convolutional layer, a pooling layer, a normalization layer, a reverse convolutional layer and the like, the common components in the neural network are assembled to design a model, and when model parameters (weight matrixes of each layer) are determined so that model errors meet preset conditions or the number of the adjusted model parameters reaches a preset threshold, the model converges.
The convolution layer is configured with a plurality of convolution kernels, and each convolution kernel is provided with a corresponding step length so as to carry out convolution operation on the image. The convolution operation aims to extract different features of an input image, a first layer of convolution layer can only extract some low-level features such as edges, lines, angles and other levels, and a deeper convolution layer can iteratively extract more complex features from the low-level features.
The inverse convolutional layer is used to map a space with a low dimension to a space with a high dimension, while maintaining the connection relationship/mode therebetween (the connection relationship here refers to the connection relationship during convolution). The reverse convolution layer is configured with a plurality of convolution kernels, and each convolution kernel is provided with a corresponding step length so as to perform deconvolution operation on the image. Generally, a frame library (for example, a PyTorch library) for designing a neural network is built in an upscale () function, and a low-dimensional to high-dimensional spatial mapping can be realized by calling the upscale () function.
Pooling (posing) is a process that mimics the human visual system in that data can be reduced in size or images can be represented with higher level features. Common operations of pooling layers include maximum pooling, mean pooling, random pooling, median pooling, combined pooling, and the like. Generally, pooling layers are periodically inserted between convolutional layers of a neural network to achieve dimensionality reduction.
The normalization layer is used to perform normalization operations on all neurons in the middle layer to prevent gradient explosion and gradient disappearance.
(2) Loss function
In the process of training the neural network, because the output of the neural network is expected to be as close as possible to the value really expected to be predicted, the weight matrix of each layer of the neural network can be updated according to the difference between the predicted value of the current network and the really expected target value (however, an initialization process is usually carried out before the first updating, namely parameters are configured in advance for each layer in the neural network), for example, if the predicted value of the network is high, the weight matrix is adjusted to be lower in prediction, and the adjustment is carried out continuously until the neural network can predict the really expected target value. Therefore, it is necessary to define in advance "how to compare the difference between the predicted value and the target value", which are loss functions (loss functions) or objective functions (objective functions), which are important equations for measuring the difference between the predicted value and the target value. Taking the loss function as an example, if the higher the output value (loss) of the loss function indicates the larger the difference, the training of the neural network becomes a process of reducing the loss as much as possible.
Before describing the embodiments of the present application, a brief description is first given of the age detection method known to the inventor of the present application, so that it is convenient to understand the embodiments of the present application in the following.
In some age detection methods, training samples are obtained, and each sample is a face image labeled with a corresponding age label; determining the age label distribution corresponding to the age label of each sample; and training the age estimation model based on the sample and the age label distribution until the total loss function of the age estimation model converges.
The method is a classification method based on statistics, and when the method is applied to practical application, the age prediction result of the same person under different expressions or shooting angles fluctuates greatly due to different facial expressions, shooting angles and the like, and the stability and the accuracy are lacked.
In view of the foregoing problems, embodiments of the present application provide a method for training an age detection model, an age detection method, an electronic device, and a storage medium, where the training method trains a neural network by using a plurality of face images and corresponding age interference information. Therefore, the neural network can learn the characteristics of the face image and the age interference characteristics in the training process, namely clearly distinguish the age interference characteristics for interfering with age prediction and the effective characteristics influencing the age prediction accuracy, so that the effective characteristics have higher identifiability, the interference characteristics can be filtered out in the age prediction process, the influence of the interference characteristics on network learning is reduced, and the trained age detection model can accurately and stably detect the age.
An exemplary application of the electronic device for training the age detection model or for age detection provided in the embodiment of the present application is described below, and it is understood that the electronic device may train the age detection model, and may also perform age detection on a face image by using the age detection model.
The electronic device provided by the embodiment of the application can be a server, for example, a server deployed in the cloud. When the server is used for training the age detection model, the training set is adopted to carry out iterative training on the neural network according to the training set and the neural network provided by other equipment or technicians in the field, and final model parameters are determined, so that the neural network configures the final model parameters, and the age detection model can be obtained. When the server is used for age detection, a built-in age detection model is called, corresponding calculation processing is carried out on the face image to be detected provided by other equipment or a user, and the corresponding age is obtained.
The electronic device provided by some embodiments of the present application may be various types of terminals such as a notebook computer, a desktop computer, or a mobile device. When the terminal is used for training the age detection model, a person skilled in the art inputs the prepared training set into the terminal, designs a neural network on the terminal, and iteratively trains the neural network by using the training set by the terminal to determine final model parameters, so that the neural network configures the final model parameters, and the age detection model can be obtained. When the terminal is used for age detection, a built-in age detection model is called, corresponding calculation processing is carried out on a face image to be detected input by a user, and the age corresponding to the face image to be detected is obtained.
By way of example, referring to fig. 1, fig. 1 is a schematic view of an application scenario of the age detection system provided in the embodiment of the present application, and the terminal 10 is connected to the server 20 through a network, where the network may be a wide area network or a local area network, or a combination of the two.
The terminal 10 may be used to obtain training sets and build neural networks, for example, one skilled in the art downloads the prepared training sets on the terminal and builds a network structure for the neural network. It is understood that the terminal 10 may also be used to obtain a face image, for example, a user inputs the face image through an input interface, and the terminal automatically obtains the face image after the input is completed; for example, the terminal 10 is provided with a camera through which a face image is captured.
In some embodiments, the terminal 10 locally executes the method for training the age detection model provided in this embodiment to complete training the designed neural network by using the training set, and determine the final model parameters, so that the neural network configures the final model parameters, and the age detection model can be obtained. In some embodiments, the terminal 10 may also send, to the server 20 through the network, a training set stored on the terminal by a person skilled in the art and a constructed neural network, the server 20 receives the training set and the neural network, trains the designed neural network with the training set, determines final model parameters, and then sends the final model parameters to the terminal 10, and the terminal 10 stores the final model parameters, so that the neural network configuration can obtain the final model parameters, that is, the age detection model can be obtained.
In some embodiments, the terminal 10 locally executes the age detection method provided in this embodiment to provide an age detection service for the user, invokes a built-in age detection model, and performs corresponding calculation processing on the facial image to be detected and the age interference information to obtain the age corresponding to the facial image to be detected. In some embodiments, the terminal 10 may also send, to the server 20 through the network, the facial image to be detected and the age interference information input by the user on the terminal, and the server 20 receives the facial image to be detected and the age interference information, invokes a built-in age detection model to perform corresponding calculation processing on the facial image to be detected and the age interference information, to obtain the age corresponding to the facial image to be detected, and then sends the age to the terminal 10. The terminal 10 displays the age on its own display interface to inform the user of the age.
The structure of the electronic device in the embodiment of the present application is described below, and fig. 2 is a schematic structural diagram of the electronic device 500 in the embodiment of the present application, where the electronic device 500 includes at least one processor 510, a memory 550, at least one network interface 520, and a user interface 530. The various components in the electronic device 500 are coupled together by a bus system 540. It is understood that the bus system 540 is used to enable communications among the components. The bus system 540 includes a power bus, a control bus, and a status signal bus in addition to a data bus. For clarity of illustration, however, the various buses are labeled as bus system 540 in fig. 2.
The Processor 510 may be an integrated circuit chip having Signal processing capabilities, such as a general purpose Processor, a Digital Signal Processor (DSP), or other programmable logic device, discrete gate or transistor logic device, discrete hardware components, or the like, wherein the general purpose Processor may be a microprocessor or any conventional Processor, or the like.
The user interface 530 includes one or more output devices 531 enabling presentation of media content, including one or more speakers and/or one or more visual display screens. The user interface 530 also includes one or more input devices 532, including user interface components to facilitate user input, such as a keyboard, mouse, microphone, touch screen display, camera, other input buttons and controls.
The memory 550 may comprise volatile memory or nonvolatile memory, and may also comprise both volatile and nonvolatile memory. The non-volatile Memory may be a Read Only Memory (ROM), and the volatile Memory may be a Random Access Memory (RAM). The memory 550 described in embodiments herein is intended to comprise any suitable type of memory. Memory 550 optionally includes one or more storage devices physically located remote from processor 510.
In some embodiments, memory 550 may be capable of storing data to support various operations, examples of which include programs, modules, and data structures, or subsets or supersets thereof, as exemplified below.
An operating system 551 including system programs for processing various basic system services and performing hardware-related tasks, such as a framework layer, a core library layer, a driver layer, etc., for implementing various basic services and processing hardware-based tasks;
a network communication module 552 for communicating with other computing devices via one or more (wired or wireless) network interfaces 520, exemplary network interfaces 520 including Bluetooth, Wireless Fidelity (WiFi), and Universal Serial Bus (USB), among others;
a display module 553 for enabling presentation of information (e.g., a user interface for operating peripherals and displaying content and information) via one or more output devices 531 (e.g., a display screen, speakers, etc.) associated with the user interface 530;
an input processing module 554 to detect one or more user inputs or interactions from one of the one or more input devices 532 and to translate the detected inputs or interactions.
As can be understood from the foregoing, the method for training an age detection model and the age detection method provided in the embodiments of the present application may be implemented by various types of electronic devices with computing processing capabilities, such as an intelligent terminal and a server.
The method for training the age detection model provided by the embodiment of the present application is described below with reference to an exemplary application and implementation of the server provided by the embodiment of the present application. Referring to fig. 3, fig. 3 is a schematic flowchart of a method for training an age detection model according to an embodiment of the present application.
Referring to fig. 3 again, the method S100 may specifically include the following steps:
s10: and acquiring a training set, wherein the training set comprises a plurality of face images, and the real age is marked on each face image.
The training set comprises a number of face images, wherein each face image comprises a face. It is understood that in the training set, each face image is labeled with a real age, and the range of ages that each real age can cover includes 1-100 years.
The real age of each face image may be labeled by a thermal coding labeling method, for example, if the age range to be detected is between 1 and 100, the 100-dimensional vector is used as the real age corresponding to one face image. It will be appreciated that the use of heat-coded labeling is a common practice for those skilled in the art and will not be described in detail herein.
In some embodiments, the terminal or the server may perform normalization processing on the face images in the training set, which is beneficial to improving the convergence speed and model accuracy of subsequent model training. Specifically, in some embodiments, the size of the face image may be set to 256 × 256, translating the range of pixel values of the face image from 0-255 to between 0-1.
In some embodiments, the number of face images in the training set is ten thousand, which may be 20000, for example, which is beneficial for training to obtain an accurate general model. The number of face images can be determined by those skilled in the art according to actual situations.
S20: and acquiring age interference information corresponding to each face image in the training set.
It is understood that the face skin state, the eye spirit, and the like in the face image are effective information for judging the age. Besides the effective information, the face image also comprises age interference information. The age interference information is an interference factor for interfering the neural network to accurately detect the age. It can be understood that the human face of the same person can present different states due to different shooting angles, and therefore learning and judgment of the neural network can be influenced; the human face of the same person shows different states due to different expressions, so that the learning and judgment of the neural network can be influenced. In some embodiments, the age disturbance information includes photographing information and/or expression information.
In some embodiments, the age interference information may also include hair style information. It can be understood that different hairstyles can shield the face to different degrees, and the facial quality or gender is changed. Thus, when the same person, having a different hairstyle, the neural network may give a different age prediction.
Considering the influence of the age interference information on the neural network, here, the age interference information corresponding to each face image in the training set is obtained. In some embodiments, the age disturbance information includes photographing information and expression information. The shooting information includes a pitch angle, a yaw angle, a roll angle, and the like. The expression information includes laughter, smile or dilemma, etc. For example, the age interference information of a 32-year-old face image is: pitch angle 45 °, yaw angle 10 °, roll angle 10 °, laugh (expression).
S30: and coding the age interference information to obtain an information characteristic code.
It is understood that the age interference information includes text data and numerical data. In order to convert the age interference information into numerical data which can be learned by a neural network, the age interference information is coded into an information characteristic code before training. It will be appreciated that the information characteristic code is numerical data, which may be a vector, for example. Thus, the neural network can learn age-related information reflected by the information feature code.
In some embodiments, referring to fig. 4, the step S30 specifically includes:
s31: and coding the text data in the age interference information to obtain a text code.
S32: and splicing the text code and numerical data in the age interference information to obtain an information characteristic code.
It will be appreciated that the text data may include expressions, angle names, etc., such as the text "pitch angle, yaw angle, roll angle, laugh" described above, etc. In some embodiments, the text data may be encoded by thermal encoding or the like to obtain a corresponding text code.
In some embodiments, the step S31 specifically includes: and coding the text data in the age interference information by adopting a bag-of-words model to obtain a text code.
The word bag model is to select words in the text data and put the words into a word bag, count the times of all the words in the word bag appearing in the text data and express the words in a vector form. In this embodiment, text data in age interference information of all face images in a training set is first integrated to construct a dictionary. It will be appreciated that the dictionary includes words included in the text data in the age interference information of all face images in the training set, the dictionary includes word position information and corresponding words, and for example, the constructed dictionary may be { 1: pitch angle, 2: yaw angle, 3: roll angle, 4: laugh, 5: heart injury, 6: smile }, in which the number preceding the word is the word location information.
In this embodiment, for example, for the text data "pitch angle, yaw angle, roll angle, laugh", the text code obtained after the dictionary code in the above example is adopted may be "111100". It can be seen from the text code "111100" that "pitch angle, yaw angle, roll angle, laugh" occur once in the text data, and "hurry heart, smile" occur zero times in the text data.
And after the text code corresponding to the text data in the age interference information is acquired, splicing the text code with the numerical data in the age interference information to obtain the information characteristic code. The splicing mode of the text code and the numerical data can be horizontal splicing.
It is to be understood that numerical data in the age interference information may include various angle values. For example, 45 ° for pitch angle, 10 ° for yaw angle, and 10 ° for roll angle. In some embodiments, the 45 ° is 000101101 and the 10 ° is 000001010, so that after the text code "111100" is spliced with the numerical data "000101101, 000001010, 000001010", the resulting information characteristic code may be "111100000101101000001010000001010".
In the embodiment, the text data is coded, and then the coded text code is spliced with the numerical data, so that the numeralization of the age interference information is realized, and the subsequent study on a neural network is facilitated.
S40: and performing iterative training on the neural network by adopting the training set and the information characteristic codes corresponding to the face images in the training set to obtain an age detection model.
It can be understood that, in the training process, the neural network learns the image features of a plurality of face images in the training set and the interference features reflected by the corresponding information feature codes, makes an age prediction, and reversely adjusts the model parameters based on the predicted age and the real age, that is, adjusts the feature extraction parameters of the face images, so that the neural network can extract and learn effective features, such as skin features and the like, which affect the accuracy of the age prediction.
In addition, the interference features reflected by the information feature codes can tell the human face in the neural network human face image that the interference exists, so that the interference features can be ignored when the age prediction is carried out, and the influence of the interference features on the age prediction is avoided. For example, when the information feature code reflects shooting information and expression information, the information feature code can inform a neural network of the existence of expressions and shooting angles of a face in a face image, and when learning features in the face image, interference features such as the expressions and the shooting angles should be filtered out, so that attention is focused on effective features influencing the accuracy of age prediction.
After multiple times of iterative training and parameter adjustment, the neural network converges to obtain the age detection model. The information feature coding can help the neural network focus attention on effective features influencing the accuracy of age prediction, so that the converged age detection model can accurately and stably detect the age.
It is understood that convergence may refer to the fact that under a certain model parameter, the sum of the differences between the actual ages and the predicted ages in the training set is smaller than a preset threshold or fluctuates within a certain range.
In some embodiments, the adam algorithm is used to optimize the model parameters, for example, the number of iterations is set to 10 ten thousand, the initial learning rate is set to 0.001, the weight attenuation of the learning rate is set to 0.0005, and the learning rate is attenuated to 1/10 of the original learning rate every 1000 iterations, where the learning rate, the difference between each real age in the training set and the corresponding predicted age can be input into the adam algorithm to obtain an adjusted model parameter output by the adam algorithm, and the adjusted model parameter is used to perform the next training until the model parameter of the converged neural network is output after the training is completed. Thus, the converged neural network serves as an age detection model.
In this embodiment, the neural network can learn the characteristics of the face image and the age interference characteristics in the training process, that is, clearly distinguish the age interference characteristics and the effective characteristics affecting the accuracy of age prediction, so that the effective characteristics have higher discriminability, thereby filtering the interference characteristics when age prediction is performed, reducing the influence of the interference characteristics on network learning, and enabling the trained age detection model to accurately and stably detect the age.
In some embodiments, before the foregoing step S40, the method further comprises:
s50: and performing feature extraction on the information feature codes by adopting a multilayer perceptron module to obtain target information vectors.
It is understood that the multi-layered perceptron module includes an input layer, a multi-layered hidden layer, and an output layer, wherein the input layer includes N neurons, the hidden layer includes Q neurons, and the output layer includes K neurons. The operation of each layer may be described by a functional expression, it being understood that the functional expression differs for each layer.
It will be appreciated that if the input information characteristic code is represented by x, the input layer is fed to the hidden layer x, and the output of the hidden layer may be f (w) 1 x+b 1 ) Wherein w is 1 Is a weight, b 1 Is an offset, the function f may be a commonly used sigmoid function or tanh function. The hidden layer to the output layer is equivalent to a multi-class logistic regression, namely softmax regression, so that the output of the output layer is softmax (w) 2 x 1 +b 2 ) Wherein x is 1 F (w) output for hidden layer 1 x+b 1 )。
Therefore, the multi-layer perceptron module can be represented by the following formula:
Figure BDA0003675356970000151
wherein G represents the softmax activation function, h represents the number of hidden layers, and W i And b i Representing the weights and offsets of the ith hidden layer. x represents the input information characteristic code. W 1 And b 1 Represents the weights and offsets of the input layers, S represents the activation function, and mlp (x) represents the target information vector.
In some embodiments, K may be 1024, so that the output layer outputs a one-dimensional vector with length of 1024, i.e. a target information vector with length of 1024.
Each layer of the multilayer perceptron module uses an activation function, and can introduce nonlinear factors into neurons, so that the module can approach any nonlinear function at will, and further, more nonlinear models can be utilized. The multi-layer perceptron module has good feature extraction capability on discrete information, so that the extracted target information vector can sufficiently reflect the features of information feature codes, namely the features of age interference information.
In this embodiment, the step S40 specifically includes:
s41: and performing iterative training on the neural network and the multilayer perceptron module by adopting a training set and target information vectors corresponding to the training set to obtain an age detection model.
In the embodiment, the target-based information vector is extracted through the multilayer perceptron module, and the method has the advantages of low dimensionality, small feature granularity and the like, so that the neural network learning can be facilitated, and the accuracy and the training speed of the model can be improved.
In some embodiments, referring to fig. 6, the neural network includes a cascade of convolution modules, a fully-connected layer, a fusion layer, and a classification layer. The convolution module comprises a plurality of convolution layers and is used for performing down-sampling feature extraction on the input face image. The full connection layer is used for integrating and classifying the input characteristic diagram and outputting a one-dimensional vector. The fusion layer is used for performing fusion processing on at least two characteristics. The classification layer is used for converting the input vector into a probability vector with the value between 0 and 1, thereby realizing classification. It is worth noting that convolutional layers, fully-connected layers, and classification layers are common components in neural networks, and are well known to those skilled in the art and will not be described in detail herein.
In this embodiment, referring to fig. 5, the step S41 specifically includes:
s411: and inputting the training set into a convolution module of the neural network, and outputting an age characteristic diagram by the last convolution layer of the convolution module.
In some embodiments, referring to fig. 6, the convolution module includes a plurality of convolution layers, each convolution layer is configured with a pooling layer for implementing dimensionality reduction, the convolution layers are configured with convolution kernels of 3 × 3 size, the step size is set to 2, the number of convolution kernels in some convolution layers is 32,64,128,256,512, respectively, and the size of the output feature map is 256 × 32, 128 × 128, 64 × 64, 64 × 128, 32 × 256, 16 × 512, respectively. It can be understood that the feature map output by the last convolutional layer in the convolutional module is an age feature map.
S412: and inputting the age characteristic diagram into a full connection layer of the neural network to obtain an age characteristic vector.
The full-connection layer based one-dimensional vector is used for integrating and classifying the input features and outputting one-dimensional vector, so that the age feature map is processed by the full-connection layer to obtain one-dimensional vector, namely the age feature vector.
S413: and inputting the age characteristic vector and the target information vector into a fusion layer for fusion, and inputting the fusion vector obtained by fusion into a classification layer for age classification to obtain the predicted age.
And the fusion layer is used for fusing at least two features, so that after the age feature vector and the target information vector are input into the fusion layer, the obtained fusion vector has the image feature reflected by the age feature vector and the interference feature reflected by the target information vector.
Referring again to FIG. 6, the fused vector is processed by the classification layer and then converted into a probability vector with a value between 0 and 1. It can be understood that the probability vector is a vector representation of the predicted ages, wherein the elements in the probability vector are probabilities of the human face in the human face image being of each age. It is understood that the age with the highest probability is the predicted age.
Because the predicted age is obtained by classifying the fusion vector, the neural network can learn the characteristics of the face image and the age interference characteristics, namely the age interference characteristics and the effective characteristics influencing the age prediction accuracy are clearly distinguished, so that the effective characteristics have higher identifiability, the interference characteristics can be filtered out when the age prediction is carried out, and the influence of the interference characteristics on network learning is reduced.
In some embodiments, the aforementioned "inputting the age feature vector and the target information vector into the fusion layer for fusion includes: and multiplying and fusing the age characteristic vector and the target information vector to obtain a fused vector.
It can be understood that, the multiplication fusion is that the age characteristic vector and the target information vector are multiplied by corresponding positions, so that the weight of the target information in the age characteristic vector is increased, that is, the weight of the target information in the obtained fusion vector is great, the identification is easy to distinguish, and the neural network is favorable for clearly distinguishing the age interference characteristic and the effective characteristic influencing the age prediction accuracy.
S414: and calculating the loss between each real age and the corresponding predicted age by adopting a loss function, and adjusting model parameters of the neural network and the multilayer perceptron module according to the loss until convergence to obtain an age detection model.
Here, the loss function may be configured in the terminal by a person skilled in the art, the configured loss function is sent to the server along with the neural network and the multi-layer perceptron module, after the server processes the predicted age corresponding to each face image in the training set, the server calculates the loss between each real age and the predicted age in the training set by using the loss function, and iteratively trains the neural network and the multi-layer perceptron module based on the loss until convergence to obtain the age detection model.
In this embodiment, a loss function is used to calculate the difference between the true age and the predicted age corresponding to each face image in the training set. The loss function has been described in detail in the above "noun introduction (2)", and will not be described again. It is understood that the structure of the loss function can be set according to actual situations based on different network structures and training modes.
In some embodiments, the loss function comprises:
Figure BDA0003675356970000171
P i =log(Y ci )
wherein, Y ci E (1,2, …, n) is the probability corresponding to the i-th year of the predicted years, Y Ti The probability corresponding to the i-th age in the real ages is n, and the maximum value of the ages is n.
The loss function has a strong ability to distinguish between classes, and thus, gradient dispersion can be effectively avoided.
In summary, in the method for training an age detection model provided in the embodiment of the present application, a training set includes a plurality of facial images marked with real ages, and age interference information corresponding to each facial image in the training set is first obtained, for example, the age interference information includes shooting information and/or expression information. And then, coding the interference information of each age to obtain an information characteristic code. And finally, carrying out iterative training on the neural network by adopting a training set and information characteristic codes corresponding to the face images in the training set to obtain an age detection model. In this embodiment, the neural network can learn the characteristics of the face image and the age interference characteristics in the training process, that is, clearly distinguish the age interference characteristics and the effective characteristics affecting the accuracy of age prediction, so that the effective characteristics have higher discriminability, thereby filtering the interference characteristics when age prediction is performed, reducing the influence of the interference characteristics on network learning, and enabling the trained age detection model to accurately and stably detect the age.
After the age detection model is trained by the method for training the age detection model provided by the embodiment of the application, the age detection model can be applied to age detection. The age detection method provided by the embodiment of the application can be implemented by various electronic devices with computing processing capacity, such as an intelligent terminal, a server and the like.
The age detection method provided by the embodiment of the present application is described below with reference to an exemplary application and implementation of the terminal provided by the embodiment of the present application. Referring to fig. 7, fig. 7 is a schematic flowchart of an age detection method according to an embodiment of the present disclosure. The method S30 may include the steps of:
s31: and acquiring a human face image to be detected.
Here, the face image to be detected refers to a face image of an age to be detected. It is understood that the face image to be detected includes a human face.
In specific implementation, a clothing classification assistant (application software) built in a terminal (for example, a smart phone) acquires the image of the face to be detected. The face image to be detected can be shot by a terminal or input by a user into the terminal.
S32: and acquiring age interference information corresponding to the face image to be detected.
It can be understood that the skin state, the eye spirit and the like of the face in the face image to be detected are effective information for judging the age. Besides the effective information, the face image to be detected also comprises age interference information. The age interference information is image information that interferes with effective information of age detection model identification.
S33: and coding age interference information corresponding to the face image to be detected to obtain information characteristic codes corresponding to the face image to be detected.
It is understood that the age interference information includes text data and numerical data. In order to convert the age interference information into numerical data which can be calculated by an age detection module, the age interference information is coded into an information characteristic code before detection. The specific implementation of step S33 may refer to the encoding manner in step S30 in the training example, and will not be described again here.
S33: and inputting the face image to be detected and the age interference information corresponding to the face image to be detected into an age detection model for age detection to obtain the age corresponding to the face image to be detected.
Here, the age detection model refers to an age detection model obtained by training the method embodiments of fig. 3 to fig. 6, and has the same structure and function as the age detection model in the embodiments described above, and is not repeated here.
Embodiments of the present application also provide a computer-readable storage medium storing computer-executable instructions for causing an electronic device to perform a method for training an age detection model provided in embodiments of the present application, for example, a method for training an age detection model as shown in fig. 3 to 6, or an age detection method provided in embodiments of the present application.
In some embodiments, the storage medium may be memory such as FRAM, ROM, PROM, EPROM, EE PROM, flash, magnetic surface memory, optical disk, or CD-ROM; or may be various devices including one or any combination of the above memories.
In some embodiments, executable instructions may be written in any form of programming language (including compiled or interpreted languages), in the form of programs, software modules, scripts or code, and may be deployed in any form, including as a stand-alone program or as a module, component, subroutine, or other unit suitable for use in a computing environment.
By way of example, executable instructions may correspond, but do not necessarily have to correspond, to files in a file system, may be stored in a portion of a file that holds other programs or data, e.g., in one or more scripts in a HyperText markup Language (H TML) document, in a single file dedicated to the program in question, or in multiple coordinated files (e.g., files that store one or more modules, sub-programs, or portions of code).
By way of example, executable instructions may be deployed to be executed on one computing device (a device including a smart terminal and a server), or on multiple computing devices located at one site, or distributed across multiple sites interconnected by a communication network.
Embodiments of the present application also provide a computer-readable storage medium storing a computer program, where the computer program includes program instructions, and when the program instructions are executed by a computer, the computer executes a method for training an age detection model or an age detection method as in the foregoing embodiments.
Finally, it should be noted that: the above embodiments are only used to illustrate the technical solutions of the present application, and not to limit the same; within the context of the present application, where technical features in the above embodiments or in different embodiments can also be combined, the steps can be implemented in any order and there are many other variations of the different aspects of the present application as described above, which are not provided in detail for the sake of brevity; although the present application has been described in detail with reference to the foregoing embodiments, it should be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; and the modifications or the substitutions do not make the essence of the corresponding technical solutions depart from the scope of the technical solutions of the embodiments of the present application.

Claims (10)

1. A method of training an age detection model, comprising:
acquiring a training set, wherein the training set comprises a plurality of face images, and each face image is marked with a real age;
acquiring age interference information corresponding to each face image in a training set, wherein the age interference information comprises shooting information and/or expression information, and the age interference information is an interference factor for interfering a neural network to accurately detect age;
coding the age interference information to obtain an information characteristic code;
and performing iterative training on a neural network by adopting the training set and the information characteristic codes corresponding to the face images in the training set to obtain an age detection model.
2. The method of claim 1, wherein said encoding the age interference information to obtain an information characteristic code comprises:
coding the text data in the age interference information to obtain a text code;
and splicing the text code and the numerical data in the age interference information to obtain the information characteristic code.
3. The method of claim 2, wherein encoding the text data in the age interference information to obtain a text code comprises:
and coding the text data in the age interference information by adopting a bag-of-words model to obtain the text code.
4. The method according to any one of claims 1 to 3, wherein before performing iterative training on a neural network by using the training set and the information feature codes corresponding to the face images in the training set to obtain an age detection model, the method further comprises:
extracting the characteristics of the information characteristic codes by adopting a multilayer perceptron module to obtain target information vectors;
adopting the training set and the information characteristic codes corresponding to the face images in the training set to carry out iterative training on a neural network to obtain an age detection model, comprising the following steps:
and performing iterative training on the neural network and the multilayer perceptron module by adopting the training set and target information vectors corresponding to the training set to obtain the age detection model.
5. The method of claim 4, wherein the neural network comprises a cascade of convolution modules, fully-connected layers, fused layers, and classified layers, wherein the convolution modules comprise a plurality of convolution layers;
adopting the training set and the target information vector corresponding to the training set to carry out iterative training on the neural network and the multilayer perceptron module to obtain the age detection model, comprising the following steps:
inputting the training set into a convolution module of the neural network, wherein the last convolution layer of the convolution module outputs an age characteristic diagram;
inputting the age characteristic diagram into a full connection layer of the neural network to obtain an age characteristic vector;
inputting the age characteristic vector and the target information vector into the fusion layer for fusion, and inputting the fusion vector obtained by fusion into the classification layer for age classification to obtain a predicted age;
and calculating the loss between each real age and the corresponding predicted age by adopting a loss function, and adjusting model parameters of the neural network and the multilayer perceptron module according to the loss until convergence to obtain the age detection model.
6. The method of claim 5, wherein said inputting said age feature vector and said target information vector into said fusion layer for fusion comprises:
and multiplying and fusing the age characteristic vector and the target information vector to obtain the fusion vector.
7. The method of claim 5, wherein the loss function comprises:
Figure FDA0003675356960000021
P i =log(Y ci )
wherein Y is ci E (1,2, …, n) is the probability that the ith age corresponds to the predicted age, Y Ti And n is the probability corresponding to the ith age in the real ages, and is the maximum age.
8. An age detection method, comprising:
acquiring a human face image to be detected;
acquiring age interference information corresponding to the face image to be detected, wherein the age interference information comprises shooting information and/or expression information, and the age interference information is an interference factor for interfering a neural network to accurately detect age;
encoding age interference information corresponding to the face image to be detected to obtain information characteristic codes corresponding to the face image to be detected;
and inputting the facial image to be detected and the information characteristic code corresponding to the facial image to be detected into an age detection model for age detection to obtain the age corresponding to the facial image to be detected.
9. An electronic device, comprising:
at least one processor, and
a memory communicatively coupled to the at least one processor, wherein,
the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the method of any one of claims 1-8.
10. A computer-readable storage medium having computer-executable instructions stored thereon for causing a computer device to perform the method of any one of claims 1-8.
CN202210623051.4A 2022-06-01 2022-06-01 Method for training age detection model, age detection method and related device Pending CN114943999A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210623051.4A CN114943999A (en) 2022-06-01 2022-06-01 Method for training age detection model, age detection method and related device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210623051.4A CN114943999A (en) 2022-06-01 2022-06-01 Method for training age detection model, age detection method and related device

Publications (1)

Publication Number Publication Date
CN114943999A true CN114943999A (en) 2022-08-26

Family

ID=82908459

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210623051.4A Pending CN114943999A (en) 2022-06-01 2022-06-01 Method for training age detection model, age detection method and related device

Country Status (1)

Country Link
CN (1) CN114943999A (en)

Similar Documents

Publication Publication Date Title
US11657286B2 (en) Structure learning in convolutional neural networks
US11967151B2 (en) Video classification method and apparatus, model training method and apparatus, device, and storage medium
CN110533097B (en) Image definition recognition method and device, electronic equipment and storage medium
CN108764195B (en) Handwriting model training method, handwritten character recognition method, device, equipment and medium
CN109948149B (en) Text classification method and device
JP7403909B2 (en) Operating method of sequence mining model training device, operation method of sequence data processing device, sequence mining model training device, sequence data processing device, computer equipment, and computer program
KR20160034814A (en) Client device with neural network and system including the same
CN109086653B (en) Handwriting model training method, handwritten character recognition method, device, equipment and medium
CN110378438A (en) Training method, device and the relevant device of Image Segmentation Model under label is fault-tolerant
CN112395979B (en) Image-based health state identification method, device, equipment and storage medium
CN114021524B (en) Emotion recognition method, device, equipment and readable storage medium
CN114398961A (en) Visual question-answering method based on multi-mode depth feature fusion and model thereof
CN111475622A (en) Text classification method, device, terminal and storage medium
CN109726291B (en) Loss function optimization method and device of classification model and sample classification method
CN114266897A (en) Method and device for predicting pox types, electronic equipment and storage medium
CN115050064A (en) Face living body detection method, device, equipment and medium
CN108985442B (en) Handwriting model training method, handwritten character recognition method, device, equipment and medium
KR20200143450A (en) Image processing method, device, electronic device and storage medium
CN112749737A (en) Image classification method and device, electronic equipment and storage medium
CN115238909A (en) Data value evaluation method based on federal learning and related equipment thereof
CN113221695A (en) Method for training skin color recognition model, method for recognizing skin color and related device
CN114821244A (en) Method for training clothes classification model, clothes classification method and related device
CN113449840A (en) Neural network training method and device and image classification method and device
CN115439179A (en) Method for training fitting model, virtual fitting method and related device
CN114943999A (en) Method for training age detection model, age detection method and related device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination