CN117315445B - Target identification method, device, electronic equipment and readable storage medium - Google Patents

Target identification method, device, electronic equipment and readable storage medium Download PDF

Info

Publication number
CN117315445B
CN117315445B CN202311598731.6A CN202311598731A CN117315445B CN 117315445 B CN117315445 B CN 117315445B CN 202311598731 A CN202311598731 A CN 202311598731A CN 117315445 B CN117315445 B CN 117315445B
Authority
CN
China
Prior art keywords
task
attribute identification
target
correlation
attribute
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202311598731.6A
Other languages
Chinese (zh)
Other versions
CN117315445A (en
Inventor
葛沅
赵雅倩
史宏志
温东超
赵健
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Suzhou Metabrain Intelligent Technology Co Ltd
Original Assignee
Suzhou Metabrain Intelligent Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Suzhou Metabrain Intelligent Technology Co Ltd filed Critical Suzhou Metabrain Intelligent Technology Co Ltd
Priority to CN202311598731.6A priority Critical patent/CN117315445B/en
Publication of CN117315445A publication Critical patent/CN117315445A/en
Application granted granted Critical
Publication of CN117315445B publication Critical patent/CN117315445B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/0464Convolutional networks [CNN, ConvNet]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/172Classification, e.g. identification

Abstract

The invention discloses a target identification method, a target identification device, electronic equipment and a readable storage medium, which are applied to the technical field of artificial intelligence. In the initial multi-task target recognition model training process, calculating correlation information among different attribute recognition tasks based on a multi-task loss function of the model; based on each correlation information, the attribute identification tasks are grouped with the objective of maximizing the correlation information among all the attribute identification tasks. Deploying the grouping result through a task parameter feature learning layer for replacing the initial multi-task target recognition model to obtain the multi-task target recognition model; training the multi-task target recognition model by using the target sample data set to obtain the multi-task target recognition model for simultaneously executing a plurality of attribute recognition tasks. The method and the device can solve the problem that correlation among different attribute recognition tasks cannot be simply and accurately determined in the related technology, and can effectively improve the accuracy of target recognition.

Description

Target identification method, device, electronic equipment and readable storage medium
Technical Field
The present invention relates to the field of artificial intelligence, and in particular, to a target identification method, apparatus, electronic device, and readable storage medium.
Background
With the development of artificial intelligence technology, multi-task learning in which a plurality of tasks learn in parallel and a plurality of prediction results are output has been developed. The multi-task learning occupies fewer computing resources, and can mine the common data characteristics hidden among different tasks, so that the model prediction precision is effectively improved.
Because the prediction accuracy of the multi-task model is affected by the correlation between the tasks, in the multi-task learning process, the correlation technology generally adopts priori knowledge to force the model to group according to a certain rule or performs evolutionary search on the similarity of a plurality of attribute recognition tasks through a multi-factor evolutionary algorithm in the target recognition process based on multi-task learning so as to determine the correlation between different tasks. However, the prior knowledge is used for manually judging the task correlation, the accuracy is not high, the randomness is high, and the multi-factor evolution method is complex in calculation and is not suitable for a scene with a very large task data size.
In view of this, simply and accurately determining the correlation between different attribute recognition tasks to improve the accuracy of target recognition is a technical problem that needs to be solved by those skilled in the art.
Disclosure of Invention
The invention provides a target identification method, a target identification device, electronic equipment and a readable storage medium, which can simply and accurately determine the correlation among different attribute identification tasks and effectively improve the accuracy of target identification.
In order to solve the technical problems, the invention provides the following technical scheme:
the first aspect of the present invention provides a target recognition method, including:
acquiring a first target sample data set and a second target sample data set of a label attribute type label;
in the process of training an initial multi-task target recognition model by using the first target sample data set, calculating correlation information among different attribute recognition tasks based on a multi-task loss function corresponding to the initial multi-task target recognition model; the initial multi-task target recognition model comprises an input layer, a shared parameter characteristic learning layer, a task parameter characteristic learning layer and an output layer which are sequentially connected;
based on each correlation information, grouping each attribute identification task with the aim of maximizing the correlation information among all attribute identification tasks to obtain a grouping result; each group of attribute identification tasks at least comprises one attribute identification task;
Deploying the grouping result to the initial multi-task target recognition model by replacing the task parameter feature learning layer to obtain a multi-task target recognition model;
and training the multi-task target recognition model by using the second target sample data set to obtain the multi-task target recognition model for simultaneously executing a plurality of attribute recognition tasks.
In a first exemplary embodiment, the calculating, based on the multi-task loss function corresponding to the initial multi-task object recognition model, correlation information between different attribute recognition tasks includes:
for every two attribute identification tasks, determining the correlation between the first attribute identification task and the second attribute identification task as correlation information according to the loss influence degree of the first attribute identification task on the second attribute identification task;
the correlation relationship comprises a positive correlation relationship and a negative correlation relationship, wherein the positive correlation relationship is used for representing that the first attribute identification task and the second attribute identification task are suitable as a group; the negative correlation is used to indicate that the first attribute identification task and the second attribute identification task are not suitable as a group.
In a second exemplary embodiment, the determining, according to the loss influence degree of the first attribute identification task on the second attribute identification task, the correlation between the first attribute identification task and the second attribute identification task includes:
calculating a first loss function value of the first attribute identification task at the current moment based on the first target sample data;
gradient updating is carried out on the shared parameter feature learning layer by utilizing the first loss function value, so that a new shared parameter is obtained;
calculating a second original loss function value of a second attribute identification task based on the original sharing parameter at the current moment and second target sample data;
calculating a second new loss function value for the second attribute identification task based on the new sharing parameter and the second target sample data;
if the second original loss function value is larger than the second new loss function value, the first attribute identification task and the second attribute identification task are in positive correlation; and if the second original loss function value is smaller than or equal to the second new loss function value, the first attribute identification task and the second attribute identification task are in a negative correlation.
In a third exemplary embodiment, the calculating a second original loss function value of a second attribute identification task based on the original sharing parameter at the current time and second target sample data includes:
calling an original loss function value calculation relation, and calculating a second original loss function value of a second attribute identification task; the original loss function value calculation relation is as follows:
in the method, in the process of the invention,loss q-org for the second original loss function value,qrepresenting a second attribute identification task, Y being second target sample dataX t At the position oftThe true value corresponding to the moment in time,Kfor the second target sample dataX t The total number of target data to be included,F(X t ) For the second target sample dataX t A corresponding attribute prediction function is provided for each of the plurality of attributes,identifying task for second attributetTask attribute parameter of time->To at the same timetOriginal sharing parameters of time of day->For the second target sample dataX t Is the first of (2)iThe second attribute of the individual target data identifies the actual tag of the task.
In a fourth exemplary embodiment, the calculating a second new loss function value for the second attribute identification task based on the new sharing parameter and the second target sample data includes:
invoking a new loss function value calculation relation, and calculating a second new loss function value of the second attribute identification task; the new loss function value calculation relation is:
In the method, in the process of the invention,loss q-new for the second new original loss function value,to at the same timetA new shared parameter set of time of day.
In a fifth exemplary embodiment, the calculating, based on the multi-task loss function corresponding to the initial multi-task object recognition model, correlation information between different attribute recognition tasks includes:
for every two attribute identification tasks, calculating the relevance scores of the first attribute identification task and the second attribute identification task according to a new loss value updated by the gradient of the second attribute identification task along the descending direction of the loss function of the first attribute identification task and an original loss value before gradient updating;
according to the numerical relation between the relevance score and a preset value, determining the relevance relation between the first attribute identification task and the second attribute identification task as the relevance information;
the correlation relationship comprises a positive correlation relationship and a negative correlation relationship, wherein the positive correlation relationship is used for representing that the first attribute identification task and the second attribute identification task are suitable as a group; the negative correlation is used to indicate that the first attribute identification task and the second attribute identification task are not suitable as a group.
In a sixth exemplary embodiment, the calculating, according to the new loss value after the gradient of the second attribute identification task is updated along the decreasing direction of the loss function of the first attribute identification task and the original loss value before the gradient is updated, a relevance score of the first attribute identification task and the second attribute identification task includes:
invoking a correlation score calculation relational expression to calculate correlation scores of the first attribute identification task and the second attribute identification task; the correlation score calculation relational expression is as follows:
R=1-(loss Bnew /loss Borg );
in the method, in the process of the invention,Ras a result of the relevance score,loss Bnew in order to update the new loss value after the update,loss Borg the original loss value before gradient update.
In a seventh exemplary embodiment, the calculating, based on the multi-task loss function corresponding to the initial multi-task object recognition model, correlation information between different attribute recognition tasks includes:
calculating correlation information between every two attribute identification tasks for a plurality of times according to a preset frequency;
when the total training duration is up, for every two attribute identification tasks, taking the average processing result of the multiple correlation information as the correlation information between the corresponding attribute identification tasks.
In an eighth exemplary embodiment, the initial multi-task target recognition model adopts a hard parameter sharing mode, the shared parameter feature learning layer is located at a bottom layer, the task parameter feature learning layer is located at a top layer, the task parameter feature learning layer includes a plurality of subtask parameter feature learning layers, and each subtask parameter feature learning layer corresponds to an attribute recognition task and is used for learning parameter features of the corresponding attribute recognition task.
In a ninth exemplary embodiment, the shared parameter feature learning layer includes a first feature extraction layer, a second feature extraction layer, a third feature extraction layer, a fourth feature layer, a fifth feature extraction layer, a first fully connected layer, and a second fully connected layer;
the first feature extraction layer, the second feature extraction layer and the fifth feature extraction layer comprise a convolution layer, a batch normalization layer and a maximum pooling layer which are sequentially connected; the third feature extraction layer and the fourth feature layer comprise a convolution layer and a batch normalization layer which are sequentially connected.
In a tenth exemplary embodiment, the multitasking object recognition model includes the input layer, the shared parameter feature learning layer, the grouping parameter feature learning layer, and the output layer that are sequentially connected;
the grouping parameter characteristic learning layers comprise a plurality of subgroup parameter characteristic learning layers, and each subgroup parameter characteristic learning layer corresponds to one subgroup and is used for learning the parameter characteristics of the corresponding subgroup; each subgroup parameter feature learning layer comprises a third fully connected layer and a fourth fully connected layer.
In an eleventh exemplary embodiment, the grouping each attribute identification task based on each correlation information with the objective of maximizing correlation information among all attribute identification tasks, to obtain a grouping result includes:
For each attribute identification task, acquiring correlation information between the current attribute identification task and other attribute identification tasks, and dividing candidate attribute identification tasks belonging to positive correlation and the current attribute identification tasks into a first subgroup based on each correlation information;
and deleting candidate attribute identification tasks which do not meet the conditions in the first subgroup according to the total number of attribute identification tasks contained in the first subgroup, the correlation degree of each candidate attribute identification task and the current attribute identification task and the correlation information among the candidate attribute identification tasks so as to achieve the maximum correlation information among all attribute identification tasks.
In a twelfth exemplary embodiment, the deleting the candidate attribute identifying task that does not meet the condition in the first subset according to the total number of attribute identifying tasks included in the first subset, the degree of correlation between each candidate attribute identifying task and the current attribute identifying task, and the correlation information between each candidate attribute identifying task includes:
if the total number of attribute identification tasks contained in the first subgroup is greater than 2 attribute identification tasks, the first subgroup reserves initial target attribute identification tasks of which the correlation information corresponding to each candidate target attribute identification task is greater than a preset correlation threshold;
Calculating the correlation information of every two initial target attribute identification tasks in the first subgroup;
for a first initial target attribute identification task and a second initial target attribute identification task belonging to a negative correlation, acquiring a first correlation score and a second correlation score between the first initial target attribute identification task and the second initial target attribute identification task and the current attribute identification task respectively; and if the first correlation score is greater than the second correlation score, deleting the second initial target attribute identification task from the first subgroup.
In a thirteenth exemplary embodiment, the relevance information is a relevance score, and the first subset retains initial target attribute identification tasks with relevance information corresponding to each candidate target attribute identification task greater than a preset relevance threshold, including:
selecting a maximum correlation score from the correlation scores between the current attribute identification task and the candidate attribute identification tasks, and determining a preset correlation threshold based on the maximum correlation score and a preset adjustment factor, wherein the preset adjustment factor is more than 0 and less than 1;
taking a candidate target attribute identification task with the relevance score larger than the preset relevant threshold value as an initial target attribute identification task;
And deleting candidate target attribute identification tasks which are not the initial target attribute identification tasks in the first subgroup.
In a fourteenth exemplary embodiment, the determining a preset correlation threshold based on the maximum correlation score and a preset adjustment factor includes:
and when a grouping parameter adjusting instruction is received, updating a preset adjusting factor stored locally according to a new adjusting factor in the grouping parameter adjusting instruction.
In a fifteenth exemplary embodiment, the training the multitasking object recognition model using the second object sample data set includes:
according to the preset total training period number, repeatedly selecting target sample data in the second target sample data set, calling a multi-task target recognition loss function relation to calculate a loss function of the multi-task target recognition model until a preset training period number is reached, and ending iteration; the multitasking target recognition loss function relation is:
in the method, in the process of the invention,is thatMTask attribute parameter set of individual attribute identification task, +.>For the original shared parameter set, < >>For the second target sample data setXIs the first of (2)iItem of target datajThe individual attributes identify the actual labels of the tasks, F(X i ) Is the first target sample data setiAttribute prediction functions corresponding to the respective target data,Nis the total number of target data contained in the second target sample data set.
The second aspect of the present invention provides a target recognition method, including:
training by using the target recognition method according to any one of the previous claims to obtain a multi-task target recognition model;
acquiring data to be processed of a target to be identified; the object to be identified comprises a plurality of attributes;
and inputting the data to be processed into the multi-task target recognition model to obtain a recognition result of at least one attribute in the target to be recognized.
The third aspect of the present invention provides a target recognition method, including:
acquiring a first face image dataset and a second face image dataset of a tag attribute type tag;
in the process of training an initial multi-task face recognition model by utilizing the first face image data set, calculating correlation information among different attribute recognition tasks based on a multi-task loss function corresponding to the initial multi-task face recognition model; the initial multi-task face recognition model comprises an input layer, a shared parameter characteristic learning layer, a task parameter characteristic learning layer and an output layer which are sequentially connected;
Based on each correlation information, grouping each attribute identification task with the aim of maximizing the correlation information among all attribute identification tasks to obtain a grouping result; each group of attribute identification tasks at least comprises one attribute identification task;
disposing the grouping result to the initial multi-task face recognition model by replacing the task parameter feature learning layer to obtain a multi-task face recognition model;
and training the multi-task face recognition model by using the second face image data set to obtain the multi-task face recognition model for executing the multi-face attribute recognition tasks simultaneously.
A fourth aspect of the present invention provides an object recognition apparatus comprising:
the image sample acquisition module is used for acquiring a first target sample data set and a second target sample data set of the marking attribute category label;
the correlation determination module is used for calculating correlation information among different attribute recognition tasks based on a multi-task loss function corresponding to the initial multi-task target recognition model in the process of training the initial multi-task target recognition model by using the first target sample data set; the initial multi-task target recognition model comprises an input layer, a shared parameter characteristic learning layer, a task parameter characteristic learning layer and an output layer which are sequentially connected;
The grouping module is used for grouping the attribute identification tasks based on the correlation information and aiming at maximizing the correlation information among all the attribute identification tasks to obtain a grouping result; each group of attribute identification tasks at least comprises one attribute identification task;
the target recognition module is used for deploying the grouping result to the initial multi-task target recognition model by replacing the task parameter feature learning layer to obtain a multi-task target recognition model; and training the multi-task target recognition model by using the second target sample data set to obtain the multi-task target recognition model for simultaneously executing a plurality of attribute recognition tasks.
A fifth aspect of the present invention provides an object recognition apparatus comprising:
the model training module is used for training by utilizing the target recognition method according to any one of the previous items to obtain a multi-task target recognition model;
the data acquisition module is used for acquiring the data to be processed of the target to be identified; the object to be identified comprises a plurality of attributes;
the recognition result generation module is used for inputting the data to be processed into the multi-task target recognition model to obtain a recognition result of at least one attribute in the target to be recognized.
A sixth aspect of the present invention provides an object recognition apparatus comprising:
the face sample acquisition module is used for acquiring a first face image data set and a second face image data set of the marking attribute type label;
the face attribute relevance determining module is used for calculating relevance information among different attribute recognition tasks based on a multi-task loss function corresponding to the initial multi-task face recognition model in the process of training the initial multi-task face recognition model by utilizing the first face image data set; the initial multi-task face recognition model comprises an input layer, a shared parameter characteristic learning layer, a task parameter characteristic learning layer and an output layer which are sequentially connected;
the face attribute recognition task grouping module is used for grouping all attribute recognition tasks based on all the correlation information and aiming at maximizing the correlation information among all the attribute recognition tasks to obtain a grouping result; each group of attribute identification tasks at least comprises one attribute identification task;
the face recognition module is used for deploying the grouping result to the initial multi-task face recognition model by replacing the task parameter feature learning layer to obtain a multi-task face recognition model; and training the multi-task face recognition model by using the second face image data set to obtain the multi-task face recognition model for executing the multi-face attribute recognition tasks simultaneously.
The invention also provides an electronic device comprising a processor for implementing the steps of the object recognition method according to any one of the preceding claims when executing a computer program stored in a memory.
The invention finally provides a readable storage medium having stored thereon a computer program which, when executed by a processor, implements the steps of the object recognition method according to any of the preceding claims.
The technical scheme provided by the invention has the advantages that the correlation of different subtasks is quantified by utilizing the change of the loss function, the correlation between different attribute identification tasks can be simply and accurately determined, each attribute identification task and the attribute identification task with similarity are divided into a group of shared local parameters, the model parameters of a plurality of similar tasks can be jointly learned, common information can be mined, the problem that training precision of a small sample task cannot be improved due to insufficient samples is solved, and the target identification precision is effectively improved; the risk of precision deviation caused by unreasonable grouping of the multi-task target recognition model by artificial subjectivity can be effectively avoided, the multi-task target recognition model is helped to learn more general and global features through reasonable grouping, the phenomenon that the multi-task target recognition model is over-fitted due to excessive attention of certain local features is avoided, and the precision and efficiency of target recognition can be effectively improved.
In addition, the invention also provides a corresponding implementation device, electronic equipment and a readable storage medium for the target identification method, so that the method is more practical, and the device, the electronic equipment and the readable storage medium have corresponding advantages.
It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the invention as claimed.
Drawings
For a clearer description of the present invention or of the technical solutions related thereto, the following brief description will be given of the drawings used in the description of the embodiments or of the related art, it being obvious that the drawings in the following description are only some embodiments of the present invention, and that other drawings may be obtained from these drawings without the inventive effort of a person skilled in the art.
FIG. 1 is a schematic flow chart of a target recognition method according to the present invention;
FIG. 2 is a schematic flow chart of another object recognition method according to the present invention;
FIG. 3 is a structural framework diagram of an initial multi-tasking object recognition model provided by the present invention under an exemplary embodiment;
FIG. 4 is a structural framework diagram of a multi-tasking object recognition model provided by the present invention under an exemplary embodiment;
FIG. 5 is a schematic flow chart of another object recognition method according to the present invention;
FIG. 6 is a flow chart of a final object recognition method according to the present invention;
FIG. 7 is a schematic diagram of an exemplary application scenario provided by the present invention;
FIG. 8 is a block diagram of an embodiment of a target recognition device according to the present invention;
FIG. 9 is a block diagram of another embodiment of a target recognition device according to the present invention;
FIG. 10 is a block diagram of another embodiment of a target recognition device according to the present invention;
fig. 11 is a block diagram of an embodiment of an electronic device according to the present invention.
Detailed Description
In order to make the technical scheme of the present invention better understood by those skilled in the art, the present invention will be further described in detail with reference to the accompanying drawings and the detailed description. Wherein the terms "first," "second," "third," "fourth," and the like in the description and in the claims and in the above-described figures, are used for distinguishing between different objects and not for describing a particular sequential order. Furthermore, the terms "comprise" and "have," as well as any variations of the two, are intended to cover a non-exclusive inclusion. The term "exemplary" means "serving as an example, embodiment, or illustration. Any embodiment described herein as "exemplary" is not necessarily to be construed as preferred or advantageous over other embodiments.
With the rapid development of artificial intelligence technology, a machine learning method model is developed from single-task learning to multi-task joint learning, and multi-task learning is widely applied to downstream tasks of various computer vision technologies due to more precision and less resource occupation. The single-task learning is to learn one task at a time, the complex learning problem is firstly decomposed into theoretically independent sub-problems, then each sub-problem is respectively learned, and finally a mathematical model of the complex problem is built through the combination of the sub-problem learning results. The multi-task learning is joint learning, one model trains a plurality of tasks and outputs a plurality of prediction results, the plurality of tasks are learned in parallel, and similarity correlation among the tasks is mutually influenced.
In multi-task learning, a model typically has multiple loss functions, i.e., loss, while learning multiple tasks, network model parameters share the same parameters in a unified way, with each task contributing the same to the overall loss function. Multiple tasks can share a model, occupying less computing resources than single task learning. Meanwhile, the multi-task learning can take the association and constraint among tasks into consideration, can mine the hidden common data characteristics among different tasks, is beneficial to improving the prediction accuracy of a model, and especially for subtasks with fewer samples, the sharing parameters of other tasks can help the subtasks to improve the prediction accuracy of the subtasks.
Because the prediction accuracy of the multi-task model is affected by the correlation between tasks, and taking face attribute recognition as an example, the recognition of a single face attribute has been successful, but when one model is required to recognize multiple face attributes at the same time, and the multiple face attributes affect each other, the recognition accuracy of the model is difficult to ensure. In the multi-task learning process, in the target recognition process based on multi-task learning, a plurality of subtasks are simultaneously put together for learning, the data distribution and the importance of different subtasks are different, and the generalization capability and the learning speed of a subtask model are different. The loss functions of the various subtasks are combined into a single aggregate loss function, and the model is trained to minimize the aggregate loss function. The technology groups the subtasks based on a certain rule according to a priori knowledge forcing model, or models the relation among the subtasks so as to group the tasks with similar and related characteristics, so that a plurality of subtask loss functions can be accurately combined, the contributions of different subtasks are measured, and weights are added to the losses of the subtasks. However, the model is forced to rely on manual experience according to a certain rule group according to priori knowledge, and the target identification has randomness and is not high in accuracy. Another related technology performs optimization analysis on task groups to solve the problems of correlation among different tasks and grouping clustering of a plurality of tasks. According to the method, the similarity of a plurality of tasks is subjected to evolution search through an MFEA (Multi-Factor Evolutionary Algorithm, multi-factor evolution algorithm) at the same time, the similarity measurement between the MFEA tasks is researched from three different angles of distance between optimal solutions, fitness level correlation and fitness value range analysis, and correlation calculation is performed through spatial sampling of an objective function. However, this method does not verify the generality of the algorithm in multitasking learning, and is computationally complex, and is not suitable for use when the number of tasks is very large.
In view of the above, in the initial multi-task target recognition model training process, the invention calculates the correlation information among different attribute recognition tasks based on the multi-task loss function of the model; judging whether each attribute identification task is suitable for being classified into one type according to the change trend of the loss function of each attribute identification task in the model training process, and the method has the advantages of simple calculation method, wide application range and capability of accurately determining the correlation among the attribute identification tasks. And then grouping the attribute identification tasks based on the correlation information with the aim of maximizing the correlation information among all the attribute identification tasks. Deploying the grouping result through a task parameter feature learning layer for replacing the initial multi-task target recognition model to obtain the multi-task target recognition model; the multi-task target recognition model is trained by utilizing the target sample data set, so that the multi-task target recognition model for executing a plurality of attribute recognition tasks simultaneously is obtained, and the accuracy and recognition efficiency of target recognition can be effectively improved.
Having described aspects of the invention, various non-limiting embodiments of the invention are described in detail below. Numerous specific details are set forth in the following description in order to provide a better understanding of the invention. It will be understood by those skilled in the art that the present invention may be practiced without these specific details. In other instances, well known methods, procedures, components, and circuits have not been described in detail so as not to obscure the present invention.
Referring to fig. 1 first, fig. 1 is a flow chart of a target recognition method provided in this embodiment, where the method may include the following steps:
s101: a first target sample data set and a second target sample data set of the tagged attribute category tag are obtained.
In this embodiment, the first target sample data set and the second target sample data set are used as training sample data sets of the training model, and may be the same data set or different data sets, for example, the first target sample data set and the second target sample data set are obtained by comparing all target sample data sets according to a certain ratio, for example, 6:4 into a training sample set and a test sample set. The first target sample data set is used for training an initial multi-task target recognition model, the second target sample data set is used for training a multi-task target recognition model, the first target sample data set and the second target sample data set both comprise a large number of target data sets covering various rich scenes, the targets can be any entity needing to be recognized, such as faces, license plates and automobiles, the attribute types refer to the attributes of the targets to be recognized, the target data can be target image data or target audio data or target text data or target video data, the targets to be recognized have a plurality of attributes, each target sample in the first target sample data set and each target sample in the second target sample data set is labeled with the attribute in advance, and the labeled information is the attribute type label.
Taking the target to be identified as a human face as an example, the first target sample data set and the second target sample data set comprise a certain amount of human face image data sets, and the data set images can contain various rich scenes, wherein the various rich scenes comprise information comprising different postures, facial expressions, photographing angles, illumination changes, shielding, age changes, resolution and the like, and the data sets also comprise different gender, different age, different height, fineness, different skin colors and the like of different varieties of people. In general, when the first target sample data set and the second target sample data set contain more training data, the accuracy of the model trained therewith is higher. Labeling the face image dataset with face attribute information, wherein the attributes include but are not limited to: the hair curler comprises a blumea, a small moustache, a goat, a short hair, a bald top, a hairline moving, a thick eyebrow, a tie, a lipstick, an eye makeup, a necklace, an earring, a long hair, a hair curler, ji Liuhai, a round face, a goose egg face, a double chin, a sharp chin, a high cheekbone, a temple, a red cheek, a white skin, wrinkles, double eyelid, a big eye, a beautiful pupil, a black eye ring, an eye bag, a glasses wearing, a high nose, a sharp nose, a tremella, a thick lip and a cherry small mouth. The process of labeling the class labels on the face image data set according to the attributes can be as follows: if the first target sample data set or the second target sample data set contains N face image samples, M (m=35) face attribute categories. One simple class label assignment method is from 1 to M; other types of label assignments may also be employed, such as: 0 to M-1, all without affecting the implementation of the present invention. In order to train a high-precision multi-task face attribute recognition model, a first target The sample data set or the second target sample data set at least comprises 10 ten thousand face images and 1 ten thousand different identities. The face attribute is marked with a sequence number from 1~M. The first target sample data set or the second target sample data set is provided with N training face images and M face attribute labels, and the representation mode of the first target sample data set or the second target sample data set is as follows: d= { X, Y }, X is used to represent the face image of the dataset, and the ith face image is denoted asx i . Y is used for representing the true value of the face attribute corresponding to the face, < ->The j-th attribute truth value label for representing the i-th picture can be expressed in the following form:
in the method, in the process of the invention,,/>1 to M properties representing the ith picture,/->And the face attribute labels represent 1-N face images.
S102: in the process of training an initial multi-task target recognition model by using the first target sample data set, calculating correlation information among different attribute recognition tasks based on a multi-task loss function corresponding to the initial multi-task target recognition model.
In this step, the initial multi-task target recognition model is an initial target recognition model, which is one or more network model description files, the training algorithm loads the first target sample data set and the network model description files, constructs an input initial multi-task target recognition model according to the network model description files, initializes the weight parameters of the constructed initial multi-task target recognition model, and then calculates the updated weight in the loss function in a subsequent iteration. The initial multi-task target recognition model trains all tasks together, network parameters of the bottom layer of all tasks are uniformly shared, each subtask on the top layer corresponds to a specific parameter characteristic learning layer, namely the initial multi-task target recognition model can comprise an input layer, a shared parameter characteristic learning layer, a task parameter characteristic learning layer and an output layer which are connected in sequence. The input layer is used for inputting target sample data in the first target sample data set, the shared parameter feature learning layer can be used for learning to obtain shared network parameters, the task parameter feature learning layer is used for learning specific parameters corresponding to each attribute identification task, and the output layer can output identification results.
In the training process of the initial multi-task target recognition model or the multi-task target recognition model, the first target sample data set and the second target sample data set can be divided into a training set and a testing set, wherein single target sample data in the training set and the testing set does not need to contain all attributes, but target sample data in the training set and the testing set need to cover labels of all target attributes, and the minimum number of labels of each target attribute in the testing set is not less than 1/2 of that in the training set, for example, 60% of target sample data can be used for the training set, and 40% of target sample data can be used for the testing set. An initial multi-task target recognition model or a multi-task target recognition model can be trained by adopting a batch random gradient descent method, the model needs to initialize a gradient descent algorithm before gradient updating iteration, and an epoch (training period), a batch size, a weight updating period t and iteration times iteration are set. For example, the total number of training set samples may be 6 ten thousand, the initial multi-task target recognition model or the multi-task target recognition model is trained for at least 100 training periods, one training period refers to that model parameters of the neural network are not repeatedly updated by using all training samples in the training set, and one batch (batch) of data is taken for updating the model parameters of the neural network at a time, so that a training process is completed. In the gradient update iteration process, 500 training samples are used per iteration update, and these 500 training samples are referred to as one batch (batch) data, i.e., batch_size number of samples. The iteration number iteration refers to the number of training using batch_size samples, and the iteration number iteration=60000/500=120 for one epoch is completed. The weight updating period refers to the weight updating once every iteration t times when the initial multi-task target recognition model or the multi-task target recognition model is trained.
In this step, in the training process of the initial multi-task target recognition model, a loss function needs to be defined, the initial multi-task target recognition model includes a plurality of attribute recognition tasks, each attribute recognition task can be used for recognizing one attribute or a plurality of attributes of each target sample data in input data, namely, first target sample data set, the change of the loss function is used for quantifying the correlation information of different attribute recognition tasks, the correlation information is used for representing the similarity degree between different attribute recognition tasks, so that each attribute recognition task and the attribute recognition task with similarity are divided into a group of shared local parameters, and the problem existing in the algorithm that the multi-task learning target recognition model uses priori knowledge to artificially restrict the grouping in the related technology is solved.
S103: based on each correlation information, the correlation information among all attribute identification tasks is maximized as a target, and each attribute identification task is grouped to obtain a grouping result.
As shown in fig. 2, after the correlation information between every two attribute identification tasks is obtained in the previous step, after traversing to obtain whether the relationships between every two sub-tasks have positive correlation or negative correlation, all attribute identification tasks need to be grouped, so as to achieve the optimal solution of the overall correlation. The grouping mode is not limited to grouping two attribute identification tasks into one group, and a plurality of attribute identification tasks can be grouped into one group, or one attribute identification task can be singly grouped, that is, each group of attribute identification tasks at least comprises one attribute identification task.
S104: and deploying the grouping result to the initial multi-task target recognition model by replacing the task parameter feature learning layer to obtain the multi-task target recognition model.
After the grouping result is obtained in the previous step, the grouping result comprises a plurality of subgroups, each subgroup corresponds to a group of attribute identification tasks with strong correlation, each subgroup at least comprises one attribute identification task, the attribute identification task is used as a structure to replace a task parameter characteristic learning layer in an initial multi-task target identification model, the initial multi-task target identification model after replacement is defined as a multi-task target identification model, namely the multi-task target identification model refers to the initial multi-task target identification model with the grouping result deployed.
S105: training the multi-task target recognition model by using the second target sample data set to obtain the multi-task target recognition model for simultaneously executing a plurality of attribute recognition tasks.
After the multi-task target recognition model is obtained through the deployment grouping result in the previous step, the multi-task target recognition model is trained by utilizing the second target sample data set, the multi-task target recognition model locally shares parameters, the optimal estimation of a single attribute recognition task is realized through the fine tuning feature, and when the model iteration ending condition of the multi-task target recognition model is met, the multi-task target recognition model is obtained, and the multi-task target recognition model can recognize a plurality of attributes of data of a target to be recognized, so that the recognition result of each attribute of the target to be recognized is obtained.
In the technical scheme provided by the embodiment, the correlation of different subtasks is quantified by utilizing the change of the loss function, and the correlation among different attribute identification tasks can be simply and accurately determined, so that each attribute identification task and the attribute identification task with similarity are divided into a group of shared local parameters, the model parameters of a plurality of similar tasks can be jointly learned, common information can be mined, the problem that training precision of a small sample task cannot be improved due to insufficient samples is solved, and the target identification precision is effectively improved; the risk of precision deviation caused by unreasonable grouping of the multi-task target recognition model by artificial subjectivity can be effectively avoided, the multi-task target recognition model is helped to learn more general and global features through reasonable grouping, the phenomenon that the multi-task target recognition model is over-fitted due to excessive attention of certain local features is avoided, and the precision and efficiency of target recognition can be effectively improved.
In the above embodiment, the method for calculating the correlation information between the different attribute identification tasks is not limited, and a practical, feasible, simple and easy-to-implement calculation method for calculating the correlation information between the different attribute identification tasks may include:
And for every two attribute identification tasks, determining the correlation between the first attribute identification task and the second attribute identification task as correlation information according to the loss influence degree of the first attribute identification task on the second attribute identification task.
In this embodiment, the correlation includes a positive correlation and a negative correlation, where the first attribute identification task and the second attribute identification task refer to any one of the attribute identification tasks, and in the training process for all the attribute identification tasks, the sharing parameters of the attribute identification tasks are trained together, and the loss influence degree of the first attribute identification task on the second attribute identification task can be used to determine whether the two attribute identification tasks are positive correlation or negative correlation, and the positive correlation is used to indicate that the first attribute identification task and the second attribute identification task are suitable as a group; the negative correlation is used to indicate that the first attribute identification task and the second attribute identification task are not suitable as a group. The degree of influence of the first attribute identification task on the loss of the second attribute identification task can be achieved by judging whether the gradient of the first attribute identification task moves along the descending direction of the loss function of the second attribute identification task, if the loss of the first attribute identification task and the loss of the second attribute identification task are both smaller during gradient updating, the first attribute identification task and the second attribute identification task are suitable for being put together to share the parameters, otherwise, the first attribute identification task and the second attribute identification task are not suitable for being put together to share the parameters. Embodiments of the process may include:
Calculating a first loss function value of the first attribute identification task at the current moment based on the first target sample data; gradient updating is carried out on the shared parameter characteristic learning layer by utilizing the first loss function value, so that new shared parameters are obtained; calculating a second original loss function value of a second attribute identification task based on the original sharing parameter at the current moment and second target sample data; calculating a second new loss function value for a second attribute identification task based on the new sharing parameter and the second target sample data; if the second original loss function value is larger than the second new loss function value, the first attribute identification task and the second attribute identification task are in positive correlation; and if the second original loss function value is smaller than or equal to the second new loss function value, the first attribute identification task and the second attribute identification task are in a negative correlation relationship.
For example, using training data of a batch (batch), calculating a first loss function value of a first attribute identification task at a time t, gradient updating a shared parameter by using the first loss function value of the first attribute identification task to obtain a new shared parameter, calculating a second original loss function value of a second attribute identification task by using the original shared parameter and the training data, and calculating a second new loss function value of the second attribute identification task by using the new shared parameter and the same training data, wherein if the second new loss function value is smaller than the second original loss function value, namely, the gradient of the second attribute identification task is along the loss falling direction of the first attribute identification task, the influence of the gradient update parameter generated based on the first attribute identification task on the second attribute identification task is positive, and the two tasks are considered to be suitable as a group; whereas the task impact on the second attribute recognition is negative, the two tasks are considered mutually exclusive and unsuitable as a group. In this way, the relationship between the attribute recognition tasks is calculated to be positive correlation or negative correlation in the initial multi-task target recognition model training process.
Furthermore, in order to more conveniently and rapidly calculate the correlation information between every two attribute identification tasks, an original loss function value calculation relation and a new loss function value calculation relation can be stored in advance, the original loss function value calculation relation is called, and a second original loss function value of a second attribute identification task is calculated; and calling a new loss function value calculation relation, and calculating a second new loss function value of the second attribute identification task. The original loss function value calculation relation can be expressed as:
in the method, in the process of the invention,loss q-org for the second original loss function value,qrepresenting a second attribute identification task, Y being second target sample dataX t At the position oftThe true value corresponding to the moment in time,Kfor the second target sample dataX t The total number of target data to be included,F(X t ) For the second target sample dataX t A corresponding attribute prediction function is provided for each of the plurality of attributes,identifying task for second attributetTask attribute parameter of time->To at the same timetOriginal sharing parameters of time of day->For the second target sample dataX t Is the first of (2)iThe second attribute of the individual target data identifies the actual tag of the task.
Wherein the new loss function value calculation relation can be expressed as:
in the method, in the process of the invention,loss q-new for the second new original loss function value,to at the same timetA new shared parameter set of time of day.
Furthermore, the present invention also provides another calculation method of the correlation information, and the present embodiment may quantitatively represent the correlation information, that is, represent the degree of correlation between attribute identification tasks by using a correlation score, which may include the following contents:
for every two attribute identification tasks, calculating the relevance scores of the first attribute identification task and the second attribute identification task according to the new loss value updated by the gradient of the second attribute identification task along the descending direction of the loss function of the first attribute identification task and the original loss value before gradient updating;
according to the numerical relation between the correlation score and a preset value, determining the correlation relation between the first attribute identification task and the second attribute identification task to be used as correlation information;
in this embodiment, the correlation relationship includes a positive correlation relationship and a negative correlation relationship, and the positive correlation or the negative correlation is determined by comparing the correlation score with a preset value, where the preset value can be flexibly selected according to an actual application scenario, and the positive correlation relationship is also used to indicate that the first attribute identification task and the second attribute identification task are suitable as a group; the negative correlation is used to indicate that the first attribute identification task and the second attribute identification task are not suitable as a group. That is, in this embodiment, the correlation between the first attribute identification task and the second attribute identification task is quantitatively defined according to the second new loss function value after the gradient of the second attribute identification task is updated along the loss falling direction of the first attribute identification task and the second original loss function value before the gradient is updated, if the preset value is 0, if the correlation score of the first attribute identification task and the second attribute identification task is greater than 0, the two are positively correlated, and if the correlation score of the first attribute identification task and the second attribute identification task is less than 0, the two are negatively correlated.
In order to calculate the correlation information between every two attribute identification tasks more conveniently and quickly, a correlation score calculation relation can be stored in advance, and then the correlation scores of the first attribute identification task and the second attribute identification task are calculated by calling the correlation score calculation relation; the relevance score calculation relation can be expressed as:
R=1-(loss Bnew /loss Borg );
in the method, in the process of the invention,Ras a result of the relevance score,loss Bnew for the updated new loss value, i.e. the second new loss function value,loss Borg the original loss value before gradient update, namely the second original loss function value.
In order to make the correlation information calculation manner of the different attribute recognition tasks of the present invention more clear to those skilled in the art, the present invention provides an exemplary embodiment taking a face as an example, and the method may include the following:
it can be understood that the prediction of each attribute recognition task corresponds to a classification task, taking a face as a target to be recognized as an example, i represents an ith face image, and j represents a jth face attribute tag. The result is output through the neural network, and the function expression form can be as follows:
;/>
wherein, to classify the prediction result, each attribute identification task is a classification task, /> Is 1*2 Vectors, k is 1-2 index numbers,/> j face attributes of the ith target sample data of the initial multitasking recognition model are in The network outputs the kth result of the layer. Based on this, the loss function of a single face attribute recognition task can be defined as:
wherein,outputting the result for the function, ++>Is a true sample tag. Wherein:
wherein,true tag reflecting the j-th attribute of the i-th face image,/for the image of the person>Is a function of the prediction of the attribute,F(X i ) And representing the predicted value output by the ith face through the model. The expression form of the objective function corresponding to the jth face attribute, namely the objective function of the single face attribute recognition task, may be:
wherein,is a model predictive value and a true value +.>Loss function between->Representing shared parameter set, ++>Represents the j-th set of attribute-specific attribute parameters. argmin () is used to represent the shared parameter set +_for minimizing the loss function>And specific parameter set->. If each face attribute is independent of an objective function, multitask learning is performed in parallel. The M face attributes correspond to M classification tasks, and the corresponding function expression forms can be:
for a training data set X of one batch, assuming that the corresponding truth value set is Y, and one batch has K training data, the total loss function of the jth face attribute identification task may be expressed as:
The j-th attribute uses the training data of a batch at the time tThe total loss function can be expressed as:
corresponding shared parameters with the loss function of the jth attributeAfter gradient update, new shared parameters are obtained, denoted +.>
With new shared parametersTraining dataX t The new loss function after the update of the shared parameters is calculated for the other face recognition task q is as follows: />
And determining the relevance scores of the tasks j and q by calculating the loss change of the face attribute q before and after the change of the sharing parameter. the original shared parameter at the moment t isThe original loss function of the face attribute q at time t can be expressed as:
if the relevance scoreIf the gradient is larger than 0, the gradient update of the shared parameter by the face attribute recognition task j can reduce the loss function of the face attribute recognition task q, the face attribute recognition task q is positively influenced, and the face attribute recognition task q are suitable for being put into a group; conversely, if the relevance score ++>If the gradient update of the face attribute recognition task j to the shared parameter is smaller than 0, the gradient update of the face attribute recognition task j is indicated to improve the loss function of the face attribute recognition task q, the loss function is negatively influenced on the face attribute recognition task q, and the face attribute recognition task q are trained separately.
Furthermore, in order to improve the accuracy of the subsequent grouping result, in the training process of the initial multi-task target recognition model, the correlation information between every two attribute recognition tasks can be calculated for multiple times according to a preset frequency; when the total training duration is up, sequentially carrying out subsequent grouping processing on each two attribute identification tasks, and taking the average processing result of the multiple correlation information of the current two attribute identification tasks in the whole training process as the final correlation information. For example, the total training time length is divided into ten equal parts, the relevance scores of all tasks are calculated for a plurality of times at certain intervals, the interval mode can be set according to the mode of dividing the total training time length into ten equal parts, and the average value of the relevance scores is taken as the final task relevance score.
The above embodiment does not limit the model structure of the initial multi-task target recognition model, and based on the above embodiment, the present invention also provides a simple and good-performance network structure of the initial multi-task target recognition model, as shown in fig. 3, where the initial multi-task target recognition model adopts a hard parameter sharing manner, the shared parameter feature learning layer is located at the bottom layer, the task parameter feature learning layer is located at the top layer, and the task parameter feature learning layer includes multiple subtask parameter feature learning layers, where each subtask parameter feature learning layer corresponds to one attribute recognition task, that is, each subtask at the top layer corresponds to one specific task parameter feature learning layer for learning the parameter feature of the corresponding attribute recognition task. The shared parameter feature learning layer comprises a first feature extraction layer, a second feature extraction layer, a third feature extraction layer, a fourth feature extraction layer, a fifth feature extraction layer, a first full connection layer and a second full connection layer; the first feature extraction layer, the second feature extraction layer and the fifth feature extraction layer comprise a convolution layer, a batch normalization layer and a maximum pooling layer which are sequentially connected; the third feature extraction layer and the fourth feature layer comprise a convolution layer and a batch normalization layer which are sequentially connected.
The above embodiment does not limit the model structure of the multi-task object recognition model, and based on the above embodiment, the present invention further provides a simple and good-performance network structure of the multi-task object recognition model, as shown in fig. 4, where the multi-task object recognition model includes an input layer, a shared parameter feature learning layer, a grouping parameter feature learning layer, and an output layer that are sequentially connected; the grouping parameter characteristic learning layers comprise a plurality of subgroup parameter characteristic learning layers, and each subgroup parameter characteristic learning layer corresponds to one subgroup and is used for learning the parameter characteristics of the corresponding subgroup; each subgroup parameter feature learning layer comprises a third fully connected layer and a fourth fully connected layer.
In the above embodiment, how to obtain the grouping result is not limited, and a practical, feasible, simple and easy-to-implement grouping manner is provided in this embodiment, which may include:
for each attribute identification task, acquiring correlation information between the current attribute identification task and other attribute identification tasks, and dividing candidate attribute identification tasks belonging to positive correlation and the current attribute identification tasks into a first subgroup based on each correlation information;
and deleting candidate attribute identification tasks which do not meet the conditions in the first subgroup according to the total number of attribute identification tasks contained in the first subgroup, the correlation degree of each candidate attribute identification task and the current attribute identification task and the correlation information among the candidate attribute identification tasks so as to achieve the maximum correlation information among all the attribute identification tasks.
Taking the correlation information as a correlation score, taking a preset value as 0 as an example, randomly selecting an attribute identification task to define as a first attribute identification task, and obtaining the correlation scores of the first attribute identification task and all other attribute identification tasks. If the relevance scores of the first attribute identification task and all other face attributes are smaller than 0, the first attribute identification task and all other face attributes are independently divided into a group, and the top layer of the model independently corresponds to a specific task parameter feature learning layer. If the correlation score between 1 attribute identification task and the first attribute identification task is greater than 0, the attribute identification task and the first attribute identification task are divided into a group, parameters are locally shared in each group, and the correlation of different attributes in the group is utilized to enable the attribute with commonality to learn advanced features in the group, so that the precision of each attribute task is further refined. If the correlation score between 2 or more attribute identification tasks and the first attribute identification task is greater than 0, deleting the candidate attribute identification tasks which do not meet the conditions in the first subgroup according to the correlation degree of each candidate attribute identification task and the current attribute identification task and the correlation information among the candidate attribute identification tasks. The process of deleting candidate attribute identification tasks in the first subset that do not satisfy the condition may include: if the total number of the attribute identification tasks contained in the first subgroup is greater than 2 attribute identification tasks, the first subgroup reserves initial target attribute identification tasks of which the correlation information corresponding to each candidate target attribute identification task is greater than a preset correlation threshold; calculating the correlation information of every two initial target attribute identification tasks in the first subgroup; for a first initial target attribute identification task and a second initial target attribute identification task belonging to a negative correlation, acquiring a first correlation score and a second correlation score between the first initial target attribute identification task and the second initial target attribute identification task and the current attribute identification task respectively; if the first relevance score is greater than the second relevance score, the second initial target attribute identification task is deleted from the first subset.
In this embodiment, the preset correlation threshold may be flexibly selected according to actual situations, and for example, if the correlation information is represented by a correlation score, the process of reserving the initial target attribute identification task with the correlation information corresponding to each candidate target attribute identification task greater than the preset correlation threshold in the first subset may include:
selecting a maximum correlation score from the correlation scores between the current attribute identification task and each candidate attribute identification task, and determining a preset correlation threshold based on the maximum correlation score and a preset adjustment factor, wherein the preset adjustment factor is more than 0 and less than 1; taking a candidate target attribute identification task with the correlation score with the current attribute identification task being larger than a preset correlation threshold value as an initial target attribute identification task; candidate target attribute identification tasks in the first subset that are not identified for the initial target attribute are deleted.
For example, if there are n attribute identification tasks, taking the first attribute identification task as an example, the relevance scores of the first attribute identification task and all other tasks are calculated, and the relevance scores are sorted. If the correlation score value between the attribute identification task with the first attribute identification task with the correlation score of more than 1 is more than 0, the maximum value is defined as Rmax, the adjusting factor is 0.5, the 0.5Rmax is taken as a preset correlation threshold value, and all attribute identification tasks with the correlation score of more than 0.5Rmax of the first attribute identification task are divided into a set. And calculating the correlation of the attribute identification tasks of the set, wherein if the correlation of the attribute identification task D and the attribute identification task E in the set is smaller than 0, the attribute identification task D and the attribute identification task E are mutually exclusive and are not suitable for being placed in a group, the correlation scores of the attribute identification task D and the attribute identification task E and the first attribute identification task are respectively compared, the task with higher correlation is reserved, and the task with weaker correlation is excluded until all the subtasks in the set are compared, so that one-time grouping is completed. And selecting an attribute identification task H from the rest tasks outside the set, and calculating the relevance scores of H and other all attribute identification tasks again according to the operation process to finish the next wheel set.
Furthermore, in order to improve practicability, based on the above embodiment, the present invention further supports adjustment of a preset relevant threshold, and when a packet parameter adjustment instruction is received, a locally stored preset adjustment factor is updated according to a new adjustment factor in the packet parameter adjustment instruction. If the number of the set groups is large, the number of the attribute identification tasks in each group can be reduced by increasing the preset correlation threshold. If the number of the set groups is small, the number of the attribute identification tasks in each group is increased by reducing the preset correlation threshold.
The above embodiment does not limit how to train the multi-task target recognition model by using the second target sample data set, and based on the above embodiment, the present invention further provides a practical, feasible, simple and easy-to-implement training manner of the multi-task target recognition model, which may include the following contents:
the multi-task target recognition model divides M attribute recognition tasks into K groups (K is less than or equal to M), the bottom layer shares network parameters, the top layer groups parameter feature learning layers into K subgroups, the partial sharing parameters in each subgroup are utilized, and the correlation of different attributes in the groups is utilized, so that the attribute with commonality learns advanced features in one group, and further refinement of the precision of each attribute task is realized. Re-initializing the parameters of the multi-task target recognition model, training the multi-task target recognition model by using the same initialized gradient descent algorithm of the initial multi-task target recognition model, continuously and repeatedly selecting target sample data in a second target sample data set according to a preset total training period number, and calling a multi-task target recognition loss function relation to calculate a loss function of the multi-task target recognition model until reaching a preset training period number value and ending iteration to obtain the trained multi-task target recognition model. The multitasking object recognition loss function relationship may be expressed as:
In the method, in the process of the invention,is thatMTask attribute parameter set of individual attribute identification task, +.>For the original shared parameter set, < >>For a second target sample data setXIs the first of (2)iItem of target datajThe individual attributes identify the actual labels of the tasks,F(X i ) Is the first target sample data setiAnd the attribute prediction function corresponding to each target data is that N is the total number of the target data contained in the second target sample data set.
Based on the above embodiment, in the practical application process, the present invention further provides another target recognition method, which is applicable to any target recognition scene, referring to fig. 5, and may include the following contents:
s501: the multitasking object recognition model is trained in advance.
The present embodiment may be trained to obtain a multi-task target recognition model using the target recognition method steps described in any of the embodiments described above.
S502: and acquiring the data to be processed of the target to be identified.
The object to be identified of the present embodiment includes a plurality of attributes.
S503: inputting the target data to be identified into a multitask target identification model to obtain an identification result of at least one attribute in the target to be identified.
The target data to be identified is data corresponding to the target to be identified, namely data to be processed.
From the above, the present embodiment can simply and accurately determine the correlation problem between different attribute recognition tasks, and can effectively improve the accuracy of target recognition.
Based on the above embodiment, the present invention further provides another target recognition method, which is applicable to any face attribute recognition scene, referring to fig. 6, and may include the following contents:
s601: a first face image dataset and a second face image dataset of a tag attribute category label are obtained.
S602: in the process of training an initial multi-task face recognition model by using the first face image data set, calculating correlation information among different attribute recognition tasks based on a multi-task loss function corresponding to the initial multi-task face recognition model.
The initial multi-task face recognition model comprises an input layer, a shared parameter characteristic learning layer, a task parameter characteristic learning layer and an output layer which are sequentially connected.
S603: based on each correlation information, grouping each attribute identification task with the aim of maximizing the correlation information among all attribute identification tasks to obtain a grouping result; each set of attribute identification tasks includes at least one attribute identification task.
S604: and deploying the grouping result to the initial multi-task face recognition model by replacing the task parameter feature learning layer to obtain the multi-task face recognition model.
S605: and training the multi-task face recognition model by using the second face image data set to obtain the multi-task face recognition model for simultaneously executing a plurality of face attribute recognition tasks.
In this embodiment, the object of the foregoing embodiment is the face of the present embodiment, the attribute recognition task of the foregoing embodiment is the face attribute recognition task of the present embodiment, the first target sample dataset of the foregoing embodiment is the first face image dataset of the present embodiment, the second target sample dataset of the foregoing embodiment is the second face image dataset of the present embodiment, the initial multi-tasking object recognition model of the foregoing embodiment is the initial multi-tasking face recognition model of the present embodiment, the multi-tasking object recognition model of the foregoing embodiment is the multi-tasking face recognition model of the present embodiment, and the concepts of the present embodiment are replaced with the corresponding concepts of the foregoing embodiment, so as to solve the face recognition task of the multi-attribute task in the face attribute recognition application scenario.
As can be seen from the above, the present embodiment designs and trains a multi-task learning face attribute classification model based on the grouping method, and can accurately output a plurality of different face attributes at the same time. If the architecture adopts hard parameter sharing, the model bottom layer shares parameters, and each task at the top layer is independent characteristic parameter layer. The multi-task grouping can be carried out on the basis, the model parameters of a plurality of similar tasks can be jointly learned to mine common information, meanwhile, the problem that the training precision of the tasks with small samples cannot be improved due to insufficient samples is solved, the problem that the multi-task learning of the related technology usually utilizes priori knowledge to fixedly group the sub-tasks, if mutually exclusive tasks are bound together to share the parameters by force, the different tasks can be caused to carry out parameter preemption, or the loss of one task can be caused to be updated in a decreasing direction, but the loss of the other task can be caused to be increased, and the problem that the tasks are not mutually related is solved. The risk of deviation caused by unreasonable grouping of the multitask model by human subjective can be effectively avoided, meanwhile, through reasonable grouping, the model is helped to learn more general and global characteristics, and the model is prevented from being over fitted due to excessively paying attention to certain local characteristics. The method can simply and accurately determine the correlation problem among different face attribute recognition tasks, and can effectively improve the accuracy of face recognition.
It should be noted that, in the present invention, the steps are not strictly executed sequentially, so long as they conform to the logic sequence, and the steps may be executed simultaneously or may be executed according to a certain preset sequence, and fig. 1, fig. 5, and fig. 6 are only schematic, and are not meant to represent only such execution sequence.
Finally, based on the above technical solution of the present invention, the following description will exemplify some possible application scenarios related to the technical solution of the present invention with reference to fig. 7, and fig. 7 is a schematic diagram of a hardware composition frame to which the target recognition method provided by the present invention is applicable, where the following may be included:
the hardware component framework may include a first electronic device 71 and a second electronic device 72, with the first electronic device 71 and the second electronic device 72 being connected by a network 73. The first electronic device 71 is disposed with a processor for executing the target recognition method described in any of the above embodiments, and the second electronic device 72 is disposed with a user terminal for providing a man-machine interaction interface. The first electronic device 71 may complete all or part of the steps in the target recognition method described in the above embodiments.
Based on the above technical solutions of the present application, one of the application scenarios of the embodiments of the present invention may be implemented through interaction between the second electronic device 72 and the user, in this application scenario, the user may issue a command or request, such as a packet parameter adjustment command, to the first electronic device 71 through the second electronic device 72, may upload or select, through the second electronic device 72, the first target sample data set, the second target sample data set, and the data to be processed of the target to be identified, and may issue an access information request, where the access information may be information on accessing the first electronic device 71 through interaction between the second electronic device 72 and the first electronic device 71, or may be information for directly accessing the second electronic device 72 itself, which is not limited in this embodiment.
It should be noted that the above application scenario is only shown for the convenience of understanding the idea and principle of the present invention, and the embodiment of the present invention is not limited in any way. Rather, embodiments of the invention may be applied to any scenario where applicable.
From the above, the present embodiment can simply and accurately determine the correlation problem between different attribute recognition tasks, and can effectively improve the accuracy of target recognition.
The invention also provides a corresponding device for the target identification method, so that the method has more practicability. Wherein the device may be described separately from the functional module and the hardware. In the following description, the object recognition apparatus provided by the present invention is used to implement the object recognition method provided by the present invention, and in this embodiment, the object recognition apparatus may include or be divided into one or more program modules, where the one or more program modules are stored in a storage medium and executed by one or more processors, to implement the object recognition method disclosed in the first embodiment. Program modules in the present embodiment refer to a series of computer program instruction segments capable of performing specific functions, and are more suitable than programs themselves for describing the execution of the object recognition apparatus in a storage medium. The following description will specifically describe functions of each program module of the present embodiment, and the object recognition apparatus described below and the object recognition method described above may be referred to correspondingly to each other.
Referring to fig. 8, fig. 8 is a block diagram of an object recognition device according to the embodiment under a specific implementation manner, where the device may include:
an image sample acquisition module 801 is configured to acquire a first target sample data set and a second target sample data set of a tag attribute type tag.
A correlation determination module 802, configured to calculate correlation information between different attribute recognition tasks based on a multi-task loss function corresponding to the initial multi-task target recognition model in a process of training the initial multi-task target recognition model using the first target sample data set; the initial multi-task target recognition model comprises an input layer, a shared parameter characteristic learning layer, a task parameter characteristic learning layer and an output layer which are sequentially connected.
The grouping module 803 is configured to group each attribute identification task based on each correlation information, with the objective of maximizing correlation information among all attribute identification tasks, to obtain a grouping result; each set of attribute identification tasks includes at least one attribute identification task.
The target recognition module 804 is configured to deploy the grouping result to the initial multi-task target recognition model by replacing the task parameter feature learning layer, so as to obtain the multi-task target recognition model; training the multi-task target recognition model by using the second target sample data set to obtain the multi-task target recognition model for simultaneously executing a plurality of attribute recognition tasks.
Illustratively, in some implementations of the present embodiment, the correlation determination module 802 may be further configured to:
for every two attribute identification tasks, determining the correlation between the first attribute identification task and the second attribute identification task as correlation information according to the loss influence degree of the first attribute identification task on the second attribute identification task; the correlation relationship comprises a positive correlation relationship and a negative correlation relationship, wherein the positive correlation relationship is used for representing that the first attribute identification task and the second attribute identification task are suitable as a group; the negative correlation is used to indicate that the first attribute identification task and the second attribute identification task are not suitable as a group.
As another exemplary implementation of the above embodiment, the above correlation determination module 802 may be further configured to:
calculating a first loss function value of the first attribute identification task at the current moment based on the first target sample data; gradient updating is carried out on the shared parameter characteristic learning layer by utilizing the first loss function value, so that new shared parameters are obtained; calculating a second original loss function value of a second attribute identification task based on the original sharing parameter at the current moment and second target sample data; calculating a second new loss function value for a second attribute identification task based on the new sharing parameter and the second target sample data; if the second original loss function value is larger than the second new loss function value, the first attribute identification task and the second attribute identification task are in positive correlation; and if the second original loss function value is smaller than or equal to the second new loss function value, the first attribute identification task and the second attribute identification task are in a negative correlation relationship.
As an exemplary implementation of the above embodiment, the above correlation determination module 802 may be further configured to:
calling an original loss function value calculation relation, and calculating a second original loss function value of a second attribute identification task; the original loss function value calculation relation is:
in the method, in the process of the invention,loss q-org for the second original loss function value,qrepresenting a second attribute identification task, Y being second target sample dataX t At the position oftThe true value corresponding to the moment in time,Kfor the second target sample dataX t The total number of target data to be included,F(X t ) For the second target sample dataX t A corresponding attribute prediction function is provided for each of the plurality of attributes,identifying task for second attributetTask attribute parameter of time->To at the same timetOriginal sharing parameters of time of day->For the second target sample dataX t Is the first of (2)iThe second attribute of the individual target data identifies the actual tag of the task.
As yet another exemplary implementation of the above embodiment, the above correlation determination module 802 may be further configured to:
invoking a new loss function value calculation relation, and calculating a second new loss function value of the second attribute identification task; the new loss function value calculation relation is:
in the method, in the process of the invention,loss q-new for the second new original loss function value,to at the same timetA new shared parameter set of time of day.
Illustratively, in other implementations of the present embodiment, the correlation determination module 802 may be further configured to:
For every two attribute identification tasks, calculating the relevance scores of the first attribute identification task and the second attribute identification task according to the new loss value updated by the gradient of the second attribute identification task along the descending direction of the loss function of the first attribute identification task and the original loss value before gradient updating; according to the numerical relation between the correlation score and a preset value, determining the correlation relation between the first attribute identification task and the second attribute identification task to be used as correlation information; the correlation relationship comprises a positive correlation relationship and a negative correlation relationship, wherein the positive correlation relationship is used for representing that the first attribute identification task and the second attribute identification task are suitable as a group; the negative correlation is used to indicate that the first attribute identification task and the second attribute identification task are not suitable as a group.
As an exemplary implementation of the above embodiment, the above correlation determination module 802 may be further configured to:
invoking a correlation score calculation relational expression, and calculating correlation scores of the first attribute identification task and the second attribute identification task; the correlation score calculation relationship is:
R=1-(loss Bnew /loss Borg );
in the method, in the process of the invention,Ras a result of the relevance score,loss Bnew in order to update the new loss value after the update,loss Borg the original loss value before gradient update.
Illustratively, in still other implementations of the present embodiment, the correlation determination module 802 described above may be further configured to:
calculating correlation information between every two attribute identification tasks for a plurality of times according to a preset frequency; when the total training duration is up, for every two attribute identification tasks, taking the average processing result of the multiple correlation information as the correlation information between the corresponding attribute identification tasks.
In still other embodiments of the present embodiment, the initial multi-task target recognition model uses a hard parameter sharing manner, where the shared parameter feature learning layer is located at a bottom layer, the task parameter feature learning layer is located at a top layer, and the task parameter feature learning layer includes a plurality of subtask parameter feature learning layers, where each subtask parameter feature learning layer corresponds to one attribute recognition task and is used for learning parameter features of the corresponding attribute recognition task.
As an exemplary implementation of the foregoing embodiment, the shared parameter feature learning layer includes a first feature extraction layer, a second feature extraction layer, a third feature extraction layer, a fourth feature extraction layer, a fifth feature extraction layer, a first fully connected layer, and a second fully connected layer; the first feature extraction layer, the second feature extraction layer and the fifth feature extraction layer comprise a convolution layer, a batch normalization layer and a maximum pooling layer which are sequentially connected; the third feature extraction layer and the fourth feature layer comprise a convolution layer and a batch normalization layer which are sequentially connected.
Illustratively, in still other implementations of the present embodiment, the multitasking object recognition model includes an input layer, a shared parameter feature learning layer, a grouping parameter feature learning layer, and an output layer that are sequentially connected;
the grouping parameter characteristic learning layers comprise a plurality of subgroup parameter characteristic learning layers, and each subgroup parameter characteristic learning layer corresponds to one subgroup and is used for learning the parameter characteristics of the corresponding subgroup; each subgroup parameter feature learning layer comprises a third fully connected layer and a fourth fully connected layer.
Illustratively, in still other implementations of the present embodiment, the grouping module 803 described above may be further configured to:
for each attribute identification task, acquiring correlation information between the current attribute identification task and other attribute identification tasks, and dividing candidate attribute identification tasks belonging to positive correlation and the current attribute identification tasks into a first subgroup based on each correlation information; and deleting candidate attribute identification tasks which do not meet the conditions in the first subgroup according to the total number of attribute identification tasks contained in the first subgroup, the correlation degree of each candidate attribute identification task and the current attribute identification task and the correlation information among the candidate attribute identification tasks so as to achieve the maximum correlation information among all the attribute identification tasks.
As an exemplary implementation of the foregoing embodiment, the foregoing grouping module 803 may further be configured to:
if the total number of the attribute identification tasks contained in the first subgroup is greater than 2 attribute identification tasks, the first subgroup reserves initial target attribute identification tasks of which the correlation information corresponding to each candidate target attribute identification task is greater than a preset correlation threshold; calculating the correlation information of every two initial target attribute identification tasks in the first subgroup; for a first initial target attribute identification task and a second initial target attribute identification task belonging to a negative correlation, acquiring a first correlation score and a second correlation score between the first initial target attribute identification task and the second initial target attribute identification task and the current attribute identification task respectively; if the first relevance score is greater than the second relevance score, the second initial target attribute identification task is deleted from the first subset.
As another exemplary implementation of the above embodiment, the grouping module 803 may be further configured to:
selecting a maximum correlation score from the correlation scores between the current attribute identification task and each candidate attribute identification task, and determining a preset correlation threshold based on the maximum correlation score and a preset adjustment factor, wherein the preset adjustment factor is more than 0 and less than 1; taking a candidate target attribute identification task with the correlation score with the current attribute identification task being larger than a preset correlation threshold value as an initial target attribute identification task; candidate target attribute identification tasks in the first subset that are not identified for the initial target attribute are deleted.
As yet another exemplary implementation of the above embodiment, the grouping module 803 may be further configured to:
when a grouping parameter adjusting instruction is received, the preset adjusting factors stored locally are updated according to the new adjusting factors in the grouping parameter adjusting instruction.
Illustratively, in still other implementations of this embodiment, the object recognition module 804 may be further configured to:
according to the preset total training period number, repeatedly selecting target sample data in the second target sample data set, calling a multi-task target recognition loss function relation to calculate a loss function of the multi-task target recognition model, and ending iteration until the preset training period number is reached; the multitasking target recognition loss function relationship is:
in the method, in the process of the invention,is thatMTask attribute parameter set of individual attribute identification task, +.>For the original shared parameter set, < >>For a second target sample data setXIs the first of (2)iItem of target datajThe individual attributes identify the actual labels of the tasks,F(X i ) Is the first target sample data setiAttribute prediction functions corresponding to the respective target data,Nis the total number of target data contained in the second target sample data set.
Finally, taking the face as an example, the present invention also provides an implementation manner of the object recognition device from the perspective of the functional module, please refer to fig. 9, fig. 9 is a schematic structural frame diagram of the object recognition device provided in this embodiment under another specific implementation manner, where the device may include:
The model training module 901 is configured to train to obtain a multi-task target recognition model by using the target recognition method according to any one of the above embodiments.
A data acquisition module 902, configured to acquire data to be processed of a target to be identified; the object to be identified includes a plurality of attributes.
The recognition result generating module 903 is configured to input the data to be processed into a multitasking target recognition model, and obtain a recognition result of at least one attribute in the target to be recognized.
Referring to fig. 10, fig. 10 is a schematic structural frame diagram of an object recognition device according to another embodiment of the present invention, based on the angle of the functional modules, where the device may include:
the face sample acquiring module 101 is configured to acquire a first face image dataset and a second face image dataset of the tag attribute type tag.
The face attribute relevance determining module 102 is configured to calculate relevance information between different attribute recognition tasks based on a multi-task loss function corresponding to the initial multi-task face recognition model in a process of training the initial multi-task face recognition model by using the first face image data set; the initial multi-task face recognition model comprises an input layer, a shared parameter characteristic learning layer, a task parameter characteristic learning layer and an output layer which are sequentially connected.
The face attribute recognition task grouping module 103 is configured to group each attribute recognition task based on each correlation information, with the aim of maximizing correlation information among all attribute recognition tasks, to obtain a grouping result; each set of attribute identification tasks includes at least one attribute identification task.
The face recognition module 104 is configured to deploy the grouping result to the initial multi-task face recognition model by replacing the task parameter feature learning layer, so as to obtain the multi-task face recognition model; and training the multi-task face recognition model by using the second face image data set to obtain the multi-task face recognition model for simultaneously executing a plurality of face attribute recognition tasks.
The functions of each functional module of the target recognition device in this embodiment may be specifically implemented according to the method in the foregoing method embodiment, and the specific implementation process may refer to the relevant description of the foregoing method embodiment, which is not repeated herein.
From the above, the accuracy of target identification can be effectively improved.
The above-mentioned object recognition device is described from the viewpoint of a functional module, and further, the invention also provides an electronic device, which is described from the viewpoint of hardware. Fig. 11 is a schematic structural diagram of an electronic device according to an embodiment of the present invention. As shown in fig. 11, the electronic device comprises a memory 110 for storing a computer program; a processor 111 for implementing the steps of the object recognition method as mentioned in any of the embodiments above when executing a computer program.
Processor 111 may include one or more processing cores, such as a 4-core processor, an 8-core processor, and processor 111 may also be a controller, microcontroller, microprocessor, or other data processing chip, among others. The processor 111 may be implemented in at least one hardware form of DSP (Digital Signal Processing ), FPGA (Field-Programmable Gate Array, field programmable gate array), PLA (Programmable Logic Array ). The processor 111 may also include a main processor, which is a processor for processing data in an awake state, also called a CPU (Central Processing Unit ), and a coprocessor; a coprocessor is a low-power processor for processing data in a standby state. In some embodiments, the processor 111 may be integrated with a GPU (Graphics Processing Unit, graphics processor) for taking care of rendering and drawing of content that the display screen is required to display. In some embodiments, the processor 111 may also include an AI (Artificial Intelligence ) processor for processing computing operations related to machine learning.
Memory 110 may include one or more computer-readable storage media, which may be non-transitory. Memory 110 may also include high-speed random access memory as well as non-volatile memory, such as one or more magnetic disk storage devices, flash memory storage devices. Memory 110 may be an internal storage unit of an electronic device, such as a hard disk of a server, in some embodiments. The memory 110 may also be an external storage device of the electronic device, such as a plug-in hard disk provided on a server, a Smart Media Card (SMC), a Secure Digital (SD) Card, a Flash Card (Flash Card), etc. in other embodiments. Further, the memory 110 may also include both internal storage units and external storage devices of the electronic device. The memory 110 may be used to store not only application software installed in an electronic device, but also various types of data, such as: code of a program or the like in executing the object recognition method may also be used to temporarily store data that has been output or is to be output. In this embodiment, the memory 110 is at least used to store a computer program 1101, where the computer program is loaded and executed by the processor 111 to implement the relevant steps of the object recognition method disclosed in any of the foregoing embodiments. In addition, the resources stored in the memory 110 may further include an operating system 1102, data 1103, and the like, and the storage manner may be transient storage or permanent storage. The operating system 1102 may include Windows, unix, linux, among other things. The data 1103 may include, but is not limited to, data corresponding to the target recognition result, and the like.
In some embodiments, the electronic device may further include a display 112, an input/output interface 113, a communication interface 114, or referred to as a network interface, a power supply 115, and a communication bus 116. Among other things, a display 112, an input output interface 113 such as a Keyboard (Keyboard) pertain to user interfaces, which may also include standard wired interfaces, wireless interfaces, and the like. Alternatively, in some embodiments, the display may be an LED display, a liquid crystal display, a touch-sensitive liquid crystal display, an OLED (Organic Light-Emitting Diode) touch, or the like. The display may also be referred to as a display screen or display unit, as appropriate, for displaying information processed in the electronic device and for displaying a visual user interface. The communication interface 114 may illustratively include a wired interface and/or a wireless interface, such as a WI-FI interface, a bluetooth interface, etc., typically used to establish a communication connection between an electronic device and other electronic devices. The communication bus 116 may be a peripheral component interconnect standard (peripheral component interconnect, PCI) bus, or an extended industry standard architecture (extended industry standard architecture, EISA) bus, among others. The bus may be classified as an address bus, a data bus, a control bus, etc. For ease of illustration, only one thick line is shown in FIG. 11, but not only one bus or one type of bus.
Those skilled in the art will appreciate that the configuration shown in fig. 11 is not limiting of the electronic device and may include more or fewer components than shown, for example, may also include sensors 117 to perform various functions.
The functions of each functional module of the electronic device in this embodiment may be specifically implemented according to the method in the foregoing method embodiment, and the specific implementation process may refer to the relevant description of the foregoing method embodiment, which is not repeated herein.
From the above, the accuracy of target identification can be effectively improved.
It will be appreciated that the object recognition method of the above embodiment, if implemented in the form of a software functional unit and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present invention may be embodied essentially or in part or all of the technical solution contributing to the related art, or may be embodied in the form of a software product stored in a storage medium, which performs all or part of the steps of the methods of the various embodiments of the present invention. And the aforementioned storage medium includes: a U disk, a removable hard disk, a Read-Only Memory (ROM), a random access Memory (Random Access Memory, RAM), an electrically erasable programmable ROM, registers, a hard disk, a multimedia card, a card-type Memory (e.g., SD or DX Memory, etc.), a magnetic Memory, a removable disk, a CD-ROM, a magnetic disk, or an optical disk, etc., that can store program code.
Based on this, the invention also provides a readable storage medium storing a computer program which, when executed by a processor, performs the steps of the object recognition method according to any one of the embodiments above.
In this specification, each embodiment is described in a progressive manner, and each embodiment is mainly described in a different point from other embodiments, so that the same or similar parts between the embodiments are referred to each other. For the hardware including the device and the electronic equipment disclosed in the embodiments, the description is relatively simple because the hardware includes the device and the electronic equipment corresponding to the method disclosed in the embodiments, and relevant places refer to the description of the method.
Those of skill would further appreciate that the various illustrative elements and algorithm steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware, computer software, or combinations of both, and that the various illustrative elements and steps are described above generally in terms of functionality in order to clearly illustrate the interchangeability of hardware and software. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the solution. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present invention.
The above describes in detail a target recognition method, device, electronic equipment and readable storage medium provided by the invention. The principles and embodiments of the present invention have been described herein with reference to specific examples, the description of which is intended only to facilitate an understanding of the method of the present invention and its core ideas. It should be noted that, based on the embodiments of the present invention, all other embodiments obtained by a person skilled in the art without making any inventive effort fall within the scope of protection of the present invention. The invention is capable of numerous modifications and adaptations without departing from the principles of the invention, and such modifications and adaptations are intended to be within the scope of the invention as set forth in the claims.

Claims (20)

1. A method of target identification, comprising:
acquiring a first target sample data set and a second target sample data set of a label attribute type label; the target sample data is target image data or target audio data or target text data or target video data;
in the process of training an initial multi-task target recognition model by using the first target sample data set, calculating correlation information among different attribute recognition tasks based on a multi-task loss function corresponding to the initial multi-task target recognition model; the initial multi-task target recognition model comprises an input layer, a shared parameter characteristic learning layer, a task parameter characteristic learning layer and an output layer which are sequentially connected;
Based on each correlation information, grouping each attribute identification task with the aim of maximizing the correlation information among all attribute identification tasks to obtain a grouping result; each group of attribute identification tasks at least comprises one attribute identification task;
deploying the grouping result to the initial multi-task target recognition model by replacing the task parameter feature learning layer to obtain a multi-task target recognition model;
training the multi-task target recognition model by using the second target sample data set to obtain a multi-task target recognition model for executing a plurality of attribute recognition tasks simultaneously;
the step of grouping the attribute identification tasks based on the correlation information with the aim of maximizing the correlation information among all the attribute identification tasks to obtain a grouping result comprises the following steps:
for every two attribute identification tasks, calculating the relevance scores of the first attribute identification task and the second attribute identification task according to a new loss value updated by the gradient of the second attribute identification task along the descending direction of the loss function of the first attribute identification task and an original loss value before gradient updating;
if the correlation scores of the 2 or more attribute identification tasks and the first attribute identification task are larger than a preset value, deleting the candidate attribute identification tasks which do not meet the conditions in the first subgroup according to the correlation degree of each candidate attribute identification task and the current attribute identification task and the correlation information among the candidate attribute identification tasks; the first subgroup comprises candidate attribute identification tasks belonging to positive correlation with the current attribute identification task; the process of deleting candidate attribute identification tasks in the first subset that do not meet the condition includes: selecting a maximum correlation score from the correlation scores between the current attribute identification task and each candidate attribute identification task, determining a preset correlation threshold based on the maximum correlation score and a preset adjustment factor, wherein the preset adjustment factor is more than 0 and less than 1, and taking the candidate target attribute identification task with the correlation score of the current attribute identification task being more than the preset correlation threshold as an initial target attribute identification task; if the total number of the attribute identification tasks contained in the first subgroup is greater than 2 attribute identification tasks, the first subgroup reserves initial target attribute identification tasks of which the correlation information corresponding to each candidate target attribute identification task is greater than a preset correlation threshold; calculating the correlation information of every two initial target attribute identification tasks in the first subgroup; for a first initial target attribute identification task and a second initial target attribute identification task belonging to a negative correlation, acquiring a first correlation score and a second correlation score between the first initial target attribute identification task and the second initial target attribute identification task and the current attribute identification task respectively; if the first correlation score is greater than the second correlation score, deleting the second initial target attribute identification task from the first subset; when a grouping parameter adjusting instruction is received, updating a locally stored preset adjusting factor according to a new adjusting factor in the grouping parameter adjusting instruction; the preset adjustment factor increases as the number of packets increases and decreases as the number of packets decreases.
2. The method according to claim 1, wherein calculating correlation information between different attribute recognition tasks based on a multi-task loss function corresponding to the initial multi-task target recognition model includes:
for every two attribute identification tasks, determining the correlation between the first attribute identification task and the second attribute identification task as correlation information according to the loss influence degree of the first attribute identification task on the second attribute identification task;
the correlation relationship comprises a positive correlation relationship and a negative correlation relationship, wherein the positive correlation relationship is used for representing that the first attribute identification task and the second attribute identification task are suitable as a group; the negative correlation is used to indicate that the first attribute identification task and the second attribute identification task are not suitable as a group.
3. The method according to claim 2, wherein determining the correlation between the first attribute identification task and the second attribute identification task according to the loss influence degree of the first attribute identification task on the second attribute identification task comprises:
calculating a first loss function value of the first attribute identification task at the current moment based on the first target sample data;
Gradient updating is carried out on the shared parameter feature learning layer by utilizing the first loss function value, so that a new shared parameter is obtained;
calculating a second original loss function value of a second attribute identification task based on the original sharing parameter at the current moment and second target sample data;
calculating a second new loss function value for the second attribute identification task based on the new sharing parameter and the second target sample data;
if the second original loss function value is larger than the second new loss function value, the first attribute identification task and the second attribute identification task are in positive correlation; and if the second original loss function value is smaller than or equal to the second new loss function value, the first attribute identification task and the second attribute identification task are in a negative correlation.
4. The method according to claim 3, wherein calculating a second original loss function value of a second attribute identification task based on the original shared parameter at the current time and second target sample data comprises:
calling an original loss function value calculation relation, and calculating a second original loss function value of a second attribute identification task; the original loss function value calculation relation is as follows:
In the method, in the process of the invention,loss q-org for the second original loss function value,qrepresenting a second attribute identification task, Y being second target sample dataX t At the position oftThe true value corresponding to the moment in time,Kfor the second target sample dataX t The total number of target data to be included,F(X t ) For the second target sample dataX t A corresponding attribute prediction function is provided for each of the plurality of attributes,identifying task for second attributetTask attribute parameter of time->To at the same timetOriginal sharing parameters of time of day->For the second target sample dataX t Is the first of (2)iThe second attribute of the individual target data identifies the actual tag of the task.
5. The method of claim 4, wherein the calculating a second new loss function value for the second attribute identification task based on the new shared parameter and the second target sample data comprises:
invoking a new loss function value calculation relation, and calculating a second new loss function value of the second attribute identification task; the new loss function value calculation relation is:
in the method, in the process of the invention,loss q-new for the second new original loss function value,to at the same timetA new shared parameter set of time of day.
6. The method according to claim 1, wherein calculating correlation information between different attribute recognition tasks based on a multi-task loss function corresponding to the initial multi-task target recognition model includes:
According to the numerical relation between the relevance score and a preset value, determining the relevance relation between the first attribute identification task and the second attribute identification task as the relevance information;
the correlation relationship comprises a positive correlation relationship and a negative correlation relationship, wherein the positive correlation relationship is used for representing that the first attribute identification task and the second attribute identification task are suitable as a group; the negative correlation is used to indicate that the first attribute identification task and the second attribute identification task are not suitable as a group.
7. The method according to claim 6, wherein calculating the correlation scores of the first attribute identification task and the second attribute identification task from the new loss value after the gradient of the second attribute identification task is updated along the decreasing direction of the loss function of the first attribute identification task and the original loss value before the gradient is updated, comprises:
invoking a correlation score calculation relational expression to calculate correlation scores of the first attribute identification task and the second attribute identification task; the correlation score calculation relational expression is as follows:
R=1-(loss Bnew /loss Borg );
in the method, in the process of the invention,Ras a result of the relevance score, loss Bnew In order to update the new loss value after the update,loss Borg the original loss value before gradient update.
8. The method according to claim 1, wherein calculating correlation information between different attribute recognition tasks based on a multi-task loss function corresponding to the initial multi-task target recognition model includes:
calculating correlation information between every two attribute identification tasks for a plurality of times according to a preset frequency;
when the total training duration is up, for every two attribute identification tasks, taking the average processing result of the multiple correlation information as the correlation information between the corresponding attribute identification tasks.
9. The target recognition method according to claim 1, wherein the initial multi-task target recognition model adopts a hard parameter sharing mode, the shared parameter feature learning layer is located at a bottom layer, the task parameter feature learning layer is located at a top layer, the task parameter feature learning layer comprises a plurality of subtask parameter feature learning layers, each subtask parameter feature learning layer corresponds to one attribute recognition task and is used for learning parameter features of the corresponding attribute recognition task.
10. The target recognition method of claim 9, wherein the shared parameter feature learning layer comprises a first feature extraction layer, a second feature extraction layer, a third feature extraction layer, a fourth feature extraction layer, a fifth feature extraction layer, a first fully connected layer, and a second fully connected layer;
The first feature extraction layer, the second feature extraction layer and the fifth feature extraction layer comprise a convolution layer, a batch normalization layer and a maximum pooling layer which are sequentially connected; the third feature extraction layer and the fourth feature layer comprise a convolution layer and a batch normalization layer which are sequentially connected.
11. The object recognition method according to claim 1, wherein the multitasking object recognition model includes the input layer, the shared parameter feature learning layer, the grouping parameter feature learning layer, and the output layer connected in this order;
the grouping parameter characteristic learning layers comprise a plurality of subgroup parameter characteristic learning layers, and each subgroup parameter characteristic learning layer corresponds to one subgroup and is used for learning the parameter characteristics of the corresponding subgroup; each subgroup parameter feature learning layer comprises a third fully connected layer and a fourth fully connected layer.
12. The method for identifying objects according to claim 1, wherein the grouping each attribute identification task based on each correlation information with the objective of maximizing correlation information among all attribute identification tasks to obtain a grouping result includes:
for each attribute identification task, acquiring correlation information between the current attribute identification task and other attribute identification tasks, and dividing candidate attribute identification tasks belonging to positive correlation and the current attribute identification tasks into a first subgroup based on each correlation information;
And deleting candidate attribute identification tasks which do not meet the conditions in the first subgroup according to the total number of attribute identification tasks contained in the first subgroup, the correlation degree of each candidate attribute identification task and the current attribute identification task and the correlation information among the candidate attribute identification tasks so as to achieve the maximum correlation information among all attribute identification tasks.
13. The method for identifying a target according to claim 1, wherein the correlation information is a correlation score, the first subset retains initial target attribute identification tasks for which correlation information corresponding to each candidate target attribute identification task is greater than a preset correlation threshold, and the method comprises:
and deleting candidate target attribute identification tasks which are not the initial target attribute identification tasks in the first subgroup.
14. The method of any one of claims 1 to 13, wherein training the multitasking object recognition model using the second object sample dataset comprises:
according to the preset total training period number, repeatedly selecting target sample data in the second target sample data set, calling a multi-task target recognition loss function relation to calculate a loss function of the multi-task target recognition model until a preset training period number is reached, and ending iteration; the multitasking target recognition loss function relation is:
In the method, in the process of the invention,is thatMTask attribute parameter set of individual attribute identification task, +.>For the original shared parameter set, < >>For the second target sample data setXIs the first of (2)iItem of target datajThe individual attributes identify the actual labels of the tasks,F(X i ) Is the first target sample data setiAttribute prediction functions corresponding to the respective target data,Nis the total number of target data contained in the second target sample data set.
15. A method of target identification, comprising:
training to obtain a multi-task target recognition model by using the target recognition method according to any one of claims 1 to 14;
acquiring data to be processed of a target to be identified; the object to be identified comprises a plurality of attributes;
and inputting the data to be processed into the multi-task target recognition model to obtain a recognition result of at least one attribute in the target to be recognized.
16. A method of target identification, comprising:
acquiring a first face image dataset and a second face image dataset of a tag attribute type tag;
in the process of training an initial multi-task face recognition model by utilizing the first face image data set, calculating correlation information among different attribute recognition tasks based on a multi-task loss function corresponding to the initial multi-task face recognition model; the initial multi-task face recognition model comprises an input layer, a shared parameter characteristic learning layer, a task parameter characteristic learning layer and an output layer which are sequentially connected;
Based on each correlation information, grouping each attribute identification task with the aim of maximizing the correlation information among all attribute identification tasks to obtain a grouping result; each group of attribute identification tasks at least comprises one attribute identification task;
disposing the grouping result to the initial multi-task face recognition model by replacing the task parameter feature learning layer to obtain a multi-task face recognition model;
training the multi-task face recognition model by using the second face image data set to obtain a multi-task face recognition model for executing a plurality of face attribute recognition tasks simultaneously;
the step of grouping the attribute identification tasks based on the correlation information with the aim of maximizing the correlation information among all the attribute identification tasks to obtain a grouping result comprises the following steps:
for every two attribute identification tasks, calculating the relevance scores of the first attribute identification task and the second attribute identification task according to a new loss value updated by the gradient of the second attribute identification task along the descending direction of the loss function of the first attribute identification task and an original loss value before gradient updating;
if the correlation scores of the 2 or more attribute identification tasks and the first attribute identification task are larger than a preset value, deleting the candidate attribute identification tasks which do not meet the conditions in the first subgroup according to the correlation degree of each candidate attribute identification task and the current attribute identification task and the correlation information among the candidate attribute identification tasks; the first subgroup comprises candidate attribute identification tasks belonging to positive correlation with the current attribute identification task; the process of deleting candidate attribute identification tasks in the first subset that do not meet the condition includes: selecting a maximum correlation score from the correlation scores between the current attribute identification task and each candidate attribute identification task, determining a preset correlation threshold based on the maximum correlation score and a preset adjustment factor, wherein the preset adjustment factor is more than 0 and less than 1, and taking the candidate target attribute identification task with the correlation score of the current attribute identification task being more than the preset correlation threshold as an initial target attribute identification task; if the total number of the attribute identification tasks contained in the first subgroup is greater than 2 attribute identification tasks, the first subgroup reserves initial target attribute identification tasks of which the correlation information corresponding to each candidate target attribute identification task is greater than a preset correlation threshold; calculating the correlation information of every two initial target attribute identification tasks in the first subgroup; for a first initial target attribute identification task and a second initial target attribute identification task belonging to a negative correlation, acquiring a first correlation score and a second correlation score between the first initial target attribute identification task and the second initial target attribute identification task and the current attribute identification task respectively; if the first correlation score is greater than the second correlation score, deleting the second initial target attribute identification task from the first subset; when a grouping parameter adjusting instruction is received, updating a locally stored preset adjusting factor according to a new adjusting factor in the grouping parameter adjusting instruction; the preset adjustment factor increases as the number of packets increases and decreases as the number of packets decreases.
17. An object recognition apparatus, comprising:
the image sample acquisition module is used for acquiring a first target sample data set and a second target sample data set of the marking attribute category label; the target sample data is target image data or target audio data or target text data or target video data;
the correlation determination module is used for calculating correlation information among different attribute recognition tasks based on a multi-task loss function corresponding to the initial multi-task target recognition model in the process of training the initial multi-task target recognition model by using the first target sample data set; the initial multi-task target recognition model comprises an input layer, a shared parameter characteristic learning layer, a task parameter characteristic learning layer and an output layer which are sequentially connected;
the grouping module is used for grouping the attribute identification tasks based on the correlation information and aiming at maximizing the correlation information among all the attribute identification tasks to obtain a grouping result; each group of attribute identification tasks at least comprises one attribute identification task;
the target recognition module is used for deploying the grouping result to the initial multi-task target recognition model by replacing the task parameter feature learning layer to obtain a multi-task target recognition model; training the multi-task target recognition model by using the second target sample data set to obtain a multi-task target recognition model for executing a plurality of attribute recognition tasks simultaneously;
Wherein the grouping module is further to:
for every two attribute identification tasks, calculating the relevance scores of the first attribute identification task and the second attribute identification task according to a new loss value updated by the gradient of the second attribute identification task along the descending direction of the loss function of the first attribute identification task and an original loss value before gradient updating;
if the correlation scores of the 2 or more attribute identification tasks and the first attribute identification task are larger than a preset value, deleting the candidate attribute identification tasks which do not meet the conditions in the first subgroup according to the correlation degree of each candidate attribute identification task and the current attribute identification task and the correlation information among the candidate attribute identification tasks; the first subgroup comprises candidate attribute identification tasks belonging to positive correlation with the current attribute identification task; the process of deleting candidate attribute identification tasks in the first subset that do not meet the condition includes: selecting a maximum correlation score from the correlation scores between the current attribute identification task and each candidate attribute identification task, determining a preset correlation threshold based on the maximum correlation score and a preset adjustment factor, wherein the preset adjustment factor is more than 0 and less than 1, and taking the candidate target attribute identification task with the correlation score of the current attribute identification task being more than the preset correlation threshold as an initial target attribute identification task; if the total number of the attribute identification tasks contained in the first subgroup is greater than 2 attribute identification tasks, the first subgroup reserves initial target attribute identification tasks of which the correlation information corresponding to each candidate target attribute identification task is greater than a preset correlation threshold; calculating the correlation information of every two initial target attribute identification tasks in the first subgroup; for a first initial target attribute identification task and a second initial target attribute identification task belonging to a negative correlation, acquiring a first correlation score and a second correlation score between the first initial target attribute identification task and the second initial target attribute identification task and the current attribute identification task respectively; if the first correlation score is greater than the second correlation score, deleting the second initial target attribute identification task from the first subset; when a grouping parameter adjusting instruction is received, updating a locally stored preset adjusting factor according to a new adjusting factor in the grouping parameter adjusting instruction; the preset adjustment factor increases as the number of packets increases and decreases as the number of packets decreases.
18. An object recognition apparatus, comprising:
a model training module for training to obtain a multi-task target recognition model by using the target recognition method according to any one of claims 1 to 14;
the data acquisition module is used for acquiring the data to be processed of the target to be identified; the object to be identified comprises a plurality of attributes;
the recognition result generation module is used for inputting the data to be processed into the multi-task target recognition model to obtain a recognition result of at least one attribute in the target to be recognized.
19. An electronic device comprising a processor and a memory, the processor being configured to implement the steps of the object recognition method according to any one of claims 1 to 16 when executing a computer program stored in the memory.
20. A readable storage medium, characterized in that it has stored thereon a computer program which, when executed by a processor, implements the steps of the object recognition method according to any one of claims 1 to 16.
CN202311598731.6A 2023-11-28 2023-11-28 Target identification method, device, electronic equipment and readable storage medium Active CN117315445B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202311598731.6A CN117315445B (en) 2023-11-28 2023-11-28 Target identification method, device, electronic equipment and readable storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202311598731.6A CN117315445B (en) 2023-11-28 2023-11-28 Target identification method, device, electronic equipment and readable storage medium

Publications (2)

Publication Number Publication Date
CN117315445A CN117315445A (en) 2023-12-29
CN117315445B true CN117315445B (en) 2024-03-22

Family

ID=89273947

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202311598731.6A Active CN117315445B (en) 2023-11-28 2023-11-28 Target identification method, device, electronic equipment and readable storage medium

Country Status (1)

Country Link
CN (1) CN117315445B (en)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110765288A (en) * 2019-09-04 2020-02-07 北京旷视科技有限公司 Image information synchronization method, device and system and storage medium
CN111459646A (en) * 2020-05-09 2020-07-28 南京大学 Big data quality management task scheduling method based on pipeline model and task combination
CN111507263A (en) * 2020-04-17 2020-08-07 电子科技大学 Face multi-attribute recognition method based on multi-source data
CN115146865A (en) * 2022-07-22 2022-10-04 中国平安财产保险股份有限公司 Task optimization method based on artificial intelligence and related equipment
CN116563932A (en) * 2023-05-15 2023-08-08 辽宁蜻蜓健康科技有限公司 Eye image recognition method and related equipment based on multitask learning

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110765288A (en) * 2019-09-04 2020-02-07 北京旷视科技有限公司 Image information synchronization method, device and system and storage medium
CN111507263A (en) * 2020-04-17 2020-08-07 电子科技大学 Face multi-attribute recognition method based on multi-source data
CN111459646A (en) * 2020-05-09 2020-07-28 南京大学 Big data quality management task scheduling method based on pipeline model and task combination
CN115146865A (en) * 2022-07-22 2022-10-04 中国平安财产保险股份有限公司 Task optimization method based on artificial intelligence and related equipment
CN116563932A (en) * 2023-05-15 2023-08-08 辽宁蜻蜓健康科技有限公司 Eye image recognition method and related equipment based on multitask learning

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
Efficiently Identifying Task Groupings for Multi-Task Learning;Christopher Fifty等;《NeurIPS 2021》;第1-14页 *

Also Published As

Publication number Publication date
CN117315445A (en) 2023-12-29

Similar Documents

Publication Publication Date Title
US11537884B2 (en) Machine learning model training method and device, and expression image classification method and device
CN108701216B (en) Face recognition method and device and intelligent terminal
WO2020182121A1 (en) Expression recognition method and related device
WO2020211398A1 (en) Portrait attribute model creating method and apparatus, computer device and storage medium
CN110210624A (en) Execute method, apparatus, equipment and the storage medium of machine-learning process
Rafique et al. Age and gender prediction using deep convolutional neural networks
CN108124486A (en) Face living body detection method based on cloud, electronic device and program product
CN107958230B (en) Facial expression recognition method and device
CN109919252B (en) Method for generating classifier by using few labeled images
CN107145857A (en) Face character recognition methods, device and method for establishing model
CN106295591A (en) Gender identification method based on facial image and device
CN110414428A (en) A method of generating face character information identification model
US11861514B2 (en) Using machine learning algorithms to prepare training datasets
CN112889065A (en) System and method for providing personalized product recommendations using deep learning
CN113569732A (en) Face attribute recognition method and system based on parallel sharing multitask network
CN114913923A (en) Cell type identification method aiming at open sequencing data of single cell chromatin
CN112801236A (en) Image recognition model migration method, device, equipment and storage medium
CN113569627A (en) Human body posture prediction model training method, human body posture prediction method and device
Zhang et al. Facial component-landmark detection with weakly-supervised lr-cnn
CN117315445B (en) Target identification method, device, electronic equipment and readable storage medium
CN113643283A (en) Method, device, equipment and storage medium for detecting aging condition of human body
CN113221695A (en) Method for training skin color recognition model, method for recognizing skin color and related device
CN112101185A (en) Method for training wrinkle detection model, electronic device and storage medium
El Sayed et al. 3D face detection based on salient features extraction and skin colour detection using data mining
Widiyanto et al. Implementation the convolutional neural network method for classification the draw-A-person test

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant