CN113298176B

CN113298176B - Heterogeneous model self-adaptive cooperation method

Info

Publication number: CN113298176B
Application number: CN202110650567.3A
Authority: CN
Inventors: 张兰; 李向阳; 袁牧
Original assignee: University of Science and Technology of China USTC
Current assignee: University of Science and Technology of China USTC
Priority date: 2021-06-10
Filing date: 2021-06-10
Publication date: 2023-04-25
Anticipated expiration: 2041-06-10
Also published as: CN113298176A

Abstract

The disclosure provides a heterogeneous model self-adaptive collaboration method, comprising: acquiring a training result of the heterogeneous model, and establishing a multi-source mapping relation group and a multi-domain mapping relation group according to the training result; and respectively calculating the execution probability and the model scheduling of the multi-source mapping relation group and the multi-domain mapping relation group according to preset indexes. According to the heterogeneous model self-adaptive cooperation method, the heterogeneous models which are operated in an isolated mode originally can cooperate, knowledge of a plurality of tasks and a plurality of domains is supported and fused, effectiveness and generalization of cooperation are improved, accuracy of output labels in the models can be improved under limited resources based on a scheduling strategy of heterogeneous model cooperation, and the range of the output labels is enlarged.

Description

Heterogeneous model self-adaptive cooperation method

Technical Field

The disclosure relates to the field of deep learning, and in particular relates to a heterogeneous model self-adaptive collaboration method.

Background

The analytical tasks of increasingly complex intelligent applications are often composed of multiple heterogeneous models. Such as urban video analysis, which requires detection of a parked vehicle or the like from a single frame of images, while using timing information to analyze dynamic events in the video (e.g., vehicle tracking across cameras), and sometimes audio data (e.g., noise monitoring). The above tasks are performed by a plurality of heterogeneous machine learning models, which require great computational overhead and bring about high data delay. The existing methods focus on the optimization of a single model, and can accelerate the reasoning speed of the single model and keep the analysis precision basically unchanged by compressing the size of the single model, reusing the calculation of the single model, filtering the input data of the single model and the like.

However, the conventional method has the following problems. The single model has low optimization reusability, the knowledge background required by the optimization method aiming at the single model has strong specificity, and the trial-and-error cost is difficult to estimate, so that the optimization process is difficult to reuse, namely when a new model is to be deployed, a large amount of manpower and material resources are still required for targeted optimization.

Furthermore, existing model optimization methods all assume white-box information for known models, i.e., the process is set to be performed prior to deployment. For deployed heterogeneous models, it is not feasible to obtain white-box information for optimization, which makes existing methods difficult to extend.

Disclosure of Invention

First, the technical problem to be solved

The present disclosure proposes a heterogeneous model adaptive collaboration method to at least solve the above-mentioned problems in the prior art.

(II) technical scheme

In order to achieve the above object, the present disclosure provides a heterogeneous model adaptive collaboration method, including: acquiring a training result of the heterogeneous model, and establishing a multi-source mapping relation group and a multi-domain mapping relation group according to the training result;

and respectively calculating the execution probability and the model scheduling of the multi-source mapping relation group and the multi-domain mapping relation group according to preset indexes.

In some embodiments of the present disclosure, the obtaining training results of the heterogeneous model includes:

collecting N model data, and establishing the heterogeneous model corresponding to the N model data, wherein N is an integer greater than or equal to 2;

taking the Nth heterogeneous model as an original heterogeneous model, taking the heterogeneous models except the Nth heterogeneous model as a target heterogeneous model, and establishing a mapping relation from the original heterogeneous model to the target heterogeneous model;

setting a real label for the heterogeneous model, and guiding the mapping relation to train through the output data of the real label and the heterogeneous model to obtain the training result.

In some embodiments of the present disclosure, the set of output data of the heterogeneous model forms an output space, the output space comprising: a fixed length vector and a variable length sequence, wherein the output space of the original heterogeneous model is an original output space, the output space of the target heterogeneous model is a target output space,

the establishing the mapping relation between the original heterogeneous model and the target heterogeneous model comprises the following steps:

establishing a first mapping relation between the fixed-length vector in the original output space and the fixed-length vector in the target output space;

establishing a second mapping relation from the fixed-length vector in the original output space to the variable-length sequence in the target output space;

establishing a third mapping relation between the variable-length sequences in the original output space and the fixed-length vectors in the target output space;

and establishing a fourth mapping relation between the variable length sequences in the original output space and the variable length sequences in the target output space.

In some embodiments of the present disclosure, the training the mapping relationship through the output data of the real label and the heterogeneous model to obtain the training result includes:

establishing a first data tuple according to the output data of the original heterogeneous model and the real label of the target heterogeneous model, and training the mapping relation through the first data tuple; or alternatively

And establishing a second data tuple according to the output data of the original heterogeneous model and the output data of the target heterogeneous model, and training the mapping relation through the second data tuple.

In some embodiments of the disclosure, the establishing the multi-source mapping relation group and the multi-domain mapping relation group according to the training result includes:

fusing training results of each target heterogeneous model to form a multi-source mapping relation, wherein the number of the multi-source mapping relations is N, and the N multi-source mapping relations form a multi-source mapping relation group;

and carrying out gradient aggregation on the training results of each original heterogeneous model to form a multi-domain mapping relation, wherein the number of the multi-domain mapping relations is N, and the N multi-domain mapping relations form a multi-domain mapping relation group.

In some embodiments of the present disclosure, the preset index includes:

the heterogeneous model is used as a representation of a mapping relation of the source heterogeneous model, and the first index is calculated by the following formula:

wherein P is _i ¹ Average accuracy of heterogeneous models other than the ith heterogeneous model, P, expressed as the ith heterogeneous model prediction _ij The precision of predicting the jth heterogeneous model is expressed as the ith heterogeneous model, and M is expressed as a set of heterogeneous models;

the heterogeneous model is used as the representation of the mapping relation of the target heterogeneous model, and the second index is calculated by the following formula:

wherein P is _i ² Expressed as the average accuracy of the prediction of the ith heterogeneous model by heterogeneous models other than the ith heterogeneous model, P _ji Expressed as the accuracy with which the ith heterogeneous model is predicted by the jth heterogeneous model, M is expressed as a set of heterogeneous models;

and a third index, wherein the resource expense of the heterogeneous model is executed, the resource expense comprises time, occupied memory or occupied video memory required by running the heterogeneous model, and the third index is calculated by the following formula:

wherein P is _i ³ Expressed as probability of selecting the ith heterogeneous model, c _i Resource overhead, denoted as the ith heterogeneous model, w is represented by the normalized conditional formula

And (5) calculating to obtain the product.

In some embodiments of the present disclosure, the establishing a first mapping relationship between the fixed-length vector in the original output space and the fixed-length vector in the target output space is:

and establishing a mapping relation from the fixed-length vector to the fixed-length vector through a neural network model, and establishing the first mapping relation through a gradient descent method.

In some embodiments of the disclosure, the establishing a second mapping relationship between the fixed-length vector in the original output space and the variable-length sequence in the target output space includes:

and establishing a mapping relation from the fixed-length vector to the variable-length sequence through the neural network model, and establishing the second mapping relation through the gradient descent method.

In some embodiments of the disclosure, the establishing a third mapping relationship between the variable length sequence in the original output space and the fixed length vector in the target output space includes:

and establishing a mapping relation from the variable-length sequence to the fixed-length vector through the neural network model, and establishing the third mapping relation through the gradient descent method.

In some embodiments of the disclosure, the establishing a fourth mapping relationship between the variable length sequence in the original output space and the variable length sequence in the target output space includes:

and establishing a mapping relation from the variable length sequence to the variable length sequence through the neural network model, and establishing the fourth mapping relation through the gradient descent method.

(III) beneficial effects

From the above technical solution, it can be seen that the heterogeneous model adaptive collaboration method of the present disclosure has at least one or a part of the following advantages:

according to the heterogeneous model self-adaptive collaboration method, the heterogeneous models which are originally operated in an isolated mode can be collaborative, knowledge integrating a plurality of tasks and a plurality of domains is supported, and the effectiveness and generalization of collaboration are improved. The scheduling strategy based on heterogeneous model cooperation can improve the accuracy of the output label in the model under the condition of limited resources, and enlarge the range of the output label.

Drawings

FIG. 1 is a flow chart of a heterogeneous model adaptive collaboration method in an embodiment of the present disclosure;

FIG. 2 is a flow chart of heterogeneous model training steps in an embodiment of the present disclosure;

FIG. 3 is a training schematic diagram of the mapping relationship of the target heterogeneous model i in an embodiment of the disclosure;

FIG. 4 is a schematic diagram of a process for establishing a set of multi-source mappings in an embodiment of the disclosure;

FIG. 5 is a training schematic diagram of the mapping relationship of the original heterogeneous model i in an embodiment of the present disclosure;

fig. 6 is a schematic diagram of a process for establishing a multi-domain mapping relation set in an embodiment of the disclosure.

Detailed Description

The disclosure provides a heterogeneous model self-adaptive collaboration method, comprising: acquiring a training result of the heterogeneous model, and establishing a multi-source mapping relation group and a multi-domain mapping relation group according to the training result; and respectively calculating the execution probability and the model scheduling of the multi-source mapping relation group and the multi-domain mapping relation group according to the preset indexes. According to the heterogeneous model self-adaptive collaboration method, the heterogeneous models which are originally operated in an isolated mode can be collaborative, knowledge integrating a plurality of tasks and a plurality of domains is supported, and the effectiveness and generalization of collaboration are improved. The scheduling strategy based on heterogeneous model cooperation can improve the accuracy of the output label in the model under the condition of limited resources, and the range of the output label is enlarged.

For the purposes of promoting an understanding of the principles and advantages of the disclosure, reference will now be made to the embodiments illustrated in the drawings and specific language will be used to describe the same. This disclosure may, however, be embodied in different forms and should not be construed as limited to the embodiments set forth herein. Rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the scope of the disclosure to those skilled in the art. In the drawings, the size of layers and regions, as well as the relative sizes, may be exaggerated for the same elements throughout.

Hereinafter, embodiments of the present disclosure will be described with reference to the accompanying drawings. It should be understood that the description is only exemplary and is not intended to limit the scope of the present disclosure. In the following detailed description, for purposes of explanation, numerous specific details are set forth in order to provide a thorough understanding of the embodiments of the present disclosure. It may be evident, however, that one or more embodiments may be practiced without these specific details. In addition, in the following description, descriptions of well-known structures and techniques are omitted so as not to unnecessarily obscure the concepts of the present disclosure.

The terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the disclosure. The terms "comprises," "comprising," and/or the like, as used herein, specify the presence of stated features, steps, operations, and/or components, but do not preclude the presence or addition of one or more other features, steps, operations, or components.

All terms (including technical and scientific terms) used herein have the same meaning as commonly understood by one of ordinary skill in the art unless otherwise defined. It should be noted that the terms used herein should be construed to have meanings consistent with the context of the present specification and should not be construed in an idealized or overly formal manner.

Where expressions like at least one of "A, B and C, etc. are used, the expressions should generally be interpreted in accordance with the meaning as commonly understood by those skilled in the art (e.g.," a system having at least one of A, B and C "shall include, but not be limited to, a system having a alone, B alone, C alone, a and B together, a and C together, B and C together, and/or A, B, C together, etc.). Where a formulation similar to at least one of "A, B or C, etc." is used, in general such a formulation should be interpreted in accordance with the ordinary understanding of one skilled in the art (e.g. "a system with at least one of A, B or C" would include but not be limited to systems with a alone, B alone, C alone, a and B together, a and C together, B and C together, and/or A, B, C together, etc.).

The present disclosure provides a heterogeneous model adaptive collaboration method, as shown in fig. 1, including the following operations:

operation S1: and acquiring a training result of the heterogeneous model, and establishing a multi-source mapping relation group and a multi-domain mapping relation group according to the training result.

Operation S2: and respectively calculating the execution probability and the model scheduling of the multi-source mapping relation group and the multi-domain mapping relation group according to the preset indexes.

In the above operation S2, the preset indexes include: the first index, the second index and the third index have the following meanings and calculation methods.

The first index refers to the representation of the mapping relation of the heterogeneous model as the source heterogeneous model, and the first index is calculated by the following formula:

/>

wherein P is _i ¹ Average accuracy of heterogeneous models other than the ith heterogeneous model, P, expressed as the ith heterogeneous model prediction _ij Denoted as the accuracy with which the ith heterogeneous model predicts the jth heterogeneous model, M is denoted as the set of heterogeneous models.

And the second index is used for representing the mapping relation of the heterogeneous model serving as the target heterogeneous model, and is calculated by the following formula:

wherein P is _i ² Expressed as the average accuracy of the prediction of the ith heterogeneous model by heterogeneous models other than the ith heterogeneous model, P _ji Expressed as the accuracy with which the ith heterogeneous model is predicted by the jth heterogeneous model, M is expressed as a set of heterogeneous models.

The third index is used for executing the resource expense of the heterogeneous model, wherein the resource expense comprises the running time, occupied memory or occupied video memory of the heterogeneous model, and the third index is calculated by the following formula:

And (5) calculating to obtain the product.

The execution probability refers to that under the condition that computing resources (such as memory and time delay) are limited, all heterogeneous models cannot be executed to perform data analysis, and different execution probabilities need to be allocated to different heterogeneous models. And the model scheduling utilizes the calculated heterogeneous model execution probability to obtain a final model scheduling strategy.

Based on the execution probabilities of the respective heterogeneous models, the heterogeneous model with the highest execution probability can be adaptively selected according to the resource constraint. Due to the cooperation of the heterogeneous models, the method and the device have the advantages of obtaining high-precision and wide-range real labels under limited computing resources, and are suitable for scenes such as cloud large-scale data analysis, real-time analysis of the edge data of the Internet of things, low-power consumption data analysis of terminal equipment and the like.

The heterogeneous models in the method are different artificial intelligent models, for example: may be an analysis task, an input data modality (e.g., pictures, video, audio text, etc.).

As shown in fig. 2, acquiring training results of the heterogeneous model includes the following operations:

operation S3: and collecting N model data, and establishing a heterogeneous model corresponding to the N model data, wherein N is an integer greater than or equal to 2.

Operation S4: taking the N-th heterogeneous model as an original heterogeneous model, taking heterogeneous models except the N-th heterogeneous model as target heterogeneous models, and establishing a mapping relation from the original heterogeneous model to the target heterogeneous model.

Operation S5: setting a real label for the heterogeneous model, and training by guiding the mapping relation through the output data of the real label and the heterogeneous model to obtain a training result.

The collection of output data of the heterogeneous model forms an output space comprising: and (3) fixing the length vector and the variable length sequence, wherein the output space of the original heterogeneous model is the original output space, and the output space of the target heterogeneous model is the target output space.

In operation S4, the mapping relationship between the output spaces is directional, i.e., mapped from the far output space to the target output space. The establishing a mapping relation between the original heterogeneous model and the target heterogeneous model specifically comprises the following steps:

operation S41: and establishing a first mapping relation from the fixed-length vector in the original output space to the fixed-length vector in the target output space. And establishing a mapping relation from the fixed-length vector to the fixed-length vector through a neural network model, and establishing a first mapping relation through a gradient descent method.

Operation S42: establishing a second mapping relation from the fixed-length vector in the original output space to the variable-length sequence in the target output space; and establishing a mapping relation from the fixed-length vector to the variable-length sequence through a neural network model, and establishing a second mapping relation through a gradient descent method.

Operation S43: establishing a third mapping relation from the variable-length sequence in the original output space to the fixed-length vector in the target output space; and establishing a mapping relation from the variable-length sequence to the fixed-length vector through a neural network model, and establishing a third mapping relation through a gradient descent method.

Operation S44: and establishing a fourth mapping relation from the variable length sequence in the original output space to the variable length sequence in the target output space. And establishing a mapping relation from the variable length sequence to the variable length sequence through a neural network model, and establishing the fourth mapping relation through a gradient descent method.

The calculation process of the gradient descent method is to solve the neural network parameters with minimized loss along the gradient descent direction of the data.

In operation S5, training is performed by guiding the mapping relationship through the output data of the real tag and the heterogeneous model, which specifically includes:

operation S51: and establishing a first data tuple according to the output data of the original heterogeneous model and the real label of the target heterogeneous model, and training the mapping relation through the first data tuple. Or alternatively

Operation S52: and establishing a second data tuple according to the output data of the original heterogeneous model and the output data of the target heterogeneous model, and training the mapping relation through the second data tuple.

In operation S1, establishing the multi-source mapping relation group and the multi-domain mapping relation group according to the training result includes:

operation S11: and fusing training results of each target heterogeneous model to form a multi-source mapping relation, wherein the number of the multi-source mapping relations is N, and the N multi-source mapping relations form a multi-source mapping relation group.

As shown in fig. 3, the ith target heterogeneous model is illustrated:

first, a mapping relationship from the original heterogeneous model 1, the original heterogeneous model 2.

Secondly, training the mapping relation from all the original heterogeneous models to the target heterogeneous model i to obtain a training result corresponding to the target heterogeneous model i.

Similarly, according to the step of obtaining the training result corresponding to the target heterogeneous model i, the training result 1 of the target heterogeneous model 1 and the training result 2 … … of the target heterogeneous model 2 are obtained, and the training result N of the target heterogeneous model N is obtained.

As shown in fig. 4, training results of all target heterogeneous models are fused to obtain a multisource mapping relation 1 and a multisource mapping relation 2 … … multisource mapping relation N. The N multi-source mapping relations form a multi-source mapping relation group.

The multisource mapping relation group obtained after fusion can cooperatively use information of various different original heterogeneous models, and the accuracy of the mapping relation is improved by comprehensively using different kinds of information.

Operation S12: and carrying out gradient aggregation on the training result of each original heterogeneous model to form a multi-domain mapping relation, wherein the number of the multi-domain mapping relations is N, and the N multi-domain mapping relations form a multi-domain mapping relation group.

As shown in fig. 5, the i-th original heterogeneous model is illustrated:

first, a mapping relationship from the original heterogeneous model i to the target heterogeneous model 1 and the target heterogeneous model 2 and … … to the target heterogeneous model j is established.

Secondly, training the mapping relation from the original heterogeneous model i to all target heterogeneous models to obtain a training result corresponding to the original heterogeneous model i.

Similarly, according to the step of obtaining the training result corresponding to the original heterogeneous model i, the training result 1 of the original heterogeneous model 1 and the training result 2 … … of the original heterogeneous model N are obtained.

As shown in fig. 6, the training results of all the original heterogeneous models are gradient aggregated to obtain a multi-domain mapping relation 1 and a multi-domain mapping relation 2 … …. The N multi-domain mappings form a multi-domain mapping set.

According to the multi-domain mapping relation group acquisition method, the multi-domain mapping relation group is obtained by gradient aggregation of heterogeneous models deployed in different domains (such as classrooms, hospitals and factories), so that generalization of the mapping relation is facilitated.

It should be further noted that, the directional terms mentioned in the embodiments, such as "upper", "lower", "front", "rear", "left", "right", etc., are only referring to the directions of the drawings, and are not intended to limit the scope of the present disclosure. Like elements are denoted by like or similar reference numerals throughout the drawings. In the event that an understanding of the present disclosure may be made, conventional structures or constructions will be omitted, and the shapes and dimensions of the various parts in the drawings do not reflect actual sizes and proportions, but merely illustrate the contents of the embodiments of the present disclosure.

Unless otherwise known, numerical parameters in this specification and the appended claims are approximations that may vary depending upon the desired properties sought to be obtained by the present disclosure. In particular, all numbers expressing quantities of ingredients, reaction conditions, and so forth used in the specification and claims are to be understood as being modified in all instances by the term "about". In general, the meaning of expression is meant to include a variation of + -10% in some embodiments, a variation of + -5% in some embodiments, a variation of + -1% in some embodiments, and a variation of + -0.5% in some embodiments by a particular amount.

The use of ordinal numbers such as "first," "second," "third," etc., in the description and the claims to modify a corresponding element does not by itself connote any ordinal number of elements or the order of manufacturing or use of the ordinal numbers in a particular claim, merely for enabling an element having a particular name to be clearly distinguished from another element having the same name.

Furthermore, unless specifically described or steps must occur in sequence, the order of the above steps is not limited to the list above and may be changed or rearranged according to the desired design. In addition, the above embodiments may be mixed with each other or other embodiments based on design and reliability, i.e. the technical features of the different embodiments may be freely combined to form more embodiments.

While the foregoing is directed to embodiments of the present disclosure, other and further details of the invention may be had by the present application, it is to be understood that the foregoing description is merely exemplary of the present disclosure and that no limitations are intended to the scope of the disclosure, except insofar as modifications, equivalents, improvements or modifications may be made without departing from the spirit and principles of the present disclosure.

Claims

1. A heterogeneous model adaptive collaboration method, comprising:

collecting N model data, and establishing a heterogeneous model corresponding to the N model data, wherein N is an integer greater than or equal to 2, and the N model data comprise at least two of picture data, video data and audio text data;

taking the N-th heterogeneous model as a target heterogeneous model, taking the heterogeneous models except the N-th heterogeneous model as an original heterogeneous model, and establishing a mapping relation from the original heterogeneous model to the target heterogeneous model to obtain a training result of the target heterogeneous model;

taking the N-th heterogeneous model as an original heterogeneous model, taking the heterogeneous models except the N-th heterogeneous model as target heterogeneous models, and establishing a mapping relation from the original heterogeneous model to the target heterogeneous model to obtain training results of the original heterogeneous model;

setting a real label for the heterogeneous model, guiding the mapping relation to train through the output data of the real label and the heterogeneous model to obtain a training result, and establishing a multi-source mapping relation group and a multi-domain mapping relation group according to the training result;

respectively calculating the execution probability and model scheduling of the multi-source mapping relation group and the multi-domain mapping relation group according to preset indexes;

the execution probability representation allocates different execution probabilities to different heterogeneous models under the condition of limited computing resources; the model scheduling characterization obtains a model scheduling strategy according to the respective execution probability of the heterogeneous models;

wherein, the preset index comprises:

Calculating to obtain;

the step of establishing the multi-source mapping relation group and the multi-domain mapping relation group according to the training result comprises the following steps:

gradient aggregation is carried out on the training results of each original heterogeneous model to form a multi-domain mapping relation, wherein the number of the multi-domain mapping relations is N, and N multi-domain mapping relations form a multi-domain mapping relation group;

wherein, the preset index comprises:

wherein P is _i ² Expressed as the average accuracy of the prediction of the ith heterogeneous model by heterogeneous models other than the ith heterogeneous model, P _ji Expressed as the accuracy with which the ith heterogeneous model is predicted by the jth heterogeneous model, M is expressed as heterogeneousA collection of models.

2. The heterogeneous model adaptive collaboration method of claim 1, wherein the collection of output data of the heterogeneous model constitutes an output space, the output space comprising: a fixed length vector and a variable length sequence, wherein the output space of the original heterogeneous model is an original output space, the output space of the target heterogeneous model is a target output space,

3. The heterogeneous model adaptive collaboration method according to claim 1, wherein the training of the mapping relation through the output data of the real tag and the heterogeneous model to obtain the training result includes:

4. The heterogeneous model adaptive cooperation method of claim 2, wherein the establishing a first mapping relationship between the fixed-length vector in the original output space and the fixed-length vector in the target output space is:

5. The heterogeneous model adaptive cooperation method of claim 2, wherein the establishing a second mapping relationship between the fixed-length vector in the original output space and the variable-length sequence in the target output space includes:

and establishing a mapping relation from the fixed-length vector to the variable-length sequence through a neural network model, and establishing the second mapping relation through a gradient descent method.

6. The heterogeneous model adaptive cooperation method of claim 2, wherein the establishing a third mapping relationship between the variable length sequences in the original output space and the fixed length vectors in the target output space includes:

and establishing a mapping relation from the variable-length sequence to the fixed-length vector through a neural network model, and establishing the third mapping relation through a gradient descent method.

7. The heterogeneous model adaptive collaboration method of claim 2, wherein the establishing a fourth mapping relationship of the variable length sequences in the original output space to the variable length sequences in the target output space comprises:

and establishing a mapping relation from the variable length sequence to the variable length sequence through a neural network model, and establishing the fourth mapping relation through a gradient descent method.