CN117274732B - Method and system for constructing optimized diffusion model based on scene memory drive - Google Patents

Method and system for constructing optimized diffusion model based on scene memory drive Download PDF

Info

Publication number
CN117274732B
CN117274732B CN202311208862.9A CN202311208862A CN117274732B CN 117274732 B CN117274732 B CN 117274732B CN 202311208862 A CN202311208862 A CN 202311208862A CN 117274732 B CN117274732 B CN 117274732B
Authority
CN
China
Prior art keywords
task
parameters
scene memory
phi
support set
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202311208862.9A
Other languages
Chinese (zh)
Other versions
CN117274732A (en
Inventor
张磊
张志宇
甄先通
左利云
李欣
王宝艳
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guangdong University of Petrochemical Technology
Original Assignee
Guangdong University of Petrochemical Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guangdong University of Petrochemical Technology filed Critical Guangdong University of Petrochemical Technology
Priority to CN202311208862.9A priority Critical patent/CN117274732B/en
Publication of CN117274732A publication Critical patent/CN117274732A/en
Application granted granted Critical
Publication of CN117274732B publication Critical patent/CN117274732B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/774Generating sets of training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • G06N3/0455Auto-encoder networks; Encoder-decoder networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/0985Hyperparameter optimisation; Meta-learning; Learning-to-learn
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/94Hardware or software architectures specially adapted for image or video understanding
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02TCLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
    • Y02T10/00Road transport of goods or passengers
    • Y02T10/10Internal combustion engine [ICE] based vehicles
    • Y02T10/40Engine management systems

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Software Systems (AREA)
  • General Physics & Mathematics (AREA)
  • Evolutionary Computation (AREA)
  • Computing Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Health & Medical Sciences (AREA)
  • Multimedia (AREA)
  • Biomedical Technology (AREA)
  • Molecular Biology (AREA)
  • General Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Mathematical Physics (AREA)
  • Computational Linguistics (AREA)
  • Biophysics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Databases & Information Systems (AREA)
  • Medical Informatics (AREA)
  • Image Analysis (AREA)

Abstract

The invention relates to the field of image processing, in particular to a method and a system for constructing an optimized diffusion model based on a scene memory driver, which comprises the following steps: constructing a plurality of tasks, wherein each task comprises a plurality of image samples and manual labels corresponding to the image samples; constructing the image samples and the manual labels corresponding to the image samples as a support set and a query set; constructing a scene memory storage, and optimizing the scene memory storage by using a support set and a query set of a plurality of tasks; constructing a diffusion model based on scene memory driving; optimizing a diffusion model based on a scene memory driver by using a support set and a query set of a plurality of tasks; and constructing a new task, wherein the new task comprises a plurality of images to be predicted, and predicting the images to be predicted by using the optimized diffusion model based on the scene memory driving. According to the invention, the diffusion model based on the scene memory driving is used for solving the noise problem of parameters in the prediction process, and the accuracy of the prediction result is improved.

Description

Method and system for constructing optimized diffusion model based on scene memory drive
Technical Field
The invention relates to the field of image processing, in particular to a method and a system for constructing an optimized diffusion model based on a scene memory driver.
Background
Meta learning is a solution measure for classifying small samples, and is to learn the passing characteristics of a classifier by using multiple tasks, and then achieve the effect of improving the classification performance under the small samples by using support set data of specific tasks. In particular, it is assumed that there is a task distributionFrom which the tasks can be sampled a number of times, for each task sampledIt includes a support set And test setAlthough for each task, the training data only has support set data, namely data containing C categories, and each category data contains K samples, under the condition that the training data only has K samples due to the overfitting phenomenon, the classifier is difficult to train well by using the traditional method, so that the classifier obtains better performance on a test set. The basic idea of meta learning is to use the number of tasks to compensate for the shortcoming of few samples. Similar to human learning habits, such as learning to ride a bicycle, it should be easy to learn to ride a motorcycle, some general knowledge between tasks is learned, and when a new task is encountered, it can be quickly learned to adapt to the new task without requiring many samples.
In conventional parameter predictions, model parameters learned under such data are noisy due to noise in the labeling of the data, i.e., interference from nonstandard or inaccurate labeling. But the current approach has few or no consideration of this noise. The denoising capability of the diffusion model can be used for helping to recover clean parameters and accurately predict. At present, meta-learning based on optimization is gradually becoming an effective solution to cope with the dilemma of learning with few samples. However, in the case of limited samples, the gradient estimation becomes noisy, often resulting in the metamodel either diverging or sinking into the suboptimal minimum of each small sample task.
Disclosure of Invention
The invention aims to overcome at least one defect (deficiency) of the prior art, and provides a method and a system for constructing an optimized diffusion model based on a scene memory drive, which solve the noise problem of diffusion model parameters by using a scene memory storage and improve the accuracy of a prediction result.
The technical scheme adopted by the invention is as follows:
in a first aspect, a method for constructing an optimized diffusion model based on a context memory driver is provided, including:
constructing a plurality of tasks, wherein each task comprises a plurality of image samples and manual labels corresponding to the image samples;
constructing a part of the image samples and the manual labels corresponding to the image samples as a support set, and constructing the other part as a query set;
Constructing a scene memory storage, and optimizing the scene memory storage by using a support set and a query set of a plurality of tasks;
Constructing a diffusion model based on scene memory driving according to the scene memory;
optimizing a diffusion model based on a scene memory driver by using a support set and a query set of a plurality of tasks;
and constructing a new task, wherein the new task comprises a plurality of images to be predicted, and predicting the images to be predicted by using the optimized diffusion model based on the scene memory driving.
Under the condition of limited samples based on optimized meta-learning, gradient estimation often becomes noisy, often resulting in meta-models that either diverge or sink into suboptimal minima for each small sample task. In order to alleviate the problem, the invention constructs a diffusion model based on scene memory driving for meta learning. By incrementally updating the basic model parameters, trapping to task-specific local minimum traps is avoided. Firstly, constructing a scene memory storage from a neural network check point, and utilizing the scene memory storage as a prompt to guide a new task of updating a basic model; secondly, taking the conditional diffusion model as an optimizer, taking the parameter to be updated as the conditional diffusion model, taking the scene memory as a guide, and predicting the required optimal parameter update suitable for the expected basic model by combining the diffusion time steps, so that the invention can optimize the basic model by using unfamiliar parameters only through single or few times of update in meta-test, effectively solve the noise problem of the diffusion model parameters, and improve the accuracy of the prediction result. Meanwhile, the method has universality and flexibility, and can be smoothly integrated with the model of the existing meta learning method based on optimization.
Further, the constructing the context memory storage, optimizing the context memory storage by using a support set and a query set of a plurality of tasks, specifically includes:
Constructing a scene memory for storing task keywords and value fields;
internal circulation optimization is carried out on initial parameters of the scene memory storage by using a support set of a plurality of tasks, and parameters after the support set optimization are obtained;
After the internal circulation updating is finished, the parameters after the scene memory support set optimization are subjected to external circulation updating by using the query sets of a plurality of tasks, so that the parameters after the query set optimization are obtained;
and optimizing the scene memory according to the parameters after the support set optimization and the parameters after the query set optimization.
According to the invention, the support set and the query set are used for carrying out internal circulation and external circulation, and parameters are repeatedly optimized, so that the scene memory storage is optimized, the parameters in the process of constructing the diffusion model are ensured to be the current optimal, and the noise of the model parameters is reduced.
Further, the constructing a scenario memory storage for storing task keywords and value fields specifically includes:
building context memory
N represents N storage units in the scene memory, N is smaller than or equal to the storage capacity of the scene memory, and each storage unit M i stores a key value pair, namely a keyword and a value field, and is represented by M i=[Ki,Vi;
Wherein V i represents a value range of the ith task, and K i represents a keyword of the ith task, specifically expressed as: k i =
Transformer([cls,e1,…,en])[0];
Where n represents the total number of image samples in the support set and query set for the ith task, e j=gφ(xj) the feature representation of image sample x j after it passes encoder g φ, x j represents the jth image sample in the support set for the ith task, cls represents the class occupancy information for the ith task.
The invention stores the key words and the value fields of different tasks in the scene memory storage, wherein the value fields are key check points in different tasks, and in the process of optimizing the scene memory storage, as the task number is increased and exceeds the maximum capacity of the scene memory storage, the memory unit which firstly enters the scene memory storage is replaced by adopting the first-in first-out technology, and M i=[Ki,Vi obtained in a new task is added into the scene memory storage.
Further, the internal loop optimization is performed on the initial parameters of the context memory by using the support sets of the plurality of tasks to obtain parameters after the support sets are optimized, which specifically comprises:
with the ith task The support set of the image samples x j and the artificial annotations y j corresponding to the image samples optimize the original parameters phi, which are specifically expressed as follows:
Wherein, alpha represents learning rate of inner loop optimization, S represents the number of image samples in the support set, f φ(xj) represents prediction labeling of the image samples x i by using original parameters phi, which are obtained by an encoder and a predictor, phi i represents parameters after the support set optimization;
Use tasks Is used for supporting iterative updating of parameters phi i of all image samples in a centralized manner;
Using task sets Iteratively updating the parameter phi i corresponding to each task,Representing the number of tasks in the task set;
and obtaining the optimized parameter phi i of the support set corresponding to each task.
Further, the external circulation updating is performed on the parameters optimized by the context memory support set by using the query set of the plurality of tasks to obtain the parameters optimized by the query set, which specifically comprises:
optimizing the parameter phi i after support set optimization by using the image sample x in the query set of the ith task and the artificial annotation y corresponding to the image sample, wherein the method is specifically expressed as follows:
Wherein, gamma represents the learning rate of the outer loop optimization, Q represents the number of image samples in the query set, Representing the predictive annotation of the image sample x using the support set optimized parameter phi i,Representing the number of tasks in the task set;
Using task sets Iteratively updating the parameter phi i corresponding to each task according to the query set of each task;
obtaining the optimized parameter phi i of the query set corresponding to each task.
In the meta-learning framework, sampling tasks according to a task distribution, a series of task sequences is generated, The core idea of meta-learning is to find a generic meta-learner in the training task of the meta-training phase, thus for the taskThe parameter phi can be optimized by alternating the internal and external cycles to achieve the optimum. Optimizing parameters by using the support set sample data in the internal circulation, and obtaining parameters phi i after the support set optimization; and optimizing parameters by using sample data in the query set during external circulation, and obtaining the parameters phi i after the optimization of the query set.
More specifically, the optimizing the context memory according to the parameters after the support set optimization and the parameters after the query set optimization specifically includes:
parameters phi i after optimizing the ith task query set, parameters phi i after optimizing the support set, average loss The accuracy U is obtained according to the prediction annotation of the parameter phi i optimized by using the query set and the prediction annotation of the original parameter phi, and is used as the value of the value range V i to be stored in the scene memory storageRepresenting the predictive annotation of the image sample x using the query set optimized parameter phi i i.
And constructing parameters after optimizing the query set, parameters after optimizing the support set, average loss obtained by calculating the query set and accuracy obtained according to the prediction annotation after optimizing the query set and the original prediction annotation as a value field of an ith task, and using the value field as the content stored in the scene memory storage for further optimizing a diffusion model constructed later.
Further, the constructing a diffusion model based on the scene memory driving according to the scene memory storage specifically comprises the following steps:
Parameter phi' after optimizing query set, initial parameter phi for random initialization and prompt Initial informationThe time stamp t is used as an input of a diffusion model based on the scene memory driving;
The prompt is Comprises parameters phi r, average loss and accuracy U after optimizing a support set stored in a scene memory storage, wherein the initial informationThe average loss and the accuracy obtained by the current task are used as initial loss and initial accuracy, and the time stamp t is the iteration number of loop optimization in the scene memory;
The output of the diffusion model based on the context memory driving is expressed as:
Wherein Θ θ represents a diffusion model, θ represents diffusion model parameters, and is obtained through a transducer; e represents noise as the diffusion process proceeds, which is a random, from Gaussian distribution Sampling to obtain; The learning rate of the inner loop optimization is represented.
When noise is present in the signal, the diffusion model may be configured to predict the signal or noise. Conventional diffusion models are often used to predict noise, and the present invention utilizes diffusion model prediction signals, i.e., the diffusion model is parameterized to output task-specific parameters. The invention obtains K i according to the current task and the scene memory storageThe K of all units in the model is used for calculating cosine distance, phi r parameters stored in V r corresponding to the memory unit with the smallest distance are selected as a first input phi ', the input is generally noisy, and the model provided by the invention can be combined with other information to remove noise from phi' so as to perform more accurate prediction.
Further, the optimizing the diffusion model based on the context memory driving by using the support set and the query set of a plurality of tasks specifically comprises the following steps:
Optimizing an objective function:
constructing a total loss function:
Where E represents noise as the diffusion process proceeds, is a random, from Gaussian distribution Sampling to obtain, theta represents diffusion model parameters, obtaining through a transducer,The learning rate representing the inner loop optimization, x q represents the image sample,Representing the prediction labels corresponding to the image samples x q, Q representing the number of image samples in the query set,Representing the calculated expectation, lambda represents the adjustment parameter,Representing predictions given x q and parameter phiProbability of (2);
The parameters theta and phi of the diffusion model based on the scene memory driving are optimized by using the total loss function.
The diffusion model represented by the objective function theta θ is realized by using a transducer with the parameter theta, and then the total loss function is utilized to optimize the diffusion model parameter theta and the base line network parameter phi of the scene memory drive. Wherein the method comprises the steps ofIndicating the computational expectations, λ is the adjustment parameter, the specific value of which can be adjusted in the experiment.
Further, the construction of a new task, wherein the new task comprises a plurality of images to be predicted, and the prediction of the images to be predicted is performed by using an optimized diffusion model based on a scene memory drive, and specifically comprises the following steps:
constructing a new task The new taskThe method comprises the steps that the method comprises a plurality of images to be predicted, wherein the images to be predicted are divided into a support set and a query set;
Acquiring new tasks A corresponding keyword K;
Calculating key K and scene memory Cosine distances of key words K j of all storage units in the memory;
Selecting a value range V r corresponding to a keyword K r with the minimum cosine distance, acquiring a parameter phi r after query set optimization, a parameter phi r after support set optimization, average loss and accuracy U stored in the value range V r, and constructing the parameter phi r after support set optimization, the average loss and the accuracy U into a prompt
Utilizing new tasksIs used for calculating initial loss and initial accuracy rate and generating initial information
Random generation and new tasksCorresponding initial parameters phi;
Parameters phi r and prompts after optimizing the acquired query set Initial informationNew taskCorresponding initial parameters phi and time stamp t are input into a diffusion model based on scene memory driving, and new tasks are generatedCorresponding prediction parameters phi ;
Utilizing new tasks Is completed for the new task by the predicted parameter phi Is a prediction of (2).
According to the invention, the image to be predicted is predicted through the scene memory storage and the diffusion model based on the scene memory drive, so that noise in the prediction process can be effectively reduced, and compared with the prior art, the accuracy of a prediction result is higher.
In a second aspect, a system for constructing an optimized diffusion model based on context memory drivers is provided, comprising:
The task construction module is used for constructing a plurality of tasks, and each task comprises a plurality of image samples and manual labels corresponding to the image samples; one part of the image samples and the manual labels corresponding to the image samples are constructed as a support set, and the other part is constructed as a query set;
The scene memory storage construction module is used for constructing a scene memory storage and optimizing the scene memory storage by using a support set and a query set of a plurality of tasks;
the diffusion model construction module is used for constructing a diffusion model based on the scene memory drive according to the scene memory;
the diffusion model optimization module is used for optimizing a diffusion model based on the scene memory driving by using a support set and a query set of a plurality of tasks;
And the prediction module is used for constructing a new task, wherein the new task comprises a plurality of images to be predicted, and the optimized diffusion model based on the scene memory drive is used for predicting the images to be predicted.
Compared with the prior art, the invention has the beneficial effects that:
(1) According to the invention, the support set and the query set are used for carrying out internal circulation and external circulation, and parameters are repeatedly optimized, so that the scene memory storage is optimized, the parameters in the process of constructing the diffusion model are ensured to be the current optimal, and the noise of the model parameters is reduced;
(2) According to the invention, the basic model can be optimized by using unfamiliar parameters only through single or few times of updating in meta-test, so that the noise problem of diffusion model parameters is effectively solved, and the accuracy of a prediction result is improved;
(3) The invention has universality and flexibility, and can be smoothly integrated with the model of the existing meta learning method based on optimization.
Drawings
Fig. 1 is a flow chart of the method of embodiment 1 of the present invention.
Fig. 2 is a schematic diagram of a scene memory according to embodiment 1 of the present invention.
Fig. 3 is a technical frame diagram of a diffusion model optimization process based on a scene memory driving in embodiment 1 of the present invention.
Fig. 4 is a system configuration diagram of embodiment 2 of the present invention.
Detailed Description
The drawings are for illustrative purposes only and are not to be construed as limiting the invention. For better illustration of the following embodiments, some parts of the drawings may be omitted, enlarged or reduced, and do not represent the actual product dimensions; it will be appreciated by those skilled in the art that certain well-known structures in the drawings and descriptions thereof may be omitted.
Example 1
As shown in fig. 1, the present embodiment provides a method for constructing an optimized diffusion model based on a context memory driver, including:
s1, constructing a plurality of tasks, wherein each task comprises a plurality of image samples and manual labels corresponding to the image samples;
S2, constructing a part of the image samples and the manual labels corresponding to the image samples as a support set, and constructing the other part as a query set;
S3, constructing a scene memory storage, and optimizing the scene memory storage by using a support set and a query set of a plurality of tasks;
s4, constructing a diffusion model based on the scene memory drive according to the scene memory;
S5, optimizing a diffusion model based on the scene memory drive by using a support set and a query set of a plurality of tasks;
S6, constructing a new task, wherein the new task comprises a plurality of images to be predicted, and predicting the images to be predicted by using an optimized diffusion model based on the scene memory drive.
The embodiment builds a diffusion model based on scene memory driving, and avoids trapping in a task-specific local minimum trap by carrying out incremental updating on basic model parameters. Firstly, constructing a scene memory storage from a neural network check point, and utilizing the scene memory storage as a prompt to guide a new task of updating a basic model; secondly, taking the conditional diffusion model as an optimizer, taking the parameter to be updated as the conditional diffusion model, taking the scene memory as a guide, and predicting the required optimal parameter update suitable for the expected basic model by combining the diffusion time steps, so that the embodiment can optimize the basic model by using unfamiliar parameters only through single or few times of update in meta-test, effectively solve the noise problem of the diffusion model parameters, and improve the accuracy of a prediction result.
In a specific implementation process, step S1 of this embodiment specifically includes:
assume that there is a task distribution From which the task distribution can be sampled a number of times, for each task sampledThe method comprises a plurality of image samples x i and artificial labels y i corresponding to the image samples, i represents an ith image sample, and t represents a tth task;
The step S2 of this embodiment specifically includes:
Task is carried out Is constructed as a support set by a plurality of image samples x i and artificial annotations y i corresponding to the image samplesAnd test setWherein the support set contains data of C categories, each category data containing K samples, and the query set contains M samples in total.
The step S3 of this embodiment specifically includes:
s301, constructing a scene memory for storing task keywords and value fields;
s302, performing internal circulation optimization on initial parameters of a scene memory by using a support set of a plurality of tasks to obtain parameters after the support set optimization;
S303, after the internal circulation updating is finished, using a query set of a plurality of tasks to carry out external circulation updating on parameters after the scene memory support set is optimized, and obtaining parameters after the query set is optimized;
s304, optimizing the scene memory according to the parameters after the support set optimization and the parameters after the query set optimization.
In the embodiment, gradient optimization based on meta learning is used for optimizing a scene memory storage, as shown in fig. 2, a variable x S in the figure represents input in a support set in small sample learning, namely an image sample of the support set in the embodiment; y S represents the real output of the support set in the small sample learning, namely the artificial annotation corresponding to the image sample of the support set in the embodiment; Representing the prediction output of the corresponding x S, namely a prediction label obtained by predicting the image sample of the support set by using the current encoder and predictor; x Q represents the input in the query set in the small sample learning, i.e., the image sample of the query set in this embodiment; y Q represents the real output of the support set in the small sample learning, namely the manual annotation corresponding to the image sample of the query set in the embodiment; The prediction output representing the corresponding x Q, i.e., the prediction label that the image sample of the query set predicts using the current encoder and predictor.
The step S301 in this embodiment specifically includes:
building context memory
N represents N storage units in the scene memory, N is smaller than or equal to the storage capacity of the scene memory, and each storage unit M i stores a key value pair, namely a keyword and a value field, and is represented by M i=[Ki,Vi;
Wherein V i represents a value range of the ith task, and K i represents a keyword of the ith task, specifically expressed as: k i =
Transformer([cls,e1,…,en])[0];
Where n represents the total number of image samples in the support set and query set for the ith task, e j=gφ(xj) the feature representation of image sample x j after it passes encoder g φ, x j represents the jth image sample in the support set for the ith task, cls represents the class occupancy information for the ith task.
As shown in FIG. 2, each cell of the context memory stores a key value pair, namely a key and a value field value, denoted by M i=[Ki,Vi. Wherein K i represents the keyword representation of the ith task, which is an output representation corresponding to the category occupation information after the task is subjected to the conversion according to the characteristics of the task on the support set and the query set and the category occupation information cls; the value field V i stores key checkpoints in different tasks, such as model parameters, loss, and accuracy.
The step S302 of this embodiment specifically includes:
with the ith task The support set of the image samples x j and the artificial annotations y j corresponding to the image samples optimize the original parameters phi, which are specifically expressed as follows:
Wherein, alpha represents learning rate of inner loop optimization, S represents the number of image samples in the support set, f φ(xj) represents prediction labeling of the image samples x j by using original parameters phi, which are obtained by an encoder and a predictor, phi i represents parameters after the support set optimization;
Use tasks Is used for supporting iterative updating of parameters phi i of all image samples in a centralized manner;
Using task sets Iteratively updating the parameter phi i corresponding to each task,Representing the number of tasks in the task set;
and obtaining the optimized parameter phi i of the support set corresponding to each task.
After the update of the inner loop is completed, the outer loop is started, that is, step S303 of this embodiment specifically includes:
optimizing the parameter phi i after support set optimization by using the image sample x in the query set of the ith task and the artificial annotation y corresponding to the image sample, wherein the method is specifically expressed as follows:
Wherein, gamma represents the learning rate of the outer loop optimization, Q represents the number of image samples in the query set, Representing the predictive annotation of the image sample x using the support set optimized parameter phi i,Representing the number of tasks in the task set;
Using task sets Iteratively updating the parameter phi i corresponding to each task according to the query set of each task;
obtaining the optimized parameter phi i of the query set corresponding to each task.
The step S304 in this embodiment specifically includes:
Parameter phi' i after optimizing the ith task query set, parameter phi i after optimizing the support set, average loss The accuracy U is obtained according to the prediction annotation of the parameter phi' i optimized by using the query set and the prediction annotation of the original parameter phi, and is used as the value of the value range V i to be stored in the scene memory storageRepresenting the predictive annotation of the image sample x using the query set optimized parameter phi' i.
In the meta-learning framework, sampling tasks according to a task distribution, a series of task sequences is generated, The core idea of meta-learning is to find a generic meta-learner in the training task of the meta-training phase, thus for the taskThe parameter phi can be optimized by alternating the internal and external cycles to achieve the optimum. Optimizing parameters by using the support set sample data in the internal circulation, and obtaining parameters phi i after the support set optimization; and optimizing parameters by using sample data in the query set during external circulation, and obtaining parameters phi' i after the optimization of the query set.
In the specific implementation process, when the scene memory is updated, as the number of tasks increases and exceeds the maximum capacity of the scene memory, the first-in first-out technology is adopted to replace the storage unit which is firstly entered into the scene memory, and M i=[Ki,Vi obtained in a new task is stored in the scene memory.
The step S4 of this embodiment specifically includes:
Parameter phi' after optimizing query set, initial parameter phi for random initialization and prompt Initial informationThe time stamp t is used as an input of a diffusion model based on the scene memory driving;
The prompt is Comprises parameters phi r, average loss and accuracy U after optimizing a support set stored in a scene memory storage, wherein the initial informationThe average loss and the accuracy obtained by the current task are used as initial loss and initial accuracy, and the time stamp t is the iteration number of loop optimization in the scene memory;
The output of the diffusion model based on the context memory driving is expressed as:
Wherein Θ θ represents a diffusion model, θ represents diffusion model parameters, and is obtained through a transducer; e represents noise as the diffusion process proceeds, which is a random, from Gaussian distribution Sampling to obtain; The learning rate of the inner loop optimization is represented.
The present embodiment utilizes diffusion model prediction signals, i.e., the diffusion model is parameterized to output task-specific parameters. The overall technical framework is shown in fig. 3, and in the diffusion model based on the scene memory driving, the input of the diffusion model comprises the following parts:
(1) Parameter phi' after query set optimization: k i obtained according to current task, and scene memory storage The K of all units in the memory unit is used for calculating cosine distance, phi ' r parameters stored in V r corresponding to the memory unit with the smallest distance are selected as a first input phi ', the input is generally noisy, and the embodiment can combine other information to remove noise from phi ' so as to perform more accurate prediction;
(2) Initial parameters phi: randomly initializing parameters;
(3) Prompt for I.e. the input part indicated by the dashed line in fig. 3, comprising parameters phi r, average loss and accuracy U after optimization from the support set stored in the context memory;
(4) Initial information Calculating initial loss and initial accurate values according to the flow of the figure 2 by using the support set and the query set of the current task;
(5) Timestamp t: i.e. the time stamp in the diffusion model, the specific values can be adjusted in experiments.
From the above five inputs, the output of the diffusion model based on the context memory driving can be expressed as:
The step S5 in this embodiment specifically includes:
Optimizing an objective function:
constructing a total loss function:
Where E represents noise as the diffusion process proceeds, is a random, from Gaussian distribution Sampling to obtain, theta represents diffusion model parameters, obtaining through a transducer,The learning rate representing the inner loop optimization, x q represents the image sample,Representing the prediction labels corresponding to the image samples x q, Q representing the number of image samples in the query set,Representing the calculated expectation, lambda represents the adjustment parameter,Representing predictions given x q and parameter phiProbability of (2);
The parameters theta and phi of the diffusion model based on the scene memory driving are optimized by using the total loss function.
In the embodiment, a diffusion model represented by an objective function theta θ is realized by using a transducer with a parameter theta, and then a total loss function is utilized to optimize a diffusion model parameter theta and a base line network parameter phi of a scene memory drive; wherein the method comprises the steps ofIndicating the computational expectations, λ is the adjustment parameter, the specific value of which can be adjusted in experiments.
In step S6 of this embodiment, that is, the step of predicting the image to be predicted using the diffusion model based on the scene memory driving optimized in this embodiment specifically includes:
constructing a new task The new taskThe method comprises the steps that the method comprises a plurality of images to be predicted, wherein the images to be predicted are divided into a support set and a query set;
Acquiring new tasks A corresponding keyword K;
Calculating key K and scene memory Cosine distances of key words K j of all storage units in the memory;
Selecting a value range V r corresponding to a keyword K r with the minimum cosine distance, acquiring a parameter phi r after query set optimization, a parameter phi r after support set optimization, average loss and accuracy U stored in the value range V r, and constructing the parameter phi r after support set optimization, the average loss and the accuracy U into a prompt
Utilizing new tasksIs used for calculating initial loss and initial accuracy rate and generating initial information
Random generation and new tasksCorresponding initial parameters phi;
Parameters phi' r and prompts after optimizing the acquired query set Initial informationNew taskCorresponding initial parameters phi and time stamp t are input into a diffusion model based on scene memory driving, and new tasks are generatedCorresponding prediction parameters phi';
Utilizing new tasks Completion of the predicted parameter phi' for the new taskIs a prediction of (2).
According to the embodiment, the image to be predicted is predicted through the scene memory storage and the diffusion model based on the scene memory driving, so that noise in the prediction process can be effectively reduced, and compared with the prior art, the accuracy of a prediction result is higher.
Example 2
As shown in fig. 2, the present embodiment provides a system for constructing an optimized diffusion model based on a context memory driver, which is used for implementing the method for constructing an optimized diffusion model based on a context memory driver provided in embodiment 1, and the system specifically includes:
The task construction module 101 is configured to construct a plurality of tasks, where each task includes a plurality of image samples and manual labels corresponding to the image samples; one part of the image samples and the manual labels corresponding to the image samples are constructed as a support set, and the other part is constructed as a query set;
A context memory constructing module 102, configured to construct a context memory, and optimize the context memory using a support set and a query set of a plurality of tasks; the method specifically comprises the following steps:
Constructing a scene memory for storing task keywords and value fields;
building context memory N represents N storage units in the scene memory, N is smaller than or equal to the storage capacity of the scene memory, and each storage unit M i stores a key value pair, namely a keyword and a value field, and is represented by M i=[Ki,Vi; wherein V i represents a value range of the ith task, and K i represents a keyword of the ith task, specifically expressed as: k i=Transformer([cls,e1,…,en) 0; wherein n represents the total number of image samples in the support set and the query set of the ith task, e j=gφ(xj) the feature representation of the image sample x j after passing through the encoder g φ, x j represents the jth image sample in the support set of the ith task, cls represents the category occupation information of the ith task;
internal circulation optimization is carried out on initial parameters of the scene memory storage by using a support set of a plurality of tasks, and parameters after the support set optimization are obtained;
with the ith task The support set of the image samples x j and the artificial annotations y j corresponding to the image samples optimize the original parameters phi, which are specifically expressed as follows: Wherein, alpha represents learning rate of inner loop optimization, S represents the number of image samples in the support set, f φ(xj) represents prediction labeling of the image samples x j by using original parameters phi, which are obtained by an encoder and a predictor, phi i represents parameters after the support set optimization; use tasks Is used for supporting iterative updating of parameters phi i of all image samples in a centralized manner; using task setsIteratively updating the parameter phi i corresponding to each task,Representing the number of tasks in the task set; obtaining parameters phi i after optimizing the supporting set corresponding to each task;
After the internal circulation updating is finished, the parameters after the scene memory support set optimization are subjected to external circulation updating by using the query sets of a plurality of tasks, so that the parameters after the query set optimization are obtained;
optimizing the parameter phi i after support set optimization by using the image sample x in the query set of the ith task and the artificial annotation y corresponding to the image sample, wherein the method is specifically expressed as follows: Wherein, gamma represents the learning rate of the outer loop optimization, Q represents the number of image samples in the query set, Representing the predictive annotation of the image sample x using the support set optimized parameter phi i,Representing the number of tasks in the task set; using task setsIteratively updating the parameter phi i corresponding to each task according to the query set of each task; obtaining the optimized parameter phi i of the query set corresponding to each task.
Optimizing the scene memory according to the parameters after optimizing the support set and the parameters after optimizing the query set;
parameters phi i after optimizing the ith task query set, parameters phi i after optimizing the support set, average loss The accuracy U is obtained according to the prediction annotation of the parameter phi i optimized by using the query set and the prediction annotation of the original parameter phi, and is used as the value of the value range V i to be stored in the scene memory storageRepresenting the predictive annotation of the image sample x using the query set optimized parameter phi i .
A diffusion model construction module 103 for constructing a diffusion model based on a scene memory driver according to the scene memory; the method specifically comprises the following steps:
Parameter phi' after optimizing query set, initial parameter phi for random initialization and prompt Initial informationThe time stamp t is used as an input of a diffusion model based on the scene memory driving;
The prompt is Comprises parameters phi r, average loss and accuracy U after optimizing a support set stored in a scene memory storage, wherein the initial informationThe average loss and the accuracy obtained by the current task are used as initial loss and initial accuracy, and the time stamp t is the iteration number of loop optimization in the scene memory;
The output of the diffusion model based on the context memory driving is expressed as:
Wherein Θ θ represents a diffusion model, θ represents diffusion model parameters, and is obtained through a transducer; e represents noise as the diffusion process proceeds, which is a random, from Gaussian distribution Sampling to obtain; The learning rate of the inner loop optimization is represented.
A diffusion model optimization module 104, configured to optimize a diffusion model based on a context memory driver using a support set and a query set of a plurality of tasks; the method specifically comprises the following steps:
Optimizing an objective function:
constructing a total loss function:
Where E represents noise as the diffusion process proceeds, is a random, from Gaussian distribution Sampling to obtain, theta represents diffusion model parameters, obtaining through a transducer,The learning rate representing the inner loop optimization, x q represents the image sample,Representing the prediction labels corresponding to the image samples x q, Q representing the number of image samples in the query set,Representing the calculated expectation, lambda represents the adjustment parameter,Representing predictions given x q and parameter phiProbability of (2);
The parameters theta and phi of the diffusion model based on the scene memory driving are optimized by using the total loss function.
The prediction module 105 is used for constructing a new task, wherein the new task comprises a plurality of images to be predicted, and the optimized diffusion model based on the scene memory driving is used for predicting the images to be predicted; the method specifically comprises the following steps:
constructing a new task The new taskThe method comprises the steps that the method comprises a plurality of images to be predicted, wherein the images to be predicted are divided into a support set and a query set;
Acquiring new tasks A corresponding keyword K;
Calculating key K and scene memory Cosine distances of key words K j of all storage units in the memory;
Selecting a value range V r corresponding to a keyword K r with the minimum cosine distance, acquiring a parameter phi' r after query set optimization, a parameter phi r after support set optimization, average loss and accuracy U stored in the value range V r, and constructing the parameter phi r after support set optimization, the average loss and the accuracy U as prompts
Utilizing new tasksIs used for calculating initial loss and initial accuracy rate and generating initial information
Random generation and new tasksCorresponding initial parameters phi;
Parameters phi' r and prompts after optimizing the acquired query set Initial informationNew taskCorresponding initial parameters phi and time stamp t are input into a diffusion model based on scene memory driving, and new tasks are generatedCorresponding prediction parameters phi';
Utilizing new tasks Completion of the predicted parameter phi' for the new taskIs a prediction of (2).
Firstly, constructing a scene memory storage from a neural network check point, and utilizing the scene memory storage as a prompt to guide a new task of updating a basic model; secondly, taking the conditional diffusion model as an optimizer, taking the parameter to be updated as the conditional diffusion model, taking the scene memory as a guide, and predicting the required optimal parameter update suitable for the expected basic model by combining the diffusion time steps, so that the embodiment can optimize the basic model by using unfamiliar parameters only through single or few times of update in meta-test, effectively solve the noise problem of the diffusion model parameters, and improve the accuracy of a prediction result. Meanwhile, the embodiment has universality and flexibility, and can be smoothly integrated with a model of the existing meta learning method based on optimization.
It should be understood that the foregoing examples of the present invention are merely illustrative of the present invention and are not intended to limit the present invention to the specific embodiments thereof. Any modification, equivalent replacement, improvement, etc. that comes within the spirit and principle of the claims of the present invention should be included in the protection scope of the claims of the present invention.

Claims (2)

1. The method for constructing the optimized diffusion model based on the scene memory driving is characterized by comprising the following steps of:
constructing a plurality of tasks, wherein each task comprises a plurality of image samples and manual labels corresponding to the image samples;
constructing a part of the image samples and the manual labels corresponding to the image samples as a support set, and constructing the other part as a query set;
Constructing a scene memory storage, and optimizing the scene memory storage by using a support set and a query set of a plurality of tasks;
Constructing a diffusion model based on scene memory driving according to the scene memory;
optimizing a diffusion model based on a scene memory driver by using a support set and a query set of a plurality of tasks;
Constructing a new task, wherein the new task comprises a plurality of images to be predicted, and predicting the images to be predicted by using an optimized diffusion model based on scene memory driving;
the construction of the scene memory storage uses a support set and a query set of a plurality of tasks to optimize the scene memory storage, and specifically comprises the following steps:
Constructing a scene memory for storing task keywords and value fields;
internal circulation optimization is carried out on initial parameters of the scene memory storage by using a support set of a plurality of tasks, and parameters after the support set optimization are obtained;
After the internal circulation updating is finished, the parameters after the scene memory support set optimization are subjected to external circulation updating by using the query sets of a plurality of tasks, so that the parameters after the query set optimization are obtained;
Optimizing the scene memory according to the parameters after optimizing the support set and the parameters after optimizing the query set;
the construction of the scene memory storage for storing task keywords and value fields specifically comprises the following steps:
building context memory
N represents N storage units in the scene memory, N is smaller than or equal to the storage capacity of the scene memory, and each storage unit M i stores a key value pair, namely a keyword and a value field, and is represented by M i=[Ki,Vi;
Wherein V i represents a value range of the ith task, and K i represents a keyword of the ith task, specifically expressed as: k i=Transformer([cls,e1,…,en) 0;
Wherein n represents the total number of image samples in the support set and the query set of the ith task, e j=gφ(xj) the feature representation of the image sample x j after passing through the encoder g φ, x j represents the jth image sample in the support set of the ith task, cls represents the category occupation information of the ith task;
The method comprises the steps of performing internal loop optimization on initial parameters of a scene memory by using a support set of a plurality of tasks to obtain parameters after the support set optimization, and specifically comprises the following steps:
with the ith task The support set of the image samples x j and the artificial annotations y j corresponding to the image samples optimize the original parameters phi, which are specifically expressed as follows:
Wherein, alpha represents learning rate of inner loop optimization, S represents the number of image samples in the support set, f φ(xj) represents prediction labeling of the image samples x j by using original parameters phi, which are obtained by an encoder and a predictor, phi i represents parameters after the support set optimization;
Use tasks Is used for supporting iterative updating of parameters phi i of all image samples in a centralized manner;
Using task sets Iteratively updating the parameter phi i corresponding to each task,Representing the number of tasks in the task set;
Obtaining parameters phi i after optimizing the supporting set corresponding to each task;
the method comprises the steps of performing outer loop update on parameters optimized by a scene memory support set by using a query set of a plurality of tasks to obtain parameters optimized by the query set, and specifically comprises the following steps:
optimizing the parameter phi i after support set optimization by using the image sample x in the query set of the ith task and the artificial annotation y corresponding to the image sample, wherein the method is specifically expressed as follows:
Wherein, gamma represents the learning rate of the outer loop optimization, Q represents the number of image samples in the query set, Representing the predictive annotation of the image sample x using the support set optimized parameter phi i,Representing the number of tasks in the task set;
Using task sets Iteratively updating the parameter phi' i corresponding to each task according to the query set of each task;
Obtaining optimized parameters phi' i of the query set corresponding to each task;
The optimizing the scene memory according to the parameters after the support set optimization and the parameters after the query set optimization specifically comprises the following steps:
Parameter phi' i after optimizing the ith task query set, parameter phi i after optimizing the support set, average loss The accuracy U is obtained according to the prediction annotation of the parameter phi' i optimized by using the query set and the prediction annotation of the original parameter phi, and is used as the value of the value range V i to be stored in the scene memory storageRepresenting the predictive annotation of the parameter phi' i of the image sample x after optimization by using the query set;
the method for constructing the diffusion model based on the scene memory driving according to the scene memory storage specifically comprises the following steps:
Parameter phi' after optimizing query set, initial parameter phi for random initialization and prompt Initial informationThe time stamp t is used as an input of a diffusion model based on the scene memory driving;
The prompt is Comprises parameters phi r, average loss and accuracy U after optimizing a support set stored in a scene memory storage, wherein the initial informationThe average loss and the accuracy obtained by the current task are used as initial loss and initial accuracy, and the time stamp t is the iteration number of loop optimization in the scene memory;
The output of the diffusion model based on the context memory driving is expressed as:
Wherein Θ θ represents a diffusion model, θ represents diffusion model parameters, and is obtained through a transducer; e represents noise as the diffusion process proceeds, which is a random, from Gaussian distribution Sampling to obtain; The learning rate of the internal circulation optimization is represented;
the method for optimizing the diffusion model based on the scene memory driving by using the support set and the query set of a plurality of tasks specifically comprises the following steps:
Optimizing an objective function:
constructing a total loss function:
Where E represents noise as the diffusion process proceeds, is a random, from Gaussian distribution Sampling to obtain, theta represents diffusion model parameters, obtaining through a transducer,The learning rate representing the inner loop optimization, x q represents the image sample,Representing the prediction labels corresponding to the image samples x q, Q representing the number of image samples in the query set,Representing the calculated expectation, lambda represents the adjustment parameter,Representing predictions given x q and parameter phiProbability of (2);
optimizing parameters theta and phi of a diffusion model based on scene memory driving by using a total loss function;
The construction of a new task, wherein the new task comprises a plurality of images to be predicted, and the prediction of the images to be predicted is performed by using an optimized diffusion model based on a scene memory drive, and specifically comprises the following steps:
constructing a new task The new taskThe method comprises the steps that the method comprises a plurality of images to be predicted, wherein the images to be predicted are divided into a support set and a query set;
Acquiring new tasks A corresponding keyword K;
Calculating key K and scene memory Cosine distances of key words K j of all storage units in the memory;
Selecting a value range V r corresponding to a keyword K r with the minimum cosine distance, acquiring a parameter phi' r after query set optimization, a parameter phi r after support set optimization, average loss and accuracy U stored in the value range V r, and constructing the parameter phi r after support set optimization, the average loss and the accuracy U as prompts
Utilizing new tasksIs used for calculating initial loss and initial accuracy rate and generating initial information
Random generation and new tasksCorresponding initial parameters phi;
Parameters phi' r and prompts after optimizing the acquired query set Initial informationNew taskCorresponding initial parameters phi and time stamp t are input into a diffusion model based on scene memory driving, and new tasks are generatedCorresponding prediction parameters phi';
Utilizing new tasks Completion of the predicted parameter phi' for the new taskIs a prediction of (2).
2. A system for constructing an optimized diffusion model based on a context memory driver, comprising:
The task construction module is used for constructing a plurality of tasks, and each task comprises a plurality of image samples and manual labels corresponding to the image samples; one part of the image samples and the manual labels corresponding to the image samples are constructed as a support set, and the other part is constructed as a query set;
The scene memory storage construction module is used for constructing a scene memory storage and optimizing the scene memory storage by using a support set and a query set of a plurality of tasks;
the diffusion model construction module is used for constructing a diffusion model based on the scene memory drive according to the scene memory;
the diffusion model optimization module is used for optimizing a diffusion model based on the scene memory driving by using a support set and a query set of a plurality of tasks;
the prediction module is used for constructing a new task, wherein the new task comprises a plurality of images to be predicted, and the optimized diffusion model based on the scene memory drive is used for predicting the images to be predicted;
the construction of the scene memory storage uses a support set and a query set of a plurality of tasks to optimize the scene memory storage, and specifically comprises the following steps:
Constructing a scene memory for storing task keywords and value fields;
internal circulation optimization is carried out on initial parameters of the scene memory storage by using a support set of a plurality of tasks, and parameters after the support set optimization are obtained;
After the internal circulation updating is finished, the parameters after the scene memory support set optimization are subjected to external circulation updating by using the query sets of a plurality of tasks, so that the parameters after the query set optimization are obtained;
Optimizing the scene memory according to the parameters after optimizing the support set and the parameters after optimizing the query set;
the construction of the scene memory storage for storing task keywords and value fields specifically comprises the following steps:
building context memory
N represents N storage units in the scene memory, N is smaller than or equal to the storage capacity of the scene memory, and each storage unit M i stores a key value pair, namely a keyword and a value field, and is represented by M i=[Ki,Vi;
Wherein V i represents a value range of the ith task, and K i represents a keyword of the ith task, specifically expressed as: k i=Transformer([cls,e1,…,en) 0;
Wherein n represents the total number of image samples in the support set and the query set of the ith task, e j=gΦ(xj) the feature representation of the image sample x j after passing through the encoder g φ, x j represents the jth image sample in the support set of the ith task, cls represents the category occupation information of the ith task;
The method comprises the steps of performing internal loop optimization on initial parameters of a scene memory by using a support set of a plurality of tasks to obtain parameters after the support set optimization, and specifically comprises the following steps:
with the ith task The support set of the image samples x j and the artificial annotations y j corresponding to the image samples optimize the original parameters phi, which are specifically expressed as follows:
Wherein, alpha represents learning rate of inner loop optimization, S represents the number of image samples in the support set, f φ(xj) represents prediction labeling of the image samples x j by using original parameters phi, which are obtained by an encoder and a predictor, phi i represents parameters after the support set optimization;
Use tasks Is used for supporting iterative updating of parameters phi i of all image samples in a centralized manner;
Using task sets Iteratively updating the parameter phi i corresponding to each task,Representing the number of tasks in the task set;
Obtaining parameters phi i after optimizing the supporting set corresponding to each task;
the method comprises the steps of performing outer loop update on parameters optimized by a scene memory support set by using a query set of a plurality of tasks to obtain parameters optimized by the query set, and specifically comprises the following steps:
optimizing the parameter phi i after support set optimization by using the image sample x in the query set of the ith task and the artificial annotation y corresponding to the image sample, wherein the method is specifically expressed as follows:
Wherein, gamma represents the learning rate of the outer loop optimization, Q represents the number of image samples in the query set, Representing the predictive annotation of the image sample x using the support set optimized parameter phi i,Representing the number of tasks in the task set;
Using task sets Iteratively updating the parameter phi' i corresponding to each task according to the query set of each task;
Obtaining optimized parameters phi' i of the query set corresponding to each task;
The optimizing the scene memory according to the parameters after the support set optimization and the parameters after the query set optimization specifically comprises the following steps:
Parameter phi' i after optimizing the ith task query set, parameter phi i after optimizing the support set, average loss The accuracy U is obtained according to the prediction annotation of the parameter phi' i optimized by using the query set and the prediction annotation of the original parameter phi, and is used as the value of the value range V i to be stored in the scene memory storageRepresenting the predictive annotation of the parameter phi' i of the image sample x after optimization by using the query set;
the method for constructing the diffusion model based on the scene memory driving according to the scene memory storage specifically comprises the following steps:
Parameter phi' after optimizing query set, initial parameter phi for random initialization and prompt Initial informationThe time stamp t is used as an input of a diffusion model based on the scene memory driving;
The prompt is Comprises parameters phi r, average loss and accuracy U after optimizing a support set stored in a scene memory storage, wherein the initial informationThe average loss and the accuracy obtained by the current task are used as initial loss and initial accuracy, and the time stamp t is the iteration number of loop optimization in the scene memory;
The output of the diffusion model based on the context memory driving is expressed as:
Wherein Θ θ represents a diffusion model, θ represents diffusion model parameters, and is obtained through a transducer; e represents noise as the diffusion process proceeds, which is a random, from Gaussian distribution Sampling to obtain; The learning rate of the internal circulation optimization is represented;
the method for optimizing the diffusion model based on the scene memory driving by using the support set and the query set of a plurality of tasks specifically comprises the following steps:
Optimizing an objective function:
constructing a total loss function:
Where E represents noise as the diffusion process proceeds, is a random, from Gaussian distribution Sampling to obtain, theta represents diffusion model parameters, obtaining through a transducer,The learning rate representing the inner loop optimization, x q represents the image sample,Representing the prediction labels corresponding to the image samples x q, Q representing the number of image samples in the query set,Representing the calculated expectation, lambda represents the adjustment parameter,Representing predictions given x q and parameter phiProbability of (2);
optimizing parameters theta and phi of a diffusion model based on scene memory driving by using a total loss function;
The construction of a new task, wherein the new task comprises a plurality of images to be predicted, and the prediction of the images to be predicted is performed by using an optimized diffusion model based on a scene memory drive, and specifically comprises the following steps:
constructing a new task The new taskThe method comprises the steps that the method comprises a plurality of images to be predicted, wherein the images to be predicted are divided into a support set and a query set;
Acquiring new tasks A corresponding keyword K;
Calculating key K and scene memory Cosine distances of key words K j of all storage units in the memory;
Selecting a value range V r corresponding to a keyword K r with the minimum cosine distance, acquiring a parameter phi' r after query set optimization, a parameter phi r after support set optimization, average loss and accuracy U stored in the value range V r, and constructing the parameter phi r after support set optimization, the average loss and the accuracy U as prompts
Utilizing new tasksIs used for calculating initial loss and initial accuracy rate and generating initial information
Random generation and new tasksCorresponding initial parameters phi;
Parameters phi' r and prompts after optimizing the acquired query set Initial informationNew taskCorresponding initial parameters phi and time stamp t are input into a diffusion model based on scene memory driving, and new tasks are generatedCorresponding prediction parameters phi';
Utilizing new tasks Completion of the predicted parameter phi' for the new taskIs a prediction of (2).
CN202311208862.9A 2023-09-18 2023-09-18 Method and system for constructing optimized diffusion model based on scene memory drive Active CN117274732B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202311208862.9A CN117274732B (en) 2023-09-18 2023-09-18 Method and system for constructing optimized diffusion model based on scene memory drive

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202311208862.9A CN117274732B (en) 2023-09-18 2023-09-18 Method and system for constructing optimized diffusion model based on scene memory drive

Publications (2)

Publication Number Publication Date
CN117274732A CN117274732A (en) 2023-12-22
CN117274732B true CN117274732B (en) 2024-07-16

Family

ID=89215315

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202311208862.9A Active CN117274732B (en) 2023-09-18 2023-09-18 Method and system for constructing optimized diffusion model based on scene memory drive

Country Status (1)

Country Link
CN (1) CN117274732B (en)

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116563638A (en) * 2023-05-19 2023-08-08 广东石油化工学院 Image classification model optimization method and system based on scene memory

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP4057189A1 (en) * 2017-02-24 2022-09-14 DeepMind Technologies Limited Neural episodic control
CN112182162B (en) * 2020-09-30 2023-10-31 中国人民大学 Personalized dialogue method and system based on memory neural network
CN114708637A (en) * 2022-04-02 2022-07-05 天津大学 Face action unit detection method based on meta-learning
CN115935817A (en) * 2022-12-05 2023-04-07 浙江工业大学 Rapid model generation method based on diffusion model
CN116524062B (en) * 2023-07-04 2023-09-01 南京邮电大学 Diffusion model-based 2D human body posture estimation method

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116563638A (en) * 2023-05-19 2023-08-08 广东石油化工学院 Image classification model optimization method and system based on scene memory

Also Published As

Publication number Publication date
CN117274732A (en) 2023-12-22

Similar Documents

Publication Publication Date Title
Lin et al. Pattern sensitive prediction of traffic flow based on generative adversarial framework
WO2021143396A1 (en) Method and apparatus for carrying out classification prediction by using text classification model
JP2021524099A (en) Systems and methods for integrating statistical models of different data modality
CN110503192A (en) The effective neural framework of resource
JP7111671B2 (en) LEARNING APPARATUS, LEARNING SYSTEM AND LEARNING METHOD
CN109919183B (en) Image identification method, device and equipment based on small samples and storage medium
Liu et al. Research on improved wavelet convolutional wavelet neural networks
CN116644755B (en) Multi-task learning-based few-sample named entity recognition method, device and medium
EP3474274A1 (en) Speech recognition method and apparatus
CN114519469A (en) Construction method of multivariate long sequence time sequence prediction model based on Transformer framework
CN116594748B (en) Model customization processing method, device, equipment and medium for task
CN113920379B (en) Zero sample image classification method based on knowledge assistance
US20210073645A1 (en) Learning apparatus and method, and program
CN115130591A (en) Cross supervision-based multi-mode data classification method and device
CN117576910A (en) Traffic flow prediction method based on circulation space-time attention mechanism
CN117033657A (en) Information retrieval method and device
JP2023067749A (en) Information processing method, information processor, and storage medium
WO2024067563A1 (en) Task processing method and apparatus based on model quantization, and device and storage medium
CN117274732B (en) Method and system for constructing optimized diffusion model based on scene memory drive
CN116091867B (en) Model training and image recognition method, device, equipment and storage medium
CN117437507A (en) Prejudice evaluation method for evaluating image recognition model
CN117155806A (en) Communication base station flow prediction method and device
CN116778300A (en) Knowledge distillation-based small target detection method, system and storage medium
CN111259673A (en) Feedback sequence multi-task learning-based law decision prediction method and system
CN116094977A (en) Deep learning method of service Qos prediction based on time perception feature-oriented optimization

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant