CN112686318B - Zero sample learning mechanism based on sphere embedding, sphere alignment and sphere calibration - Google Patents

Zero sample learning mechanism based on sphere embedding, sphere alignment and sphere calibration Download PDF

Info

Publication number
CN112686318B
CN112686318B CN202011629663.1A CN202011629663A CN112686318B CN 112686318 B CN112686318 B CN 112686318B CN 202011629663 A CN202011629663 A CN 202011629663A CN 112686318 B CN112686318 B CN 112686318B
Authority
CN
China
Prior art keywords
sphere
embedding
category
semantic
alignment
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202011629663.1A
Other languages
Chinese (zh)
Other versions
CN112686318A (en
Inventor
张磊
沈佳怡
甄先通
李欣
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guangdong University of Petrochemical Technology
Original Assignee
Guangdong University of Petrochemical Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guangdong University of Petrochemical Technology filed Critical Guangdong University of Petrochemical Technology
Priority to CN202011629663.1A priority Critical patent/CN112686318B/en
Publication of CN112686318A publication Critical patent/CN112686318A/en
Application granted granted Critical
Publication of CN112686318B publication Critical patent/CN112686318B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Abstract

The application discloses a zero sample learning mechanism based on sphere embedding, sphere alignment and sphere calibration, which comprises the following steps: and constructing a whole framework of the system, and identifying the semantic embedded network parameter learning process and the unseen category sample. The application proposes a joint objective function, wherein alpha and beta are superparameters adjusted during the experiment. The application uses sphere embedding, sphere alignment and sphere calibration in a optimization formula in a centralized way to respectively solve the problems of semantic gap, pivot and prediction deviation; the application maps the distance between visual features of the image and semantic descriptions of the category to a sphere calculation; the traditional Euclidean distance ignores angle information, and the cosine distance does not show radial distance at all, so the spherical surface embedding adopted by the application takes angle information and radial distance into more comprehensive consideration; the application adopts different radial distances aiming at the visible category and the invisible category, thereby emphasizing the effect of the sample of the invisible category.

Description

Zero sample learning mechanism based on sphere embedding, sphere alignment and sphere calibration
Technical Field
The application relates to the technical field of zero sample learning, in particular to a zero sample learning mechanism based on sphere embedding, sphere alignment and sphere calibration.
Background
In a real-world scenario, many tasks require identification of instance categories that have never been seen before, thus making the original training method unsuitable. Zero sample learning has occurred. Zero sample learning, also called Zero shot learning, is mainly aimed at realizing category prediction and identification of data of unseen categories through related priori knowledge according to visible category data in training set.
The existing methods are mainly related research work from several aspects of embedding models, generating models and measuring methods. The embedded model method mainly realizes the migration of knowledge of visible categories to unknown categories by mapping the features of the visual space onto category prototypes of semantic representation. The model generation method is to generate samples of unknown classes through generating a countermeasure model or a variational encoder through semantic descriptions of the classes, so that zero sample learning is converted into small sample or multiple sample learning. The measurement method is to select a proper measurement method in the embedded space and establish the similarity between the visual characteristics and the category prototypes.
The existing zero sample learning method can encounter several problems: 1. semantic gap problem between visible and invisible categories. The existing method establishes the association relation between the known category and the unknown category through the semantic space, and the description of the association relation is simple; 2. the pivot problem is that samples of many different categories which are encountered in zero sample learning may be closer to only a few category prototypes and far from most categories; 3. the problem of predictive bias, test images from unknown classes, always tend to identify known classes that are very close to the unknown class.
The application provides a zero sample learning mechanism based on sphere embedding, sphere alignment and sphere calibration, and provides a comprehensive solution to the three problems, namely a semantic gap problem, a pivot problem and a prediction deviation problem, and the problems are solved by fusing the sphere embedding, the sphere alignment and the sphere calibration into a frame.
Disclosure of Invention
The embodiment of the application provides a zero sample learning mechanism based on sphere embedding, sphere alignment and sphere calibration, which comprises the following steps: constructing a whole framework of the system, and identifying a semantic embedded network parameter learning process and an unseen category sample;
the system-wide framework construction includes:
the image is embedded into the network phi through visual characteristics, and the category information is embedded into the network through semantemeBy sphere-embedded KL distance, sphere-aligned R function and sphere-aligned minimum entropy constraint, an objective function is constructed as follows:
wherein the method comprises the steps ofThe KL distance of sphere embedding is represented, the KL distance distributed between an actual marked sample and a predicted sample is calculated, the specific step is shown as 2.5, R represents a sphere alignment function, and the specific calculation formula is shown as step 2.3; />The minimum entropy constraint for sphere calibration is represented, and a specific calculation formula is shown in step 2.4; alpha and beta are super-parameters regulated in the experimental process;
the semantic embedded network parameter learning process comprises the following steps:
input: category prototype a of visible category C Category prototype a of aggregate and unknown categories U Set, training data set D C tr The visual features are embedded into the net;
and (3) outputting: semantic embedding networkParameters;
step 1: initializing, setting a batch size B and iteration times l, and initializing a semantic embedded networkParameters;
step 2: iteration number iter= [1: l ], do the following operations:
step 2.1: randomly sampling B samples;
step 2.2: category prototype A of a visible category C And class prototype A of unknown class U Projection into sphere-embedded space, i.e. for A C ∪A U Each category prototype a of (1), embedding the network according to semanticsGenerate->
Step 2.3: r is calculated according to the following formula:
wherein i and j respectively represent class labels corresponding to prototype a, cos (θ i,j ) Then the cosine distance between the prototypes of the two categories i, j is represented;representing an alignment factor between two categories i, j; the specific calculation is as follows:
wherein a is i Prototype representing class i, a j A prototype representing class j;representing a semantic embedded network;
wherein u represents the uniform alignment factor, i.eRepresenting the balance alignment factor. S-valued represents the semantic alignment factor, i.e. +.>Representing a similarity alignment factor; and λ is a balance parameter that balances semantic alignment and uniform alignment, in [0,1]The value is taken in between;
the balance alignment factor is calculated as follows:
the similarity pair Ji Yinzi is calculated as follows:
step 2.4: calculated according to the following formula:
wherein the method comprises the steps ofRepresenting training dataset->Representing selection of samples x from training dataset n ,/>Expressed in dataset +.>Expected on all samples, H [ q ]]Entropy representing distribution q; />Representing sample x n Through a visual characteristic embedding network phi and a semantic embedding network +.>Then, under the condition of knowing the prototype a, predicting the probability distribution of the class label y; a is a corresponding prototype, y is a predicted class label;
the probability distribution q is calculated as:
wherein phi represents the embedding of the visual features into the network,representing a semantic embedded network, wherein C represents a known category number, and U represents an unknown category number; a, a i Prototype representing class i, a j A prototype representing class j; y is i Representing the ith component in the y vector, i.e., x, under the conditions of a known prototype, a known semantic embedding network, and a visual feature embedding network n Probability of belonging to class i;
wherein the function f ρ The calculation is as follows:
wherein ρ is 1 And ρ 2 Spherical radius functions corresponding to the visible class and the invisible class respectively, the unknown class is set to have a longer radius than the known class, that is ρ 2 >ρ 1
Step 2.5: minimizing the following objective function ρ 2 >ρ 1
Wherein the method comprises the steps ofRepresenting training dataset->Representing selection of samples x from training dataset n ,/>Expressed in dataset +.>The expectations on all samples; d (D) KL Representing the calculation x n Actual category label->KL distance between its predicted tag distribution p;
whileIs x n The p-function is calculated as follows:
step 2.6: updating semantic embedded networks with backward propagation methodsParameters;
the unseen category sample identification includes:
input: test image x m Category prototype A of visible category C And class prototype A of unknown class U Semantic embedding networkParameters, the visual characteristics are embedded into network phi parameters;
and (3) outputting: predictive output of the test image;
step 1: for test image x m Calculating a visual representation of the test image;
step 2: category prototype A of a visible category C And class prototype A of unknown class U Projection into sphere-embedded space, i.e. for A C ∪A U Each category prototype a of (1), embedding the network according to semanticsGenerate->
Step 3: calculating a class prediction value for a test image according to the following formula
Wherein f for keeping in agreement with the training data ρ The calculation is as follows:
wherein θ is n,i Representation of prototype a n And a i An included angle between the two; wherein n represents a sample x n Is a category of (2);
the embodiment of the application adopts the following technical scheme: learning to obtain a semantic embedded network by utilizing an objective function constructed in the whole framework of the systemParameters, thus realizing the fusion of sphere embedding, sphere alignment and sphere calibration into one frame, and solving the problems of semantic gap, pivot and prediction deviation.
The embodiment of the application adopts the following technical scheme: in the semantic embedded network parameter learning process, step 2 is that of step 2.3In the calculation formula of (1), lambda epsilon [0,1 ]]Is a super-parameter regulated in the test process.
The embodiment of the application adopts the following technical scheme: in the semantic embedded network parameter learning process, step 2 is that of step 2.4In the calculation formula of (2), H is the entropy of probability distribution q of training set samples.
The embodiment of the application adopts the following technical scheme: in the semantic embedded network parameter learning process, step 2 is that of step 2.4In the calculation formula of (2), C is the number of samples in visible categories, and U is the number of samples in invisible categories; phi (x) n ) For image x n Is a visual feature embedding function,/>Embedding a function, y, for semantic features of class prototype a i Representing the ith component in the y vector, i.e., x, under the conditions of a known prototype, a known semantic embedding network, and a visual feature embedding network n Probability of belonging to class i.
The embodiment of the application adopts the following technical scheme: in the semantic embedded network parameter learning process, step 2 is that of step 2.5In the formula, the alpha and beta experimental data are adjusted to be super-parameters.
The embodiment of the application adopts the following technical scheme: the first term in the equationThe spherical embedding is embodied.
The embodiment of the application adopts the following technical scheme: the second term in the formula αR (η) * ) Spherical alignment is embodied.
The embodiment of the application adopts the following technical scheme: third term in the formulaAnd spherical calibration is embodied.
The embodiment of the application adopts the following technical scheme:the semantic gap, the pivot problem and the prediction deviation problem are respectively solved by centralizing sphere embedding, sphere alignment and sphere calibration as an optimization formula.
The above at least one technical scheme adopted by the embodiment of the application can achieve the following beneficial effects:
the main differences between the application and other zero sample learning methods at present are as follows:
1. the joint objective function proposed in the present application:wherein alpha and beta are super-parameters regulated in the experimental process. The first term in the above equation represents sphere embedding, the second term represents sphere alignment, and the third term represents sphere alignment. The application uses sphere embedding, sphere alignment and sphere calibration in a optimization formula in a centralized way to respectively solve the problems of semantic gap, pivot and prediction deviation;
2. the application maps the distance between visual characteristics of the image and semantic descriptions of the category to a sphere calculation, a specific formula
The traditional Euclidean distance ignores angle information, and the cosine distance does not show radial distance at all, so the spherical surface embedding adopted by the application takes angle information and radial distance into more comprehensive consideration. Still further, the present application also uses different radial distances for visible and invisible categories, thereby exacerbating the effect of the invisible category samples.
Drawings
The accompanying drawings, which are included to provide a further understanding of the application and are incorporated in and constitute a part of this specification, illustrate embodiments of the application and together with the description serve to explain the application and do not constitute a limitation on the application. In the drawings:
FIG. 1 is a system overall frame diagram of a zero sample learning mechanism based on sphere embedding, sphere alignment and sphere calibration of the present application;
fig. 2 is a schematic diagram of experimental performance of the zero sample learning mechanism of the present application based on sphere embedding, sphere alignment and sphere calibration.
Detailed Description
In order to make the objects, technical solutions and advantages of the present application more apparent, the technical solutions of the present application will be clearly and completely described below with reference to specific embodiments of the present application and corresponding drawings. It will be apparent that the described embodiments are only some, but not all, embodiments of the application. All other embodiments, which can be made by those skilled in the art based on the embodiments of the application without making any inventive effort, are intended to be within the scope of the application.
The following describes in detail the technical solutions provided by the embodiments of the present application with reference to the accompanying drawings.
Examples
A zero sample learning mechanism based on sphere embedding, sphere alignment, and sphere calibration, comprising: constructing a whole framework of the system, and identifying a semantic embedded network parameter learning process and an unseen category sample;
the system-wide framework construction includes:
the image is embedded into the network phi through visual characteristics, and the category information is embedded into the network through semantemeBy sphere-embedded KL distance, sphere-aligned R function and sphere-aligned minimum entropy constraint, an objective function is constructed as follows:
wherein the method comprises the steps ofRepresents the KL distance embedded in the sphere, calculated from the KL distance distributed between the actual marked sample and the predicted sample, see 2.5 for specific steps, R represents the sphere alignment function,the specific calculation formula is shown in step 2.3; />The minimum entropy constraint for sphere calibration is represented, and a specific calculation formula is shown in step 2.4; alpha and beta are super-parameters regulated in the experimental process;
the semantic embedded network parameter learning process comprises the following steps:
input: category prototype a of visible category C Category prototype a of aggregate and unknown categories U Set, training data set D C tr Visual characteristic embedded net
And (3) outputting: semantic embedding networkParameters;
step 1: initializing, setting a batch size B and iteration times l, and initializing a semantic embedded networkParameters;
step 2: iteration number iter= [1: l ], do the following operations:
step 2.1: randomly sampling B samples;
step 2.2: category prototype A of a visible category C And class prototype A of unknown class U Projection into sphere-embedded space, i.e. for A C ∪A U Each category prototype a of (1), embedding the network according to semanticsGenerate->
Step 2.3: r is calculated according to the following formula:
wherein i and j respectively represent class labels corresponding to prototype a, cos (θ i,j ) Then the cosine distance between the prototypes of the two categories i, j is represented;representing an alignment factor between two categories i, j; the specific calculation is as follows:
wherein a is i Prototype representing class i, a j A prototype representing class j;representing a semantic embedded network;
wherein u represents the uniform alignment factor, i.eRepresenting the equilibrium alignment factor, the S-valued representing the semantic alignment factor, i.e. +.>Representing a similarity alignment factor; and λ is a balance parameter that balances semantic alignment and uniform alignment, in [0,1]The value is taken in between;
the balance alignment factor is calculated as follows:
the similarity pair Ji Yinzi is calculated as follows:
step 2.4: calculated according to the following formula:
wherein the method comprises the steps ofRepresenting training dataset->Representing the selection of samples xn, < > from the training dataset>Expressed in dataset +.>Expected on all samples, H [ q ]]Entropy representing distribution q; />Representing sample x n Through a visual characteristic embedding network phi and a semantic embedding network +.>Then, under the condition of knowing the prototype a, predicting the probability distribution of the class label y; a is a corresponding prototype, y is a predicted class label; the probability distribution q is calculated as:
wherein phi represents the embedding of the visual features into the network,representing a semantic embedded network, C representing a known category number, and U representing an unknown category number. a, a i Prototype representing class i, a j Representing a prototype of category j. y is i Representing the ith component in the y vector, i.e., x, under the conditions of a known prototype, a known semantic embedding network, and a visual feature embedding network n Probability of belonging to class i;
wherein the function f ρ The calculation is as follows:
wherein ρ is 1 And ρ 2 Spherical radius functions corresponding to the visible class and the invisible class respectively, the unknown class is set to have a longer radius than the known class, that is ρ 2 >ρ 1
Step 2.5: minimizing the following objective function ρ 2 >ρ 1
Wherein the method comprises the steps ofRepresenting training dataset->Representing selection of samples x from training dataset n ,/>Expressed in dataset +.>The expectations on all samples; d (D) KL Representing the calculation x n Actual category label->KL distance between its predicted tag distribution p;
whileIs x n The p-function is calculated as follows:
step 2.6: updating semantic embedded networks with backward propagation methodsParameters;
the unseen category sample identification includes:
input: test image x m Category prototype A of visible category C And class prototype A of unknown class U Semantic embedding networkParameters, the visual characteristics are embedded into network phi parameters;
and (3) outputting: predictive output of the test image;
step 1: for test image x m Calculating a visual representation of the test image;
step 2: category prototype A of a visible category C And class prototype A of unknown class U Projection into sphere-embedded space, i.e. for A C UA U Each category prototype a of (1), embedding the network according to semanticsGenerate->
Step 3: calculating a class prediction value for a test image according to the following formula
Wherein f for keeping in agreement with the training data ρ The calculation is as follows:
wherein θ is n,i Representation of prototype a n And a i An included angle between the two; wherein n represents a sample x n Is a category of (2);
learning to obtain a semantic embedded network by utilizing an objective function constructed in the whole framework of the systemParameters, thus realizing the fusion of sphere embedding, sphere alignment and sphere calibration into one frame, and solving the problems of semantic gap, pivot and prediction deviation.
In the semantic embedded network parameter learning process, step 2 is that of step 2.3In the calculation formula of (1), lambda epsilon [0,1 ]]Is a super-parameter regulated in the test process; step 2 +.4 of step 2>In the calculation formula of (2), H is the entropy of probability distribution q of training set samples; step 2 +.4 of step 2>In the calculation formula of (2), C is the number of samples in visible categories, and U is the number of samples in invisible categories; phi (x) n ) For image x n Is a visual feature embedding function,/>The function is embedded for the semantic features of category prototype a.
Semantic inlayIn the network parameter learning process, step 2 is that of step 2.5In the formula, the alpha and beta experimental data are adjusted to be super-parameters.
The first term in the equationThe spherical embedding is embodied; the second term in the formula αR (η) * ) The spherical alignment is embodied; third term in the formula->Reflecting spherical calibration; />The semantic gap, the pivot problem and the prediction deviation problem are respectively solved by centralizing sphere embedding, sphere alignment and sphere calibration as an optimization formula.
To sum up: the joint objective function proposed in the present application:wherein alpha and beta are super-parameters regulated in the experimental process. The first term in the above equation represents sphere embedding, the second term represents sphere alignment, and the third term represents sphere alignment. The application uses sphere embedding, sphere alignment and sphere calibration in a optimization formula in a centralized way to respectively solve the problems of semantic gap, pivot and prediction deviation;
the application maps the distance between visual characteristics of the image and semantic descriptions of the category to a sphere calculation, a specific formulaThe traditional Euclidean distance ignores angle information, and the cosine distance does not show radial distance at all, so the spherical surface embedding adopted by the application takes angle information and radial distance into more comprehensive consideration. Still further, the present application is directed to visible categories and undiscoveredThe category, using different radial distances (formula below), aggravates the effect of not seeing the category sample.
Wherein ρ is 2 >ρ 1
It should also be noted that the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising one does not exclude the presence of other like elements in a process, method, article, or apparatus that comprises an element.
The foregoing is merely exemplary of the present application and is not intended to limit the present application. Various modifications and variations of the present application will be apparent to those skilled in the art. Any modification, equivalent replacement, improvement, etc. which come within the spirit and principles of the application are to be included in the scope of the claims of the present application.

Claims (10)

1. A zero sample learning mechanism based on sphere embedding, sphere alignment and sphere calibration, comprising: constructing a whole framework of the system, and identifying a semantic embedded network parameter learning process and an unseen category sample;
the system-wide framework construction includes:
the image is embedded into the network phi through visual characteristics, and the category information is embedded into the network through semantemeBy sphere-embedded KL distance, sphere-aligned R function and sphere-aligned minimum entropy constraint, an objective function is constructed as follows:
wherein the method comprises the steps ofThe KL distance of sphere embedding is represented, the KL distance distributed between an actual marked sample and a predicted sample is calculated, and R represents a sphere alignment function; />Representing the minimum entropy constraint of spherical calibration, wherein alpha and beta are super parameters regulated in the experimental process;
the semantic embedded network parameter learning process comprises the following steps:
input: category prototype a of visible category C Category prototype a of aggregate and unknown categories U Set, training data set D c tr The visual features are embedded into the net;
and (3) outputting: semantic embedding networkParameters;
step 1: initializing, setting a batch size B and iteration times l, and initializing a semantic embedded networkParameters;
step 2: the iteration number iter= [1:l ] is as follows:
step 2.1: randomly sampling B samples;
step 2.2: category prototype A of a visible category C And class prototype A of unknown class U Projection into sphere-embedded space, i.e. for A C ∪A U Each category prototype a of (1), embedding the network according to semanticsGenerate->
Step 2.3: r is calculated according to the following formula:
wherein i and j respectively represent class labels corresponding to prototype a, cos (θ i,j ) The cosine distance between the prototypes of the two classes i, j is represented,representing the alignment factor between two categories i, j:
wherein a is i Prototype representing class i, a j A prototype of category j is represented,representing a semantic embedded network;
wherein u represents the uniform alignment factor, i.eRepresenting balanced alignment factors, the value of S representing semantic alignment factors, i.eRepresents a similarity alignment factor, while λ is a balance parameter that balances semantic alignment and uniform alignment, in [0,1]The value of the two values is taken between the two values,
the balance alignment factor is calculated as follows:
the similarity pair Ji Yinzi is calculated as follows:
step 2.4: calculated according to the following formula:
wherein the method comprises the steps ofRepresenting training dataset->Representing selection of samples x from training dataset n ,/>Expressed in dataset +.>Expected on all samples, H [ q ]]Entropy representing distribution q>Representing sample x n Through a visual characteristic embedding network phi and a semantic embedding network +.>Then, under the condition of knowing the prototype a, predicting the probability of the class label yDistribution, a is the corresponding prototype, y is the predicted class label, and probability distribution q is calculated as:
wherein phi represents the embedding of the visual features into the network,representing a semantic embedded network, C representing a known category number, U representing an unknown category number, a i Prototype representing class i, a j Prototype, y representing class j i Representing the ith component in the y vector, i.e., x, under the conditions of a known prototype, a known semantic embedding network, and a visual feature embedding network n The probability of belonging to the class i is,
wherein the function f ρ The calculation is as follows:
wherein ρ is 1 And ρ 2 Spherical radius functions corresponding to visible and invisible categories, respectively, specify ρ 21
Step 2.5: minimizing the following objective function ρ 21
Wherein the method comprises the steps ofRepresenting training dataset->Representing selection of samples x from training dataset n ,/>Expressed in dataset +.>Expectations on all samples, D KL Representing the calculation x n Actual category label->And the KL distance between its predicted tag distribution p,
whileIs x n The p-function is calculated as follows:
step 2.6: updating semantic embedded networks with backward propagation methodsParameters;
the unseen category sample identification includes:
input: test image x m Category prototype A of visible category C And class prototype A of unknown class U Semantic embedding networkParameters, the visual characteristics are embedded into network phi parameters;
and (3) outputting: predictive output of the test image;
step 1: measurement by contrastTest image x m Calculating a visual representation of the test image;
step 2: category prototype A of a visible category C And class prototype A of unknown class U Projection into sphere-embedded space, i.e. for A C ∪A U Each category prototype a of (1), embedding the network according to semanticsGenerate->
Step 3: calculating a class prediction value for a test image according to the following formula
Wherein f for keeping in agreement with the training data ρ The calculation is as follows:
wherein θ is n,i Representation of prototype a n And a i An included angle between them, where n represents a sample x n Is a category of (2).
2. The zero-sample learning mechanism based on sphere embedding, sphere alignment and sphere calibration according to claim 1, wherein the semantic embedded network is learned by using objective functions constructed in the system overall frameworkParameters, thus realizing the fusion of sphere embedding, sphere alignment and sphere calibration into one frame, and solving the problems of semantic gap, pivot and prediction deviation.
3. The zero-sample learning mechanism based on sphere embedding, sphere alignment and sphere calibration according to claim 1, wherein in the semantic embedding network parameter learning process, step 2 is the step 2.3 of step 2In the calculation formula of (1), lambda epsilon [0,1 ]]Is a super-parameter regulated in the test process.
4. The zero-sample learning mechanism based on sphere embedding, sphere alignment and sphere calibration according to claim 1, wherein in the semantic embedding network parameter learning process, step 2 is step 2.4 of step 2In the calculation formula of (2), H is the entropy of probability distribution q of training set samples.
5. The zero-sample learning mechanism based on sphere embedding, sphere alignment and sphere calibration according to claim 1, wherein in the semantic embedding network parameter learning process, step 2 is step 2.4 of step 2In the calculation formula of (2), C is the number of samples in visible categories, and U is the number of samples in invisible categories; phi (x) n ) For image x n Is a visual feature embedding function,/>Embedding a function, y, for semantic features of class prototype a i Representing the ith component in the y vector, i.e., x, under the conditions of a known prototype, a known semantic embedding network, and a visual feature embedding network n Probability of belonging to class i.
6. A zero-sample based on sphere embedding, sphere alignment and sphere calibration as claimed in claim 1The learning mechanism is characterized in that in the process of learning the semantic embedded network parameters, the step 2 is that of step 2.5In the formula, the alpha and beta experimental data are adjusted to be super-parameters.
7. The zero sample learning mechanism based on sphere embedding, sphere alignment and sphere calibration of claim 6, wherein the first term in the formulaThe spherical embedding is embodied.
8. The zero sample learning mechanism based on sphere embedding, sphere alignment and sphere calibration of claim 6, wherein the second term in the formula αr (η * ) Spherical alignment is embodied.
9. The zero sample learning mechanism based on sphere embedding, sphere alignment and sphere calibration of claim 6, wherein the third term in the formulaAnd spherical calibration is embodied.
10. A zero sample learning mechanism based on sphere embedding, sphere alignment and sphere calibration as defined in claim 6, wherein,the semantic gap, the pivot problem and the prediction deviation problem are respectively solved by centralizing sphere embedding, sphere alignment and sphere calibration as an optimization formula.
CN202011629663.1A 2020-12-31 2020-12-31 Zero sample learning mechanism based on sphere embedding, sphere alignment and sphere calibration Active CN112686318B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011629663.1A CN112686318B (en) 2020-12-31 2020-12-31 Zero sample learning mechanism based on sphere embedding, sphere alignment and sphere calibration

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011629663.1A CN112686318B (en) 2020-12-31 2020-12-31 Zero sample learning mechanism based on sphere embedding, sphere alignment and sphere calibration

Publications (2)

Publication Number Publication Date
CN112686318A CN112686318A (en) 2021-04-20
CN112686318B true CN112686318B (en) 2023-08-29

Family

ID=75455944

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011629663.1A Active CN112686318B (en) 2020-12-31 2020-12-31 Zero sample learning mechanism based on sphere embedding, sphere alignment and sphere calibration

Country Status (1)

Country Link
CN (1) CN112686318B (en)

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110163258A (en) * 2019-04-24 2019-08-23 浙江大学 A kind of zero sample learning method and system reassigning mechanism based on semantic attribute attention
CN110516718A (en) * 2019-08-12 2019-11-29 西北工业大学 The zero sample learning method based on depth embedded space
WO2020156303A1 (en) * 2019-01-30 2020-08-06 广州市百果园信息技术有限公司 Method and apparatus for training semantic segmentation network, image processing method and apparatus based on semantic segmentation network, and device and storage medium

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2017166137A1 (en) * 2016-03-30 2017-10-05 中国科学院自动化研究所 Method for multi-task deep learning-based aesthetic quality assessment on natural image
US11087174B2 (en) * 2018-09-25 2021-08-10 Nec Corporation Deep group disentangled embedding and network weight generation for visual inspection

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2020156303A1 (en) * 2019-01-30 2020-08-06 广州市百果园信息技术有限公司 Method and apparatus for training semantic segmentation network, image processing method and apparatus based on semantic segmentation network, and device and storage medium
CN110163258A (en) * 2019-04-24 2019-08-23 浙江大学 A kind of zero sample learning method and system reassigning mechanism based on semantic attribute attention
CN110516718A (en) * 2019-08-12 2019-11-29 西北工业大学 The zero sample learning method based on depth embedded space

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
Towards Effective Deep Embedding;Lei Zhang et al;《IEEE》;20200930;第30卷(第9期);第2843-2852页 *

Also Published As

Publication number Publication date
CN112686318A (en) 2021-04-20

Similar Documents

Publication Publication Date Title
Brachmann et al. Dsac-differentiable ransac for camera localization
US20180240243A1 (en) Segmenting three-dimensional shapes into labeled component shapes
US20220067588A1 (en) Transforming a trained artificial intelligence model into a trustworthy artificial intelligence model
CN108021930A (en) A kind of adaptive multi-view image sorting technique and system
CN110597956B (en) Searching method, searching device and storage medium
CN111052128B (en) Descriptor learning method for detecting and locating objects in video
CN111127364A (en) Image data enhancement strategy selection method and face recognition image data enhancement method
US20230049817A1 (en) Performance-adaptive sampling strategy towards fast and accurate graph neural networks
CN111161249A (en) Unsupervised medical image segmentation method based on domain adaptation
WO2020256732A1 (en) Domain adaptation and fusion using task-irrelevant paired data in sequential form
CN110443273B (en) Zero-sample-confrontation learning method for cross-class identification of natural images
CN111159241A (en) Click conversion estimation method and device
CN112686318B (en) Zero sample learning mechanism based on sphere embedding, sphere alignment and sphere calibration
CN111062406B (en) Heterogeneous domain adaptation-oriented semi-supervised optimal transmission method
CN111161238A (en) Image quality evaluation method and device, electronic device, and storage medium
CN110717037A (en) Method and device for classifying users
US20240020531A1 (en) System and Method for Transforming a Trained Artificial Intelligence Model Into a Trustworthy Artificial Intelligence Model
CN114255381B (en) Training method of image recognition model, image recognition method, device and medium
CN110135507A (en) A kind of label distribution forecasting method and device
CN114595787A (en) Recommendation model training method, recommendation device, medium and equipment
Han et al. Vanishing point detection and line classification with BPSO
CN112329833A (en) Image metric learning method based on spherical surface embedding
CN111523649A (en) Method and device for preprocessing data aiming at business model
Lee Accumulating conversational skills using continual learning
CN115984653B (en) Construction method of dynamic intelligent container commodity identification model

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant