CN109213876A - Based on the cross-module state search method for generating confrontation network - Google Patents

Based on the cross-module state search method for generating confrontation network Download PDF

Info

Publication number
CN109213876A
CN109213876A CN201810871910.5A CN201810871910A CN109213876A CN 109213876 A CN109213876 A CN 109213876A CN 201810871910 A CN201810871910 A CN 201810871910A CN 109213876 A CN109213876 A CN 109213876A
Authority
CN
China
Prior art keywords
data
cross
module state
arbiter
training
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201810871910.5A
Other languages
Chinese (zh)
Other versions
CN109213876B (en
Inventor
刘立波
徐峰
程晓龙
郑斌
郭进祥
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Ningxia University
Original Assignee
Ningxia University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Ningxia University filed Critical Ningxia University
Priority to CN201810871910.5A priority Critical patent/CN109213876B/en
Publication of CN109213876A publication Critical patent/CN109213876A/en
Application granted granted Critical
Publication of CN109213876B publication Critical patent/CN109213876B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Landscapes

  • Image Analysis (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a kind of based on the cross-module state search method for generating confrontation network, be related to multimedia data retrieval technical field, the described method comprises the following steps: step 1 carries out feature extraction to the data of input mode and the data of target modalities using feature extracting method;GAN model is established and trained to step 2, so that GAN model can be generated the data of target modalities by the data of input mode;Step 3, the data of the target modalities generated using GAN model carry out similarity mode with the data of the corresponding mode obtained in step 1, that is, carry out the calculating of Euclidean distance;Step 4 arranges the calculated result of Euclidean distance from small to large, to obtain the result of cross-module state retrieval;Euclidean distance is smaller, and the similarity of the more forward result of ranking and searched targets is higher.Compared with existing cross-module state retrieval model, the present invention more makes full use of study and the mapping ability of deep neural network, improves cross-module state retrieval accuracy.

Description

Based on the cross-module state search method for generating confrontation network
Technical field
The present invention relates to multimedia data retrieval technical field more particularly to a kind of cross-module states based on generation confrontation network Search method.
Background technique
With the development of internet technology, more and more have the media data of identical semanteme in the form of multiple modalities Occur simultaneously, for example, the photo that news report is corresponding, the diagnosis explanation and medical imaging of patient.At present people with greater need for Another mode is retrieved by a kind of mode, and is not only the retrieval to single mode.Such as, it is seen that a photo will shine Piece submits to searching system, and searching system can retrieve text information related with this photo;Patient can be by the X of oneself Mating plate image submits to searching system, and searching system can return to the diagnostic text being consistent with the X-ray.It is this to use a kind of mode Data retrieval to the method for other modal datas, referred to as cross-module state is retrieved.
Traditional cross-module state search method, such as by text retrieval image, principle is according further to the text to image The retrieval realization of markup information, essential or a kind of retrieval of single mode.But the development of internet is very fast, a large amount of image It is continued to bring out with text information, this makes the mark to image time-consuming and laborious;Additionally due to artificial mark tends not to complete table Up to the content of image, to have certain influence to search result.And depth learning technology is in processing text and image side at present Face achieves good effect, starts with for people from depth learning technology, realizes that new cross-module state retrieval technique provides road.
Two kinds can be divided into currently based on the algorithm of deep learning: 1) first kind method by different modal datas respectively into Row abstract indicates, these results abstracted is then mapped to a public representation space again, to establish each mode Between association, but such method lacked indicate study and association study between connection, this makes public representation space Not only included the shared information of multiple modalities, but also included the peculiar information of single mode data, and be unfavorable for the progress of cross-module state retrieval;2) Association is learnt and indicates to learn to merge to become whole by the second class method, but there are still retrieval effectiveness shakinesses for current this method Calmly, the problems such as retrieval precision is not high.
Therefore, those skilled in the art is dedicated to developing a kind of better cross-module state search method, improves above-mentioned retrieval The problem that effect is unstable, retrieval precision is not high.
Summary of the invention
The present invention proposes aiming at the problem that cross-module state is retrieved based on generation confrontation network (Generative Adversarial Networks, GAN) cross-module state search method, using the good code capacity of deep neural network, not With the bridge for establishing conversion between modal data, so that depth model has a better expression effect, and the retrieval of cross-module state Accuracy is higher.
To achieve the above object, the present invention provides a kind of based on the cross-module state search method for generating confrontation network, special Sign is, the described method comprises the following steps:
Step 1 carries out feature extraction to the data of input mode and the data of target modalities using feature extracting method;
GAN model is established and trained to step 2, so that GAN model can be generated the number of target modalities by the data of input mode According to;
Step 3, data and the data of the corresponding mode obtained in step 1 of the target modalities generated using GAN model into Row similarity mode carries out the calculating of Euclidean distance;
Step 4 arranges the calculated result of Euclidean distance from small to large, to obtain the result of cross-module state retrieval;It is European Apart from smaller, the similarity of the more forward result of ranking and searched targets is higher.
Further, feature extraction described in the step 1 includes the following steps:
Step 1.1, when text data be input modal data when, image data is target modalities data, and vice versa;
Step 1.2 extracts feature using different method for the data of different modalities: image data feature passes through VGG- 16, FCN method extracts;Text modality data carry out feature extraction by word2vec method;For image and textual data According to the feature after extraction is indicated with vector mode.
Further, foundation described in the step 2 and training GAN model include the following steps:
Step 2.1 builds GAN network model using the method based on Tensorflow frame;
Step 2.2 is trained GAN model using training set data, obtains the various parameters of GAN model.
Further, GAN model is trained including walking as follows using training set data described in the step 2.2 It is rapid:
Step 2.2.1, the parameter θ of arbiter is initializeddWith the parameter θ of generatorg
Step 2.2.2, the arbiter in training GAN: target modalities data set is sent into arbiter and is trained, is differentiated Device learns to obtain its semantic information input data;
Step 2.2.3, it the generator in training GAN: using certain modal data as input modal data, is sent into and generates Device, generator will generate target modalities data according to input modal data and be sent to arbiter, and arbiter will be to generation Target modalities data are differentiated, and result is fed back to generator;
Step 2.2.4, step 2.2.2 and step 2.2.3 is repeated, until arbiter is restrained with generator, obtains GAN mould The parameter sets θ of type.
Further, the training of arbiter is included the following steps: in the step 2.2.2
Step 2.2.2.1: from the data P of training setdata(x) m training sample { x of input modal data is taken out in1, x2,...,xm};
Step 2.2.2.2: from the data P of training setdata(x) m sample { z of target modalities data is taken out in1, z2,...,zm};
Step 2.2.2.3: the data of generation are obtained
Step 2.2.2.4: the parameter θ of arbiter is updateddTo maximization:
Wherein: PdataIt (x) is the training set indicated with vector, including input modal data and target modalities data, G is represented The distribution of generator, D represent the result of arbiter.
Further, the training of generator is included the following steps: in the step 2.2.3
Step 2.2.3.1: from the data P of pre-set training setdata(x) it takes out and is different from step 2.2.2.2 in M sample { z1,z2,...,zm};
Step 2.2.3.2: the parameter θ of generator is updatedgTo minimum:
Further, the calculating of Euclidean distance described in the step 3 is as follows: input modal data enters GAN model Afterwards, target modalities data are obtained, which will carry out Euclidean distance meter with all data in true corresponding modal data It calculates, the similarity degree between two vectors is reflected by Euclidean distance.
Further, in n-dimensional space, the calculation formula of the Euclidean distance d in the step 3 are as follows:
Wherein tiAnd yiFor two n-dimensional vectors.
The invention has the advantages that: the code capacity of GAN is made full use of, is constructed between the data of different modalities Bridge is mapped, more complicated network structure in the cross-module state retrieval model of existing depth network is got rid of;With existing cross-module State retrieval model is compared, and study and the mapping ability of deep neural network is more fully utilized, and it is quasi- to improve the retrieval of cross-module state Exactness.
It is described further below with reference to technical effect of the attached drawing to design of the invention, specific structure and generation, with It is fully understood from the purpose of the present invention, feature and effect.
Detailed description of the invention
Fig. 1 is the flow chart of technical solution of the present invention;
Fig. 2 is the structure chart that confrontation network model is generated in the present invention;
Fig. 3 is the flow chart that confrontation network model training is generated in the present invention;
Fig. 4 is the flow chart of the embodiment of the present invention;
Fig. 5 is the comparative result figure of cross-module state retrieval.
Specific embodiment
Multiple preferred embodiments of the invention are introduced below with reference to Figure of description, keep its technology contents more clear and just In understanding.The present invention can be emerged from by many various forms of embodiments, and protection scope of the present invention not only limits The embodiment that Yu Wenzhong is mentioned.
To solve cross-module state search problem, the present invention proposes a kind of cross-module state search method based on GAN, the technology of the present invention Program flow chart is as shown in Figure 1, comprising the following steps:
Step 1: feature extraction being carried out to the data of input mode and the data of target modalities using feature extracting method, is obtained The vector that mode and target modalities data must be inputted indicates;
Step 2: GAN model is established and train, so that GAN model can be generated separately by a kind of data (input mode) of mode A kind of data of mode (target modalities);
Step 3: the target modalities data generated using GAN model are carried out with the data of the corresponding mode obtained in step 1 Similarity mode carries out the calculating of Euclidean distance;
Step 4: the calculated result of Euclidean distance being arranged from small to large, to obtain the result of cross-module state retrieval.
Fig. 2 is the neural network structure figure of GAN model in the present invention.GAN model includes a generator (Generator) With an arbiter (Discriminator), the effect of generator is the feature vector expression generation according to input modal data The feature vector of corresponding target modalities data indicates out, and the effect of arbiter is to guarantee that generator can be in training The feature vector for properly generating target modalities data indicates, correct with the mapping for ensuring to input between mode and output modalities.Below It describes in detail to GAN model.
GAN model:
The purpose for generating confrontation network (Generative Adversarial Network, GAN) is desirable to according to input Data generate target data.Unlike general encoder, GAN includes by two networks, and one is generator (Generator), one is arbiter (Discriminator), and the mutual game of the two is fought mutually by two networks to reach To best generation effect.By the differentiation of arbiter, training of the continuous iteration to generator, finally until arbiter can not Judge whether the data generated have any different with truthful data, generator has just reached fitting state, and generator may be used as at this time It is the bridge of input data and the mutual inversion of phases of output data.
GAN model working principle is illustrated for generating image:
For the distribution P of true picturesdata(x), x is a true picture, be can be represented by vectors, point of the vector Cloth is Pdata, now need to generate the image under the distribution.
Assuming that existing generator is distributed as PG(x, θ), the distribution are controlled by θ, θ be the distribution parameter (if It is gauss hybrid models, then θ is exactly the average value and variance of each Gaussian Profile).If there is true data { x1, x2..., xm, if it is desired to calculate a likelihood PG(xi, θ), for these data, be in the likelihood generated in modelIf we need that generator is allowed to generate true picture maximum probability, a θ is needed*To maximize L.
It allows generator maximum probability to generate true picture, that is, needs to find a θ and enable PGCloser to Pdata.This In assume that PG(x, θ) is a neural network.A vector z is randomly generated first, it is raw by this network of G (z)=x At picture x, in order to compare whether z and x is similar, one group of sample of z can be taken, this group of sample meets a distribution, then passing through Another distribution P can be generated in networkG, then compare itself and true distribution PdataDifference.
The objective function of GAN is as follows:
Wherein G represents generator distribution, and D represents arbiter as a result, PdataIt is truthful data, PGIt is the data generated.If Fixed G, max V (G, D) mean that PGAnd PdataBetween difference, it is only necessary to find a best G, make max V minimum, just It is the difference minimum between two distributions.
G fixed first, solves optimal D:
For given x, the maximization D of optimization*
Pdata(x)log D(x)+PG(x)log(1-D(x))
Solve D*:
F (D)=alog (D)+blog (1-D)
By optimal D*It substitutes intoIt can be obtained
Wherein JSD is the symmetrical smoothed version of KL, illustrates the difference between two distributions, which shows fixed G,Indicate that the difference between two distributions, minimum value are -2log2, maximum value 0.Work as PG(x)=Pdata(x) when, G It is optimal.
The training of GAN model:
GAN network include an a generator G and arbiter D, training when two networks alternately.Assuming that initial Generator and arbiter are G0And D0, first train D0It findsThen D is fixed0Start to train G0, training process makes With gradient descent method, and so on, training D1, G1, D2, G2...
Training step is as shown in figure 3, detailed step is as follows:
1) step 2.1: the parameter θ of initialization arbiter and generatordAnd θg
2) step 2.2: training arbiter;
3) step 2.3: training generator;
4) step 2.4: step 2)~step 3) is alternately performed until algorithmic statement.
In step 2.2, the training of arbiter is included the following steps:
1) from data Pdata(x) m training sample { x is taken out in1,x2,...,xm};
2) from pre-set random vector Pprior(z) m sample { z is taken out in1,z2,...,zm};
3) data generated are obtained
4) parameter θ of arbiter is updateddTo maximization:
In step 2.3, the training of generator is included the following steps:
1) from pre-set random vector Pprior(z) m sample { z being different from step 2) is taken out in1,z2,..., zm};
2) parameter θ of generator is updatedgTo minimum:
It can be obtained the parameter sets θ of GAN model by the above method.
For the purpose of the present invention, technical solution expression is more clearly understood, with reference to the accompanying drawing and specific embodiment The present invention is further described in detail again.
Embodiment:
Assuming that there is m to the text and image data of known corresponding relationship, i.e. training dataset;The text of unknown corresponding relationship With image data each n, i.e. test data set;It is illustrated by taking image retrieval text as an example, searched targets are test data set In some image s, search library include test set in k retrieval member, retrieve member be text modality data;Such as Fig. 4 institute Show, include following 4 steps:
1) step 401: using feature extracting method, to the text and image data progress feature in training set and test set It extracts, can obtain its vector using the methods of word2vec for text data indicates, image data can be used The methods of VGG16 or FCN extract its feature and obtain the expression of its vector;By the step, m can be obtained to known corresponding relationship Different modal datas feature vector, obtain each n of feature vector of the text and image modal data of unknown corresponding relationship It is a;
2) step 402: using m in training set to the feature vector of the different modalities data of known corresponding relationship to GAN mould Type is trained;By the step, GAN can be generated according to the image or text modality data of input approximate semantic text or Image modalities data.
To the specific steps of GAN model training in the step are as follows:
1) parameter θ of arbiter and generator is initializeddAnd θg
2) training arbiter;
3) training generator;
4) step 2)~step 3) is alternately performed until algorithmic statement.
In step 2), the training of arbiter is included the following steps:
1. taking out m text modality training sample { x from training set1,x2,...,xm};
2. taking out m image modalities sample { z from training set1,z2,...,zm};
3. obtaining the data of generation
4. updating the parameter θ of arbiterdTo maximization:
In step 3), the training of generator is included the following steps:
1. from m image modalities sample { z being different from step 2) is taken out in training set1,z2,...,zm};
2. updating the parameter θ of generatorgTo minimum:
It can be obtained the parameter sets θ of GAN model by the above method.
3) step 403: indicating image s to be retrieved using the vector of s obtained in step 401, and by the vector table Show that the GAN model be sent into and trained, the vector that GAN model can generate the target modalities of s indicate, i.e., containing semanteme identical with s Text vector indicates s ';
4) step 404: the vector corresponding to s of generation is indicated in s ' and target modalities data, i.e. k text modality inspection The vector of each member in rope member indicates, calculates Euclidean distance, and according to Euclidean distance from small to large be sequentially generated knot Fruit list;
In the step, the calculation formula of Euclidean distance d are as follows:
Wherein s ' is the vector expression of target image s to be retrieved, kiIndicate the retrieval member of k text modality, diIndicate s ' With kiEuclidean distance.Euclidean distance by calculating s ' and each k obtains d, arranges from small to large further according to d and diIt is corresponding ki, just obtain search result list;
As shown in figure 5, the result of result and existing cross-module state search method that GAN model carries out the retrieval of cross-module state carries out Comparison, evaluation index are mAP (mean Average Precise);MAP is the superiority and inferiority of common scaling information search result Standard;The inquiry specified for one, R result before returning;The calculation formula of its mAP are as follows:
Wherein, M represents the fruiting quantities that certain image s is retrieved, and p (r) indicates the accuracy rate in position r, and rel (r) is represented The correlation (correlation maximum 1, minimum 0,0 is uncorrelated) of the result of position r and image s, evaluating standard is that s and r is It is no that there is identical semanteme;In the present invention, first 50 that search result quantity is search result are returned.
In Fig. 5, i-t is indicated by image retrieval text, and t-i indicates that, by text retrieval image, AVG is indicated by image retrieval text This and by text retrieval image average mAP value;From fig. 5, it can be seen that method of the invention in Wikipedia data set and Retrieval precision in NUS-WIDE-10K data set is above other methods;Embody the cross-module state retrieval side based on GAN model Method more accurately learns the semantic relation arrived between different modalities, and cross-module state retrieval accuracy is higher.
The preferred embodiment of the present invention has been described in detail above.It should be appreciated that the ordinary skill of this field is without wound The property made labour, which according to the present invention can conceive, makes many modifications and variations.Therefore, all technician in the art Pass through the available technology of logical analysis, reasoning, or a limited experiment on the basis of existing technology under this invention's idea Scheme, all should be within the scope of protection determined by the claims.

Claims (8)

1. a kind of based on the cross-module state search method for generating confrontation network, which is characterized in that the described method comprises the following steps:
Step 1 carries out feature extraction to the data of input mode and the data of target modalities using feature extracting method;
GAN model is established and trained to step 2, so that GAN model can be generated the data of target modalities by the data of input mode;
Step 3, the data of the target modalities generated using GAN model carry out phase with the data of the corresponding mode obtained in step 1 It is matched like degree, that is, carries out the calculating of Euclidean distance;
Step 4 arranges the calculated result of Euclidean distance from small to large, to obtain the result of cross-module state retrieval;Euclidean distance Smaller, the similarity of the more forward result of ranking and searched targets is higher.
2. as described in claim 1 based on the cross-module state search method for generating confrontation network, which is characterized in that the step 1 Described in feature extraction include the following steps:
Step 1.1, when text data be input modal data when, image data is target modalities data, and vice versa;
Step 1.2 extracts feature using different method for the data of different modalities: image data feature by VGG-16, FCN method extracts;Text modality data carry out feature extraction by word2vec method;For image and text data, Feature after extraction is indicated with vector mode.
3. as described in claim 1 based on the cross-module state search method for generating confrontation network, which is characterized in that the step 2 Described in foundation and training GAN model include the following steps:
Step 2.1 builds GAN network model using the method based on Tensorflow frame;
Step 2.2 is trained GAN model using training set data, obtains the various parameters of GAN model.
4. as claimed in claim 3 based on the cross-module state search method for generating confrontation network, which is characterized in that the step GAN model is trained using training set data described in 2.2 and is included the following steps:
Step 2.2.1, the parameter θ of arbiter is initializeddWith the parameter θ of generatorg
Step 2.2.2, the arbiter in training GAN: target modalities data set is sent into arbiter and is trained, arbiter pair Input data learns to obtain its semantic information;
Step 2.2.3, the generator in training GAN: using certain modal data as input modal data, being sent into generator, raw Target modalities data will be generated according to input modal data and be sent to arbiter by growing up to be a useful person, and arbiter will be to the target mould of generation State data are differentiated, and result is fed back to generator;
Step 2.2.4, step 2.2.2 and step 2.2.3 is repeated, until arbiter is restrained with generator, obtains GAN model Parameter sets θ.
5. as claimed in claim 4 based on the cross-module state search method for generating confrontation network, which is characterized in that the step 2.2.2 the training of arbiter is included the following steps: in
Step 2.2.2.1: from the data P of training setdata(x) m training sample { x of input modal data is taken out in1, x2,...,xm};
Step 2.2.2.2: from the data P of training setdata(x) m sample { z of target modalities data is taken out in1,z2,...,zm};
Step 2.2.2.3: the data of generation are obtained
Step 2.2.2.4: the parameter θ of arbiter is updateddTo maximization:
Wherein: PdataIt (x) is the training set indicated with vector, including input modal data and target modalities data, G, which is represented, to be generated The distribution of device, D represent the result of arbiter.
6. as claimed in claim 4 based on the cross-module state search method for generating confrontation network, which is characterized in that the step 2.2.3 the training of generator is included the following steps: in
Step 2.2.3.1: from the data P of pre-set training setdata(x) m be different from step 2.2.2.2 are taken out in Sample { z1,z2,...,zm};
Step 2.2.3.2: the parameter θ of generator is updatedgTo minimum:
7. as described in claim 1 based on the cross-module state search method for generating confrontation network, which is characterized in that the step 3 Described in Euclidean distance calculating it is as follows: input modal data enter GAN model after, obtain target modalities data, the mode Data will with data progress Euclidean distance calculating all in true corresponding modal data, reflected by Euclidean distance two to Similarity degree between amount.
8. as described in claim 1 based on the cross-module state search method for generating confrontation network, which is characterized in that in n-dimensional space In, the calculation formula of the Euclidean distance d in the step 3 are as follows:
Wherein tiAnd yiFor two n-dimensional vectors.
CN201810871910.5A 2018-08-02 2018-08-02 Cross-modal retrieval method based on generation of countermeasure network Active CN109213876B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810871910.5A CN109213876B (en) 2018-08-02 2018-08-02 Cross-modal retrieval method based on generation of countermeasure network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810871910.5A CN109213876B (en) 2018-08-02 2018-08-02 Cross-modal retrieval method based on generation of countermeasure network

Publications (2)

Publication Number Publication Date
CN109213876A true CN109213876A (en) 2019-01-15
CN109213876B CN109213876B (en) 2022-12-02

Family

ID=64988109

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810871910.5A Active CN109213876B (en) 2018-08-02 2018-08-02 Cross-modal retrieval method based on generation of countermeasure network

Country Status (1)

Country Link
CN (1) CN109213876B (en)

Cited By (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109783644A (en) * 2019-01-18 2019-05-21 福州大学 A kind of cross-cutting emotional semantic classification system and method based on text representation study
CN110502743A (en) * 2019-07-12 2019-11-26 北京邮电大学 Social networks based on confrontation study and semantic similarity is across media search method
CN110827232A (en) * 2019-11-14 2020-02-21 四川大学 Cross-modal MRI (magnetic resonance imaging) synthesis method based on morphological feature GAN (gain)
CN110909181A (en) * 2019-09-30 2020-03-24 中国海洋大学 Cross-modal retrieval method and system for multi-type ocean data
CN111179207A (en) * 2019-12-05 2020-05-19 浙江工业大学 Cross-modal medical image synthesis method based on parallel generation network
CN111782921A (en) * 2020-03-25 2020-10-16 北京沃东天骏信息技术有限公司 Method and device for searching target
CN111861949A (en) * 2020-04-21 2020-10-30 北京联合大学 Multi-exposure image fusion method and system based on generation countermeasure network
CN111985243A (en) * 2019-05-23 2020-11-24 中移(苏州)软件技术有限公司 Emotion model training method, emotion analysis device and storage medium
CN112487217A (en) * 2019-09-12 2021-03-12 腾讯科技(深圳)有限公司 Cross-modal retrieval method, device, equipment and computer-readable storage medium
CN113420166A (en) * 2021-03-26 2021-09-21 阿里巴巴新加坡控股有限公司 Commodity mounting, retrieving, recommending and training processing method and device and electronic equipment
CN113435206A (en) * 2021-05-26 2021-09-24 卓尔智联(武汉)研究院有限公司 Image-text retrieval method and device and electronic equipment
CN117390210A (en) * 2023-12-07 2024-01-12 山东建筑大学 Building indoor positioning method, positioning system, storage medium and electronic equipment

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102629275A (en) * 2012-03-21 2012-08-08 复旦大学 Face and name aligning method and system facing to cross media news retrieval
CN102663447A (en) * 2012-04-28 2012-09-12 中国科学院自动化研究所 Cross-media searching method based on discrimination correlation analysis
CN107832351A (en) * 2017-10-21 2018-03-23 桂林电子科技大学 Cross-module state search method based on depth related network

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102629275A (en) * 2012-03-21 2012-08-08 复旦大学 Face and name aligning method and system facing to cross media news retrieval
CN102663447A (en) * 2012-04-28 2012-09-12 中国科学院自动化研究所 Cross-media searching method based on discrimination correlation analysis
CN107832351A (en) * 2017-10-21 2018-03-23 桂林电子科技大学 Cross-module state search method based on depth related network

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
JIUXIANG GUT等: ""Look, Imagine and Match:Improving Textual-Visual Cross-Modal Retrieval with Generative Models"", 《2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION》 *
YUXIN PENG等: ""CM-GANs: Cross-modal Generative AdversarialNetworks for Common Representation Learning"", 《HTTPS://ARXIV.ORG/ABS/1710.05106V2》 *

Cited By (17)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109783644A (en) * 2019-01-18 2019-05-21 福州大学 A kind of cross-cutting emotional semantic classification system and method based on text representation study
CN111985243A (en) * 2019-05-23 2020-11-24 中移(苏州)软件技术有限公司 Emotion model training method, emotion analysis device and storage medium
CN111985243B (en) * 2019-05-23 2023-09-08 中移(苏州)软件技术有限公司 Emotion model training method, emotion analysis device and storage medium
CN110502743A (en) * 2019-07-12 2019-11-26 北京邮电大学 Social networks based on confrontation study and semantic similarity is across media search method
CN112487217A (en) * 2019-09-12 2021-03-12 腾讯科技(深圳)有限公司 Cross-modal retrieval method, device, equipment and computer-readable storage medium
CN110909181A (en) * 2019-09-30 2020-03-24 中国海洋大学 Cross-modal retrieval method and system for multi-type ocean data
CN110827232B (en) * 2019-11-14 2022-07-15 四川大学 Cross-modality MRI (magnetic resonance imaging) synthesis method based on morphological characteristics GAN (gamma GAN)
CN110827232A (en) * 2019-11-14 2020-02-21 四川大学 Cross-modal MRI (magnetic resonance imaging) synthesis method based on morphological feature GAN (gain)
CN111179207B (en) * 2019-12-05 2022-04-08 浙江工业大学 Cross-modal medical image synthesis method based on parallel generation network
CN111179207A (en) * 2019-12-05 2020-05-19 浙江工业大学 Cross-modal medical image synthesis method based on parallel generation network
CN111782921A (en) * 2020-03-25 2020-10-16 北京沃东天骏信息技术有限公司 Method and device for searching target
CN111861949A (en) * 2020-04-21 2020-10-30 北京联合大学 Multi-exposure image fusion method and system based on generation countermeasure network
CN111861949B (en) * 2020-04-21 2023-07-04 北京联合大学 Multi-exposure image fusion method and system based on generation countermeasure network
CN113420166A (en) * 2021-03-26 2021-09-21 阿里巴巴新加坡控股有限公司 Commodity mounting, retrieving, recommending and training processing method and device and electronic equipment
CN113435206A (en) * 2021-05-26 2021-09-24 卓尔智联(武汉)研究院有限公司 Image-text retrieval method and device and electronic equipment
CN113435206B (en) * 2021-05-26 2023-08-01 卓尔智联(武汉)研究院有限公司 Image-text retrieval method and device and electronic equipment
CN117390210A (en) * 2023-12-07 2024-01-12 山东建筑大学 Building indoor positioning method, positioning system, storage medium and electronic equipment

Also Published As

Publication number Publication date
CN109213876B (en) 2022-12-02

Similar Documents

Publication Publication Date Title
CN109213876A (en) Based on the cross-module state search method for generating confrontation network
CN110147457B (en) Image-text matching method, device, storage medium and equipment
Wu et al. Cascaded fully convolutional networks for automatic prenatal ultrasound image segmentation
Alfarisy et al. Deep learning based classification for paddy pests & diseases recognition
Huang et al. Instance-aware image and sentence matching with selective multimodal lstm
CN108509463B (en) Question response method and device
CN106909924B (en) Remote sensing image rapid retrieval method based on depth significance
CN112384948A (en) Generating countermeasure networks for image segmentation
Ahishali et al. Advance warning methodologies for covid-19 using chest x-ray images
KR102265573B1 (en) Method and system for reconstructing mathematics learning curriculum based on artificial intelligence
WO2018196718A1 (en) Image disambiguation method and device, storage medium, and electronic device
EP3968337A1 (en) Target object attribute prediction method based on machine learning and related device
Bu Human motion gesture recognition algorithm in video based on convolutional neural features of training images
JP2018022496A (en) Method and equipment for creating training data to be used for natural language processing device
CN110163130B (en) Feature pre-alignment random forest classification system and method for gesture recognition
Zhou et al. Learn fine-grained adaptive loss for multiple anatomical landmark detection in medical images
Zhou et al. Model uncertainty guides visual object tracking
Heidler et al. A deep active contour model for delineating glacier calving fronts
CN108804470B (en) Image retrieval method and device
CN116934747A (en) Fundus image segmentation model training method, fundus image segmentation model training equipment and glaucoma auxiliary diagnosis system
CN116416334A (en) Scene graph generation method of embedded network based on prototype
Zachmann et al. Random forests for tracking on ultrasonic images
Rochmawati et al. Brain tumor classification using transfer learning
Wibowo Performances of Chimpanzee Leader Election Optimization and K-Means in Multilevel Color Image Segmentation
Wu et al. Loop Closure Detection for Visual SLAM Based on SuperPoint Network

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant