CN109213876A - Based on the cross-module state search method for generating confrontation network - Google Patents
Based on the cross-module state search method for generating confrontation network Download PDFInfo
- Publication number
- CN109213876A CN109213876A CN201810871910.5A CN201810871910A CN109213876A CN 109213876 A CN109213876 A CN 109213876A CN 201810871910 A CN201810871910 A CN 201810871910A CN 109213876 A CN109213876 A CN 109213876A
- Authority
- CN
- China
- Prior art keywords
- data
- cross
- module state
- arbiter
- training
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Landscapes
- Image Analysis (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The invention discloses a kind of based on the cross-module state search method for generating confrontation network, be related to multimedia data retrieval technical field, the described method comprises the following steps: step 1 carries out feature extraction to the data of input mode and the data of target modalities using feature extracting method;GAN model is established and trained to step 2, so that GAN model can be generated the data of target modalities by the data of input mode;Step 3, the data of the target modalities generated using GAN model carry out similarity mode with the data of the corresponding mode obtained in step 1, that is, carry out the calculating of Euclidean distance;Step 4 arranges the calculated result of Euclidean distance from small to large, to obtain the result of cross-module state retrieval;Euclidean distance is smaller, and the similarity of the more forward result of ranking and searched targets is higher.Compared with existing cross-module state retrieval model, the present invention more makes full use of study and the mapping ability of deep neural network, improves cross-module state retrieval accuracy.
Description
Technical field
The present invention relates to multimedia data retrieval technical field more particularly to a kind of cross-module states based on generation confrontation network
Search method.
Background technique
With the development of internet technology, more and more have the media data of identical semanteme in the form of multiple modalities
Occur simultaneously, for example, the photo that news report is corresponding, the diagnosis explanation and medical imaging of patient.At present people with greater need for
Another mode is retrieved by a kind of mode, and is not only the retrieval to single mode.Such as, it is seen that a photo will shine
Piece submits to searching system, and searching system can retrieve text information related with this photo;Patient can be by the X of oneself
Mating plate image submits to searching system, and searching system can return to the diagnostic text being consistent with the X-ray.It is this to use a kind of mode
Data retrieval to the method for other modal datas, referred to as cross-module state is retrieved.
Traditional cross-module state search method, such as by text retrieval image, principle is according further to the text to image
The retrieval realization of markup information, essential or a kind of retrieval of single mode.But the development of internet is very fast, a large amount of image
It is continued to bring out with text information, this makes the mark to image time-consuming and laborious;Additionally due to artificial mark tends not to complete table
Up to the content of image, to have certain influence to search result.And depth learning technology is in processing text and image side at present
Face achieves good effect, starts with for people from depth learning technology, realizes that new cross-module state retrieval technique provides road.
Two kinds can be divided into currently based on the algorithm of deep learning: 1) first kind method by different modal datas respectively into
Row abstract indicates, these results abstracted is then mapped to a public representation space again, to establish each mode
Between association, but such method lacked indicate study and association study between connection, this makes public representation space
Not only included the shared information of multiple modalities, but also included the peculiar information of single mode data, and be unfavorable for the progress of cross-module state retrieval;2)
Association is learnt and indicates to learn to merge to become whole by the second class method, but there are still retrieval effectiveness shakinesses for current this method
Calmly, the problems such as retrieval precision is not high.
Therefore, those skilled in the art is dedicated to developing a kind of better cross-module state search method, improves above-mentioned retrieval
The problem that effect is unstable, retrieval precision is not high.
Summary of the invention
The present invention proposes aiming at the problem that cross-module state is retrieved based on generation confrontation network (Generative
Adversarial Networks, GAN) cross-module state search method, using the good code capacity of deep neural network, not
With the bridge for establishing conversion between modal data, so that depth model has a better expression effect, and the retrieval of cross-module state
Accuracy is higher.
To achieve the above object, the present invention provides a kind of based on the cross-module state search method for generating confrontation network, special
Sign is, the described method comprises the following steps:
Step 1 carries out feature extraction to the data of input mode and the data of target modalities using feature extracting method;
GAN model is established and trained to step 2, so that GAN model can be generated the number of target modalities by the data of input mode
According to;
Step 3, data and the data of the corresponding mode obtained in step 1 of the target modalities generated using GAN model into
Row similarity mode carries out the calculating of Euclidean distance;
Step 4 arranges the calculated result of Euclidean distance from small to large, to obtain the result of cross-module state retrieval;It is European
Apart from smaller, the similarity of the more forward result of ranking and searched targets is higher.
Further, feature extraction described in the step 1 includes the following steps:
Step 1.1, when text data be input modal data when, image data is target modalities data, and vice versa;
Step 1.2 extracts feature using different method for the data of different modalities: image data feature passes through VGG-
16, FCN method extracts;Text modality data carry out feature extraction by word2vec method;For image and textual data
According to the feature after extraction is indicated with vector mode.
Further, foundation described in the step 2 and training GAN model include the following steps:
Step 2.1 builds GAN network model using the method based on Tensorflow frame;
Step 2.2 is trained GAN model using training set data, obtains the various parameters of GAN model.
Further, GAN model is trained including walking as follows using training set data described in the step 2.2
It is rapid:
Step 2.2.1, the parameter θ of arbiter is initializeddWith the parameter θ of generatorg;
Step 2.2.2, the arbiter in training GAN: target modalities data set is sent into arbiter and is trained, is differentiated
Device learns to obtain its semantic information input data;
Step 2.2.3, it the generator in training GAN: using certain modal data as input modal data, is sent into and generates
Device, generator will generate target modalities data according to input modal data and be sent to arbiter, and arbiter will be to generation
Target modalities data are differentiated, and result is fed back to generator;
Step 2.2.4, step 2.2.2 and step 2.2.3 is repeated, until arbiter is restrained with generator, obtains GAN mould
The parameter sets θ of type.
Further, the training of arbiter is included the following steps: in the step 2.2.2
Step 2.2.2.1: from the data P of training setdata(x) m training sample { x of input modal data is taken out in1,
x2,...,xm};
Step 2.2.2.2: from the data P of training setdata(x) m sample { z of target modalities data is taken out in1,
z2,...,zm};
Step 2.2.2.3: the data of generation are obtained
Step 2.2.2.4: the parameter θ of arbiter is updateddTo maximization:
Wherein: PdataIt (x) is the training set indicated with vector, including input modal data and target modalities data, G is represented
The distribution of generator, D represent the result of arbiter.
Further, the training of generator is included the following steps: in the step 2.2.3
Step 2.2.3.1: from the data P of pre-set training setdata(x) it takes out and is different from step 2.2.2.2 in
M sample { z1,z2,...,zm};
Step 2.2.3.2: the parameter θ of generator is updatedgTo minimum:
Further, the calculating of Euclidean distance described in the step 3 is as follows: input modal data enters GAN model
Afterwards, target modalities data are obtained, which will carry out Euclidean distance meter with all data in true corresponding modal data
It calculates, the similarity degree between two vectors is reflected by Euclidean distance.
Further, in n-dimensional space, the calculation formula of the Euclidean distance d in the step 3 are as follows:
Wherein tiAnd yiFor two n-dimensional vectors.
The invention has the advantages that: the code capacity of GAN is made full use of, is constructed between the data of different modalities
Bridge is mapped, more complicated network structure in the cross-module state retrieval model of existing depth network is got rid of;With existing cross-module
State retrieval model is compared, and study and the mapping ability of deep neural network is more fully utilized, and it is quasi- to improve the retrieval of cross-module state
Exactness.
It is described further below with reference to technical effect of the attached drawing to design of the invention, specific structure and generation, with
It is fully understood from the purpose of the present invention, feature and effect.
Detailed description of the invention
Fig. 1 is the flow chart of technical solution of the present invention;
Fig. 2 is the structure chart that confrontation network model is generated in the present invention;
Fig. 3 is the flow chart that confrontation network model training is generated in the present invention;
Fig. 4 is the flow chart of the embodiment of the present invention;
Fig. 5 is the comparative result figure of cross-module state retrieval.
Specific embodiment
Multiple preferred embodiments of the invention are introduced below with reference to Figure of description, keep its technology contents more clear and just
In understanding.The present invention can be emerged from by many various forms of embodiments, and protection scope of the present invention not only limits
The embodiment that Yu Wenzhong is mentioned.
To solve cross-module state search problem, the present invention proposes a kind of cross-module state search method based on GAN, the technology of the present invention
Program flow chart is as shown in Figure 1, comprising the following steps:
Step 1: feature extraction being carried out to the data of input mode and the data of target modalities using feature extracting method, is obtained
The vector that mode and target modalities data must be inputted indicates;
Step 2: GAN model is established and train, so that GAN model can be generated separately by a kind of data (input mode) of mode
A kind of data of mode (target modalities);
Step 3: the target modalities data generated using GAN model are carried out with the data of the corresponding mode obtained in step 1
Similarity mode carries out the calculating of Euclidean distance;
Step 4: the calculated result of Euclidean distance being arranged from small to large, to obtain the result of cross-module state retrieval.
Fig. 2 is the neural network structure figure of GAN model in the present invention.GAN model includes a generator (Generator)
With an arbiter (Discriminator), the effect of generator is the feature vector expression generation according to input modal data
The feature vector of corresponding target modalities data indicates out, and the effect of arbiter is to guarantee that generator can be in training
The feature vector for properly generating target modalities data indicates, correct with the mapping for ensuring to input between mode and output modalities.Below
It describes in detail to GAN model.
GAN model:
The purpose for generating confrontation network (Generative Adversarial Network, GAN) is desirable to according to input
Data generate target data.Unlike general encoder, GAN includes by two networks, and one is generator
(Generator), one is arbiter (Discriminator), and the mutual game of the two is fought mutually by two networks to reach
To best generation effect.By the differentiation of arbiter, training of the continuous iteration to generator, finally until arbiter can not
Judge whether the data generated have any different with truthful data, generator has just reached fitting state, and generator may be used as at this time
It is the bridge of input data and the mutual inversion of phases of output data.
GAN model working principle is illustrated for generating image:
For the distribution P of true picturesdata(x), x is a true picture, be can be represented by vectors, point of the vector
Cloth is Pdata, now need to generate the image under the distribution.
Assuming that existing generator is distributed as PG(x, θ), the distribution are controlled by θ, θ be the distribution parameter (if
It is gauss hybrid models, then θ is exactly the average value and variance of each Gaussian Profile).If there is true data { x1, x2...,
xm, if it is desired to calculate a likelihood PG(xi, θ), for these data, be in the likelihood generated in modelIf we need that generator is allowed to generate true picture maximum probability, a θ is needed*To maximize L.
It allows generator maximum probability to generate true picture, that is, needs to find a θ and enable PGCloser to Pdata.This
In assume that PG(x, θ) is a neural network.A vector z is randomly generated first, it is raw by this network of G (z)=x
At picture x, in order to compare whether z and x is similar, one group of sample of z can be taken, this group of sample meets a distribution, then passing through
Another distribution P can be generated in networkG, then compare itself and true distribution PdataDifference.
The objective function of GAN is as follows:
Wherein G represents generator distribution, and D represents arbiter as a result, PdataIt is truthful data, PGIt is the data generated.If
Fixed G, max V (G, D) mean that PGAnd PdataBetween difference, it is only necessary to find a best G, make max V minimum, just
It is the difference minimum between two distributions.
G fixed first, solves optimal D:
For given x, the maximization D of optimization*
Pdata(x)log D(x)+PG(x)log(1-D(x))
Solve D*:
F (D)=alog (D)+blog (1-D)
By optimal D*It substitutes intoIt can be obtained
Wherein JSD is the symmetrical smoothed version of KL, illustrates the difference between two distributions, which shows fixed G,Indicate that the difference between two distributions, minimum value are -2log2, maximum value 0.Work as PG(x)=Pdata(x) when, G
It is optimal.
The training of GAN model:
GAN network include an a generator G and arbiter D, training when two networks alternately.Assuming that initial
Generator and arbiter are G0And D0, first train D0It findsThen D is fixed0Start to train G0, training process makes
With gradient descent method, and so on, training D1, G1, D2, G2...
Training step is as shown in figure 3, detailed step is as follows:
1) step 2.1: the parameter θ of initialization arbiter and generatordAnd θg;
2) step 2.2: training arbiter;
3) step 2.3: training generator;
4) step 2.4: step 2)~step 3) is alternately performed until algorithmic statement.
In step 2.2, the training of arbiter is included the following steps:
1) from data Pdata(x) m training sample { x is taken out in1,x2,...,xm};
2) from pre-set random vector Pprior(z) m sample { z is taken out in1,z2,...,zm};
3) data generated are obtained
4) parameter θ of arbiter is updateddTo maximization:
In step 2.3, the training of generator is included the following steps:
1) from pre-set random vector Pprior(z) m sample { z being different from step 2) is taken out in1,z2,...,
zm};
2) parameter θ of generator is updatedgTo minimum:
It can be obtained the parameter sets θ of GAN model by the above method.
For the purpose of the present invention, technical solution expression is more clearly understood, with reference to the accompanying drawing and specific embodiment
The present invention is further described in detail again.
Embodiment:
Assuming that there is m to the text and image data of known corresponding relationship, i.e. training dataset;The text of unknown corresponding relationship
With image data each n, i.e. test data set;It is illustrated by taking image retrieval text as an example, searched targets are test data set
In some image s, search library include test set in k retrieval member, retrieve member be text modality data;Such as Fig. 4 institute
Show, include following 4 steps:
1) step 401: using feature extracting method, to the text and image data progress feature in training set and test set
It extracts, can obtain its vector using the methods of word2vec for text data indicates, image data can be used
The methods of VGG16 or FCN extract its feature and obtain the expression of its vector;By the step, m can be obtained to known corresponding relationship
Different modal datas feature vector, obtain each n of feature vector of the text and image modal data of unknown corresponding relationship
It is a;
2) step 402: using m in training set to the feature vector of the different modalities data of known corresponding relationship to GAN mould
Type is trained;By the step, GAN can be generated according to the image or text modality data of input approximate semantic text or
Image modalities data.
To the specific steps of GAN model training in the step are as follows:
1) parameter θ of arbiter and generator is initializeddAnd θg;
2) training arbiter;
3) training generator;
4) step 2)~step 3) is alternately performed until algorithmic statement.
In step 2), the training of arbiter is included the following steps:
1. taking out m text modality training sample { x from training set1,x2,...,xm};
2. taking out m image modalities sample { z from training set1,z2,...,zm};
3. obtaining the data of generation
4. updating the parameter θ of arbiterdTo maximization:
In step 3), the training of generator is included the following steps:
1. from m image modalities sample { z being different from step 2) is taken out in training set1,z2,...,zm};
2. updating the parameter θ of generatorgTo minimum:
It can be obtained the parameter sets θ of GAN model by the above method.
3) step 403: indicating image s to be retrieved using the vector of s obtained in step 401, and by the vector table
Show that the GAN model be sent into and trained, the vector that GAN model can generate the target modalities of s indicate, i.e., containing semanteme identical with s
Text vector indicates s ';
4) step 404: the vector corresponding to s of generation is indicated in s ' and target modalities data, i.e. k text modality inspection
The vector of each member in rope member indicates, calculates Euclidean distance, and according to Euclidean distance from small to large be sequentially generated knot
Fruit list;
In the step, the calculation formula of Euclidean distance d are as follows:
Wherein s ' is the vector expression of target image s to be retrieved, kiIndicate the retrieval member of k text modality, diIndicate s '
With kiEuclidean distance.Euclidean distance by calculating s ' and each k obtains d, arranges from small to large further according to d and diIt is corresponding
ki, just obtain search result list;
As shown in figure 5, the result of result and existing cross-module state search method that GAN model carries out the retrieval of cross-module state carries out
Comparison, evaluation index are mAP (mean Average Precise);MAP is the superiority and inferiority of common scaling information search result
Standard;The inquiry specified for one, R result before returning;The calculation formula of its mAP are as follows:
Wherein, M represents the fruiting quantities that certain image s is retrieved, and p (r) indicates the accuracy rate in position r, and rel (r) is represented
The correlation (correlation maximum 1, minimum 0,0 is uncorrelated) of the result of position r and image s, evaluating standard is that s and r is
It is no that there is identical semanteme;In the present invention, first 50 that search result quantity is search result are returned.
In Fig. 5, i-t is indicated by image retrieval text, and t-i indicates that, by text retrieval image, AVG is indicated by image retrieval text
This and by text retrieval image average mAP value;From fig. 5, it can be seen that method of the invention in Wikipedia data set and
Retrieval precision in NUS-WIDE-10K data set is above other methods;Embody the cross-module state retrieval side based on GAN model
Method more accurately learns the semantic relation arrived between different modalities, and cross-module state retrieval accuracy is higher.
The preferred embodiment of the present invention has been described in detail above.It should be appreciated that the ordinary skill of this field is without wound
The property made labour, which according to the present invention can conceive, makes many modifications and variations.Therefore, all technician in the art
Pass through the available technology of logical analysis, reasoning, or a limited experiment on the basis of existing technology under this invention's idea
Scheme, all should be within the scope of protection determined by the claims.
Claims (8)
1. a kind of based on the cross-module state search method for generating confrontation network, which is characterized in that the described method comprises the following steps:
Step 1 carries out feature extraction to the data of input mode and the data of target modalities using feature extracting method;
GAN model is established and trained to step 2, so that GAN model can be generated the data of target modalities by the data of input mode;
Step 3, the data of the target modalities generated using GAN model carry out phase with the data of the corresponding mode obtained in step 1
It is matched like degree, that is, carries out the calculating of Euclidean distance;
Step 4 arranges the calculated result of Euclidean distance from small to large, to obtain the result of cross-module state retrieval;Euclidean distance
Smaller, the similarity of the more forward result of ranking and searched targets is higher.
2. as described in claim 1 based on the cross-module state search method for generating confrontation network, which is characterized in that the step 1
Described in feature extraction include the following steps:
Step 1.1, when text data be input modal data when, image data is target modalities data, and vice versa;
Step 1.2 extracts feature using different method for the data of different modalities: image data feature by VGG-16,
FCN method extracts;Text modality data carry out feature extraction by word2vec method;For image and text data,
Feature after extraction is indicated with vector mode.
3. as described in claim 1 based on the cross-module state search method for generating confrontation network, which is characterized in that the step 2
Described in foundation and training GAN model include the following steps:
Step 2.1 builds GAN network model using the method based on Tensorflow frame;
Step 2.2 is trained GAN model using training set data, obtains the various parameters of GAN model.
4. as claimed in claim 3 based on the cross-module state search method for generating confrontation network, which is characterized in that the step
GAN model is trained using training set data described in 2.2 and is included the following steps:
Step 2.2.1, the parameter θ of arbiter is initializeddWith the parameter θ of generatorg;
Step 2.2.2, the arbiter in training GAN: target modalities data set is sent into arbiter and is trained, arbiter pair
Input data learns to obtain its semantic information;
Step 2.2.3, the generator in training GAN: using certain modal data as input modal data, being sent into generator, raw
Target modalities data will be generated according to input modal data and be sent to arbiter by growing up to be a useful person, and arbiter will be to the target mould of generation
State data are differentiated, and result is fed back to generator;
Step 2.2.4, step 2.2.2 and step 2.2.3 is repeated, until arbiter is restrained with generator, obtains GAN model
Parameter sets θ.
5. as claimed in claim 4 based on the cross-module state search method for generating confrontation network, which is characterized in that the step
2.2.2 the training of arbiter is included the following steps: in
Step 2.2.2.1: from the data P of training setdata(x) m training sample { x of input modal data is taken out in1,
x2,...,xm};
Step 2.2.2.2: from the data P of training setdata(x) m sample { z of target modalities data is taken out in1,z2,...,zm};
Step 2.2.2.3: the data of generation are obtained
Step 2.2.2.4: the parameter θ of arbiter is updateddTo maximization:
Wherein: PdataIt (x) is the training set indicated with vector, including input modal data and target modalities data, G, which is represented, to be generated
The distribution of device, D represent the result of arbiter.
6. as claimed in claim 4 based on the cross-module state search method for generating confrontation network, which is characterized in that the step
2.2.3 the training of generator is included the following steps: in
Step 2.2.3.1: from the data P of pre-set training setdata(x) m be different from step 2.2.2.2 are taken out in
Sample { z1,z2,...,zm};
Step 2.2.3.2: the parameter θ of generator is updatedgTo minimum:
7. as described in claim 1 based on the cross-module state search method for generating confrontation network, which is characterized in that the step 3
Described in Euclidean distance calculating it is as follows: input modal data enter GAN model after, obtain target modalities data, the mode
Data will with data progress Euclidean distance calculating all in true corresponding modal data, reflected by Euclidean distance two to
Similarity degree between amount.
8. as described in claim 1 based on the cross-module state search method for generating confrontation network, which is characterized in that in n-dimensional space
In, the calculation formula of the Euclidean distance d in the step 3 are as follows:
Wherein tiAnd yiFor two n-dimensional vectors.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810871910.5A CN109213876B (en) | 2018-08-02 | 2018-08-02 | Cross-modal retrieval method based on generation of countermeasure network |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810871910.5A CN109213876B (en) | 2018-08-02 | 2018-08-02 | Cross-modal retrieval method based on generation of countermeasure network |
Publications (2)
Publication Number | Publication Date |
---|---|
CN109213876A true CN109213876A (en) | 2019-01-15 |
CN109213876B CN109213876B (en) | 2022-12-02 |
Family
ID=64988109
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201810871910.5A Active CN109213876B (en) | 2018-08-02 | 2018-08-02 | Cross-modal retrieval method based on generation of countermeasure network |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN109213876B (en) |
Cited By (12)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109783644A (en) * | 2019-01-18 | 2019-05-21 | 福州大学 | A kind of cross-cutting emotional semantic classification system and method based on text representation study |
CN110502743A (en) * | 2019-07-12 | 2019-11-26 | 北京邮电大学 | Social networks based on confrontation study and semantic similarity is across media search method |
CN110827232A (en) * | 2019-11-14 | 2020-02-21 | 四川大学 | Cross-modal MRI (magnetic resonance imaging) synthesis method based on morphological feature GAN (gain) |
CN110909181A (en) * | 2019-09-30 | 2020-03-24 | 中国海洋大学 | Cross-modal retrieval method and system for multi-type ocean data |
CN111179207A (en) * | 2019-12-05 | 2020-05-19 | 浙江工业大学 | Cross-modal medical image synthesis method based on parallel generation network |
CN111782921A (en) * | 2020-03-25 | 2020-10-16 | 北京沃东天骏信息技术有限公司 | Method and device for searching target |
CN111861949A (en) * | 2020-04-21 | 2020-10-30 | 北京联合大学 | Multi-exposure image fusion method and system based on generation countermeasure network |
CN111985243A (en) * | 2019-05-23 | 2020-11-24 | 中移(苏州)软件技术有限公司 | Emotion model training method, emotion analysis device and storage medium |
CN112487217A (en) * | 2019-09-12 | 2021-03-12 | 腾讯科技(深圳)有限公司 | Cross-modal retrieval method, device, equipment and computer-readable storage medium |
CN113420166A (en) * | 2021-03-26 | 2021-09-21 | 阿里巴巴新加坡控股有限公司 | Commodity mounting, retrieving, recommending and training processing method and device and electronic equipment |
CN113435206A (en) * | 2021-05-26 | 2021-09-24 | 卓尔智联(武汉)研究院有限公司 | Image-text retrieval method and device and electronic equipment |
CN117390210A (en) * | 2023-12-07 | 2024-01-12 | 山东建筑大学 | Building indoor positioning method, positioning system, storage medium and electronic equipment |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102629275A (en) * | 2012-03-21 | 2012-08-08 | 复旦大学 | Face and name aligning method and system facing to cross media news retrieval |
CN102663447A (en) * | 2012-04-28 | 2012-09-12 | 中国科学院自动化研究所 | Cross-media searching method based on discrimination correlation analysis |
CN107832351A (en) * | 2017-10-21 | 2018-03-23 | 桂林电子科技大学 | Cross-module state search method based on depth related network |
-
2018
- 2018-08-02 CN CN201810871910.5A patent/CN109213876B/en active Active
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102629275A (en) * | 2012-03-21 | 2012-08-08 | 复旦大学 | Face and name aligning method and system facing to cross media news retrieval |
CN102663447A (en) * | 2012-04-28 | 2012-09-12 | 中国科学院自动化研究所 | Cross-media searching method based on discrimination correlation analysis |
CN107832351A (en) * | 2017-10-21 | 2018-03-23 | 桂林电子科技大学 | Cross-module state search method based on depth related network |
Non-Patent Citations (2)
Title |
---|
JIUXIANG GUT等: ""Look, Imagine and Match:Improving Textual-Visual Cross-Modal Retrieval with Generative Models"", 《2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION》 * |
YUXIN PENG等: ""CM-GANs: Cross-modal Generative AdversarialNetworks for Common Representation Learning"", 《HTTPS://ARXIV.ORG/ABS/1710.05106V2》 * |
Cited By (17)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109783644A (en) * | 2019-01-18 | 2019-05-21 | 福州大学 | A kind of cross-cutting emotional semantic classification system and method based on text representation study |
CN111985243A (en) * | 2019-05-23 | 2020-11-24 | 中移(苏州)软件技术有限公司 | Emotion model training method, emotion analysis device and storage medium |
CN111985243B (en) * | 2019-05-23 | 2023-09-08 | 中移(苏州)软件技术有限公司 | Emotion model training method, emotion analysis device and storage medium |
CN110502743A (en) * | 2019-07-12 | 2019-11-26 | 北京邮电大学 | Social networks based on confrontation study and semantic similarity is across media search method |
CN112487217A (en) * | 2019-09-12 | 2021-03-12 | 腾讯科技(深圳)有限公司 | Cross-modal retrieval method, device, equipment and computer-readable storage medium |
CN110909181A (en) * | 2019-09-30 | 2020-03-24 | 中国海洋大学 | Cross-modal retrieval method and system for multi-type ocean data |
CN110827232B (en) * | 2019-11-14 | 2022-07-15 | 四川大学 | Cross-modality MRI (magnetic resonance imaging) synthesis method based on morphological characteristics GAN (gamma GAN) |
CN110827232A (en) * | 2019-11-14 | 2020-02-21 | 四川大学 | Cross-modal MRI (magnetic resonance imaging) synthesis method based on morphological feature GAN (gain) |
CN111179207B (en) * | 2019-12-05 | 2022-04-08 | 浙江工业大学 | Cross-modal medical image synthesis method based on parallel generation network |
CN111179207A (en) * | 2019-12-05 | 2020-05-19 | 浙江工业大学 | Cross-modal medical image synthesis method based on parallel generation network |
CN111782921A (en) * | 2020-03-25 | 2020-10-16 | 北京沃东天骏信息技术有限公司 | Method and device for searching target |
CN111861949A (en) * | 2020-04-21 | 2020-10-30 | 北京联合大学 | Multi-exposure image fusion method and system based on generation countermeasure network |
CN111861949B (en) * | 2020-04-21 | 2023-07-04 | 北京联合大学 | Multi-exposure image fusion method and system based on generation countermeasure network |
CN113420166A (en) * | 2021-03-26 | 2021-09-21 | 阿里巴巴新加坡控股有限公司 | Commodity mounting, retrieving, recommending and training processing method and device and electronic equipment |
CN113435206A (en) * | 2021-05-26 | 2021-09-24 | 卓尔智联(武汉)研究院有限公司 | Image-text retrieval method and device and electronic equipment |
CN113435206B (en) * | 2021-05-26 | 2023-08-01 | 卓尔智联(武汉)研究院有限公司 | Image-text retrieval method and device and electronic equipment |
CN117390210A (en) * | 2023-12-07 | 2024-01-12 | 山东建筑大学 | Building indoor positioning method, positioning system, storage medium and electronic equipment |
Also Published As
Publication number | Publication date |
---|---|
CN109213876B (en) | 2022-12-02 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN109213876A (en) | Based on the cross-module state search method for generating confrontation network | |
CN110147457B (en) | Image-text matching method, device, storage medium and equipment | |
Wu et al. | Cascaded fully convolutional networks for automatic prenatal ultrasound image segmentation | |
Alfarisy et al. | Deep learning based classification for paddy pests & diseases recognition | |
Huang et al. | Instance-aware image and sentence matching with selective multimodal lstm | |
CN108509463B (en) | Question response method and device | |
CN106909924B (en) | Remote sensing image rapid retrieval method based on depth significance | |
CN112384948A (en) | Generating countermeasure networks for image segmentation | |
Ahishali et al. | Advance warning methodologies for covid-19 using chest x-ray images | |
KR102265573B1 (en) | Method and system for reconstructing mathematics learning curriculum based on artificial intelligence | |
WO2018196718A1 (en) | Image disambiguation method and device, storage medium, and electronic device | |
EP3968337A1 (en) | Target object attribute prediction method based on machine learning and related device | |
Bu | Human motion gesture recognition algorithm in video based on convolutional neural features of training images | |
JP2018022496A (en) | Method and equipment for creating training data to be used for natural language processing device | |
CN110163130B (en) | Feature pre-alignment random forest classification system and method for gesture recognition | |
Zhou et al. | Learn fine-grained adaptive loss for multiple anatomical landmark detection in medical images | |
Zhou et al. | Model uncertainty guides visual object tracking | |
Heidler et al. | A deep active contour model for delineating glacier calving fronts | |
CN108804470B (en) | Image retrieval method and device | |
CN116934747A (en) | Fundus image segmentation model training method, fundus image segmentation model training equipment and glaucoma auxiliary diagnosis system | |
CN116416334A (en) | Scene graph generation method of embedded network based on prototype | |
Zachmann et al. | Random forests for tracking on ultrasonic images | |
Rochmawati et al. | Brain tumor classification using transfer learning | |
Wibowo | Performances of Chimpanzee Leader Election Optimization and K-Means in Multilevel Color Image Segmentation | |
Wu et al. | Loop Closure Detection for Visual SLAM Based on SuperPoint Network |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |