CN110334363A - A kind of translation of description and method for measuring similarity based on hybrid coder - Google Patents
A kind of translation of description and method for measuring similarity based on hybrid coder Download PDFInfo
- Publication number
- CN110334363A CN110334363A CN201910630989.7A CN201910630989A CN110334363A CN 110334363 A CN110334363 A CN 110334363A CN 201910630989 A CN201910630989 A CN 201910630989A CN 110334363 A CN110334363 A CN 110334363A
- Authority
- CN
- China
- Prior art keywords
- description
- translation
- sub
- similarity
- reconstruct
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/22—Matching criteria, e.g. proximity measures
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/10—Text processing
- G06F40/194—Calculation of difference between files
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/40—Processing or translation of natural language
- G06F40/58—Use of machine translation, e.g. for multi-lingual retrieval, for server-side translation for client devices or for real-time translation
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Artificial Intelligence (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- Data Mining & Analysis (AREA)
- General Health & Medical Sciences (AREA)
- Computational Linguistics (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Health & Medical Sciences (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Evolutionary Computation (AREA)
- Evolutionary Biology (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Bioinformatics & Computational Biology (AREA)
- Life Sciences & Earth Sciences (AREA)
- Machine Translation (AREA)
Abstract
A kind of translation of description and method for measuring similarity based on hybrid coder, is related to image retrieval and description son translation.Different craft or a variety of descriptions based on study are extracted from for trained image set, and to prepare to describe the training set of sub- translater, mixing self-encoding encoder is trained using feature in pairs;The hybrid coder of mapping of the training from source feature to target signature, using reconstruct and translation two-way supplemental training decoder, encoder is peculiar, and decoder is shared, obtains translation loss and loses with reconstruct;It is lost using translation loss and reconstruct to measure the similarity between two kinds of description;For description to be translated, it is input in corresponding translater, is then measured to sub- translation, acquired results confidence level is described by similarity according to the translater that training obtains.The retrieval barrier between the searching system based on different description is broken through, provides a transfer platform, convenient and efficient between not homologous ray.
Description
Technical field
The present invention relates to image retrievals and description son translation, turn over more particularly, to a kind of description based on hybrid coder
It translates and method for measuring similarity.
Background technique
In the epoch of computer technology rapid development, especially with the prevalence of the social network sites such as Flickr, microblogging, figure
The isomeric datas such as picture, video, audio, text are all increasing at an amazing speed daily.For example, Facebook registration user is more than
1000000000, monthly upload the picture more than 1,000,000,000;Flickr picture social network sites user's uploading pictures number in 2015 up to 7.28 hundred million,
Average daily user uploads about 2,000,000 picture;It is in store on the back-end system of the e-commerce system Taobao of Largest In China
28600000000 plurality of pictures.It include the mass picture of abundant visual information for these, it is how square in these immense image libraries
Just, quickly and accurately inquire and retrieve needed for user or interested image, become multimedia information retrieval area research
Hot spot.Content-based image retrieval method given full play to computer be longer than processing iterative task advantage, by people from
It needs to expend and be freed in a large amount of human and material resources and the artificial mark of financial resources.By the development over 10 years, the figure based on content
As retrieval technique is widely used to the every aspect of the lives such as search engine, e-commerce, medicine, textile industry, leather industry.Figure
Two classes can be divided by the difference of description picture material mode as retrieving, one kind is text based image retrieval (TBIR, Text
Based Image Retrieval), another kind of is content-based image retrieval (CBIR, Content Based Image
Retrieval)。
Text based image search method starts from 1970s, it is in the way of text marking in image
Content is described, to form the description diagram as the keyword of content, such as the object in image, scene for each image
Deng this mode can be artificial notation methods, can also carry out semi-automatic mark by image recognition technology.It is being retrieved
When, user can provide key word of the inquiry according to the interest of oneself, and searching system finds out that according to the key word of the inquiry that user provides
It is labeled with the corresponding picture of the key word of the inquiry, the result of inquiry is finally returned into user.It is this based on text description
Image retrieval mode has manpower intervention due to being easily achieved, and in mark, so its precision ratio is also relatively high.In today
Some middle and small scale picture search Web still have a use using upper, but defect brought by this mode based on text description
It is also obviously: firstly, this mode based on text description needs manpower intervention annotation process, so that it is only applicable to
Small-scale image data will complete this process in large-scale image data and need to expend a large amount of manpower and financial resources, and
And image constantly external at any time be unable to do without artificial intervention in storage;Secondly, " thousand speech of figure victory ", accurate for needing
Inquiry, user are sometimes difficult to that the image for oneself really wanting acquisition is depicted with brief keyword;Again, it manually marked
Journey inevitably will receive the influence of the human-subject test of labeler, speech use and subjective judgement etc., therefore will cause text
The difference of word description picture.
Iamge description is a kind of data type that must be handled in image retrieval, and is that most of existing visions are searched
The basis of cable system.In typical set-up, visual search system can only handle from offline image and concentrate the predefined feature extracted.
Such setting is prevented reuses certain different visual signature in not homologous ray.In addition, when upgrading visual search system,
Time-consuming step is needed to extract new feature and construct corresponding index, and previous feature and index is discarded.It is prominent
Broken such setting is all very useful anyway.
Summary of the invention
It is an object of the invention to can not be interknited between different description in order to solve in different searching systems
Using the problem of, provide it is a kind of based on hybrid coder description translation and method for measuring similarity.
The present invention the following steps are included:
1) different a variety of descriptions is extracted, from for trained image set to prepare to describe the training of sub- translater
Collection is trained mixing self-encoding encoder using feature in pairs;
2) hybrid coder for training the mapping from source feature to target signature, using reconstruct and translation two-way supplemental training
Decoder, encoder is peculiar, and decoder is shared, obtains translation loss and loses with reconstruct;
3) it loses using translation obtained in step 2) and reconstruct loses to measure the similarity between two kinds of description;
4) description to be translated is input in corresponding translater, the translater then obtained according to training is to progress
Description son translation, acquired results confidence level are measured by the similarity in step 3).
In step 1), a variety of description may include sub or description based on study of manual description, as long as
It all can serve as the object of translation for description of image zooming-out.
In step 2), the specific method of the hybrid coder of mapping of the training from source feature to target signature can
Are as follows: first by encoder EsAnd EtTo encode the sub- V of Source DescriptionsWith the sub- V of goal descriptiont, respectively obtain coding zsAnd zt, then use
One shared decoder decodes zsAnd zt, respectively obtain the sub- V of description of conversionstWith the sub- V of description of reconstructtt, trained damage
Lose the L2 norm after function is the L2 norm between reconfiguration description and goal description and converts between goal description;Specifically may be used
Expression are as follows:
Wherein, the L2 norm between reconfiguration description is known as reconstructing loss, goal description after goal description and conversion
Between L2 norm be known as translate loss.
It is described to be lost with reconstruct using translation loss obtained in step 2) to measure two kinds of description in step 3)
Between the specific method of similarity can are as follows:
Oriented similarity is constructed first with the difference of translation loss and reconstruct loss, is expressed are as follows:
Normalizing is carried out later, obtains similarity to the end:
It is described that description to be translated is input in corresponding translater in step 4), then obtained according to training
Translater to sub- translation is described, what acquired results confidence level was measured by the similarity in step 3) method particularly includes:
(1) select to describe sub- translater according to goal description and Source Description;
(2) translation Source Description is sub to goal description;
(3) confidence level of translation result is measured according to the measuring similarity between description;
(4) associated downstream task is completed using description that translation is completed.
Compared with prior art, the present invention has advantage following prominent:
The retrieval barrier between the searching system based on different description can be broken through through the invention, guarantee different retrieval systems
It can be interknited between system, provide a transfer platform between not homologous ray, also provided for searching system update
The mode of one more convenient and efficient.The scheme proposed can obtain the translater of description and translation describes sub- effect later
Confidence level.Translater can provide translation model for the conversion between different description, and confidence level can know two kinds in advance
If appropriate for being translated between difference description.The sub- interpretation method of the description of hybrid coder proposed by the present invention, than tradition
Multi-layer perception (MLP) algorithm it is more stable, and demonstrate the convertible degree between 16 kinds of different characteristics.
Detailed description of the invention
Fig. 1 is flow chart of the invention.
Specific embodiment
Following embodiment will the present invention is described in detail in conjunction with attached drawing.
Referring to Fig. 1, the present invention the following steps are included:
1) different craft or the various features (description) based on study are extracted, from for trained image set with standard
The standby training set for describing sub- translater is trained mixing self-encoding encoder using feature in pairs;
2) hybrid coder for training the mapping from source feature to target signature, using reconstruct and translation two-way supplemental training
Decoder, encoder is peculiar, and wherein decoder is shared;
3) similarity between two kinds of description is measured using the translation error in step 2) and reconstructed error;
4) it for description to be translated, is entered into corresponding translater, the translation then obtained according to training
Device is to being described sub- translation.Acquired results confidence level is measured by the similarity in step 3).
In step 2), the specific method of the hybrid coder of mapping of the training from source feature to target signature can
Are as follows: first by encoder EsAnd EtTo encode the sub- V of Source DescriptionsWith the sub- V of goal descriptiont, respectively obtain coding zsAnd zt, then use
One shared decoder decodes zsAnd zt, respectively obtain the sub- V of description of conversionstWith the sub- V of description of reconstructtt, trained damage
Lose the L2 norm after function is the L2 norm between reconfiguration description and goal description and converts between goal description;Specifically may be used
Expression are as follows:
Wherein, the L2 norm between reconfiguration description is known as reconstructing loss, goal description after goal description and conversion
Between L2 norm be known as translate loss.
It is described to be lost with reconstruct using translation loss obtained in step 2) to measure two kinds of description in step 3)
Between the specific method of similarity can are as follows:
Oriented similarity is constructed first with the difference of translation loss and reconstruct loss, is expressed are as follows:
Normalizing is carried out later, obtains similarity to the end:
In step 4), it is described according to the obtained translater of training to sub- translation is described method particularly includes: (1) root
Select to describe sub- translater according to goal description and Source Description;(2) translation Source Description is sub to goal description;(3) basis
Measuring similarity between son is described to measure the confidence level of translation result;(4) it is completed using description that translation is completed related
Downstream Jobs.
The present invention is done on classical retrieval data set Holidays, Oxford5k, Paris6k for 16 kinds of description
Corresponding relevant replication experiment is as shown in table 1.
Table 1
Table 1 gives description after translation and the error before translation, as it can be seen from table 1 side proposed by the invention
Method is completed translation duties on largely description, with good performance.
Claims (5)
1. a kind of translation of description and method for measuring similarity based on hybrid coder, it is characterised in that the following steps are included:
1) different a variety of descriptions is extracted, from for trained image set to prepare to describe the training set of sub- translater, benefit
Mixing self-encoding encoder is trained with feature in pairs;
2) hybrid coder for training the mapping from source feature to target signature, using reconstruct and translation two-way supplemental training decoding
Device, encoder is peculiar, and decoder is shared, obtains translation loss and loses with reconstruct;
3) it loses using translation obtained in step 2) and reconstruct loses to measure the similarity between two kinds of description;
4) description to be translated is input in corresponding translater, then according to the obtained translater of training to being described
Son translation, acquired results confidence level are measured by the similarity in step 3).
2. a kind of translation of description and method for measuring similarity, feature based on hybrid coder exists as described in claim 1
In in step 1), a variety of description attached bags include sub or description based on study of manual description, as long as being directed to image
Description of extraction all can be used as the object of translation.
3. a kind of translation of description and method for measuring similarity, feature based on hybrid coder exists as described in claim 1
In in step 2), the hybrid coder of mapping of the training from source feature to target signature method particularly includes: first by
Encoder EsAnd EtTo encode the sub- V of Source DescriptionsWith the sub- V of goal descriptiont, respectively obtain coding zsAnd zt, then shared using one
Decoder decode zsAnd zt, respectively obtain the sub- V of description of conversionstWith the sub- V of description of reconstructtt, trained loss function is
L2 norm after L2 norm between reconfiguration description and goal description and conversion between goal description;It embodies are as follows:
Wherein, the L2 norm between reconfiguration description is known as reconstructing loss, after goal description and conversion between goal description
L2 norm is known as translating loss.
4. a kind of translation of description and method for measuring similarity, feature based on hybrid coder exists as described in claim 1
In in step 3), described lost using translation obtained in step 2) is lost with reconstruct to measure the phases between two kinds of description
Like degree method particularly includes:
Oriented similarity is constructed first with the difference of translation loss and reconstruct loss, is expressed are as follows:
Normalizing is carried out later, obtains similarity to the end:
5. a kind of translation of description and method for measuring similarity, feature based on hybrid coder exists as described in claim 1
The translation that is described that description to be translated is input in corresponding translater in step 4), then being obtained according to training
Device to sub- translation is described, what acquired results confidence level was measured by the similarity in step 3) method particularly includes:
(1) select to describe sub- translater according to goal description and Source Description;
(2) translation Source Description is sub to goal description;
(3) confidence level of translation result is measured according to the measuring similarity between description;
(4) associated downstream task is completed using description that translation is completed.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910630989.7A CN110334363A (en) | 2019-07-12 | 2019-07-12 | A kind of translation of description and method for measuring similarity based on hybrid coder |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910630989.7A CN110334363A (en) | 2019-07-12 | 2019-07-12 | A kind of translation of description and method for measuring similarity based on hybrid coder |
Publications (1)
Publication Number | Publication Date |
---|---|
CN110334363A true CN110334363A (en) | 2019-10-15 |
Family
ID=68146680
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910630989.7A Pending CN110334363A (en) | 2019-07-12 | 2019-07-12 | A kind of translation of description and method for measuring similarity based on hybrid coder |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN110334363A (en) |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107622311A (en) * | 2017-10-09 | 2018-01-23 | 深圳市唯特视科技有限公司 | A kind of robot learning by imitation method based on contextual translation |
CN107967262A (en) * | 2017-11-02 | 2018-04-27 | 内蒙古工业大学 | A kind of neutral net covers Chinese machine translation method |
-
2019
- 2019-07-12 CN CN201910630989.7A patent/CN110334363A/en active Pending
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107622311A (en) * | 2017-10-09 | 2018-01-23 | 深圳市唯特视科技有限公司 | A kind of robot learning by imitation method based on contextual translation |
CN107967262A (en) * | 2017-11-02 | 2018-04-27 | 内蒙古工业大学 | A kind of neutral net covers Chinese machine translation method |
Non-Patent Citations (2)
Title |
---|
JIE HU: "Towards Visual Feature Translation", 《THE IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR)》 * |
XIANGWEN ZHANG: "Asynchronous Bidirectional Decoding for Neural Machine Translation", 《THE THIRTY-SECOND AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE(AAAI-18)》 * |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110232152B (en) | Content recommendation method, device, server and storage medium | |
CN106776849B (en) | Method for quickly searching scenic spots by using pictures and tour guide system | |
US8577882B2 (en) | Method and system for searching multilingual documents | |
Stefanini et al. | Artpedia: A new visual-semantic dataset with visual and contextual sentences in the artistic domain | |
WO2019169872A1 (en) | Method and device for searching for content resource, and server | |
CN110399515B (en) | Picture retrieval method, device and system | |
CN110516096A (en) | Synthesis perception digital picture search | |
KR100471927B1 (en) | System for searching image data being based on web and method thereof | |
JP2009537901A (en) | Annotation by search | |
JP4699954B2 (en) | Multimedia data management method and apparatus | |
US11568018B2 (en) | Utilizing machine-learning models to generate identifier embeddings and determine digital connections between digital content items | |
CN103226547A (en) | Method and device for producing verse for picture | |
WO2023108980A1 (en) | Information push method and device based on text adversarial sample | |
CN108491543A (en) | Image search method, image storage method and image indexing system | |
WO2021159812A1 (en) | Cancer staging information processing method and apparatus, and storage medium | |
CN102508901A (en) | Content-based massive image search method and content-based massive image search system | |
CN107391599B (en) | Image retrieval method based on style characteristics | |
CN112948601A (en) | Cross-modal Hash retrieval method based on controlled semantic embedding | |
CN114637886A (en) | Machine vision system based on multiple protocols | |
CN112989811B (en) | History book reading auxiliary system based on BiLSTM-CRF and control method thereof | |
CN110442736B (en) | Semantic enhancer spatial cross-media retrieval method based on secondary discriminant analysis | |
Poornima et al. | Multi-modal features and correlation incorporated Naive Bayes classifier for a semantic-enriched lecture video retrieval system | |
CN110334363A (en) | A kind of translation of description and method for measuring similarity based on hybrid coder | |
Kim et al. | Towards a fairer landmark recognition dataset | |
CN103092935A (en) | Approximate copy image detection method based on scale invariant feature transform (SIFT) quantization |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
WD01 | Invention patent application deemed withdrawn after publication | ||
WD01 | Invention patent application deemed withdrawn after publication |
Application publication date: 20191015 |