CN110334363A - A kind of translation of description and method for measuring similarity based on hybrid coder - Google Patents

A kind of translation of description and method for measuring similarity based on hybrid coder Download PDF

Info

Publication number
CN110334363A
CN110334363A CN201910630989.7A CN201910630989A CN110334363A CN 110334363 A CN110334363 A CN 110334363A CN 201910630989 A CN201910630989 A CN 201910630989A CN 110334363 A CN110334363 A CN 110334363A
Authority
CN
China
Prior art keywords
description
translation
sub
similarity
reconstruct
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201910630989.7A
Other languages
Chinese (zh)
Inventor
纪荣嵘
胡杰
李新阳
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Xiamen University
Original Assignee
Xiamen University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Xiamen University filed Critical Xiamen University
Priority to CN201910630989.7A priority Critical patent/CN110334363A/en
Publication of CN110334363A publication Critical patent/CN110334363A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/22Matching criteria, e.g. proximity measures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/10Text processing
    • G06F40/194Calculation of difference between files
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/40Processing or translation of natural language
    • G06F40/58Use of machine translation, e.g. for multi-lingual retrieval, for server-side translation for client devices or for real-time translation

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Artificial Intelligence (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • General Health & Medical Sciences (AREA)
  • Computational Linguistics (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Health & Medical Sciences (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Evolutionary Computation (AREA)
  • Evolutionary Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Machine Translation (AREA)

Abstract

A kind of translation of description and method for measuring similarity based on hybrid coder, is related to image retrieval and description son translation.Different craft or a variety of descriptions based on study are extracted from for trained image set, and to prepare to describe the training set of sub- translater, mixing self-encoding encoder is trained using feature in pairs;The hybrid coder of mapping of the training from source feature to target signature, using reconstruct and translation two-way supplemental training decoder, encoder is peculiar, and decoder is shared, obtains translation loss and loses with reconstruct;It is lost using translation loss and reconstruct to measure the similarity between two kinds of description;For description to be translated, it is input in corresponding translater, is then measured to sub- translation, acquired results confidence level is described by similarity according to the translater that training obtains.The retrieval barrier between the searching system based on different description is broken through, provides a transfer platform, convenient and efficient between not homologous ray.

Description

A kind of translation of description and method for measuring similarity based on hybrid coder
Technical field
The present invention relates to image retrievals and description son translation, turn over more particularly, to a kind of description based on hybrid coder It translates and method for measuring similarity.
Background technique
In the epoch of computer technology rapid development, especially with the prevalence of the social network sites such as Flickr, microblogging, figure The isomeric datas such as picture, video, audio, text are all increasing at an amazing speed daily.For example, Facebook registration user is more than 1000000000, monthly upload the picture more than 1,000,000,000;Flickr picture social network sites user's uploading pictures number in 2015 up to 7.28 hundred million, Average daily user uploads about 2,000,000 picture;It is in store on the back-end system of the e-commerce system Taobao of Largest In China 28600000000 plurality of pictures.It include the mass picture of abundant visual information for these, it is how square in these immense image libraries Just, quickly and accurately inquire and retrieve needed for user or interested image, become multimedia information retrieval area research Hot spot.Content-based image retrieval method given full play to computer be longer than processing iterative task advantage, by people from It needs to expend and be freed in a large amount of human and material resources and the artificial mark of financial resources.By the development over 10 years, the figure based on content As retrieval technique is widely used to the every aspect of the lives such as search engine, e-commerce, medicine, textile industry, leather industry.Figure Two classes can be divided by the difference of description picture material mode as retrieving, one kind is text based image retrieval (TBIR, Text Based Image Retrieval), another kind of is content-based image retrieval (CBIR, Content Based Image Retrieval)。
Text based image search method starts from 1970s, it is in the way of text marking in image Content is described, to form the description diagram as the keyword of content, such as the object in image, scene for each image Deng this mode can be artificial notation methods, can also carry out semi-automatic mark by image recognition technology.It is being retrieved When, user can provide key word of the inquiry according to the interest of oneself, and searching system finds out that according to the key word of the inquiry that user provides It is labeled with the corresponding picture of the key word of the inquiry, the result of inquiry is finally returned into user.It is this based on text description Image retrieval mode has manpower intervention due to being easily achieved, and in mark, so its precision ratio is also relatively high.In today Some middle and small scale picture search Web still have a use using upper, but defect brought by this mode based on text description It is also obviously: firstly, this mode based on text description needs manpower intervention annotation process, so that it is only applicable to Small-scale image data will complete this process in large-scale image data and need to expend a large amount of manpower and financial resources, and And image constantly external at any time be unable to do without artificial intervention in storage;Secondly, " thousand speech of figure victory ", accurate for needing Inquiry, user are sometimes difficult to that the image for oneself really wanting acquisition is depicted with brief keyword;Again, it manually marked Journey inevitably will receive the influence of the human-subject test of labeler, speech use and subjective judgement etc., therefore will cause text The difference of word description picture.
Iamge description is a kind of data type that must be handled in image retrieval, and is that most of existing visions are searched The basis of cable system.In typical set-up, visual search system can only handle from offline image and concentrate the predefined feature extracted. Such setting is prevented reuses certain different visual signature in not homologous ray.In addition, when upgrading visual search system, Time-consuming step is needed to extract new feature and construct corresponding index, and previous feature and index is discarded.It is prominent Broken such setting is all very useful anyway.
Summary of the invention
It is an object of the invention to can not be interknited between different description in order to solve in different searching systems Using the problem of, provide it is a kind of based on hybrid coder description translation and method for measuring similarity.
The present invention the following steps are included:
1) different a variety of descriptions is extracted, from for trained image set to prepare to describe the training of sub- translater Collection is trained mixing self-encoding encoder using feature in pairs;
2) hybrid coder for training the mapping from source feature to target signature, using reconstruct and translation two-way supplemental training Decoder, encoder is peculiar, and decoder is shared, obtains translation loss and loses with reconstruct;
3) it loses using translation obtained in step 2) and reconstruct loses to measure the similarity between two kinds of description;
4) description to be translated is input in corresponding translater, the translater then obtained according to training is to progress Description son translation, acquired results confidence level are measured by the similarity in step 3).
In step 1), a variety of description may include sub or description based on study of manual description, as long as It all can serve as the object of translation for description of image zooming-out.
In step 2), the specific method of the hybrid coder of mapping of the training from source feature to target signature can Are as follows: first by encoder EsAnd EtTo encode the sub- V of Source DescriptionsWith the sub- V of goal descriptiont, respectively obtain coding zsAnd zt, then use One shared decoder decodes zsAnd zt, respectively obtain the sub- V of description of conversionstWith the sub- V of description of reconstructtt, trained damage Lose the L2 norm after function is the L2 norm between reconfiguration description and goal description and converts between goal description;Specifically may be used Expression are as follows:
Wherein, the L2 norm between reconfiguration description is known as reconstructing loss, goal description after goal description and conversion Between L2 norm be known as translate loss.
It is described to be lost with reconstruct using translation loss obtained in step 2) to measure two kinds of description in step 3) Between the specific method of similarity can are as follows:
Oriented similarity is constructed first with the difference of translation loss and reconstruct loss, is expressed are as follows:
Normalizing is carried out later, obtains similarity to the end:
It is described that description to be translated is input in corresponding translater in step 4), then obtained according to training Translater to sub- translation is described, what acquired results confidence level was measured by the similarity in step 3) method particularly includes:
(1) select to describe sub- translater according to goal description and Source Description;
(2) translation Source Description is sub to goal description;
(3) confidence level of translation result is measured according to the measuring similarity between description;
(4) associated downstream task is completed using description that translation is completed.
Compared with prior art, the present invention has advantage following prominent:
The retrieval barrier between the searching system based on different description can be broken through through the invention, guarantee different retrieval systems It can be interknited between system, provide a transfer platform between not homologous ray, also provided for searching system update The mode of one more convenient and efficient.The scheme proposed can obtain the translater of description and translation describes sub- effect later Confidence level.Translater can provide translation model for the conversion between different description, and confidence level can know two kinds in advance If appropriate for being translated between difference description.The sub- interpretation method of the description of hybrid coder proposed by the present invention, than tradition Multi-layer perception (MLP) algorithm it is more stable, and demonstrate the convertible degree between 16 kinds of different characteristics.
Detailed description of the invention
Fig. 1 is flow chart of the invention.
Specific embodiment
Following embodiment will the present invention is described in detail in conjunction with attached drawing.
Referring to Fig. 1, the present invention the following steps are included:
1) different craft or the various features (description) based on study are extracted, from for trained image set with standard The standby training set for describing sub- translater is trained mixing self-encoding encoder using feature in pairs;
2) hybrid coder for training the mapping from source feature to target signature, using reconstruct and translation two-way supplemental training Decoder, encoder is peculiar, and wherein decoder is shared;
3) similarity between two kinds of description is measured using the translation error in step 2) and reconstructed error;
4) it for description to be translated, is entered into corresponding translater, the translation then obtained according to training Device is to being described sub- translation.Acquired results confidence level is measured by the similarity in step 3).
In step 2), the specific method of the hybrid coder of mapping of the training from source feature to target signature can Are as follows: first by encoder EsAnd EtTo encode the sub- V of Source DescriptionsWith the sub- V of goal descriptiont, respectively obtain coding zsAnd zt, then use One shared decoder decodes zsAnd zt, respectively obtain the sub- V of description of conversionstWith the sub- V of description of reconstructtt, trained damage Lose the L2 norm after function is the L2 norm between reconfiguration description and goal description and converts between goal description;Specifically may be used Expression are as follows:
Wherein, the L2 norm between reconfiguration description is known as reconstructing loss, goal description after goal description and conversion Between L2 norm be known as translate loss.
It is described to be lost with reconstruct using translation loss obtained in step 2) to measure two kinds of description in step 3) Between the specific method of similarity can are as follows:
Oriented similarity is constructed first with the difference of translation loss and reconstruct loss, is expressed are as follows:
Normalizing is carried out later, obtains similarity to the end:
In step 4), it is described according to the obtained translater of training to sub- translation is described method particularly includes: (1) root Select to describe sub- translater according to goal description and Source Description;(2) translation Source Description is sub to goal description;(3) basis Measuring similarity between son is described to measure the confidence level of translation result;(4) it is completed using description that translation is completed related Downstream Jobs.
The present invention is done on classical retrieval data set Holidays, Oxford5k, Paris6k for 16 kinds of description Corresponding relevant replication experiment is as shown in table 1.
Table 1
Table 1 gives description after translation and the error before translation, as it can be seen from table 1 side proposed by the invention Method is completed translation duties on largely description, with good performance.

Claims (5)

1. a kind of translation of description and method for measuring similarity based on hybrid coder, it is characterised in that the following steps are included:
1) different a variety of descriptions is extracted, from for trained image set to prepare to describe the training set of sub- translater, benefit Mixing self-encoding encoder is trained with feature in pairs;
2) hybrid coder for training the mapping from source feature to target signature, using reconstruct and translation two-way supplemental training decoding Device, encoder is peculiar, and decoder is shared, obtains translation loss and loses with reconstruct;
3) it loses using translation obtained in step 2) and reconstruct loses to measure the similarity between two kinds of description;
4) description to be translated is input in corresponding translater, then according to the obtained translater of training to being described Son translation, acquired results confidence level are measured by the similarity in step 3).
2. a kind of translation of description and method for measuring similarity, feature based on hybrid coder exists as described in claim 1 In in step 1), a variety of description attached bags include sub or description based on study of manual description, as long as being directed to image Description of extraction all can be used as the object of translation.
3. a kind of translation of description and method for measuring similarity, feature based on hybrid coder exists as described in claim 1 In in step 2), the hybrid coder of mapping of the training from source feature to target signature method particularly includes: first by Encoder EsAnd EtTo encode the sub- V of Source DescriptionsWith the sub- V of goal descriptiont, respectively obtain coding zsAnd zt, then shared using one Decoder decode zsAnd zt, respectively obtain the sub- V of description of conversionstWith the sub- V of description of reconstructtt, trained loss function is L2 norm after L2 norm between reconfiguration description and goal description and conversion between goal description;It embodies are as follows:
Wherein, the L2 norm between reconfiguration description is known as reconstructing loss, after goal description and conversion between goal description L2 norm is known as translating loss.
4. a kind of translation of description and method for measuring similarity, feature based on hybrid coder exists as described in claim 1 In in step 3), described lost using translation obtained in step 2) is lost with reconstruct to measure the phases between two kinds of description Like degree method particularly includes:
Oriented similarity is constructed first with the difference of translation loss and reconstruct loss, is expressed are as follows:
Normalizing is carried out later, obtains similarity to the end:
5. a kind of translation of description and method for measuring similarity, feature based on hybrid coder exists as described in claim 1 The translation that is described that description to be translated is input in corresponding translater in step 4), then being obtained according to training Device to sub- translation is described, what acquired results confidence level was measured by the similarity in step 3) method particularly includes:
(1) select to describe sub- translater according to goal description and Source Description;
(2) translation Source Description is sub to goal description;
(3) confidence level of translation result is measured according to the measuring similarity between description;
(4) associated downstream task is completed using description that translation is completed.
CN201910630989.7A 2019-07-12 2019-07-12 A kind of translation of description and method for measuring similarity based on hybrid coder Pending CN110334363A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910630989.7A CN110334363A (en) 2019-07-12 2019-07-12 A kind of translation of description and method for measuring similarity based on hybrid coder

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910630989.7A CN110334363A (en) 2019-07-12 2019-07-12 A kind of translation of description and method for measuring similarity based on hybrid coder

Publications (1)

Publication Number Publication Date
CN110334363A true CN110334363A (en) 2019-10-15

Family

ID=68146680

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910630989.7A Pending CN110334363A (en) 2019-07-12 2019-07-12 A kind of translation of description and method for measuring similarity based on hybrid coder

Country Status (1)

Country Link
CN (1) CN110334363A (en)

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107622311A (en) * 2017-10-09 2018-01-23 深圳市唯特视科技有限公司 A kind of robot learning by imitation method based on contextual translation
CN107967262A (en) * 2017-11-02 2018-04-27 内蒙古工业大学 A kind of neutral net covers Chinese machine translation method

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107622311A (en) * 2017-10-09 2018-01-23 深圳市唯特视科技有限公司 A kind of robot learning by imitation method based on contextual translation
CN107967262A (en) * 2017-11-02 2018-04-27 内蒙古工业大学 A kind of neutral net covers Chinese machine translation method

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
JIE HU: "Towards Visual Feature Translation", 《THE IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR)》 *
XIANGWEN ZHANG: "Asynchronous Bidirectional Decoding for Neural Machine Translation", 《THE THIRTY-SECOND AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE(AAAI-18)》 *

Similar Documents

Publication Publication Date Title
CN110232152B (en) Content recommendation method, device, server and storage medium
CN106776849B (en) Method for quickly searching scenic spots by using pictures and tour guide system
US8577882B2 (en) Method and system for searching multilingual documents
Stefanini et al. Artpedia: A new visual-semantic dataset with visual and contextual sentences in the artistic domain
WO2019169872A1 (en) Method and device for searching for content resource, and server
CN110399515B (en) Picture retrieval method, device and system
CN110516096A (en) Synthesis perception digital picture search
KR100471927B1 (en) System for searching image data being based on web and method thereof
JP2009537901A (en) Annotation by search
JP4699954B2 (en) Multimedia data management method and apparatus
US11568018B2 (en) Utilizing machine-learning models to generate identifier embeddings and determine digital connections between digital content items
CN103226547A (en) Method and device for producing verse for picture
WO2023108980A1 (en) Information push method and device based on text adversarial sample
CN108491543A (en) Image search method, image storage method and image indexing system
WO2021159812A1 (en) Cancer staging information processing method and apparatus, and storage medium
CN102508901A (en) Content-based massive image search method and content-based massive image search system
CN107391599B (en) Image retrieval method based on style characteristics
CN112948601A (en) Cross-modal Hash retrieval method based on controlled semantic embedding
CN114637886A (en) Machine vision system based on multiple protocols
CN112989811B (en) History book reading auxiliary system based on BiLSTM-CRF and control method thereof
CN110442736B (en) Semantic enhancer spatial cross-media retrieval method based on secondary discriminant analysis
Poornima et al. Multi-modal features and correlation incorporated Naive Bayes classifier for a semantic-enriched lecture video retrieval system
CN110334363A (en) A kind of translation of description and method for measuring similarity based on hybrid coder
Kim et al. Towards a fairer landmark recognition dataset
CN103092935A (en) Approximate copy image detection method based on scale invariant feature transform (SIFT) quantization

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
WD01 Invention patent application deemed withdrawn after publication
WD01 Invention patent application deemed withdrawn after publication

Application publication date: 20191015