CN104933029A - Text image joint semantics analysis method based on probability theme model - Google Patents

Text image joint semantics analysis method based on probability theme model Download PDF

Info

Publication number
CN104933029A
CN104933029A CN201510350978.5A CN201510350978A CN104933029A CN 104933029 A CN104933029 A CN 104933029A CN 201510350978 A CN201510350978 A CN 201510350978A CN 104933029 A CN104933029 A CN 104933029A
Authority
CN
China
Prior art keywords
image
text
theme
semantic
semantics
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201510350978.5A
Other languages
Chinese (zh)
Inventor
朱海龙
庞彦伟
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tianjin University
Original Assignee
Tianjin University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tianjin University filed Critical Tianjin University
Priority to CN201510350978.5A priority Critical patent/CN104933029A/en
Publication of CN104933029A publication Critical patent/CN104933029A/en
Pending legal-status Critical Current

Links

Landscapes

  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention provides a text image joint semantics analysis method based on a probability theme model. The text image joint semantics analysis method comprises the following steps: collecting a great quantity of texts comprising images, carrying out proper processing on the texts and the images, and forming an image-text pairs database in an image and text one-to-one way; utilizing samples to train to obtain a joint theme distribution model used for the text image semantics analysis; for an input image to be analyzed, extracting a visual characteristic vocabulary; applying a PLSA (Probabilistic Latent Semantic Analysis) model to the image and the visual characteristic vocabulary, and combining with text image joint theme distribution to obtain theme semantics of the image to be analyzed; matching the theme semantics obtained tin the previous step with the theme of the text in the image-text pairs database to select an optimal matching text; and for the obtained matching text, combining with an input image to carry out semantics evaluation. The text image joint semantics analysis method can obtain more semantics knowledge in addition to visualized scene object information.

Description

A kind of text image combination semantic analytical approach based on probability topic model
Art
The present invention relates to the text image semantic analysis in the fields such as computer vision, pattern analysis and artificial intelligence, specifically, particularly relate to the text image combination semantic analytical approach based on probability topic model.
Background technology
Image understanding (Image Understanding, IU) is exactly the semantic interpretation to image.It take image as object, and knowledge is core, in research image what there is the mutual relationship between what target (what is where), target scene position, what scene image is and a science of how application scenarios.Image understanding input be data, output be knowledge, belong to High-level content [1] [2] of picture research field.Semantic (Semantics), as the basic description carrier of knowledge information, complete picture material can be converted to can the class text language performance of intuitivism apprehension, plays vital effect in image understanding.
Semantic analysis in image understanding is huge in the potentiality of application.Semantic knowledge abundant in image can provide more accurate image search engine (Searching Engine), and the visual scene generated in intelligent digital picture photograph album and virtual world describes.Simultaneously, in the research of image understanding body, effectively can form the mutual driving system of " data-knowledge ", comprise significant context (Context) information and layer structure (Hierarchical Structured) information, can identify more fast, more accurately and detect the specific objective in scene.
Although semantic analysis is in very important position in image understanding, traditional image analysis method has all avoided matter of semantics substantially, only analyzes for pure view data.Trace it to its cause and mainly concentrate on two aspects: 1) be difficult between the visual expression of image and semanteme set up rationally association, describe inter-entity and produce huge semantic gap (Semantic Gap); 2) semantic itself have the polysemy of expression and uncertain (Ambiguity).At present, increasing research has started to pay close attention to above-mentioned bottleneck, and is devoted to valid model and method to realize the semanteme table in image understanding.
The semantic gap solved in image understanding needs the corresponding relation set up between image and text, the thinking solved can be roughly divided into three classes. and Article 1 thinking lays particular emphasis on the research of image itself, by building and the consistent model of picture material or method, by semanteme implicitly (Implicitly) incorporate wherein, set up the oriented contact of " text is to image ", core is how to be melted in model and method by semanteme.The achievement in research adopting this strategy to be formed focuses mostly in generation (Generative) mode and differentiation (Discriminative) mode.Article 2 thinking expresses from semantic syntax (Grammar) own and structural relation is started with, analyze its composition and mutual relationship, express by setting up image vision element structure similar with it, semantic description and analytical approach explicitly (Explicitly) being implanted comprises in the vision figure of syntactic relation, sets up the oriented contact of " image is to text ".Core is how to build the vision graph of a relation meeting semantic rules.Article 3 thinking is application-oriented, with CBIR (ImageRetrieval) for core, increases semantic vocabulary scale, builds the image retrieval inquiry system of multi-semantic meaning multi-user multi-process.
Solve semantic ambiguity problem own to need to set up rational Description standard and structural system.The cognitive scholar of Princeton university and linguist as far back as the eighties in 20th century with regard to research and establishment more unified class tree structure [3].Nowadays be regarded as the semantic relation normative reference that visual pattern research field is generally acknowledged, for the design of large-scale image data collection with in marking, effectively sorted out and unified polysemy word.In addition, some objective semantic retrieval evaluation criterions are also in positive heuristic process.
Semantic objective evaluation is the significant process of measure algorithm quality.Classic method generally carries out recall ratio/precision ratio evaluation for limited semantic classes, judge that whether the target in scene occurs, recall ratio/precision ratio the curve (Recall Precision Curve, RPC) of two evaluation index formation is general as basic evaluation object.
This patent mainly solves the semantic gap in image understanding, set up the corresponding relation between image and text, the probability topic model analysis method used for reference in text semantic analysis obtains text image combination semantic analytical approach, belongs to the generation method that image, semantic is analyzed.The comparative maturity that the semantic understanding of text developed to today, the hidden semantic analysis of probability (PLSA) [4] [5] model and hidden Di Li Cray is had to analyze (LDA) [6] model, application in conjunction with text and image has by probability topic models applying to abundant upper [7] [8] [9] of text, but is not the semantic understanding for image.
Use for reference text analyzing strategy, first need to build the object corresponded, the corresponding entire chapter document (Document) of entire image (Image), and the vocabulary (Lexicon) in document also needs corresponding corresponding visual vocabulary (Visual Word). the acquisition of visual vocabulary is generally by extracting the low-level feature of image to the significance analysis of image information, low-level feature is mostly from image data acquisition, comprise some special complex characteristic of simple point-line-surface characteristic sum, suitable visual vocabulary is generated again by the feature representation mode of robust, visual vocabulary generally has high reusability and some invariant features.
List of references:
[1]J.Gao,Z.Xie.Image Understanding Theory and Approach.Beijing,China:SciencePress,2009(in Chinese).
[2]Z.Xie,J.Gao.A Novel Method for Scene Categorization with Constraint MechanismBased on Gaussian Statistical Model[J].Acta Electronica Sinica,2009(in Chinese).
[3]D.Cruse.Lexical Semantics.Cambridge,UK:Cambridge University Press,1986.
[4]T.Hofmann.Unsupervised Learning by Probabilistic Latent Semantic Analysis[J].Machine Learning,2001.
[5]T.Hofmann.Probabilistic Latent Semantic Indexing[C].Proceedings of the 15thConference on Uncertainty in Artificial Intelligence.Stockholm,Netherlands,1999.
[6]D.M.Blei,A.Y.Ng,M.I.Jordan.Latent Dirichlet Allocation[J].Journal of MachineLearning Research,2003.
[7]M.Bressan,G.Csurka,Y.Hoppenot,J.M.Renders.Travel Blog Assistant System(TBAS)-An Example Scenario of How to Enrich Text with Images and Images with Text using OnlineMultimedia Repositories[C].VISAPP Workshop on Metadata Mining for Image Understanding,2008.
[8]Y.Pang,X.Lu,Y.Yuan,X.Li.Travelogue enriching and scenic spot overview basedon textual and visual topic models[J].International Journal of Pattern Recognition andArtificial Intelligence,2011.
[9]Y.Pang,Q.Hao,Y.Yuan,T.Hu,R.Cai,L.Zhang.Summarizing tourist destinationsby mining user-generated travelogues and photos[J].Computer Vision and Image Understanding,2011.
[10]Z.Xie,J.Gao.Object Localization Based on Visual Statistical ProbabilisticModels[J].Journal of Image and Graphics,2007,12(7):1234-1242(in Chinese).
[11]P.Moravech.Obstacle Avoidance and Navigation in the Real World by a SeeingRobot Rover.Technical Report,CMU-RI-TR-80-03,Pittsburgh,USA:Carnegie Mellon University.Robotics Institute,1980.
Summary of the invention
The object of the invention is to overcome traditional image analysis method and avoid matter of semantics, only analyze for pure view data, the quantity of information that theres is provided of image, semantic analytical approach based on image region segmentation and object mark is little, lower to the understanding level of picture material, the Background sources of image, the indefinite deficiency of relation of scene and target, propose a kind of text image combination semantic analytical approach based on probability topic model, utilize the advantage of the large data of network to excavate the abundant high-level semantic of image as far as possible.Technical scheme of the present invention is as follows:
Based on a text image combination semantic analytical approach for probability topic model, comprise step below:
Step 1: gather the text comprising image in a large number, suitable process is carried out to text and image, by image text composition diagram picture one to one-text pairs database;
Step 2: utilize the associating theme distribution model that these sample trainings obtain for text image semantic analysis; ;
Step 3: for the image to be analyzed of input, extracts visual signature vocabulary;
Step 4: to image and visual vocabulary application PLSA model thereof, in conjunction with text image associating theme distribution, the theme obtaining image to be analyzed is semantic;
Step 5: theme semanteme obtained in the previous step is mated with the theme of the text in image-text pairs database, chooses best matched text; The similarity measurement that coupling adopts can adopt the method for measuring similarity such as Euclidean distance, KL distance metric or included angle cosine;
Step 6: for the matched text obtained, carries out semantic evaluation in conjunction with input picture.
Adopt the method for the invention, analyzed by text image combination semantic, the abundanter semantic information except object except seeing intuitively from image and scene can be obtained.Relative to traditional simple image object and area marking, and the image understanding method of image Scene afterwards and object relationship, text image combination semantic analytical approach based on probability topic model by means of the strength of large data, when carrying out image understanding, employ the reference of more text image information, the more multi-semantic meaning knowledge except object scene information intuitively can be obtained, backstory of such as news picture etc.
Accompanying drawing explanation
By referring to accompanying drawing come directviewing description the present invention adopt the main body frame of technical scheme.
Fig. 1 is the probability enigmatic language justice analytical model of image
Fig. 2 is the learning process of the text image Conjoint Analysis model based on probability topic model
Fig. 3 be the present invention adopt the implementation process of text image combination semantic analytical approach
Embodiment
Here with the semantic understanding of news picture for instantiation carrys out its preferred forms of brief description, certain the present invention does not limit the classification of text image.
About the PLSA model that text semantic is analyzed, the joint probability distribution of document vocabulary is expressed as
P ( w , d ) = P ( d ) Σ z P ( w | z ) P ( z | d ) \*MERGEFORMAT(1.1)
Wherein, d represents document (document), w represents the vocabulary (word) in document, z represents the theme of document, P (d) represents the probability distribution of document, P (w|z) represents the conditional probability that theme vocabulary distributes, and P (z|d) represents the conditional probability that document subject matter distributes.
According to PLSA model, parameter to be estimated is θ 1=P (w|z), P (z|d) | and w ∈ V, d ∈ C, 1≤j≤k}, wherein, C is collection of document, and V represents all lexical sets in C, and the likelihood function of document C can be expressed as:
L ( θ ) = log P ( C | θ ) = Σ d ∈ C Σ w ∈ V c ( w , d ) × log P ( w , d ) = Σ d ∈ C Σ w ∈ V c ( w , d ) × log Σ k = 1 K P ( z k | d ) P ( w | z k ) \*MERGEFORMAT(1.2)
EM algorithm iteration is adopted to solve the distribution obtaining hidden variable theme,
P ( z k | d i ) = Σ w c ( w , d i ) P ( z k | d i , w ) Σ w c ( w , d i ) , k = 1... K , d i ∈ C P ( w j | z k ) = Σ w c ( w j , d ) P ( z k | d , w j ) Σ d Σ w c ( w , d ) P ( z k | d , w ) , k = 1... K , w j ∈ V \*MERGEFORMAT(1.3)
In research before, Hofmann uses for reference the hidden semantic analysis of probability (the Probabilistic Latent SemanticAnalysis in text analyzing, PLSA) model, " semanteme " is described and puts into latent space Z, generate corresponding " topic " (Topic) node, it describes as shown in Figure 1 substantially.D is the set that M image d forms, z represents the concept classification (being called Topics) of target, every width image is formed by the convex combination of K Topics vector, parameter iteration is carried out by maximal possibility estimation, likelihood function is the exponential form of p (w|d), with the frequency dependence of semantic vocabulary and image.Model alternately performs E process (calculating hidden variable posterior probability to expect) and M process (parameter iteration maximization likelihood) by expectation maximization (Expectation Maximization, EM) algorithm.The hidden variable semanteme ownership of decision process meets
z * = arg max z P ( z | d ) , \*MERGEFORMAT(1.4)
PLSA model sets up the corresponding relation between feature and image by hidden variable, each text unit is combined in proportion by several semantic concepts, semanteme distribution in essence in latent space remains sparse discrete distribution, be difficult to the adequate condition meeting statistics. in addition, the information that the combination of the semantic concept simply obtained according to image vision vocabulary has more compared with visual vocabulary itself is very limited, is difficult to provide about the abundanter semanteme of image and the relevant more background knowledges of image.
In order to more be enriched complete semantic knowledge from Image Visual Feature vocabulary, the present invention, on the basis of image PLSA model, in conjunction with the PLSA model of text, forms the text image combination semantic analytical approach based on probability topic model.For image text pair, first according to image PLSA model and corresponding text PLSA model, there is the associating theme distribution that identical theme principle sets up text image, then input picture combining image PLSA model and text image associating theme distribution are obtained to the Subject Concept of being correlated with, finally mate with the theme of these Subject Concepts with database Chinese version, text corresponding to Optimum Matching is namely as the semantic understanding to input picture.
After text and image are respectively through PLSA model treatment, due to text and the image polysemy of semantic meaning representation and diversity separately, in order to obtain common semantic meaning representation, adopt following formula in the decision process of choosing about theme variable:
z * = arg m a x z P ( z | d d o c ) + λ P ( z | d i m g ) , \*MERGEFORMAT(1.5)
Choose the theme variable simultaneously expressing text and image, give text and image simultaneously, due to text and the intrinsic multi-to-multi characteristic of image, additional mode is taked in the imparting for theme, is only appended on text or image by the theme also do not given.λ is used for weighing the semantic weight of text and image.
Generally speaking, method of the present invention is as follows
Step 1: the Chinese news report collecting a large amount of band picture, news content and image content are separated and form man-to-man relation, if one section of news package is containing several pictures, then each width picture all forms man-to-man text-image pairs with text.
Step 2: in order to obtain text image associating theme probability distribution and be every group of text and Computer image genration theme.
Step 3: for the image to be analyzed of input, equally with the processing mode of image in step 2 extract visual signature vocabulary.
Step 4: to image and visual vocabulary application PLSA model thereof, in conjunction with text image associating theme distribution theme collection, obtains the theme vector of image to be analyzed.
Step 5: mated by the theme of theme vector obtained in the previous step with the text in image-text pairs database, chooses best matched text.The similarity measurement that coupling adopts can adopt the method for measuring similarity such as Euclidean distance, KL distance metric or included angle cosine.
Step 6: for the matched text obtained, calculates the recall ratio/precision ratio of object scene.Or semantic evaluation can also by carrying out Similarity measures, as the accuracy rate of semantic understanding by input picture and the pairing associated picture of most matched text.

Claims (1)

1., based on a text image combination semantic analytical approach for probability topic model, comprise step below:
Step 1: gather the text comprising image in a large number, suitable process is carried out to text and image, by image text composition diagram picture one to one-text pairs database;
Step 2: utilize the associating theme distribution model that these sample trainings obtain for text image semantic analysis; ;
Step 3: for the image to be analyzed of input, extracts visual signature vocabulary;
Step 4: to image and visual vocabulary application PLSA model thereof, in conjunction with text image associating theme distribution, the theme obtaining image to be analyzed is semantic;
Step 5: theme semanteme obtained in the previous step is mated with the theme of the text in image-text pairs database, chooses best matched text; The similarity measurement that coupling adopts can adopt the method for measuring similarity such as Euclidean distance, KL distance metric or included angle cosine;
Step 6: for the matched text obtained, carries out semantic evaluation in conjunction with input picture.
CN201510350978.5A 2015-06-23 2015-06-23 Text image joint semantics analysis method based on probability theme model Pending CN104933029A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201510350978.5A CN104933029A (en) 2015-06-23 2015-06-23 Text image joint semantics analysis method based on probability theme model

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201510350978.5A CN104933029A (en) 2015-06-23 2015-06-23 Text image joint semantics analysis method based on probability theme model

Publications (1)

Publication Number Publication Date
CN104933029A true CN104933029A (en) 2015-09-23

Family

ID=54120198

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201510350978.5A Pending CN104933029A (en) 2015-06-23 2015-06-23 Text image joint semantics analysis method based on probability theme model

Country Status (1)

Country Link
CN (1) CN104933029A (en)

Cited By (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106682060A (en) * 2015-11-11 2017-05-17 奥多比公司 Structured Knowledge Modeling, Extraction and Localization from Images
CN107785014A (en) * 2017-10-23 2018-03-09 上海百芝龙网络科技有限公司 A kind of home scenarios semantic understanding method
CN107967494A (en) * 2017-12-20 2018-04-27 华东理工大学 A kind of image-region mask method of view-based access control model semantic relation figure
CN108647705A (en) * 2018-04-23 2018-10-12 北京交通大学 Image, semantic disambiguation method and device based on image and text semantic similarity
CN109559315A (en) * 2018-09-28 2019-04-02 天津大学 A kind of water surface dividing method based on multipath deep neural network
CN110413819A (en) * 2019-07-12 2019-11-05 深兰科技(上海)有限公司 A kind of acquisition methods and device of picture description information
CN111276149A (en) * 2020-01-19 2020-06-12 科大讯飞股份有限公司 Voice recognition method, device, equipment and readable storage medium
CN111383302A (en) * 2018-12-29 2020-07-07 中兴通讯股份有限公司 Image collocation method and device, terminal and computer readable storage medium
CN112508048A (en) * 2020-10-22 2021-03-16 复旦大学 Image description generation method and device
CN112765992A (en) * 2021-01-14 2021-05-07 深圳市人马互动科技有限公司 Training data construction method and device, computer equipment and storage medium
CN116244306A (en) * 2023-01-10 2023-06-09 江苏理工学院 Academic paper quotation recommendation method and system based on knowledge organization semantic relation

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101388022A (en) * 2008-08-12 2009-03-18 北京交通大学 Web portrait search method for fusing text semantic and vision content
CN101853295A (en) * 2010-05-28 2010-10-06 天津大学 Image search method
CN102314610A (en) * 2010-07-07 2012-01-11 北京师范大学 Object-oriented image clustering method based on probabilistic latent semantic analysis (PLSA) model
CN104268546A (en) * 2014-05-28 2015-01-07 苏州大学 Dynamic scene classification method based on topic model
CN104573711A (en) * 2014-12-22 2015-04-29 上海交通大学 Object and scene image understanding method based on text-object-scene relations

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101388022A (en) * 2008-08-12 2009-03-18 北京交通大学 Web portrait search method for fusing text semantic and vision content
CN101853295A (en) * 2010-05-28 2010-10-06 天津大学 Image search method
CN102314610A (en) * 2010-07-07 2012-01-11 北京师范大学 Object-oriented image clustering method based on probabilistic latent semantic analysis (PLSA) model
CN104268546A (en) * 2014-05-28 2015-01-07 苏州大学 Dynamic scene classification method based on topic model
CN104573711A (en) * 2014-12-22 2015-04-29 上海交通大学 Object and scene image understanding method based on text-object-scene relations

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
MARCO BRESSAN ET AL: "Travel Blog Assistant System (TBAS) - An Example Scenario of How to Enrich Text with Images and Images Scenario of How to Enrich Text with Images and Images", 《VISAPP WORKSHOP ON METADATA MINING FOR IMAGE UNDERSTANDING》 *
YANWEI PANG ET AL: "Summarizing tourist destinations by mining user-generated travelogues and photos", 《COMPUTER VISION AND IMAGE UNDERSTANDING》 *
李志欣 等: "建模连续视觉特征的图像语义标注方法", 《计算机辅助设计与图形学学报》 *
李志欣 等: "融合语义主题的图像自动标注", 《软件学报》 *

Cited By (18)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106682060A (en) * 2015-11-11 2017-05-17 奥多比公司 Structured Knowledge Modeling, Extraction and Localization from Images
CN106682060B (en) * 2015-11-11 2022-03-15 奥多比公司 Modeling, extracting, and localizing from structured knowledge of images
CN107785014A (en) * 2017-10-23 2018-03-09 上海百芝龙网络科技有限公司 A kind of home scenarios semantic understanding method
CN107967494B (en) * 2017-12-20 2020-12-11 华东理工大学 Image region labeling method based on visual semantic relation graph
CN107967494A (en) * 2017-12-20 2018-04-27 华东理工大学 A kind of image-region mask method of view-based access control model semantic relation figure
CN108647705A (en) * 2018-04-23 2018-10-12 北京交通大学 Image, semantic disambiguation method and device based on image and text semantic similarity
CN109559315A (en) * 2018-09-28 2019-04-02 天津大学 A kind of water surface dividing method based on multipath deep neural network
CN109559315B (en) * 2018-09-28 2023-06-02 天津大学 Water surface segmentation method based on multipath deep neural network
CN111383302A (en) * 2018-12-29 2020-07-07 中兴通讯股份有限公司 Image collocation method and device, terminal and computer readable storage medium
CN110413819A (en) * 2019-07-12 2019-11-05 深兰科技(上海)有限公司 A kind of acquisition methods and device of picture description information
CN110413819B (en) * 2019-07-12 2022-03-29 深兰科技(上海)有限公司 Method and device for acquiring picture description information
CN111276149A (en) * 2020-01-19 2020-06-12 科大讯飞股份有限公司 Voice recognition method, device, equipment and readable storage medium
CN111276149B (en) * 2020-01-19 2023-04-18 科大讯飞股份有限公司 Voice recognition method, device, equipment and readable storage medium
CN112508048A (en) * 2020-10-22 2021-03-16 复旦大学 Image description generation method and device
CN112508048B (en) * 2020-10-22 2023-06-06 复旦大学 Image description generation method and device
CN112765992A (en) * 2021-01-14 2021-05-07 深圳市人马互动科技有限公司 Training data construction method and device, computer equipment and storage medium
CN116244306A (en) * 2023-01-10 2023-06-09 江苏理工学院 Academic paper quotation recommendation method and system based on knowledge organization semantic relation
CN116244306B (en) * 2023-01-10 2023-11-03 江苏理工学院 Academic paper quotation recommendation method and system based on knowledge organization semantic relation

Similar Documents

Publication Publication Date Title
CN104933029A (en) Text image joint semantics analysis method based on probability theme model
CN104899253B (en) Towards the society image across modality images-label degree of correlation learning method
Devika et al. Sentiment analysis: a comparative study on different approaches
CN110598005B (en) Public safety event-oriented multi-source heterogeneous data knowledge graph construction method
Mohammed et al. A state-of-the-art survey on semantic similarity for document clustering using GloVe and density-based algorithms
CN105760507B (en) Cross-module state topic relativity modeling method based on deep learning
US9183467B2 (en) Sketch segmentation
Gao et al. Multi‐dimensional data modelling of video image action recognition and motion capture in deep learning framework
CN106056082B (en) A kind of video actions recognition methods based on sparse low-rank coding
Liu et al. Weakly supervised graph propagation towards collective image parsing
Chen et al. Differential topic models
Zhu et al. Visual relationship detection with object spatial distribution
Xu et al. Multi-modal transformer with global-local alignment for composed query image retrieval
Soltanian et al. Hierarchical concept score postprocessing and concept-wise normalization in CNN-based video event recognition
Gu et al. Toward facial expression recognition in the wild via noise-tolerant network
Pan et al. A bottom-up summarization algorithm for videos in the wild
Wajid et al. Deep learning and knowledge graph for image/video captioning: A review of datasets, evaluation metrics, and methods
Yan et al. Mitigating label-noise for facial expression recognition in the wild
Al-Tameemi et al. Multi-model fusion framework using deep learning for visual-textual sentiment classification
Long et al. Bi-calibration networks for weakly-supervised video representation learning
Li et al. Self-supervised nodes-hyperedges embedding for heterogeneous information network learning
Bertini et al. Learning ontology rules for semantic video annotation
Wang et al. Improving embedding learning by virtual attribute decoupling for text-based person search
Sun et al. Learning spatio-temporal co-occurrence correlograms for efficient human action classification
Feng et al. Multiple style exploration for story unit segmentation of broadcast news video

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
WD01 Invention patent application deemed withdrawn after publication

Application publication date: 20150923

WD01 Invention patent application deemed withdrawn after publication