GB2546360B - Image captioning with weak supervision - Google Patents
Image captioning with weak supervision Download PDFInfo
- Publication number
- GB2546360B GB2546360B GB1618932.6A GB201618932A GB2546360B GB 2546360 B GB2546360 B GB 2546360B GB 201618932 A GB201618932 A GB 201618932A GB 2546360 B GB2546360 B GB 2546360B
- Authority
- GB
- United Kingdom
- Prior art keywords
- weak supervision
- image captioning
- captioning
- image
- supervision
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/214—Generating training patterns; Bootstrap methods, e.g. bagging or boosting
- G06F18/2155—Generating training patterns; Bootstrap methods, e.g. bagging or boosting characterised by the incorporation of unlabelled data, e.g. multiple instance learning [MIL], semi-supervised techniques using expectation-maximisation [EM] or naïve labelling
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/50—Information retrieval; Database structures therefor; File system structures therefor of still image data
- G06F16/58—Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
- G06F16/583—Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
- G06F18/241—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
- G06F18/2413—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on distances to training or reference patterns
- G06F18/24133—Distances to prototypes
- G06F18/24143—Distances to neighbourhood prototypes, e.g. restricted Coulomb energy networks [RCEN]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/10—Text processing
- G06F40/166—Editing, e.g. inserting or deleting
- G06F40/169—Annotation, e.g. comment data or footnotes
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/30—Semantic analysis
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/40—Processing or translation of natural language
- G06F40/55—Rule-based translation
- G06F40/56—Natural language generation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/044—Recurrent networks, e.g. Hopfield networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/047—Probabilistic or stochastic networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/20—Image preprocessing
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/40—Extraction of image or video features
- G06V10/44—Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components
- G06V10/443—Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components by matching or filtering
- G06V10/449—Biologically inspired filters, e.g. difference of Gaussians [DoG] or Gabor filters
- G06V10/451—Biologically inspired filters, e.g. difference of Gaussians [DoG] or Gabor filters with interaction between the filter responses, e.g. cortical complex cells
- G06V10/454—Integrating the filters into a hierarchical structure, e.g. convolutional neural networks [CNN]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/82—Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/35—Categorising the entire scene, e.g. birthday party or wedding scene
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/70—Labelling scene content, e.g. deriving syntactic or semantic representations
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/40—Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
- H04N21/47—End-user applications
- H04N21/488—Data services, e.g. news ticker
- H04N21/4884—Data services, e.g. news ticker for displaying subtitles
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US14/995,042 US9792534B2 (en) | 2016-01-13 | 2016-01-13 | Semantic natural language vector space |
US14/995,032 US9811765B2 (en) | 2016-01-13 | 2016-01-13 | Image captioning with weak supervision |
Publications (2)
Publication Number | Publication Date |
---|---|
GB2546360A GB2546360A (en) | 2017-07-19 |
GB2546360B true GB2546360B (en) | 2020-08-19 |
Family
ID=59078284
Family Applications (2)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
GB1618936.7A Active GB2547068B (en) | 2016-01-13 | 2016-11-09 | Semantic natural language vector space |
GB1618932.6A Active GB2546360B (en) | 2016-01-13 | 2016-11-09 | Image captioning with weak supervision |
Family Applications Before (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
GB1618936.7A Active GB2547068B (en) | 2016-01-13 | 2016-11-09 | Semantic natural language vector space |
Country Status (2)
Country | Link |
---|---|
DE (2) | DE102016013372A1 (en) |
GB (2) | GB2547068B (en) |
Families Citing this family (22)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108205684B (en) * | 2017-04-25 | 2022-02-11 | 北京市商汤科技开发有限公司 | Image disambiguation method, device, storage medium and electronic equipment |
CN107608943B (en) * | 2017-09-08 | 2020-07-28 | 中国石油大学(华东) | Image subtitle generating method and system fusing visual attention and semantic attention |
CN108108351B (en) * | 2017-12-05 | 2020-05-22 | 华南理工大学 | Text emotion classification method based on deep learning combination model |
CN108230413B (en) * | 2018-01-23 | 2021-07-06 | 北京市商汤科技开发有限公司 | Image description method and device, electronic equipment and computer storage medium |
CN108921764B (en) * | 2018-03-15 | 2022-10-25 | 中山大学 | Image steganography method and system based on generation countermeasure network |
CN108959512B (en) * | 2018-06-28 | 2022-04-29 | 清华大学 | Image description network and technology based on attribute enhanced attention model |
CN109086405B (en) * | 2018-08-01 | 2021-09-14 | 武汉大学 | Remote sensing image retrieval method and system based on significance and convolutional neural network |
CN109858487B (en) * | 2018-10-29 | 2023-01-17 | 温州大学 | Weak supervision semantic segmentation method based on watershed algorithm and image category label |
US11704487B2 (en) * | 2019-04-04 | 2023-07-18 | Beijing Jingdong Shangke Information Technology Co., Ltd. | System and method for fashion attributes extraction |
CN110191096B (en) * | 2019-04-30 | 2023-05-09 | 安徽工业大学 | Word vector webpage intrusion detection method based on semantic analysis |
CN110288665B (en) * | 2019-05-13 | 2021-01-15 | 中国科学院西安光学精密机械研究所 | Image description method based on convolutional neural network, computer-readable storage medium and electronic device |
CN110276001B (en) * | 2019-06-20 | 2021-10-08 | 北京百度网讯科技有限公司 | Checking page identification method and device, computing equipment and medium |
JP6830514B2 (en) | 2019-07-26 | 2021-02-17 | zro株式会社 | How visual and non-visual semantic attributes are associated with visuals and computing devices |
CN110472642B (en) * | 2019-08-19 | 2022-02-01 | 齐鲁工业大学 | Fine-grained image description method and system based on multi-level attention |
CN110750669B (en) * | 2019-09-19 | 2023-05-23 | 深思考人工智能机器人科技(北京)有限公司 | Method and system for generating image captions |
CN110851644A (en) * | 2019-11-04 | 2020-02-28 | 泰康保险集团股份有限公司 | Image retrieval method and device, computer-readable storage medium and electronic device |
CN111275110B (en) * | 2020-01-20 | 2023-06-09 | 北京百度网讯科技有限公司 | Image description method, device, electronic equipment and storage medium |
CN111444367B (en) * | 2020-03-24 | 2022-10-14 | 哈尔滨工程大学 | Image title generation method based on global and local attention mechanism |
CN111986730A (en) * | 2020-07-27 | 2020-11-24 | 中国科学院计算技术研究所苏州智能计算产业技术研究院 | Method for predicting siRNA silencing efficiency |
CN112580362B (en) * | 2020-12-18 | 2024-02-20 | 西安电子科技大学 | Visual behavior recognition method, system and computer readable medium based on text semantic supervision |
CN113128410A (en) * | 2021-04-21 | 2021-07-16 | 湖南大学 | Weak supervision pedestrian re-identification method based on track association learning |
CN115186655A (en) * | 2022-07-06 | 2022-10-14 | 重庆软江图灵人工智能科技有限公司 | Character semantic recognition method, system, medium and device based on deep learning |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
KR20090112020A (en) * | 2008-04-23 | 2009-10-28 | 엔에이치엔(주) | System and method for extracting caption candidate and system and method for extracting image caption using text information and structural information of document |
CN105389326A (en) * | 2015-09-16 | 2016-03-09 | 中国科学院计算技术研究所 | Image annotation method based on weak matching probability canonical correlation model |
WO2016070098A2 (en) * | 2014-10-31 | 2016-05-06 | Paypal, Inc. | Determining categories for weakly labeled images |
Family Cites Families (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101354704B (en) * | 2007-07-23 | 2011-01-12 | 夏普株式会社 | Apparatus for making grapheme characteristic dictionary and document image processing apparatus having the same |
CN104572940B (en) * | 2014-12-30 | 2017-11-21 | 中国人民解放军海军航空工程学院 | A kind of image automatic annotation method based on deep learning and canonical correlation analysis |
-
2016
- 2016-11-09 GB GB1618936.7A patent/GB2547068B/en active Active
- 2016-11-09 GB GB1618932.6A patent/GB2546360B/en active Active
- 2016-11-10 DE DE102016013372.4A patent/DE102016013372A1/en active Pending
- 2016-11-11 DE DE102016013487.9A patent/DE102016013487A1/en active Pending
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
KR20090112020A (en) * | 2008-04-23 | 2009-10-28 | 엔에이치엔(주) | System and method for extracting caption candidate and system and method for extracting image caption using text information and structural information of document |
WO2016070098A2 (en) * | 2014-10-31 | 2016-05-06 | Paypal, Inc. | Determining categories for weakly labeled images |
CN105389326A (en) * | 2015-09-16 | 2016-03-09 | 中国科学院计算技术研究所 | Image annotation method based on weak matching probability canonical correlation model |
Non-Patent Citations (1)
Title |
---|
Xi, Su Mei, and Young Im Cho. "Image caption automatic generation method based on weighted feature." Control, Automation and Systems (ICCAS), 2013 13th International Conference on. IEEE, 2013 * |
Also Published As
Publication number | Publication date |
---|---|
GB2546360A (en) | 2017-07-19 |
GB2547068B (en) | 2019-06-19 |
GB2547068A (en) | 2017-08-09 |
DE102016013487A1 (en) | 2017-07-13 |
DE102016013372A1 (en) | 2017-07-13 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
GB2546360B (en) | Image captioning with weak supervision | |
IL258803A (en) | Single image detection | |
HK1251427A1 (en) | Image analysis | |
ZA201708648B (en) | Stabilizing video | |
GB201507320D0 (en) | Camera | |
AU201612508S (en) | Video camera | |
LT3256486T (en) | Griffithsin mutants | |
GB2541589B (en) | Image modification | |
PL3131064T3 (en) | Searching image content | |
GB201515953D0 (en) | Improved overflow device | |
SG10201608874WA (en) | Lens | |
ZA201707892B (en) | Video laryngoscopes | |
IL253940A0 (en) | Video encoder | |
GB201619936D0 (en) | Projector with improved contrast | |
GB201505049D0 (en) | Video guide system | |
GB201404101D0 (en) | Image modification | |
GB2550124B (en) | Camera | |
GB201521000D0 (en) | Video content synchronisation | |
GB2578263B (en) | Video laryngoscopes | |
GB201520066D0 (en) | Video system | |
EP3206627C0 (en) | An improved lens design | |
GB201600522D0 (en) | TV etc subtitles | |
GB201719957D0 (en) | Instamatic cameras | |
GB201515075D0 (en) | Television | |
GB201514293D0 (en) | Television |