CN111985505A - Interest visual relationship detection method and device based on interest propagation network - Google Patents
Interest visual relationship detection method and device based on interest propagation network Download PDFInfo
- Publication number
- CN111985505A CN111985505A CN202010848981.0A CN202010848981A CN111985505A CN 111985505 A CN111985505 A CN 111985505A CN 202010848981 A CN202010848981 A CN 202010848981A CN 111985505 A CN111985505 A CN 111985505A
- Authority
- CN
- China
- Prior art keywords
- interest
- visual
- relation
- predicate
- pair
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 230000000007 visual effect Effects 0.000 title claims abstract description 113
- 238000001514 detection method Methods 0.000 title claims description 38
- 230000009466 transformation Effects 0.000 claims abstract description 6
- 238000004364 calculation method Methods 0.000 claims description 6
- 238000004590 computer program Methods 0.000 claims description 6
- 230000006870 function Effects 0.000 claims description 4
- 238000012163 sequencing technique Methods 0.000 claims description 4
- 238000013528 artificial neural network Methods 0.000 claims description 3
- 238000000034 method Methods 0.000 abstract description 12
- 230000008569 process Effects 0.000 abstract description 2
- 230000011218 segmentation Effects 0.000 description 3
- 230000008901 benefit Effects 0.000 description 2
- 238000007796 conventional method Methods 0.000 description 2
- 244000025254 Cannabis sativa Species 0.000 description 1
- 230000000052 comparative effect Effects 0.000 description 1
- 238000010586 diagram Methods 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 239000002360 explosive Substances 0.000 description 1
- 230000004438 eyesight Effects 0.000 description 1
- 230000002349 favourable effect Effects 0.000 description 1
- 230000003993 interaction Effects 0.000 description 1
- 230000016776 visual perception Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/40—Extraction of image or video features
- G06V10/44—Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/30—Semantic analysis
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q10/00—Administration; Management
- G06Q10/04—Forecasting or optimisation specially adapted for administrative or management purposes, e.g. linear programming or "cutting stock problem"
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/70—Determining position or orientation of objects or cameras
- G06T7/73—Determining position or orientation of objects or cameras using feature-based methods
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Business, Economics & Management (AREA)
- Human Resources & Organizations (AREA)
- Economics (AREA)
- Computational Linguistics (AREA)
- Artificial Intelligence (AREA)
- Strategic Management (AREA)
- General Health & Medical Sciences (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Health & Medical Sciences (AREA)
- General Engineering & Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Game Theory and Decision Science (AREA)
- Software Systems (AREA)
- Computing Systems (AREA)
- Molecular Biology (AREA)
- Multimedia (AREA)
- Evolutionary Computation (AREA)
- Development Economics (AREA)
- Biophysics (AREA)
- Mathematical Physics (AREA)
- Biomedical Technology (AREA)
- Life Sciences & Earth Sciences (AREA)
- Entrepreneurship & Innovation (AREA)
- Marketing (AREA)
- Operations Research (AREA)
- Quality & Reliability (AREA)
- Tourism & Hospitality (AREA)
- General Business, Economics & Management (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Image Analysis (AREA)
Abstract
An object is extracted from an input image, pairwise combination is carried out to obtain an object pair, corresponding object features and joint features are calculated, the visual features, semantic features and position features of the object and the object pair are generated, interest features of the object and the object pair are obtained through meridian transformation, accordingly, interest degree of the object pair is predicted, interest features of a relation predicate are obtained through meridian transformation of the visual features, the semantic features and the position features of the object pair relation predicate, and interest degree of the relation predicate among the objects is predicted; and finally, combining the interest degree of the object with the interest degree of the relational predicate to obtain a visual relation interest degree, wherein the visual relation with high interest degree is the finally detected interest visual relation. The invention can predict the relation interest degree more reasonably by taking the semantic importance as the standard in the process of detecting the visual relation, finds out the interest visual relation capable of accurately transmitting the main content of the image and has good universality and practicability.
Description
Technical Field
The invention belongs to the technical field of computer vision, relates to visual relation detection in images, and particularly relates to an interest visual relation detection method based on an interest propagation network.
Technical Field
As a bridge between visual and natural languages, visual relationship detection aims to describe objects in an image and interactions between objects in the form of relationship triples of < subject, relationship predicate, object >. Where subjects and objects are generally represented by the borders and categories of the object, relational predicates typically have verbs (e.g., "raise", "ride", "see"), orientation words (e.g., "beside", "in front", "above"), and verb phrases (e.g., "stand beside", "sit on", "walk through"). The visual relation detection can help a machine to understand and analyze the content of the image or the video, and can be widely applied to scenes such as image retrieval, video analysis and the like.
Conventional visual relationship detection methods are directed to detecting all visual relationships in an image. In fact, due to the explosive combination of subjects, relational predicates, and objects, conventional methods typically detect a very rich visual relationship, as shown in fig. 2. Although the image content can be described more fully, too much detail may mislead the understanding of the machine to the image main content, resulting in the loss of precision in the scene of image retrieval and the like, which is not favorable for the accurate analysis of the image or video by the machine.
Intuitively, not all detected visual relationships are semantically really 'interesting', that is, not all visual relationships express the main content of an image, and often only a small part of the relationships have important significance for conveying the main content of the image, and such relationships are interesting visual relationships. The goal of interest visual relationship detection is to detect the visual relationship that is really important for conveying the main content of the image, i.e., the "interesting" visual relationship.
At present, no research work attempts to detect interest visual relationships, and only some related works measure the visual significance of the relationships through an attention module, so that the significance weight of the relationships is determined, and the significant visual relationships are found out. However, such a method only takes into account the visual significance of the relationship, and does not take into account the semantic importance of the relationship, and the obtained interest relationship is not necessarily really "interesting".
Disclosure of Invention
The invention aims to solve the problems that: machine understanding deviation easily caused by too rich visual relationship in the image needs to be detected, and the interest visual relationship capable of accurately conveying the content of the image main body needs to be detected so as to help a machine to more accurately understand and analyze the image or the video.
The technical scheme of the invention is as follows: an interest visual relationship detection method based on an interest propagation network comprises the steps of establishing the interest propagation network, inputting images and outputting interest visual relationships in the images, wherein the interest propagation network comprises a panoramic object detection module, an object interest prediction module and a relationship predicate interest prediction module; firstly, extracting objects from an input image through a panoramic object detection module, combining every two objects into object pairs, calculating object characteristics of the objects and joint characteristics of the object pairs, generating visual characteristics, semantic characteristics and position characteristics of the objects and the object pairs in an object pair interest prediction module, and respectively obtaining interest characteristics of the objects and the object pairs so as to predict interest degrees of the object pairs; meanwhile, the relation predicate interest prediction module obtains interest characteristics of the relation predicates according to the visual characteristics, the semantic characteristics and the position characteristics of the object pair relation predicates, and predicts the relation predicate interest degrees among the objects by using semi-supervised learning; and finally, combining the interest degree of the object with the interest degree of the relational predicate to obtain a visual relation interest degree, wherein the visual relation with high interest degree is the finally detected interest visual relation.
Further, the invention comprises the following steps:
1) extracting frames and categories of all objects from the input image, calculating characteristics in the frames of the n objects as object characteristics, combining the n objects pairwise to form n (n-1) object pairs, calculating characteristics in the frames of a subject and an object in each object pair as joint characteristics;
2) for each object, pre-training a GloVe model to obtain a word embedding feature of the class name of the object, taking the object feature of the object as a visual feature, taking the word embedding feature of the class name as a semantic feature, taking the position of the object relative to the whole image as a position feature, and combining the three features to obtain an interest feature of the object; for each object pair, respectively calculating three characteristics of the subject and the object in the same way, calculating the three characteristics of the object pair, and combining to obtain the interest characteristics of the object pair; inputting the object and the interest characteristics of the object pair into a graph convolution neural network, and predicting the interest degree of the object pair;
3) for each object pair, calculating the visual characteristics, semantic characteristics and position characteristics of the relational predicates to obtain interest characteristics of the object pair relational predicates, and for each relational predicate, predicting the probability that the relational predicates are interesting under the condition that the object pairs are interesting by using semi-supervised learning, namely, obtaining relational predicate interestingness;
4) adding the loss predicted by the object type in the step 1), the loss predicted by the object and the object with the interest degree in the step 2) and the loss predicted by the relational predicate interest degree in the step 3) to obtain total loss, combining the object interest degree and the relational predicate interest degree obtained by minimizing the total loss to obtain visual relational interest degrees, and sequencing all the visual relations according to the interest degrees, wherein the visual relation with high interest degree is the finally detected interest visual relation.
The invention also provides an interest visual relationship detection device based on the interest propagation network, which is configured with a computer program, wherein the computer program correspondingly realizes the interest propagation network and realizes the interest visual relationship detection method.
The effective benefits of the invention are: the method and the device have the advantages that the problem of machine understanding deviation caused by excessively rich visual relations in the images is solved, the semantic features, the position features and the visual features of objects and object pairs are considered, the relation interest degree can be predicted more reasonably by taking the semantic importance as a standard in the process of detecting the visual relations, and the interest visual relations capable of accurately conveying the main contents of the images are found. The method has good universality and practicability.
Drawings
FIG. 1 is a flow chart of the architecture and interest visual relationship detection of the interest propagation network of the present invention.
Fig. 2 shows the effect of detecting an excessively rich visual relationship by the conventional method.
FIG. 3 is a diagram illustrating the results of the interest visual relationship detection method of the present invention.
Detailed Description
The interest visual relationship detection method based on the interest propagation network, which is related by the invention, provides a solution for the problem of machine understanding deviation caused by excessively rich visual relationship in an image, realizes that the interest characteristics of an object, an object pair and a relationship predicate are obtained by carrying out linear transformation on semantic characteristics, position characteristics and visual characteristics of an input image, a combined object and an object pair, reasonably predicts the interest degree of the visual relationship by taking semantic importance as a standard, and produces an interest visual relationship result capable of accurately conveying the main content of the image.
The practice of the present invention is described in detail below.
As shown in FIG. 1, the invention establishes an interest propagation network, inputs images and outputs interest visual relationships in the images, wherein the interest propagation network comprises a panoramic object detection module, an object interest prediction module and a relation predicate interest prediction module; the panoramic object detection module performs panoramic segmentation (panoramic segmentation) on the image, and the content in the image can be divided into categories of thins and stuff according to whether a fixed shape exists, wherein objects with fixed shapes such as people and cars belong to the categories of thins (namely, numerical names usually belong to things); objects with no fixed shape such as sky, grass, etc. belong to the stuff category (i.e., the non-countable term belongs to stuff). The method comprises the steps of obtaining an object in an image through panoramic segmentation, obtaining object features of the object and joint features of object pairs through an instance encoder, respectively inputting an object pair interest prediction module and a relation predicate interest prediction module, obtaining semantic features, visual features and position features of the object, the object pairs and the object pair relation predicates through a semantic encoder, a visual encoder and a position encoder, obtaining interest features of the object, the object pairs and the object pair relation predicates through meridian transformation of the three features, obtaining interest degrees of the object, the object pairs and the object pair relation predicates through supervised learning and semi-supervised learning prediction of the interest features, obtaining visual relation interest degrees through combination of the two interest degrees, and sequencing and outputting interest visual relations according to the visual relation interest degrees.
On the basis of the interest propagation network, firstly, objects are extracted from an input image through a panoramic object detection module, pairwise combination is carried out to form object pairs, object features of the objects and combined features of the object pairs are calculated, visual features, semantic features and position features of the objects and the object pairs are generated, then interest features of the objects and the object pairs are obtained through an object interest prediction module respectively, and accordingly interest degrees of the objects are predicted; meanwhile, the relation predicate interest prediction module obtains interest characteristics of the relation predicates according to the visual characteristics, the semantic characteristics and the position characteristics of the object pair relation predicates, and predicts the relation predicate interest degrees among the objects by using semi-supervised learning; and finally, combining the interest degree of the object with the interest degree of the relational predicate to obtain a visual relation interest degree, wherein the visual relation with high interest degree is the finally detected interest visual relation.
The following describes the implementation of the present invention in detail. The invention comprises the following steps:
1) for the input image, a panoramic object detection module of the interest propagation network is adopted to calculate object features and joint features:
1.1) extracting the frames and the categories of all objects in the graph;
1.2) calculating the characteristics in the n object frames in the step 1.1) as object characteristics;
1.3) combining the n objects in the step 1.1) pairwise to form n (n-1) object pairs, and calculating the characteristics in the subject and object union frame in each object pair as combined characteristics.
2) For the objects extracted in the step 1) and the object pairs formed by the objects, calculating the interest degree of the objects by adopting an object interest prediction module of an interest propagation network:
2.1) for each object extracted in the step 1), using the object characteristics as visual characteristics, pre-training by a GloVe model to obtain word embedding characteristics of the class name, using the word embedding characteristics of the class name as semantic characteristics, using the position of the object relative to the whole image as position characteristics, and combining the three characteristics to obtain the interest characteristics of the object. The method for calculating the position characteristics of the object comprises the following steps:
wherein, LociIs a characteristic of the position of the object i,it is shown that the operation of juxtaposition,respectively, the coordinates of the left, upper, right, and lower boundaries of the object, and w, h, respectively, the width and height of the input image.
2.2) calculating three characteristics of the subject and the object respectively in a similar mode for each object pair formed in the step 1), and then calculating the three characteristics of the object pair to jointly obtain the interest characteristics of the object pair. The method for calculating the position characteristics of the object pair comprises the following steps:
Locpis a positional characteristic of the object to p, LociIs a position characteristic of the object i, sp、opAnd the U represents the juxtaposition operation of the object level.
The method for calculating the visual characteristics of the object pair comprises the following steps:
wherein FpIs the view of an object on pThe characteristics of the sense of sight,respectively representing the subject and object characteristics of the object pair,representing the combined characteristics of the subject and object of the object pair.
2.3) inputting the two interest characteristics in the step 2.1) and the step 2.2) into a graph convolution neural network, and predicting the interest degree of the object.
3) Calculating a relational predicate interest degree by adopting a relational predicate interest prediction module of the interest propagation network for the object pair formed in the step 1):
3.1) calculating the visual characteristics, semantic characteristics and position characteristics of the object pair relational predicates for each object pair formed in the step 1), and jointly obtaining the interest characteristics of the relational predicates. The calculation method of the relational predicate position characteristics of the object pairs comprises the following steps:
wherein Loc'pIs the relation predicate position characteristic of the object pair p, and w 'and h' are the width and height of the subject and object union frame in the object pair respectively. The calculation of the visual features is the same as the calculation of the visual features by an object.
3.2) for each relational predicate, the semi-supervised learning is used to predict the probability that the relational predicate is also interesting under the condition that the object pair is interesting, namely the relational predicate interestingness. The loss of semi-supervised learning is calculated as follows:
wherein L isrelaIs the loss of the relational predicate interest prediction Module, lrelaIs a function of the loss as a function of,respectively representing the prediction results of marked data and unmarked data,the real results of marked data and unmarked data are respectively shown, and beta is the loss weight of the unmarked data.
4) Minimizing the total loss of the interest propagation network, predicting the interest visual relationship:
4.1) adding the loss of the object type prediction in the step 1), the loss of the object and the object interest prediction in the step 2) and the loss of the relation predicate interest prediction in the step 3) to obtain the total loss of the interest propagation network, and combining the object interest degree and the relation predicate interest degree obtained by minimizing the total loss to obtain the visual relation interest degree. The total loss of the interest propagation network is calculated as follows:
Lpos=-(1-ppos)2log ppos
Lneg=-pneg log(1-pneg)
wherein L ispos、LnegRepresenting the loss of positive and negative samples, p, respectivelypos、pnegRespectively representing the probability scores of positive and negative samples, LtotalIs the total loss of the interest propagation network, LclassIs a loss of prediction of the object class,respectively representing positive and negative losses of the object interest prediction,respectively representing positive and negative losses of the object to the prediction of interest,respectively representing relational predicate interest predictionsPositive losses and negative losses.
4.2) sequencing all visual relations according to the interest degree, wherein the visual relation with high interest degree is the finally detected interest visual relation. The interestingness of the visual relationship is calculated as follows:
Ispo=Eso·Iso·Pspo
wherein, IspoIs the degree of interest of the visual relationship, Iso、PspoRespectively representing the interest-degree of the object-pair and the relational predicate, EsoIs a binary parameter, E when the subject and object in the object pair are the same objectsoGet 0, otherwise Eso1 is taken.
The method of the present invention can be implemented by a computer program, and therefore, an interest visual relationship detecting apparatus based on an interest propagation network is also provided, wherein the apparatus is configured with a computer program and when executed, implements the interest visual relationship detecting method of the present invention.
The method is implemented on the MSCOCO image data set, and compared with the result of the traditional visual relation detection method. Fig. 2 and 3 are comparative examples of the results of conventional visual relationship detection and the results of the present invention. Fig. 2(a) and 3(a) are input images, and objects related to the visual relationship detection result are marked. Fig. 2(b) is the result of conventional visual relationship detection, which includes up to 24 visual relationships, and most of the relationships are weakly associated with the main content of the input image. Fig. 3(b) is the result of the interest visual relationship detection of the present invention, which includes only 5 visual relationships, and all of which are strongly associated with the main content of the input image.
Claims (9)
1. An interest visual relationship detection method based on an interest propagation network is characterized in that the interest visual relationship detection method is characterized in that the interest propagation network is established, images are input, and interest visual relationships in the images are output, and the interest propagation network comprises a panoramic object detection module, an object interest prediction module and a relationship predicate interest prediction module; firstly, extracting objects from an input image through a panoramic object detection module, combining every two objects into object pairs, calculating object characteristics of the objects and joint characteristics of the object pairs, generating visual characteristics, semantic characteristics and position characteristics of the objects and the object pairs in an object pair interest prediction module, and obtaining interest characteristics of the objects and the object pairs through linear transformation so as to predict interest degrees of the object pairs; meanwhile, the relation predicate interest prediction module obtains interest characteristics of the relation predicates through linear transformation of the visual characteristics, the semantic characteristics and the position characteristics of the object pair relation predicates, and predicts the relation predicate interest degrees among the objects by using semi-supervised learning; and finally, combining the interest degree of the object with the interest degree of the relational predicate to obtain a visual relation interest degree, wherein the visual relation with high interest degree is the finally detected interest visual relation.
2. The interest visual relationship detection method based on the interest propagation network as claimed in claim 1, characterized by comprising the following steps:
1) extracting frames and categories of all objects from the input image, calculating characteristics in the frames of the n objects as object characteristics, combining the n objects pairwise to form n (n-1) object pairs, calculating characteristics in the frames of a subject and an object in each object pair as joint characteristics;
2) for each object, pre-training a GloVe model to obtain a word embedding feature of the class name of the object, taking the object feature of the object as a visual feature, taking the word embedding feature of the class name as a semantic feature, taking the position of the object relative to the whole image as a position feature, and combining the three features to obtain an interest feature of the object; for each object pair, respectively calculating three characteristics of the subject and the object in the same way, calculating the three characteristics of the object pair, and combining to obtain the interest characteristics of the object pair; inputting the object and the interest characteristics of the object pair into a graph convolution neural network, and predicting the interest degree of the object pair;
3) for each object pair, calculating the visual characteristics, semantic characteristics and position characteristics of the relational predicates to obtain interest characteristics of the object pair relational predicates, and for each relational predicate, predicting the probability that the relational predicates are interesting under the condition that the object pairs are interesting by using semi-supervised learning, namely, obtaining relational predicate interestingness;
4) adding the loss predicted by the object type in the step 1), the loss predicted by the object and the object with the interest degree in the step 2) and the loss predicted by the relational predicate interest degree in the step 3) to obtain total loss, combining the object interest degree and the relational predicate interest degree obtained by minimizing the total loss to obtain visual relational interest degrees, and sequencing all the visual relations according to the interest degrees, wherein the visual relation with high interest degree is the finally detected interest visual relation.
3. The interest visual relationship detection method based on the interest propagation network as claimed in claim 2, wherein in the step 2), the position characteristic of the object is calculated by:
4. The interest visual relationship detection method based on the interest propagation network as claimed in claim 2, wherein in step 2), the calculation method of the position features of the object pairs comprises:
wherein, LocpIs a positional characteristic of the object to p, LociIs a position characteristic of the object i, sp、opAnd the U represents the juxtaposition operation of the object level.
5. The interest visual relationship detection method based on the interest propagation network as claimed in claim 2, wherein the visual characteristics of the object pair are calculated by:
6. The interest visual relationship detection method based on the interest propagation network as claimed in claim 2, wherein in step 3), the calculation method of the relational predicate position characteristics of the object pairs comprises:
wherein Loc'pIs the object-to-p relational predicate location feature,it is shown that the operation of juxtaposition,respectively the coordinates, s, of the left, upper, right, and lower boundaries of the object ip、opRespectively representing a main body and an object of the object pair, U represents the juxtaposition operation of the object level, and w 'and h' are the widths of a frame of a union set of the main body and the object in the object pairAnd a height.
7. The interest visual relationship detection method based on the interest propagation network as claimed in claim 2, wherein in the semi-supervised learning predicted relationship predicate interestingness of step 3), the calculation method of the prediction loss is as follows:
wherein L isrelaIs the loss of relational predicate interestingness prediction, lrelaIs a function of the loss as a function of,respectively representing the prediction results of marked data and unmarked data,the real results of marked data and unmarked data are respectively shown, and beta is the loss weight of the unmarked data.
8. The interest visual relationship detection method based on the interest propagation network as claimed in claim 2, wherein the total loss in step 4) is calculated by:
Lpos=-(1-ppos)2logppos
Lneg=-pneglog(1-pneg)
wherein L ispos、LnegRepresenting the loss of positive and negative samples, p, respectivelypos、pnegRespectively representing the probability scores of positive and negative samples, LtotalIs the total loss, LclassIs a loss of prediction of the object class,respectively representing positive and negative losses of the object interest prediction,respectively representing positive and negative losses of the object to the interestingness prediction,representing positive and negative losses, respectively, of the relational predicate interestingness prediction.
9. An interest visual relationship detection device based on an interest propagation network, characterized in that the device is configured with a computer program, the computer program corresponds to the interest propagation network of claim 1, and when executed, the interest visual relationship detection method of claim 1 is realized.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010848981.0A CN111985505B (en) | 2020-08-21 | 2020-08-21 | Interest visual relation detection method and device based on interest propagation network |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010848981.0A CN111985505B (en) | 2020-08-21 | 2020-08-21 | Interest visual relation detection method and device based on interest propagation network |
Publications (2)
Publication Number | Publication Date |
---|---|
CN111985505A true CN111985505A (en) | 2020-11-24 |
CN111985505B CN111985505B (en) | 2024-02-13 |
Family
ID=73442732
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202010848981.0A Active CN111985505B (en) | 2020-08-21 | 2020-08-21 | Interest visual relation detection method and device based on interest propagation network |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN111985505B (en) |
Citations (15)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20100278420A1 (en) * | 2009-04-02 | 2010-11-04 | Siemens Corporation | Predicate Logic based Image Grammars for Complex Visual Pattern Recognition |
CN105045907A (en) * | 2015-08-10 | 2015-11-11 | 北京工业大学 | Method for constructing visual attention-label-user interest tree for personalized social image recommendation |
US20160314597A1 (en) * | 2007-07-03 | 2016-10-27 | Shoppertrak Rct Corporation | System and process for detecting, tracking and counting human objects of interest |
CN108229272A (en) * | 2017-02-23 | 2018-06-29 | 北京市商汤科技开发有限公司 | Vision relationship detection method and device and vision relationship detection training method and device |
CN108229491A (en) * | 2017-02-28 | 2018-06-29 | 北京市商汤科技开发有限公司 | The method, apparatus and equipment of detection object relationship from picture |
CN108229477A (en) * | 2018-01-25 | 2018-06-29 | 深圳市商汤科技有限公司 | For visual correlation recognition methods, device, equipment and the storage medium of image |
WO2019035771A1 (en) * | 2017-08-17 | 2019-02-21 | National University Of Singapore | Video visual relation detection methods and systems |
CN110796472A (en) * | 2019-09-02 | 2020-02-14 | 腾讯科技(深圳)有限公司 | Information pushing method and device, computer readable storage medium and computer equipment |
CN110889397A (en) * | 2018-12-28 | 2020-03-17 | 南京大学 | Visual relation segmentation method taking human as main body |
CN111125406A (en) * | 2019-12-23 | 2020-05-08 | 天津大学 | Visual relation detection method based on self-adaptive cluster learning |
CN111325279A (en) * | 2020-02-26 | 2020-06-23 | 福州大学 | Pedestrian and personal sensitive article tracking method fusing visual relationship |
CN111325243A (en) * | 2020-02-03 | 2020-06-23 | 天津大学 | Visual relation detection method based on regional attention learning mechanism |
CN111368829A (en) * | 2020-02-28 | 2020-07-03 | 北京理工大学 | Visual semantic relation detection method based on RGB-D image |
CN116089732A (en) * | 2023-04-11 | 2023-05-09 | 江西时刻互动科技股份有限公司 | User preference identification method and system based on advertisement click data |
CN116628052A (en) * | 2022-02-18 | 2023-08-22 | 罗伯特·博世有限公司 | Apparatus and computer-implemented method for adding quantity facts to a knowledge base |
-
2020
- 2020-08-21 CN CN202010848981.0A patent/CN111985505B/en active Active
Patent Citations (15)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20160314597A1 (en) * | 2007-07-03 | 2016-10-27 | Shoppertrak Rct Corporation | System and process for detecting, tracking and counting human objects of interest |
US20100278420A1 (en) * | 2009-04-02 | 2010-11-04 | Siemens Corporation | Predicate Logic based Image Grammars for Complex Visual Pattern Recognition |
CN105045907A (en) * | 2015-08-10 | 2015-11-11 | 北京工业大学 | Method for constructing visual attention-label-user interest tree for personalized social image recommendation |
CN108229272A (en) * | 2017-02-23 | 2018-06-29 | 北京市商汤科技开发有限公司 | Vision relationship detection method and device and vision relationship detection training method and device |
CN108229491A (en) * | 2017-02-28 | 2018-06-29 | 北京市商汤科技开发有限公司 | The method, apparatus and equipment of detection object relationship from picture |
WO2019035771A1 (en) * | 2017-08-17 | 2019-02-21 | National University Of Singapore | Video visual relation detection methods and systems |
CN108229477A (en) * | 2018-01-25 | 2018-06-29 | 深圳市商汤科技有限公司 | For visual correlation recognition methods, device, equipment and the storage medium of image |
CN110889397A (en) * | 2018-12-28 | 2020-03-17 | 南京大学 | Visual relation segmentation method taking human as main body |
CN110796472A (en) * | 2019-09-02 | 2020-02-14 | 腾讯科技(深圳)有限公司 | Information pushing method and device, computer readable storage medium and computer equipment |
CN111125406A (en) * | 2019-12-23 | 2020-05-08 | 天津大学 | Visual relation detection method based on self-adaptive cluster learning |
CN111325243A (en) * | 2020-02-03 | 2020-06-23 | 天津大学 | Visual relation detection method based on regional attention learning mechanism |
CN111325279A (en) * | 2020-02-26 | 2020-06-23 | 福州大学 | Pedestrian and personal sensitive article tracking method fusing visual relationship |
CN111368829A (en) * | 2020-02-28 | 2020-07-03 | 北京理工大学 | Visual semantic relation detection method based on RGB-D image |
CN116628052A (en) * | 2022-02-18 | 2023-08-22 | 罗伯特·博世有限公司 | Apparatus and computer-implemented method for adding quantity facts to a knowledge base |
CN116089732A (en) * | 2023-04-11 | 2023-05-09 | 江西时刻互动科技股份有限公司 | User preference identification method and system based on advertisement click data |
Non-Patent Citations (4)
Title |
---|
YU, FAN,等: "Visual Relation of Interest Detection", 《MM \'20: PROCEEDINGS OF THE 28TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA》, pages 1386 - 1394 * |
ZHOU, HAO,等: "Visual Relationship Detection with Relative Location Mining", 《PROCEEDINGS OF THE 27TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA (MM\'19) 》, pages 30 - 38 * |
吴建超,等: "视频群体行为识别综述", 《软件学报》, vol. 34, no. 2, pages 964 - 984 * |
陈方芳: "基于目标对筛选和联合谓语识别的视觉关系检测", 《中国优秀硕士学位论文全文数据库 信息科技辑》, no. 8, pages 138 - 657 * |
Also Published As
Publication number | Publication date |
---|---|
CN111985505B (en) | 2024-02-13 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110516067B (en) | Public opinion monitoring method, system and storage medium based on topic detection | |
CN112182166B (en) | Text matching method and device, electronic equipment and storage medium | |
US11514244B2 (en) | Structured knowledge modeling and extraction from images | |
US9183467B2 (en) | Sketch segmentation | |
WO2020248391A1 (en) | Case brief classification method and apparatus, computer device, and storage medium | |
CN110598005A (en) | Public safety event-oriented multi-source heterogeneous data knowledge graph construction method | |
CN111259940A (en) | Target detection method based on space attention map | |
CN113822224A (en) | Rumor detection method and device integrating multi-modal learning and multi-granularity structure learning | |
CN114663915A (en) | Image human-object interaction positioning method and system based on Transformer model | |
CN114429566A (en) | Image semantic understanding method, device, equipment and storage medium | |
CN116610778A (en) | Bidirectional image-text matching method based on cross-modal global and local attention mechanism | |
US20200364259A1 (en) | Image retrieval | |
CN113902764A (en) | Semantic-based image-text cross-modal retrieval method | |
CN111159411B (en) | Knowledge graph fused text position analysis method, system and storage medium | |
US20230290118A1 (en) | Automatic classification method and system of teaching videos based on different presentation forms | |
CN111985505B (en) | Interest visual relation detection method and device based on interest propagation network | |
CN112069898A (en) | Method and device for recognizing human face group attribute based on transfer learning | |
Shf et al. | Review on deep based object detection | |
Liu et al. | RDBN: Visual relationship detection with inaccurate RGB-D images | |
CN111368829A (en) | Visual semantic relation detection method based on RGB-D image | |
WO2023173552A1 (en) | Establishment method for target detection model, application method for target detection model, and device, apparatus and medium | |
CN110750673A (en) | Image processing method, device, equipment and storage medium | |
CN115292533A (en) | Cross-modal pedestrian retrieval method driven by visual positioning | |
CN113159071B (en) | Cross-modal image-text association anomaly detection method | |
He et al. | Investigating YOLO Models Towards Outdoor Obstacle Detection For Visually Impaired People |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |