CN111985505B - Interest visual relation detection method and device based on interest propagation network - Google Patents
Interest visual relation detection method and device based on interest propagation network Download PDFInfo
- Publication number
- CN111985505B CN111985505B CN202010848981.0A CN202010848981A CN111985505B CN 111985505 B CN111985505 B CN 111985505B CN 202010848981 A CN202010848981 A CN 202010848981A CN 111985505 B CN111985505 B CN 111985505B
- Authority
- CN
- China
- Prior art keywords
- interest
- visual
- relation
- objects
- features
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 230000000007 visual effect Effects 0.000 title claims abstract description 112
- 238000001514 detection method Methods 0.000 title claims abstract description 41
- 238000000034 method Methods 0.000 claims abstract description 11
- 230000009466 transformation Effects 0.000 claims abstract description 6
- 238000004364 calculation method Methods 0.000 claims description 6
- 238000004590 computer program Methods 0.000 claims description 5
- 238000013528 artificial neural network Methods 0.000 claims description 3
- 238000012549 training Methods 0.000 claims description 3
- 230000006870 function Effects 0.000 claims description 2
- 230000008569 process Effects 0.000 abstract description 2
- 239000000284 extract Substances 0.000 abstract 1
- 230000011218 segmentation Effects 0.000 description 3
- 238000013459 approach Methods 0.000 description 2
- 244000025254 Cannabis sativa Species 0.000 description 1
- 206010011971 Decreased interest Diseases 0.000 description 1
- 230000000052 comparative effect Effects 0.000 description 1
- 238000007796 conventional method Methods 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 239000002360 explosive Substances 0.000 description 1
- 230000003993 interaction Effects 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 238000012163 sequencing technique Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/40—Extraction of image or video features
- G06V10/44—Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/30—Semantic analysis
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q10/00—Administration; Management
- G06Q10/04—Forecasting or optimisation specially adapted for administrative or management purposes, e.g. linear programming or "cutting stock problem"
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/70—Determining position or orientation of objects or cameras
- G06T7/73—Determining position or orientation of objects or cameras using feature-based methods
Abstract
An interest visual relation detection method and device based on an interest propagation network extracts objects from an input image, combines the objects into object pairs in pairs, calculates corresponding object characteristics and joint characteristics, generates visual characteristics, semantic characteristics and position characteristics of the objects and the object pairs, obtains the interest characteristics of the objects and the object pairs through linear transformation, predicts the interest degree of the objects, obtains the interest characteristics of the relationship predicates through linear transformation of the visual characteristics, the semantic characteristics and the position characteristics of the relationship predicates, and predicts the interest degree of the relationship predicates among the objects; finally, the object interest degree and the relation predicate interest degree are combined to obtain the visual relation interest degree, and the visual relation with high interest degree is the finally detected interest visual relation. The invention can more reasonably predict the interest degree of the relation by taking the semantic importance as the standard in the process of detecting the visual relation, find out the interest visual relation which can accurately convey the content of the image main body, and has good universality and practicability.
Description
Technical Field
The invention belongs to the technical field of computer vision, relates to vision relation detection in images, and particularly relates to an interest vision relation detection method based on an interest propagation network.
Technical Field
As a bridge between vision and natural language, visual relationship detection aims at describing objects in images and interactions between objects in the form of a relationship triplet of < subject, relationship predicate, object >. Where subjects and objects are typically represented by borders and categories of objects, relational predicates are typically verbs (e.g., "lift", "ride", "see"), direction words (e.g., "beside", "in front", "above") and verb phrases (e.g., "stand aside", "sit on", "walk over"). The visual relation detection can help the machine to understand and analyze the content of the image or video, and can be widely applied to scenes such as image retrieval, video analysis and the like.
Conventional visual relationship detection methods are directed to detecting all visual relationships in an image. In fact, conventional approaches typically detect a very rich visual relationship due to the explosive combination of subject, relationship predicates and objects, as shown in FIG. 2. Although the image content can be more fully described, too many details can mislead the understanding of the machine on the image main body content, so that the accuracy in the scenes such as image retrieval is lost, and the accurate analysis of the image or video by the machine is not facilitated.
Intuitively, not all detected visual relationships are truly "interesting" in terms of semantics, i.e., not all visual relationships express the subject content of an image, often only a small portion of which has an important meaning for conveying the subject content of an image, such relationships being interesting visual relationships. The goal of interest visual relationship detection is to detect visual relationships that are truly important for conveying the content of the image subject, i.e., visual relationships that are "interesting".
At present, no research work has tried to detect interesting visual relations, and only some related works measure the visual significance of the relations through an attention module, so as to determine the significance weight of the relations and find out the significant visual relations. However, such an approach only takes into account the visual significance of the relationship, and does not take into account the semantic significance of the relationship, and the resulting interest relationship is not necessarily truly "interesting".
Disclosure of Invention
The invention aims to solve the problems that: machine understanding bias, which is easily caused by too rich visual relationships in images, requires the detection of interesting visual relationships that can accurately convey the content of the image subject to help the machine understand and analyze the images or videos more accurately.
The technical scheme of the invention is as follows: an interest visual relation detection method based on an interest propagation network establishes an interest propagation network, inputs an image and outputs an interest visual relation in the image, wherein the interest propagation network comprises a panoramic object detection module, an object interest prediction module and a relation predicate interest prediction module; firstly, extracting objects from an input image through a panoramic object detection module, combining the objects into object pairs in pairs, calculating the combined characteristics of object characteristics and the object pairs of the objects, generating visual characteristics, semantic characteristics and position characteristics of the objects and the object pairs in an object pair interest prediction module, and respectively obtaining the interest characteristics of the objects and the object pairs, thereby predicting the interest degree of the objects; meanwhile, the interest prediction module of the relation predicates obtains interest features of the relation predicates from visual features, semantic features and position features of the objects on the relation predicates, and uses semi-supervised learning to predict interest degree of the relation predicates among the objects; finally, the object interest degree and the relation predicate interest degree are combined to obtain the visual relation interest degree, and the visual relation with high interest degree is the finally detected interest visual relation.
Further, the invention comprises the following steps:
1) For an input image, extracting frames and categories of all objects, calculating characteristics in the frames of the n objects to serve as object characteristics, combining the n objects in pairs to form n (n-1) object pairs, and calculating characteristics in the frames of a main body and an object in each object pair to serve as joint characteristics;
2) For each object, pre-training by using a GloVe model to obtain word embedded features of category names, taking object features of the object as visual features, taking the word embedded features of the category names as semantic features, taking the position of the object relative to the whole image as position features, and combining the three features to obtain interesting features of the object; for each object pair, calculating three characteristics of a host and an object respectively in the same mode, calculating three characteristics of the object pair, and combining the three characteristics to obtain interesting characteristics of the object pair; inputting the object and the interest feature of the object pair into a graph convolution neural network, and predicting the object pair interest degree;
3) For each object pair, calculating visual features, semantic features and position features of the relation predicates to obtain interest features of the object pair relation predicates, and for each relation predicate, using semi-supervised learning to predict the probability that the relation predicate is interesting under the condition that the object pair is interesting, namely, the interest degree of the relation predicate;
4) Adding the loss of the object category prediction in the step 1), the loss of the object and the object for the interest degree prediction in the step 2) and the loss of the relation predicate interest degree prediction in the step 3) to obtain total loss, combining the object for the interest degree and the relation predicate interest degree obtained by minimizing the total loss to obtain visual relation interest degree, and sorting all visual relations according to the interest degree, wherein the visual relation with high interest degree is the finally detected interest visual relation.
The invention also provides an interest visual relation detection device based on the interest propagation network, which is configured with a computer program, wherein the computer program correspondingly realizes the interest propagation network and realizes the interest visual relation detection method.
The invention has the following effective benefits: the scheme for solving the problem of machine understanding deviation caused by too rich visual relationships in the images is provided, the semantic features, the position features and the visual features of objects and object pairs are considered, the relationship interestingness can be predicted reasonably by taking semantic importance as a standard in the process of detecting the visual relationships, and the interest visual relationships capable of accurately conveying the content of the image main body are found out. The method has good universality and practicability.
Drawings
FIG. 1 is a flow chart of the architecture and interest visual relationship detection of the interest propagation network of the present invention.
Fig. 2 shows an effect of too rich visual relationship detected by the conventional method.
FIG. 3 is a graphical representation of the results of the visual relationship of interest detection method of the present invention.
Detailed Description
The interest visual relation detection method based on the interest propagation network provides a solution to the problem of machine understanding deviation caused by too rich visual relation in an image, achieves that for an input image, the interest characteristics of an object, an object pair and a relation predicate are obtained through linear transformation of semantic characteristics, position characteristics and visual characteristics of the combined object and the object pair, interest degrees of the visual relation are reasonably predicted by taking semantic importance as a standard, and interest visual relation results capable of accurately conveying image main body contents are produced.
The practice of the invention is specifically described below.
As shown in FIG. 1, the invention establishes an interest propagation network, which inputs an image and outputs an interest visual relationship in the image, and comprises a panoramic object detection module, an object interest prediction module and a relationship predicate interest prediction module; the panoramic object detection module performs panoramic segmentation (panoptic segmentation) on the image, wherein the content in the image can be divided into a types of thags and a types of stuffs according to whether the fixed shapes exist, wherein objects with fixed shapes such as people, vehicles and the like belong to the types of thags (namely, the nouns generally belong to the thags); objects of the sky, grass, etc. that do not have a fixed shape belong to the category of stuff (i.e. the inexhaustible nouns belong to stuff). The method comprises the steps of extracting an object in an image through panoramic segmentation, obtaining the joint characteristics of the object characteristic and the object pair by an instance encoder, respectively inputting the joint characteristics into an object-to-interest prediction module and a relation predicate-interest prediction module, respectively obtaining semantic characteristics, visual characteristics and position characteristics of the object, the object pair and the relation predicate by a semantic encoder, a visual encoder and a position encoder, obtaining the interest characteristics of the object, the object pair and the relation predicate by linear transformation, respectively obtaining the object-to-interest degree and the relation predicate-interest degree by the interest characteristics through supervised learning and semi-supervised learning prediction, combining the two interest degrees to obtain the visual relation-interest degree, and sequencing and outputting the interest visual relation according to the visual relation-interest degree.
On the basis of the interest propagation network, firstly, extracting objects from an input image through a panoramic object detection module, combining the objects into object pairs in pairs, calculating the combined characteristics of the object characteristics of the objects and the object pairs, generating the visual characteristics, semantic characteristics and position characteristics of the objects and the object pairs, and respectively obtaining the interest characteristics of the objects and the object pairs through an object pair interest prediction module, thereby predicting the interest degree of the objects; meanwhile, the interest prediction module of the relation predicates obtains interest features of the relation predicates from visual features, semantic features and position features of the objects on the relation predicates, and uses semi-supervised learning to predict interest degree of the relation predicates among the objects; finally, the object interest degree and the relation predicate interest degree are combined to obtain the visual relation interest degree, and the visual relation with high interest degree is the finally detected interest visual relation.
The implementation of the present invention is described in detail below. The invention comprises the following steps:
1) For an input image, calculating object features and joint features by adopting a panoramic object detection module of an interest propagation network:
1.1 Extracting the frames and the categories of all objects in the graph;
1.2 Calculating the characteristics in the n object frames in the step 1.1) to be used as the object characteristics;
1.3 Combining the n objects in the step 1.1) to form n (n-1) object pairs, and calculating the characteristics in the frame of the union of the main body and the object in each object pair as the joint characteristics.
2) For the object and the object pair composed by the object extracted in the step 1), calculating the object pair interestingness by adopting an object pair interestingness prediction module of an interest propagation network:
2.1 For each object extracted in the step 1), taking the object characteristics as visual characteristics, pre-training by using a GloVE model to obtain word embedded characteristics of category names, taking the word embedded characteristics of the category names as semantic characteristics, taking the position of the object relative to the whole image as position characteristics, and combining the three characteristics to obtain the interest characteristics of the object. The calculation method of the position characteristics of the object comprises the following steps:
wherein Loc i Is a feature of the position of the object i,representing juxtaposition operations,/->Coordinates of a left boundary, an upper boundary, a right boundary, and a lower boundary of the object, w and h are the width and the height of the input image, respectively.
2.2 For each object pair formed in the step 1), three characteristics of a subject and an object are calculated respectively in a similar manner, and then three characteristics of the object pair are calculated, so that the interesting characteristics of the object pair are obtained in a combined manner. The method for calculating the position characteristics of the object pairs comprises the following steps:
Loc p is the position characteristic of the object to p, loc i Is the position feature of object i, s p 、o p Respectively representing the subject and the object of the object pair, and U represents the juxtaposition operation of the object level.
The method for calculating the visual characteristics of the object pairs comprises the following steps:
wherein F is p Is the visual characteristic of the object to p,representing object characteristics of the subject and object of the object pair respectively,representing the joint characteristics of the subject and object of the object pair.
2.3 Inputting the two interest features in the step 2.1) and the step 2.2) into a graph convolution neural network, and predicting the object interestingness.
3) For the object pairs formed in the step 1), calculating the interest degree of the relation predicates by adopting a relation predicate interest prediction module of the interest propagation network:
3.1 For each object pair composed in the step 1), calculating visual features, semantic features and position features of the object pair relation predicates, and combining to obtain interest features of the relation predicates. The method for calculating the position characteristics of the relation predicates of the object pairs comprises the following steps:
wherein Loc' p Is the relation predicate position characteristic of the object pair p, and w 'and h' are the width and the height of the frame of the union of the object pair main body and the object respectively. The calculation of visual features is the same as the calculation of visual features by an object.
3.2 For each relationship predicate, semi-supervised learning is used to predict the probability that the relationship predicate is also interesting under the condition that the object is interesting, namely the relationship predicate interestingness. The loss of semi-supervised learning is calculated as follows:
wherein L is rela Is the loss of the relation predicate interest prediction module, l rela Is the loss function of the device,prediction results of marked data and unmarked data are respectively represented by +.>And respectively representing the real results of marked data and unmarked data, wherein beta is the loss weight of the unmarked data.
4) Minimizing the total loss of the interest propagation network, predicting the visual relationship of interest:
4.1 Adding the loss of the object category prediction in the step 1), the loss of the object and the object to the interest prediction in the step 2) and the loss of the relation predicate interest prediction in the step 3) to obtain the total loss of the interest propagation network, and combining the object to the interest degree and the relation predicate interest degree obtained by minimizing the total loss to obtain the visual relation interest degree. The total loss of the interest propagation network is calculated as follows:
L pos =-(1-p pos ) 2 log p pos
L neg =-p neg log(1-p neg )
wherein L is pos 、L neg Representing the loss of positive and negative samples, p pos 、p neg Representing the probability scores of positive and negative samples, respectively, L total Is the total loss of the interest propagation network, L class Is a loss of prediction of the class of the object,positive and negative losses, respectively, representing predicted object interest,>representing the predicted positive and negative loss of interest of the object, respectively,>positive and negative losses of the relational predicate interest prediction are represented, respectively.
4.2 Ordering all visual relations according to the interestingness, wherein the visual relation with high interestingness is the finally detected interest visual relation. The interestingness of the visual relationship is calculated as follows:
I spo =E so ·I so ·P spo
wherein I is spo Is the interestingness of visual relationship, I so 、P spo Respectively representing the interest degree of object pairs and relation predicates, E so Is a binary parameter, E when the subject and the object in the object pair are the same object so Get 0, otherwise E so Taking 1.
The method of the present invention can be implemented by a computer program, and thus there is also provided an interest visual relationship detection apparatus based on an interest propagation network, the apparatus being configured with a computer program, which when executed implements the interest visual relationship detection method of the present invention.
The invention is implemented on an MSCOCO image dataset and compared with the result of the traditional visual relationship detection method. Fig. 2 and 3 are comparative examples of conventional visual relationship detection results and the results of the present invention. Wherein fig. 2 (a) and fig. 3 (a) are input images, and objects related to the visual relationship detection result are marked. Fig. 2 (b) is a result of a conventional visual relationship detection, including up to 24 visual relationships, and most relationships are weakly associated with the subject content of the input image. Fig. 3 (b) shows the result of interest visual relationship detection according to the present invention, wherein the visual relationship is only 5, and the correlation between the visual relationship and the main content of the input image is strong.
Claims (9)
1. An interest visual relation detection method based on an interest propagation network is characterized in that an interest propagation network is established, an image is input, an interest visual relation in the image is output, and the interest propagation network comprises a panoramic object detection module, an object interest prediction module and a relation predicate interest prediction module; firstly, extracting objects from an input image through a panoramic object detection module, combining the objects into object pairs, calculating the combined characteristics of object characteristics and the object pairs of the objects, generating visual characteristics, semantic characteristics and position characteristics of the objects and the object pairs in an object pair interest prediction module, and obtaining interest characteristics of the objects and the object pairs through linear transformation, thereby predicting the interest degree of the objects; meanwhile, the interest prediction module of the relation predicates obtains interest features of the relation predicates through linear transformation of visual features, semantic features and position features of the relation predicates by the objects, and uses semi-supervised learning to predict interest degree of the relation predicates among the objects; finally, the object interest degree and the relation predicate interest degree are combined to obtain the visual relation interest degree, and the visual relation with high interest degree is the finally detected interest visual relation.
2. The interest visual relationship detection method based on the interest propagation network as claimed in claim 1, comprising the steps of:
1) For an input image, extracting frames and categories of all objects, calculating characteristics in the frames of the n objects to serve as object characteristics, combining the n objects in pairs to form n (n-1) object pairs, and calculating characteristics in the frames of a main body and an object in each object pair to serve as joint characteristics;
2) For each object, pre-training by using a GloVe model to obtain word embedded features of category names, taking object features of the object as visual features, taking the word embedded features of the category names as semantic features, taking the position of the object relative to the whole image as position features, and combining the three features to obtain interesting features of the object; for each object pair, calculating three characteristics of a host and an object respectively in the same mode, calculating three characteristics of the object pair, and combining the three characteristics to obtain interesting characteristics of the object pair; inputting the object and the interest feature of the object pair into a graph convolution neural network, and predicting the object pair interest degree;
3) For each object pair, calculating visual features, semantic features and position features of the relation predicates to obtain interest features of the object pair relation predicates, and for each relation predicate, using semi-supervised learning to predict the probability that the relation predicate is interesting under the condition that the object pair is interesting, namely, the interest degree of the relation predicate;
4) Adding the loss of the object category prediction in the step 1), the loss of the object and the object for the interest degree prediction in the step 2) and the loss of the relation predicate interest degree prediction in the step 3) to obtain total loss, combining the object for the interest degree and the relation predicate interest degree obtained by minimizing the total loss to obtain visual relation interest degree, and sorting all visual relations according to the interest degree, wherein the visual relation with high interest degree is the finally detected interest visual relation.
3. The interest visual relationship detection method based on the interest propagation network as claimed in claim 2, wherein in the step 2), the calculation method of the position characteristics of the object is as follows:
wherein Loc i Is a feature of the position of the object i,representing juxtaposition operations,/->Coordinates of a left boundary, an upper boundary, a right boundary, and a lower boundary of the object i are respectively, and w and h are respectively the width and the height of the input image.
4. The interest visual relationship detection method based on the interest propagation network as claimed in claim 2, wherein in the step 2), the method for calculating the position characteristics of the object pair is as follows:
wherein Loc p Is the position characteristic of the object to p, loc i Is the position feature of object i, s p 、o p Respectively representing the subject and the object of the object pair, and U represents the juxtaposition operation of the object level.
5. The interest visual relationship detection method based on the interest propagation network as claimed in claim 2, wherein the method for calculating the visual characteristics of the object pair is as follows:
wherein F is p Is the visual characteristic of the object to p,object characteristics of the subject and the object of the object pair, respectively,/->Representing the joint characteristics of the subject and object of the object pair.
6. The interest visual relationship detection method based on the interest propagation network as claimed in claim 2, wherein in step 3), the method for calculating the relationship predicate position feature of the object pair is as follows:
wherein Loc' p Is the position characteristic of the relation predicate of the object to p,representing juxtaposition operations,/->Coordinates of left boundary, upper boundary, right boundary, and lower boundary of object i, s p 、o p And the U represents the juxtaposition operation of the object level, and w 'and h' are the width and the height of the frame of the union of the object and the object in the object pair.
7. The interest visual relationship detection method based on the interest propagation network according to claim 2, wherein in the semi-supervised learning prediction relationship predicate interestingness in step 3), the calculation method of the prediction loss is as follows:
wherein L is rela Is the loss of relation predicate interestingness prediction, l rela Is the loss function of the device,prediction results of marked data and unmarked data are respectively represented by +.>And respectively representing the real results of marked data and unmarked data, wherein beta is the loss weight of the unmarked data.
8. The interest visual relationship detection method based on the interest propagation network as claimed in claim 2, wherein the calculation method of the total loss in step 4) is as follows:
L pos =-(1-p pos ) 2 logp pos
L neg =-p neg log(1-p neg )
wherein L is pos 、L neg Representing the loss of positive and negative samples, p pos 、p neg Representing the probability scores of positive and negative samples, respectively, L total Is total loss, L class Is a loss of prediction of the class of the object,positive and negative losses representing object interestingness prediction, respectively,>representing the positive and negative losses of the object's prediction of interest,positive and negative losses of the relationship predicate interestingness prediction are represented, respectively.
9. An interest visual relationship detection apparatus based on an interest propagation network, characterized in that the apparatus is configured with a computer program which, when executed, implements the interest visual relationship detection method of claim 1, corresponding to the interest propagation network of claim 1.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010848981.0A CN111985505B (en) | 2020-08-21 | 2020-08-21 | Interest visual relation detection method and device based on interest propagation network |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010848981.0A CN111985505B (en) | 2020-08-21 | 2020-08-21 | Interest visual relation detection method and device based on interest propagation network |
Publications (2)
Publication Number | Publication Date |
---|---|
CN111985505A CN111985505A (en) | 2020-11-24 |
CN111985505B true CN111985505B (en) | 2024-02-13 |
Family
ID=73442732
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202010848981.0A Active CN111985505B (en) | 2020-08-21 | 2020-08-21 | Interest visual relation detection method and device based on interest propagation network |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN111985505B (en) |
Citations (13)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105045907A (en) * | 2015-08-10 | 2015-11-11 | 北京工业大学 | Method for constructing visual attention-label-user interest tree for personalized social image recommendation |
CN108229491A (en) * | 2017-02-28 | 2018-06-29 | 北京市商汤科技开发有限公司 | The method, apparatus and equipment of detection object relationship from picture |
CN108229272A (en) * | 2017-02-23 | 2018-06-29 | 北京市商汤科技开发有限公司 | Vision relationship detection method and device and vision relationship detection training method and device |
CN108229477A (en) * | 2018-01-25 | 2018-06-29 | 深圳市商汤科技有限公司 | For visual correlation recognition methods, device, equipment and the storage medium of image |
WO2019035771A1 (en) * | 2017-08-17 | 2019-02-21 | National University Of Singapore | Video visual relation detection methods and systems |
CN110796472A (en) * | 2019-09-02 | 2020-02-14 | 腾讯科技(深圳)有限公司 | Information pushing method and device, computer readable storage medium and computer equipment |
CN110889397A (en) * | 2018-12-28 | 2020-03-17 | 南京大学 | Visual relation segmentation method taking human as main body |
CN111125406A (en) * | 2019-12-23 | 2020-05-08 | 天津大学 | Visual relation detection method based on self-adaptive cluster learning |
CN111325279A (en) * | 2020-02-26 | 2020-06-23 | 福州大学 | Pedestrian and personal sensitive article tracking method fusing visual relationship |
CN111325243A (en) * | 2020-02-03 | 2020-06-23 | 天津大学 | Visual relation detection method based on regional attention learning mechanism |
CN111368829A (en) * | 2020-02-28 | 2020-07-03 | 北京理工大学 | Visual semantic relation detection method based on RGB-D image |
CN116089732A (en) * | 2023-04-11 | 2023-05-09 | 江西时刻互动科技股份有限公司 | User preference identification method and system based on advertisement click data |
CN116628052A (en) * | 2022-02-18 | 2023-08-22 | 罗伯特·博世有限公司 | Apparatus and computer-implemented method for adding quantity facts to a knowledge base |
Family Cites Families (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US7965866B2 (en) * | 2007-07-03 | 2011-06-21 | Shoppertrak Rct Corporation | System and process for detecting, tracking and counting human objects of interest |
US8548231B2 (en) * | 2009-04-02 | 2013-10-01 | Siemens Corporation | Predicate logic based image grammars for complex visual pattern recognition |
-
2020
- 2020-08-21 CN CN202010848981.0A patent/CN111985505B/en active Active
Patent Citations (13)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105045907A (en) * | 2015-08-10 | 2015-11-11 | 北京工业大学 | Method for constructing visual attention-label-user interest tree for personalized social image recommendation |
CN108229272A (en) * | 2017-02-23 | 2018-06-29 | 北京市商汤科技开发有限公司 | Vision relationship detection method and device and vision relationship detection training method and device |
CN108229491A (en) * | 2017-02-28 | 2018-06-29 | 北京市商汤科技开发有限公司 | The method, apparatus and equipment of detection object relationship from picture |
WO2019035771A1 (en) * | 2017-08-17 | 2019-02-21 | National University Of Singapore | Video visual relation detection methods and systems |
CN108229477A (en) * | 2018-01-25 | 2018-06-29 | 深圳市商汤科技有限公司 | For visual correlation recognition methods, device, equipment and the storage medium of image |
CN110889397A (en) * | 2018-12-28 | 2020-03-17 | 南京大学 | Visual relation segmentation method taking human as main body |
CN110796472A (en) * | 2019-09-02 | 2020-02-14 | 腾讯科技(深圳)有限公司 | Information pushing method and device, computer readable storage medium and computer equipment |
CN111125406A (en) * | 2019-12-23 | 2020-05-08 | 天津大学 | Visual relation detection method based on self-adaptive cluster learning |
CN111325243A (en) * | 2020-02-03 | 2020-06-23 | 天津大学 | Visual relation detection method based on regional attention learning mechanism |
CN111325279A (en) * | 2020-02-26 | 2020-06-23 | 福州大学 | Pedestrian and personal sensitive article tracking method fusing visual relationship |
CN111368829A (en) * | 2020-02-28 | 2020-07-03 | 北京理工大学 | Visual semantic relation detection method based on RGB-D image |
CN116628052A (en) * | 2022-02-18 | 2023-08-22 | 罗伯特·博世有限公司 | Apparatus and computer-implemented method for adding quantity facts to a knowledge base |
CN116089732A (en) * | 2023-04-11 | 2023-05-09 | 江西时刻互动科技股份有限公司 | User preference identification method and system based on advertisement click data |
Non-Patent Citations (4)
Title |
---|
Yu, Fan,等.Visual Relation of Interest Detection.《MM '20: PROCEEDINGS OF THE 28TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA》.2020,第1386-1394页. * |
Zhou, Hao,等.Visual Relationship Detection with Relative Location Mining.《PROCEEDINGS OF THE 27TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA (MM'19) 》.2019,第30-38页. * |
基于目标对筛选和联合谓语识别的视觉关系检测;陈方芳;《中国优秀硕士学位论文全文数据库 信息科技辑》(第8期);I138-657 * |
视频群体行为识别综述;吴建超,等;《软件学报》;第34卷(第2期);第964-984页 * |
Also Published As
Publication number | Publication date |
---|---|
CN111985505A (en) | 2020-11-24 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN109740148B (en) | Text emotion analysis method combining BiLSTM with Attention mechanism | |
CN110598005B (en) | Public safety event-oriented multi-source heterogeneous data knowledge graph construction method | |
CN110516067B (en) | Public opinion monitoring method, system and storage medium based on topic detection | |
CN105760507B (en) | Cross-module state topic relativity modeling method based on deep learning | |
CN106682059B (en) | Modeling and extraction from structured knowledge of images | |
US20200097604A1 (en) | Stacked cross-modal matching | |
CN113220919B (en) | Dam defect image text cross-modal retrieval method and model | |
CN111259940B (en) | Target detection method based on space attention map | |
CN111680159B (en) | Data processing method and device and electronic equipment | |
CN114936623B (en) | Aspect-level emotion analysis method integrating multi-mode data | |
CN112434732A (en) | Deep learning classification method based on feature screening | |
CN103336835B (en) | Image retrieval method based on weight color-sift characteristic dictionary | |
CN115311463B (en) | Category-guided multi-scale decoupling marine remote sensing image text retrieval method and system | |
CN114780690A (en) | Patent text retrieval method and device based on multi-mode matrix vector representation | |
CN114429566A (en) | Image semantic understanding method, device, equipment and storage medium | |
CN112115253A (en) | Depth text ordering method based on multi-view attention mechanism | |
CN107526721A (en) | A kind of disambiguation method and device to electric business product review vocabulary | |
Gong et al. | A method for wheat head detection based on yolov4 | |
Du | High-precision portrait classification based on mtcnn and its application on similarity judgement | |
CN115658934A (en) | Image-text cross-modal retrieval method based on multi-class attention mechanism | |
CN116610778A (en) | Bidirectional image-text matching method based on cross-modal global and local attention mechanism | |
CN115311465A (en) | Image description method based on double attention models | |
US11281714B2 (en) | Image retrieval | |
CN112711693A (en) | Litigation clue mining method and system based on multi-feature fusion | |
CN111985505B (en) | Interest visual relation detection method and device based on interest propagation network |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |