CN111985505B - Interest visual relation detection method and device based on interest propagation network - Google Patents

Interest visual relation detection method and device based on interest propagation network Download PDF

Info

Publication number
CN111985505B
CN111985505B CN202010848981.0A CN202010848981A CN111985505B CN 111985505 B CN111985505 B CN 111985505B CN 202010848981 A CN202010848981 A CN 202010848981A CN 111985505 B CN111985505 B CN 111985505B
Authority
CN
China
Prior art keywords
interest
visual
relation
objects
features
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202010848981.0A
Other languages
Chinese (zh)
Other versions
CN111985505A (en
Inventor
任桐炜
武港山
王浩楠
于凡
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Nanjing University
Original Assignee
Nanjing University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nanjing University filed Critical Nanjing University
Priority to CN202010848981.0A priority Critical patent/CN111985505B/en
Publication of CN111985505A publication Critical patent/CN111985505A/en
Application granted granted Critical
Publication of CN111985505B publication Critical patent/CN111985505B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/44Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/30Semantic analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/04Forecasting or optimisation specially adapted for administrative or management purposes, e.g. linear programming or "cutting stock problem"
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/70Determining position or orientation of objects or cameras
    • G06T7/73Determining position or orientation of objects or cameras using feature-based methods

Abstract

An interest visual relation detection method and device based on an interest propagation network extracts objects from an input image, combines the objects into object pairs in pairs, calculates corresponding object characteristics and joint characteristics, generates visual characteristics, semantic characteristics and position characteristics of the objects and the object pairs, obtains the interest characteristics of the objects and the object pairs through linear transformation, predicts the interest degree of the objects, obtains the interest characteristics of the relationship predicates through linear transformation of the visual characteristics, the semantic characteristics and the position characteristics of the relationship predicates, and predicts the interest degree of the relationship predicates among the objects; finally, the object interest degree and the relation predicate interest degree are combined to obtain the visual relation interest degree, and the visual relation with high interest degree is the finally detected interest visual relation. The invention can more reasonably predict the interest degree of the relation by taking the semantic importance as the standard in the process of detecting the visual relation, find out the interest visual relation which can accurately convey the content of the image main body, and has good universality and practicability.

Description

Interest visual relation detection method and device based on interest propagation network
Technical Field
The invention belongs to the technical field of computer vision, relates to vision relation detection in images, and particularly relates to an interest vision relation detection method based on an interest propagation network.
Technical Field
As a bridge between vision and natural language, visual relationship detection aims at describing objects in images and interactions between objects in the form of a relationship triplet of < subject, relationship predicate, object >. Where subjects and objects are typically represented by borders and categories of objects, relational predicates are typically verbs (e.g., "lift", "ride", "see"), direction words (e.g., "beside", "in front", "above") and verb phrases (e.g., "stand aside", "sit on", "walk over"). The visual relation detection can help the machine to understand and analyze the content of the image or video, and can be widely applied to scenes such as image retrieval, video analysis and the like.
Conventional visual relationship detection methods are directed to detecting all visual relationships in an image. In fact, conventional approaches typically detect a very rich visual relationship due to the explosive combination of subject, relationship predicates and objects, as shown in FIG. 2. Although the image content can be more fully described, too many details can mislead the understanding of the machine on the image main body content, so that the accuracy in the scenes such as image retrieval is lost, and the accurate analysis of the image or video by the machine is not facilitated.
Intuitively, not all detected visual relationships are truly "interesting" in terms of semantics, i.e., not all visual relationships express the subject content of an image, often only a small portion of which has an important meaning for conveying the subject content of an image, such relationships being interesting visual relationships. The goal of interest visual relationship detection is to detect visual relationships that are truly important for conveying the content of the image subject, i.e., visual relationships that are "interesting".
At present, no research work has tried to detect interesting visual relations, and only some related works measure the visual significance of the relations through an attention module, so as to determine the significance weight of the relations and find out the significant visual relations. However, such an approach only takes into account the visual significance of the relationship, and does not take into account the semantic significance of the relationship, and the resulting interest relationship is not necessarily truly "interesting".
Disclosure of Invention
The invention aims to solve the problems that: machine understanding bias, which is easily caused by too rich visual relationships in images, requires the detection of interesting visual relationships that can accurately convey the content of the image subject to help the machine understand and analyze the images or videos more accurately.
The technical scheme of the invention is as follows: an interest visual relation detection method based on an interest propagation network establishes an interest propagation network, inputs an image and outputs an interest visual relation in the image, wherein the interest propagation network comprises a panoramic object detection module, an object interest prediction module and a relation predicate interest prediction module; firstly, extracting objects from an input image through a panoramic object detection module, combining the objects into object pairs in pairs, calculating the combined characteristics of object characteristics and the object pairs of the objects, generating visual characteristics, semantic characteristics and position characteristics of the objects and the object pairs in an object pair interest prediction module, and respectively obtaining the interest characteristics of the objects and the object pairs, thereby predicting the interest degree of the objects; meanwhile, the interest prediction module of the relation predicates obtains interest features of the relation predicates from visual features, semantic features and position features of the objects on the relation predicates, and uses semi-supervised learning to predict interest degree of the relation predicates among the objects; finally, the object interest degree and the relation predicate interest degree are combined to obtain the visual relation interest degree, and the visual relation with high interest degree is the finally detected interest visual relation.
Further, the invention comprises the following steps:
1) For an input image, extracting frames and categories of all objects, calculating characteristics in the frames of the n objects to serve as object characteristics, combining the n objects in pairs to form n (n-1) object pairs, and calculating characteristics in the frames of a main body and an object in each object pair to serve as joint characteristics;
2) For each object, pre-training by using a GloVe model to obtain word embedded features of category names, taking object features of the object as visual features, taking the word embedded features of the category names as semantic features, taking the position of the object relative to the whole image as position features, and combining the three features to obtain interesting features of the object; for each object pair, calculating three characteristics of a host and an object respectively in the same mode, calculating three characteristics of the object pair, and combining the three characteristics to obtain interesting characteristics of the object pair; inputting the object and the interest feature of the object pair into a graph convolution neural network, and predicting the object pair interest degree;
3) For each object pair, calculating visual features, semantic features and position features of the relation predicates to obtain interest features of the object pair relation predicates, and for each relation predicate, using semi-supervised learning to predict the probability that the relation predicate is interesting under the condition that the object pair is interesting, namely, the interest degree of the relation predicate;
4) Adding the loss of the object category prediction in the step 1), the loss of the object and the object for the interest degree prediction in the step 2) and the loss of the relation predicate interest degree prediction in the step 3) to obtain total loss, combining the object for the interest degree and the relation predicate interest degree obtained by minimizing the total loss to obtain visual relation interest degree, and sorting all visual relations according to the interest degree, wherein the visual relation with high interest degree is the finally detected interest visual relation.
The invention also provides an interest visual relation detection device based on the interest propagation network, which is configured with a computer program, wherein the computer program correspondingly realizes the interest propagation network and realizes the interest visual relation detection method.
The invention has the following effective benefits: the scheme for solving the problem of machine understanding deviation caused by too rich visual relationships in the images is provided, the semantic features, the position features and the visual features of objects and object pairs are considered, the relationship interestingness can be predicted reasonably by taking semantic importance as a standard in the process of detecting the visual relationships, and the interest visual relationships capable of accurately conveying the content of the image main body are found out. The method has good universality and practicability.
Drawings
FIG. 1 is a flow chart of the architecture and interest visual relationship detection of the interest propagation network of the present invention.
Fig. 2 shows an effect of too rich visual relationship detected by the conventional method.
FIG. 3 is a graphical representation of the results of the visual relationship of interest detection method of the present invention.
Detailed Description
The interest visual relation detection method based on the interest propagation network provides a solution to the problem of machine understanding deviation caused by too rich visual relation in an image, achieves that for an input image, the interest characteristics of an object, an object pair and a relation predicate are obtained through linear transformation of semantic characteristics, position characteristics and visual characteristics of the combined object and the object pair, interest degrees of the visual relation are reasonably predicted by taking semantic importance as a standard, and interest visual relation results capable of accurately conveying image main body contents are produced.
The practice of the invention is specifically described below.
As shown in FIG. 1, the invention establishes an interest propagation network, which inputs an image and outputs an interest visual relationship in the image, and comprises a panoramic object detection module, an object interest prediction module and a relationship predicate interest prediction module; the panoramic object detection module performs panoramic segmentation (panoptic segmentation) on the image, wherein the content in the image can be divided into a types of thags and a types of stuffs according to whether the fixed shapes exist, wherein objects with fixed shapes such as people, vehicles and the like belong to the types of thags (namely, the nouns generally belong to the thags); objects of the sky, grass, etc. that do not have a fixed shape belong to the category of stuff (i.e. the inexhaustible nouns belong to stuff). The method comprises the steps of extracting an object in an image through panoramic segmentation, obtaining the joint characteristics of the object characteristic and the object pair by an instance encoder, respectively inputting the joint characteristics into an object-to-interest prediction module and a relation predicate-interest prediction module, respectively obtaining semantic characteristics, visual characteristics and position characteristics of the object, the object pair and the relation predicate by a semantic encoder, a visual encoder and a position encoder, obtaining the interest characteristics of the object, the object pair and the relation predicate by linear transformation, respectively obtaining the object-to-interest degree and the relation predicate-interest degree by the interest characteristics through supervised learning and semi-supervised learning prediction, combining the two interest degrees to obtain the visual relation-interest degree, and sequencing and outputting the interest visual relation according to the visual relation-interest degree.
On the basis of the interest propagation network, firstly, extracting objects from an input image through a panoramic object detection module, combining the objects into object pairs in pairs, calculating the combined characteristics of the object characteristics of the objects and the object pairs, generating the visual characteristics, semantic characteristics and position characteristics of the objects and the object pairs, and respectively obtaining the interest characteristics of the objects and the object pairs through an object pair interest prediction module, thereby predicting the interest degree of the objects; meanwhile, the interest prediction module of the relation predicates obtains interest features of the relation predicates from visual features, semantic features and position features of the objects on the relation predicates, and uses semi-supervised learning to predict interest degree of the relation predicates among the objects; finally, the object interest degree and the relation predicate interest degree are combined to obtain the visual relation interest degree, and the visual relation with high interest degree is the finally detected interest visual relation.
The implementation of the present invention is described in detail below. The invention comprises the following steps:
1) For an input image, calculating object features and joint features by adopting a panoramic object detection module of an interest propagation network:
1.1 Extracting the frames and the categories of all objects in the graph;
1.2 Calculating the characteristics in the n object frames in the step 1.1) to be used as the object characteristics;
1.3 Combining the n objects in the step 1.1) to form n (n-1) object pairs, and calculating the characteristics in the frame of the union of the main body and the object in each object pair as the joint characteristics.
2) For the object and the object pair composed by the object extracted in the step 1), calculating the object pair interestingness by adopting an object pair interestingness prediction module of an interest propagation network:
2.1 For each object extracted in the step 1), taking the object characteristics as visual characteristics, pre-training by using a GloVE model to obtain word embedded characteristics of category names, taking the word embedded characteristics of the category names as semantic characteristics, taking the position of the object relative to the whole image as position characteristics, and combining the three characteristics to obtain the interest characteristics of the object. The calculation method of the position characteristics of the object comprises the following steps:
wherein Loc i Is a feature of the position of the object i,representing juxtaposition operations,/->Coordinates of a left boundary, an upper boundary, a right boundary, and a lower boundary of the object, w and h are the width and the height of the input image, respectively.
2.2 For each object pair formed in the step 1), three characteristics of a subject and an object are calculated respectively in a similar manner, and then three characteristics of the object pair are calculated, so that the interesting characteristics of the object pair are obtained in a combined manner. The method for calculating the position characteristics of the object pairs comprises the following steps:
Loc p is the position characteristic of the object to p, loc i Is the position feature of object i, s p 、o p Respectively representing the subject and the object of the object pair, and U represents the juxtaposition operation of the object level.
The method for calculating the visual characteristics of the object pairs comprises the following steps:
wherein F is p Is the visual characteristic of the object to p,representing object characteristics of the subject and object of the object pair respectively,representing the joint characteristics of the subject and object of the object pair.
2.3 Inputting the two interest features in the step 2.1) and the step 2.2) into a graph convolution neural network, and predicting the object interestingness.
3) For the object pairs formed in the step 1), calculating the interest degree of the relation predicates by adopting a relation predicate interest prediction module of the interest propagation network:
3.1 For each object pair composed in the step 1), calculating visual features, semantic features and position features of the object pair relation predicates, and combining to obtain interest features of the relation predicates. The method for calculating the position characteristics of the relation predicates of the object pairs comprises the following steps:
wherein Loc' p Is the relation predicate position characteristic of the object pair p, and w 'and h' are the width and the height of the frame of the union of the object pair main body and the object respectively. The calculation of visual features is the same as the calculation of visual features by an object.
3.2 For each relationship predicate, semi-supervised learning is used to predict the probability that the relationship predicate is also interesting under the condition that the object is interesting, namely the relationship predicate interestingness. The loss of semi-supervised learning is calculated as follows:
wherein L is rela Is the loss of the relation predicate interest prediction module, l rela Is the loss function of the device,prediction results of marked data and unmarked data are respectively represented by +.>And respectively representing the real results of marked data and unmarked data, wherein beta is the loss weight of the unmarked data.
4) Minimizing the total loss of the interest propagation network, predicting the visual relationship of interest:
4.1 Adding the loss of the object category prediction in the step 1), the loss of the object and the object to the interest prediction in the step 2) and the loss of the relation predicate interest prediction in the step 3) to obtain the total loss of the interest propagation network, and combining the object to the interest degree and the relation predicate interest degree obtained by minimizing the total loss to obtain the visual relation interest degree. The total loss of the interest propagation network is calculated as follows:
L pos =-(1-p pos ) 2 log p pos
L neg =-p neg log(1-p neg )
wherein L is pos 、L neg Representing the loss of positive and negative samples, p pos 、p neg Representing the probability scores of positive and negative samples, respectively, L total Is the total loss of the interest propagation network, L class Is a loss of prediction of the class of the object,positive and negative losses, respectively, representing predicted object interest,>representing the predicted positive and negative loss of interest of the object, respectively,>positive and negative losses of the relational predicate interest prediction are represented, respectively.
4.2 Ordering all visual relations according to the interestingness, wherein the visual relation with high interestingness is the finally detected interest visual relation. The interestingness of the visual relationship is calculated as follows:
I spo =E so ·I so ·P spo
wherein I is spo Is the interestingness of visual relationship, I so 、P spo Respectively representing the interest degree of object pairs and relation predicates, E so Is a binary parameter, E when the subject and the object in the object pair are the same object so Get 0, otherwise E so Taking 1.
The method of the present invention can be implemented by a computer program, and thus there is also provided an interest visual relationship detection apparatus based on an interest propagation network, the apparatus being configured with a computer program, which when executed implements the interest visual relationship detection method of the present invention.
The invention is implemented on an MSCOCO image dataset and compared with the result of the traditional visual relationship detection method. Fig. 2 and 3 are comparative examples of conventional visual relationship detection results and the results of the present invention. Wherein fig. 2 (a) and fig. 3 (a) are input images, and objects related to the visual relationship detection result are marked. Fig. 2 (b) is a result of a conventional visual relationship detection, including up to 24 visual relationships, and most relationships are weakly associated with the subject content of the input image. Fig. 3 (b) shows the result of interest visual relationship detection according to the present invention, wherein the visual relationship is only 5, and the correlation between the visual relationship and the main content of the input image is strong.

Claims (9)

1. An interest visual relation detection method based on an interest propagation network is characterized in that an interest propagation network is established, an image is input, an interest visual relation in the image is output, and the interest propagation network comprises a panoramic object detection module, an object interest prediction module and a relation predicate interest prediction module; firstly, extracting objects from an input image through a panoramic object detection module, combining the objects into object pairs, calculating the combined characteristics of object characteristics and the object pairs of the objects, generating visual characteristics, semantic characteristics and position characteristics of the objects and the object pairs in an object pair interest prediction module, and obtaining interest characteristics of the objects and the object pairs through linear transformation, thereby predicting the interest degree of the objects; meanwhile, the interest prediction module of the relation predicates obtains interest features of the relation predicates through linear transformation of visual features, semantic features and position features of the relation predicates by the objects, and uses semi-supervised learning to predict interest degree of the relation predicates among the objects; finally, the object interest degree and the relation predicate interest degree are combined to obtain the visual relation interest degree, and the visual relation with high interest degree is the finally detected interest visual relation.
2. The interest visual relationship detection method based on the interest propagation network as claimed in claim 1, comprising the steps of:
1) For an input image, extracting frames and categories of all objects, calculating characteristics in the frames of the n objects to serve as object characteristics, combining the n objects in pairs to form n (n-1) object pairs, and calculating characteristics in the frames of a main body and an object in each object pair to serve as joint characteristics;
2) For each object, pre-training by using a GloVe model to obtain word embedded features of category names, taking object features of the object as visual features, taking the word embedded features of the category names as semantic features, taking the position of the object relative to the whole image as position features, and combining the three features to obtain interesting features of the object; for each object pair, calculating three characteristics of a host and an object respectively in the same mode, calculating three characteristics of the object pair, and combining the three characteristics to obtain interesting characteristics of the object pair; inputting the object and the interest feature of the object pair into a graph convolution neural network, and predicting the object pair interest degree;
3) For each object pair, calculating visual features, semantic features and position features of the relation predicates to obtain interest features of the object pair relation predicates, and for each relation predicate, using semi-supervised learning to predict the probability that the relation predicate is interesting under the condition that the object pair is interesting, namely, the interest degree of the relation predicate;
4) Adding the loss of the object category prediction in the step 1), the loss of the object and the object for the interest degree prediction in the step 2) and the loss of the relation predicate interest degree prediction in the step 3) to obtain total loss, combining the object for the interest degree and the relation predicate interest degree obtained by minimizing the total loss to obtain visual relation interest degree, and sorting all visual relations according to the interest degree, wherein the visual relation with high interest degree is the finally detected interest visual relation.
3. The interest visual relationship detection method based on the interest propagation network as claimed in claim 2, wherein in the step 2), the calculation method of the position characteristics of the object is as follows:
wherein Loc i Is a feature of the position of the object i,representing juxtaposition operations,/->Coordinates of a left boundary, an upper boundary, a right boundary, and a lower boundary of the object i are respectively, and w and h are respectively the width and the height of the input image.
4. The interest visual relationship detection method based on the interest propagation network as claimed in claim 2, wherein in the step 2), the method for calculating the position characteristics of the object pair is as follows:
wherein Loc p Is the position characteristic of the object to p, loc i Is the position feature of object i, s p 、o p Respectively representing the subject and the object of the object pair, and U represents the juxtaposition operation of the object level.
5. The interest visual relationship detection method based on the interest propagation network as claimed in claim 2, wherein the method for calculating the visual characteristics of the object pair is as follows:
wherein F is p Is the visual characteristic of the object to p,object characteristics of the subject and the object of the object pair, respectively,/->Representing the joint characteristics of the subject and object of the object pair.
6. The interest visual relationship detection method based on the interest propagation network as claimed in claim 2, wherein in step 3), the method for calculating the relationship predicate position feature of the object pair is as follows:
wherein Loc' p Is the position characteristic of the relation predicate of the object to p,representing juxtaposition operations,/->Coordinates of left boundary, upper boundary, right boundary, and lower boundary of object i, s p 、o p And the U represents the juxtaposition operation of the object level, and w 'and h' are the width and the height of the frame of the union of the object and the object in the object pair.
7. The interest visual relationship detection method based on the interest propagation network according to claim 2, wherein in the semi-supervised learning prediction relationship predicate interestingness in step 3), the calculation method of the prediction loss is as follows:
wherein L is rela Is the loss of relation predicate interestingness prediction, l rela Is the loss function of the device,prediction results of marked data and unmarked data are respectively represented by +.>And respectively representing the real results of marked data and unmarked data, wherein beta is the loss weight of the unmarked data.
8. The interest visual relationship detection method based on the interest propagation network as claimed in claim 2, wherein the calculation method of the total loss in step 4) is as follows:
L pos =-(1-p pos ) 2 logp pos
L neg =-p neg log(1-p neg )
wherein L is pos 、L neg Representing the loss of positive and negative samples, p pos 、p neg Representing the probability scores of positive and negative samples, respectively, L total Is total loss, L class Is a loss of prediction of the class of the object,positive and negative losses representing object interestingness prediction, respectively,>representing the positive and negative losses of the object's prediction of interest,positive and negative losses of the relationship predicate interestingness prediction are represented, respectively.
9. An interest visual relationship detection apparatus based on an interest propagation network, characterized in that the apparatus is configured with a computer program which, when executed, implements the interest visual relationship detection method of claim 1, corresponding to the interest propagation network of claim 1.
CN202010848981.0A 2020-08-21 2020-08-21 Interest visual relation detection method and device based on interest propagation network Active CN111985505B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010848981.0A CN111985505B (en) 2020-08-21 2020-08-21 Interest visual relation detection method and device based on interest propagation network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010848981.0A CN111985505B (en) 2020-08-21 2020-08-21 Interest visual relation detection method and device based on interest propagation network

Publications (2)

Publication Number Publication Date
CN111985505A CN111985505A (en) 2020-11-24
CN111985505B true CN111985505B (en) 2024-02-13

Family

ID=73442732

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010848981.0A Active CN111985505B (en) 2020-08-21 2020-08-21 Interest visual relation detection method and device based on interest propagation network

Country Status (1)

Country Link
CN (1) CN111985505B (en)

Citations (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105045907A (en) * 2015-08-10 2015-11-11 北京工业大学 Method for constructing visual attention-label-user interest tree for personalized social image recommendation
CN108229491A (en) * 2017-02-28 2018-06-29 北京市商汤科技开发有限公司 The method, apparatus and equipment of detection object relationship from picture
CN108229272A (en) * 2017-02-23 2018-06-29 北京市商汤科技开发有限公司 Vision relationship detection method and device and vision relationship detection training method and device
CN108229477A (en) * 2018-01-25 2018-06-29 深圳市商汤科技有限公司 For visual correlation recognition methods, device, equipment and the storage medium of image
WO2019035771A1 (en) * 2017-08-17 2019-02-21 National University Of Singapore Video visual relation detection methods and systems
CN110796472A (en) * 2019-09-02 2020-02-14 腾讯科技(深圳)有限公司 Information pushing method and device, computer readable storage medium and computer equipment
CN110889397A (en) * 2018-12-28 2020-03-17 南京大学 Visual relation segmentation method taking human as main body
CN111125406A (en) * 2019-12-23 2020-05-08 天津大学 Visual relation detection method based on self-adaptive cluster learning
CN111325279A (en) * 2020-02-26 2020-06-23 福州大学 Pedestrian and personal sensitive article tracking method fusing visual relationship
CN111325243A (en) * 2020-02-03 2020-06-23 天津大学 Visual relation detection method based on regional attention learning mechanism
CN111368829A (en) * 2020-02-28 2020-07-03 北京理工大学 Visual semantic relation detection method based on RGB-D image
CN116089732A (en) * 2023-04-11 2023-05-09 江西时刻互动科技股份有限公司 User preference identification method and system based on advertisement click data
CN116628052A (en) * 2022-02-18 2023-08-22 罗伯特·博世有限公司 Apparatus and computer-implemented method for adding quantity facts to a knowledge base

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7965866B2 (en) * 2007-07-03 2011-06-21 Shoppertrak Rct Corporation System and process for detecting, tracking and counting human objects of interest
US8548231B2 (en) * 2009-04-02 2013-10-01 Siemens Corporation Predicate logic based image grammars for complex visual pattern recognition

Patent Citations (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105045907A (en) * 2015-08-10 2015-11-11 北京工业大学 Method for constructing visual attention-label-user interest tree for personalized social image recommendation
CN108229272A (en) * 2017-02-23 2018-06-29 北京市商汤科技开发有限公司 Vision relationship detection method and device and vision relationship detection training method and device
CN108229491A (en) * 2017-02-28 2018-06-29 北京市商汤科技开发有限公司 The method, apparatus and equipment of detection object relationship from picture
WO2019035771A1 (en) * 2017-08-17 2019-02-21 National University Of Singapore Video visual relation detection methods and systems
CN108229477A (en) * 2018-01-25 2018-06-29 深圳市商汤科技有限公司 For visual correlation recognition methods, device, equipment and the storage medium of image
CN110889397A (en) * 2018-12-28 2020-03-17 南京大学 Visual relation segmentation method taking human as main body
CN110796472A (en) * 2019-09-02 2020-02-14 腾讯科技(深圳)有限公司 Information pushing method and device, computer readable storage medium and computer equipment
CN111125406A (en) * 2019-12-23 2020-05-08 天津大学 Visual relation detection method based on self-adaptive cluster learning
CN111325243A (en) * 2020-02-03 2020-06-23 天津大学 Visual relation detection method based on regional attention learning mechanism
CN111325279A (en) * 2020-02-26 2020-06-23 福州大学 Pedestrian and personal sensitive article tracking method fusing visual relationship
CN111368829A (en) * 2020-02-28 2020-07-03 北京理工大学 Visual semantic relation detection method based on RGB-D image
CN116628052A (en) * 2022-02-18 2023-08-22 罗伯特·博世有限公司 Apparatus and computer-implemented method for adding quantity facts to a knowledge base
CN116089732A (en) * 2023-04-11 2023-05-09 江西时刻互动科技股份有限公司 User preference identification method and system based on advertisement click data

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
Yu, Fan,等.Visual Relation of Interest Detection.《MM '20: PROCEEDINGS OF THE 28TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA》.2020,第1386-1394页. *
Zhou, Hao,等.Visual Relationship Detection with Relative Location Mining.《PROCEEDINGS OF THE 27TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA (MM'19) 》.2019,第30-38页. *
基于目标对筛选和联合谓语识别的视觉关系检测;陈方芳;《中国优秀硕士学位论文全文数据库 信息科技辑》(第8期);I138-657 *
视频群体行为识别综述;吴建超,等;《软件学报》;第34卷(第2期);第964-984页 *

Also Published As

Publication number Publication date
CN111985505A (en) 2020-11-24

Similar Documents

Publication Publication Date Title
CN109740148B (en) Text emotion analysis method combining BiLSTM with Attention mechanism
CN110598005B (en) Public safety event-oriented multi-source heterogeneous data knowledge graph construction method
CN110516067B (en) Public opinion monitoring method, system and storage medium based on topic detection
CN105760507B (en) Cross-module state topic relativity modeling method based on deep learning
CN106682059B (en) Modeling and extraction from structured knowledge of images
US20200097604A1 (en) Stacked cross-modal matching
CN113220919B (en) Dam defect image text cross-modal retrieval method and model
CN111259940B (en) Target detection method based on space attention map
CN111680159B (en) Data processing method and device and electronic equipment
CN114936623B (en) Aspect-level emotion analysis method integrating multi-mode data
CN112434732A (en) Deep learning classification method based on feature screening
CN103336835B (en) Image retrieval method based on weight color-sift characteristic dictionary
CN115311463B (en) Category-guided multi-scale decoupling marine remote sensing image text retrieval method and system
CN114780690A (en) Patent text retrieval method and device based on multi-mode matrix vector representation
CN114429566A (en) Image semantic understanding method, device, equipment and storage medium
CN112115253A (en) Depth text ordering method based on multi-view attention mechanism
CN107526721A (en) A kind of disambiguation method and device to electric business product review vocabulary
Gong et al. A method for wheat head detection based on yolov4
Du High-precision portrait classification based on mtcnn and its application on similarity judgement
CN115658934A (en) Image-text cross-modal retrieval method based on multi-class attention mechanism
CN116610778A (en) Bidirectional image-text matching method based on cross-modal global and local attention mechanism
CN115311465A (en) Image description method based on double attention models
US11281714B2 (en) Image retrieval
CN112711693A (en) Litigation clue mining method and system based on multi-feature fusion
CN111985505B (en) Interest visual relation detection method and device based on interest propagation network

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant