CN112612900A - Knowledge graph guided multi-scene image generation method - Google Patents

Knowledge graph guided multi-scene image generation method Download PDF

Info

Publication number
CN112612900A
CN112612900A CN202011434422.1A CN202011434422A CN112612900A CN 112612900 A CN112612900 A CN 112612900A CN 202011434422 A CN202011434422 A CN 202011434422A CN 112612900 A CN112612900 A CN 112612900A
Authority
CN
China
Prior art keywords
knowledge
graph
matrix
layout
layer
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202011434422.1A
Other languages
Chinese (zh)
Inventor
肖贺文
孔雨秋
刘秀平
尹宝才
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Dalian University of Technology
Original Assignee
Dalian University of Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Dalian University of Technology filed Critical Dalian University of Technology
Priority to CN202011434422.1A priority Critical patent/CN112612900A/en
Publication of CN112612900A publication Critical patent/CN112612900A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/36Creation of semantic tools, e.g. ontology or thesauri
    • G06F16/367Ontology
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N5/00Computing arrangements using knowledge-based models
    • G06N5/02Knowledge representation; Symbolic representation

Abstract

The invention provides a method for generating a plurality of scene images guided by a knowledge graph, belonging to an image generation neighborhood. The method comprises the steps of firstly utilizing a knowledge graph to assist in completing an image generation task, firstly constructing the knowledge graph containing object layout relations, then inputting a group of object labels into the graph, and obtaining a plurality of layout relation graphs which accord with facts through a layout searching module; and finally, when the layout relational graph passes through the image generation module, training the generator and the discriminator by combining a knowledge object matrix and a global knowledge vector obtained in the knowledge module so as to generate a scene image corresponding to each relational graph. The method realizes a group of labels by using the knowledge graph, generates one-to-many tasks of a plurality of images, and improves the image generation quality by embedding knowledge representation information. The present invention evaluates the present invention using a real image dataset and observes the improvement over the most advanced baseline.

Description

Knowledge graph guided multi-scene image generation method
Technical Field
The invention belongs to the field of image generation, and particularly relates to a method for generating a plurality of scene images guided by a knowledge graph.
Background
The knowledge graph is a database taking triples as a unit, entity information and relationship information between entities are stored in the triples, in the method application of the knowledge graph, the KG2E method in the trans series is one of classical knowledge representation methods, the method can embed and represent the entities and the relationships in the graph into a high-dimensional Gaussian distribution, and the distribution of head entities and tail entities is reduced by utilizing KL divergence during model training, so that the distribution of the head entities and the tail entities approaches to the distribution of the relationships as far as possible. The knowledge representation method can introduce the information in the map into other models in a distributed mode, and is also a method adopted by the invention when the map information is extracted.
Cross-modality conversion is a classic task in multi-modality learning, and generation of images from various modalities such as texts and sounds belongs to the field. At present, the generation of images is mainly realized by utilizing a generation countermeasure network, the generation countermeasure network consists of a generator and a discriminator, the specific design is determined by tasks, the generator generally consists of a multilayer perceptron and a deep convolution network, a feature vector extracted by texts or sounds is input, and the generated images are output; the discriminator consists of a shallow layer convolution network, inputs the image, outputs the true and false scores of the image, and can output the corresponding category of the image more finely. In the training process, the discriminator hopes to judge the generated image as low score, and the real image as high score, so as to achieve the effect of 'evaluation discrimination'; the generator hopes that the generated image can be judged as high score by the discriminator, and the effect of 'false and false' is achieved. The generator and the discriminator are alternately trained and mutually confronted, thereby ensuring the generation quality of the image.
At present, the method for synthesizing scene images by texts mostly has the following problems: (1) the current text is often a sentence, and a user needs to give a sentence to generate an image in practical application in life, which is inconvenient; (2) the number of images conforming to the description text is more than one, but most methods can only realize one-to-one generation task at present, and the method has poor performance when generating complex scene images with many objects and cannot generate good layout. (3) Text and images belong to different modalities, and the amount of information that can be provided by text information is insufficient to support the generation of high quality images.
Disclosure of Invention
The purpose of the invention is as follows: the invention mainly aims at overcoming the defects of the text-to-image generation method, provides a knowledge-graph-guided multi-scene image generation method, takes a label as input, obtains a layout relation by using a knowledge graph, realizes one-to-many generation, and simultaneously adds knowledge information in a generation countermeasure network to improve the image generation quality.
In order to achieve the purpose, the invention adopts the following technical scheme:
a method for generating a plurality of scene images guided by a knowledge graph comprises the following steps:
step S1: constructing a knowledge graph, extracting required triples in the form of (head entities, relations and tail entities), and integrating the triples into a small knowledge graph;
step S2: inputting a group of object tags into a layout searching module to obtain a plurality of layout relational graphs which accord with facts;
further, the step S2 specifically includes:
step S21: inputting object tags into a constructed knowledge graph for graph searching, searching all triples containing the relationship between the input tags, and sequencing the searched triples according to the occurrence frequency;
step S22: selecting the required triple quantity to form the most possible layout relational graph according to the parameter setting, and simultaneously generating other multiple different layout relational graphs in a random combination mode;
step S3: inputting each layout relation diagram into a pre-trained knowledge module to obtain an object knowledge matrix and a global knowledge vector;
further, the step S3 specifically includes:
step S31: pre-training a knowledge graph by using a classical knowledge representation method KG2E, and representing knowledge representations corresponding to all objects and relations in the graph by using different Gaussian distributions;
step S32: performing data processing on the layout relation diagram, and decomposing the layout relation diagram into an in-diagram object label and an in-diagram relation label;
step S33: inputting in-graph object labels and in-graph relation labels, and sampling from KG2E knowledge representation of pre-trained objects and relations to obtain an object knowledge matrix and a relation knowledge matrix;
step S34: the sum of the object knowledge matrix and the relation knowledge matrix is called a global knowledge matrix, a global knowledge vector is generated through a full connection layer, and the global knowledge vector represents knowledge information extracted from the map by the whole layout relation diagram;
step S4: adding an object knowledge matrix and a global knowledge vector into a generator;
further, the step S4 specifically includes:
step S41: initializing and embedding the label of the object in the graph and the label of the relation in the graph obtained by decomposition to obtain an initial matrix of the object and an initial matrix of the relation;
step S42: inputting the initial matrix of the object and the relation into a graph convolution network with 5 layers of depth to obtain an updated matrix of the object and the relation;
step S43: connecting the object knowledge matrix output from the knowledge module with the object update matrix to obtain an object prediction matrix;
step S44: the object matrix generates a numerical value of the position of an object frame through the multilayer perceptron 1, generates an object shape mask through the multilayer perceptron 2, and generates a scene layout tensor through mapping combination;
step S45: automatically expanding the dimension of the global knowledge vector output from the knowledge module, which is the same as the size of the picture, and connecting the global knowledge vector with a scene layout tensor, inputting the global knowledge vector into a cascade generation network, and generating a scene image;
step S5: adding an object knowledge matrix and a global knowledge vector into a discriminator;
further, the step S5 specifically includes:
step S51: when different objects in the image are identified, an image slice obtained by processing the scene image data and an object knowledge matrix are simultaneously input into the convolutional neural network 1 to obtain the true and false scores of the object image slice and the object class prediction;
step S52: when the whole image is identified, the scene image and the global knowledge vector are simultaneously input into the convolutional neural network 2 to obtain the true and false scores of the image;
step S6: and training the generator and the discriminator alternately according to the overall loss function, so that the generation quality of the whole image is ensured, and the corresponding category of the object image slice composite label is also ensured. The obtained generator is a tool for completing the generation of the layout relationship diagram to the scene image.
Compared with the prior art, the invention has the following beneficial effects:
(1) unlike text entry in a sentence in most methods, the input of the present invention is a set of selectable labels, which is more convenient for the user to use; (2) most methods can only complete one-to-one generation task, but the invention realizes that a plurality of layout relational graphs are generated by a group of labels by introducing the integrally constructed knowledge graph, thereby generating a plurality of scene images, completing one-to-many generation tasks and simultaneously ensuring that each image has reasonable layout; (3) in the generation countermeasure network, knowledge information in a map obtained by a knowledge representation method KG2E is added to a generator and a discriminator, an object knowledge matrix is added from the perspective of a local object, a global knowledge vector is added from the perspective of global layout, the defect that text information is insufficient is made up, and the generation quality of images is improved. This is also the first application of knowledge representation in the knowledge-graph in the field of image generation.
Drawings
Figure 1 is the overall structure of the design of the present invention.
FIG. 2 is a layout search module designed by the present invention.
FIG. 3 is a knowledge module designed by the present invention.
Fig. 4 is a generator structure in an image generation module designed by the present invention.
Fig. 5 is a discriminator structure in an image generation module designed by the present invention.
Detailed description of the invention
The technical solution of the present invention will be further described with reference to the following specific embodiments and accompanying drawings.
A method for generating a plurality of scene images guided by a knowledge graph comprises the following steps:
step S1: extracting all triples (head entity, relation and tail entity) in the VG dataset, wherein the set of the head entity and the tail entity comprises all label objects, the relation comprises words which can represent object layout relations such as 'adjacent', 'above', 'behind', and the like, and all triples are extracted and integrated into a small knowledge graph;
step S2: as shown in fig. 2, a group of n object tags is input into a layout search module to obtain m layout relationship diagrams conforming to the fact;
further, the step S2 specifically includes:
step S21: inputting n object tags into a constructed knowledge graph for graph search, searching all triples containing the relationship between the input tags, and sequencing the searched triples from high to low according to the occurrence frequency;
step S22: according to parameter setting, selecting the number of required triples according to the sequence to form a most possible layout relational graph representing the layout relational graph with the most possible n labels, and simultaneously generating a plurality of other different layout relational graphs in a random combination mode to obtain m layout relational graphs;
step S3: as shown in fig. 3, the layout relation diagram is input into a knowledge module pre-trained by using a knowledge graph to obtain a corresponding object knowledge matrix and a global knowledge vector;
further, the step S3 specifically includes:
step S31: pre-training the knowledge graph by using a classical knowledge representation method KG2E to obtain d-dimensional Gaussian distribution (mu) corresponding to all N entities in the graphi,σi) N, d-dimensional gaussian distribution (μ) corresponding to all K relationships in the mapj,σj) K, which is a knowledge representation of KG2E of the object and relationship.
Step S32: and performing data processing on the layout relationship diagram, and decomposing the layout relationship diagram into an in-diagram object label and an in-diagram relationship label.
Step S33: inputting object labels and relation labels in the graph, sampling from KG2E knowledge representation of pre-trained objects and relations, and obtaining an object knowledge matrix Ok∈Rn×dAnd relation knowledge matrix Pk∈Rk×d. Wherein n is the number of object labels in the layout relational graph, k is the number of relational labels in the layout relational graph, and d is the embedding dimension represented by the map knowledge.
Step S34: knowledge matrix O of objectk∈Rn×dAnd relation knowledge matrix Pk∈Rk×dThe global knowledge matrix S is obtained by adding the column directionsk∈R1×dGenerating a global knowledge vector G through the full connection layerk∈Rd
Step S4: as shown in fig. 4, the object knowledge matrix and the global knowledge vector are added to the generator, and a layout relationship diagram is input to generate a scene image.
Further, the step S4 specifically includes:
step S41: initializing and embedding n in-picture object labels and k in-picture relation labels obtained by decomposition to obtain an object initial matrix Oo∈Rn×dAnd relation initial matrix Po∈Rk×dWhere d is the embedding dimension, consistent with the embedding dimension of the knowledge module.
Step S42: initial matrix O of object and relationo∈Rn×dAnd Po∈Rk×dInput into the graph convolution networkTo obtain an object update matrix On∈Rn×dAnd relation initial matrix Pn∈Rk×dThe graph convolution network is formed by stacking 5 layers of same graph convolution blocks, and each block is formed by combining a full connection layer, a Relu layer, a full connection layer and a Relu layer in sequence.
Step S43: the object knowledge matrix O output in the knowledge module of step S3k∈Rn×dUpdate matrix O with objectn∈Rn×dConnected together according to the direction of the row to obtain an object prediction matrix Op∈Rn×2dAnd the knowledge information of each object is integrated into the generator.
Step S44: predicting the object with a matrix Op∈Rn×2dGenerating a value B epsilon of the position of an object frame through a multilayer perceptron 1n×4Generating an object shape mask M E R through a multilayer perceptron 2n×s×s×dMapping and combining the two, and setting a scene layout tensor L epsilon RH×W×dThe multilayer perceptron 1 is composed of a full connection layer, a Relu layer and a full connection layer, and the multilayer perceptron 2 is composed of an upper sampling layer, a BN layer, a volume data layer and a Relu layer which are stacked for 4 times in sequence. Object frame position B is corresponding to Rn×4In the drawing, n represents the number of objects in the drawing, and 4 represents the position values of the objects at the lower left corner, the lower right corner, the upper left corner and the lower right corner of the bounding box. M ∈ R in object shape maskn×s×s×dS represents the size of the object mask, d is the embedding dimension of the object matrix input, and the scene layout tensor L is the RH×W×dIn the above description, H represents the height of the scene image to be generated, and W represents the width of the scene image to be generated.
Step S45: global knowledge vector G output in the knowledge module of step S3k∈RdAutomatically extending dimensionality with the same size as the picture to obtain G'k∈RH×W×dAnd is associated with the scene layout tensor L ∈ RH×W×dConnected together, input into a cascade generation network to generate a scene image I e RH×W×3. The cascade generation network consists of 5 cascade generation modules, and the structure of each cascade generation module comprises an average pooling layer, an upsampling layer, a convolutional layer, a BN layer, a Relu layer, a convolutional layer, a BN layer and RelThe u layers are 8 layers in total.
Step S5: as shown in fig. 5, an object knowledge matrix and a global knowledge vector are added to the discriminator.
Further, the step S5 specifically includes:
step S51: when different objects in the image are identified, the scene image I belongs to RH×W×3Obtaining an image slice C' epsilon R after data processingn×L×L×3Where L is the size of the image slice, and is related to the object knowledge matrix Ok∈Rn×dAnd connecting and grouping according to a first dimension, wherein each group is a two u image slice and a corresponding knowledge vector, inputting the two u image slices into a convolutional neural network 1 in a group to obtain a true and false score of each object image slice and object type prediction, wherein the convolutional neural network 1 has a structure of a convolutional layer, a BN layer, a Relu layer, a convolutional layer, an average pooling layer and a full connection layer.
Step S52: when the whole image is identified, the scene image I belongs to RH×W×3And global knowledge vector Gk∈RdAnd simultaneously inputting the image into a convolutional neural network 2 to obtain the true and false scores of the image, wherein the convolutional neural network 2 has the structure of a convolutional layer, a BN layer, a Relu layer, a convolutional layer, a BN layer, a Relu layer and a convolutional layer.
Step S6: training the generator and discriminator alternately, minimizing the overall loss function:
L=λ1Lbox2Lpixel3LGAN4Limg-per5Lobj-per
wherein L isboxFor the L1 loss between the predicted object bounding box position and the real object bounding box, LpixelFor the L1 loss between the generated push image and the real image, LGANFor the generation of the generator and discriminator, Limg-perTo generate a loss of perception at the feature level of the image and the real image, Lobj-perFor generating a loss of perception at the feature level, λ, of an object slice of an image with an object slice of a real image1,λ2,λ3,λ4,λ5And the hyper-parameters are manually set in the training process. And the generator part after training is used for generating a scene image from the layout relation diagram.
The above-mentioned expanding model with sg2im as the baseline for the generator and discriminator in steps S4 and S5 is only a preferred embodiment of the present invention, and all equivalent changes and modifications made according to the claimed scope of the present invention should be covered by the present invention.

Claims (7)

1. A method for generating a plurality of scene images guided by a knowledge graph is characterized by comprising the following steps:
step S1: extracting required triples in the form of (head entity, relation and tail entity) and integrating the triples into a knowledge graph;
step S2: inputting a group of object tags into a layout searching module to obtain a plurality of layout relational graphs which accord with facts;
step S3: inputting each layout relation diagram into a pre-trained knowledge module to obtain an object knowledge matrix and a global knowledge vector;
step S4: adding an object knowledge matrix and a global knowledge vector into a generator;
step S5: adding an object knowledge matrix and a global knowledge vector into a discriminator;
step S6: training a generator and a discriminator alternately according to an integral loss function, and ensuring the generation quality of the whole image and the corresponding category of the object image slice composite label; the obtained generator is a tool for completing the generation of the layout relationship diagram to the scene image.
2. The method for generating a plurality of scene images under knowledge-graph guidance according to claim 1, wherein the step S2 is specifically as follows:
step S21: inputting a group of object tags into the knowledge graph constructed in the step S1 for graph search, searching all triples containing the relationship between the input tags, and sequencing the searched triples according to the occurrence frequency;
step S22: and according to parameter setting, selecting the required number of triples to form the most possible layout relational graph, and simultaneously generating other multiple different layout relational graphs in a random combination mode.
3. The method for generating a plurality of scene images under knowledge-graph guidance according to claim 1 or 2, wherein the step S3 is specifically as follows:
step S31: pre-training the knowledge graph constructed in the step S1 by using a knowledge representation method KG2E, and representing all objects in the graph by using different Gaussian distributions and representing knowledge representation corresponding to the relationship;
step S32: performing data processing on the layout relation diagram obtained in the step S2, and decomposing the layout relation diagram into an in-diagram object label and an in-diagram relation label;
step S33: inputting in-graph object labels and in-graph relation labels, and sampling from KG2E knowledge representation of pre-trained objects and relations to obtain an object knowledge matrix and a relation knowledge matrix;
step S34: the sum of the object knowledge matrix and the relation knowledge matrix is called a global knowledge matrix, and a global knowledge vector is generated through a full connection layer and represents knowledge information extracted from the map by the whole layout relation diagram.
4. The method for generating a plurality of scene images under knowledge-graph guidance according to claim 3, wherein the step S4 is specifically as follows:
step S41: initializing and embedding the label of the object in the graph and the label of the relation in the graph obtained by decomposition to obtain an initial matrix of the object and an initial matrix of the relation;
step S42: inputting the initial matrix of the object and the relation into a graph convolution network with 5 layers of depth to respectively obtain an object updating matrix and a relation updating matrix;
step S43: connecting the object knowledge matrix output from the knowledge module with the object update matrix to obtain an object prediction matrix;
step S44: the object prediction matrix generates a numerical value of an object frame position through the multilayer perceptron 1, generates an object shape mask through the multilayer perceptron 2, and generates a scene layout tensor through the object frame position and the object shape mask through mapping combination;
step S45: and automatically expanding the dimension of the global knowledge vector output by the knowledge module, which is the same as the size of the picture, and connecting the global knowledge vector with the scene layout tensor together, and inputting the global knowledge vector into a cascade generation network to generate a scene image.
5. The method for generating a plurality of scene images under knowledge-graph guidance according to claim 4, wherein the step S5 is specifically as follows:
step S51: when different objects in the image are identified, an object image slice obtained after the scene image is subjected to data processing and an object knowledge matrix output by a knowledge module are simultaneously input into the convolutional neural network 1 to obtain the true and false scores of the object image slice and the object class prediction;
step S52: when the whole image is identified, the scene image and the global knowledge vector output by the knowledge module are simultaneously input into the convolutional neural network 2 to obtain the true and false scores of the image.
6. The method for generating a plurality of scene images guided by a knowledge graph according to claim 4, wherein the multilayer perceptron 1 is composed of a full connection layer, a Relu layer and a full connection layer, and the multilayer perceptron 2 is composed of an upsampling layer, a BN layer, a volume layer and a Relu layer which are stacked 4 times in sequence.
7. The method for generating a plurality of scene images guided by a knowledge graph according to claim 5, wherein the convolutional neural network 1 has a structure of a convolutional layer, a BN layer, a Relu layer, a convolutional layer, an average pooling layer and a full connection layer; the convolutional neural network 2 has a structure of convolutional layer, BN layer, Relu layer, convolutional layer.
CN202011434422.1A 2020-12-10 2020-12-10 Knowledge graph guided multi-scene image generation method Pending CN112612900A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011434422.1A CN112612900A (en) 2020-12-10 2020-12-10 Knowledge graph guided multi-scene image generation method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011434422.1A CN112612900A (en) 2020-12-10 2020-12-10 Knowledge graph guided multi-scene image generation method

Publications (1)

Publication Number Publication Date
CN112612900A true CN112612900A (en) 2021-04-06

Family

ID=75232566

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011434422.1A Pending CN112612900A (en) 2020-12-10 2020-12-10 Knowledge graph guided multi-scene image generation method

Country Status (1)

Country Link
CN (1) CN112612900A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113407645A (en) * 2021-05-19 2021-09-17 福建福清核电有限公司 Intelligent sound image archive compiling and researching method based on knowledge graph
CN114299194A (en) * 2021-12-23 2022-04-08 北京百度网讯科技有限公司 Training method of image generation model, image generation method and device

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113407645A (en) * 2021-05-19 2021-09-17 福建福清核电有限公司 Intelligent sound image archive compiling and researching method based on knowledge graph
CN114299194A (en) * 2021-12-23 2022-04-08 北京百度网讯科技有限公司 Training method of image generation model, image generation method and device

Similar Documents

Publication Publication Date Title
US9558268B2 (en) Method for semantically labeling an image of a scene using recursive context propagation
CN109344285B (en) Monitoring-oriented video map construction and mining method and equipment
CN109783666B (en) Image scene graph generation method based on iterative refinement
Alush et al. Ensemble segmentation using efficient integer linear programming
CN108875076B (en) Rapid trademark image retrieval method based on Attention mechanism and convolutional neural network
CN110188228A (en) Cross-module state search method based on Sketch Searching threedimensional model
CN112200266B (en) Network training method and device based on graph structure data and node classification method
CN109993102A (en) Similar face retrieval method, apparatus and storage medium
CN114419642A (en) Method, device and system for extracting key value pair information in document image
CN114419304A (en) Multi-modal document information extraction method based on graph neural network
Guo et al. Using multi-scale and hierarchical deep convolutional features for 3D semantic classification of TLS point clouds
CN112612900A (en) Knowledge graph guided multi-scene image generation method
CN111325237A (en) Image identification method based on attention interaction mechanism
CN113159067A (en) Fine-grained image identification method and device based on multi-grained local feature soft association aggregation
CN115098675A (en) Emotion triple generation method based on multi-class table filling
CN112861970A (en) Fine-grained image classification method based on feature fusion
CN113392244A (en) Three-dimensional model retrieval method and system based on depth measurement learning
CN113240033B (en) Visual relation detection method and device based on scene graph high-order semantic structure
CN110111365B (en) Training method and device based on deep learning and target tracking method and device
CN114387608B (en) Table structure identification method combining convolution and graph neural network
Varlik et al. Filtering airborne LIDAR data by using fully convolutional networks
Ma et al. Compound exemplar based object detection by incremental random forest
CN114998647A (en) Breast cancer full-size pathological image classification method based on attention multi-instance learning
Bakhtiarnia et al. PromptMix: Text-to-image diffusion models enhance the performance of lightweight networks
CN114170460A (en) Multi-mode fusion-based artwork classification method and system

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination