WO2023137916A1 - Graph neural network-based image scene classification method and apparatus - Google Patents

Graph neural network-based image scene classification method and apparatus Download PDF

Info

Publication number
WO2023137916A1
WO2023137916A1 PCT/CN2022/090725 CN2022090725W WO2023137916A1 WO 2023137916 A1 WO2023137916 A1 WO 2023137916A1 CN 2022090725 W CN2022090725 W CN 2022090725W WO 2023137916 A1 WO2023137916 A1 WO 2023137916A1
Authority
WO
WIPO (PCT)
Prior art keywords
superpixel
target
image
sample
unit
Prior art date
Application number
PCT/CN2022/090725
Other languages
French (fr)
Chinese (zh)
Inventor
王俊
Original Assignee
平安科技(深圳)有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 平安科技(深圳)有限公司 filed Critical 平安科技(深圳)有限公司
Publication of WO2023137916A1 publication Critical patent/WO2023137916A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/11Region-based segmentation

Definitions

  • the present application relates to the technical field of artificial intelligence, and in particular to a graph neural network-based image scene classification method, device, electronic equipment, and storage medium.
  • Image scene classification means that for a given image, by identifying the information and content it contains to judge the scene it belongs to (such as nature, street, indoor, etc.), so as to achieve the purpose of scene classification.
  • Convolutional Neural Networks are widely used in computer vision tasks such as image scene classification.
  • the embodiment of the present application proposes an image scene classification method based on a graph neural network, the method comprising:
  • the state vector of the target superpixel unit is updated to obtain the updated state vector of the target superpixel unit;
  • the updated state vectors of all target superpixel units are input to a pre-trained image scene classification model, so that the image scene classification model outputs a target scene label based on the target superpixel segmented image;
  • An image scene classification result corresponding to the target image is determined according to the target scene label.
  • the embodiment of the present application proposes an image scene classification device based on a graph neural network, including:
  • the image segmentation module is used to perform superpixel segmentation on the target image to be classified to obtain the target superpixel segmentation image
  • the feature extraction module is used to obtain a plurality of target superpixel units under the target superpixel segmentation image, and each of the target superpixel units is used as a node to obtain node features of each target superpixel unit and edge features between adjacent target superpixel units;
  • a state determination module configured to, for each target superpixel unit, determine the state vector of the target superpixel unit according to the node characteristics of the target superpixel unit;
  • a state update module for each target superpixel unit, according to the state vector of the target superpixel unit, the state vector of the adjacent target superpixel unit, the edge feature between the target superpixel unit and the adjacent target superpixel unit, update the state vector of the target superpixel unit, and obtain the updated state vector of the target superpixel unit;
  • a label output module configured to input the updated state vectors of all target superpixel units to the pre-trained image scene classification model, so that the image scene classification model outputs the target scene label based on the target superpixel segmented image;
  • a scene classification module configured to determine an image scene classification result corresponding to the target image according to the target scene label.
  • an embodiment of the present application proposes an electronic device, the electronic device includes a memory, a processor, a program stored in the memory and operable on the processor, and a data bus for realizing connection and communication between the processor and the memory, when the program is executed by the processor, an image scene classification method based on a graph neural network is implemented, wherein the image scene classification method based on a graph neural network includes:
  • the state vector of the target superpixel unit is updated to obtain the updated state vector of the target superpixel unit;
  • the updated state vectors of all target superpixel units are input to a pre-trained image scene classification model, so that the image scene classification model outputs a target scene label based on the target superpixel segmented image;
  • An image scene classification result corresponding to the target image is determined according to the target scene label.
  • an embodiment of the present application proposes a storage medium, the storage medium is a computer-readable storage medium for computer-readable storage, the storage medium stores one or more programs, and the one or more programs can be executed by one or more processors to implement a graph neural network-based image scene classification method, wherein the graph neural network-based image scene classification method includes:
  • the state vector of the target superpixel unit is updated to obtain the updated state vector of the target superpixel unit;
  • the updated state vectors of all target superpixel units are input to a pre-trained image scene classification model, so that the image scene classification model outputs a target scene label based on the target superpixel segmented image;
  • An image scene classification result corresponding to the target image is determined according to the target scene label.
  • the scheme of this application is based on graph neural network modeling, and constructs graph data based on superpixel units obtained by superpixel segmentation of target images.
  • the correlation between adjacent superpixel units and edge features are considered in the modeling process, so that the message passing property of graph neural network can be used to achieve effective image scene classification.
  • learning the correlation between local features through the graph neural network not limited to the correlation between a single pixel pair, can better realize feature migration and utilization, effectively obtain global context information, improve the accuracy of deep models in image understanding tasks, and eliminate the limitations of high-cost spatial information.
  • Fig. 1 is a schematic flow diagram of an image scene classification method based on a graph neural network provided by an embodiment of the present application
  • Fig. 2 a is the schematic diagram of the target image to be classified in the embodiment of the present application.
  • Fig. 2b is a schematic diagram of a target superpixel segmented image in an embodiment of the present application.
  • Fig. 3 is a schematic diagram of the message passing process of the graph neural network of the embodiment of the present application.
  • Fig. 4 is a schematic flow chart of step S140 in Fig. 1;
  • FIG. 5 is a schematic diagram of a training process of an image scene classification model provided by an embodiment of the present application.
  • FIG. 6 is a schematic flow diagram of another image scene classification method based on a graph neural network provided by an embodiment of the present application.
  • FIG. 7 is a schematic diagram of target superpixel segmentation images of different levels generated based on different preset segmentation thresholds
  • FIG. 8 is a schematic diagram of another image scene classification model training process provided by the embodiment of the present application.
  • FIG. 9 is a schematic structural diagram of an image scene classification device based on a graph neural network provided by an embodiment of the present application.
  • FIG. 10 is a schematic diagram of a hardware structure of an electronic device provided by an embodiment of the present application.
  • Artificial intelligence It is a new technical science that studies and develops theories, methods, technologies and application systems for simulating, extending and expanding human intelligence; artificial intelligence is a branch of computer science, artificial intelligence attempts to understand the essence of intelligence, and produce a new intelligent machine that can respond in a similar way to human intelligence. Research in this field includes robots, language recognition, image recognition, natural language processing and expert systems. Artificial intelligence can simulate the information process of human consciousness and thinking. Artificial intelligence is also a theory, method, technology and application system that uses digital computers or machines controlled by digital computers to simulate, extend and expand human intelligence, perceive the environment, acquire knowledge and use knowledge to obtain the best results.
  • GNN is a neural network that operates directly on graph structures.
  • a graph structure usually includes multiple nodes.
  • a node can represent an object or concept, and an edge can represent the relationship between nodes.
  • GNN uses a state vector to represent the state of the node.
  • GNN is based on a message propagation mechanism. Each node updates its node status by exchanging messages with each other until a certain stable value is reached. The output of GNN is calculated at each node according to the current node status. It can be said that the main process of GNN learning is to iteratively aggregate and update the adjacent information of nodes in graph data.
  • each node updates its own information by aggregating the features of adjacent nodes and its own features in the previous layer, and usually performs nonlinear transformation on the aggregated information.
  • each node can obtain the information of adjacent nodes within the corresponding hop number.
  • Region growing algorithms Digital image segmentation algorithms are generally based on one of two fundamental properties of gray values: discontinuity and similarity.
  • the application of the former property is to segment the image based on the discontinuous change of image gray level, such as the edge of the image.
  • the main application of the second property is to segment an image into similar regions according to implementation-specified criteria.
  • the region growing algorithm is based on the second property of the image, that is, the similarity of the gray value of the image.
  • the basic idea of the region growing algorithm is to merge pixels with similar properties together. For each region, a seed point should be designated as the starting point of growth, and then the pixels in the area around the seed point will be compared with the seed point, and the points with similar properties will be merged to continue growing outward until no pixels that meet the conditions are included. The growth of such a region is complete.
  • Image scene classification means that for a given image, by judging and identifying the information and content it contains to judge the scene it belongs to (such as nature, street, indoor, etc.), so as to achieve the purpose of scene classification.
  • Convolutional neural network (CNN) is widely used in computer vision tasks such as image scene classification.
  • CNN convolutional neural network
  • direct use of convolutional neural network model for classification can achieve a certain accuracy of scene category classification.
  • extraction and modeling of image scene information by conventional convolutional neural network does not conform to the actual way of human brain cognition, so it also brings problems such as poor model interpretability and limited accuracy.
  • Existing global context information acquisition methods, such as non-local and various attention mechanisms have too high a parameter cost and are difficult to apply to high-resolution input image scenarios. Therefore, how to improve the accuracy of image scene classification and reduce the amount of parameters in the classification process has become a technical problem to be solved urgently.
  • the embodiment of the present application provides a graph neural network-based image scene classification method, device, electronic equipment, and storage medium, aiming at improving the accuracy of image scene classification and reducing the amount of parameters in the classification process.
  • AI artificial intelligence
  • digital computers or machines controlled by digital computers to simulate, extend and expand human intelligence, perceive the environment, acquire knowledge and use knowledge to obtain the best results.
  • Artificial intelligence basic technologies generally include technologies such as sensors, dedicated artificial intelligence chips, cloud computing, distributed storage, big data processing technology, operation/interaction systems, and mechatronics.
  • Artificial intelligence software technology mainly includes computer vision technology, robotics technology, biometrics technology, speech processing technology, natural language processing technology, and machine learning/deep learning.
  • the image scene classification method based on the graph neural network provided in the embodiment of the present application relates to the technical fields of artificial intelligence and image processing.
  • the image scene classification method provided in the embodiment of the present application may be applied to a terminal, may also be applied to a server, and may also be software running on the terminal or the server.
  • the terminal can be a smart phone, tablet computer, notebook computer, desktop computer, etc.
  • the server can be configured as an independent physical server, or can be configured as a server cluster or distributed system composed of multiple physical servers, and can also be configured as a cloud server that provides basic cloud computing services such as cloud services, cloud databases, cloud computing, cloud functions, cloud storage, network services, cloud communications, middleware services, domain name services, security services, CDN, and big data and artificial intelligence platforms;
  • the application can be used in numerous general purpose or special purpose computer system environments or configurations. Examples: personal computers, server computers, handheld or portable devices, tablet-type devices, multiprocessor systems, microprocessor-based systems, set-top boxes, programmable consumer electronics devices, network PCs, minicomputers, mainframe computers, distributed computing environments including any of the above systems or devices, etc.
  • This application may be described in the general context of computer-executable instructions, such as program modules, being executed by a computer.
  • program modules include routines, programs, objects, components, data structures, etc. that perform particular tasks or implement particular abstract data types.
  • the application may also be practiced in distributed computing environments where tasks are performed by remote processing devices that are linked through a communications network.
  • program modules may be located in both local and remote computer storage media including storage devices.
  • FIG. 1 shows a schematic flowchart of a method for classifying image scenes based on a graph neural network provided by an embodiment of the present application.
  • the image scene classification method includes but not limited to the following steps S110-S160.
  • Step S110 performing superpixel segmentation on the target image to be classified to obtain the target superpixel segmented image.
  • FIG. 2a is a target image to be classified, and by performing superpixel segmentation on the target image, a target superpixel segmented image as shown in FIG. 2b can be obtained.
  • step S110 perform superpixel segmentation on the target image to be classified to obtain the target superpixel segmented image, which may be specifically implemented through the following method steps: use a region growing algorithm to perform region segmentation on the target image to be classified to obtain the target superpixel segmented image.
  • the region growing algorithm is an image segmentation method to digitally segment the image based on the similarity of the gray value of the image.
  • the target image can be regarded as an image composed of N*N pixels based on the preset pixel parameters, and the seed point is selected from the N*N pixels based on the preset rules, and then judge whether the gray value of the adjacent pixel point and the gray value of the current seed point meet the preset similarity, and if so, add the adjacent pixel point to the area to which the current seed point belongs.
  • the condition of region growing is actually some similarity criteria defined according to the continuity between pixel gray levels, and the condition of region growing stop defines a termination rule, basically, when no pixel satisfies the condition of joining a certain region, region growing will stop.
  • the algorithm define a variable, the maximum pixel gray value distance reg_maxdist. When the absolute value of the difference between the gray value of the pixel to be added and the average gray value of all pixels in the segmented area is less than or equal to reg_maxdist, the pixel is added to the segmented area. Otherwise, the region growing algorithm stops.
  • Step S120 acquiring a plurality of target superpixel units under the target superpixel segmentation image, using each of the target superpixel units as a node, and acquiring node features of each target superpixel unit and edge features between adjacent target superpixel units.
  • the image is subdivided into multiple image sub-regions (sets of pixels), and these image sub-regions are super-pixel units.
  • the superpixel unit under the target superpixel segmentation image is used as the target superpixel unit, and then the graph data is constructed based on the target superpixel unit.
  • each of the target superpixel units is regarded as a node, and the node features of each target superpixel unit and the edge features between adjacent target superpixel units are obtained.
  • the node features include at least one of grayscale features, shape features, and texture features.
  • the grayscale feature, shape feature, and texture feature will be exemplified below.
  • the grayscale feature is to describe the apparent physical properties of the target object through the grayscale change, which directly reflects the color law of the target itself.
  • the grayscale values of the target in different bands of red, green and blue form a set of vectors. Based on the vectors, multiple indicators including mean, brightness, variance, and standard deviation can be calculated.
  • shape features are generally determined by the outer contour of the target in the image, which reflects the geometric form of the target to a certain extent. Distinguish from other objects based on the compacted outer contour, and the shape features have the advantages of rotation invariance at a recognizable resolution.
  • Commonly used shape features of objects include length, width, area, perimeter, density, roundness, shape index, and rectangularity.
  • the texture information in the image embodies the combined information of grayscale features and spatial features, and can reflect the spatial distribution properties of pixel color information. It usually appears as a local regular pattern at an intermediate scale between the pixel level and the scene level, and belongs to a semi-macro-level target knowledge. In actual computing, texture is often described as the spatial regularity distribution and correlation between image gray levels within a specific 3*3, 5*5, 7*7 or larger window.
  • the most classic and widely used statistical calculation method for texture features is the gray level co-occurrence matrix method proposed by Haralick et al.
  • Step S130 for each target superpixel unit, determine the state vector of the target superpixel unit according to the node characteristics of the target superpixel unit.
  • the initial state of the target superpixel unit may be determined according to the node characteristics of the target superpixel unit according to a preset mapping rule.
  • the target superpixel unit v it can be obtained by Represents its initial state vector, v ⁇ N, N is the set of target superpixel units.
  • Step S140 for each target superpixel unit, update the state vector of the target superpixel unit according to the state vector of the target superpixel unit, the state vectors of adjacent target superpixel units, and the edge features between the target superpixel unit and adjacent target superpixel units, to obtain the updated state vector of the target superpixel unit.
  • each node determines its initial state vector, it also exchanges messages with adjacent nodes, and updates its own node status according to the adjacent node messages and the edge messages connected to the adjacent nodes.
  • FIG. 3 shows a schematic diagram of a message passing process of a graph neural network.
  • the state vector of node v1 can be updated by considering the state vectors of neighboring nodes v3, v5 and v8 and the edge features connected with neighboring nodes.
  • step S140 according to the state vector of the target superpixel unit, the state vector of the adjacent target superpixel unit, and the edge features between the target superpixel unit and the adjacent target superpixel unit, the state vector of the target superpixel unit is updated to obtain the updated state vector of the target superpixel unit, which can be specifically implemented through the following method steps S141-S142:
  • Step S141 according to the state vector of the target superpixel unit, the state vector of the adjacent target superpixel unit, and the edge features between the target superpixel unit and the adjacent target superpixel unit, determine the relationship feature vector of the target superpixel unit.
  • calculation formula of the relationship feature vector can refer to the following formula (1):
  • Step S142 updating the state vector of the target superpixel unit according to the state vector of the target superpixel unit and the relationship feature vector, to obtain an updated state vector of the target superpixel unit.
  • the updated state vector of the target superpixel unit can be expressed by the following formula (2):
  • the initial state vector of a certain target superpixel unit indicates that its initial state is "plant”, after collecting the state vectors and edge feature vectors of adjacent target superpixel units, its own state vector is updated, and the updated state vector indicates that it is "tree”.
  • the updated state vectors of all target superpixel units are input to the pre-trained image scene classification model, so that the image scene classification model outputs the corresponding target scene labels according to the state vectors of all target superpixel units.
  • the image scene classification result can be determined according to the target scene label.
  • step S150 the image scene classification model provided by the embodiment of the present application needs to be trained to obtain a pre-trained image scene classification model.
  • FIG. 5 shows a schematic diagram of a training process of an image scene classification model.
  • the training process of the image scene classification model provided by the embodiment of the present application may include the following steps S200-S250.
  • Step S200 acquiring a sample image and a sample scene label corresponding to the sample image.
  • a sample image set is constructed first, and a sample scene label is attached to each sample image in the sample image set.
  • Step S210 performing superpixel segmentation on the sample image to obtain a sample superpixel segmented image.
  • Step S220 obtaining a plurality of sample superpixel units under the sample superpixel segmented image, using each sample superpixel unit as a node, and obtaining node features of each sample superpixel unit and edge features between adjacent sample superpixel units.
  • Step S230 for each sample superpixel unit, determine the state vector of the sample superpixel unit according to the node characteristics of the sample superpixel unit.
  • Step S240 for each sample superpixel unit, update the state vector of the sample superpixel unit according to the state vector of the sample superpixel unit, the state vectors of adjacent sample superpixel units, and the edge features between the sample superpixel unit and adjacent sample superpixel units, to obtain the updated state vector of the sample superpixel unit.
  • Step S250 using the updated state vectors of all sample superpixel units as input and the sample scene label as expected output, to train the image scene classification model.
  • a trained image scene classification model By performing multiple rounds of iterative training on the image scene classification model until the image scene classification model satisfies the training end condition, a trained image scene classification model can be obtained.
  • FIG. 6 shows a schematic flowchart of another image scene classification method based on a graph neural network provided by an embodiment of the present application.
  • the image scene classification method includes but not limited to the following steps S310-S370.
  • Step S310 performing superpixel segmentation on the target image to be classified based on different preset segmentation thresholds to obtain a plurality of target superpixel segmented images, each target superpixel segmented image including a different number of target superpixel units.
  • the embodiments of the present application control the threshold of superpixel segmentation to generate different target superpixel segmentation images based on different preset segmentation thresholds, and each target superpixel segmentation image includes different numbers of target superpixel units.
  • three different levels of target superpixel segmentation images can be generated based on different preset segmentation thresholds, so as to decompose the image into a multi-level network structure for target information expression.
  • the image is divided into three levels of superpixel unit sets, including 8, 4, and 2 superpixels respectively.
  • different segmentation thresholds can be determined by adjusting the size of preset pixel parameters in the process of performing superpixel segmentation on the target image based on the region growing algorithm.
  • Step S320 traversing each target superpixel segmented image, and acquiring a plurality of target superpixel units under the currently traversed target superpixel segmented image.
  • each target superpixel segmented image is traversed, and the following steps S330-S360 are respectively performed for each target superpixel segmented image.
  • Step S330 for multiple target superpixel units under the currently traversed target superpixel segmentation image, use each target superpixel unit as a node, and acquire node features of each target superpixel unit and edge features between adjacent target superpixel units.
  • Step S340 for each target superpixel unit, determine the state vector of the target superpixel unit according to the node characteristics of the target superpixel unit.
  • Step S350 for each target superpixel unit, update the state vector of the target superpixel unit according to the state vector of the target superpixel unit, the state vectors of adjacent target superpixel units, and the edge features between the target superpixel unit and adjacent target superpixel units, to obtain an updated state vector of the target superpixel unit.
  • step S360 the updated state vectors of all target superpixel units are input to the pre-trained image scene classification model, so that the image scene classification model outputs a target scene label based on the target superpixel segmented image.
  • Step S370 acquiring target scene labels output by the image scene classification model based on each target superpixel segmented image.
  • target scene labels of different levels are obtained by segmenting images based on target superpixels of different levels.
  • the label of the target scene based on level 1 is "nature”
  • the label of the target scene based on level 2 is "forest”
  • the label of the target scene based on level 3 is "shrub forest”.
  • Step S380 Determine an image scene classification result corresponding to the target image according to the target scene label output based on each superpixel segmented image.
  • Step S380 may specifically include: concatenating target scene labels output from each superpixel segmented image to obtain an image scene classification result corresponding to the target image.
  • the target scene labels corresponding to the three levels “nature”, “forest”, and “shrub forest” can be obtained, and finally the target scene labels of the three levels can be concatenated to output the final scene classification result as "nature-forest-shrub forest”.
  • step S360 the image scene classification model needs to be trained to obtain a pre-trained image scene classification model.
  • FIG. 8 shows a schematic diagram of a training process of an image scene classification model.
  • the training process of the image scene classification model provided by the embodiment of the present application may include the following steps S400-S420.
  • Step S400 acquiring a sample image and a sample scene label corresponding to the sample image.
  • a sample image set is constructed first, and a sample scene label is attached to each sample image in the sample image set.
  • Step S410 perform superpixel segmentation on the sample image based on different preset segmentation thresholds to obtain multiple sample superpixel segmented images, each sample superpixel segmented image includes a different number of sample superpixel units.
  • the superpixel segmentation is performed on the sample image based on different preset segmentation thresholds, and the specific segmentation method can refer to the implementation process of the previous step S310, which will not be repeated here.
  • Step S420 traversing each sample superpixel segmented image to train the image scene classification model based on each sample superpixel segmented image, the training process includes steps S421-S424:
  • Step S421 taking each of the sample superpixel units under the currently traversed sample superpixel segmented image as a node, and acquiring node features of each sample superpixel unit and edge features between adjacent sample superpixel units.
  • Step S422 for each sample superpixel unit, determine the state vector of the sample superpixel unit according to the node characteristics of the sample superpixel unit.
  • Step S423 For each sample superpixel unit, update the state vector of the sample superpixel unit according to the state vector of the sample superpixel unit, the state vectors of adjacent sample superpixel units, and the edge features between the sample superpixel unit and adjacent sample superpixel units, to obtain the updated state vector of the sample superpixel unit.
  • Step S424 taking the updated state vectors of all sample superpixel units as input and the sample scene label as expected output to train the image scene classification model.
  • steps S421-S424 is similar to the implementation process of the previous steps S220-S250, so reference may be made to the relevant descriptions of the previous steps S220-S250, which will not be repeated here.
  • the embodiment of the present application also provides an image scene classification device based on a graph neural network, the device includes:
  • the image segmentation module 810 is used to perform superpixel segmentation on the target image to be classified to obtain the target superpixel segmentation image
  • the feature extraction module 820 is used to obtain a plurality of target superpixel units under the target superpixel segmentation image, and use each of the target superpixel units as a node to obtain node features of each target superpixel unit and edge features between adjacent target superpixel units;
  • a state determination module 830 configured to, for each target superpixel unit, determine the state vector of the target superpixel unit according to the node characteristics of the target superpixel unit;
  • the state update module 840 for each target superpixel unit, according to the state vector of the target superpixel unit, the state vector of the adjacent target superpixel unit, and the edge characteristics between the target superpixel unit and the adjacent target superpixel unit, update the state vector of the target superpixel unit to obtain the updated state vector of the target superpixel unit;
  • the label output module 850 is used to input the updated state vectors of all target superpixel units to the image scene classification model trained in advance, so that the image scene classification model outputs the target scene label based on the target superpixel segmented image;
  • a scene classification module 860 configured to determine an image scene classification result corresponding to the target image according to the target scene label.
  • the graph neural network-based image scene classification device of the present application further includes a training module for training the image scene classification model to obtain a trained image scene classification model.
  • the embodiment of the present application also provides an electronic device, the electronic device includes: a memory, a processor, a program stored on the memory and operable on the processor, and a data bus for realizing connection and communication between the processor and the memory, and the above-mentioned image scene classification method is implemented when the program is executed by the processor.
  • the electronic device may be any intelligent terminal including a tablet computer, a vehicle-mounted computer, and the like.
  • FIG. 10 illustrates a hardware structure of an electronic device in another embodiment.
  • the electronic device includes:
  • the processor 901 can be implemented by a general-purpose CPU (Central Processing Unit, central processing unit), microprocessor, application-specific integrated circuit (Application Specific Integrated Circuit, ASIC), or one or more integrated circuits, and is used to execute related programs to realize the technical solutions provided by the embodiments of the present application;
  • a general-purpose CPU Central Processing Unit, central processing unit
  • microprocessor microprocessor
  • application-specific integrated circuit Application Specific Integrated Circuit, ASIC
  • ASIC Application Specific Integrated Circuit
  • the memory 902 may be implemented in the form of a read-only memory (Read Only Memory, ROM), a static storage device, a dynamic storage device, or a random access memory (Random Access Memory, RAM).
  • the memory 902 can store an operating system and other application programs.
  • the relevant program codes are stored in the memory 902, and the processor 901 invokes and executes a graph neural network-based image scene classification method according to the embodiment of the present application.
  • a node obtain the node characteristics of each target superpixel unit and the edge characteristics between adjacent target superpixel units; for each target superpixel unit, determine the state vector of the target superpixel unit according to the node characteristics of the target superpixel unit;
  • the updated state vector is input to the image scene classification model trained in advance, so that the image scene classification model outputs a target scene label based on the target superpixel segmentation image; determine the image scene classification result corresponding to the target image according to the target scene label;
  • the input/output interface 903 is used to realize information input and output
  • the communication interface 904 is used to realize the communication interaction between the device and other devices, and the communication can be realized through a wired method (such as USB, network cable, etc.), or can be realized through a wireless method (such as a mobile network, WIFI, Bluetooth, etc.); and
  • bus 905 for transferring information between various components of the device (such as processor 901, memory 902, input/output interface 903 and communication interface 904);
  • the processor 901 , the memory 902 , the input/output interface 903 and the communication interface 904 are connected to each other within the device through the bus 905 .
  • the method before inputting the updated state vectors of all target superpixel units into the pre-trained image scene classification model to obtain the target scene label through the image scene classification model, the method further includes: obtaining a sample image and a sample scene label corresponding to the sample image; performing superpixel segmentation on the sample image to obtain a sample superpixel segmentation image; obtaining a plurality of sample superpixel units under the sample superpixel segmentation image, using each of the sample superpixel units as a node, and obtaining the node characteristics of each sample superpixel unit and the edge characteristics between adjacent sample superpixel units; For each sample superpixel unit, according to the node characteristics of the sample superpixel unit, determine the state vector of the sample superpixel unit; for each sample superpixel unit, update the state vector of the sample superpixel unit according to the state vector of the sample superpixel unit, the state vectors of adjacent sample superpixel units, and the edge features between the sample superpixel unit and adjacent sample superpixel units, to obtain the updated state vector of the sample superpixel unit; use the updated state vectors of all sample superpixel units
  • the method before inputting the updated state vectors of all target superpixel units into the pre-trained image scene classification model to obtain the target scene label through the image scene classification model, the method further includes: obtaining a sample image and a sample scene label corresponding to the sample image; performing superpixel segmentation on the sample image based on different preset segmentation thresholds to obtain a plurality of sample superpixel segmentation images, each sample superpixel segmentation image includes a different number of sample superpixel units; Training, the training process includes: taking each of the sample superpixel units under the currently traversed sample superpixel segmentation image as a node respectively, and obtaining the node characteristics of each sample superpixel unit and the edge characteristics between adjacent sample superpixel units; for each sample superpixel unit, determining the state vector of the sample superpixel unit according to the node characteristics of the sample superpixel unit; The state vector of the sample superpixel unit is updated to obtain the updated state vector of the sample superpixel unit; the updated state vector of all sample superpixel units is used as input and the sample scene label is used as the expected output to train
  • the superpixel segmentation of the target image to be classified to obtain the target superpixel segmentation image includes:
  • the target image to be classified is subjected to superpixel segmentation based on different preset segmentation thresholds to obtain multiple target superpixel segmented images, and each target superpixel segmented image includes different numbers of target superpixel units.
  • the acquiring a plurality of target superpixel units under the target superpixel segmentation image includes: traversing each target superpixel segmentation image, and acquiring a plurality of target superpixel units under the currently traversed target superpixel segmentation image; determining the image scene classification result corresponding to the target image according to the target scene label includes: acquiring the target scene label output by the image scene classification model based on each target superpixel segmentation image; and determining the image scene classification result corresponding to the target image based on the target scene label output based on each superpixel segmentation image.
  • the determining the image scene classification result corresponding to the target image according to the target scene label output based on each superpixel segmentation image includes: splicing the target scene label output from each superpixel segmentation image to obtain the image scene classification result corresponding to the target image.
  • updating the state vector of the target superpixel unit according to the state vector of the target superpixel unit, the state vector of adjacent target superpixel units, and the edge features between the target superpixel unit and adjacent target superpixel units, to obtain the updated state vector of the target superpixel unit includes: determining the relationship feature vector of the target superpixel unit according to the state vector of the target superpixel unit, the state vector of adjacent target superpixel units, and the edge features between the target superpixel unit and adjacent target superpixel units; The state vector of the target superpixel unit and the relationship feature vector are used to update the state vector of the target superpixel unit to obtain an updated state vector of the target superpixel unit.
  • the embodiment of the present application also provides a storage medium, which is a computer-readable storage medium for computer-readable storage.
  • the storage medium stores one or more programs, and one or more programs can be executed by one or more processors to implement a graph neural network-based image scene classification method, wherein the graph neural network-based image scene classification method includes: performing superpixel segmentation on the target image to be classified to obtain a target superpixel segmentation image; The node characteristics of the pixel unit and the edge characteristics between the adjacent target superpixel units; for each target superpixel unit, according to the node characteristics of the target superpixel unit, the state vector of the target superpixel unit is determined; for each target superpixel unit, according to the state vector of the target superpixel unit, the state vector of the adjacent target superpixel unit, and the edge characteristics between the target superpixel unit and the adjacent target superpixel unit, the state vector of the target superpixel unit is updated to obtain the updated state vector of the target superpixel unit; A trained image scene classification model, so that the image scene classification model outputs a target scene label based on the
  • the method before inputting the updated state vectors of all target superpixel units into the pre-trained image scene classification model to obtain the target scene label through the image scene classification model, the method further includes: obtaining a sample image and a sample scene label corresponding to the sample image; performing superpixel segmentation on the sample image to obtain a sample superpixel segmentation image; obtaining a plurality of sample superpixel units under the sample superpixel segmentation image, using each of the sample superpixel units as a node, and obtaining the node characteristics of each sample superpixel unit and the edge characteristics between adjacent sample superpixel units; For each sample superpixel unit, according to the node characteristics of the sample superpixel unit, determine the state vector of the sample superpixel unit; for each sample superpixel unit, update the state vector of the sample superpixel unit according to the state vector of the sample superpixel unit, the state vectors of adjacent sample superpixel units, and the edge characteristics between the sample superpixel unit and adjacent sample superpixel units, to obtain the updated state vector of the sample superpixel unit; use the updated state vectors of all sample superpixel units
  • the method before inputting the updated state vectors of all target superpixel units into the pre-trained image scene classification model to obtain the target scene label through the image scene classification model, the method further includes: obtaining a sample image and a sample scene label corresponding to the sample image; performing superpixel segmentation on the sample image based on different preset segmentation thresholds to obtain a plurality of sample superpixel segmentation images, each sample superpixel segmentation image includes a different number of sample superpixel units; Training, the training process includes: taking each of the sample superpixel units under the currently traversed sample superpixel segmentation image as a node respectively, and obtaining the node characteristics of each sample superpixel unit and the edge characteristics between adjacent sample superpixel units; for each sample superpixel unit, determining the state vector of the sample superpixel unit according to the node characteristics of the sample superpixel unit; The state vector of the pixel unit is updated to obtain the updated state vector of the sample super pixel unit; the updated state vector of all sample super pixel units is used as input and the sample scene label is used as the expected output to
  • the superpixel segmentation of the target image to be classified to obtain the target superpixel segmentation image includes:
  • the target image to be classified is subjected to superpixel segmentation based on different preset segmentation thresholds to obtain multiple target superpixel segmented images, and each target superpixel segmented image includes different numbers of target superpixel units.
  • the acquiring a plurality of target superpixel units under the target superpixel segmentation image includes: traversing each target superpixel segmentation image, and acquiring a plurality of target superpixel units under the currently traversed target superpixel segmentation image; determining the image scene classification result corresponding to the target image according to the target scene label includes: acquiring the target scene label output by the image scene classification model based on each target superpixel segmentation image; and determining the image scene classification result corresponding to the target image based on the target scene label output based on each superpixel segmentation image.
  • the determining the image scene classification result corresponding to the target image according to the target scene label output based on each superpixel segmentation image includes: splicing the target scene label output from each superpixel segmentation image to obtain the image scene classification result corresponding to the target image.
  • updating the state vector of the target superpixel unit according to the state vector of the target superpixel unit, the state vector of adjacent target superpixel units, and the edge features between the target superpixel unit and adjacent target superpixel units, to obtain the updated state vector of the target superpixel unit includes: determining the relationship feature vector of the target superpixel unit according to the state vector of the target superpixel unit, the state vector of adjacent target superpixel units, and the edge features between the target superpixel unit and adjacent target superpixel units; The state vector of the target superpixel unit and the relationship feature vector are used to update the state vector of the target superpixel unit to obtain an updated state vector of the target superpixel unit.
  • the computer-readable storage medium may be non-volatile or volatile.
  • memory can be used to store non-transitory software programs and non-transitory computer-executable programs.
  • the memory may include high-speed random access memory, and may also include non-transitory memory, such as at least one magnetic disk storage device, flash memory device, or other non-transitory solid-state storage devices.
  • the memory optionally includes memory located remotely from the processor, and these remote memories may be connected to the processor via a network. Examples of the aforementioned networks include, but are not limited to, the Internet, intranets, local area networks, mobile communication networks, and combinations thereof.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Physics & Mathematics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Engineering & Computer Science (AREA)
  • Evolutionary Computation (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Mathematical Physics (AREA)
  • Computing Systems (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Image Analysis (AREA)

Abstract

The present application relates to the technical field of artificial intelligence. Embodiments of the present application provide a graph neural network-based image scene classification method and apparatus, an electronic device, and a storage medium. The method comprises: performing superpixel segmentation on a target image to be classified so as to obtain a target superpixel segmented image; for each target superpixel unit, updating state vectors of the target superpixel units according to the state vectors of the target superpixel units, state vectors of adjacent target superpixel units, and edge features between the target superpixel units and the adjacent target superpixel units so as to obtain updated state vectors of the target superpixel units; and inputting the updated state vectors of all target superpixel units to a pre-trained image scene classification model so as to obtain a target scene label. According to the present application, global context information can be effectively obtained, thereby improving the accuracy performance of the model in image comprehension tasks, and also eliminating the limitations of high-cost spatial information.

Description

基于图神经网络的图像场景分类方法及装置Image scene classification method and device based on graph neural network
本申请要求于2022年1月21日提交中国专利局、申请号为202210073146.3,发明名称为“基于图神经网络的图像场景分类方法及装置”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。This application claims the priority of the Chinese patent application with the application number 202210073146.3 and the invention title "Image Scene Classification Method and Device Based on Graph Neural Network" submitted to the China Patent Office on January 21, 2022, the entire contents of which are incorporated in this application by reference.
技术领域technical field
本申请涉及人工智能技术领域,尤其涉及一种基于图神经网络的图像场景分类方法、装置、电子设备及存储介质。The present application relates to the technical field of artificial intelligence, and in particular to a graph neural network-based image scene classification method, device, electronic equipment, and storage medium.
背景技术Background technique
图像场景分类是指对于已经给定的图像,通过识别它所包含的信息和内容来判断其所属的场景(例如自然、街道、室内等),从而达到场景分类的目的。卷积神经网络(CNN)在计算机视觉任务如图像场景分类中应用十分广泛。Image scene classification means that for a given image, by identifying the information and content it contains to judge the scene it belongs to (such as nature, street, indoor, etc.), so as to achieve the purpose of scene classification. Convolutional Neural Networks (CNNs) are widely used in computer vision tasks such as image scene classification.
技术问题technical problem
以下是发明人意识到的现有技术的技术问题:直接利用卷积神经网络模型进行分类,虽然可以实现一定精度的场景类别分类,但是,常规的卷积神经网络对图像场景信息的提取和建模,并不符合人脑认知的实际方式,因此也带来了模型可解释性差、精度有限等问题。现有的全局上下文信息获取的方法,如采用非局部均值(non-loca l)、各种注意力机制,参数成本太高,难以应用于高分辨率输入图像的场景中。因此,如何提高图像场景分类的准确性及减小分类过程中的参数量,成为了亟待解决的技术问题。The following is the technical problem of the prior art realized by the inventor: directly using the convolutional neural network model for classification, although it can achieve a certain accuracy of scene category classification, but the extraction and modeling of image scene information by conventional convolutional neural networks does not conform to the actual way of human brain cognition, so it also brings problems such as poor interpretability and limited accuracy of the model. The existing global context information acquisition methods, such as non-local means and various attention mechanisms, have too high a parameter cost and are difficult to apply to high-resolution input image scenarios. Therefore, how to improve the accuracy of image scene classification and reduce the amount of parameters in the classification process has become a technical problem to be solved urgently.
技术解决方案technical solution
第一方面,本申请实施例提出了一种基于图神经网络的图像场景分类方法,所述方法包括:In the first aspect, the embodiment of the present application proposes an image scene classification method based on a graph neural network, the method comprising:
对待分类的目标图像进行超像素分割,得到目标超像素分割图像;Perform superpixel segmentation on the target image to be classified to obtain the target superpixel segmentation image;
获取所述目标超像素分割图像下的多个目标超像素单元,将每个所述目标超像素单元分别作为一个节点,获取每个目标超像素单元的节点特征、相邻目标超像素单元之间的边特征;Obtaining a plurality of target superpixel units under the target superpixel segmentation image, using each of the target superpixel units as a node, and acquiring node features of each target superpixel unit and edge features between adjacent target superpixel units;
对于每个目标超像素单元,根据所述目标超像素单元的节点特征,确定所述目标超像素单元的状态向量;For each target superpixel unit, according to the node characteristics of the target superpixel unit, determine the state vector of the target superpixel unit;
对于每个目标超像素单元,根据所述目标超像素单元的状态向量、相邻目标超像素单元的状态向量、所述目标超像素单元与相邻目标超像素单元之间的边特征,对所述目标超像素单元的状态向量进行更新,得到所述目标超像素单元更新后的状态向量;For each target superpixel unit, according to the state vector of the target superpixel unit, the state vector of the adjacent target superpixel unit, and the edge feature between the target superpixel unit and the adjacent target superpixel unit, the state vector of the target superpixel unit is updated to obtain the updated state vector of the target superpixel unit;
将所有目标超像素单元更新后的状态向量输入至预先训练好的图像场景分类模型,以使所述图像场景分类模型输出基于所述目标超像素分割图像的目标场景标签;The updated state vectors of all target superpixel units are input to a pre-trained image scene classification model, so that the image scene classification model outputs a target scene label based on the target superpixel segmented image;
根据所述目标场景标签确定对应于所述目标图像的图像场景分类结果。An image scene classification result corresponding to the target image is determined according to the target scene label.
第二方面,本申请实施例提出了一种基于图神经网络的图像场景分类装置,包括:In the second aspect, the embodiment of the present application proposes an image scene classification device based on a graph neural network, including:
图像分割模块,用于对待分类的目标图像进行超像素分割,得到目标超像素分割图像;The image segmentation module is used to perform superpixel segmentation on the target image to be classified to obtain the target superpixel segmentation image;
特征提取模块,用于获取所述目标超像素分割图像下的多个目标超像素单元,将每个所述目标超像素单元分别作为一个节点,获取每个目标超像素单元的节点特征、相邻目标超像素单元之间的边特征;The feature extraction module is used to obtain a plurality of target superpixel units under the target superpixel segmentation image, and each of the target superpixel units is used as a node to obtain node features of each target superpixel unit and edge features between adjacent target superpixel units;
状态确定模块,用于对于每个目标超像素单元,根据所述目标超像素单元的节点特征,确定所述目标超像素单元的状态向量;A state determination module, configured to, for each target superpixel unit, determine the state vector of the target superpixel unit according to the node characteristics of the target superpixel unit;
状态更新模块,对于每个目标超像素单元,根据所述目标超像素单元的状态向量、相邻目标超像素单元的状态向量、所述目标超像素单元与相邻目标超像素单元之间的边特征,对所述目标超像素单元的状态向量进行更新,得到所述目标超像素单元更新后的状态向量;A state update module, for each target superpixel unit, according to the state vector of the target superpixel unit, the state vector of the adjacent target superpixel unit, the edge feature between the target superpixel unit and the adjacent target superpixel unit, update the state vector of the target superpixel unit, and obtain the updated state vector of the target superpixel unit;
标签输出模块,用于将所有目标超像素单元更新后的状态向量输入至预先训练好的图像场景分类模型,以使所述图像场景分类模型输出基于目标超像素分割图像的所述目标场景标签;A label output module, configured to input the updated state vectors of all target superpixel units to the pre-trained image scene classification model, so that the image scene classification model outputs the target scene label based on the target superpixel segmented image;
场景分类模块,用于根据所述目标场景标签确定对应于所述目标图像的图像场景分类结果。A scene classification module, configured to determine an image scene classification result corresponding to the target image according to the target scene label.
第三方面,本申请实施例提出了一种电子设备,所述电子设备包括存储器、处理器、存储在所述存储器上并可在所述处理器上运行的程序以及用于实现所述处理器和所述存储器之间的连接通信的数据总线,所述程序被所述处理器执行时实现一种基于图神经网络的图像场景分类方法,其中,所述基于图神经网络的图像场景分类方法包括:In a third aspect, an embodiment of the present application proposes an electronic device, the electronic device includes a memory, a processor, a program stored in the memory and operable on the processor, and a data bus for realizing connection and communication between the processor and the memory, when the program is executed by the processor, an image scene classification method based on a graph neural network is implemented, wherein the image scene classification method based on a graph neural network includes:
对待分类的目标图像进行超像素分割,得到目标超像素分割图像;Perform superpixel segmentation on the target image to be classified to obtain the target superpixel segmentation image;
获取所述目标超像素分割图像下的多个目标超像素单元,将每个所述目标超像素单元分别作为一个节点,获取每个目标超像素单元的节点特征、相邻目标超像素单元之间的边特征;Obtaining a plurality of target superpixel units under the target superpixel segmentation image, using each of the target superpixel units as a node, and acquiring node features of each target superpixel unit and edge features between adjacent target superpixel units;
对于每个目标超像素单元,根据所述目标超像素单元的节点特征,确定所述目标超像素单元的状态向量;For each target superpixel unit, according to the node characteristics of the target superpixel unit, determine the state vector of the target superpixel unit;
对于每个目标超像素单元,根据所述目标超像素单元的状态向量、相邻目标超像素单元的状态向量、所述目标超像素单元与相邻目标超像素单元之间的边特征,对所述目标超像素单元的状态向量进行更新,得到所述目标超像素单元更新后的状态向量;For each target superpixel unit, according to the state vector of the target superpixel unit, the state vector of the adjacent target superpixel unit, and the edge feature between the target superpixel unit and the adjacent target superpixel unit, the state vector of the target superpixel unit is updated to obtain the updated state vector of the target superpixel unit;
将所有目标超像素单元更新后的状态向量输入至预先训练好的图像场景分类模型,以使所述图像场景分类模型输出基于所述目标超像素分割图像的目标场景标签;The updated state vectors of all target superpixel units are input to a pre-trained image scene classification model, so that the image scene classification model outputs a target scene label based on the target superpixel segmented image;
根据所述目标场景标签确定对应于所述目标图像的图像场景分类结果。An image scene classification result corresponding to the target image is determined according to the target scene label.
第四方面,本申请实施例的提出了一种存储介质,所述存储介质为计算机可读存储介质,用于计算机可读存储,所述存储介质存储有一个或者多个程序,所述一个或者多个程序可被一个或者多个处理器执行,以实现一种基于图神经网络的图像场景分类方法,其中,所述基于图神经网络的图像场景分类方法包括:In a fourth aspect, an embodiment of the present application proposes a storage medium, the storage medium is a computer-readable storage medium for computer-readable storage, the storage medium stores one or more programs, and the one or more programs can be executed by one or more processors to implement a graph neural network-based image scene classification method, wherein the graph neural network-based image scene classification method includes:
对待分类的目标图像进行超像素分割,得到目标超像素分割图像;Perform superpixel segmentation on the target image to be classified to obtain the target superpixel segmentation image;
获取所述目标超像素分割图像下的多个目标超像素单元,将每个所述目标超像素单元分别作为一个节点,获取每个目标超像素单元的节点特征、相邻目标超像素单元之间的边特征;Obtaining a plurality of target superpixel units under the target superpixel segmentation image, using each of the target superpixel units as a node, and acquiring node features of each target superpixel unit and edge features between adjacent target superpixel units;
对于每个目标超像素单元,根据所述目标超像素单元的节点特征,确定所述目标超像素单元的状态向量;For each target superpixel unit, according to the node characteristics of the target superpixel unit, determine the state vector of the target superpixel unit;
对于每个目标超像素单元,根据所述目标超像素单元的状态向量、相邻目标超像素单元的状态向量、所述目标超像素单元与相邻目标超像素单元之间的边特征,对所述目标超像素单元的状态向量进行更新,得到所述目标超像素单元更新后的状态向量;For each target superpixel unit, according to the state vector of the target superpixel unit, the state vector of the adjacent target superpixel unit, and the edge feature between the target superpixel unit and the adjacent target superpixel unit, the state vector of the target superpixel unit is updated to obtain the updated state vector of the target superpixel unit;
将所有目标超像素单元更新后的状态向量输入至预先训练好的图像场景分类模型,以使所述图像场景分类模型输出基于所述目标超像素分割图像的目标场景标签;The updated state vectors of all target superpixel units are input to a pre-trained image scene classification model, so that the image scene classification model outputs a target scene label based on the target superpixel segmented image;
根据所述目标场景标签确定对应于所述目标图像的图像场景分类结果。An image scene classification result corresponding to the target image is determined according to the target scene label.
有益效果Beneficial effect
本申请的方案,基于图神经网络建模,并基于对目标图像进行超像素分割所获得的超像素单元构建图数据。另外,为了充分挖掘目标图像场景的时空拓扑关系,在建模过程中考虑了相邻超像素单元之间的相关性以及边特征,从而利用图神经网络的消息传递特性实现有效的图像场景分类。如此,通过图神经网络来学习局部特征之间的相关性,而不局限于单个像素对之间的相关性,可以较好地实现特征迁移和利用,有效获取全局上下文信息,提高深度模型在图像理解任务上的准确度表现,还能消除高成本空间信息的局限性。The scheme of this application is based on graph neural network modeling, and constructs graph data based on superpixel units obtained by superpixel segmentation of target images. In addition, in order to fully mine the spatio-temporal topological relationship of the target image scene, the correlation between adjacent superpixel units and edge features are considered in the modeling process, so that the message passing property of graph neural network can be used to achieve effective image scene classification. In this way, learning the correlation between local features through the graph neural network, not limited to the correlation between a single pixel pair, can better realize feature migration and utilization, effectively obtain global context information, improve the accuracy of deep models in image understanding tasks, and eliminate the limitations of high-cost spatial information.
附图说明Description of drawings
图1是本申请实施例提供的一种基于图神经网络的图像场景分类方法的流程示意图;Fig. 1 is a schematic flow diagram of an image scene classification method based on a graph neural network provided by an embodiment of the present application;
图2a是本申请实施例中的待分类的目标图像的示意图;Fig. 2 a is the schematic diagram of the target image to be classified in the embodiment of the present application;
图2b是本申请实施例中的目标超像素分割图像的示意图;Fig. 2b is a schematic diagram of a target superpixel segmented image in an embodiment of the present application;
图3是本申请实施例的图神经网络的消息传递过程示意图;Fig. 3 is a schematic diagram of the message passing process of the graph neural network of the embodiment of the present application;
图4是图1中步骤S140的流程示意图;Fig. 4 is a schematic flow chart of step S140 in Fig. 1;
图5是本申请的实施例提供的一种图像场景分类模型训练流程示意图;FIG. 5 is a schematic diagram of a training process of an image scene classification model provided by an embodiment of the present application;
图6是本申请实施例提供的另一种基于图神经网络的图像场景分类方法的流程示意图;FIG. 6 is a schematic flow diagram of another image scene classification method based on a graph neural network provided by an embodiment of the present application;
图7是基于不同的预设分割阈值生成的不同层次的目标超像素分割图像的示意图;7 is a schematic diagram of target superpixel segmentation images of different levels generated based on different preset segmentation thresholds;
图8是本申请实施例提供的另一种图像场景分类模型训练流程示意图;FIG. 8 is a schematic diagram of another image scene classification model training process provided by the embodiment of the present application;
图9是本申请实施例提供的基于图神经网络的图像场景分类装置的结构示意图;9 is a schematic structural diagram of an image scene classification device based on a graph neural network provided by an embodiment of the present application;
图10是本申请实施例提供的电子设备的硬件结构示意图。FIG. 10 is a schematic diagram of a hardware structure of an electronic device provided by an embodiment of the present application.
本发明的实施方式Embodiments of the present invention
为了使本申请的目的、技术方案及优点更加清楚明白,以下结合附图及实施例,对本申请进行进一步详细说明。应当理解,此处所描述的具体实施例仅用以解释本申请,并不用于限定本申请。In order to make the purpose, technical solution and advantages of the present application clearer, the present application will be further described in detail below in conjunction with the accompanying drawings and embodiments. It should be understood that the specific embodiments described here are only used to explain the present application, not to limit the present application.
首先,对本申请中涉及的若干名词进行解析:First, analyze some nouns involved in this application:
人工智能(artificial intelligence,AI):是研究、开发用于模拟、延伸和扩展人的智能的理论、方法、技术及应用系统的一门新的技术科学;人工智能是计算机科学的一个分支,人工智能企图了解智能的实质,并生产出一种新的能以人类智能相似的方式做出反应的智能机器,该领域的研究包括机器人、语言识别、图像识别、自然语言处理和专家系统等。人工智能可以对人的意识、思维的信息过程的模拟。人工智能还是利用数字计算机或者数字计算机控制的机器模拟、延伸和扩展人的智能,感知环境、获取知识并使用知识获得最佳结果的理论、方法、技术及应用系统。Artificial intelligence (AI): It is a new technical science that studies and develops theories, methods, technologies and application systems for simulating, extending and expanding human intelligence; artificial intelligence is a branch of computer science, artificial intelligence attempts to understand the essence of intelligence, and produce a new intelligent machine that can respond in a similar way to human intelligence. Research in this field includes robots, language recognition, image recognition, natural language processing and expert systems. Artificial intelligence can simulate the information process of human consciousness and thinking. Artificial intelligence is also a theory, method, technology and application system that uses digital computers or machines controlled by digital computers to simulate, extend and expand human intelligence, perceive the environment, acquire knowledge and use knowledge to obtain the best results.
图神经网络(Graph Neural Networks,GNN):GNN是一种直接在图结构上运行的神经网络。图结构通常包括多个节点,在图结构中,一个节点可以表示一个对象或概念,而边可以表示节点之间的关系。GNN用一个状态向量表示节点的状态。GNN基于消息传播机制,每一个节点通过相互交换消息来更新自己的节点状态,直到达到某一个稳定值,GNN的输出就是在每个节点处,根据当前节点状态分别计算输出。可以说,GNN学习的主要过程是通过迭代对图数据中节点的相邻信息进行聚合和更新。在一次迭代中,每一个节点通过聚合相邻节点的特征及自己在上一层的特征来更新自己的信息,通常也会对聚合后的信息进行非线性变换。通过堆叠多层网络,每个节点可以获取到相应跳数内的相邻节点信息。Graph Neural Networks (GNN): GNN is a neural network that operates directly on graph structures. A graph structure usually includes multiple nodes. In a graph structure, a node can represent an object or concept, and an edge can represent the relationship between nodes. GNN uses a state vector to represent the state of the node. GNN is based on a message propagation mechanism. Each node updates its node status by exchanging messages with each other until a certain stable value is reached. The output of GNN is calculated at each node according to the current node status. It can be said that the main process of GNN learning is to iteratively aggregate and update the adjacent information of nodes in graph data. In an iteration, each node updates its own information by aggregating the features of adjacent nodes and its own features in the previous layer, and usually performs nonlinear transformation on the aggregated information. By stacking multi-layer networks, each node can obtain the information of adjacent nodes within the corresponding hop number.
区域生长算法:数字图像分割算法一般是基于灰度值的两个基本特性之一:不连续性和相似性。前一种性质的应用途径是基于图像灰度的不连续变化分割图像,比如图像的边缘。第二种性质的主要应用途径是依据实现指定的准则将图像分割为相似的区域。区域生长算法就是基于图像的第二种性质,即图像灰度值的相似性。区域生长算法的基本思想是将有相似性质的像素点合并到一起。对每一个区域要先指定一个种子点作为生长的起点,然后将种子点周围领域的像素点和种子点进行对比,将具有相似性质的点合并起来继续向外生长,直到没有满足条件的像素被包括进来为止。这样一个区域的生长就完成了。Region growing algorithms: Digital image segmentation algorithms are generally based on one of two fundamental properties of gray values: discontinuity and similarity. The application of the former property is to segment the image based on the discontinuous change of image gray level, such as the edge of the image. The main application of the second property is to segment an image into similar regions according to implementation-specified criteria. The region growing algorithm is based on the second property of the image, that is, the similarity of the gray value of the image. The basic idea of the region growing algorithm is to merge pixels with similar properties together. For each region, a seed point should be designated as the starting point of growth, and then the pixels in the area around the seed point will be compared with the seed point, and the points with similar properties will be merged to continue growing outward until no pixels that meet the conditions are included. The growth of such a region is complete.
图像场景分类是指对于已经给定的图像,通过判断识别它所包含的信息和内容来判断其所属的场景(例如自然、街道、室内等),从而达到场景分类的目的。卷积神经网络(CNN)在计算机视觉任务如图像场景分类中应用十分广泛,但是,直接利用卷积神经网络模型进行分类,虽然可以实现一定精度的场景类别分类,但是,常规的卷积神经网络对图像场景信息的提取和建模,并不符合人脑认知的实际方式,因此也带来了模型可解释性差、精度有限等 问题。现有的全局上下文信息获取的方法,如采用非局部均值(non-local)、各种注意力机制,参数成本太高,难以应用于高分辨率输入图像的场景中。因此,如何提高图像场景分类的准确性及减小分类过程中的参数量,成为了亟待解决的技术问题。Image scene classification means that for a given image, by judging and identifying the information and content it contains to judge the scene it belongs to (such as nature, street, indoor, etc.), so as to achieve the purpose of scene classification. Convolutional neural network (CNN) is widely used in computer vision tasks such as image scene classification. However, direct use of convolutional neural network model for classification can achieve a certain accuracy of scene category classification. However, the extraction and modeling of image scene information by conventional convolutional neural network does not conform to the actual way of human brain cognition, so it also brings problems such as poor model interpretability and limited accuracy. Existing global context information acquisition methods, such as non-local and various attention mechanisms, have too high a parameter cost and are difficult to apply to high-resolution input image scenarios. Therefore, how to improve the accuracy of image scene classification and reduce the amount of parameters in the classification process has become a technical problem to be solved urgently.
基于此,本申请实施例提供了一种基于图神经网络的图像场景分类方法、装置、电子设备及存储介质,旨在提高图像场景分类的准确性及减小分类过程中的参数量。Based on this, the embodiment of the present application provides a graph neural network-based image scene classification method, device, electronic equipment, and storage medium, aiming at improving the accuracy of image scene classification and reducing the amount of parameters in the classification process.
本申请实施例可以基于人工智能技术对相关的数据进行获取和处理。其中,人工智能(Artificial Intelligence,AI)是利用数字计算机或者数字计算机控制的机器模拟、延伸和扩展人的智能,感知环境、获取知识并使用知识获得最佳结果的理论、方法、技术及应用系统。The embodiments of the present application may acquire and process relevant data based on artificial intelligence technology. Among them, artificial intelligence (AI) is a theory, method, technology and application system that uses digital computers or machines controlled by digital computers to simulate, extend and expand human intelligence, perceive the environment, acquire knowledge and use knowledge to obtain the best results.
人工智能基础技术一般包括如传感器、专用人工智能芯片、云计算、分布式存储、大数据处理技术、操作/交互系统、机电一体化等技术。人工智能软件技术主要包括计算机视觉技术、机器人技术、生物识别技术、语音处理技术、自然语言处理技术以及机器学习/深度学习等几大方向。Artificial intelligence basic technologies generally include technologies such as sensors, dedicated artificial intelligence chips, cloud computing, distributed storage, big data processing technology, operation/interaction systems, and mechatronics. Artificial intelligence software technology mainly includes computer vision technology, robotics technology, biometrics technology, speech processing technology, natural language processing technology, and machine learning/deep learning.
本申请实施例提供的基于图神经网络的图像场景分类方法,涉及人工智能及图像处理技术领域。本申请实施例提供的图像场景分类方法可应用于终端中,也可应用于服务器端中,还可以是运行于终端或服务器端中的软件。在一些实施例中,终端可以是智能手机、平板电脑、笔记本电脑、台式计算机等;服务器端可以配置成独立的物理服务器,也可以配置成多个物理服务器构成的服务器集群或者分布式系统,还可以配置成提供云服务、云数据库、云计算、云函数、云存储、网络服务、云通信、中间件服务、域名服务、安全服务、CDN以及大数据和人工智能平台等基础云计算服务的云服务器;软件可以是实现图像场景分类方法的应用等,但并不局限于以上形式。The image scene classification method based on the graph neural network provided in the embodiment of the present application relates to the technical fields of artificial intelligence and image processing. The image scene classification method provided in the embodiment of the present application may be applied to a terminal, may also be applied to a server, and may also be software running on the terminal or the server. In some embodiments, the terminal can be a smart phone, tablet computer, notebook computer, desktop computer, etc.; the server can be configured as an independent physical server, or can be configured as a server cluster or distributed system composed of multiple physical servers, and can also be configured as a cloud server that provides basic cloud computing services such as cloud services, cloud databases, cloud computing, cloud functions, cloud storage, network services, cloud communications, middleware services, domain name services, security services, CDN, and big data and artificial intelligence platforms;
本申请可用于众多通用或专用的计算机系统环境或配置中。例如:个人计算机、服务器计算机、手持设备或便携式设备、平板型设备、多处理器系统、基于微处理器的系统、置顶盒、可编程的消费电子设备、网络PC、小型计算机、大型计算机、包括以上任何系统或设备的分布式计算环境等等。本申请可以在由计算机执行的计算机可执行指令的一般上下文中描述,例如程序模块。一般地,程序模块包括执行特定任务或实现特定抽象数据类型的例程、程序、对象、组件、数据结构等等。也可以在分布式计算环境中实践本申请,在这些分布式计算环境中,由通过通信网络而被连接的远程处理设备来执行任务。在分布式计算环境中,程序模块可以位于包括存储设备在内的本地和远程计算机存储介质中。The application can be used in numerous general purpose or special purpose computer system environments or configurations. Examples: personal computers, server computers, handheld or portable devices, tablet-type devices, multiprocessor systems, microprocessor-based systems, set-top boxes, programmable consumer electronics devices, network PCs, minicomputers, mainframe computers, distributed computing environments including any of the above systems or devices, etc. This application may be described in the general context of computer-executable instructions, such as program modules, being executed by a computer. Generally, program modules include routines, programs, objects, components, data structures, etc. that perform particular tasks or implement particular abstract data types. The application may also be practiced in distributed computing environments where tasks are performed by remote processing devices that are linked through a communications network. In a distributed computing environment, program modules may be located in both local and remote computer storage media including storage devices.
请参见图1,图1示出了本申请实施例提供的一种基于图神经网络的图像场景分类方法的流程示意图。如图1所示,该图像场景分类方法包括但不限于以下步骤S110-S160。Please refer to FIG. 1 , which shows a schematic flowchart of a method for classifying image scenes based on a graph neural network provided by an embodiment of the present application. As shown in Fig. 1, the image scene classification method includes but not limited to the following steps S110-S160.
步骤S110,对待分类的目标图像进行超像素分割,得到目标超像素分割图像。Step S110, performing superpixel segmentation on the target image to be classified to obtain the target superpixel segmented image.
示例性的,图2a为待分类的目标图像,通过对该目标图像进行超像素分割,可得到如图2b所示的目标超像素分割图像。Exemplarily, FIG. 2a is a target image to be classified, and by performing superpixel segmentation on the target image, a target superpixel segmented image as shown in FIG. 2b can be obtained.
示例性的,步骤S110中,对待分类的目标图像进行超像素分割,得到目标超像素分割图像,具体可以通过以下方法步骤实现:利用区域生长算法在待分类的目标图像上进行区域分割,得到目标超像素分割图像。Exemplarily, in step S110, perform superpixel segmentation on the target image to be classified to obtain the target superpixel segmented image, which may be specifically implemented through the following method steps: use a region growing algorithm to perform region segmentation on the target image to be classified to obtain the target superpixel segmented image.
区域生长算法是基于图像灰度值的相似性对图像进行数字分割的一种图像分割方式。在基于区域生长算法对目标图像进行超像素分割的过程中,可以先基于预设像素参数,将目标图像看作N*N个像素点组成的图像,基于预设规则从N*N个像素点之中选取出种子点,进而判断相邻像素点的灰度值与当前种子点的灰度值是否满足预设的相似度,若满足,则将该相邻像素点加入至当前种子点所属区域。区域生长的条件实际上就是根据像素灰度间的连续性而定义的一些相似性准则,而区域生长停止的条件定义了一个终止规则,基本上,在没有像素满足加入某个区域的条件的时候,区域生长就会停止。在算法里面,定义一个变量,最大像素灰度值距离reg_maxdist。当待加入像素点的灰度值和已经分割好的区域所有像素点的平均灰度值的差的绝对值小于或等于reg_maxdist时,该像素点加入到已经分割到的区域。 相反,则区域生长算法停止。The region growing algorithm is an image segmentation method to digitally segment the image based on the similarity of the gray value of the image. In the process of superpixel segmentation of the target image based on the region growing algorithm, the target image can be regarded as an image composed of N*N pixels based on the preset pixel parameters, and the seed point is selected from the N*N pixels based on the preset rules, and then judge whether the gray value of the adjacent pixel point and the gray value of the current seed point meet the preset similarity, and if so, add the adjacent pixel point to the area to which the current seed point belongs. The condition of region growing is actually some similarity criteria defined according to the continuity between pixel gray levels, and the condition of region growing stop defines a termination rule, basically, when no pixel satisfies the condition of joining a certain region, region growing will stop. In the algorithm, define a variable, the maximum pixel gray value distance reg_maxdist. When the absolute value of the difference between the gray value of the pixel to be added and the average gray value of all pixels in the segmented area is less than or equal to reg_maxdist, the pixel is added to the segmented area. Otherwise, the region growing algorithm stops.
步骤S120,获取所述目标超像素分割图像下的多个目标超像素单元,将每个所述目标超像素单元分别作为一个节点,获取每个目标超像素单元的节点特征、相邻目标超像素单元之间的边特征。Step S120, acquiring a plurality of target superpixel units under the target superpixel segmentation image, using each of the target superpixel units as a node, and acquiring node features of each target superpixel unit and edge features between adjacent target superpixel units.
如图2b所示,目标图像经过超像素分割处理后,图像被细分为多个图像子区域(像素的集合),这些图像子区域即为超像素单元。本申请实施例,将目标超像素分割图像下的超像素单元作为目标超像素单元,然后基于目标超像素单元构建图数据。具体地,将每个所述目标超像素单元分别作为一个节点,获取每个目标超像素单元的节点特征、相邻目标超像素单元之间的边特征。As shown in FIG. 2b, after the target image is subjected to superpixel segmentation processing, the image is subdivided into multiple image sub-regions (sets of pixels), and these image sub-regions are super-pixel units. In the embodiment of the present application, the superpixel unit under the target superpixel segmentation image is used as the target superpixel unit, and then the graph data is constructed based on the target superpixel unit. Specifically, each of the target superpixel units is regarded as a node, and the node features of each target superpixel unit and the edge features between adjacent target superpixel units are obtained.
示例性的,所述节点特征包括灰度特征、形状特征、纹理特征中的至少一种。下面将对灰度特征、形状特征、纹理特征进行示例性说明。Exemplarily, the node features include at least one of grayscale features, shape features, and texture features. The grayscale feature, shape feature, and texture feature will be exemplified below.
(1)灰度特征(1) Grayscale features
灰度特征是通过灰度变化描述目标对象的表观物理属性,直接反映目标自身的色彩规律,在多通道图像中目标在红绿蓝不同波段的灰度值组成一组向量,基于向量可以计算包括均值、亮度、方差、标准差等在内的多个指标。The grayscale feature is to describe the apparent physical properties of the target object through the grayscale change, which directly reflects the color law of the target itself. In the multi-channel image, the grayscale values of the target in different bands of red, green and blue form a set of vectors. Based on the vectors, multiple indicators including mean, brightness, variance, and standard deviation can be calculated.
(2)形状特征(2) Shape features
对于图像目标来说,大多数目标在表观形状上通常具备相应的规则几何形态,形状特征作为图像重要的目标表示手段,一般通过目标在图像中呈现出的外部轮廓来确定,一定程度反映了目标的几何形态。基于紧凑化的外部轮廓与其他目标进行区分,在可辨识的分辨率下形状特征具有旋转不变性等优点。常用的对象的形状特征包括长度、宽度、面积、周长、密度、圆度、形状指数和矩形度等。For image targets, most targets usually have a corresponding regular geometric form in terms of apparent shape. As an important target representation means of images, shape features are generally determined by the outer contour of the target in the image, which reflects the geometric form of the target to a certain extent. Distinguish from other objects based on the compacted outer contour, and the shape features have the advantages of rotation invariance at a recognizable resolution. Commonly used shape features of objects include length, width, area, perimeter, density, roundness, shape index, and rectangularity.
(3)纹理特征(3) Texture features
图像中的纹理信息体现了灰度特征和空间特征的组合信息,可以反映像素色彩信息的空间分布属性,通常呈现为介于像元层次和场景层次之间的中间尺度的局部规律性模式,属于一种半宏观层次的目标知识。在实际运算中,纹理经常被描述为在特定3*3、5*5、7*7或者更大窗口内的图像灰度级之间的空间规律性分布和关联关系。当前最经典并被广泛应用的纹理特征统计计算方法是由Haralick等人提出的灰度共生矩阵方法。The texture information in the image embodies the combined information of grayscale features and spatial features, and can reflect the spatial distribution properties of pixel color information. It usually appears as a local regular pattern at an intermediate scale between the pixel level and the scene level, and belongs to a semi-macro-level target knowledge. In actual computing, texture is often described as the spatial regularity distribution and correlation between image gray levels within a specific 3*3, 5*5, 7*7 or larger window. The most classic and widely used statistical calculation method for texture features is the gray level co-occurrence matrix method proposed by Haralick et al.
步骤S130,对于每个目标超像素单元,根据所述目标超像素单元的节点特征,确定所述目标超像素单元的状态向量。Step S130, for each target superpixel unit, determine the state vector of the target superpixel unit according to the node characteristics of the target superpixel unit.
示例性,本申请实施例在确定各目标超像素单元的节点特征之后,可以根据预设的映射规则,由目标超像素单元的节点特征确定目标超像素单元的初始状态。对于目标超像素单元v,可通过
Figure PCTCN2022090725-appb-000001
表示其初始的状态向量,v∈N,N为目标超像素单元集合。
Exemplarily, in the embodiment of the present application, after determining the node characteristics of each target superpixel unit, the initial state of the target superpixel unit may be determined according to the node characteristics of the target superpixel unit according to a preset mapping rule. For the target superpixel unit v, it can be obtained by
Figure PCTCN2022090725-appb-000001
Represents its initial state vector, v∈N, N is the set of target superpixel units.
步骤S140,对于每个目标超像素单元,根据所述目标超像素单元的状态向量、相邻目标超像素单元的状态向量、所述目标超像素单元与相邻目标超像素单元之间的边特征,对所述目标超像素单元的状态向量进行更新,得到所述目标超像素单元更新后的状态向量。Step S140, for each target superpixel unit, update the state vector of the target superpixel unit according to the state vector of the target superpixel unit, the state vectors of adjacent target superpixel units, and the edge features between the target superpixel unit and adjacent target superpixel units, to obtain the updated state vector of the target superpixel unit.
可以理解的是,本申请实施例基于图神经网络的消息传播机制,各节点在确定自身初始的状态向量之后,还与相邻节点进行消息交换,并根据相邻节点消息以及与相邻节点连接的边消息来更新自身的节点状态。It can be understood that the embodiment of the present application is based on the message propagation mechanism of the graph neural network. After each node determines its initial state vector, it also exchanges messages with adjacent nodes, and updates its own node status according to the adjacent node messages and the edge messages connected to the adjacent nodes.
请参见图3,图3示出了图神经网络的消息传递过程示意图。对于图3中的节点v1,可通过考虑相邻节点v3、v5和v8的状态向量以及与相邻节点连接的边特征,来更新节点v1的状态向量。Please refer to FIG. 3 , which shows a schematic diagram of a message passing process of a graph neural network. For node v1 in Figure 3, the state vector of node v1 can be updated by considering the state vectors of neighboring nodes v3, v5 and v8 and the edge features connected with neighboring nodes.
请参见图4,步骤S140中,根据所述目标超像素单元的状态向量、相邻目标超像素单元的状态向量、所述目标超像素单元与相邻目标超像素单元之间的边特征,对所述目标超像素单元的状态向量进行更新,得到所述目标超像素单元更新后的状态向量,具体可以通过以下方法步骤S141-S142实现:Referring to FIG. 4, in step S140, according to the state vector of the target superpixel unit, the state vector of the adjacent target superpixel unit, and the edge features between the target superpixel unit and the adjacent target superpixel unit, the state vector of the target superpixel unit is updated to obtain the updated state vector of the target superpixel unit, which can be specifically implemented through the following method steps S141-S142:
步骤S141,根据所述目标超像素单元的状态向量、相邻目标超像素单元的状态向量、所 述目标超像素单元与相邻目标超像素单元之间的边特征,确定所述目标超像素单元的关系特征向量。Step S141, according to the state vector of the target superpixel unit, the state vector of the adjacent target superpixel unit, and the edge features between the target superpixel unit and the adjacent target superpixel unit, determine the relationship feature vector of the target superpixel unit.
具体地,关系特征向量的计算公式可参见以下公式(1):Specifically, the calculation formula of the relationship feature vector can refer to the following formula (1):
Figure PCTCN2022090725-appb-000002
Figure PCTCN2022090725-appb-000002
上述公式中,
Figure PCTCN2022090725-appb-000003
表示目标超像素单元v的关系特征向量,
Figure PCTCN2022090725-appb-000004
表示目标超像素单元v初始的状态向量,
Figure PCTCN2022090725-appb-000005
表示相邻目标超像素单元w的状态向量,e vw表示目标超像素单元v和相邻目标超像素单元w的边特征向量,M t表示消息传递函数,
Figure PCTCN2022090725-appb-000006
表示相邻目标超像素单元的集合。
In the above formula,
Figure PCTCN2022090725-appb-000003
Represents the relational feature vector of the target superpixel unit v,
Figure PCTCN2022090725-appb-000004
Indicates the initial state vector of the target superpixel unit v,
Figure PCTCN2022090725-appb-000005
Represents the state vector of the adjacent target superpixel unit w, e vw represents the edge feature vector of the target superpixel unit v and the adjacent target superpixel unit w, M t represents the message transfer function,
Figure PCTCN2022090725-appb-000006
Represents the collection of adjacent target superpixel units.
步骤S142,根据所述目标超像素单元的状态向量和所述关系特征向量,对所述目标超像素单元的状态向量进行更新,得到所述目标超像素单元更新后的状态向量。Step S142 , updating the state vector of the target superpixel unit according to the state vector of the target superpixel unit and the relationship feature vector, to obtain an updated state vector of the target superpixel unit.
具体地,目标超像素单元更新后的状态向量可通过如下公式(2)表示:Specifically, the updated state vector of the target superpixel unit can be expressed by the following formula (2):
Figure PCTCN2022090725-appb-000007
Figure PCTCN2022090725-appb-000007
上述公式中,
Figure PCTCN2022090725-appb-000008
表示目标超像素单元v更新后的状态向量,
Figure PCTCN2022090725-appb-000009
表示目标超像素单元v初始的状态向量,
Figure PCTCN2022090725-appb-000010
表示目标超像素单元v的关系特征向量,U t表示状态更新模型。
In the above formula,
Figure PCTCN2022090725-appb-000008
Indicates the updated state vector of the target superpixel unit v,
Figure PCTCN2022090725-appb-000009
Indicates the initial state vector of the target superpixel unit v,
Figure PCTCN2022090725-appb-000010
Denotes the relational feature vector of the target superpixel unit v, and Ut denotes the state update model.
例如,在一个具体示例中,某一个目标超像素单元初始的状态向量指示其初始状态为“植物”,通过收集相邻目标超像素单元的状态向量以及边特征向量后,对其自身状态向量进行了更新,更新后的状态向量指示其为“树木”。For example, in a specific example, the initial state vector of a certain target superpixel unit indicates that its initial state is "plant", after collecting the state vectors and edge feature vectors of adjacent target superpixel units, its own state vector is updated, and the updated state vector indicates that it is "tree".
S150,将所有目标超像素单元更新后的状态向量输入至预先训练好的图像场景分类模型,以使所述图像场景分类模型输出基于所述目标超像素分割图像的目标场景标签。S150. Input the updated state vectors of all target superpixel units to a pre-trained image scene classification model, so that the image scene classification model outputs a target scene label based on the target superpixel segmented image.
可以理解的是,通过更新迭代,各个节点的状态向量都达到稳定值之后,将所有目标超像素单元更新后的状态向量输入至预先训练好的图像场景分类模型,以使得图像场景分类模型根据所有目标超像素单元的状态向量,输出对应的目标场景标签。It can be understood that after the state vectors of each node reach a stable value through update iterations, the updated state vectors of all target superpixel units are input to the pre-trained image scene classification model, so that the image scene classification model outputs the corresponding target scene labels according to the state vectors of all target superpixel units.
S160,根据所述目标场景标签确定对应于所述目标图像的图像场景分类结果。S160. Determine an image scene classification result corresponding to the target image according to the target scene label.
可以理解的是,在获得与目标图像对应的目标场景标签之后,即可以根据目标场景标签确定图像场景分类结果。It can be understood that after the target scene label corresponding to the target image is obtained, the image scene classification result can be determined according to the target scene label.
可以理解的是,在步骤S150之前,需要对本申请实施例提供的图像场景分类模型进行训练,以得到预先训练好的图像场景分类模型。请参见图5,图5示出了图像场景分类模型的训练流程示意图。如图5所示,本申请实施例提供的图像场景分类模型的训练过程可以包括以下步骤S200-S250。It can be understood that before step S150, the image scene classification model provided by the embodiment of the present application needs to be trained to obtain a pre-trained image scene classification model. Please refer to FIG. 5 , which shows a schematic diagram of a training process of an image scene classification model. As shown in FIG. 5 , the training process of the image scene classification model provided by the embodiment of the present application may include the following steps S200-S250.
步骤S200,获取样本图像以及与所述样本图像对应的样本场景标签。Step S200, acquiring a sample image and a sample scene label corresponding to the sample image.
可以理解的是,在进行模型训练之前,先构建样本图像集合,并对样本图像集合中的每个样本图像打上样本场景标签。It can be understood that, before performing model training, a sample image set is constructed first, and a sample scene label is attached to each sample image in the sample image set.
步骤S210,对所述样本图像进行超像素分割,得到样本超像素分割图像。Step S210, performing superpixel segmentation on the sample image to obtain a sample superpixel segmented image.
步骤S220,获取所述样本超像素分割图像下的多个样本超像素单元,将每个所述样本超像素单元分别作为一个节点,获取每个样本超像素单元的节点特征、相邻样本超像素单元之间的边特征。Step S220, obtaining a plurality of sample superpixel units under the sample superpixel segmented image, using each sample superpixel unit as a node, and obtaining node features of each sample superpixel unit and edge features between adjacent sample superpixel units.
步骤S230,对于每个样本超像素单元,根据所述样本超像素单元的节点特征,确定所述样本超像素单元的状态向量。Step S230, for each sample superpixel unit, determine the state vector of the sample superpixel unit according to the node characteristics of the sample superpixel unit.
步骤S240,对于每个样本超像素单元,根据所述样本超像素单元的状态向量、相邻样本超像素单元的状态向量、所述样本超像素单元与相邻样本超像素单元之间的边特征,对所述样本超像素单元的状态向量进行更新,得到所述样本超像素单元更新后的状态向量。Step S240, for each sample superpixel unit, update the state vector of the sample superpixel unit according to the state vector of the sample superpixel unit, the state vectors of adjacent sample superpixel units, and the edge features between the sample superpixel unit and adjacent sample superpixel units, to obtain the updated state vector of the sample superpixel unit.
以上步骤S210-S240的实现过程与前面步骤S110-S140的实现过程相类似,因此可以参照前面步骤S110-S140的相关描述,此处不再赘述。The implementation process of the above steps S210-S240 is similar to the implementation process of the previous steps S110-S140, so reference may be made to the relevant descriptions of the previous steps S110-S140, which will not be repeated here.
步骤S250,将所有样本超像素单元更新后的状态向量作为输入、所述样本场景标签作为 期望输出,对所述图像场景分类模型进行训练。Step S250, using the updated state vectors of all sample superpixel units as input and the sample scene label as expected output, to train the image scene classification model.
通过对所述图像场景分类模型进行多轮迭代训练,直至图像场景分类模型满足训练结束条件,即可以得到训练好的图像场景分类模型。By performing multiple rounds of iterative training on the image scene classification model until the image scene classification model satisfies the training end condition, a trained image scene classification model can be obtained.
请参见图6,图6示出了本申请实施例提供的另一种基于图神经网络的图像场景分类方法的流程示意图。如图6所示,该图像场景分类方法包括但不限于以下步骤S310-S370。Please refer to FIG. 6 . FIG. 6 shows a schematic flowchart of another image scene classification method based on a graph neural network provided by an embodiment of the present application. As shown in FIG. 6, the image scene classification method includes but not limited to the following steps S310-S370.
步骤S310,基于不同的预设分割阈值对待分类的目标图像进行超像素分割,得到多个目标超像素分割图像,各个目标超像素分割图像包括不同数量的目标超像素单元。Step S310 , performing superpixel segmentation on the target image to be classified based on different preset segmentation thresholds to obtain a plurality of target superpixel segmented images, each target superpixel segmented image including a different number of target superpixel units.
可以理解的是,考虑到图像场景通常有一定的层次性,因此本申请实施例通过控制超像素分割的阈值,以基于不同的预设分割阈值生成不同的目标超像素分割图像,各个目标超像素分割图像包括不同数量的目标超像素单元。It can be understood that, considering that an image scene usually has a certain hierarchy, the embodiments of the present application control the threshold of superpixel segmentation to generate different target superpixel segmentation images based on different preset segmentation thresholds, and each target superpixel segmentation image includes different numbers of target superpixel units.
举例而言,如图7所示,对于目标图像,可以基于不同的预设分割阈值,生成三种不同层次的目标超像素分割图像,以将图像分解成多层次的网络结构进行目标信息表达。在图7所示例子中,将图像分割成三个层次的超像素单元集合,分别包含8个、4个、2个超像素。For example, as shown in Figure 7, for the target image, three different levels of target superpixel segmentation images can be generated based on different preset segmentation thresholds, so as to decompose the image into a multi-level network structure for target information expression. In the example shown in FIG. 7 , the image is divided into three levels of superpixel unit sets, including 8, 4, and 2 superpixels respectively.
具体实现时,可以在基于区域生长算法对目标图像进行超像素分割的过程中,通过调整预设像素参数大小来确定不同的分割阈值。In specific implementation, different segmentation thresholds can be determined by adjusting the size of preset pixel parameters in the process of performing superpixel segmentation on the target image based on the region growing algorithm.
步骤S320,遍历各个目标超像素分割图像,获取当前遍历的所述目标超像素分割图像下的多个目标超像素单元。Step S320, traversing each target superpixel segmented image, and acquiring a plurality of target superpixel units under the currently traversed target superpixel segmented image.
可以理解的是,在获得不同层次的目标超像素分割图像之后,遍历各个目标超像素分割图像,针对每个目标超像素分割图像分别执行以下步骤S330-S360。It can be understood that after the target superpixel segmented images of different levels are obtained, each target superpixel segmented image is traversed, and the following steps S330-S360 are respectively performed for each target superpixel segmented image.
步骤S330,针对当前遍历的所述目标超像素分割图像下的多个目标超像素单元,将每个所述目标超像素单元分别作为一个节点,获取每个目标超像素单元的节点特征、相邻目标超像素单元之间的边特征。Step S330, for multiple target superpixel units under the currently traversed target superpixel segmentation image, use each target superpixel unit as a node, and acquire node features of each target superpixel unit and edge features between adjacent target superpixel units.
步骤S340,对于每个目标超像素单元,根据所述目标超像素单元的节点特征,确定所述目标超像素单元的状态向量。Step S340, for each target superpixel unit, determine the state vector of the target superpixel unit according to the node characteristics of the target superpixel unit.
步骤S350,对于每个目标超像素单元,根据所述目标超像素单元的状态向量、相邻目标超像素单元的状态向量、所述目标超像素单元与相邻目标超像素单元之间的边特征,对所述目标超像素单元的状态向量进行更新,得到所述目标超像素单元更新后的状态向量。Step S350, for each target superpixel unit, update the state vector of the target superpixel unit according to the state vector of the target superpixel unit, the state vectors of adjacent target superpixel units, and the edge features between the target superpixel unit and adjacent target superpixel units, to obtain an updated state vector of the target superpixel unit.
步骤S360,将所有目标超像素单元更新后的状态向量输入至预先训练好的图像场景分类模型,以使所述图像场景分类模型输出基于所述目标超像素分割图像的目标场景标签。In step S360, the updated state vectors of all target superpixel units are input to the pre-trained image scene classification model, so that the image scene classification model outputs a target scene label based on the target superpixel segmented image.
以上步骤S330-S260的实现过程与前面步骤S120-S150的实现过程相类似,因此可以参照前面步骤S120-S150的相关描述,此处不再赘述。The implementation process of the above steps S330-S260 is similar to the implementation process of the previous steps S120-S150, so reference may be made to the related descriptions of the previous steps S120-S150, which will not be repeated here.
步骤S370,获取所述图像场景分类模型基于各个目标超像素分割图像输出的目标场景标签。Step S370, acquiring target scene labels output by the image scene classification model based on each target superpixel segmented image.
可以理解的是,本申请基于不同层次的目标超像素分割图像,得到不同层次的目标场景标签。例如,基于层次1得到的目标场景标签为“自然”,基于层次2得到的目标场景标签为“森林”,基于层次3得到的目标场景标签为“灌木林”。It can be understood that, in this application, target scene labels of different levels are obtained by segmenting images based on target superpixels of different levels. For example, the label of the target scene based on level 1 is "nature", the label of the target scene based on level 2 is "forest", and the label of the target scene based on level 3 is "shrub forest".
步骤S380,根据基于各个超像素分割图像输出的目标场景标签,确定对应于所述目标图像的图像场景分类结果。Step S380: Determine an image scene classification result corresponding to the target image according to the target scene label output based on each superpixel segmented image.
步骤S380具体可以包括:对各个超像素分割图像输出的目标场景标签进行拼接,得到对应于所述目标图像的图像场景分类结果。Step S380 may specifically include: concatenating target scene labels output from each superpixel segmented image to obtain an image scene classification result corresponding to the target image.
沿用前面例子,基于三个不同层次的目标超像素分割图像,得到三个层次对应的目标场景标签“自然”、“森林”、“灌木林”,最终可以将三个层次的目标场景标签拼接输出最终的场景分类结果为“自然-森林-灌木林”。Following the previous example, based on the target superpixel segmentation image at three different levels, the target scene labels corresponding to the three levels "nature", "forest", and "shrub forest" can be obtained, and finally the target scene labels of the three levels can be concatenated to output the final scene classification result as "nature-forest-shrub forest".
本实施例通过生成关于目标图像的微观和宏观尺度兼备的多尺度表达,在此基础上更好地实现图像场景内组件单元及其空间拓扑关系信息的提取和挖掘,可以更深度地获取场景的关键多层次信息,以实现更加准确有效的图像场景分类。In this embodiment, by generating a multi-scale representation of both the micro and macro scales of the target image, on this basis, the extraction and mining of component units and their spatial topological relationship information in the image scene can be better realized, and the key multi-level information of the scene can be obtained more deeply, so as to achieve more accurate and effective image scene classification.
可以理解的是,在步骤S360之前,需要对图像场景分类模型进行训练,以得到预先训练好的图像场景分类模型。请参见图8,图8示出了图像场景分类模型的训练流程示意图。如图8所示,本申请实施例提供的图像场景分类模型的训练过程可以包括以下步骤S400-S420。It can be understood that before step S360, the image scene classification model needs to be trained to obtain a pre-trained image scene classification model. Please refer to FIG. 8 , which shows a schematic diagram of a training process of an image scene classification model. As shown in FIG. 8 , the training process of the image scene classification model provided by the embodiment of the present application may include the following steps S400-S420.
步骤S400,获取样本图像以及与所述样本图像对应的样本场景标签。Step S400, acquiring a sample image and a sample scene label corresponding to the sample image.
可以理解的是,在进行模型训练之前,先构建样本图像集合,并对样本图像集合中的每个样本图像打上样本场景标签。It can be understood that, before performing model training, a sample image set is constructed first, and a sample scene label is attached to each sample image in the sample image set.
步骤S410,基于不同的预设分割阈值对所述样本图像进行超像素分割,得到多个样本超像素分割图像,各个样本超像素分割图像包括不同数量的样本超像素单元。Step S410, perform superpixel segmentation on the sample image based on different preset segmentation thresholds to obtain multiple sample superpixel segmented images, each sample superpixel segmented image includes a different number of sample superpixel units.
可以理解的是,对于样本图像,基于不同的预设分割阈值对所述样本图像进行超像素分割,具体分割方法可参照前面步骤S310的实施过程,此处不再赘述。It can be understood that, for the sample image, the superpixel segmentation is performed on the sample image based on different preset segmentation thresholds, and the specific segmentation method can refer to the implementation process of the previous step S310, which will not be repeated here.
步骤S420,遍历各个样本超像素分割图像,以基于各个样本超像素分割图像对所述图像场景分类模型进行训练,所述训练过程包括步骤S421-S424:Step S420, traversing each sample superpixel segmented image to train the image scene classification model based on each sample superpixel segmented image, the training process includes steps S421-S424:
步骤S421,将当前遍历的样本超像素分割图像下的每个所述样本超像素单元分别作为一个节点,获取每个样本超像素单元的节点特征、相邻样本超像素单元之间的边特征。Step S421, taking each of the sample superpixel units under the currently traversed sample superpixel segmented image as a node, and acquiring node features of each sample superpixel unit and edge features between adjacent sample superpixel units.
步骤S422,对于每个样本超像素单元,根据所述样本超像素单元的节点特征,确定所述样本超像素单元的状态向量。Step S422, for each sample superpixel unit, determine the state vector of the sample superpixel unit according to the node characteristics of the sample superpixel unit.
步骤S423,对于每个样本超像素单元,根据所述样本超像素单元的状态向量、相邻样本超像素单元的状态向量、所述样本超像素单元与相邻样本超像素单元之间的边特征,对所述样本超像素单元的状态向量进行更新,得到所述样本超像素单元更新后的状态向量。Step S423: For each sample superpixel unit, update the state vector of the sample superpixel unit according to the state vector of the sample superpixel unit, the state vectors of adjacent sample superpixel units, and the edge features between the sample superpixel unit and adjacent sample superpixel units, to obtain the updated state vector of the sample superpixel unit.
步骤S424,将所有样本超像素单元更新后的状态向量作为输入、所述样本场景标签作为期望输出,对所述图像场景分类模型进行训练。Step S424, taking the updated state vectors of all sample superpixel units as input and the sample scene label as expected output to train the image scene classification model.
需说明的是,步骤S421-S424的具体实现过程与前面步骤S220-S250的实现过程相类似,因此可以参照前面步骤S220-S250的相关描述,此处不再赘述。It should be noted that, the specific implementation process of steps S421-S424 is similar to the implementation process of the previous steps S220-S250, so reference may be made to the relevant descriptions of the previous steps S220-S250, which will not be repeated here.
请参见图9,本申请实施例还提供一种基于图神经网络的图像场景分类装置,所述装置包括:Please refer to FIG. 9, the embodiment of the present application also provides an image scene classification device based on a graph neural network, the device includes:
图像分割模块810,用于对待分类的目标图像进行超像素分割,得到目标超像素分割图像;The image segmentation module 810 is used to perform superpixel segmentation on the target image to be classified to obtain the target superpixel segmentation image;
特征提取模块820,用于获取所述目标超像素分割图像下的多个目标超像素单元,将每个所述目标超像素单元分别作为一个节点,获取每个目标超像素单元的节点特征、相邻目标超像素单元之间的边特征;The feature extraction module 820 is used to obtain a plurality of target superpixel units under the target superpixel segmentation image, and use each of the target superpixel units as a node to obtain node features of each target superpixel unit and edge features between adjacent target superpixel units;
状态确定模块830,用于对于每个目标超像素单元,根据所述目标超像素单元的节点特征,确定所述目标超像素单元的状态向量;A state determination module 830, configured to, for each target superpixel unit, determine the state vector of the target superpixel unit according to the node characteristics of the target superpixel unit;
状态更新模块840,对于每个目标超像素单元,根据所述目标超像素单元的状态向量、相邻目标超像素单元的状态向量、所述目标超像素单元与相邻目标超像素单元之间的边特征,对所述目标超像素单元的状态向量进行更新,得到所述目标超像素单元更新后的状态向量;The state update module 840, for each target superpixel unit, according to the state vector of the target superpixel unit, the state vector of the adjacent target superpixel unit, and the edge characteristics between the target superpixel unit and the adjacent target superpixel unit, update the state vector of the target superpixel unit to obtain the updated state vector of the target superpixel unit;
标签输出模块850,用于将所有目标超像素单元更新后的状态向量输入至预先训练好的图像场景分类模型,以使所述图像场景分类模型输出基于目标超像素分割图像的所述目标场景标签;The label output module 850 is used to input the updated state vectors of all target superpixel units to the image scene classification model trained in advance, so that the image scene classification model outputs the target scene label based on the target superpixel segmented image;
场景分类模块860,用于根据所述目标场景标签确定对应于所述目标图像的图像场景分类结果。A scene classification module 860, configured to determine an image scene classification result corresponding to the target image according to the target scene label.
可以理解的是,在一些实施例中,本申请的基于图神经网络的图像场景分类装置还包括训练模块,用于对图像场景分类模型进行训练,得到训练好的图像场景分类模型。It can be understood that, in some embodiments, the graph neural network-based image scene classification device of the present application further includes a training module for training the image scene classification model to obtain a trained image scene classification model.
需要说明的是,上述装置的模块之间的信息交互、执行过程等内容,由于与本申请方法实施例基于同一构思,其具体功能及带来的技术效果,具体可参见方法实施例部分,此处不再赘述。It should be noted that, since the information exchange and execution process between the modules of the above-mentioned device are based on the same idea as the method embodiment of the present application, its specific functions and technical effects can be found in the method embodiment section, and will not be repeated here.
本申请实施例还提供了一种电子设备,电子设备包括:存储器、处理器、存储在存储器 上并可在处理器上运行的程序以及用于实现处理器和存储器之间的连接通信的数据总线,程序被处理器执行时实现上述图像场景分类方法。该电子设备可以为包括平板电脑、车载电脑等任意智能终端。The embodiment of the present application also provides an electronic device, the electronic device includes: a memory, a processor, a program stored on the memory and operable on the processor, and a data bus for realizing connection and communication between the processor and the memory, and the above-mentioned image scene classification method is implemented when the program is executed by the processor. The electronic device may be any intelligent terminal including a tablet computer, a vehicle-mounted computer, and the like.
请参见图10,图10示意了另一实施例的电子设备的硬件结构,电子设备包括:Please refer to FIG. 10. FIG. 10 illustrates a hardware structure of an electronic device in another embodiment. The electronic device includes:
处理器901,可以采用通用的CPU(Central Processing Unit,中央处理器)、微处理器、应用专用集成电路(Application Specific Integrated Circuit,ASIC)、或者一个或多个集成电路等方式实现,用于执行相关程序,以实现本申请实施例所提供的技术方案;The processor 901 can be implemented by a general-purpose CPU (Central Processing Unit, central processing unit), microprocessor, application-specific integrated circuit (Application Specific Integrated Circuit, ASIC), or one or more integrated circuits, and is used to execute related programs to realize the technical solutions provided by the embodiments of the present application;
存储器902,可以采用只读存储器(Read Only Memory,ROM)、静态存储设备、动态存储设备或者随机存取存储器(Random Access Memory,RAM)等形式实现。存储器902可以存储操作系统和其他应用程序,在通过软件或者固件来实现本说明书实施例所提供的技术方案时,相关的程序代码保存在存储器902中,并由处理器901来调用执行本申请实施例的一种基于图神经网络的图像场景分类方法,其中,所述基于图神经网络的图像场景分类方法包括:对待分类的目标图像进行超像素分割,得到目标超像素分割图像;获取所述目标超像素分割图像下的多个目标超像素单元,将每个所述目标超像素单元分别作为一个节点,获取每个目标超像素单元的节点特征、相邻目标超像素单元之间的边特征;对于每个目标超像素单元,根据所述目标超像素单元的节点特征,确定所述目标超像素单元的状态向量;对于每个目标超像素单元,根据所述目标超像素单元的状态向量、相邻目标超像素单元的状态向量、所述目标超像素单元与相邻目标超像素单元之间的边特征,对所述目标超像素单元的状态向量进行更新,得到所述目标超像素单元更新后的状态向量;将所有目标超像素单元更新后的状态向量输入至预先训练好的图像场景分类模型,以使所述图像场景分类模型输出基于所述目标超像素分割图像的目标场景标签;根据所述目标场景标签确定对应于所述目标图像的图像场景分类结果;The memory 902 may be implemented in the form of a read-only memory (Read Only Memory, ROM), a static storage device, a dynamic storage device, or a random access memory (Random Access Memory, RAM). The memory 902 can store an operating system and other application programs. When implementing the technical solutions provided by the embodiments of this specification through software or firmware, the relevant program codes are stored in the memory 902, and the processor 901 invokes and executes a graph neural network-based image scene classification method according to the embodiment of the present application. As a node, obtain the node characteristics of each target superpixel unit and the edge characteristics between adjacent target superpixel units; for each target superpixel unit, determine the state vector of the target superpixel unit according to the node characteristics of the target superpixel unit; The updated state vector is input to the image scene classification model trained in advance, so that the image scene classification model outputs a target scene label based on the target superpixel segmentation image; determine the image scene classification result corresponding to the target image according to the target scene label;
输入/输出接口903,用于实现信息输入及输出;The input/output interface 903 is used to realize information input and output;
通信接口904,用于实现本设备与其他设备的通信交互,可以通过有线方式(例如USB、网线等)实现通信,也可以通过无线方式(例如移动网络、WIFI、蓝牙等)实现通信;和The communication interface 904 is used to realize the communication interaction between the device and other devices, and the communication can be realized through a wired method (such as USB, network cable, etc.), or can be realized through a wireless method (such as a mobile network, WIFI, Bluetooth, etc.); and
总线905,在设备的各个组件(例如处理器901、存储器902、输入/输出接口903和通信接口904)之间传输信息;bus 905, for transferring information between various components of the device (such as processor 901, memory 902, input/output interface 903 and communication interface 904);
其中处理器901、存储器902、输入/输出接口903和通信接口904通过总线905实现彼此之间在设备内部的通信连接。The processor 901 , the memory 902 , the input/output interface 903 and the communication interface 904 are connected to each other within the device through the bus 905 .
示例性的,在将所有目标超像素单元更新后的状态向量输入至预先训练好的图像场景分类模型,以通过所述图像场景分类模型获得目标场景标签之前,所述方法还包括:获取样本图像以及与所述样本图像对应的样本场景标签;对所述样本图像进行超像素分割,得到样本超像素分割图像;获取所述样本超像素分割图像下的多个样本超像素单元,将每个所述样本超像素单元分别作为一个节点,获取每个样本超像素单元的节点特征、相邻样本超像素单元之间的边特征;对于每个样本超像素单元,根据所述样本超像素单元的节点特征,确定所述样本超像素单元的状态向量;对于每个样本超像素单元,根据所述样本超像素单元的状态向量、相邻样本超像素单元的状态向量、所述样本超像素单元与相邻样本超像素单元之间的边特征,对所述样本超像素单元的状态向量进行更新,得到所述样本超像素单元更新后的状态向量;将所有样本超像素单元更新后的状态向量作为输入、所述样本场景标签作为期望输出,对所述图像场景分类模型进行训练。Exemplarily, before inputting the updated state vectors of all target superpixel units into the pre-trained image scene classification model to obtain the target scene label through the image scene classification model, the method further includes: obtaining a sample image and a sample scene label corresponding to the sample image; performing superpixel segmentation on the sample image to obtain a sample superpixel segmentation image; obtaining a plurality of sample superpixel units under the sample superpixel segmentation image, using each of the sample superpixel units as a node, and obtaining the node characteristics of each sample superpixel unit and the edge characteristics between adjacent sample superpixel units; For each sample superpixel unit, according to the node characteristics of the sample superpixel unit, determine the state vector of the sample superpixel unit; for each sample superpixel unit, update the state vector of the sample superpixel unit according to the state vector of the sample superpixel unit, the state vectors of adjacent sample superpixel units, and the edge features between the sample superpixel unit and adjacent sample superpixel units, to obtain the updated state vector of the sample superpixel unit; use the updated state vectors of all sample superpixel units as input, and the sample scene label as an expected output to train the image scene classification model .
示例性的,在将所有目标超像素单元更新后的状态向量输入至预先训练好的图像场景分类模型,以通过所述图像场景分类模型获得目标场景标签之前,所述方法还包括:获取样本图像以及与所述样本图像对应的样本场景标签;基于不同的预设分割阈值对所述样本图像进行超像素分割,得到多个样本超像素分割图像,各个样本超像素分割图像包括不同数量的样本超像素单元;遍历各个样本超像素分割图像,以基于各个样本超像素分割图像对所述图像场景分类模型进行训练,所述训练过程包括:将当前遍历的样本超像素分割图像下的每个所 述样本超像素单元分别作为一个节点,获取每个样本超像素单元的节点特征、相邻样本超像素单元之间的边特征;对于每个样本超像素单元,根据所述样本超像素单元的节点特征,确定所述样本超像素单元的状态向量;对于每个样本超像素单元,根据所述样本超像素单元的状态向量、相邻样本超像素单元的状态向量、所述样本超像素单元与相邻样本超像素单元之间的边特征,对所述样本超像素单元的状态向量进行更新,得到所述样本超像素单元更新后的状态向量;将所有样本超像素单元更新后的状态向量作为输入、所述样本场景标签作为期望输出,对所述图像场景分类模型进行训练。Exemplarily, before inputting the updated state vectors of all target superpixel units into the pre-trained image scene classification model to obtain the target scene label through the image scene classification model, the method further includes: obtaining a sample image and a sample scene label corresponding to the sample image; performing superpixel segmentation on the sample image based on different preset segmentation thresholds to obtain a plurality of sample superpixel segmentation images, each sample superpixel segmentation image includes a different number of sample superpixel units; Training, the training process includes: taking each of the sample superpixel units under the currently traversed sample superpixel segmentation image as a node respectively, and obtaining the node characteristics of each sample superpixel unit and the edge characteristics between adjacent sample superpixel units; for each sample superpixel unit, determining the state vector of the sample superpixel unit according to the node characteristics of the sample superpixel unit; The state vector of the sample superpixel unit is updated to obtain the updated state vector of the sample superpixel unit; the updated state vector of all sample superpixel units is used as input and the sample scene label is used as the expected output to train the image scene classification model.
示例性的,所述对待分类的目标图像进行超像素分割,得到目标超像素分割图像,包括:Exemplarily, the superpixel segmentation of the target image to be classified to obtain the target superpixel segmentation image includes:
基于不同的预设分割阈值对待分类的目标图像进行超像素分割,得到多个目标超像素分割图像,各个目标超像素分割图像包括不同数量的目标超像素单元。对应地,所述获取所述目标超像素分割图像下的多个目标超像素单元,包括:遍历各个目标超像素分割图像,获取当前遍历的所述目标超像素分割图像下的多个目标超像素单元;所述根据所述目标场景标签确定对应于所述目标图像的图像场景分类结果,包括:获取所述图像场景分类模型基于各个目标超像素分割图像输出的目标场景标签;根据基于各个超像素分割图像输出的目标场景标签,确定对应于所述目标图像的图像场景分类结果。The target image to be classified is subjected to superpixel segmentation based on different preset segmentation thresholds to obtain multiple target superpixel segmented images, and each target superpixel segmented image includes different numbers of target superpixel units. Correspondingly, the acquiring a plurality of target superpixel units under the target superpixel segmentation image includes: traversing each target superpixel segmentation image, and acquiring a plurality of target superpixel units under the currently traversed target superpixel segmentation image; determining the image scene classification result corresponding to the target image according to the target scene label includes: acquiring the target scene label output by the image scene classification model based on each target superpixel segmentation image; and determining the image scene classification result corresponding to the target image based on the target scene label output based on each superpixel segmentation image.
示例性的,所述根据基于各个超像素分割图像输出的目标场景标签,确定对应于所述目标图像的图像场景分类结果,包括:对各个超像素分割图像输出的目标场景标签进行拼接,得到对应于所述目标图像的图像场景分类结果。Exemplarily, the determining the image scene classification result corresponding to the target image according to the target scene label output based on each superpixel segmentation image includes: splicing the target scene label output from each superpixel segmentation image to obtain the image scene classification result corresponding to the target image.
示例性的,所述对于每个目标超像素单元,根据所述目标超像素单元的状态向量、相邻目标超像素单元的状态向量、所述目标超像素单元与相邻目标超像素单元之间的边特征,对所述目标超像素单元的状态向量进行更新,得到所述目标超像素单元更新后的状态向量,包括:根据所述目标超像素单元的状态向量、相邻目标超像素单元的状态向量、所述目标超像素单元与相邻目标超像素单元之间的边特征,确定所述目标超像素单元的关系特征向量;根据所述目标超像素单元的状态向量和所述关系特征向量,对所述目标超像素单元的状态向量进行更新,得到所述目标超像素单元更新后的状态向量。Exemplarily, for each target superpixel unit, updating the state vector of the target superpixel unit according to the state vector of the target superpixel unit, the state vector of adjacent target superpixel units, and the edge features between the target superpixel unit and adjacent target superpixel units, to obtain the updated state vector of the target superpixel unit, includes: determining the relationship feature vector of the target superpixel unit according to the state vector of the target superpixel unit, the state vector of adjacent target superpixel units, and the edge features between the target superpixel unit and adjacent target superpixel units; The state vector of the target superpixel unit and the relationship feature vector are used to update the state vector of the target superpixel unit to obtain an updated state vector of the target superpixel unit.
本申请实施例还提供了一种存储介质,存储介质为计算机可读存储介质,用于计算机可读存储,存储介质存储有一个或者多个程序,一个或者多个程序可被一个或者多个处理器执行,以实现一种基于图神经网络的图像场景分类方法,其中,所述基于图神经网络的图像场景分类方法包括:对待分类的目标图像进行超像素分割,得到目标超像素分割图像;获取所述目标超像素分割图像下的多个目标超像素单元,将每个所述目标超像素单元分别作为一个节点,获取每个目标超像素单元的节点特征、相邻目标超像素单元之间的边特征;对于每个目标超像素单元,根据所述目标超像素单元的节点特征,确定所述目标超像素单元的状态向量;对于每个目标超像素单元,根据所述目标超像素单元的状态向量、相邻目标超像素单元的状态向量、所述目标超像素单元与相邻目标超像素单元之间的边特征,对所述目标超像素单元的状态向量进行更新,得到所述目标超像素单元更新后的状态向量;将所有目标超像素单元更新后的状态向量输入至预先训练好的图像场景分类模型,以使所述图像场景分类模型输出基于所述目标超像素分割图像的目标场景标签;根据所述目标场景标签确定对应于所述目标图像的图像场景分类结果。The embodiment of the present application also provides a storage medium, which is a computer-readable storage medium for computer-readable storage. The storage medium stores one or more programs, and one or more programs can be executed by one or more processors to implement a graph neural network-based image scene classification method, wherein the graph neural network-based image scene classification method includes: performing superpixel segmentation on the target image to be classified to obtain a target superpixel segmentation image; The node characteristics of the pixel unit and the edge characteristics between the adjacent target superpixel units; for each target superpixel unit, according to the node characteristics of the target superpixel unit, the state vector of the target superpixel unit is determined; for each target superpixel unit, according to the state vector of the target superpixel unit, the state vector of the adjacent target superpixel unit, and the edge characteristics between the target superpixel unit and the adjacent target superpixel unit, the state vector of the target superpixel unit is updated to obtain the updated state vector of the target superpixel unit; A trained image scene classification model, so that the image scene classification model outputs a target scene label based on the target superpixel segmented image; determine an image scene classification result corresponding to the target image according to the target scene label.
示例性的,在将所有目标超像素单元更新后的状态向量输入至预先训练好的图像场景分类模型,以通过所述图像场景分类模型获得目标场景标签之前,所述方法还包括:获取样本图像以及与所述样本图像对应的样本场景标签;对所述样本图像进行超像素分割,得到样本超像素分割图像;获取所述样本超像素分割图像下的多个样本超像素单元,将每个所述样本超像素单元分别作为一个节点,获取每个样本超像素单元的节点特征、相邻样本超像素单元之间的边特征;对于每个样本超像素单元,根据所述样本超像素单元的节点特征,确定所述样本超像素单元的状态向量;对于每个样本超像素单元,根据所述样本超像素单元的状态向量、相邻样本超像素单元的状态向量、所述样本超像素单元与相邻样本超像素单元之间的边 特征,对所述样本超像素单元的状态向量进行更新,得到所述样本超像素单元更新后的状态向量;将所有样本超像素单元更新后的状态向量作为输入、所述样本场景标签作为期望输出,对所述图像场景分类模型进行训练。Exemplarily, before inputting the updated state vectors of all target superpixel units into the pre-trained image scene classification model to obtain the target scene label through the image scene classification model, the method further includes: obtaining a sample image and a sample scene label corresponding to the sample image; performing superpixel segmentation on the sample image to obtain a sample superpixel segmentation image; obtaining a plurality of sample superpixel units under the sample superpixel segmentation image, using each of the sample superpixel units as a node, and obtaining the node characteristics of each sample superpixel unit and the edge characteristics between adjacent sample superpixel units; For each sample superpixel unit, according to the node characteristics of the sample superpixel unit, determine the state vector of the sample superpixel unit; for each sample superpixel unit, update the state vector of the sample superpixel unit according to the state vector of the sample superpixel unit, the state vectors of adjacent sample superpixel units, and the edge characteristics between the sample superpixel unit and adjacent sample superpixel units, to obtain the updated state vector of the sample superpixel unit; use the updated state vectors of all sample superpixel units as input, and the sample scene label as the expected output, and perform the image scene classification model. training.
示例性的,在将所有目标超像素单元更新后的状态向量输入至预先训练好的图像场景分类模型,以通过所述图像场景分类模型获得目标场景标签之前,所述方法还包括:获取样本图像以及与所述样本图像对应的样本场景标签;基于不同的预设分割阈值对所述样本图像进行超像素分割,得到多个样本超像素分割图像,各个样本超像素分割图像包括不同数量的样本超像素单元;遍历各个样本超像素分割图像,以基于各个样本超像素分割图像对所述图像场景分类模型进行训练,所述训练过程包括:将当前遍历的样本超像素分割图像下的每个所述样本超像素单元分别作为一个节点,获取每个样本超像素单元的节点特征、相邻样本超像素单元之间的边特征;对于每个样本超像素单元,根据所述样本超像素单元的节点特征,确定所述样本超像素单元的状态向量;对于每个样本超像素单元,根据所述样本超像素单元的状态向量、相邻样本超像素单元的状态向量、所述样本超像素单元与相邻样本超像素单元之间的边特征,对所述样本超像素单元的状态向量进行更新,得到所述样本超像素单元更新后的状态向量;将所有样本超像素单元更新后的状态向量作为输入、所述样本场景标签作为期望输出,对所述图像场景分类模型进行训练。Exemplarily, before inputting the updated state vectors of all target superpixel units into the pre-trained image scene classification model to obtain the target scene label through the image scene classification model, the method further includes: obtaining a sample image and a sample scene label corresponding to the sample image; performing superpixel segmentation on the sample image based on different preset segmentation thresholds to obtain a plurality of sample superpixel segmentation images, each sample superpixel segmentation image includes a different number of sample superpixel units; Training, the training process includes: taking each of the sample superpixel units under the currently traversed sample superpixel segmentation image as a node respectively, and obtaining the node characteristics of each sample superpixel unit and the edge characteristics between adjacent sample superpixel units; for each sample superpixel unit, determining the state vector of the sample superpixel unit according to the node characteristics of the sample superpixel unit; The state vector of the pixel unit is updated to obtain the updated state vector of the sample super pixel unit; the updated state vector of all sample super pixel units is used as input and the sample scene label is used as the expected output to train the image scene classification model.
示例性的,所述对待分类的目标图像进行超像素分割,得到目标超像素分割图像,包括:Exemplarily, the superpixel segmentation of the target image to be classified to obtain the target superpixel segmentation image includes:
基于不同的预设分割阈值对待分类的目标图像进行超像素分割,得到多个目标超像素分割图像,各个目标超像素分割图像包括不同数量的目标超像素单元。对应地,所述获取所述目标超像素分割图像下的多个目标超像素单元,包括:遍历各个目标超像素分割图像,获取当前遍历的所述目标超像素分割图像下的多个目标超像素单元;所述根据所述目标场景标签确定对应于所述目标图像的图像场景分类结果,包括:获取所述图像场景分类模型基于各个目标超像素分割图像输出的目标场景标签;根据基于各个超像素分割图像输出的目标场景标签,确定对应于所述目标图像的图像场景分类结果。The target image to be classified is subjected to superpixel segmentation based on different preset segmentation thresholds to obtain multiple target superpixel segmented images, and each target superpixel segmented image includes different numbers of target superpixel units. Correspondingly, the acquiring a plurality of target superpixel units under the target superpixel segmentation image includes: traversing each target superpixel segmentation image, and acquiring a plurality of target superpixel units under the currently traversed target superpixel segmentation image; determining the image scene classification result corresponding to the target image according to the target scene label includes: acquiring the target scene label output by the image scene classification model based on each target superpixel segmentation image; and determining the image scene classification result corresponding to the target image based on the target scene label output based on each superpixel segmentation image.
示例性的,所述根据基于各个超像素分割图像输出的目标场景标签,确定对应于所述目标图像的图像场景分类结果,包括:对各个超像素分割图像输出的目标场景标签进行拼接,得到对应于所述目标图像的图像场景分类结果。Exemplarily, the determining the image scene classification result corresponding to the target image according to the target scene label output based on each superpixel segmentation image includes: splicing the target scene label output from each superpixel segmentation image to obtain the image scene classification result corresponding to the target image.
示例性的,所述对于每个目标超像素单元,根据所述目标超像素单元的状态向量、相邻目标超像素单元的状态向量、所述目标超像素单元与相邻目标超像素单元之间的边特征,对所述目标超像素单元的状态向量进行更新,得到所述目标超像素单元更新后的状态向量,包括:根据所述目标超像素单元的状态向量、相邻目标超像素单元的状态向量、所述目标超像素单元与相邻目标超像素单元之间的边特征,确定所述目标超像素单元的关系特征向量;根据所述目标超像素单元的状态向量和所述关系特征向量,对所述目标超像素单元的状态向量进行更新,得到所述目标超像素单元更新后的状态向量。Exemplarily, for each target superpixel unit, updating the state vector of the target superpixel unit according to the state vector of the target superpixel unit, the state vector of adjacent target superpixel units, and the edge features between the target superpixel unit and adjacent target superpixel units, to obtain the updated state vector of the target superpixel unit, includes: determining the relationship feature vector of the target superpixel unit according to the state vector of the target superpixel unit, the state vector of adjacent target superpixel units, and the edge features between the target superpixel unit and adjacent target superpixel units; The state vector of the target superpixel unit and the relationship feature vector are used to update the state vector of the target superpixel unit to obtain an updated state vector of the target superpixel unit.
所述计算机可读存储介质可以是非易失性,也可以是易失性。The computer-readable storage medium may be non-volatile or volatile.
存储器作为一种非暂态计算机可读存储介质,可用于存储非暂态软件程序以及非暂态性计算机可执行程序。此外,存储器可以包括高速随机存取存储器,还可以包括非暂态存储器,例如至少一个磁盘存储器件、闪存器件、或其他非暂态固态存储器件。在一些实施方式中,存储器可选包括相对于处理器远程设置的存储器,这些远程存储器可以通过网络连接至该处理器。上述网络的实例包括但不限于互联网、企业内部网、局域网、移动通信网及其组合。As a non-transitory computer-readable storage medium, memory can be used to store non-transitory software programs and non-transitory computer-executable programs. In addition, the memory may include high-speed random access memory, and may also include non-transitory memory, such as at least one magnetic disk storage device, flash memory device, or other non-transitory solid-state storage devices. In some embodiments, the memory optionally includes memory located remotely from the processor, and these remote memories may be connected to the processor via a network. Examples of the aforementioned networks include, but are not limited to, the Internet, intranets, local area networks, mobile communication networks, and combinations thereof.
以上参照附图说明了本申请实施例的优选实施例,并非因此局限本申请实施例的权利范围。本领域技术人员不脱离本申请实施例的范围和实质内所作的任何修改、等同替换和改进,均应在本申请实施例的权利范围之内。The preferred embodiments of the embodiments of the present application have been described above with reference to the accompanying drawings, which does not limit the scope of rights of the embodiments of the present application. Any modifications, equivalent replacements and improvements made by those skilled in the art without departing from the scope and essence of the embodiments of the present application shall fall within the scope of rights of the embodiments of the present application.

Claims (20)

  1. 一种基于图神经网络的图像场景分类方法,其中,所述方法包括:A method for classifying image scenes based on a graph neural network, wherein the method includes:
    对待分类的目标图像进行超像素分割,得到目标超像素分割图像;Perform superpixel segmentation on the target image to be classified to obtain the target superpixel segmentation image;
    获取所述目标超像素分割图像下的多个目标超像素单元,将每个所述目标超像素单元分别作为一个节点,获取每个目标超像素单元的节点特征、相邻目标超像素单元之间的边特征;Obtaining a plurality of target superpixel units under the target superpixel segmentation image, using each of the target superpixel units as a node, and acquiring node features of each target superpixel unit and edge features between adjacent target superpixel units;
    对于每个目标超像素单元,根据所述目标超像素单元的节点特征,确定所述目标超像素单元的状态向量;For each target superpixel unit, according to the node characteristics of the target superpixel unit, determine the state vector of the target superpixel unit;
    对于每个目标超像素单元,根据所述目标超像素单元的状态向量、相邻目标超像素单元的状态向量、所述目标超像素单元与相邻目标超像素单元之间的边特征,对所述目标超像素单元的状态向量进行更新,得到所述目标超像素单元更新后的状态向量;For each target superpixel unit, according to the state vector of the target superpixel unit, the state vector of the adjacent target superpixel unit, and the edge feature between the target superpixel unit and the adjacent target superpixel unit, the state vector of the target superpixel unit is updated to obtain the updated state vector of the target superpixel unit;
    将所有目标超像素单元更新后的状态向量输入至预先训练好的图像场景分类模型,以使所述图像场景分类模型输出基于所述目标超像素分割图像的目标场景标签;The updated state vectors of all target superpixel units are input to a pre-trained image scene classification model, so that the image scene classification model outputs a target scene label based on the target superpixel segmented image;
    根据所述目标场景标签确定对应于所述目标图像的图像场景分类结果。An image scene classification result corresponding to the target image is determined according to the target scene label.
  2. 根据权利要求1所述的方法,其中,在将所有目标超像素单元更新后的状态向量输入至预先训练好的图像场景分类模型,以通过所述图像场景分类模型获得目标场景标签之前,所述方法还包括:The method according to claim 1, wherein, before inputting the updated state vectors of all target superpixel units to a pre-trained image scene classification model to obtain the target scene label through the image scene classification model, the method further comprises:
    获取样本图像以及与所述样本图像对应的样本场景标签;Acquiring a sample image and a sample scene label corresponding to the sample image;
    对所述样本图像进行超像素分割,得到样本超像素分割图像;Performing superpixel segmentation on the sample image to obtain a sample superpixel segmentation image;
    获取所述样本超像素分割图像下的多个样本超像素单元,将每个所述样本超像素单元分别作为一个节点,获取每个样本超像素单元的节点特征、相邻样本超像素单元之间的边特征;Obtaining a plurality of sample superpixel units under the sample superpixel segmentation image, using each of the sample superpixel units as a node, and obtaining the node features of each sample superpixel unit and the edge features between adjacent sample superpixel units;
    对于每个样本超像素单元,根据所述样本超像素单元的节点特征,确定所述样本超像素单元的状态向量;For each sample superpixel unit, according to the node characteristics of the sample superpixel unit, determine the state vector of the sample superpixel unit;
    对于每个样本超像素单元,根据所述样本超像素单元的状态向量、相邻样本超像素单元的状态向量、所述样本超像素单元与相邻样本超像素单元之间的边特征,对所述样本超像素单元的状态向量进行更新,得到所述样本超像素单元更新后的状态向量;For each sample superpixel unit, according to the state vector of the sample superpixel unit, the state vectors of adjacent sample superpixel units, and the edge features between the sample superpixel unit and adjacent sample superpixel units, update the state vector of the sample superpixel unit to obtain an updated state vector of the sample superpixel unit;
    将所有样本超像素单元更新后的状态向量作为输入、所述样本场景标签作为期望输出,对所述图像场景分类模型进行训练。The image scene classification model is trained by taking the updated state vectors of all sample superpixel units as input and the sample scene label as expected output.
  3. 根据权利要求1所述的方法,其中,在将所有目标超像素单元更新后的状态向量输入至预先训练好的图像场景分类模型,以通过所述图像场景分类模型获得目标场景标签之前,所述方法还包括:The method according to claim 1, wherein, before inputting the updated state vectors of all target superpixel units to a pre-trained image scene classification model to obtain the target scene label through the image scene classification model, the method further comprises:
    获取样本图像以及与所述样本图像对应的样本场景标签;Acquiring a sample image and a sample scene label corresponding to the sample image;
    基于不同的预设分割阈值对所述样本图像进行超像素分割,得到多个样本超像素分割图像,各个样本超像素分割图像包括不同数量的样本超像素单元;Performing superpixel segmentation on the sample image based on different preset segmentation thresholds to obtain a plurality of sample superpixel segmented images, each sample superpixel segmented image includes a different number of sample superpixel units;
    遍历各个样本超像素分割图像,以基于各个样本超像素分割图像对所述图像场景分类模型进行训练,所述训练过程包括:Traversing each sample superpixel segmentation image to train the image scene classification model based on each sample superpixel segmentation image, the training process includes:
    将当前遍历的样本超像素分割图像下的每个所述样本超像素单元分别作为一个节点,获取每个样本超像素单元的节点特征、相邻样本超像素单元之间的边特征;Taking each of the sample superpixel units under the currently traversed sample superpixel segmentation image as a node, and obtaining the node features of each sample superpixel unit and the edge features between adjacent sample superpixel units;
    对于每个样本超像素单元,根据所述样本超像素单元的节点特征,确定所述样本超像素单元的状态向量;For each sample superpixel unit, according to the node characteristics of the sample superpixel unit, determine the state vector of the sample superpixel unit;
    对于每个样本超像素单元,根据所述样本超像素单元的状态向量、相邻样本超像素单元的状态向量、所述样本超像素单元与相邻样本超像素单元之间的边特征,对所述样本超像素单元的状态向量进行更新,得到所述样本超像素单元更新后的状态向量;For each sample superpixel unit, according to the state vector of the sample superpixel unit, the state vectors of adjacent sample superpixel units, and the edge features between the sample superpixel unit and adjacent sample superpixel units, update the state vector of the sample superpixel unit to obtain an updated state vector of the sample superpixel unit;
    将所有样本超像素单元更新后的状态向量作为输入、所述样本场景标签作为期望输出,对所述图像场景分类模型进行训练。The image scene classification model is trained by taking the updated state vectors of all sample superpixel units as input and the sample scene label as expected output.
  4. 根据权利要求1所述的方法,其中,所述对待分类的目标图像进行超像素分割,得到 目标超像素分割图像,包括:The method according to claim 1, wherein, the target image to be classified is carried out to superpixel segmentation, and the target superpixel segmentation image is obtained, comprising:
    基于不同的预设分割阈值对待分类的目标图像进行超像素分割,得到多个目标超像素分割图像,各个目标超像素分割图像包括不同数量的目标超像素单元;Performing superpixel segmentation on the target image to be classified based on different preset segmentation thresholds to obtain multiple target superpixel segmentation images, each target superpixel segmentation image includes a different number of target superpixel units;
    所述获取所述目标超像素分割图像下的多个目标超像素单元,包括:The acquisition of multiple target superpixel units under the target superpixel segmentation image includes:
    遍历各个目标超像素分割图像,获取当前遍历的所述目标超像素分割图像下的多个目标超像素单元;Traverse each target superpixel segmentation image, and obtain a plurality of target superpixel units under the currently traversed target superpixel segmentation image;
    所述根据所述目标场景标签确定对应于所述目标图像的图像场景分类结果,包括:The determining the image scene classification result corresponding to the target image according to the target scene label includes:
    获取所述图像场景分类模型基于各个目标超像素分割图像输出的目标场景标签;Obtaining the target scene label output by the image scene classification model based on each target superpixel segmentation image;
    根据基于各个超像素分割图像输出的目标场景标签,确定对应于所述目标图像的图像场景分类结果。According to the target scene label output based on each superpixel segmented image, an image scene classification result corresponding to the target image is determined.
  5. 根据权利要求4所述的方法,其中,所述根据基于各个超像素分割图像输出的目标场景标签,确定对应于所述目标图像的图像场景分类结果,包括:The method according to claim 4, wherein said determining the image scene classification result corresponding to the target image according to the target scene label output based on each superpixel segmentation image comprises:
    对各个超像素分割图像输出的目标场景标签进行拼接,得到对应于所述目标图像的图像场景分类结果。The target scene labels output from each superpixel segmented image are spliced to obtain an image scene classification result corresponding to the target image.
  6. 根据权利要求1所述的方法,其中,所述对于每个目标超像素单元,根据所述目标超像素单元的状态向量、相邻目标超像素单元的状态向量、所述目标超像素单元与相邻目标超像素单元之间的边特征,对所述目标超像素单元的状态向量进行更新,得到所述目标超像素单元更新后的状态向量,包括:The method according to claim 1, wherein, for each target superpixel unit, the state vector of the target superpixel unit is updated according to the state vector of the target superpixel unit, the state vector of adjacent target superpixel units, and the edge features between the target superpixel unit and adjacent target superpixel units, to obtain the updated state vector of the target superpixel unit, comprising:
    根据所述目标超像素单元的状态向量、相邻目标超像素单元的状态向量、所述目标超像素单元与相邻目标超像素单元之间的边特征,确定所述目标超像素单元的关系特征向量;determining a relationship feature vector of the target superpixel unit according to the state vector of the target superpixel unit, the state vector of adjacent target superpixel units, and the edge features between the target superpixel unit and adjacent target superpixel units;
    根据所述目标超像素单元的状态向量和所述关系特征向量,对所述目标超像素单元的状态向量进行更新,得到所述目标超像素单元更新后的状态向量。According to the state vector of the target superpixel unit and the relationship feature vector, the state vector of the target superpixel unit is updated to obtain an updated state vector of the target superpixel unit.
  7. 根据权利要求1所述的方法,其中,所述对待分类的目标图像进行超像素分割,得到目标超像素分割图像,包括:The method according to claim 1, wherein performing superpixel segmentation on the target image to be classified to obtain the target superpixel segmentation image comprises:
    利用区域生长算法在待分类的目标图像上进行区域分割,得到目标超像素分割图像。The region growing algorithm is used to perform region segmentation on the target image to be classified, and the target superpixel segmented image is obtained.
  8. 一种基于图神经网络的图像场景分类装置,其中,包括:An image scene classification device based on a graph neural network, comprising:
    图像分割模块,用于对待分类的目标图像进行超像素分割,得到目标超像素分割图像;The image segmentation module is used to perform superpixel segmentation on the target image to be classified to obtain the target superpixel segmentation image;
    特征提取模块,用于获取所述目标超像素分割图像下的多个目标超像素单元,将每个所述目标超像素单元分别作为一个节点,获取每个目标超像素单元的节点特征、相邻目标超像素单元之间的边特征;The feature extraction module is used to obtain a plurality of target superpixel units under the target superpixel segmentation image, and each of the target superpixel units is used as a node to obtain node features of each target superpixel unit and edge features between adjacent target superpixel units;
    状态确定模块,用于对于每个目标超像素单元,根据所述目标超像素单元的节点特征,确定所述目标超像素单元的状态向量;A state determination module, configured to, for each target superpixel unit, determine the state vector of the target superpixel unit according to the node characteristics of the target superpixel unit;
    状态更新模块,对于每个目标超像素单元,根据所述目标超像素单元的状态向量、相邻目标超像素单元的状态向量、所述目标超像素单元与相邻目标超像素单元之间的边特征,对所述目标超像素单元的状态向量进行更新,得到所述目标超像素单元更新后的状态向量;A state update module, for each target superpixel unit, according to the state vector of the target superpixel unit, the state vector of the adjacent target superpixel unit, the edge feature between the target superpixel unit and the adjacent target superpixel unit, update the state vector of the target superpixel unit, and obtain the updated state vector of the target superpixel unit;
    标签输出模块,用于将所有目标超像素单元更新后的状态向量输入至预先训练好的图像场景分类模型,以使所述图像场景分类模型输出基于目标超像素分割图像的所述目标场景标签;A label output module, configured to input the updated state vectors of all target superpixel units to the pre-trained image scene classification model, so that the image scene classification model outputs the target scene label based on the target superpixel segmented image;
    场景分类模块,用于根据所述目标场景标签确定对应于所述目标图像的图像场景分类结果。A scene classification module, configured to determine an image scene classification result corresponding to the target image according to the target scene label.
  9. 一种电子设备,其中,所述电子设备包括存储器、处理器、存储在所述存储器上并可在所述处理器上运行的程序以及用于实现所述处理器和所述存储器之间的连接通信的数据总线,所述程序被所述处理器执行时实现一种基于图神经网络的图像场景分类方法;An electronic device, wherein the electronic device includes a memory, a processor, a program stored in the memory and operable on the processor, and a data bus for realizing connection and communication between the processor and the memory, and when the program is executed by the processor, an image scene classification method based on a graph neural network is implemented;
    其中,所述基于图神经网络的图像场景分类方法包括:Wherein, the image scene classification method based on graph neural network includes:
    对待分类的目标图像进行超像素分割,得到目标超像素分割图像;Perform superpixel segmentation on the target image to be classified to obtain the target superpixel segmentation image;
    获取所述目标超像素分割图像下的多个目标超像素单元,将每个所述目标超像素单元分 别作为一个节点,获取每个目标超像素单元的节点特征、相邻目标超像素单元之间的边特征;Obtain a plurality of target superpixel units under the target superpixel segmentation image, use each of the target superpixel units as a node, obtain the node features of each target superpixel unit, and the edge features between adjacent target superpixel units;
    对于每个目标超像素单元,根据所述目标超像素单元的节点特征,确定所述目标超像素单元的状态向量;For each target superpixel unit, according to the node characteristics of the target superpixel unit, determine the state vector of the target superpixel unit;
    对于每个目标超像素单元,根据所述目标超像素单元的状态向量、相邻目标超像素单元的状态向量、所述目标超像素单元与相邻目标超像素单元之间的边特征,对所述目标超像素单元的状态向量进行更新,得到所述目标超像素单元更新后的状态向量;For each target superpixel unit, according to the state vector of the target superpixel unit, the state vector of the adjacent target superpixel unit, and the edge feature between the target superpixel unit and the adjacent target superpixel unit, the state vector of the target superpixel unit is updated to obtain the updated state vector of the target superpixel unit;
    将所有目标超像素单元更新后的状态向量输入至预先训练好的图像场景分类模型,以使所述图像场景分类模型输出基于所述目标超像素分割图像的目标场景标签;The updated state vectors of all target superpixel units are input to a pre-trained image scene classification model, so that the image scene classification model outputs a target scene label based on the target superpixel segmented image;
    根据所述目标场景标签确定对应于所述目标图像的图像场景分类结果。An image scene classification result corresponding to the target image is determined according to the target scene label.
  10. 根据权利要求9所述的电子设备,其中,在将所有目标超像素单元更新后的状态向量输入至预先训练好的图像场景分类模型,以通过所述图像场景分类模型获得目标场景标签之前,所述方法还包括:The electronic device according to claim 9, wherein, before inputting the updated state vectors of all target superpixel units to a pre-trained image scene classification model to obtain the target scene label through the image scene classification model, the method further comprises:
    获取样本图像以及与所述样本图像对应的样本场景标签;Acquiring a sample image and a sample scene label corresponding to the sample image;
    对所述样本图像进行超像素分割,得到样本超像素分割图像;Performing superpixel segmentation on the sample image to obtain a sample superpixel segmentation image;
    获取所述样本超像素分割图像下的多个样本超像素单元,将每个所述样本超像素单元分别作为一个节点,获取每个样本超像素单元的节点特征、相邻样本超像素单元之间的边特征;Obtaining a plurality of sample superpixel units under the sample superpixel segmentation image, using each of the sample superpixel units as a node, and obtaining the node features of each sample superpixel unit and the edge features between adjacent sample superpixel units;
    对于每个样本超像素单元,根据所述样本超像素单元的节点特征,确定所述样本超像素单元的状态向量;For each sample superpixel unit, according to the node characteristics of the sample superpixel unit, determine the state vector of the sample superpixel unit;
    对于每个样本超像素单元,根据所述样本超像素单元的状态向量、相邻样本超像素单元的状态向量、所述样本超像素单元与相邻样本超像素单元之间的边特征,对所述样本超像素单元的状态向量进行更新,得到所述样本超像素单元更新后的状态向量;For each sample superpixel unit, according to the state vector of the sample superpixel unit, the state vectors of adjacent sample superpixel units, and the edge features between the sample superpixel unit and adjacent sample superpixel units, update the state vector of the sample superpixel unit to obtain an updated state vector of the sample superpixel unit;
    将所有样本超像素单元更新后的状态向量作为输入、所述样本场景标签作为期望输出,对所述图像场景分类模型进行训练。The image scene classification model is trained by taking the updated state vectors of all sample superpixel units as input and the sample scene label as expected output.
  11. 根据权利要求9所述的电子设备,其中,在将所有目标超像素单元更新后的状态向量输入至预先训练好的图像场景分类模型,以通过所述图像场景分类模型获得目标场景标签之前,所述方法还包括:The electronic device according to claim 9, wherein, before inputting the updated state vectors of all target superpixel units to a pre-trained image scene classification model to obtain the target scene label through the image scene classification model, the method further comprises:
    获取样本图像以及与所述样本图像对应的样本场景标签;Acquiring a sample image and a sample scene label corresponding to the sample image;
    基于不同的预设分割阈值对所述样本图像进行超像素分割,得到多个样本超像素分割图像,各个样本超像素分割图像包括不同数量的样本超像素单元;Performing superpixel segmentation on the sample image based on different preset segmentation thresholds to obtain a plurality of sample superpixel segmented images, each sample superpixel segmented image includes a different number of sample superpixel units;
    遍历各个样本超像素分割图像,以基于各个样本超像素分割图像对所述图像场景分类模型进行训练,所述训练过程包括:Traversing each sample superpixel segmentation image to train the image scene classification model based on each sample superpixel segmentation image, the training process includes:
    将当前遍历的样本超像素分割图像下的每个所述样本超像素单元分别作为一个节点,获取每个样本超像素单元的节点特征、相邻样本超像素单元之间的边特征;Taking each of the sample superpixel units under the currently traversed sample superpixel segmentation image as a node, and obtaining the node features of each sample superpixel unit and the edge features between adjacent sample superpixel units;
    对于每个样本超像素单元,根据所述样本超像素单元的节点特征,确定所述样本超像素单元的状态向量;For each sample superpixel unit, according to the node characteristics of the sample superpixel unit, determine the state vector of the sample superpixel unit;
    对于每个样本超像素单元,根据所述样本超像素单元的状态向量、相邻样本超像素单元的状态向量、所述样本超像素单元与相邻样本超像素单元之间的边特征,对所述样本超像素单元的状态向量进行更新,得到所述样本超像素单元更新后的状态向量;For each sample superpixel unit, according to the state vector of the sample superpixel unit, the state vectors of adjacent sample superpixel units, and the edge features between the sample superpixel unit and adjacent sample superpixel units, update the state vector of the sample superpixel unit to obtain an updated state vector of the sample superpixel unit;
    将所有样本超像素单元更新后的状态向量作为输入、所述样本场景标签作为期望输出,对所述图像场景分类模型进行训练。The image scene classification model is trained by taking the updated state vectors of all sample superpixel units as input and the sample scene label as expected output.
  12. 根据权利要求9所述的电子设备,其中,所述对待分类的目标图像进行超像素分割,得到目标超像素分割图像,包括:The electronic device according to claim 9, wherein performing superpixel segmentation on the target image to be classified to obtain a target superpixel segmented image comprises:
    基于不同的预设分割阈值对待分类的目标图像进行超像素分割,得到多个目标超像素分割图像,各个目标超像素分割图像包括不同数量的目标超像素单元;Performing superpixel segmentation on the target image to be classified based on different preset segmentation thresholds to obtain multiple target superpixel segmentation images, each target superpixel segmentation image includes a different number of target superpixel units;
    所述获取所述目标超像素分割图像下的多个目标超像素单元,包括:The acquisition of multiple target superpixel units under the target superpixel segmentation image includes:
    遍历各个目标超像素分割图像,获取当前遍历的所述目标超像素分割图像下的多个目标 超像素单元;Traverse each target superpixel segmentation image, and obtain a plurality of target superpixel units under the target superpixel segmentation image currently traversed;
    所述根据所述目标场景标签确定对应于所述目标图像的图像场景分类结果,包括:The determining the image scene classification result corresponding to the target image according to the target scene label includes:
    获取所述图像场景分类模型基于各个目标超像素分割图像输出的目标场景标签;Obtaining the target scene label output by the image scene classification model based on each target superpixel segmentation image;
    根据基于各个超像素分割图像输出的目标场景标签,确定对应于所述目标图像的图像场景分类结果。According to the target scene label output based on each superpixel segmented image, an image scene classification result corresponding to the target image is determined.
  13. 根据权利要求12所述的电子设备,其中,所述根据基于各个超像素分割图像输出的目标场景标签,确定对应于所述目标图像的图像场景分类结果,包括:The electronic device according to claim 12, wherein said determining the image scene classification result corresponding to the target image according to the target scene label output based on each superpixel segmentation image comprises:
    对各个超像素分割图像输出的目标场景标签进行拼接,得到对应于所述目标图像的图像场景分类结果。The target scene labels output from each superpixel segmented image are spliced to obtain an image scene classification result corresponding to the target image.
  14. 根据权利要求9所述的电子设备,其中,所述对于每个目标超像素单元,根据所述目标超像素单元的状态向量、相邻目标超像素单元的状态向量、所述目标超像素单元与相邻目标超像素单元之间的边特征,对所述目标超像素单元的状态向量进行更新,得到所述目标超像素单元更新后的状态向量,包括:The electronic device according to claim 9, wherein, for each target superpixel unit, the state vector of the target superpixel unit is updated according to the state vector of the target superpixel unit, the state vector of adjacent target superpixel units, and the edge features between the target superpixel unit and adjacent target superpixel units to obtain the updated state vector of the target superpixel unit, comprising:
    根据所述目标超像素单元的状态向量、相邻目标超像素单元的状态向量、所述目标超像素单元与相邻目标超像素单元之间的边特征,确定所述目标超像素单元的关系特征向量;determining a relationship feature vector of the target superpixel unit according to the state vector of the target superpixel unit, the state vector of adjacent target superpixel units, and the edge features between the target superpixel unit and adjacent target superpixel units;
    根据所述目标超像素单元的状态向量和所述关系特征向量,对所述目标超像素单元的状态向量进行更新,得到所述目标超像素单元更新后的状态向量。According to the state vector of the target superpixel unit and the relationship feature vector, the state vector of the target superpixel unit is updated to obtain an updated state vector of the target superpixel unit.
  15. 一种存储介质,所述存储介质为计算机可读存储介质,用于计算机可读存储,其中,所述存储介质存储有一个或者多个程序,所述一个或者多个程序可被一个或者多个处理器执行,以实现一种基于图神经网络的图像场景分类方法;A storage medium, the storage medium is a computer-readable storage medium for computer-readable storage, wherein the storage medium stores one or more programs, and the one or more programs can be executed by one or more processors to implement a graph neural network-based image scene classification method;
    其中,所述基于图神经网络的图像场景分类方法包括:Wherein, the image scene classification method based on graph neural network includes:
    对待分类的目标图像进行超像素分割,得到目标超像素分割图像;Perform superpixel segmentation on the target image to be classified to obtain the target superpixel segmentation image;
    获取所述目标超像素分割图像下的多个目标超像素单元,将每个所述目标超像素单元分别作为一个节点,获取每个目标超像素单元的节点特征、相邻目标超像素单元之间的边特征;Obtaining a plurality of target superpixel units under the target superpixel segmentation image, using each of the target superpixel units as a node, and acquiring node features of each target superpixel unit and edge features between adjacent target superpixel units;
    对于每个目标超像素单元,根据所述目标超像素单元的节点特征,确定所述目标超像素单元的状态向量;For each target superpixel unit, according to the node characteristics of the target superpixel unit, determine the state vector of the target superpixel unit;
    对于每个目标超像素单元,根据所述目标超像素单元的状态向量、相邻目标超像素单元的状态向量、所述目标超像素单元与相邻目标超像素单元之间的边特征,对所述目标超像素单元的状态向量进行更新,得到所述目标超像素单元更新后的状态向量;For each target superpixel unit, according to the state vector of the target superpixel unit, the state vector of the adjacent target superpixel unit, and the edge feature between the target superpixel unit and the adjacent target superpixel unit, the state vector of the target superpixel unit is updated to obtain the updated state vector of the target superpixel unit;
    将所有目标超像素单元更新后的状态向量输入至预先训练好的图像场景分类模型,以使所述图像场景分类模型输出基于所述目标超像素分割图像的目标场景标签;The updated state vectors of all target superpixel units are input to a pre-trained image scene classification model, so that the image scene classification model outputs a target scene label based on the target superpixel segmented image;
    根据所述目标场景标签确定对应于所述目标图像的图像场景分类结果。An image scene classification result corresponding to the target image is determined according to the target scene label.
  16. 根据权利要求15所述的存储介质,其中,在将所有目标超像素单元更新后的状态向量输入至预先训练好的图像场景分类模型,以通过所述图像场景分类模型获得目标场景标签之前,所述方法还包括:The storage medium according to claim 15, wherein, before inputting the updated state vectors of all target superpixel units to a pre-trained image scene classification model to obtain the target scene label through the image scene classification model, the method further comprises:
    获取样本图像以及与所述样本图像对应的样本场景标签;Acquiring a sample image and a sample scene label corresponding to the sample image;
    对所述样本图像进行超像素分割,得到样本超像素分割图像;Performing superpixel segmentation on the sample image to obtain a sample superpixel segmentation image;
    获取所述样本超像素分割图像下的多个样本超像素单元,将每个所述样本超像素单元分别作为一个节点,获取每个样本超像素单元的节点特征、相邻样本超像素单元之间的边特征;Obtaining a plurality of sample superpixel units under the sample superpixel segmentation image, using each of the sample superpixel units as a node, and obtaining the node features of each sample superpixel unit and the edge features between adjacent sample superpixel units;
    对于每个样本超像素单元,根据所述样本超像素单元的节点特征,确定所述样本超像素单元的状态向量;For each sample superpixel unit, according to the node characteristics of the sample superpixel unit, determine the state vector of the sample superpixel unit;
    对于每个样本超像素单元,根据所述样本超像素单元的状态向量、相邻样本超像素单元的状态向量、所述样本超像素单元与相邻样本超像素单元之间的边特征,对所述样本超像素单元的状态向量进行更新,得到所述样本超像素单元更新后的状态向量;For each sample superpixel unit, according to the state vector of the sample superpixel unit, the state vectors of adjacent sample superpixel units, and the edge features between the sample superpixel unit and adjacent sample superpixel units, update the state vector of the sample superpixel unit to obtain an updated state vector of the sample superpixel unit;
    将所有样本超像素单元更新后的状态向量作为输入、所述样本场景标签作为期望输出,对所述图像场景分类模型进行训练。The image scene classification model is trained by taking the updated state vectors of all sample superpixel units as input and the sample scene label as expected output.
  17. 根据权利要求15所述的存储介质,其中,在将所有目标超像素单元更新后的状态向量输入至预先训练好的图像场景分类模型,以通过所述图像场景分类模型获得目标场景标签之前,所述方法还包括:The storage medium according to claim 15, wherein, before inputting the updated state vectors of all target superpixel units to a pre-trained image scene classification model to obtain the target scene label through the image scene classification model, the method further comprises:
    获取样本图像以及与所述样本图像对应的样本场景标签;Acquiring a sample image and a sample scene label corresponding to the sample image;
    基于不同的预设分割阈值对所述样本图像进行超像素分割,得到多个样本超像素分割图像,各个样本超像素分割图像包括不同数量的样本超像素单元;Performing superpixel segmentation on the sample image based on different preset segmentation thresholds to obtain a plurality of sample superpixel segmented images, each sample superpixel segmented image includes a different number of sample superpixel units;
    遍历各个样本超像素分割图像,以基于各个样本超像素分割图像对所述图像场景分类模型进行训练,所述训练过程包括:Traversing each sample superpixel segmentation image to train the image scene classification model based on each sample superpixel segmentation image, the training process includes:
    将当前遍历的样本超像素分割图像下的每个所述样本超像素单元分别作为一个节点,获取每个样本超像素单元的节点特征、相邻样本超像素单元之间的边特征;Taking each of the sample superpixel units under the currently traversed sample superpixel segmentation image as a node, and obtaining the node features of each sample superpixel unit and the edge features between adjacent sample superpixel units;
    对于每个样本超像素单元,根据所述样本超像素单元的节点特征,确定所述样本超像素单元的状态向量;For each sample superpixel unit, according to the node characteristics of the sample superpixel unit, determine the state vector of the sample superpixel unit;
    对于每个样本超像素单元,根据所述样本超像素单元的状态向量、相邻样本超像素单元的状态向量、所述样本超像素单元与相邻样本超像素单元之间的边特征,对所述样本超像素单元的状态向量进行更新,得到所述样本超像素单元更新后的状态向量;For each sample superpixel unit, according to the state vector of the sample superpixel unit, the state vectors of adjacent sample superpixel units, and the edge features between the sample superpixel unit and adjacent sample superpixel units, update the state vector of the sample superpixel unit to obtain an updated state vector of the sample superpixel unit;
    将所有样本超像素单元更新后的状态向量作为输入、所述样本场景标签作为期望输出,对所述图像场景分类模型进行训练。The image scene classification model is trained by taking the updated state vectors of all sample superpixel units as input and the sample scene label as expected output.
  18. 根据权利要求15所述的存储介质,其中,所述对待分类的目标图像进行超像素分割,得到目标超像素分割图像,包括:The storage medium according to claim 15, wherein performing superpixel segmentation on the target image to be classified to obtain the target superpixel segmented image comprises:
    基于不同的预设分割阈值对待分类的目标图像进行超像素分割,得到多个目标超像素分割图像,各个目标超像素分割图像包括不同数量的目标超像素单元;Performing superpixel segmentation on the target image to be classified based on different preset segmentation thresholds to obtain multiple target superpixel segmentation images, each target superpixel segmentation image includes a different number of target superpixel units;
    所述获取所述目标超像素分割图像下的多个目标超像素单元,包括:The acquisition of multiple target superpixel units under the target superpixel segmentation image includes:
    遍历各个目标超像素分割图像,获取当前遍历的所述目标超像素分割图像下的多个目标超像素单元;Traverse each target superpixel segmentation image, and obtain a plurality of target superpixel units under the currently traversed target superpixel segmentation image;
    所述根据所述目标场景标签确定对应于所述目标图像的图像场景分类结果,包括:The determining the image scene classification result corresponding to the target image according to the target scene label includes:
    获取所述图像场景分类模型基于各个目标超像素分割图像输出的目标场景标签;Obtaining the target scene label output by the image scene classification model based on each target superpixel segmentation image;
    根据基于各个超像素分割图像输出的目标场景标签,确定对应于所述目标图像的图像场景分类结果。According to the target scene label output based on each superpixel segmented image, an image scene classification result corresponding to the target image is determined.
  19. 根据权利要求18所述的存储介质,其中,所述根据基于各个超像素分割图像输出的目标场景标签,确定对应于所述目标图像的图像场景分类结果,包括:The storage medium according to claim 18, wherein said determining the image scene classification result corresponding to the target image according to the target scene label output based on each superpixel segmented image comprises:
    对各个超像素分割图像输出的目标场景标签进行拼接,得到对应于所述目标图像的图像场景分类结果。The target scene labels output from each superpixel segmented image are spliced to obtain an image scene classification result corresponding to the target image.
  20. 根据权利要求15所述的存储介质,其中,所述对于每个目标超像素单元,根据所述目标超像素单元的状态向量、相邻目标超像素单元的状态向量、所述目标超像素单元与相邻目标超像素单元之间的边特征,对所述目标超像素单元的状态向量进行更新,得到所述目标超像素单元更新后的状态向量,包括:The storage medium according to claim 15, wherein, for each target superpixel unit, the state vector of the target superpixel unit is updated according to the state vector of the target superpixel unit, the state vector of adjacent target superpixel units, and the edge features between the target superpixel unit and adjacent target superpixel units to obtain the updated state vector of the target superpixel unit, comprising:
    根据所述目标超像素单元的状态向量、相邻目标超像素单元的状态向量、所述目标超像素单元与相邻目标超像素单元之间的边特征,确定所述目标超像素单元的关系特征向量;determining a relationship feature vector of the target superpixel unit according to the state vector of the target superpixel unit, the state vector of adjacent target superpixel units, and the edge features between the target superpixel unit and adjacent target superpixel units;
    根据所述目标超像素单元的状态向量和所述关系特征向量,对所述目标超像素单元的状态向量进行更新,得到所述目标超像素单元更新后的状态向量。According to the state vector of the target superpixel unit and the relationship feature vector, the state vector of the target superpixel unit is updated to obtain an updated state vector of the target superpixel unit.
PCT/CN2022/090725 2022-01-21 2022-04-29 Graph neural network-based image scene classification method and apparatus WO2023137916A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202210073146.3A CN114399002A (en) 2022-01-21 2022-01-21 Image scene classification method and device based on graph neural network
CN202210073146.3 2022-01-21

Publications (1)

Publication Number Publication Date
WO2023137916A1 true WO2023137916A1 (en) 2023-07-27

Family

ID=81232596

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2022/090725 WO2023137916A1 (en) 2022-01-21 2022-04-29 Graph neural network-based image scene classification method and apparatus

Country Status (2)

Country Link
CN (1) CN114399002A (en)
WO (1) WO2023137916A1 (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114399002A (en) * 2022-01-21 2022-04-26 平安科技(深圳)有限公司 Image scene classification method and device based on graph neural network

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20180061046A1 (en) * 2016-08-31 2018-03-01 International Business Machines Corporation Skin lesion segmentation using deep convolution networks guided by local unsupervised learning
CN109741341A (en) * 2018-12-20 2019-05-10 华东师范大学 A kind of image partition method based on super-pixel and long memory network in short-term
CN110110741A (en) * 2019-03-26 2019-08-09 深圳大学 A kind of multiple dimensioned classification method, device and computer readable storage medium
CN113160177A (en) * 2021-04-23 2021-07-23 杭州电子科技大学 Plane segmentation method based on superpixel and graph convolution network
CN113313164A (en) * 2021-05-27 2021-08-27 复旦大学附属肿瘤医院 Digital pathological image classification method and system based on superpixel segmentation and image convolution
CN114399002A (en) * 2022-01-21 2022-04-26 平安科技(深圳)有限公司 Image scene classification method and device based on graph neural network

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111695636B (en) * 2020-06-15 2023-07-14 北京师范大学 Hyperspectral image classification method based on graph neural network
CN113298129B (en) * 2021-05-14 2024-02-02 西安理工大学 Polarized SAR image classification method based on superpixel and graph convolution network

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20180061046A1 (en) * 2016-08-31 2018-03-01 International Business Machines Corporation Skin lesion segmentation using deep convolution networks guided by local unsupervised learning
CN109741341A (en) * 2018-12-20 2019-05-10 华东师范大学 A kind of image partition method based on super-pixel and long memory network in short-term
CN110110741A (en) * 2019-03-26 2019-08-09 深圳大学 A kind of multiple dimensioned classification method, device and computer readable storage medium
CN113160177A (en) * 2021-04-23 2021-07-23 杭州电子科技大学 Plane segmentation method based on superpixel and graph convolution network
CN113313164A (en) * 2021-05-27 2021-08-27 复旦大学附属肿瘤医院 Digital pathological image classification method and system based on superpixel segmentation and image convolution
CN114399002A (en) * 2022-01-21 2022-04-26 平安科技(深圳)有限公司 Image scene classification method and device based on graph neural network

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
LONG JIANWU, YAN ZERAN, CHEN HONGFA: "A Graph Neural Network for superpixel image classification", JOURNAL OF PHYSICS: CONFERENCE SERIES, INSTITUTE OF PHYSICS PUBLISHING, GB, vol. 1871, no. 1, 1 April 2021 (2021-04-01), GB , pages 012071, XP093079775, ISSN: 1742-6588, DOI: 10.1088/1742-6596/1871/1/012071 *

Also Published As

Publication number Publication date
CN114399002A (en) 2022-04-26

Similar Documents

Publication Publication Date Title
Chen et al. Linear spectral clustering superpixel
CN112232293B (en) Image processing model training method, image processing method and related equipment
TWI821671B (en) A method and device for positioning text areas
WO2022120997A1 (en) Distributed slam system and learning method therefor
CN114677565B (en) Training method and image processing method and device for feature extraction network
Xu et al. Weakly supervised deep semantic segmentation using CNN and ELM with semantic candidate regions
CN111784699B (en) Method and device for carrying out target segmentation on three-dimensional point cloud data and terminal equipment
WO2020125100A1 (en) Image search method, apparatus, and device
Wang et al. Duplicate discovery on 2 billion internet images
WO2023137916A1 (en) Graph neural network-based image scene classification method and apparatus
CN108829692B (en) Flower image retrieval method based on convolutional neural network
CN112132145A (en) Image classification method and system based on model extended convolutional neural network
CN116452810A (en) Multi-level semantic segmentation method and device, electronic equipment and storage medium
CN113987236A (en) Unsupervised training method and unsupervised training device for visual retrieval model based on graph convolution network
Salem et al. Semantic image inpainting using self-learning encoder-decoder and adversarial loss
Gao et al. Full-scale video-based detection of smoke from forest fires combining ViBe and MSER algorithms
US9875528B2 (en) Multi-frame patch correspondence identification in video
Al-Saidi et al. Fuzzy fractal dimension based on escape time algorithm
CN111079930A (en) Method and device for determining quality parameters of data set and electronic equipment
CN108961268B (en) Saliency map calculation method and related device
CN112906517B (en) Self-supervision power law distribution crowd counting method and device and electronic equipment
CN112330697B (en) Image segmentation method and device, electronic equipment and readable storage medium
CN111626311A (en) Heterogeneous graph data processing method and device
Hu et al. Convolutional neural networks with hybrid weights for 3D point cloud classification
JP2021527859A (en) Irregular shape segmentation in an image using deep region expansion