CN117829265A

CN117829265A - Electric power cross-mode bidirectional knowledge migration method based on intermediate space construction

Info

Publication number: CN117829265A
Application number: CN202410232948.3A
Authority: CN
Inventors: 郑敏; 吴春鹏; 王岳; 叶青河; 刘卫卫; 常珂; 周飞
Original assignee: State Grid Smart Grid Research Institute Co ltd; State Grid Corp of China SGCC
Current assignee: State Grid Smart Grid Research Institute Co ltd; State Grid Corp of China SGCC
Priority date: 2024-03-01
Filing date: 2024-03-01
Publication date: 2024-04-05
Anticipated expiration: 2044-03-01
Also published as: CN117829265B

Abstract

The invention relates to the technical field of knowledge migration, in particular to a power cross-mode bidirectional knowledge migration method based on intermediate space construction. The method comprises the following steps: extracting first characteristics of first modal data by adopting graph modeling and a graph neural network, and extracting second characteristics of second modal data by adopting a large language model; constructing an intermediate space based on the similarity of the first feature and the second feature and the corresponding loss function; adopting a loss function in an intermediate space to iteratively optimize parameters in the graph neural network and the large language model; and carrying out knowledge migration in an intermediate space based on the characteristics extracted by the graph neural network after graph modeling and parameter iterative optimization and the large language model. According to the method, the relationship features extracted by different modes are aligned in the intermediate space, so that bidirectional knowledge migration among different mode data is realized.

Description

Electric power cross-mode bidirectional knowledge migration method based on intermediate space construction

Technical Field

The invention relates to the technical field of knowledge migration, in particular to a power cross-mode bidirectional knowledge migration method based on intermediate space construction.

Background

Knowledge migration refers to a process of applying knowledge learned from one domain or task to another domain or task, such as applying knowledge learned by a visual domain to a text domain. It can help us solve new problems or tasks with existing knowledge and experience, thereby speeding up the learning process and improving performance. The goal of knowledge migration is to improve the learning efficiency and performance of new tasks by exploiting existing knowledge and experience. In the power domain, many business scenarios use visual domain-text domain knowledge migration.

In the power domain, many business scenarios use visual domain-text domain knowledge migration. However, since there is a difference in modality between the visual field and the text field, knowledge migration cannot be directly performed.

Disclosure of Invention

In view of the above, the present invention provides a method and apparatus for power cross-modal bidirectional knowledge migration based on intermediate space construction, so as to solve the problem of how to perform knowledge migration between cross-modal data.

In a first aspect, the present invention provides a method for power cross-modal bidirectional knowledge migration based on intermediate space construction, the method comprising: extracting first characteristics of first modal data by adopting graph modeling and a graph neural network, and extracting second characteristics of second modal data by adopting a large language model; constructing an intermediate space based on the similarity of the first feature and the second feature and the corresponding loss function; adopting a loss function in an intermediate space to iteratively optimize parameters in the graph neural network and the large language model; and carrying out knowledge migration in an intermediate space based on the characteristics extracted by the graph neural network after graph modeling and parameter iterative optimization and the large language model.

According to the power cross-mode bidirectional knowledge migration method based on intermediate space construction, provided by the embodiment of the invention, the first characteristics of the first mode data are extracted by adopting graph modeling and graph neural network, and the second characteristics of the second mode data are extracted by adopting a large language model; constructing an intermediate space based on the similarity of the first feature and the second feature and the corresponding loss function; adopting a loss function in an intermediate space to iteratively optimize parameters in the graph neural network and the large language model; and carrying out knowledge migration in an intermediate space based on the characteristics extracted by the graph neural network after graph modeling and parameter iterative optimization and the large language model. Therefore, the method aligns the relationship features extracted by different modes in the intermediate space, and realizes the bidirectional knowledge migration between the data of different modes.

In an alternative embodiment, the first modality data is image data, and extracting the first feature of the first modality data using graph modeling and a graph neural network includes: extracting an object node set, a connection node set and a node label of the first modal data by adopting graph modeling; respectively extracting first node characteristics of the object node set and first connection characteristics of the connection node set by adopting a graph neural network; determining node label semantic features and connection label semantic features based on the product of the weight matrix of the graph neural network and the node labels; the first feature of the first modality data is determined based on joint optimization of the first node feature and the node tag semantic feature and the first connection feature and the connection tag semantic feature.

In an alternative embodiment, determining the first feature of the first modality data based on joint optimization of the first node feature and the node tag semantic feature and the first connection feature and the connection tag semantic feature comprises: based on the activation function and the weight matrix of the graph neural network, the joint processing of the first node characteristic and the node label semantic characteristic is carried out to obtain an initial joint node characteristic; based on the activation function and the weight matrix of the graph neural network, the joint processing of the first connection feature and the connection tag semantic feature is carried out to obtain an initial joint connection feature; and optimizing the joint node characteristics based on the initial joint node characteristics, optimizing the joint connection characteristics based on the initial joint connection characteristics and the initial joint node characteristics of the corresponding connection, and obtaining optimized joint node characteristics and joint connection characteristics, wherein the joint node characteristics and the joint connection characteristics form a first characteristic.

In this embodiment, the first node feature and the node tag semantic feature, and the first connection feature and the connection tag semantic feature are extracted through the graph neural network, and feature optimization updating is performed in a joint optimization mode, so that information can be effectively propagated and integrated, and a global context is acquired on the whole image data.

In an alternative embodiment, the second modality data is text data, and the extracting the second feature of the second modality data using the large language model includes: acquiring a word order path of the second-mode data as an object node and semantic features of the second-mode data as a connection node; and extracting the word sequence features of the object nodes and the semantic features of the connection nodes by adopting a large language model, wherein the word sequence features and the semantic features form second features.

In this embodiment, text feature extraction is performed based on context feature representation of language order-semantics, and particularly, feature extraction is performed by using a large language model, so that the context relationship contained in the language order and the semantics of the text can be effectively captured.

In an alternative embodiment, constructing the intermediate space based on the similarity of the first feature and the second feature and the corresponding loss function includes: calculating the similarity of the first feature and the second feature; the loss function is constructed based on the maximization of the similarity, and the similarity and the loss function constitute an intermediate space.

In this embodiment, the loss function is constructed through similarity calculation and similarity maximization, so that bidirectional knowledge migration between different modalities can be accurately realized in an intermediate space constructed based on the similarity and the loss function.

In an alternative embodiment, the loss function is expressed using the following formula:

in the method, in the process of the invention,representing similarity (S)>And->Representing hyper-parameters->Representing anchor points->Representing a positive sample, +.>Representing a first negative sample, +.>Representing the second negative sample, +.>Representing modality(s)>Representing a first modality->Representing a second modality.

In an alternative embodiment, the first feature comprises a joint node feature and a joint connection feature, and the second feature comprises a word order feature and a semantic feature; calculating the similarity of the first feature and the second feature comprises: calculating first similarity of joint node characteristics and language sequence characteristics by adopting an inner product; calculating a second similarity of the joint connection feature and the semantic feature by adopting an inner product; and adding the first similarity and the second similarity to obtain the similarity of the first feature and the second feature.

In this embodiment, since the first feature and the second feature respectively include two features, the similarity of the corresponding features is calculated respectively, and then the two similarities are added, so that accurate calculation of the similarity can be achieved.

In a second aspect, the present invention provides a power cross-modal bidirectional knowledge migration apparatus constructed based on an intermediate space, the apparatus comprising: the feature extraction module is used for extracting first features of the first modal data by adopting graph modeling and a graph neural network, and extracting second features of the second modal data by adopting a large language model; the space construction module is used for constructing an intermediate space based on the similarity of the first feature and the second feature and the corresponding loss function; the optimization module is used for iteratively optimizing parameters in the graph neural network and the large language model by adopting a loss function in an intermediate space; and the knowledge migration module is used for carrying out knowledge migration in the intermediate space based on the graph modeling, the graph neural network after parameter iteration optimization and the characteristics extracted by the large language model.

In a third aspect, the present invention provides a computer device comprising: the system comprises a memory and a processor, wherein the memory and the processor are in communication connection, the memory stores computer instructions, and the processor executes the computer instructions so as to execute the power cross-mode bidirectional knowledge migration method based on the intermediate space construction in the first aspect or any corresponding embodiment.

In a fourth aspect, the present invention provides a computer readable storage medium having stored thereon computer instructions for causing a computer to perform the power cross-modal bidirectional knowledge migration method based on intermediate space construction of the first aspect or any one of its corresponding embodiments.

Drawings

In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings that are needed in the description of the embodiments or the prior art will be briefly described, and it is obvious that the drawings in the description below are some embodiments of the present invention, and other drawings can be obtained according to the drawings without inventive effort for a person skilled in the art.

FIG. 1 is a flow diagram of a method for power cross-modal bi-directional knowledge migration based on intermediate space construction in accordance with an embodiment of the invention;

FIG. 2 is a flow diagram of yet another method for power cross-modal bi-directional knowledge migration based on intermediate space construction in accordance with an embodiment of the invention;

FIG. 3 is a block diagram of a power cross-modality bi-directional knowledge migration device constructed based on intervening spaces in accordance with an embodiment of the invention;

fig. 4 is a schematic diagram of a hardware structure of a computer device according to an embodiment of the present invention.

Detailed Description

For the purpose of making the objects, technical solutions and advantages of the embodiments of the present invention more apparent, the technical solutions of the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present invention, and it is apparent that the described embodiments are some embodiments of the present invention, but not all embodiments of the present invention. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.

In accordance with an embodiment of the present invention, there is provided an embodiment of a power cross-modal bi-directional knowledge migration method based on intermediate space construction, it being noted that the steps illustrated in the flowchart of the figures may be performed in a computer system, such as a set of computer executable instructions, and that, although a logical order is illustrated in the flowchart, in some cases, the steps illustrated or described may be performed in an order other than that illustrated herein.

In this embodiment, a method for transferring power cross-modal bidirectional knowledge constructed based on an intermediate space is provided, which can be used for electronic devices, such as computers, mobile phones, tablet computers, etc., fig. 1 is a flowchart of a method for transferring power cross-modal bidirectional knowledge constructed based on an intermediate space according to an embodiment of the present invention, as shown in fig. 1, where the flowchart includes the following steps:

step S101, a graph modeling and graph neural network is adopted to extract first features of first modal data, and a large language model is adopted to extract second features of second modal data. Specifically, the first modality data and the second modality data belong to data of different modalities, for example, the first modality data is visual domain data, the second modality data is text domain data, that is, the first modality data may be image data, and the second modality data may be text data. When the first characteristic of the first modal data is extracted, a graph modeling algorithm and a graph neural network algorithm are adopted, wherein the graph modeling algorithm can adopt a relational modeling algorithm in the related technology, so that relational mining in the first modal data is realized. And extracting the characteristics of the mined relation by adopting a graph neural network algorithm, so that the first characteristics in the first modal data are accurately extracted. When the feature extraction is performed on the second modality data, any large language model in the related art may be adopted, the large language model refers to a deep learning model trained by using a large amount of text data, and natural language text may be generated or meaning of language text may be understood.

Step S102, constructing an intermediate space based on the similarity of the first feature and the second feature and the corresponding loss function. Specifically, the intermediate space is used for realizing alignment among different modal data features, namely, mapping the first feature of the first modal data and the second feature of the second modal data into the intermediate space, so that knowledge migration between the first modal data and the second modal data is realized. In this embodiment, the construction of the intermediate space is realized through the calculation of the similarity and the construction of the loss function, wherein the loss function is constructed based on the calculated similarity.

And step S103, adopting a loss function in an intermediate space to iteratively optimize parameters in the graph neural network and the large language model. Specifically, since the loss function is determined by calculating the similarity between the first feature of the first modality data and the second feature of the second modality data, the process of tuning the model parameters by using the loss function is a process of making the first feature and the second feature more similar, that is, enabling alignment between features extracted by the model parameters after tuning, thereby realizing knowledge migration between different modality data. It should be noted that, the process of tuning the model parameters based on the loss function may be implemented by referring to related technologies, which will not be described herein.

And step S104, performing knowledge migration in an intermediate space based on the graph neural network subjected to graph modeling and parameter iteration optimization and the features extracted by the large language model. Specifically, after the model parameters are optimized by adopting the loss function, the first characteristics of the first modal data are extracted by adopting the graph neural network after graph modeling and parameter iteration optimization, and the second characteristics of the second modal data are extracted by adopting the large language model after parameter iteration optimization, so that the extracted first characteristics and second characteristics are the most similar, and knowledge migration can be performed in an intermediate space.

In this embodiment, a method for power cross-modal bidirectional knowledge migration based on intermediate space construction is provided, and the process includes the following steps:

step S201, a graph modeling and graph neural network is adopted to extract first characteristics of first modal data, and a large language model is adopted to extract second characteristics of second modal data.

Specifically, extracting the first feature of the first modality data using graph modeling and a graph neural network includes:

step S2011, extracting an object node set, a connection node set and a node label of the first modal data by adopting graph modeling; in this embodiment, a relational modeling method is adoptedNeuralMotifsAnd extracting an object node set, a connection node set and a node label of the first modal data. The specific extraction method can refer to the related technology, and will not be described herein. For example, if the first modality data is a picture, the content extracted by using the relational modeling method can be formalized asWherein->Representing the set of object nodes->Representing a set of connection nodes>Connection->And->Two object nodes, E representing edge sets, are virtual, not physical, are not involved in this embodiment, and are therefore formalized +.>。

It should be noted thatThe object node set may refer to a set of objects in a picture, for example, a picture includes two people, glasses and food, and then the set of objects is the object node set, and the connection node set may refer to a relationship between objects in the picture, for example, one person wears a glasses in the picture, and another person is eating food, and wearing and eating indicates the relationship between objects, that is, wearing and eating in the picture forms the connection node set. And for node labels, a relational modeling method is adoptedNeuralMotifsThe extracted node labels are generally represented by single-hot codes, and in the picture, the node labels are specifically used for marking specific positions of the object nodes and the connection nodes in the picture, or the node labels are used for marking the object nodes and the connection nodes in the picture. In the present embodiment, one-hot encoding is adoptedAnd->To represent node +.>And->Is a label of (a).

Step S2012, respectively extracting a first node characteristic of the target node set and a first connection characteristic of the connection node set by adopting a graph neural network; in this embodiment, when feature extraction is performed, a YOLO-V8 algorithm is used as a graph neural network to perform feature extraction. For object node setsExtracting salient block features->As a first feature. For the set of connection nodes->Extracting the characteristic of the intersection area between the salient blocks>As a first connection feature. In this way, node feature learning of the object node and the connection node is achieved. It should be noted that, the feature extraction of the object node set and the connection node set by using the YOLO-V8 algorithm may be implemented by referring to the feature extraction method of the algorithm in the related art, which is not described herein.

Step S2013, determining node label semantic features and connection label semantic features based on the product of the weight matrix of the graph neural network and the node labels; wherein node tag semantic features may be represented asThe connection tag semantic feature may represent +.>Wherein->And->Representing a weight matrix of the neural network.

Step S2014, determining a first feature of the first modality data based on the joint optimization of the first node feature and the node tag semantic feature and the first connection feature and the connection tag semantic feature. Wherein, the joint optimization specifically comprises the following steps: based on the activation function and the weight matrix of the graph neural network, the joint processing of the first node characteristic and the node label semantic characteristic is carried out to obtain an initial joint node characteristic; based on the activation function and the weight matrix of the graph neural network, the joint processing of the first connection feature and the connection tag semantic feature is carried out to obtain an initial joint connection feature; and optimizing the joint node characteristics based on the initial joint node characteristics, optimizing the joint connection characteristics based on the initial joint connection characteristics and the initial joint node characteristics of the corresponding connection, and obtaining optimized joint node characteristics and joint connection characteristics, wherein the joint node characteristics and the joint connection characteristics form a first characteristic.

In particular, the initial joint node feature may be expressed asThe initial joint connection feature may be expressed as +.>Wherein->Representing a weight matrix of the neural network, relu represents the activation function. In the joint optimization of features, the +.>And->Wherein the object nodes update themselves and the connection nodes are updated by the neighbor object nodes in an aggregate. Specifically, at the time of updating, first +.>，/>Let joint node feature->Joint connection feature->Wherein->Representing a full connection layer, < >>Followed by a relu activation function. The joint node features and joint connection features obtained through the update optimization process constitute first features.

Specifically, the method for extracting the second characteristic of the second modal data by adopting the large language model comprises the following steps:

step S2015, acquiring a word order path of the second mode data as an object node and semantic features of the second mode data as a connection node; specifically, in this embodiment, text data is used as the second modality data, and the word order feature of the text data is used as the object nodeSemantic features of text data are used as connection nodes. For example, one piece of text data of the text data contains +.>Individual words, then this->The word sequence of the word in the text data is a word sequence path, and at this time, the first node characteristic corresponding to the text data is 1 length +.>Is a word order path of (a). The semantic path is obtained by extracting keywords such as a main predicate in the word order path and deleting modifier words in the word order path. When the main predicate in the word order path is extracted as a semantic path, the length of the semantic path is 3. In this embodiment, will be made of +.>The semantic set formed by the semantic paths serves as a second connection feature.

In step S2016, the large language model is used to extract the word order feature of the object node and the semantic feature of the connection node, where the word order feature and the semantic feature form the second feature. In this embodiment, the feature extraction is performed by using the open source pre-training model GPT2 as a large language model, i.e. the object node is extractedChinese character sequence featuresExtracting connection node->Semantic features in (a)As a second feature. The specific feature extraction method for the large language model may refer to the feature extraction method in the related art, which is not described herein.

Step S202, constructing an intermediate space based on the similarity of the first feature and the second feature and the corresponding loss function.

Specifically, the step S202 includes:

step S2021, calculating the similarity of the first feature and the second feature; the calculation process of the similarity specifically comprises the steps of calculating first similarity of joint node characteristics and language sequence characteristics by adopting an inner product; calculating a second similarity of the joint connection feature and the semantic feature by adopting an inner product; and adding the first similarity and the second similarity to obtain the similarity of the first feature and the second feature.

That is, the first similarity for joint node features and order features may be formulatedCalculating, wherein->Representing the word order feature>Representing joint node characteristics, ++>And->Are all dimension +.>Vector of->Representing an inner product operation. The second similarity of the joint connection feature and the semantic feature may be formulated +.>Calculation of->Representing semantic features->Representing joint connection characteristics. The similarity of the first feature and the second feature may be as defined by the formula +.>And (5) calculating.

In step S2022, a loss function is constructed based on the maximization of the similarity, and the similarity and the loss function constitute the intermediate space. The loss function is expressed using the following formula:

In a corresponding manner to the embodiment of the present invention,xit may be understood that the features extracted from the first modality data or the second modality data, i.e. when the subscript of x is i or j, then x may be understood as the first feature extracted from the first modality data (since the first feature comprises two features, i or j is used for each, and the second feature is the same), when the subscript of x is k or l, x may be understood as the second feature extracted from the second modality data, or when the subscript of x is i or j, then x may be understood as the second feature extracted from the second modality data, and when the subscript of x is k or l, x may be understood as the first feature extracted from the first modality data. In order to better distinguish, a superscript M is set for x, and when the superscript M is 1, the corresponding x is a feature in the first mode data, and when the superscript M is 2, the corresponding x is a feature in the second mode data. The subscript i, j, k, l of M is intended to correspond to, i.e., have the same meaning as, the subscript of x.

Step S203, adopting a loss function in an intermediate space to iteratively optimize parameters in the graph neural network and the large language model; please refer to step S103 in the embodiment shown in fig. 1 in detail, which is not described herein.

Step S204, knowledge migration is carried out in an intermediate space based on the graph modeling, the graph neural network after parameter iteration optimization and the features extracted by the large language model. Please refer to step S104 in the embodiment shown in fig. 1 in detail, which is not described herein.

As a specific application embodiment of the invention, as shown in FIG. 2, taking a visual domain and a text domain as examples, the power cross-modal bidirectional knowledge migration party constructed based on the intermediate spaceThe method is as follows: firstly, respectively carrying out relational modeling in two domains, namely adopting a relational modeling method in a visual domainNeuralMotifsExtracting an object node set, a connection node set and a node label of the first mode data, and acquiring a word order path of the second mode data in a text field to serve as an object node and semantic features of the second mode data to serve as connection nodes. And then, respectively extracting the relation characteristics of the vision and the text by adopting a graph neural network (YOLO-V8 algorithm) based on heterogeneous characteristics and a context characteristic representation (large language model) based on language order-semantics. And then, constructing an intermediate space, mapping the extracted relation features into the intermediate space, and carrying out similarity measurement, thereby realizing the bidirectional knowledge migration from the visual domain to the text domain and from the text domain to the visual domain.

The embodiment also provides a power cross-mode bidirectional knowledge migration device constructed based on the intermediate space, which is used for realizing the embodiment and the preferred implementation mode, and is not described again. As used below, the term "module" may be a combination of software and/or hardware that implements a predetermined function. While the means described in the following embodiments are preferably implemented in software, implementation in hardware, or a combination of software and hardware, is also possible and contemplated.

The embodiment provides a power cross-mode bidirectional knowledge migration device constructed based on an intermediate space, as shown in fig. 3, including:

the feature extraction module 31 is configured to extract a first feature of the first modality data by using graph modeling and a graph neural network, and extract a second feature of the second modality data by using a large language model;

a space construction module 32 for constructing an intermediate space based on the similarity of the first feature and the second feature and the corresponding loss function;

an optimization module 33, configured to iteratively optimize parameters in the graph neural network and the large language model by using a loss function in the intermediate space;

the knowledge migration module 34 is configured to perform knowledge migration in the intermediate space based on the graph modeling, the graph neural network after parameter iterative optimization, and the features extracted by the large language model.

In an alternative embodiment, the first modality data is image data, and the feature extraction module includes: the first extraction module is used for extracting an object node set, a connection node set and a node label of the first modal data by adopting graph modeling; the second extraction module is used for respectively extracting the first node characteristics of the target node set and the first connection characteristics of the connection node set by adopting the graph neural network; the label semantic determining module is used for determining node label semantic features and connection label semantic features based on the product of the weight matrix of the graph neural network and the node labels; and the extraction optimization module is used for determining the first feature of the first modal data based on joint optimization of the first node feature and the node tag semantic feature and the first connection feature and the connection tag semantic feature.

In an alternative embodiment, the extraction optimization module is specifically configured to: based on the activation function and the weight matrix of the graph neural network, the joint processing of the first node characteristic and the node label semantic characteristic is carried out to obtain an initial joint node characteristic; based on the activation function and the weight matrix of the graph neural network, the joint processing of the first connection feature and the connection tag semantic feature is carried out to obtain an initial joint connection feature; and optimizing the joint node characteristics based on the initial joint node characteristics, optimizing the joint connection characteristics based on the initial joint connection characteristics and the initial joint node characteristics of the corresponding connection, and obtaining optimized joint node characteristics and joint connection characteristics, wherein the joint node characteristics and the joint connection characteristics form a first characteristic.

In an alternative embodiment, the second modality data is text data, and the feature extraction module further includes: the path acquisition module is used for acquiring a word order path of the second mode data as an object node and semantic features of the second mode data as a connection node; and the third extraction module is used for extracting the word sequence characteristics of the object nodes and the semantic characteristics of the connection nodes by adopting the large language model, and the word sequence characteristics and the semantic characteristics form second characteristics.

In an alternative embodiment, the space building module includes: the similarity calculation module is used for calculating the similarity of the first feature and the second feature; and the construction submodule is used for constructing a loss function based on the maximization of the similarity, and the similarity and the loss function form an intermediate space.

In an alternative embodiment, the first feature comprises a joint node feature and a joint connection feature, and the second feature comprises a word order feature and a semantic feature; the similarity calculation module is specifically configured to: calculating first similarity of joint node characteristics and language sequence characteristics by adopting an inner product; calculating a second similarity of the joint connection feature and the semantic feature by adopting an inner product; and adding the first similarity and the second similarity to obtain the similarity of the first feature and the second feature.

Further functional descriptions of the above respective modules and units are the same as those of the above corresponding embodiments, and are not repeated here.

The embodiment of the invention also provides computer equipment, which is provided with the electric power cross-mode bidirectional knowledge migration device based on the intermediate space construction shown in the figure 3.

Referring to fig. 4, fig. 4 is a schematic structural diagram of a computer device according to an alternative embodiment of the present invention, as shown in fig. 4, the computer device includes: one or more processors 10, memory 20, and interfaces for connecting the various components, including high-speed interfaces and low-speed interfaces. The various components are communicatively coupled to each other using different buses and may be mounted on a common motherboard or in other manners as desired. The processor may process instructions executing within the computer device, including instructions stored in or on memory to display graphical information of the GUI on an external input/output device, such as a display device coupled to the interface. In some alternative embodiments, multiple processors and/or multiple buses may be used, if desired, along with multiple memories and multiple memories. Also, multiple computer devices may be connected, each providing a portion of the necessary operations (e.g., as a server array, a set of blade servers, or a multiprocessor system). One processor 10 is illustrated in fig. 4.

The processor 10 may be a central processor, a network processor, or a combination thereof. The processor 10 may further include a hardware chip, among others. The hardware chip may be an application specific integrated circuit, a programmable logic device, or a combination thereof. The programmable logic device may be a complex programmable logic device, a field programmable gate array, a general-purpose array logic, or any combination thereof.

Wherein the memory 20 stores instructions executable by the at least one processor 10 to cause the at least one processor 10 to perform a method for implementing the embodiments described above.

The memory 20 may include a storage program area that may store an operating system, at least one application program required for functions, and a storage data area; the storage data area may store data created from the use of the computer device of the presentation of a sort of applet landing page, and the like. In addition, the memory 20 may include high-speed random access memory, and may also include non-transitory memory, such as at least one magnetic disk storage device, flash memory device, or other non-transitory solid-state storage device. In some alternative embodiments, memory 20 may optionally include memory located remotely from processor 10, which may be connected to the computer device via a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof.

Memory 20 may include volatile memory, such as random access memory; the memory may also include non-volatile memory, such as flash memory, hard disk, or solid state disk; the memory 20 may also comprise a combination of the above types of memories.

The computer device also includes a communication interface 30 for the computer device to communicate with other devices or communication networks.

The embodiments of the present invention also provide a computer readable storage medium, and the method according to the embodiments of the present invention described above may be implemented in hardware, firmware, or as a computer code which may be recorded on a storage medium, or as original stored in a remote storage medium or a non-transitory machine readable storage medium downloaded through a network and to be stored in a local storage medium, so that the method described herein may be stored on such software process on a storage medium using a general purpose computer, a special purpose processor, or programmable or special purpose hardware. The storage medium can be a magnetic disk, an optical disk, a read-only memory, a random access memory, a flash memory, a hard disk, a solid state disk or the like; further, the storage medium may also comprise a combination of memories of the kind described above. It will be appreciated that a computer, processor, microprocessor controller or programmable hardware includes a storage element that can store or receive software or computer code that, when accessed and executed by the computer, processor or hardware, implements the methods illustrated by the above embodiments.

Although embodiments of the present invention have been described in connection with the accompanying drawings, various modifications and variations may be made by those skilled in the art without departing from the spirit and scope of the invention, and such modifications and variations fall within the scope of the invention as defined by the appended claims.

Claims

1. An electric power cross-modal bidirectional knowledge migration method based on intermediate space construction, which is characterized by comprising the following steps:

extracting first characteristics of first modal data by adopting graph modeling and a graph neural network, and extracting second characteristics of second modal data by adopting a large language model;

constructing an intermediate space based on the similarity of the first feature and the second feature and the corresponding loss function;

adopting a loss function in an intermediate space to iteratively optimize parameters in the graphic neural network and the large language model;

and carrying out knowledge migration in an intermediate space based on the characteristics extracted by the graph neural network after graph modeling and parameter iterative optimization and the large language model.

2. The method of claim 1, wherein the first modality data is image data, and wherein extracting the first feature of the first modality data using graph modeling and a graph neural network comprises:

extracting an object node set, a connection node set and a node label of the first modal data by adopting graph modeling;

respectively extracting first node characteristics of the object node set and first connection characteristics of the connection node set by adopting a graph neural network;

determining node label semantic features and connection label semantic features based on the product of the weight matrix of the graph neural network and the node labels;

the first feature of the first modality data is determined based on joint optimization of the first node feature and the node tag semantic feature and the first connection feature and the connection tag semantic feature.

3. The method of claim 2, wherein determining the first feature of the first modality data based on joint optimization of the first node feature and the node tag semantic feature and the first connection feature and the connection tag semantic feature comprises:

based on the activation function and the weight matrix of the graph neural network, the joint processing of the first node characteristic and the node label semantic characteristic is carried out to obtain an initial joint node characteristic;

based on the activation function and the weight matrix of the graph neural network, the joint processing of the first connection feature and the connection tag semantic feature is carried out to obtain an initial joint connection feature;

optimizing the joint node characteristics based on the initial joint node characteristics, optimizing the joint connection characteristics based on the initial joint connection characteristics and the initial joint node characteristics of the corresponding connection, and obtaining optimized joint node characteristics and joint connection characteristics, wherein the joint node characteristics and the joint connection characteristics form a first characteristic.

4. The method of claim 1, wherein the second modality data is text data, and wherein extracting the second feature of the second modality data using the large language model comprises:

acquiring a word order path of the second-mode data as an object node and semantic features of the second-mode data as a connection node;

and extracting the word sequence characteristics of the object nodes and the semantic characteristics of the connection nodes by adopting a large language model, wherein the word sequence characteristics and the semantic characteristics form second characteristics.

5. The method of claim 1, wherein constructing the mediating space based on the similarities of the first feature and the second feature and the corresponding loss function comprises:

calculating the similarity of the first feature and the second feature;

and constructing a loss function based on the maximization of the similarity, wherein the similarity and the loss function form an intermediate space.

6. The method of claim 5, wherein the loss function is expressed by the following formula:

7. The method of claim 5, wherein the first features comprise joint node features and joint connection features and the second features comprise word order features and semantic features; calculating the similarity of the first feature and the second feature comprises:

calculating first similarity of joint node characteristics and language sequence characteristics by adopting an inner product;

calculating a second similarity of the joint connection feature and the semantic feature by adopting an inner product;

and adding the first similarity and the second similarity to obtain the similarity of the first feature and the second feature.

8. An apparatus for power cross-modal bi-directional knowledge migration based on intermediate space construction, the apparatus comprising:

the feature extraction module is used for extracting first features of the first modal data by adopting graph modeling and a graph neural network, and extracting second features of the second modal data by adopting a large language model;

the space construction module is used for constructing an intermediate space based on the similarity of the first feature and the second feature and the corresponding loss function;

the optimization module is used for iteratively optimizing parameters in the graph neural network and the large language model by adopting a loss function in an intermediate space;

and the knowledge migration module is used for carrying out knowledge migration in the intermediate space based on the graph modeling, the graph neural network after parameter iteration optimization and the characteristics extracted by the large language model.

9. A computer device, comprising:

a memory and a processor communicatively coupled to each other, the memory having stored therein computer instructions that, upon execution, perform the method of power cross-modal bidirectional knowledge migration based on intermediate space construction of any one of claims 1 to 7.

10. A computer-readable storage medium having stored thereon computer instructions for causing a computer to perform the method of power cross-modal bidirectional knowledge migration based on intermediate space construction of any one of claims 1 to 7.