CN117808084A - Pre-selection method based on graph reduction brief representation and graph neural network - Google Patents
Pre-selection method based on graph reduction brief representation and graph neural network Download PDFInfo
- Publication number
- CN117808084A CN117808084A CN202311490926.9A CN202311490926A CN117808084A CN 117808084 A CN117808084 A CN 117808084A CN 202311490926 A CN202311490926 A CN 202311490926A CN 117808084 A CN117808084 A CN 117808084A
- Authority
- CN
- China
- Prior art keywords
- graph
- node
- neural network
- information
- candidate
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000013528 artificial neural network Methods 0.000 title claims abstract description 27
- 238000010187 selection method Methods 0.000 title claims abstract description 16
- 230000009467 reduction Effects 0.000 title claims abstract description 11
- 239000013598 vector Substances 0.000 claims abstract description 44
- 238000003062 neural network model Methods 0.000 claims abstract description 17
- 230000007246 mechanism Effects 0.000 claims abstract description 16
- 238000010586 diagram Methods 0.000 claims abstract description 14
- 238000011176 pooling Methods 0.000 claims abstract description 8
- 230000006870 function Effects 0.000 claims description 19
- 238000000034 method Methods 0.000 claims description 17
- 230000002776 aggregation Effects 0.000 claims description 7
- 238000004220 aggregation Methods 0.000 claims description 7
- 238000012546 transfer Methods 0.000 claims description 5
- 230000004931 aggregating effect Effects 0.000 claims description 3
- 230000008569 process Effects 0.000 claims description 3
- 239000002243 precursor Substances 0.000 claims 3
- 238000013473 artificial intelligence Methods 0.000 abstract description 4
- 125000002015 acyclic group Chemical group 0.000 description 3
- 238000010835 comparative analysis Methods 0.000 description 2
- 230000000694 effects Effects 0.000 description 2
- 239000010410 layer Substances 0.000 description 2
- 238000012360 testing method Methods 0.000 description 2
- ORILYTVJVMAKLC-UHFFFAOYSA-N Adamantane Natural products C1C(C2)CC3CC1CC2C3 ORILYTVJVMAKLC-UHFFFAOYSA-N 0.000 description 1
- 101001121408 Homo sapiens L-amino-acid oxidase Proteins 0.000 description 1
- 102100026388 L-amino-acid oxidase Human genes 0.000 description 1
- 230000003044 adaptive effect Effects 0.000 description 1
- 238000004458 analytical method Methods 0.000 description 1
- 238000013459 approach Methods 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 238000013527 convolutional neural network Methods 0.000 description 1
- 238000012217 deletion Methods 0.000 description 1
- 230000037430 deletion Effects 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 238000002474 experimental method Methods 0.000 description 1
- 238000004880 explosion Methods 0.000 description 1
- 238000010801 machine learning Methods 0.000 description 1
- 238000010606 normalization Methods 0.000 description 1
- 230000000306 recurrent effect Effects 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 230000006403 short-term memory Effects 0.000 description 1
- 239000002356 single layer Substances 0.000 description 1
- 238000012549 training Methods 0.000 description 1
- 238000012795 verification Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N5/00—Computing arrangements using knowledge-based models
- G06N5/01—Dynamic search techniques; Heuristics; Dynamic trees; Branch-and-bound
- G06N5/013—Automatic theorem proving
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
- G06F18/241—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/042—Knowledge-based neural networks; Logical representations of neural networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/0464—Convolutional networks [CNN, ConvNet]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/09—Supervised learning
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- Artificial Intelligence (AREA)
- Evolutionary Computation (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Mathematical Physics (AREA)
- Computing Systems (AREA)
- Computational Linguistics (AREA)
- Software Systems (AREA)
- Life Sciences & Earth Sciences (AREA)
- Biophysics (AREA)
- Biomedical Technology (AREA)
- Health & Medical Sciences (AREA)
- General Health & Medical Sciences (AREA)
- Molecular Biology (AREA)
- Evolutionary Biology (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Bioinformatics & Computational Biology (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
本发明涉及人工智能技术领域,涉及一种基于图约简表示与图神经网络的前提选择方法,包括以下步骤:步骤一:通过判断并删除连续重复的量词得到简化的一阶逻辑公式图;步骤二:基于简化的逻辑公式图,提出一种具有注意力机制的项游走图神经网络模型,模型按照项游走模式聚合位于项游走三元组上部、中部和下部的节点信息,引入注意力机制计算节点的项游走特征权重,并将权重与节点信息结合生成新的节点嵌入向量,再通过全局平均池化得到最终的公式图特征向量;步骤三,将候选前提和给定猜想的图特征向量输入到二元分类器,进而实现对候选前提的分类。本发明能较佳地进行前提选择。
The invention relates to the technical field of artificial intelligence, and relates to a premise selection method based on graph reduction representation and graph neural network, which includes the following steps: Step 1: Obtain a simplified first-order logic formula graph by judging and deleting continuously repeated quantifiers; 2: Based on the simplified logical formula diagram, a term-walk graph neural network model with an attention mechanism is proposed. The model aggregates the node information located in the upper, middle and lower parts of the term-walk triplet according to the term-walk mode, and introduces attention. The force mechanism calculates the item wandering feature weight of the node, and combines the weight with the node information to generate a new node embedding vector, and then obtains the final formula graph feature vector through global average pooling; Step 3: Combine the candidate premises with the given conjecture The graph feature vector is input to the binary classifier to classify the candidate premises. The present invention can better perform premise selection.
Description
技术领域Technical field
本发明涉及人工智能技术领域,具体地说,涉及一种基于图约简表示与图神经网络的前提选择方法。The present invention relates to the field of artificial intelligence technology, and specifically to a premise selection method based on graph reduction representation and graph neural network.
背景技术Background technique
自动定理证明(Automated Theorem Provers,ATPs)是人工智能领域一个核心且前沿的方向,作为人工智能系统的重要组成部分,它被广泛应用于专家系统、电路设计、编译器和软件验证等领域。ATPs首先将猜想与前提形式化为逻辑公式,然后将逻辑公式输入到自动定理证明器中,从而实现从前提自动演绎出猜想。ATPs通过不断迭代搜索问题库中所有待处理的子句集来实现对新问题证明,这会导致它在较大规模的问题库中出现搜索空间呈指数型爆炸的问题。前提选择为解决该问题提供了一个新的方法,也就是在将逻辑公式输入到ATPs之前确定并选出有助于证明给定问题的结论的公式。Automated Theorem Provers (ATPs) is a core and cutting-edge direction in the field of artificial intelligence. As an important part of artificial intelligence systems, it is widely used in expert systems, circuit design, compilers and software verification and other fields. ATPs first formalizes conjectures and premises into logical formulas, and then inputs the logical formulas into the automatic theorem prover, thereby automatically deducing conjectures from premises. ATPs achieves proof of new problems by continuously iteratively searching all pending clause sets in the problem library, which will lead to an exponential explosion of the search space in a larger problem library. Premise selection provides a new approach to solving this problem, that is, identifying and selecting formulas that help prove the conclusion of a given problem before entering logical formulas into ATPs.
有效的前提选择方法能够大大提高ATPs的能力。早期的前提选择方法主要是一种基于符号比较分析的手工设计的启发式方法,它通过比较分析从输入公式中抽取子句的深度、符号计数等符号和结构特征来筛选前提集合中与结论更相关的前提,该方法局限于手工设计的特征。随着计算机能力的发展,一些机器学习的方法成为前提选择的一种有效的替代方法,它们将问题自然地转化为分类或排序问题,比原有方法更能捕捉到逻辑公式的深层特征,如卷积神经网络、长短期记忆神经网络、门控循环神经网络等。Effective premise selection methods can greatly improve the capabilities of ATPs. The early premise selection method was mainly a hand-designed heuristic method based on symbolic comparative analysis. It extracted symbolic and structural features such as clause depth and symbol count from the input formula through comparative analysis to screen the premise set that is more relevant to the conclusion. Related to the premise, this method is limited to hand-designed features. With the development of computer capabilities, some machine learning methods have become an effective alternative method for premise selection. They naturally transform the problem into a classification or sorting problem, and can capture the deep characteristics of logical formulas better than the original method, such as Convolutional neural network, long short-term memory neural network, gated recurrent neural network, etc.
在此基础上,由于逻辑公式可以自然地被表示为图,而融合图拓扑结构信息的特征能够充分体现逻辑公式的特征,因此,图神经网络与自动定理证明的结合成为当前较为热门的研究主题。现有的基于图神经网络的前提选择方法虽然能够在一定程度上提高前提选择分类的能力,但仍存在一些不足之处:On this basis, because logical formulas can be naturally represented as graphs, and the characteristics of fused graph topology information can fully reflect the characteristics of logical formulas, the combination of graph neural networks and automatic theorem proving has become a popular research topic at present. . Although the existing premise selection method based on graph neural network can improve the ability of premise selection classification to a certain extent, it still has some shortcomings:
(1)逻辑公式图包含丰富的语法和语义性质,多数前提选择方法忽略了逻辑公式不同图表示法对图神经网络模型的影响,这将导致图神经网络不能很好捕捉逻辑公式的内部和外部信息;(1) Logic formula graphs contain rich grammatical and semantic properties. Most premise selection methods ignore the impact of different graph representations of logical formulas on the graph neural network model. This will result in the graph neural network not being able to capture the internal and external aspects of the logic formula well. information;
(2)现有的图神经网络模型常通过聚合来自邻居节点或其他节点的信息,来生成保留更多逻辑公式信息的特征,这些特征往往包含大量的节点信息,这可能导致生成的公式特征被图上不重要的信息影响,从而无法充分表示逻辑公式信息的图特征。(2) Existing graph neural network models often generate features that retain more logical formula information by aggregating information from neighbor nodes or other nodes. These features often contain a large amount of node information, which may cause the generated formula features to be The influence of unimportant information on the graph makes it impossible to fully represent the graph features of logical formula information.
发明内容Contents of the invention
本发明的内容是提供一种基于图约简表示与图神经网络的前提选择方法,其能够为图上节点分配不同的权重,进而更好的编码一阶逻辑公式图特征。The content of the present invention is to provide a premise selection method based on graph reduction representation and graph neural network, which can assign different weights to nodes on the graph, thereby better encoding first-order logic formula graph features.
根据本发明的基于图约简表示与图神经网络的前提选择方法,包括以下步骤:The premise selection method based on graph reduction representation and graph neural network according to the present invention includes the following steps:
步骤一:通过判断并删除连续重复的量词得到简化的一阶逻辑公式图;Step 1: Get a simplified first-order logic formula diagram by judging and deleting consecutive repeated quantifiers;
步骤二:基于简化的逻辑公式图,提出一种具有注意力机制的项游走图神经网络模型,模型按照项游走模式聚合位于项游走三元组上部、中部和下部的节点信息,引入注意力机制计算节点的项游走特征权重,并将权重与节点信息结合生成新的节点嵌入向量,再通过全局平均池化得到最终的公式图特征向量;Step 2: Based on the simplified logical formula diagram, a term-walk graph neural network model with an attention mechanism is proposed. The model aggregates the node information located in the upper, middle and lower parts of the term-walk triplet according to the term-walk mode, and introduces The attention mechanism calculates the item walking feature weight of the node, combines the weight with the node information to generate a new node embedding vector, and then obtains the final formula graph feature vector through global average pooling;
步骤三,将候选前提和给定猜想的图特征向量输入到二元分类器,进而实现对候选前提的分类。Step 3: Input the candidate premise and the graph feature vector of the given conjecture into the binary classifier to classify the candidate premise.
作为优选,步骤一中,在有向无环图DAGs的基础上将满足相同且连续这一条件的量词进行合并,从而得到简化的一阶逻辑公式图。As a preferred method, in step one, quantifiers that meet the same and continuous conditions are combined on the basis of directed acyclic graphs DAGs to obtain a simplified first-order logic formula graph.
作为优选,步骤二中,一种具有注意力机制的项游走图神经网络模型具体为:As a preferred option, in step two, an item-walking graph neural network model with an attention mechanism is specifically:
(1)输入图G=(V,E),其中,V为图上所有节点,E为图上所有的边;首先为每个节点v∈V分配一个初始嵌入xv,再通过k轮迭代生成的节点状态向量实现消息传递过程,其中k∈{1,…,K};(1) Input graph G = (V, E), where V is all the nodes on the graph and E is all the edges on the graph; first assign an initial embedding x v to each node v∈V, and then go through k rounds of iterations Generated node status vector Implement the message passing process, where k∈{1,…,K};
其中,是固定大小的初始状态向量,/>为节点嵌入向量的输出维度,FV为一个查找表;in, is a fixed-size initial state vector,/> is the output dimension of the node embedding vector, and F V is a lookup table;
(2)按照项游走模式聚集节点信息,输入图G=(V,E)的每个节点v,设Tu(v),Tm(v)和Tl(v)分别表示节点v位于上部、中部和下部的项游走特征集合,有:(2) Gather node information according to the term walking mode, input each node v of the graph G = (V, E), let T u (v), T m (v) and T l (v) respectively represent that node v is located The item wandering feature sets in the upper, middle and lower parts are:
Tu(v)={(v,u,w)|(v,u),(u,w)∈E},T u (v)={(v,u,w)|(v,u), (u,w)∈E},
Tm(v)={(u,v,w)|(u,v),(v,w)∈E},T m (v)={(u, v, w)|(u, v), (v, w)∈E},
Tl(v)={(u,w,v)|(u,w),(w,v)∈E}T l (v)={(u,w,v)|(u,w),(w,v)∈E}
其中,u,w均为图上任意节点;Among them, u and w are any nodes on the graph;
为了区分来自项游走特征不同位置的节点v,分别对Tu(v),Tm(v)和Tl(v)三元组中的向量进行拼接,然后模型针对Tu(v),Tm(v)和Tl(v)以及聚合函数Fu,Fm和Fl聚集来自不同位置的节点v信息;In order to distinguish the nodes v from different positions of the item wandering feature, the vectors in the triples T u (v), T m (v) and T l (v) are spliced respectively, and then the model is based on Tu (v), T m (v) and T l (v) and the aggregation functions F u , F m and F l aggregate node v information from different locations;
其中,公式中的分号表示不同节点状态向量的拼接;|Tu(v)|,|Tm(v)|和|Tm(v)|分别表示集合Tu(v),Tm(v)和Tl(v)中所有三元组个数的总和;Among them, the semicolon in the formula represents the splicing of different node state vectors; |T u (v)|, |T m (v)| and |T m (v)| respectively represent the set T u (v), T m ( v) and the sum of the numbers of all triples in T l (v);
模型引入注意力机制以平衡节点状态信息与节点的项游走特征信息和/>为减少模型结构的复杂度,使用/>和/>作为项游走特征信息对节点信息的注意力分数,通过softmax函数归一化该注意力分数得到注意力权重αvu,αvm和αvl,最终使用聚合函数FU,FM和FL得到平衡后的节点信息/>和/> The model introduces an attention mechanism to balance node status information Item travel feature information with nodes and/> To reduce the complexity of the model structure, use/> and/> As the attention score of item wandering feature information to node information, the attention score is normalized by the softmax function to obtain the attention weights α vu , α vm and α vl , and finally obtained by using the aggregation functions F U , F M and F L Balanced node information/> and/>
其中,αvu,αvm和αvl分别表示节点项游走特征信息和/>对节点信息、/>的注意力权重,vu,vm和vl分别表示位于项游走三元组上部、中部和下部的节点;Among them, α vu , α vm and α vl respectively represent the node item wandering characteristic information. and/> For node information,/> The attention weights, vu, vm and vl respectively represent the nodes located in the upper, middle and lower parts of the item wandering triplet;
最后,节点v聚集来自集合Tu(v),Tm(v)和Tl(v)的平衡节点信息和/>被汇总为/> Finally, node v aggregates the balanced node information from the sets T u (v), T m (v) and T l (v) and/> is summarized as/>
(3)利用来自集合Tu(v),Tm(v)和Tl(v)的调整后的节点总信息和上一步的节点v状态向量/>对节点向量/>进行传递和更新;(3) Using the adjusted total node information from the sets Tu (v), Tm (v) and Tl (v) and the node v state vector of the previous step/> For node vector/> to deliver and update;
其中,Fsum为节点信息传递函数;Among them, F sum is the node information transfer function;
(4)对逻辑公式图上的所有节点进行平均池化操作AvgPool,最终的公式图嵌入向量hG为:(4) Perform average pooling operation AvgPool on all nodes on the logical formula graph. The final embedding vector h G of the formula graph is:
其中,AvgPool为平均池。in, AvgPool is the average pool.
作为优选,步骤三中,将候选前提和给定猜想的图嵌入向量对(hp,hc)输入到分类函数Fclass中得到候选前提在猜想下的有用性得分;As a preferred method, in step three, input the graph embedding vector pair (h p , h c ) of the candidate premise and the given conjecture into the classification function F class to obtain the usefulness score of the candidate premise under the conjecture;
z=Fclass([hp;hc])z=F class ([h p ;h c ])
其中,z∈R2表示候选前提对猜想有用和无用的得分;Among them, z∈R 2 represents the score of the candidate premise being useful and useless for the guess;
模型使用softmax函数归一化候选前提对猜想的有用和无用得分,并按照得分大小对候选前提进行划分:The model uses the softmax function to normalize the usefulness and useless scores of the candidate premises for the guess, and divides the candidate premises according to the score size:
其中,表示归一化后的前提有用和无用得分,zi为z中第i个元素,/>为/>中第l个元素,候选前提有用得分和无用得分对应着不同的标签属性,则根据划分结果,确定候选前提的标签属性,并与已有标签比较,进而实现分类。in, Represents the normalized usefulness and useless scores of the premise, z i is the i-th element in z,/> for/> In the l-th element, the useful score and useless score of the candidate premise correspond to different label attributes. Based on the division results, the label attribute of the candidate premise is determined and compared with the existing labels to achieve classification.
本发明的有益效果如下:The beneficial effects of the present invention are as follows:
1)本发明提出的一种删除重复量词的简化一阶逻辑公式图表示法,能够防止逻辑公式不同图表示法影响图神经网络模型,使图神经网络能够很好捕捉逻辑公式的内部和外部信息;1) The invention proposes a simplified first-order logical formula graphical representation method that deletes repeated quantifiers, which can prevent different graphical representations of logical formulas from affecting the graph neural network model, so that the graph neural network can well capture the internal and external information of the logical formula ;
2)本发明提出了一种具有注意力机制的项游走图神经网络模型,并将其应用于前提选择问题。该模型能够防止图神经网络中公式特征被图上不重要的信息影响,能够为图上节点分配不同的权重,进而更好的编码一阶逻辑公式图特征。2) The present invention proposes an item-walking graph neural network model with an attention mechanism, and applies it to the premise selection problem. This model can prevent the formula features in the graph neural network from being affected by unimportant information on the graph, assign different weights to the nodes on the graph, and thus better encode the first-order logic formula graph features.
附图说明Description of drawings
图1为实施例中一种基于图约简表示与图神经网络的前提选择方法的流程图。FIG1 is a flow chart of a premise selection method based on graph reduction representation and graph neural network in an embodiment.
具体实施方式Detailed ways
为进一步了解本发明的内容,结合附图和实施例对本发明作详细描述。应当理解的是,实施例仅仅是对本发明进行解释而并非限定。In order to further understand the content of the present invention, the present invention will be described in detail with reference to the accompanying drawings and embodiments. It should be understood that the embodiments are only for explanation of the present invention but not for limitation.
实施例Example
如图1所示,本实施例提供了一种基于图约简表示与图神经网络的前提选择方法,包括以下步骤:As shown in Figure 1, this embodiment provides a premise selection method based on graph reduction representation and graph neural network, which includes the following steps:
步骤一:通过判断并删除连续重复的量词得到简化的一阶逻辑公式图;一阶逻辑公式图包括一阶逻辑前提公式图和一阶逻辑猜想公式图;Step 1: Obtain a simplified first-order logic formula diagram by judging and deleting continuously repeated quantifiers; the first-order logic formula diagram includes a first-order logic premise formula diagram and a first-order logic conjecture formula diagram;
常用的逻辑公式到图的表示法多为被拓展的有向无环图(DAGs),其一般步骤为:1)将逻辑公式转化为类似程序语言的语法解析树;2)合并解析树上相同的子表达式和叶子节点;3)重命名逻辑公式中的变量。Commonly used representations of logical formulas into graphs are mostly extended directed acyclic graphs (DAGs). The general steps are: 1) Convert logical formulas into a syntax parse tree similar to a programming language; 2) Merge the same parsing trees subexpressions and leaf nodes; 3) Rename the variables in the logical formula.
为了减少图数据的规模以及让DAGs包含更多的逻辑性质,本发明提出基于删除重复量词的有向无环图(Simplified-DAGs)用以表示一阶逻辑公式。该Simplified-DAGs的操作相当于在原有DAGs的基础上将满足相同且连续这一条件的量词进行合并。In order to reduce the size of graph data and allow DAGs to contain more logical properties, the present invention proposes directed acyclic graphs (Simplified-DAGs) based on deletion of repeated quantifiers to represent first-order logic formulas. The operation of Simplified-DAGs is equivalent to merging quantifiers that meet the same and continuous conditions on the basis of the original DAGs.
步骤二:基于简化的逻辑公式图,提出一种具有注意力机制的项游走图神经网络模型(Attention-TW-GNN),模型按照项游走模式聚合位于项游走三元组上部、中部和下部的节点信息,引入注意力机制计算节点的项游走特征权重,并将权重与节点信息结合生成新的节点嵌入向量,再通过全局平均池化得到最终的公式图特征向量;Step 2: Based on the simplified logical formula diagram, an item-walking graph neural network model with attention mechanism (Attention-TW-GNN) is proposed. The model is aggregated in the upper and middle parts of the item-walking triplet according to the item-walking mode. and the node information in the lower part, introduce the attention mechanism to calculate the item walking feature weight of the node, combine the weight with the node information to generate a new node embedding vector, and then obtain the final formula graph feature vector through global average pooling;
步骤二中,一种具有注意力机制的项游走图神经网络模型按照图神经网络模型的工作流程,通过图节点向量初始化阶段、图节点信息聚合阶段、图节点信息传递阶段和图特征读出阶段(图池化)迭代更新节点嵌入信息并获得最终的一阶逻辑公式图向量。具体为:In step two, an item-walking graph neural network model with an attention mechanism follows the workflow of the graph neural network model and goes through the graph node vector initialization phase, the graph node information aggregation phase, the graph node information transfer phase and the graph feature readout The stage (graph pooling) iteratively updates node embedding information and obtains the final first-order logic formula graph vector. Specifically:
(1)输入图G=(V,E),其中,V为图上所有节点,E为图上所有的边;首先为每个节点v∈V分配一个初始嵌入xv,再通过k轮迭代生成的节点状态向量实现消息传递过程,其中k∈{1,…,K};(1) Input graph G = (V, E), where V is all the nodes on the graph and E is all the edges on the graph; first assign an initial embedding x v to each node v∈V, and then go through k rounds of iterations Generated node status vector Implement the message passing process, where k∈{1,…,K};
其中,是固定大小的初始状态向量,/>为节点嵌入向量的输出维度,FV为一个查找表;in, is a fixed-size initial state vector,/> is the output dimension of the node embedding vector, and F V is a lookup table;
(2)按照项游走模式聚集节点信息,输入图G=(V,E)的每个节点v,设Tu(v),Tm(v)和Tl(v)分别表示节点v位于上部、中部和下部的项游走特征集合,有:(2) Aggregate node information according to the term walking mode, input each node v of the graph G = (V, E), let T u (v), T m (v) and T l (v) respectively represent that node v is located The item wandering feature sets in the upper, middle and lower parts are:
Tu(v)={(v,u,w)|(v,u),(u,w)∈E},T u (v)={(v,u,w)|(v,u), (u,w)∈E},
Tm(v)={(u,v,w)|(u,v),(v,w)∈E},T m (v) = {(u, v, w) | (u, v), (v, w) ∈ E},
Tl(v)={(u,w,v)|(u,w),(w,v)∈E}T l (v)={(u,w,v)|(u,w),(w,v)∈E}
其中,u,w均为图上任意节点;Among them, u and w are any nodes on the graph;
为了区分来自项游走特征不同位置的节点v,分别对Tu(v),Tm(v)和Tl(v)三元组中的向量进行拼接,然后模型针对Tu(v),Tm(v)和Tl(v)以及聚合函数Fu,Fm和Fl聚集来自不同位置的节点v信息;In order to distinguish nodes v from different positions of the item walk feature, the vectors in the triplets of Tu (v), Tm (v) and Tl (v) are concatenated respectively. Then the model aggregates the information of nodes v from different positions for Tu (v), Tm (v) and Tl (v) and the aggregation functions Fu , Fm and Fl .
其中,公式中的分号表示不同节点状态向量的拼接;|Tu(v)|,|Tm(v)|和|Tm(v)|分别表示集合Tu(v),Tm(v)和Tl(v)中所有三元组个数的总和;Among them, the semicolon in the formula represents the splicing of different node state vectors; |T u (v)|, |T m (v)| and |T m (v)| respectively represent the set T u (v), T m ( v) and the sum of the numbers of all triples in T l (v);
模型引入注意力机制以平衡节点状态信息与节点的项游走特征信息和/>为减少模型结构的复杂度,使用/>和/>作为项游走特征信息对节点信息的注意力分数(贡献),通过softmax函数归一化该注意力分数得到注意力权重αvu,αvm和αvl,最终使用聚合函数FU,FM和FL得到平衡后的节点信息/>和/> The model introduces an attention mechanism to balance node status information Item travel feature information with nodes and/> To reduce the complexity of the model structure, use/> and/> As the attention score (contribution) of item wandering feature information to node information, the attention score is normalized through the softmax function to obtain the attention weights α vu , α vm and α vl , and finally the aggregation functions F U , F M and F L gets the balanced node information/> and/>
其中,αvu,αvm和αvl分别表示节点项游走特征信息和/>对节点信息的注意力权重,vu,vm和vl分别表示位于项游走三元组上部、中部和下部的节点;Among them, α vu , α vm and α vl respectively represent the node item wandering characteristic information. and/> Node information The attention weights, vu, vm and vl respectively represent the nodes located in the upper, middle and lower parts of the item wandering triplet;
最后,节点v聚集来自集合Tu(v),Tm(v)和Tl(v)的平衡节点信息和/>被汇总为/> Finally, node v aggregates the balanced node information from the sets T u (v), T m (v) and T l (v) and/> is summarized as/>
(3)利用来自集合Tu(v),Tm(v)和Tl(v)的调整后的节点总信息和上一步的节点v状态向量/>对节点向量/>进行传递和更新;(3) Using the adjusted total node information from the sets Tu (v), Tm (v) and Tl (v) and the node v state vector of the previous step/> For node vector/> to deliver and update;
其中,Fsum为节点信息传递函数;Fsum为一个简单的单层MLPs。Among them, F sum is the node information transfer function; F sum is a simple single-layer MLPs.
(4)对逻辑公式图上的所有节点进行平均池化操作(AvgPool),最终的公式图嵌入向量hG为:(4) Perform average pooling operation (AvgPool) on all nodes on the logical formula graph. The final formula graph embedding vector h G is:
其中,AvgPool为平均池。in, AvgPool is the average pool.
步骤三,将候选前提和给定猜想的图特征向量输入到二元分类器,进而实现对候选前提的分类。Step 3: Input the candidate premise and the graph feature vector of the given conjecture into the binary classifier to classify the candidate premise.
将候选前提和给定猜想的图嵌入向量对(hp,hc)输入到分类函数Fclass中得到候选前提在猜想下的有用性得分;Input the graph embedding vector pair (h p , h c ) of the candidate premise and the given conjecture into the classification function F class to obtain the usefulness score of the candidate premise under the conjecture;
z=Fclass([hp;hc])z=F class ([h p ; h c ])
其中,z∈R2表示候选前提对猜想有用和无用的得分;Among them, z∈R 2 represents the score of the candidate premise being useful and useless for the guess;
模型使用softmax函数归一化候选前提对猜想的有用和无用得分,并按照得分大小对候选前提进行划分:The model uses the softmax function to normalize the usefulness and useless scores of the candidate premises for the guess, and divides the candidate premises according to the score size:
其中,表示归一化后的前提有用和无用得分,zi为z中第i个元素,/>为/>中第l个元素,候选前提有用得分和无用得分对应着不同的标签属性(1或0),则根据划分结果(得分最大值),确定候选前提的标签属性,并与已有标签比较,进而实现分类。in, Represents the normalized usefulness and useless scores of the premise, z i is the i-th element in z,/> for/> In the l-th element, the useful score and useless score of the candidate premise correspond to different label attributes (1 or 0). According to the division result (maximum score), the label attribute of the candidate premise is determined, and compared with the existing labels, and then Implement classification.
实验experiment
(1)数据集(1)Data set
本实施例基于MPTP2078题库建立了两个数据集,分别为原始(MPTP)数据集、合取范式(CNF)数据集,用以测试模型的预测分类效果。其中,MPTP数据集为MPTP2078题库中的逻辑公式,CNF数据集为MPTP2078题库中逻辑公式对应的合取范式。该题库包含1469个猜想和24087个用于证明这些猜想的前提。In this embodiment, two data sets are established based on the MPTP2078 question bank, namely the original (MPTP) data set and the conjunction normal form (CNF) data set, to test the prediction and classification effect of the model. Among them, the MPTP data set is the logical formula in the MPTP2078 question bank, and the CNF data set is the conjunction normal form corresponding to the logical formula in the MPTP2078 question bank. The question bank contains 1,469 conjectures and 24,087 premises used to prove these conjectures.
本实施例构建用于训练、验证和测试(40996、13990和14068个样本)的MPTP数据集,其中,每个数据集都形如三元组(前提、猜想、标签),前提是给定猜想的候选前提,标签是二元分类中的0或1(1表示前提有用,0表示前提无用);同时,本实施例构建的CNF数据集分布与MPTP数据集相同。This embodiment constructs MPTP data sets for training, verification and testing (40996, 13990 and 14068 samples), where each data set is shaped like a triplet (premise, conjecture, label), the premise is a given conjecture The candidate premise of , the label is 0 or 1 in binary classification (1 indicates that the premise is useful, 0 indicates that the premise is useless); at the same time, the distribution of the CNF data set constructed in this embodiment is the same as the MPTP data set.
(2)模型设置(2)Model settings
本实施例根据数据集将逻辑公式转化为基于删除重复量词的简化逻辑公式图,获取每个简化逻辑公式的图信息(节点id、节点名称、父节点id、子节点id)。随后,将逻辑公式图传入图神经网络,并得到公式图特征向量。图神经网络的模型具体设置如下:This embodiment converts the logic formula into a simplified logic formula graph based on deleting repeated quantifiers according to the data set, and obtains the graph information (node id, node name, parent node id, child node id) of each simplified logic formula. Subsequently, the logic formula graph is passed into the graph neural network, and the formula graph feature vector is obtained. The specific model settings of the graph neural network are as follows:
在本实施例的图神经网络模型中,每个节点的初始一个热向量具有dv维。FV是一个嵌入网络,它将dv维的初始一个热向量嵌入到维的节点初始状态向量中。Fu、Fm、Fl的配置相同,它们是具有输入维度/>和输出维度/>的全连通层(FC)。Fsum、FU、FM和FL的配置与Fu相似,因为它只改变输入维度/>Fclass有两个完全连接的层:第一个是维度为/>的FC和批归一化(BN);第二个是通过包含softmax的维度为2的FC。值得注意的是,dv是793,它表示793个节点标记,在这些标记中,我们统一地将变量表示为“Var”。In the graph neural network model of this embodiment, the initial heat vector of each node has d v dimensions. F V is an embedding network, which embeds an initial heat vector of d v dimensions into dimensional node initial state vector. The configurations of F u , F m and F l are the same. They have input dimensions/> and output dimensions/> Fully Connected Layer (FC). The configuration of F sum , F U , F M and F L is similar to F u in that it only changes the input dimension/> F class has two fully connected layers: the first is of dimension/> FC and batch normalization (BN); the second is FC of dimension 2 by including softmax. It is worth noting that d v is 793, which represents 793 node tags in which we uniformly represent variables as "Var".
(3)实验设置(3) Experimental settings
本实施例的模型参数设置如下:The model parameters of this embodiment are set as follows:
(a)使用自适应矩估计Adam优化器的默认设置来训练模型;(a) Use the default settings of the adaptive moment estimation Adam optimizer to train the model;
(b)批量大小设置为32;(b) The batch size is set to 32;
(c)正则化参数设置为0.0001;(c) The regularization parameter is set to 0.0001;
(d)初始学习率设置为0.01;(d) The initial learning rate is set to 0.01;
(e)模型使用Pytorch库中的ReduceLROnPlateau策略自动调整学习率。(e) The model automatically adjusts the learning rate using the ReduceLROnPlateau strategy in the Pytorch library.
(f)模型使用交叉熵损失函数训练模型(f) The model uses the cross-entropy loss function to train the model
(4)实验结果与分析(4)Experimental results and analysis
本实施例在MPTP数据集和CNF数据集上评估基于Attention-TW-GNN的前提选择模型,并将该模型与一些主流方法进行比较。从中可以看出:This embodiment evaluates the premise selection model based on Attention-TW-GNN on the MPTP data set and the CNF data set, and compares the model with some mainstream methods. It can be seen from this:
基于图神经网络模型的前提选择方法均能够取得较好的分类精度,优于一些主流的图神经网络模型。例如:GCN、GAT、SGC等模型在MPTP数据集下的Accuracy指标分别为86.25%,85.38%,85.67%。这说明主流的图神经网络只能简单捕捉逻辑公式图的拓扑结构,无法捕捉逻辑公式的深层次信息。The premise selection methods based on the graph neural network model can achieve good classification accuracy, which is better than some mainstream graph neural network models. For example, the Accuracy indicators of GCN, GAT, SGC and other models under the MPTP dataset are 86.25%, 85.38%, and 85.67% respectively. This shows that the mainstream graph neural network can only simply capture the topological structure of the logic formula graph, but cannot capture the deep information of the logic formula.
在其他基线方法中,基于手工设计特征的图神经网络PC-GCN、TW-GNN明显优于主流的图神经网络。例如:PC-GCN、TW-GNN模型在CNF数据集下的F1指标分别为83.98%,83.72%。一个合理的解释是,这些图神经网络模型除了聚合邻居节点信息以外,还聚合了来自更远节点的信息。Among other baseline methods, the graph neural networks PC-GCN and TW-GNN based on hand-designed features are significantly better than mainstream graph neural networks. For example, the F1 index of PC-GCN and TW-GNN models in the CNF dataset is 83.98% and 83.72% respectively. A reasonable explanation is that these graph neural network models aggregate information from more distant nodes in addition to neighboring node information.
本实施例的基于Attention-TW-GNN的前提选择模型在分类精度能在绝大多数情况下超越现有的其他基于图神经网络的前提选择模型。例如:在MPTP数据集下,Attention-TW-GNN相比主流图神经网络至少提高了2%,相比其他基线方法提高了0.5%;在CNF数据集下,Attention-TW-GNN相比主流图神经网络模型提高了3%。这说明在加入注意力机制调整后的图神经网络能够更好的表征一阶逻辑公式的语法和语义信息,同时一阶逻辑公式的图表示也在一定程度上影响模型的分类效果。The classification accuracy of the premise selection model based on Attention-TW-GNN in this embodiment can surpass other existing graph neural network-based premise selection models in most cases. For example: Under the MPTP data set, Attention-TW-GNN has improved by at least 2% compared to mainstream graph neural networks and 0.5% compared to other baseline methods; under the CNF data set, Attention-TW-GNN has improved compared to mainstream graph neural networks The neural network model improved by 3%. This shows that the graph neural network adjusted by adding the attention mechanism can better represent the syntax and semantic information of the first-order logic formula. At the same time, the graph representation of the first-order logic formula also affects the classification effect of the model to a certain extent.
以上示意性的对本发明及其实施方式进行了描述,该描述没有限制性,附图中所示的也只是本发明的实施方式之一,实际的结构并不局限于此。所以,如果本领域的普通技术人员受其启示,在不脱离本发明创造宗旨的情况下,不经创造性的设计出与该技术方案相似的结构方式及实施例,均应属于本发明的保护范围。The present invention and its embodiments have been schematically described above. This description is not limiting. What is shown in the drawings is only one embodiment of the present invention, and the actual structure is not limited thereto. Therefore, if a person of ordinary skill in the art is inspired by the invention and without departing from the spirit of the invention, can devise structural methods and embodiments similar to the technical solution without inventiveness, they shall all fall within the protection scope of the invention. .
Claims (4)
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202311490926.9A CN117808084A (en) | 2023-11-09 | 2023-11-09 | Pre-selection method based on graph reduction brief representation and graph neural network |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202311490926.9A CN117808084A (en) | 2023-11-09 | 2023-11-09 | Pre-selection method based on graph reduction brief representation and graph neural network |
Publications (1)
Publication Number | Publication Date |
---|---|
CN117808084A true CN117808084A (en) | 2024-04-02 |
Family
ID=90432464
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202311490926.9A Pending CN117808084A (en) | 2023-11-09 | 2023-11-09 | Pre-selection method based on graph reduction brief representation and graph neural network |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN117808084A (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN118966273A (en) * | 2024-10-16 | 2024-11-15 | 西南交通大学 | A graph neural network and logic fusion method for premise selection |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20210150373A1 (en) * | 2019-11-15 | 2021-05-20 | International Business Machines Corporation | Capturing the global structure of logical formulae with graph long short-term memory |
CN115204372A (en) * | 2022-07-20 | 2022-10-18 | 成都飞机工业(集团)有限责任公司 | Precondition selection method and system based on item walking graph neural network |
-
2023
- 2023-11-09 CN CN202311490926.9A patent/CN117808084A/en active Pending
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20210150373A1 (en) * | 2019-11-15 | 2021-05-20 | International Business Machines Corporation | Capturing the global structure of logical formulae with graph long short-term memory |
CN115204372A (en) * | 2022-07-20 | 2022-10-18 | 成都飞机工业(集团)有限责任公司 | Precondition selection method and system based on item walking graph neural network |
Non-Patent Citations (1)
Title |
---|
兰咏琪等: "面向前提选择的新型图约简表示与图神经网络模型", 《计算机科学》, 26 September 2023 (2023-09-26), pages 193 - 199 * |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN118966273A (en) * | 2024-10-16 | 2024-11-15 | 西南交通大学 | A graph neural network and logic fusion method for premise selection |
CN118966273B (en) * | 2024-10-16 | 2025-01-24 | 西南交通大学 | A graph neural network and logic fusion method for premise selection |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Wang et al. | Detecting code clones with graph neural network and flow-augmented abstract syntax tree | |
Yang et al. | Learn to explain efficiently via neural logic inductive learning | |
Chen et al. | Towards synthesizing complex programs from input-output examples | |
Wu et al. | Knowledge distillation improves graph structure augmentation for graph neural networks | |
CN112507699B (en) | Remote supervision relation extraction method based on graph convolution network | |
CN112257066A (en) | Malicious behavior identification method, system and storage medium for weighted heterogeneous graph | |
CN107292097B (en) | Chinese medicine principal symptom selection method based on feature group | |
CN113569906A (en) | Method and Device for Extracting Heterogeneous Graph Information Based on Meta-Path Subgraph | |
CN104036023B (en) | Method for creating context fusion tree video semantic indexes | |
CN109857457B (en) | A Method for Learning Function Hierarchical Embedding Representations in Source Codes in Hyperbolic Spaces | |
CN112749757B (en) | Thesis classification model construction method and system based on gating graph annotation force network | |
CN116521882A (en) | Domain Long Text Classification Method and System Based on Knowledge Graph | |
CN110826639A (en) | A zero-sample image classification method using full data training | |
Wang et al. | Detecting code clones with graph neural networkand flow-augmented abstract syntax tree | |
CN115828143A (en) | A node classification method for path aggregation of heterogeneous graph elements based on graph convolution and self-attention mechanism | |
CN113988083B (en) | Factual information coding and evaluating method for generating shipping news abstract | |
CN117808084A (en) | Pre-selection method based on graph reduction brief representation and graph neural network | |
CN114897181A (en) | Meta-learning interpretation method based on causal relationship | |
CN117633811A (en) | A code vulnerability detection method based on multi-view feature fusion | |
CN110830291B (en) | A Node Classification Method for Heterogeneous Information Network Based on Meta-Path | |
Li et al. | Lexical attention and aspect-oriented graph convolutional networks for aspect-based sentiment analysis | |
CN115641599A (en) | Entity alignment method for customhouse import and export commodity knowledge map | |
CN115935367A (en) | Static source code vulnerability detection and positioning method based on graph neural network | |
CN114444515A (en) | A relation extraction method based on entity semantic fusion | |
CN115860122A (en) | A knowledge map multi-hop reasoning method based on multi-agent reinforcement learning |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |