CN117011874A

CN117011874A - Text detection method, device, equipment and storage medium based on artificial intelligence

Info

Publication number: CN117011874A
Application number: CN202311001120.9A
Authority: CN
Inventors: 陈雪娇
Original assignee: Ping An Property and Casualty Insurance Company of China Ltd
Current assignee: Ping An Property and Casualty Insurance Company of China Ltd
Priority date: 2023-08-09
Filing date: 2023-08-09
Publication date: 2023-11-07

Abstract

The application belongs to the field of artificial intelligence and the field of financial science and technology, and relates to a text detection method based on artificial intelligence, which comprises the following steps: constructing a topological graph of the news text; processing the topological graph through a first regularization layer, an attention layer, a graph topological structure learning layer and a node updating module in the language model to obtain target node vector representation; generating a graph vector representation of the target node vector; training a language model through graph vector representation to obtain a news detection model; and predicting the news text to be processed through the news detection model and outputting a corresponding prediction result. The application also provides a text detection device, computer equipment and a storage medium based on the artificial intelligence. In addition, the application also relates to a block chain technology, and error positioning information can be stored in the block chain. The method can be applied to false news detection scenes in the financial field, and the text detection method can effectively improve the accuracy of the news detection model in false news detection.

Description

Text detection method, device, equipment and storage medium based on artificial intelligence

Technical Field

The application relates to the technical field of artificial intelligence development and the technical field of finance, in particular to a text detection method, a text detection device, computer equipment and a storage medium based on artificial intelligence.

Background

There is often a business need for false news detection in financial and technological enterprises, such as insurance companies, banks, etc. False news detection aims at detecting unrealistic news information from a large number of news documents. By designing an effective false news detection means, the method can be used for rapidly and accurately detecting the nonfunctional news for the company from the Internet, so that reasonable countermeasures are formulated in time to reduce benefit damage to the company caused by the nonfunctional news in the Internet.

Currently, the industry mainly has the following two means to realize false news detection: means (1) based on a conventional BERT-like sequence-to-sequence language model; means (2) detect a model based on false information of the graph neural network. However, both of these prior art techniques still suffer from the following drawbacks: the method (1) treats each word in the sentence independently, and lacks structural information (such as a syntactic structure, contextual structural information and the like) of the text, so that the detection accuracy of the model in a real environment is reduced. Although the structural information of the text is considered in the means (2), due to the adoption of the local aggregation strategy of the traditional graph neural network, the method can only etch the semantic information between two words with similar distances in the text, and the semantic information of a word long sequence in the text can not be modeled, so that the phenomenon of over fitting or over compression of the model is caused, and the accuracy of the method is reduced.

Disclosure of Invention

The embodiment of the application aims to provide a text detection method, a device, computer equipment and a storage medium based on artificial intelligence, which are used for solving the technical problem that the detection accuracy of the existing means for detecting false news by using a traditional BERT (binary event detection) sequence-to-sequence-based language model and a graph neural network-based false information detection model for detecting false news is low.

In order to solve the technical problems, the embodiment of the application provides a text detection method based on artificial intelligence, which adopts the following technical scheme:

acquiring news texts and constructing a topological graph corresponding to the news texts;

performing layer regularization processing on feature vectors of nodes contained in the topological graph through a first regularization layer in a preset language model to obtain corresponding first feature vectors, and generating a first attention weight matrix corresponding to the first feature vectors through an attention layer in the language model; wherein the language model is a model based on a transducer structure;

processing the first attention weight matrix through a graph topology structure learning layer in the language model, and outputting a corresponding second attention weight matrix;

Node updating processing is carried out on the second attention weight matrix through a node updating module in the language model, and corresponding target node vector representation is obtained;

generating a graph vector representation corresponding to the target node vector representation;

training the language model through the graph vector representation to obtain a trained news detection model;

and predicting the news text to be processed through the news detection model, and outputting a prediction result corresponding to the news text to be processed.

Further, the step of constructing a topological graph corresponding to the news text specifically includes:

carrying out graph construction processing on the news text based on a preset sliding window mechanism to generate a corresponding topological structure diagram;

generating vector representations corresponding to each word contained in the news text based on a preset model;

generating a topological graph corresponding to the news text based on the vector representation and the topological structure diagram.

Further, the step of processing the first attention weight matrix through a graph topology learning layer in the language model and outputting a corresponding second attention weight matrix specifically includes:

Acquiring preset adjacent matrix information and shortest path matrix information;

constructing a matrix generation formula based on the adjacency matrix information and the shortest path matrix information;

and processing the first attention weight matrix by using the matrix generation formula through a graph topological structure learning layer in the language model to obtain the corresponding second attention weight matrix.

Further, the step of performing node update processing on the second attention weight matrix by using a node update module in the language model to obtain a corresponding target node vector representation specifically includes:

acquiring a preset nonlinear activation function and a vector updating mode corresponding to the node updating module;

processing the second attention weight matrix through the nonlinear activation function to obtain a corresponding first node vector representation;

updating the first node vector representation based on the vector updating mode to obtain a corresponding second node vector representation;

the second node vector representation is taken as the target node vector representation.

Further, the step of updating the first node vector representation based on the vector updating manner to obtain a corresponding second node vector representation specifically includes:

Processing the first node vector representation through a second regularization layer in the language model to obtain a corresponding third node vector representation;

processing the third node vector representation by a feedforward neural network device in the language model to obtain a corresponding fourth node vector representation;

obtaining residual information of the first node vector;

the second node vector representation is generated based on the fourth node vector representation and the residual information.

Further, the step of generating a graph vector representation corresponding to the target node vector representation specifically includes:

acquiring a preset average pooling function;

carrying out pooling treatment on the target node vector based on the average pooling function to obtain corresponding vector data;

the vector data is represented as the graph vector.

Further, the training the language model through the graph vector representation to obtain a trained news detection model specifically includes:

nonlinear conversion is carried out on the graph vector representation through a full connection layer in the language model, and a corresponding target graph vector representation is obtained;

performing prediction processing on the target graph vector representation through a softmax layer in the language model to obtain a corresponding prediction tag;

Acquiring a real label of a news text corresponding to the target graph vector;

calculating cross entropy loss based on the prediction tag and the real tag;

training the language model based on the cross entropy loss to obtain a trained language model;

and taking the trained language model as the news detection model.

In order to solve the technical problems, the embodiment of the application also provides a text detection device based on artificial intelligence, which adopts the following technical scheme:

the acquisition module is used for acquiring the news text and constructing a topological graph corresponding to the news text;

the first processing module is used for carrying out layer regularization processing on the feature vectors of the nodes contained in the topological graph through a first regularization layer in a preset language model to obtain corresponding first feature vectors, and generating a first attention weight matrix corresponding to the first feature vectors through an attention layer in the language model; wherein the language model is a model based on a transducer structure;

the second processing module is used for processing the first attention weight matrix through a graph topological structure learning layer in the language model and outputting a corresponding second attention weight matrix;

The third processing module is used for carrying out node updating processing on the second attention weight matrix through the node updating module in the language model to obtain a corresponding target node vector representation;

a generating module, configured to generate a graph vector representation corresponding to the target node vector representation;

the training module is used for training the language model through the graph vector representation to obtain a trained news detection model;

and the prediction module is used for predicting the news text to be processed through the news detection model and outputting a prediction result corresponding to the news text to be processed.

In order to solve the above technical problems, the embodiment of the present application further provides a computer device, which adopts the following technical schemes:

In order to solve the above technical problems, an embodiment of the present application further provides a computer readable storage medium, which adopts the following technical schemes:

Compared with the prior art, the embodiment of the application has the following main beneficial effects:

firstly, acquiring news texts and constructing a topological graph corresponding to the news texts; performing layer regularization processing on feature vectors of nodes contained in the topological graph through a first regularization layer in a preset language model to obtain corresponding first feature vectors, and generating a first attention weight matrix corresponding to the first feature vectors through an attention layer in the language model; processing the first attention weight matrix through a graph topological structure learning layer in the language model, and outputting a corresponding second attention weight matrix; subsequently, node updating processing is carried out on the second attention weight matrix through a node updating module in the language model, and corresponding target node vector representation is obtained; further generating a graph vector representation corresponding to the target node vector representation; finally, training the language model through the graph vector representation to obtain a trained news detection model; and predicting the news text to be processed through the news detection model, and outputting a prediction result corresponding to the news text to be processed. After the news text is obtained, the text is firstly modeled into a graph structure form, and then the context relation of the text sequence is learned by using a transducer model, so that the semantic information of the word long sequence in the text is learned. Meanwhile, a graph topological structure learning module is introduced into a traditional transducer model, so that the model is driven to learn semantic structure information in a text, and the problem that the existing method is difficult to simultaneously utilize the context semantic information and the text structure information of the text is solved. And predicting the news text to be processed by using the trained news detection model, so that a prediction result corresponding to the news text to be processed can be rapidly and accurately output, and the accuracy of the news detection model in false news detection is effectively improved.

Drawings

In order to more clearly illustrate the solution of the present application, a brief description will be given below of the drawings required for the description of the embodiments of the present application, it being apparent that the drawings in the following description are some embodiments of the present application, and that other drawings may be obtained from these drawings without the exercise of inventive effort for a person of ordinary skill in the art.

FIG. 1 is an exemplary system architecture diagram in which the present application may be applied;

FIG. 2 is a flow chart of one embodiment of an artificial intelligence based text detection method in accordance with the present application;

FIG. 3 is a schematic diagram of one embodiment of an artificial intelligence based text detection device in accordance with the present application;

FIG. 4 is a schematic structural diagram of one embodiment of a computer device in accordance with the present application.

Detailed Description

Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this application belongs; the terminology used in the description of the applications herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the application; the terms "comprising" and "having" and any variations thereof in the description of the application and the claims and the description of the drawings above are intended to cover a non-exclusive inclusion. The terms first, second and the like in the description and in the claims or in the above-described figures, are used for distinguishing between different objects and not necessarily for describing a sequential or chronological order.

Reference herein to "an embodiment" means that a particular feature, structure, or characteristic described in connection with the embodiment may be included in at least one embodiment of the application. The appearances of such phrases in various places in the specification are not necessarily all referring to the same embodiment, nor are separate or alternative embodiments mutually exclusive of other embodiments. Those of skill in the art will explicitly and implicitly appreciate that the embodiments described herein may be combined with other embodiments.

In order to make the person skilled in the art better understand the solution of the present application, the technical solution of the embodiment of the present application will be clearly and completely described below with reference to the accompanying drawings.

As shown in fig. 1, a system architecture 100 may include terminal devices 101, 102, 103, a network 104, and a server 105. The network 104 is used as a medium to provide communication links between the terminal devices 101, 102, 103 and the server 105. The network 104 may include various connection types, such as wired, wireless communication links, or fiber optic cables, among others.

The user may interact with the server 105 via the network 104 using the terminal devices 101, 102, 103 to receive or send messages or the like. Various communication client applications, such as a web browser application, a shopping class application, a search class application, an instant messaging tool, a mailbox client, social platform software, etc., may be installed on the terminal devices 101, 102, 103.

The terminal devices 101, 102, 103 may be various electronic devices having a display screen and supporting web browsing, including but not limited to smartphones, tablet computers, electronic book readers, MP3 players (Moving Picture Experts Group Audio Layer III, dynamic video expert compression standard audio plane 3), MP4 (Moving Picture Experts Group Audio Layer IV, dynamic video expert compression standard audio plane 4) players, laptop and desktop computers, and the like.

The server 105 may be a server providing various services, such as a background server providing support for pages displayed on the terminal devices 101, 102, 103.

It should be noted that, the text detection method based on artificial intelligence provided by the embodiment of the application is generally executed by a server/terminal device, and correspondingly, the text detection device based on artificial intelligence is generally arranged in the server/terminal device.

The embodiment of the application can acquire and process the related data based on the artificial intelligence technology. Among these, artificial intelligence (Artificial Intelligence, AI) is the theory, method, technique and application system that uses a digital computer or a digital computer-controlled machine to simulate, extend and extend human intelligence, sense the environment, acquire knowledge and use knowledge to obtain optimal results.

Artificial intelligence infrastructure technologies generally include technologies such as sensors, dedicated artificial intelligence chips, cloud computing, distributed storage, big data processing technologies, operation/interaction systems, mechatronics, and the like. The artificial intelligence software technology mainly comprises a computer vision technology, a robot technology, a biological recognition technology, a voice processing technology, a natural language processing technology, machine learning/deep learning and other directions.

It should be understood that the number of terminal devices, networks and servers in fig. 1 is merely illustrative. There may be any number of terminal devices, networks, and servers, as desired for implementation.

With continued reference to FIG. 2, a flow chart of one embodiment of an artificial intelligence based text detection method in accordance with the present application is shown. The order of the steps in the flowchart may be changed and some steps may be omitted according to various needs. The text detection method based on the artificial intelligence provided by the embodiment of the application can be applied to any scene needing false news detection, and can be applied to products of the scenes, such as false news detection in the field of financial insurance. The text detection method based on artificial intelligence comprises the following steps:

Step S201, obtaining news texts and constructing a topological graph corresponding to the news texts.

In this embodiment, the electronic device (e.g., the server/terminal device shown in fig. 1) on which the text detection method based on artificial intelligence operates may acquire the news text through a wired connection manner or a wireless connection manner. It should be noted that the wireless connection may include, but is not limited to, 3G/4G/5G connection, wiFi connection, bluetooth connection, wiMAX connection, zigbee connection, UWB (ultra wideband) connection, and other now known or later developed wireless connection. In the financial application scenario, the news text may refer to news text of acquiring a financial company, such as an insurance company and a bank, from the internet. The specific implementation process of constructing the topological graph corresponding to the news text will be described in further detail in the following specific embodiments, which are not described herein.

Step S202, performing layer regularization processing on feature vectors of nodes contained in the topological graph through a first regularization layer in a preset language model to obtain corresponding first feature vectors, and generating a first attention weight matrix corresponding to the first feature vectors through an attention layer in the language model.

In this embodiment, the language model is a model based on a transducer structure. The data layers within the language model referred to in S202-S204 may be collectively referred to as encoder layers, and the present application specifically requires stacking 8 encoder layers, i.e., performing S202, S203, and S204 8 times. This step aims to learn the attention weight between any two nodes in the topology. Specifically, the topology graph is layer regularized (Layer normalization) using a first regularization layer within the language model, as shown in equation (1): h ^l ＝LN(H ^l ) (1); wherein LN (H) ^l ) Representation of H ^l Layer regularization is performed. Subsequently, a learning process of generating an attention weight corresponding to a first attention weight matrix corresponding to the first feature vector through an attention layer within the language model is as shown in formulas (2) and (3): q=h ^l W _Q ，K＝H ^l W _K ，V＝H ^l WV(2)；(3) The method comprises the steps of carrying out a first treatment on the surface of the Wherein H is ^l Is the input of the self-attention layer in the layer-1 encoder. Let the initialization input in the layer 1 encoder be X for the topology map Gi _i Order H ⁰ ＝X _i 。W _Q ，W _K And W is _V Is H ^l To generate a query matrix Q, a key matrix K and a value matrix V. d, d _K Is the dimension size of the node characteristics after the attention conversion. A is the attention weight matrix between nodes. Element value A corresponding to ith row and jth column in A _ij Representing the attention weight between the two nodes corresponding to the row and the column.

Step S203, processing the first attention weight matrix by a graph topology learning layer in the language model, and outputting a corresponding second attention weight matrix.

In this embodiment, the first attention weight matrix is processed by the graph topology learning layer in the language model, and a specific implementation process of the corresponding second attention weight matrix is output.

And step S204, performing node updating processing on the second attention weight matrix through a node updating module in the language model to obtain a corresponding target node vector representation.

In this embodiment, the foregoing node update processing is performed on the second attention weight matrix by using the node update module in the language model to obtain a specific implementation process of the corresponding target node vector representation, which will be described in further detail in the following specific embodiments, which will not be described herein.

Step S205, generating a graph vector representation corresponding to the target node vector representation.

In this embodiment, the specific implementation process of generating the graph vector representation corresponding to the target node vector representation will be described in further detail in the following specific embodiments, which will not be described herein.

And step S206, training the language model through the graph vector representation to obtain a trained news detection model.

In this embodiment, the training of the language model by the graph vector representation to obtain a specific implementation process of the trained news detection model is described in further detail in the following specific embodiments, which will not be described herein.

And step S207, predicting the news text to be processed through the news detection model, and outputting a prediction result corresponding to the news text to be processed.

In this embodiment, the labels (i.e., prediction results, including true news or false news) of the news text to be processed are predicted by inputting the news text to be processed into the news detection model. Specifically, inputting the news text to be processed into a trained news detection model, and outputting the label of the news text to be processed Wherein the method comprises the steps ofw is the number of news texts to be processed. Tag 0 represents real news and tag 1 represents false news. The false news detection is carried out by applying the text detection method provided by the application in finance and technology companies, such as insurance companies and banks, so that an effective technical means is provided for the company to rapidly detect the unreal news of the company in the Internet, and the processing accuracy of the false news detection is effectively improved.

For semantic structural features of text, such as syntactic structural features, are most common. For example, sentences typically have parts of speech such as nouns, verbs, adjectives, etc., and there are good organizational features between these parts of speech in text. Therefore, how to model these underlying text structure information is critical to promote the accuracy of the model in false news detection tasks. Furthermore, with respect to semantic information of a long sequence of words in a text, it refers to information between two words that are distant from each other in the text. Generally, the existing graph neural network-based method only can model the semantic relation between two words close to each other, and cannot capture the semantic information between two nodes far away from each other (for example, in a sentence, a subject noun and an object noun are often far away, but have a strong semantic relation), so that long-sequence semantic information of a text cannot be described.

Firstly, acquiring a news text, and constructing a topological graph corresponding to the news text; performing layer regularization processing on feature vectors of nodes contained in the topological graph through a first regularization layer in a preset language model to obtain corresponding first feature vectors, and generating a first attention weight matrix corresponding to the first feature vectors through an attention layer in the language model; processing the first attention weight matrix through a graph topological structure learning layer in the language model, and outputting a corresponding second attention weight matrix; subsequently, node updating processing is carried out on the second attention weight matrix through a node updating module in the language model, and corresponding target node vector representation is obtained; further generating a graph vector representation corresponding to the target node vector representation; finally, training the language model through the graph vector representation to obtain a trained news detection model; and predicting the news text to be processed through the news detection model, and outputting a prediction result corresponding to the news text to be processed. After the news text is obtained, the text is firstly modeled into a graph structure form, and then the context relation of the text sequence is learned by using a transducer model, so that the semantic information of the word long sequence in the text is learned. Meanwhile, a graph topological structure learning module is introduced into a traditional transducer model, so that the model is driven to learn semantic structure information in a text, and the problem that the existing method is difficult to simultaneously utilize the context semantic information and the text structure information of the text is solved. And predicting the news text to be processed by using the trained news detection model, so that a prediction result corresponding to the news text to be processed can be rapidly and accurately output, and the accuracy of the news detection model in false news detection is effectively improved.

In some alternative implementations, step S201 includes the steps of:

and carrying out graph construction processing on the news text based on a preset sliding window mechanism, and generating a corresponding topological structure diagram.

In this embodiment, the topology map may be built by utilizing an existing sliding window mechanism. Specifically, for each news text, each word in the news text is regarded as a node, and a connecting edge is established between every two nodes located in the same sliding window. Wherein the size of the sliding window is set to 3. Based on this, for each news text, a topology structure diagram based on semantic information can be built separately for it.

Vector representations corresponding to the words contained in the news text are generated based on a preset model.

In this embodiment, the preset model may be a BERT model. By inputting the news text to the BERT model, a vector representation corresponding to each word contained in the news text is generated using the BERT model, and this is taken as a feature vector corresponding to each node.

In this embodiment, after the vector representation corresponding to each word included in the news text is obtained, the vector representation is used as the feature vector corresponding to each node in the topology structure diagram. Based on the above, each news text can respectively construct a topological graph, and semantic information of the text can be represented by the topological graph. Specifically, for the news text d= { D1, D2, …, dp }, by establishing one topology map gi= { Vi, ei, xi }, for each news text di. Vi is the set of nodes in the graph Gi and Ei is the set of edges in the graph Gi. Xi is the feature matrix of the graph Gi, each row of which represents the feature vector of a node in the graph Gi. The words in the news text are regarded as nodes, and a sliding window mechanism is adopted to establish continuous edges among the nodes (words). Specifically, the size of the sliding window can be set to 3, and a connecting edge is established between every two nodes in the window. For the construction of the feature matrix Xi, by inputting the news text di into the BERT model, an initial feature vector representation for each node (i.e., each row in Xi) can be obtained. In addition, the topology of the network can also be defined by the adjacency matrix A _adj To indicate that each element in its matrix has a value of 0 or 1. If A _adj A value of 1 for an element represents that there is a connecting edge between the nodes corresponding to the row and column where the element is located. If A _adj And if the value of one element is 0, the other element is the opposite.

The method comprises the steps of carrying out graph construction processing on the news text based on a preset sliding window mechanism to generate a corresponding topological structure diagram; then generating vector representations corresponding to each word contained in the news text based on a preset model; and generating a topological graph corresponding to the news text based on the vector representation and the topological structure diagram. According to the method and the device for generating the topological graph, the news text is processed based on the sliding window mechanism and the use of the preset model, so that the topological graph corresponding to the news text can be generated rapidly, and the generation efficiency of the topological graph is improved.

In some alternative implementations of the present embodiment, step S203 includes the steps of:

and acquiring preset adjacent matrix information and shortest path matrix information.

In this embodiment, adjacency matrix A of the topology _adj And the shortest path matrix S respectively represents the structural information of different layers in the graph. According to the application, the adjacent matrix information and the shortest path matrix information are simultaneously introduced as bias items and modeled in the attention weight matrix to construct a corresponding matrix generation formula.

And constructing a matrix generation formula based on the adjacent matrix information and the shortest path matrix information.

In this embodiment, the matrix generation formula includes: a=a+a _adj W _A +SW _S Wherein S is a shortest path matrix, each element of which represents a shortest path between corresponding two nodes. W (W) _A And W is _S Respectively representing a learnable weight matrix. Based on this, due to the adjacency matrix A of the graph _adj And the shortest path matrix S respectively represents the structural information of different layers in the graph, so that the finally obtained attention weight matrix A effectively describes the structural information of the topological graph, namely the semantic structural information of the text. Based on unified modeling of S202 and S203, the method has the capability of modeling the semantic relation between any two nodes and the semantic structure information of the text at the same time, and overcomes the defect that the existing method is difficult to capture the semantic information of word long sequences and the complex semantic structure in the text at the same time.

The method comprises the steps of obtaining preset adjacent matrix information and shortest path matrix information; then constructing a matrix generation formula based on the adjacent matrix information and the shortest path matrix information; and subsequently, processing the first attention weight matrix by using the matrix generation formula through a graph topological structure learning layer in the language model to obtain the corresponding second attention weight matrix. According to the application, the first attention weight matrix is processed by using the graph topological structure learning layer in the language model and using the matrix generation formula constructed based on the adjacent matrix information and the shortest path matrix information, so that the required second attention weight matrix can be quickly and accurately constructed. According to the application, the adjacency matrix and the shortest path matrix information are introduced at the same time, so that the language model can effectively depict the structural information of the network on the basis of having the long-sequence semantic information for depicting the text, and the node vector representation considering the long-sequence information and the structural information is effectively output.

In some alternative implementations, step S204 includes the steps of:

and acquiring a preset nonlinear activation function and a vector updating mode corresponding to the node updating module.

In this embodiment, the nonlinear activation function includes: h _tem ^l ＝softmax(A)V+H ^l-1 Where softmax () represents the nonlinear activation function. V is the value matrix obtained in the above formula (2). H ^l-1 Is the input to the layer 1 encoder (the language model of the present application has 8 layers in total). H _tem ^l Is an updated node vector representation.

And processing the second attention weight matrix through the nonlinear activation function to obtain a corresponding first node vector representation.

In this embodiment, the second attention weight matrix may be substituted into the nonlinear activation function for calculation to obtain the corresponding first node vector representation.

And updating the first node vector representation based on the vector updating mode to obtain a corresponding second node vector representation.

In this embodiment, the foregoing updating the first node vector representation based on the vector updating manner to obtain a specific implementation process of the corresponding second node vector representation, which will be described in further detail in the following specific embodiments, which will not be described herein.

The method comprises the steps of obtaining a preset nonlinear activation function and a vector updating mode corresponding to the node updating module; then, the second attention weight matrix is processed through the nonlinear activation function, and corresponding first node vector representation is obtained; and subsequently, updating the first node vector representation based on the vector updating mode to obtain a corresponding second node vector representation, and taking the second node vector representation as the target node vector representation. The application processes the second attention weight based on the nonlinear activation function and the vector updating mode, so that the required target node vector representation can be quickly generated, and the generation efficiency and accuracy of the target node vector representation are improved.

In some optional implementations, the updating the first node vector representation based on the vector updating manner to obtain a corresponding second node vector representation includes the following steps:

and processing the first node vector representation through a second regularization layer in the language model to obtain a corresponding third node vector representation.

In the present embodiment, H can be represented by the resulting node vector _tem ^l Further input into a layer regularization layer LN (Layer Normalization) in the language model, then input into a feed-forward neural network (FFN) in the language model, and then add the residual information to output the final node vector representation, e.g., formula H ^l ＝FFN(LN(H _tem ^l ))+H _tem ^l As shown.

And processing the third node vector representation by a feedforward neural network device in the language model to obtain a corresponding fourth node vector representation.

In this embodiment, the feedforward neural network device refers to the above-described FFN (feed-forward blocks).

And obtaining residual information of the first node vector.

In this embodiment, the second node vector representation may be obtained by calculating a sum between the fourth node vector representation and the residual information.

The first node vector representation is processed through a second regularization layer in the language model to obtain a corresponding third node vector representation; then, the third node vector representation is processed through a feedforward neural network device in the language model to obtain a corresponding fourth node vector representation; obtaining residual information of the first node vector; the second node vector representation is subsequently generated based on the fourth node vector representation and the residual information. The application processes the second attention weight based on the vector updating mode, so that the required second node vector representation can be quickly generated, and the generation efficiency and accuracy of the second node vector representation are improved.

In some alternative implementations of the present embodiment, step S205 includes the steps of:

and obtaining a preset average pooling function.

In this embodiment, after 8 layers of the above encoder are circularly executed (i.e. steps S202-S204), the output of the last layer is set to H ^l (at this time l=7). The average pooling function is specifically h _Gi ＝MeanPooling(H ^l ) Wherein h is _Gi Representing a vector representation of the graph Gi. MeanPooling () represents an average pooling function.

And carrying out pooling treatment on the target node vector based on the average pooling function to obtain corresponding vector data.

The vector data is represented as the graph vector.

The method comprises the steps of obtaining a preset average pooling function; then carrying out pooling treatment on the target node vector based on the average pooling function to obtain corresponding vector data; the vector data is subsequently represented as the map vector. According to the application, the target node vector is subjected to pooling processing by using the average pooling function, so that the graph vector representation corresponding to the target node vector representation can be rapidly and accurately output, and the generation efficiency and accuracy of the graph vector representation are improved.

In some alternative implementations of the present embodiment, step S206 includes the steps of:

And carrying out nonlinear conversion on the graph vector representation through a full connection layer in the language model to obtain a corresponding target graph vector representation.

In the present embodiment, h is represented by the resulting map vector _Gi Is input into a full link layer (MLP) within the language model for nonlinear conversion.

And carrying out prediction processing on the target graph vector representation through a softmax layer in the language model to obtain a corresponding prediction tag.

In this embodiment, the softmax layer comprises:wherein W is _Y Representing a learnable weight matrix, b being a learnable bias vector. />The predictive label of figure Gi is shown.

And acquiring the real label of the news text corresponding to the target graph vector.

And calculating cross entropy loss based on the prediction label and the real label.

In this embodiment, the formula can be usedAnd calculating cross entropy loss between the prediction label and the real label. Wherein y is _Gi Is the true label of the graph Gi. y is _Gi And->Tag 0 represents real news and tag 1 represents false news.

And training the language model based on the cross entropy loss to obtain a trained language model.

In this embodiment, training of the language model may be accomplished by minimizing the cross entropy loss described above. And stopping iteration when the difference between the loss values of the two iterations is smaller than m or the iteration times is larger than q, and completing model training. The parameters m and q are not particularly limited, and can be set according to actual training requirements.

And taking the trained language model as the news detection model.

The application carries out nonlinear conversion on the graph vector representation through a full connection layer in the language model to obtain a corresponding target graph vector representation; then, carrying out prediction processing on the target graph vector representation through a softmax layer in the language model to obtain a corresponding prediction tag; then obtaining a real label of the news text corresponding to the target graph vector; subsequently calculating cross entropy loss based on the prediction tag and the real tag; and finally training the language model based on the cross entropy loss to obtain a trained language model, and taking the trained language model as the news detection model. According to the method, the graph vector is processed based on the full connection layer and the softmax layer in the language model to obtain the prediction label, then the cross entropy loss is calculated based on the prediction label and the real label of the target graph vector, and further the language model is trained based on the cross entropy loss, so that training of the language model can be completed and the language model is used as a final news detection model, and the construction efficiency of the news detection model is improved.

It should be understood that the sequence number of each step in the foregoing embodiment does not mean that the execution sequence of each process should be determined by the function and the internal logic, and should not limit the implementation process of the embodiment of the present application.

It should be emphasized that to further guarantee the privacy and security of the prediction, the prediction may also be stored in a blockchain node.

The blockchain is a novel application mode of computer technologies such as distributed data storage, point-to-point transmission, consensus mechanism, encryption algorithm and the like. The Blockchain (Blockchain), which is essentially a decentralised database, is a string of data blocks that are generated by cryptographic means in association, each data block containing a batch of information of network transactions for verifying the validity of the information (anti-counterfeiting) and generating the next block. The blockchain may include a blockchain underlying platform, a platform product services layer, an application services layer, and the like.

Those skilled in the art will appreciate that implementing all or part of the above described methods may be accomplished by computer readable instructions stored in a computer readable storage medium that, when executed, may comprise the steps of the embodiments of the methods described above. The storage medium may be a nonvolatile storage medium such as a magnetic disk, an optical disk, a Read-Only Memory (ROM), or a random access Memory (Random Access Memory, RAM).

It should be understood that, although the steps in the flowcharts of the figures are shown in order as indicated by the arrows, these steps are not necessarily performed in order as indicated by the arrows. The steps are not strictly limited in order and may be performed in other orders, unless explicitly stated herein. Moreover, at least some of the steps in the flowcharts of the figures may include a plurality of sub-steps or stages that are not necessarily performed at the same time, but may be performed at different times, the order of their execution not necessarily being sequential, but may be performed in turn or alternately with other steps or at least a portion of the other steps or stages.

With further reference to fig. 3, as an implementation of the method shown in fig. 2, the present application provides an embodiment of an artificial intelligence-based text detection apparatus, which corresponds to the method embodiment shown in fig. 2, and which is particularly applicable to various electronic devices.

As shown in fig. 3, the text detection device 300 based on artificial intelligence according to the present embodiment includes: an acquisition module 301, a first processing module 302, a second processing module 303, a third processing module 304, a generation module 305, a training module 306, and a prediction module 307. Wherein:

the acquisition module 301 is configured to acquire a news text and construct a topological graph corresponding to the news text;

the first processing module 302 is configured to perform layer regularization processing on feature vectors of nodes included in the topological graph through a first regularization layer in a preset language model to obtain corresponding first feature vectors, and generate a first attention weight matrix corresponding to the first feature vectors through an attention layer in the language model; wherein the language model is a model based on a transducer structure;

a second processing module 303, configured to process the first attention weight matrix through a graph topology learning layer in the language model, and output a corresponding second attention weight matrix;

A third processing module 304, configured to perform node update processing on the second attention weight matrix through a node update module in the language model, so as to obtain a corresponding target node vector representation;

a generating module 305, configured to generate a graph vector representation corresponding to the target node vector representation;

training module 306, configured to train the language model through the graph vector representation, to obtain a trained news detection model;

and the prediction module 307 is configured to predict a news text to be processed through the news detection model, and output a prediction result corresponding to the news text to be processed.

In this embodiment, the operations performed by the modules or units respectively correspond to the steps of the text detection method based on artificial intelligence in the foregoing embodiment one by one, which is not described herein again.

In some optional implementations of this embodiment, the acquiring module 301 includes:

the first processing sub-module is used for carrying out graph construction processing on the news text based on a preset sliding window mechanism to generate a corresponding topological structure diagram;

the first generation sub-module is used for generating vector representations corresponding to each word contained in the news text based on a preset model;

And the second generation sub-module is used for generating a topological graph corresponding to the news text based on the vector representation and the topological structure diagram.

In some alternative implementations of the present embodiment, the second processing module 303 includes:

the first acquisition submodule is used for acquiring preset adjacent matrix information and shortest path matrix information;

a construction submodule for constructing a matrix generation formula based on the adjacent matrix information and the shortest path matrix information;

and the second processing sub-module is used for processing the first attention weight matrix by using the matrix generation formula through a graph topological structure learning layer in the language model to obtain the corresponding second attention weight matrix.

In some alternative implementations of the present embodiment, the third processing module 304 includes:

The second acquisition sub-module is used for acquiring a preset nonlinear activation function and a vector updating mode corresponding to the node updating module;

the third processing sub-module is used for processing the second attention weight matrix through the nonlinear activation function to obtain a corresponding first node vector representation;

a fourth processing sub-module, configured to update the first node vector representation based on the vector update manner, to obtain a corresponding second node vector representation;

a first determination submodule for taking the second node vector representation as the target node vector representation.

In some optional implementations of this embodiment, the fourth processing sub-module includes:

the first processing unit is used for processing the first node vector representation through a second regularization layer in the language model to obtain a corresponding third node vector representation;

the second processing unit is used for processing the third node vector representation through a feedforward neural network device in the language model to obtain a corresponding fourth node vector representation;

An obtaining unit, configured to obtain residual information of the first node vector;

and a processing unit configured to generate the second node vector representation based on the fourth node vector representation and the residual information.

In some alternative implementations of the present embodiment, the generating module 305 includes:

the third acquisition submodule is used for acquiring a preset average pooling function;

a fifth processing sub-module, configured to perform pooling processing on the target node vector based on the average pooling function, to obtain corresponding vector data;

and the second determining submodule is used for representing the vector data as the graph vector.

In some alternative implementations of the present embodiment, training module 306 includes:

the conversion sub-module is used for carrying out nonlinear conversion on the graph vector representation through a full connection layer in the language model to obtain a corresponding target graph vector representation;

The prediction sub-module is used for carrying out prediction processing on the target graph vector representation through a softmax layer in the language model to obtain a corresponding prediction label;

a fourth obtaining sub-module, configured to obtain a real tag of a news text corresponding to the target graph vector;

a computation sub-module for computing a cross entropy loss based on the prediction tag and the real tag;

the training sub-module is used for training the language model based on the cross entropy loss to obtain a trained language model;

and the third determining submodule is used for taking the trained language model as the news detection model.

In order to solve the technical problems, the embodiment of the application also provides computer equipment. Referring specifically to fig. 4, fig. 4 is a basic structural block diagram of a computer device according to the present embodiment.

The computer device 4 comprises a memory 41, a processor 42, a network interface 43 communicatively connected to each other via a system bus. It should be noted that only computer device 4 having components 41-43 is shown in the figures, but it should be understood that not all of the illustrated components are required to be implemented and that more or fewer components may be implemented instead. It will be appreciated by those skilled in the art that the computer device herein is a device capable of automatically performing numerical calculations and/or information processing in accordance with predetermined or stored instructions, the hardware of which includes, but is not limited to, microprocessors, application specific integrated circuits (Application Specific Integrated Circuit, ASICs), programmable gate arrays (fields-Programmable Gate Array, FPGAs), digital processors (Digital Signal Processor, DSPs), embedded devices, etc.

The computer equipment can be a desktop computer, a notebook computer, a palm computer, a cloud server and other computing equipment. The computer equipment can perform man-machine interaction with a user through a keyboard, a mouse, a remote controller, a touch pad or voice control equipment and the like.

The memory 41 includes at least one type of readable storage medium including flash memory, hard disk, multimedia card, card memory (e.g., SD or DX memory, etc.), random Access Memory (RAM), static Random Access Memory (SRAM), read Only Memory (ROM), electrically Erasable Programmable Read Only Memory (EEPROM), programmable Read Only Memory (PROM), magnetic memory, magnetic disk, optical disk, etc. In some embodiments, the storage 41 may be an internal storage unit of the computer device 4, such as a hard disk or a memory of the computer device 4. In other embodiments, the memory 41 may also be an external storage device of the computer device 4, such as a plug-in hard disk, a Smart Media Card (SMC), a Secure Digital (SD) Card, a Flash Card (Flash Card) or the like, which are provided on the computer device 4. Of course, the memory 41 may also comprise both an internal memory unit of the computer device 4 and an external memory device. In this embodiment, the memory 41 is typically used to store an operating system and various application software installed on the computer device 4, such as computer readable instructions of an artificial intelligence-based text detection method. Further, the memory 41 may be used to temporarily store various types of data that have been output or are to be output.

The processor 42 may be a central processing unit (Central Processing Unit, CPU), controller, microcontroller, microprocessor, or other data processing chip in some embodiments. The processor 42 is typically used to control the overall operation of the computer device 4. In this embodiment, the processor 42 is configured to execute computer readable instructions stored in the memory 41 or process data, such as executing computer readable instructions of the artificial intelligence based text detection method.

The network interface 43 may comprise a wireless network interface or a wired network interface, which network interface 43 is typically used for establishing a communication connection between the computer device 4 and other electronic devices.

in the embodiment of the application, after the news text is acquired, the text is firstly modeled into a form of a graph structure, and then the context relation of the text sequence is learned by utilizing a transducer model, so that the semantic information of the word long sequence in the text is learned. Meanwhile, a graph topological structure learning module is introduced into a traditional transducer model, so that the model is driven to learn semantic structure information in a text, and the problem that the existing method is difficult to simultaneously utilize the context semantic information and the text structure information of the text is solved. And predicting the news text to be processed by using the trained news detection model, so that a prediction result corresponding to the news text to be processed can be rapidly and accurately output, and the accuracy of the news detection model in false news detection is effectively improved.

The present application also provides another embodiment, namely, a computer-readable storage medium storing computer-readable instructions executable by at least one processor to cause the at least one processor to perform the steps of an artificial intelligence-based text detection method as described above.

From the above description of the embodiments, it will be clear to those skilled in the art that the above-described embodiment method may be implemented by means of software plus a necessary general hardware platform, but of course may also be implemented by means of hardware, but in many cases the former is a preferred embodiment. Based on such understanding, the technical solution of the present application may be embodied essentially or in a part contributing to the prior art in the form of a software product stored in a storage medium (e.g. ROM/RAM, magnetic disk, optical disk) comprising instructions for causing a terminal device (which may be a mobile phone, a computer, a server, an air conditioner, or a network device, etc.) to perform the method according to the embodiments of the present application.

It is apparent that the above-described embodiments are only some embodiments of the present application, but not all embodiments, and the preferred embodiments of the present application are shown in the drawings, which do not limit the scope of the patent claims. This application may be embodied in many different forms, but rather, embodiments are provided in order to provide a thorough and complete understanding of the present disclosure. Although the application has been described in detail with reference to the foregoing embodiments, it will be apparent to those skilled in the art that modifications may be made to the embodiments described in the foregoing description, or equivalents may be substituted for elements thereof. All equivalent structures made by the content of the specification and the drawings of the application are directly or indirectly applied to other related technical fields, and are also within the scope of the application.

Claims

1. The text detection method based on artificial intelligence is characterized by comprising the following steps:

performing layer regularization processing on feature vectors of nodes contained in the topological graph through a first regularization layer in a preset language model to obtain corresponding first feature vectors, and generating a first attention weight matrix corresponding to the first feature vectors through an attention layer in the language model; wherein the language model is a model based on a Tr ansformer structure;

2. The artificial intelligence based text detection method according to claim 1, wherein the step of constructing a topological graph corresponding to the news text specifically comprises:

3. The method for detecting text based on artificial intelligence according to claim 1, wherein the step of processing the first attention weight matrix by a graph topology learning layer in the language model and outputting a corresponding second attention weight matrix specifically includes:

4. The text detection method based on artificial intelligence according to claim 1, wherein the step of performing node update processing on the second attention weight matrix by a node update module in the language model to obtain a corresponding target node vector representation specifically includes:

5. The method for detecting text based on artificial intelligence according to claim 4, wherein the step of updating the first node vector representation based on the vector updating method to obtain a corresponding second node vector representation specifically comprises:

obtaining residual information of the first node vector;

6. The artificial intelligence based text detection method of claim 1, wherein the step of generating a graph vector representation corresponding to the target node vector representation specifically comprises:

acquiring a preset average pooling function;

the vector data is represented as the graph vector.

7. The artificial intelligence based text detection method of claim 1, wherein the training the language model by the graph vector representation to obtain a trained news detection model comprises:

acquiring a real label of a news text corresponding to the target graph vector;

calculating cross entropy loss based on the prediction tag and the real tag;

and taking the trained language model as the news detection model.

8. An artificial intelligence based text detection device, comprising:

9. A computer device comprising a memory having stored therein computer readable instructions which when executed implement the steps of the artificial intelligence based text detection method of any of claims 1 to 7.

10. A computer readable storage medium having stored thereon computer readable instructions which when executed by a processor implement the steps of the artificial intelligence based text detection method of any of claims 1 to 7.