CN111428033A

CN111428033A - Automatic threat information extraction method based on double-layer convolutional neural network

Info

Publication number: CN111428033A
Application number: CN202010203140.4A
Authority: CN
Inventors: 李小勇; 荀爽
Original assignee: Beijing University of Posts and Telecommunications
Current assignee: Beijing University of Posts and Telecommunications
Priority date: 2020-03-20
Filing date: 2020-03-20
Publication date: 2020-07-17
Anticipated expiration: 2040-03-20
Also published as: CN111428033B

Abstract

The application provides an automatic threat information extraction method based on double-layer convolutional neural network, can discern the document in the data to be analyzed through the network model, judge whether there is threat information, can discern according to the rest in the judged result to treat the analysis document simultaneously and extract to be convenient for draw the threat information, improve the extraction efficiency of threat information, reduce the cost of labor.

Description

Automatic threat information extraction method based on double-layer convolutional neural network

Technical Field

The application relates to the technical field of information, in particular to an automatic threat information extraction method based on a double-layer convolutional neural network.

Background

In recent years, with the rapid development of information technology, network security issues are also becoming more and more important. However, with the development of technologies such as big data, more accurate matching of threatening data can be performed during the analysis. For example, threat intelligence techniques may provide a decision maker with a recommendation of feasibility based on knowledge about known or new threats or dangers.

However, in the existing threat information extraction technology, when extracting threat information, matching with data to be analyzed needs to be performed through a set rule, and the rule needs to be defined by manpower, so that the required labor cost is high, and the generalization capability of the rules is low, and the efficiency is not high.

Disclosure of Invention

The embodiment of the application aims to provide an automatic threat information extraction method based on a double-layer convolutional neural network so as to achieve the purpose of reducing labor cost. The specific technical scheme is as follows:

in a first aspect of this embodiment, there is provided an automated threat intelligence extraction method based on a double-layer convolutional neural network, including:

obtaining a statement to be analyzed;

preprocessing a statement to be analyzed to obtain a data matrix in a preset format;

inputting the data matrix into a first convolution layer of a pre-trained double-layer convolution neural network, performing convolution operation on the data matrix through a first preset number of first preset convolution cores to obtain a first preset number of first matrixes to be output, and performing average pooling operation on the data matrix to obtain a first preset number of first feature matrixes;

splicing the first matrix to be output and the first characteristic matrix to obtain a first output matrix;

inputting the first output matrix into a second convolution layer of the double-layer convolution neural network, performing convolution operation on the first output matrix through a second preset convolution kernel with a second preset number to obtain a second preset number of second matrixes to be output, performing maximum pooling operation on the second preset number of second matrixes to be output to obtain a second preset number of second characteristic matrixes, and performing global average pooling operation on the data matrixes to obtain a third characteristic matrix;

splicing the second feature matrixes with a second preset number with the third feature matrixes to obtain second output matrixes;

and inputting the second output matrix into a full connection layer of the double-layer convolutional neural network, and classifying the second output matrix to obtain a judgment result of whether the statement to be analyzed is a threat intelligence statement.

Optionally, inputting the second output matrix into a full connection layer of the double-layer convolutional neural network, classifying the second output matrix, and obtaining a judgment result of whether the statement to be analyzed is a threat intelligence statement, including:

inputting the second output matrix into a full-connection layer of the double-layer convolutional neural network, and calculating probability distribution of the second output matrix corresponding to each preset classification;

and obtaining a judgment result of whether the statement to be analyzed is a threat intelligence statement or not according to the probability distribution.

Optionally, obtaining the statement to be analyzed includes:

and obtaining the sentence to be analyzed from the target database through a preset crawler program.

Optionally, after the second output matrix is input to the full connection layer of the double-layer convolutional neural network and classified to obtain a judgment result of whether the statement to be analyzed is a threat intelligence statement, the method further includes:

obtaining a pre-generated analysis report, wherein the analysis report is a relevant analysis report of threat intelligence generated by a third party;

and converting the analysis report and the judgment result into an analysis result in a specified format.

Optionally, the training process of the pre-trained double-layer convolutional neural network includes:

obtaining a sample text, wherein the sample text comprises a sample statement marked with threat intelligence;

preprocessing a sample statement to obtain a data matrix with a preset format;

inputting the data matrix into a first convolution layer of a double-layer convolution neural network to be trained, performing convolution operation on the data matrix through a first preset number of first preset convolution cores to obtain a first preset number of first matrixes to be output, and performing average pooling operation on the data matrix to obtain a first preset number of first feature matrixes;

inputting the first output matrix into a second convolution layer of a double-layer convolution neural network to be trained, performing convolution operation on the first output matrix through a second preset convolution kernel with a second preset number to obtain a second preset number of second output matrices, performing maximum pooling operation on the second preset number of second output matrices to obtain a second preset number of second feature matrices, and performing global average pooling operation on the data matrix to obtain a third feature matrix;

inputting the second output matrix into a full-connection layer of a double-layer convolutional neural network to be trained, and classifying the second output matrix to obtain a judgment result of whether the statement to be analyzed is a threat intelligence statement;

and retraining the double-layer convolutional neural network to be trained according to the judgment result until the judgment result of the double-layer convolutional neural network to be trained meets the preset condition.

In a second aspect of this application, an automated threat intelligence extraction apparatus based on a double-layer convolutional neural network is provided, including:

the sentence acquisition module is used for acquiring a sentence to be analyzed;

the preprocessing module is used for preprocessing the statement to be analyzed to obtain a data matrix with a preset format;

the first convolution module is used for inputting the data matrix into a first convolution layer of a pre-trained double-layer convolution neural network, performing convolution operation on the data matrix through a first preset number of first preset convolution cores to obtain a first preset number of first matrixes to be output, and performing average pooling operation on the data matrix to obtain a first preset number of first feature matrixes;

the first splicing module is used for splicing the first matrix to be output and the first characteristic matrix to obtain a first output matrix;

the second convolution module is used for inputting the first output matrix into a second convolution layer of the double-layer convolution neural network, performing convolution operation on the first output matrix through a second preset convolution kernel with a second preset number to obtain a second preset number of second to-be-output matrices, performing maximum pooling operation on the second preset number of second to-be-output matrices to obtain a second preset number of second feature matrices, and performing global average pooling operation on the data matrix to obtain a third feature matrix;

the second splicing module is used for splicing a second feature matrix and a third feature matrix of a second preset number to obtain a second output matrix;

and the result output module is used for inputting the second output matrix into the full connection layer of the double-layer convolutional neural network, classifying the second output matrix and obtaining a judgment result of whether the statement to be analyzed is a threat intelligence statement.

Optionally, the result output module includes:

the probability distribution submodule is used for inputting the second output matrix into a full-connection layer of the double-layer convolutional neural network and calculating the probability distribution of the second output matrix corresponding to each preset classification;

and the result judgment submodule is used for obtaining a judgment result whether the statement to be analyzed is a threat intelligence statement or not according to the probability distribution.

Optionally, the statement obtaining module includes:

and the crawler program submodule is used for acquiring the sentence to be analyzed from the target database through a preset crawler program.

Optionally, the apparatus further comprises:

the report acquisition module is used for acquiring a pre-generated analysis report, wherein the analysis report is a related analysis report of threat intelligence generated by a third party;

and the format conversion module is used for converting the analysis report and the judgment result into an analysis result in a specified format.

preprocessing a sample statement to obtain a data matrix with a preset format;

In a third aspect of the present application, an electronic device is provided, which includes a processor, a communication interface, a memory, and a communication bus, where the processor, the communication interface, and the memory complete communication with each other through the communication bus;

a memory for storing a computer program;

and the processor is used for realizing any automatic threat information extraction method based on the double-layer convolutional neural network when executing the computer program stored in the memory.

In a fourth aspect of this embodiment of the present application, a computer-readable storage medium is provided, in which a computer program is stored, and the computer program, when executed by a processor, implements any of the above-mentioned automated threat intelligence extraction methods based on a double-layer convolutional neural network.

The automatic threat information extraction method based on the double-layer convolutional neural network can be used for extracting sentences to be analyzed; preprocessing a statement to be analyzed to obtain a data matrix in a preset format; inputting the data matrix into a first convolution layer of a pre-trained double-layer convolution neural network, performing convolution operation on the data matrix through a first preset number of first preset convolution cores to obtain a first preset number of first matrixes to be output, and performing average pooling operation on the data matrix to obtain a first preset number of first feature matrixes; splicing the first matrix to be output and the first characteristic matrix to obtain a first output matrix; inputting the first output matrix into a second convolution layer of the double-layer convolution neural network, performing convolution operation on the first output matrix through a second preset convolution kernel with a second preset number to obtain a second preset number of second matrixes to be output, performing maximum pooling operation on the second preset number of second matrixes to be output to obtain a second preset number of second characteristic matrixes, and performing global average pooling operation on the data matrixes to obtain a third characteristic matrix; splicing the second feature matrixes with a second preset number with the third feature matrixes to obtain second output matrixes; and inputting the second output matrix into a full connection layer of the double-layer convolutional neural network, and classifying the second output matrix to obtain a judgment result of whether the statement to be analyzed is a threat intelligence statement. Therefore, the documents in the data to be analyzed can be identified according to the network model, whether threat information exists or not is judged, and the rest parts in the documents to be analyzed can be identified according to the judgment result, so that the threat information is conveniently extracted, the extraction efficiency of the threat information is improved, and the labor cost is reduced. Of course, it is not necessary for any product or method of the present application to achieve all of the above advantages at the same time.

Drawings

In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present application, and for those skilled in the art, other drawings can be obtained according to the drawings without creative efforts.

FIG. 1 is a flow chart of an automated threat information extraction method based on a double-layer convolutional neural network as embodied by the present application;

FIG. 2 is another flow chart of an automated threat information extraction method based on a double-layer convolutional neural network as implemented in the present application;

FIG. 3 is a diagram illustrating an example of an automated threat information extraction apparatus based on a double-layer convolutional neural network according to an embodiment of the present disclosure;

FIG. 4 is a flow chart of a sentence classifier implemented in the present application;

FIG. 5 is a diagram illustrating an example of an automated threat information extraction apparatus based on a double-layer convolutional neural network implemented in the present application;

fig. 6 is a schematic structural diagram of an electronic device according to an embodiment of the present application.

Detailed Description

The technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are only a part of the embodiments of the present application, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.

obtaining a statement to be analyzed;

Therefore, according to the automatic threat information extraction method based on the double-layer convolutional neural network, the document in the data to be analyzed can be identified through the network model, whether threat information exists or not is judged, and meanwhile, the rest parts in the document to be analyzed can be identified according to the judgment result, so that the threat information can be conveniently extracted, the extraction efficiency of the threat information is improved, and the labor cost is reduced.

Referring to fig. 1, fig. 1 is a flowchart of an automated threat intelligence extraction method based on a double-layer convolutional neural network implemented in the present application, including:

step S11, a sentence to be analyzed is acquired.

The statements to be analyzed may be statements in a designated database, or may be obtained by monitoring a website list related to the security information through a preset crawler, and obtaining articles such as blogs and reports updated on these websites, or may be obtained by monitoring the latest APT reports and white papers in the APT databases through an APT (Advanced Persistent Threat) report collector.

After the sentence to be analyzed is obtained, the obtained sentence to be analyzed may be stored in a preset storage database, for example, a text obtained from a blog, a forum article crawler, and an APT report collector may be stored in an external database of NoSQ L (Not Only SQ L, a non-relational database).

Optionally, obtaining the statement to be analyzed includes: and obtaining the sentence to be analyzed from the target database through a preset crawler program.

The automatic threat information extraction method based on the double-layer convolutional neural network aims at threat information in intelligent terminal equipment or threat information in a network is obtained through an intelligent terminal, so that the threat information extraction method can be executed through the intelligent terminal equipment, and specifically, the intelligent terminal equipment can be a computer or a server and the like.

And step S12, preprocessing the statement to be analyzed to obtain a data matrix with a preset format.

The sentences to be analyzed can be preprocessed by using a word segmentation tool in an N L TK (Natural L language ToolKit) to perform preprocessing work such as word segmentation, cleaning, word deactivation, word drying and the like on each sentence, and the preprocessed sentences are put into a BERT (bidirectional encoder transform) word embedding tool for extracting word vectors through a sentence classifier based on a double-layer convolutional neural network to obtain a data matrix in a preset format, wherein the preset format can be any format.

For example, after a sentence to be analyzed is acquired, b sentences are taken out from a data set consisting of a plurality of sentences to be analyzed, then word segmentation operation is performed on each sentence, the number of words of each sentence is counted, and after the number of words of all sentences is regulated, a threshold value w is set as a dimension scale of the sentence vector. When the number of word instances in a certain statement is smaller than a threshold value w, the scale of the statement vector is expanded to w; when the number of word instances in a certain statement is larger than a threshold value w, deleting the part of the statement with the vector size larger than w, and only keeping the vector with the size in w. After the statement vector is obtained, carrying out word embedding operation on each word case in the statement through a BERT model to convert the word case into an e-dimensional vector.

Let W_iEmbedding a vector for an e-dimensional word of the ith word case in a sentence, a sentence consisting of w word cases can be expressed as:

in this connection, it is possible to use,

the model is trained by adopting mini-batch gradient descent algorithm, so that b sentences are spliced together as input to obtain a three-dimensional matrix with the scale b × e × w, and the matrix is the input matrix I.

Step S13, inputting the data matrix into a first convolution layer of a pre-trained double-layer convolution neural network, performing convolution operation on the data matrix through a first preset convolution kernel with a first preset number to obtain a first to-be-output matrix with a first preset number, and performing average pooling operation on the data matrix to obtain a first feature matrix with a first preset number.

The method comprises the steps of inputting a data matrix into a first convolution layer of a pre-trained double-layer convolution neural network, carrying out convolution operation on the data matrix through a first preset convolution kernel with a first preset number to obtain a first preset number of first to-be-output matrixes, wherein five convolution kernels with different sizes are selected from the first convolution layer, the convolution kernels are three-dimensional matrixes with the scale b × e × k and are used for carrying out convolution operation on the three-dimensional matrixes with an input matrix I, and k of four convolution kernels are 5, 4 and different3, 2 and a convolution kernel with k being 3, but setting a hole convolution with an expansion rate (d) being 1 to enlarge the size of the convolution kernel, wherein the output channel number of each convolution kernel is 4, performing convolution operation on each convolution kernel and I along the w dimension, performing nonlinear transformation on the convolved result through an activation function, and finally obtaining a three-dimensional characteristic matrix with the scale of b × 4 × w under the convolution kernel

And the characteristic matrix obtained under the five convolution kernels in the first layer of convolution network is represented, namely the first characteristic matrix.

The average pooling operation is performed on the data matrix to obtain a first feature matrix of a first preset number, which may be that the average pooling operation is performed on the data matrix along the e dimension to obtain an average value

As the eigenvalue obtained after the average pooling, the eigenvalue is a three-dimensional eigenvalue matrix with a scale of b × 1 × w, i.e., a first eigenvalue matrix.

And step S14, splicing the first matrix to be output and the first characteristic matrix to obtain a first output matrix.

And splicing the first matrix to be output and the first characteristic matrix to obtain a first output matrix. The characteristic matrix obtained by respectively calculating I through five convolution kernels

And I, obtaining a characteristic matrix after one-dimensional average pooling

The final result of the stitching to obtain the first convolutional layer is a three-dimensional feature matrix with the dimension b × 21 × w, i.e., a first output matrix M.

The first to-be-output matrix can reflect local characteristics of the data matrix, and the first feature matrix with a first preset number is obtained by performing average pooling operation on the data matrix, so that the feature matrix can have global features. Therefore, the first matrix to be output is spliced with the first feature matrix, and the obtained first output matrix covers both local features and overall features.

Step S15, inputting the first output matrix into a second convolution layer of the double-layer convolution neural network, performing convolution operation on the first output matrix through a second preset convolution kernel with a second preset number to obtain a second preset number of second to-be-output matrices, performing maximum pooling operation on the second preset number of second to-be-output matrices to obtain a second preset number of second feature matrices, and performing global average pooling operation on the data matrix to obtain a third feature matrix.

The method comprises the steps of inputting a first output matrix into a second convolution layer of a double-layer convolution neural network, carrying out convolution operation on the first output matrix through a second preset convolution kernel with a second preset number to obtain a second preset number of second matrixes to be output, selecting three convolution kernels with k values of 1, 2 and 3 respectively from the second convolution layer, wherein the scale of each convolution kernel is a three-dimensional matrix of b × e × k, carrying out convolution operation through a feature matrix M and the three convolution kernels, setting two convolution kernels for each convolution kernel, carrying out convolution operation along the dimension w through each convolution kernel and M, carrying out nonlinear transformation on a convolution result through an activation function, and finally splicing together to obtain a three-dimensional feature matrix with the scale of b × 2 × w under the convolution kernels

I.e. the second matrix to be output.

Performing maximum pooling operation on a second preset number of second to-be-output matrixes to obtain a second preset number of second feature matrixes, and performing global average pooling operation on the data matrixes to obtain a third feature matrix. May be obtained by performing a convolution operation on each convolution kernel

Performing one-dimensional maximum pooling operation along w dimension to obtain maximum value

The eigenvalue is an eigenvalue matrix with the scale b × 2 × 1 as the maximum pooled eigenvalue

I.e. the third feature matrix.

And performing maximum pooling operation on a second preset number of second to-be-output matrixes to obtain a second feature matrix of a second preset number, and performing global average pooling operation on the data matrix to obtain a third feature matrix. The feature matrix under a certain convolution kernel can be utilized, and a vector with the maximum weight is selected in the word dimension of the feature matrix to replace a sentence vector of the feature matrix, so that the purpose of reducing the dimension is achieved.

And step S16, splicing the second feature matrixes with a second preset number with the third feature matrixes to obtain second output matrixes.

And splicing the second feature matrixes with a second preset number with the third feature matrixes to obtain a second output matrix. Respectively calculating the feature matrix M through three convolution kernels in a second layer of convolution network and a third feature matrix obtained through maximum pooling operation

And A obtained after global average pooling operation with I⁽²⁾And performing splicing to obtain a three-dimensional feature matrix with the scale b × 7 × 1 at the step, namely a second output matrix O.

The second output matrix O retains the spatial information and semantic information extracted from each convolutional layer as well as each pooling layer, so that the inspection accuracy of the model can be improved.

And step S17, inputting the second output matrix into the full connection layer of the double-layer convolutional neural network, and classifying the second output matrix to obtain a judgment result of whether the statement to be analyzed is a threat intelligence statement.

The second output matrix O can be used as input, firstly multiplied by a weight matrix with the scale of 2 × 7, then normalized through an activation function, and finally a normalized feature matrix with the scale of b × 2 × 1 is output to represent the probability distribution of each statement under each classification label, so that the judgment result of whether the statement to be analyzed is the threat intelligence statement is obtained according to the probability distribution under each classification label.

Optionally, inputting the second output matrix into a full connection layer of the double-layer convolutional neural network, classifying the second output matrix, and obtaining a judgment result of whether the statement to be analyzed is a threat intelligence statement, including: inputting the second output matrix into a full-connection layer of the double-layer convolutional neural network, and calculating probability distribution of the second output matrix corresponding to each preset classification; and obtaining a judgment result of whether the statement to be analyzed is a threat intelligence statement or not according to the probability distribution.

Referring to fig. 2, fig. 2 is another flowchart of an automated threat intelligence extraction method based on a double-layer convolutional neural network implemented in the present application, including:

and step S111, obtaining the sentence to be analyzed from the target database through a preset crawler program.

The method comprises the steps of obtaining statements to be analyzed from a target database through a preset crawler program, continuously monitoring websites related to safety information, such as blogs and forum articles, through the preset crawler program, crawling articles updated by the websites, continuously monitoring latest APT reports and white papers in APTNots (APT database) through an APT (Advanced Persistent thread) report collector, and obtaining the latest APT reports and white papers. Therefore, the real-time performance of the statement to be analyzed can be improved, and the effectiveness of the analysis result can be improved.

Step S171, inputting the second output matrix into the full connection layer of the double-layer convolutional neural network, and calculating probability distribution of the second output matrix corresponding to each preset classification.

Step S172, obtaining the judgment result whether the statement to be analyzed is the threat intelligence statement according to the probability distribution.

And inputting the second output matrix into a full-connection layer of the double-layer convolutional neural network, calculating probability distribution of the second output matrix corresponding to each preset classification, and obtaining a judgment result of whether the statement to be analyzed is a threat intelligence statement according to the probability distribution. The output matrix can be compared with the preset standard classification, and the similarity between the output matrix and the standard classification is calculated to obtain probability distribution, so that the judgment result of whether the statement to be analyzed is the threat intelligence statement is obtained according to the probability distribution. Therefore, the accuracy of the classification result is improved, and the classification efficiency is improved.

Step S18, obtaining a pre-generated analysis report, wherein the analysis report is a relevant analysis report of threat intelligence generated by a third party.

Step S19, converting the analysis report and the judgment result into an analysis result in a specified format.

The analysis report and the judgment result are converted into the analysis result in the specified format by acquiring a pre-generated analysis report, wherein the analysis report is a related analysis report of threat intelligence generated by a third party. Texts except statements to be analyzed, which are obtained from a target database through a preset crawler program, namely relevant analysis reports of threat intelligence generated by a third party can be obtained. And converting the analysis report and the judgment result into an analysis result in a specified format. Thereby making the information contained in the generated analysis results more comprehensive. Meanwhile, the generation efficiency of the analysis result can be improved by acquiring the relevant analysis report of the threat intelligence generated by the third party.

preprocessing a sample statement to obtain a data matrix with a preset format;

Referring to fig. 3, fig. 3 is a diagram illustrating an example of an automatic threat information extraction apparatus based on a double-layer convolutional neural network according to an embodiment of the present application, including:

a data collection module 201 and a threat intelligence identification module 202;

the data collection module 201 includes a crawler 2011, a report collector 2012, a storage database 2013, and an analysis detection system 2014:

a crawler 2011. The crawler continuously monitors a list of websites related to the safety information through a crawler program for writing articles, and crawls articles such as blogs, reports and the like updated by the websites.

A report collector 2012 that continuously monitors the latest APT reports and white papers in the aptinotes database by writing a report crawler and crawls them down.

A storage database 2013 that stores text retrieved from blogs, forum article crawlers, and APT report collectors in a database external to NoSQ L.

The analysis and detection system 2014 automatically analyzes the Malware and generates an analysis report by building a CMAS (Cuckoo Malware analysis System, open source rhododendron Malware analysis system), and stores the analysis report in an internal database of the system.

The intelligence recognition module 202 is composed of a data preprocessor 2021, a sentence classifier 2022, and a threat intelligence recognizer 2023:

the data preprocessor 2021 takes part of text data from the database, and classifies each sentence into two categories, i.e. relevant and irrelevant to threat intelligence by manual marking, and then carries out preprocessing work such as word segmentation, cleaning, word deactivation, word drying, etc. on each sentence by using a word segmentation tool in N L TK (Natural L language processing kit).

As shown in fig. 4, the sentence classifier 2022 and the processing flow chart of the sentence classifier 2022 are to put the collected text data into BERT (bidirectional encoder representation from converters) word embedding tools for extracting word vectors after passing through the data preprocessor. Then, the obtained word vector matrix is input into a double-layer CNN (Convolutional Neural Networks), and a sentence classifier based on the double-layer CNN is trained.

The sentence classifier 2022 includes:

a word embedding layer. The purpose of the word embedding layer is to convert all word instances in a sentence into respective word vectors or word embedding vectors. Firstly, b sentences are taken out from a data set, then word segmentation operation is carried out on each sentence, the number of words and phrases of each sentence is counted, and after the number of words and phrases gauge rule of all the sentences is obtained, a threshold value w is set as a dimension scale of the sentence vector. When the number of word instances in a certain statement is smaller than a threshold value w, the scale of the statement vector is expanded to w; when the number of word instances in a certain statement is larger than a threshold value w, deleting the part of the statement with the vector size larger than w, and only keeping the vector with the size in w. After a sentence vector is obtained, carrying out word embedding operation on each word case in the sentence through a BERT model to convert the word case into an e-dimensional vector.

wherein,

and a mini-batch gradient descent algorithm is adopted to train the model, so that b sentences are spliced together to be used as input, and a three-dimensional matrix with the scale of b × e × w can be obtained, and the matrix is an input matrix I.

The method comprises the steps of firstly selecting five convolution kernels with different convolution kernel sizes (k), wherein the convolution kernels are three-dimensional matrixes with the scale of b × e × k and are used for performing convolution operation on input matrixes I, the k of the four convolution kernels is 5, 4, 3 and 2, the k of one convolution kernel is 3, and a hole convolution with the expansion rate (d) of 1 is simultaneously arranged to enlarge the size of the convolution kernels, the number of output channels of each convolution kernel is 4, performing convolution operation on the convolution kernels and I along the w dimension, performing nonlinear transformation on the convolution result through an activation function, and finally obtaining a three-dimensional characteristic matrix with the scale of b × 4 × w under the convolution kernels

Representing the feature matrix obtained under the five convolution kernels in the first layer of the convolutional network. Then, performing one-dimensional average pooling operation on I along the e dimension to obtain an average value

As the feature value obtained after the average pooling, the feature value is a three-dimensional feature moment with a scale of b × 1 × wThe array, average pooling operation may cause the feature matrix to have global features. Finally, respectively calculating the characteristic matrix obtained by I through five convolution kernels

And I, obtaining a characteristic matrix after one-dimensional average pooling

And splicing to obtain a three-dimensional characteristic matrix with the dimension b × 21 × w as the final result of the first convolutional layer, wherein the matrix is an intermediate matrix M.

Selecting three convolution kernels with k values of 1, 2 and 3 respectively in the second convolution layer, wherein the convolution kernels are three-dimensional matrixes with the scale of b × e × k, performing convolution operation on a feature matrix M and the three convolution kernels obtained on the basis of the first convolution network respectively, setting two convolution kernels for each convolution kernel, performing convolution operation on each convolution kernel of a certain convolution kernel and the feature matrix M along the dimension of w respectively, performing nonlinear transformation on the convolved result through an activation function, and finally splicing to obtain the three-dimensional feature matrix with the scale of b × 2 × w under the convolution kernels

Representing the feature matrix obtained under the three convolution kernels in the second layer of convolutional network. Obtained by performing convolution operation on each convolution kernel

The eigenvalue obtained after the maximum pooling is the eigenvalue matrix with the scale b × 2 × 1

Finally, the intermediate matrix M is respectively calculated through three convolution kernels in a second layer of convolution network and maximum pooling operation

And A obtained after global average pooling operation with I⁽²⁾And (5) performing splicing to obtain a three-dimensional characteristic matrix with the final result of the scale b × 7 × 1, and calling the matrix as an output matrix O.

The output matrix O is firstly multiplied by a weight matrix with the scale of 2 × 7, and then normalized by an activation function (Softmax normalized exponential function, y ═ Softmax (wx + b)), so that a normalized feature matrix with the scale of b × 2 × 1 is obtained, and the normalized feature matrix represents the probability distribution of each sentence under each classification label.

Threat intelligence recognizer 2023, take out the text data except for data used for data preprocessing from the database, put into the statement classifier based on double-deck CNN, can classify and come out threat intelligence statement and non-threat intelligence statement directly.

Referring to fig. 5, fig. 5 is a diagram illustrating an example of an automated threat intelligence extraction apparatus based on a double-layer convolutional neural network according to an embodiment of the present application, including:

a statement obtaining module 501, configured to obtain a statement to be analyzed;

a preprocessing module 502, configured to preprocess a statement to be analyzed to obtain a data matrix in a preset format;

the first convolution module 503 is configured to input the data matrix into a first convolution layer of a pre-trained double-layer convolutional neural network, perform convolution operation on the data matrix through a first preset number of first preset convolution kernels to obtain a first preset number of first to-be-output matrices, and perform average pooling operation on the data matrix to obtain a first preset number of first feature matrices;

a first splicing module 504, configured to splice the first to-be-output matrix and the first feature matrix to obtain a first output matrix;

a second convolution module 505, configured to input the first output matrix into a second convolution layer of the double-layer convolutional neural network, perform convolution operation on the first output matrix through a second preset number of second preset convolution cores to obtain a second preset number of second to-be-output matrices, perform maximum pooling operation on the second preset number of second to-be-output matrices to obtain a second preset number of second feature matrices, and perform global average pooling operation on the data matrix to obtain a third feature matrix;

a second splicing module 506, configured to splice a second feature matrix of a second preset number with the third feature matrix to obtain a second output matrix;

and a result output module 507, configured to input the second output matrix into a full connection layer of the double-layer convolutional neural network, classify the second output matrix, and obtain a determination result of whether the statement to be analyzed is a threat intelligence statement.

Optionally, the result output module 507 includes:

Optionally, the statement obtaining module 501 includes:

Optionally, the apparatus further comprises:

preprocessing a sample statement to obtain a data matrix with a preset format;

Therefore, the automatic threat information extraction device based on the double-layer convolutional neural network can identify the document in the data to be analyzed through the network model, judge whether threat information exists or not, and identify the rest parts in the document to be analyzed according to the judgment result, so that the threat information can be conveniently extracted, the extraction efficiency of the threat information is improved, and the labor cost is reduced.

The embodiment of the present application further provides an electronic device, as shown in fig. 6, which includes a processor 601, a communication interface 602, a memory 603, and a communication bus 604, where the processor 601, the communication interface 602, and the memory 603 complete mutual communication through the communication bus 604,

a memory 603 for storing a computer program;

the processor 601 is configured to implement the following steps when executing the program stored in the memory 603:

obtaining a statement to be analyzed;

Optionally, the processor is configured to implement any one of the above automated threat information extraction methods based on a double-layer convolutional neural network when executing a program stored in the memory.

The communication bus mentioned in the electronic device may be a PCI (Peripheral component interconnect) bus, an EISA (Extended Industry standard architecture) bus, or the like. The communication bus may be divided into an address bus, a data bus, a control bus, etc. For ease of illustration, only one thick line is shown, but this does not mean that there is only one bus or one type of bus.

The communication interface is used for communication between the electronic equipment and other equipment.

The Memory may include a RAM (Random Access Memory) or an NVM (Non-Volatile Memory), such as at least one disk Memory. Optionally, the memory may also be at least one memory device located remotely from the processor.

The Processor may be a general-purpose Processor, including a Central Processing Unit (CPU), a Network Processor (NP), and the like; but also a DSP (Digital Signal Processor), an ASIC (Application Specific Integrated Circuit), an FPGA (Field-Programmable Gate Array) or other Programmable logic device, discrete Gate or transistor logic device, discrete hardware component.

In yet another embodiment provided by the present application, a computer-readable storage medium is further provided, which has instructions stored therein, and when the instructions are executed on a computer, the instructions cause the computer to execute any one of the above-mentioned automated threat intelligence extraction methods based on a double-layer convolutional neural network.

In yet another embodiment provided by the present application, there is also provided a computer program product containing instructions that, when run on a computer, cause the computer to perform any of the above-described embodiments of automated threat intelligence extraction methods based on a double-layer convolutional neural network.

It is noted that, herein, relational terms such as first and second, and the like may be used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Also, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other identical elements in a process, method, article, or apparatus that comprises the element.

All the embodiments in the present specification are described in a related manner, and the same and similar parts among the embodiments may be referred to each other, and each embodiment focuses on the differences from the other embodiments. In particular, for the system embodiment, since it is substantially similar to the method embodiment, the description is simple, and for the relevant points, reference may be made to the partial description of the method embodiment.

The above description is only for the preferred embodiment of the present application, and is not intended to limit the scope of the present application. Any modification, equivalent replacement, improvement and the like made within the spirit and principle of the present application are included in the protection scope of the present application.

Claims

1. An automatic threat information extraction method based on a double-layer convolutional neural network is characterized by comprising the following steps:

obtaining a statement to be analyzed;

preprocessing the statement to be analyzed to obtain a data matrix with a preset format;

inputting the data matrix into a first convolution layer of a pre-trained double-layer convolution neural network, performing convolution operation on the data matrix through a first preset convolution kernel with a first preset number to obtain a first to-be-output matrix with a first preset number, and performing average pooling operation on the data matrix to obtain a first feature matrix with a first preset number;

inputting the first output matrix into a second convolution layer of the double-layer convolution neural network, performing convolution operation on the first output matrix through a second preset convolution kernel with a second preset number to obtain a second preset number of second to-be-output matrices, performing maximum pooling operation on the second preset number of second to-be-output matrices to obtain a second preset number of second feature matrices, and performing global average pooling operation on the data matrix to obtain a third feature matrix;

splicing the second feature matrixes of the second preset number with the third feature matrixes to obtain second output matrixes;

and inputting the second output matrix into a full connection layer of a double-layer convolutional neural network, and classifying the second output matrix to obtain a judgment result of whether the statement to be analyzed is a threat intelligence statement.

2. The method according to claim 1, wherein the inputting the second output matrix into a full connection layer of a double-layer convolutional neural network, and classifying the second output matrix to obtain a result of determining whether the statement to be analyzed is a threat intelligence statement, includes:

3. The method of claim 1, wherein obtaining the statement to be analyzed comprises:

and acquiring the statement to be analyzed from a target database through a preset crawler program.

4. The method according to claim 3, wherein the second output matrix is input to a full connection layer of a double-layer convolutional neural network, and the second output matrix is classified to obtain a result of determining whether the statement to be analyzed is a threat intelligence statement, and the method further comprises:

5. The method of claim 1, wherein the training process of the pre-trained double-layer convolutional neural network comprises:

preprocessing the sample statement to obtain a data matrix with a preset format;

inputting the data matrix into a first convolution layer of a double-layer convolution neural network to be trained, performing convolution operation on the data matrix through a first preset convolution core with a first preset number to obtain a first to-be-output matrix with a first preset number, and performing average pooling operation on the data matrix to obtain a first feature matrix with a first preset number;

inputting the second output matrix into a full connection layer of a double-layer convolutional neural network to be trained, and classifying the second output matrix to obtain a judgment result of whether the statement to be analyzed is a threat intelligence statement;

6. An automatic threat intelligence extraction element based on double-layer convolution neural network, its characterized in that includes:

the first convolution module is used for inputting the data matrix into a first convolution layer of a pre-trained double-layer convolution neural network, performing convolution operation on the data matrix through a first preset convolution kernel with a first preset number to obtain a first to-be-output matrix with the first preset number, and performing average pooling operation on the data matrix to obtain a first feature matrix with the first preset number;

the second splicing module is used for splicing the second feature matrixes of the second preset number with the third feature matrixes to obtain second output matrixes;

and the result output module is used for inputting the second output matrix into a full connection layer of the double-layer convolutional neural network, classifying the second output matrix and obtaining a judgment result of whether the statement to be analyzed is a threat intelligence statement.

7. The apparatus of claim 6, wherein the result output module comprises:

the probability distribution submodule is used for inputting the second output matrix into the full-connection layer of the double-layer convolutional neural network and calculating the probability distribution of the second output matrix corresponding to each preset classification;

8. The apparatus of claim 6, wherein the sentence acquisition module comprises:

and the crawler program submodule is used for acquiring the statement to be analyzed from the target database through a preset crawler program.

9. An electronic device, comprising a processor, a communication interface, a memory and a communication bus, wherein the processor, the communication interface and the memory complete communication with each other through the communication bus;

a memory for storing a computer program;

a processor for implementing the method steps of any of claims 1-5 when executing the computer program stored in the memory.

10. A computer-readable storage medium, characterized in that a computer program is stored in the computer-readable storage medium, which computer program, when being executed by a processor, carries out the method steps of any one of the claims 1-5.