CN112511546A

CN112511546A - Vulnerability scanning method, device, equipment and storage medium based on log analysis

Info

Publication number: CN112511546A
Application number: CN202011396275.3A
Authority: CN
Inventors: 刘伟雄; 李泳权
Original assignee: Guangzhou Wonfone Technology Co ltd
Current assignee: Guangzhou Wonfone Technology Co ltd
Priority date: 2020-12-03
Filing date: 2020-12-03
Publication date: 2021-03-16

Abstract

The embodiment of the application discloses a vulnerability scanning method, device, equipment and storage medium based on log analysis. The method comprises the steps of obtaining a sample log file corresponding to sample vulnerability data, and analyzing the sample log file to obtain sample characteristic data; training a first sub-classifier according to the sample characteristic data set, wherein the first sub-classifier is used for outputting the probability of the sample log file corresponding to each vulnerability attack type; training a second sub-classifier according to the output result of the first sub-classifier, wherein the second sub-classifier is used for confirming the vulnerability attack type corresponding to the sample log file according to the probability; integrating the first sub-classifier and the second sub-classifier into a vulnerability classifier; and inputting the analysis result of the log file to be detected into the vulnerability classifier for classification detection so as to confirm the vulnerability attack type corresponding to the log file to be detected. The vulnerability attack type can be quickly and accurately identified, so that a targeted defense strategy is provided.

Description

Vulnerability scanning method, device, equipment and storage medium based on log analysis

Technical Field

The embodiment of the application relates to the field of computer networks, in particular to a vulnerability scanning method, device, equipment and storage medium based on log analysis.

Background

With the development of the internet, the number of network users is increasing, and the number of host devices invested in the internet is also increasing. With the development of the internet, the attack on the host by using the vulnerability is more common, and the security of the host device is more important. In order to guarantee the safe operation of the host to the maximum extent, the occurrence of host attack behavior needs to be monitored in real time, and the existence of danger is discovered in time.

The existing attack discovery mode has higher and higher complexity along with the continuous upgrading of the technology, the attack intention is hidden more and more deeply, the attack is discovered only after the attack has consequences with high probability, the timeliness of the attack is not enough when the attack is discovered, and even if the attack is discovered, the attack type cannot be accurately judged, and the targeted defense cannot be made in time.

Disclosure of Invention

The embodiment of the application provides a vulnerability scanning method, device, equipment and storage medium based on log analysis, and aims to solve the technical problems that attack behaviors are not timely and accurate enough to find and targeted defense cannot be performed.

In a first aspect, an embodiment of the present invention provides a vulnerability scanning method based on log analysis, including:

obtaining a sample log file corresponding to sample vulnerability data, and analyzing the sample log file to obtain sample characteristic data;

training a first sub-classifier according to the sample characteristic data set, wherein the first sub-classifier is used for outputting the probability of the sample log file corresponding to each vulnerability attack type;

training a second sub-classifier according to the output result of the first sub-classifier, wherein the second sub-classifier is used for confirming the vulnerability attack type corresponding to the sample log file according to the probability;

integrating the first sub-classifier and the second sub-classifier into a vulnerability classifier;

and inputting the analysis result of the log file to be detected into the vulnerability classifier for classification detection so as to confirm the vulnerability attack type corresponding to the log file to be detected.

Further, the obtaining a sample log file corresponding to the sample vulnerability data, and analyzing the sample log file to obtain sample characteristic data includes:

obtaining a sample log file corresponding to sample vulnerability data from a vulnerability database;

and vectorizing the sample log file based on the text features and the statistical features to obtain sample feature data.

Further, the text features are extracted based on an N-Gram text model method; the statistical characteristics comprise length statistical characteristics, character statistical characteristics and keyword statistical characteristics.

Further, the step of inputting the analysis result of the log file to be detected into the vulnerability classifier for classification detection includes:

acquiring a log file to be tested from a data source to be tested;

vectorizing the log file to be tested based on the text features and the statistical features to obtain feature data to be tested;

and inputting the characteristic data to be detected into the vulnerability classifier for classification detection.

Further, the vulnerability attack types include: network attacks, system attacks, information attacks, and hardware attacks.

In a second aspect, an embodiment of the present invention provides a vulnerability scanning apparatus based on log analysis, including:

the sample obtaining unit is used for obtaining a sample log file corresponding to the sample vulnerability data and analyzing the sample log file to obtain sample characteristic data;

the first training unit is used for training a first sub-classifier according to the sample feature data set, and the first sub-classifier is used for outputting the probability that the sample log file corresponds to each vulnerability attack type;

the second training unit is used for training a second sub-classifier according to the output result of the first sub-classifier, and the second sub-classifier is used for confirming the vulnerability attack type corresponding to the sample log file according to the probability;

a classification integration unit, configured to integrate the first sub-classifier and the second sub-classifier into a vulnerability classifier;

and the detection classification unit is used for inputting the analysis result of the log file to be detected into the vulnerability classifier for classification detection so as to confirm the vulnerability attack type corresponding to the log file to be detected.

Further, the sample acquiring unit includes:

the log obtaining module is used for obtaining a sample log file corresponding to the sample vulnerability data from the vulnerability database;

and the characteristic extraction module is used for vectorizing the sample log file based on the text characteristic and the statistical characteristic to obtain sample characteristic data.

the data acquisition module is used for acquiring a log file to be detected from a data source to be detected;

the characteristic generating module is used for vectorizing the log file to be tested based on the text characteristic and the statistical characteristic to obtain characteristic data to be tested;

and the characteristic classification module is used for inputting the characteristic data to be detected into the vulnerability classifier for classification detection.

In a third aspect, an embodiment of the present invention further provides a terminal device, including:

one or more processors;

a memory for storing one or more programs;

when executed by the one or more processors, cause the one or more processors to implement the method for vulnerability scanning based on log analysis as described in any of the first aspects.

In a fourth aspect, an embodiment of the present invention further provides a computer-readable storage medium, on which a computer program is stored, where the computer program, when executed by a processor, implements the vulnerability scanning method based on log analysis according to any one of the first aspect.

According to the vulnerability scanning method, device, equipment and storage medium based on log analysis, a sample log file corresponding to sample vulnerability data is obtained, and the sample log file is analyzed to obtain sample characteristic data; training a first sub-classifier according to the sample characteristic data set, wherein the first sub-classifier is used for outputting the probability of the sample log file corresponding to each vulnerability attack type; training a second sub-classifier according to the output result of the first sub-classifier, wherein the second sub-classifier is used for confirming the vulnerability attack type corresponding to the sample log file according to the probability; integrating the first sub-classifier and the second sub-classifier into a vulnerability classifier; and inputting the analysis result of the log file to be detected into the vulnerability classifier for classification detection so as to confirm the vulnerability attack type corresponding to the log file to be detected. By training the first sub-classifier and the second sub-classifier, the vulnerability classifier with high identification precision is formed by the first sub-classifier and the second sub-classifier, the vulnerability attack type can be quickly and accurately identified, and a targeted defense strategy is provided.

Drawings

Fig. 1 is a flowchart of a vulnerability scanning method based on log analysis according to an embodiment of the present disclosure;

fig. 2 is a schematic structural diagram of a vulnerability scanning apparatus based on log analysis according to a second embodiment of the present application;

fig. 3 is a schematic structural diagram of a terminal device according to a third embodiment of the present application.

Detailed Description

In order to make the objects, technical solutions and advantages of the present application more clear, the present invention is further described in detail below with reference to the accompanying drawings and examples. It is to be understood that the specific embodiments described herein are for purposes of illustration and not limitation. It should be further noted that, for the convenience of description, only some of the structures related to the present invention are shown in the drawings, not all of the structures.

It should be noted that, for the sake of brevity, this description does not exhaust all alternative embodiments, and it should be understood by those skilled in the art after reading this description that any combination of features may constitute an alternative embodiment as long as the features are not mutually inconsistent. Before discussing exemplary embodiments in more detail, it should be noted that some exemplary embodiments are described as processes or methods depicted as flowcharts. Although a flowchart may describe the operations (or steps) as a sequential process, many of the operations can be performed in parallel, concurrently or simultaneously. In addition, the order of the operations may be re-arranged. The process may be terminated when its operations are completed, but may have additional steps not included in the figure. The processes may correspond to methods, functions, procedures, subroutines, and the like.

Example one

Fig. 1 is a flowchart of a vulnerability scanning method based on log analysis according to an embodiment of the present invention. The vulnerability scanning method based on log analysis provided in the embodiment may be executed by an operating device corresponding to the vulnerability scanning method based on log analysis, the operating device may be implemented in a software and/or hardware manner, and the operating device may be composed of two or more physical entities or may be composed of one physical entity.

Specifically, referring to fig. 1, the vulnerability scanning method based on log analysis specifically includes:

step S110: and obtaining a sample log file corresponding to the sample vulnerability data, and analyzing the sample log file to obtain sample characteristic data.

In the scheme, the sample vulnerability data used for training can be log files which are found daily and correspondingly accumulated in response to vulnerability attacks, and can also be log files from various vulnerability databases, and in order to realize wider identification, the log files in the vulnerability databases can be obtained to serve as the sample vulnerability data.

In a specific implementation, step S110 may be implemented by step S111 and step S112.

Step S111: and obtaining a sample log file corresponding to the sample vulnerability data from the vulnerability database.

The source database of the vulnerability data in the scheme includes, but is not limited to, vulnerability databases derived from national information Security vulnerability library, national information Security vulnerability sharing platform vulnerability, Samengkok vulnerability library, American national information Security vulnerability library, global information Security vulnerability fingerprint library, vulnerability library of Offen live Security company, CVE (Common Vulnerabilities & Exposueres), Ucloud Security vulnerability reporting platform, and debug vulnerability library. Certainly, in order to implement the comparison and identification of normal data access, the sample log file also includes a plurality of log files corresponding to normal data access behaviors, so as to be used for the comparison and analysis of vulnerability attacks.

Step S112: and vectorizing the sample log file based on the text features and the statistical features to obtain sample feature data.

In the scheme, the sample log file is vectorized based on the text characteristics to obtain sample characteristic data, for example, the log file vectorization is performed through N-Gram. N-Gram is an algorithm based on a statistical language model. The basic idea is to perform a sliding window operation with the size of N on the content in the text according to bytes, and form a byte fragment sequence with the length of N. Each byte segment is called as a gram, the occurrence frequency of all the grams is counted, and filtering is performed according to a preset threshold value to form a key gram list, namely a vector feature space of the text, wherein each gram in the list is a feature vector dimension. The model is based on the assumption that the occurrence of the nth word is only related to the first N-1 words and not to any other words, and that the probability of a complete sentence is the product of the probabilities of occurrence of the words. These probabilities can be obtained by counting the number of times that N words occur simultaneously directly from the corpus. Commonly used are the binary Bi-Gram model (also known as the 2-Gram model) and the ternary Tri-Gram model (also known as the 3-Gram model).

In the scheme, considering that the difference of the characteristic dimensions of different n-gram models is large, the characteristic dimension of a 1-gram model is obviously smaller than that of 2-gram and 3-gram, and in order to reduce the calculation amount and improve the calculation speed, only the 1-gram model is used for extracting the text characteristics.

The statistical characteristics comprise length statistical characteristics, character statistical characteristics and keyword statistical characteristics. The length statistical characteristics mainly refer to relevant parameters of the access requests recorded in the sample log file, such as the length of request paths, the length of request parameters, the number of request paths and the like, and more parameters related to the requests can be extracted according to actual classification requirements to obtain more accurate judgment results. The character statistical characteristics include frequency of occurrence of various characters, character entropy, character attributes, and the like. The keyword statistical characteristics mainly aim at the statistics of code labels for injecting abnormal codes for intrusion recorded in a sample log file, for example, in SQL injection attack, single quotation marks and double quotation marks are frequently used for closing HTTP requests, so that the subsequent vulnerability scanning behavior identification is realized by performing statistics on keywords of each attack type.

Step S120: and training a first sub-classifier according to the sample characteristic data set, wherein the first sub-classifier is used for outputting the probability of the sample log file corresponding to each vulnerability attack type.

Step S130: and training a second sub-classifier according to the output result of the first sub-classifier, wherein the second sub-classifier is used for confirming the vulnerability attack type corresponding to the sample log file according to the probability.

In the scheme, the vulnerability attack types are identified and judged by twice classification respectively carried out by the first sub-classifier and the second sub-classifier. Specifically, the first sub-classifier mainly determines a correct sample and an error sample according to the judgment result of the vulnerability attack type, that is, a sample with the judgment result consistent with the actual notepaper of the sample log file is used as the correct sample, and otherwise, the sample is used as the error sample. Therefore, multiple classification problems of the sample log files can be converted into binary classification problems, and the classification detection accuracy of the subsequent second sub-classifiers is improved. For example, the probability that a normal sample log file is confirmed as the type of a vulnerability attack in the first sub-classifier is 31%, the probability that the normal sample log file is confirmed as the type of B vulnerability attack is 25%, and the probability that the normal sample log file is confirmed as the type of normal access is 22%. Then the first two samples that are mispredicted are the erroneous samples and the last sample that is predicted correctly is the correct sample when the second sub-classifier is input. And in the second sub-classifier, predicting the candidate vulnerability attack type by using the sample characteristic data corresponding to the sample log file and the first sub-classifier.

The classification effect is also limited by the characteristic constraint of one-time classification. In the scheme, the probability of each vulnerability attack type is obtained through the first sub-classifier, then the statistical characteristics related to the categories are added into the second sub-classifier, and the classification errors of the first sub-classifier can be corrected. In general, the second sub-classifier implements the combination of the mapping relationship between the feature data corresponding to a single log file and the vulnerability attack category features corresponding to the single log file, and generally speaking, the correct samples are classified in the classification results of the first sub-classifier in a top ranking manner. Through the twice classification, the log files which cannot be judged correctly in the first sub-classifier can be revised to be in the correct vulnerability attack type.

In the implementation process of the scheme, according to the summary of the vulnerability attack types, the classification of the vulnerability attack types comprises the following steps: network attacks, system attacks, information attacks, and hardware attacks. It should be noted that the above classification manner is only one classification selection in the specific implementation, and is only an exemplary selection, and does not represent an exclusive limitation to other classification manners, and the vulnerability attack types may be increased or decreased to serve as a classification scheme more suitable for application scenarios and technology updating. The specific vulnerability attack types are the types of attacks that have been found in the prior art, and are not described repeatedly herein.

Step S140: and integrating the first sub-classifier and the second sub-classifier into a vulnerability classifier.

The vulnerability classifier based on the integration of the first sub-classifier and the second sub-classifier is a final training target based on the sample log file and is used for realizing the classification of the most to-be-detected sample.

Step S150: and inputting the analysis result of the log file to be detected into the vulnerability classifier for classification detection so as to confirm the vulnerability attack type corresponding to the log file to be detected.

In the scheme, when a new access occurs, a corresponding log file is generated, and a test is performed based on the log file to obtain a corresponding judgment result.

In a specific implementation, step S150 may be implemented by steps S151-S153.

Step S151: and acquiring the log file to be tested from the data source to be tested.

Each newly generated log file is put into a data source to be tested, judgment within a reasonable time limit is carried out according to the security level and the security requirement of a security monitoring object, generally speaking, the higher the security level is, the higher the judgment time limit requirement is, for example, real-time judgment is the highest, namely, the judgment is carried out when the related log file is generated.

Step S152: and vectorizing the log file to be tested based on the text features and the statistical features to obtain feature data to be tested.

For the log file to be tested, the corresponding characteristic data to be tested is obtained in the same way in the training process, and the specific process is not described repeatedly here.

Step S153: and inputting the characteristic data to be detected into the vulnerability classifier for classification detection.

In the specific classification detection process, classification detection can be completed by inputting the feature data to be detected into the vulnerability classifier, and the identification of the data to be detected based on the training model is the conventional identification, and is not repeatedly described here.

The sample log file corresponding to the sample vulnerability data is obtained, and the sample log file is analyzed to obtain sample characteristic data; training a first sub-classifier according to the sample characteristic data set, wherein the first sub-classifier is used for outputting the probability of the sample log file corresponding to each vulnerability attack type; training a second sub-classifier according to the output result of the first sub-classifier, wherein the second sub-classifier is used for confirming the vulnerability attack type corresponding to the sample log file according to the probability; integrating the first sub-classifier and the second sub-classifier into a vulnerability classifier; and inputting the analysis result of the log file to be detected into the vulnerability classifier for classification detection so as to confirm the vulnerability attack type corresponding to the log file to be detected. By training the first sub-classifier and the second sub-classifier, the vulnerability classifier with high identification precision is formed by the first sub-classifier and the second sub-classifier, the vulnerability attack type can be quickly and accurately identified, and a targeted defense strategy is provided.

Example two

Fig. 2 is a schematic structural diagram of a vulnerability scanning apparatus based on log analysis according to a second embodiment of the present disclosure, and referring to fig. 2, the vulnerability scanning apparatus based on log analysis includes a sample obtaining unit 210, a first training unit 220, a second training unit 230, a classification integrating unit 240, and a detection classifying unit 250.

The sample obtaining unit 210 is configured to obtain a sample log file corresponding to sample vulnerability data, and analyze the sample log file to obtain sample characteristic data; a first training unit 220, configured to train a first sub-classifier according to the sample feature data set, where the first sub-classifier is configured to output probabilities that the sample log file corresponds to each vulnerability attack type; a second training unit 230, configured to train a second sub-classifier according to an output result of the first sub-classifier, where the second sub-classifier is configured to determine, according to the probability, a vulnerability attack type corresponding to the sample log file; a classification integration unit 240, configured to integrate the first sub-classifier and the second sub-classifier into a vulnerability classifier; and the detection classification unit 250 is configured to input the analysis result of the log file to be detected to the vulnerability classifier for classification detection, so as to determine the vulnerability attack type corresponding to the log file to be detected.

On the basis of the above embodiment, the sample acquiring unit 210 includes:

On the basis of the embodiment, the text features are extracted based on an N-Gram text model method; the statistical characteristics comprise length statistical characteristics, character statistical characteristics and keyword statistical characteristics.

On the basis of the above embodiment, the detection classification unit 250 includes:

On the basis of the above embodiment, the vulnerability attack types include: network attacks, system attacks, information attacks, and hardware attacks.

The model training device based on the counterstudy provided by the embodiment of the invention is contained in the terminal equipment, can be used for executing any vulnerability scanning method based on log analysis provided by the first embodiment of the invention, and has corresponding functions and beneficial effects.

It should be noted that, in the embodiment of the vulnerability scanning apparatus based on log analysis, each unit and each module included in the vulnerability scanning apparatus are only divided according to functional logic, but are not limited to the above division as long as the corresponding function can be realized; in addition, specific names of the functional units are only for convenience of distinguishing from each other, and are not used for limiting the protection scope of the present invention.

EXAMPLE III

Fig. 3 is a schematic structural diagram of a terminal device according to a third embodiment of the present invention, where the terminal device is a specific hardware presentation scheme of the operating device of the vulnerability scanning method based on log analysis. As shown in fig. 3, the terminal device includes a processor 310, a memory 320, an input device 330, an output device 340, and a communication device 350; the number of the processors 310 in the terminal device may be one or more, and one processor 310 is taken as an example in fig. 3; the processor 310, the memory 320, the input device 330, the output device 340 and the communication device 350 in the terminal equipment may be connected by a bus or other means, and the connection by the bus is taken as an example in fig. 3.

The memory 320 may be used as a computer-readable storage medium for storing software programs, computer-executable programs, and modules, such as program instructions/modules corresponding to the model training method based on the counterlearning according to the embodiment of the present invention (for example, the sample obtaining unit 210, the first training unit 220, the second training unit 230, the classification integrating unit 240, and the detection classifying unit 250 in the vulnerability scanning apparatus based on log analysis). The processor 310 executes various functional applications of the terminal device and data processing by executing software programs, instructions and modules stored in the memory 320, namely, implements the above-mentioned model training method based on the counterlearning.

The memory 320 may mainly include a program storage area and a data storage area, wherein the program storage area may store an operating system, an application program required for at least one function; the storage data area may store data created according to the use of the terminal device, and the like. Further, the memory 320 may include high speed random access memory, and may also include non-volatile memory, such as at least one magnetic disk storage device, flash memory device, or other non-volatile solid state storage device. In some examples, memory 320 may further include memory located remotely from processor 310, which may be connected to the terminal device via a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof.

The input device 330 may be used to receive input numeric or character information and generate key signal inputs related to user settings and function control of the terminal apparatus. The output device 340 may include a display device such as a display screen.

The terminal equipment comprises a model training device based on the counterstudy, can be used for executing any model training method based on the counterstudy, and has corresponding functions and beneficial effects.

Example four

The embodiment of the present invention further provides a storage medium containing computer-executable instructions, where the computer-executable instructions are used to execute relevant operations in the vulnerability scanning method based on log analysis provided in any embodiment of the present application when executed by a computer processor, and have corresponding functions and beneficial effects.

As will be appreciated by one skilled in the art, embodiments of the present application may be provided as a method, system, or computer program product.

Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein. The present application is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the application. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks. These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks. These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

In a typical configuration, a computing device includes one or more processors (CPUs), input/output interfaces, network interfaces, and memory. The memory may include forms of volatile memory in a computer readable medium, Random Access Memory (RAM) and/or non-volatile memory, such as Read Only Memory (ROM) or flash memory (flash RAM). The memory is an example of a computer-readable medium.

Computer-readable media, including both non-transitory and non-transitory, removable and non-removable media, may implement information storage by any method or technology. The information may be computer readable instructions, data structures, modules of a program, or other data. Examples of computer storage media include, but are not limited to, phase change memory (PRAM), Static Random Access Memory (SRAM), Dynamic Random Access Memory (DRAM), other types of Random Access Memory (RAM), Read Only Memory (ROM), Electrically Erasable Programmable Read Only Memory (EEPROM), flash memory or other memory technology, compact disc read only memory (CD-ROM), Digital Versatile Discs (DVD) or other optical storage, magnetic cassettes, magnetic tape magnetic disk storage or other magnetic storage devices, or any other non-transmission medium that can be used to store information that can be accessed by a computing device. As defined herein, a computer readable medium does not include a transitory computer readable medium such as a modulated data signal and a carrier wave.

It should also be noted that the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other identical elements in the process, method, article, or apparatus that comprises the element.

It should be noted that the foregoing is only illustrative of the preferred embodiments of the present application and the technical principles employed. The present application is not limited to the particular embodiments described herein, but is capable of various obvious changes, rearrangements and substitutions as will now become apparent to those skilled in the art without departing from the scope of the invention. Therefore, although the present application has been described in more detail with reference to the above embodiments, the present application is not limited to the above embodiments, and may include other equivalent embodiments without departing from the spirit of the present application, and the scope of the present application is determined by the scope of the claims.

Claims

1. A vulnerability scanning method based on log analysis is characterized by comprising the following steps:

2. The method of claim 1, wherein the obtaining of the sample log file corresponding to the sample vulnerability data and the analyzing of the sample log file to obtain the sample characteristic data comprises:

3. The method according to claim 2, wherein the text features are extracted based on an N-Gram text model method; the statistical characteristics comprise length statistical characteristics, character statistical characteristics and keyword statistical characteristics.

4. The method according to claim 1, wherein the inputting the analysis result of the log file to be tested to the vulnerability classifier for classification detection comprises:

acquiring a log file to be tested from a data source to be tested;

5. The method of claim 1, wherein the vulnerability attack types include: network attacks, system attacks, information attacks, and hardware attacks.

6. Vulnerability scanning device based on log analysis, its characterized in that includes:

7. The apparatus of claim 6, wherein the sample acquisition unit comprises:

8. The apparatus of claim 7, wherein the text features are extracted based on an N-Gram text model method; the statistical characteristics comprise length statistical characteristics, character statistical characteristics and keyword statistical characteristics.

9. A terminal device, comprising:

one or more processors;

a memory for storing one or more programs;

when executed by the one or more processors, cause the one or more processors to implement the log analysis-based vulnerability scanning method of any of claims 1-5.

10. A computer-readable storage medium, on which a computer program is stored, which, when being executed by a processor, implements the log analysis-based vulnerability scanning method according to any one of claims 1-5.