CN111695117B - Webshell script detection method and device - Google Patents

Webshell script detection method and device Download PDF

Info

Publication number
CN111695117B
CN111695117B CN202010534994.0A CN202010534994A CN111695117B CN 111695117 B CN111695117 B CN 111695117B CN 202010534994 A CN202010534994 A CN 202010534994A CN 111695117 B CN111695117 B CN 111695117B
Authority
CN
China
Prior art keywords
webshell
features
script
normal web
scripts
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202010534994.0A
Other languages
Chinese (zh)
Other versions
CN111695117A (en
Inventor
戚伟强
徐柳婧
王艳艳
郑星航
范超
陈可
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
State Grid Zhejiang Electric Power Co Ltd
Information and Telecommunication Branch of State Grid Zhejiang Electric Power Co Ltd
Original Assignee
State Grid Zhejiang Electric Power Co Ltd
Information and Telecommunication Branch of State Grid Zhejiang Electric Power Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by State Grid Zhejiang Electric Power Co Ltd, Information and Telecommunication Branch of State Grid Zhejiang Electric Power Co Ltd filed Critical State Grid Zhejiang Electric Power Co Ltd
Priority to CN202010534994.0A priority Critical patent/CN111695117B/en
Publication of CN111695117A publication Critical patent/CN111695117A/en
Application granted granted Critical
Publication of CN111695117B publication Critical patent/CN111695117B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/50Monitoring users, programs or devices to maintain the integrity of platforms, e.g. of processors, firmware or operating systems
    • G06F21/55Detecting local intrusion or implementing counter-measures
    • G06F21/56Computer malware detection or handling, e.g. anti-virus arrangements
    • G06F21/562Static detection
    • G06F21/563Static detection by source code analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2411Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on the proximity to a decision surface, e.g. support vector machines

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Security & Cryptography (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Computer Hardware Design (AREA)
  • Data Mining & Analysis (AREA)
  • Software Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • Virology (AREA)
  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

本申请提供了一种webshell脚本检测方法及装置,该方法通过对多个不同设定类型的特征的提取,保证对SVM模型进行训练所用特征的多样性,并基于费舍尔评分算法对提取的多个不同设定类型的特征进行筛选,利用筛选后的特征对SVM模型进行训练,可以进一步提高SVM模型分类的准确性。在此基础上,对待检测脚本进行多个不同设定类型的特征的提取,并利用费舍尔评分算法对提取的多个不同设定类型的特征进行筛选,将筛选出的特征输入SVM模型,可以提高SVM模型输出的分类结果的准确性。

This application provides a webshell script detection method and device. This method ensures the diversity of features used for training the SVM model by extracting multiple different set types of features, and evaluates the extracted features based on the Fisher scoring algorithm. Screening multiple different set types of features and using the filtered features to train the SVM model can further improve the classification accuracy of the SVM model. On this basis, the script to be detected is used to extract features of multiple different setting types, and the Fisher scoring algorithm is used to filter the extracted features of multiple different setting types, and the filtered features are input into the SVM model. It can improve the accuracy of the classification results output by the SVM model.

Description

Webshell script detection method and device
Technical Field
The application relates to the technical field of information security, in particular to a webshell script detection method and device.
Background
The webshell script is a backdoor program installed on a successfully invaded computer, and an attacker can use the webshell script to permanently access the invaded computer and make a series of malicious uses, such as executing system commands, stealing and falsifying user data, modifying website homepages and the like. Therefore, it is highly necessary to detect webshell scripts.
However, the accuracy of the current webshell script detection method needs to be improved.
Disclosure of Invention
In order to solve the technical problems, the embodiment of the application provides a method and a device for detecting webshell script, so as to achieve the purpose of improving the accuracy of webshell script detection, and the technical scheme is as follows:
a webshell script detection method, the method comprising:
extracting features from the script to be detected according to a preset feature extraction template;
screening out the characteristics which accord with a set rule from the characteristics, and taking the screened out characteristics as characteristics to be used;
inputting the features to be used into a pre-trained SVM model to obtain a classification result output by the SVM model, wherein the pre-trained SVM model is obtained by training by utilizing webshell training features and normal web training features;
the obtaining process of the webshell training feature and the normal web training feature comprises the following steps:
acquiring a webshell script set and a normal web script set;
extracting features from webshell scripts in the webshell script set and normal web scripts in the normal web script set respectively according to the preset feature extraction templates to obtain to-be-processed webshell features and to-be-processed normal web features;
And respectively screening out the characteristics conforming to the set rule from the webshell characteristics to be processed and the normal web characteristics to be processed, and respectively taking the screened characteristics as webshell training characteristics and normal web training characteristics.
Preferably, the feature extraction template includes:
extracting templates of a plurality of features of different setting types;
the extracting the features from the script to be detected according to the preset feature extraction template comprises the following steps:
extracting a plurality of features of different setting types from the script to be detected according to templates for extracting the features of a plurality of different setting types;
extracting features from the webshell scripts in the webshell script set and the normal web scripts in the normal web script set according to the preset feature extraction templates respectively, wherein the feature extraction comprises the following steps:
and extracting the characteristics of the different setting types from the webshell scripts in the webshell script set and the normal web scripts in the normal web script set according to templates for extracting the characteristics of the different setting types.
Preferably, the setting rule includes:
rules for feature ranking above a set ranking threshold;
the feature ranking is ranking obtained by scoring the features based on a Fisher scoring algorithm and sorting the features from low score to high score.
Preferably, the extracting the features from the webshell scripts in the webshell script set and the normal web scripts in the normal web script set according to the preset feature extraction templates respectively includes:
denoising the webshell script set and the normal web script set respectively to obtain a first webshell script set to be processed and a first normal web script set to be processed;
performing redundancy elimination processing on the first webshell script set to be processed and the first normal web script set to be processed respectively to obtain a second webshell script set to be processed and a second normal web script set to be processed;
clustering the second webshell script set to be processed and the second normal web script set to be processed respectively to obtain at least one webshell target script set and at least one normal web script set;
and extracting features from each webshell target script set and each normal web script set according to the preset feature extraction templates respectively.
Preferably, the denoising processing for the webshell script set and the normal web script set respectively includes:
respectively taking the script with the script length in the webshell script set and the normal web script set which meet a set script length threshold as a noise script, and removing the noise script;
And cleaning BASE64 codes in scripts remaining after the noise scripts are removed in the webshell script set and the normal web script set respectively by using an anti-aliasing technology.
A webshell script detection device, comprising:
the extraction module is used for extracting features from the script to be detected according to a preset feature extraction template;
the screening module is used for screening out the characteristics which accord with the set rule from the characteristics, and taking the screened characteristics as the characteristics to be used;
the classification module is used for inputting the characteristics to be used into a pre-trained SVM model to obtain a classification result output by the SVM model; the pre-trained SVM model is obtained by training a training module through webshell training features and normal web training features;
the obtaining process of the webshell training feature and the normal web training feature comprises the following steps:
acquiring a webshell script set and a normal web script set;
extracting features from webshell scripts in the webshell script set and normal web scripts in the normal web script set respectively according to the preset feature extraction templates to obtain to-be-processed webshell features and to-be-processed normal web features;
And respectively screening out the characteristics conforming to the set rule from the webshell characteristics to be processed and the normal web characteristics to be processed, and respectively taking the screened characteristics as webshell training characteristics and normal web training characteristics.
Preferably, the feature extraction template includes:
extracting templates of a plurality of features of different setting types;
the extraction module is specifically configured to: extracting a plurality of features of different setting types from the script to be detected according to templates for extracting the features of a plurality of different setting types;
the training module is specifically configured to:
and extracting the characteristics of the different setting types from the webshell scripts in the webshell script set and the normal web scripts in the normal web script set according to templates for extracting the characteristics of the different setting types.
Preferably, the setting rule includes:
rules for feature ranking above a set ranking threshold;
the feature ranking is ranking obtained by scoring the features based on a Fisher scoring algorithm and sorting the features from low score to high score.
Preferably, the training module is specifically configured to:
denoising the webshell script set and the normal web script set respectively to obtain a first webshell script set to be processed and a first normal web script set to be processed;
Performing redundancy elimination processing on the first webshell script set to be processed and the first normal web script set to be processed respectively to obtain a second webshell script set to be processed and a second normal web script set to be processed;
clustering the second webshell script set to be processed and the second normal web script set to be processed respectively to obtain at least one webshell target script set and at least one normal web script set;
and extracting features from each webshell target script set and each normal web script set according to the preset feature extraction templates respectively.
Preferably, the training module is specifically configured to:
respectively taking the script with the script length in the webshell script set and the normal web script set which meet a set script length threshold as a noise script, and removing the noise script;
and cleaning BASE64 codes in scripts remaining after the noise scripts are removed in the webshell script set and the normal web script set respectively by using an anti-aliasing technology.
Compared with the prior art, the application has the beneficial effects that:
in the application, the data processing mode adopted during SVM model training, namely the mode of preprocessing, feature extraction and feature screening is adopted to process the script to be detected to obtain the feature to be used, and the feature to be used is input into the SVM model trained in advance, so that the feature to be used can be more accurately classified by the SVM model, and the accuracy of webshell script detection is improved.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings that are needed in the description of the embodiments will be briefly described below, it being obvious that the drawings in the following description are only some embodiments of the present application, and that other drawings may be obtained according to these drawings without inventive effort to a person skilled in the art.
Fig. 1 is a flowchart of a webshell script detection method provided in embodiment 1 of the present application;
FIG. 2 is a flowchart of a webshell training feature and the process of obtaining the normal web training feature provided in embodiment 1 of the present application;
fig. 3 is a flowchart of a webshell script detection method provided in embodiment 2 of the present application;
FIG. 4 is a flowchart of a webshell training feature and the process of obtaining the normal web training feature provided in embodiment 2 of the present application;
fig. 5 is a flowchart of a webshell script detection method provided in embodiment 3 of the present application;
FIG. 6 is a flowchart of a webshell training feature and the process of obtaining the normal web training feature provided in embodiment 3 of the present application;
fig. 7 is a schematic logic structure diagram of a webshell script detecting device provided by the application.
Detailed Description
The following description of the embodiments of the present application will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present application, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the application without making any inventive effort, are intended to be within the scope of the application.
In order that the above-recited objects, features and advantages of the present application will become more readily apparent, a more particular description of the application will be rendered by reference to the appended drawings and appended detailed description.
Referring to fig. 1, a flowchart of a webshell script detection method provided in embodiment 1 of the present application is shown in fig. 1, and the method may include, but is not limited to, the following steps:
and S11, extracting features from the script to be detected according to a preset feature extraction template.
The feature extraction template may be set as needed, and is not limited in this embodiment.
It should be noted that, according to the preset feature extraction template, features are extracted from the script to be detected, so that the feature extraction efficiency can be improved.
And step S12, screening out the characteristics which accord with the set rule from the characteristics, and taking the screened out characteristics as the characteristics to be used.
The setting rule may be set as needed, and is not limited in this embodiment.
And screening out the features which accord with the set rule from the features, and taking the screened features as the features to be used, so that the workload of webshell script detection can be at least reduced.
And S13, inputting the features to be used into a pre-trained SVM model to obtain a classification result output by the SVM model, wherein the pre-trained SVM model is obtained by training by utilizing webshell training features and normal web training features.
The obtaining process of the webshell training feature and the normal web training feature may refer to fig. 2, and the obtaining process may include:
s131, acquiring a webshell script set and a normal web script set.
In this embodiment, the process of obtaining the webshell script set and the normal web script set may include:
s1311, collecting a webshell script set and a normal web script set from a network (e.g., a gatherer) using a crawler attack.
Of course, the process of obtaining the webshell script set and the normal web script set may also include:
s1312, collecting webshell script sets and normal web script sets from a network (e.g., a gitub) using a crawler attack.
S1313, screening out webshell script sets meeting set conditions from the collected webshell script sets.
The setting conditions may be set as needed, and are not limited in this embodiment. For example, the setting condition may be set to a script language as a setting language (e.g., PHP language).
S1314, screening out the normal web script set meeting the set condition from the collected normal web script set.
The setting conditions in this step are the same as those in step S1313, and will not be described here.
S132, extracting features from the webshell scripts in the webshell script set and the normal web scripts in the normal web script set respectively according to the preset feature extraction templates to obtain the to-be-processed webshell features and the to-be-processed normal web features.
Extracting features from the webshell scripts in the webshell script set and the normal web scripts in the normal web script set according to the preset feature extraction templates respectively to obtain to-be-processed webshell features and to-be-processed normal web features, which can be understood as: extracting features from webshell scripts in the webshell script set according to the preset feature extraction template to obtain webshell features to be processed, and extracting features from normal web scripts in the normal web script set according to the preset feature extraction template to obtain normal web features to be processed.
The preset feature extraction template in this step is the same as the feature extraction template in step S11, and will not be described here again.
S133, screening out the characteristics conforming to the set rule from the webshell characteristics to be processed and the normal web characteristics to be processed, and taking the screened characteristics as webshell training characteristics and normal web training characteristics respectively.
The features conforming to the set rule are respectively screened from the webshell features to be processed and the normal web features to be processed, and the screened features are respectively used as webshell training features and normal web training features, which can be understood as follows: screening out characteristics conforming to the set rule from the characteristics of the webshell to be processed, and taking the screened characteristics as webshell training characteristics; and screening out the characteristics conforming to the set rule from the normal web characteristics to be processed, and taking the screened characteristics as normal web training characteristics.
The setting in this step is the same as the feature extraction template in step S11, and will not be described here.
In the application, the data processing mode adopted during SVM model training, namely the mode of preprocessing, feature extraction and feature screening is adopted to process the script to be detected to obtain the feature to be used, and the feature to be used is input into the SVM model trained in advance, so that the feature to be used can be more accurately classified by the SVM model, and the accuracy of webshell script detection is improved.
As another alternative embodiment of the present application, referring to fig. 3, a flowchart of a webshell script detection method provided in embodiment 2 of the present application is mainly a refinement of the webshell script detection method described in embodiment 1 above, and as shown in fig. 3, the method may include, but is not limited to, the following steps:
and S21, extracting the characteristics of a plurality of different setting types from the script to be detected according to templates for extracting the characteristics of a plurality of different setting types.
The template for extracting the features of the plurality of different setting types is one embodiment of the feature extraction template set in advance in example 1.
Step S21 is a specific embodiment of step S11 in example 1.
The features of the different setting types may include, but are not limited to: lexical features, syntactic features, and abstract features.
The lexical features can be understood as: and analyzing global variables required by the webshell according to the characteristic that the webshell receives various commands and performs information interaction with the victim server, and taking the number of the global variables representing the received information as a characteristic. Global variables may include, but are not limited to: keywords in the web such as $_get (F1), $_post (F2), $_cookie (F3), $_request (F4), $_file (F5), and $_session (F6).
Syntactic features, can be understood as: through analysis of expression modes used by an attacker when writing webshell scripts, the determined webshell can automatically adapt to various operating systems and automatically try to acquire the features of the authorities of related software. Syntactic features may include, but are not limited to: conditional statement duty cycle: representing the percentage of conditional statements in all statements of the script, e.g., if (F7), else (F8), else (F9), case (F10) in the script; and/or, the cyclic sentence duty cycle: representing the percentage of loop statements in all statements of the script, e.g., for (F11), while (F12) and foreach (F13) in the script
Abstract features, which can be understood as: sensitivity function matching degree. The matching degree of the sensitive function can be determined by judging whether the sensitive function exists in the script to be detected. If the script to be detected contains a sensitive function, the matching degree of the sensitive function can be set to be 1; if the script to be detected does not contain the sensitive function, the matching degree of the sensitive function can be set to 0. The sensitivity function matching degree can represent the application condition of some keywords in PHP language, such as disguised execution function (eval), file acquisition function (wget, curl, lynx, get, fetch), reverse connection function (perl, python, gcc, chmod, nohup, nc), information collection function (uname, id, ver, sysctl, whoami, $OSTYPE, pwd), etc., which are often used by webshell to execute some suspicious behaviors. The present invention is characterized by including these types of functions, namely, including a camouflage execution function (F14), including a file acquisition function (F15), including a reverse connection function (F16), and including an information collection function (F17). In addition, three common features of maximum length of word in script (F18), maximum length of line in php source code (F19) and information entropy (F20) can be added.
And S22, screening out the characteristics which accord with the set rule from the characteristics, and taking the screened out characteristics as the characteristics to be used.
And S23, inputting the features to be used into a pre-trained SVM model to obtain a classification result output by the SVM model, wherein the pre-trained SVM model is obtained by training by utilizing webshell training features and normal web training features.
The detailed procedure of steps S22-S23 can be referred to in the related description of steps S12-S13 in embodiment 1, and will not be described herein.
In this embodiment, the obtaining process of the webshell training feature and the normal web training feature may refer to fig. 4, and the obtaining process may include:
step S231, acquiring a webshell script set and a normal web script set.
The detailed process of step S231 can be referred to the related description of step S131 in embodiment 1, and will not be repeated here.
And step 232, extracting the characteristics of a plurality of different setting types from the webshell scripts in the webshell script set and the normal web scripts in the normal web script set according to templates for extracting the characteristics of the plurality of different setting types.
The template for extracting the plurality of features of different setting types is the same as the template for extracting the plurality of features of different setting types in step S21, and will not be described here.
In this embodiment, the process of extracting features from the webshell scripts in the webshell script set and the normal web scripts in the normal web script set according to the preset feature extraction templates respectively may include:
s2321, denoising the webshell script set and the normal web script set respectively to obtain a first webshell script set to be processed and a first normal web script set to be processed.
S2322, performing redundancy elimination processing on the first webshell script set to be processed and the first normal web script set to be processed respectively to obtain a second webshell script set to be processed and a second normal web script set to be processed.
The redundancy removal processing is performed on the first webshell script set to be processed and the first normal web script set to be processed respectively, which can be understood as follows: and respectively carrying out standardized operation on the first webshell script set to be processed and the first normal web script set to be processed, wherein the standardized operation comprises deleting all code notes and empty lines, calculating text similarity by utilizing a tf-idf model, and only reserving one script with similarity exceeding a threshold value.
S2323, clustering the second webshell script set to be processed and the second normal web script set to be processed respectively to obtain at least one webshell target script set and at least one normal web script set.
Because the functional modules between the Webshell scripts or between the normal web scripts may overlap, the second to-be-processed Webshell script set and the second to-be-processed normal web script set may be clustered by using the concept of transferring closures, specifically: webshell scripts or normal web scripts with the similarity of the functional modules exceeding a set clustering threshold are clustered into one type.
More specifically, the second set of webshell scripts to be processed and the second set of normal web scripts to be processed may be clustered according to a family of scripts, respectively. The scripts in the same family have higher similarity, and the scripts in different families have lower similarity. For example, a webshell has a family called C99, and a plurality of webshell scripts exist in the C99 family, their codes are similar, functions are similar (i.e., the functional modules are close), and the different scripts may differ by several lines of codes. And the other family is called r57, and webshells in the r57 family are also similar. But the webshell script in r57 is very different from the webshell script in C99.
For example, if the scripts m1 and m2 have a part of the same functional modules, and the scripts m2 and m3 have a part of the same functional modules, it can be inferred that m1, m2, and m3 are all belonging to a family. This transfer characteristic makes it possible to obtain all member information of the same family. Meanwhile, in this process, the corresponding functions of webshells in one family are also gradually clarified.
In this embodiment, the set clustering threshold may be set to 30%, that is, 30% of the functional modules are identical and are grouped into one class.
Steps S2321-S2323 may be understood as: and (3) carrying out data preprocessing on the webshell scripts in the webshell script set and the normal web scripts in the normal web script set.
S2324, extracting features from each webshell target script set and each normal web script set according to the preset feature extraction templates respectively.
The detailed process of step S2324 may refer to the related description of step S21, which is not described herein.
In this embodiment, the process of extracting features from the webshell scripts in the webshell script set and the normal web scripts in the normal web script set according to the preset feature extraction templates respectively may also include:
s2325, respectively taking scripts with script lengths in the webshell script set and the normal web script set meeting a set script length threshold as noise scripts, and removing the noise scripts.
The script length threshold may be set as needed, and is not limited in this embodiment. For example, the set script length threshold may be set as, but not limited to: 3 megabytes.
S2326, cleaning BASE64 codes in the scripts remaining after the noise scripts are removed in the webshell script set and the normal web script set respectively by utilizing an anti-aliasing technology, so as to obtain a first webshell script set to be processed and a first normal web script set to be processed.
Cleaning BASE64 codes in scripts remaining after the noise scripts are removed in the webshell script set and the normal web script set respectively by using an anti-aliasing technology, which can be understood as follows: and respectively utilizing an anti-aliasing technology to centralize the webshell scripts and restore BASE64 codes in the scripts remained after the noise scripts are removed respectively in the normal web scripts to be original PHP scripts.
Since the BASE64 code has no semantic information, features cannot be extracted from the BASE64 code, and thus the BASE64 code needs to be restored to the PHP script to ensure that features can be extracted from the PHP script.
Steps S2325-S2326 are a specific embodiment of step S2321.
S2327, performing redundancy elimination processing on the first webshell script set to be processed and the first normal web script set to be processed respectively to obtain a second webshell script set to be processed and a second normal web script set to be processed.
S2328, clustering the second webshell script set to be processed and the second normal web script set to be processed respectively to obtain at least one webshell target script set and at least one normal web script set.
S2329, extracting features from each webshell target script set and each normal web script set according to the preset feature extraction templates respectively.
The detailed process of steps S2327-S2329 may be referred to in the description of steps S2322-S2324, and will not be described herein.
And S233, screening out the characteristics conforming to the set rule from the webshell characteristics to be processed and the normal web characteristics to be processed, and taking the screened characteristics as webshell training characteristics and normal web training characteristics respectively.
The detailed process of step S233 can be referred to the related description of step S133 in embodiment 1, and will not be repeated here.
In this embodiment, the feature extraction of the plurality of different set types ensures the diversity of features used for training the SVM model, and the feature extraction of the plurality of different set types is screened, and the SVM model is trained by using the screened features, so that the classification accuracy of the SVM model can be further improved. On the basis, the characteristics of a plurality of different setting types are extracted from the script to be detected, the extracted characteristics of the plurality of different setting types are screened, and the screened characteristics are input into the SVM model, so that the accuracy of the classification result output by the SVM model can be improved.
As another alternative embodiment of the present application, referring to fig. 5, a flowchart of a webshell script detection method provided in embodiment 3 of the present application is mainly a refinement of the webshell script detection method described in embodiment 2 above, and as shown in fig. 5, the method may include, but is not limited to, the following steps:
and S31, extracting a plurality of features of different setting types from the script to be detected according to templates for extracting the features of the plurality of different setting types.
The detailed process of step S31 can be referred to the related description of step S21 in embodiment 2, and will not be repeated here.
And S32, screening out the features which accord with the rule that the feature ranking is higher than the set ranking threshold value from the features, and taking the screened features as the features to be used.
The feature ranking in the rule that the feature ranking is higher than the set ranking threshold is ranking obtained by scoring the features based on a Fisher scoring algorithm and sorting the features from low score to high score.
The setting of the ranking threshold may be set as needed, and is not limited in this embodiment.
The process of screening features from the features that meet a rule that the feature ranks above a set ranking threshold may include:
S321, scoring the characteristics based on a relational expression of the following Fisher algorithm:
wherein,,mean value of the ith feature on the dataset,/->Represent the first
i numberAverage value of features in k-th class, n k Representing the number of samples in the k-th class,
a value representing the j-th position of the i-th feature in the k-th class.
If F (F) i ) The larger the variance of all corresponding values for a corresponding feature in the same class, the better the feature.
S322, sorting the scores from low to high to obtain feature ranks, and selecting features with feature ranks higher than a set ranking threshold.
The set ranking threshold may be set as, but is not limited to: 16.
the embodiment calculates the data set of the feature sum extracted by the Fisher scoring algorithm to verify that the feature extracted by the Fisher scoring algorithm can well distinguish the normal script from the malicious script. The specific verification method can be as follows: for each dimension data of the feature, first, calculate the center point of each script (normal and malicious) and record as c respectively n And c m . Then, corresponding dimension data in the data samples of the normal class and the malicious class script are recorded to a center point c n And c m The average radius of Euclidean distances of (2) are respectively denoted as r n And r m . Finally, calculating Euclidean distance between normal and malicious script center points, and marking as dc m,n . If r n And r m Value ratio dc of (2) m,n Much smaller, the declarative features can distinguish well between normal and malicious scripts.
And step S33, inputting the features to be used into a pre-trained SVM model to obtain a classification result output by the SVM model, wherein the pre-trained SVM model is obtained by training by utilizing webshell training features and normal web training features.
The detailed process of step S33 can be referred to the related description of step S23 in embodiment 2, and will not be described herein.
In this embodiment, the obtaining process of the webshell training feature and the normal web training feature may refer to fig. 6, and the obtaining process may include:
step S331, acquiring a webshell script set and a normal web script set.
Step S332, extracting the features of the plurality of different setting types from the webshell scripts in the webshell script set and the normal web scripts in the normal web script set according to templates for extracting the features of the plurality of different setting types.
The detailed process of steps S331 to S332 can be referred to in the related description of steps S231 to S232 in embodiment 2, and will not be described herein.
Step S333, screening out features meeting the rule that feature ranking is higher than a set ranking threshold from the webshell features to be processed and the normal web features to be processed, and taking the screened features as webshell training features and normal web training features, respectively.
The rule that the feature rank is higher than the set rank threshold in this step may refer to the related description of the set rule in step S32, which is not described herein.
And respectively screening out the characteristics which accord with the rule that the characteristic ranking is higher than the set ranking threshold value from the webshell characteristics to be processed and the normal web characteristics to be processed, and respectively taking the screened characteristics as detailed processes of the webshell training characteristics and the normal web training characteristics, wherein the detailed description of the step S32 can be referred to, and the detailed description is omitted.
In this embodiment, the diversity of features used for training the SVM model is ensured by extracting features of a plurality of different setting types, the features of the plurality of different setting types are screened based on the fischer scoring algorithm, and the SVM model is trained by using the screened features, so that the accuracy of classification of the SVM model can be further improved. On the basis, extracting the characteristics of a plurality of different setting types from the script to be detected, screening the extracted characteristics of the plurality of different setting types by using a Fisher scoring algorithm, and inputting the screened characteristics into the SVM model, so that the accuracy of the classification result output by the SVM model can be improved.
The webshell script detection device provided by the application is introduced, and the webshell script detection device introduced below and the webshell script detection method introduced above can be correspondingly referred to each other.
Referring to fig. 7, the webshell script detecting device includes: the device comprises an extraction module 11, a screening module 12, a classification module 13, a training module 14 and a feature obtaining module 15.
The extracting module 11 is configured to extract features from the script to be detected according to a preset feature extraction template.
And the screening module 12 is used for screening out the characteristics conforming to the set rule from the characteristics, and taking the screened characteristics as the characteristics to be used.
The classification module 13 is used for inputting the characteristics to be used into a pre-trained SVM model to obtain a classification result output by the SVM model; the pre-trained SVM model is trained using the webshell training features and the normal web training features using the training module 14.
The feature obtaining module 15 is configured to:
acquiring a webshell script set and a normal web script set;
extracting features from webshell scripts in the webshell script set and normal web scripts in the normal web script set respectively according to the preset feature extraction templates to obtain to-be-processed webshell features and to-be-processed normal web features;
And respectively screening out the characteristics conforming to the set rule from the webshell characteristics to be processed and the normal web characteristics to be processed, and respectively taking the screened characteristics as webshell training characteristics and normal web training characteristics.
In this embodiment, the feature extraction template may include:
extracting templates of a plurality of features of different setting types;
accordingly, the extraction module 11 may be specifically configured to: extracting a plurality of features of different setting types from the script to be detected according to templates for extracting the features of a plurality of different setting types;
the feature obtaining module 15 may specifically be configured to:
and extracting the characteristics of the different setting types from the webshell scripts in the webshell script set and the normal web scripts in the normal web script set according to templates for extracting the characteristics of the different setting types.
In this embodiment, the setting rule may include:
rules for feature ranking above a set ranking threshold;
the feature ranking is ranking obtained by scoring the features based on a Fisher scoring algorithm and sorting the features from low score to high score.
In this embodiment, the process of extracting features from the webshell scripts in the webshell script set and the normal web scripts in the normal web script set by the feature obtaining module 15 according to the preset feature extraction template respectively may include:
Denoising the webshell script set and the normal web script set respectively to obtain a first webshell script set to be processed and a first normal web script set to be processed;
performing redundancy elimination processing on the first webshell script set to be processed and the first normal web script set to be processed respectively to obtain a second webshell script set to be processed and a second normal web script set to be processed;
clustering the second webshell script set to be processed and the second normal web script set to be processed respectively to obtain at least one webshell target script set and at least one normal web script set;
and extracting features from each webshell target script set and each normal web script set according to the preset feature extraction templates respectively.
In this embodiment, the process of the feature obtaining module 15 performing denoising processing on the webshell script set and the normal web script set respectively may include:
respectively taking the script with the script length in the webshell script set and the normal web script set which meet a set script length threshold as a noise script, and removing the noise script;
and cleaning BASE64 codes in scripts remaining after the noise scripts are removed in the webshell script set and the normal web script set respectively by using an anti-aliasing technology.
It should be noted that, in each embodiment, the differences from the other embodiments are emphasized, and the same similar parts between the embodiments are referred to each other. For the apparatus class embodiments, the description is relatively simple as it is substantially similar to the method embodiments, and reference is made to the description of the method embodiments for relevant points.
Finally, it is further noted that relational terms such as first and second, and the like are used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Moreover, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising one … …" does not exclude the presence of other like elements in a process, method, article, or apparatus that comprises the element.
For convenience of description, the above devices are described as being functionally divided into various units, respectively. Of course, the functions of each element may be implemented in the same piece or pieces of software and/or hardware when implementing the present application.
From the above description of embodiments, it will be apparent to those skilled in the art that the present application may be implemented in software plus a necessary general hardware platform. Based on such understanding, the technical solution of the present application may be embodied essentially or in a part contributing to the prior art in the form of a software product, which may be stored in a storage medium, such as a ROM/RAM, a magnetic disk, an optical disk, etc., including several instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to execute the method described in the embodiments or some parts of the embodiments of the present application.
The webshell script detection method and device provided by the application are described in detail, and specific examples are applied to illustrate the principle and implementation of the application, and the description of the above examples is only used for helping to understand the method and core ideas of the application; meanwhile, as those skilled in the art will have variations in the specific embodiments and application scope in accordance with the ideas of the present application, the present description should not be construed as limiting the present application in view of the above.

Claims (8)

1.一种webshell脚本检测方法,其特征在于,该方法包括:1. A webshell script detection method, characterized in that the method includes: 按照预先设定的特征提取模板,从待检测脚本中提取特征;所述特征提取模板,包括:提取多个不同设定类型的特征的模板;Extract features from the script to be detected according to a preset feature extraction template; the feature extraction template includes: a template for extracting multiple features of different set types; 所述按照预先设定的特征提取模板,从待检测脚本中提取特征,包括:按照提取多个不同设定类型的特征的模板,从所述待检测脚本中提取多个不同所述设定类型的特征;所述不同设定类型的特征包括:词法特征、句法特征和抽象特征;所述词法特征为根据webshell接受各类命令与受害服务器进行信息交互的特点,分析webshell所需的全局变量,将代表接受信息的全局变量的数量作为的特征,所述句法特征为所述webshell能够自动地适应各类操作系统并自动地尝试获取相关软件的权限的特征,所述抽象特征为敏感函数匹配度,所述敏感函数匹配度用于表征PHP语言中关键词的运用情况;Extracting features from the script to be detected according to a preset feature extraction template includes: extracting multiple different setting types from the script to be detected according to a template for extracting features of multiple different setting types. The characteristics of the different set types include: lexical features, syntactic features and abstract features; the lexical features are based on the characteristics of the webshell accepting various commands and interacting with the victim server, analyzing the global variables required by the webshell, The number of global variables representing the accepted information is taken as the feature. The syntactic feature is the feature that the webshell can automatically adapt to various operating systems and automatically try to obtain permissions for related software. The abstract feature is the matching degree of sensitive functions. , the sensitive function matching degree is used to characterize the use of keywords in the PHP language; 从所述特征中筛选出符合设定规则的特征,将筛选出的特征作为待使用特征;Filter out features that comply with the set rules from the features, and use the filtered features as features to be used; 将所述待使用特征输入到预先训练好的SVM模型,得到所述SVM模型输出的分类结果,所述预先训练好的SVM模型为利用webshell训练特征和正常web训练特征训练得到的;Input the features to be used into a pre-trained SVM model to obtain the classification results output by the SVM model. The pre-trained SVM model is trained using webshell training features and normal web training features; 所述webshell训练特征和所述正常web训练特征的获得过程,包括:The process of obtaining the webshell training features and the normal web training features includes: 获取webshell脚本集和正常web脚本集;Get the webshell script set and normal web script set; 分别按照所述预先设定的特征提取模板,从所述webshell脚本集中的webshell脚本和所述正常web脚本集中的正常web脚本中提取特征,得到待处理webshell特征和待处理正常web特征;所述分别按照所述预先设定的特征提取模板,从所述webshell脚本集中的webshell脚本和所述正常web脚本集中的正常web脚本中提取特征,包括:分别按照提取多个不同设定类型的特征的模板,从所述webshell脚本集中的webshell脚本和所述正常web脚本集中的正常web脚本中提取所述多个不同设定类型的特征;According to the preset feature extraction template, features are extracted from the webshell scripts in the webshell script set and the normal web scripts in the normal web script set to obtain the webshell features to be processed and the normal web features to be processed; Extracting features from the webshell scripts in the webshell script set and the normal web scripts in the normal web script set respectively according to the preset feature extraction templates includes: extracting features of multiple different set types according to Template, extract the features of the multiple different setting types from the webshell scripts in the webshell script set and the normal web scripts in the normal web script set; 分别从所述待处理webshell特征和所述待处理正常web特征中筛选出符合所述设定规则的特征,将筛选得到的特征分别作为webshell训练特征和正常web训练特征。Characteristics that comply with the set rules are respectively selected from the webshell characteristics to be processed and the normal web characteristics to be processed, and the filtered characteristics are used as webshell training characteristics and normal web training characteristics respectively. 2.根据权利要求1所述的方法,其特征在于,所述设定规则,包括:2. The method according to claim 1, characterized in that the setting rules include: 特征排名高于设定排名阈值的规则;Rules for feature rankings above a set ranking threshold; 所述特征排名为基于费舍尔评分算法,对特征进行评分,按照评分从低到高进行排序得到的排名。The feature ranking is based on the Fisher scoring algorithm, which scores the features and sorts them from low to high. 3.根据权利要求1-2任意一项所述的方法,其特征在于,所述分别按照所述预先设定的特征提取模板,从所述webshell脚本集中的webshell脚本和所述正常web脚本集中的正常web脚本中提取特征,包括:3. The method according to any one of claims 1-2, characterized in that the templates are extracted according to the preset characteristics respectively, from the webshell scripts in the webshell script set and the normal web script set. Features extracted from normal web scripts include: 分别对所述webshell脚本集和所述正常web脚本集进行去噪处理,得到第一待处理webshell脚本集和第一待处理正常web脚本集;Perform denoising processing on the webshell script set and the normal web script set respectively to obtain a first webshell script set to be processed and a first normal web script set to be processed; 分别对所述第一待处理webshell脚本集和所述第一待处理正常web脚本集进行去冗处理,得到第二待处理webshell脚本集和第二待处理正常web脚本集;Perform redundancy processing on the first set of webshell scripts to be processed and the set of normal web scripts to be processed respectively, to obtain a second set of webshell scripts to be processed and a second set of normal web scripts to be processed; 分别对所述第二待处理webshell脚本集和所述第二待处理正常web脚本集进行聚类,得到至少一个webshell目标脚本集和至少一个正常web脚本集;Cluster the second webshell script set to be processed and the normal web script set to be processed respectively to obtain at least one webshell target script set and at least one normal web script set; 分别按照所述预先设定的特征提取模板,从各个所述webshell目标脚本集和各个所述正常web脚本集中提取特征。Features are extracted from each webshell target script set and each normal web script set according to the preset feature extraction template. 4.根据权利要求3所述的方法,其特征在于,所述分别对所述webshell脚本集和所述正常web脚本集进行去噪处理,包括:4. The method according to claim 3, characterized in that the step of denoising the webshell script set and the normal web script set respectively includes: 分别将所述webshell脚本集和所述正常web脚本集中脚本长度符合设定脚本长度阈值的脚本,作为噪声脚本,并去除所述噪声脚本;Scripts whose script lengths meet the set script length threshold in the webshell script set and the normal web script set are used as noise scripts, and the noise scripts are removed; 分别利用反混淆技术对所述webshell脚本集中和所述正常web脚本集中去除各自的所述噪声脚本之后剩余的脚本中的BASE64编码进行清洗。Anti-obfuscation technology is used to clean the BASE64 encoding in the remaining scripts after removing the respective noise scripts in the webshell script set and the normal web script set. 5.一种webshell脚本检测装置,其特征在于,包括:5. A webshell script detection device, characterized by including: 提取模块,用于按照预先设定的特征提取模板,从待检测脚本中提取特征;所述特征提取模板,包括:The extraction module is used to extract features from the script to be detected according to a preset feature extraction template; the feature extraction template includes: 提取多个不同设定类型的特征的模板;Extract templates for multiple different set types of features; 所述提取模块,具体用于:按照提取多个不同设定类型的特征的模板,从所述待检测脚本中提取多个不同所述设定类型的特征;所述不同设定类型的特征包括:词法特征、句法特征和抽象特征;所述词法特征为根据webshell接受各类命令与受害服务器进行信息交互的特点,分析webshell所需的全局变量,将代表接受信息的全局变量的数量作为的特征,所述句法特征为所述webshell能够自动地适应各类操作系统并自动地尝试获取相关软件的权限的特征,所述抽象特征为敏感函数匹配度,所述敏感函数匹配度用于表征PHP语言中关键词的运用情况;The extraction module is specifically configured to extract multiple features of different set types from the script to be detected according to a template for extracting features of multiple different set types; the features of different set types include : Lexical features, syntactic features and abstract features; the lexical features are based on the characteristics of the webshell accepting various commands and interacting with the victim server, analyzing the global variables required by the webshell, and taking the number of global variables representing the accepted information as the feature , the syntactic feature is the feature that the webshell can automatically adapt to various operating systems and automatically try to obtain permissions for related software, the abstract feature is the sensitive function matching degree, and the sensitive function matching degree is used to characterize the PHP language The use of keywords; 筛选模块,用于从所述特征中筛选出符合设定规则的特征,将筛选出的特征作为待使用特征;A screening module, used to screen out features that comply with the set rules from the features, and use the screened features as features to be used; 分类模块,用于将所述待使用特征输入到预先训练好的SVM模型,得到所述SVM模型输出的分类结果;所述预先训练好的SVM模型为利用训练模块利用webshell训练特征和正常web训练特征训练得到的;The classification module is used to input the features to be used into the pre-trained SVM model to obtain the classification results output by the SVM model; the pre-trained SVM model uses the training module to use webshell training features and normal web training Obtained by feature training; 特征获得模块,用于:Feature acquisition module, used for: 获取webshell脚本集和正常web脚本集;Get the webshell script set and normal web script set; 分别按照所述预先设定的特征提取模板,从所述webshell脚本集中的webshell脚本和所述正常web脚本集中的正常web脚本中提取特征,得到待处理webshell特征和待处理正常web特征;Extract features from the webshell scripts in the webshell script set and the normal web scripts in the normal web script set according to the preset feature extraction template, respectively, to obtain the webshell features to be processed and the normal web features to be processed; 所述特征获得模块,具体用于:The feature acquisition module is specifically used for: 分别按照提取多个不同设定类型的特征的模板,从所述webshell脚本集中的webshell脚本和所述正常web脚本集中的正常web脚本中提取所述多个不同设定类型的特征;Extract the features of the multiple different setting types from the webshell scripts in the webshell script set and the normal web scripts in the normal web script set according to templates for extracting features of multiple different setting types respectively; 分别从所述待处理webshell特征和所述待处理正常web特征中筛选出符合所述设定规则的特征,将筛选得到的特征分别作为webshell训练特征和正常web训练特征。Characteristics that comply with the set rules are respectively selected from the webshell characteristics to be processed and the normal web characteristics to be processed, and the filtered characteristics are used as webshell training characteristics and normal web training characteristics respectively. 6.根据权利要求5所述的装置,其特征在于,所述设定规则,包括:6. The device according to claim 5, wherein the setting rules include: 特征排名高于设定排名阈值的规则;Rules for feature rankings above a set ranking threshold; 所述特征排名为基于费舍尔评分算法,对特征进行评分,按照评分从低到高进行排序得到的排名。The feature ranking is based on the Fisher scoring algorithm, which scores the features and sorts them from low to high. 7.根据权利要求5-6任意一项所述的装置,其特征在于,所述特征获得模块,具体用于:7. The device according to any one of claims 5-6, characterized in that the feature acquisition module is specifically used for: 分别对所述webshell脚本集和所述正常web脚本集进行去噪处理,得到第一待处理webshell脚本集和第一待处理正常web脚本集;Perform denoising processing on the webshell script set and the normal web script set respectively to obtain a first webshell script set to be processed and a first normal web script set to be processed; 分别对所述第一待处理webshell脚本集和所述第一待处理正常web脚本集进行去冗处理,得到第二待处理webshell脚本集和第二待处理正常web脚本集;Perform redundancy processing on the first set of webshell scripts to be processed and the set of normal web scripts to be processed respectively, to obtain a second set of webshell scripts to be processed and a second set of normal web scripts to be processed; 分别对所述第二待处理webshell脚本集和所述第二待处理正常web脚本集进行聚类,得到至少一个webshell目标脚本集和至少一个正常web脚本集;Cluster the second webshell script set to be processed and the normal web script set to be processed respectively to obtain at least one webshell target script set and at least one normal web script set; 分别按照所述预先设定的特征提取模板,从各个所述webshell目标脚本集和各个所述正常web脚本集中提取特征。Features are extracted from each webshell target script set and each normal web script set according to the preset feature extraction template. 8.根据权利要求7所述的装置,其特征在于,所述特征获得模块,具体用于:8. The device according to claim 7, characterized in that the feature acquisition module is specifically used for: 分别将所述webshell脚本集和所述正常web脚本集中脚本长度符合设定脚本长度阈值的脚本,作为噪声脚本,并去除所述噪声脚本;Scripts whose script lengths meet the set script length threshold in the webshell script set and the normal web script set are used as noise scripts, and the noise scripts are removed; 分别利用反混淆技术对所述webshell脚本集中和所述正常web脚本集中去除各自的所述噪声脚本之后剩余的脚本中的BASE64编码进行清洗。Anti-obfuscation technology is used to clean the BASE64 encoding in the remaining scripts after removing the respective noise scripts in the webshell script set and the normal web script set.
CN202010534994.0A 2020-06-12 2020-06-12 Webshell script detection method and device Active CN111695117B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010534994.0A CN111695117B (en) 2020-06-12 2020-06-12 Webshell script detection method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010534994.0A CN111695117B (en) 2020-06-12 2020-06-12 Webshell script detection method and device

Publications (2)

Publication Number Publication Date
CN111695117A CN111695117A (en) 2020-09-22
CN111695117B true CN111695117B (en) 2023-10-03

Family

ID=72480538

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010534994.0A Active CN111695117B (en) 2020-06-12 2020-06-12 Webshell script detection method and device

Country Status (1)

Country Link
CN (1) CN111695117B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113393063A (en) * 2021-08-17 2021-09-14 深圳市信润富联数字科技有限公司 Match result prediction method, system, program product and storage medium

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108604228A (en) * 2016-02-09 2018-09-28 国际商业机器公司 System and method for the language feature generation that multilayer word indicates
CN109462575A (en) * 2018-09-28 2019-03-12 东巽科技(北京)有限公司 A kind of webshell detection method and device
CN109598124A (en) * 2018-12-11 2019-04-09 厦门服云信息科技有限公司 A kind of webshell detection method and device
CN109657459A (en) * 2018-10-11 2019-04-19 平安科技(深圳)有限公司 Webpage back door detection method, equipment, storage medium and device
CN109905385A (en) * 2019-02-19 2019-06-18 中国银行股份有限公司 A kind of webshell detection method, apparatus and system
CN110427755A (en) * 2018-10-16 2019-11-08 新华三信息安全技术有限公司 A kind of method and device identifying script file
WO2020000743A1 (en) * 2018-06-27 2020-01-02 平安科技(深圳)有限公司 Webshell detection method and related device

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108604228A (en) * 2016-02-09 2018-09-28 国际商业机器公司 System and method for the language feature generation that multilayer word indicates
WO2020000743A1 (en) * 2018-06-27 2020-01-02 平安科技(深圳)有限公司 Webshell detection method and related device
CN109462575A (en) * 2018-09-28 2019-03-12 东巽科技(北京)有限公司 A kind of webshell detection method and device
CN109657459A (en) * 2018-10-11 2019-04-19 平安科技(深圳)有限公司 Webpage back door detection method, equipment, storage medium and device
CN110427755A (en) * 2018-10-16 2019-11-08 新华三信息安全技术有限公司 A kind of method and device identifying script file
CN109598124A (en) * 2018-12-11 2019-04-09 厦门服云信息科技有限公司 A kind of webshell detection method and device
CN109905385A (en) * 2019-02-19 2019-06-18 中国银行股份有限公司 A kind of webshell detection method, apparatus and system

Also Published As

Publication number Publication date
CN111695117A (en) 2020-09-22

Similar Documents

Publication Publication Date Title
Lin et al. Detecting multimedia generated by large ai models: A survey
CN103649905B (en) The method and system represented for unified information and application thereof
US20100211551A1 (en) Method, system, and computer readable recording medium for filtering obscene contents
CN104809069A (en) Source node loophole detection method based on integrated neural network
WO2016205286A1 (en) Automatic entity resolution with rules detection and generation system
CN107844533A (en) A kind of intelligent Answer System and analysis method
CN111143838A (en) A method for detecting abnormal behavior of database users
KR20170035892A (en) Recognition of behavioural changes of online services
US20240062569A1 (en) Optical character recognition filtering
CN114818724B (en) A method for constructing an effective disaster information detection model on social media
Liong et al. Automatic traditional Chinese painting classification: A benchmarking analysis
CN110991246A (en) Video detection method and system
Shcherban et al. Automatic identification of code smell discussions on stack overflow: A preliminary investigation
Qin et al. Finger-vein quality assessment based on deep features from grayscale and binary images
Vasconcellos et al. Analyzing polarization and toxicity on political debate in Brazilian TikTok videos transcriptions
CN111695117B (en) Webshell script detection method and device
Truskinger et al. Decision support for the efficient annotation of bioacoustic events
CN114048770B (en) Automatic detection method and system for digital audio deletion and insertion tampering operation
CN113259369B (en) A data set authentication method and system based on machine learning membership inference attack
CN110309285B (en) Automatic question answering method, device, electronic equipment and storage medium
Rigoni et al. Cleaner categories improve object detection and visual-textual grounding
CN113704108A (en) Similar code detection method and device, electronic equipment and storage medium
Li Multimodal visual pattern mining with convolutional neural networks
Akwaronwu et al. Brute Force Attack Detection in Network Traffic Using Convolutional Neural Networks
KR102695536B1 (en) Irregular/bad food monitoring device and method

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant