CN113742726A - Program recognition model training and program recognition method, device, equipment and medium - Google Patents

Program recognition model training and program recognition method, device, equipment and medium Download PDF

Info

Publication number
CN113742726A
CN113742726A CN202110997676.2A CN202110997676A CN113742726A CN 113742726 A CN113742726 A CN 113742726A CN 202110997676 A CN202110997676 A CN 202110997676A CN 113742726 A CN113742726 A CN 113742726A
Authority
CN
China
Prior art keywords
program
signature
character string
information
sample
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110997676.2A
Other languages
Chinese (zh)
Inventor
范乙琛
卿润东
梁彧
傅强
阿曼太
蔡琳
杨满智
田野
王杰
金红
陈晓光
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Eversec Beijing Technology Co Ltd
Original Assignee
Eversec Beijing Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Eversec Beijing Technology Co Ltd filed Critical Eversec Beijing Technology Co Ltd
Priority to CN202110997676.2A priority Critical patent/CN113742726A/en
Publication of CN113742726A publication Critical patent/CN113742726A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/50Monitoring users, programs or devices to maintain the integrity of platforms, e.g. of processors, firmware or operating systems
    • G06F21/55Detecting local intrusion or implementing counter-measures
    • G06F21/56Computer malware detection or handling, e.g. anti-virus arrangements
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning

Abstract

The embodiment of the invention discloses a method, a device, equipment and a medium for training a program identification model and identifying a program. The program recognition model training method comprises the following steps: acquiring behavior characteristic information and certificate signature information of each sample program; acquiring program behavior characteristics of each sample program according to the behavior characteristic information; acquiring the character string randomness of the certificate signature information to obtain the signature randomness characteristics of each sample program; and inputting the program behavior characteristics and the signature randomness characteristics as sample training data into the program identification model to train the program identification model. The embodiment of the invention can realize multi-angle feature extraction and identification of the application program based on a machine learning method, and improve the efficiency and accuracy of malicious program detection.

Description

Program recognition model training and program recognition method, device, equipment and medium
Technical Field
The embodiment of the invention relates to the technical field of computers, in particular to a method, a device, equipment and a medium for training a program identification model and identifying a program.
Background
In the prior art, identification of malicious programs can be usually determined by determining static features such as API (Application Programming Interface) call, authority, and boot startup items of the programs, or by determining dynamic features such as call sequences, registry behaviors, and file behaviors of the programs, or by acquiring internal information of the programs and matching the internal information with malicious information existing in a database.
However, the above method provided by the prior art only uses simple feature discrimination to identify the malicious program, and cannot extract and discriminate the multi-angle features of the application program, and the efficiency and accuracy of identifying the malicious program are low.
Disclosure of Invention
The embodiment of the invention provides a program identification model training and program identification method, device, equipment and medium, which aim to realize multi-angle feature extraction and discrimination of an application program based on a machine learning method and improve efficiency and accuracy of malicious program identification.
In a first aspect, an embodiment of the present invention provides a method for training a program recognition model, including:
acquiring behavior characteristic information and certificate signature information of each sample program;
acquiring program behavior characteristics of each sample program according to the behavior characteristic information;
acquiring the character string randomness of the certificate signature information to obtain the signature randomness characteristics of each sample program;
and inputting the program behavior characteristics and the signature randomness characteristics as sample training data to a program recognition model to train the program recognition model.
In a second aspect, an embodiment of the present invention further provides a program identification method, including:
acquiring behavior characteristic information and certificate signature information of a program to be identified;
acquiring the program behavior characteristics of the program to be identified according to the behavior characteristic information;
acquiring the character string randomness of the certificate signature information to obtain the signature randomness characteristics of the program to be identified;
inputting the program behavior characteristics and the signature randomness characteristics into a program identification model as to-be-detected data of the program to be identified to obtain a program identification result of the program to be identified; the program identification model is obtained by training through the program identification model training method in any embodiment of the invention.
In a third aspect, an embodiment of the present invention further provides a program recognition model training apparatus, including:
the system comprises a sample information acquisition module, a certificate signing module and a verification module, wherein the sample information acquisition module is used for acquiring behavior characteristic information and certificate signing information of each sample program;
the sample behavior characteristic acquisition module is used for acquiring the program behavior characteristics of each sample program according to the behavior characteristic information;
the sample signature characteristic acquisition module is used for acquiring the character string randomness of the certificate signature information to obtain the signature randomness characteristics of each sample program;
and the program identification model training module is used for inputting the program behavior characteristics and the signature randomness characteristics into a program identification model as sample training data so as to train the program identification model.
In a fourth aspect, an embodiment of the present invention further provides a program identification device, including:
the system comprises a to-be-identified information acquisition module, a certificate signature module and a verification module, wherein the to-be-identified information acquisition module is used for acquiring behavior characteristic information and certificate signature information of a to-be-identified program;
the behavior feature acquiring module is used for acquiring the program behavior feature of the program to be identified according to the behavior feature information;
the signature feature acquisition module to be identified is used for acquiring the character string randomness of the certificate signature information to obtain the signature randomness features of the program to be identified;
the program identification module is used for inputting the program behavior characteristics and the signature randomness characteristics into a program identification model as to-be-detected data of the program to be identified to obtain a program identification result of the program to be identified; the program identification model is obtained by training through the program identification model training method in any embodiment of the invention.
In a fifth aspect, an embodiment of the present invention further provides a computer device, where the computer device includes:
one or more processors;
storage means for storing one or more programs;
when the one or more programs are executed by the one or more processors, the one or more processors implement the program recognition model training method or the program recognition method provided by any embodiment of the present invention.
In a sixth aspect, an embodiment of the present invention further provides a computer storage medium, on which a computer program is stored, where the computer program, when executed by a processor, implements the program recognition model training method or the program recognition method provided in any embodiment of the present invention.
The embodiment of the invention acquires the behavior characteristic information and the certificate signature information of the sample program, acquires the program behavior characteristic of each sample program according to the behavior characteristic information, and acquires the signature randomness characteristic of each sample program according to the certificate signature information, so that the program behavior characteristic and the signature randomness characteristic are input to a program identification model as sample training data to train the program identification model, a model for detecting the malicious program based on multiple characteristics is obtained, the multi-angle characteristic extraction and discrimination of the application program are realized based on a machine learning method, and the efficiency and the accuracy of malicious program identification are improved.
Drawings
Fig. 1 is a flowchart of a program recognition model training method according to an embodiment of the present invention.
Fig. 2 is a flowchart of a program recognition model training method according to a second embodiment of the present invention.
Fig. 3 is a schematic structural diagram of a feature extraction model according to a second embodiment of the present invention.
Fig. 4 is a flowchart illustrating a program recognition model training method according to a second embodiment of the present invention.
Fig. 5 is a flowchart of a program identification method according to a third embodiment of the present invention.
Fig. 6 is a schematic structural diagram of a program recognition model training apparatus according to a fourth embodiment of the present invention.
Fig. 7 is a schematic structural diagram of a program identification device according to a fifth embodiment of the present invention.
Fig. 8 is a schematic structural diagram of a computer device according to a sixth embodiment of the present invention.
Detailed Description
The present invention will be described in further detail with reference to the accompanying drawings and examples. It is to be understood that the specific embodiments described herein are merely illustrative of the invention and are not limiting of the invention.
It should be further noted that, for the convenience of description, only some but not all of the relevant aspects of the present invention are shown in the drawings. Before discussing exemplary embodiments in more detail, it should be noted that some exemplary embodiments are described as processes or methods depicted as flowcharts. Although a flowchart may describe the operations (or steps) as a sequential process, many of the operations can be performed in parallel, concurrently or simultaneously. In addition, the order of the operations may be re-arranged. The process may be terminated when its operations are completed, but may have additional steps not included in the figure. The processes may correspond to methods, functions, procedures, subroutines, and the like.
Example one
Fig. 1 is a flowchart of a method for training a program recognition model according to an embodiment of the present invention, where the embodiment is applicable to a case where a model is trained so that the model can be used for recognizing a malicious program, and the method can be performed by a device for training a program recognition model according to an embodiment of the present invention, where the device can be implemented by software and/or hardware, and can be generally integrated in a computer device. Accordingly, as shown in fig. 1, the method comprises the following operations:
and S110, acquiring behavior characteristic information and certificate signature information of each sample program.
The sample programs may be malicious programs or non-malicious programs, and the number of the sample programs may be preset according to needs. Behavior feature information may include information generated by the behavior of a program that may be used to reflect the behavior of the program. The certificate signing information may be a digital signature in a digital certificate of the program.
Accordingly, a sufficient number of sample programs may be obtained in advance as needed, and may include a certain number of malicious programs and a certain number of non-malicious programs. In each sample program, the behavior characteristic information of the sample program can be obtained in any mode, for example, the behavior characteristic information can include that the sample program is decompiled to obtain the static characteristic information of the sample program, such as application program components, permissions, APIs and the like; and dynamic characteristic information such as log information, file access and a database of the sample program can be acquired in the running process of the sample program.
Further, the digital certificate is an electronic file for public key infrastructure, which is used to prove the identity of the owner of the public key, and the certification authority applies a digital signature to the public key to be certified with its own private key and generates a certificate. Thus, the certificate signature information thereof can be acquired in the digital certificate of the sample program.
Illustratively, the certificate signature information may include a CN (Common Name) field, an O (Organization Name) field, an OU (Organization Unit) field, an L (city), an S (State/Province) field, and a C (Country). For signature of malicious program, the field content includes randomly generated character string to hide the real information of the malicious program developer, such as signature "CN ═ So, OU ═ wq, O ═ dqoindwei, L ═ at, ST ═ th, and C ═ China", whereas in signature of non-malicious program, the content of each field has certain meaning, such as website domain name in CN field, unit name in O field, organization unit name in OU field, City name in L field, province name in S field, country name or code in C field, such as signature "CN ═ Android QZone Team, OU ═ tence Company, O ═ QZone of tence Company, L ═ beijijning City, ST ═ beiijty, and C ═ 86".
And S120, acquiring the program behavior characteristics of each sample program according to the behavior characteristic information.
The program behavior feature may be a feature obtained by discriminating, according to a behavior difference between the malicious program and the non-malicious program, a behavior of the program reflected by the behavior feature information.
Correspondingly, the behavior of each sample program can be known according to the behavior characteristic information, so that each behavior of the sample program can be distinguished, and the program behavior characteristics can be obtained.
For example, in a case where it is determined that the behavior of the sample program reflected by the behavior feature information conforms to the behavior feature of the malicious program, the program behavior feature of the sample program may be determined as a feature value 1; and under the condition that the behavior of the sample program reflected by the behavior characteristic information is determined to be consistent with the behavior characteristic of the non-malicious program, determining the program behavior characteristic of the sample program as a characteristic value 0.
S130, obtaining the character string randomness of the certificate signature information to obtain the signature randomness characteristics of each sample program.
Wherein, the character string randomness may be information describing whether the character string is randomly generated. The signature randomness characteristic may be a characteristic obtained by judging whether or not a digital signature in a digital certificate of the program is a randomly generated character string.
Correspondingly, the character string randomness of the certificate signature information can be obtained by judging whether the certificate signature information is a randomly generated character string. For any sample program, when the certificate signature information is a randomly generated character string, the signature randomness characteristic can be shown as signature randomness; when the certificate signature information is not a randomly generated character string but a character string with a specific meaning, the signature randomness characteristic can be shown as signature non-randomness. Alternatively, the signature may be randomly determined as the eigenvalue 1, and the signature may be non-randomly determined as the eigenvalue 0.
It should be noted that the execution sequence between S120 and S130 is not limited, and may be executed sequentially or simultaneously, and both may be executed after S110 and before S140.
And S140, inputting the program behavior characteristics and the signature randomness characteristics into a program recognition model as sample training data so as to train the program recognition model.
The sample training data may be data used for model training, among others. The program identification model may be a model for identifying malicious programs.
Correspondingly, the program behavior characteristics and signature randomness characteristics of all sample programs are used as sample training data, and the program identification model can be trained, so that the trained program identification model can determine whether the program is a malicious program according to the input program behavior characteristics and signature randomness characteristics of the program.
Accordingly, the program recognition model may be a Light Gradient Boosting Machine (Light Gradient Boosting decision tree) model.
Table 1 is a training effect recording table of the program identification model provided in the embodiment of the present invention, in which accuracy (accuracy), precision (precision), and recall (recall) for training different types of program identification models by using static features and signature randomness features of a sample program as sample training data are described. As shown in table 1, in a test using algorithms including a captoost, an XGBoost, a LightGBM, a logistic regression, an MLP (multi layer Perceptron), a KNN (K-nearest neighbor algorithm), a random forest algorithm, and the like, the LightGBM model is optimal in both the final test accuracy and the training efficiency, and thus the LightGBM model is selected as the program recognition model.
TABLE 1
Figure BDA0003234631960000071
The embodiment of the invention provides a program identification model training method, which comprises the steps of obtaining behavior characteristic information and certificate signature information of sample programs, obtaining program behavior characteristics of each sample program according to the behavior characteristic information, obtaining signature randomness characteristics of each sample program according to the certificate signature information, inputting the program behavior characteristics and the signature randomness characteristics into a program identification model as sample training data to train the program identification model so as to obtain a multi-characteristic-based malicious program detection model, extracting and identifying multi-angle characteristics of an application program based on a machine learning method, and improving the efficiency and accuracy of malicious program detection.
Example two
Fig. 2 is a flowchart of a program recognition model training method according to a second embodiment of the present invention. The embodiment of the present invention is embodied on the basis of the above-described embodiment, and in the embodiment of the present invention, a specific optional implementation manner is provided for obtaining the randomness of the character strings of the certificate signature information to obtain the signature randomness characteristics of each sample program.
As shown in fig. 2, the method of the embodiment of the present invention specifically includes:
and S210, acquiring behavior characteristic information and certificate signature information of each sample program.
And S220, acquiring the program behavior characteristics of each sample program according to the behavior characteristic information.
And S230, acquiring the character string randomness of the certificate signature information to obtain the signature randomness characteristics of each sample program.
In an optional embodiment of the present invention, S230 may specifically include:
and S231, acquiring character string characteristics of each certificate signature information according to the known signature character string.
The known signature string may include a string determined to be a digital signature and/or a string determined to be not a digital signature, among other things. The character string feature may be a feature obtained by discriminating a character arrangement rule to which the certificate signature information conforms, according to a character arrangement rule of a known signature character string, and may describe whether or not the character string conforms to a character arrangement rule to which a digital signature determined according to the known signature character string should conform.
Correspondingly, the character arrangement rule which the digital signature should accord with can be determined by knowing the signature character string, so that the character string characteristic of the certificate signature information can be determined according to whether the certificate signature information accords with the character arrangement rule or not.
Alternatively, the known signature character string may include a certain number of collected digital signatures and english words of the non-malicious program, and a certain number of random character strings that are randomly encoded and synthesized according to the digital signatures and english words of the non-malicious program.
In an optional embodiment of the present invention, the obtaining of the character string characteristic of the certificate signature information according to the known signature character string may include: and inputting the certificate signature information into a pre-trained feature extraction model to obtain the character string features output by the feature extraction model.
The feature extraction model is obtained by training by using the known signature character string as a sample and is used for obtaining the character string features of the input information.
Accordingly, the character string features of the certificate signature information may be obtained using a feature extraction model. The feature extraction model can be any machine learning model, the known signature character string is adopted as a sample to be trained in advance, the character arrangement rule of the known signature character string can be learned, and therefore the character string features of the input information can be obtained according to the learned character arrangement rule of the known signature character string.
In an optional embodiment of the present invention, before the obtaining the character string feature of the certificate signature information according to the known signature character string, the method may further include: acquiring a first preset number of known signature character strings; performing feature marking on character string features of the known signature character strings; and inputting the known signature character string after the characteristic marking into a characteristic extraction model as a sample, and training the characteristic extraction model.
And the trained feature extraction model is used for extracting the character string features of the certificate signature information.
Specifically, the first preset number may be the total number of samples required for training the feature extraction model, and may be determined as needed, which is not limited herein. The signature marking may be an operation of marking each known signature string with its corresponding string signature.
Correspondingly, the known signature character string is used as a sample to train the feature extraction model, so that the feature extraction model can learn the character arrangement rule of each character string in the known signature character string and the character string features corresponding to the character string, when any character string is input in the feature extraction model, the character string features of the character string can be output, and whether the input character string conforms to the features which the digital signature determined according to the known signature character string conforms to can be determined.
For example, if it is determined that the digital signature of the non-malicious program may adopt an existing english word or an original word conforming to the spelling rule of the english word, such as an english name of a program operator, the trained feature extraction model may determine whether the input character string conforms to the spelling rule of the english word, so as to output a corresponding character string feature.
In an optional embodiment of the present invention, the obtaining of the first preset number of known signature strings may include: acquiring a second preset number of legal signature character strings; carrying out random coding processing on the legal signature character strings to obtain a second preset number of random character strings; determining the legal signature character string and the random character string as the known signature character string; the characterizing the string features of each of the known signature strings may include: marking legal character string characteristics in the legal signature character string; and marking illegal character string characteristics in the random character string.
The second preset number may be half of the first preset number, and may be determined according to the first preset number. A legitimate signature string may be a string determined to be a digital signature in an application digital certificate. The random encoding process may be an operation of generating a random string from a legitimate signature string. The random string may be a string determined not to be a digital signature in an application digital certificate, which does not conform to any particular character arrangement rule. The legal character string features can be used to describe the character arrangement rule of the character string conforming to the legal signature character string. The illegal character string features can describe the character arrangement rule that the character string does not accord with the legal signature character string.
Correspondingly, a second preset number of legal signature character strings are obtained, random coding processing is carried out on the legal signature character strings to obtain a second preset number of random character strings, and training samples which account for half of black and white samples can be obtained, so that the models can be trained by utilizing the samples, and the models can learn the characteristics of the character strings corresponding to the legal signature character strings, the random character strings and the marks of the random character strings.
Optionally, the feature extraction model may be a Bi-LSTM (Bidirectional Long Short Term Memory) model.
Fig. 3 is a schematic structural diagram of a feature extraction model according to an embodiment of the present invention. As shown in FIG. 3, first is an Input layer, each training sample can be loaded into the model. Next is Embedding, which can map each word, each letter, to a word vector. For example, Apple, where Embedding is performed, each letter of Apple becomes a vector of 100 dimensions, and finally the whole word becomes a vector of 5 x 100 dimensions. The Bi-LSTM model can be trained in conjunction with information of the input samples in both the forward and backward directions. Dropout (drop layer) is to drop a part of the neurons during training in order to prevent the over-fitting phenomenon, and optionally, the value of Dropout is set to 0.5. The Dense (full connection layer) is used for flattening and expanding the vector into the full connection layer after obtaining the Output vector, obtaining final Output (Output) after an Activation function TANH of the Activation layer, and obtaining the character string characteristics of the Output result through a Softmax function (normalized exponential function).
Optionally, the second preset number of legal signature character strings may include 37 ten thousand signature values and english words, and then data is synthesized, 37 ten thousand random character strings are generated according to random codes, and 74 thousand known signature character strings are obtained as a sample in total.
S232, under the condition that the character string features are determined to be legal character string features, determining that the certificate signature information is a non-random character string, and obtaining that the signature randomness features of the sample program corresponding to the certificate signature information are signature non-random.
The legal character string features can describe the character arrangement rule that the character string accords with the digital signature determined according to the known signature character string. Signature non-random may describe that the digital signature of the program is not a randomly generated string.
Correspondingly, if the character string feature is a legal character string feature, it can be shown that the certificate signature information conforms to the character arrangement rule to which the digital signature determined according to the known signature character string should conform, and it can be determined that the character string is a non-random character string, so that it can be determined that the digital signature of the corresponding sample program is not randomly generated, that is, the signature randomness feature is signature non-random.
In an optional embodiment of the present invention, after the obtaining the character string feature of the certificate signature information according to the known signature character string, the method may further include: under the condition that the character string features are determined to be illegal character string features, matching the certificate signature information in a character string comparison library generated in advance; under the condition that the certificate signature information is successfully matched in the character string comparison library, determining that the certificate signature information is a non-random character string, and obtaining that the signature randomness characteristic of the sample program corresponding to the certificate signature information is signature non-random; and under the condition that the certificate signature information is determined to be unsuccessfully matched in the character string comparison library, determining the certificate signature information to be a random character string, and obtaining that the signature randomness characteristic of the sample program corresponding to the certificate signature information is signature randomness.
The illegal character string features can describe that the character string does not accord with the character arrangement rule which the digital signature determined according to the known signature character string should accord with. The character string comparison library may be a database for storing character strings, wherein the stored character strings may include character strings that can be used as digital signatures, and conform to a specific character arrangement rule. Signature randomness the digital signature that can describe the program is a randomly generated string of characters.
Correspondingly, if the character string feature of the certificate signature information is an illegal character string feature, it can be shown that the character string feature does not conform to the character arrangement rule to which the digital signature determined according to the known signature character string should conform. Because the character arrangement rule determined according to the known signature character string can not make all the non-random character strings conform to, for example, according to the known signature character string composed of the english words, only the character arrangement rule of the english words can be determined, if the digital signature of the program adopts other languages, such as chinese pinyin, or includes english abbreviation single times, and the like, the non-random character strings can be misjudged as random character strings according to the character arrangement rule determined by the english words, therefore, a character string comparison library can be generated in advance, and the character strings conforming to the specific character arrangement rule are stored in the character string comparison library. If the certificate signature information is successfully matched in the character string comparison library, the certificate signature information can be also shown to be a non-random character string, and the signature randomness characteristic of the corresponding sample program is signature non-random; if the matching of the certificate signature information in the character string comparison library fails, which can indicate that the certificate signature information does not conform to the specific character arrangement rule, the certificate signature information is determined to be a random character string, and the signature randomness characteristic of the corresponding sample program is signature randomness.
For example, if the certificate signature information includes pinyin in chinese, for example, "CN ═ wangjun, OU ═ xingdanfang", in this case, the signature content does not belong to the english word, and if the signature content is determined according to the character rules of the english word, the signature content is determined as a random character string. Therefore, the pinyin of the Chinese characters can be formed into a character string comparison library in advance, for example, the pinyin of single Chinese characters such as "ai", "wo" and "chen", and then through a greedy algorithm, when a character string such as "xindongfang" can be enumerated and matched in the character string comparison library, the character string can be judged to be the Chinese pinyin instead of a random character string. Optionally, the character string comparison library may be formed by pinyin of 408 Chinese characters.
In the above embodiment provided by the embodiment of the present invention, based on the feature extraction model trained by the Bi-LSTM model and the pinyin discrimination, it can be perfectly determined whether the character string is randomly generated, and the accuracy of the discrimination is close to 100% through testing. Meanwhile, the method is applied to data for testing, in 70 ten thousand malicious virus software, more than 50 ten thousand random character strings are detected in the signatures, and in 11 ten thousand normal software, only more than 1000 random character strings appear in the signatures, and the result confirms the feasibility of the scheme.
S240, inputting the program behavior characteristics and the signature randomness characteristics into a program recognition model as sample training data to train the program recognition model.
Fig. 4 is a flowchart illustrating a program recognition model training method according to a second embodiment of the present invention. In a specific example, as shown in fig. 4, the original sample may be input into the trained feature extraction model, and it may be determined whether the certificate signature information is a random character string. If the certificate signature information is determined not to be a random character string according to the feature extraction model, the signature randomness feature of the sample can be directly determined to be signature nonrandom. If the certificate signature information is determined to be a random character string according to the feature extraction model, whether the certificate signature information is pinyin can be further judged. In the event that the certificate signature information is determined to be not pinyin, the signature randomness characteristic of the sample may be determined to be signature randomness. In the case where the certificate signature information is determined to be pinyin, the signature randomness characteristic of the sample may be determined to be signature non-random. Therefore, the signature randomness characteristics of the sample can be extracted and put into a program identification model together with the program behavior characteristics of the sample for model training.
The embodiment of the invention provides a program identification model training method, which comprises the steps of obtaining behavior characteristic information and certificate signature information of sample programs, obtaining program behavior characteristics of each sample program according to the behavior characteristic information, obtaining signature randomness characteristics of each sample program according to the certificate signature information, inputting the program behavior characteristics and the signature randomness characteristics into a program identification model as sample training data to train the program identification model so as to obtain a model for detecting malicious programs based on multiple characteristics, extracting and distinguishing multi-angle characteristics of application programs based on a machine learning method, and improving efficiency and accuracy of malicious program identification.
EXAMPLE III
Fig. 5 is a flowchart of a program identification method provided in a third embodiment of the present invention, where this embodiment is applicable to the case of identifying a malicious program, and this method may be executed by a program identification apparatus provided in the third embodiment of the present invention, where the apparatus may be implemented by software and/or hardware, and may be generally integrated in a computer device. Accordingly, as shown in fig. 5, the method includes the following operations:
and S310, acquiring behavior characteristic information and certificate signature information of the program to be identified.
The program to be identified may be any application program that needs to identify whether the program is a malicious program.
And S320, acquiring the program behavior characteristics of the program to be identified according to the behavior characteristic information.
S330, acquiring the character string randomness of the certificate signature information to obtain the signature randomness characteristics of the program to be identified.
S340, inputting the program behavior characteristics and the signature randomness characteristics into a program identification model as to-be-detected data of the program to be identified, and obtaining a program identification result of the program to be identified.
The program identification model is obtained by training through the program identification model training method in any embodiment of the invention.
Accordingly, the descriptions of the embodiments of the present invention are the same as the descriptions of any embodiment of the present invention, and are not repeated here.
In an optional embodiment of the present invention, the obtaining of the randomness of the character string of the certificate signature information to obtain the signature randomness characteristic of the program to be identified includes: acquiring character string characteristics of each certificate signature information according to a known signature character string; and under the condition that the character string features are determined to be legal character string features, determining the certificate signature information to be a non-random character string, and obtaining that the signature randomness features of the program to be identified corresponding to the certificate signature information are signature non-random.
In an optional embodiment of the present invention, after the obtaining the character string feature of the certificate signature information according to the known signature character string, the method further includes: under the condition that the character string features are determined to be illegal character string features, matching the certificate signature information in a character string comparison library generated in advance; under the condition that the certificate signature information is successfully matched in the character string comparison library, determining that the certificate signature information is a non-random character string, and obtaining that the signature randomness characteristic of the program to be identified corresponding to the certificate signature information is signature non-random; and under the condition that the certificate signature information is determined to be unsuccessfully matched in the character string comparison library, determining the certificate signature information to be a random character string, and obtaining that the signature randomness characteristic of the program to be identified corresponding to the certificate signature information is signature randomness.
In an optional embodiment of the present invention, the obtaining the character string characteristic of the certificate signature information according to the known signature character string includes: inputting the certificate signature information into a pre-trained feature extraction model to obtain the character string features output by the feature extraction model; the feature extraction model is obtained by training by using the known signature character string as a sample and is used for obtaining the character string features of the input information.
In an optional embodiment of the present invention, before the obtaining the character string feature of the certificate signature information according to the known signature character string, the method further includes: acquiring a first preset number of known signature character strings; performing feature marking on character string features of the known signature character strings; inputting the known signature character string after the characteristic marking into a characteristic extraction model as a sample, and training the characteristic extraction model; and the trained feature extraction model is used for extracting the character string features of the certificate signature information.
In an optional embodiment of the present invention, the obtaining a first preset number of known signature strings includes: acquiring a second preset number of legal signature character strings; carrying out random coding processing on the legal signature character strings to obtain a second preset number of random character strings; determining the legal signature character string and the random character string as the known signature character string; the characteristic marking of the character string characteristics of each known signature character string comprises the following steps: marking legal character string characteristics in the legal signature character string; and marking illegal character string characteristics in the random character string.
The embodiment of the invention provides a program identification method, which is characterized in that the behavior characteristic information and the certificate signature information of a program to be identified are acquired, the program behavior characteristic of the program to be identified is acquired according to the behavior characteristic information, and the signature randomness characteristic of the program to be identified is acquired according to the certificate signature information, so that the program behavior characteristic and the signature randomness characteristic are input into a program identification model to obtain an identification result, multi-angle characteristic extraction and identification of an application program are realized based on a machine learning method, and the efficiency and the accuracy of malicious program detection are improved.
Example four
Fig. 6 is a schematic structural diagram of a program recognition model training apparatus according to a fourth embodiment of the present invention, as shown in fig. 6, the apparatus includes: a sample information acquisition module 410, a sample behavior feature acquisition module 420, a sample signature feature acquisition module 430, and a program identification model training module 440.
The sample information obtaining module 410 is configured to obtain behavior feature information and certificate signature information of each sample program.
A sample behavior feature obtaining module 420, configured to obtain program behavior features of each sample program according to the behavior feature information.
A sample signature characteristic obtaining module 430, configured to obtain a character string randomness of the certificate signature information, to obtain a signature randomness characteristic of each sample program.
And a program identification model training module 440, configured to input the program behavior features and the signature randomness features as sample training data to a program identification model to train the program identification model.
In an optional implementation manner of the embodiment of the present invention, the sample signature feature obtaining module 430 may include: the sample character string characteristic acquisition submodule is used for acquiring the character string characteristics of the certificate signature information according to the known signature character string; and the first sample non-random determining submodule is used for determining the certificate signature information as a non-random character string under the condition that the character string features are determined to be legal character string features, and obtaining that the signature randomness features of the sample program corresponding to the certificate signature information are signature non-random.
In an optional implementation manner of the embodiment of the present invention, the sample signature feature obtaining module 430 may further include: the sample signature matching submodule is used for matching the certificate signature information in a pre-generated character string comparison library under the condition that the character string features are determined to be illegal character string features; a second sample non-random determining submodule, configured to determine that the certificate signature information is a non-random character string under the condition that it is determined that the certificate signature information is successfully matched in the character string comparison library, and obtain that the signature randomness characteristic of the sample program corresponding to the certificate signature information is signature non-random; and the sample random determining submodule is used for determining the certificate signature information as a random character string under the condition that the matching of the certificate signature information in the character string comparison library is failed, and obtaining that the signature randomness characteristic of the sample program corresponding to the certificate signature information is signature randomness.
In an optional implementation manner of the embodiment of the present invention, the sample character string feature obtaining sub-module may be specifically configured to: inputting the certificate signature information into a pre-trained feature extraction model to obtain the character string features output by the feature extraction model; the feature extraction model is obtained by training by using the known signature character string as a sample and is used for obtaining the character string features of the input information.
In an optional implementation manner of the embodiment of the present invention, the sample signature feature obtaining module 430 may further include: the model training submodule is used for acquiring a first preset number of known signature character strings; performing feature marking on character string features of the known signature character strings; inputting the known signature character string after the characteristic marking into a characteristic extraction model as a sample, and training the characteristic extraction model; and the trained feature extraction model is used for extracting the character string features of the certificate signature information.
In an optional implementation manner of the embodiment of the present invention, the model training sub-module may be specifically configured to: acquiring a second preset number of legal signature character strings; carrying out random coding processing on the legal signature character strings to obtain a second preset number of random character strings; determining the legal signature character string and the random character string as the known signature character string; marking legal character string characteristics in the legal signature character string; and marking illegal character string characteristics in the random character string.
The device can execute the program recognition model training method provided by any embodiment of the invention, and has the corresponding functional modules and beneficial effects of executing the program recognition model training method.
The embodiment of the invention provides a program identification model training device, which is characterized in that the behavior characteristic information and the certificate signature information of sample programs are obtained, the program behavior characteristics of each sample program are obtained according to the behavior characteristic information, the signature randomness characteristics of each sample program are obtained according to the certificate signature information, and therefore the program behavior characteristics and the signature randomness characteristics are input into a program identification model as sample training data to be trained so as to obtain a model for detecting malicious programs based on multiple characteristics.
EXAMPLE five
Fig. 7 is a schematic structural diagram of a program identification apparatus according to a fifth embodiment of the present invention, and as shown in fig. 7, the apparatus includes: an information to be identified obtaining module 510, a behavior feature to be identified obtaining module 520, a signature feature to be identified obtaining module 530 and a program identification module 540.
The module 510 for acquiring information to be identified is configured to acquire behavior feature information and certificate signature information of a program to be identified.
A to-be-recognized behavior feature obtaining module 520, configured to obtain, according to the behavior feature information, a program behavior feature of the to-be-recognized program.
A signature feature acquiring module 530 for acquiring the randomness of the character string of the certificate signature information to obtain the signature randomness feature of the program to be identified.
And the program identification module 540 is configured to input the program behavior feature and the signature randomness feature into a program identification model as to-be-detected data of the program to be identified, so as to obtain a program identification result of the program to be identified.
The program identification model is obtained by training through the program identification model training method in any embodiment of the invention.
In an optional implementation manner of the embodiment of the present invention, the to-be-identified signature feature obtaining module 530 may include: the character string feature acquisition submodule to be identified is used for acquiring the character string features of the certificate signature information according to the known signature character string; and the first non-random determining submodule to be identified is used for determining that the certificate signature information is a non-random character string under the condition that the character string features are determined to be legal character string features, and obtaining that the signature randomness features of the program to be identified corresponding to the certificate signature information are signature non-random.
In an optional implementation manner of the embodiment of the present invention, the to-be-identified signature feature obtaining module 530 may further include: the signature matching submodule to be identified is used for matching the certificate signature information in a pre-generated character string comparison library under the condition that the character string features are determined to be illegal character string features; the second to-be-identified non-random determining submodule is used for determining that the certificate signature information is a non-random character string under the condition that the certificate signature information is successfully matched in the character string comparison library, and obtaining that the signature randomness characteristic of the to-be-identified program corresponding to the certificate signature information is signature non-random; and the to-be-identified random determining submodule is used for determining that the certificate signature information is a random character string under the condition that the matching of the certificate signature information in the character string comparison library is failed, and obtaining that the signature randomness characteristic of the to-be-identified program corresponding to the certificate signature information is signature randomness.
In an optional implementation manner of the embodiment of the present invention, the character string to be recognized feature obtaining sub-module may be specifically configured to: inputting the certificate signature information into a pre-trained feature extraction model to obtain the character string features output by the feature extraction model; the feature extraction model is obtained by training by using the known signature character string as a sample and is used for obtaining the character string features of the input information.
In an optional implementation manner of the embodiment of the present invention, the to-be-identified signature feature obtaining module 530 may further include: the model training submodule is used for acquiring a first preset number of known signature character strings; performing feature marking on character string features of the known signature character strings; inputting the known signature character string after the characteristic marking into a characteristic extraction model as a sample, and training the characteristic extraction model; and the trained feature extraction model is used for extracting the character string features of the certificate signature information.
In an optional implementation manner of the embodiment of the present invention, the model training sub-module may be specifically configured to: acquiring a second preset number of legal signature character strings; carrying out random coding processing on the legal signature character strings to obtain a second preset number of random character strings; determining the legal signature character string and the random character string as the known signature character string; marking legal character string characteristics in the legal signature character string; and marking illegal character string characteristics in the random character string.
The device can execute the program identification method provided by any embodiment of the invention, and has the corresponding functional modules and beneficial effects of executing the program identification method.
The embodiment of the invention provides a program identification device, which is used for acquiring behavior characteristic information and certificate signature information of a program to be identified, acquiring program behavior characteristics of the program to be identified according to the behavior characteristic information, and acquiring signature randomness characteristics of the program to be identified according to the certificate signature information, so that the program behavior characteristics and the signature randomness characteristics are input into a program identification model to obtain an identification result, multi-angle characteristic extraction and identification of an application program are realized based on a machine learning method, and the efficiency and accuracy of malicious program detection are improved.
EXAMPLE six
Fig. 8 is a schematic structural diagram of a computer device according to a sixth embodiment of the present invention. FIG. 8 illustrates a block diagram of an exemplary computer device 12 suitable for use in implementing embodiments of the present invention. The computer device 12 shown in fig. 8 is only an example and should not bring any limitations to the functionality or scope of use of the embodiments of the present invention.
As shown in FIG. 8, computer device 12 is in the form of a general purpose computing device. The components of computer device 12 may include, but are not limited to: one or more processors 16, a memory 28, and a bus 18 that connects the various system components (including the memory 28 and the processors 16).
Bus 18 represents one or more of any of several types of bus structures, including a memory bus or memory controller, a peripheral bus, an accelerated graphics port, and a processor or local bus using any of a variety of bus architectures. By way of example, such architectures include, but are not limited to, Industry Standard Architecture (ISA) bus, micro-channel architecture (MAC) bus, enhanced ISA bus, Video Electronics Standards Association (VESA) local bus, and Peripheral Component Interconnect (PCI) bus.
Computer device 12 typically includes a variety of computer system readable media. Such media may be any available media that is accessible by computer device 12 and includes both volatile and nonvolatile media, removable and non-removable media.
The memory 28 may include computer system readable media in the form of volatile memory, such as Random Access Memory (RAM)30 and/or cache memory 32. Computer device 12 may further include other removable/non-removable, volatile/nonvolatile computer system storage media. By way of example only, storage system 34 may be used to read from and write to non-removable, nonvolatile magnetic media (not shown in FIG. 8, and commonly referred to as a "hard drive"). Although not shown in FIG. 8, a magnetic disk drive for reading from and writing to a removable, nonvolatile magnetic disk (e.g., a "floppy disk") and an optical disk drive for reading from or writing to a removable, nonvolatile optical disk (e.g., a CD-ROM, DVD-ROM, or other optical media) may be provided. In these cases, each drive may be connected to bus 18 by one or more data media interfaces. Memory 28 may include at least one program product having a set (e.g., at least one) of program modules that are configured to carry out the functions of embodiments of the invention.
A program/utility 40 having a set (at least one) of program modules 42 may be stored, for example, in memory 28, such program modules 42 including, but not limited to, an operating system, one or more application programs, other program modules, and program data, each of which examples or some combination thereof may comprise an implementation of a network environment. Program modules 42 generally carry out the functions and/or methodologies of the described embodiments of the invention.
Computer device 12 may also communicate with one or more external devices 14 (e.g., keyboard, pointing device, display 24, etc.), with one or more devices that enable a user to interact with computer device 12, and/or with any devices (e.g., network card, modem, etc.) that enable computer device 12 to communicate with one or more other computing devices. Such communication may be through an input/output (I/O) interface 22. Also, computer device 12 may communicate with one or more networks (e.g., a Local Area Network (LAN), a Wide Area Network (WAN), and/or a public network such as the Internet) via network adapter 20. As shown, network adapter 20 communicates with the other modules of computer device 12 via bus 18. It should be appreciated that although not shown in FIG. 8, other hardware and/or software modules may be used in conjunction with computer device 12, including but not limited to: microcode, device drivers, redundant processing units, external disk drive arrays, RAID systems, tape drives, and data backup storage systems, among others.
The processor 16 executes various functional applications and data processing by running the program stored in the memory 28, so as to implement the program recognition model training method provided by the embodiment of the present invention: acquiring behavior characteristic information and certificate signature information of each sample program; acquiring program behavior characteristics of each sample program according to the behavior characteristic information; acquiring the character string randomness of the certificate signature information to obtain the signature randomness characteristics of each sample program; inputting the program behavior features and the signature randomness features as sample training data into a program recognition model to train the program recognition model; or the like, or, alternatively,
the program identification method comprises the following steps: acquiring behavior characteristic information and certificate signature information of a program to be identified; acquiring the program behavior characteristics of the program to be identified according to the behavior characteristic information; acquiring the character string randomness of the certificate signature information to obtain the signature randomness characteristics of the program to be identified; inputting the program behavior characteristics and the signature randomness characteristics into a program identification model as to-be-detected data of the program to be identified to obtain a program identification result of the program to be identified; the program identification model is obtained by training through the program identification model training method provided by any embodiment of the invention.
EXAMPLE seven
The seventh embodiment of the present invention provides a computer-readable storage medium, on which a computer program is stored, where when the computer program is executed by a processor, the method for training a program identification model provided in the embodiments of the present invention is implemented: acquiring behavior characteristic information and certificate signature information of each sample program; acquiring program behavior characteristics of each sample program according to the behavior characteristic information; acquiring the character string randomness of the certificate signature information to obtain the signature randomness characteristics of each sample program; inputting the program behavior features and the signature randomness features as sample training data into a program recognition model to train the program recognition model; or the like, or, alternatively,
the program identification method comprises the following steps: acquiring behavior characteristic information and certificate signature information of a program to be identified; acquiring the program behavior characteristics of the program to be identified according to the behavior characteristic information; acquiring the character string randomness of the certificate signature information to obtain the signature randomness characteristics of the program to be identified; inputting the program behavior characteristics and the signature randomness characteristics into a program identification model as to-be-detected data of the program to be identified to obtain a program identification result of the program to be identified; the program identification model is obtained by training through the program identification model training method provided by any embodiment of the invention.
Any combination of one or more computer-readable media may be employed. The computer readable medium may be a computer readable signal medium or a computer readable storage medium. A computer readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination of the foregoing. More specific examples (a non-exhaustive list) of the computer readable storage medium would include the following: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the context of this document, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device.
A computer readable signal medium may include a propagated data signal with computer readable program code embodied therein, for example, in baseband or as part of a carrier wave. Such a propagated data signal may take many forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof. A computer readable signal medium may also be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device.
Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to wireless, wireline, optical fiber cable, RF, etc., or any suitable combination of the foregoing.
Computer program code for carrying out operations for aspects of the present invention may be written in any combination of one or more programming languages, including an object oriented programming language such as Java, Smalltalk, C + + or the like and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or computer device. In the case of a remote computer, the remote computer may be connected to the user's computer through any type of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or the connection may be made to an external computer (for example, through the Internet using an Internet service provider).
It is to be noted that the foregoing is only illustrative of the preferred embodiments of the present invention and the technical principles employed. It will be understood by those skilled in the art that the present invention is not limited to the particular embodiments described herein, but is capable of various obvious changes, rearrangements and substitutions as will now become apparent to those skilled in the art without departing from the scope of the invention. Therefore, although the present invention has been described in greater detail by the above embodiments, the present invention is not limited to the above embodiments, and may include other equivalent embodiments without departing from the spirit of the present invention, and the scope of the present invention is determined by the scope of the appended claims.

Claims (11)

1. A method for training a program recognition model, comprising:
acquiring behavior characteristic information and certificate signature information of each sample program;
acquiring program behavior characteristics of each sample program according to the behavior characteristic information;
acquiring the character string randomness of the certificate signature information to obtain the signature randomness characteristics of each sample program;
and inputting the program behavior characteristics and the signature randomness characteristics as sample training data to a program recognition model to train the program recognition model.
2. The method according to claim 1, wherein the obtaining of the randomness of the character strings of the certificate signature information to obtain the signature randomness characteristics of each of the sample programs comprises:
acquiring character string characteristics of each certificate signature information according to a known signature character string;
and under the condition that the character string features are determined to be legal character string features, determining the certificate signature information to be a non-random character string, and obtaining that the signature randomness features of the sample program corresponding to the certificate signature information are signature non-random.
3. The method according to claim 2, further comprising, after the obtaining the string feature of the certificate signature information from the known signature string:
under the condition that the character string features are determined to be illegal character string features, matching the certificate signature information in a character string comparison library generated in advance;
under the condition that the certificate signature information is successfully matched in the character string comparison library, determining that the certificate signature information is a non-random character string, and obtaining that the signature randomness characteristic of the sample program corresponding to the certificate signature information is signature non-random;
and under the condition that the certificate signature information is determined to be unsuccessfully matched in the character string comparison library, determining the certificate signature information to be a random character string, and obtaining that the signature randomness characteristic of the sample program corresponding to the certificate signature information is signature randomness.
4. The method according to claim 2, wherein the obtaining the character string characteristic of the certificate signature information according to the known signature character string comprises:
inputting the certificate signature information into a pre-trained feature extraction model to obtain the character string features output by the feature extraction model;
the feature extraction model is obtained by training by using the known signature character string as a sample and is used for obtaining the character string features of the input information.
5. The method according to claim 4, further comprising, before the obtaining the string feature of the certificate signature information from the known signature string:
acquiring a first preset number of known signature character strings;
performing feature marking on character string features of the known signature character strings;
inputting the known signature character string after the characteristic marking into a characteristic extraction model as a sample, and training the characteristic extraction model;
and the trained feature extraction model is used for extracting the character string features of the certificate signature information.
6. The method of claim 5, wherein obtaining the first preset number of known signature strings comprises:
acquiring a second preset number of legal signature character strings;
carrying out random coding processing on the legal signature character strings to obtain a second preset number of random character strings;
determining the legal signature character string and the random character string as the known signature character string;
the characteristic marking of the character string characteristics of each known signature character string comprises the following steps:
marking legal character string characteristics in the legal signature character string;
and marking illegal character string characteristics in the random character string.
7. A program identification method, comprising:
acquiring behavior characteristic information and certificate signature information of a program to be identified;
acquiring the program behavior characteristics of the program to be identified according to the behavior characteristic information;
acquiring the character string randomness of the certificate signature information to obtain the signature randomness characteristics of the program to be identified;
inputting the program behavior characteristics and the signature randomness characteristics into a program identification model as to-be-detected data of the program to be identified to obtain a program identification result of the program to be identified; wherein, the program recognition model is obtained by training through the program recognition model training method of any one of claims 1 to 6.
8. A program recognition model training apparatus, comprising:
the system comprises a sample information acquisition module, a certificate signing module and a verification module, wherein the sample information acquisition module is used for acquiring behavior characteristic information and certificate signing information of each sample program;
the sample behavior characteristic acquisition module is used for acquiring the program behavior characteristics of each sample program according to the behavior characteristic information;
the sample signature characteristic acquisition module is used for acquiring the character string randomness of the certificate signature information to obtain the signature randomness characteristics of each sample program;
and the program identification model training module is used for inputting the program behavior characteristics and the signature randomness characteristics into a program identification model as sample training data so as to train the program identification model.
9. A program identifying apparatus, comprising:
the system comprises a to-be-identified information acquisition module, a certificate signature module and a verification module, wherein the to-be-identified information acquisition module is used for acquiring behavior characteristic information and certificate signature information of a to-be-identified program;
the behavior feature acquiring module is used for acquiring the program behavior feature of the program to be identified according to the behavior feature information;
the signature feature acquisition module to be identified is used for acquiring the character string randomness of the certificate signature information to obtain the signature randomness features of the program to be identified;
the program identification module is used for inputting the program behavior characteristics and the signature randomness characteristics into a program identification model as to-be-detected data of the program to be identified to obtain a program identification result of the program to be identified; wherein, the program recognition model is obtained by training through the program recognition model training method of any one of claims 1 to 6.
10. A computer device, characterized in that the computer device comprises:
one or more processors;
storage means for storing one or more programs;
when executed by the one or more processors, cause the one or more processors to implement the program recognition model training method of any one of claims 1-6 or the program recognition method of claim 7.
11. A computer storage medium on which a computer program is stored, which program, when being executed by a processor, is adapted to carry out a program recognition model training method according to any one of claims 1 to 6 or a program recognition method according to claim 7.
CN202110997676.2A 2021-08-27 2021-08-27 Program recognition model training and program recognition method, device, equipment and medium Pending CN113742726A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110997676.2A CN113742726A (en) 2021-08-27 2021-08-27 Program recognition model training and program recognition method, device, equipment and medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110997676.2A CN113742726A (en) 2021-08-27 2021-08-27 Program recognition model training and program recognition method, device, equipment and medium

Publications (1)

Publication Number Publication Date
CN113742726A true CN113742726A (en) 2021-12-03

Family

ID=78733509

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110997676.2A Pending CN113742726A (en) 2021-08-27 2021-08-27 Program recognition model training and program recognition method, device, equipment and medium

Country Status (1)

Country Link
CN (1) CN113742726A (en)

Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105354496A (en) * 2015-10-10 2016-02-24 邱寅峰 Detection method and system of malicious program automatically generated on Android platform
CN106778277A (en) * 2017-01-13 2017-05-31 北京邮电大学 Malware detection methods and device
US20180204119A1 (en) * 2017-01-19 2018-07-19 International Business Machines Corporation Method and Apparatus for Driver Identification Leveraging Telematics Data
CN108304720A (en) * 2018-02-06 2018-07-20 恒安嘉新(北京)科技股份公司 A kind of Android malware detection methods based on machine learning
CN108764226A (en) * 2018-04-13 2018-11-06 顺丰科技有限公司 Image text recognition methods, device, equipment and its storage medium
CN109379377A (en) * 2018-11-30 2019-02-22 极客信安(北京)科技有限公司 Encrypt malicious traffic stream detection method, device, electronic equipment and storage medium
CN109829307A (en) * 2018-06-26 2019-05-31 360企业安全技术(珠海)有限公司 Process behavior recognition methods and device
CN111476290A (en) * 2020-04-03 2020-07-31 北京推想科技有限公司 Detection model training method, lymph node detection method, apparatus, device and medium
CN112347475A (en) * 2020-11-11 2021-02-09 北京航空航天大学 Malicious certificate automatic detection system and method based on deep learning technology

Patent Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105354496A (en) * 2015-10-10 2016-02-24 邱寅峰 Detection method and system of malicious program automatically generated on Android platform
CN106778277A (en) * 2017-01-13 2017-05-31 北京邮电大学 Malware detection methods and device
US20180204119A1 (en) * 2017-01-19 2018-07-19 International Business Machines Corporation Method and Apparatus for Driver Identification Leveraging Telematics Data
CN108304720A (en) * 2018-02-06 2018-07-20 恒安嘉新(北京)科技股份公司 A kind of Android malware detection methods based on machine learning
CN108764226A (en) * 2018-04-13 2018-11-06 顺丰科技有限公司 Image text recognition methods, device, equipment and its storage medium
CN109829307A (en) * 2018-06-26 2019-05-31 360企业安全技术(珠海)有限公司 Process behavior recognition methods and device
CN109379377A (en) * 2018-11-30 2019-02-22 极客信安(北京)科技有限公司 Encrypt malicious traffic stream detection method, device, electronic equipment and storage medium
CN111476290A (en) * 2020-04-03 2020-07-31 北京推想科技有限公司 Detection model training method, lymph node detection method, apparatus, device and medium
CN112347475A (en) * 2020-11-11 2021-02-09 北京航空航天大学 Malicious certificate automatic detection system and method based on deep learning technology

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
HISHAM SHEHATA GALAL等: "Behavior-based features model for malware detection", pages 1 - 9, Retrieved from the Internet <URL:《网页在线公开:https://link.springer.com/article/10.1007/s11416-015-0244-0》> *
徐玄骥等: "基于多维度特征的Android 恶意软件检测方法", 《通信技术》, vol. 54, no. 5, 26 May 2021 (2021-05-26), pages 1240 - 1245 *

Similar Documents

Publication Publication Date Title
CN108053545B (en) Certificate verification method and device, server and storage medium
CN111159697B (en) Key detection method and device and electronic equipment
CN112966713B (en) DGA domain name detection method and device based on deep learning and computer equipment
US11361058B2 (en) Method used in a mobile equipment with a trusted execution environment for authenticating a user based on his face
CN110807194A (en) Webshell detection method and device
CN116366377B (en) Malicious file detection method, device, equipment and storage medium
WO2023093346A1 (en) Exogenous feature-based model ownership verification method and apparatus
CN112733140A (en) Detection method and system for model tilt attack
CN110858247A (en) Android malicious application detection method, system, device and storage medium
CN110879888A (en) Virus file detection method, device and equipment
CN113742726A (en) Program recognition model training and program recognition method, device, equipment and medium
CN114448664B (en) Method and device for identifying phishing webpage, computer equipment and storage medium
CN113742727A (en) Program recognition model training and program recognition method, device, equipment and medium
CN115688107A (en) Fraud-related APP detection system and method
CN115310087A (en) Website backdoor detection method and system based on abstract syntax tree
CN113935022A (en) Homologous sample capturing method and device, electronic equipment and storage medium
CN113836297A (en) Training method and device for text emotion analysis model
CN115022001B (en) Training method and device of domain name recognition model, electronic equipment and storage medium
CN113609352B (en) Character string retrieval method, device, computer equipment and storage medium
Seas et al. Automated Vulnerability Detection in Source Code Using Deep Representation Learning
Patil et al. Impact of PCA Feature Extraction Method used in Malware Detection for Security Enhancement
CN113806715B (en) SDK security analysis method and system for embedded equipment
CN116611057B (en) Data security detection method and system thereof
CN115718696B (en) Source code cryptography misuse detection method and device, electronic equipment and storage medium
RU2483355C1 (en) Method of identifying mobile device user from unique signature thereof

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination