CN107491696B

CN107491696B - Software security analysis method and system based on immune model

Info

Publication number: CN107491696B
Application number: CN201710718223.5A
Authority: CN
Inventors: 丁滟; 戴华东; 董攀; 黄辰林; 谭郁松; 陈松政; 魏立峰
Original assignee: National University of Defense Technology
Current assignee: National University of Defense Technology
Priority date: 2017-08-21
Filing date: 2017-08-21
Publication date: 2020-07-10
Anticipated expiration: 2037-08-21
Also published as: CN107491696A

Abstract

The invention discloses a software security analysis method and a system based on an immune model, and the method comprises the following steps: carrying out feature sampling on the running of a normal system and software to extract a self-feature library, and generating an antibody library based on the self-feature library according to the negative selection principle of artificial immunity; carrying out characteristic sampling to extract an antigen library when the current system and software run, carrying out matching detection on a characteristic string in the antigen library and an antibody in an antibody library, and reporting a safety event if the characteristic string meets a matching condition; the system comprises a program module corresponding to the method. The invention is based on organism immune system and combines practical application, the abstract artificial immune model is corresponding to various indexes of the software system, various characteristic events of the software system are extracted, the 'self' characteristic of the normal system is defined, the occurrence of safety events in the system is judged and responded by identifying 'non-self', and the artificial immunity is innovatively applied to software safety detection based on system operation.

Description

Software security analysis method and system based on immune model

Technical Field

The invention relates to a software security analysis technology, in particular to a software security analysis method and system based on an immune model.

Background

The software security analysis has the characteristics of large calculation scale, high complexity and the like, but the existing software security analysis method has defects in the aspects of detection efficiency, result accuracy and the like, and mainly reflects that the static analysis method has large detection result set and high false alarm rate and does not have analysis and check aiming at functions and program structures; the dynamic analysis efficiency is not high, and the automatic discovery is not easy to realize; the Fuzzing technology has the problems of non-universality, long construction test period and the like. Therefore, in view of the defects of the existing safety analysis technology, the subject is to develop a new software safety analysis method research, provide a software safety analysis method based on an immune model, and explore in the aspects of improving and optimizing the accuracy and efficiency of software safety analysis and detection.

In nature, the immune system of an organism is mainly used to recognize "self" belonging to the normal body itself and abnormal "non-self" from inside and outside the organism, and to detect and kill antigens not belonging to the body itself at any time, and to generate antibody substances capable of resisting the antigens according to the reaction. From the above description, it can be seen that the computer security problem has a striking similarity to the problems encountered by the biological immune system, both of which maintain the stability of the system in a constantly changing environment. Therefore, the main idea of the Artificial Immune System (AIS) is to use the biological Immune System as a reference, and to use many characteristics of the biological Immune System as a basic theory, and to combine the practical situation in practical engineering and application to solve the problem. At present, the artificial immune system has research on the application of virus detection, network intrusion, malicious code analysis and the like. However, how to apply artificial immunity to software security detection based on system operation is still a key technical problem to be solved urgently.

Disclosure of Invention

The technical problems to be solved by the invention are as follows: aiming at the problems in the prior art, the software safety analysis method and system based on the immune model are provided, the problem of software safety analysis is solved by taking a biological immune system as a reference, taking a plurality of characteristics of the biological immune system as a basic theory and combining practical situations in practical engineering and application, the abstract artificial immune model is corresponding to various indexes of the software system, various characteristic events of the software system are extracted, the self characteristics of a normal system are defined, and the occurrence of the safety events in the system is judged and responded by identifying non-self, so that the artificial immune is innovatively applied to software safety detection based on the operation of the system.

In order to solve the technical problems, the invention adopts the technical scheme that:

in one aspect, the invention provides a software security analysis method based on an immune model, which comprises the following implementation steps:

1) carrying out feature sampling on the running of a normal system and software in advance to extract a self-feature library S, and generating an antibody library based on the self-feature library S according to the negative selection principle of artificial immunity; skipping to execute the next step when software security analysis is needed;

2) and (3) performing characteristic sampling to extract an antigen library when the current system and software run, performing matching detection on the characteristic strings in the antigen library and the antibodies in the antibody library, and reporting a safety event if the matching conditions are met.

Preferably, the features collected during feature sampling in step 1) and step 2) comprise system state feature vectorsSNetwork communication feature vectorNSoftware behavior feature vectorAAnd software configuration feature vectorsCSystem state feature vectorSAt least one characteristic of CPU use and memory use is included; network communication feature vectorNAt least one characteristic of inflow flow, outflow flow and access target of the system network is included; software behavior feature vectorAAt least one of system resource use characteristics, network resource use characteristics, system call behavior characteristics and file access behavior characteristics of the software is contained; software configuration feature vectorCIncluding at least one feature of a software configuration aspect of the system.

Preferably, the detailed step of performing the feature sampling extraction self-feature library S in step 1) includes: and respectively coding the features acquired by each feature sampling to obtain vectors corresponding to each feature, coding the vectors corresponding to each feature to generate self-feature strings, and combining the self-feature strings acquired by performing the feature sampling for multiple times to form a self-feature library S.

Preferably, the detailed steps of generating the antibody library based on the self-characteristics library S according to the negative selection principle of artificial immunity in step 1) include:

1.1) encoding the self-characteristics library S in advance to generate the length equal tolSet of character strings ofF；

1.2) defining a character string matching rule; initializing antibody library antibody numbersjA value of (d);

1.3) randomly generating a length oflCharacter string ofaNumber of cycles of initializationiA value of (d);

1.4) from character string collectionsFTo selectiPersonal characteristic character stringf _i；

1.5) judging character stringaThe first stepiPersonal characteristic character stringf _iWhether a predefined string matching rule is met, and if so, skipping to execute the step 1.3); otherwise, updating the cycle traversal timesiA value of (d);

1.6) judging the number of loop traversal timesiWhether the value of (A) is greater than the selection string setFTotal number of self-character strings, if cycle traversal timesiIs greater than the selection string setFThe total number of self-character strings, then the character stringaAdding the antibody as an antibody into an antibody library, and counting the antibodies in the antibody libraryjAdding 1, jumping to execute step 1.7); otherwise, the jump executes step 1.4).

1.7) determination of the number of antibodies in the antibody libraryjIf the number of the antibodies is equal to the expected number N of the antibodies, generating an antibody library; otherwise, the jump executes step 1.3).

Preferably, the detailed steps of extracting the antigen pool in step 2) include: and performing feature sampling and coding on the current system and software during running to obtain vectors corresponding to all the features, and generating self-feature strings by coding the vectors corresponding to all the features to serve as an antigen library.

Preferably, step 2) further comprises a step of synergistically stimulating an immune response, and the detailed steps comprise:

s1) when a safety event is reported, inputting the current system resource occupation change into a pre-established danger signal calculation model to obtain a judgment result that the corresponding system resource occupation is abnormal; the danger signal calculation model comprises at least one danger signal calculation model of mapping relation between system resource occupation change and system resource occupation abnormity, and corresponding danger signals are output if any system resource occupation abnormity is judged;

s2) judging whether the comprehensive dangerous situation of the cooperative immune response is triggered to the comprehensive dangerous signal according to the judgment result of the abnormal occupation of the corresponding system resources output by the dangerous signal calculation model, if the comprehensive dangerous signal is triggered, judging that harmful software exists in the system and the system is in a dangerous state, and reporting a safety event.

Preferably, the step of establishing the mapping relationship between the change in the system resource occupation and the occurrence of the system resource occupation exception in the step S1) includes:

s1.1) sampling the occupation condition of system resources in a normal state in a training stage, wherein the sampling time interval ishThe time interval of the training phase is denoted as [, ]t ₁,t _n]，t _i=t _i-1+h；

S1.2) after the training phase is finished, connecting the recorded data of the occupation situation of the system resources to generate a time-varying curve of the occupation situation of the system resourcesy=f(t)；

S1.3) generating a normal system resource occupation change characteristic library M 'based on a change curve of the system resource occupation situation along with time, wherein the normal system resource occupation change characteristic library M' consists of system resource occupation characteristic binary groups, and one system resource occupation characteristic binary group is expressed as a f-pocketf(t _i ), f’(t _i ) _left}，f(t _i )Is shown int _iThe occupation amount of the system resources at the moment,f’(t _i ) _leftthen the left differential of the feature point is expressed asf’(t _i ) _left=(f(t _i)-f(t _i-1) )/ hWhereinf(t _i-1) Is shown int _i-1The occupation amount of the system resources at the moment,hto representt _iAndt _i-1the distance between them;

s1.4) a mapping relation between system resource occupation change and system resource occupation abnormity is established, after a system operation stage, a real-time system resource occupation characteristic binary group is generated according to the real-time monitored system resource occupation change, the real-time system resource occupation characteristic binary group is compared with historical system resource occupation characteristic binary groups in a normal system resource occupation change characteristic library M ', if the real-time system resource occupation characteristic binary group is larger than all the historical system resource occupation characteristic binary groups in the normal system resource occupation change characteristic library M', the system resource occupation abnormity is judged, and if any system resource occupation abnormity is judged, a corresponding danger signal is output.

Preferably, the mapping relationship between the change in the system resource occupation and the occurrence of the system resource occupation exception in the step S1) includes at least one of a mapping relationship ① -a mapping relationship ③, namely a mapping relationship ① between the change in the memory resource occupation and the occurrence of the memory resource occupation exception, a mapping relationship ② between the change in the CPU resource occupation and the occurrence of the CPU resource occupation exception, and a mapping relationship ③ between the change in the file size and the occurrence of the file size exception.

Preferably, the step S2) of determining whether the comprehensive dangerous situation triggering the comprehensive dangerous signal cooperatively stimulating the immune response according to the determination result of the abnormal occupation of the corresponding system resource output by the dangerous signal calculation model includes:

s2.1) outputting corresponding danger signals to carry out weighted summation when all system resources are abnormally occupied, wherein each system resource has a corresponding weight;

s2.2) judging whether the weighted summation result is larger than a preset threshold value, if so, judging that the comprehensive danger condition of the cooperative excitation immune response triggers the comprehensive danger signal, otherwise, judging that the comprehensive danger condition of the cooperative excitation immune response does not trigger the comprehensive danger signal.

In another aspect, the present invention further provides an immune model-based software security analysis system, including:

the antibody library program unit is used for sampling features of a normal system and software during running in advance to extract a self-feature library S, and generating an antibody library based on the self-feature library S according to the negative selection principle of artificial immunity;

and the software safety analysis unit is used for sampling the characteristics of the current system and software during running to extract an antigen library when software safety analysis is needed, matching and detecting the characteristic strings in the antigen library and the antibodies in the antibody library, and reporting a safety event if the matching conditions are met.

Preferably, the software safety analysis unit further comprises a co-immune response-provoking program module, including:

the cooperative excitation immune response detection subprogram module is used for inputting the current system resource occupation change into a pre-established danger signal calculation model when reporting the safety event to obtain a judgment result that the corresponding system resource occupation is abnormal; the danger signal calculation model comprises at least one danger signal calculation model of mapping relation between system resource occupation change and system resource occupation abnormity;

and the cooperative excitation immune response danger judgment subprogram module is used for judging whether all system resource occupation output by the danger signal calculation model is abnormal, and if all system resource occupation is abnormal, judging that harmful software exists in the system and the system is in a dangerous state.

The software security analysis method based on the immune model has the following advantages: the software safety analysis method based on the immune model solves the problem of software safety analysis by taking a biological immune system as reference, taking a plurality of characteristics of the biological immune system as a basic theory and combining practical situations in practical engineering and application, corresponds an abstract artificial immune model to various indexes of the software system, extracts various characteristic events of the software system, defines self characteristics of a normal system, judges the occurrence of the safety events in the system by identifying non-self and responds, thereby innovatively applying the artificial immunity to software safety detection based on the system operation.

The software safety analysis system based on the immune model is a system completely corresponding to the steps of the software safety analysis method based on the immune model, so that the software safety analysis system based on the immune model also has the advantages of the software safety analysis method based on the immune model, and is not repeated herein.

Drawings

FIG. 1 is a schematic diagram of a basic flow of a method according to an embodiment of the present invention.

FIG. 2 is a schematic diagram illustrating the principle of detecting the security of software by using an antibody library according to an embodiment of the present invention.

FIG. 3 is a schematic diagram of a process for generating an antibody library according to an embodiment of the present invention.

FIG. 4 is a schematic flow chart of the cooperative challenge of the immune response according to the embodiment of the present invention.

Fig. 5 is a schematic diagram of a schematic structure of a system to which the method of the embodiment of the present invention is applied.

Detailed Description

As shown in fig. 1, the implementation steps of the software safety analysis method based on the immune model in this embodiment include:

The Negative Selection Algorithm (Negative Selection Algorithm) is a bionics Algorithm proposed by reference to the Negative Selection principle of the biological immune system, and is one of the core algorithms of the artificial immune system. With reference to the negative selection principle, based on the recognition mechanism of the immune system ' self and ' non-self ', Forrest et al propose a negative selection algorithm and corresponding discussion on the specific application of the negative selection algorithm in engineering. The software safety analysis method based on the immune model innovatively applies the artificial immunity to the aspect of software safety detection based on system operation, provides a software safety analysis method based on a negative selection model, takes a negative selection algorithm as a theoretical basis, corresponds an abstract artificial immune model to various indexes of a software system, extracts various characteristic events of the software system, defines self characteristics of a normal system, judges the occurrence of safety events in the system by identifying non-self, and responds.

In this embodiment, the features collected during feature sampling in step 1) and step 2) include system state feature vectorsSNetwork communication feature vectorNSoftware behavior feature vectorAAnd software configuration feature vectorsC. In this embodiment, the system state feature vectorSAt least one characteristic of CPU use and memory use is included; network communication feature vectorNAt least one characteristic of inflow flow, outflow flow and access target of the system network is included; software behavior feature vectorAAt least one of system resource use characteristics, network resource use characteristics, system call behavior characteristics and file access behavior characteristics of the software is contained; software configuration feature vectorCIncluding at least one feature of a software configuration aspect of the system. It should be noted that the above features are exemplary rather than exhaustive. Needless to say, various feature vectors can be formed according to various other features related to harmful software, and the principle of the feature vectors is the same as that of the embodiment, and is not described herein again.

In this embodiment, the detailed steps of performing feature sampling and extracting the self-feature library S in step 1) include: and respectively coding the features acquired by each feature sampling to obtain vectors corresponding to each feature, coding the vectors corresponding to each feature to generate self-feature strings, and combining the self-feature strings acquired by performing the feature sampling for multiple times to form a self-feature library S. Similarly, the detailed steps of extracting the antigen library in the step 2) comprise: and performing feature sampling and coding on the current system and software during running to obtain vectors corresponding to all the features, and generating self-feature strings by coding the vectors corresponding to all the features to serve as an antigen library. The encoding rule here is, of course, exactly the same as in step 1).

As shown in FIG. 3, the detailed steps of generating the antibody library based on the self-characteristics library S according to the negative selection principle of artificial immunity in step 1) include:

1.7) determination of the number of antibodies in the antibody libraryjIf it is equal to the expected number N of antibodies, generating antibodiesFinishing the body bank; otherwise, the jump executes step 1.3).

As shown in fig. 2, when detecting the security of software by using an antibody library, firstly, the system features and software features during the operation of the software are extracted to generate a software antigen library, and then, the feature strings in the antigen library and the antibodies in the antibody library are matched and detected, and the matching conditions are met, and the matching conditions are reported as a security event.

The negative selection model is mainly used for strictly dividing the system constitution into self and non-self, and detecting and responding to the abnormal state of the non-self; the danger theory breaks through the definition that the negative selection theory is too clearly defined for the abnormity, and only when the system index is detected to show the occurrence of danger, the response is carried out, and how to define the dangerous state of the system becomes difficult. In this embodiment, step 2) further includes a step of cooperatively stimulating an immune response, and the detailed steps include:

The embodiment realizes the organic combination of the negative selection model and the danger theory model through the steps of cooperatively stimulating immune response. The negative selection model is mainly used for strictly dividing the system constitution into self and non-self, and detecting and responding to the abnormal state of the non-self; the danger theory breaks through the definition that the negative selection theory is too clearly defined for the abnormity, and only when the system index is detected to show the occurrence of danger, the response is carried out, and how to define the dangerous state of the system becomes difficult. The detection effects of the two methods are distinctive, and meanwhile, certain limitations exist. The negative selection method is established on the basis of strict definitions of 'self' and 'non-self' of the system, however, the behavior mode of the system is continuously changed due to the dynamic change of the computer system, the state of the system is influenced after the system is subjected to operations such as adding users, installing new software and the like, the system has strong unpredictability, and the characteristic space of the system to be judged is subjected to combined explosion, so that the detection performance is greatly influenced. Meanwhile, the definition of the danger theory algorithm to the abnormity is not clear, so that the abnormity identification technology has the problems of high false alarm rate and high false missing rate, therefore, the problem is that the danger theory is added for cooperative identification on the basis of keeping a negative selection algorithm identification mechanism, a cooperative identification model of self/non-self and danger theory is provided, and immune response is only triggered when the non-self is identified and danger is sensed at the same time. For the definition of "danger", if the virtual computing system is not changed, no danger will occur, so when the system is changed beyond the normal range, the danger is sensed through the change of the system parameters, the limit is uncertain, and the tool for describing the uncertainty can be used for describing the change of the system parameters, so that the occurrence of the danger is defined. In an actual software system, the monitored system variables are often not continuously derivable. Even if it is continuously derivable, the function curve is complex and inconvenient to describe in the form of derivative. In view of the uncertainty of the occurrence of the danger signal, an immune model is used to describe the changes in the system variables. The system normal state is described by self-characteristics under the known system normal state. And calculating the membership degree of the system variable sampling value at the monitored moment, and if the system variable sampling value at the moment is in a normal state, determining that no dangerous signal is generated, otherwise, determining that a dangerous signal is generated. Therefore, the negative selection model and the danger theory model are organically combined, so that the accuracy of the detection of the harmful software can be effectively improved.

Referring to fig. 4, the mapping relationship between the system resource occupation change and the system resource occupation exception in step S1) includes a mapping relationship ① -a mapping relationship ③, a mapping relationship ① is a mapping relationship between a memory resource occupancy change and a memory resource occupancy exception (danger signal M), a mapping relationship ② is a mapping relationship between a CPU resource occupancy change and a CPU resource occupancy exception (danger signal C), and a mapping relationship ③ is a mapping relationship between a file size change and a file size exception (danger signal F).

In this embodiment, the step of establishing the mapping relationship between the change in system resource occupation and the occurrence of the abnormality in system resource occupation in step S1) includes:

S1.3) generating a normal system resource occupation change characteristic library M 'based on a change curve of the system resource occupation situation along with time, wherein the normal system resource occupation change characteristic library M' consists of system resource occupation characteristic binary groups, and one system resource occupation characteristic binary group is expressed as a f-pocketf(t _i ), f’(t _i ) _left}，f(t _i )Is shown int _iThe occupation amount of the system resources at the moment,f’(t _i ) _leftthen the left differential of the feature point is represented, and the expression of the left differential of the feature pointIs of the formulaf’(t _i ) _left=(f(t _i)-f(t _i-1) )/ hWhereinf(t _i-1) Is shown int _i-1The occupation amount of the system resources at the moment,hto representt _iAndt _i-1the distance between them;

Taking the memory occupancy rate change as an example, firstly, in the training stage, the change condition of the memory occupancy rate when the statistical system normally operates is specifically divided into the following steps: 1. the system will sample the memory usage in the normal state, assuming a sampling time interval ofhThe whole training time interval is recorded as [, ]t ₁,t _n]，t _i=t _i-1+h. 2. After the training stage is finished, the recorded discrete memory occupancy change rate data are connected to generate a change curve of the system memory occupancy rate along with timey=f(t). 3. And generating a normal memory occupancy rate change characteristic library M' based on the memory occupancy rate change curve. M' is composed of two-tuple of memory occupation change rate characteristics, and one two-tuple of memory occupation change rate characteristics is expressed as af(t _i ), f’(t _i ) _left}。f(t _i )Is shown int _iTime systemThe occupied amount of the memory in the system is increased,f’(t _i ) _leftthen the left differential of the feature point is represented:f’(t _i ) _left=(f(t _i)-f(t _i-1) )/ hwhereinhTo representt _iAndt _i-1the distance between them. And after the system enters an operation stage, generating a real-time memory standardization rate characteristic binary group aiming at the memory occupancy rate data monitored in real time, comparing the real-time memory standardization rate characteristic binary group with the historical characteristic binary group in M ', and if the real-time data is greater than all the historical characteristic binary groups in M', determining that the current memory occupancy rate is abnormal and sending a memory danger signal M. In the above, the identification of the danger signals is described by taking the memory usage change rate as an example, and the system may define a plurality of danger signals to identify according to the principle. When there are multiple danger signal identifications in the system, it is necessary to comprehensively consider the multiple signals to determine whether the current system is in a dangerous state.

In this embodiment, the step S2) of determining whether the comprehensive dangerous condition cooperatively stimulating the immune response triggers the comprehensive dangerous signal according to the determination result of the abnormal occupation of the corresponding system resource output by the dangerous signal calculation model in step S2) includes:

In this embodiment, for the danger signal M, the danger signal C, and the danger signal F, the following formula is adopted to perform weighted summation:

S _d= α₁M + α₂C +α₃F

in the above formula, the first and second carbon atoms are,S _dfor the weighted sum result, α₁～α₃The weights are respectively the danger signal M, the danger signal C and the danger signal F, and M, C and F respectively represent the values of the danger signal M, the danger signal C and the danger signal F. Synthetic danger signal (weighted summation result) output based on danger signal M, danger signal C and danger signal FS _d) The danger signal output by the detector is in an AND relation, because the detector is only relied on to detect a non-self state, is over-clear and can cause higher false alarm rate, so that the danger theory based on the cooperative excitation of immune response is assisted. When the comprehensive danger signal is triggered but the detector judges that the system is in a self state, the danger is considered to be relieved, a normal feature library corresponding to the danger signal is updated, and the feature binary group sampled at the time is added into the normal feature library. Similarly, when the detector identifies the system status as non-self, but does not trigger the comprehensive danger signal, the self-feature string identified this time is also identified as a normal feature, and is added to the self-feature library F. The danger signal M, the danger signal C and the danger signal F all belong to a single system variable, the danger signal generated by only one system variable is not enough to judge the occurrence of danger, and the danger signals of various system variables need to be synthesized to judge together. The role played by the danger signals generated by the system variables in the final determination of whether or not "danger" occurs is different, and a weight is set for each danger signal according to the magnitude of the role played by each danger signal in the determination of "danger". This weight varies with the security level of the system and the abnormal event to be prevented, and therefore should be set to a variable value and adjusted according to the actual situation. The abnormal events are usually expressed in the system as stealing the private information of the user, modifying the system configuration file, occupying the system resources and the like, and try to destroy the normal operation of the system. When the software safety analysis system detects the occurrence of a dangerous anomaly, a safety response is triggered. And recording the exception and the process and the file related to the exception, and recovering the running state of the software.

The software security analysis method based on the immune model in this embodiment is specifically implemented by a computer program, and the software security analysis system based on the immune model implemented by the computer program includes:

In this embodiment, the software security analysis unit further includes a collaborative immune response-provoking program module, including:

The software safety analysis system structure based on the negative selection model, to which the software safety analysis method based on the immune model of the present embodiment is applied, is shown in fig. 5, and mainly includes four main modules, such as self-feature extraction, software safety monitoring, dangerous event warning and safety response, and three main libraries, such as self-features, antigens and antibodies. Wherein, the self-feature weight-raising module works in a training stage to generate a self-feature library and an antibody library; the software safety monitoring and dangerous event warning module and the safety response module work in the operation stage. The software safety analysis system based on the negative selection model applies the artificial immune model to the safety analysis of software, system call, files, legal software behaviors, normal states of the system and the like involved in the running of normal software in the system are defined as self, abnormal software states are defined as non-self, artificial immune algorithms such as negative selection, danger theory and the like are adopted, known self characteristics are utilized to generate a detector antibody, the abnormal state of the software in the test process is identified, logs are recorded, and responses are made at the same time. The system mainly comprises two stages, namely a training stage and a running stage. In the training stage, self-feature extraction of normally running systems and software is mainly completed, and an antibody library used as a detection standard is generated based on the self-features; after the trained detection antibody library is obtained, the system can enter an operation stage, and in the operation stage, the change condition of the system characteristics is obtained through a software safety monitoring mechanism in the system, an antigen is generated, and detection is carried out. And if the abnormality is detected, alarming according to the type of the judged danger, and starting.

In the software security analysis method based on the immune model, the self-feature extraction module firstly needs to identify the system features, namely the self-features, when the normal system runs, and then can judge the abnormal state of the system on the basis of the self-features. Therefore, in the training phase, the state of the normally running system and the specific system call, file access and other behaviors of the normal software in the system are subjected to feature extraction mainly from the aspects of system state features, network communication features, software behavior features, software configuration features and the like, and then the software features are coded to form a self-feature library of the software.

The software safety monitoring module works in the operation stage and monitors the state of the system and the software after the new software is added in the system operation process. The specific monitoring content comprises the aspects of system state monitoring, network communication monitoring, software behavior monitoring, software configuration monitoring and the like, and an information acquisition mechanism realized in the self-feature extraction module is multiplexed to complete information acquisition. The collected data are coded according to the same method as that in the self-feature extraction module to generate a monitoring antigen string, and an antigen library of the software is finally formed by monitoring the running of the software in multiple time periods.

The analysis and alarm module mainly utilizes a negative selection algorithm to carry out matching detection on the antigen strings in the software antigen library one by one antibody library, if the matching is found to be successful, the antigen can be regarded as a problem antigen, further security analysis is required, the type of a security event is analyzed according to the information characteristics of the software antigen, after dangerous events such as denial of service attack, buffer overflow attack, information leakage attack and the like are found, alarm information is sent to the security response module, and a system is requested to carry out corresponding processing.

The safety response module forms a log record for the safety event, judges the threat degree of the dangerous event according to the alarm information, and performs recovery actions such as software repair or program restart according to different threat degrees.

The above description is only a preferred embodiment of the present invention, and the protection scope of the present invention is not limited to the above embodiments, and all technical solutions belonging to the idea of the present invention belong to the protection scope of the present invention. It should be noted that modifications and embellishments within the scope of the invention may occur to those skilled in the art without departing from the principle of the invention, and are considered to be within the scope of the invention.

Claims

1. A software security analysis method based on immune model is characterized by comprising the following implementation steps:

2) carrying out characteristic sampling to extract an antigen library when the current system and software run, carrying out matching detection on a characteristic string in the antigen library and an antibody in an antibody library, and reporting a safety event if the characteristic string meets a matching condition;

the step 2) also comprises a step of cooperatively stimulating immune response, and the detailed steps comprise:

s1) when a security event is reported, inputting the current system resource occupation change into a pre-established dangerous signal calculation model to obtain a corresponding judgment result of the abnormal system resource occupation, wherein the dangerous signal calculation model comprises at least one system resource occupation change and a dangerous signal calculation model of the mapping relation between the abnormal system resource occupation, and outputting a corresponding dangerous signal if the abnormal system resource occupation is judged, the mapping relation between the system resource occupation change and the abnormal system resource occupation comprises at least one of mapping relations ① - ③, namely a mapping relation ①, a mapping relation between the memory resource occupation change and the abnormal memory resource occupation, a mapping relation ②, a mapping relation between the CPU resource occupation change and the abnormal CPU resource occupation, a mapping relation ③, a mapping relation between the file size change and the abnormal file size;

s2) judging whether the comprehensive dangerous situation of the cooperative immune response is triggered to the comprehensive dangerous signal according to the judgment result of the abnormal occupation of the corresponding system resources output by the dangerous signal calculation model, if the comprehensive dangerous signal is triggered, judging that harmful software exists in the system and the system is in a dangerous state, and reporting a safety event;

the step of establishing the mapping relationship between the change of the system resource occupation and the abnormal system resource occupation in the step S1) includes:

S1.3) generating a normal system resource occupation change characteristic library M 'based on a change curve of the system resource occupation situation along with time, wherein the normal system resource occupation change characteristic library M' is occupied by the system resourcesComposed of characteristic binary groups, one system resource occupation characteristic binary group being expressed asf(t _i ), f’(t _i ) _left}，f(t _i )Is shown int _iThe occupation amount of the system resources at the moment,f’(t _i ) _leftthen the left differential of the feature point is expressed asf’(t _i ) _left=(f(t _i)-f(t _i-1) )/ hWhereinf(t _i-1) Is shown int _i-1The occupation amount of the system resources at the moment,hto representt _iAndt _i-1the distance between them;

2. The immune model-based software security analysis method of claim 1, wherein the features collected during feature sampling in step 1) and step 2) comprise system state feature vectorsSNetwork communication feature vectorNSoftware behavior feature vectorAAnd software configuration feature vectorsCSystem state feature vectorSAt least one characteristic of CPU use and memory use is included; network communication feature vectorNComprises system network inflow flow, outflow flow and access targetAt least one feature of (a); software behavior feature vectorAAt least one of system resource use characteristics, network resource use characteristics, system call behavior characteristics and file access behavior characteristics of the software is contained; software configuration feature vectorCIncluding at least one feature of a software configuration aspect of the system.

3. The immune model-based software security analysis method of claim 1, wherein the detailed step of performing feature sampling to extract the self-feature library S in step 1) comprises: respectively coding the features acquired by each feature sampling to obtain vectors corresponding to each feature, coding the vectors corresponding to each feature to generate self-feature strings, and combining the self-feature strings acquired by performing feature sampling for multiple times to form a self-feature library S; the detailed steps for extracting the antigen library in the step 2) comprise: and performing feature sampling and coding on the current system and software during running to obtain vectors corresponding to all the features, and generating self-feature strings by coding the vectors corresponding to all the features to serve as an antigen library.

4. The immune model-based software security analysis method of claim 1, wherein the detailed steps of generating the antibody library based on the self-characteristics library S according to the negative selection principle of artificial immunity in step 1) comprise:

1.5) judging character stringaThe first stepiPersonal characteristic character stringf _iWhether a predefined string matching rule is satisfied, if so, skipping executionStep 1.3); otherwise, updating the cycle traversal timesiA value of (d);

1.6) judging the number of loop traversal timesiWhether the value of (A) is greater than the selection string setFTotal number of self-character strings, if cycle traversal timesiIs greater than the selection string setFThe total number of self-character strings, then the character stringaAdding the antibody as an antibody into an antibody library, and counting the antibodies in the antibody libraryjAdding 1, jumping to execute step 1.7); otherwise, skipping to execute the step 1.4);

5. The immune model-based software safety analysis method according to claim 1, wherein the step of determining whether the comprehensive dangerous condition cooperatively stimulating the immune response triggers the comprehensive dangerous signal according to the abnormal judgment result of the corresponding system resource occupation output by the dangerous signal calculation model in step S2) comprises:

6. An immune model-based software security analysis system, comprising:

the software safety analysis unit is used for sampling the characteristics of the current system and software during running to extract an antigen library when software safety analysis is needed, matching and detecting the characteristic strings in the antigen library and the antibodies in the antibody library, and reporting a safety event if the matching conditions are met;

the software safety analysis unit also comprises a collaborative excitation immune response program module, which comprises:

the system comprises a collaborative excitation immune response detection subprogram module, a system resource detection subprogram module and a system resource detection subprogram module, wherein the collaborative excitation immune response detection subprogram module is used for inputting the current system resource occupation change into a pre-established danger signal calculation model to obtain a judgment result of the abnormal occupation of the corresponding system resource when reporting a security event, the danger signal calculation model comprises at least one of the system resource occupation change and a danger signal calculation model of the mapping relation between the abnormal occupation of the system resource, the mapping relation between the system resource occupation change and the abnormal occupation of the system resource comprises at least one of the mapping relation ① - ③, the mapping relation is ①, the mapping relation is the mapping relation between the memory resource occupation change and the abnormal occupation of the memory resource, the mapping relation is ②, the mapping relation is the mapping relation between the CPU resource occupation change and the abnormal occupation of the CPU resource, and the mapping relation is ③, the mapping relation is the;

the cooperative excitation immune response danger judgment subprogram module is used for judging whether all system resource occupation output by the danger signal calculation model is abnormal, and if all system resource occupation is abnormal, judging that harmful software exists in the system and the system is in a dangerous state;

the establishment of the mapping relation between the change of the system resource occupation and the abnormal system resource occupation in the cooperative excitation immune response detection subprogram module comprises the following steps:

S1.2) after the training phase is finished, the recorded data of the occupation situation of the system resources are generated in a connected modeTime-dependent change curve of system resource occupationy=f(t)；

S1.3) generating a normal system resource occupation change characteristic library M 'based on a change curve of the system resource occupation situation along with time, wherein the normal system resource occupation change characteristic library M' consists of system resource occupation characteristic binary groups, and one system resource occupation characteristic binary group is expressed as a ff(t _i ), f’(t _i ) _left}，f(t _i )Is shown int _iThe occupation amount of the system resources at the moment,f’(t _i ) _leftthen the left differential of the feature point is expressed asf’(t _i ) _left=(f(t _i)-f(t _i-1) )/ hWhereinf(t _i-1) Is shown int _i-1The occupation amount of the system resources at the moment,hto representt _iAndt _i-1the distance between them;