CN110135167A - A kind of edge calculations terminal security grade appraisal procedure of random forest - Google Patents

A kind of edge calculations terminal security grade appraisal procedure of random forest Download PDF

Info

Publication number
CN110135167A
CN110135167A CN201910399303.8A CN201910399303A CN110135167A CN 110135167 A CN110135167 A CN 110135167A CN 201910399303 A CN201910399303 A CN 201910399303A CN 110135167 A CN110135167 A CN 110135167A
Authority
CN
China
Prior art keywords
test
security level
sample
random forest
training
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201910399303.8A
Other languages
Chinese (zh)
Other versions
CN110135167B (en
Inventor
雷文鑫
文红
侯文静
刘文洁
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
University of Electronic Science and Technology of China
Research Institute of Southern Power Grid Co Ltd
Original Assignee
University of Electronic Science and Technology of China
Research Institute of Southern Power Grid Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by University of Electronic Science and Technology of China, Research Institute of Southern Power Grid Co Ltd filed Critical University of Electronic Science and Technology of China
Priority to CN201910399303.8A priority Critical patent/CN110135167B/en
Publication of CN110135167A publication Critical patent/CN110135167A/en
Application granted granted Critical
Publication of CN110135167B publication Critical patent/CN110135167B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/243Classification techniques relating to the number of classes
    • G06F18/24323Tree-organised classifiers
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/50Monitoring users, programs or devices to maintain the integrity of platforms, e.g. of processors, firmware or operating systems
    • G06F21/57Certifying or maintaining trusted computer platforms, e.g. secure boots or power-downs, version controls, system software checks, secure updates or assessing vulnerabilities
    • G06F21/577Assessing vulnerabilities and evaluating computer system security

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • General Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • Computer Security & Cryptography (AREA)
  • Computer Hardware Design (AREA)
  • General Physics & Mathematics (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Evolutionary Computation (AREA)
  • Evolutionary Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Software Systems (AREA)
  • Computing Systems (AREA)
  • Complex Calculations (AREA)

Abstract

The invention discloses a kind of edge calculations lateral terminal security level appraisal procedures of random forest, comprising the following steps: the test result of S1. setting terminal safety test individual event and each test individual event;S2. the intelligent terminal of access is tested, S3. determines the corresponding relationship of intelligent terminal security level and individual event test result collection;S4. the corresponding security level of each edge termination is calculated, data set is obtained;S5. data set is divided into training set and test set;S6. training set input random forest is trained, obtains mature sorter model;S7. it in random forest grader model test set input training obtained, obtains test result and step S4 security level compares to obtain classifier up to standard;S8. the terminal security grade newly accessed is assessed using sorter model up to standard.The present invention by the data safety demand of edge termination press grade classification, according to face security risk, system complexity, can pass through the objective standard of quantization carry out edge calculations lateral terminal security evaluation.

Description

A kind of edge calculations terminal security grade appraisal procedure of random forest
Technical field
The present invention relates to edge calculations terminal security grade appraisal procedures, more particularly to a kind of edge meter of random forest Calculate terminal security grade appraisal procedure.
Background technique
With rapid development and extensive use that all things on earth interconnects, intelligent terminal will become all things on earth and interconnect key node, and produce Raw magnanimity real time data.According to IDC statistical data, it there will be over 50,000,000,000 terminals and equipment access network to the year two thousand twenty, wherein Data more than 50% are needed in the analysis of network edge side, processing and storage.The mass data that a large amount of edge devices generate needs Quicker connection, more effective data processing, while to have better data protection.Internet of Things are accessed in face of a large amount of heterogeneous terminals Net, edge calculations side are also faced with bigger data safety threat and hidden danger, and there are some not trusted terminals and mobile sides The illegal access problem of edge application developer.Therefore, it is necessary to the data safety demands to edge computing terminal to press grade classification, Terminal, fringe node establish new secure access mechanism between edge calculations service, with guarantee the confidentialities of data, integrality, User information privacy.It under this background, tests and assesses for the security performance of edge calculations terminal, first in edge calculations side Individual event assessment is carried out to terminal security, according to the test result scientific algorithm of each test individual event, carries out drawing for terminal security grade Point, it realizes the safe handling of different security level demands, it is safe and effective to reach intelligent terminal.
The computing resource of edge side is supported, is made it possible to and is carried out terminal security performance using more complicated calculation method It assesses, objective, effective and accurate division of realization terminal security grade, terminal and data demand for security are pressed in this patent proposition etc. Grade divides, and according to security risk, the system complexity etc. faced, carries out edge calculations lateral terminal peace by the objective standard of quantization The evaluation and test of congruent grade.
Random forest (Random forest) is the machine learning algorithm proposed by LeoBreiman in 2001, is mainly answered For returning and classifying.Its basic thought is to utilize bootstrap (bootstrap) resampling technique and node random splitting skill Art constructs more decision trees, has from original training sample collection N and repeats to randomly select k sample and generate new training sample with putting back to Then this set generates k classification tree according to self-service sample set and forms random forest, votes to obtain new data by classification tree Classification results.
Based on the support of edge calculations ability, realize that the data safety demand of intelligent terminal is pressed under random forests algorithm Grade divides, for realizing that the largest optimization of edge calculations security of system energy is of great significance.
Summary of the invention
It is an object of the invention to overcome the deficiencies of the prior art and provide a kind of edge calculations terminal securities of random forest Grade appraisal procedure obtains test result according to the test of each individual event security performance of intelligent terminal, and uses random forests algorithm The safety status classification for carrying out intelligent terminal, improves the accuracy of safety status classification.
The purpose of the present invention is achieved through the following technical solutions: a kind of edge calculations lateral terminal peace of random forest Congruent grade appraisal procedure, comprising the following steps:
S1. in edge calculations side Build Security test platform, k test individual event of setting terminal, each test individual event Test result is 0 or 1, wherein 0 indicates not pass through, 1 indicates to pass through;
S2. on the safe test platform of edge side, m+n platform intelligent terminal is tested according to k test individual event, is obtained To the security performance individual event test result collection of each intelligent terminal, wherein the security performance individual event test of i-th intelligent terminal Result set are as follows:
Xi=[xi1,xi2,...,xik], i=1,2 ..., m+n;
Wherein, xijFor j-th of test individual event score of i-th intelligent terminal, j=1,2 ..., k;By all intelligent terminals Individual event test result with (m+n) * k tie up matrix X indicate:
S3. the corresponding relationship of intelligent terminal security level and individual event test result collection is determined;
S4. according to the corresponding relationship in step S3, each X is calculatedi=[xi1,xi2,...,xik] corresponding security level yi, Data set D={ (X is obtained after calculating1,y1),(X2,y2),...,(Xm+n,ym+n)};
S5. data set D is divided, takes preceding m of data set D as training set T, latter n is test set S:
Training set T={ (X1,y1),(X2,y2),...,(Xm,ym), the ratio for accounting for data set is
Test set S={ (Xm+1,ym+1),(Xm+2,ym+2),...,(Xm+n,ym+n), the ratio for accounting for data set is
Preferably, the size of training set T and test set S are adjustable, and data set is bigger, and training set data is more, training Effect is better, more accurate to the classification of test set;
S6. by training set T={ (X1,y1),(X2,y2),...,(Xm,ym) it is used as sample set, input random forest point It is trained in class device model, obtains mature sorter model;
S7. after the completion of training, by test set S={ (Xm+1,ym+1),(Xm+2,ym+2),...,(Xm+n,ym+n) input training In obtained random forest grader model, obtains test result and step S4 security level compares to obtain classifier up to standard;
S8. the edge calculations side to be measured intelligent terminal newly accessed access safe test platform is obtained into test result, inputted It is assessed in sorter model up to standard, obtains corresponding security level.
Further, the step S3 includes following sub-step:
It S31. is y class by the safety status classification of intelligent terminal;
S32. the test individual event total score of i-th intelligent terminal is enabled0≤sumi≤k;
S33. withSafety status classification range is determined to be spaced, whenWhen, the safety of i-th intelligent terminal Grade is 0,When security level be 1,When security level be 2, and so on,When security level be t, t=1,2 ..., y-1;sumiThe bigger security performance for indicating intelligent terminal is more It is good.
Further, the step S6 includes following sub-step:
S61. selection random forests algorithm constructs random forest grader model, it belongs to Bagging type, passes through combination Multiple Weak Classifiers, final result is by ballot or takes mean value, so that the result of overall model accuracy with higher and general Change performance;
S62. by training set T={ (X1,y1),(X2,y2),...,(Xm,ym) it is divided into minority class sample set TminAnd majority Class sample set Tmax, whereinAnd Tmin∩Tmax={ T };
S63. 2/3rds sample points of random extraction are concentrated from original sample, obtains training set T ', observation T's ' lacks Several classes of data set Tmin', most class data set Tmax′;
S64. it calculatesValue, provides conditionAnd
S65. if training set T ' meets the condition in S64, the training set for extracting and obtaining is saved, if training set T ' is discontented Condition in sufficient S64 then gives up extraction and obtains training set;
S66. step S63~S65 is repeated, until obtaining NtreeA training set for meeting condition, wherein NtreeFor quasi- construction Decision tree quantity, finally obtained NtreeA training set isWherein i=1,2 ..., Ntree
S67. in i=1,2 ..., NtreeWhen, utilize training set Ti, one CART decision tree H of trainingi, according to Gini index Choose optimal characteristics.
Wherein, the step S62 includes following sub-step:
S621. training set T={ (X is counted1,y1),(X2,y2),...,(Xm,ym) in each security level sample number Mesh;
S622. for each security level, if its corresponding number of samples is greater than preset threshold H, by the safety etc. Most class sample set T are added in all samples of grademax;If its corresponding number of samples is less than or equal to preset threshold H, by the peace Minority class sample set T is added in all samples of congruent grademin
Wherein, the step S67 includes following sub-step:
S671. for training set Ti, gini index Gini is calculated,In the smaller expression set of Gini index Selected sample is smaller by the probability of misclassification, that is to say, that the purity of set is higher, conversely, set is more impure;Wherein PkTable Show the frequency that k-th of classification occurs in classification results;
S672. for the training set T containing N number of samplei, according to the ith attribute value of attribute A, by data set TiIt is divided into Two parts calculate Gain_GINI,Wherein n1、n2For sample set Ti1、Ti2Number of samples;
S673. for attribute A, the Gain_GINI that data set is divided into after two parts by any attribute value is calculated separately, Minimum value therein is chosen, optimal two offshoot program obtained as attribute A:
S674. for sample set Ti, optimal two offshoot program of all properties is calculated, minimum value therein is chosen, as sample This collection TiOptimal two offshoot program:
Further, the step S7 includes following sub-step:
S71. test set S={ (Xm+1,ym+1),(Xm+2,ym+2),...,(Xm+n,ym+n) it is sample to be tested;
S72. for i=1,2 ..., Ntree, the initial ballot weight of decision tree is 1, enables Ri=Timax′/Timin′;
The ballot weight for updating every decision tree is
S73. for j=m+1, m+2 ..., m+n, i=1,2 ..., Ntree, input sample to be tested Xj, by the decision of S66 Set HiExport Hi(Xj), the final classification of prediction isAs test sample XjCorresponding peace Congruent grade;
S74. setting judgement classifier error threshold value θ, 0≤θ≤1.
IfM+1≤j≤m+n, then classifier meets predetermined threshold value, is classification up to standard Device, the return step S5 re -training if being unsatisfactory for, wherein
Further, the step S8 includes following sub-step:
S81. the edge calculations side to be measured intelligent terminal newly accessed access safe test platform k test individual events are obtained to survey Test result X=[x1,x2,...,xk];
S82. test result is inputted in sorter model up to standard,I=1, 2,...,Ntree.F (X) is corresponding security level.
The beneficial effects of the present invention are: test of (1) present invention according to each individual event security performance of edge calculations intelligent terminal, Objective and accurate division to intelligent terminal security level is realized using random forest sorting algorithm, realizes edge calculations system safety The largest optimization of performance;(2) present invention constructs disaggregated model, the introducing of randomness, so that random gloomy using random forests algorithm Woods is not easy over-fitting, there is good noise resisting ability, and training speed is fast, available variable grade classification results, obtain compared with Accurately to quantify objective standard;(3) present invention carries out safety test to different Edge intelligence terminal devices, and with every end Holding test result data collection is feedback, to realize the training of classifier and the division of security level, improves safety status classification As a result confidence level.
Detailed description of the invention
Fig. 1 is flow chart of the method for the present invention;
Fig. 2 is a kind of flow chart of the edge calculations terminal security grade appraisal procedure of random forest in embodiment.
Specific embodiment
Technical solution of the present invention is described in further detail with reference to the accompanying drawing, but protection scope of the present invention is not limited to It is as described below.
As shown in Figure 1, a kind of edge calculations lateral terminal security level appraisal procedure of random forest, comprising the following steps:
S1. in edge calculations side Build Security test platform, k test individual event of setting terminal, each test individual event Test result is 0 or 1, wherein 0 indicates not pass through, 1 indicates to pass through;
S2. on the safe test platform of edge side, m+n platform intelligent terminal is tested according to k test individual event, is obtained To the security performance individual event test result collection of each intelligent terminal, wherein the security performance individual event test of i-th intelligent terminal Result set are as follows:
Xi=[xi1,xi2,...,xik], i=1,2 ..., m+n;
Wherein, xijFor j-th of test individual event score of i-th intelligent terminal, j=1,2 ..., k;By all intelligent terminals Individual event test result with (m+n) * k tie up matrix X indicate:
S3. the corresponding relationship of intelligent terminal security level and individual event test result collection is determined;
S4. according to the corresponding relationship in step S3, each X is calculatedi=[xi1,xi2,...,xik] corresponding security level yi, Data set D={ (X is obtained after calculating1,y1),(X2,y2),...,(Xm+n,ym+n)};
S5. data set D is divided, takes preceding m of data set D as training set T, latter n is test set S:
Training set T={ (X1,y1),(X2,y2),...,(Xm,ym), the ratio for accounting for data set is
Test set S={ (Xm+1,ym+1),(Xm+2,ym+2),...,(Xm+n,ym+n), the ratio for accounting for data set is
In embodiments herein, the size of training set T and test set S are adjustable, and data set is bigger, training set number According to more, training effect is better, more accurate to the classification of test set;
S6. by training set T={ (X1,y1),(X2,y2),...,(Xm,ym) it is used as sample set, input random forest point It is trained in class device model, obtains mature sorter model;
S7. after the completion of training, by test set S={ (Xm+1,ym+1),(Xm+2,ym+2),...,(Xm+n,ym+n) input training In obtained random forest grader model, obtains test result and step S4 security level compares to obtain classifier up to standard;
S8. the edge calculations side to be measured intelligent terminal newly accessed access safe test platform is obtained into test result, inputted It is assessed in sorter model up to standard, obtains corresponding security level.
Further, the step S3 includes following sub-step:
It S31. is y class by the safety status classification of intelligent terminal;
S32. the test individual event total score of i-th intelligent terminal is enabled0≤sumi≤k;
S33. withSafety status classification range is determined to be spaced, whenWhen, the safety of i-th intelligent terminal Grade is 0,When security level be 1,When security level be 2, and so on,When security level be t, t=1,2 ..., y-1;sumiThe bigger security performance for indicating intelligent terminal is more It is good.
Further, the step S6 includes following sub-step:
S61. selection random forests algorithm constructs random forest grader model, it belongs to Bagging type, passes through combination Multiple Weak Classifiers, final result is by ballot or takes mean value, so that the result of overall model accuracy with higher and general Change performance;
S62. by training set T={ (X1,y1),(X2,y2),...,(Xm,ym) it is divided into minority class sample set TminAnd majority Class sample set Tmax, whereinAnd Tmin∩Tmax={ T };
S63. 2/3rds sample points of random extraction are concentrated from original sample, obtains training set T ', observation T's ' lacks Several classes of data set Tmin', most class data set Tmax′;
S64. it calculatesValue, provides conditionAnd
S65. if training set T ' meets the condition in S64, the training set for extracting and obtaining is saved, if training set T ' is discontented Condition in sufficient S64 then gives up extraction and obtains training set;
S66. step S63~S65 is repeated, until obtaining NtreeA training set for meeting condition, wherein NtreeFor quasi- construction Decision tree quantity, finally obtained NtreeA training set isWherein i=1,2 ..., Ntree
S67. in i=1,2 ..., NtreeWhen, utilize training set Ti, one CART decision tree H of trainingi, according to Gini index Choose optimal characteristics.
Wherein, the step S62 includes following sub-step:
S621. training set T={ (X is counted1,y1),(X2,y2),...,(Xm,ym) in each security level sample number Mesh;
S622. for each security level, if its corresponding number of samples is greater than preset threshold H, by the safety etc. Most class sample set T are added in all samples of grademax;If its corresponding number of samples is less than or equal to preset threshold H, by the peace Minority class sample set T is added in all samples of congruent grademin
Wherein, the step S66 includes following sub-step:
S671. for training set Ti, gini index Gini is calculated,In the smaller expression set of Gini index Selected sample is smaller by the probability of misclassification, that is to say, that the purity of set is higher, conversely, set is more impure;Wherein PkTable Show the frequency that k-th of classification occurs in classification results;
S672. for the training set T containing N number of samplei, according to the ith attribute value of attribute A, by data set TiIt is divided into Two parts calculate Gain_GINI,Wherein n1、n2For sample set Ti1、Ti2Number of samples;
S673. for attribute A, the Gain_GINI that data set is divided into after two parts by any attribute value is calculated separately, Minimum value therein is chosen, optimal two offshoot program obtained as attribute A:
S674. for sample set Ti, optimal two offshoot program of all properties is calculated, minimum value therein is chosen, as sample This collection TiOptimal two offshoot program:
Further, the step S7 includes following sub-step:
S71. test set S={ (Xm+1,ym+1),(Xm+2,ym+2),...,(Xm+n,ym+n) it is sample to be tested;
S72. for i=1,2 ..., Ntree, the initial ballot weight of decision tree is 1, enables Ri=Timax′/Timin′;
The ballot weight for updating every decision tree is
S73. for j=m+1, m+2 ..., m+n, i=1,2 ..., Ntree, input sample to be tested Xj, by the decision of S66 Set HiExport Hi(Xj), the final classification of prediction isAs test sample XjCorresponding peace Congruent grade;
S74. setting judgement classifier error threshold value θ, 0≤θ≤1.
IfM+1≤j≤m+n, then classifier meets predetermined threshold value, is classification up to standard Device, the return step S5 re -training if being unsatisfactory for, wherein
Further, the step S8 includes following sub-step:
S81. the edge calculations side to be measured intelligent terminal newly accessed access safe test platform k test individual events are obtained to survey Test result X=[x1,x2,...,xk];
S82. test result is inputted in sorter model up to standard,I=1, 2,...,Ntree.F (X) is corresponding security level.
As shown in Fig. 2, using trained random forest, inputting edge termination to be measured in embodiments herein and obtaining The process of edge calculations terminal security grade is as follows:
1. 10 edge of table intelligent terminals are first accessed safe test platform in edge calculations side, design terminal tests individual event It is 22, the individual event test result for obtaining every edge of table intelligent terminal is Xi=[x1,x2,...,x22], i=1,2 ..., 10, institute There is the individual event test result of Edge intelligence terminal to integrate and tie up matrix X as 10*22, wherein xij=0 or xij=1.
2. determining the corresponding relationship of edge termination security level and individual event test result collection.
1) security level of Edge intelligence terminal is divided into 0,1,2,3 four class by this assessment;
2) the test individual event total score of i-th intelligent terminal is enabled0≤sumi≤22;
3) safety status classification is determined according to sum value, it is 0, when 6≤sum≤10 that security level is corresponded to as 0≤sum≤5 Security level is 1, and security level is 2 when 11≤sum≤15, and security level is 3 when 16≤sum≤22, security level higher generation The security performance of meter terminal is better.Shown in security level corresponding relationship following table:
Total score sum 0~5 6~10 11~15 16~22
Security level Yi 0 1 2 3
Safe coefficient It is very poor Difference Generally Safety
3. calculating each Xi=[x1,x2,...,x22] security level yi, data set is obtained after calculating:
D={ (X1,y1),(X2,y2),...,(X10,y10)}。
4. using Monte carlo algorithm since data set is not big enough and expanding data set D in proportion.
5. data set D is divided into training set T={ (X1,y1),(X2,y2),...,(Xm,ym) and test set S= {(Xm+1,ym+1),(Xm+2,ym+2),...,(Xm+n,ym+n), test set is as sample to be tested.
6. concentrating 2/3rds sample points of random extraction from original sample, training set T ' is obtained.Observe the minority of T ' Class data set Tmin', most class data set Tmax′。
7. calculatingValue: if training set T ' satisfactionAndThen repeat step 6, repeat NtreeIt is secondary, NtreeFor quasi- construction decision tree quantity.Training set T after obtaining stochastical samplingi, i=1,2 ..., Ntree
8. couple i=1,2 ..., Ntree, use training set TiGenerate the tree H of a not beta pruningi.It is random from 22 features M feature is selected, on each node from M feature according to Gini selecting index optimal characteristics, division is grown into most until tree Greatly.
9. for i=1,2 ..., Ntree, the initial ballot weight of decision tree is 1, enables Ri=Timax′/Timin', update every The ballot weight of decision tree is
10. for j=m+1, m+2 ..., m+n, i=1,2 ..., Ntree, input sample to be tested Xj, by decision tree HiIt is defeated H outi(Xj), the test sample classification of prediction isAs corresponding safety of test sample etc. Grade.
11. setting judgement classifier error threshold value θ=0.98.M+1≤j≤m+n, point Class device meets predetermined threshold value, is classifier up to standard.
12. the edge calculations side to be measured intelligent terminal newly accessed access safe test platform is obtained 22 test individual events to survey Test result X=[x1,x2,...,x22]。
13. by test result X=[x1,x2,...,x22] input in sorter model up to standard,I=1,2 ..., Ntree.F (X) is the corresponding safety of edge calculations side to be measured intelligent terminal Grade.
In embodiments herein, step S6 is in addition to using machine learning random forests algorithm building disaggregated model, also It can be using k- nearest neighbor algorithm, NB Algorithm, SVM algorithm and decision Tree algorithms or convolutional neural networks algorithm, preceding It presents neural network algorithm and radial base neural net algorithm constructs corresponding neural network, and neural network is instructed using training set Practice, obtains corresponding maturity model.
To sum up, the present invention is based on the edge calculations terminals that machine learning algorithm grade separation model proposes a kind of random forest Security level appraisal procedure is classified using random forest and is calculated according to the test of each individual event security performance of edge calculations intelligent terminal Method realizes the objective and accurate division to intelligent terminal security level, realizes the largest optimization of edge calculations security of system energy;Benefit Disaggregated model is constructed with random forests algorithm, the introducing of randomness has good anti-noise so that random forest is not easy over-fitting Sound ability, training speed is fast, available variable grade classification results, obtains accurately quantization objective standard;To difference Edge intelligence terminal device carry out safety test, and with every terminal test result data collection be feedback, to realize classifier Training and security level division, improve the confidence level of safety status classification result;Meanwhile the present invention is to acquisition training set Double sampling process improved, by increase constraint condition sampling results are screened, can guarantee obtain it is random Training set can preferably represent original training set;And the process of forest is formed for combination decision tree, the present invention passes through change The ballot weight of decision tree can effectively reduce the defect of random forests algorithm itself, unbalanced especially for data distribution Scene process effect has significantly improved, and the few treatment effect of data volume connects preferably.
The above is a preferred embodiment of the present invention, it should be understood that the present invention is not limited to shape described herein Formula should not be viewed as excluding other embodiments, and can be used for other combinations, modification and environment, and can be in this paper institute It states in contemplated scope, modifications can be made through the above teachings or related fields of technology or knowledge.And what those skilled in the art were carried out Modifications and changes do not depart from the spirit and scope of the present invention, then all should be within the scope of protection of the appended claims of the present invention.

Claims (7)

1. a kind of edge calculations lateral terminal security level appraisal procedure of random forest, it is characterised in that: the following steps are included:
S1. in edge calculations side Build Security test platform, k test individual event of setting terminal, the test of each test individual event It as a result is 0 or 1, wherein 0 indicates not pass through, 1 indicates to pass through;
S2. on the safe test platform of edge side, m+n platform intelligent terminal is tested according to k test individual event, is obtained every The security performance individual event test result collection of one intelligent terminal, wherein the security performance individual event test result of i-th intelligent terminal Collection are as follows:
Xi=[xi1, xi2..., xik], i=1,2 ..., m+n;
Wherein, xijFor j-th of test individual event score of i-th intelligent terminal, j=1,2 ..., k;By the list of all intelligent terminals Item test result is tieed up matrix X with (m+n) * k and is indicated:
S3. the corresponding relationship of intelligent terminal security level and individual event test result collection is determined;
S4. according to the corresponding relationship in step S3, each X is calculatedi=[xi1, xi2..., xik] corresponding security level yi, calculate After obtain data set D={ (X1, y1), (X2, y2) ..., (Xm+n, ym+n)};
S5. data set D is divided, takes preceding m of data set D as training set T, latter n is test set S:
Training set T={ (X1, y1), (X2, y2) ..., (Xm, ym), the ratio for accounting for data set is
Test set S={ (Xm+1, ym+1), (Xm+2, ym+2) ..., (Xm+n, ym+n), the ratio for accounting for data set is
S6. by training set T={ (X1, y1), (X2, y2) ..., (Xm, ym) it is used as sample set, input random forest grader It is trained in model, obtains mature sorter model;
S7. after the completion of training, by test set S={ (Xm+1, ym+1), (Xm+2, ym+2) ..., (Xm+n, ym+n) input trained obtain Random forest grader model in, obtain test result and step S4 security level compare to obtain classifier up to standard;
S8. the edge calculations side to be measured intelligent terminal newly accessed access safe test platform is obtained into test result, inputted up to standard Sorter model in assessed, obtain corresponding security level.
2. a kind of edge calculations lateral terminal security level appraisal procedure of random forest according to claim 1, feature Be: the step S3 includes following sub-step:
It S31. is y class by the safety status classification of intelligent terminal;
S32. the test individual event total score of i-th intelligent terminal is enabled
S33. withSafety status classification range is determined to be spaced, whenWhen, the security level of i-th intelligent terminal It is 0,When security level be 1,When security level be 2, and so on,When security level be t, t=1,2 ..., y-1;sumiThe bigger security performance for indicating intelligent terminal is more It is good.
3. a kind of edge calculations lateral terminal security level appraisal procedure of random forest according to claim 1, feature Be: the step S6 includes following sub-step:
S61. selection random forests algorithm constructs random forest grader model, it belongs to Bagging type, multiple by combining Weak Classifier, final result is by ballot or takes mean value, so that the result of overall model accuracy with higher and generalization Energy;
S62. by training set T={ (X1, y1), (X2, y2) ..., (Xm, ym) it is divided into minority class sample set TminWith most class samples This collection Tmax, whereinAnd TminTmax={ T };
S63. 2/3rds sample points of random extraction are concentrated from original sample, obtains training set T ', observe the minority class of T ' Data set Tmin', most class data set Tmax′;
S64. it calculatesValue, provides conditionAnd
S65. if training set T ' meets the condition in S64, the training set for extracting and obtaining is saved, if training set T ' is unsatisfactory for Condition in S64 then gives up extraction and obtains training set;
S66. step S63~S65 is repeated, until obtaining NtreeA training set for meeting condition, wherein NtreeFor quasi- construction decision Set quantity, finally obtained NtreeA training set isWherein i=1,2 ..., Ntree
S67. in i=1,2 ..., NtreeWhen, utilize training set Ti, one CART decision tree H of trainingi, according to Gini selecting index Optimal characteristics.
4. a kind of edge calculations lateral terminal security level appraisal procedure of random forest according to claim 1, feature Be: the step S7 includes following sub-step:
S71. test set S={ (Xm+1, ym+1), (Xm+2, ym+2) ..., (Xm+n, ym+n) it is sample to be tested;
S72. for i=1,2 ..., Ntree, the initial ballot weight of decision tree is 1, enables Ri=Timax′/Timin′;
The ballot weight for updating every decision tree is
S73. for j=m+1, m+2 ..., m+n, i=1,2 ..., Ntree, input sample to be tested Xj, by the decision tree H of S66iIt is defeated H outi(Xj), the final classification of prediction isAs test sample XjCorresponding safety etc. Grade;
S74. setting judgement classifier error threshold value θ, 0≤θ≤1;
IfThen classifier meets predetermined threshold value, is classification up to standard Device, the return step S5 re -training if being unsatisfactory for, wherein
5. a kind of edge calculations lateral terminal security level appraisal procedure of random forest according to claim 1, feature Be: the step S8 includes following sub-step:
S81. the edge calculations side to be measured intelligent terminal newly accessed access safe test platform is obtained into k test individual event test knots Fruit X=[x1, x2..., xk];
S82. test result is inputted in sorter model up to standard,f It (X) is corresponding security level.
6. a kind of edge calculations lateral terminal security level appraisal procedure of random forest according to claim 3, feature Be: the step S62 includes following sub-step:
S621. training set T={ (X is counted1, y1), (X2, y2) ..., (Xm, ym) in each security level number of samples;
S622. for each security level, if its corresponding number of samples is greater than preset threshold H, by the security level Most class sample set T are added in all samplesmax;If its corresponding number of samples is less than or equal to preset threshold H, by the safety etc. Minority class sample set T is added in all samples of grademin
7. a kind of edge calculations lateral terminal security level appraisal procedure of random forest according to claim 3, feature Be: the step S67 includes following sub-step:
S671. for training set Ti, gini index Gini is calculated,It is chosen in the smaller expression set of Gini index In sample it is smaller by the probability of misclassification, that is to say, that the purity of set is higher, conversely, set it is more impure;Wherein PkIt indicates to divide The frequency that k-th of classification occurs in class result;
S672. for the training set T containing N number of samplei, according to the ith attribute value of attribute A, by data set TiIt is divided into two Point, Gain_GINI is calculated,Wherein n1、n2For sample set Ti1、Ti2 Number of samples;
S673. for attribute A, the Gain_GINI that data set is divided into after two parts by any attribute value is calculated separately, is chosen Minimum value therein, optimal two offshoot program obtained as attribute A:
S674. for sample set Ti, optimal two offshoot program of all properties is calculated, minimum value therein is chosen, as sample set Ti Optimal two offshoot program:
CN201910399303.8A 2019-05-14 2019-05-14 Edge computing terminal security level evaluation method for random forest Active CN110135167B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910399303.8A CN110135167B (en) 2019-05-14 2019-05-14 Edge computing terminal security level evaluation method for random forest

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910399303.8A CN110135167B (en) 2019-05-14 2019-05-14 Edge computing terminal security level evaluation method for random forest

Publications (2)

Publication Number Publication Date
CN110135167A true CN110135167A (en) 2019-08-16
CN110135167B CN110135167B (en) 2020-11-20

Family

ID=67573839

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910399303.8A Active CN110135167B (en) 2019-05-14 2019-05-14 Edge computing terminal security level evaluation method for random forest

Country Status (1)

Country Link
CN (1) CN110135167B (en)

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111124855A (en) * 2019-11-29 2020-05-08 苏州浪潮智能科技有限公司 Hard disk introduction risk assessment method, system and equipment
CN111935171A (en) * 2020-08-24 2020-11-13 南方电网科学研究院有限责任公司 Terminal security policy selection method based on machine learning under edge calculation
CN112287345A (en) * 2020-10-29 2021-01-29 中南大学 Credible edge computing system based on intelligent risk detection
CN112583844A (en) * 2020-12-24 2021-03-30 北京航空航天大学 Big data platform defense method for advanced sustainable threat attack
KR20210040884A (en) * 2020-04-17 2021-04-14 베이징 바이두 넷컴 사이언스 앤 테크놀로지 코., 엘티디. Edge computing test methods, devices, electronic devices and computer-readable media
CN112801145A (en) * 2021-01-12 2021-05-14 深圳市中博科创信息技术有限公司 Safety monitoring method and device, computer equipment and storage medium
CN113128532A (en) * 2019-12-31 2021-07-16 北京超星未来科技有限公司 Method for acquiring training sample data, method for processing training sample data, device and system
CN113191455A (en) * 2021-05-26 2021-07-30 平安国际智慧城市科技股份有限公司 Edge computing box election method and device, electronic equipment and medium
CN113569482A (en) * 2021-07-29 2021-10-29 石家庄铁道大学 Method and device for evaluating service performance of tunnel, terminal and storage medium

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20100250473A1 (en) * 2009-03-27 2010-09-30 Porikli Fatih M Active Learning Method for Multi-Class Classifiers
CN107180362A (en) * 2017-05-03 2017-09-19 浙江工商大学 Retail commodity sales forecasting method based on deep learning
CN107886135A (en) * 2017-12-01 2018-04-06 江苏蓝深远望科技股份有限公司 A kind of parallel random forests algorithm for handling uneven big data
CN108306894A (en) * 2018-03-19 2018-07-20 西安电子科技大学 A kind of network security situation evaluating method and system that confidence level occurring based on attack
CN108874927A (en) * 2018-05-31 2018-11-23 桂林电子科技大学 Intrusion detection method based on hypergraph and random forest
CN109325844A (en) * 2018-06-25 2019-02-12 南京工业大学 Net under multidimensional data borrows borrower's credit assessment method
CN109344848A (en) * 2018-07-13 2019-02-15 电子科技大学 Mobile intelligent terminal security level classification method based on Adaboost

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20100250473A1 (en) * 2009-03-27 2010-09-30 Porikli Fatih M Active Learning Method for Multi-Class Classifiers
CN107180362A (en) * 2017-05-03 2017-09-19 浙江工商大学 Retail commodity sales forecasting method based on deep learning
CN107886135A (en) * 2017-12-01 2018-04-06 江苏蓝深远望科技股份有限公司 A kind of parallel random forests algorithm for handling uneven big data
CN108306894A (en) * 2018-03-19 2018-07-20 西安电子科技大学 A kind of network security situation evaluating method and system that confidence level occurring based on attack
CN108874927A (en) * 2018-05-31 2018-11-23 桂林电子科技大学 Intrusion detection method based on hypergraph and random forest
CN109325844A (en) * 2018-06-25 2019-02-12 南京工业大学 Net under multidimensional data borrows borrower's credit assessment method
CN109344848A (en) * 2018-07-13 2019-02-15 电子科技大学 Mobile intelligent terminal security level classification method based on Adaboost

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
KUGUA233: "《机器学习实战_决策树》", 15 June 2018, HTTPS://SEGMENTFAULT.COM/A/1190000015299657 *
傅佳: "《随机森林算法介绍》", 24 September 2018, HTTPS://MP.WEIXIN.QQ.COM/S?SRC=11&TIMESTAMP=1573523236&VER=1969&SIGNATURE=CWMDMOGHFR353CJFNKGWKRRT6GM1*UHMHK3Q2EXKMAKLLRWVRB3-F9WGHNRPPNTBRZNEA7FQF9GZZZBOECLFZKMRRSJBCSZVMU5FI3RWIENFSGQGMJGV5BBZ0GEWBKPK&NEW=1 *

Cited By (17)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111124855A (en) * 2019-11-29 2020-05-08 苏州浪潮智能科技有限公司 Hard disk introduction risk assessment method, system and equipment
CN113128532B (en) * 2019-12-31 2023-06-20 北京超星未来科技有限公司 Training sample data acquisition method, processing method, device and system
CN113128532A (en) * 2019-12-31 2021-07-16 北京超星未来科技有限公司 Method for acquiring training sample data, method for processing training sample data, device and system
KR20210040884A (en) * 2020-04-17 2021-04-14 베이징 바이두 넷컴 사이언스 앤 테크놀로지 코., 엘티디. Edge computing test methods, devices, electronic devices and computer-readable media
EP3842948A3 (en) * 2020-04-17 2021-10-20 Beijing Baidu Netcom Science And Technology Co. Ltd. Method and apparatus for testing edge computing, device, and readable storage medium
KR102493449B1 (en) * 2020-04-17 2023-01-31 베이징 바이두 넷컴 사이언스 앤 테크놀로지 코., 엘티디. Edge computing test methods, devices, electronic devices and computer-readable media
CN111935171A (en) * 2020-08-24 2020-11-13 南方电网科学研究院有限责任公司 Terminal security policy selection method based on machine learning under edge calculation
CN112287345A (en) * 2020-10-29 2021-01-29 中南大学 Credible edge computing system based on intelligent risk detection
CN112287345B (en) * 2020-10-29 2024-04-16 中南大学 Trusted edge computing system based on intelligent risk detection
CN112583844B (en) * 2020-12-24 2021-09-03 北京航空航天大学 Big data platform defense method for advanced sustainable threat attack
CN112583844A (en) * 2020-12-24 2021-03-30 北京航空航天大学 Big data platform defense method for advanced sustainable threat attack
CN112801145B (en) * 2021-01-12 2024-05-28 深圳市中博科创信息技术有限公司 Security monitoring method, device, computer equipment and storage medium
CN112801145A (en) * 2021-01-12 2021-05-14 深圳市中博科创信息技术有限公司 Safety monitoring method and device, computer equipment and storage medium
CN113191455A (en) * 2021-05-26 2021-07-30 平安国际智慧城市科技股份有限公司 Edge computing box election method and device, electronic equipment and medium
CN113191455B (en) * 2021-05-26 2024-06-07 平安国际智慧城市科技股份有限公司 Edge computing box election method and device, electronic equipment and medium
CN113569482A (en) * 2021-07-29 2021-10-29 石家庄铁道大学 Method and device for evaluating service performance of tunnel, terminal and storage medium
CN113569482B (en) * 2021-07-29 2024-02-06 石家庄铁道大学 Tunnel service performance evaluation method, device, terminal and storage medium

Also Published As

Publication number Publication date
CN110135167B (en) 2020-11-20

Similar Documents

Publication Publication Date Title
CN110135167A (en) A kind of edge calculations terminal security grade appraisal procedure of random forest
CN107862347A (en) A kind of discovery method of the electricity stealing based on random forest
CN108632279A (en) A kind of multilayer method for detecting abnormality based on network flow
CN107846392A (en) A kind of intrusion detection algorithm based on improvement coorinated training ADBN
CN105373606A (en) Unbalanced data sampling method in improved C4.5 decision tree algorithm
CN104794368A (en) Rolling bearing fault classifying method based on FOA-MKSVM (fruit fly optimization algorithm-multiple kernel support vector machine)
CN116108758B (en) Landslide susceptibility evaluation method
CN110084610A (en) A kind of network trading fraud detection system based on twin neural network
CN106202952A (en) A kind of Parkinson disease diagnostic method based on machine learning
CN112735097A (en) Regional landslide early warning method and system
CN105913450A (en) Tire rubber carbon black dispersity evaluation method and system based on neural network image processing
CN109299741A (en) A kind of network attack kind identification method based on multilayer detection
CN109034194A (en) Transaction swindling behavior depth detection method based on feature differentiation
CN102200981B (en) Feature selection method and feature selection device for hierarchical text classification
CN110225055A (en) A kind of network flow abnormal detecting method and system based on KNN semi-supervised learning model
CN108052968A (en) A kind of perception intrusion detection method of QSFLA-SVM
CN112785450A (en) Soil environment quality partitioning method and system
CN109948726A (en) A kind of Power Quality Disturbance Classification Method based on depth forest
CN109829627A (en) A kind of safe confidence appraisal procedure of Electrical Power System Dynamic based on integrated study scheme
CN115577357A (en) Android malicious software detection method based on stacking integration technology
CN112561176A (en) Early warning method for online running state of electric power metering device
CN106056164A (en) Classification forecasting method based on Bayesian network
CN110177112B (en) Network intrusion detection method based on double subspace sampling and confidence offset
CN108920477A (en) A kind of unbalanced data processing method based on binary tree structure
Prasenna et al. Network programming and mining classifier for intrusion detection using probability classification

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant