CN109871686A - Rogue program recognition methods and device based on icon representation and software action consistency analysis - Google Patents
Rogue program recognition methods and device based on icon representation and software action consistency analysis Download PDFInfo
- Publication number
- CN109871686A CN109871686A CN201910123265.3A CN201910123265A CN109871686A CN 109871686 A CN109871686 A CN 109871686A CN 201910123265 A CN201910123265 A CN 201910123265A CN 109871686 A CN109871686 A CN 109871686A
- Authority
- CN
- China
- Prior art keywords
- icon
- software
- classification
- sample
- tested
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Abstract
The invention belongs to technical field of network security, in particular to a kind of rogue program recognition methods and device based on icon representation and software action consistency analysis, this method includes: collecting known normal software data of classifying, it extracts known normal software icon resource data and imports Table A PI data, construct CNN deep learning model, icon and importing Table A PI information are trained respectively, establish icon disaggregated model and software classification model, according to icon classification and software action classification information, software program routine information library is obtained;Structure elucidation is carried out to sample to be tested, extract icon resource data and imports Table A PI function data, is tested by CNN deep learning model, the classification of sample to be tested icon and software action classification information are obtained;Classify according to test result to sample to be tested icon and the behavior congruence of software action classification determines.The present invention realizes that automatic, batch rogue program quickly detects, and is effectively identified by the malicious program code of the camouflages such as the similar icon of software.
Description
Technical field
The invention belongs to technical field of network security, in particular to a kind of based on icon representation and software action consistency point
The rogue program recognition methods of analysis and device.
Background technique
Traditional malicious code analysis method is broadly divided into Static Analysis Method and dynamic analysing method.Static Analysis Method
Refer in the case where not executing program, dis-assembling, decompiling etc. are carried out to program, then analyzed again, main method has
Static source code analysis, static disassembly analysis, decompiling analysis;Dynamic analysing method refers to using program debugging tool to evil
Meaning code is tracked, and is observed malicious code implementation procedure, is dissected the working mechanism of malicious code and verify staticaanalysis results,
Main method has system to call behavior analysis method and trigger-initiated scanning technology.But it is traditional based on code and behavioural characteristic
Malicious code detecting method generally require to take a substantial amount of time by cumbersome step and can be only achieved preferable effect.According to
Statistics, in a large amount of malicious code, has sizable a part to belong to the malicious code of trick type, usually used and WORD
It waits the similar icon of popular softwares simply to pretend oneself, and then user is inveigled to go to click.After clicking operation, such evil
Meaning code then carries out a series of operation such as steal secret information, extort, and makes the information assets risk of user.Then, in recent years, disliking
Meaning code detection field proposes a kind of new approaches based on icon similarity analysis.The innovative point of the thinking is that from icon
It sets out, malicious code is utilized pretends using icon similar with normal software this feature of oneself, carry out malicious code
Detection, greatly improves Malicious Code Detection efficiency and precision.Therefore, carry out the malicious code based on icon similarity analysis
The research of detection method has important practical significance for the detection work of malicious code.
Using the method for machine learning, information is extracted from icon to improve the precision of detection Malicious Code Detection, including
Two steps: 1) extracting icon characteristics and use collect statistics (Summary Statics), histograms of oriented gradients (HOG,
Histogram of Oriented Gradient) and a convolution autocoder;2) according to the icon characteristics of extraction to figure
Mark is clustered.When experiment shows to analyze in prediction model using icon, mean accuracy increases 10%, the disadvantage is that requiring
Artificial to extract feature, icon classification speed and precision be not high, and does not provide a kind of vaild act analysis method.Based on application
The malicious code of mobile terminal detection method and system of program icon comprise the concrete steps that and first correspond to be carried out with the installation kit of program
Analysis, the icon of the application program is extracted, then the extraction system api function from the application code file, will
The icon of the application program is corresponding with application icon function rule base, so that function rule corresponding with this icon is retrieved,
By the api function of the application call compared with the corresponding function rule of the icon pair, if unanimously, normally to apply
Program;It otherwise is the application program of malice;But this technology does not have practicability, malicious code is many kinds of, and application program
API information can not reflect software function completely, therefore, in a practical situation, can there are problems that it is serious wrong report or fail to report.From
It is existing based on icon analysis in order to adapt to the automatic detection demand of extensive diversified malice sample from the point of view of actual effect
The means and method of Malicious Code Detection have the disadvantage that shortage versatility, can only be to using icon similar to normal software
Sample detected, and the sample for using other icons can not be detected;Lack practicability, when to extensive sample into
When row detection, efficiency is very low.
Summary of the invention
For this purpose, the present invention provides a kind of rogue program recognition methods based on icon representation and software action consistency analysis
And device, calculation amount is substantially reduced, realizes that automatic, batch rogue program quickly detects, high-efficient, versatility, strong applicability.
According to design scheme provided by the present invention, a kind of malice based on icon representation and software action consistency analysis
Procedure identification method includes following content:
A known normal software data of classifying) are collected, known normal software icon resource data are extracted and import Table A PI number
According to building CNN deep learning model is respectively trained icon and importing Table A PI information, and according to icon classification and software
Behavior classification information obtains conventional software library;
B structure elucidation) is carried out to sample to be tested, extract icon resource data and imports Table A PI function data, is transmitted to instruction
It is tested in the conventional software library perfected, obtains the classification of sample to be tested icon and software action classification information;
C) classify according to test result to sample to be tested icon and the behavior congruence of software action classification information is sentenced
It is fixed, if unanimously, being determined as normal software, if inconsistent, it is determined as Malware, and it is defeated to generate rogue program examining report
Out.
Above-mentioned, CNN deep learning model includes for the icon according to icon resource data acquiring software programs categories
Disaggregated model and according to import the programs categories behavior of Table A PI data acquiring software software action disaggregated model.
Preferably, icon disaggregated model use comprising input layer, convolutional layer, pond layer, full articulamentum and output layer volume
Product neural network model, wherein convolutional layer and pond layer are arranged alternately, and convolutional layer extracts icon characteristics by convolution operation, are led to
It crosses pond layer and carries out Feature Dimension Reduction, icon classification results are exported by full articulamentum.
Preferably, in software action disaggregated model, Table A PI function data will be imported and be saved as text file format, by software
Behavior classification problem is converted into text classification problem.
Further, during converting text classification problem for software action classification problem, it will extract and each of obtain
Table A PI function data is imported as sample, every a line indicates a behavior in the sample, and a line is considered as an entirety, traverses
All importing Table A PI function data texts carry out duplicate removal to all behaviors occurred, obtain dictionary;Use consecutive numbers
Word carries out label to each of dictionary word, obtains the mapping of dynamic behaviour to label id;Text is switched into two-dimensional matrix, it will
Each of dictionary word indicates with a vector, and for each sample, the behavior in text is converted to pair according to dictionary
The id sequence answered converts the samples into two-dimensional matrix according to the vector of each id in the id sequence and dictionary;Utilize convolution mind
Through training sample, for the two-dimensional matrix of input, convolution is carried out with multiple convolution kernels respectively, after convolution, then to every
One convolution results takes the maximum value in column vector using maximum pond;The corresponding maximum value of all convolution kernel results is connected,
Full articulamentum is constituted, carries out more classification processings with softmax classifier.
Label, the mapping process of acquisition dynamic behaviour to label id are carried out to each of dictionary word with continuous number
In, and unknown dynamic behaviour is added, which matches the unknown behavior not in dictionary for after.
Above-mentioned, during conventional software library test, if occurring that the classification of sample to be tested icon and software action can not be obtained
Classification information then determines that the sample to be tested is new classification, which is added in software program routine information library.
A kind of rogue program identification device based on icon representation and software action consistency analysis includes: collection module,
Test module and determination module, wherein
Collection module, for extracting known normal software icon resource number for collecting known normal software data of classifying
According to import Table A PI data, construct CNN deep learning model, respectively to icon and import Table A PI information be trained, and according to
According to icon classification and software action classification information, conventional software library is obtained;
Test module, for carrying out structure elucidation to sample to be tested, extracting icon resource data and importing Table A PI function number
According to, it is transmitted in trained conventional software library and is tested, the classification of acquisition sample to be tested icon and software action classification information;
Determination module, for the behavior one according to test result to sample to be tested icon classification and software action classification information
Cause property is determined, if unanimously, being determined as normal software, if inconsistent, is determined as Malware, and generate rogue program
Examining report output.
Beneficial effects of the present invention:
Comprehensive analysis tradition malicious code detecting method of the present invention and the Malicious Code Detection analyzed currently based on icon
The advantage and disadvantage of method, obtained using machine learning conventional software test model software category information that icon information gives expression to and
Software action classification information, the identification based on icon representation and software action consistency analysis effectively solve traditional malicious code
Detection method low efficiency, the problems such as cost is high, meanwhile, it solves and is deposited in the malicious code detecting method currently based on icon analysis
Not can be carried out the situations such as effective detection with normal software similar diagram target rogue program for being not used, in extensive sample
In the case where this, realizes automatic, batch rogue program and quickly detect, it is high-efficient, it can effectively identify in network by soft
The malicious program code that similar icon of part etc. is pretended, is further ensured that network subscriber information assets security, to network security
Technology development has great importance.
Detailed description of the invention:
Fig. 1 is one of rogue program recognition methods flow chart in embodiment;
Fig. 2 is two of rogue program recognition methods flow chart in embodiment;
Fig. 3 is rogue program identification device schematic diagram in embodiment;
Fig. 4 is rogue program identification device working principle diagram in embodiment.
Specific embodiment:
To make the object, technical solutions and advantages of the present invention clearer, understand, with reference to the accompanying drawing with technical solution pair
The present invention is described in further detail.
In order to adapt to the automatic detection demand of extensive diversified malice sample, in the embodiment of the present invention, referring to Fig. 1 institute
Show, a kind of rogue program recognition methods based on icon representation and software action consistency analysis be provided, includes following content:
S101 known normal software data of classifying) are collected, known normal software icon resource data are extracted and import Table A PI
Data, construct CNN deep learning model, respectively to icon and import Table A PI information be trained, and according to icon classification and it is soft
Part behavior classification information, obtains conventional software library;
S102 structure elucidation) is carried out to sample to be tested, extract icon resource data and imports Table A PI function data, transmission
To being tested in trained conventional software library, the classification of sample to be tested icon and software action classification information are obtained;
S103) classify according to test result to sample to be tested icon and the behavior congruence of software action classification information carries out
Determine, if unanimously, being determined as normal software, if inconsistent, is determined as Malware, and generate rogue program examining report
Output.
Shown in Figure 2, in another embodiment of the present invention, CNN deep learning model includes for according to icon resource
The icon disaggregated model of data acquiring software programs categories and according to import the programs categories behavior of Table A PI data acquiring software it is soft
Part behavior disaggregated model.Establish icon disaggregated model and software action disaggregated model.Icon disaggregated model is the figure for analyzing software
Information is marked, the software category information that icon information gives expression to is obtained;Software classification model is determined according to icon disaggregated model
As a result, analyze corresponded in the behavioural information and software action disaggregated model of the software such software behavior it is whether consistent.
During establishing icon disaggregated model, image classification is carried out using CNN even depth learning model, is achieved not
Wrong effect, the classification of icon icon still have feasibility.In another embodiment of the present invention, icon disaggregated model use comprising
Input layer, convolutional layer, pond layer, full articulamentum and output layer convolutional neural networks model, wherein convolutional layer and pond layer are handed over
For setting, convolutional layer extracts icon characteristics by convolution operation, carries out Feature Dimension Reduction by pond layer, is exported by full articulamentum
Icon classification results.
In establishing software action disaggregated model, in another embodiment of the present invention, Table A PI function data will be imported and be saved as
Software action classification problem is converted text classification problem by text file format.The importing Table A PI information for extracting software is deposited
At text file format, text classification problem is converted into software action classification problem, it can be using CNN in natural language processing
The method of aspect.Start with from icon, using the method for deep learning, establishes icon disaggregated model and software classification model, firstly,
Using the icon information of malicious code, efficiently solve the problems, such as that traditional malicious code detecting method low efficiency, cost are high, together
When, solve currently based on icon analysis malicious code detecting method existing deficiency, i.e., for be not used with normally it is soft
Part similar diagram target rogue program not can be carried out effective detection, and in the case where extensive sample, realize it is automatic, batch
The rogue program of amount quickly detects.
During converting text classification problem for software action classification problem, in another embodiment of the present invention, it will mention
The each importing Table A PI function data obtained is as sample, and every a line indicates a behavior in the sample, and a line is considered as one
A entirety traverses all importing Table A PI function data texts, carries out duplicate removal to all behaviors occurred, obtains word
Library;Label is carried out to each of dictionary word with continuous number, obtains the mapping of dynamic behaviour to label id;Text is switched to
Two-dimensional matrix indicates each of dictionary word with a vector, for each sample, by the behavior in text according to word
Library is converted to corresponding id sequence, converts the samples into two-dimensional matrix according to the vector of each id in the id sequence and dictionary;
Using convolutional neural networks training sample, for the two-dimensional matrix of input, convolution is carried out with multiple convolution kernels respectively, by convolution
Afterwards, then to each convolution results using maximum pond, the maximum value in column vector is taken;All convolution kernel results are corresponding most
Big value connection, constitutes full articulamentum, carries out more classification processings with softmax classifier.
Label, the mapping process of acquisition dynamic behaviour to label id are carried out to each of dictionary word with continuous number
In, and unknown dynamic behaviour is added, which matches the unknown behavior not in dictionary for after.
It is shown in Figure 2,1) sample information extraction module, PE structure elucidation is carried out to incoming sample to be tested, is extracted
Api function information in its icon resource information and importing table, icon still save as icon format, import Table A PI and are saved as text lattice
Formula, the information input as malicious detection in next step.2) collect a large amount of normal software, as office software, video software,
Music software etc., wherein the icon of the application programs such as WORD is applied extremely wide in malicious code, is had for ordinary user
There is extremely strong trick;To normal software, the first extraction of progress icon resource information and importing Table A PI information, the side of extraction
Formula is identical as the Functional Design of sample information extraction module;Establish icon disaggregated model and software action disaggregated model.Icon point
Class model uses CNN, is made of input layer, convolutional layer, pond layer, full articulamentum and output layer, using convolutional layer and pond layer
It is arranged alternately;Software classification model will extract obtained each importing Table A PI text and regard sample as, every a line in the sample
It indicates a behavior, a line is considered as an entirety, traverse all importing Table A PI texts, acquisition occurred all
Behavior (duplicate removal), as dictionary.Label is carried out to each of dictionary word with continuous number, dynamic behaviour available in this way
To the mapping of label id.Other than the dynamic behaviour occurred, it is also in addition added to " Unknown " dynamic behaviour, is used for
The unknown behavior not in dictionary is matched later.Then, text is switched into two-dimensional matrix.First by each of dictionary word with one
A vector indicates.Random initializtion is used when initialization vector, can constantly update term vector with training later.For each
Behavior in text is converted to corresponding id sequence according to dictionary by sample.Further according to each id in this id sequence and dictionary
Vector convert the samples into two-dimensional matrix.Finally, using CNN training sample, for the sample matrix of input, respectively with multiple
Convolution kernel carries out convolution, uses max-pooling after convolution, then to each convolution results, takes the maximum in column vector
Value.The corresponding maximum value of all convolution kernel results is linked together and constitutes full articulamentum.Finally carry out classify with softmax more
Processing.3) sample to be tested input conventional software model is tested, is determined by icon disaggregated model by malicious detection module
The classification of the software determines the classification of the software by software classification model, if the two determine it is consistent, illustrate icon performance with it is soft
Part behavior is with uniformity, then is normal software;It otherwise is Malware.During conventional software library test, the present invention is another
In a embodiment, if occurring that the classification of sample to be tested icon and software action classification information can not be obtained, the sample to be tested is determined
Newly to classify, which is added in software program routine information library.
Based on above-mentioned method, the malice based on icon representation and software action consistency analysis that the present invention also provides a kind of
Procedure identification device, it is shown in Figure 3, include: collection module 101, test module 102 and determination module 103, wherein
Collection module 101 extracts known normal software icon resource data for collecting known normal software data of classifying
With importing Table A PI data, CNN deep learning model is constructed, icon and importing Table A PI information are trained respectively, and foundation
Icon classification and software action classification information, obtain conventional software library;
Test module 102, for carrying out structure elucidation to sample to be tested, extracting icon resource data and importing Table A PI letter
Number data, are transmitted in trained conventional software library and are tested, and obtain the classification of sample to be tested icon and software action classification
Information;
Determination module 103, for the row according to test result to sample to be tested icon classification and software action classification information
Determined for consistency, if unanimously, being determined as normal software, if inconsistent, be determined as Malware, and generates malice
Programmable detection report output.
Shown in Figure 4, sample information, which is extracted, carries out PE structure elucidation to the sample to be tested of input, extracts its icon money
Api function information in source information and importing table, the information input as malicious detection in next step.Conventional software model refers to logical
The executable program for collecting a large amount of normal software is crossed, its icon information is extracted and imports Table A PI information, constructs CNN depth
Learning model respectively carries out icon and importing Table A PI information, icon disaggregated model and software classification model is established, as evil
The information reference of meaning property detection.Sample to be tested input conventional software model is tested, is classified by icon by malicious detection
Model determines the classification of the software, and the classification of the software is determined by software classification model, if the two determines unanimously, to illustrate icon
Performance is with uniformity with software action, then is normal software;It otherwise is Malware.If icon disaggregated model or software classification
Model can not determine, i.e., can not detect malicious, determine that the input sample is new classification, and routine information is added in the sample
Library.Solve that traditional malicious code detecting method low efficiency, cost are high, and the malicious code detecting method based on icon analysis is deposited
For be not used with normal software similar diagram target rogue program not can be carried out effective detection the problems such as, in conjunction with routine information
Library and real-time update realize automatic, batch rogue program and quickly detect, effectively identify in the case where extensive sample
The malicious program code pretended in network by the similar icon of software out, it is easy to accomplish, it is high-efficient, guarantee network user's letter
Assets security is ceased, is had important practical significance for network security detection.
Unless specifically stated otherwise, the opposite step of the component and step that otherwise illustrate in these embodiments, digital table
It is not limit the scope of the invention up to formula and numerical value.
Based on above-mentioned method, the embodiment of the present invention also provides a kind of server, comprising: one or more processors;It deposits
Storage device, for storing one or more programs, when one or more of programs are executed by one or more of processors,
So that one or more of processors realize above-mentioned method.
Based on above-mentioned method, the embodiment of the present invention also provides a kind of computer-readable medium, is stored thereon with computer
Program, wherein the program realizes above-mentioned method when being executed by processor.
The technical effect and preceding method embodiment phase of device provided by the embodiment of the present invention, realization principle and generation
Together, to briefly describe, Installation practice part does not refer to place, can refer to corresponding contents in preceding method embodiment.
It is apparent to those skilled in the art that for convenience and simplicity of description, the system of foregoing description
It with the specific work process of device, can refer to corresponding processes in the foregoing method embodiment, details are not described herein.
In all examples being illustrated and described herein, any occurrence should be construed as merely illustratively, without
It is as limitation, therefore, other examples of exemplary embodiment can have different values.
It should also be noted that similar label and letter indicate similar terms in following attached drawing, therefore, once a certain Xiang Yi
It is defined in a attached drawing, does not then need that it is further defined and explained in subsequent attached drawing.
The flow chart and block diagram in the drawings show the system of multiple embodiments according to the present invention, method and computer journeys
The architecture, function and operation in the cards of sequence product.In this regard, each box in flowchart or block diagram can generation
A part of one module, section or code of table, a part of the module, section or code include one or more use
The executable instruction of the logic function as defined in realizing.It should also be noted that in some implementations as replacements, being marked in box
The function of note can also occur in a different order than that indicated in the drawings.For example, two continuous boxes can actually base
Originally it is performed in parallel, they can also be executed in the opposite order sometimes, and this depends on the function involved.It is also noted that
It is the combination of each box in block diagram and or flow chart and the box in block diagram and or flow chart, can uses and execute rule
The dedicated hardware based system of fixed function or movement is realized, or can use the group of specialized hardware and computer instruction
It closes to realize.
In several embodiments provided herein, it should be understood that disclosed systems, devices and methods, it can be with
It realizes by another way.The apparatus embodiments described above are merely exemplary, for example, the division of the unit,
Only a kind of logical function partition, there may be another division manner in actual implementation, in another example, multiple units or components can
To combine or be desirably integrated into another system, or some features can be ignored or not executed.Another point, it is shown or beg for
The mutual coupling, direct-coupling or communication connection of opinion can be through some communication interfaces, device or unit it is indirect
Coupling or communication connection can be electrical property, mechanical or other forms.
The unit as illustrated by the separation member may or may not be physically separated, aobvious as unit
The component shown may or may not be physical unit, it can and it is in one place, or may be distributed over multiple
In network unit.It can select some or all of unit therein according to the actual needs to realize the mesh of this embodiment scheme
's.
It, can also be in addition, the functional units in various embodiments of the present invention may be integrated into one processing unit
It is that each unit physically exists alone, can also be integrated in one unit with two or more units.
It, can be with if the function is realized in the form of SFU software functional unit and when sold or used as an independent product
It is stored in the executable non-volatile computer-readable storage medium of a processor.Based on this understanding, of the invention
Technical solution substantially the part of the part that contributes to existing technology or the technical solution can be with software in other words
The form of product embodies, which is stored in a storage medium, including some instructions use so that
One computer equipment (can be personal computer, server or the network equipment etc.) executes each embodiment institute of the present invention
State all or part of the steps of method.And storage medium above-mentioned includes: USB flash disk, mobile hard disk, read-only memory (ROM, Read-
Only Memory), random access memory (RAM, Random Access Memory), magnetic or disk etc. are various can be with
Store the medium of program code.
Finally, it should be noted that embodiment described above, only a specific embodiment of the invention, to illustrate the present invention
Technical solution, rather than its limitations, scope of protection of the present invention is not limited thereto, although with reference to the foregoing embodiments to this hair
It is bright to be described in detail, those skilled in the art should understand that: anyone skilled in the art
In the technical scope disclosed by the present invention, it can still modify to technical solution documented by previous embodiment or can be light
It is readily conceivable that variation or equivalent replacement of some of the technical features;And these modifications, variation or replacement, do not make
The essence of corresponding technical solution is detached from the spirit and scope of technical solution of the embodiment of the present invention, should all cover in protection of the invention
Within the scope of.Therefore, the protection scope of the present invention shall be subject to the protection scope of the claims.
Claims (10)
1. a kind of rogue program recognition methods based on icon representation and software action consistency analysis, characterized by comprising:
A known normal software data of classifying) are collected, known normal software icon resource data are extracted and import Table A PI data, structure
CNN deep learning model is built, icon and importing Table A PI information are trained respectively, and according to icon classification and software action
Classification information obtains conventional software library;
B structure elucidation) is carried out to sample to be tested, extract icon resource data and imports Table A PI function data, is transmitted to and trains
Conventional software library in tested, obtain sample to be tested icon classification and software action classification information;
C) classify according to test result to sample to be tested icon and the behavior congruence of software action classification information determines, if
Unanimously, then it is determined as normal software, if inconsistent, is determined as Malware, and generates the output of rogue program examining report.
2. the rogue program recognition methods according to claim 1 based on icon representation and software action consistency analysis,
It is characterized in that, CNN deep learning model includes to classify for the icon according to icon resource data acquiring software programs categories
Model and according to import the programs categories behavior of Table A PI data acquiring software software action disaggregated model.
3. the rogue program recognition methods according to claim 2 based on icon representation and software action consistency analysis,
It is characterized in that, icon disaggregated model is using the convolution mind comprising input layer, convolutional layer, pond layer, full articulamentum and output layer
Through network model, wherein convolutional layer and pond layer are arranged alternately, and convolutional layer extracts icon characteristics by convolution operation, pass through pond
Change layer and carry out Feature Dimension Reduction, icon classification results are exported by full articulamentum.
4. the rogue program recognition methods according to claim 2 based on icon representation and software action consistency analysis,
It is characterized in that, will import Table A PI function data in software action disaggregated model and be saved as text file format, by software action
Classification problem is converted into text classification problem.
5. the rogue program recognition methods according to claim 4 based on icon representation and software action consistency analysis,
It is characterized in that, obtained each importing table will be extracted during converting text classification problem for software action classification problem
Api function data are as sample, and every a line indicates a behavior in the sample, and a line is considered as an entirety, is traversed all
Table A PI function data text is imported, duplicate removal is carried out to all behaviors occurred, obtains dictionary;With continuous number to word
Each of library word carries out label, obtains the mapping of dynamic behaviour to label id;Text is switched into two-dimensional matrix, it will be in dictionary
Each word indicated with a vector, for each sample, the behavior in text is converted into corresponding id according to dictionary
Sequence converts the samples into two-dimensional matrix according to the vector of each id in the id sequence and dictionary;Utilize convolutional neural networks
Training sample carries out convolution with multiple convolution kernels respectively for the two-dimensional matrix of input, after convolution, then to each volume
Product result takes the maximum value in column vector using maximum pond;By the corresponding maximum value connection of all convolution kernel results, constitute complete
Articulamentum carries out more classification processings with softmax classifier.
6. the rogue program recognition methods according to claim 5 based on icon representation and software action consistency analysis,
It is characterized in that, carrying out label to each of dictionary word with continuous number, the mapping of dynamic behaviour to label id is obtained
Cheng Zhong, and unknown dynamic behaviour is added, which matches the unknown behavior not in dictionary for after.
7. the rogue program recognition methods according to claim 1 based on icon representation and software action consistency analysis,
It is characterized in that, B) in, during conventional software library test, if occurring that the classification of sample to be tested icon and software action can not be obtained
Classification information then determines that the sample to be tested is new classification, which is added in software program routine information library.
8. a kind of rogue program identification device based on icon representation and software action consistency analysis, characterized by comprising:
Collection module, test module and determination module, wherein
Collection module extracts known normal software icon resource data and importing for collecting known normal software data of classifying
Table A PI data construct CNN deep learning model, are trained respectively to icon and importing Table A PI information, and according to icon point
Class and software action classification information, obtain conventional software library;
Test module, for carrying out structure elucidation to sample to be tested, extracting icon resource data and importing Table A PI function data,
It is transmitted in trained conventional software library and is tested, obtain the classification of sample to be tested icon and software action classification information;
Determination module, for the behavior congruence according to test result to sample to be tested icon classification and software action classification information
Determined, if unanimously, being determined as normal software, if inconsistent, be determined as Malware, and generates rogue program detection
Report output.
9. the rogue program identification device according to claim 8 based on icon representation and software action consistency analysis,
It is characterized in that, CNN deep learning model includes to classify for the icon according to icon resource data acquiring software programs categories
Model and according to import the programs categories behavior of Table A PI data acquiring software software action disaggregated model.
10. the rogue program identification device according to claim 8 based on icon representation and software action consistency analysis,
It is characterized in that, also including update module, the classification of sample to be tested icon and software can not be obtained for being directed in conventional software library
The sample to be tested for the situation occur is added in software program routine information library by the situation of behavior classification information.
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN2019100997272 | 2019-01-31 | ||
CN201910099727 | 2019-01-31 |
Publications (1)
Publication Number | Publication Date |
---|---|
CN109871686A true CN109871686A (en) | 2019-06-11 |
Family
ID=66918928
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910123265.3A Pending CN109871686A (en) | 2019-01-31 | 2019-02-18 | Rogue program recognition methods and device based on icon representation and software action consistency analysis |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN109871686A (en) |
Cited By (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111290785A (en) * | 2020-03-06 | 2020-06-16 | 北京百度网讯科技有限公司 | Method and device for evaluating deep learning framework system compatibility, electronic equipment and storage medium |
CN112257757A (en) * | 2020-09-27 | 2021-01-22 | 北京锐服信科技有限公司 | Malicious sample detection method and system based on deep learning |
CN112364309A (en) * | 2021-01-13 | 2021-02-12 | 北京云真信科技有限公司 | Information processing method, electronic device, and computer-readable storage medium |
CN112487432A (en) * | 2020-12-10 | 2021-03-12 | 杭州安恒信息技术股份有限公司 | Method, system and equipment for malicious file detection based on icon matching |
CN112860932A (en) * | 2021-02-19 | 2021-05-28 | 电子科技大学 | Image retrieval method, device, equipment and storage medium for resisting malicious sample attack |
CN113076539A (en) * | 2021-04-13 | 2021-07-06 | 郑州信息科技职业学院 | Big data-based computer security protection system |
Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103761481A (en) * | 2014-01-23 | 2014-04-30 | 北京奇虎科技有限公司 | Method and device for automatically processing malicious code sample |
CN103853979A (en) * | 2010-12-31 | 2014-06-11 | 北京奇虎科技有限公司 | Program identification method and device based on machine learning |
CN103902906A (en) * | 2013-12-25 | 2014-07-02 | 武汉安天信息技术有限责任公司 | Mobile terminal malicious code detecting method and system based on application icon |
CN105938485A (en) * | 2016-04-14 | 2016-09-14 | 北京工业大学 | Image description method based on convolution cyclic hybrid model |
US20180063169A1 (en) * | 2016-09-01 | 2018-03-01 | Cylance Inc. | Container file analysis using machine learning model |
CN108898015A (en) * | 2018-06-26 | 2018-11-27 | 暨南大学 | Application layer dynamic intruding detection system and detection method based on artificial intelligence |
CN109165688A (en) * | 2018-08-28 | 2019-01-08 | 暨南大学 | A kind of Android Malware family classification device construction method and its classification method |
-
2019
- 2019-02-18 CN CN201910123265.3A patent/CN109871686A/en active Pending
Patent Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103853979A (en) * | 2010-12-31 | 2014-06-11 | 北京奇虎科技有限公司 | Program identification method and device based on machine learning |
CN103902906A (en) * | 2013-12-25 | 2014-07-02 | 武汉安天信息技术有限责任公司 | Mobile terminal malicious code detecting method and system based on application icon |
CN103761481A (en) * | 2014-01-23 | 2014-04-30 | 北京奇虎科技有限公司 | Method and device for automatically processing malicious code sample |
CN105938485A (en) * | 2016-04-14 | 2016-09-14 | 北京工业大学 | Image description method based on convolution cyclic hybrid model |
US20180063169A1 (en) * | 2016-09-01 | 2018-03-01 | Cylance Inc. | Container file analysis using machine learning model |
CN108898015A (en) * | 2018-06-26 | 2018-11-27 | 暨南大学 | Application layer dynamic intruding detection system and detection method based on artificial intelligence |
CN109165688A (en) * | 2018-08-28 | 2019-01-08 | 暨南大学 | A kind of Android Malware family classification device construction method and its classification method |
Non-Patent Citations (3)
Title |
---|
卓新建 等: "《计算机病毒原理与防治》", 30 April 2004, 北京邮电大学出版社 * |
孟曦: "基于深度学习的恶意代码分类与聚类技术研究", 《中国优秀硕士学位论文全文数据库 信息科技辑》 * |
李德毅: "《人工智能导论》", 31 August 2018 * |
Cited By (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111290785A (en) * | 2020-03-06 | 2020-06-16 | 北京百度网讯科技有限公司 | Method and device for evaluating deep learning framework system compatibility, electronic equipment and storage medium |
CN112257757A (en) * | 2020-09-27 | 2021-01-22 | 北京锐服信科技有限公司 | Malicious sample detection method and system based on deep learning |
CN112487432A (en) * | 2020-12-10 | 2021-03-12 | 杭州安恒信息技术股份有限公司 | Method, system and equipment for malicious file detection based on icon matching |
CN112364309A (en) * | 2021-01-13 | 2021-02-12 | 北京云真信科技有限公司 | Information processing method, electronic device, and computer-readable storage medium |
CN112860932A (en) * | 2021-02-19 | 2021-05-28 | 电子科技大学 | Image retrieval method, device, equipment and storage medium for resisting malicious sample attack |
CN113076539A (en) * | 2021-04-13 | 2021-07-06 | 郑州信息科技职业学院 | Big data-based computer security protection system |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN109871686A (en) | Rogue program recognition methods and device based on icon representation and software action consistency analysis | |
Warnecke et al. | Evaluating explanation methods for deep learning in security | |
Kolosnjaji et al. | Empowering convolutional networks for malware classification and analysis | |
CN106709345B (en) | Method, system and equipment for deducing malicious code rules based on deep learning method | |
CN109784056B (en) | Malicious software detection method based on deep learning | |
CN110020422B (en) | Feature word determining method and device and server | |
CN110837550A (en) | Knowledge graph-based question and answer method and device, electronic equipment and storage medium | |
Xiao et al. | Image-based malware classification using section distribution information | |
CN108491228A (en) | A kind of binary vulnerability Code Clones detection method and system | |
CN109446328A (en) | A kind of text recognition method, device and its storage medium | |
CN111866004B (en) | Security assessment method, apparatus, computer system, and medium | |
CN109344258A (en) | A kind of intelligent self-adaptive sensitive data identifying system and method | |
CN109063478A (en) | Method for detecting virus, device, equipment and the medium of transplantable executable file | |
CN116361801B (en) | Malicious software detection method and system based on semantic information of application program interface | |
CN109829302A (en) | Android malicious application family classification method, apparatus and electronic equipment | |
Chen et al. | Applying convolutional neural network for malware detection | |
CN103530312A (en) | User identification method and system using multifaceted footprints | |
Zhang et al. | Malicious code detection based on code semantic features | |
CN114386511B (en) | Malicious software family classification method based on multidimensional feature fusion and model integration | |
CN111400713A (en) | Malicious software family classification method based on operation code adjacency graph characteristics | |
Fonseca et al. | Model-agnostic approaches to handling noisy labels when training sound event classifiers | |
CN116663019B (en) | Source code vulnerability detection method, device and system | |
CN108985052A (en) | A kind of rogue program recognition methods, device and storage medium | |
EP4227855A1 (en) | Graph explainable artificial intelligence correlation | |
CN114817925B (en) | Android malicious software detection method and system based on multi-modal graph features |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20190611 |