CN106951780B - Beat again the static detection method and device of packet malicious application - Google Patents

Beat again the static detection method and device of packet malicious application Download PDF

Info

Publication number
CN106951780B
CN106951780B CN201710069633.1A CN201710069633A CN106951780B CN 106951780 B CN106951780 B CN 106951780B CN 201710069633 A CN201710069633 A CN 201710069633A CN 106951780 B CN106951780 B CN 106951780B
Authority
CN
China
Prior art keywords
malicious
classes
application program
sequence
class
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201710069633.1A
Other languages
Chinese (zh)
Other versions
CN106951780A (en
Inventor
刘超
喻民
谭民
朱大立
姜建国
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Institute of Information Engineering of CAS
Original Assignee
Institute of Information Engineering of CAS
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Institute of Information Engineering of CAS filed Critical Institute of Information Engineering of CAS
Priority to CN201710069633.1A priority Critical patent/CN106951780B/en
Publication of CN106951780A publication Critical patent/CN106951780A/en
Application granted granted Critical
Publication of CN106951780B publication Critical patent/CN106951780B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/50Monitoring users, programs or devices to maintain the integrity of platforms, e.g. of processors, firmware or operating systems
    • G06F21/55Detecting local intrusion or implementing counter-measures
    • G06F21/56Computer malware detection or handling, e.g. anti-virus arrangements
    • G06F21/562Static detection
    • G06F21/563Static detection by source code analysis

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Security & Cryptography (AREA)
  • Computer Hardware Design (AREA)
  • General Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Virology (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Telephonic Communication Services (AREA)

Abstract

The present invention relates to a kind of static detection methods and device for beating again packet malicious application, this method comprises: obtaining the incidence relation between class belonging to the API Calls sequence and each API Calls sequence of the installation kit of application program to be detected;Construct class-based function call relationship graph;Clustering is carried out to each class, obtains multiple clusters, and by the cluster of the strongest preset quantity of incidence relation removes between class and class in each cluster, obtains malicious code cluster;Extract sensitive API calling sequence in the API Calls sequence of each class in malicious code cluster, and the characteristic sequence sample by the sensitive API calling sequence of each class extracted respectively with malicious application in the sample database that pre-establishes carries out similarity mode;Determine whether application program to be detected attaches most importance to the malicious application of packing.The present invention extracts malicious code independent of Android official application program as unit of class, so being directed to the malicious code of mutation, also can guarantee higher accuracy.

Description

Static detection method and device for repackaging malicious applications
Technical Field
The invention relates to the technical field of malicious code detection, in particular to a static detection method and a static detection device for repackaging malicious applications.
Background
With the rapid development of the mobile internet, the sales volume of the smart terminal (e.g., a smart phone, a tablet, etc.) is rapidly increasing due to its convenience in carrying, excellent performance, and rich functions (e.g., instant messaging, handling office, network game, etc.). At present, China mobile Internet users exceed 8 hundred million, Google Play breaks through 140 thousands of applications in 2015, and application markets of various third parties in China also have a large number of mobile applications. The applications bring great convenience to people and also bring great information safety hidden dangers and risks. A research of malicious application program analysis based on an Android system shows that: after analyzing 1260 malicious application samples, 1083 (86%) malicious applications were found to have been generated by repackaging the legitimate versions with the malicious applications.
In the face of the problem of the inundation of malicious repackaging application programs on the Android platform, researchers at home and abroad propose different detection methods. In which droidmos is a typical representative, the method first assumes that the Android application programs in the Android official application market are the most initial, unpacked and non-malicious, so as to detect whether the Android application programs from other sources, such as a third party application market, are unpacked malicious application programs. The detection process adopts a fuzzy hash algorithm, generates a unique signature of the Android application program based on the instruction sequence, and then performs pairwise comparison to realize whether the application program is malicious or not.
In the detection method, the Android official application market is assumed to be native, non-malicious and not repackaged, and the assumption is too optimistic in some aspects to detect the repackaged application in the Android application market. Moreover, the detection capability for variant malicious code is quite limited, requiring timely updates to the malicious sample library. Both of the above two points make detection accuracy of DroidMOSS low.
Disclosure of Invention
Aiming at the defects, the invention provides a static detection method and a static detection device for packaging malicious applications, which can improve the detection accuracy.
In a first aspect, the static detection method for repackaging malicious applications provided by the present invention includes:
acquiring API calling sequences of an installation package of an application program to be detected and an association relation between classes to which each API calling sequence belongs;
constructing a function call relation graph based on classes according to the strength of the association relation between the classes; the nodes in the function call relation graph are classes;
according to the strength degree of the association relationship between the classes, clustering and dividing each class to obtain a plurality of clusters, and removing a preset number of clusters with the strongest association relationship between the classes in each cluster to obtain a malicious code cluster;
extracting sensitive API calling sequences from the API calling sequences of all classes in the malicious code cluster, and respectively carrying out similarity matching on the extracted sensitive API calling sequences of all classes and a characteristic sequence sample of a malicious application program in a pre-established sample library;
and determining whether the application program to be detected is a repackaged malicious application program or not according to the similarity matching result.
Optionally, the obtaining of the API call sequence of the installation package of the application to be detected includes: preprocessing the installation package to obtain classes.dex files; performing decompiling on the classes and dex file to obtain a smali file; and extracting an API calling sequence from the smali file.
Optionally, the preprocessing the installation package to obtain classes. Decompressing the installation package, and extracting classes.
Optionally, the extracting an API call sequence from the smali file includes: and searching and backtracking from the corresponding position of each entry point of the application program to be detected in the smali file to extract the API calling sequence.
Optionally, the performing similarity matching between the extracted sensitive API call sequence of each class and a pre-established sample of a feature sequence of a malicious application program in a sample library includes: carrying out similarity matching on the extracted sensitive API calling sequence of each class and family characteristics of the malicious application program families of the same class in the sample library; wherein the malicious application family comprises a plurality of malicious applications of the same category; the family features are a sequence of features of the malicious application family and include a sequence sample of sensitive API calls for each malicious application in the malicious code family.
Optionally, the method further includes: and if the application program to be detected is determined to be the repackaged malicious application program according to the similarity matching result, adding the sensitive API calling sequence of the application program to be detected into the family characteristics of the same category malicious application program family.
In a second aspect, the present invention provides a static detection apparatus for repackaging malicious applications, including:
the acquisition module is used for acquiring API calling sequences of the installation package of the application program to be detected and the association relation between the classes to which the API calling sequences belong;
the building module is used for building a function call relation graph based on classes according to the strength of the incidence relation between the classes; the nodes in the function call relation graph are classes;
the cluster module is used for clustering and dividing each class according to the strength degree of the association relationship between the classes to obtain a plurality of clusters, and removing the clusters with the strongest preset number of association relationships between the classes in each cluster to obtain a malicious code cluster;
the matching module is used for extracting sensitive API calling sequences from the API calling sequences of all classes in the malicious code cluster and respectively carrying out similarity matching on the extracted sensitive API calling sequences of all classes and a characteristic sequence sample of a malicious application program in a pre-established sample library;
and the determining module is used for determining whether the application program to be detected is a repackaged malicious application program or not according to the similarity matching result.
Optionally, the obtaining module includes:
the preprocessing unit is used for preprocessing the installation package to obtain classes. The decompiling unit is used for decompiling the classes and dex file to obtain a smali file; and the extraction unit is used for extracting the API calling sequence from the smali file.
Optionally, the matching module is specifically configured to: carrying out similarity matching on the extracted sensitive API calling sequence of each class and family characteristics of the malicious application program families of the same class in the sample library; wherein the malicious application family comprises a plurality of malicious applications of the same category; the family features are a sequence of features of the malicious application family and include a sequence sample of sensitive API calls for each malicious application in the malicious code family.
Optionally, the apparatus further comprises:
and the updating module is used for adding the sensitive API calling sequence of the application program to be detected to the family characteristics of the same category of malicious application program families when the application program to be detected is determined to be the repackaged malicious application program according to the similarity matching result.
According to the static detection method and device for the repackaging malicious application, the application program in the Android official application market is not assumed to be original, non-malicious and not repackaged in each step, namely the method and device do not depend on the Android official application program, the detection accuracy can be further improved, and the detection of the Android official application program can be realized. In addition, in S3, each class is clustered and divided according to the strength of the calling relationship of each class to extract the malicious code portion, and since this process is performed in class units, even if the developer of the malicious program modifies the injected portion, the detection result can be determined according to the similarity, so that a relatively accurate detection result can be obtained, and therefore, a relatively high accuracy can be ensured for the malicious code of the variation. In addition, in the detection process, the malicious code part is proposed in a clustering division mode, the detection result is not influenced by the operations of modification, deletion and the like of the normal code part of the application program, and the detection accuracy is further improved.
Drawings
In order to more clearly illustrate the embodiments of the present disclosure or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present disclosure, and other drawings can be obtained by those skilled in the art without creative efforts.
FIG. 1 is a flow chart illustrating a static detection method for repackaging malicious applications according to an embodiment of the present invention;
FIG. 2 is a diagram illustrating the API call sequence obtained in S1 according to an embodiment of the present invention;
FIG. 3 is a diagram illustrating the structure of an application program according to an embodiment of the present invention;
FIG. 4 is a diagram illustrating the function call relationship diagram constructed in S2 according to an embodiment of the present invention;
FIG. 5 is a schematic diagram illustrating the clusters obtained after the cluster analysis in S4 according to an embodiment of the present invention;
fig. 6 shows a block diagram of an electronic device according to an embodiment of the present invention.
Detailed Description
The technical solutions in the embodiments of the present disclosure will be clearly and completely described below with reference to the drawings in the embodiments of the present disclosure, and it is obvious that the described embodiments are only a part of the embodiments of the present disclosure, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments disclosed herein without making any creative effort, shall fall within the protection scope of the present disclosure.
In a first aspect, the invention provides a static detection method for repackaging malicious applications, which can be used for detecting repackaged malicious applications and is suitable for Android application programs in an Android official application market, a third-party application market or other sources. As shown in fig. 1, the method includes:
s1, obtaining API calling sequences of the installation package of the application program to be detected and the association relation between the classes to which the API calling sequences belong;
it is understood that the API, Application Programming Interface, refers to a calling Interface that the operating system leaves for an Application program, which makes the operating system execute commands or actions of the Application program by calling the API of the operating system. The API call sequence acquired in this step may refer to fig. 2.
It is understood that the so-called class-to-class association may include: inheritance relationships, references, function call relationships, and the like.
S2, constructing a function call relation graph based on classes according to the strength of the incidence relation between the classes; the nodes in the function call relation graph are classes;
it can be understood that the application program is mainly organized in the form of classes and packages in the development process, as shown in fig. 3, under the APK (Android Package), there are n classes, class, and under each class there are multiple calling functions, and under each calling function there are multiple API calling sequences. Therefore, in the development process, a developer puts codes with certain association into a class and then organizes the codes into a package, and the class and the package have specific semantic information. Accordingly, the API call sequences are classified, that is, aggregated based on the classes, so as to construct a function call relation diagram, which may refer to fig. 4.
S3, according to the strength degree of the association relationship between the classes, clustering and dividing each class to obtain a plurality of clusters, and removing the clusters with the strongest preset number of association relationships between the classes in each cluster to obtain a malicious code cluster;
it can be understood that the clustering division is performed according to the strength degree of the class and the class calling relationship, that is, the classes with stronger calling relationship are divided together, and the classes with weaker calling relationship are divided together, and the specific strength degree can be set according to actual needs, and if the calling relationship is stronger than a certain specific degree, the calling relationship is considered to be stronger, otherwise, the calling relationship is considered to be weaker. According to the research on the Android platform malicious application hiding technology, after a developer of the malicious application embeds the malicious application into a normal application by using a repacking technology, the repacked malicious application contains all components and instructions of the malicious application. In the repackaging process, in order to ensure the normal execution of the application program functions, most of the repackaged malicious application programs adopt an injection of an independent component to execute malicious behaviors, for example, an injection of an independent broadcast listener to monitor the startup time of the mobile phone. After the handset is restarted, the broadcaster is triggered to perform malicious activities. Due to the independence of the components that perform malicious activities, the relationship between the malicious code portions and the code portions of normal applications is weak in function call relationships. Therefore, the malicious code part can be extracted from the APK file of the application program by clustering according to the strength of each class calling relation. In this step, each cluster obtained after the cluster analysis can refer to fig. 5.
It should be noted that the meaning of the repackaged malicious application and the malicious application mentioned in the present invention is different, the malicious application only refers to all components and instructions for executing malicious behavior, and the repackaged malicious application includes the malicious application and normal code part, and is formed by repackaging after injecting the malicious application into the normal application.
S4, extracting sensitive API calling sequences from the API calling sequences of all classes in the malicious code cluster, and respectively carrying out similarity matching on the extracted sensitive API calling sequences of all classes and a characteristic sequence sample of a malicious application program in a pre-established sample library;
it is understood that the malicious code clusters are composed of different classes together, and the signature of the malicious code clusters is composed of sensitive API sequence signatures of each class together.
It is understood that the sensitive API call sequence refers to a call sequence of a sensitive API, as shown in table 1, the sensitive API has a short message type, a device information type, a geographic information type, a broadcast type, a database type, a network type, a voice recording type, and other types. In order to realize the malicious behavior of the repackaged malicious application program, the sensitive API is necessarily called, so that the sensitive API calling sequence is used as the characteristic sequence of the application program to be detected. And extracting the sensitive API calling sequence, and performing similarity matching on the sensitive API calling sequence and the characteristic sequence sample of the malicious application program in the sample library, so that whether the application program to be detected is a repackaged malicious application program or not is conveniently determined.
TABLE 1 description of the belongings of sensitive AP1
And S5, determining whether the application program to be detected is a repackaged malicious application program or not according to the similarity matching result.
It can be understood that if the similarity matching degree is higher, the application program to be detected can be determined as a repackaged malicious application program. If there are multiple classes of sensitive API call sequences and the similarity matching degree between one of the classes of sensitive API call sequences and the sample library is high, the application to be detected is also considered to be a repackaged malicious application.
According to the detection method provided by the invention, the application programs in the Android official application market are not assumed to be original ecological, non-malicious and not repackaged in each step, namely, the detection method does not depend on the Android official application programs, so that the detection accuracy can be further improved, and the detection of the Android official application programs can be realized. In addition, in S3, each class is clustered and divided according to the strength of the calling relationship of each class to extract the malicious code portion, and since this process is performed in class units, even if the developer of the malicious program modifies the injected portion, the detection result can be determined according to the similarity, so that a relatively accurate detection result can be obtained, and therefore, a relatively high accuracy can be ensured for the malicious code of the variation. In addition, in the detection process, the malicious code part is proposed in a clustering division mode, the detection result is not influenced by the operations of modification, deletion and the like of the normal code part of the application program, and the detection accuracy is further improved.
When implemented, the specific process of S1 may include:
s11, preprocessing the installation package to obtain classes.
It will be appreciated that the basic structure of the installation package includes:
META-INF \ Jar, commonly seen in this document;
res', which is a directory for storing resource files;
xml, which is a program global configuration file;
dex, is Dalvik bytecode;
arsc, which is a compiled binary resource file.
As can be seen from the above structure, the classes. Because the nature of the application installation package is a compressed file in a zip format, class.
S12, performing decompiling on the classes.
It can be understood that the decompilation is actually a reverse analysis technology, and the smali file obtained by decompilation of the classes.
And S13, extracting an API calling sequence from the smali file.
In a specific implementation, the API call sequence may be extracted from the smali file in the following manner: and searching and backtracking from the corresponding position of each entry point of the application program to be detected in the smali file to extract the API calling sequence. And the return value and the parameter of the calling function can be extracted as additional information, so that a function calling relation graph with richer information is constructed.
In the above, a manner of obtaining the API call sequence of the installation package is provided, and of course, other manners may be adopted to obtain the API call sequence, which is not limited in the present invention.
In a specific implementation, the similarity matching between the extracted sensitive API call sequences of each class and the feature sequence samples of the malicious application programs in the pre-established sample library in S4 may include:
carrying out similarity matching on the extracted sensitive API calling sequence of each class and family characteristics of the malicious application program families of the same class in the sample library;
the malicious application family comprises a plurality of malicious applications of the same category; the family features are a sequence of features of the malicious application family and include a sequence sample of sensitive API calls for each malicious application in the malicious code family.
The family classification is carried out on each malicious application program according to the category of malicious behaviors executed by a plurality of known malicious application programs in advance, after the sensitive API calling sequence is extracted at this time, the sensitive API calling sequence is subjected to similarity matching with the family characteristics of the malicious application program families in the same category, the familial detection is realized, comparison with the characteristic sequences of all the malicious application programs is not needed, and the detection efficiency is improved.
In specific implementation, if the application program to be detected is determined to be a repackaged malicious application program according to the similarity matching result, the sensitive API call sequence of the application program to be detected can be added to the family features of the malicious application program families of the same category to update the sample library, so that the sample library can meet the detection requirement.
Actually, 1009 open Android malicious repackaging application programs are tested, the detection accuracy is up to 93%, and therefore the detection method provided by the invention has good performance in accuracy and usability and can be applied to actual detection work.
In a second aspect, the present invention further provides an apparatus for statically detecting a repackaged malicious application, including:
the acquisition module is used for acquiring API calling sequences of the installation package of the application program to be detected and the association relation between the classes to which the API calling sequences belong;
the building module is used for building a function call relation graph based on classes according to the strength of the incidence relation between the classes; the nodes in the function call relation graph are classes;
the cluster module is used for clustering and dividing each class according to the strength degree of the association relationship between the classes to obtain a plurality of clusters, and removing the clusters with the strongest preset number of association relationships between the classes in each cluster to obtain a malicious code cluster;
the matching module is used for extracting sensitive API calling sequences from the API calling sequences of all classes in the malicious code cluster and respectively carrying out similarity matching on the extracted sensitive API calling sequences of all classes and a characteristic sequence sample of a malicious application program in a pre-established sample library;
and the determining module is used for determining whether the application program to be detected is a repackaged malicious application program or not according to the similarity matching result.
Optionally, the obtaining module includes:
the preprocessing unit is used for preprocessing the installation package to obtain classes.
The decompiling unit is used for decompiling the classes and dex file to obtain a smali file;
and the extraction unit is used for extracting the API calling sequence from the smali file.
Optionally, the matching module is specifically configured to: carrying out similarity matching on the extracted sensitive API calling sequence of each class and family characteristics of the malicious application program families of the same class in the sample library; wherein the malicious application family comprises a plurality of malicious applications of the same category; the family features are a sequence of features of the malicious application family and include a sequence sample of sensitive API calls for each malicious application in the malicious code family.
Optionally, the apparatus further comprises:
and the updating module is used for adding the sensitive API calling sequence of the application program to be detected to the family characteristics of the same category of malicious application program families when the application program to be detected is determined to be the repackaged malicious application program according to the similarity matching result.
It can be understood that the static detection apparatus provided by the present invention is a functional architecture module of the static detection method, and the explanation, the optional implementation, the beneficial effects, and the like of the related contents can show the corresponding contents in the static detection method, and are not described herein again.
The present invention also provides an electronic device, and referring to fig. 6, the electronic device includes: a processor (processor)601, a memory (memory)602, a communication Interface (Communications Interface)603, and a bus 604; wherein,
the processor 601, the memory 602 and the communication interface 603 complete mutual communication through the bus 604;
the communication interface 603 is used for information transmission between the electronic device and a corresponding communication device;
the processor 601 is configured to call program instructions in the memory 602 to perform the methods provided by the above-mentioned method embodiments, for example, including: acquiring API calling sequences of an installation package of an application program to be detected and an association relation between classes to which each API calling sequence belongs; constructing a function call relation graph based on classes according to the strength of the association relation between the classes; the nodes in the function call relation graph are classes; according to the strength degree of the association relationship between the classes, clustering and dividing each class to obtain a plurality of clusters, and removing a preset number of clusters with the strongest association relationship between the classes in each cluster to obtain a malicious code cluster; extracting sensitive API calling sequences from the API calling sequences of all classes in the malicious code cluster, and respectively carrying out similarity matching on the extracted sensitive API calling sequences of all classes and a characteristic sequence sample of a malicious application program in a pre-established sample library; and determining whether the application program to be detected is a repackaged malicious application program or not according to the similarity matching result.
The present invention also provides a computer program product comprising a computer program stored on a non-transitory computer readable storage medium, the computer program comprising program instructions which, when executed by a computer, enable the computer to perform the method provided by the above-described method embodiments, for example comprising: acquiring API calling sequences of an installation package of an application program to be detected and an association relation between classes to which each API calling sequence belongs; constructing a function call relation graph based on classes according to the strength of the association relation between the classes; the nodes in the function call relation graph are classes; according to the strength degree of the association relationship between the classes, clustering and dividing each class to obtain a plurality of clusters, and removing a preset number of clusters with the strongest association relationship between the classes in each cluster to obtain a malicious code cluster; extracting sensitive API calling sequences from the API calling sequences of all classes in the malicious code cluster, and respectively carrying out similarity matching on the extracted sensitive API calling sequences of all classes and a characteristic sequence sample of a malicious application program in a pre-established sample library; and determining whether the application program to be detected is a repackaged malicious application program or not according to the similarity matching result.
The present invention provides a non-transitory computer-readable storage medium storing computer instructions that cause the computer to perform a method provided by the above method embodiments, for example, comprising: acquiring API calling sequences of an installation package of an application program to be detected and an association relation between classes to which each API calling sequence belongs; constructing a function call relation graph based on classes according to the strength of the association relation between the classes; the nodes in the function call relation graph are classes; according to the strength degree of the association relationship between the classes, clustering and dividing each class to obtain a plurality of clusters, and removing a preset number of clusters with the strongest association relationship between the classes in each cluster to obtain a malicious code cluster; extracting sensitive API calling sequences from the API calling sequences of all classes in the malicious code cluster, and respectively carrying out similarity matching on the extracted sensitive API calling sequences of all classes and a characteristic sequence sample of a malicious application program in a pre-established sample library; and determining whether the application program to be detected is a repackaged malicious application program or not according to the similarity matching result.
Those of ordinary skill in the art will understand that: all or part of the steps for implementing the method embodiments may be implemented by hardware related to program instructions, and the program may be stored in a computer readable storage medium, and when executed, the program performs the steps including the method embodiments; and the aforementioned storage medium includes: various media that can store program codes, such as ROM, RAM, magnetic or optical disks.
The above-described embodiments of the test equipment and the like of the display device are merely illustrative, wherein the units described as separate parts may or may not be physically separate, and the parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of the present embodiment. One of ordinary skill in the art can understand and implement it without inventive effort.
Through the above description of the embodiments, those skilled in the art will clearly understand that each embodiment can be implemented by software plus a necessary general hardware platform, and certainly can also be implemented by hardware. With this understanding in mind, the above-described technical solutions may be embodied in the form of a software product, which can be stored in a computer-readable storage medium such as ROM/RAM, magnetic disk, optical disk, etc., and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to execute the methods described in the embodiments or some parts of the embodiments.
Finally, it should be noted that: the above embodiments are only used for illustrating the technical solutions of the embodiments of the present invention, and are not limited thereto; although embodiments of the present invention have been described in detail with reference to the foregoing embodiments, those skilled in the art will understand that: the technical solutions described in the foregoing embodiments may still be modified, or some or all of the technical features may be equivalently replaced; and the modifications or the substitutions do not make the essence of the corresponding technical solutions depart from the scope of the technical solutions of the embodiments of the present invention.

Claims (10)

1. A static detection method for repackaging malicious applications is characterized by comprising the following steps:
acquiring API calling sequences of an installation package of an application program to be detected and an association relation between classes to which each API calling sequence belongs;
constructing a function call relation graph based on classes according to the strength of the association relation between the classes; the nodes in the function call relation graph are classes;
according to the strength degree of the association relationship between the classes, clustering and dividing each class to obtain a plurality of clusters, and removing a preset number of clusters with the strongest association relationship between the classes in each cluster to obtain a malicious code cluster;
extracting sensitive API calling sequences from the API calling sequences of all classes in the malicious code cluster, and respectively carrying out similarity matching on the extracted sensitive API calling sequences of all classes and a characteristic sequence sample of a malicious application program in a pre-established sample library;
and determining whether the application program to be detected is a repackaged malicious application program or not according to the similarity matching result.
2. The method according to claim 1, wherein the obtaining of the API call sequence of the installation package of the application to be detected comprises:
preprocessing the installation package to obtain classes.dex files;
performing decompiling on the classes and dex file to obtain a smali file;
and extracting an API calling sequence from the smali file.
3. The method of claim 2, wherein said pre-processing said installation package to obtain classes.
Decompressing the installation package, and extracting classes.
4. The method of claim 2, wherein extracting the API call sequence from the smali file comprises:
and searching and backtracking from the corresponding position of each entry point of the application program to be detected in the smali file to extract the API calling sequence.
5. The method according to claim 1, wherein the similarity matching of the extracted sensitive API call sequences of each class with the feature sequence samples of malicious applications in the pre-established sample library respectively comprises:
carrying out similarity matching on the extracted sensitive API calling sequence of each class and family characteristics of the malicious application program families of the same class in the sample library;
the malicious application family comprises a plurality of malicious applications of the same category; the family features are a sequence of features of the malicious application family and include a sample of a sequence of sensitive API calls for each malicious application in the malicious application family.
6. The method of claim 5, further comprising:
and if the application program to be detected is determined to be the repackaged malicious application program according to the similarity matching result, adding the sensitive API calling sequence of the application program to be detected into the family characteristics of the same category malicious application program family.
7. A static detection apparatus for repackaging malicious applications, comprising:
the acquisition module is used for acquiring API calling sequences of the installation package of the application program to be detected and the association relation between the classes to which the API calling sequences belong;
the building module is used for building a function call relation graph based on classes according to the strength of the incidence relation between the classes; the nodes in the function call relation graph are classes;
the cluster module is used for clustering and dividing each class according to the strength degree of the association relationship between the classes to obtain a plurality of clusters, and removing the clusters with the strongest preset number of association relationships between the classes in each cluster to obtain a malicious code cluster;
the matching module is used for extracting sensitive API calling sequences from the API calling sequences of all classes in the malicious code cluster and respectively carrying out similarity matching on the extracted sensitive API calling sequences of all classes and a characteristic sequence sample of a malicious application program in a pre-established sample library;
and the determining module is used for determining whether the application program to be detected is a repackaged malicious application program or not according to the similarity matching result.
8. The apparatus of claim 7, wherein the obtaining module comprises:
the preprocessing unit is used for preprocessing the installation package to obtain classes.
The decompiling unit is used for decompiling the classes and dex file to obtain a smali file;
and the extraction unit is used for extracting the API calling sequence from the smali file.
9. The apparatus of claim 7, wherein the matching module is specifically configured to: carrying out similarity matching on the extracted sensitive API calling sequence of each class and family characteristics of the malicious application program families of the same class in the sample library; wherein the malicious application family comprises a plurality of malicious applications of the same category; the family features are a sequence of features of the malicious application family and include a sample of a sequence of sensitive API calls for each malicious application in the malicious application family.
10. The apparatus of claim 9, further comprising:
and the updating module is used for adding the sensitive API calling sequence of the application program to be detected to the family characteristics of the same category of malicious application program families when the application program to be detected is determined to be the repackaged malicious application program according to the similarity matching result.
CN201710069633.1A 2017-02-08 2017-02-08 Beat again the static detection method and device of packet malicious application Active CN106951780B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201710069633.1A CN106951780B (en) 2017-02-08 2017-02-08 Beat again the static detection method and device of packet malicious application

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201710069633.1A CN106951780B (en) 2017-02-08 2017-02-08 Beat again the static detection method and device of packet malicious application

Publications (2)

Publication Number Publication Date
CN106951780A CN106951780A (en) 2017-07-14
CN106951780B true CN106951780B (en) 2019-09-10

Family

ID=59465812

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201710069633.1A Active CN106951780B (en) 2017-02-08 2017-02-08 Beat again the static detection method and device of packet malicious application

Country Status (1)

Country Link
CN (1) CN106951780B (en)

Families Citing this family (18)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107992742A (en) * 2017-10-27 2018-05-04 维沃移动通信有限公司 A kind of method and apparatus of installation kit identification
CN109918906B (en) * 2017-12-12 2022-09-02 财团法人资讯工业策进会 Abnormal behavior detection model generation device and abnormal behavior detection model generation method thereof
CN110390185B (en) * 2018-04-20 2022-08-09 武汉安天信息技术有限责任公司 Repackaging application detection method, rule base construction method and related device
CN109145589B (en) * 2018-07-27 2023-04-07 平安科技(深圳)有限公司 Application program acquisition method and device
CN109145605A (en) * 2018-08-23 2019-01-04 北京理工大学 A kind of Android malware family clustering method based on SinglePass algorithm
CN109462403A (en) * 2018-10-09 2019-03-12 北京邮电大学 A kind of method and system for realizing consignment address code
CN109635565A (en) * 2018-11-28 2019-04-16 江苏通付盾信息安全技术有限公司 The detection method of rogue program, calculates equipment and computer storage medium at device
CN110765457A (en) * 2018-12-24 2020-02-07 哈尔滨安天科技集团股份有限公司 Method and device for identifying homologous attack based on program logic and storage device
CN109858249B (en) * 2019-02-18 2020-08-07 暨南大学 Rapid intelligent comparison and safety detection method for mobile malicious software big data
CN110175045A (en) * 2019-05-20 2019-08-27 北京邮电大学 Android application program beats again bag data processing method and processing device
CN112035836B (en) * 2019-06-04 2023-04-14 四川大学 Malicious code family API sequence mining method
CN113051561A (en) * 2019-12-27 2021-06-29 中国电信股份有限公司 Application program feature extraction method and device and classification method and device
CN111339531B (en) * 2020-02-24 2023-12-19 南开大学 Malicious code detection method and device, storage medium and electronic equipment
CN111783095A (en) * 2020-07-28 2020-10-16 支付宝(杭州)信息技术有限公司 Method and device for identifying malicious code of applet and electronic equipment
CN112115480A (en) * 2020-09-09 2020-12-22 重庆广播电视大学重庆工商职业学院 Hotlinking risk reminding method, device and equipment used in cloud platform environment
CN112651024B (en) * 2020-12-29 2024-08-23 重庆大学 Method, device and equipment for detecting malicious codes
CN113641964B (en) * 2021-10-19 2022-05-17 北京邮电大学 Repackaging application detection method, electronic device and storage medium
CN116775050A (en) * 2022-03-07 2023-09-19 华为技术有限公司 Method, terminal and server for identifying SDK in application program

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103473346A (en) * 2013-09-24 2013-12-25 北京大学 Android re-packed application detection method based on application programming interface
CN103473507A (en) * 2013-09-25 2013-12-25 西安交通大学 Android malicious software detection method based on method call graph
CN104091121A (en) * 2014-06-12 2014-10-08 上海交通大学 Method for detecting, removing and recovering malicious codes of Android repackaging malicious software
CN104331436A (en) * 2014-10-23 2015-02-04 西安交通大学 Rapid classification method of malicious codes based on family genetic codes
CN104778409A (en) * 2015-04-16 2015-07-15 电子科技大学 Method and device for detecting similarities of Android application software
CN105205356A (en) * 2015-09-17 2015-12-30 清华大学深圳研究生院 APP application re-packaging detection method

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103473346A (en) * 2013-09-24 2013-12-25 北京大学 Android re-packed application detection method based on application programming interface
CN103473507A (en) * 2013-09-25 2013-12-25 西安交通大学 Android malicious software detection method based on method call graph
CN104091121A (en) * 2014-06-12 2014-10-08 上海交通大学 Method for detecting, removing and recovering malicious codes of Android repackaging malicious software
CN104331436A (en) * 2014-10-23 2015-02-04 西安交通大学 Rapid classification method of malicious codes based on family genetic codes
CN104778409A (en) * 2015-04-16 2015-07-15 电子科技大学 Method and device for detecting similarities of Android application software
CN105205356A (en) * 2015-09-17 2015-12-30 清华大学深圳研究生院 APP application re-packaging detection method

Also Published As

Publication number Publication date
CN106951780A (en) 2017-07-14

Similar Documents

Publication Publication Date Title
CN106951780B (en) Beat again the static detection method and device of packet malicious application
CN108133139B (en) Android malicious application detection system based on multi-operation environment behavior comparison
Crussell et al. Andarwin: Scalable detection of android application clones based on semantics
CN104715196B (en) The Static Analysis Method and system of smart mobile phone application program
US9454658B2 (en) Malware detection using feature analysis
US9792433B2 (en) Method and device for detecting malicious code in an intelligent terminal
CN103473346B (en) A kind of Android based on application programming interface beats again bag applying detection method
US20160063244A1 (en) Method and system for recognizing advertisement plug-ins
CN109271788B (en) Android malicious software detection method based on deep learning
CN107346284B (en) Application program detection method and detection device
KR20150044490A (en) A detecting device for android malignant application and a detecting method therefor
KR20170068814A (en) Apparatus and Method for Recognizing Vicious Mobile App
CN103679030B (en) Malicious code analysis and detection method based on dynamic semantic features
US10296743B2 (en) Method and device for constructing APK virus signature database and APK virus detection system
CN108090360B (en) Behavior feature-based android malicious application classification method and system
CN113468524B (en) RASP-based machine learning model security detection method
CN105205398B (en) It is a kind of that shell side method is looked into based on APK shell adding software dynamic behaviours
KR101803888B1 (en) Method and apparatus for detecting malicious application based on similarity
CN112231697A (en) Third-party SDK behavior detection method, device, medium and electronic equipment
CN111324893B (en) Detection method and background system for android malicious software based on sensitive mode
CN109670311A (en) Malicious code analysis and detection method based on high-level semantics
Akram et al. DroidMD: an efficient and scalable android malware detection approach at source code level
CN108229168B (en) Heuristic detection method, system and storage medium for nested files
CN103093147B (en) A kind of method identifying information and electronic installation
Feichtner et al. Obfuscation-resilient code recognition in Android apps

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant