WO2014166312A1 - 一种广告插件识别的方法和系统 - Google Patents

一种广告插件识别的方法和系统 Download PDF

Info

Publication number
WO2014166312A1
WO2014166312A1 PCT/CN2014/071596 CN2014071596W WO2014166312A1 WO 2014166312 A1 WO2014166312 A1 WO 2014166312A1 CN 2014071596 W CN2014071596 W CN 2014071596W WO 2014166312 A1 WO2014166312 A1 WO 2014166312A1
Authority
WO
WIPO (PCT)
Prior art keywords
feature
dimension
advertisement
feature vector
plug
Prior art date
Application number
PCT/CN2014/071596
Other languages
English (en)
French (fr)
Inventor
张迪
唐淳
Original Assignee
北京奇虎科技有限公司
奇智软件(北京)有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 北京奇虎科技有限公司, 奇智软件(北京)有限公司 filed Critical 北京奇虎科技有限公司
Priority to US14/783,042 priority Critical patent/US9824212B2/en
Publication of WO2014166312A1 publication Critical patent/WO2014166312A1/zh

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/44Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs
    • H04N21/44008Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs involving operations for analysing video streams, e.g. detecting features or characteristics in the video stream
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/50Monitoring users, programs or devices to maintain the integrity of platforms, e.g. of processors, firmware or operating systems
    • G06F21/55Detecting local intrusion or implementing counter-measures
    • G06F21/552Detecting local intrusion or implementing counter-measures involving long-term monitoring or reporting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/3331Query processing
    • G06F16/334Query execution
    • G06F16/3347Query execution using vector based model
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/50Monitoring users, programs or devices to maintain the integrity of platforms, e.g. of processors, firmware or operating systems
    • G06F21/55Detecting local intrusion or implementing counter-measures
    • G06F21/56Computer malware detection or handling, e.g. anti-virus arrangements
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/02Marketing; Price estimation or determination; Fundraising
    • G06Q30/0241Advertisements
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/442Monitoring of processes or resources, e.g. detecting the failure of a recording device, monitoring the downstream bandwidth, the number of times a movie has been viewed, the storage space available from the internal hard disk
    • H04N21/4424Monitoring of the internal components or processes of the client device, e.g. CPU or memory load, processing speed, timer, counter or percentage of the hard disk space used
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/443OS processes, e.g. booting an STB, implementing a Java virtual machine in an STB or power management in an STB
    • H04N21/4431OS processes, e.g. booting an STB, implementing a Java virtual machine in an STB or power management in an STB characterized by the use of Application Program Interface [API] libraries
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/80Generation or processing of content or additional data by content creator independently of the distribution process; Content per se
    • H04N21/81Monomedia components thereof
    • H04N21/812Monomedia components thereof involving advertisement data
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/80Generation or processing of content or additional data by content creator independently of the distribution process; Content per se
    • H04N21/81Monomedia components thereof
    • H04N21/8166Monomedia components thereof involving executable data, e.g. software
    • H04N21/8193Monomedia components thereof involving executable data, e.g. software dedicated tools, e.g. video decoder software or IPMP tool
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2221/00Indexing scheme relating to security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F2221/03Indexing scheme relating to G06F21/50, monitoring users, programs or devices to maintain the integrity of platforms
    • G06F2221/034Test or assess a computer or a system

Definitions

  • the present invention relates to the field of computer technologies, and in particular, to a method and system for identifying an advertisement plug-in, a computer program, and a computer readable medium. Background technique
  • smart mobile terminals are becoming more and more popular, such as smart phones and iPhones that use Android (AZ, a Linux-based free and open source operating system).
  • AZ Android
  • smart mobile terminals such as smart phones
  • various mobile applications have sprung up on smart mobile terminals, and more and more embedded plug-ins are embedded in applications, while malicious advertising software is light. It will harass users, and more serious will result in user privacy leakage, especially for the user's mobile phone, it is more likely to consume a lot of traffic, and even secretly send deduction SMS, causing various losses to the user.
  • the first step in protecting users from malicious ads is to identify which apps are adware and their hazards so that users can know if the adware is a malicious ad and a compromised program, and they can choose to uninstall the software. It also provides data support for further ad blocking.
  • the identification of the advertisement plug-in for the smart mobile terminal is mostly a simple fixed detection of the component name of the advertisement to determine whether the application is an advertisement plug-in, and since many advertisers embed the advertisement component in the application, the confusing software code There may be no obvious component names to distinguish, so the prior art cannot accurately identify the advertisement plug-in, and the advertisement plug-in recognition rate is low.
  • an advertisement plug-in identification method including: searching for each file related to an application plug-in; And scanning, according to the feature vector of each feature dimension of the feature vector set of the predetermined advertisement, scanning each file related to the application plug-in, and calculating a feature vector similarity of the data in each file and the feature vector of each feature dimension;
  • an advertisement plug-in identification system including: a lookup module configured to search for files related to an application plug-in;
  • a feature scanning module configured to scan each file related to the application plug-in based on a feature vector of each feature dimension of the feature vector set of the predetermined advertisement, and calculate data of each file and a feature vector of each feature dimension Feature vector similarity
  • the advertisement similarity calculation module is configured to calculate an advertisement similarity of the current application plug-in according to the feature vector similarity of each feature dimension and the feature recognition weight of the feature dimension;
  • the determining module is configured to compare the similarity of the advertisement with a threshold, and according to the comparison result, determine whether the application plug-in is an advertisement plug-in.
  • a computer program comprising computer readable code, when the computer readable code is run on a mobile terminal, causing the mobile terminal to perform according to claims 1-10 Any of the described ad plugin identification methods.
  • a computer readable medium wherein the computer program according to claim 21 is stored.
  • the beneficial effects of the invention are:
  • An advertisement plug-in identification method can analyze feature data of various feature dimensions for an application plug-in of the smart terminal, and comprehensively determine whether the application plug-in includes an advertisement plug-in according to the feature data under various feature dimensions, thereby solving the present problem.
  • the plugin performs the beneficial effects of effective behavior detection.
  • FIG. 1 is a schematic flow chart showing an advertisement plug-in identification method according to one embodiment of the present invention
  • FIG. 2 is a schematic flow chart showing an advertising plug-in identification method according to a second embodiment of the present invention
  • FIG. 3 is a schematic block diagram showing a structure of an advertisement plug-in identification method according to a third embodiment of the present invention.
  • FIG. 4 is a schematic structural diagram showing an advertisement card identification system according to a fourth embodiment of the present invention.
  • FIG. 5 is a schematic block diagram showing the structure of an advertisement plug-in recognition system according to a fifth embodiment of the present invention.
  • FIG. 6 is a schematic structural diagram of an advertisement card identification system according to an embodiment 6 of the present invention.
  • Figure 7 is a block diagram schematically showing a mobile terminal for performing the method according to the present invention.
  • Fig. 8 schematically shows a storage unit for holding or carrying program code implementing the method according to the invention. detailed description
  • FIG. 1 it is a schematic diagram of a method for identifying an advertisement card according to a first embodiment of the present invention, which may specifically include:
  • Step 110 Find each file related to the application plug-in
  • the installation package of the application plug-in and the related file of the installation location of the installation package are searched, for example, an application plug-in of the application is stored in the mobile phone SD card (Secure Digital Memory Card, for example, in the root directory, for SD: ⁇ A file, after the application plug-in is installed, the translated file is in the SD: ⁇ program ⁇ ml file of the mobile phone SD card, and the file includes configuration.
  • SD card Secure Digital Memory Card
  • the translated file is in the SD: ⁇ program ⁇ ml file of the mobile phone SD card, and the file includes configuration.
  • Files, executable files (such as .Dex files, .Dex files are generally executable files for Android Android.)
  • step 110 may be: searching for each file related to the application plug-in in each file related to the application plug-in in the smart mobile terminal.
  • the application is directed to an application plug-in in a smart mobile terminal.
  • an app such as an APP in a smartphone.
  • Step 120 Scan each file according to a feature vector of each feature dimension of the feature vector set of the predetermined advertisement, and calculate a feature vector similarity of the data in each file and the feature vector of each feature dimension;
  • the method further includes:
  • Step S100 The feature data in each feature dimension is obtained by each application plug-in in the cloud server analyzing the advertisement sample library, and the feature vector set of the advertisement is constructed according to the feature data.
  • the feature vector set D (dl, d2, d3 alonedn) of the advertisement may be pre-built, where n is the number of feature vectors, wherein each feature vector corresponds to the feature matching condition, such as:
  • the plug-in specific ad features are identified by the string constant pool. For example, the string of the plugin version number and the ad plugin networking domain name string are saved in the constant pool.
  • the constant pool of statistical adware gets 100 strings, in which there are various combinations of character conditions, such as string ⁇ and string B, string C or string D or string E, non-string F, such as a string condition to determine the feature vector of the constant value dimension belonging to the ad plugin. Then "string A and string B, string C or string D or string E, non-string F" and the feature matching condition for the feature vector.
  • the ad plugin will contain a specific package name and class name, which can be used to determine if a specific ad plugin is included.
  • package name and type feature value should be selected as the class name that will not be confused, such as the class name of the component included in the plugin, and the ad.
  • the class name of the View is a specific package name and class name, which can be used to determine if a specific ad plugin is included.
  • ad plugins will declare the required information in the file, and the recognition rate can be improved according to the feature.
  • Many ad plugins store information such as the AppKey of the ad plugin in, for example, AndroidMenifest.xml or a custom configuration file.
  • the AppKey is the unique Id that the advertising provider provides to the developer.
  • Ad plugin recognition In a specific application scenario, it may be necessary to accurately identify the ad components in the program installation package, rather than whether or not the ad is included. At this time, some components can inherit from a known advertising component, and then it can be determined whether it is an advertising component according to the sequence of inheritance relationships.
  • class inheritance relationship used to identify the adware is class > class b-> class c, or class &-> class 1» class d, then "class &-> class 1» class (;, Or class &->class 1 > class d" is the feature matching condition under this dimension.
  • the function call sequence used to identify the adware is function a->function b->function c+function d, or class ⁇ >class b>function c+function d, then "function a->function b -> function c + function d, or class > class b > function c + function d " is the feature matching condition under this dimension.
  • the feature can be determined by issuing md5 of its installation package.
  • each application plug-in sample in the cloud plug-in sample library for example, 1) to 5
  • each application plug-in can be analyzed as follows:
  • the executable file taking the .Dex file as an example, can find the location of the class structure through the .Dex file header index and the offset value, so that the package name and class name can be extracted in the structure, and the parent and child according to each class.
  • the inheritance relationship can find the class inheritance relationship, and can also find the location of the method descriptor according to the index of the .Dex file header and the offset value.
  • the method descriptor records the calling relationship between the functions of various methods in the execution of the plugin. , then you can find the function call relationship.
  • Step S121 adding feedback to the application plug-in sample library including the advertisement according to the feedback information of the user to the application plug-in.
  • the plug-in that has reached a certain value for the advertisement plug-in can be added to the advertisement sample library, and the data source is continuously improved, and the accuracy of the feature vector set of the advertisement is improved.
  • the method further includes: pre-compiling, in the cloud server, the feature vector set of the feature data construction advertisement in the feature vector dimension to a binary XML format.
  • the feature data will be pre-compiled into binary xml format. Firstly, there will be faster parsing speed. It is suitable for low-memory, low-CPU device parsing of mobile devices, and again because of the redundant attribute names, elements, etc. in the xml file. Strings are referenced by indexes in a public string pool, which can greatly reduce the size of data files and is suitable for network, especially mobile network transmission.
  • step 120 is performed.
  • the feature vector of each feature dimension of the feature vector set of the predetermined advertisement is scanned, and each file related to the application plug-in is scanned, and the data in each file and the feature vector of each feature dimension are calculated.
  • Feature vector similarities include:
  • each feature vector needs to scan a specified file or a specified location to obtain, it is necessary to scan the scan position specified by each feature dimension of the feature vector set of the predetermined advertisement to obtain feature data in the corresponding feature dimension.
  • the obtaining the feature data in the corresponding feature dimension according to the scan location specified by each feature dimension of the feature vector set of the predetermined advertisement includes:
  • Step b11 scanning an installation package of the application plugin, and obtaining the advertisement from the installation package
  • Each feature information in the feature vector set installation package dimension is used as the first feature vector; for example, the installation package corresponding to the scan application plug-in is calculated, and the md5 value is calculated, and the md5 value is used as the feature value of the feature vector in the installation package dimension.
  • step bl2 scanning the configuration file, and obtaining, from the configuration file, each declaration information in a feature vector central configuration information dimension of the advertisement as a first feature vector;
  • the configuration file obtained by the installation package is scanned, such as a configuration file such as AndroidMenifest.xml, from which the declaration information, such as AppKey, is extracted as the feature value of the feature vector in the dimension.
  • step bl3 scanning a constant pool in the executable file, and acquiring, from the constant pool, each character string in the constant pool dimension of the feature vector set of the advertisement as the first feature vector;
  • the plug-in specific ad features are identified by the string constant pool. For example, the string of the plugin version number and the ad plugin networking domain name string are saved in the constant pool. Then, this step obtains a string of the plugin version number, a string of the advertisement plug-in network domain name string, and the like as the feature value of the feature vector in the dimension.
  • step bl4 scanning a class structure in the executable file, and obtaining, from the class structure, each package name and class name in a feature vector set package name and a class name dimension of the advertisement as a first feature vector;
  • the Dex file obtained by the application plug-in installation package is scanned, and the class structure is searched for, and the package name and the class name are obtained from the class structure as the feature values of the feature vector under the package name and the class name dimension.
  • step bl5 scanning a class structure in the executable file, and obtaining, according to the class structure, various types of inheritance relationships under the feature vector set inheritance relationship sequence dimension of the advertisement as the first feature vector;
  • step bl6 scanning a method descriptor in the executable file, and acquiring, from the method descriptor, a function call sequence in a feature vector concentration function call sequence dimension of the advertisement
  • the column acts as the first feature vector.
  • the Dex file obtained by the application plug-in installation package is scanned, and the method descriptor is searched for. From the record function calling relationship of the method descriptor, the function call relationship sequence is extracted, and the feature value of the relationship sequence dimension is called as a function.
  • Step A12 Perform feature vector similarity calculation on the feature value of the feature vector in the corresponding feature dimension of the feature vector of the advertisement, and obtain the feature vector similarity under the feature dimension.
  • the feature vector similarity is calculated with the feature value of the feature vector under the feature dimension of the feature vector set of the advertisement, and the feature vector similarity under the feature dimension is obtained.
  • the feature data is compared with the feature value of the feature vector in the feature dimension of the advertisement in the feature vector set of the advertisement, and the feature vector similarity under the feature dimension is obtained by:
  • Step bl7 matching the feature data with various feature matching conditions corresponding to the feature vector in the feature dimension of the advertisement in the feature vector set, and calculating the feature vector similarity under the feature dimension according to the matching result.
  • the foregoing feature vector set for the advertisement "string A and string B, string C or string D or string E, non-string F" under the constant pool feature dimension, the current application plug-in is scanned.
  • the constant pool gets the strings A, C, N, and then exactly matches the matching condition ",, the string C or the string D or the string E, the non-string F, and the calculated similarity is 2/3.
  • Step 130 Calculate an advertisement similarity of the current application plug-in according to the feature vector similarity of each feature dimension and the feature recognition weight of the feature dimension;
  • a corresponding feature recognition weight may be set, and then the feature vector similarity and the corresponding feature recognition weight are brought into the feature vector similarity calculation function f to calculate the advertisement similarity V of the current application plug-in.
  • the feature vector set of the aforementioned preset advertisement D(dl, d2, d3 alonedn) where n is the number of feature vectors, and the weight W (wl, w2, w3 ⁇ .wn) corresponding to the feature vector set of the preset advertisement
  • Step 140 Compare the similarity of the advertisement with a threshold, and determine, according to the comparison result, whether the application plug-in is an advertisement plug-in.
  • an advertisement similarity threshold t may also be set.
  • v>t it may be determined whether the application plug-in is an advertisement plug-in.
  • it also includes:
  • Step C11 Recording the result of the scanning of each application plug-in; when scanning again, skip scanning the determined application plug-in according to the record of the scanning result of each application plug-in.
  • the smart mobile terminal reinstalls the system of the present invention, or there is a case where the installed application plug-in is reinstalled.
  • the application plug-in that has been scanned and determined may be used. Quickly scan and judge using the recorded scan judgment results.
  • the scenarios of the application of the present invention include:
  • the end user can scan on the smartphone, the user can see what the adware has and what behavior, the user can choose to uninstall the software. You can also provide pull data to the ad blocking module.
  • the software can be scanned on the platform of the smart mobile terminal (such as the application market), so that the user can know whether the software is the advertisement software and its behavior before downloading and installing.
  • the embodiment of the invention has perfect feature recognition rules, and can be combined with cloud data for feature recognition, realizing accurate and quasi-identification of the advertisement software, and having high feature matching recognition ability for the confusing advertisement software code, and improving the advertisement plug-in. Recognition rate.
  • the embodiment of the present invention can analyze feature data of various feature dimensions for the application plug-in of the smart terminal, and comprehensively determine whether the application plug-in includes an advertisement plug-in according to the feature data of various feature dimensions, thereby solving the problem that the prior art cannot accurately identify.
  • Ad plugin can't apply to confusing apps Line ad plug-in identification, low recognition rate of ad plug-ins, etc., has obtained a complete feature recognition rule, and can be combined with cloud data for feature recognition, which realizes fine and accurate identification of adware, and has obfuscated adware code. The higher feature matching recognition ability and the beneficial effect of effective behavior detection on the identified advertising plug-in.
  • Embodiment 2 Referring to FIG. 2, a flow chart of a method for identifying an advertisement card according to a second embodiment of the present invention is shown.
  • Step 210 Find each file related to the application plugin.
  • Step 220 Scan each file related to the application plug-in based on a feature vector of each feature dimension of the feature vector set of the predetermined advertisement, and calculate data in the file to be similar to a feature vector of the feature vector of each feature dimension. Degree; the feature vector set of the built advertisement is constructed and sent by the cloud server;
  • Step 230 Calculate an advertisement similarity of the current application plug-in according to the feature vector similarity of each feature dimension and the feature recognition weight of the feature dimension;
  • Step 240 Compare the similarity of the advertisement with a threshold, and determine, according to the comparison result, whether the application plug-in is an advertisement plug-in;
  • Step 250 If the application plug-in is an advertisement plug-in, detecting an operation behavior of the advertisement plug-in to the smart mobile terminal system.
  • Steps 210 to 240 in the embodiment of the present invention are substantially similar to the similar steps in the first embodiment, and will not be described in detail herein.
  • an advertisement behavior analysis engine may be set to perform behavior detection on the advertisement plug-in. For example, whether the advertisement plug-in is requested to obtain advertisement content from the network, whether to extract the user's private information, and whether the target of transmitting the privacy information is an external network or the like.
  • the active defense engine monitors the application software interface (Application Programming Interface) in real time, such as calling an API for reading the content of the message; or describing the behavior of the plugin from the advertisement feature, such as a function call. Description of the sequence; or through static analysis of the plug-in code, whether to call a sensitive API, such as calling the API to read the contact.
  • the behavior detection may be detected in real time in the smart mobile terminal to notify the user, or may be pre-detected in the platform where the application plug-in is located, and then notified to the user.
  • the behavior of the advertisement plug-in can be detected in a targeted manner, and the advertisement behavior of the current advertisement plug-in can be prompted.
  • the present invention can also intercept the subsequent operation behavior of the advertisement plug-in. For example, after detecting the API for invoking the content of the short message, the present invention intercepts the call of the advertisement plug-in to the specific advertisement content, and the other behaviors are similar.
  • the embodiment of the invention has perfect feature recognition rules, and can be combined with cloud data for feature recognition, realizing accurate and quasi-identification of the advertisement software, and having high feature matching recognition ability for the confusing advertisement software code, and improving the advertisement plug-in.
  • the recognition rate and can further detect the behavior of the ad plugin.
  • FIG. 3 it is a schematic flowchart of an advertisement plug-in identification method according to Embodiment 3 of the present invention, which may specifically include:
  • Step S200 constructing a feature vector set of the advertisement
  • Step S210 Searching for each file related to the application plug-in in the application platform of the smart mobile terminal; that is, the application can search for the application plug-in in the application platform of the smart mobile terminal.
  • the application can search for the application plug-in in the application platform of the smart mobile terminal.
  • various APPs in the application platform of smart phones are examples of various APPs in the application platform of smart phones.
  • Step S220 Scan each file related to the application plug-in based on a feature vector of each feature dimension of the feature vector set of the predetermined advertisement, and calculate that the data in each file is similar to the feature vector of the feature vector of each feature dimension.
  • Step S230 calculating the similarity of the advertisement of the current application plug-in according to the feature vector similarity of each feature dimension and the feature recognition weight of the feature dimension;
  • Step S240 comparing the similarity of the advertisement with a threshold, and determining, according to the comparison result, whether the application plug-in is an advertisement plug-in;
  • Step S250 if the application plug-in is an advertisement plug-in, detecting an operation behavior of the advertisement plug-in to the smart mobile terminal system.
  • the smart mobile terminal such as the user's mobile terminal downloads the application plug-in identified as the advertisement plug-in, the user is prompted, for example, the application plug-in is an advertisement plug-in, and the advertisement behavior includes: calling a short message, calling a contact, and the like.
  • the application plug-in is mainly identified and processed on the mobile phone platform. Therefore, if the real-time information of the mobile phone is required for the behavior detection of the advertisement plug-in, the behavior may not be detected temporarily, and other detection methods and the second embodiment similar.
  • the embodiment of the invention has perfect feature recognition rules, and can be combined with cloud data for feature recognition, realizing accurate and quasi-identification of the advertisement software, and having high feature matching recognition ability for the confusing advertisement software code, and improving the advertisement plug-in.
  • the recognition rate, and the behavior of the ad plugin is detected in advance in the reusable platform.
  • FIG. 4 a schematic diagram of a structure of an advertisement card identification system according to a fourth embodiment of the present invention is shown, which may specifically include:
  • the searching module 310 is configured to find each file related to the application plugin
  • the feature scanning module 320 is configured to scan each file related to the application plug-in based on a feature vector of each feature dimension of the feature vector set of the predetermined advertisement, and calculate data in the file and a feature vector of each feature dimension.
  • Feature vector similarity
  • the advertisement similarity calculation module 330 is configured to calculate an advertisement similarity of the current application plug-in according to the feature vector similarity of each feature dimension and the feature recognition weight of the feature dimension;
  • the determining module 340 is configured to compare the similarity of the advertisement with a threshold, and according to the comparison result, determine whether the application plug-in is an advertisement plug-in.
  • the searching module includes:
  • the first search module is configured to find each file related to the application plug-in in the application platform of the smart mobile terminal.
  • it also includes:
  • the cloud server includes:
  • the feature vector set building module is configured to acquire feature data in each feature dimension by using each application plug-in in the cloud sample server to analyze the feature data set, and construct a feature vector set of the advertisement according to the feature data.
  • the cloud server further includes:
  • the feedback supplementing module is configured to supplement the feedback to the application plug-in sample library including the advertisement according to the feedback information of the user to the application plug-in.
  • it also includes:
  • the feature vector set conversion module is configured to pre-compile the feature vector set of the feature data construction advertisement in the feature vector dimension to a binary XML format in the cloud server.
  • the feature scanning module includes:
  • a feature data extraction module configured to acquire feature data in a corresponding feature dimension according to a scan position specified by each feature dimension of the predetermined feature vector set of the advertisement
  • the feature data analysis module is configured to perform feature vector similarity calculation on the feature value of the feature vector in the corresponding feature dimension of the feature vector of the advertisement, and obtain the feature vector similarity in the feature dimension.
  • the feature data extraction module includes:
  • the application plug-in scanning module is configured to scan an installation package of the application plug-in, and obtain, from the installation package, each feature information in a dimension vector installation package dimension of the advertisement as a first feature vector;
  • a configuration information dimension obtaining module configured to scan the configuration file, and obtain, from the configuration file, a feature value that matches a feature value in a feature vector centralized configuration information dimension of the preset advertisement
  • a constant pool dimension acquisition module configured to scan a constant pool in the executable file, and obtain characters matching the respective strings in a constant pool dimension of a feature vector set of a preset advertisement from the constant pool String
  • a package name and class name obtaining module configured to scan a class structure in the executable file, and obtain, from the class structure, a package name and a class name in a feature vector set in the feature vector set of the advertisement As a first feature vector;
  • a class inheritance relationship dimension obtaining module configured to scan a class structure in the executable file, and obtain, from the class structure, various types of inheritance under a dimension sequence of a feature vector set class inheritance relationship of a preset advertisement a class-inherited relationship that matches the relationship;
  • the function call sequence dimension obtaining module is configured to scan a method descriptor in the executable file, and obtain, from the method descriptor, each function under a feature vector set function sequence sequence dimension of the preset advertisement Calls a sequence of function calls that match the sequence.
  • the feature data analysis module includes: a first analysis module, configured to match the feature data with various feature matching conditions corresponding to feature vectors in corresponding feature dimensions of the feature vector of the advertisement, according to The matching result calculates the feature vector similarity under the feature dimension.
  • it also includes:
  • a recording module configured to record a scan result of each application plug-in
  • the method further includes a quick scan module configured to skip scanning the determined application plug-in according to the record of the scan result of each application plug-in when scanning again.
  • Embodiment 5 Referring to FIG. 5, a schematic structural diagram of an advertisement card identification system according to Embodiment 5 of the present invention is shown, which may specifically include:
  • the smart mobile terminal includes:
  • the searching module 411 is configured to find each file related to the application plugin
  • the feature scanning module 412 is configured to scan each file related to the application plug-in based on a feature vector of each feature dimension of the feature vector set of the predetermined advertisement, and calculate a data in the file and a feature vector of each feature dimension.
  • Feature vector similarity
  • the advertisement similarity calculation module 413 is configured to calculate an advertisement similarity of the current application plug-in according to the feature vector similarity of each feature dimension and the feature recognition weight of the feature dimension;
  • the determining module 414 is configured to compare the similarity of the advertisement with a threshold, according to a comparison If it is determined whether the application plugin is an advertisement plugin:
  • the behavior detection module 415 is configured to determine whether the application plug-in is an advertisement plug-in, and then detect an operation behavior of the advertisement plug-in to the smart mobile terminal system.
  • This embodiment is similar to the fourth embodiment and will not be described in detail herein.
  • Embodiment 6 is a schematic structural diagram of an advertisement plug-in identification system according to Embodiment 5 of the present invention, which may specifically include:
  • the cloud server S420 includes:
  • the vector set building module S421 is configured to construct a feature vector set of the advertisement.
  • the searching module S422 is configured to find each file related to the application plugin
  • the feature scanning module S423 is configured to scan each file related to the application plug-in based on a feature vector of each feature dimension of the feature vector set of the predetermined advertisement, and calculate data in the file and feature vectors of the feature dimensions.
  • Feature vector similarity
  • the advertisement similarity calculation module S424 is configured to calculate an advertisement similarity of the current application plug-in according to the feature vector similarity of each feature dimension and the feature recognition weight of the feature dimension;
  • the determining module S425 is configured to compare the similarity of the advertisement with a threshold, and according to the comparison result, determine whether the application plugin is an advertisement plugin:
  • the behavior detecting module S426 is configured to determine whether the application plug-in is an advertising plug-in, and then detect an operation behavior of the advertising plug-in to the smart mobile terminal system.
  • the smart mobile terminal when the smart mobile terminal downloads the application plug-in identified as the advertisement plug-in, the user is prompted.
  • the various component embodiments of the present invention may be implemented in hardware, or in a software module running on one or more processors, or in a combination thereof. It will be understood by those skilled in the art that a microprocessor or digital signal processor (DSP) may be used in practice to implement some or all of the functions of some or all of the components of the mobile terminal call request information processing apparatus according to embodiments of the present invention. .
  • DSP digital signal processor
  • the invention can also be implemented for execution herein A device or device program (eg, a computer program and a computer program product) that is part or all of the described method.
  • Such a program implementing the invention may be stored on a computer readable medium or may be in the form of one or more signals. Such signals may be downloaded from an Internet website, provided on a carrier signal, or provided in any other form.
  • FIG. 7 schematically illustrates a block diagram of a mobile terminal for performing a method in accordance with the present invention, which conventionally includes a processor 710 and a computer program product or computer readable medium in the form of a memory 720.
  • Memory 720 can be an electronic memory such as a flash memory, EEPROM (Electrically Erasable Programmable Read Only Memory), EPROM, hard disk, or ROM.
  • Memory 720 has a memory space 730 for program code 731 for performing any of the method steps described above.
  • storage space 730 for program code can include various program code 731 for implementing various steps in the above methods, respectively.
  • the program code can be read from or written to one or more computer program products.
  • Such computer program products include program code carriers such as hard disks, compact disks (CDs), memory cards or floppy disks.
  • Such a computer program product is typically a portable or fixed storage unit as described with reference to Figure 8.
  • the storage unit may have a storage section, a storage space, and the like arranged similarly to the storage 720 in the mobile terminal of Fig. 7.
  • the program code can be compressed, for example, in an appropriate form.
  • the storage unit includes computer readable code 73, ie, code readable by a processor, such as 710, which when executed by the mobile terminal causes the mobile terminal to perform each of the methods described above step.
  • an embodiment or “one or more embodiments” as used herein means that the particular features, structures, or characteristics described in connection with the embodiments are included in at least one embodiment of the invention.
  • the phrase “in one embodiment” herein does not necessarily refer to the same embodiment.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Signal Processing (AREA)
  • Multimedia (AREA)
  • Software Systems (AREA)
  • Business, Economics & Management (AREA)
  • General Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Computer Security & Cryptography (AREA)
  • Strategic Management (AREA)
  • Finance (AREA)
  • Development Economics (AREA)
  • Accounting & Taxation (AREA)
  • Marketing (AREA)
  • Databases & Information Systems (AREA)
  • Computer Hardware Design (AREA)
  • Game Theory and Decision Science (AREA)
  • Economics (AREA)
  • General Business, Economics & Management (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Library & Information Science (AREA)
  • Data Mining & Analysis (AREA)
  • Computational Linguistics (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Virology (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)
  • Information Transfer Between Computers (AREA)
  • Stored Programmes (AREA)

Abstract

本发明公开了一种广告插件识别的方法和装置,涉及计算机技术领域。所述方法包括:查找应用插件相关的各文件;基于预定的广告的特征向量集的各特征维度的特征向量,扫描所述应用插件相关的各文件,计算所述各文件中的数据与所述各特征维度的特征向量的特征向量相似度;根据每个特征维度的特征向量相似度以及该特征维度的特征识别权重,计算当前应用插件的广告相似度;将所述广告相似度与阈值进行比较,根据比较结果,判定所述应用插件是否为广告插件。本发明取得了有完善的特征识别规则,并可结合云端数据进行特征识别,实现了广告软件的精、准识别,对混淆后的广告软件代码具有较高的特征匹配识别能力的有益效果。

Description

一种广告插件识别的方法和系统
技术领域
本发明涉及计算机技术领域, 具体涉及一种广告插件识别的方法和系 统, 一种计算机程序及一种计算机可读介质。 背景技术
随着技术的发展, 智能移动终端也越来越普及, 比如釆用 Android (安 卓, 一种基于 Linux的自由及开放源代码的操作系统) 的智能手机, iphone 手机等。 而随着智能移动终端(比如智能手机)的普及, 各种移动应用也如 雨后春笋般出现在智能移动终端上上, 同时应用中也越来越多的嵌入广告插 件, 而恶意的广告软件轻则会骚扰用户, 更严重的则会造成用户隐私泄露、 特别对于用户手机来说, 更可能耗费大量流量, 甚至会偷偷发送扣费短信, 给用户造成各种损失。
而要保护用户免受恶意广告损害的第一步就是要识别哪些应用是广告 软件, 以及其危害, 这样用户可以知晓广告软件是否为恶意广告以及危害程 序, 可以选择卸载该软件。 同时也为进一步的广告拦截提供数据支持。
目前, 对于智能移动终端的广告插件识别, 大都只是简单的固定的检测 广告组件名来判断应用是否为广告插件, 而由于艮多广告方都把广告组件嵌 入应用, 而这种混淆后的软件代码, 可能没有明显的组件名以进行区分, 因 此现有技术无法精确的识别广告插件, 广告插件识别率低。 发明内容
鉴于上述问题,提出了本发明以便提供一种克服上述问题或者至少部分 地解决上述问题的一种广告插件识别系统和相应的一种广告插件识别方法, 一种计算机程序及一种计算机可读介质。 依据本发明的一个方面, 提供了一种广告插件识别方法, 包括: 查找应用插件相关的各文件; 基于预定的广告的特征向量集的各特征维度的特征向量,扫描所述应用 插件相关的各文件,计算所述各文件中的数据与所述各特征维度的特征向量 的特征向量相似度;
根据每个特征维度的特征向量相似度以及该特征维度的特征识别权重, 计算当前应用插件的广告相似度;
将所述广告相似度与阈值进行比较, 根据比较结果, 判定所述应用插件 是否为广告插件。 依据本发明的另一个方面,还提供了一种广告插件识别系统, 包括: 查找模块, 配置为查找应用插件相关的各文件;
特征扫描模块, 配置为基于预定的广告的特征向量集的各特征维度的特 征向量, 扫描所述应用插件相关的各文件, 计算所述各文件中的数据与所述 各特征维度的特征向量的特征向量相似度;
广告相似度计算模块, 配置为根据每个特征维度的特征向量相似度以及 该特征维度的特征识别权重, 计算当前应用插件的广告相似度;
判断模块, 配置为将所述广告相似度与阈值进行比较, 根据比较结果, 判定所述应用插件是否为广告插件。
根据本发明的另一个方面, 提供了一种计算机程序, 其包括计算机 可读代码, 当所述计算机可读代码在移动终端上运行时, 导致所述移动 终端执行根据权利要求 1-10中的任一个所述的一种广告插件识别方法。
根据本发明的另一个方面, 提供了一种计算机可读介质, 其中存储 了如权利要求 21所述的计算机程序。 本发明的有益效果为:
根据本发明的一种广告插件识别方法可以针对智能终端的应用插件分 析各种特征维度下的特征数据 , 结合各种特征维度下的特征数据综合判断应 用插件是否包括广告插件, 由此解决了现有技术无法精确的识别广告插件, 无法对混淆后的应用进行广告插件识别, 广告插件识别率低的等问题, 取得 了有完善的特征识别规则, 并可结合云端数据进行特征识别, 实现了广告软 件的精、 准识别, 对混淆后的广告软件代码具有较高的特征匹配识别能力, 并且能够对识别出来的广告插件进行有效的行为检测的有益效果。
上述说明仅是本发明技术方案的概述, 为了能够更清楚了解本发明的技 术手段, 而可依照说明书的内容予以实施, 并且为了让本发明的上述和其它 目的、 特征和优点能够更明显易懂, 以下特举本发明的具体实施方式。 附图说明
通过阅读下文优选实施方式的详细描述,各种其他的优点和益处对于本 领域普通技术人员将变得清楚明了。 附图仅用于示出优选实施方式的目的, 而并不认为是对本发明的限制。 而且在整个附图中, 用相同的参考符号表示 相同的部件。 在附图中:
图 1 示意性示出了根据本发明一个实施例的一的一种广告插件识别方 法的流程示意图;
图 2 示意性示出了根据本发明一个实施例的二的一种广告插件识别方 法的流程示意图;
图 3 示意性示出了根据本发明一个实施例的三的一种广告插件识别方 法的结构示意图;
图 4 示意性示出了根据本发明一个实施例四的一种广告插件识别系统 的结构示意图;
图 5 示意性示出了根据本发明一个实施例五的一种广告插件识别系统 的结构示意图;以及
图 6 示意性示出了根据本发明一个实施例六的一种广告插件识别系统 的结构示意图;
图 7 示意性示出了用于执行根据本发明的方法的一种移动终端的框 图;
图 8 示意性示出了用于保持或者携带实现根据本发明的方法的程序 代码的存储单元。 具体实施方式
下面将参照附图更详细地描述本公开的示例性实施例。 虽然附图中显示 了本公开的示例性实施例, 然而应当理解, 可以以各种形式实现本公开而不 应被这里阐述的实施例所限制。 相反, 提供这些实施例是为了能够更透彻地 理解本公开, 并且能够将本公开的范围完整的传达给本领域的技术人员。
实施例一
参照图 1 , 其示出了本发明实施例一的一种广告插件识别方法的流程示 意图, 具体可以包括:
步骤 110, 查找应用插件相关的各文件;
在本发明实施例中, 在识别时, 对于新安装的应用插件, 查找应用插件 的安装包及安装包的译放位置的相关文件, 比如一个应用的应用插件在手机 存储于手机 SD卡( Secure Digital Memory Card, 安全数码卡), 比如在的根 目录下, 为 SD: \A文件, 该应用插件安装后其译放的文件在手机 SD卡的 SD: \program\ml 文件中, 文件包括配置文件, 可执行文件 (比如 .Dex 文 件, .Dex文件一般为 Android安卓系统的执行文件。 )
另外对于安装完毕被删除安装包的文件 ,可扫描应用插件安装译放位置 的文件, 比如配置文件, 可执行文件等。
优选的, 步骤 110可以为: 查找智能移动终端的中的应用插件相关的各 文件中的应用插件相关的各文件。
即本申请针对智能移动终端中的应用插件。 比如智能手机中的 APP等 插件。
步骤 120, 基于预定的广告的特征向量集的各特征维度的特征向量, 扫 描所述各文件,计算所述各文件中的数据与所述各特征维度的特征向量的特 征向量相似度;
在本发明中, 优选的, 还包括:
步骤 S100, 通过云端服务器分析广告样本库中的各应用插件获取各特 征维度下的特征数据, 并根据所述特征数据构建广告的特征向量集。 在本发明实施例中, 可预先由构建广告的特征向量集 D(dl,d2,d3.....dn), 其中 n为特征向量个数 , 其中每个特征向量对应特征匹配条件 , , 比如:
1 ) 常量池维度特征向量
通过字符串常量池识别插件特定的广告特征, 比如说艮多插件版本号的 字符串、 广告插件联网域名字符串都会保存在常量池。
那么该维度中, 比如统计广告软件的常量池得到 100个字符串, 其中存 在各种字符条件组合, 比如字符串 Α且字符串 B, 字符串 C或字符串 D或 字符串 E, 非字符串 F, 等字符串条件才能判断属于广告插件的常量值维度 的特征向量。那么 "字符串 A且字符串 B,字符串 C或字符串 D或字符串 E, 非字符串 F" 及为该特征向量的特征匹配条件。
2 ) 包名和类名维度特征向量
广告插件都会包含特定包名和类名 ,通过该信息可以判定是否含有特定 的广告插件。 然而艮多广告插件会随着广告应用宿主的混淆而混淆, 所以包 名与类型特征值的选取应该是不会混淆的类名, 比如说插件中包含的服务等 组件的类名, 还有广告 View的类名。
3 ) 配置信息维度特征向量
清单文件及配置中声明的信息,有些广告插件会在该文件中声明需要的 信息, 根据该特征可以提高识别率。 艮多广告插件会在例如 AndroidMenifest.xml或者自定义的配置文件中存放广告插件的 AppKey等信 息。 其中 AppKey是广告提供商提供给开发者的唯一性 Id。
4 )类继承关系序列维度特征向量
广告插件识别在特定应用场景下,可能需要精确识别程序安装包里面的 广告组件, 而不是是否包含广告。 这时候有些组件可以继承自某个已知的广 告组件, 这时候根据继承关系序列可以判定是否为广告组件。
那么该维度中,比如统计用于识别广告软件的类继承关系为类 >类 b-> 类 c, 或者类 &->类 1»类 d, 那么 "类 &->类1»类(;, 或者类 &->类 1 >类 d" 即为该维度下的特征匹配条件。
5) 函数调用序列维度特征向量 根据对程序代码进行扫描, 可以确定函数调用序列, 通过分析函数调用 序列, 确定是否包含可疑的广告发送行为。
那么该维度中, 比如统计用于识别广告软件的函数调用序列为函数 a-> 函数 b->函数 c+函数 d, 或者类 ^>类 b>函数 c+函数 d, 那么 "函数 a->函数 b->函数 c+函数 d,或者类 >类 b>函数 c+函数 d"即为该维度下的特征匹配 条件。
6 )安装包维度特征向量
比如例如对于已经确定为广告的插件, 可以通过下发其安装包的 md5 来判定该特征。
上述特征向量可由云端对的广告插件样本库中的各应用插件样本进行 分析和统计获得, 比如对 1 ) 至 5 ), 可对每个应用插件进行如下分析:
扫描所述配置文件 ,从所述配置文件中获取所述广告的特征向量集中配 置信息维度下的各声明信息作为第一特征向量;
扫描所述可执行文件中的常量池,从所述常量池获取所述广告的特征向 量集中常量池维度下的各字符串作为第一特征向量;
扫描所述可执行文件中的类结构,从所述类结构中获取所述广告的特征 向量集中包名和类名维度下的各包名和类名作为第一特征向量;
扫描所述可执行文件中的类结构,从所述类结构中获取所述广告的特征 向量集中类继承关系序列维度下的各类继承关系作为第一特征向量;
扫描所述可执行文件中的方法描述符,从所述方法描述符中获取所述广 告的特征向量集中函数调用序列维度下的各函数调用序列作为第一特征向 量。
可执行文件, 以. Dex文件为例, 通过. Dex文件头部索引以及偏移值可 以查找到类结构的位置, 从而可在该结构中提取到包名和类名, 以及根据每 个类的父子继承关系可以查找到类继承关系,还可根据通过 .Dex文件头部索 引以及偏移值查找到方法描述符位置, 方法描述符记录了该插件执行过程中 各种方法的函数之间的调用关系, 那么即可查找函数调用关系。
对于 6 ), 可以对用户量较多的广告软件运用云端数据实现特征语义分 析, 达到快速、 准确识别的目的, 1 )至 5 )的几个特征识别方案可以适用于 大部分情况。 对于安装包特定的特征更多的是要结合云端数据, 例如通过云 端和用户的反馈, 已经确定为广告的插件, 可以通过下发其安装包的 md5 来判定是否为广告插件。
另外, 还包括:
步骤 S121 , 根据用户对应用插件的反馈信息, 将反馈为包括广告的应 用插件补充进入所述广告应用插件样本库。
即可持续不断的接收用户对各种软件的反馈,根据反馈结果可将反馈为 广告插件达到一定值的插件加入广告样本库, 不断完善数据源, 提高广告的 特征向量集的准确度。
可选的, 还包括: 在云端服务器将所述各特征向量维度下的特征数据构 建广告的特征向量集预编译为二进制的 XML格式。
特征数据会预编译成二进制 xml格式, 首先会有较快的解析速度, 适合 于手机设备这样的低内存、 低 CPU的设备解析, 再次因为会把 xml文件中 冗余的属性名、 元素等字符串放在公共的字符串池中通过索引进行引用, 可 以大大减少数据文件的体积, 适合于网络, 特别是移动网络传输。 在本发明 实施例中, 在构建了广告的特征向量集后, 及可进行步骤 120。
可选的, 所述基于预定的广告的特征向量集的各特征维度的特征向量, 扫描所述应用插件相关的各文件,计算所述各文件中的数据与所述各特征维 度的特征向量的特征向量相似度包括:
步骤 All , 根据所述预定的广告的特征向量集的各特征维度指定的扫描 位置, 获取相应特征维度下的特征数据;
由于每种特征向量需要扫描指定文件或者指定位置才能获得,那么需要 扫描预定的广告的特征向量集的各特征维度指定的扫描位置, 获取相应特征 维度下的特征数据。
可选的,所述根据所述预定的广告的特征向量集的各特征维度指定的扫 描位置, 获取相应特征维度下的特征数据包括:
步骤 bll , 扫描所述应用插件的安装包, 从所述安装包获取所述广告的 特征向量集中安装包维度下的各特征信息作为第一特征向量; 比如扫描应用插件对应的安装包, 计算其 md5值, 将该 md5值作为安 装包维度下特征向量的特征值。
和 /或, 步骤 bl2, 扫描所述配置文件, 从所述配置文件中获取所述广告 的特征向量集中配置信息维度下的各声明信息作为第一特征向量;
比如从扫描文件的安装包的译放位置,扫描该安装包译放得到的配置文 件,比如 AndroidMenifest.xml等配置文件,从中提取声明信息,比如 AppKey , 作为该维度下特征向量的特征值。
和 /或, 步骤 bl3 , 扫描所述可执行文件中的常量池, 从所述常量池获取 所述广告的特征向量集中常量池维度下的各字符串作为第一特征向量;
通过字符串常量池识别插件特定的广告特征, 比如说艮多插件版本号的 字符串、 广告插件联网域名字符串都会保存在常量池。 那么本步骤获取插件 版本号的字符串、 广告插件联网域名字符串等字符串, 作为该维度下特征向 量的特征值。
和 /或, 步骤 bl4, 扫描所述可执行文件中的类结构, 从所述类结构中获 取所述广告的特征向量集中包名和类名维度下的各包名和类名作为第一特 征向量;
比如扫描由应用插件安装包译放得到的. Dex文件, 查找其中的类结构 , 从所述类结构中获取包名和类名,作为包名和类名维度下的特征向量的特征 值。
和 /或, 步骤 bl5 , 扫描所述可执行文件中的类结构, 从所述类结构中获 取所述广告的特征向量集中类继承关系序列维度下的各类继承关系作为第 一特征向量;
比如扫描由应用插件安装包译放得到的. Dex文件, 查找其中的类结构 , 从所述类结构中类的指向关系和继承关系, 提取类继承关系序列, 作为类继 承关系序列维度的特征值。
和 /或, 步骤 bl6, 扫描所述可执行文件中的方法描述符, 从所述方法描 述符中获取所述广告的特征向量集中函数调用序列维度下的各函数调用序 列作为第一特征向量。
比如扫描由应用插件安装包译放得到的. Dex文件,查找其中的方法描述 符, 从所述方法描述符的记录函数调用关系, 提取函数调用关系序列, 作为 函数调用关系序列维度的特征值。
步骤 A12,将所述特征数据与所述广告的特征向量集中相应特征维度下 的特征向量的特征值进行特征向量相似度计算,获取该特征维度下的特征向 量相似度。
在扫描得到各特征维度下的特征数据后,与广告的特征向量集中相应特 征维度下的特征向量的特征值进行特征向量相似度计算,获取该特征维度下 的特征向量相似度。
可选的,所述将所述特征数据与所述广告的特征向量集中相应特征维度 下的特征向量的特征值进行特征向量相似度计算,获取该特征维度下的特征 向量相似度包括:
步骤 bl7, 将所述特征数据与所述广告的特征向量集中相应特征维度下 的特征向量对应的各种特征匹配条件进行匹配,根据匹配结果计算该特征维 度下的特征向量相似度。
比如前述的对于广告的特征向量集中,常量池特征维度下的 "符串 A且 字符串 B, 字符串 C或字符串 D或字符串 E, 非字符串 F" 匹配条件, 扫描 当前的应用插件的常量池得到字符串 A、、 C、 N, 那么完全符合匹配条件 ",, 符串 C或字符串 D或字符串 E, 非字符串 F, 可计算相似度为 2/3。
对于前述各种特征维度,计算得到每个特征维度 i的特征向量相似度后, 可得到该插件的特征向量相似度集 S(sl,s2,si,— sn)(i=l,2— n), si取值范围为 0 到 1, 其中 0为完全不相似, 1为完全匹配。
步骤 130, 根据每个特征维度的特征向量相似度以及该特征维度的特征 识别权重, 计算当前应用插件的广告相似度;
在本发明中, 对于每个特征维度, 可设置相应的特征识别权重, 然后将 特征向量相似度和相应的特征识别权重带入特征向量相似度计算函数 f计算 当前应用插件的广告相似度 V。 比如对于前述预置的广告的特征向量集 D(dl,d2,d3.....dn),其中 n为特征向量个数, 可预置广告的特征向量集对应的 权重 W(wl,w2,w3 ··· .wn)„
特别的, 实际测试扫描准确度, 大部分条件下相似度计算函数 f退化为 使用加权平均数即可满足条件,也考虑到了加权平均数计算简单,快速,较少 的浮点运算, 适合于手机等智能移动终端设备, 即: v=sl*wl + s2*w2 + ... + sn*wn。 步骤 140, 将所述广告相似度与阈值进行比较, 根据比较结果, 判 定所述应用插件是否为广告插件。
本发明实施例中还可设置广告相似度阈值 t, 当 v>t时, 即可判断所述 应用插件是否为广告插件。
可选的, 还包括:
步骤 C11 , 记录对各应用插件扫描判断结果; 当再次扫描时, 根据对对 各应用插件扫描判断结果的记录, 跳过对已判断的应用插件的扫描。
在本发明实施例中,可能存在智能移动终端重新安装本发明的系统的情 况, 或者存在重新安装已经安装过的应用插件的情况, 那么上述情况中, 则 可对已经进行扫描判断的应用插件, 利用记录的扫描判断结果进行快速扫描 和判断。
本发明的应用的场景包括:
1 ) 在最终用户在智能机上可以进行扫描, 用户可以看到广告软件有哪 些, 有哪些行为, 用户可以选择卸载该软件。 也可以向广告拦截模块提供拉 截数据。
2 ) 在智能移动终端的平台上(比如应用市场)上可以对软件进行扫描, 这样用户在下载安装前就可以了解到该软件是否为广告软件及其行为。
本发明实施例有完善的特征识别规则, 并可结合云端数据进行特征识 别, 实现了广告软件的精、 准识别, 对混淆后的广告软件代码具有较高的特 征匹配识别能力, 提高了广告插件的识别率。
本发明实施例可以针对智能终端的应用插件分析各种特征维度下的特 征数据, 结合各种特征维度下的特征数据综合判断应用插件是否包括广告插 件, 由此解决了现有技术无法精确的识别广告插件, 无法对混淆后的应用进 行广告插件识别, 广告插件识别率低的等问题, 取得了有完善的特征识别规 则, 并可结合云端数据进行特征识别, 实现了广告软件的精、 准识别, 对混 淆后的广告软件代码具有较高的特征匹配识别能力, 并且能够对识别出来的 广告插件进行有效的行为检测的有益效果。
实施例二 参照图 2, 示出了根据本发明一个实施例的二的一种广告插件识别方法 的流程示意图, 具体可以包括:
步骤 210, 查找应用插件相关的各文件;
步骤 220, 基于预定的广告的特征向量集的各特征维度的特征向量, 扫 描所述应用插件相关的各文件,计算所述各文件中的数据与所述各特征维度 的特征向量的特征向量相似度; 所述建广告的特征向量集由云端服务器构建 并发送;
步骤 230, 根据每个特征维度的特征向量相似度以及该特征维度的特征 识别权重, 计算当前应用插件的广告相似度;
步骤 240, 将所述广告相似度与阈值进行比较, 根据比较结果, 判定所 述应用插件是否为广告插件;
步骤 250, 如果所述应用插件是广告插件, 则检测所述广告插件对智能 移动终端系统的操作行为。
在本发明实施例中步骤 210至步骤 240与实施例一的类似步骤基本类 似, 在此不再加以详述。
对于步骤 250, 本发明实施例中可设置广告行为分析引擎对广告插件进 行行为检测。 比如检测广告插件运行时是否请求从网络获取广告内容, 是否 提取用户的隐私信息, 其传输隐私信息的目标是否为外部网络等。 又比如, 通过主动防御引擎实时监测广告软件的敏感 API ( Application Programming Interface,应用程序编程接口)调用, 比如调用读取短信内容的 API; 或者来 自广告特征中对该插件行为的描述, 比如函数调用序列的描述; 或者通过对 插件代码进行静态分析, 是否调用敏感 API, 比如调用读取联系人的 API。 该行为检测可在智能移动终端中实时检测以通知给用户,也可在应用插件所 在平台中预先检测, 再通知给用户。
本发明实施例, 在能有效识别广告插件的情况下, 可针对性的对广告插 件进行行为检测, 可提示用户当前广告插件的广告行为。
当然, 本发明也可拦截广告插件后续的操作行为, 比如检测到调用读取 短信内容的 API后, 本发明则拦截该广告插件对具体广告内容的调用, 其他 行为的处理方法类似。
本发明实施例有完善的特征识别规则, 并可结合云端数据进行特征识 别, 实现了广告软件的精、 准识别, 对混淆后的广告软件代码具有较高的特 征匹配识别能力, 提高了广告插件的识别率, 并且可进一步对广告插件的行 为进行检测。 实施例三
参照图 3 , 其示出了本发明实施例三的一种广告插件识别方法的流程示 意图, 具体可以包括:
步骤 S200, 构建广告的特征向量集;
步骤 S210, 查找智能移动终端的应用平台中的应用插件相关的各文件; 即本申请可针对在智能移动终端的应用平台中查找应用插件。 比如智能 手机的应用平台中的各种 APP等。
步骤 S220, 基于预定的广告的特征向量集的各特征维度的特征向量, 扫描所述应用插件相关的各文件,计算所述各文件中的数据与所述各特征维 度的特征向量的特征向量相似度; 步骤 S230, 根据每个特征维度的特征向 量相似度以及该特征维度的特征识别权重, 计算当前应用插件的广告相似 度;
步骤 S240, 将所述广告相似度与阈值进行比较, 根据比较结果, 判定 所述应用插件是否为广告插件;
步骤 S250, 如果所述应用插件是广告插件, 则检测所述广告插件对智 能移动终端系统的操作行为。 在用户的手机终端等智能移动终端下载被识别为广告插件的应用插件 时, 对用户进行提示, 比如提示该应用插件为广告插件, 其广告行为包括: 调用短信、 调用联系人等。
在本实施例中主要是在手机平台上对应用插件进行广告识别和处理, 因 此, 如果对广告插件进行行为检测需要手机的实时信息时, 可暂不检测该行 为, 其他检测方式与实施例二类似。
本实施例中的与实施例而类似的步骤原理也类似, 在此不再详述。 本发明实施例有完善的特征识别规则, 并可结合云端数据进行特征识 别, 实现了广告软件的精、 准识别, 对混淆后的广告软件代码具有较高的特 征匹配识别能力, 提高了广告插件的识别率, 并且可再应用平台中预先对广 告插件的行为进行检测。 实施例四
参照图 4, 示出了本发明实施例的四的一种广告插件识别系统的结构示 意图, 具体可以包括:
查找模块 310, 配置为查找应用插件相关的各文件;
特征扫描模块 320, 配置为基于预定的广告的特征向量集的各特征维度 的特征向量, 扫描所述应用插件相关的各文件, 计算所述各文件中的数据与 所述各特征维度的特征向量的特征向量相似度;
广告相似度计算模块 330, 配置为根据每个特征维度的特征向量相似度 以及该特征维度的特征识别权重, 计算当前应用插件的广告相似度;
判断模块 340, 配置为将所述广告相似度与阈值进行比较, 根据比较结 果, 判定所述应用插件是否为广告插件。
其中, 所述查找模块包括:
第一查找模块, 配置为查找智能移动终端的应用平台中的应用插件相关 的各文件。
可选的, 还包括:
云端服务器, 所述云端服务器包括: 特征向量集构建模块, 配置为通过云端服务器分析广告样本库中的各应 用插件获取各特征维度下的特征数据, 并根据所述特征数据构建广告的特征 向量集。
可选的, 所述云端服务器还包括:
反馈补充模块, 配置为根据用户对应用插件的反馈信息, 将反馈为包括 广告的应用插件补充进入所述广告应用插件样本库。
可选的, 还包括:
特征向量集转换模块, 配置为在云端服务器将所述各特征向量维度下的 特征数据构建广告的特征向量集预编译为二进制的 XML格式。
可选的, 所述特征扫描模块包括:
特征数据提取模块, 配置为根据所述预定的广告的特征向量集的各特征 维度指定的扫描位置, 获取相应特征维度下的特征数据;
特征数据分析模块, 配置为将所述特征数据与所述广告的特征向量集中 相应特征维度下的特征向量的特征值进行特征向量相似度计算,获取该特征 维度下的特征向量相似度。
可选的, 所述特征数据提取模块包括:
应用插件扫描模块, 配置为扫描所述应用插件的安装包, 从所述安装包 获取所述广告的特征向量集中安装包维度下的各特征信息作为第一特征向 量;
和 /或配置信息维度获取模块, 配置为扫描所述配置文件, 从所述配置 文件中获取与预置的广告的特征向量集中配置信息维度下的特征值相匹配 的特征值;
和 /或, 常量池维度获取模块, 配置为扫描所述可执行文件中的常量池, 从所述常量池获取与预置的广告的特征向量集中常量池维度下的各字符串 相匹配的字符串;
和 /或, 包名和类名获取模块, 配置为扫描所述可执行文件中的类结构, 从所述类结构中获取所述广告的特征向量集中包名和类名维度下的各包名 和类名作为第一特征向量; 和 /或, 类继承关系维度获取模块, 配置为扫描所述可执行文件中的类 结构,从所述类结构中获取与预置的广告的特征向量集中类继承关系序列维 度下的各类继承关系相匹配的类继承关系;
和 /或, 函数调用序列维度获取模块, 配置为扫描所述可执行文件中的 方法描述符,从所述方法描述符中获取与预置的广告的特征向量集中函数调 用序列维度下的各函数调用序列相匹配的函数调用序列。
可选的, 所述特征数据分析模块包括: 第一分析模块, 配置为将所述特 征数据与所述广告的特征向量集中相应特征维度下的特征向量对应的各种 特征匹配条件进行匹配, 根据匹配结果计算该特征维度下的特征向量相似 度。
可选的, 还包括:
记录模块, 配置为记录对各应用插件扫描判断结果;
进一步的, 还包括快速扫描模块, 配置为当再次扫描时, 根据对对各应 用插件扫描判断结果的记录, 跳过对已判断的应用插件的扫描。
实施例五 参照图 5 , 示出了本发明实施例五的一种广告插件识别系统的结构示意 图, 具体可以包括:
智能移动终端 410和云端服务器 420;
所述智能移动终端包括:
查找模块 411 , 配置为查找应用插件相关的各文件;
特征扫描模块 412, 配置为基于预定的广告的特征向量集的各特征维度 的特征向量, 扫描所述应用插件相关的各文件, 计算所述各文件中的数据与 所述各特征维度的特征向量的特征向量相似度;
广告相似度计算模块 413 , 配置为根据每个特征维度的特征向量相似度 以及该特征维度的特征识别权重, 计算当前应用插件的广告相似度;
判断模块 414, 配置为将所述广告相似度与阈值进行比较, 根据比较结 果, 判定所述应用插件是否为广告插件:
行为检测模块 415 , 配置为判定所述应用插件是否为广告插件之后, 检 测所述广告插件对智能移动终端系统的操作行为。
本实施例与实施例四类似, 在此不再详述。
实施例六, 示出了本发明实施例五的一种广告插件识别系统的结构示意 图, 具体可以包括:
智能移动终端 S410和云端服务器 S420;
所述云端服务器 S420包括:
向量集构建模块 S421 , 配置为构建广告的特征向量集。
查找模块 S422, 配置为查找应用插件相关的各文件;
特征扫描模块 S423 , 配置为基于预定的广告的特征向量集的各特征维 度的特征向量, 扫描所述应用插件相关的各文件, 计算所述各文件中的数据 与所述各特征维度的特征向量的特征向量相似度;
广告相似度计算模块 S424 , 配置为根据每个特征维度的特征向量相似 度以及该特征维度的特征识别权重, 计算当前应用插件的广告相似度;
判断模块 S425 , 配置为将所述广告相似度与阈值进行比较, 根据比较 结果, 判定所述应用插件是否为广告插件:
行为检测模块 S426 , 配置为判定所述应用插件是否为广告插件之后, 检测所述广告插件对智能移动终端系统的操作行为。
在智能移动终端中,在智能移动终端下载被识别为广告插件的应用插件 时, 对用户进行提示。
本实施例与实施例五类似, 在此不再详述。 本发明的各个部件实施例可以以硬件实现, 或者以在一个或者多个 处理器上运行的软件模块实现, 或者以它们的组合实现。 本领域的技术 人员应当理解, 可以在实践中使用微处理器或者数字信号处理器 (DSP ) 来实现根据本发明实施例的移动终端通话请求信息处理设备中的一些或 者全部部件的一些或者全部功能。 本发明还可以实现为用于执行这里所 描述的方法的一部分或者全部的设备或者装置程序 (例如, 计算机程序 和计算机程序产品)。 这样的实现本发明的程序可以存储在计算机可读介 质上, 或者可以具有一个或者多个信号的形式。 这样的信号可以从因特 网网站上下载得到, 或者在载体信号上提供, 或者以任何其他形式提供。
例如, 图 7 示意性示出了用于执行根据本发明的方法的的移动终端 的框图, 该移动终端传统上包括处理器 710和以存储器 720形式的计算 机程序产品或者计算机可读介质。存储器 720可以是诸如闪存、 EEPROM (电可擦除可编程只读存储器)、 EPROM、 硬盘或者 ROM之类的电子存 储器。 存储器 720 具有用于执行上述方法中的任何方法步骤的程序代码 731的存储空间 730。 例如, 用于程序代码的存储空间 730可以包括分别 用于实现上面的方法中的各种步骤的各个程序代码 731。这些程序代码可 以从一个或者多个计算机程序产品中读出或者写入到这一个或者多个计 算机程序产品中。 这些计算机程序产品包括诸如硬盘, 紧致盘(CD )、 存 储卡或者软盘之类的程序代码载体。 这样的计算机程序产品通常为如参 考图 8所述的便携式或者固定存储单元。 该存储单元可以具有与图 7的 移动终端中的存储器 720 类似布置的存储段、 存储空间等。 程序代码可 以例如以适当形式进行压缩。 通常, 存储单元包括计算机可读代码 73 Γ , 即可以由例如诸如 710 之类的处理器读取的代码, 这些代码当由移动终 端运行时, 导致该移动终端执行上面所描述的方法中的各个步骤。
本文中所称的 "一个实施例"、 "实施例" 或者 "一个或者多个实施 例" 意味着, 结合实施例描述的特定特征、 结构或者特性包括在本发明 的至少一个实施例中。 此外, 请注意, 这里 "在一个实施例中" 的词语 例子不一定全指同一个实施例。
在此处所提供的说明书中, 说明了大量具体细节。 然而, 能够理解, 中, 并未详细示出公知的方法、 结构和技术, 以便不模糊对本说明书的 理解。
应该注意的是上述实施例对本发明进行说明而不是对本发明进行限 制, 并且本领域技术人员在不脱离所附权利要求的范围的情况下可设计 出替换实施例。 在权利要求中, 不应将位于括号之间的任何参考符号构 造成对权利要求的限制。 单词 "包含" 不排除存在未列在权利要求中的 元件或步骤。 位于元件之前的单词 "一" 或 "一个" 不排除存在多个这 样的元件。 本发明可以借助于包括有若干不同元件的硬件以及借助于适 当编程的计算机来实现。 在列举了若干装置的单元权利要求中, 这些装 置中的若干个可以是通过同一个硬件项来具体体现。 单词第一、 第二、 以及第三等的使用不表示任何顺序。 可将这些单词解释为名称。
此外, 还应当注意, 本说明书中使用的语言主要是为了可读性和教 导的目的而选择的, 而不是为了解释或者限定本发明的主题而选择的。 因此, 在不偏离所附权利要求书的范围和精神的情况下, 对于本技术领 域的普通技术人员来说许多修改和变更都是显而易见的。 对于本发明的 范围, 对本发明所做的公开是说明性的, 而非限制性的, 本发明的范围 由所附权利要求书限定。

Claims

1、 一种广告插件识别方法, 包括:
查找应用插件相关的各文件;
基于预定的广告的特征向量集的各特征维度的特征向量,扫描所述应用 插件相关的各文件,计算所述各文件中的数据与所述各特征维度的特征向量 的特征向量相似度; 权
根据每个特征维度的特征向量相似度以及该特征维度的特征识别权重, 计算当前应用插件的广告相似度;
将所述广告相似度与阈值进行比较, 根据比较结果, 判定所述应用插件 是否为广告插件。
2、 如权利要求 1所述的方法, 还包括:
通过云端服务器分析广告样本库中的各应用插件获取各特征维度下的 特征数据, 并根据所述特征数据构建广告的特征向书量集。
3、 如权利要求 2所述的方法, 还包括:
根据用户对应用插件的反馈信息,将反馈为包括广告的应用插件补充进 入所述广告应用插件样本库。
4、 如权利要求 2所述的方法, 还包括:
在云端服务器将所述各特征向量维度下的特征数据构建的广告的特征 向量集预编译为二进制的 XML格式。
5、 如权利要求 1所述的方法, 所述基于预定的广告的特征向量集的各 特征维度的特征向量, 扫描所述应用插件相关的各文件, 计算所述各文件中 的数据与所述各特征维度的特征向量的特征向量相似度包括:
根据所述预定的广告的特征向量集的各特征维度指定的扫描位置,获取 相应特征维度下的特征数据;
将所述特征数据与所述广告的特征向量集中相应特征维度下的特征向 量的特征值进行特征向量相似度计算, 获取该特征维度下的特征向量相似 度。
6、 如权利要求 5所述的方法, 所述根据所述预定的广告的特征向量集 的各特征维度指定的扫描位置, 获取相应特征维度下的特征数据包括: 扫描所述应用插件的安装包,从所述安装包获取所述广告的特征向量集 中安装包维度下的各特征信息作为第一特征向量;
和 /或, 扫描所述配置文件, 从所述配置文件中获取所述广告的特征向 量集中配置信息维度下的各声明信息作为第一特征向量;
和 /或, 扫描所述可执行文件中的常量池, 从所述常量池获取所述广告 的特征向量集中常量池维度下的各字符串作为第一特征向量;和 /或,扫描所 述可执行文件中的类结构,从所述类结构中获取所述广告的特征向量集中包 名和类名维度下的各包名和类名作为第一特征向量;
和 /或, 扫描所述可执行文件中的类结构, 从所述类结构中获取所述广 告的特征向量集中类继承关系序列维度下的各类继承关系作为第一特征向 量;
和 /或, 扫描所述可执行文件中的方法描述符, 从所述方法描述符中获 取所述广告的特征向量集中函数调用序列维度下的各函数调用序列作为第 一特征向量。
7、 如权利要求 6所述的方法, 所述将所述特征数据与所述广告的特征 向量集中相应特征维度下的特征向量的特征值进行特征向量相似度计算, 获 取该特征维度下的特征向量相似度包括:
将所述特征数据与所述广告的特征向量集中相应特征维度下的特征向 量对应的各种特征匹配条件进行匹配,根据匹配结果计算该特征维度下的特 征向量相似度。
8、 如权利要求 1 所述的方法, 还包括: 记录对各应用插件扫描判断 结果; 当再次扫描时, 根据对对各应用插件扫描判断结果的记录, 跳过对已 判断的应用插件的扫描。
9、如权利要求 1所述的方法, 所述的查找应用插件相关的各文件包括: 查找智能移动终端的应用平台中的应用插件相关的各文件。
10、 如权利要求 1 所述的方法, 判定所述应用插件是否为广告插件之 后还包括: 检测所述广告插件对智能移动终端系统的操作行为。
11、 一种广告插件识别系统, 包括: 查找模块, 配置为查找应用插件相关的各文件;
特征扫描模块, 配置为基于预定的广告的特征向量集的各特征维度的特 征向量, 扫描所述应用插件相关的各文件, 计算所述各文件中的数据与所述 各特征维度的特征向量的特征向量相似度;
广告相似度计算模块, 配置为根据每个特征维度的特征向量相似度以及 该特征维度的特征识别权重, 计算当前应用插件的广告相似度;
判断模块, 配置为将所述广告相似度与阈值进行比较, 根据比较结果, 判定所述应用插件是否为广告插件。
12、 如权利要求 11所述的系统, 还包括:
云端服务器, 所述云端服务器包括: 特征向量集构建模块, 配置为通过 云端服务器分析广告样本库中的各应用插件获取各特征维度下的特征数据, 并根据所述特征数据构建广告的特征向量集。
13、 如权利要求 12所述的系统, 所述云端服务器还包括:
反馈补充模块, 配置为根据用户对应用插件的反馈信息, 将反馈为包括 广告的应用插件补充进入所述广告应用插件样本库。
14、 如权利要求 12所述的系统, 还包括:
特征向量集转换模块, 配置为在云端服务器将所述各特征向量维度下的 特征数据构建的广告的特征向量集预编译为二进制的 XML格式。
15、 如权利要求 11所述的系统, 所述特征扫描模块包括:
特征数据提取模块, 配置为根据所述预定的广告的特征向量集的各特征 维度指定的扫描位置, 获取相应特征维度下的特征数据;
特征数据分析模块, 配置为将所述特征数据与所述广告的特征向量集中 相应特征维度下的特征向量的特征值进行特征向量相似度计算,获取该特征 维度下的特征向量相似度。
16、 如权利要求 15所述的系统, 所述特征数据提取模块包括: 应用插件扫描模块, 配置为扫描所述应用插件的安装包, 从所述安装包 获取所述广告的特征向量集中安装包维度下的各特征信息作为第一特征向 和 /或, 配置信息维度获取模块, 配置为扫描所述配置文件, 从所述配 置文件中获取与预置的广告的特征向量集中配置信息维度下的特征值相匹 配的特征值;
和 /或, 常量池维度获取模块, 配置为扫描所述可执行文件中的常量池, 从所述常量池获取与预置的广告的特征向量集中常量池维度下的各字符串 相匹配的字符串;
和 /或, 包名和类名获取模块, 配置为扫描所述可执行文件中的类结构, 从所述类结构中获取所述广告的特征向量集中包名和类名维度下的各包名 和类名作为第一特征向量;
和 /或, 类继承关系维度获取模块, 配置为扫描所述可执行文件中的类 结构,从所述类结构中获取与预置的广告的特征向量集中类继承关系序列维 度下的各类继承关系相匹配的类继承关系;
和 /或, 函数调用序列维度获取模块, 配置为扫描所述可执行文件中的 方法描述符,从所述方法描述符中获取与预置的广告的特征向量集中函数调 用序列维度下的各函数调用序列相匹配的函数调用序列。
17、 如权利要求 15所述的系统, 所述特征数据分析模块包括: 第一分析模块 , 配置为将所述特征数据与所述广告的特征向量集中相应 特征维度下的特征向量对应的各种特征匹配条件进行匹配,根据匹配结果计 算该特征维度下的特征向量相似度。
18、 如权利要求 11所述的系统, 还包括:
记录模块, 配置为记录对各应用插件扫描判断结果; 进一步的, 还包 括快速扫描模块, 配置为当再次扫描时, 根据对对各应用插件扫描判断结果 的记录, 跳过对已判断的应用插件的扫描。
19、 如权利要求 11所述的系统, 所述查找模块包括:
第一查找模块, 配置为查找智能移动终端的应用平台中的应用插件相关 的各文件。
20、 如权利要求 11所述的系统, 还包括:
行为检测模块, 配置为判定所述应用插件是否为广告插件之后, 检测所 述广告插件对智能移动终端系统的操作行为。
21、 一种计算机程序, 包括计算机可读代码, 当所述计算机可读代码 在移动终端上运行时, 导致所述移动终端执行根据权利要求 1-10中的任 一个所述的一种广告插件识别方法。
22、 一种计算机可读介质, 其中存储了如权利要求 21所述的计算机 程序。
PCT/CN2014/071596 2013-04-08 2014-01-27 一种广告插件识别的方法和系统 WO2014166312A1 (zh)

Priority Applications (1)

Application Number Priority Date Filing Date Title
US14/783,042 US9824212B2 (en) 2013-04-08 2014-01-27 Method and system for recognizing advertisement plug-ins

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201310119812.3A CN103226583B (zh) 2013-04-08 2013-04-08 一种广告插件识别的方法和装置
CN201310119812.3 2013-04-08

Publications (1)

Publication Number Publication Date
WO2014166312A1 true WO2014166312A1 (zh) 2014-10-16

Family

ID=48837029

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2014/071596 WO2014166312A1 (zh) 2013-04-08 2014-01-27 一种广告插件识别的方法和系统

Country Status (3)

Country Link
US (1) US9824212B2 (zh)
CN (1) CN103226583B (zh)
WO (1) WO2014166312A1 (zh)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110213220A (zh) * 2018-12-26 2019-09-06 腾讯科技(深圳)有限公司 检测流量数据的方法、装置、电子设备及计算机存储介质

Families Citing this family (29)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103226583B (zh) * 2013-04-08 2017-07-28 北京奇虎科技有限公司 一种广告插件识别的方法和装置
CN104424200A (zh) * 2013-08-21 2015-03-18 贝壳网际(北京)安全技术有限公司 广告信息处理方法和装置
CN104598815B (zh) * 2013-10-30 2018-09-11 北京猎豹移动科技有限公司 恶意广告程序的识别方法、装置及客户端
US9336539B2 (en) * 2014-04-03 2016-05-10 Vungle, Inc. Systems and methods for providing advertising services in a predictive manner to devices with an advertising exchange
US9336538B2 (en) * 2014-04-03 2016-05-10 Vungle, Inc. Systems and methods for providing advertising services to devices with an advertising exchange
CN103942487B (zh) * 2014-04-11 2017-06-27 珠海市君天电子科技有限公司 积分墙信息的获取方法及装置
CN105373728A (zh) * 2014-09-01 2016-03-02 深圳富泰宏精密工业有限公司 广告提示系统及方法
CN104281473A (zh) * 2014-09-22 2015-01-14 深圳市金立通信设备有限公司 一种插件的处理方法
CN105488406B (zh) * 2014-12-29 2019-02-26 哈尔滨安天科技股份有限公司 一种基于特征向量的相似恶意样本匹配方法及系统
US9811679B2 (en) * 2014-12-30 2017-11-07 Samsung Electronics Co., Ltd. Electronic system with access management mechanism and method of operation thereof
CN105117643B (zh) * 2015-09-23 2018-02-23 北京金山安全软件有限公司 一种处理弹窗的方法及装置
KR102431266B1 (ko) * 2015-09-24 2022-08-11 삼성전자주식회사 통신 시스템에서 정보 보호 장치 및 방법
CN105528212B (zh) * 2015-12-05 2019-08-09 中国航空工业集团公司洛阳电光设备研究所 检测仪接口板卡驱动接口的通用化处理方法
CN105512558B (zh) * 2016-01-07 2018-08-17 北京邮电大学 一种基于反编译模块特征的android广告插件检测方法
CN105912935B (zh) * 2016-05-03 2019-06-14 腾讯科技(深圳)有限公司 广告检测方法及广告检测装置
CN106096394A (zh) * 2016-06-16 2016-11-09 北京奇虎科技有限公司 一种安卓应用的广告拦截方法和装置
CN106845787A (zh) * 2016-12-26 2017-06-13 大唐软件技术股份有限公司 一种数据自动交换方法及装置
CN106991323A (zh) * 2017-03-10 2017-07-28 中时瑞安(北京)网络科技有限责任公司 一种检测Android应用程序广告插件的模型和方法
US10089467B1 (en) * 2017-05-23 2018-10-02 Malwarebytes Inc. Static anomaly-based detection of malware files
CN107273142B (zh) * 2017-07-12 2021-04-23 北京龙之心科技有限公司 程序更新方法、程序运行方法及装置
CN108334775B (zh) * 2018-01-23 2022-09-23 创新先进技术有限公司 一种越狱插件检测方法及装置
CN109034781B (zh) * 2018-06-27 2022-02-22 美味不用等(上海)信息科技股份有限公司 一种收银系统识别方法、识别装置及计算机可读存储介质
CN110399729B (zh) * 2019-04-11 2021-04-27 国家计算机网络与信息安全管理中心 一种基于组件特征权重的二进制软件分析方法
CN112527302B (zh) * 2019-09-19 2024-03-01 北京字节跳动网络技术有限公司 错误检测的方法及装置、终端和存储介质
CN111177545B (zh) * 2019-12-24 2023-06-09 百度国际科技(深圳)有限公司 广告投放方法、平台、电子设备及存储介质
CN111159493B (zh) * 2019-12-25 2023-07-18 乐山师范学院 一种基于特征权重的网络数据相似度计算方法与系统
CN111930277B (zh) * 2020-07-02 2022-07-12 上海连尚网络科技有限公司 一种用于提供呈现信息的方法与设备
CN113988949A (zh) * 2021-11-15 2022-01-28 北京有竹居网络技术有限公司 一种推广信息处理方法、装置、设备及介质、程序产品
CN115828227B (zh) * 2023-01-05 2023-07-07 荣耀终端有限公司 识别广告弹窗的方法、电子设备及存储介质

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20090110275A1 (en) * 2007-10-26 2009-04-30 Abbas Ahmed System and method for electronic document classification
CN102025636A (zh) * 2010-12-09 2011-04-20 北京星网锐捷网络技术有限公司 报文特征处理方法、装置及网络设备
CN102222199A (zh) * 2011-06-03 2011-10-19 奇智软件(北京)有限公司 应用程序身份识别方法及系统
CN102799605A (zh) * 2012-05-02 2012-11-28 天脉聚源(北京)传媒科技有限公司 一种广告监播方法和系统
CN103226583A (zh) * 2013-04-08 2013-07-31 北京奇虎科技有限公司 一种广告插件识别的方法和装置

Family Cites Families (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8019700B2 (en) * 2007-10-05 2011-09-13 Google Inc. Detecting an intrusive landing page
US8516590B1 (en) * 2009-04-25 2013-08-20 Dasient, Inc. Malicious advertisement detection and remediation
US9323928B2 (en) * 2011-06-01 2016-04-26 Mcafee, Inc. System and method for non-signature based detection of malicious processes
US8688309B2 (en) * 2011-12-12 2014-04-01 International Business Machines Corporation Active and stateful hyperspectral vehicle evaluation
KR101214893B1 (ko) * 2011-12-16 2013-01-09 주식회사 안랩 어플리케이션의 유사성 검출 장치 및 방법
CN102708320B (zh) * 2012-05-04 2015-05-06 北京奇虎科技有限公司 一种病毒apk的识别方法及装置
US9407443B2 (en) * 2012-06-05 2016-08-02 Lookout, Inc. Component analysis of software applications on computing devices
CN102831338B (zh) * 2012-06-28 2015-09-30 北京奇虎科技有限公司 一种Android应用程序的安全检测方法及系统
CN102833347A (zh) * 2012-09-10 2012-12-19 辜进荣 基于云平台的移动终端广告

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20090110275A1 (en) * 2007-10-26 2009-04-30 Abbas Ahmed System and method for electronic document classification
CN102025636A (zh) * 2010-12-09 2011-04-20 北京星网锐捷网络技术有限公司 报文特征处理方法、装置及网络设备
CN102222199A (zh) * 2011-06-03 2011-10-19 奇智软件(北京)有限公司 应用程序身份识别方法及系统
CN102799605A (zh) * 2012-05-02 2012-11-28 天脉聚源(北京)传媒科技有限公司 一种广告监播方法和系统
CN103226583A (zh) * 2013-04-08 2013-07-31 北京奇虎科技有限公司 一种广告插件识别的方法和装置

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110213220A (zh) * 2018-12-26 2019-09-06 腾讯科技(深圳)有限公司 检测流量数据的方法、装置、电子设备及计算机存储介质

Also Published As

Publication number Publication date
CN103226583B (zh) 2017-07-28
CN103226583A (zh) 2013-07-31
US20160063244A1 (en) 2016-03-03
US9824212B2 (en) 2017-11-21

Similar Documents

Publication Publication Date Title
WO2014166312A1 (zh) 一种广告插件识别的方法和系统
US10152594B2 (en) Method and device for identifying virus APK
CN108763928B (zh) 一种开源软件漏洞分析方法、装置和存储介质
US10114946B2 (en) Method and device for detecting malicious code in an intelligent terminal
Ma et al. Libradar: Fast and accurate detection of third-party libraries in android apps
CN112041815B (zh) 恶意软件检测
US10481964B2 (en) Monitoring activity of software development kits using stack trace analysis
CN104715196B (zh) 智能手机应用程序的静态分析方法及系统
RU2614557C2 (ru) Система и способ обнаружения вредоносных файлов на мобильных устройствах
US9336389B1 (en) Rapid malware inspection of mobile applications
KR20170087007A (ko) 악성 코드 분석을 위한 전자 장치 및 이의 방법
WO2015101097A1 (zh) 特征提取的方法及装置
KR101337874B1 (ko) 파일 유전자 지도를 이용하여 파일의 악성코드 포함 여부를 판단하는 방법 및 시스템
WO2016201819A1 (zh) 检测恶意文件的方法和装置
US20140082729A1 (en) System and method for analyzing repackaged application through risk calculation
US8875303B2 (en) Detecting pirated applications
US20160267270A1 (en) Method and system for fast inspection of android malwares
KR101277517B1 (ko) 애플리케이션 위/변조 탐지장치 및 방법
CN104317599A (zh) 检测安装包是否被二次打包的方法和装置
US9251261B2 (en) Method and system for metadata driven testing of malware signatures
CN113961919B (zh) 恶意软件检测方法和装置
KR101605783B1 (ko) 악성 애플리케이션 탐지 방법 및 이 방법을 실행시키는 컴퓨터프로그램
CN111460448B (zh) 一种恶意软件家族检测方法及装置
KR102311355B1 (ko) 이미지 및 음성파일의 단어와 화이트리스트를 사용하는 공공기관 또는 금융권 피싱 멀웨어 탐지방법
Anto et al. Kernel modification APT attack detection in android

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 14782669

Country of ref document: EP

Kind code of ref document: A1

WWE Wipo information: entry into national phase

Ref document number: 14783042

Country of ref document: US

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 14782669

Country of ref document: EP

Kind code of ref document: A1