WO2023169212A1 - 一种识别应用程序中sdk的方法、终端及服务器 - Google Patents

一种识别应用程序中sdk的方法、终端及服务器 Download PDF

Info

Publication number
WO2023169212A1
WO2023169212A1 PCT/CN2023/077711 CN2023077711W WO2023169212A1 WO 2023169212 A1 WO2023169212 A1 WO 2023169212A1 CN 2023077711 W CN2023077711 W CN 2023077711W WO 2023169212 A1 WO2023169212 A1 WO 2023169212A1
Authority
WO
WIPO (PCT)
Prior art keywords
class
target application
file
application
sdk
Prior art date
Application number
PCT/CN2023/077711
Other languages
English (en)
French (fr)
Inventor
李松
孙靓
许汝波
Original Assignee
华为技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华为技术有限公司 filed Critical 华为技术有限公司
Publication of WO2023169212A1 publication Critical patent/WO2023169212A1/zh

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/10Protecting distributed programs or content, e.g. vending or licensing of copyrighted material ; Digital rights management [DRM]
    • G06F21/12Protecting executable software
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F8/00Arrangements for software engineering
    • G06F8/60Software deployment
    • G06F8/61Installation

Definitions

  • This application relates to the field of electronic technology, and in particular to a method, terminal and server for identifying SDK in an application.
  • SDKs software development kits
  • third-party SDKs As third-party SDKs are cited on a large scale, certain risk factors have also been introduced, such as privacy leaks, security vulnerabilities, etc.
  • the SDK integrated in the application generally exists in the application code in the form of a package. Therefore, the SDK that may be included in the application can be determined by detecting the package name or package structure in the application code, also known as potential SDK.
  • some third-party application developers use certain adversarial analysis techniques to circumvent this detection method. For example, arbitrarily converting the SDK package name into an irregular, unreadable, and low-entropy string will confuse the SDK package name. Or, arbitrarily placing sub-packages in the SDK into the root directory of the application or other packages flattens the package structure of the SDK and destroys the normal package structure. In this way, if the above detection method is still used, the potential SDK in the application code may not be detected.
  • This application provides a method, terminal and server for identifying SDK in an application, which can overcome the impact of package name confusion and flattened package structure of the target application, and improve the accuracy of identifying the SDK included in the target application. .
  • the first aspect is to provide a method for identifying SDK in an application.
  • the method includes: obtaining the code file of the target application from the installation package of the target application; and determining the path relationship of the class file in the code file of the target application.
  • the SDK included in the target application multiple basic units whose relationships form a closed loop belong to one SDK.
  • the embodiment of the present application first constructs a basic unit according to the path relationship of the class files in the target application, and determines the association of the basic unit according to the dependency relationship between the class files corresponding to the basic unit. Identify the SDK. Since the SDK identification method provided by the embodiment of the present application no longer relies on the package name or the structural characteristics of the package in the target application, it can overcome the package name confusion and package structure of the target application. Identify the impact of architectural flattening and accurately identify the SDKs included in the target application.
  • the basic units included in the target application and the class files corresponding to each basic unit are determined based on the path relationship of the class files in the code files of the target application, including: according to the code files of the target application For the path relationship of class files in the package, it is determined that the package containing one or more class files in the root directory is the basic unit, and the class files corresponding to the basic unit are all the class files in the root directory of the basic unit.
  • class files corresponding to the basic unit do not include the class files in the sub-packages of the basic unit.
  • class files belonging to the same package were developed during the same time period, and class files developed during the same time period are considered to belong to the same SDK. It is understandable that developers generally focus their time on developing certain SDK functions. Therefore, when class files belong to the same time period, it is considered that the class files belong to the same SDK with a high probability. It should be noted that the development time of a class file here does not refer to the time when the class file is actually created, but uses a certain analysis method to analyze the timing relationship between class files. The analysis methods include analysis through the path relationship of class files here, and analysis from the dependency relationships of class files below.
  • the package whose root directory includes the class file is determined as the basic unit, and the association ring is built based on the basic unit.
  • this method can avoid the class in the code file. There are too many file dependencies, making the correlation loop more complicated. It can be seen that the embodiment of the present application determines the appropriate basic unit for building the association ring, which not only ensures the accuracy of SDK recognition, but also improves the efficiency of recognition.
  • the dependency relationship between class files corresponding to the basic unit is determined based on the dependency relationship between classes in each class file.
  • the dependency relationship between classes includes inheritance relationships, calling One or more items in a relationship, a parameter reference relationship, and a return relationship.
  • the class file in the code file contains the business execution logic of the target application, which contains the dependencies between the classes in the class file.
  • Dependencies between classes include but are not limited to inheritance relationships, calling relationships, parameter reference relationships, and return relationships.
  • the calling relationship, parameter reference relationship and return relationship between classes can be determined indirectly through the relationship between classes and methods (such as parameter reference relationship and return relationship), and the relationship between methods (such as calling relationship). It can be seen that the embodiments of the present application provide more dependencies, which is conducive to establishing more dependencies between classes, and overcomes the problem of simply relying on direct dependencies between classes to establish correlations between basic units. Insufficient, it is helpful to provide SDK recognition accuracy.
  • the method after determining the SDK included in the target application based on the association relationship between the basic units, the method also includes: identifying the sensitive API called by each SDK based on the class file included in each SDK. , and generate security prompts for each SDK; where sensitive APIs are used to obtain sensitive data.
  • the code files corresponding to the SDK are analyzed to identify sensitive behaviors of the SDK, such as calling sensitive APIs and performing sensitive broadcast behaviors.
  • the sensitive API is used to obtain sensitive data, such as obtaining the location of the terminal, connecting to the network, accessing the photo album in the terminal, etc.
  • the identified sensitive behaviors of each SDK are expressed in natural language to form a risk report (also called a security prompt) of the SDK, which is used to present it to the user so that the user can decide whether to install the application or whether to install the application. whether after Change privacy system settings for the app, etc.
  • the method further includes: obtaining the configuration file of the target application from the installation package of the target application; identifying the first code file from the code file of the target application according to the configuration file of the target application.
  • the class files of the class and the class files of the second class according to the path relationship of the class files in the code file of the target application, determine the basic units contained in the target application and the class files corresponding to each basic unit, specifically: according to the target
  • the path relationship of the second class class file in the application code file determines the basic units included in the target application and the second class class files corresponding to each basic unit.
  • the target application generally includes program code from two sources, namely program code developed by the application developer itself, and program code provided by integrated third-party application developers (usually one or more SDKs). It is understandable that, under normal circumstances, the risk of program code developed by application developers themselves is lower, while the risk of program code provided by third-party application developers is uncontrollable and higher. Therefore, the terminal can also perform SDK identification and/or analysis of sensitive behaviors in the SDK for the program codes from the two sources.
  • identifying the first type of class file and the second type of class file from the code file of the target application includes: determining the configuration file of the target application.
  • the declared class files are the first class class files; the undeclared class files in the configuration file of the target application are determined to be the second class class files.
  • the first type of class file is the class file declared in the configuration file. It is generally considered to be the program code developed by the application developer for the target application. The sensitive behavior of these application developers developing their own program code is usually risky. Low.
  • the second category of class files refers to class files that are not declared in the configuration file. They are generally considered to be SDKs developed by third-party service providers. The sensitive behavior of these SDKs should be focused on, as the risks are higher.
  • the method further includes: converting the first type of class file to the second type of class file.
  • the class file identifies an SDK in the target application.
  • a collection of class files of the first type may be determined as an SDK.
  • the class files corresponding to this SDK are all first-class class files in the target application.
  • the same method as for the class files of the second type may be performed on the class files of the first type to identify multiple SDKs included in the class files of the first type.
  • a terminal including: a processor and a memory.
  • the memory is coupled to the processor.
  • the memory is used to store computer program code.
  • the computer program code includes computer instructions.
  • a third aspect is to provide a device, which is included in a terminal and has the function of realizing the terminal behavior in any of the above aspects and possible implementation methods.
  • This function can be implemented by hardware, or it can be implemented by hardware executing corresponding software.
  • the hardware or software includes at least one module or unit corresponding to the above functions. For example, communication modules or units, storage modules or units, and processing modules or units, etc.
  • a fourth aspect is to provide a computer-readable storage medium, which includes computer instructions.
  • the terminal is caused to perform the method described in the above aspect and any possible implementation manner.
  • the fifth aspect provides a graphical user interface on a terminal.
  • the terminal has a display screen, a camera, a memory, and one or more processors.
  • the one or more processors are used to execute a program stored in the memory.
  • One or more computer programs, the graphical user interface includes a graphical user interface displayed when the terminal executes the method described in the above aspect and any possible implementation manner therein.
  • a computer program product is provided.
  • the computer program product When the computer program product is run on a computer, it causes the computer to execute the method described in the above aspects and any of the possible implementations.
  • a seventh aspect provides a chip system, including a processor.
  • the processor executes instructions, the processor executes the method described in the above aspects and any of the possible implementations.
  • Figure 1 is a schematic structural diagram of a communication system provided by an embodiment of the present application.
  • Figure 2A is a schematic structural diagram of a terminal provided by an embodiment of the present application.
  • Figure 2B is a schematic structural diagram of another terminal provided by an embodiment of the present application.
  • Figure 3 is a schematic structural diagram of an application program provided by an embodiment of the present application.
  • Figure 4 is a schematic structural diagram of a server provided by an embodiment of the present application.
  • Figure 5 is a schematic flowchart of a method for identifying SDK in an application provided by an embodiment of the present application
  • Figure 6 is a schematic diagram of a method for determining a basic unit according to an embodiment of the present application.
  • Figure 7 is a schematic diagram of a method for identifying SDK in an application according to an embodiment of the present application.
  • Figure 8 is a schematic flowchart of yet another method of identifying SDK in an application provided by an embodiment of the present application.
  • FIG. 9 is a schematic structural diagram of a chip system provided by an embodiment of the present application.
  • the communication system includes a terminal 100 and a server 200 .
  • the server 200 provides application installation package download services for various types of terminals (such as mobile terminals), and is specifically an application server.
  • Various application programs can be installed on the terminal 100, such as an application mall (also called an application market), a browser, etc. The user can access the server 200 through the application mall or browser on the terminal 100, and download the installation package of the application program from the server 200.
  • the terminal 100 may specifically execute the method for identifying the SDK in the application provided by the embodiments of this application. After the terminal downloads the application installation package from the application mall or browser, before or when the terminal installs the application, the terminal can identify the SDK in the application installation package and analyze the SDK. Sensitive behaviors (such as obtaining the location of the terminal, accessing the terminal's photo album, connecting to the network, etc.) to remind the user of possible risks in installing the application, so that the user can determine whether to install it.
  • Sensitive behaviors such as obtaining the location of the terminal, accessing the terminal's photo album, connecting to the network, etc.
  • the terminal identifies the SDK in the application installation package and analyzes sensitive behaviors in the SDK to prompt the user of possible risks in the application and prompt the user whether to uninstall the application or whether to Change privacy and security-related settings for the application (such as prohibiting the application from obtaining the terminal’s location information, Prohibit the application from accessing the terminal's photo album, prohibit the application from connecting to Wi-Fi or cellular networks, etc.).
  • the server 200 may specifically execute the method for identifying the SDK in the application provided by the embodiments of this application.
  • the application server can analyze the application installation package, identify the SDK in the application installation package, and analyze the risks of sensitive behaviors in the SDK .
  • the terminal downloads the application from the application server, the application server can push the risk of the application to the terminal so that the user can determine whether to download it.
  • the application server sends the application installation package to the terminal, it also pushes the risk of the application to the terminal, so as to prompt the user to confirm whether to install it, or to prompt the user whether to change privacy and security-related settings for the application.
  • the terminal 100 in the embodiment of the present application can be, for example, a mobile phone, a tablet computer, a personal computer (PC), a personal digital assistant (PDA), a smart watch, a netbook, a wearable terminal, or an augmented reality device.
  • a mobile phone a tablet computer
  • PC personal computer
  • PDA personal digital assistant
  • smart watch a netbook
  • a wearable terminal or an augmented reality device.
  • Technology augmented reality, AR
  • VR virtual reality
  • vehicle-mounted equipment smart screens, smart speakers, etc.
  • smart speakers etc.
  • This application does not impose special restrictions on the specific form of the terminal.
  • FIG. 2A shows a schematic structural diagram of the terminal 100.
  • the terminal 100 may include a processor 110, an external memory interface 120, an internal memory 121, a universal serial bus (USB) interface 130, a charging management module 140, a power management module 141, a battery 142, an antenna 1, an antenna 2, Mobile communication module 150, wireless communication module 160, audio module 170, speaker 170A, receiver 170B, microphone 170C, headphone interface 170D, sensor module 180, button 190, motor 191, indicator 192, camera 193, display screen 194, and user Identification module (subscriber identification module, SIM) card interface 195, etc.
  • a processor 110 an external memory interface 120, an internal memory 121, a universal serial bus (USB) interface 130, a charging management module 140, a power management module 141, a battery 142, an antenna 1, an antenna 2, Mobile communication module 150, wireless communication module 160, audio module 170, speaker 170A, receiver 170B, microphone 170C, headphone interface 170D, sensor module 180, button 190, motor 191, indicator 192, camera 193, display screen
  • the sensor module 180 may include a pressure sensor 180A, a gyro sensor 180B, an air pressure sensor 180C, a magnetic sensor 180D, an acceleration sensor 180E, a distance sensor 180F, a proximity light sensor 180G, a fingerprint sensor 180H, a temperature sensor 180J, a touch sensor 180K, and ambient light. Sensor 180L, bone conduction sensor 180M, etc.
  • the structure illustrated in the embodiment of the present invention does not constitute a specific limitation on the terminal 100.
  • the terminal 100 may include more or fewer components than shown in the figures, or some components may be combined, or some components may be separated, or may be arranged differently.
  • the components illustrated may be implemented in hardware, software, or a combination of software and hardware.
  • the processor 110 may include one or more processing units.
  • the processor 110 may include an application processor (application processor, AP), a modem processor, a graphics processing unit (GPU), and an image signal processor. (image signal processor, ISP), controller, video codec, digital signal processor (digital signal processor, DSP), baseband processor, and/or neural network processor (neural-network processing unit, NPU), etc.
  • application processor application processor, AP
  • modem processor graphics processing unit
  • GPU graphics processing unit
  • image signal processor image signal processor
  • ISP image signal processor
  • controller video codec
  • digital signal processor digital signal processor
  • DSP digital signal processor
  • baseband processor baseband processor
  • neural network processor neural-network processing unit
  • the processor 110 may also be provided with a memory for storing instructions and data.
  • the memory in processor 110 is cache memory. This memory may hold instructions or data that have been recently used or recycled by processor 110 . If the processor 110 needs to use the instructions or data again, it can be called directly from the memory. Repeated access is avoided and the waiting time of the processor 110 is reduced, thus improving the efficiency of the system.
  • the charging management module 140 is used to receive charging input from the charger.
  • the charger can be a wireless charger, It can also be a wired charger.
  • the power management module 141 is used to connect the battery 142, the charging management module 140 and the processor 110.
  • the power management module 141 receives input from the battery 142 and/or the charging management module 140, and supplies power to the processor 110, the internal memory 121, the display screen 194, the camera 193, the wireless communication module 160, and the like.
  • the power management module 141 can also be used to monitor battery capacity, battery cycle times, battery health status (leakage, impedance) and other parameters.
  • the wireless communication function of the terminal 100 can be implemented through the antenna 1, the antenna 2, the mobile communication module 150, the wireless communication module 160, the modem processor and the baseband processor.
  • the mobile communication module 150 can provide wireless communication solutions including 2G/3G/4G/5G applied to the terminal 100.
  • the mobile communication module 150 may include at least one filter, switch, power amplifier, low noise amplifier (LNA), etc.
  • the mobile communication module 150 can receive electromagnetic waves through the antenna 1, perform filtering, amplification and other processing on the received electromagnetic waves, and transmit them to the modem processor for demodulation.
  • the mobile communication module 150 can also amplify the signal modulated by the modem processor and convert it into electromagnetic waves through the antenna 1 for radiation.
  • at least part of the functional modules of the mobile communication module 150 may be disposed in the processor 110 .
  • the wireless communication module 160 can provide applications on the terminal 100 including wireless local area networks (WLAN) (such as wireless fidelity (Wi-Fi) network), Bluetooth (bluetooth, BT), and global navigation satellite system. (global navigation satellite system, GNSS), frequency modulation (FM), near field communication technology (near field communication, NFC), infrared technology (infrared, IR) and other wireless communication solutions.
  • WLAN wireless local area networks
  • BT Bluetooth
  • BT global navigation satellite system
  • FM frequency modulation
  • NFC near field communication technology
  • infrared technology infrared, IR
  • the wireless communication module 160 may be one or more devices integrating at least one communication processing module.
  • the wireless communication module 160 receives electromagnetic waves via the antenna 2 , frequency modulates and filters the electromagnetic wave signals, and sends the processed signals to the processor 110 .
  • the wireless communication module 160 can also receive the signal to be sent from the processor 110, frequency modulate it, amplify it, and convert it into electromagnetic waves through the antenna 2 for radiation.
  • the terminal 100 implements the display function through the GPU, the display screen 194, and the application processor.
  • the terminal 100 can implement the shooting function through the ISP, camera 193, video codec, GPU, display screen 194, application processor, etc.
  • the external memory interface 120 can be used to connect an external memory card, such as a Micro SD card, to expand the storage capacity of the terminal 100.
  • the external memory card communicates with the processor 110 through the external memory interface 120 to implement the data storage function. Such as saving music, videos, etc. files in external memory card.
  • Internal memory 121 may be used to store computer executable program code, which includes instructions.
  • the internal memory 121 may include a program storage area and a data storage area.
  • the stored program area can store an operating system, at least one application program required for a function (such as a sound playback function, an image playback function, etc.).
  • the storage data area may store data created during use of the terminal 100 (such as audio data, phone book, etc.).
  • the internal memory 121 may include high-speed random access memory, and may also include non-volatile memory, such as at least one disk storage device, flash memory device, universal flash storage (UFS), etc.
  • the processor 110 executes various functional applications and data processing of the terminal 100 by executing instructions stored in the internal memory 121 and/or instructions stored in a memory provided in the processor.
  • the terminal 100 can use the audio module 170, the speaker 170A, the receiver 170B, the microphone 170C, and the earphone.
  • the interface 170D and the application processor implement audio functions. Such as music playback, recording, etc.
  • the buttons 190 include a power button, a volume button, etc.
  • Key 190 may be a mechanical key. It can also be a touch button.
  • the terminal 100 may receive key input and generate key signal input related to user settings and function control of the terminal 100.
  • the motor 191 can generate vibration prompts.
  • the motor 191 can be used for vibration prompts for incoming calls and can also be used for touch vibration feedback.
  • touch operations for different applications can correspond to different vibration feedback effects.
  • the motor 191 can also respond to different vibration feedback effects for touch operations in different areas of the display screen 194 .
  • Different application scenarios such as time reminders, receiving information, alarm clocks, games, etc.
  • the touch vibration feedback effect can also be customized.
  • the indicator 192 may be an indicator light, which may be used to indicate charging status, power changes, or may be used to indicate messages, missed calls, notifications, etc.
  • the SIM card interface 195 is used to connect a SIM card.
  • the SIM card can be connected to or separated from the terminal 100 by inserting it into the SIM card interface 195 or pulling it out from the SIM card interface 195 .
  • the terminal 100 can support 1 or N SIM card interfaces, where N is a positive integer greater than 1.
  • the terminal 100 adopts eSIM, that is, an embedded SIM card.
  • the eSIM card can be embedded in the terminal 100 and cannot be separated from the terminal 100.
  • the software system of the terminal 100 may adopt a layered architecture, an event-driven architecture, a microkernel architecture, a microservice architecture, or a cloud architecture.
  • This embodiment of the present invention takes the Android system with a layered architecture as an example to illustrate the software structure of the terminal 100 .
  • Figure 2B is a software structure block diagram of the terminal 100 according to the embodiment of the present invention.
  • the layered architecture divides the software into several layers, and each layer has clear roles and division of labor.
  • the layers communicate through software interfaces.
  • the Android system is divided into four layers, from top to bottom: application layer, application framework layer, Android runtime and system libraries, and kernel layer.
  • the application layer may include a series of application packages.
  • app store For example, app store, browser.
  • the application layer may also include calendar, gallery, music, weather, map, video and other applications (not shown in the figure).
  • the application store/browser allows users to download other applications.
  • the application mall/browser can specifically implement the method for identifying the SDK in the application provided by the embodiments of this application.
  • FIG. 3 a schematic structural diagram of an application program is provided for an embodiment of the present application.
  • the application mall/browser can adopt the application structure shown in Figure 3, which specifically includes a preprocessing module, a basic unit building module, an association ring building module, a dependency module, an SDK identification module, and an SDK analysis module.
  • Figure 3 is only a structural diagram of an application mall/browser.
  • the application mall/browser may include more or fewer modules than in Figure 3, or combine some modules, or split some modules. wait.
  • the specific functions of each module in Figure 3 will be described in detail below in conjunction with specific embodiments, and will not be described here.
  • the application framework layer provides an application programming interface (API) and programming framework for applications in the application layer.
  • API application programming interface
  • the application framework layer includes some predefined functions.
  • the application framework layer can include a window manager, content provider, view system, phone manager, resource manager, notification manager, etc.
  • a window manager is used to manage window programs.
  • the window manager can obtain the display size, determine whether there is a status bar, lock the screen, capture the screen, etc.
  • Content providers are used to store and retrieve data and make this data accessible to applications. Said data can include videos, images, audio, calls made and received, browsing history and bookmarks, phone books, etc.
  • the view system includes visual controls, such as controls that display text, controls that display pictures, etc.
  • a view system can be used to build applications.
  • the display interface can be composed of one or more views. For example, a display interface including a text message notification icon may include a view for displaying text and a view for displaying pictures.
  • the phone manager is used to provide communication functions of the terminal 100. For example, call status management (including connected, hung up, etc.).
  • the resource manager provides various resources to applications, such as localized strings, icons, pictures, layout files, video files, etc.
  • the notification manager allows applications to display notification information in the status bar, which can be used to convey notification-type messages and can automatically disappear after a short stay without user interaction. For example, the notification manager is used to notify download completion, message reminders, etc.
  • the notification manager can also be notifications that appear in the status bar at the top of the system in the form of charts or scroll bar text, such as notifications for applications running in the background, or notifications that appear on the screen in the form of conversation windows. For example, text information is prompted in the status bar, a prompt sound is emitted, the terminal vibrates, and the indicator light flashes, etc.
  • Android Runtime includes core libraries and virtual machines.
  • Android runtime is responsible for the scheduling and management of the Android system.
  • the core library contains two parts: one part is the functional functions that need to be called by the Java language, and the other part is the core library of Android.
  • the application layer and application framework layer run in virtual machines.
  • the virtual machine executes the java files of the application layer and application framework layer into binary files.
  • the virtual machine is used to perform object life cycle management, stack management, thread management, security and exception management, and garbage collection and other functions.
  • System libraries can include multiple functional modules. For example: surface manager (surface manager), media libraries (Media Libraries), 3D graphics processing libraries (for example: OpenGL ES), 2D graphics engines (for example: SGL), etc.
  • the surface manager is used to manage the display subsystem and provides the fusion of 2D and 3D layers for multiple applications.
  • the media library supports playback and recording of a variety of commonly used audio and video formats, as well as static image files, etc.
  • the media library can support a variety of audio and video encoding formats, such as: MPEG4, H.264, MP3, AAC, AMR, JPG, PNG, etc.
  • the 3D graphics processing library is used to implement 3D graphics drawing, image rendering, composition, and layer processing.
  • 2D Graphics Engine is a drawing engine for 2D drawing.
  • the kernel layer is the layer between hardware and software.
  • the kernel layer contains at least display driver, camera driver, audio driver, and sensor driver.
  • the server 200 includes at least one processor 210, at least one memory 220, and at least one communication interface 230.
  • the server 200 may also include an output device and an input device, which are not shown in the figure.
  • the processor 210, the memory 220 and the communication interface 230 are connected through a bus.
  • the processor 210 may be a general central processing unit (CPU), a microprocessor, an application-specific integrated circuit (ASIC), or one or more processors for controlling the execution of the program of the present application. integrated circuit.
  • the processor 210 may also include multiple CPUs, and the processor 210 It can be a single-CPU processor or a multi-CPU processor.
  • a processor here may refer to one or more devices, circuits, or processing cores for processing data (eg, computer program instructions).
  • the processor 210 may also include a preprocessing module, a basic unit building module, an association ring building module, a dependency module, and an SDK Identification module, SDK analysis module.
  • the functions implemented by these modules in the server 200 are similar to the functions implemented by the corresponding modules in the terminal shown in FIG. 3. This article will not elaborate on the specific functions of each module in the server 200.
  • Memory 220 may be a read-only memory (ROM) or other type of static storage device that can store static information and instructions, a random access memory (random access memory (RAM)) or other type that can store information and instructions.
  • a dynamic storage device can also be an electrically erasable programmable read-only memory (EEPROM), a compact disc read-only memory (CD-ROM) or other optical disk storage, optical disc storage (including compressed optical discs, laser discs, optical discs, digital versatile discs, Blu-ray discs, etc.), disk storage media or other magnetic storage devices, or can be used to carry or store desired program code in the form of instructions or data structures and can be used by a computer Any other medium for access, but not limited to this.
  • EEPROM electrically erasable programmable read-only memory
  • CD-ROM compact disc read-only memory
  • optical disc storage including compressed optical discs, laser discs, optical discs, digital versatile discs, Blu-ray discs, etc.
  • disk storage media or other magnetic storage devices or can be used
  • the memory 220 may exist independently and be connected to the processor 210 through a bus. Memory 220 may also be integrated with processor 210. Among them, the memory 220 is used to store the application code for executing the solution of the present application, and the processor 210 controls the execution.
  • the communication interface 230 can be used to communicate with other devices or communication networks, such as Ethernet, wireless local area networks (WLAN), etc.
  • WLAN wireless local area networks
  • the server 200 can receive a request from the terminal 100 to call the terminal 200 through the communication interface 230, establish a signaling channel with the terminal 100 and the terminal 200, and be used to exchange control-related information, for example, transfer network Call instructions, etc.
  • Output devices communicate with the processor and can display information in a variety of ways.
  • the output device may be a liquid crystal display (LCD), a light emitting diode (LED) display device, a cathode ray tube (CRT) display device, or a projector (projector), etc.
  • Input devices communicate with the processor and can receive user input in a variety of ways.
  • the input device may be a mouse, a keyboard, a touch screen device or a sensing device, etc.
  • the terminal After obtaining the installation package of the target application, the terminal decompresses the installation package of the target application and obtains the code file of the target application from the decompressed file.
  • the Android application package Take the Android application package as an example to illustrate. For example, decompress the APK installation package, obtain the .dex file from the decompressed file, and disassemble the .dex file to obtain the smali code.
  • this step and subsequent steps may be specifically executed by the application mall/browser in the terminal, or may also be performed by a system service (such as a security service) or a third-party application (such as decompression service) in the terminal.
  • application or service, third-party security application or service specifically perform this step and subsequent steps.
  • the application mall/browser Your browser can perform this and subsequent steps. More specifically, this step can be performed by the preprocessing module in Figure 3.
  • the terminal's system service such as security service
  • third-party applications such as decompression application or services, third-party security applications or services
  • the system service or other third-party application program may have a structure similar to that in FIG. 3 .
  • a basic unit is a package, and the root directory of the basic unit includes one or more class files.
  • the class files corresponding to the basic unit are all class files in the root directory of the basic unit. It should be emphasized that the class files corresponding to the basic unit do not include the class files in the sub-packages of the basic unit.
  • the root directory of package 3 includes class file a.
  • Package 3 is determined to be a basic unit (the package determined to be a basic unit is marked with a five-pointed star in the figure), corresponding to class file a.
  • the root directory of package 4 includes class file b and class file c.
  • Package 4 is determined as a basic unit, corresponding to class file b and class file c.
  • the root directory of package 5 includes class file e and package 6 (that is, package 6 is a sub-package of package 5).
  • Package 5 is determined to be a basic unit corresponding to class file e.
  • the root directory of package 6 includes class file d, and package 6 is determined as a basic unit corresponding to class file d.
  • the root directory of package 7 includes class file f and package 8 (that is, package 8 is a sub-package of package 7).
  • Package 7 is determined to be a basic unit corresponding to class file f.
  • the root directories of package 1, package 2 and package 8 do not contain class files and are not basic units.
  • class files belonging to the same package were developed during the same time period, and class files developed during the same time period are considered to belong to the same SDK. It is understandable that developers generally focus their time on developing certain SDK functions. Therefore, when class files belong to the same time period, it is considered that the class files belong to the same SDK with a high probability. It should be noted that the development time of a class file here does not refer to the time when the class file is actually created, but uses a certain analysis method to analyze the timing relationship between class files. The analysis methods include analysis through the path relationship of class files here, and analysis from the dependency relationships of class files below.
  • this step can be performed by the basic unit building module in Figure 3.
  • the class file in the code file contains the business execution logic of the target application, which contains the dependencies between the classes in the class file.
  • Dependencies between classes include but are not limited to inheritance relationships, calling relationships, parameter reference relationships, and return relationships.
  • the calling relationship, parameter reference relationship and return relationship between classes can be determined indirectly through the relationship between classes and methods (such as parameter reference relationship and return relationship), and the relationship between methods (such as calling relationship). For example, method 0 in class A calls method 1 in class B, then class A and class B have a calling relationship. For another example, if the parameters referenced by method 2 under class C include class D, then class C and class D have a parameter reference relationship.
  • class C and class E have a return relationship. It can be seen that the embodiments of the present application provide more dependencies, are conducive to establishing more dependencies between classes, and overcome the problem of simply relying on direct dependencies between classes to establish basic units. The lack of correlation is beneficial to improving the identification accuracy of the SDK.
  • the dependency relationship between classes also reflects the timing relationship of development time between classes. For example, if class A (i.e., subclass) inherits class file B (i.e., parent class), then the development time of class B is earlier than the development time of class A. For another example, if the method of class A calls the method of class B, then the development time of class A is later than the development time of class B. For another example, if class A is used as a parameter of a method in class B, then the development time of class A is earlier than the development time of class B. For another example, if class A is a return value type that is returned by a method of class B, then the development time of class A is earlier than the development time of class B. It should be noted that the development time of the class here is not the time when the class is actually created, but the timing relationship between classes based on the dependency relationship between classes.
  • the dependency relationship between class files is consistent with the dependency relationship between classes in the class file. It is understandable that the dependency relationship of class files also reflects the timing relationship of the development time of class files. Again, this class The development time of the file is not the time when the class file is actually created, but the time series relationship between the class files is analyzed using certain analysis methods.
  • the analysis methods include analysis from the dependency relationships between classes in the class files here, and analysis through the path relationships of the class files above.
  • association relationship between the basic units is consistent with the dependency relationship of the class files corresponding to the basic units.
  • timing relationship between basic units is consistent with the timing relationship of the class files corresponding to the basic units. It is understandable that the correlation between basic units is directional.
  • base unit 1 # class file A ⁇ base unit 2 # class file B
  • base unit 1 ⁇ base unit 2. That is to say, the development time of class file A in basic unit 1 is earlier than the development time of class file B in basic unit 2, then the development time of basic unit 1 is earlier than the development time of basic unit 2.
  • base unit 1# class file A base unit 2 # class file B
  • base unit 2 # class file C base unit 1 # class file B
  • base unit 1 ⁇ base unit 2 base unit 2
  • Basic unit 1 basic unit 2.
  • basic unit 1 and basic unit 2 can be considered to be developed at the same time, then basic unit 1 and basic unit 2 can be considered to belong to the same SDK.
  • the dependency relationship between classes in the class file can be identified by the dependency relationship module in Figure 3. Then, the association ring building module determines the dependency relationship between class files based on the dependency relationship between classes in the class file, and further determines the association relationship between basic units.
  • S504. Determine the SDK included in the target application based on the association between basic units.
  • an SDK includes multiple basic units whose relationships form a closed loop.
  • the two closed loops can be merged into one closed loop, that is, they are determined to belong to the same SDK.
  • FIG 7 it is a schematic structural diagram of some packages and class files in the code file of the target application.
  • the terminal executes the above steps S501 to S504, four closed rings formed by the association relationships of the basic units as shown in Figure 7 are obtained, namely association ring 1, association ring 2, association ring 3 and association ring 4.
  • association ring 2 and correlation ring 3 have the same basic unit and can be merged into one correlation ring.
  • three SDKs can be identified, namely SDK1, SDK2 and SDK3 as shown in Figure 7.
  • the SDK identification module in Figure 3 can determine the target application based on the association between the basic units identified by the association ring building module. SDK included in the program.
  • step S504 it is determined that an SDK includes multiple basic units, each basic unit corresponds to one or more class files, and the code files in the one or more class files are considered to be code files corresponding to the corresponding SDK.
  • each basic unit corresponds to one or more class files
  • the code files in the one or more class files are considered to be code files corresponding to the corresponding SDK.
  • analyze the code files corresponding to the SDK and identify sensitive behaviors of the SDK, such as calling sensitive APIs and executing sensitive broadcast behaviors.
  • the sensitive API is used to obtain sensitive data, such as obtaining the location of the terminal, connecting to the network, accessing the photo album in the terminal, etc.
  • the SDK analysis module in Figure 3 can identify the sensitive behaviors contained in each SDK based on the information of the SDK identified by the SDK identification module. . Furthermore, the SDK analysis module expresses the identified sensitive behaviors of each SDK in natural language to form a risk report (also called a security prompt) of the SDK, which is used to present it to the user so that the user can decide whether to install the application, or Whether to change the privacy system settings for the application after installing the application, etc.
  • a risk report also called a security prompt
  • the embodiment of the present application first constructs a basic unit according to the path relationship of the class files in the target application, and determines the association of the basic unit according to the dependency relationship between the class files corresponding to the basic unit. Identify the SDK. Since the SDK identification method provided by the embodiment of the present application no longer relies on the package name or the structural characteristics of the package in the target application, it can overcome the package name confusion and package structure of the target application. Identify the impact of architectural flattening and accurately identify the SDKs included in the target application.
  • the package whose root directory includes the class file is determined as the basic unit, and the association ring is built based on the basic unit.
  • this method can avoid the class in the code file. There are too many file dependencies, making the correlation loop more complicated. It can be seen that the embodiment of the present application determines the appropriate basic unit for building the association ring, which not only ensures the accuracy of SDK recognition, but also improves the efficiency of recognition.
  • the target application generally includes program code from two sources, namely program code developed by the application developer itself, and program code provided by integrated third-party application developers (usually one or more SDKs). It is understandable that, under normal circumstances, the risk of program code developed by application developers themselves is lower, while the risk of program code provided by third-party application developers is uncontrollable and higher. Therefore, in some embodiments, the terminal can also perform SDK identification and/or analysis of sensitive behaviors in the SDK for the program codes from the two sources.
  • FIG. 8 there is a schematic flow chart of another method of identifying SDK in an application provided by the embodiment of the present application.
  • the process includes:
  • the terminal After obtaining the installation package of the target application, the terminal decompresses the installation package of the target application and obtains the code file and configuration file of the target application from the decompressed file.
  • the Android application package Take the Android application package as an example to illustrate. For example, decompress the APK installation package and obtain the .dex file and AndroidManifest.xml file from the decompressed file.
  • the .dex file is a code file, and further the .dex file can be disassembled to obtain the smali code.
  • the AndroidManifest.xml file is a configuration file, including: describing each component of the target application, including activity, service, broadcast receiver, content provider, and declaring each Class files for component implementation, etc.
  • the first category of class files is the class file declared in the configuration file. It is generally considered to be the program code developed by the application developer for the target application. These application developers develop the sensitive behavior of their own program code. The risk is usually lower.
  • a collection of class files of the first type may be determined as an SDK.
  • the class files corresponding to this SDK are all first-class class files in the target application.
  • the same method as the class file of the second type may be executed on the class file of the first type to identify multiple SDKs included in the class file of the first type.
  • the second category of class files refers to class files that are not declared in the configuration file. They are generally considered to be SDKs developed by third-party service providers. The sensitive behavior of these SDKs should be focused on, as the risks are higher.
  • a basic unit is a package, and the root directory of the basic unit includes one or more class files of the second type.
  • the class files corresponding to the basic unit are all the second class class files in the root directory of the basic unit. It should be emphasized that the class files corresponding to the basic unit do not include the class files in the sub-packages of the basic unit.
  • S805. Determine the SDK included in the target application according to the association between basic units.
  • an SDK includes multiple basic units whose relationships form a closed loop.
  • the two closed loops can be merged into one closed loop, that is, they are determined to belong to the same SDK.
  • S806. Determine the sensitive behavior of each SDK according to the class files included in each SDK in the target application.
  • the sensitive behaviors of each SDK analyzed here can include two types of SDKs.
  • One type is the SDK identified in step 802 (that is, the SDK formed by the first type of class file collection), and the other type is The SDKs identified in step S805 (ie, multiple SDKs obtained by analyzing the class files of the second category).
  • the risks of the two types of SDKs may be separately prompted.
  • steps S801 to S806 please refer to the corresponding contents of steps S501 to S505, and will not be repeated here.
  • the chip system includes at least one processor 1101 and at least one interface circuit 1102.
  • the processor 1101 and the interface circuit 1102 may be interconnected by wires.
  • interface circuitry 1102 may be used to receive signals from other devices, such as the memory of terminal 100.
  • interface circuit 1102 may be used to send signals to other devices (eg, processor 1101).
  • the interface circuit 1102 can read instructions stored in the memory and send the instructions to the processor 1101.
  • the terminal can be caused to perform various steps performed by the terminal 100 (such as a mobile phone) in the above embodiment.
  • the chip system may also include other discrete devices, which are not specifically limited in the embodiments of this application.
  • Embodiments of the present application also provide a device, which is included in a terminal or a server, and has the function of realizing the behavior of the terminal or server in any of the methods in the above embodiments.
  • This function can be implemented by hardware, or it can be implemented by hardware executing corresponding software.
  • the hardware or software includes at least one module or unit corresponding to the above functions. For example, communication modules or units, storage modules or units, and processing modules or units, etc.
  • Embodiments of the present application also provide a computer storage medium that includes computer instructions.
  • the terminal or server causes the terminal or server to perform any of the methods in the above embodiments.
  • Embodiments of the present application also provide a computer program product.
  • the computer program product When the computer program product is run on a computer, it causes the computer to perform any of the methods in the above embodiments.
  • the above-mentioned terminal or server includes hardware structures and/or software modules corresponding to each function.
  • the embodiments of the present application can be implemented in hardware or hardware and computing. It is realized in the form of a combination of machine software. Whether a function is performed by hardware or computer software driving the hardware depends on the specific application and design constraints of the technical solution. Professionals and technicians may use different methods to implement the described functions for each specific application, but such implementations should not be considered to be beyond the scope of the embodiments of the present invention.
  • Embodiments of the present application can divide the terminal or server into functional modules according to the above method examples.
  • each functional module can be divided corresponding to each function, or two or more functions can be integrated into one processing module.
  • the above integrated modules can be implemented in the form of hardware or software function modules. It should be noted that the division of modules in the embodiment of the present invention is schematic and is only a logical function division. In actual implementation, there may be other division methods.
  • Each functional unit in each embodiment of the present application may be integrated into one processing unit, or each unit may exist physically alone, or two or more units may be integrated into one unit.
  • the above integrated units can be implemented in the form of hardware or software functional units.
  • the integrated unit is implemented in the form of a software functional unit and sold or used as an independent product, it may be stored in a computer-readable storage medium.
  • the technical solutions of the embodiments of the present application are essentially or contribute to the existing technology, or all or part of the technical solution can be embodied in the form of a software product.
  • the computer software product is stored in a storage device.
  • the medium includes several instructions to cause a computer device (which may be a personal computer, a server, or a network device, etc.) or a processor to execute all or part of the steps of the methods described in various embodiments of this application.
  • the aforementioned storage media include: flash memory, mobile hard disk, read-only memory, random access memory, magnetic disk or optical disk and other media that can store program codes.

Landscapes

  • Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Theoretical Computer Science (AREA)
  • General Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Technology Law (AREA)
  • Computer Hardware Design (AREA)
  • Computer Security & Cryptography (AREA)
  • Stored Programmes (AREA)

Abstract

一种识别应用程序中SDK的方法、终端及服务器,涉及电子技术领域,可以克服目标应用程序的包名混淆以及包结构扁平化带来的影响,提升识别出目标应用程序中包含的SDK的准确性,该方法包括:从目标应用程序的安装包中获取目标应用程序的代码文件;根据代码文件中类文件的路径关系,确定目标应用程序中包含的基础单元,以及各个基础单元对应的类文件;根据各个基础单元对应的类文件之间的依赖关系,建立基础单元之间的关联关系;根据基础单元之间的关联关系确定目标应用程序中包含的SDK;其中关联关系形成闭合环的多个基础单元属于一个SDK。

Description

一种识别应用程序中SDK的方法、终端及服务器
本申请要求于2022年3月7日提交国家知识产权局、申请号为202210225699.6、申请名称为“一种识别应用程序中SDK的方法、终端及服务器”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本申请涉及电子技术领域,尤其涉及一种识别应用程序中SDK的方法、终端及服务器。
背景技术
随着移动互联网的快速发展,终端上能够安装各式各样的应用程序,为用户提供丰富的功能。应用开发者为了提高开发效率、降低成本,可以集成第三方应用开发者开发的软件开发工具包(Software Development Kit,SDK),例如开源的第三方SDK。随着第三方SDK被大规模引用,也引入了一定的风险因素,例如隐私泄露、安全漏洞等。
应用程序中集成的SDK一般会以包的形式存在于应用程序的代码中。因此,可以通过检测应用程序的代码中的包名或者包的结构来确定应用程序中可能包含的SDK,也称为潜在SDK。然而,一些第三方应用开发者会采用一定的对抗分析技术来规避该检测方法。例如,将SDK的包名随意转换为无规律、不易读、熵值低的字符串,即混淆SDK的包名。或者,将SDK中的子包随意放置到应用程序的根目录或者其他包下,使得SDK的包结构扁平化,破坏了正常的包结构。这样,如果仍然采用上述检测方法,可能无法检测出应用程序的代码中潜在的SDK。
发明内容
本申请提供的一种识别应用程序中SDK的方法、终端及服务器,可以克服目标应用程序的包名混淆以及包结构扁平化带来的影响,提升识别出目标应用程序中包含的SDK的准确性。
为了实现上述目的,本申请实施例提供了以下技术方案:
第一方面、提供一种识别应用程序中SDK的方法,该方法包括:从目标应用程序的安装包中获取目标应用程序的代码文件;根据目标应用程序的代码文件中类文件的路径关系,确定目标应用程序中包含的基础单元,以及各个基础单元对应的类文件;根据各个基础单元对应的类文件之间的依赖关系,建立基础单元之间的关联关系;根据基础单元之间的关联关系确定目标应用程序中包含的SDK;其中关联关系形成闭合环的多个基础单元属于一个SDK。
综上可见,本申请实施例先根据目标应用程序中类文件的路径关系构建基础单元,并根据基础单元对应的类文件之间的依赖关系确定基础单元的关联关系,根据基础单元的关联关系来识别SDK的。由于本申请实施例提供的SDK识别方法不再依赖目标应用程序中包的包名或包的结构特征,因此能够克服目标应用程序的包名混淆以及包结 构扁平化带来的影响,准确识别出目标应用程序中包含的SDK。
另外,本申请实施例中是确定关联关系形成闭合环的多个基础单元属于同一个SDK,能够避免一些恶意SDK单方向继承或调用其他SDK被误识别为同一个SDK,即未识别出该恶意SDK,或者可能低估该恶意SDK敏感行为的风险性的情况发生。由此可见,本申请实施例提供的SDK识别方法能够更加准确识别出不同的SDK,提升风险提示的可靠性。
一种可能的实现方式中,根据目标应用程序的代码文件中类文件的路径关系,确定目标应用程序中包含的基础单元,以及各个基础单元对应的类文件,包括:根据目标应用程序的代码文件中类文件的路径关系,确定根目录下包括一个或多个类文件的包为基础单元,基础单元对应的类文件为基础单元根目录下的全部类文件。
需要强调的是,基础单元对应的类文件不包括基础单元的子包中的类文件。
一般认为,同属于一个包的类文件为同一时间段开发的,而同一时间段开发的类文件被认为属于同一个SDK。可以理解的,开发者一般集中时间开发某个SDK功能,因此当类文件属于同一个时间段时,认为该类文件极大概率属于同一个SDK。需要说明的是,这里类文件的开发时间并不是指该类文件实际被创建的时间,而是采用一定分析方法分析出类文件之间的时序关系。其中分析方法包括这里通过类文件的路径关系进行分析,以及下文中从类文件的依赖关系进行分析。
综上,本申请实施例中确定根目录包括类文件的包为基础单元,并以基础单元为单位构建关联环,相较于以类文件为单位构建关联环,该方法能够避免代码文件中类文件的依赖关系过于繁多而造成关联环更加复杂。可见,本申请实施例确定合适的基础单元用于构建关联环,在保证SDK识别准确性的基础上,还提升了识别的效率。
一种可能的实现方式中,基础单元对应的类文件之间的依赖关系,为根据各个类文件中类与类之间的依赖关系确定的,类与类之间的依赖关系包括继承关系、调用关系、参数引用关系、以及返回关系中的一项或多项。
其中,代码文件中的类文件包含了目标应用程序的业务执行逻辑,其中包含了类文件中类与类的依赖关系。类与类的依赖关系包括但不限于继承关系、调用关系、参数引用关系、以及返回关系等。其中,类与类的调用关系、参数引用关系以及返回关系可以通过类与方法的关系(如参数引用关系、返回关系),方法与方法之间的关系(如调用关系)间接确定的。由此可见,本申请实施例提供更多的依赖关系,有利于建立更多类与类之间的依赖关系,克服了单纯依靠类与类之间的直接依赖关系建立基础单元之间关联关系的不足,有利于提供SDK的识别的准确率。
一种可能的实现方式中,在根据基础单元之间的关联关系确定目标应用程序中包含的SDK之后,该方法还包括:根据每个SDK包含的类文件,识别出每个SDK调用的敏感API,并生成每个SDK的安全提示;其中敏感API用于获取敏感数据。
也就是说,以SDK为单位,对SDK对应的代码文件进行分析,识别出SDK的敏感行为,例如调用敏感API,执行敏感的广播行为等。其中,敏感API是用于获取敏感数据,例如获取终端的定位、连接到网络、访问终端中的相册等。并对识别出的各个SDK的敏感行为进行自然语言表达,形成SDK的风险报告(也称为安全提示),用于呈现给用户,以便用户决定是否安装该应用程序,或者是否在安装该应用程序后是否 更改针对该应用程序的隐私类系统设置等。
一种可能的实现方式中,该方法还包括:从目标应用程序的安装包中还获取目标应用程序的配置文件;根据目标应用程序的配置文件,从目标应用程序的代码文件中识别出第一类的类文件和第二类的类文件;根据目标应用程序的代码文件中类文件的路径关系,确定目标应用程序中包含的基础单元,以及各个基础单元对应的类文件,具体为:根据目标应用程序的代码文件中第二类的类文件的路径关系,确定目标应用程序中包含的基础单元,以及各个基础单元对应的第二类的类文件。
可以理解的,目标应用程序中大致包括两种来源的程序代码,分别为应用开发者自己开发的程序代码,以及集成的第三方应用开发者提供的程序代码(通常为一个或多个SDK)。可以理解的是,一般情况下,应用开发者自己开发的程序代码的风险性较低,而第三方应用开发者提供的程序代码的风险不可控,风险较高。因此终端还可以针对两种来源的程序代码分别进行SDK识别和/或SDK中敏感行为的分析。
一种可能的实现方式中,根据目标应用程序的配置文件,从目标应用程序的代码文件中识别出第一类的类文件和第二类的类文件,包括:确定目标应用程序的配置文件中声明的类文件为第一类的类文件;确定目标应用程序的配置文件中未声明的类文件为第二类的类文件。
其中,第一类的类文件,为配置文件中声明的类文件,一般认为是应用开发者针对该目标应用程序自己开发的程序代码,这些应用开发者开发自己开发程序代码的敏感行为通常风险较低。第二类的类文件,指配置文件中未声明的类文件,一般认为是第三方服务提供商开发的SDK,这些SDK的敏感行为应该重点关注,风险较高。
可以理解的是,将风险不同的类文件划分为不同的SDK,有利于后续分开评估不同SDK的风险,有利于用户更好的了解目标应用程序中不同SDK的风险性。另外,在对风险较高的类文件识别SDK时,去除风险较低的类文件,有利于提高SDK的识别效率和准确性。
一种可能的实现方式中,在根据目标应用程序的配置文件,从目标应用程序的代码文件中识别出第一类的类文件和第二类的类文件之后,该方法还包括:将第一类的类文件确定为目标应用程序中的一个SDK。
一些实施例中,可以将第一类的类文件的集合确定为一个SDK。这个SDK对应的类文件为目标应用程序中全部的第一类的类文件。或者,在其他一些实施例中,也可以针对第一类的类文件执行与第二类的类文件相同的方法,以识别出第一类的类文件中包含的多个SDK。
第二方面、提供一种终端,包括:处理器和存储器,存储器与处理器耦合,存储器用于存储计算机程序代码,计算机程序代码包括计算机指令,当处理器从存储器中读取计算机指令,以使得终端执行如上述方面及其中任一种可能的实现方式中所述的方法。
第三方面、提供一种装置,该装置包含在终端中,该装置具有实现上述方面及可能的实现方式中任一方法中终端行为的功能。该功能可以通过硬件实现,也可以通过硬件执行相应的软件实现。硬件或软件包括至少一个与上述功能相对应的模块或单元。例如,通信模块或单元、存储模块或单元、以及处理模块或单元等。
第四方面、提供一种计算机可读存储介质,包括计算机指令,当计算机指令在终端上运行时,使得终端执行如上述方面及其中任一种可能的实现方式中所述的方法。
第五方面、提供一种终端上的图形用户界面,所述终端具有显示屏、摄像头、存储器、以及一个或多个处理器,所述一个或多个处理器用于执行存储在所述存储器中的一个或多个计算机程序,所述图形用户界面包括所述终端执行如上述方面及其中任一种可能的实现方式中所述的方法时显示的图形用户界面。
第六方面、提供一种计算机程序产品,当计算机程序产品在计算机上运行时,使得计算机执行如上述方面中及其中任一种可能的实现方式中所述的方法。
第七方面、提供一种芯片系统,包括处理器,当处理器执行指令时,处理器执行如上述方面中及其中任一种可能的实现方式中所述的方法。
附图说明
图1为本申请实施例提供的一种通信系统的结构示意图;
图2A为本申请实施例提供的一种终端的结构示意图;
图2B为本申请实施例提供的又一种终端的结构示意图;
图3为本申请实施例提供的一种应用程序的结构示意图;
图4为本申请实施例提供的一种服务器的结构示意图;
图5为本申请实施例提供的一种识别应用程序中SDK的方法的流程示意图;
图6为本申请实施例提供一种确定基础单元的方法的示意图;
图7为本申请实施例提供一种识别应用程序中SDK的方法的示意图;
图8为本申请实施例提供的又一种识别应用程序中SDK的方法的流程示意图;
图9为本申请实施例提供的一种芯片系统的结构示意图。
具体实施方式
在本申请实施例中,“示例性的”或者“例如”等词用于表示作例子、例证或说明。本申请实施例中被描述为“示例性的”或者“例如”的任何实施例或设计方案不应被解释为比其它实施例或设计方案更优选或更具优势。确切而言,使用“示例性的”或者“例如”等词旨在以具体方式呈现相关概念。
如图1所示,为本申请实施例提供的一种通信系统,该通信系统包括终端100和服务器200。其中,服务器200为各类终端(如移动终端)提供应用程序安装包下载服务,例如具体为应用服务器。终端100其上能够安装各式各样的应用程序,例如安装有应用商城(也称为应用市场)、浏览器等。用户可以通过终端100上的应用商城或浏览器访问服务器200,并从服务器200处下载应用程序的安装包。
在一些实施例中,可由终端100具体执行本申请实施例提供的识别应用程序中的SDK的方法。当终端从应用商城或浏览器中下载应用程序的安装包后,在终端安装该应用程序前或者在终端安装该应用程序时,终端可以识别该应用程序安装包中的SDK,并分析SDK中的敏感行为(例如获取终端的位置、访问终端的相册、连接网络等),以提示用户安装该应用程序可能存在的风险,便于用户确定是否安装。或者,当终端安装该应用程序后,终端识别该应用程序安装包中的SDK,并分析SDK中的敏感行为,以提示用户该应用程序可能存在的风险,提示用户是否卸载该应用程序,或者是否针对该应用程序更改隐私安全相关的设置(例如禁止该应用程序获取终端的位置信息, 禁止该应用程序访问终端的相册,禁止该应用程序连接Wi-Fi或蜂窝网等)。
在另一些实施例中,可由服务器200具体执行本申请实施例提供的识别应用程序中的SDK的方法。当应用开发者将开发的应用程序的安装包上传到应用服务器后,应用服务器可以对应用程序的安装包进行分析,识别出该应用程序安装包中的SDK,并分析SDK中的敏感行为的风险。当终端从应用服务器下载该应用程序时,应用服务器可以向终端推送该应用程序的风险,以便用户确定是否下载。或者,应用服务器向终端发生该应用程序安装包时,一并向终端推送该应用程序的风险,以便提示用户确定是否安装,或者提示用户是否针对该应用程序更改隐私安全相关的设置。
下面对本申请实施例提供的技术方案进行详细说明。
示例性的,本申请实施例中终端100例如可以为手机、平板电脑、个人计算机(personal computer,PC)、个人数字助理(personal digital assistant,PDA)、智能手表、上网本、可穿戴终端、增强现实技术(augmented reality,AR)设备、虚拟现实(virtual reality,VR)设备、车载设备、智慧屏、智能音响等,本申请对该终端的具体形式不做特殊限制。
请参见图2A,图2A示出了终端100的结构示意图。
终端100可以包括处理器110,外部存储器接口120,内部存储器121,通用串行总线(universal serial bus,USB)接口130,充电管理模块140,电源管理模块141,电池142,天线1,天线2,移动通信模块150,无线通信模块160,音频模块170,扬声器170A,受话器170B,麦克风170C,耳机接口170D,传感器模块180,按键190,马达191,指示器192,摄像头193,显示屏194,以及用户标识模块(subscriber identification module,SIM)卡接口195等。其中传感器模块180可以包括压力传感器180A,陀螺仪传感器180B,气压传感器180C,磁传感器180D,加速度传感器180E,距离传感器180F,接近光传感器180G,指纹传感器180H,温度传感器180J,触摸传感器180K,环境光传感器180L,骨传导传感器180M等。
可以理解的是,本发明实施例示意的结构并不构成对终端100的具体限定。在本申请另一些实施例中,终端100可以包括比图示更多或更少的部件,或者组合某些部件,或者拆分某些部件,或者不同的部件布置。图示的部件可以以硬件,软件或软件和硬件的组合实现。
处理器110可以包括一个或多个处理单元,例如:处理器110可以包括应用处理器(application processor,AP),调制解调处理器,图形处理器(graphics processing unit,GPU),图像信号处理器(image signal processor,ISP),控制器,视频编解码器,数字信号处理器(digital signal processor,DSP),基带处理器,和/或神经网络处理器(neural-network processing unit,NPU)等。其中,不同的处理单元可以是独立的器件,也可以集成在一个或多个处理器中。
处理器110中还可以设置存储器,用于存储指令和数据。在一些实施例中,处理器110中的存储器为高速缓冲存储器。该存储器可以保存处理器110刚用过或循环使用的指令或数据。如果处理器110需要再次使用该指令或数据,可从所述存储器中直接调用。避免了重复存取,减少了处理器110的等待时间,因而提高了系统的效率。
充电管理模块140用于从充电器接收充电输入。其中,充电器可以是无线充电器, 也可以是有线充电器。电源管理模块141用于连接电池142,充电管理模块140与处理器110。电源管理模块141接收电池142和/或充电管理模块140的输入,为处理器110,内部存储器121,显示屏194,摄像头193,和无线通信模块160等供电。电源管理模块141还可以用于监测电池容量,电池循环次数,电池健康状态(漏电,阻抗)等参数。
终端100的无线通信功能可以通过天线1,天线2,移动通信模块150,无线通信模块160,调制解调处理器以及基带处理器等实现。
其中,天线1和天线2用于发射和接收电磁波信号。移动通信模块150可以提供应用在终端100上的包括2G/3G/4G/5G等无线通信的解决方案。移动通信模块150可以包括至少一个滤波器,开关,功率放大器,低噪声放大器(low noise amplifier,LNA)等。移动通信模块150可以由天线1接收电磁波,并对接收的电磁波进行滤波,放大等处理,传送至调制解调处理器进行解调。移动通信模块150还可以对经调制解调处理器调制后的信号放大,经天线1转为电磁波辐射出去。在一些实施例中,移动通信模块150的至少部分功能模块可以被设置于处理器110中。在一些实施例中,移动通信模块150的至少部分功能模块可以与处理器110的至少部分模块被设置在同一个器件中。无线通信模块160可以提供应用在终端100上的包括无线局域网(wireless local area networks,WLAN)(如无线保真(wireless fidelity,Wi-Fi)网络),蓝牙(bluetooth,BT),全球导航卫星系统(global navigation satellite system,GNSS),调频(frequency modulation,FM),近距离无线通信技术(near field communication,NFC),红外技术(infrared,IR)等无线通信的解决方案。无线通信模块160可以是集成至少一个通信处理模块的一个或多个器件。无线通信模块160经由天线2接收电磁波,将电磁波信号调频以及滤波处理,将处理后的信号发送到处理器110。无线通信模块160还可以从处理器110接收待发送的信号,对其进行调频,放大,经天线2转为电磁波辐射出去。
终端100通过GPU,显示屏194,以及应用处理器等实现显示功能。
终端100可以通过ISP,摄像头193,视频编解码器,GPU,显示屏194以及应用处理器等实现拍摄功能。
外部存储器接口120可以用于连接外部存储卡,例如Micro SD卡,实现扩展终端100的存储能力。外部存储卡通过外部存储器接口120与处理器110通信,实现数据存储功能。例如将音乐,视频等文件保存在外部存储卡中。
内部存储器121可以用于存储计算机可执行程序代码,所述可执行程序代码包括指令。内部存储器121可以包括存储程序区和存储数据区。其中,存储程序区可存储操作系统,至少一个功能所需的应用程序(比如声音播放功能,图像播放功能等)等。存储数据区可存储终端100使用过程中所创建的数据(比如音频数据,电话本等)等。此外,内部存储器121可以包括高速随机存取存储器,还可以包括非易失性存储器,例如至少一个磁盘存储器件,闪存器件,通用闪存存储器(universal flash storage,UFS)等。处理器110通过运行存储在内部存储器121的指令,和/或存储在设置于处理器中的存储器的指令,执行终端100的各种功能应用以及数据处理。
终端100可以通过音频模块170,扬声器170A,受话器170B,麦克风170C,耳机 接口170D,以及应用处理器等实现音频功能。例如音乐播放,录音等。
按键190包括开机键,音量键等。按键190可以是机械按键。也可以是触摸式按键。终端100可以接收按键输入,产生与终端100的用户设置以及功能控制有关的键信号输入。
马达191可以产生振动提示。马达191可以用于来电振动提示,也可以用于触摸振动反馈。例如,作用于不同应用(例如拍照,音频播放等)的触摸操作,可以对应不同的振动反馈效果。作用于显示屏194不同区域的触摸操作,马达191也可对应不同的振动反馈效果。不同的应用场景(例如:时间提醒,接收信息,闹钟,游戏等)也可以对应不同的振动反馈效果。触摸振动反馈效果还可以支持自定义。
指示器192可以是指示灯,可以用于指示充电状态,电量变化,也可以用于指示消息,未接来电,通知等。
SIM卡接口195用于连接SIM卡。SIM卡可以通过插入SIM卡接口195,或从SIM卡接口195拔出,实现和终端100的接触和分离。终端100可以支持1个或N个SIM卡接口,N为大于1的正整数。在一些实施例中,终端100采用eSIM,即:嵌入式SIM卡。eSIM卡可以嵌在终端100中,不能和终端100分离。
终端100的软件系统可以采用分层架构,事件驱动架构,微核架构,微服务架构,或云架构。本发明实施例以分层架构的Android系统为例,示例性说明终端100的软件结构。
请参见图2B,图2B是本发明实施例的终端100的软件结构框图。
分层架构将软件分成若干个层,每一层都有清晰的角色和分工。层与层之间通过软件接口通信。在一些实施例中,将Android系统分为四层,从上至下分别为应用程序层,应用程序框架层,安卓运行时(Android runtime)和系统库,以及内核层。
1、应用程序层
如图2B所示,应用程序层可以包括一系列应用程序包。例如,应用商城、浏览器。当然应用程序层还可以包括日历、图库、音乐、天气、地图、视频等应用程序(图中未示出)。
其中,应用商城/浏览器,可供用户下载其他应用程序。在一些实施例中,应用商城/浏览器,可具体执行本申请实施例提供的识别应用程序中的SDK的方法。
如图3所示,为本申请实施例提供一种应用程序的结构示意图。应用商城/浏览器可以采用图3所示的应用程序的结构,具体包括预处理模块、基础单元构建模块、关联环构建模块、依赖关系模块、SDK识别模块、SDK分析模块。可以理解的是,图3仅为应用商城/浏览器的结构示意,事实上应用商城/浏览器可以包括比图3更多或更少的模块,或者组合某些模块,或者拆分某些模块等。图3中各个模块的具体功能将在下文结合具体的实施例进行详述,这里先不做说明。
2、应用程序框架层
应用程序框架层为应用程序层的应用程序提供应用编程接口(application programming interface,API)和编程框架。应用程序框架层包括一些预先定义的函数。
如图2B所示,应用程序框架层可以包括窗口管理器,内容提供器,视图系统,电话管理器,资源管理器,通知管理器等。
窗口管理器用于管理窗口程序。窗口管理器可以获取显示屏大小,判断是否有状态栏,锁定屏幕,截取屏幕等。内容提供器用来存放和获取数据,并使这些数据可以被应用程序访问。所述数据可以包括视频,图像,音频,拨打和接听的电话,浏览历史和书签,电话簿等。视图系统包括可视控件,例如显示文字的控件,显示图片的控件等。视图系统可用于构建应用程序。显示界面可以由一个或多个视图组成的。例如,包括短信通知图标的显示界面,可以包括显示文字的视图以及显示图片的视图。电话管理器用于提供终端100的通信功能。例如通话状态的管理(包括接通,挂断等)。资源管理器为应用程序提供各种资源,比如本地化字符串,图标,图片,布局文件,视频文件等等。通知管理器使应用程序可以在状态栏中显示通知信息,可以用于传达告知类型的消息,可以短暂停留后自动消失,无需用户交互。比如通知管理器被用于告知下载完成,消息提醒等。通知管理器还可以是以图表或者滚动条文本形式出现在系统顶部状态栏的通知,例如后台运行的应用程序的通知,还可以是以对话窗口形式出现在屏幕上的通知。例如在状态栏提示文本信息,发出提示音,终端振动,指示灯闪烁等。
3、安卓运行时(Android runtime)和系统库
如图2B所示,Android Runtime包括核心库和虚拟机。Android runtime负责安卓系统的调度和管理。其中,核心库包含两部分:一部分是java语言需要调用的功能函数,另一部分是安卓的核心库。应用程序层和应用程序框架层运行在虚拟机中。虚拟机将应用程序层和应用程序框架层的java文件执行为二进制文件。虚拟机用于执行对象生命周期的管理,堆栈管理,线程管理,安全和异常的管理,以及垃圾回收等功能。
系统库可以包括多个功能模块。例如:表面管理器(surface manager),媒体库(Media Libraries),三维图形处理库(例如:OpenGL ES),2D图形引擎(例如:SGL)等。
表面管理器用于对显示子系统进行管理,并且为多个应用程序提供了2D和3D图层的融合。媒体库支持多种常用的音频,视频格式回放和录制,以及静态图像文件等。媒体库可以支持多种音视频编码格式,例如:MPEG4,H.264,MP3,AAC,AMR,JPG,PNG等。三维图形处理库用于实现三维图形绘图,图像渲染,合成,和图层处理等。2D图形引擎是2D绘图的绘图引擎。
4、内核层
内核层是硬件和软件之间的层。内核层至少包含显示驱动,摄像头驱动,音频驱动,传感器驱动。
如图4所示,为本申请实施例提供的一种服务器200的结构示意图。该服务器200包括至少一个处理器210、至少一个存储器220、至少一个通信接口230。可选的,服务器200还可以包括输出设备和输入设备,图中未示出。
处理器210、存储器220和通信接口230通过总线相连接。处理器210可以是一个通用中央处理器(central processing unit,CPU)、微处理器、特定应用集成电路(application-specific integrated circuit,ASIC),或者一个或多个用于控制本申请方案程序执行的集成电路。处理器210也可以包括多个CPU,并且处理器210 可以是一个单核(single-CPU)处理器或多核(multi-CPU)处理器。这里的处理器可以指一个或多个设备、电路或用于处理数据(例如计算机程序指令)的处理核。
可以理解的是,当由服务器200执行本申请实施例提供的识别应用程序中SDK方法时,处理器210中也可以包括预处理模块、基础单元构建模块、关联环构建模块、依赖关系模块、SDK识别模块、SDK分析模块。服务器200中的这些模块实现的功能与图3所示的终端中相对应的模块实现的功能类似,本文将不对服务器200中各个模块的具体功能进行阐述。
存储器220可以是只读存储器(read-only memory,ROM)或可存储静态信息和指令的其他类型的静态存储设备、随机存取存储器(random access memory,RAM)或者可存储信息和指令的其他类型的动态存储设备,也可以是电可擦可编程只读存储器(electrically erasable programmable read-only memory,EEPROM)、只读光盘(compact disc read-only memory,CD-ROM)或其他光盘存储、光碟存储(包括压缩光碟、激光碟、光碟、数字通用光碟、蓝光光碟等)、磁盘存储介质或者其他磁存储设备、或者能够用于携带或存储具有指令或数据结构形式的期望的程序代码并能够由计算机存取的任何其他介质,但不限于此。存储器220可以是独立存在,通过总线与处理器210相连接。存储器220也可以和处理器210集成在一起。其中,存储器220用于存储执行本申请方案的应用程序代码,并由处理器210来控制执行。
通信接口230,可用于与其他设备或通信网络通信,如以太网,无线局域网(wireless local area networks,WLAN)等。
在本申请实施例中,服务器200可以通过该通信接口230接收来自终端100的呼叫终端200的请求,与终端100以及终端200建立信令通道,并用于交互与控制相关的信息,例如,转移网络通话的指示等。
输出设备和处理器通信,可以以多种方式来显示信息。例如,输出设备可以是液晶显示器(liquid crystal display,LCD),发光二级管(light emitting diode,LED)显示设备,阴极射线管(cathode ray tube,CRT)显示设备,或投影仪(projector)等。输入设备和处理器通信,可以以多种方式接收用户的输入。例如,输入设备可以是鼠标、键盘、触摸屏设备或传感设备等。
以下实施例中所涉及的技术方案均可以在具有上述终端100或者服务器200中实现。下文以终端100执行本申请实施例提供的技术方案为例进行说明。如图5所示,为本申请实施例提供的一种识别应用程序中的SDK的方法的流程示意图,该方法包括:
S501、从目标应用程序的安装包中获取目标应用程序的代码文件。
终端在获取到目标应用程序的安装包后,对目标应用程序的安装包进行解压,从解压后的文件中获取目标应用程序的代码文件。以Android的应用程序包为例进行说明。例如,对APK安装包进行解压,从解压后的文件中获取.dex文件,并对.dex文件进行反汇编得到smali代码。
在一个具体的实施例中,可以由终端中的应用商城/浏览器具体执行本步骤以及之后的步骤,或者,也可以由终端中的系统服务(例如安全服务)或者第三方应用程序(例如解压应用或服务,第三方的安全应用或服务)具体执行本步骤以及之后的步骤。具体来说,当终端从应用商城/浏览器上下载目标应用程序的安装包后,应用商城/浏 览器可以执行本步骤以及之后的步骤。更具体地,本步骤可由图3中的预处理模块执行。
或者,终端从应用商城/浏览器上下载目标应用程序的安装包后,当检测到终端开始安装该目标应用程序时,终端的系统服务(例如安全服务)或其他第三方应用程序(例如解压应用或服务,第三方的安全应用或服务)可以执行本步骤以及之后的步骤。可以理解的,当由终端的系统服务或其他第三方应用程序执行时,系统服务或其他第三方应用程序可以具有与图3中相似的结构。
S502、根据代码文件中类文件的路径关系,确定目标应用程序中包含的基础单元,以及各个基础单元对应的类文件。
其中,一个基础单元为一个包,且基础单元的根目录下包括一个或多个类文件。基础单元对应的类文件为基础单元的根目录下的全部类文件。需要强调的是,基础单元对应的类文件不包括基础单元的子包中的类文件。
例如,如图6所示,为代码文件中部分包的路径关系示意图。包3的根目录包括类文件a,包3确定为一个基础单元(图中用五角星标记出确定为基础单元的包),对应类文件a。包4的根目录包括类文件b和类文件c,包4确定为一个基础单元,对应类文件b和类文件c。包5的根目录包括类文件e和包6(即包6为包5的子包),包5确定为一个基础单元,对应类文件e。包6的根目录包括类文件d,包6确定为一个基础单元,对应类文件d。包7的根目录包括类文件f和包8(即包8为包7的子包),包7确定为一个基础单元,对应类文件f。其中包1、包2和包8的根目录不包含类文件,不为基础单元。
一般认为,同属于一个包的类文件为同一时间段开发的,而同一时间段开发的类文件被认为属于同一个SDK。可以理解的,开发者一般集中时间开发某个SDK功能,因此当类文件属于同一个时间段时,认为该类文件极大概率属于同一个SDK。需要说明的是,这里类文件的开发时间并不是指该类文件实际被创建的时间,而是采用一定分析方法分析出类文件之间的时序关系。其中分析方法包括这里通过类文件的路径关系进行分析,以及下文中从类文件的依赖关系进行分析。
在一个具体的实施例中,当应用商城/浏览器具体执行本步骤时,更具体的,可由图3中的基础单元构建模块执行本步骤。
S503、根据基础单元对应的类文件之间的依赖关系,建立基础单元之间的关联关系。
其中,代码文件中的类文件包含了目标应用程序的业务执行逻辑,其中包含了类文件中类与类的依赖关系。类与类的依赖关系包括但不限于继承关系、调用关系、参数引用关系、以及返回关系等。其中,类与类的调用关系、参数引用关系以及返回关系可以通过类与方法的关系(如参数引用关系、返回关系),方法与方法之间的关系(如调用关系)间接确定的。例如,类A中的方法0调用了类B中的方法1,那么类A与类B具有调用关系。又例如,类C下的方法2引用的参数中包括类D,那么类C与类D具有参数引用关系。又例如,类C的方法2的返回值的类型为类E,那么类C和类E为返回关系。由此可见,本申请实施例提供更多的依赖关系,有利于建立更多类与类之间的依赖关系,克服了单纯依靠类与类之间的直接依赖关系建立基础单元之间 关联关系的不足,有利于提供SDK的识别的准确率。
可以理解的是,类与类之间的依赖关系也体现了类与类之间的开发时间的时序关系。例如,类A(即子类)继承类文件B(即父类),那么类B的开发时间早于类A的开发时间。又例如,类A的方法调用类B的方法,那么类A的开发时间晚于类B的开发时间。又例如,类A作为类B中某个方法的参数,那么类A的开发时间早于类B的开发时间。又例如,类A作为一个返回值的类型,被类B的某个方法返回,则类A的开发时间早于类B的开发时间。需要说明的是,这里类的开发时间并不是类实际被创建的时间,而是根据类与类之间的依赖关系分析出类与类之间的时序关系。
举例说明,如下为一个基础单元中部分代码的示例:
由上面的代码可以得出表一所示的类文件中包含的依赖关系,如下:
表一
进一步的,类文件之间的依赖关系与类文件中类与类的依赖关系一致。可以理解的是,类文件的依赖关系也体现了类文件的开发时间的时序关系。再次说明,这里类 文件的开发时间并不是类文件实际被创建的时间,而是采用一定分析方法分析出类文件之间的时序关系。其中分析方法包括这里从类文件中类之间的依赖关系进行分析,以及上文通过类文件的路径关系进行分析。
再进一步地,基础单元之间的关联关系与基础单元对应的类文件的依赖关系一致。也就是说,基础单元之间的时序关系与基础单元对应的类文件的时序关系一致。可以理解的,基础单元之间的关联关系具有方向性。
为了便于描述,这里用符号“基础单元#类文件”表示基础单元与类文件的对应关系,用“类文件/基础单元A<类文件/基础单元B”用于表示类文件/基础单元A的开发时间早于类文件/基础单元B的开发时间。需要说明的是,符号本身不具有其他含义。
例如,如果基础单元1#类文件A<基础单元2#类文件B,那么基础单元1<基础单元2。也就是说,基础单元1中的类文件A的开发时间早于基础单元2中的类文件B的开发时间,那么基础单元1的开发时间早于基础单元2的开发时间。
又例如,如果基础单元1#类文件A<基础单元2#类文件B;基础单元2#类文件C<基础单元3#类文件D,那么基础单元1<基础单元2,基础单元2<基础单元3,并且能够推测出基础单元1<基础单元3。也就是说,基础单元之间的关联关系具有传递性。可以理解的,基础单元之间的关联关系具有传递性,有利于扩展更多基础单元之间的关联关系,有利于提升识别出SDK的准确性。
又例如,如果基础单元1#类文件A<基础单元2#类文件B;基础单元2#类文件C<基础单元1#类文件B,那么基础单元1<基础单元2,且基础单元2<基础单元1,能够推测出基础单元1=基础单元2。也就是说,基础单元1和基础单元2可认为是同一个时间开发的,那么基础单元1和基础单元2可认为是属于同一个SDK。
由此可知,当多个基础单元之间的关联关系形成闭合环时,根据基础单元之间的关联关系的传递性可知,这个闭合环的多个基础单元都可以认为属于同一个SDK。
在一个具体的实施例中,当应用商城/浏览器具体执行本步骤时,更具体的,可由图3中的依赖关系模块识别出类文件中类与类的依赖关系。而后由关联环构建模块根据类文件中类与类的依赖关系确定类文件之间的依赖关系,进一步确定基础单元之间的关联关系。
S504、根据基础单元之间的关联关系确定目标应用程序中包含的SDK。其中,一个SDK包括关联关系形成闭合环的多个基础单元。
可选的,在一些实施例中,如果两个闭合环存在相同的基础单元,则这两个闭合环可以合并为一个闭合环,即确定为属于同一个SDK。
如图7所示,为目标应用程序的代码文件中部分包和类文件的结构示意图。当终端执行上述步骤S501至步骤S504之后,得到如图7中的基础单元的关联关系形成的4个闭合环,分别为关联环1,关联环2,关联环3和关联环4。其中,关联环2和关联环3存在相同的基础单元,可以合并为一个关联环。那么,可以确定出三个SDK,分别如图7所示的SDK1、SDK2和SDK3。
在一个具体的实施例中,当应用商城/浏览器具体执行本步骤时,更具体的,可由图3中的SDK识别模块根据关联环构建模块识别的基础单元之间的关联关系确定出目标应用程序中包含的SDK。
S505、根据目标应用程序中各个SDK包含的类文件,确定各个SDK的敏感行为。
步骤S504中已确定出一个SDK包括多个基础单元,每个基础单元对应一个或多个类文件,这一个或多个类文件中的代码文件被认为是相应SDK对应的代码文件。以SDK为单位,对SDK对应的代码文件进行分析,识别出SDK的敏感行为,例如调用敏感API,执行敏感的广播行为等。其中,敏感API是用于获取敏感数据,例如获取终端的定位、连接到网络、访问终端中的相册等。
如表二所示,为一些SDK中包括的敏感行为的示意。
表二
在一个具体的实施例中,当应用商城/浏览器具体执行本步骤时,更具体的,可由图3中的SDK分析模块根据SDK识别模块识别的SDK的信息,识别出各个SDK包含的敏感行为。更进一步的,SDK分析模块对识别出的各个SDK的敏感行为进行自然语言表达,形成SDK的风险报告(也称为安全提示),用于呈现给用户,以便用户决定是否安装该应用程序,或者是否在安装该应用程序后是否更改针对该应用程序的隐私类系统设置等。
综上可见,本申请实施例先根据目标应用程序中类文件的路径关系构建基础单元,并根据基础单元对应的类文件之间的依赖关系确定基础单元的关联关系,根据基础单元的关联关系来识别SDK的。由于本申请实施例提供的SDK识别方法不再依赖目标应用程序中包的包名或包的结构特征,因此能够克服目标应用程序的包名混淆以及包结 构扁平化带来的影响,准确识别出目标应用程序中包含的SDK。
另外,本申请实施例中是确定关联关系形成闭合环的多个基础单元属于同一个SDK,能够避免一些恶意SDK单方向继承或调用其他SDK被误识别为同一个SDK,即未识别出该恶意SDK,或者可能低估该恶意SDK敏感行为的风险性的情况发生。由此可见,本申请实施例提供的SDK识别方法能够更加准确识别出不同的SDK,提升风险提示的可靠性。
再有,本申请实施例中确定根目录包括类文件的包为基础单元,并以基础单元为单位构建关联环,相较于以类文件为单位构建关联环,该方法能够避免代码文件中类文件的依赖关系过于繁多而造成关联环更加复杂。可见,本申请实施例确定合适的基础单元用于构建关联环,在保证SDK识别准确性的基础上,还提升了识别的效率。
目标应用程序中大致包括两种来源的程序代码,分别为应用开发者自己开发的程序代码,以及集成的第三方应用开发者提供的程序代码(通常为一个或多个SDK)。可以理解的是,一般情况下,应用开发者自己开发的程序代码的风险性较低,而第三方应用开发者提供的程序代码的风险不可控,风险较高。因此,在又一些实施例中,终端还可以针对两种来源的程序代码分别进行SDK识别和/或SDK中敏感行为的分析。
如图8所示,为本申请实施例提供的又一种识别应用程序中SDK的方法的流程示意图,该流程包括:
S801、从目标应用程序的安装包中获取目标应用程序的代码文件和配置文件。
终端在获取到目标应用程序的安装包后,对目标应用程序的安装包进行解压,从解压后的文件中获取目标应用程序的代码文件和配置文件。
以Android的应用程序包为例进行说明。例如,对APK安装包进行解压,从解压后的文件中获取.dex文件和AndroidManifest.xml文件。其中,.dex文件为代码文件,进一步的可以对.dex文件进行反汇编得到smali代码。其中,AndroidManifest.xml文件为配置文件,包括:描述目标应用程序的各个组件,包括活动(activity)、服务(service)、广播接收者(broadcast receiver)、内容提供者(content provider),并声明各个组件实现的类文件等。
S802、根据配置文件,从代码文件中识别出第一类的类文件和第二类的类文件。
可以理解的是,第一类的类文件,为配置文件中声明的类文件,一般认为是应用开发者针对该目标应用程序自己开发的程序代码,这些应用开发者开发自己开发程序代码的敏感行为通常风险较低。一些实施例中,可以将第一类的类文件的集合确定为一个SDK。这个SDK对应的类文件为目标应用程序中全部的第一类的类文件。或者,在其他一些实施例中,也可以针对第一类的类文件执行与第二类的类文件相同的方法,以识别出第一类的类文件中包含的多个SDK。
第二类的类文件,指配置文件中未声明的类文件,一般认为是第三方服务提供商开发的SDK,这些SDK的敏感行为应该重点关注,风险较高。
可以理解的是,将风险不同的类文件划分为不同的SDK,有利于后续分开评估不同SDK的风险,有利于用户更好的了解目标应用程序中不同SDK的风险性。另外,在对风险较高的类文件识别SDK时,去除风险较低的类文件,有利于提高SDK的识别效率和准确性。
S803、根据第二类的类文件的路径关系,确定目标应用程序中包含的基础单元,以及各个基础单元对应的第二类的类文件。
其中,一个基础单元为一个包,且基础单元的根目录下包括一个或多个第二类的类文件。基础单元对应的类文件为基础单元的根目录下的全部的第二类的类文件。需要强调的是,基础单元对应的类文件不包括基础单元的子包中的类文件。
可以理解的是,这里是针对第二类的类文件确定的基础单元,基础单元对应的类文件属于第二类的类文件。
S804、根据基础单元对应的第二类的类文件之间的依赖关系,建立基础单元之间的关联关系。
S805、根据基础单元之间的关联关系确定目标应用程序中包含的SDK。其中,一个SDK包括关联关系形成闭合环的多个基础单元。
可选的,在一些实施例中,如果两个闭合环存在相同的基础单元,则这两个闭合环可以合并为一个闭合环,即确定为属于同一个SDK。
S806、根据目标应用程序中各个SDK包含的类文件,确定各个SDK的敏感行为。
需要注意的是,这里分析的各个SDK的敏感行为可以包括两种类型的SDK,一种类型是步骤802中识别的SDK(即第一类的类文件集合形成的SDK),另一种类型是步骤S805中识别的SDK(即由第二类的类文件分析得到的多个SDK)。一些实施例中,由于两种类型的SDK的风险性不同,可以分别分开提示两种类型SDK的风险。
步骤S801-步骤S806的其他内容请参考前文步骤S501-步骤S505中相应的内容,这里不再重复赘述。
本申请实施例还提供一种芯片系统,如图9所示,该芯片系统包括至少一个处理器1101和至少一个接口电路1102。处理器1101和接口电路1102可通过线路互联。例如,接口电路1102可用于从其它装置(例如终端100的存储器)接收信号。又例如,接口电路1102可用于向其它装置(例如处理器1101)发送信号。示例性的,接口电路1102可读取存储器中存储的指令,并将该指令发送给处理器1101。当所述指令被处理器1101执行时,可使得终端执行上述实施例中的终端100(比如,手机)执行的各个步骤。当然,该芯片系统还可以包含其他分立器件,本申请实施例对此不作具体限定。
本申请实施例还提供一种装置,该装置包含在终端或服务器中,该装置具有实现上述实施例中任一方法中终端或服务器行为的功能。该功能可以通过硬件实现,也可以通过硬件执行相应的软件实现。硬件或软件包括至少一个与上述功能相对应的模块或单元。例如,通信模块或单元、存储模块或单元、以及处理模块或单元等。
本申请实施例还提供一种计算机存储介质,包括计算机指令,当计算机指令在终端或服务器上运行时,使得终端或服务器执行如上述实施例中任一方法。
本申请实施例还提供一种计算机程序产品,当计算机程序产品在计算机上运行时,使得计算机执行如上述实施例中任一方法。
可以理解的是,上述终端或服务器等为了实现上述功能,其包含了执行各个功能相应的硬件结构和/或软件模块。本领域技术人员应该很容易意识到,结合本文中所公开的实施例描述的各示例的单元及算法步骤,本申请实施例能够以硬件或硬件和计算 机软件的结合形式来实现。某个功能究竟以硬件还是计算机软件驱动硬件的方式来执行,取决于技术方案的特定应用和设计约束条件。专业技术人员可以对每个特定的应用来使用不同方法来实现所描述的功能,但是这种实现不应认为超出本发明实施例的范围。
本申请实施例可以根据上述方法示例对上述终端或服务器等进行功能模块的划分,例如,可以对应各个功能划分各个功能模块,也可以将两个或两个以上的功能集成在一个处理模块中。上述集成的模块既可以采用硬件的形式实现,也可以采用软件功能模块的形式实现。需要说明的是,本发明实施例中对模块的划分是示意性的,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式。
通过以上的实施方式的描述,所属领域的技术人员可以清楚地了解到,为描述的方便和简洁,仅以上述各功能模块的划分进行举例说明,实际应用中,可以根据需要而将上述功能分配由不同的功能模块完成,即将装置的内部结构划分成不同的功能模块,以完成以上描述的全部或者部分功能。上述描述的系统,装置和单元的具体工作过程,可以参考前述方法实施例中的对应过程,在此不再赘述。
在本申请实施例各个实施例中的各功能单元可以集成在一个处理单元中,也可以是各个单元单独物理存在,也可以两个或两个以上单元集成在一个单元中。上述集成的单元既可以采用硬件的形式实现,也可以采用软件功能单元的形式实现。
所述集成的单元如果以软件功能单元的形式实现并作为独立的产品销售或使用时,可以存储在一个计算机可读取存储介质中。基于这样的理解,本申请实施例的技术方案本质上或者说对现有技术做出贡献的部分或者该技术方案的全部或部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储介质中,包括若干指令用以使得一台计算机设备(可以是个人计算机,服务器,或者网络设备等)或处理器执行本申请各个实施例所述方法的全部或部分步骤。而前述的存储介质包括:快闪存储器、移动硬盘、只读存储器、随机存取存储器、磁碟或者光盘等各种可以存储程序代码的介质。
以上所述,仅为本申请的具体实施方式,但本申请的保护范围并不局限于此,任何在本申请揭露的技术范围内的变化或替换,都应涵盖在本申请的保护范围之内。因此,本申请的保护范围应以所述权利要求的保护范围为准。

Claims (10)

  1. 一种识别应用程序中SDK的方法,其特征在于,所述方法包括:
    从目标应用程序的安装包中获取所述目标应用程序的代码文件;
    根据所述目标应用程序的所述代码文件中类文件的路径关系,确定所述目标应用程序中包含的基础单元,以及各个所述基础单元对应的类文件;
    根据各个所述基础单元对应的类文件之间的依赖关系,建立所述基础单元之间的关联关系;
    根据所述基础单元之间的关联关系确定所述目标应用程序中包含的SDK;其中关联关系形成闭合环的多个所述基础单元属于一个SDK。
  2. 根据权利要求1所述的方法,其特征在于,所述根据所述目标应用程序的所述代码文件中类文件的路径关系,确定所述目标应用程序中包含的基础单元,以及各个所述基础单元对应的类文件,包括:
    根据所述目标应用程序的所述代码文件中类文件的路径关系,确定根目录下包括一个或多个类文件的包为所述基础单元,所述基础单元对应的类文件为所述基础单元根目录下的全部类文件。
  3. 根据权利要求1或2所述的方法,其特征在于,
    所述基础单元对应的类文件之间的依赖关系,为根据各个类文件中类与类之间的依赖关系确定的,所述类与类之间的依赖关系包括继承关系、调用关系、参数引用关系、以及返回关系中的一项或多项。
  4. 根据权利要求1-3任一项所述的方法,其特征在于,在所述根据所述基础单元之间的关联关系确定所述目标应用程序中包含的SDK之后,所述方法还包括:
    根据每个SDK包含的类文件,识别出每个SDK调用的敏感API,并生成每个SDK的安全提示;其中所述敏感API用于获取敏感数据。
  5. 根据权利要求1-4任一项所述的方法,其特征在于,所述方法还包括:
    从所述目标应用程序的安装包中还获取所述目标应用程序的配置文件;
    根据所述目标应用程序的所述配置文件,从所述目标应用程序的所述代码文件中识别出第一类的类文件和第二类的类文件;
    所述根据所述目标应用程序的所述代码文件中类文件的路径关系,确定所述目标应用程序中包含的基础单元,以及各个基础单元对应的类文件,具体为:
    根据所述目标应用程序的所述代码文件中所述第二类的类文件的路径关系,确定所述目标应用程序中包含的基础单元,以及各个基础单元对应的所述第二类的类文件。
  6. 根据权利要求5所述的方法,其特征在于,所述根据所述目标应用程序的所述配置文件,从所述目标应用程序的所述代码文件中识别出第一类的类文件和第二类的类文件,包括:
    确定所述目标应用程序的所述配置文件中声明的类文件为所述第一类的类文件;
    确定所述目标应用程序的所述配置文件中未声明的类文件为所述第二类的类文件。
  7. 根据权利要求5或6所述的方法,其特征在于,在所述根据所述目标应用程序的所述配置文件,从所述目标应用程序的所述代码文件中识别出第一类的类文件和第二类的类文件之后,所述方法还包括:
    将所述第一类的类文件确定为所述目标应用程序中的一个SDK。
  8. 一种终端,其特征在于,包括:处理器和存储器,所述存储器与所述处理器耦合,所述存储器用于存储计算机程序代码,所述计算机程序代码包括计算机指令,当所述处理器从所述存储器中读取所述计算机指令,以使得所述终端执行如权利要求1-7中任一项所述的识别应用程序中SDK的方法。
  9. 一种计算机可读存储介质,其特征在于,包括计算机指令,当所述计算机指令在终端上运行时,使得所述终端执行如权利要求1-7中任一项所述的识别应用程序中SDK的方法。
  10. 一种芯片系统,其特征在于,包括一个或多个处理器,当所述一个或多个处理器执行指令时,所述一个或多个处理器执行如权利要求1-7中任一项所述的识别应用程序中SDK的方法。
PCT/CN2023/077711 2022-03-07 2023-02-22 一种识别应用程序中sdk的方法、终端及服务器 WO2023169212A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202210225699.6 2022-03-07
CN202210225699.6A CN116775050A (zh) 2022-03-07 2022-03-07 一种识别应用程序中sdk的方法、终端及服务器

Publications (1)

Publication Number Publication Date
WO2023169212A1 true WO2023169212A1 (zh) 2023-09-14

Family

ID=87937148

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2023/077711 WO2023169212A1 (zh) 2022-03-07 2023-02-22 一种识别应用程序中sdk的方法、终端及服务器

Country Status (2)

Country Link
CN (1) CN116775050A (zh)
WO (1) WO2023169212A1 (zh)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20110283268A1 (en) * 2010-05-17 2011-11-17 Salter Mark O Mechanism for Cross-Building Support Using Dependency Information
CN105630684A (zh) * 2016-01-26 2016-06-01 百度在线网络技术(北京)有限公司 软件开发工具包识别方法和装置
CN106951780A (zh) * 2017-02-08 2017-07-14 中国科学院信息工程研究所 重打包恶意应用的静态检测方法和装置
CN112748952A (zh) * 2019-10-30 2021-05-04 武汉斗鱼鱼乐网络科技有限公司 一种环形依赖关系的检测方法、装置、设备和存储介质

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20110283268A1 (en) * 2010-05-17 2011-11-17 Salter Mark O Mechanism for Cross-Building Support Using Dependency Information
CN105630684A (zh) * 2016-01-26 2016-06-01 百度在线网络技术(北京)有限公司 软件开发工具包识别方法和装置
CN106951780A (zh) * 2017-02-08 2017-07-14 中国科学院信息工程研究所 重打包恶意应用的静态检测方法和装置
CN112748952A (zh) * 2019-10-30 2021-05-04 武汉斗鱼鱼乐网络科技有限公司 一种环形依赖关系的检测方法、装置、设备和存储介质

Also Published As

Publication number Publication date
CN116775050A (zh) 2023-09-19

Similar Documents

Publication Publication Date Title
CN106796565B (zh) 利用用户移动设备的附件设备操作
CN110865837B (zh) 一种进行系统升级的方法和终端
CN113094123A (zh) 应用程序中功能的实现方法、装置、电子设备和存储介质
WO2021052167A1 (zh) 一种实现应用插件化的方法及电子设备
WO2021027772A1 (zh) 一种应用切换运行的方法及设备
WO2022253158A1 (zh) 一种用户隐私保护方法及装置
US10599444B2 (en) Extensible input stack for processing input device data
CN116483734B (zh) 一种基于编译器的插桩方法、系统及相关电子设备
CN116467221B (zh) 一种基于解释器的插桩方法、系统及相关电子设备
WO2023169212A1 (zh) 一种识别应用程序中sdk的方法、终端及服务器
CN114706633B (zh) 预加载方法、电子设备及存储介质
WO2021238376A1 (zh) 功能包的加载方法、装置、服务器和电子设备
US20190213015A1 (en) Extensible input stack for processing input device data
US11405341B1 (en) Audience-based content optimization in a messaging system
US11122396B2 (en) Recipient-based content optimization in a messaging system
WO2024032022A1 (zh) 一种应用图标的可视化方法和设备
WO2023051357A1 (zh) 一种虚拟设备运行方法
WO2024055875A1 (zh) 服务卡片的添加方法、电子设备及计算机可读存储介质
CN116743908B (zh) 壁纸显示方法及相关装置
WO2022179267A1 (zh) 广告的展示方法、装置及系统
CN112783512B (zh) 应用程序包处理方法、装置、设备及存储介质
WO2023202406A1 (zh) 显示方法及电子设备
CN116089368B (zh) 文件搜索方法和相关装置
CN116719556B (zh) 系统升级的方法和电子设备
WO2024083114A1 (zh) 一种软件分发方法、电子设备及系统

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 23765790

Country of ref document: EP

Kind code of ref document: A1